修改文本XmlNode的InnerXml

时间:2011-10-26 12:25:18

标签: c# xml-parsing xmldocument sgmlreader

我使用SGML和XmlDocument遍历一个html文档。当我找到一个类型为Text的XmlNode时,我需要更改其具有xml元素的值。我不能改变InnerXml,因为它只是readonly。我尝试更改InnerText,但这次标记描述符chars <>编码为&lt;&gt;。例如:

<p>
    This is a text that will be highlighted.
    <anothertag />
    <......>
</p>

我正在尝试改为:

<p>
    This is a text that will be <span class="highlighted">highlighted</span>.
    <anothertag />
    <......>
</p>

修改文本XmlNode值的最简单方法是什么?

3 个答案:

答案 0 :(得分:2)

我有一个解决方法,我不知道它是一个真正的解决方案或什么,但它可以产生我想要的。如果有合适的解决方案,请评论此代码

    private void traverse(ref XmlNode node)
    {
        XmlNode prevOldElement = null;
        XmlNode prevNewElement = null;
        var element = node.FirstChild;
        do
        {
            if (prevNewElement != null && prevOldElement != null)
            {
                prevOldElement.ParentNode.ReplaceChild(prevNewElement, prevOldElement);
                prevNewElement = null;
                prevOldElement = null;
            }
            if (element.NodeType == XmlNodeType.Text)
            {
                var el = doc.CreateElement("text");
                //Here is manuplation of the InnerXml.
                el.InnerXml = element.Value.Replace(a_search_term, "<b>" + a_search_term + "</b>");
                //I don't replace element right now, because element.NextSibling will be null.
                //So I replace the new element after getting the next sibling.
                prevNewElement = el;
                prevOldElement = element;
            }
            else if (element.HasChildNodes)
                traverse(ref element);
        }
        while ((element = element.NextSibling) != null);
        if (prevNewElement != null && prevOldElement != null)
        {
            prevOldElement.ParentNode.ReplaceChild(prevNewElement, prevOldElement);
        }

    }

此外,我在遍历函数后删除<text></text>字符串:

        doc = new XmlDocument();
        doc.PreserveWhitespace = true;
        doc.XmlResolver = null;
        doc.Load(sgmlReader);
        var html = doc.FirstChild;
        traverse(ref html);
        textBox1.Text = doc.OuterXml.Replace("<text>", String.Empty).Replace("</text>", String.Empty);

答案 1 :(得分:1)

using System;
using System.Xml;

public class Sample {

  public static void Main() {
    XmlDocument doc = new XmlDocument();
    doc.LoadXml(
    "<p>" +
    "This is a text that will be highlighted." +
    "<br />" +
    "<img />" +
    "</p>");
    string ImpossibleMark = "_*_";
    XmlNode elem = doc.DocumentElement.FirstChild;
    string thewWord ="highlighted";
    if(elem.NodeType == XmlNodeType.Text){
        string OriginalXml = elem.ParentNode.InnerXml;
        while(OriginalXml.Contains(ImpossibleMark)) ImpossibleMark += ImpossibleMark;
        elem.InnerText = elem.InnerText.Replace(thewWord, ImpossibleMark);
        string replaceString = "<span class=\"highlighted\">" + thewWord + "</span>";
        elem.ParentNode.InnerXml = elem.ParentNode.InnerXml.Replace(ImpossibleMark, replaceString);
    }

    Console.WriteLine(doc.DocumentElement.InnerXml);
  }
}

答案 2 :(得分:0)

InnerText property将为您提供XmlNode的所有子节点的文本内容。你真正想要设置的是InnerXml property,它将被解释为XML,而不是文本。