Question

我需要搜索整个HTML文档并突出显示我搜索过的关键字。我使用C＃和XPath作为解决方案。我认为我有解决方案，但输出并不是我想要的。

static string keyword = "red";

static void Main(string[] args)
{
    string htmlString = @"<html>
                          <head>
                              <title>HTML sample page</title>
                          </head>
                          <body>
                              <div><div>This is inside div red paragraph</div></div>
                              <p>This is a red paragraph</p>
                              <p>This is a tes paragraph</p>
                              <p>This is a test paragraph</p>
                              <p>This is a paragraph red </p>
                          </body>
                          </html>";

    XmlDocument htmlDocument = new XmlDocument();
    htmlDocument.Load(new StringReader(htmlString));

    foreach (XmlNode node in htmlDocument.SelectNodes("//*[contains(., 'red')]"))
    {
        node.InnerText = node.InnerText.Replace(keyword, "highlight" + keyword + "highlight");
    }
    Console.WriteLine(htmlDocument.InnerXml);
}

我得到的输出如下：

<html>HTML sample pageThis is inside div highlightredhighlight paragraphThis is a highlightredhighlight paragraphThis is a tes paragraphThis is a test paragraphThis is a paragraph highlightredhighlight </html>

除了html标签之外，输出似乎摆脱了所有其他标签。我做错了吗？

Answer 1

你得到的第一个匹配是html标签，因为它包含红色!!所以你只需用文字替换所有内容，加上额外的高亮词。

此外，如果您真的需要突出显示文本，则需要拆分文本并在其中插入红色（或类似的）节点。

Answer 2

请试试这个

foreach (XmlNode node in htmlDocument.SelectNodes("//*[contains(text(), 'red')]"))

dotnetfiddle：https://dotnetfiddle.net/iqBxeJ

使用XPath在整个HTML文档中查找和替换关键字

2 个答案: