Question

我正在玩这段代码，试图弄清楚如何通过XPATH提取标题信息，因为它在内部网络上，我无法访问类似Firepath的内容。

<div style="float:left">
<table border="0">
<tbody>
   <tr width="100%">
      <td valign="top">Code that does not matter</td>
      <td colspan="2">
          <span class="textinfo">
          <a href="http.....">
             <b> HI!  I am the TITLE!</b>
          </a>
          </span>
      </td>
   </tr>
   <tr></tr>
   <tr></tr>
   <tr width="100%">
      <td valign="top">Code that does not matter</td>
      <td colspan="2">
          <span class="textinfo">
          <a href="http.....">
             <b> HI!  Here is another TITLE!</b>
          </a>
          </span>
      </td>
    </tr>
   </tbody>
  </table>
  </div>

这种情况持续了一段时间。基本上有10个结果，我正在试图弄清楚如何获得所有标题。有任何想法吗？我提供了足够的信息吗？谢谢！

Answer 1

示例中的内容不是有效的XML，它具有根元素。如果我们假设没有定义名称空间（应该没有）那么......

您可以使用元素的内部文本：

//td/span[@class='textinfo']/text()

我不会将a和b放在那里 - 例如，“禁用”标题不会包含a。在任何一种情况下使用XPath来查找“标题”都不是一种非常可靠的方法

在此示例中更正Xpath以输出标题？

1 个答案: