xpath:限制祖先的范围以计算在特定节点内具有特定父级的元素

时间:2019-12-09 09:43:41

标签: xpath

我尝试选择所有具有属性名称itemprop且具有任何级别父级属性itemtype = http://schema.org/Product的元素,但位于具有其他任何属性//*[@itemtype not(@itemprop)]的节点中的元素除外。

示例:

<div itemtype = "http://schema.org/Product" >
  <div itemtype = "http://schema.org/BreadcrumbList" >
    <div itemprop = "name" > A </div>
    <div itemprop = "price" > B </div>
    <div itemtype = "http://schema.org/ListItem" >
      <div itemprop = "description"> C </div>
    </div>
  <div itemprop="offers" itemtype = "http://schema.org/Offer" >
      <div itemprop = "price"> D </div>
  </div>
  <div itemprop = "name" > E </div>
  <div>
    <div>
      <div itemprop = "price" > F </div>
    </div>
  </div>
</div>

我只需要DEF元素,而不需要ABC。我尝试过这样的事情:

//*[@itemprop and count(ancestor::*[@itemtype='http://schema.org/Product'])=count(ancestor::*[@itemtype and not(@itemprop)])]

这很好,但是,如果有更多itemtypeitemprop作为其祖先,则情况似乎很复杂。 示例:

<div itemtype = ""http://schema.org/WebPage" >
 //a lot of itemtypes and itemprops between them as Product nodes' ancestor//
  <div itemtype = "http://schema.org/Product" >
    <div itemtype = "http://schema.org/BreadcrumbList" >
      <div itemprop = "name" > A </div>
      <div itemprop = "price" > B </div>
      <div itemtype = "http://schema.org/ListItem" >
        <div itemprop = "description"> C </div>
      </div>
     <div itemprop="offers" itemtype = "http://schema.org/Offer" >
        <div itemprop = "price"> D </div>
    </div>
    <div itemprop = "name" > E </div>
    <div>
      <div>
        <div itemprop = "price" > F </div>
      </div>
    </div>
  </div>
</div>

所以,我的问题是,有什么方法可以限制祖先的范围吗?

<div itemtype = ""http://schema.org/WebPage" >
a lot of itemtypes and itemprops between them as Product nodes' ancestor
--------------------------------------------------------------------------------
Calculate count of ancestor from here
--------------------------------------------------------------------------------
  <div itemtype = "http://schema.org/Product" >
    <div itemtype = "http://schema.org/BreadcrumbList" >
      <div itemprop = "name" > A </div>
      <div itemprop = "price" > B </div>
      <div itemtype = "http://schema.org/ListItem" >
        <div itemprop = "description"> C </div>
      </div>
     <div itemprop="offers" itemtype = "http://schema.org/Offer" >
        <div itemprop = "price"> D </div>
    </div>
    <div itemprop = "name" > E </div>
    <div>
      <div>
        <div itemprop = "price" > F </div>
      </div>
    </div>
  </div>
----------------------------------------------------------------------------------
To here
----------------------------------------------------------------------------------
........
</div>

总而言之,在选择节点//*[contains(@itemtype, 'schema.org/Product') and not(@itemprop)]之后,我想计算其在节点内的祖先。 有什么办法解决这个问题? 在没有限制祖先范围的情况下,此代码可以正常工作:

//*[@itemprop and count(ancestor::*[contains(@itemtype, 'schema.org/Product') and not(@itemprop)])+count(//*[@itemtype and not(@itemprop) and descendant::*[contains(@itemtype, 'schema.org/Product') and not(@itemprop)]])=count(ancestor::*[@itemtype and not(@itemprop)])]

这是最好的吗? 非常感谢所有人:)

1 个答案:

答案 0 :(得分:2)

似乎您想选择树的叶子:具有itemprop属性的元素。另外,您只希望那些拥有第一个祖先且仅具有itemtype属性和"http://schema.org/Product"属性值的人。

此XPath 1.0表达式:

//*[@itemprop][not(@itemtype)]
   [ancestor::*[@itemtype][not(@itemprop)][1]
               [@itemtype='http://schema.org/Product']
   ]

使用格式正确的输入(请注意已关闭的BreadcrumbList)

<div itemtype="http://schema.org/WebPage">//a lot of itemtypes and itemprops between them as Product nodes' ancestor//
  <div itemtype="http://schema.org/Product">
    <div itemtype="http://schema.org/BreadcrumbList">
      <div itemprop="name">A</div>
      <div itemprop="price">B</div>
      <div itemtype="http://schema.org/ListItem">
        <div itemprop="description">C</div>
      </div>
</div>
      <div itemprop="offers" itemtype="http://schema.org/Offer">
        <div itemprop="price">D</div>
      </div>
      <div itemprop="name">E</div>
      <div>
        <div>
          <div itemprop="price">F</div>
        </div>
      </div>
    </div>
</div>

它选择:

<div itemprop="price">D</div>

<div itemprop="name">E</div>

<div itemprop="price">F</div>

选中here