我尝试选择所有具有属性名称itemprop
且具有任何级别父级属性itemtype = http://schema.org/Product
的元素,但位于具有其他任何属性//*[@itemtype not(@itemprop)]
的节点中的元素除外。
示例:
<div itemtype = "http://schema.org/Product" >
<div itemtype = "http://schema.org/BreadcrumbList" >
<div itemprop = "name" > A </div>
<div itemprop = "price" > B </div>
<div itemtype = "http://schema.org/ListItem" >
<div itemprop = "description"> C </div>
</div>
<div itemprop="offers" itemtype = "http://schema.org/Offer" >
<div itemprop = "price"> D </div>
</div>
<div itemprop = "name" > E </div>
<div>
<div>
<div itemprop = "price" > F </div>
</div>
</div>
</div>
我只需要D
,E
和F
元素,而不需要A
,B
,C
。我尝试过这样的事情:
//*[@itemprop and count(ancestor::*[@itemtype='http://schema.org/Product'])=count(ancestor::*[@itemtype and not(@itemprop)])]
这很好,但是,如果有更多itemtype
和itemprop
作为其祖先,则情况似乎很复杂。
示例:
<div itemtype = ""http://schema.org/WebPage" >
//a lot of itemtypes and itemprops between them as Product nodes' ancestor//
<div itemtype = "http://schema.org/Product" >
<div itemtype = "http://schema.org/BreadcrumbList" >
<div itemprop = "name" > A </div>
<div itemprop = "price" > B </div>
<div itemtype = "http://schema.org/ListItem" >
<div itemprop = "description"> C </div>
</div>
<div itemprop="offers" itemtype = "http://schema.org/Offer" >
<div itemprop = "price"> D </div>
</div>
<div itemprop = "name" > E </div>
<div>
<div>
<div itemprop = "price" > F </div>
</div>
</div>
</div>
</div>
所以,我的问题是,有什么方法可以限制祖先的范围吗?
<div itemtype = ""http://schema.org/WebPage" >
a lot of itemtypes and itemprops between them as Product nodes' ancestor
--------------------------------------------------------------------------------
Calculate count of ancestor from here
--------------------------------------------------------------------------------
<div itemtype = "http://schema.org/Product" >
<div itemtype = "http://schema.org/BreadcrumbList" >
<div itemprop = "name" > A </div>
<div itemprop = "price" > B </div>
<div itemtype = "http://schema.org/ListItem" >
<div itemprop = "description"> C </div>
</div>
<div itemprop="offers" itemtype = "http://schema.org/Offer" >
<div itemprop = "price"> D </div>
</div>
<div itemprop = "name" > E </div>
<div>
<div>
<div itemprop = "price" > F </div>
</div>
</div>
</div>
----------------------------------------------------------------------------------
To here
----------------------------------------------------------------------------------
........
</div>
总而言之,在选择节点//*[contains(@itemtype, 'schema.org/Product') and not(@itemprop)]
之后,我想计算其在节点内的祖先。
有什么办法解决这个问题?
在没有限制祖先范围的情况下,此代码可以正常工作:
//*[@itemprop and count(ancestor::*[contains(@itemtype, 'schema.org/Product') and not(@itemprop)])+count(//*[@itemtype and not(@itemprop) and descendant::*[contains(@itemtype, 'schema.org/Product') and not(@itemprop)]])=count(ancestor::*[@itemtype and not(@itemprop)])]
这是最好的吗? 非常感谢所有人:)
答案 0 :(得分:2)
似乎您想选择树的叶子:具有itemprop
属性的元素。另外,您只希望那些拥有第一个祖先且仅具有itemtype
属性和"http://schema.org/Product"
属性值的人。
此XPath 1.0表达式:
//*[@itemprop][not(@itemtype)]
[ancestor::*[@itemtype][not(@itemprop)][1]
[@itemtype='http://schema.org/Product']
]
使用格式正确的输入(请注意已关闭的BreadcrumbList):
<div itemtype="http://schema.org/WebPage">//a lot of itemtypes and itemprops between them as Product nodes' ancestor//
<div itemtype="http://schema.org/Product">
<div itemtype="http://schema.org/BreadcrumbList">
<div itemprop="name">A</div>
<div itemprop="price">B</div>
<div itemtype="http://schema.org/ListItem">
<div itemprop="description">C</div>
</div>
</div>
<div itemprop="offers" itemtype="http://schema.org/Offer">
<div itemprop="price">D</div>
</div>
<div itemprop="name">E</div>
<div>
<div>
<div itemprop="price">F</div>
</div>
</div>
</div>
</div>
它选择:
<div itemprop="price">D</div>
<div itemprop="name">E</div>
<div itemprop="price">F</div>
选中here