使用XSLT删除标记

时间:2015-12-10 12:13:55

标签: xml xslt

我正在过滤具有以下结构的XML文档:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <channel>
        <item>
            <title>My Second Great Title</title>
            <link>http://server.com/content/my-second-great-title</link>
            <tag>vuluptate</tag>
            <tag>id</tag>
            <tag>cras</tag>
            <tag>pretium</tag>
            <tag>conubia</tag>
            <tag>libero</tag>
            <description><![CDATA[This is a second great description <img src="http://server.com/images/image01.png" />]]></description>
            <publishedAt>Sat, 08 Nov 2015 10:00:52 +0000</publishedAt>
            <isVisible>true</isVisible>
            <content>Ut luctus auctor varius. Donec vitae erat felis. Nam ac erat vulputate, consequat elit id, dictum urna. Vestibulum dignissim eget felis vitae tempor. Suspendisse molestie lectus at est accumsan, et porta sapien elementum. Vivamus pretium imperdiet nisl id consequat. Sed gravida bibendum odio, et vehicula nibh hendrerit eget. Cras sit amet semper sem. Vivamus non lorem sed ex fringilla malesuada consequat non arcu. Etiam nec sodales tortor. In scelerisque massa vitae purus suscipit consectetur. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Cras ultrices eros tortor, eu sollicitudin eros pellentesque sit amet. Integer rutrum velit eget libero efficitur, non auctor lorem rutrum. Vivamus porta dolor ut enim dapibus, nec rutrum nisi sagittis.</content>
        </item>
        <item>
            <title>My Great Title</title>
            <link>http://server.com/content/my-great-title</link>
            <tag>lorem</tag>
            <tag>ipsum</tag>
            <tag>arcu</tag>
            <tag>sic</tag>
            <description><![CDATA[This is a great description <img src="http://server.com/images/image08.png" />]]></description>
            <publishedAt>Sat, 08 Nov 2015 10:00:52 +0000</publishedAt>
            <isVisible>true</isVisible>
            <content>Praesent consectetur, dolor non vehicula ultrices, nisl libero feugiat ligula, ut faucibus metus arcu et dui. Curabitur eleifend feugiat posuere. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec cursus blandit lorem, ullamcorper vestibulum massa molestie non. Maecenas erat enim, pretium eget velit dapibus, consequat placerat eros. Nam vulputate nisi at urna gravida accumsan. Fusce id ultrices nunc. Aenean varius quam in tincidunt cursus. Quisque sed arcu est. Etiam dignissim, neque at maximus feugiat, turpis nunc sollicitudin eros, et lobortis enim dui sed felis. Nulla rhoncus diam porttitor ullamcorper imperdiet.</content>
        </item>
        <item>
            <title>My Title</title>
            <link>http://server.com/content/my-title</link>
            <tag>auctor</tag>
            <tag>felis</tag>
            <description><![CDATA[This is a description <img src="http://server.com/images/image301.png" />]]></description>
            <publishedAt>Sat, 05 Nov 2015 16:07:23 +0000</publishedAt>
            <isVisible>true</isVisible>
            <content>Ut luctus auctor varius. Donec vitae erat felis. Nam ac erat vulputate, consequat elit id, dictum urna. Vestibulum dignissim eget felis vitae tempor. Suspendisse molestie lectus at est accumsan, et porta sapien elementum. Vivamus pretium imperdiet nisl id consequat. Sed gravida bibendum odio, et vehicula nibh hendrerit eget. Cras sit amet semper sem. Vivamus non lorem sed ex fringilla malesuada consequat non arcu. Etiam nec sodales tortor. In scelerisque massa vitae purus suscipit consectetur. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Cras ultrices eros tortor, eu sollicitudin eros pellentesque sit amet. Integer rutrum velit eget libero efficitur, non auctor lorem rutrum. Vivamus porta dolor ut enim dapibus, nec rutrum nisi sagittis.</content>
        </item>
    </channel>
</root>

我目前正以这种方式获取描述标记内容:

<description><xsl:value-of select="description" /></description>

如何摆脱img标记中找到的description(可能还有其他)标记?

1 个答案:

答案 0 :(得分:1)

在XSLT 1.0中,您需要使用递归命名模板来删除伪标记。这是一个例子:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="description">
    <xsl:copy>
        <xsl:call-template name="exclude-markup">
            <xsl:with-param name="string" select="." />
        </xsl:call-template>
    </xsl:copy>
</xsl:template>

<xsl:template name="exclude-markup">
    <xsl:param name="string"/>
    <xsl:param name="prefix" select="'&lt;'"/>
    <xsl:param name="suffix" select="'&gt;'"/>
    <xsl:choose>
        <xsl:when test="contains($string, $prefix) and contains(substring-after($string, $prefix), $suffix)">
            <xsl:value-of select="substring-before($string, $prefix)" />
            <!-- recursive call -->
            <xsl:call-template name="exclude-markup">
                <xsl:with-param name="string" select="substring-after(substring-after($string, $prefix), $suffix)" />
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="$string" />
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

</xsl:stylesheet>

示例输入:

<root>
    <channel>
        <item>
            <title>Item One</title>
            <description><![CDATA[This is the first part of a description. <img src="http://server.com/images/image01.png" />Here is the second part.]]></description>
        </item>
        <item>
            <title>Item Two</title>
            <description><![CDATA[<img src="http://server.com/images/image02.png" />This is another description.<img src="http://server.com/images/image03.png" />]]></description>
        </item>
        <item>
            <title>Item Three</title>
            <description><![CDATA[This description has a <b>bold</b> tag.]]></description>
        </item>
    </channel>
</root>

<强>结果:

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <channel>
      <item>
         <title>Item One</title>
         <description>This is the first part of a description. Here is the second part.</description>
      </item>
      <item>
         <title>Item Two</title>
         <description>This is another description.</description>
      </item>
      <item>
         <title>Item Three</title>
         <description>This description has a bold tag.</description>
      </item>
   </channel>
</root>
相关问题