在转换期间删除没有属性,子元素或文本的元素

时间:2013-09-11 14:58:40

标签: xml xslt

XSL很难。我的问题here的答案让我大部分都在正确的轨道上,但我最初忽略了一些小事。这是我最近的尝试:

XSL:

<!--
    When a file is transformed using this stylesheet the output will be
    formatted as follows:

    1.)  Elements named "info" will be removed
    2.)  Attributes named "file_line_nr" or "file_name" will be removed
    3.)  Comments will be removed
    4.)  Processing instructions will be removed
    5.)  XML declaration will be removed
    6.)  Extra whitespace will be removed
    7.)  Empty attributes will be removed
    8.)  Elements void of both attributes and child elements will be removed
    9.)  All elements will be sorted by name recursively
    10.) All attributes will be sorted by name
-->
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes" method="xml" omit-xml-declaration="yes"/>
    <xsl:strip-space elements="*"/>

    <!--
        Elements/attributes to remove.  Note that comments are not elements or
        attributes.  Since there is no template to match comments they are
        automatically ignored.
    -->
    <xsl:template match="@*[normalize-space()='']|info|@file_line_nr|@file_name"/>

    <!-- Match any attribute -->
    <xsl:template match="@*">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
        </xsl:copy>
    </xsl:template>

    <!-- Match any element -->
    <xsl:template match="*">
        <xsl:copy>
            <xsl:apply-templates select="@*">
                <xsl:sort select="name()"/>
            </xsl:apply-templates>
            <xsl:apply-templates>
                <xsl:sort select="name()"/>
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

我想我已经解决了我的每个要求,除了8号。我可以成功地创建一个样式表来删除没有子元素的元素,或删除没有属性的元素,但这不是我的想。我只想删除没有属性,子元素或文本的元素。

输入XML:

<?xml version="1.0" encoding="UTF-8" standalone="no" ?><!-- XML declaration should be removed -->
<foo b="b" a="a" c="c">
    <?some-app inst="some instruction"?><!-- Processing instructions should be removed -->
    <qwer><!-- Keep elements like this because it has child elements -->
        <zxcv c="c" b="b"/><!-- Keep elements like this because it has attributes -->
        <id>some text</id><!-- Keep elements like this because it has text -->
        <info i="i"/><!-- Elements named "info" are to be removed -->
        <rewq file_line_nr="42" file_name="somefile.txt"/><!-- Attributes named "file_line_nr" and "file_name" are to be removed which will leave this element empty, so it should be removed too -->
        <vcxz c="c" b="b"/>
    </qwer>
    <baz e="e" d="d"/>
    <bar>
        <fdsa g="g" f="f"/>
        <asdf g="g" f="f"/>
    </bar>
</foo>

所需的输出XML:(没有评论,没有空格/缩进,元素和属性排序)

<foo a="a" b="b" c="c">
<bar>
<asdf f="f" g="g"/>
<fdsa f="f" g="g"/>
</bar>
<baz d="d" e="e"/>
<qwer>
<id>some text</id>
<vcxz b="b" c="c"/>
<zxcv b="b" c="c"/>
</qwer>
</foo>

2 个答案:

答案 0 :(得分:1)

这应该做的工作:

<xsl:stylesheet 
  version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
  xmlns:msxsl="urn:schemas-microsoft-com:xslt">
  <xsl:output indent="yes" method="xml" omit-xml-declaration="yes"/>
  <xsl:strip-space elements="*"/>

  <!--
        Elements/attributes to remove.  Note that comments are not elements or
        attributes.  Since there is no template to match comments they are
        automatically ignored.
    -->
  <xsl:template match="@*[normalize-space()='']|info|@file_line_nr|@file_name"/>

  <!-- Match any attribute -->
  <xsl:template match="@*">
    <xsl:copy>
      <xsl:apply-templates select="@*"/>
    </xsl:copy>
  </xsl:template>

  <!-- Match any element -->
  <xsl:template match="*">
    <xsl:variable name="elementFragment">
      <xsl:copy>
        <xsl:apply-templates select="@*">
          <xsl:sort select="name()"/>
        </xsl:apply-templates>
        <xsl:apply-templates>
          <xsl:sort select="name()"/>
        </xsl:apply-templates>
      </xsl:copy>
    </xsl:variable>
    <xsl:variable name="element" select="msxsl:node-set($elementFragment)/*"/>
    <xsl:if test="$element/@* or $element/* or normalize-space($element)">
      <xsl:copy-of select="$element"/>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>

想法是预处理元素,将结果放在变量中,然后执行'元素 对变量没有属性,子元素或文本'测试。

变量是一个XML片段,需要使用扩展函数将其转换为节点集 - 我的XSLT使用Microsoft msxsl:node-set - 其他处理器具有相同的功能。

答案 1 :(得分:0)

最简单的方法是制定一个规则来抑制所有元素的处理:

<xsl:template match="*"/>

然后使用匹配具有一个属性或子元素的元素的规则来跟随它:

<xsl:template match="*[attribute:*] | *[child::*]">
    ...process...
</xsl:template>

或者如果您愿意

match="*[@*] | *[*]"