应该在Java中缩进XML吗?`xml:space =“preserve”`?

时间:2013-06-11 20:23:49

标签: java xml pretty-print

我正在美化/缩进Java中的一些XML:

<div xml:space="default"><h1 xml:space="default">Indenting mixed content in Java</h1><p xml:space="preserve">Why does indenting mixed content (like this paragraph) add whitespace around <a href="http://www.stackoverflow.com" xml:space="preserve"><strong>this strong element</strong></a>?</p></div>

当我美化XML时,我不希望在<a>元素的内容中添加空格,因此我指定xml:space="preserve"期望变换器保留其中的空白区域。

然而,当我转换XML时,我得到了这个:

<div>
    <h1 xml:space="default">Indenting mixed content in Java</h1>
    <p>Why does indenting mixed content (like this paragraph) add whitespace around <a href="http://www.stackoverflow.com">
            <strong xml:space="preserve">this strong element</strong>
        </a>?</p>
</div>

... <a><strong>元素之间有额外的空格。 (不仅如此,</a>关闭标记笨拙地不与其开放标记对齐。)

如何防止美化师添加白色空间?难道我做错了什么?这是我正在使用的Java代码:

import org.w3c.dom.Element;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import java.io.ByteArrayInputStream;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.Transformer;
import java.io.StringWriter;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.stream.StreamResult;

public class XmlExample {

    public static void main(String[] argv) {
        Document xmlDoc    = parseXml("<div xml:space=\"default\">" + 
                                          "<h1 xml:space=\"default\">Indenting mixed content in Java</h1>" + 
                                          "<p xml:space=\"preserve\">Why does indenting mixed content (like this paragraph) add whitespace around " + 
                                              "<a href=\"http://www.stackoverflow.com\" xml:space=\"preserve\"><strong>this strong element</strong></a>?" + 
                                          "</p>" + 
                                      "</div>");
        String   xmlString = xmlToString(xmlDoc.getDocumentElement());
        System.out.println(xmlString);
    }

    public static Document parseXml(String xml) {
        try {
            DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
            docFactory.setNamespaceAware(true);
            DocumentBuilder docBuilder = docFactory.newDocumentBuilder();

            Document doc = docBuilder.parse(new ByteArrayInputStream(xml.getBytes("UTF-8"))); 
            return doc;
        }
        catch(Exception e) {
            throw new RuntimeException(e);
        }
    }

    public static String xmlToString(Element el) {
        try {
            TransformerFactory tf = TransformerFactory.newInstance();
            Transformer transformer = tf.newTransformer();
            transformer.setOutputProperty(OutputKeys.INDENT, "yes");
            transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
            transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
            StringWriter writer = new StringWriter();
            DOMSource source = new DOMSource(el);
            transformer.transform(source, new StreamResult(writer));
            return writer.getBuffer().toString().trim();
        }
        catch(Exception e) {
            throw new RuntimeException(e);
        }
    }

}

1 个答案:

答案 0 :(得分:1)

如果您使用符合XSLT 1.0或XSLT 2.0规范的序列化程序,那么它应该尊重xml:space(即,在xml:space =“preserve”的范围内,应该抑制缩进)。 XSLT 2.0规范在这一点上比XSLT 1.0更加明确,并使其成为“必须”而非“应该”的要求。

您正在使用JAXP身份转换而不是XSLT转换;从JAXP规范到XSLT 1.0规范的引用,但它有点毛茸茸。

如果您使用Saxon,您应该获得所需的行为。 Saxon还允许您使用SUPPRESS_INDENTATION输出参数来抑制特定元素的缩进,因此您甚至不必在要序列化的文档中包含xml:space。