使用本地DTD验证XML会引发错误

时间:2018-07-29 11:28:25

标签: java xml dom dtd saxparser

我正在尝试使用本地DTD验证XML,我在Google上进行了搜索,然后得到了一些代码,这是我的代码。

public Document buildDocument(File receivedFile) {
    Document doc = null;
    try {
        logger.info("Inside buildDocument() , create a new DocumentBuilderFactory");
        // create a new DocumentBuilderFactory
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setValidating(true);
        // use the factory to create a documentbuilder
        DocumentBuilder builder = factory.newDocumentBuilder();
        builder.setErrorHandler(new ErrorHandler() {
            @Override
            public void fatalError(SAXParseException exception) throws SAXException {
                System.err.println("fatalError: " + exception);
            }

            @Override
            public void error(SAXParseException exception) throws SAXException {
                System.err.println("error: " + exception);
            }

            @Override
            public void warning(SAXParseException exception) throws SAXException {
                System.err.println("warning: " + exception);
            }
        });

        builder.setEntityResolver(new EntityResolver() {
            @Override
            public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException {
                if (systemId.contains("xyz.com/remote.dtd")) {
                    return new InputSource(FileUtils.readFileToString(
                            new File("C:\\Users\\xyz\\local.dtd"));
                } else {
                    return null;
                }
            }
        });
        doc = builder.parse(new InputSource(new StringReader(FileUtils.readFileToString(receivedFile, "UTF-16"))));
    } catch (ParserConfigurationException | SAXException | IOException e) {
        logger.warn("Opps got error while buiding document", e);
    }
    return doc;

我遇到以下错误,我确定本地DTD与我要验证的XML可以正常工作。我找不到DTD的任何问题,但仍然出现此错误,请对此提供帮助。

> java.net.MalformedURLException: no protocol: <!DOCTYPE ichicsr [
> <!ENTITY lt     "&#38;#60;">
> 
> <!-- Greater Than ">" --> <!ENTITY gt     "&#62;"> 
> 
> <!-- Ampersand "&" --> <!ENTITY amp    "&#38;#38;">

然后打印整个DTD!

at java.net.URL.<init>(Unknown Source)
    at java.net.URL.<init>(Unknown Source)
    at java.net.URL.<init>(Unknown Source)
    at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:964)
    at org.apache.xerces.impl.XMLEntityManager.startEntity(XMLEntityManager.java:902)
    at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:869)
    at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:241)
    at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(XMLDocumentScannerImpl.java:1001)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:324)
    at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:875)
    at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:798)
    at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:108)
    at org.apache.xerces.parsers.DOMParser.parse(DOMParser.java:230)
  

注意:我的XML的编码类型为UTF-16

更新:在读取文件时删除了UF-16,看起来DTD正在尝试进行编译,并且由于错误提示

  

latin-entities.dtd(系统找不到指定的路径)

这是否意味着该DTD正在寻找依赖的DTD?

1 个答案:

答案 0 :(得分:2)

我想问题出在这里

if (systemId.contains("xyz.com/remote.dtd")) {
    return new InputSource(FileUtils.readFileToString(
                     new File("C:\\Users\\xyz\\local.dtd"));

假设此处FileUtils.readFileToString()返回文件内容为 串。

XMLEntityManager希望InputSource提供一个系统ID作为URL,但是它将获取文件内容,因此是MalformedURLException。

根据文件的网址创建一个InputSource

new InputSource(new File(...).toURI().toASCIIString()

或在文件内容上使用StringReader创建InputSource

new InputSource(new StringReader(FileUtils....))
相关问题