针对多个模式定义验证XML文件

时间:2009-07-07 21:09:56

标签: java xsd xerces

我正在尝试针对许多不同的模式验证XML文件(为人为的例子道歉):

  • a.xsd
  • b.xsd
  • c.xsd

c.xsd特别是导入b.xsd和b.xsd导入a.xsd,使用:

<xs:include schemaLocation="b.xsd"/>

我正试图通过Xerces以下列方式做到这一点:

XMLSchemaFactory xmlSchemaFactory = new XMLSchemaFactory();
Schema schema = xmlSchemaFactory.newSchema(new StreamSource[] { new StreamSource(this.getClass().getResourceAsStream("a.xsd"), "a.xsd"),
                                                         new StreamSource(this.getClass().getResourceAsStream("b.xsd"), "b.xsd"),
                                                         new StreamSource(this.getClass().getResourceAsStream("c.xsd"), "c.xsd")});     
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new StringReader(xmlContent)));

但是这无法正确导入所有三个模式,导致无法将名称'blah'解析为(n)'组'组件。

我已经使用 Python 成功验证了这一点,但是 Java 6.0 Xerces 2.8.1 存在实际问题。任何人都可以在这里建议出现问题,或者更简单的方法来验证我的XML文档吗?

8 个答案:

答案 0 :(得分:17)

所以,如果其他人遇到同样的问题,我需要从单元测试中加载父模式(和隐式子模式) - 作为资源 - 来验证XML String。我使用Xerces XMLSchemFactory与Java 6验证器一起执行此操作。

为了通过include正确加载子模式,我必须编写自定义资源解析器。代码可以在这里找到:

https://code.google.com/p/xmlsanity/source/browse/src/com/arc90/xmlsanity/validation/ResourceResolver.java

要使用解析程序在架构工厂中指定它:

xmlSchemaFactory.setResourceResolver(new ResourceResolver());

它将使用它来通过类路径解析您的资源(在我的例子中来自src / main / resources)。欢迎任何评论......

答案 1 :(得分:6)

http://www.kdgregory.com/index.php?page=xml.parsing 部分'单个文档的多个模式'

我的解决方案基于该文档:

URL xsdUrlA = this.getClass().getResource("a.xsd");
URL xsdUrlB = this.getClass().getResource("b.xsd");
URL xsdUrlC = this.getClass().getResource("c.xsd");

SchemaFactory schemaFactory = schemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
//---
String W3C_XSD_TOP_ELEMENT =
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n"
   + "<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" elementFormDefault=\"qualified\">\n"
   + "<xs:include schemaLocation=\"" +xsdUrlA.getPath() +"\"/>\n"
   + "<xs:include schemaLocation=\"" +xsdUrlB.getPath() +"\"/>\n"
   + "<xs:include schemaLocation=\"" +xsdUrlC.getPath() +"\"/>\n"
   +"</xs:schema>";
Schema schema = schemaFactory.newSchema(new StreamSource(new StringReader(W3C_XSD_TOP_ELEMENT), "xsdTop"));

答案 2 :(得分:2)

Xerces中的模式是(a)非常非常迂腐,(b)当它不喜欢它发现的时候会给出完全没用的错误信息。这是一个令人沮丧的组合。

python中的架构内容可能更宽容,并且让架构中的小错误超出了未报告的范围。

现在,正如你所说,如果c.xsd包含b.xsd,而b.xsd包含a.xsd,那么就不需要将所有三个加载到模式工厂中。它不仅没有必要,它可能会混淆Xerces并导致错误,所以这可能是你的问题。只需将c.xsd传递给工厂,让它解析b.xsd和a.xsd本身,它应该相对于c.xsd做。

答案 3 :(得分:2)

来自xerces文档: http://xerces.apache.org/xerces2-j/faq-xs.html

import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;

...

StreamSource[] schemaDocuments = /* created by your application */;
Source instanceDocument = /* created by your application */;

SchemaFactory sf = SchemaFactory.newInstance(
    "http://www.w3.org/XML/XMLSchema/v1.1");
Schema s = sf.newSchema(schemaDocuments);
Validator v = s.newValidator();
v.validate(instanceDocument);

答案 4 :(得分:1)

我最终使用了这个:

import org.apache.xerces.parsers.SAXParser;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.helpers.DefaultHandler;
import java.io.IOException;
 .
 .
 .
 try {
        SAXParser parser = new SAXParser();
        parser.setFeature("http://xml.org/sax/features/validation", true);
        parser.setFeature("http://apache.org/xml/features/validation/schema", true);
        parser.setFeature("http://apache.org/xml/features/validation/schema-full-checking", true);
        parser.setProperty("http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation", "http://your_url_schema_location");

        Validator handler = new Validator();
        parser.setErrorHandler(handler);
        parser.parse("file:///" + "/home/user/myfile.xml");

 } catch (SAXException e) {
    e.printStackTrace();
 } catch (IOException ex) {
    e.printStackTrace();
 }


class Validator extends DefaultHandler {
    public boolean validationError = false;
    public SAXParseException saxParseException = null;

    public void error(SAXParseException exception)
            throws SAXException {
        validationError = true;
        saxParseException = exception;
    }

    public void fatalError(SAXParseException exception)
            throws SAXException {
        validationError = true;
        saxParseException = exception;
    }

    public void warning(SAXParseException exception)
            throws SAXException {
    }
}

记得改变:

1)xsd文件位置的参数&#34; http:// your_url_schema_location&#34;

2)指向xml文件的字符串&#34; /home/user/myfile.xml"

我没有设置变量:-Djavax.xml.validation.SchemaFactory:http://www.w3.org/2001/XMLSchema=org.apache.xerces.jaxp.validation.XMLSchemaFactory

答案 5 :(得分:1)

我遇到了同样的问题,经过调查发现了这个解决方案。它对我有用。

Enum设置不同的XSDs

public enum XsdFile {
    // @formatter:off
    A("a.xsd"),
    B("b.xsd"),
    C("c.xsd");
    // @formatter:on

    private final String value;

    private XsdFile(String value) {
        this.value = value;
    }

    public String getValue() {
        return this.value;
    }
}

验证方法:

public static void validateXmlAgainstManyXsds() {
    final SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

    String xmlFile;
    xmlFile = "example.xml";

    // Use of Enum class in order to get the different XSDs
    Source[] sources = new Source[XsdFile.class.getEnumConstants().length];
    for (XsdFile xsdFile : XsdFile.class.getEnumConstants()) {
        sources[xsdFile.ordinal()] = new StreamSource(xsdFile.getValue());
    }

    try {
        final Schema schema = schemaFactory.newSchema(sources);
        final Validator validator = schema.newValidator();
        System.out.println("Validating " + xmlFile + " against XSDs " + Arrays.toString(sources));
        validator.validate(new StreamSource(new File(xmlFile)));
    } catch (Exception exception) {
        System.out.println("ERROR: Unable to validate " + xmlFile + " against XSDs " + Arrays.toString(sources)
                + " - " + exception);
    }
    System.out.println("Validation process completed.");
}

答案 6 :(得分:1)

以防万一,仍然有人在这里找到针对多个XSD验证xml或对象的解决方案,我在这里提到它

//Using **URL** is the most important here. With URL, the relative paths are resolved for include, import inside the xsd file. Just get the parent level xsd here (not all included xsds).

URL xsdUrl = getClass().getClassLoader().getResource("my/parent/schema.xsd");

SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(xsdUrl);

JAXBContext jaxbContext = JAXBContext.newInstance(MyClass.class);
Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();
unmarshaller.setSchema(schema);

/* If you need to validate object against xsd, uncomment this
ObjectFactory objectFactory = new ObjectFactory();
JAXBElement<MyClass> wrappedObject = objectFactory.createMyClassObject(myClassObject); 
marshaller.marshal(wrappedShipmentMessage, new DefaultHandler());
*/

unmarshaller.unmarshal(getClass().getClassLoader().getResource("your/xml/file.xml"));

答案 7 :(得分:0)

如果所有XSD都属于同一个名称空间,则创建一个新的XSD并将其他XSD导入其中。然后在Java中使用新的XSD创建架构。

Schema schema = xmlSchemaFactory.newSchema(
    new StreamSource(this.getClass().getResourceAsStream("/path/to/all_in_one.xsd"));

all_in_one.xsd:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
 xmlns:ex="http://example.org/schema/" 
 targetNamespace="http://example.org/schema/" 
 elementFormDefault="unqualified"
 attributeFormDefault="unqualified">

    <xs:include schemaLocation="relative/path/to/a.xsd"></xs:include>
    <xs:include schemaLocation="relative/path/to/b.xsd"></xs:include>
    <xs:include schemaLocation="relative/path/to/c.xsd"></xs:include>

</xs:schema>