XML检查的语言建议

时间:2011-05-08 21:33:07

标签: xml validation

我想构建一个XML文档验证器。一个程序,它遍历XML文档,并根据定义的模式查找属性重复和一致性(不是XML与标准一致,但属性符合特定规则)。

我有使用经验:

  • 爪哇
  • 的Perl
  • Groovy的
  • C#
  • C

您会为此类任务推荐哪种语言/图书馆/扩展程序?

提前致谢

2 个答案:

答案 0 :(得分:0)

我会使用您喜欢的任何语言中的libxml2或其中一种实现方式。如何验证特定文档取决于它使用的XML方言。目前有三种常见的验证机制:DTD,RelaxNG和XML-Schema,每种自尊的方言都会产生至少一种方言规范。

以下用于使用RelaXNG验证MathML文档的C版本:

static const xmlChar
mml_rng_uri[] = "http://www.w3.org/Math/RelaxNG/mathml3/mathml3.rng";

/**
 * @brief Validate the MathML document located at the given URI.
 */
/*
 * -- Implementation notes --
 *
 * The goal is xmlRelaxGNValidateDoc.
 * For that we need a xmlDocPtr for the document and xmlRelaxNGValidCtxtPtr
 * for the RelaxNG schema.
 * Given a uri we can use xmlCtxtReadFile for the document.
 * We will also need a validation schema, which is always the result of a
 * RelaxNG parse operation.
 * The parse operation requires a parser context obtained from either
 * xmlRelaxNGNewParserCtxt, which takes an URI or xmlRelaxNGNewMemParserCtxt
 * which takes a pointer and size.
 *
 * -- Short hand --
 * xmlRelaxNGValidateDoc()
 *   |
 *   |- xmlDocPtr = xmlCtxtReadFile()
 *   |  |
 *   |  |- xmlParserCtxtPtr = xmlNewParserCtxt()
 *   |
 *   |- xmlRelaxNGValidCtxtPtr = xmlRelaxNGNewValidCtxt()
 *   |  |
 *   |  |- xmlRelaxNGPtr = xmlRelaxNGParse()
 *   |  |  |
 *   |  |  |- xmlRelaxNGParserCtxtPtr = xmlRelaxNGNewParserCtxt()
 *   |  |  |- xmlRelaxNGParserCtxtPtr = xmlRelaxNGNewMemParserCtxt()
 */
int MML_validate(const char *uri)
{
    xmlDocPtr doc;
    xmlParserCtxtPtr docparser;
    xmlRelaxNGValidCtxtPtr validator;
    xmlRelaxNGPtr schema;
    xmlRelaxNGParserCtxtPtr rngparser;
    int retval;

    /* RelaxNG schema setup */
    rngparser = xmlRelaxNGNewParserCtxt(mml_rng_uri);

    if( (schema = xmlRelaxNGParse(rngparser)) == NULL )
        errx(1, "Failed to parse MathML RelaxNG schema");
    if( (validator = xmlRelaxNGNewValidCtxt(schema)) == NULL )
        errx(1, "Failed to create a RelaxNG validator");

    /* MathML document setup */
    if( (docparser = xmlNewParserCtxt()) == NULL )
        errx(1, "Failed to create a document parser");
    if( (doc = xmlCtxtReadFile(docparser, uri, NULL, XML_PARSE_XINCLUDE)) ==
            NULL )
        errx(1, "Failed to parse document at %s", uri);

    /* Validation */
    retval = xmlRelaxNGValidateDoc(validator, doc);

    /* Clean up */
    xmlRelaxNGFreeValidCtxt(validator);
    xmlRelaxNGFreeParserCtxt(rngparser);
    xmlRelaxNGFree(schema);

    return(retval);
}

答案 1 :(得分:0)

要求声明要做出明确的陈述,这太短了,但这对我来说听起来像Schematron问题。