VTD-XML异常:名称空间限定异常:前缀属性未限定

时间:2012-05-10 23:55:03

标签: vtd-xml

我通过Web服务收到XML,并且我使用遗留代码(使用dom4j)来执行某些xml转换。将原始XML加载/解析为VTD-XML(VTDGen)工作正常,不会抛出任何异常。但是,在将xml加载到dom4j之后,我注意到一些元素名称空间声明和属性被重新排列。显然,这种重新安排会导致VTD-XML抛出以下异常:

例外: 名称空间限定例外:前缀属性不合格

行号:101偏移量:1827

以下是原始XML中此行号的元素:

< RR_PerformanceSite:PerformanceSite_1_4 RR_PerformanceSite:FormVersion =" 1.4"的xmlns:NSF_ApplicationChecklist =" HTTP://apply.grants.gov/forms/NSF_ApplicationChecklist-V1.1"的xmlns:NSF_CoverPage =" HTTP://apply.grants.gov/forms/NSF_CoverPage-V1.1"的xmlns:NSF_DeviationAuthorization =" HTTP://apply.grants.gov/forms/NSF_DeviationAuthorization-V1.1"的xmlns:NSF_Registration =" HTTP://apply.grants.gov/forms/NSF_Registration-V1.1"的xmlns:NSF_SuggestedReviewers =" HTTP://apply.grants.gov/forms/NSF_SuggestedReviewers-V1.1"的xmlns:PHS398_CareerDevelopmentAwardSup =" HTTP://apply.grants.gov/forms/PHS398_CareerDevelopmentAwardSup_1_1-V1.1"的xmlns:PHS398_Checklist =" HTTP://apply.grants.gov/forms/PHS398_Checklist_1_3-V1.3"的xmlns:PHS398_CoverPageSupplement =" HTTP://apply.grants.gov/forms/PHS398_CoverPageSupplement_1_4-V1.4"的xmlns:PHS398_ModularBudget =" HTTP://apply.grants.gov/forms/PHS398_ModularBudget-V1.1"的xmlns:PHS398_ResearchPlan =" HTTP://apply.grants.gov/forms/PHS398_ResearchPlan_1_3-V1.3"的xmlns:PHS_CoverLetter =" HTTP://apply.grants.gov/forms/PHS_CoverLetter_1_2-V1.2"的xmlns:RR_Budget =" HTTP://apply.grants.gov/forms/RR_Budget-V1.1"的xmlns:RR_KeyPersonExpanded =" HTTP://apply.grants.gov/forms/RR_KeyPersonExpanded_1_2-V1.2"的xmlns:RR_OtherProjectInfo =" HTTP://apply.grants.gov/forms/RR_OtherProjectInfo_1_2-V1.2"的xmlns:RR_PerformanceSite =" HTTP://apply.grants.gov/forms/PerformanceSite_1_4-V1.4"的xmlns:RR_PersonalData =" HTTP://apply.grants.gov/forms/RR_PersonalData-V1.1"的xmlns:RR_SF424 =" HTTP://apply.grants.gov/forms/RR_SF424_1_2-V1.2"的xmlns:RR_SubawardBudget =" HTTP://apply.grants.gov/forms/RR_SubawardBudget-V1.2"的xmlns:SF424C =" HTTP://apply.grants.gov/forms/SF424C-V1.0"的xmlns:ATT =" HTTP://apply.grants.gov/system/Attachments-V1.0"的xmlns:码=" HTTP://apply.grants.gov/system/UniversalCodes-V2.0"的xmlns:globlib =" HTTP://apply.grants.gov/system/GlobalLibrary-V2.0">

加载到dom4j后,这是相同的元素:

< RR_PerformanceSite:PerformanceSite_1_4 xmlns:RR_PerformanceSite =" http://apply.grants.gov/forms/PerformanceSite_1_4-V1.4"的xmlns:NSF_ApplicationChecklist =" HTTP://apply.grants.gov/forms/NSF_ApplicationChecklist-V1.1"的xmlns:NSF_CoverPage =" HTTP://apply.grants.gov/forms/NSF_CoverPage-V1.1"的xmlns:NSF_DeviationAuthorization =" HTTP://apply.grants.gov/forms/NSF_DeviationAuthorization-V1.1"的xmlns:NSF_Registration =" HTTP://apply.grants.gov/forms/NSF_Registration-V1.1"的xmlns:NSF_SuggestedReviewers =" HTTP://apply.grants.gov/forms/NSF_SuggestedReviewers-V1.1"的xmlns:PHS398_CareerDevelopmentAwardSup =" HTTP://apply.grants.gov/forms/PHS398_CareerDevelopmentAwardSup_1_1-V1.1"的xmlns:PHS398_Checklist =" HTTP://apply.grants.gov/forms/PHS398_Checklist_1_3-V1.3"的xmlns:PHS398_CoverPageSupplement =" HTTP://apply.grants.gov/forms/PHS398_CoverPageSupplement_1_4-V1.4"的xmlns:PHS398_ModularBudget =" HTTP://apply.grants.gov/forms/PHS398_ModularBudget-V1.1"的xmlns:PHS398_ResearchPlan =" HTTP://apply.grants.gov/forms/PHS398_ResearchPlan_1_3-V1.3"的xmlns:PHS_CoverLetter =" HTTP://apply.grants.gov/forms/PHS_CoverLetter_1_2-V1.2"的xmlns:RR_Budget =" HTTP://apply.grants.gov/forms/RR_Budget-V1.1"的xmlns:RR_KeyPersonExpanded =" HTTP://apply.grants.gov/forms/RR_KeyPersonExpanded_1_2-V1.2"的xmlns:RR_OtherProjectInfo =" HTTP://apply.grants.gov/forms/RR_OtherProjectInfo_1_2-V1.2"的xmlns:RR_PersonalData =" HTTP://apply.grants.gov/forms/RR_PersonalData-V1.1"的xmlns:RR_SF424 =" HTTP://apply.grants.gov/forms/RR_SF424_1_2-V1.2"的xmlns:RR_SubawardBudget =" HTTP://apply.grants.gov/forms/RR_SubawardBudget-V1.2"的xmlns:SF424C =" HTTP://apply.grants.gov/forms/SF424C-V1.0"的xmlns:ATT =" HTTP://apply.grants.gov/system/Attachments-V1.0"的xmlns:码=" HTTP://apply.grants.gov/system/UniversalCodes-V2.0"的xmlns:globlib =" HTTP://apply.grants.gov/system/GlobalLibrary-V2.0" RR_PerformanceSite:FormVersion =" 1.4">

问题在于新XML元素中的属性(在元素末尾的偏移1827处):RR_PerformanceSite:FormVersion =" 1.4"

以下是删除异常的原因: 1.将此元素的RR_PerformanceSite xmlns声明添加到XML文档的根元素。 2.用原始元素替换新元素。这个SEEMS让我相信在解析时属性/ ns声明的顺序会影响VTD。

注意:我将xml doc设置解析为' true'使用两个xml文档(原始和后dom4j xml)。此外,还为每个xml,原始和后dom4j创建了新的VTD对象。

我试图将' RR_PerformanceSite:FormVersion =" 1.4"'在元素的开头像原始但不会删除异常。由于属性位置的改变,错误消息中的偏移量是不同的。 xmlns声明的顺序是否会影响VTD?

我查看了VTDGen源代码,但无法弄清楚为什么会抛出此异常。

为什么dom4j会解析新的doc并且vtd无法解析?任何人都可以对此有所了解吗?

1 个答案:

答案 0 :(得分:1)

它似乎是VTD-XML的一个错误,与命名空间声明顺序有关。

始终可以使用以下Java代码重现

public class SchemaTester {

    /**
     * @param args
     */
    public static void main(String[] args) throws Exception {

        String bad = "C:/Temp/VTD_bad.xml"; // XML files to test
        String good = "C:/Temp/VTD_good.xml";

        StringBuilder sb = new StringBuilder();

        char[] buf = new char[4*1024];
        FileReader fr = new FileReader(bad);
        int readed = 0;

        while ((readed = fr.read(buf, 0, buf.length)) != -1) {
            sb.append(buf, 0, readed);
        }

        fr.close();

        String x = sb.toString();

        //instantiate VTDGen
        //and call parse 
        VTDGen vg = new VTDGen();
        vg.setDoc(x.getBytes("UTF-8"));
        vg.parse(true);  // set namespace awareness to true
        VTDNav vn = vg.getNav();



        AutoPilot ap = new AutoPilot (vn);
        ap.selectXPath("//*/@*");

        int i= -1;
        while((i=ap.evalXPath()) != -1) {
            // i will be attr name, i+1 will be attribute value
            System.out.println("\t\tAttribute ==> " + vn.toNormalizedString(i));
            System.out.println("\t\tValue ==> " + vn.toNormalizedString(i+1));
        } 

    }
}

OP已将XML上传到https://gist.github.com/2696220

相关问题