使用newDocumentBuilder.parse()解析UTF-8 BOM XML文档

时间:2017-10-26 02:20:57

标签: java xml utf-8 jaxb

我正在尝试从网址解析具有UTF-8 BOM编码的文档,但是我在使脚本工作时遇到问题并删除了第一个字符,以便我可以在文档上使用JAXB。我试过了;

Document k = factory.newDocumentBuilder().parse(new URL(url).openStream());

我也试过了;

String defaultEncoding = "UTF-8";
try {
    //InputStream inputStream = new FileInputStream(url);
    //BOMInputStream bOMInputStream = new BOMInputStream(inputStream);
    BOMInputStream bOMInputStream = new BOMInputStream(new URL(url).openStream());
    ByteOrderMark bom = bOMInputStream.getBOM();
    String charsetName = bom == null ? defaultEncoding : bom.getCharsetName();
    InputSource reader = new InputSource(new BufferedInputStream(bOMInputStream)); //, charsetName
    reader.setEncoding(charsetName);
    System.out.println ("Passed!");
    //use reader
    Document k = factory.newDocumentBuilder().parse(reader);
} catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    return null;
}

它似乎不起作用。

0 个答案:

没有答案
相关问题