iText PDF不提取XML中的只读字段

时间:2017-05-02 08:50:02

标签: xml pdf itext

我使用 iText PDF java 从PDF中提取XML。它工作正常,但跳过只读字段。生成的XML中不存在只读字段。我使用以下代码来提取XML

public class PDFReadExample 
{
    public static void main(String[] args) throws IOException, DocumentException, TransformerException 
    {
        String SRC = "";
        String DEST = "";

        for (String s : args) {
            SRC = args[0];
            DEST = args[1];
        }
        File file = new File(DEST);
        file.getParentFile().mkdirs();
        new PDFReadExample().readXml(SRC, DEST);
    }

    public void readXml(String src, String dest) throws IOException, DocumentException, TransformerException 
    {
        PdfReader reader = new PdfReader(src);

        AcroFields form = reader.getAcroFields();
        XfaForm xfa = form.getXfa();
        Node node = xfa.getDatasetsNode();
        NodeList list = node.getChildNodes();
        for (int i = 0; i < list.getLength(); ++i) {
            if ("data".equals(list.item(i).getLocalName())) {
                node = list.item(i);

                break;
            }
        }

        Transformer tf = TransformerFactory.newInstance().newTransformer();
        tf.setOutputProperty("encoding", "UTF-8");
        tf.setOutputProperty("indent", "yes");
        FileOutputStream os = new FileOutputStream(dest);

        tf.transform(new DOMSource(node), new StreamResult(os));
        reader.close();
    }
}

我对PDF不太熟悉。根据来自其他字段的输入自动填充只读字段。我如何在XML中提取只读值。

0 个答案:

没有答案