错误在java中将docx转换为pdf

时间:2014-02-12 20:13:31

标签: java apache-poi docx4j

下午好,

来我的情况,我正在生成一个docx文档Junction 2其他docx,我正在进行合并。

  public static void main(String[] args) throws Exception {
    InputStream in1 = new FileInputStream(new File("C:\\Clientes\\Constremac\\Repositorio_DOCS\\UPLOAD\\LAYOUT_PAGINA_VERSAO_FINAL.docx"));
    InputStream in2 = new FileInputStream(new File("C:\\Clientes\\Constremac\\Repositorio_DOCS\\UPLOAD\\modeloContratoSocial.docx"));
    OutputStream out = new FileOutputStream(new File("C:\\Clientes\\Constremac\\Repositorio_DOCS\\UPLOAD\\modeloContratoSocialMerge.docx"));
    mergeDocx(in1,in2,out);
}

 public static void mergeDocx(InputStream s1, InputStream s2, OutputStream os) throws Exception {
    WordprocessingMLPackage target = WordprocessingMLPackage.load(s1);
    insertDocx(target.getMainDocumentPart(), IOUtils.toByteArray(s2));
    SaveToZipFile saver = new SaveToZipFile(target);
    saver.save(os);
}
private static void insertDocx(MainDocumentPart main, byte[] bytes) throws Exception {
        AlternativeFormatInputPart afiPart = new AlternativeFormatInputPart(new PartName("/part" + (chunk++) + ".docx"));
        afiPart.setContentType(new ContentType(CONTENT_TYPE));
        afiPart.setBinaryData(bytes);
        Relationship altChunkRel = main.addTargetPart(afiPart);
        //convertAltChunks()
        CTAltChunk chunk = Context.getWmlObjectFactory().createCTAltChunk();
        chunk.setId(altChunkRel.getId());

        main.addObject(chunk);
}

我的最终文档(docx)没问题,我可以正常打开它。当我将此生成的文件转换为PDF时出现问题,出现以下错误:NOT IMPLEMENTED:支持w:altChunk - 。

public boolean createPDF(String nomeArquivo)    {
    try     {
        long start = System.currentTimeMillis();
        Configuration confg = new Configuration();

        System.out.println(Configuration.repositorioUpload + nomeArquivo + ".docx");
        InputStream is = new FileInputStream(new File(Configuration.repositorioUpload + nomeArquivo + ".docx"));
        WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(is);

        PdfSettings pdfSettings = new PdfSettings();

        OutputStream out = new FileOutputStream(new File(Configuration.repositorioUpload + nomeArquivo + ".pdf"));
        PdfConversion converter = new Conversion(wordMLPackage);
        converter.output(out, pdfSettings);

        System.err.println("Generate " + Configuration.repositorioUpload  + nomeArquivo + ".pdf" + " with " + (
                System.currentTimeMillis() - start) + "ms");
    }
    catch (Throwable e) {
        e.printStackTrace();
    }
    return false;
}

我正在发送我使用的java代码,有一段时间我正在尝试生成这个pdf,如果有人能够帮助我,我很感激。

谢谢大家。

拥抱!


我找到了一种使用AltChunck的方法,但是当导出到PDF时,即使没有正确运行合并图像页脚和标题也没有出现。

public static void main(String[] args) throws Exception {

    boolean ADD_TO_HEADER = true;
    HeaderPart hp = null;

    String inputfilepath = "C:\\Clientes\\Constremac\\Repositorio_DOCS\\UPLOAD\\default_template.xml";

    String chunkPath = "C:\\Clientes\\Constremac\\Repositorio_DOCS\\UPLOAD\\sample.docx";

    boolean save = true;
    String outputfilepath =  "C:\\Clientes\\Constremac\\Repositorio_DOCS\\UPLOAD\\altChunk_out.docx";


    // Open a document from the file system
    // 1. Load the Package
    WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new java.io.File(inputfilepath));
    //proce
    MainDocumentPart main = wordMLPackage.getMainDocumentPart();

    if (ADD_TO_HEADER) {
        hp = wordMLPackage.getDocumentModel().getSections().get(0).getHeaderFooterPolicy().getDefaultHeader();
    }

    AlternativeFormatInputPart afiPart = new AlternativeFormatInputPart(new PartName("/chunk.docx"));
    afiPart.setBinaryData(new FileInputStream(chunkPath));

    afiPart.setContentType(new ContentType("application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml")); //docx
    //afiPart.setContentType(new ContentType("application/xhtml+xml")); //xhtml

    Relationship altChunkRel = null;
    if (ADD_TO_HEADER) {
        altChunkRel = hp.addTargetPart(afiPart);
    } else {
        altChunkRel = main.addTargetPart(afiPart);          
    }

    CTAltChunk ac = Context.getWmlObjectFactory().createCTAltChunk();
    ac.setId(altChunkRel.getId());

    if (ADD_TO_HEADER) {
        hp.getJaxbElement().getEGBlockLevelElts().add(ac);
    } else {
        main.addObject(ac);
    }

    // Save it

    if (save) {     
        SaveToZipFile saver = new SaveToZipFile(wordMLPackage);
        saver.save(outputfilepath);
        System.out.println("Saved " + outputfilepath);
    }
}

我做错了什么?

2 个答案:

答案 0 :(得分:2)

altChunk不是“真正的”docx内容。

在以PDF格式输出之前,需要将其替换为普通的WordML段落,表格等。

您可以自己尝试这样做,如果内容不包含任何关系(图像,超链接等)或相互冲突的样式或编号,这很容易。请进一步查看http://www.docx4java.org/blog/2010/11/merging-word-documents/ ..或我公司的网站plutext.com

答案 1 :(得分:0)

这可以解决

altChunk不是"真实" docx内容。

使用java我们可以将altchunk转换为原始内容单词标签,

转换docx

中的document.xml
Docx4jProperties.setProperty(“docx4j.Convert.Out.HTML.OutputMethodXML”,
true);
Docx4J.toHTML(htmlSettings, os, Docx4J.FLAG_EXPORT_PREFER_XSL);

打开完整代码的链接。

[将AltChunk转换为原始内容] [1]

https://kishankichi.wordpress.com/2016/05/26/convert-altchunk-to-original-content-or-convert-to-real-docx-format-using-java

https://kishankichi.wordpress.com/2016/05/26/convert-altchunk-to-original-content-or-convert-to-real-docx-format-using-java/

注意:

请在html内容中忽略& nbsp和其他此类标记。 我只检查了& nbsp。

感谢重播...