iTextSharp可以将PDF文档转换为PDF / A.

时间:2010-03-02 21:19:37

标签: pdf itextsharp pdfa

我无法在常见问题解答中找到API中是否存在此功能,尽管它在书中提到的可能是可用的。有没有人有任何实现此功能的经验?

2 个答案:

答案 0 :(得分:7)

This thread(2007年6月),Paulo Soares提供的代码显示了对PDF / A的支持。这是C#代码(他还有一个Java示例):

private void PdfATest() {
    Document doc = new Document(PageSize.A4);
    PdfWriter writer = PdfWriter.GetInstance(doc, new FileStream("C:\\hello_A1-b_cs.pdf", FileMode.Create));
    writer.PDFXConformance = PdfWriter.PDFA1B;
    doc.Open();

    PdfDictionary outi = new PdfDictionary(PdfName.OUTPUTINTENT);
    outi.Put(PdfName.OUTPUTCONDITIONIDENTIFIER, new PdfString("sRGB IEC61966-2.1"));
    outi.Put(PdfName.INFO, new PdfString("sRGB IEC61966-2.1"));
    outi.Put(PdfName.S, PdfName.GTS_PDFA1);

    // get this file here: http://old.nabble.com/attachment/10971467/0/srgb.profile
    ICC_Profile icc = ICC_Profile.GetInstance("c:\\srgb.profile");
    PdfICCBased ib = new PdfICCBased(icc);
    ib.Remove(PdfName.ALTERNATE);
    outi.Put(PdfName.DESTOUTPUTPROFILE, writer.AddToBody(ib).IndirectReference);

    writer.ExtraCatalog.Put(PdfName.OUTPUTINTENTS, new PdfArray(outi));

    BaseFont bf = BaseFont.CreateFont("c:\\windows\\fonts\\arial.ttf", BaseFont.WINANSI, true);
    Font f = new iTextSharp.text.Font(bf, 12);
    doc.Add(new Paragraph("hello", f));

    writer.CreateXmpMetadata();

    doc.Close();
}

上面的链接包括ICC_Profile文件的下载。

答案 1 :(得分:5)

这是我解析HTML文件并从中创建PDF / A存档文档的方法,也是通过使用样式表嵌入字体(为了避免错误:“必须嵌入所有字体。这个不是' t:Helvetica“)

希望这有助于某人..

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using iTextSharp.text.pdf;
using iTextSharp.text;
using System.IO;
using iTextSharp.text.html.simpleparser;

namespace SaveAsPDF
{
    class HtmlPdfConverter
    {
        public void RendererWebForm2PDFArchive(string fileName)
        {
            Console.WriteLine("Parsing HTML " + fileName);
            Document document = new Document(PageSize.A4);

            try
            {
                // we create a writer that listens to the document and directs a XML-stream to a file
                PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(fileName + ".pdf", FileMode.Create));

                //set document as arhive
                writer.PDFXConformance = PdfWriter.PDFA1A;
                document.Open();

                //apply stylesheet to change font (and embedd it)
                StyleSheet styles = new StyleSheet();
                FontFactory.Register("c:\\windows\\fonts\\verdana.ttf");
                styles.LoadTagStyle("body", "face", "Verdana");

                //prepare html
                StreamReader sr = new StreamReader(fileName, Encoding.Default);
                string html = sr.ReadToEnd();                                
                html = RemoveTag(html, "<title>", "</title>");                

                //convert string to stream
                byte[] byteArray = Encoding.UTF8.GetBytes(html);
                MemoryStream ms = new MemoryStream(byteArray);

                //parse html
                HTMLWorker htmlWorker = new HTMLWorker(document);
                System.Collections.Generic.List<IElement> elements;

                elements = HTMLWorker.ParseToList(new StreamReader(ms), styles);
                foreach (IElement item in elements)
                {
                    document.Add(item);
                }

                writer.CreateXmpMetadata();
                document.Close();
                Console.WriteLine("Done");
            }
            catch (Exception e)
            {
                Console.Error.WriteLine(e.Message);
                Console.Error.WriteLine(e.StackTrace);
            }
        }**strong text**