iTextSharp 5.5.6 PdfCopy失败,“无法访问已关闭的文件”

时间:2015-05-22 18:04:01

标签: itextsharp itext

这似乎与这个问题类似:Merging Tagged PDF without ruining the tags

我正在使用最新的iTextSharp NuGet软件包(v5.5.6)尝试合并两个标记的PDF。致电Document.Close()时,我收到来自ObjectDisposedException的{​​{1}}。

PdfCopy.FlushIndirectObjects()

以下是产生异常的代码。如果我不调用at System.IO.__Error.FileNotOpen() at System.IO.FileStream.get_Position() at iTextSharp.text.io.RAFRandomAccessSource.Get(Int64 position, Byte[] bytes, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\io\RAFRandomAccessSource.cs:line 96 at iTextSharp.text.io.IndependentRandomAccessSource.Get(Int64 position, Byte[] bytes, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\io\IndependentRandomAccessSource.cs:line 76 at iTextSharp.text.pdf.RandomAccessFileOrArray.Read(Byte[] b, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\RandomAccessFileOrArray.cs:line 235 at iTextSharp.text.pdf.RandomAccessFileOrArray.ReadFully(Byte[] b, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\RandomAccessFileOrArray.cs:line 264 at iTextSharp.text.pdf.RandomAccessFileOrArray.ReadFully(Byte[] b) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\RandomAccessFileOrArray.cs:line 254 at iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw(PRStream stream, RandomAccessFileOrArray file) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfReader.cs:line 2406 at iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw(PRStream stream) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfReader.cs:line 2443 at iTextSharp.text.pdf.PRStream.ToPdf(PdfWriter writer, Stream os) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PRStream.cs:line 224 at iTextSharp.text.pdf.PdfIndirectObject.WriteTo(Stream os) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfIndirectObject.cs:line 157 at iTextSharp.text.pdf.PdfWriter.PdfBody.Write(PdfIndirectObject indirect, Int32 refNumber, Int32 generation) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfWriter.cs:line 389 at iTextSharp.text.pdf.PdfWriter.PdfBody.Add(PdfObject objecta, Int32 refNumber, Int32 generation, Boolean inObjStm) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfWriter.cs:line 379 at iTextSharp.text.pdf.PdfCopy.WriteObjectToBody(PdfIndirectObject objecta) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfCopy.cs:line 1238 at iTextSharp.text.pdf.PdfCopy.FlushIndirectObjects() in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfCopy.cs:line 1186 at iTextSharp.text.pdf.PdfCopy.FlushTaggedObjects() in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfCopy.cs:line 884 at iTextSharp.text.pdf.PdfDocument.Close() in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfDocument.cs:line 825 并且不传递true作为copy.SetTagged()的第三个参数,代码将毫无例外地执行,但忽略所有标记。

GetImportedPage()

查看5.5.6源代码分支,看起来像RAFRandomAccessSource.cs第96行是罪魁祸首。

using(var ms = new MemoryStream())
{
    var doc = new Document();
    var copy = new PdfSmartCopy(doc, ms);
    copy.SetTagged();
    doc.Open();

    string[] files = new string[]{@"d:\tagged.pdf", @"d:\tagged.pdf"};
    foreach(var f in files)
    {
        var reader = new PdfReader(f);
        int pages = reader.NumberOfPages;
        for(int i = 0; i < pages;)
            copy.AddPage(copy.GetImportedPage(reader, ++i, true));
        copy.FreeReader(reader);
        reader.Close();
    }

    // ObjectDisposedException
    doc.Close();

    ms.Flush();
    File.WriteAllBytes(@"d:\pdf.merged.v5.pdf", ms.ToArray());
}

raf.Position已经处理好了,但我无法判断它已被处置的位置。

我希望我只需要做一些事情,而不仅仅是致电public virtual int Get(long position, byte[] bytes, int off, int len) { if (position > length) return -1; // Not thread safe! if (raf.Position != position) 并将copy.SetTagged()传递给true来解决问题。

1 个答案:

答案 0 :(得分:2)

您过早关闭PdfReader个实例。你只能触发:

reader.Close();
在关闭PdfSmartCopy实例之后

,因此您必须重新考虑创建不同PdfReader对象的位置(在循环内)

不同PdfReader实例必须保持打开的原因纯粹是技术性的:合并结构化树(存储所有标记信息的地方)并不是一件容易的事。这只能在所有其他工作完成时才会发生。它需要访问单独文档的原始结构。如果您将PdfReader关闭到此类文档,则无法再检索该结构。