Question

我正在寻找一种方法来使用c＃获取pdf文件中第一页的图像任何解决方案

Answer 1

iTextSharp应该处理这个问题。退出第一张图片

示例http://www.vbforums.com/showthread.php?t=530736

编辑：

stanav

从线程中复制代码

Public Shared Function ExtractImages(ByVal sourcePdf As String) As List(Of Image)
    Dim imgList As New List(Of Image)

    Dim raf As iTextSharp.text.pdf.RandomAccessFileOrArray = Nothing
    Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
    Dim pdfObj As iTextSharp.text.pdf.PdfObject = Nothing
    Dim pdfStrem As iTextSharp.text.pdf.PdfStream = Nothing

    Try
        raf = New iTextSharp.text.pdf.RandomAccessFileOrArray(sourcePdf)
        reader = New iTextSharp.text.pdf.PdfReader(raf, Nothing)

        For i As Integer = 0 To reader.XrefSize - 1
            pdfObj = reader.GetPdfObject(i)
            If Not IsNothing(pdfObj) AndAlso pdfObj.IsStream() Then
                pdfStrem = DirectCast(pdfObj, iTextSharp.text.pdf.PdfStream)
                Dim subtype As iTextSharp.text.pdf.PdfObject = pdfStrem.Get(iTextSharp.text.pdf.PdfName.SUBTYPE)
                If Not IsNothing(subtype) AndAlso subtype.ToString = iTextSharp.text.pdf.PdfName.IMAGE.ToString Then
                    Dim bytes() As Byte = iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw(CType(pdfStrem, iTextSharp.text.pdf.PRStream))
                    If Not IsNothing(bytes) Then
                        Try
                            Using memStream As New System.IO.MemoryStream(bytes)
                                memStream.Position = 0
                                Dim img As Image = Image.FromStream(memStream)
                                imgList.Add(img)
                            End Using
                        Catch ex As Exception
                            'Most likely the image is in an unsupported format
                            'Do nothing
                            'You can add your own code to handle this exception if you want to
                        End Try
                    End If
                End If
            End If
        Next
        reader.Close()
    Catch ex As Exception
        MessageBox.Show(ex.Message)
    End Try
    Return imgList
End Function

Answer 2

您可能正在尝试栅格化PDF的页面。如果要查找获取图像等，您将打开可以在PDF上执行的其他操作。已经发布了list种方式。我已经使用ABCpdf来轻松完成这项工作。

Answer 3

您是在网络还是原生环境中？它制造了巨大的差异。您想要的是将PDF光栅化为图像。这很容易通过GhostDoc或类似工具在本机环境中完成。它们都使用虚拟打印机驱动程序来光栅化PDF。这种方法不适用于您可能需要使用商业广告的网络环境，因为编写自己的光栅化引擎是一项艰巨的任务。

有没有办法在c＃中获取pdf文件第一页的图像？

3 个答案: