如何使用Python从PDF文件中提取PNG图像

时间:2018-09-18 09:47:12

标签: python image-processing

看到这么多堆栈溢出查询后,我终于想出了从PDF提取图像的方法,但是图像只能是JPG / JPEG格式,但不适用于PDF内的PNG格式。

startmark = b"\xff\xd8"
startfix = 0
endmark = b"\xff\xd9"
endfix = 2
i = 0
n_jpg=0

istream = content3.find(b"stream", i)
istart = content3.find(startmark, istream, istream+20)
if istart < 0:
    i = istream+20
iend = content3.find(b"endstream", istart)
if iend < 0:
    raise Exception("Didn't find end of stream!")
iend = content3.find(endmark, iend-20)
if iend < 0:
    raise Exception("Didn't find end of JPG!")
istart += startfix
iend += endfix
print("JPG %d from %d to %d" % (njpg, istart, iend))
jpg = content3[istart:iend]

如何将其用于PNG文件?提取png文件时,可以看到\ x89PNG作为起点,\ xaeB` \ x82作为端点。但是在阅读“流”时出现错误

Exception: Didn't find end of stream!

请注意,我需要字节形式的字节,可用于以后的处理

提前谢谢

0 个答案:

没有答案