我读了Create a tiff with only text and no images from a postscript file with ghostscript并尝试使用KenS的回答。 但是这种方法仅删除“黑色”图像 - 图像仅包含黑色通道中的数据(PDF具有色彩空间CMYK)。如何在我的情况下删除所有图像?
答案 0 :(得分:2)
这样做的更好,但不完整。例如,它不处理使用多个数据源的图像。它本质上未经测试,除了我测试你的小文件(pages.pdf),使用ps2write转换为PostScript,然后转换为PostScript程序,以及teh pdfwrite设备,转换回PDF。
您将注意到的第一件事是,几乎所有文本都从您的文档中消失了。那是因为您使用的字体是位图字体,并且程序无法区分表示字符的位图和任何其他类型的位图。对于此文件,您可以通过删除imagemask的定义来解决这个问题,因为所有字符都使用imagemask,而其他图像使用'image'。
我有一种偷偷摸摸的怀疑,程序的格式化将会搞砸到这里: - (
8<------------------------------8<--------------------------8<-------------------------
%!
%
% numbytes -file- ConsumeFileData -
%
/ConsumeFileData {
userdict begin
/DataString 256 string def
/DataFile exch def
/BytesToRead exch def
%(BytesToRead = ) print BytesToRead ==
mark
{
DataFile DataString readstring { % read bytes
/BytesToRead BytesToRead 256 sub def % not EOF subtract 256 from required amount.
%(Read 256 bytes) ==
%(BytesToRead now = ) print BytesToRead ==
} {
length
%(Read ) print dup 256 string cvs print (bytes) ==
BytesToRead exch sub /BytesToRead exch def % Reached EOF, subtract length read froom required amount
%(BytesToRead now = ) print BytesToRead ==
exit % and exit loop
} ifelse
} loop
%BytesToRead ==
BytesToRead 0 gt {
(Ran out of image data reading from DataSource\n) ==
} if
cleartomark
end
} bind def
%
% numbytes -proc- ConsumeProcData -
%
/ConsumeProcData {
userdict begin
/DataProc exch def
/BytesToRead exch def
{
DataProc exec % returns a string
length BytesToRead exch sub % subtract # bytes read
/BytesToRead exch def
BytesToRead 0 le {
exit % exit when read enough
} if
} loop
end
} bind def
/image {
(image) ==
dup type /dicttype eq {
dup /MultipleDataSources known {
dup /MultipleDataSources get {
(Can't handle image with multiple sources!) ==
} if
} if
dup /Width get % stack = -dict- width
exch dup /BitsPerComponent get % stack = width -dict- bpc
exch dup /Decode get % stack = width bpc -dict- decode
length 2 div % decode = 2 * num components
exch 4 1 roll % stack = -dict- width bpc ncomps
mul mul % stack = -dict- width*bpc*ncomps
7 add cvi 8 idiv % stack = -dict- width(bytes)
exch dup /Height get % stack = width -dict- height
exch /DataSource get % stack = width height DataSource
3 1 roll % stack = DataSource width height
mul % stack = DataSource widht*height
exch % stack = size DataSource
} {
5 -1 roll
pop % throw away matrix
mul mul % bits/sample*width*height
7 add cvi 8 idiv % size in bytes of data floor(bits+7 / 8)
exch % stack = size DataSource
} ifelse
dup type /filetype eq {
ConsumeFileData
} {
dup type /arraytype eq or
1 index type /packedarraytype eq or {
ConsumeProcData
} {
pop pop % Remove DataSource and size
} ifelse
} ifelse
} bind def
/imagemask {
(imagemask)==
dup type /dicttype eq {
dup /MultipleDataSources known {
dup /MultipleDataSources get {
(Can't handle imagemask with multiple sources!) ==
} if
} if
dup /Width get % stack = -dict- width
7 add cvi 8 idiv % size in bytes of width floor(bits+7 / 8)
exch dup /Height get % stack = width -dict- height
exch /DataSource get % stack = width height DataSource
3 1 roll % stack = DataSource width height
mul % stack = DataSource width*height
exch % stack = size DataSource
} {
5 -1 roll
pop % throw away matrix
mul mul % bits/sample*width*height
7 add cvi 8 idiv % size in bytes of data floor(bits+7 / 8)
exch % stack = size DataSource
} ifelse
dup type /filetype eq {
ConsumeFileData
} {
dup type /arraytype eq or
1 index type /packedarraytype eq or {
ConsumeProcData
} {
pop pop % Remove DataSource and size
} ifelse
} ifelse
} bind def
/colorimage {
(colorimage)==
dup 1 ne {
1 index
{
(Can't handle colorimage with multiple sources!) ==
} if
} {
exch pop % get rid of 'multi'
% stack: w h bpc m d ncomp
3 -1 roll pop % stack: w h bpc d ncomp
exch 5 -1 roll % stack d w h bpc ncomp
mul mul mul % stack: d w*h*bpc*ncomp
7 add cvi 8 idiv exch % stack: bytes datasource
} ifelse
dup type /filetype eq {
ConsumeFileData
} {
dup type /arraytype eq or
1 index type /packedarraytype eq or {
ConsumeProcData
} {
pop pop % Remove DataSource and size
} ifelse
} ifelse
} bind def
答案 1 :(得分:1)
该技术适用于任何颜色的图像,因为图像操作符用于彩色和单色图像。除非你的文件使用obselete level 1.5'colorimage'运算符。我不记得是否在示例中重新定义了该运算符,如果没有,则yuo可以以类似的方式重新定义它。
事实上,我看到我提供了图像,色彩图像和图像掩码的重新定义,因此应该省略所有图像类型。也许你可以分享一个例子吗?