修改pdf时

时间:2016-06-01 10:34:31

标签: c++ pdf podofo

我使用podofo修改pdf文件时遇到问题,如果有时间,请帮我解决一下!

我在http://podofo.sourceforge.net/download.html找到了podofo源代码,我在Windows 7 x86上编译了它,我发现podofo的功能非常强大。

但是当我在示例" helloworld.cpp"中更改某些内容时,只需更改一些代码,即修改pdf文档并将其保存为其他文件名!

当我将本地pdf文档文件(本地pdf文档从Office文档2007中使用Windows COM界面的Word文档保存)保存到函数中时,新文件输出成功,但输出文本是垂直翻转的,输出文本的Y位是垂直翻转的。

(有些人说在这种情况下你必须处理现有内容可能改变了图形状态的事实,例如改变了当前的转换矩阵,也许他是对的,但是我无法找到函数改变了图形状态并改变了当前的变换矩阵)

这是图片截图,我不知道为什么输出文字是垂直翻转的:

screenshot

奇怪的是,当我通过文件“output.pdf"由例子" helloworld"创建

如果你有时间,请帮我解决,非常感谢你!

我更改的代码如下所示:

#define MEMDOCUMENT 1 // macro switch  
void HelloWorld( const char* pszFilename ) 
{
    /*
     * PdfStreamedDocument is the class that can actually write a PDF file.
     * PdfStreamedDocument is much faster than PdfDocument, but it is only
     * suitable for creating/drawing PDF files and cannot modify existing
     * PDF documents.
     *
     * The document is written directly to pszFilename while being created.
     */
#if MEMDOCUMENT
     PdfMemDocument document( pszFilename ); //open local pdf documet
#else
     PdfStreamedDocument document( pszFilename ); //create a new pdf documet
#endif
    /*
     * PdfPainter is the class which is able to draw text and graphics
     * directly on a PdfPage object.
     */
    PdfPainter painter;

    /*
     * This pointer will hold the page object later. 
     * PdfSimpleWriter can write several PdfPage's to a PDF file.
     */
    PdfPage* pPage;

    /*
     * A PdfFont object is required to draw text on a PdfPage using a PdfPainter.
     * PoDoFo will find the font using fontconfig on your system and embedd truetype
     * fonts automatically in the PDF file.
     */     
    PdfFont* pFont;

    try {
        /*
         * The PdfDocument object can be used to create new PdfPage objects.
         * The PdfPage object is owned by the PdfDocument will also be deleted automatically
         * by the PdfDocument object.
         *
         * You have to pass only one argument, i.e. the page size of the page to create.
         * There are predefined enums for some common page sizes.
         */
#if MEMDOCUMENT
        pPage = document.GetPage(0); //get the first page and modify it
#else
        pPage = document.CreatePage( PdfPage::CreateStandardPageSize( ePdfPageSize_A4 ) );
#endif
        /*
         * If the page cannot be created because of an error (e.g. ePdfError_OutOfMemory )
         * a NULL pointer is returned.
         * We check for a NULL pointer here and throw an exception using the RAISE_ERROR macro.
         * The raise error macro initializes a PdfError object with a given error code and
         * the location in the file in which the error ocurred and throws it as an exception.
         */
        if( !pPage ) 
        {
            PODOFO_RAISE_ERROR( ePdfError_InvalidHandle );
        }

        /*
         * Set the page as drawing target for the PdfPainter.
         * Before the painter can draw, a page has to be set first.
         */
        painter.SetPage( pPage );

        /*
         * Create a PdfFont object using the font "Arial".
         * The font is found on the system using fontconfig and embedded into the
         * PDF file. If Arial is not available, a default font will be used.
         *
         * The created PdfFont will be deleted by the PdfDocument.
         */
        pFont = document.CreateFont( "Arial" );

        /*
         * If the PdfFont object cannot be allocated return an error.
         */
        if( !pFont )
        {
            PODOFO_RAISE_ERROR( ePdfError_InvalidHandle );
        }

        /*
         * Set the font size
         */
        pFont->SetFontSize( 18.0 );

        /*
         * Set the font as default font for drawing.
         * A font has to be set before you can draw text on
         * a PdfPainter.
         */
        painter.SetFont( pFont );

        /*
         * You could set a different color than black to draw
         * the text.
         *
         * SAFE_OP( painter.SetColor( 1.0, 0.0, 0.0 ) );
         */

        /*
         * Actually draw the line "Hello World!" on to the PdfPage at
         * the position 2cm,2cm from the top left corner. 
         * Please remember that PDF files have their origin at the 
         * bottom left corner. Therefore we substract the y coordinate 
         * from the page height.
         * 
         * The position specifies the start of the baseline of the text.
         *
         * All coordinates in PoDoFo are in PDF units.
         * You can also use PdfPainterMM which takes coordinates in 1/1000th mm.
         *
         */

        painter.SetTransformationMatrix(1,0,0,-1,0,pPage->GetPageSize().GetHeight());

        painter.DrawText( 56.69, pPage->GetPageSize().GetHeight() - 56.69, "Hello World!" );

        painter.DrawText( 56.69, pPage->GetPageSize().GetHeight() - 96.69, "Hello World!" );

        /*
         * Tell PoDoFo that the page has been drawn completely.
         * This required to optimize drawing operations inside in PoDoFo
         * and has to be done whenever you are done with drawing a page.
         */
        painter.FinishPage();

        /*
         * Set some additional information on the PDF file.
         */
        document.GetInfo()->SetCreator ( PdfString("examplahelloworld - A PoDoFo test application") );
        document.GetInfo()->SetAuthor  ( PdfString("Dominik Seichter") );
        document.GetInfo()->SetTitle   ( PdfString("Hello World") );
        document.GetInfo()->SetSubject ( PdfString("Testing the PoDoFo PDF Library") );
        document.GetInfo()->SetKeywords( PdfString("Test;PDF;Hello World;") );

        /*
         * The last step is to close the document.
         */

#if MEMDOCUMENT
        document.Write("outputex.pdf"); //save page change
#else
        document.Close(); 
#endif


    } catch ( const PdfError & e ) {
        /*
         * All PoDoFo methods may throw exceptions
         * make sure that painter.FinishPage() is called
         * or who will get an assert in its destructor
         */
        try {
            painter.FinishPage();
        } catch( ... ) {
            /*
             * Ignore errors this time
             */
        }

        throw e;
    }
}

2 个答案:

答案 0 :(得分:0)

感谢mkl,在mkl的帮助下,问题已经解决了。

问题是因为Reflection effect.podofo源代码有变换矩阵,你可以在pdf文档上添加文本或行之前更改它。

添加如下代码: //

        painter.SetTransformationMatrix(1,0,0,-1,0,pPage->GetPageSize().GetHeight()); // set Reflection effect
        painter.Save();

        painter.DrawText( 56.69, pPage->GetPageSize().GetHeight() - 56.69, "Hello World!" );

        painter.DrawText( 56.69, pPage->GetPageSize().GetHeight() - 96.69, "Hello World!"

答案 1 :(得分:0)

对于那些努力想知道为什么会发生这种情况的人来说,这是由于每个页面顶部的这个命令(在这个例子中,页面是A4大小)沿着y轴翻转内容:

1 0 0 -1 0 841 cm

根据我的观察,这似乎很常见,存在于由多个程序生成的PDF中。还有许多PDF根本不包含此内容。我怀疑这完全是由于在cairo 1.15.4中提交1e07ce,请参阅https://cairographics.org/releases/ChangeLog.cairo-1.15.4

棘手的部分是此命令在任何q(保存转换),Q(恢复转换)命令之前,因此无法使用简单的{{返回已知转换1}}。换句话说,返回已知变换的唯一方法是解析页面内容流并查看Q / q对之前的变换。然后,一旦知道了这种变换,就可以在任何新内容叠加到现有内容之前应用逆变换。

解析页面并在任何Q之前获得转换:

q

其中PoDoFo::PdfPage* page = ...; PoDoFo::PdfContentsTokenizer tokenizer(page); const char* token = NULL; PoDoFo::PdfVariant param; PoDoFo::EPdfContentsType type; std::vector<PoDoFo::PdfVariant> params; double tf_a = 1, tf_c = 0, tf_e = 0; double tf_b = 0, tf_d = 1, tf_f = 0; //0 //0 //1 while(tokenizer.ReadNext(type, token, param)){ //Command if(type == PoDoFo::ePdfContentsType_Keyword){ //First Save at page, we assume that it will eventually be paired with enough Restores to go back to the current transform if(strcmp(token, "q") == 0) break; //Transform before first q, must apply the inverse when overlaying dots else if(strcmp(token, "cm") == 0){ if(params.size() == 6){ tf_a = params[0].GetReal(); tf_b = params[1].GetReal(); tf_c = params[2].GetReal(); tf_d = params[3].GetReal(); tf_e = params[4].GetReal(); tf_f = params[5].GetReal(); invertTransform(tf_a, tf_b, tf_c, tf_d, tf_e, tf_f); } else std::cout << "Warning! Found transform before first q at page with wrong number of arguments!" << std::endl; } else std::cout << "Warning! Unrelated command at page before first q: " << token << std::endl; params.clear(); } //Parameter for command else if(type == PoDoFo::ePdfContentsType_Variant) params.push_back(param); } 是一个小实用函数:

invertTransform()

然后,可以应用逆变换(只有在第一个void invertTransform(double& a, double& b, double& c, double& d, double& e, double& f){ double m_11 = a, m_12 = c, m_13 = e; double m_21 = b, m_22 = d, m_23 = f; //m_31 = 0.0, m_32 = 0.0, m_33 = 1.0; double det = m_11*(/*m_33**/m_22 /*- m_32*m_23*/) - m_21*(/*m_33**/m_12/* - m_32*m_13*/) /*+ m_31*(m_23*m_12 - m_22*m_13)*/; if(abs(det) < 1e-10){ a = 1; c = 0; e = 0; b = 0; d = 1; f = 0; //0 //0 //1 } else{ double det_1 = 1.0/det; a = det_1*( /*m_33**/m_22 /*- m_32*m_23*/); c = det_1*(-/*m_33**/m_12 /*+ m_32*m_13*/); e = det_1*( m_23*m_12 - m_22*m_13); b = det_1*(-/*m_33**/m_21 /*+ m_31*m_23*/); d = det_1*( /*m_33**/m_11 /*- m_31*m_13*/); f = det_1*(-m_23*m_11 + m_21*m_13); //det_1*( m_32*m_21 - m_31*m_22) det_1*(-m_32*m_11 + m_31*m_12) det_1*( m_22*m_11 - m_21*m_12) } } 之前没有cm时才识别)并且可以在页面上绘制内容:

q

当然,整个解决方案假定在第一个PoDoFo::PdfPainter painter; painter.SetPage(page); painter.Save(); painter.SetTransformationMatrix(tf_a, tf_b, tf_c, tf_d, tf_e, tf_f); /* painter.Draw...() */ painter.Restore(); painter.FinishPage(); 之前可能只有一个cm转换而没有其他转换。

另一个更简单的解决方案是在流中的所有内容之前放置一个q并在之后放置一个q,然后是所需的内容,但我不确定它是否可以直接执行使用PoDoFo。