Question

我正在尝试将HTML页面或HTML URL转换为pdf，它不仅可以转换html而且还可以转换css并保存它。我很困惑我应该使用什么（weasyprint，wkhtmltopdf或python pdfkit）。同时我正在使用此代码：

def ConvertToPdf(urltoConvert=None):
    import pdfkit
    pdfFormatOptions= {'page-size':'Letter', 'disable-forms':'','zoom': 1}
    pdfObject = None
    try:
        pdfkit.from_url('http://tdi.dartmouth.edu/', 'dart.pdf')
    except:
       Exception while converting"

        pass
    return pdfObject
if __name__ == "__main__":
  #  url ='http://tdi.dartmouth.edu/'
    ConvertToPdf()

此代码

import weasyprint
pdf = weasyprint.HTML('http://tdi.dartmouth.edu/').write_pdf()
len(pdf)
file('dart.pdf', 'w').write(pdf)

但是一切都是徒劳的，请帮助。

Answer 1

您可能想尝试使用：的 https://pypi.python.org/pypi/pdfkit

它还具有保存CSS的功能

You can specify external CSS files when converting files or strings using css option. Warning This is a workaround for this bug in wkhtmltopdf. You should try –user-style-sheet option first. # Single CSS file css = 'example.css' pdfkit.from_file('file.html', options=options, css=css) # Multiple CSS files css = ['example.css', 'example2.css'] pdfkit.from_file('file.html', options=options, css=css)

Answer 2

这应该可以正常工作

import pdfkit
pdfkit.from_url('http://google.com', 'res.pdf')

另外，另一种解决方案可能是通过selenium制作屏幕截图并从这些图像中组合.pdf。但是，它很脏。

如何将网页页面或HTML网址转换为PDF格式？

2 个答案: