在AWS Lambda中将doc / docx转换为pdf?

时间:2020-05-12 07:21:52

标签: amazon-web-services aws-lambda pdf-generation

尝试:

  • 预制的Lambda应用程序docx到pdf(该应用程序不再可部署) https://github.com/NativeDocuments/docx-to-pdf-on-AWS-Lambda
  • 安装comtypes.client和win32com.client(在lambda中部署后似乎都不起作用) 出现错误:无法导入模块“ lambda_function”:无法导入名称“ COMError”

可能性:

-当我从s3获得文档文件时,将其在Browser JS中转换为PDF。 -以某种方式修复comtypes或win32com。正在使用Python 3.6。

import json
import urllib
import boto3
from boto3.s3.transfer import TransferConfig
from botocore.exceptions import ClientError
import lxml
import comtypes.client
import io
import os
import sys
import threading
from docx import Document

def lambda_handler(event, context):

    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')

    try:
        response = s3.get_object(Bucket=bucket, Key=key)

        # Creating the Document
        f = io.BytesIO(response['Body'].read())
        document = Document(f)

        //Code for formating my document object in this hidden section.

        document.save('/tmp/'+key)
        pdfkey = key.split(".")[0]+".pdf"

        //The following function is suppose to convert my doc to pdf
        doctopdf('/tmp/'+ key,'/tmp/'+pdfkey) 

        //PDF file is then saved to s3
        s3.upload_file('/tmp/'+pdfkey,'output',pdfkey)

    except exceptions as e:
        Logging.error(e)
        raise e

def doctopdf(in_file,out_file):
    word = comtypes.client.CreateObject('Word.Application')
    doc = word.Documents.Open(in_file)
    doc.SaveAs(out_file, FileFormat=wdFormatPDF)
    doc.Close()
    word.Quit()

0 个答案:

没有答案