通过Flask

时间:2017-09-02 01:57:59

标签: python web-services http curl flask

我目前在单独的Docker容器中运行Flask-RESTful应用程序和Apache Tika服务器。 Flask服务器在容器和主机上的端口5000上提供服务,而Tika服务器在9998上提供服务。

我希望能够将客户端上传到Flask服务器的文件传递给Tika服务器,以便我可以提取文档的文本。但是,我似乎无法得到任何工作;我尝试在文件中读取的每一种方式都失败了。有谁看到我做错了什么?

以下在Python中用于访问Tika服务器:

requests.request('PUT', 'http://localhost:9998/rmeta/text', data=open('test_doc.docx', 'rb'), headers={}).text

然而,尝试像Flask服务器那样路由:

requests.request('post', 'http://localhost:5000/index', files={'file': open('test_doc.docx', 'rb')}, headers={}).text

应用程序/ __初始化__。PY

class Index(MethodView):
    def post(self):
        #Load in file
        parse = reqparse.RequestParser()
        parse.add_argument('file', type=werkzeug.datastructures.FileStorage, location='files')
        args = parse.parse_args()
        uploadedFile = args['file']
        filename = secure_filename(uploadedFile.filename)

        #Create temporary file
        tmpfile = TemporaryFile()
        tmpfile.write(uploadedFile.stream.read())

        #Extract text
        data = tika.extract_text(tmpfile)
        tmpfile.close()
        return data

应用程序/蒂卡/ __初始化__。PY

import json
import requests

class Tika:
    def __init__(self, endpoint):
        self.endpoint = endpoint

    def extract_text(self, filedata):
        response = requests.request('put', self.endpoint, data=filedata, headers={}).json()
        try:
            return response[0]["X-TIKA:content"]
        except:
            return "ERROR"

回溯

Traceback (most recent call last):
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/app.py", line 1997, in __call__
    return self.wsgi_app(environ, start_response)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/app.py", line 1985, in wsgi_app
    response = self.handle_exception(e)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask_restful/__init__.py", line 273, in error_router
    return original_handler(e)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/app.py", line 1540, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/_compat.py", line 32, in reraise
    raise value.with_traceback(tb)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/app.py", line 1614, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask_restful/__init__.py", line 273, in error_router
    return original_handler(e)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/app.py", line 1517, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/_compat.py", line 32, in reraise
    raise value.with_traceback(tb)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/app.py", line 1598, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask_restful/__init__.py", line 480, in wrapper
    resp = resource(*args, **kwargs)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/views.py", line 84, in view
    return self.dispatch_request(*args, **kwargs)
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/flask/views.py", line 149, in dispatch_request
    return meth(*args, **kwargs)
  File "/app/__init__.py", line 109, in post
    data = get_clean_text(tika.extract_text(tmpfile))
  File "/app/tika/__init__.py", line 16, in extract_text
    response = requests.request('put', self.endpoint, data=filedata, headers={}).json()
  File "/opt/conda/envs/SDL/lib/python3.5/site-packages/requests/models.py", line 885, in json
    return complexjson.loads(self.text, **kwargs)
  File "/opt/conda/envs/SDL/lib/python3.5/json/__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "/opt/conda/envs/SDL/lib/python3.5/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/conda/envs/SDL/lib/python3.5/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

我必须尝试将文件传递到服务器并对其进行解码,但对于我的生活,我无法弄明白。任何和所有的帮助将非常感激。

1 个答案:

答案 0 :(得分:0)

问题是没有数据传递给tika.extract_text()。这是因为tmpfile.write(uploadedFile.stream.read())将上载的数据写入临时文件,之后文件指针位于文件末尾。然后将此文件句柄传递给tika.extract_text(tmpfile),因为文件指针位于文件的末尾,任何读取都将返回一个空字符串,因此没有任何内容传递给您的tika服务器。

您可以通过在将临时文件交给tika.extract_text()之前寻找临时文件的开头来轻松解决此问题:

    #Create temporary file
    tmpfile = TemporaryFile()
    tmpfile.write(uploadedFile.stream.read())
    tmpfile.seek(0)    # reposition file pointer to the start of the file

    data = tika.extract_text(tmpfile)

在您发布的代码中,我不清楚为什么您需要使用临时文件。您只需将上传的数据直接传递给tika服务器:

from flask import Flask, request
from flask.views import MethodView
from flask.json import jsonify

app = Flask(__name__)

class Index(MethodView):
    def post(self):
        uploaded_file = request.files.get('file')
        if uploaded_file:
            data = tika.extract_text(uploaded_file)
        else:
            data = {'error': 'Missing upload file'}

        return jsonify(data)

app.add_url_rule('/', view_func=Index.as_view('/'))

app.run()