通过Flask(Python)对多个文档进行分类并将其存储在不同的文件夹中

时间:2019-02-07 07:12:47

标签: python flask document text-classification

我想做的是,我希望我的Web应用程序将多个文档作为输入并使用我的模型对其进行分类,然后将这些分类的文档存储到不同的文件夹中。

我已经开发出一种对文档进行分类的模型。模型已准备就绪,并且具有约0.96 f分数的精度。我想在烧瓶中实现它。我已经实现了显示准确结果的文本输入。

((#LIBRARIES
app = Flask(__name__)

app.config.from_object(__name__) # load config from th

app.config['UPLOAD_FOLDER'] = 'CV_upload/'
app.config['ALLOWED_EXTENSIONS'] = set(['txt', 'pdf', 'png', 'jpg', 
    'jpeg', 'gif'])

# Route for handling the login page logic
@app.route('/login', methods=['GET', 'POST'])
def login():
    error = None
    if request.method == 'POST':
        if request.form['username'] != 'admin' or 
    request.form['password'] != 'admin':
            error = 'Invalid Credentials. Please try again.'
        else:
            return redirect(url_for('home'))
    return render_template('index.html', error=error)

@app.route('/logout')
def logout():
    session.pop('logged_in', None)
    flash('You were logged out')
    return redirect(url_for('home'))


@app.route('/home')
def home():
    return render_template('home.html')

@app.route('/predict',methods=['POST'])
def predict():
    <<<<<<<<<<<<<<#MY_NAIVE_BAYES MODEL 
   HERE>>>>>>>>>>>>>>>>>>>>>>>>

<<<<<<<<<<<<<<<<<<THis is where i take input text but i want this to 
   change as input multiple pdf files and then classify them>>>>>>>>
                if request.method == 'POST':
            message = request.form['message']
            data = [message]
            vect = vectorizer.transform(data).toarray()
            my_prediction = model.predict(vect)
            return render_template('result.html',prediction = my_prediction)



if __name__ == '__main__':
    app.run(debug=True , threaded=True) 
))

我要做的是,我希望我的Web应用将多个文档作为输入并使用我的模型对其进行分类,然后将这些分类的文档存储到不同的文件夹中。进行查询时,它将生成结果。

1 个答案:

答案 0 :(得分:0)

使用FileUploads

的最小解决方案
import os

from flask import Flask, request
from werkzeug.utils import secure_filename

app = Flask(__name__)

def classify(bindata):
    #this is a pdf file, extract text and run your model
    return "abc"

def ensure_dirs(path):
    if not os.path.exists(path):
        os.makedirs(path)

@app.route('/', methods=['GET', 'POST'])
def upload_file():
    if request.method == 'POST':
        for fileobj in request.files.getlist('file'):
            print "Found {}".format(fileobj.filename)
            binary_data = fileobj.stream.read()
            file_class = classify(binary_data)
            path = "uploads/{}".format(file_class)
            ensure_dirs(path)
            fd = open(os.path.join(path, fileobj.filename), "wb")
            fd.write(binary_data)
            fd.close()
            fileobj.close()
    return '''
    <!doctype html>
    <title>Upload new File</title>
    <h1>Upload new File</h1>
    <form method=post enctype=multipart/form-data>
      <input type=file name=file multiple>
      <input type=submit value=Upload>
    </form>
    '''

if __name__ == '__main__':
    app.run()