使用Flask和Python对CSV文件中的数据进行预测

你好,世界,

我刚开始学习Python和Flask API,正在尝试导入一个CSV文件并导出另一个包含预测结果的CSV文件

输入的CSV文件看起来像这样

experience    test_score    interview   five          10            110         nan           5             140         six           15            90          .....

Python脚本看起来像这样(app.py)

from flask import Flask, make_response, requestimport ioimport csvapp = Flask(__name__)def transform(text_file_contents):return text_file_contents.replace("=", ",")@app.route('/')def form():return """    <html>        <body>            <h1>Let's TRY to Predict..</h1>            </br>            </br>            <p> Insert your CSV file and then download the Result            <form action="/transform" method="post" enctype="multipart/form-data">                <input type="file" name="data_file" class="btn btn-block"/>                </br>                </br>                <button type="submit" class="btn btn-primary btn-block btn-large">Predict</button>            </form>        </body>    </html>"""@app.route('/transform', methods=["POST"])def transform_view():f = request.files['data_file']if not f:    return "No file"stream = io.StringIO(f.stream.read().decode("UTF8"), newline=None)csv_input = csv.reader(stream)#print("file contents: ", file_contents)#print(type(file_contents))print(csv_input)for row in csv_input:    print(row)stream.seek(0)result = transform(stream.read())response = make_response(result)response.headers["Content-Disposition"] = "attachment; filename=result.csv"return responseif __name__ == "__main__":app.run(debug=True)

在anaconda提示符中执行前面的脚本,代码如下

base (C:\Users\username\ python app.py

上述代码将为我们提供一个HTTP URL,我们可以复制/粘贴到浏览器中

我们得到以下屏幕

enter image description here

因此,我能够插入一个CSV文件并下载另一个CSV文件。

我想要的是应用一个模型脚本并预测CSV中所有行的薪资

我看过许多教程,提供了预测值的脚本,比如https://www.youtube.com/watch?v=UbCWoMf80PY。然而,在大多数情况下,我们必须插入每个值才能得到预测结果。因此,我想一次预测整个CSV文件

模型脚本看起来像这样

df = pd.read_csv('hiring.csv')df['experience'].fillna(0, inplace=True)# Call the Get_dummies functionX = df.drop('salary', axis=1)y = df.salaryX_train, X_test, y_train, y_test = train_test_split (X, y, test_size=0.33, random_state=0) from sklearn.tree import DecisionTreeRegressortree = DecisionTreeTreeRegressor (max_depth = 4, random_state = 42)tree.fit(X_train, y_train)pred_tree = tree.predict(X_test)

输出CSV应该看起来像这样

experience    test_score    interview   salary   ten           10            110         1500 nan           5             140         1870six           15            90          1650.....

回答:

你首先训练模型并保存它:

import pandas as pdimport numpy as npimport pickledf = pd.read_csv('hiring.csv')df['experience'].fillna(0, inplace=True)# Call the Get_dummies functionX = df.drop('salary', axis=1)y = df.salaryX_train, X_test, y_train, y_test = train_test_split (X, y, test_size=0.33, random_state=0) from sklearn.tree import DecisionTreeRegressortree = DecisionTreeTreeRegressor (max_depth = 4, random_state = 42)tree.fit(X_train, y_train)pred_tree = tree.predict(X_test)# save the model to diskfilename = 'finalized_model.sav'pickle.dump(tree, open(filename, 'wb'))

然后导入模型并对上传的文件运行预测…然后可以使用相同的应用程序下载预测结果:

from flask import Flask, make_response, requestimport iofrom io import StringIOimport csvimport pandas as pdimport numpy as npimport pickleapp = Flask(__name__)def transform(text_file_contents):    return text_file_contents.replace("=", ",")@app.route('/')def form():    return """        <html>            <body>                <h1>Let's TRY to Predict..</h1>                </br>                </br>                <p> Insert your CSV file and then download the Result                <form action="/transform" method="post" enctype="multipart/form-data">                    <input type="file" name="data_file" class="btn btn-block"/>                    </br>                    </br>                    <button type="submit" class="btn btn-primary btn-block btn-large">Predict</button>                </form>            </body>        </html>    """@app.route('/transform', methods=["POST"])def transform_view():    f = request.files['data_file']    if not f:        return "No file"    stream = io.StringIO(f.stream.read().decode("UTF8"), newline=None)    csv_input = csv.reader(stream)    #print("file contents: ", file_contents)    #print(type(file_contents))    print(csv_input)    for row in csv_input:        print(row)    stream.seek(0)    result = transform(stream.read())    df = pd.read_csv(StringIO(result))        # load the model from disk    loaded_model = pickle.load(open(filename, 'rb'))    df['prediction'] = loaded_model.predict(df)        response = make_response(df.to_csv())    response.headers["Content-Disposition"] = "attachment; filename=result.csv"    return responseif __name__ == "__main__":    app.run(debug=False,port=9000)

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注