如何使用scikit-learn计算情感分析的分类报告

如何获取三类分类的分类报告，包括精确度、召回率、准确率和支持度，这三类分别是“积极”、“消极”和“中性”。以下是代码：

vec_clf = Pipeline([('vectorizer', vec), ('pac', svm_clf)])print vec_clf.fit(X_train.values.astype('U'),y_train.values.astype('U'))y_pred = vec_clf.predict(X_test.values.astype('U'))print "SVM Accuracy-",metrics.accuracy_score(y_test, y_pred)print "confuson metrics :\n", metrics.confusion_matrix(y_test, y_pred, labels=["positive","negative","neutral"])print(metrics.classification_report(y_test, y_pred))

运行代码后出现以下错误：

SVM Accuracy- 0.850318471338confuson metrics :[[206   9  67] [  4 373 122] [  9  21 756]]Traceback (most recent call last):  File "<ipython-input-62-e6ab3066790e>", line 1, in <module>    runfile('C:/Users/HP/abc16.py', wdir='C:/Users/HP')  File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 880, in runfile    execfile(filename, namespace)  File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 87, in execfile    exec(compile(scripttext, filename, 'exec'), glob, loc)  File "C:/Users/HP/abc16.py", line 133, in <module>    print(metrics.classification_report(y_test, y_pred))  File "C:\ProgramData\Anaconda2\lib\site-packages\sklearn\metrics\classification.py", line 1391, in classification_report    labels = unique_labels(y_true, y_pred)  File "C:\ProgramData\Anaconda2\lib\site-packages\sklearn\utils\multiclass.py", line 104, in unique_labels    raise ValueError("Mix of label input types (string and number)")ValueError: Mix of label input types (string and number)

请指导我哪里出错了

编辑1：这是y_true和y_pred的外观

        print "y_true :" ,y_test        print "y_pred :",y_pred        y_true : 5985     neutral        899     positive        2403     neutral        3963     neutral        3457     neutral        5345     neutral        3779     neutral        299      neutral        5712     neutral        5511     neutral        234      neutral        1684    negative        3701    negative        2886     neutral        .        .        .        2623    positive        3549     neutral        4574     neutral        4972    positive        Name: sentiment, Length: 1570, dtype: object        y_pred : [u'neutral' u'positive' u'neutral' ..., u'neutral' u'neutral' u'negative']

编辑2：type(y_true)和type(y_pred)的输出

type(y_true):  <class 'pandas.core.series.Series'>type(y_pred):  <type 'numpy.ndarray'>

回答：

无法重现你的错误：

import pandas as pdimport numpy as npfrom sklearn.metrics import accuracy_score, classification_report, confusion_matrix# 玩具数据，与你的类似：data = {'id':[5985,899,2403, 1684], 'sentiment':['neutral', 'positive', 'neutral', 'negative']}y_true = pd.Series(data['sentiment'], index=data['id'], name='sentiment')y_true# 5985     neutral# 899     positive# 2403     neutral# 1684    negative# Name: sentiment, dtype: objecttype(y_true)# pandas.core.series.Seriesy_pred = np.array(['neutral', 'positive', 'negative', 'neutral'])# 所有指标正常运行：accuracy_score(y_true, y_pred)# 0.5confusion_matrix(y_true, y_pred)# array([[0, 1, 0],#        [1, 1, 0],#        [0, 0, 1]], dtype=int64)classification_report(y_true, y_pred)# 结果：             precision    recall  f1-score   support   negative       0.00      0.00      0.00         1   neutral        0.50      0.50      0.50         2   positive       1.00      1.00      1.00         1      total       0.50      0.50      0.50         4

学技术

如何使用scikit-learn计算情感分析的分类报告

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复