使用Python中的SVM进行机器学习的分类报告测试集中的错误

我将数据分为测试集和训练集，两者都包含目标值’0’和’1’。但是在使用SVM拟合和预测后，分类报告显示测试样本中没有’0’，这显然是不正确的。

from sklearn.datasets import load_breast_cancerdata = load_breast_cancer()df = pd.DataFrame(data = data['data'],columns=data['feature_names'])x = dfy = data['target']xtrain,xtest,ytrain,ytest = train_test_split(x,y,test_size=0.3,random_state=42)

如您所见，测试集中确实有0和1，但分类报告中的支持数却显示没有0！

!(https://i.sstatic.net/n2uUM.png)

回答：

(在示例中始终包含相关代码是一个好主意，而不是将其放在图片中)

分类报告显示测试样本中没有’0’，这显然是不正确的。

这是因为，从您链接的图片中的代码来看，您在classification_report中交换了参数的顺序；您使用的是：

print(classification_report(pred, ytest)) # 参数顺序错误

这确实会得到：

             precision    recall  f1-score   support    class 0       0.00      0.00      0.00         0    class 1       1.00      0.63      0.77       171avg / total       1.00      0.63      0.77       171

但正确的使用方式（参见文档）是

print(classification_report(ytest, pred)) # ytest 应在前

这会得到

             precision    recall  f1-score   support    class 0       0.00      0.00      0.00        63    class 1       0.63      1.00      0.77       108avg / total       0.40      0.63      0.49       171

同时还会显示以下警告信息：

C:\Users\Root\Anaconda3\envs\tensorflow1\lib\site-packages\sklearn\metrics\classification.py:1135: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. ‘precision’, ‘predicted’, average, warn_for)

因为，正如评论中已经指出的，您只预测了1：

pred# result:array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

其原因是另一个故事，不在当前问题讨论范围内。

以下是上述内容的完整可重现代码：

from sklearn.metrics import classification_reportfrom sklearn.model_selection import train_test_splitfrom sklearn.datasets import load_breast_cancerX, y = load_breast_cancer(return_X_y=True)xtrain,xtest,ytrain,ytest = train_test_split(X,y,test_size=0.3,random_state=42)from sklearn.svm import SVCsvc=SVC()svc.fit(xtrain, ytrain)pred = svc.predict(xtest)print(classification_report(ytest, pred))

学技术

使用Python中的SVM进行机器学习的分类报告测试集中的错误

发表回复取消回复

相关文章：

Related Posts

在使用k近邻算法时，有没有办法获取被使用的“邻居”？

Theano在Google Colab上无法启用GPU支持

准确性评分似乎有误

Keras Functional API: “错误检查输入时：期望input_1具有4个维度，但得到形状为(X, Y)的数组”

如何使用sklearn.datasets.make_classification在指定范围内生成合成数据？

如何处理预测时不在训练集中的标签

发表回复 取消回复

发表回复取消回复