我正在尝试在10折交叉验证中获取正类别和负类别的精确度和召回率的平均值。我的模型是一个二分类器。
我运行了下面的代码,但遗憾的是它只返回了正类别的平均精确度和召回率。我如何才能让算法也返回负类别的平均精确度和召回率呢?
from sklearn.metrics import make_scorer, accuracy_score, precision_score, recall_score, f1_scorefrom sklearn.model_selection import cross_validatescoring = {'accuracy' : make_scorer(accuracy_score), 'precision' : make_scorer(precision_score), 'recall' : make_scorer(recall_score), 'f1_score' : make_scorer(f1_score)}results = cross_validate(model_unbalanced_data_10_times_weight, X, Y, cv=10, scoring=scoring)np.mean(results['test_precision'])np.mean(results['test_recall'])
我还尝试使用命令”classification_report(y_test, predictions)
“打印分类报告,结果如下的截图所示。然而,我认为分类报告中的精确度/召回率得分仅基于一次运行,而不是10折的平均值(如果我错了,请纠正我)。
回答:
基于我们上面的讨论,我确实认为计算每个cv折的预测并在其上计算cross_validation_report
应该是正确的方法。现在结果应该考虑到cv折的数量了:
>>> from sklearn.metrics import classification_report>>> from sklearn.datasets import load_iris>>> from sklearn.ensemble import RandomForestClassifier>>> from sklearn.model_selection import cross_val_predict>>> >>> iris = load_iris()>>> >>> rf_clf = RandomForestClassifier()>>> >>> preds = cross_val_predict(estimator=rf_clf,... X=iris["data"],... y=iris["target"],... cv=15)>>> >>> print(classification_report(iris["target"], preds)) precision recall f1-score support 0 1.00 1.00 1.00 50 1 0.92 0.94 0.93 50 2 0.94 0.92 0.93 50 accuracy 0.95 150 macro avg 0.95 0.95 0.95 150weighted avg 0.95 0.95 0.95 150