Sci-kit learn如何打印混淆矩阵的标签？

我正在使用sci-kit learn对一些数据进行分类。我有13个不同的类别值/类别来对数据进行分类。现在我已经能够使用交叉验证并打印出混淆矩阵。然而，它只显示了TP和FP等，没有显示类别标签，所以我不知道哪个类别是什么。下面是我的代码和输出：

def classify_data(df, feature_cols, file):    nbr_folds = 5    RANDOM_STATE = 0    attributes = df.loc[:, feature_cols]  # 也称为x    class_label = df['task']  # 类别标签，也称为y.    file.write("\n使用的特征: ")    for feature in feature_cols:        file.write(feature + ",")    print("使用的特征", feature_cols)    sampler = RandomOverSampler(random_state=RANDOM_STATE)    print("随机森林")    file.write("\n随机森林")    rfc = RandomForestClassifier(max_depth=2, random_state=RANDOM_STATE)    pipeline = make_pipeline(sampler, rfc)    class_label_predicted = cross_val_predict(pipeline, attributes, class_label, cv=nbr_folds)    conf_mat = confusion_matrix(class_label, class_label_predicted)    print(conf_mat)    accuracy = accuracy_score(class_label, class_label_predicted)    print("分类行数: " + str(len(class_label_predicted)))    print("准确率: {0:.3f}%\n".format(accuracy * 100))    file.write("\n分类器设置:" + str(pipeline) + "\n")    file.write("\n分类行数: " + str(len(class_label_predicted)))    file.write("\n准确率: {0:.3f}%\n".format(accuracy * 100))    file.writelines('\t'.join(str(j) for j in i) + '\n' for i in conf_mat)#输出分类行数: 23504准确率: 17.925%0   372 46  88  5   73  0   536 44  317 0   200 1270   501 29  85  0   136 0   655 9   154 0   172 670   97  141 78  1   56  0   336 37  429 0   435 1980   135 74  416 5   37  0   507 19  323 0   128 1640   247 72  145 12  64  0   424 21  296 0   304 2230   190 41  36  0   178 0   984 29  196 0   111 430   218 13  71  7   52  0   917 139 177 0   111 1030   215 30  84  3   71  0   1175    11  55  0   102 620   257 55  156 1   13  0   322 184 463 0   197 1600   188 36  104 2   34  0   313 99  827 0   69  1360   281 80  111 22  16  0   494 19  261 0   313 2110   207 66  87  18  58  0   489 23  157 0   464 2390   113 114 44  6   51  0   389 30  408 0   338 315

如您所见，您无法真正知道哪一列是什么，并且打印内容也“未对齐”，因此很难理解。

有没有办法也打印出标签呢？

回答：

从文档来看，似乎没有这样的选项来打印混淆矩阵的行和列标签。然而，您可以使用参数labels=...来指定标签顺序

示例：

from sklearn.metrics import confusion_matrixy_true = ['yes','yes','yes','no','no','no']y_pred = ['yes','no','no','no','no','no']print(confusion_matrix(y_true, y_pred))# 输出:# [[3 0]#  [2 1]]print(confusion_matrix(y_true, y_pred, labels=['yes', 'no']))# 输出:# [[1 2]#  [0 3]]

如果您想打印带有标签的混淆矩阵，可以尝试使用pandas，并设置DataFrame的index和columns。

import pandas as pdcmtx = pd.DataFrame(    confusion_matrix(y_true, y_pred, labels=['yes', 'no']),     index=['true:yes', 'true:no'],     columns=['pred:yes', 'pred:no'])print(cmtx)# 输出:#           pred:yes  pred:no# true:yes         1        2# true:no          0        3

或者

unique_label = np.unique([y_true, y_pred])cmtx = pd.DataFrame(    confusion_matrix(y_true, y_pred, labels=unique_label),     index=['true:{:}'.format(x) for x in unique_label],     columns=['pred:{:}'.format(x) for x in unique_label])print(cmtx)# 输出:#           pred:no  pred:yes# true:no         3         0# true:yes        2         1

学技术

Sci-kit learn如何打印混淆矩阵的标签？

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复