在每个表面上,我希望显示实际的预测数量,无论是百分比还是数字都可以。我还希望用“真阳性”和“假阴性”来标记它们。
代码如下:
sns.heatmap(pd.crosstab(ytest,classifier.predict(xtest)),cmap='Spectral')plt.xlabel('predicted')plt.ylabel('actual')plt.show()
回答:
我使用以下代码来实现你想要的效果,尽管通过谷歌搜索也能找到答案。
def find_best_threshold(threshold, fpr, tpr): t = threshold[np.argmax(tpr * (1-fpr))] ### TPR * TNR ---> 我们试图最大化TNR和TPR print("tpr*(1-fpr)的最大值", max(tpr*(1-fpr)), "对应的阈值", np.round(t,3)) return tdef predict_with_best_thresh(prob,t): pred=[1 if i>=t else 0 for i in prob ] return pred### https://medium.com/@dtuk81/confusion-matrix-visualization-fc31e3f30feadef conf_matrix_plot(cf_matrix,title): group_names = ['真阴性','假阳性','假阴性','真阳性'] group_counts = ["{0:0.0f}".format(value) for value in cf_matrix.flatten()] group_percentages = ["{0:.2%}".format(value) for value in cf_matrix.flatten()/np.sum(cf_matrix)] labels = [f"{v1}\n{v2}\n{v3}" for v1, v2, vQ3 in zip(group_names,group_counts,group_percentages)] labels = np.asarray(labels).reshape(2,2) #sns.set(font_scale=1.5) sns.heatmap(cf_matrix, annot=labels, fmt='',cmap='coolwarm').set_title(title + ' TFIDF的混淆矩阵') plt.xlabel('实际') plt.ylabel('预测')from sklearn.metrics import confusion_matriximport numpy as npbest_t = find_best_threshold(tr_thresholds, train_fpr, train_tpr)cf_matrix_train = confusion_matrix(y_train, predict_with_best_thresh(y_train_pred[:,1], best_t))cf_matrix_test = confusion_matrix(y_test, predict_with_best_thresh(y_test_pred[:,1], best_t))conf_matrix_plot(cf_matrix_train,'训练')
结果如下: