由于我的数据集是不平衡的,以下方法使用了带有StratifiedShuffleSplit的KNN分类器:
def KNN(train_x, train_y): skf = StratifiedShuffleSplit() scores = [] for train, test in skf.split(train_x, train_y): clf = KNeighborsClassifier(n_neighbors=2, n_jobs=-1) clf.fit(train_x.loc[train], train_y.loc[train]) score = clf.score(train_x.loc[test], train_y.loc[test]) scores.append(score) res = np.asarray(scores).mean() print(res)
如何修改scores
以计算recall
(召回率)和precision
(精确率)指标,而不是默认的准确率?
谢谢,
回答:
你需要:
sklearn.metrics.recall_score(y_true, y_pred)sklearn.metrics.precision_score(y_true, y_pred)
from sklearn.metrics import recall_scorefrom sklearn.metrics import precision_scoredef KNN(train_x, train_y): skf = StratifiedShuffleSplit() scores = [] scores2 = [] for train, test in skf.split(train_x, train_y): clf = KNeighborsClassifier(n_neighbors=2, n_jobs=-1) clf.fit(train_x.loc[train], train_y.loc[train]) y_pred = clf.predict(train_x.loc[test]) # 预测测试集的标签 y_true = train_y.loc[test] # 获取测试集的真实标签 score = recall_score(y_true, y_pred) # 召回率估算 score2 = precision_score(y_true, y_pred) # 精确率估算 scores.append(score) scores2.append(score2)