我想在Python中比较不同的二元分类器。为此，我希望计算ROC AUC分数，测量95%置信区间（CI），并计算p值以评估统计显著性。

下面是一个使用scikit-learn的最小示例，它在一个二元分类数据集上训练了三个不同的模型，绘制了ROC曲线并计算了AUC分数。

以下是我的具体问题：

如何计算测试集上ROC AUC分数的95%置信区间（CI）？（例如，使用自助法）。
如何比较测试集上的AUC分数并测量p值以评估统计显著性？（原假设是模型之间没有差异。拒绝原假设意味着AUC分数的差异在统计上是显著的。）

import numpy as npnp.random.seed(2018)from sklearn.datasets import load_breast_cancerfrom sklearn.metrics import roc_auc_score, roc_curvefrom sklearn.model_selection import train_test_splitfrom sklearn.naive_bayes import GaussianNBfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.neural_network import MLPClassifierimport matplotlibimport matplotlib.pyplot as pltdata = load_breast_cancer()X = data.datay = data.targetX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=17)# Naive Bayes Classifiernb_clf = GaussianNB()nb_clf.fit(X_train, y_train)nb_prediction_proba = nb_clf.predict_proba(X_test)[:, 1]# Ranodm Forest Classifierrf_clf = RandomForestClassifier(n_estimators=20)rf_clf.fit(X_train, y_train)rf_prediction_proba = rf_clf.predict_proba(X_test)[:, 1]# Multi-layer Perceptron Classifiermlp_clf = MLPClassifier(alpha=1, hidden_layer_sizes=150)mlp_clf.fit(X_train, y_train)mlp_prediction_proba = mlp_clf.predict_proba(X_test)[:, 1]def roc_curve_and_score(y_test, pred_proba):    fpr, tpr, _ = roc_curve(y_test.ravel(), pred_proba.ravel())    roc_auc = roc_auc_score(y_test.ravel(), pred_proba.ravel())    return fpr, tpr, roc_aucplt.figure(figsize=(8, 6))matplotlib.rcParams.update({'font.size': 14})plt.grid()fpr, tpr, roc_auc = roc_curve_and_score(y_test, rf_prediction_proba)plt.plot(fpr, tpr, color='darkorange', lw=2,         label='ROC AUC={0:.3f}'.format(roc_auc))fpr, tpr, roc_auc = roc_curve_and_score(y_test, nb_prediction_proba)plt.plot(fpr, tpr, color='green', lw=2,         label='ROC AUC={0:.3f}'.format(roc_auc))fpr, tpr, roc_auc = roc_curve_and_score(y_test, mlp_prediction_proba)plt.plot(fpr, tpr, color='crimson', lw=2,         label='ROC AUC={0:.3f}'.format(roc_auc))plt.plot([0, 1], [0, 1], color='navy', lw=1, linestyle='--')plt.legend(loc="lower right")plt.xlim([0.0, 1.0])plt.ylim([0.0, 1.05])plt.xlabel('1 - Specificity')plt.ylabel('Sensitivity')plt.show()

回答：

自助法计算95%置信区间

你需要对数据进行多次重抽样重复分析。一般情况下，假设你有一个函数f(x)，它从数据x中计算你需要的任何统计量，你可以这样进行自助法：

def bootstrap(x, f, nsamples=1000):    stats = [f(x[np.random.randint(x.shape[0], size=x.shape[0])]) for _ in range(nsamples)]    return np.percentile(stats, (2.5, 97.5))

这会给你所谓的95%置信区间的插入估计值（即你只需取自助分布的百分位数）。

在你的情况下，你可以编写一个更具体的函数，如下所示：

def bootstrap_auc(clf, X_train, y_train, X_test, y_test, nsamples=1000):    auc_values = []    for b in range(nsamples):        idx = np.random.randint(X_train.shape[0], size=X_train.shape[0])        clf.fit(X_train[idx], y_train[idx])        pred = clf.predict_proba(X_test)[:, 1]        roc_auc = roc_auc_score(y_test.ravel(), pred.ravel())        auc_values.append(roc_auc)    return np.percentile(auc_values, (2.5, 97.5))

这里，clf是你想要测试性能的分类器，X_train、y_train、X_test、y_test与你的代码中一样。

这给我以下置信区间（四舍五入到三位小数，1000次自助样本）：

朴素贝叶斯：0.986 [0.980 0.988]（估计值，置信区间的下限和上限）
随机森林：0.983 [0.974 0.989]
多层感知器：0.974 [0.223 0.98]

学技术

如何在Python中比较不同二元分类器的ROC AUC分数并评估统计显著性？（p值，置信区间）

自助法计算95%置信区间

发表回复取消回复

自助法计算95%置信区间

相关文章：

Related Posts

Flatten and back keras

如何按索引访问PyTorch模型参数

Python中多元逻辑回归显示错误

在MACOS上安装NLTK

如何在R中将通过RFE选择的变量插入到机器学习模型中？

CountVectorizer 错误：ValueError: setting an array element with a sequence

发表回复 取消回复

发表回复取消回复