如何在cross_validate函数中集成G-mean?

from sklearn.model_selection import cross_validatescores = cross_validate(LogisticRegression(class_weight='balanced',max_iter=100000),                        X,y, cv=5, scoring=('roc_auc', 'average_precision','f1','recall','balanced_accuracy'))scores['test_roc_auc'].mean(), scores['test_average_precision'].mean(),scores['test_f1'].mean(),scores['test_recall'].mean(),scores['test_balanced_accuracy'].mean()

在上述cross-validate评分参数下,如何计算以下G-mean:

from imblearn.metrics import geometric_mean_scoreprint('The geometric mean is {}'.format(geometric_mean_score(y_test, y_test_pred)))

from sklearn.metrics import accuracy_scoreg_mean = 1.0    #for label in np.unique(y_test):    idx = (y_test == label)    g_mean *= accuracy_score(y_test[idx], y_test_pred[idx])    #g_mean = np.sqrt(g_mean)score = g_meanprint(score)

回答:

只需将其作为自定义评分器传递即可

from sklearn.metrics import make_scorerfrom imblearn.metrics import geometric_mean_scoregm_scorer = make_scorer(geometric_mean_score, greater_is_better=True, average='binary')

设置greater_is_better=True,因为最佳值接近1。可以直接将geometric_mean_score的额外参数传递给make_scorer

完整示例

from sklearn.model_selection import cross_validatefrom sklearn.metrics import make_scorerfrom sklearn.datasets import load_breast_cancerfrom sklearn.linear_model import LogisticRegressionfrom imblearn.metrics import geometric_mean_scoreX, y = load_breast_cancer(return_X_y=True)gm_scorer = make_scorer(geometric_mean_score, greater_is_better=True)scores = cross_validate(    LogisticRegression(class_weight='balanced',max_iter=100000),    X,y,     cv=5,     scoring=gm_scorer)scores>>>{'fit_time': array([0.76488066, 0.69808364, 1.22158527, 0.94157672, 1.01577377]), 'score_time': array([0.00103951, 0.00100923, 0.00065804, 0.00071168, 0.00068736]), 'test_score': array([0.91499142, 0.93884403, 0.9860133 , 0.92439026, 0.9525989 ])}

编辑

要指定多个指标,请将字典传递给scoring参数

scores = cross_validate(    LogisticRegression(class_weight='balanced',max_iter=100000),    X,y,     cv=5,     scoring={'gm_scorer': gm_scorer, 'AUC': 'roc_auc', 'Avg_Precision': 'average_precision'})scores>>>{'fit_time': array([1.03509665, 0.96399784, 1.49760461, 1.13874388, 1.32006526]), 'score_time': array([0.00560617, 0.00357151, 0.0057447 , 0.00566769, 0.00549698]), 'test_gm_scorer': array([0.91499142, 0.93884403, 0.9860133 , 0.92439026, 0.9525989 ]), 'test_AUC': array([0.99443171, 0.99344907, 0.99801587, 0.97949735, 0.99765258]), 'test_Avg_Precision': array([0.99670544, 0.99623085, 0.99893162, 0.98640759, 0.99861043])}

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注