Xgboost无法与校准分类器一起运行

我试图运行带有校准分类器的XGboost,以下是我遇到错误的代码片段:

from sklearn.calibration import CalibratedClassifierCVfrom xgboost import XGBClassifierimport numpy as npx_train =np.array([1,2,2,3,4,5,6,3,4,10,]).reshape(-1,1)y_train = np.array([1,1,1,1,1,3,3,3,3,3])x_cfl=XGBClassifier(n_estimators=1)x_cfl.fit(x_train,y_train)sig_clf = CalibratedClassifierCV(x_cfl, method="sigmoid")sig_clf.fit(x_train, y_train)

错误:

TypeError: predict_proba() got an unexpected keyword argument 'X'"

完整的跟踪信息:

TypeError                                Traceback (most recent call last)<ipython-input-48-08dd0b4ae8aa> in <module>----> 1 sig_clf.fit(x_train, y_train)~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in fit(self, X, y, sample_weight)    309                 parallel = Parallel(n_jobs=self.n_jobs)    310 --> 311                 self.calibrated_classifiers_ = parallel(    312                     delayed(_fit_classifier_calibrator_pair)(    313                         clone(base_estimator), X, y, train=train, test=test,~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in __call__(self, iterable)   1039             # remaining jobs.   1040             self._iterating = False-> 1041             if self.dispatch_one_batch(iterator):   1042                 self._iterating = self._original_iterator is not None   1043 ~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in dispatch_one_batch(self, iterator)    857                 return False    858             else:--> 859                 self._dispatch(tasks)    860                 return True    861 ~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in _dispatch(self, batch)    775         with self._lock:    776             job_idx = len(self._jobs)--> 777             job = self._backend.apply_async(batch, callback=cb)    778             # A job can complete so quickly than its callback is    779             # called before we get here, causing self._jobs to~/anaconda3/lib/python3.8/site-packages/joblib/_parallel_backends.py in apply_async(self, func, callback)    206     def apply_async(self, func, callback=None):    207         """Schedule a func to be run"""--> 208         result = ImmediateResult(func)    209         if callback:    210             callback(result)~/anaconda3/lib/python3.8/site-packages/joblib/_parallel_backends.py in __init__(self, batch)    570         # Don't delay the application, to avoid keeping the input    571         # arguments in memory--> 572         self.results = batch()    573     574     def get(self):~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in __call__(self)    260         # change the default number of processes to -1    261         with parallel_backend(self._backend, n_jobs=self._n_jobs):--> 262             return [func(*args, **kwargs)    263                     for func, args, kwargs in self.items]    264 ~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in <listcomp>(.0)    260         # change the default number of processes to -1    261         with parallel_backend(self._backend, n_jobs=self._n_jobs):--> 262             return [func(*args, **kwargs)    263                     for func, args, kwargs in self.items]    264 ~/anaconda3/lib/python3.8/site-packages/sklearn/utils/fixes.py in __call__(self, *args, **kwargs)    220     def __call__(self, *args, **kwargs):    221         with config_context(**self.config):--> 222             return self.function(*args, **kwargs)~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in _fit_classifier_calibrator_pair(estimator, X, y, train, test, supports_sw, method, classes, sample_weight)    443     n_classes = len(classes)    444     pred_method = _get_prediction_method(estimator)--> 445     predictions = _compute_predictions(pred_method, X[test], n_classes)    446     447     sw = None if sample_weight is None else sample_weight[test]~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in _compute_predictions(pred_method, X, n_classes)    499         (X.shape[0], 1).    500     """--> 501     predictions = pred_method(X=X)    502     if hasattr(pred_method, '__name__'):    503         method_name = pred_method.__name__TypeError: predict_proba() got an unexpected keyword argument 'X'

我对此感到非常惊讶,因为直到昨天它对我来说还是可以运行的,当我使用其他分类器时,同样的代码也可以运行。

from sklearn.calibration import CalibratedClassifierCVfrom xgboost import XGBClassifierimport numpy as npx_train = np.array([1,2,2,3,4,5,6,3,4,10,]).reshape(-1,1)y_train = np.array([1,1,1,1,1,3,3,3,3,3])x_cfl=LGBMClassifier(n_estimators=1)x_cfl.fit(x_train,y_train)sig_clf = CalibratedClassifierCV(x_cfl, method="sigmoid")sig_clf.fit(x_train, y_train)

输出:

CalibratedClassifierCV(base_estimator=LGBMClassifier(n_estimators=1))

我的Xgboost安装有问题吗?我使用conda进行安装,我记得昨天我卸载了xgboost并重新安装了它。

我的xgboost版本:

1.3.0


回答:

我认为问题出在XGBoost上。这里有解释:https://github.com/dmlc/xgboost/pull/6555

XGBoost定义了:

predict_proba(self, data, ...

而不是:

predict_proba(self, X, ...

由于sklearn 0.24调用clf.predict_proba(X=X),因此抛出了异常。

这里有一个解决问题的方法,而无需更改您的软件包版本:创建一个继承自XGBoostClassifier的类,重写predict_proba方法,使用正确的参数名称,并调用super()

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注