在我的问题中,我想使用一个简单的RandomizedSearchCV调优器来调优sklearn.ensemble.StackingRegressor。由于在实例化StackingRegressor()时需要定义estimators,我无法正确地在param_distribution随机搜索中定义estimators的参数空间。
我尝试了以下方法,但遇到了错误:
from sklearn.datasets import load_diabetesfrom sklearn.linear_model import RidgeCVfrom sklearn.svm import LinearSVRfrom sklearn.model_selection import RandomizedSearchCVfrom sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressorfrom sklearn.ensemble import StackingRegressorX, y = load_diabetes(return_X_y=True)rfr = RandomForestRegressor()gbr = GradientBoostingRegressor()estimators = [rfr, gbr]sreg = StackingRegressor(estimators=estimators)params = {'rfr__max_depth': [3, 5, 10, 100], 'gbr__max_depth': [3, 5, 10, 100]}grid = RandomizedSearchCV(estimator=sreg, param_distributions=params, cv=3)grid.fit(X,y)
我遇到了错误 AttributeError: 'RandomForestRegressor' object has no attribute 'estimators_'
。
有没有办法在StackingRegressor中调优不同estimators的参数?
回答:
如果你将你的estimators定义为estimator名称和estimator实例的元组列表,如下所示,你的代码应该可以工作。
import pandas as pdfrom sklearn.datasets import load_diabetesfrom sklearn.model_selection import RandomizedSearchCVfrom sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressorfrom sklearn.ensemble import StackingRegressorX, y = load_diabetes(return_X_y=True)rfr = RandomForestRegressor()gbr = GradientBoostingRegressor()estimators = [('rfr', rfr), ('gbr', gbr)]sreg = StackingRegressor(estimators=estimators)params = { 'rfr__max_depth': [3, 5], 'gbr__max_depth': [3, 5]}grid = RandomizedSearchCV( estimator=sreg, param_distributions=params, n_iter=2, cv=3, verbose=1, random_state=100)grid.fit(X, y)res = pd.DataFrame(grid.cv_results_)print(res)# mean_fit_time std_fit_time ... std_test_score rank_test_score# 0 1.121728 0.024188 ... 0.024546 2# 1 1.096936 0.034377 ... 0.013047 1