我想在使用一对多策略的SVC模型上执行GridSearchCV。对于后者部分,我可以这样做:
model_to_set = OneVsRestClassifier(SVC(kernel="poly"))
我的问题在于参数设置。假设我想尝试以下值:
parameters = {"C":[1,2,4,8], "kernel":["poly","rbf"],"degree":[1,2,3,4]}
为了执行GridSearchCV,我应该做类似这样的事情:
cv_generator = StratifiedKFold(y, k=10) model_tunning = GridSearchCV(model_to_set, param_grid=parameters, score_func=f1_score, n_jobs=1, cv=cv_generator)
然而,当我执行它时,我得到了以下错误:
Traceback (most recent call last): File "/.../main.py", line 66, in <module> argclass_sys.set_model_parameters(model_name="SVC", verbose=3, file_path=PATH_ROOT_MODELS) File "/.../base.py", line 187, in set_model_parameters model_tunning.fit(self.feature_encoder.transform(self.train_feats), self.label_encoder.transform(self.train_labels)) File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 354, in fit return self._fit(X, y) File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 392, in _fit for clf_params in grid for train, test in cv) File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 473, in __call__ self.dispatch(function, args, kwargs) File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 296, in dispatch job = ImmediateApply(func, args, kwargs) File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 124, in __init__ self.results = func(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 85, in fit_grid_point clf.set_params(**clf_params) File "/usr/local/lib/python2.7/dist-packages/sklearn/base.py", line 241, in set_params % (key, self.__class__.__name__))ValueError: Invalid parameter kernel for estimator OneVsRestClassifier
基本上,因为SVC位于OneVsRestClassifier内部,而这是我发送给GridSearchCV的估计器,所以无法访问SVC的参数。
为了实现我的目标,我看到了两个解决方案:
- 在创建SVC时,以某种方式告诉它不要使用一对一策略,而是使用一对多策略。
- 以某种方式指示GridSearchCV,参数对应于OneVsRestClassifier内部的估计器。
我还没有找到实现上述任何一种替代方案的方法。你知道有办法实现它们中的任何一种吗?或者你能建议另一种达到相同结果的方法吗?
谢谢!
回答:
当你使用嵌套估计器进行网格搜索时,你可以使用__
作为分隔符来限定参数的范围。在这种情况下,SVC模型作为名为estimator
的属性存储在OneVsRestClassifier
模型中:
from sklearn.datasets import load_irisfrom sklearn.multiclass import OneVsRestClassifierfrom sklearn.svm import SVCfrom sklearn.grid_search import GridSearchCVfrom sklearn.metrics import f1_scoreiris = load_iris()model_to_set = OneVsRestClassifier(SVC(kernel="poly"))parameters = { "estimator__C": [1,2,4,8], "estimator__kernel": ["poly","rbf"], "estimator__degree":[1, 2, 3, 4],}model_tunning = GridSearchCV(model_to_set, param_grid=parameters, score_func=f1_score)model_tunning.fit(iris.data, iris.target)print model_tunning.best_score_print model_tunning.best_params_
这将产生以下结果:
0.973290762737{'estimator__kernel': 'poly', 'estimator__C': 1, 'estimator__degree': 2}