我尝试通过 set_params() 将一个填补器的 ‘strategy’ 更改为 “most_frequent”,但没有生效。
我做错了什么?
代码:
categorical_preprocessing = Pipeline(steps=[ ('cat_transformer',imputer),('encode',encoder)])numerical_preprocessing = Pipeline(steps= [('numeric_transformer',imputer)])preprocessing = ColumnTransformer(transformers=[ ('cat',categorical_preprocessing,cat_feats), ('num',numerical_preprocessing,num_feats)])feature_transformer = FeatureUnion(transformer_list= [ ('pca',pca_transformer), ('kbest',kbest) ])params = { 'preprocess__cat__cat_transformer__strategy':'most_frequent', 'preprocess__num__numeric_transformer__strategy':'mean',}pipe = Pipeline( steps = [('preprocess',preprocessing), ('feature_selection',feature_transformer) ])pipe = pipe.set_params(**params)print([pipe.get_params()[key] for key in params.keys()])
输出: ['mean', 'mean']
回答:
尝试将 params
中的第二个值更改为 "blatantly wrong"
,你就会发现问题所在。
你的管道中的 cat_transformer
和 numeric_transformer
是同一个对象 imputer
。设置该对象的参数 strategy
会在管道的两个位置上覆盖它。你应该定义两个独立的填补器实例。