我正在尝试使用train
集和eval
集来拟合一个CatBoostRegressor。在train_set
中有sample_weight
参数可以用来加权观测值,但我没有看到eval
集的类似参数。
这是一个示例:
from catboost import CatBoostRegressor# 初始化数据cat_features = [0,1,2]x_train = [["a","b",1,4,5,6],["a","b",4,5,6,7],["c","d",30,40,50,60]]x_eval = [["a","b",2,4,6,8],["a","d",1,4,50,60]]y_train = [10,20,30]y_eval = [10,20]w_train = [0.1, 0.2, 0.7]w_eval = [0.1, 0.2]# 初始化CatBoostRegressormodel = CatBoostRegressor(iterations=2, learning_rate=1, depth=2)# 拟合模型model.fit(X=x_train, y=y_train, sample_weight=w_train, eval_set=(x_eval, y_eval), cat_features=cat_features)
在示例中,w_eval
应该放在哪里?
回答:
是的,要实现这一点,你需要使用Pool类。示例:
from catboost import CatBoostClassifier, Pooltrain_data = Pool( data=[[1, 4, 5, 6], [4, 5, 6, 7], [30, 40, 50, 60]], label=[1, 1, -1], weight=[0.1, 0.2, 0.3])eval_data = Pool( data=[[1, 4, 5, 6], [4, 5, 6, 7], [30, 40, 50, 60]], label=[1, 0, -1], weight=[0.7, 0.1, 0.3])model = CatBoostClassifier(iterations = 10)model.fit(X=train_data, eval_set=eval_data)