LGBM的预测结果不随随机状态变化

我正在尝试为分类器计算预测区间。

我在sklearn中进行了训练。即使在我的管道中设置了新的random_state参数，在重新拟合数据时似乎也没有改变我的结果。我该怎么办？

这是我正在使用的代码的相关片段：

SEED_VALUE = 3t_clf = Pipeline(steps=[('preprocessor', preprocessor), ('lgbm',                        LGBMClassifier(class_weight="balanced",                        random_state=SEED_VALUE, max_depth=20,                        min_child_samples=20, num_leaves=31))                        ])states = [0,1,2,3]for state in states:       train_temp = train.copy()    t_clf.set_params(lgbm__random_state=state)    t_clf.fit(train_temp, train_temp['label'])    t_clf.predict_proba(test)   # 预测概率的输出在不同的状态下没有变化

当尝试更改洗牌顺序或装袋种子时，也会发生相同的情况。

如果这有帮助的话，以下是我当前的参数：

LGBMClassifier(bagging_seed=2, boosting_type='gbdt', class_weight='balanced',               colsample_bytree=1.0, importance_type='split', learning_rate=0.1,               max_depth=50, min_child_samples=1, min_child_weight=0.001,               min_data_in_leaf=10, min_split_gain=0.0, n_estimators=100,               n_jobs=-1, num_leaves=30, objective=None, random_state=1,               reg_alpha=0.0, reg_lambda=0.0, silent=True, subsample=1.0,               subsample_for_bin=200000, subsample_freq=0)

回答：

无论随机种子如何设置，你得到相同结果的原因是你的模型规格在任何阶段都没有进行随机抽样。例如，如果你将colsample_bytree设置为小于1的值，那么你会看到不同随机种子的预测概率不同。

from sklearn.datasets import make_classificationfrom lightgbm import LGBMClassifier# 生成一些数据X, y = make_classification(n_samples=1000, n_features=50, random_state=100)# 设置随机状态for state in [0, 1, 2, 3]:    # 实例化分类器    clf = LGBMClassifier(        class_weight='balanced',        max_depth=20,        min_child_samples=20,        num_leaves=31,        random_state=state,        colsample_bytree=0.1,    )    # 拟合分类器    clf.fit(X, y)    # 预测类概率    y_pred = clf.predict_proba(X)    # 打印第一个样本的第一类的预测概率    print([state, format(y_pred[0, 0], '.4%')])    # [0, '97.8132%']    # [1, '97.4980%']    # [2, '98.3729%']    # [3, '98.0737%']

学技术

LGBM的预测结果不随随机状态变化

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复