我在尝试使用Hyperopt对XGBoostClassifier进行超参数调优时遇到了错误。请查看我使用的代码以及错误信息如下:
步骤1:目标函数
import csvfrom hyperopt import STATUS_OKfrom timeit import default_timer as timerMAX_EVALS = 200N_FOLDS = 10def objective(params, n_folds = N_FOLDS): """XGBoost超参数优化的目标函数""" # 跟踪评估次数 global ITERATION ITERATION += 1# # 如果存在则获取子样本,否则设置为1.0# subsample = params['boosting_type'].get('subsample', 1.0)# # 提取提升类型# params['boosting_type'] = params['boosting_type']['boosting_type']# params['subsample'] = subsample # 确保需要为整数的参数是整数 for parameter_name in ['max_depth', 'colsample_bytree', 'min_child_weight']: params[parameter_name] = int(params[parameter_name]) start = timer() # 执行n_folds交叉验证 cv_results = xgb.cv(params, train_set, num_boost_round = 10000, nfold = n_folds, early_stopping_rounds = 100, metrics = 'auc', seed = 50) run_time = timer() - start # 提取最佳得分 best_score = np.max(cv_results['auc-mean']) # 损失必须最小化 loss = 1 - best_score # 返回最高cv得分的提升轮次 n_estimators = int(np.argmax(cv_results['auc-mean']) + 1) # 写入csv文件('a'表示追加) of_connection = open(out_file, 'a') writer = csv.writer(of_connection) writer.writerow([loss, params, ITERATION, n_estimators, run_time]) # 包含评估信息的字典 return {'loss': loss, 'params': params, 'iteration': ITERATION, 'estimators': n_estimators, 'train_time': run_time, 'status': STATUS_OK}
我已经定义了样本空间和优化算法。在运行Hyperopt时,我遇到了下面的错误。错误出在目标函数中。
错误:KeyError: ‘auc-mean’
<ipython-input-62-8d4e97f16929> in objective(params, n_folds) 25 run_time = timer() - start 26 # 提取最佳得分---> 27 best_score = np.max(cv_results['auc-mean']) 28 # 损失必须最小化 29 loss = 1 - best_score
回答:
首先,打印cv_results并查看存在的键。
在下面的示例笔记本中,键是:’test-auc-mean’和’train-auc-mean’
查看这里的单元格5:https://www.kaggle.com/tilii7/bayesian-optimization-of-xgboost-parameters