我在学习机器学习中的RandomizedSearchCV。
代码如下:
data = pd.read_csv("heart-disease.csv")data_shuffled = data.sample(frac = 1)X = data_shuffled.drop("target", axis = 1)y = data_shuffled["target"]X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)grid = {"n_estimators": [10, 100, 200, 500, 1000, 1200], "max_depth": [None, 5, 10, 15, 20, 30], "max_featuers": ["auto", "sqrt"], "min_samples_split": [2, 4, 6], "min_samples_leaf": [1, 2, 4]}rfc = RandomForestClassifier(n_jobs = -1)rscv = RandomizedSearchCV(estimator = rfc, param_distributions=grid, n_iter = 100, cv = 5, verbose = 1)rscv.fit(X_train, y_train)
我遇到的错误是:
ValueError: Invalid parameter max_featuers for estimator RandomForestClassifier(min_samples_leaf=2, min_samples_split=6, n_estimators=1200, n_jobs=-1). Check the list of available parameters with `estimator.get_params().keys()`.
我检查了RandomForestClassifier库,想看看是否传递了错误的超参数名称,但没有发现任何问题。
回答:
你在参数网格的定义中有一个拼写错误:应该使用max_features
而不是max_featuers
。
grid = { "n_estimators": [10, 100, 200, 500, 1000, 1200], "max_depth": [None, 5, 10, 15, 20, 30], "max_features": ["auto", "sqrt"], # <-- 此处更改 "min_samples_split": [2, 4, 6], "min_samples_leaf": [1, 2, 4]}