我在构建我的多层感知机算法时,尝试结合使用k折交叉验证和网格搜索来寻找最佳的隐藏层/节点组合。
我最初尝试简单地改变alpha值,结果有效,我使用了以下代码:
from sklearn.model_selection import cross_val_scoreimport numpy as npfrom sklearn.neural_network import MLPClassifierimport matplotlib.pyplot as plt import mglearn mlp = MLPClassifier()param_grid = {'alpha': np.arange(0,1,0.5)}knn_gscv = GridSearchCV(mlp, param_grid, cv=5)#fit model to dataknn_gscv.fit(X, y)#check top performing n_neighbors valueprint("best alpha value is",knn_gscv.best_params_)#check mean score for the top performing value of n_neighborsprint("best score best alpha",knn_gscv.best_score_)
这有效。但是现在我尝试改变隐藏层的数量和节点数,尝试了以下代码:
from sklearn.model_selection import cross_val_scoreimport numpy as npfrom sklearn.neural_network import MLPClassifierimport matplotlib.pyplot as plt import mglearn mlp = MLPClassifier()param_grid = {'hidden_layer_sizes': np.arange([10,10],[20,20],[30,30])}knn_gscv = GridSearchCV(mlp, param_grid, cv=5)#fit model to dataknn_gscv.fit(X, y)#check top performing n_neighbors valueprint("best alpha value is",knn_gscv.best_params_)#check mean score for the top performing value of n_neighborsprint("best score best alpha",knn_gscv.best_score_)
但是我收到了错误信息。我认为这是因为np.array()不适合使用列表作为输入。但我仍然认为我应该使用np.array,因为这是与网格搜索结合使用的最简单方法。有没有办法绕过这个问题?
回答:
首先,
np.arange([10,10],[20,20],[30,30])
永远不会工作。
即使是评论中建议的np.arange([[10,10],[20,20],[30,30]])
也不会工作。
两者都会引发:
TypeError: unsupported operand type(s) for -: ‘list’ and ‘list’
简短回答
对于'hidden_layer_sizes'
,你需要一个元组的列表。
例如param_grid = {'hidden_layer_sizes': [(10,10), (20,20)]}
详细回答
要生成一系列元组,可以使用如下代码:
start=10stop=20step = 5param_grid = {'hidden_layer_sizes': [(n, min(n+step, stop)) for n in range(start, stop, step)]}param_gridOut[29]: {'hidden_layer_sizes': [(10, 15), (15, 20)]}