Keras Scikit Learn 包装器 – AUC 评分器无法工作

我尝试使用Keras Scikit Learn 包装器来简化参数的随机搜索。我在这里写了一个示例代码,其中:

  1. 我生成了一个人工数据集:

我使用了scikit learn中的moons

from sklearn.datasets import make_moonsdataset = make_moons(1000)
  1. 模型构建器定义:

我定义了所需的build_fn函数:

def build_fn(nr_of_layers = 2,             first_layer_size = 10,             layers_slope_coeff = 0.8,             dropout = 0.5,             activation = "relu",             weight_l2 = 0.01,             act_l2 = 0.01,             input_dim = 2):    result_model = Sequential()    result_model.add(Dense(first_layer_size,                           input_dim = input_dim,                           activation=activation,                           W_regularizer= l2(weight_l2),                           activity_regularizer=activity_l2(act_l2)                           ))    current_layer_size = int(first_layer_size * layers_slope_coeff) + 1    for index_of_layer in range(nr_of_layers - 1):        result_model.add(BatchNormalization())        result_model.add(Dropout(dropout))        result_model.add(Dense(current_layer_size,                               W_regularizer= l2(weight_l2),                               activation=activation,                               activity_regularizer=activity_l2(act_l2)                               ))        current_layer_size = int(current_layer_size * layers_slope_coeff) + 1    result_model.add(Dense(1,                           activation = "sigmoid",                           W_regularizer = l2(weight_l2)))    result_model.compile(optimizer="rmsprop", metrics = ["accuracy"], loss = "binary_crossentropy")    return result_modelNeuralNet = KerasClassifier(build_fn)
  1. 参数网格定义:

然后我定义了一个参数网格:

param_grid = {    "nr_of_layers" : [2, 3, 4, 5],    "first_layer_size" : [5, 10, 15],    "layers_slope_coeff" : [0.4, 0.6, 0.8],    "dropout" : [0.3, 0.5, 0.8],    "weight_l2" : [0.01, 0.001, 0.0001],    "verbose" : [0],    "batch_size" : [1],    "nb_epoch" : [30]}
  1. 随机搜索阶段:

我定义了RandomizedSearchCV对象并使用人工数据集的值进行拟合:

random_search = RandomizedSearchCV(NeuralNet,     param_distributions=param_grid, verbose=2, n_iter=1, scoring="roc_auc")random_search.fit(dataset[0], dataset[1])

运行这段代码后,我在控制台中得到的结果是:

Traceback (most recent call last):  File "C:\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py", line 2885, in run_code    exec(code_obj, self.user_global_ns, self.user_ns)  File "<ipython-input-3-c5bdbc2770b7>", line 2, in <module>    random_search.fit(dataset[0], dataset[1])  File "C:\Anaconda2\lib\site-packages\sklearn\grid_search.py", line 996, in fit    return self._fit(X, y, sampled_params)  File "C:\Anaconda2\lib\site-packages\sklearn\grid_search.py", line 553, in _fit    for parameters in parameter_iterable  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 800, in __call__    while self.dispatch_one_batch(iterator):  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 658, in dispatch_one_batch    self._dispatch(tasks)  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 566, in _dispatch    job = ImmediateComputeBatch(batch)  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 180, in __init__    self.results = batch()  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 72, in __call__    return [func(*args, **kwargs) for func, args, kwargs in self.items]  File "C:\Anaconda2\lib\site-packages\sklearn\cross_validation.py", line 1550, in _fit_and_score    test_score = _score(estimator, X_test, y_test, scorer)  File "C:\Anaconda2\lib\site-packages\sklearn\cross_validation.py", line 1606, in _score    score = scorer(estimator, X_test, y_test)  File "C:\Anaconda2\lib\site-packages\sklearn\metrics\scorer.py", line 175, in __call__    y_pred = y_pred[:, 1]IndexError: index 1 is out of bounds for axis 1 with size 1

当我使用accuracy指标代替scoring = "roc_auc"时,这段代码可以正常工作。谁能解释一下这是怎么回事?有没有人遇到过类似的问题?


回答:

KerasClassifier 中有一个导致此问题的错误。我已经在仓库中为此开了个问题。 https://github.com/fchollet/keras/issues/2864

修复方法也在其中。作为临时解决方案,你可以定义自己的 KerasClassifier。

class FixedKerasClassifier(KerasClassifier):    def predict_proba(self, X, **kwargs):        kwargs = self.filter_sk_params(Sequential.predict_proba, kwargs)        probs = self.model.predict_proba(X, **kwargs)        if(probs.shape[1] == 1):            probs = np.hstack([1-probs,probs])         return probs

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注