Keras Scikit Learn 包装器 – AUC 评分器无法工作

我尝试使用Keras Scikit Learn 包装器来简化参数的随机搜索。我在这里写了一个示例代码,其中:

  1. 我生成了一个人工数据集:

我使用了scikit learn中的moons

from sklearn.datasets import make_moonsdataset = make_moons(1000)
  1. 模型构建器定义:

我定义了所需的build_fn函数:

def build_fn(nr_of_layers = 2,             first_layer_size = 10,             layers_slope_coeff = 0.8,             dropout = 0.5,             activation = "relu",             weight_l2 = 0.01,             act_l2 = 0.01,             input_dim = 2):    result_model = Sequential()    result_model.add(Dense(first_layer_size,                           input_dim = input_dim,                           activation=activation,                           W_regularizer= l2(weight_l2),                           activity_regularizer=activity_l2(act_l2)                           ))    current_layer_size = int(first_layer_size * layers_slope_coeff) + 1    for index_of_layer in range(nr_of_layers - 1):        result_model.add(BatchNormalization())        result_model.add(Dropout(dropout))        result_model.add(Dense(current_layer_size,                               W_regularizer= l2(weight_l2),                               activation=activation,                               activity_regularizer=activity_l2(act_l2)                               ))        current_layer_size = int(current_layer_size * layers_slope_coeff) + 1    result_model.add(Dense(1,                           activation = "sigmoid",                           W_regularizer = l2(weight_l2)))    result_model.compile(optimizer="rmsprop", metrics = ["accuracy"], loss = "binary_crossentropy")    return result_modelNeuralNet = KerasClassifier(build_fn)
  1. 参数网格定义:

然后我定义了一个参数网格:

param_grid = {    "nr_of_layers" : [2, 3, 4, 5],    "first_layer_size" : [5, 10, 15],    "layers_slope_coeff" : [0.4, 0.6, 0.8],    "dropout" : [0.3, 0.5, 0.8],    "weight_l2" : [0.01, 0.001, 0.0001],    "verbose" : [0],    "batch_size" : [1],    "nb_epoch" : [30]}
  1. 随机搜索阶段:

我定义了RandomizedSearchCV对象并使用人工数据集的值进行拟合:

random_search = RandomizedSearchCV(NeuralNet,     param_distributions=param_grid, verbose=2, n_iter=1, scoring="roc_auc")random_search.fit(dataset[0], dataset[1])

运行这段代码后,我在控制台中得到的结果是:

Traceback (most recent call last):  File "C:\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py", line 2885, in run_code    exec(code_obj, self.user_global_ns, self.user_ns)  File "<ipython-input-3-c5bdbc2770b7>", line 2, in <module>    random_search.fit(dataset[0], dataset[1])  File "C:\Anaconda2\lib\site-packages\sklearn\grid_search.py", line 996, in fit    return self._fit(X, y, sampled_params)  File "C:\Anaconda2\lib\site-packages\sklearn\grid_search.py", line 553, in _fit    for parameters in parameter_iterable  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 800, in __call__    while self.dispatch_one_batch(iterator):  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 658, in dispatch_one_batch    self._dispatch(tasks)  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 566, in _dispatch    job = ImmediateComputeBatch(batch)  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 180, in __init__    self.results = batch()  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 72, in __call__    return [func(*args, **kwargs) for func, args, kwargs in self.items]  File "C:\Anaconda2\lib\site-packages\sklearn\cross_validation.py", line 1550, in _fit_and_score    test_score = _score(estimator, X_test, y_test, scorer)  File "C:\Anaconda2\lib\site-packages\sklearn\cross_validation.py", line 1606, in _score    score = scorer(estimator, X_test, y_test)  File "C:\Anaconda2\lib\site-packages\sklearn\metrics\scorer.py", line 175, in __call__    y_pred = y_pred[:, 1]IndexError: index 1 is out of bounds for axis 1 with size 1

当我使用accuracy指标代替scoring = "roc_auc"时,这段代码可以正常工作。谁能解释一下这是怎么回事?有没有人遇到过类似的问题?


回答:

KerasClassifier 中有一个导致此问题的错误。我已经在仓库中为此开了个问题。 https://github.com/fchollet/keras/issues/2864

修复方法也在其中。作为临时解决方案,你可以定义自己的 KerasClassifier。

class FixedKerasClassifier(KerasClassifier):    def predict_proba(self, X, **kwargs):        kwargs = self.filter_sk_params(Sequential.predict_proba, kwargs)        probs = self.model.predict_proba(X, **kwargs)        if(probs.shape[1] == 1):            probs = np.hstack([1-probs,probs])         return probs

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注