我正在进行一个深度学习项目,并尝试按照教程中的方法使用交叉验证来评估我的模型。
我参考的教程是:https://machinelearningmastery.com/use-keras-deep-learning-models-scikit-learn-python/
我首先将数据集分成特征和标签:
labels = dataset['Label']
features = dataset.loc[:, dataset.columns != 'Label'].astype('float64')
我的数据形状如下:
features.shape ,labels.shape
((2425727, 78), (2425727,))
我使用RobustScaler来缩放数据,现在数据如下:
features
array([[ 1.40474359e+02, -1.08800488e-02, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
[ 1.40958974e+02, -1.08609909e-02, -2.50000000e-01, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
[ 1.40961538e+02, -1.08712390e-02, -2.50000000e-01, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
...,
[ 1.48589744e+02, -1.08658453e-02, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
[-6.92307692e-02, 1.77654485e-01, 1.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
[-6.92307692e-02, 6.18858398e-03, 5.00000000e-01, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00]])
labels
array([0, 0, 0, ..., 0, 0, 0])
现在数据已经准备好进行交叉验证了。
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_val_score
# 定义创建模型的函数,供KerasClassifier使用
def create_model():
# 创建模型
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(78,1)))
model.add(Dropout(0.01))
model.add(Dense(15, activation='softmax'))
# 编译模型
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
# 为了重现结果,固定随机种子
seed = 7
np.random.seed(seed)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=seed)
# 创建模型
model = KerasClassifier(build_fn=create_model(), epochs=30, batch_size=64, verbose=0)
# 使用5折交叉验证进行评估
results = cross_val_score(model, features, labels, cv=kfold, scoring='accuracy', error_score="raise")
print(results.mean())
执行上述代码后,我得到了以下错误:“ValueError: The first argument to Layer.call
must always be passed.”
我还查看了scikit-learn的文档,以确认是否有操作错误:https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_score.html
我还尝试查找其他人是否也遇到过此问题,例如:https://github.com/scikit-learn/scikit-learn/issues/18944
但我无法解决这个问题。请问有人可以帮助我解决这个问题吗?
回答:
model = KerasClassifier(build_fn=create_model(), ...)
尝试移除create_model函数的括号,因为该参数期望一个回调函数,在需要时会被调用。所以应该改为
model = KerasClassifier(build_fn=create_model, ... )