我在实现Keras的TimeseriesGenerator时遇到了问题。我想尝试不同的look_back
值,这个变量决定了X相对于每个y的滞后长度。目前我将其设置为3,但我希望能够测试多个值。实际上,我想看看使用最后n行的数据来预测一个值是否能提高准确性。以下是我的代码:
### 尝试使用时序生成器from keras.preprocessing.sequence import TimeseriesGeneratorlook_back = 3train_data_gen = TimeseriesGenerator(X_train, X_train, length=look_back, sampling_rate=1,stride=1, batch_size=3)test_data_gen = TimeseriesGenerator(X_test, X_test, length=look_back, sampling_rate=1,stride=1, batch_size=1)### Bi_LSTMBi_LSTM = Sequential()Bi_LSTM.add(layers.Bidirectional(layers.LSTM(512, input_shape=(look_back, 11))))Bi_LSTM.add(layers.Dropout(.5))# Bi_LSTM.add(layers.Flatten())Bi_LSTM.add(Dense(11, activation='softmax'))Bi_LSTM.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])### 拟合一个小的普通模型似乎是编译所必需的Bi_LSTM.fit(X_train[:1], y_train[:1], epochs=1, batch_size=32, validation_data=(X_test[:1], y_test[:1]), class_weight=class_weights)print('忽略上述内容,这是运行自定义生成器所必需的...')Bi_LSTM_history = Bi_LSTM.fit_generator(Bi_LSTM.fit_generator(generator, steps_per_epoch=1, epochs=20, verbose=0, class_weight=class_weights))
这会产生以下错误:
---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-35-11561ec7fb92> in <module>() 26 batch_size=32, 27 validation_data=(X_test[:1], y_test[:1]),---> 28 class_weight=class_weights) 29 print('ignore above, necessary to run custom generator...') 30 Bi_LSTM_history = Bi_LSTM.fit_generator(Bi_LSTM.fit_generator(generator,2 frames/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix) 143 ': expected ' + names[i] + ' to have shape ' + 144 str(shape) + ' but got array with shape ' +--> 145 str(data_shape)) 146 return data 147 ValueError: Error when checking input: expected lstm_16_input to have shape (3, 11) but got array with shape (1, 11)
如果我将BiLSTM的输入形状改为上面列出的(1,11),那么我会得到以下错误:
---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-36-7360e3790518> in <module>() 31 epochs=20, 32 verbose=0,---> 33 class_weight=class_weights)) 34 5 frames/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix) 143 ': expected ' + names[i] + ' to have shape ' + 144 str(shape) + ' but got array with shape ' +--> 145 str(data_shape)) 146 return data 147 ValueError: Error when checking input: expected lstm_17_input to have shape (1, 11) but got array with shape (3, 11)
这是怎么回事?
如果需要,我的数据是从一个数据框中读取的,每行(观察值)是一个(1,11)
的浮点向量,每个标签是一个整数,我将其转换为一个热向量形状(1,11)
。
回答:
我在代码中发现了很多错误…因此,我想提供一个你可以遵循的示例代码来完成你的任务。请注意原始数据的维度和由TimeSeriesGenerator生成的数据的维度。这对于理解如何构建网络非常重要
# 实用变量look_back = 3batch_size = 3n_feat = 11n_class = 11n_train = 200n_test = 60# 数据模拟X_train = np.random.uniform(0,1, (n_train,n_feat)) # 2D!X_test = np.random.uniform(0,1, (n_test,n_feat)) # 2D!y_train = np.random.randint(0,2, (n_train,n_class)) # 2D!y_test = np.random.randint(0,2, (n_test,n_class)) # 2D!train_data_gen = TimeseriesGenerator(X_train, y_train, length=look_back, batch_size=batch_size)test_data_gen = TimeseriesGenerator(X_test, y_test, length=look_back, batch_size=batch_size)# 检查生成器维度for i in range(len(train_data_gen)): x, y = train_data_gen[i] print(x.shape, y.shape)Bi_LSTM = Sequential()Bi_LSTM.add(Bidirectional(LSTM(512), input_shape=(look_back, n_feat)))Bi_LSTM.add(Dropout(.5))Bi_LSTM.add(Dense(n_class, activation='softmax'))print(Bi_LSTM.summary())Bi_LSTM.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])Bi_LSTM_history = Bi_LSTM.fit_generator(train_data_gen, steps_per_epoch=50, epochs=3, verbose=1, validation_data=test_data_gen) # class_weight=class_weights)