我使用Keras构建了一个RNN。这个RNN用于解决回归问题:
def RNN_keras(feat_num, timestep_num=100): model = Sequential() model.add(BatchNormalization(input_shape=(timestep_num, feat_num))) model.add(LSTM(input_shape=(timestep_num, feat_num), output_dim=512, activation='relu', return_sequences=True)) model.add(BatchNormalization()) model.add(LSTM(output_dim=128, activation='relu', return_sequences=True)) model.add(BatchNormalization()) model.add(TimeDistributed(Dense(output_dim=1, activation='relu'))) # sequence labeling rmsprop = RMSprop(lr=0.00001, rho=0.9, epsilon=1e-08) model.compile(loss='mean_squared_error', optimizer=rmsprop, metrics=['mean_squared_error']) return model
整个过程看起来不错。但是损失在各个轮次中保持完全相同。
61267 in the training set6808 in the test setBuilding training input vectors ...888 unique feature namesThe length of each vector will be 888Using TensorFlow backend.Build model...# Each batch has 1280 examples# The training data are shuffled at the beginning of each epoch.****** Iterating over each batch of the training data ******Epoch 1/3 : Batch 1/48 | loss = 11011073.000000 | root_mean_squared_error = 3318.232910Epoch 1/3 : Batch 2/48 | loss = 620.271667 | root_mean_squared_error = 24.904161Epoch 1/3 : Batch 3/48 | loss = 620.068665 | root_mean_squared_error = 24.900017......Epoch 1/3 : Batch 47/48 | loss = 618.046448 | root_mean_squared_error = 24.859678Epoch 1/3 : Batch 48/48 | loss = 652.977051 | root_mean_squared_error = 25.552946****** Epoch 1: RMSD(training) = 24.897174 Epoch 2/3 : Batch 1/48 | loss = 607.372620 | root_mean_squared_error = 24.644049Epoch 2/3 : Batch 2/48 | loss = 599.667786 | root_mean_squared_error = 24.487448Epoch 2/3 : Batch 3/48 | loss = 621.368103 | root_mean_squared_error = 24.926300......Epoch 2/3 : Batch 47/48 | loss = 620.133667 | root_mean_squared_error = 24.901398Epoch 2/3 : Batch 48/48 | loss = 639.971924 | root_mean_squared_error = 25.297264****** Epoch 2: RMSD(training) = 24.897174 Epoch 3/3 : Batch 1/48 | loss = 651.519836 | root_mean_squared_error = 25.523636Epoch 3/3 : Batch 2/48 | loss = 673.582581 | root_mean_squared_error = 25.952084Epoch 3/3 : Batch 3/48 | loss = 613.930054 | root_mean_squared_error = 24.776562......Epoch 3/3 : Batch 47/48 | loss = 624.460327 | root_mean_squared_error = 24.988203Epoch 3/3 : Batch 48/48 | loss = 629.544250 | root_mean_squared_error = 25.090448****** Epoch 3: RMSD(training) = 24.897174
我认为这并不正常。我是不是遗漏了什么?
更新:我发现所有预测在所有轮次后始终为零。这就是为什么所有RMSD都相同的理由,因为预测值都是相同的,即0。我检查了训练集的y值。它只包含很少的零值。所以这不是由于数据不平衡引起的。
所以现在我在考虑是否是因为我使用的层和激活函数导致的。
回答:
你的RNN函数看起来没问题。
损失减少的速度取决于优化器和学习率。
无论如何,你使用了0.9的衰减率。尝试使用更大的学习率,无论如何它会以0.9的速率减少。
尝试使用不同学习率的其他优化器Keras可用的其他优化器: https://keras.io/optimizers/
很多时候,某些优化器在某些数据集上表现良好,而其他优化器可能失败。