为什么测试集上的MSE非常低且似乎没有变化（增加epoch后也没有增加）

我正在使用LSTM预测股票价值的问题上工作。

我的工作基于以下项目。我使用了一个总长度为12075的股票价格时间序列数据集，将其分为训练集和测试集（约10%）。这是链接项目中使用的相同数据集。

train_data.shape(11000,)

test_data.shape(1075,)

在我们的模型中，我们首先在一个多对多的LSTM模型上进行训练，我们提供N个输入序列（股票价格）和N个标签序列（这些标签是通过将train_data分割成N个段作为输入来采样的，而标签是作为输入的后续值序列来采样的）。

然后我们开始单独预测每个值，并将其作为下一次的输入，直到达到num_predictions次预测。

损失只是预测值和实际值之间的MSE。

最后的预测看起来还不错。然而，我只是不明白为什么训练误差急剧下降，而测试误差总是非常非常低（尽管它一直在以很小的幅度减少）。我知道通常测试误差在一定数量的epoch后应该开始增加，因为过拟合。我已经用一个更简单的代码和不同的数据集进行了测试，并且遇到了相对相似的MSE图表。

这是我的主要循环：

for ep in range(epochs):# ========================= Training =====================================for step in range(num_batches):    u_data, u_labels = data_gen.unroll_batches()    feed_dict = {}    for ui,(dat,lbl) in enumerate(zip(u_data,u_labels)):        feed_dict[train_inputs[ui]] = dat.reshape(-1,1)        feed_dict[train_outputs[ui]] = lbl.reshape(-1,1)    feed_dict.update({tf_learning_rate: 0.0001, tf_min_learning_rate:0.000001})    _, l = session.run([optimizer, loss], feed_dict=feed_dict)    average_loss += l# ============================ Validation ==============================if (ep+1) % valid_summary == 0:  average_loss = average_loss/(valid_summary*num_batches)  # The average loss  if (ep+1)%valid_summary==0:    print('Average loss at step %d: %f' % (ep+1, average_loss))  train_mse_ot.append(average_loss)  average_loss = 0 # reset loss  predictions_seq = []  mse_test_loss_seq = []  # ===================== Updating State and Making Predicitons ========================  for w_i in test_points_seq:    mse_test_loss = 0.0    our_predictions = []    if (ep+1)-valid_summary==0:      # Only calculate x_axis values in the first validation epoch      x_axis=[]    # Feed in the recent past behavior of stock prices    # to make predictions from that point onwards    for tr_i in range(w_i-num_unrollings+1,w_i-1):      current_price = all_mid_data[tr_i]      feed_dict[sample_inputs] = np.array(current_price).reshape(1,1)      _ = session.run(sample_prediction,feed_dict=feed_dict)    feed_dict = {}    current_price = all_mid_data[w_i-1]    feed_dict[sample_inputs] = np.array(current_price).reshape(1,1)    # Make predictions for this many steps    # Each prediction uses previous prediciton as it's current input    for pred_i in range(n_predict_once):      pred = session.run(sample_prediction,feed_dict=feed_dict)      our_predictions.append(np.asscalar(pred))      feed_dict[sample_inputs] = np.asarray(pred).reshape(-1,1)      if (ep+1)-valid_summary==0:        # Only calculate x_axis values in the first validation epoch        x_axis.append(w_i+pred_i)      mse_test_loss += 0.5*(pred-all_mid_data[w_i+pred_i])**2    session.run(reset_sample_states)    predictions_seq.append(np.array(our_predictions))    mse_test_loss /= n_predict_once    mse_test_loss_seq.append(mse_test_loss)    if (ep+1)-valid_summary==0:      x_axis_seq.append(x_axis)  current_test_mse = np.mean(mse_test_loss_seq)  # Learning rate decay logic  if len(test_mse_ot)>0 and current_test_mse > min(test_mse_ot):      loss_nondecrease_count += 1  else:      loss_nondecrease_count = 0  if loss_nondecrease_count > loss_nondecrease_threshold :        session.run(inc_gstep)        loss_nondecrease_count = 0        print('\tDecreasing learning rate by 0.5')  test_mse_ot.append(current_test_mse)  #print('\tTest MSE: %.5f'%np.mean(mse_test_loss_seq))  print('\tTest MSE: %.5f' % current_test_mse)  predictions_over_time.append(predictions_seq)  print('\tFinished Predictions')  epochs_evolution.append(ep+1)

这是否正常？我应该增加测试集的大小吗？有什么做错的地方吗？请提供一些关于如何测试/调查的建议？

回答：

上述训练和测试之间MSE差异的原因是我们计算的不是同一件事。在训练期间，MSE是训练数据中每个样本随时间步的误差总和的平均值，因此它很大。在测试期间，我们进行N=50次预测，并计算预测值与实际值之间的平均误差。这个平均值总是非常小，并且在上面的图表中看起来几乎是恒定的。

学技术

为什么测试集上的MSE非常低且似乎没有变化（增加epoch后也没有增加）

发表回复取消回复

相关文章：

Related Posts

如何转换二维张量和索引张量以便用于torch.nn.utils.rnn.pack_sequence

模型预测值的含义是什么？

锯齿张量作为LSTM的输入

如何告诉SciKit的LinearRegression模型预测值不能小于零？

在PyTorch中，如何将与cuda()相关的代码转换为CPU版本？

词汇和整数（独热编码）表示是如何存储的？torchtext.vocab()中的(‘string’, int)元组是什么意思？

发表回复 取消回复

发表回复取消回复