我正在学习这个教程中的神经机器翻译https://www.tensorflow.org/tutorials/text/nmt_with_attention#restore_the_latest_checkpoint_and_test
但似乎教程中没有train_losses
和val_losses
(只有batch_loss
)。
有没有办法像处理另一个模型一样获取损失值历史?
例如:
train_loss = seqModel.history['loss']val_loss = seqModel.history['val_loss']train_acc = seqModel.history['acc']val_acc = seqModel.history['val_acc']
回答:
实际上,在那个教程中确实有。当他们使用
for epoch in range(EPOCHS): start = time.time() enc_hidden = encoder.initialize_hidden_state() total_loss = 0 for (batch, (inp, targ)) in enumerate(dataset.take(steps_per_epoch)): batch_loss = train_step(inp, targ, enc_hidden) total_loss += batch_loss
通过这种方式,他们计算来自train_step
方法的训练损失。但由于没有验证集,所以没有显示验证损失。
根据您的评论,您需要编写test_step
函数并在训练循环中使用它。这里是一个获取验证损失的最小表示。
@tf.functiondef test_step(inp, targ, enc_hidden): loss = 0 enc_output, enc_hidden = encoder(inp, enc_hidden, training=False) dec_hidden = enc_hidden dec_input = tf.expand_dims([targ_lang.word_index['<start>']] * BATCH_SIZE, 1) for t in range(1, targ.shape[1]): predictions, dec_hidden, _ = decoder(dec_input, dec_hidden, enc_output, training=False) loss += loss_function(targ[:, t], predictions) dec_input = tf.expand_dims(targ[:, t], 1) batch_loss = (loss / int(targ.shape[1])) return batch_loss
要在自定义训练循环中使用它,您可以按以下方式操作。注意,我使用的是同一个dataset
,但实际上我们需要创建一个单独的验证数据集。
EPOCHS = 5history = {'loss':[], 'val_loss':[]}for epoch in range(EPOCHS): start = time.time() enc_hidden = encoder.initialize_hidden_state() total_loss = 0 for (batch, (inp, targ)) in enumerate(dataset.take(steps_per_epoch)): batch_loss = train_step(inp, targ, enc_hidden) total_loss += batch_loss if (epoch + 1) % 2 == 0: checkpoint.save(file_prefix=checkpoint_prefix) history['loss'].append(total_loss.numpy()/steps_per_epoch) print(f'Epoch {epoch+1} Loss {total_loss/steps_per_epoch:.4f}') total_loss = 0 for (batch, (inp, targ)) in enumerate(dataset.take(steps_per_epoch)): batch_loss = test_step(inp, targ, enc_hidden) total_loss += batch_loss history['val_loss'].append(total_loss.numpy()/steps_per_epoch) print(f'Epoch {epoch+1} Val Loss {total_loss/steps_per_epoch:.4f}') print(f'Time taken for 1 epoch {time.time()-start:.2f} sec\n')
接下来,
history['loss']history['val_loss']