我实现了一个数据生成器,将训练数据分割成256个小批次,以避免内存错误。它在训练数据上运行,但每个epoch结束时不显示验证损失和验证准确率。我也将数据生成器应用于验证数据,并定义了验证步骤。我不确定代码中哪个部分出了问题,导致不显示验证损失和准确率?这是代码:
early_stopping_cb=tf.keras.callbacks.EarlyStopping(patience=3,restore_best_weights=True)batch_size=256epoch_steps=math.ceil(len(utt)/ batch_size)val_steps=math.ceil(len(val_prev)/ batch_size)hist = model.fit_generator(generate_data(utt_minus_one, utt, y_train, batch_size), steps_per_epoch=epoch_steps, epochs=3, callbacks = [early_stopping_cb], validation_data=generate_data(val_prev, val_curr,y_val,batch_size), validation_steps=val_steps, class_weight=custom_weight_dict, verbose=1)
这是生成器的代码:
#方法用于将数据分割成每次运行时加载的256个小批次def generate_data(X1,X2,Y,batch_size): p_input=[] c_input=[] target=[] batch_count=0 for i in range(len(X1)): p_input.append(X1[i]) c_input.append(X2[i]) target.append(Y[i]) batch_count+=1 if batch_count>batch_size: prev_X=np.array(p_input,dtype=np.int64) cur_X=np.array(c_input,dtype=np.int64) cur_y=np.array(target,dtype=np.int32) yield ([prev_X,cur_X],cur_y ) p_input=[] c_input=[] target=[] batch_count=0 return
这是第一个epoch的跟踪记录,它也显示了一个错误:
Epoch 1/3346/348 [============================>.] - ETA: 4s - batch: 172.5000 - size: 257.0000 - loss: 0.8972 - accuracy: 0.8424WARNING:tensorflow:Your dataset iterator ran out of data; interrupting training. Make sure that your iterator can generate at least `steps_per_epoch * epochs` batches (in this case, 1044 batches). You may need touse the repeat() function when building your dataset.WARNING:tensorflow:Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: loss,accuracy346/348 [============================>.] - 858s 2s/step - batch: 172.5000 - size: 257.0000 - loss: 0.8972 - accuracy: 0.8424
谁能帮助解决这些问题?
回答:
每个epoch需要一个while循环来覆盖用于分割成小批次的for循环。所以如果每个epoch有348个批次,那么总共需要3*348=1044个批次。
#方法用于将数据分割成每次运行时加载的256个小批次def generate_data(X1,X2,Y,batch_size): count=0 p_input=[] c_input=[] target=[] batch_count=0 while True: for i in range(len(X1)): p_input.append(X1[i]) c_input.append(X2[i]) target.append(Y[i]) batch_count+=1 if batch_count>batch_size: count=count+1 prev_X=np.array(p_input,dtype=np.int64) cur_X=np.array(c_input,dtype=np.int64) cur_y=np.array(target,dtype=np.int32) yield ([prev_X,cur_X],cur_y ) p_input=[] c_input=[] target=[] batch_count=0 print(count) return
这是第一个epoch的跟踪记录:
Epoch 1/3335/347 [===========================>..] - ETA: 30s - batch: 167.0000 - size: 257.0000 - loss: 1.2734 - accuracy: 0.8105346347/347 [==============================] - ETA: 0s - batch: 173.0000 - size: 257.0000 - loss: 1.2635 - accuracy: 0.8113WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_v1.py:2048: Model.state_updates (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.Instructions for updating:This property should not be used in TensorFlow 2.0, as updates are applied automatically.86347/347 [==============================] - 964s 3s/step - batch: 173.0000 - size: 257.0000 - loss: 1.2635 - accuracy: 0.8113 - val_loss: 0.5700 - val_accuracy: 0.8367