我正在使用Keras进行手写数字识别,我有两个文件:predict.py 和 train.py。
train.py 会训练模型(如果模型尚未训练),并将模型保存到一个目录中,否则它会从保存的目录中加载已训练的模型,并打印出 Test Loss
和 Test Accuracy
。
def getData(): (X_train, y_train), (X_test, y_test) = mnist.load_data() y_train = to_categorical(y_train, num_classes=10) y_test = to_categorical(y_test, num_classes=10) X_train = X_train.reshape(X_train.shape[0], 784) X_test = X_test.reshape(X_test.shape[0], 784) # 归一化数据以帮助训练 X_train /= 255 X_test /= 255 return X_train, y_train, X_test, y_testdef trainModel(X_train, y_train, X_test, y_test): # 训练参数 batch_size = 1 epochs = 10 # 创建模型并添加层 model = Sequential() model.add(Dense(64, activation='relu', input_shape=(784,))) model.add(Dense(10, activation = 'softmax')) # 编译序列模型 model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam') # 训练模型并将指标保存到history中 history = model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs, verbose=2, validation_data=(X_test, y_test)) loss_and_metrics = model.evaluate(X_test, y_test, verbose=2) print("Test Loss", loss_and_metrics[0]) print("Test Accuracy", loss_and_metrics[1]) # 保存模型结构和权重 model_json = model.to_json() with open('model.json', 'w') as json_file: json_file.write(model_json) model.save_weights('mnist_model.h5') return modeldef loadModel(): json_file = open('model.json', 'r') model_json = json_file.read() json_file.close() model = model_from_json(model_json) model.load_weights("mnist_model.h5") return modelX_train, y_train, X_test, y_test = getData()if(not os.path.exists('mnist_model.h5')): model = trainModel(X_train, y_train, X_test, y_test) print('trained model') print(model.summary())else: model = loadModel() print('loaded model') print(model.summary()) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) loss_and_metrics = model.evaluate(X_test, y_test, verbose=2) print("Test Loss", loss_and_metrics[0]) print("Test Accuracy", loss_and_metrics[1])
以下是输出(假设模型之前已训练,这次只会加载模型):
(‘Test Loss’, 1.741784990310669)
(‘Test Accuracy’, 0.414)
另一方面,predict.py 用于预测手写数字:
def loadModel(): json_file = open('model.json', 'r') model_json = json_file.read() json_file.close() model = model_from_json(model_json) model.load_weights("mnist_model.h5") return modelmodel = loadModel()model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])print(model.summary())(X_train, y_train), (X_test, y_test) = mnist.load_data()y_test = to_categorical(y_test, num_classes=10)X_test = X_test.reshape(X_test.shape[0], 28*28)loss_and_metrics = model.evaluate(X_test, y_test, verbose=2)print("Test Loss", loss_and_metrics[0])print("Test Accuracy", loss_and_metrics[1])
让我惊讶的是,在这种情况下,得到以下结果:
(‘Test Loss’, 1.8380377866744995)
(‘Test Accuracy’, 0.8856)
在第二个文件中,我得到的 Test Accuracy
为0.88(比之前高出一倍多)。
此外,model.summery()
在两个文件中是相同的:
_________________________________________________________________Layer (type) Output Shape Param # =================================================================dense_1 (Dense) (None, 64) 50240 _________________________________________________________________dense_2 (Dense) (None, 10) 650 =================================================================Total params: 50,890Trainable params: 50,890Non-trainable params: 0_________________________________________________________________
我无法找出这种行为背后的原因。这是正常的吗?还是我错过了什么?
回答:
差异结果来自于一次你使用归一化数据(即除以255)调用 evaluate()
方法,而另一次(即在”predict.py”文件中)你使用未归一化的数据调用它。在推理时间(即测试时间),你应该始终使用与训练数据相同的预处理步骤。
此外,首先将数据转换为浮点数,然后再除以255(否则,在Python 2.x中使用 /
会执行真正的除法,而在Python 3.x中运行 X_train /= 255
和 X_test /= 255
会出现错误):
X_train = X_train.astype('float32')X_test = X_test.astype('float32')X_train /= 255.X_test /= 255.