下面的模型来自Keras
网站,其表现完全符合预期。该模型使用keras.models.Sequential()
定义。我希望将其转换为使用keras.models.Model()
定义,以使其在未来使用时更加灵活。但在转换后,性能显著下降。
您可以在Keras
网站上找到原始模型:
def build_model(): model = Sequential([ layers.Dense(64, activation=tf.nn.relu, input_shape=[len(train_dataset.keys())]), layers.Dense(64, activation=tf.nn.relu), layers.Dense(1) ]) optimizer = keras.optimizers.Adam() model.compile(loss='mean_squared_error', optimizer=optimizer, metrics=['mean_absolute_error', 'mean_squared_error']) return modelmodel = build_model()model.summary()_________________________________________________________________Layer (type) Output Shape Param # =================================================================dense_22 (Dense) (None, 64) 640 _________________________________________________________________dense_23 (Dense) (None, 64) 4160 _________________________________________________________________dense_24 (Dense) (None, 1) 65 =================================================================Total params: 4,865Trainable params: 4,865Non-trainable params: 0_________________________________________________________________
以下是我转换后的代码:
def build_model_base(): input = Input(shape=[len(train_dataset.keys())]) x = Dense(64, activation='relu', name="dense1")(input) x = Dense(64, activation='relu', name="dense2")(x) output = Dense(1, activation='sigmoid', name='output')(x) model = Model(input=[input], output=[output]) optimizer = keras.optimizers.Adam() model.compile(loss='mean_squared_error', optimizer=optimizer, metrics=['mean_absolute_error', 'mean_squared_error']) return model_________________________________________________________________Layer (type) Output Shape Param # =================================================================input_18 (InputLayer) (None, 9) 0 _________________________________________________________________dense1 (Dense) (None, 64) 640 _________________________________________________________________dense2 (Dense) (None, 64) 4160 _________________________________________________________________output (Dense) (None, 1) 65 =================================================================Total params: 4,865Trainable params: 4,865Non-trainable params: 0
我能看到的唯一区别是.Sequential
不计算输入层
,而.Model
计算它,但我认为它们不会使模型结构不同。然而,.Sequential
的性能是:
而我转换的.Model()
的性能是:
谁能告诉我我哪里做错了?
其他背景信息:
我已经阅读了这个帖子,但我的代码都是在Google Colab的CPU上运行的
print(keras.__version__) # 2.0.4
print(tf.__version__) #1.14.0-rc1
绘制损失的代码:
def plot_history(history): hist = pd.DataFrame(history.history) hist['epoch'] = history.epoch plt.figure() plt.xlabel('Epoch') plt.ylabel('Mean Abs Error [MPG]') plt.plot(hist['epoch'], hist['mean_absolute_error'], label='Train Error') plt.plot(hist['epoch'], hist['val_mean_absolute_error'], label = 'Val Error') y_max = max(hist['val_mean_absolute_error']) plt.ylim([0,y_max]) plt.legend() plt.figure() plt.xlabel('Epoch') plt.ylabel('Mean Square Error [$MPG^2$]') plt.plot(hist['epoch'], hist['mean_squared_error'], label='Train Error') plt.plot(hist['epoch'], hist['val_mean_squared_error'], label = 'Val Error') y_max = max(hist['val_mean_squared_error']) plt.ylim([0,y_max]) plt.legend() plt.show()
训练模型的代码(两个模型都完全相同):
his_seq = model.fit(normed_train_data.values, train_labels.values, batch_size=128, validation_split = 0.1, epochs = 100, verbose=0)plot_history(his_seq)
任何建议都将不胜感激!
回答:
Keras的密集层默认使用’线性’激活函数,您构建的顺序模型的输出层也是如此。
但在您的转换中,您指定激活函数为’sigmoid’,这可能会产生差异。
以下是Keras提供的关于默认激活函数的描述:
activation: 要使用的激活函数(请参阅激活函数)。如果您不指定任何内容,则不应用激活函数(即“线性”激活:a(x) = x)。