我正在按照教程 https://blog.keras.io/building-autoencoders-in-keras.html 构建我的自编码器。为此,我有两种策略:
A) 步骤1:构建自编码器;步骤2:构建编码器;步骤3:构建解码器;步骤4:编译自编码器;步骤5:训练自编码器。
B) 步骤1:构建自编码器;步骤2:编译自编码器;步骤3:训练自编码器;步骤4:构建编码器;步骤5:构建解码器。
两种情况下,模型都收敛到0.100的损失。然而,在教程中提到的策略A中,重建效果非常差。使用策略B时,重建效果要好得多。
在我看来,这是有道理的,因为在策略A中,编码器和解码器模型的权重是在未经训练的层上构建的,结果是随机的。另一方面,在策略B中,训练后权重定义得更好,因此重建效果更好。
我的问题是,策略B是否有效,或者我是在重建上作弊?在策略A的情况下,Keras应该会自动更新基于自编码器层构建的编码器和解码器模型的权重吗?
###### Code for Strategy A# Step 1features = Input(shape=(x_train.shape[1],))encoded = Dense(1426, activation='relu')(features)encoded = Dense(732, activation='relu')(encoded)encoded = Dense(328, activation='relu')(encoded)encoded = Dense(encoding_dim, activation='relu')(encoded)decoded = Dense(328, activation='relu')(encoded)decoded = Dense(732, activation='relu')(decoded)decoded = Dense(1426, activation='relu')(decoded)decoded = Dense(x_train.shape[1], activation='relu')(decoded)autoencoder = Model(inputs=features, outputs=decoded)# Step 2encoder = Model(features, encoded)# Step 3encoded_input = Input(shape=(encoding_dim,))decoder_layer = autoencoder.layers[-4](encoded_input)decoder_layer = autoencoder.layers[-3](decoder_layer)decoder_layer = autoencoder.layers[-2](decoder_layer)decoder_layer = autoencoder.layers[-1](decoder_layer)decoder = Model(encoded_input, decoder_layer)# Step 4autoencoder.compile(optimizer='adam', loss='mse')# Step 5history = autoencoder.fit(x_train, x_train, epochs=150, batch_size=256, shuffle=True, verbose=1, validation_split=0.2)# Testing encodingencoded_fts = encoder.predict(x_test)decoded_fts = decoder.predict(encoded_fts)###### Code for Strategy B# Step 1features = Input(shape=(x_train.shape[1],))encoded = Dense(1426, activation='relu')(features)encoded = Dense(732, activation='relu')(encoded)encoded = Dense(328, activation='relu')(encoded)encoded = Dense(encoding_dim, activation='relu')(encoded)decoded = Dense(328, activation='relu')(encoded)decoded = Dense(732, activation='relu')(decoded)decoded = Dense(1426, activation='relu')(decoded)decoded = Dense(x_train.shape[1], activation='relu')(decoded)autoencoder = Model(inputs=features, outputs=decoded)# Step 2autoencoder.compile(optimizer='adam', loss='mse')# Step 3history = autoencoder.fit(x_train, x_train, epochs=150, batch_size=256, shuffle=True, verbose=1, validation_split=0.2)# Step 4encoder = Model(features, encoded)# Step 5encoded_input = Input(shape=(encoding_dim,))decoder_layer = autoencoder.layers[-4](encoded_input)decoder_layer = autoencoder.layers[-3](decoder_layer)decoder_layer = autoencoder.layers[-2](decoder_layer)decoder_layer = autoencoder.layers[-1](decoder_layer)decoder = Model(encoded_input, decoder_layer)# Testing encodingencoded_fts = encoder.predict(x_test)decoded_fts = decoder.predict(encoded_fts)
回答:
我的问题是,策略B是否有效,或者我是在重建上作弊?
A
和 B
是等效的;不,你没有作弊。
在策略A的情况下,Keras应该会自动更新基于自编码器层构建的编码器和解码器模型的权重吗?
解码器模型只是使用了自编码器的层。在情况 A
中:
decoder.layersOut:[<keras.engine.input_layer.InputLayer at 0x7f8a44d805c0>, <keras.layers.core.Dense at 0x7f8a44e58400>, <keras.layers.core.Dense at 0x7f8a44e746d8>, <keras.layers.core.Dense at 0x7f8a44e14940>, <keras.layers.core.Dense at 0x7f8a44e2dba8>]autoencoder.layersOut:[<keras.engine.input_layer.InputLayer at 0x7f8a44e91c18>, <keras.layers.core.Dense at 0x7f8a44e91c50>, <keras.layers.core.Dense at 0x7f8a44e91ef0>, <keras.layers.core.Dense at 0x7f8a44e89080>, <keras.layers.core.Dense at 0x7f8a44e89da0>, <keras.layers.core.Dense at 0x7f8a44e58400>, <keras.layers.core.Dense at 0x7f8a44e746d8>, <keras.layers.core.Dense at 0x7f8a44e14940>, <keras.layers.core.Dense at 0x7f8a44e2dba8>]
每个列表最后四行的十六进制数字(对象ID)是相同的——因为它们是相同的对象。当然,它们也共享它们的权重。
在情况 B
中:
decoder.layersOut:[<keras.engine.input_layer.InputLayer at 0x7f8a41de05f8>, <keras.layers.core.Dense at 0x7f8a41ee4828>, <keras.layers.core.Dense at 0x7f8a41eaceb8>, <keras.layers.core.Dense at 0x7f8a41e50ac8>, <keras.layers.core.Dense at 0x7f8a41e5d780>]autoencoder.layersOut:[<keras.engine.input_layer.InputLayer at 0x7f8a41da3940>, <keras.layers.core.Dense at 0x7f8a41da3978>, <keras.layers.core.Dense at 0x7f8a41da3a90>, <keras.layers.core.Dense at 0x7f8a41da3b70>, <keras.layers.core.Dense at 0x7f8a44720cf8>, <keras.layers.core.Dense at 0x7f8a41ee4828>, <keras.layers.core.Dense at 0x7f8a41eaceb8>, <keras.layers.core.Dense at 0x7f8a41e50ac8>, <keras.layers.core.Dense at 0x7f8a41e5d780>]
– 层也是相同的。
因此,A
和 B
的训练顺序是等效的。更一般地说,如果你共享层(因此也共享权重),那么构建、编译和训练的顺序在大多数情况下并不重要,因为它们都在同一个tensorflow图中。
我在 mnist
数据集上运行了这些示例,它们表现出相同的性能,并且很好地重建了图像。我认为,如果你在情况 A
中遇到麻烦,你可能错过了其他东西(我不知道怎么会这样,因为我复制粘贴了你的代码,一切正常)。
如果你使用jupyter,有时重新启动并从头到尾运行会有所帮助。