无法训练带有反卷积层的变分自编码器

我在Tensorflow中尝试对MNIST数据集实现变分自编码器（VAE）。首先，我训练了一个基于多层感知机（MLP）的编码器和解码器的VAE。训练过程非常顺利，损失函数逐渐降低，并且生成的数字看起来合理。这里是基于MLP的VAE解码器的代码：

x = sampled_zx = tf.layers.dense(x, 200, tf.nn.relu)x = tf.layers.dense(x, 200, tf.nn.relu)x = tf.layers.dense(x, np.prod(data_shape))img = tf.reshape(x, [-1] + data_shape)

接下来，我决定添加卷积层。仅更改编码器的效果很好，但当我在解码器中使用反卷积层（而不是全连接层）时，完全无法进行训练。损失函数始终不下降，输出始终是黑色的。以下是反卷积解码器的代码：

x = tf.layers.dense(sampled_z, 24, tf.nn.relu)x = tf.layers.dense(x, 7 * 7 * 64, tf.nn.relu)x = tf.reshape(x, [-1, 7, 7, 64])x = tf.layers.conv2d_transpose(x, 64, 3, 2, 'SAME', activation=tf.nn.relu)x = tf.layers.conv2d_transpose(x, 32, 3, 2, 'SAME', activation=tf.nn.relu)x = tf.layers.conv2d_transpose(x, 1, 3, 1, 'SAME', activation=tf.nn.sigmoid)img = tf.reshape(x, [-1, 28, 28])

这看起来很奇怪，代码在我看来没有任何问题。我确定问题出在解码器的反卷积层中，肯定是那里出了什么问题。例如，如果我在最后一个反卷积层之后添加一个全连接层（即使没有非线性激活函数！），它又能工作了！这是代码：

x = tf.layers.dense(sampled_z, 24, tf.nn.relu)x = tf.layers.dense(x, 7 * 7 * 64, tf.nn.relu)x = tf.reshape(x, [-1, 7, 7, 64])x = tf.layers.conv2d_transpose(x, 64, 3, 2, 'SAME', activation=tf.nn.relu)x = tf.layers.conv2d_transpose(x, 32, 3, 2, 'SAME', activation=tf.nn.relu)x = tf.layers.conv2d_transpose(x, 1, 3, 1, 'SAME', activation=tf.nn.sigmoid)x = tf.contrib.layers.flatten(x)x = tf.layers.dense(x, 28 * 28)img = tf.reshape(x, [-1, 28, 28])

我现在真的有点卡住了，有人知道这里可能发生了什么吗？我使用的是tf 1.8.0，Adam优化器，学习率为1e-4。

编辑：

正如@Agost指出的，我应该进一步澄清我的损失函数和训练过程。我将后验分布建模为伯努利分布，并将最大化ELBO作为我的损失函数。受到这篇文章的启发。这是编码器、解码器和损失函数的完整代码：

def make_prior():    mu = tf.zeros(N_LATENT)    sigma = tf.ones(N_LATENT)    return tf.contrib.distributions.MultivariateNormalDiag(mu, sigma)def make_encoder(x_input):    x_input = tf.reshape(x_input, shape=[-1, 28, 28, 1])    x = conv(x_input, 32, 3, 2)    x = conv(x, 64, 3, 2)    x = conv(x, 128, 3, 2)    x = tf.contrib.layers.flatten(x)    mu = dense(x, N_LATENT)    sigma = dense(x, N_LATENT, activation=tf.nn.softplus)  # softplus是log(exp(x) + 1)    return tf.contrib.distributions.MultivariateNormalDiag(mu, sigma)    def make_decoder(sampled_z):    x = tf.layers.dense(sampled_z, 24, tf.nn.relu)    x = tf.layers.dense(x, 7 * 7 * 64, tf.nn.relu)    x = tf.reshape(x, [-1, 7, 7, 64])    x = tf.layers.conv2d_transpose(x, 64, 3, 2, 'SAME', activation=tf.nn.relu)    x = tf.layers.conv2d_transpose(x, 32, 3, 2, 'SAME', activation=tf.nn.relu)    x = tf.layers.conv2d_transpose(x, 1, 3, 1, 'SAME')    img = tf.reshape(x, [-1, 28, 28])    img_distribution = tf.contrib.distributions.Bernoulli(img)    img = img_distribution.probs    img_distribution = tf.contrib.distributions.Independent(img_distribution, 2)    return img, img_distributiondef main():    mnist = input_data.read_data_sets(os.path.join(experiment_dir(EXPERIMENT), 'MNIST_data'))    tf.reset_default_graph()    batch_size = 128    x_input = tf.placeholder(dtype=tf.float32, shape=[None, 28, 28], name='X')    prior = make_prior()    posterior = make_encoder(x_input)    mu, sigma = posterior.mean(), posterior.stddev()    z = posterior.sample()    generated_img, output_distribution = make_decoder(z)    likelihood = output_distribution.log_prob(x_input)    divergence = tf.distributions.kl_divergence(posterior, prior)    elbo = tf.reduce_mean(likelihood - divergence)    loss = -elbo    global_step = tf.train.get_or_create_global_step()    optimizer = tf.train.AdamOptimizer(1e-3).minimize(loss, global_step=global_step)

回答：

可能是你在最后一个反卷积层中使用了sigmoid函数，将输出限制在0到1之间，而你在基于MLP的自编码器中或在反卷积层后添加全连接层时没有这样做，所以可能存在数据范围问题？

学技术

无法训练带有反卷积层的变分自编码器

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复