Tensorflow卷积网络奇怪的Softmax输出

我的卷积网络的输出非常不寻常。当我打印出前向传播结果的输出向量时,它始终是完美的[0, 0, 0, 1],对于数据集中的整个标签都是恒定的。我怀疑我的构建中存在错误。

import osimport sysimport tensorflow as tfimport Inputimport os, re"""这是一个基于CIFAR10模型的模型。程序的总体结构和一些功能是从Tensorflow的CIFAR10模型示例中借用的。https://github.com/tensorflow/tensorflow/tree/r0.7/tensorflow/models/image/cifar10/如引用所说:"如果您现在有兴趣开发和训练您自己的图像分类系统,我们建议您fork这个教程,并替换组件以解决您的图像分类问题。"来源:https://www.tensorflow.org/tutorials/deep_cnn/"""FLAGS = tf.app.flags.FLAGSTOWER_NAME = 'tower'tf.app.flags.DEFINE_integer('batch_size', 1, "hello")tf.app.flags.DEFINE_string('data_dir', 'data', "hello")def _activation_summary(x):    with tf.device('/cpu:0'):        tensor_name = re.sub('%s_[0-9]*/' % TOWER_NAME, '', x.op.name)        tf.histogram_summary(tensor_name + '/activations', x)        tf.scalar_summary(tensor_name + '/sparsity', tf.nn.zero_fraction(x))def inputs():  if not FLAGS.data_dir:    raise ValueError('Source Data Missing')  data_dir = FLAGS.data_dir  images, labels = Input.inputs(data_dir = data_dir, batch_size = FLAGS.batch_size)  return images, labelsdef eval_inputs():  data_dir = FLAGS.data_dir  images, labels = Input.eval_inputs(data_dir = data_dir, batch_size = 1)  return images, labelsdef weight_variable(shape):    with tf.device('/gpu:0'):        initial = tf.random_normal(shape, stddev=0.1)        return tf.Variable(initial)def bias_variable(shape):    initial = tf.constant(0.1, shape = shape)    return tf.Variable(initial)def conv(images, W):    with tf.device('/gpu:0'):        return tf.nn.conv2d(images, W, strides = [1, 1, 1, 1], padding = 'SAME')def forward_propagation(images):  with tf.variable_scope('conv1') as scope:      conv1_feature = weight_variable([20, 20, 3, 20])      conv1_bias = bias_variable([20])      image_matrix = tf.reshape(images, [-1, 1686, 1686, 3])      conv1_result = tf.nn.relu(conv(image_matrix, conv1_feature) + conv1_bias)      _activation_summary(conv1_result)  with tf.variable_scope('conv2') as scope:      conv2_feature = weight_variable([10, 10, 20, 40])      conv2_bias = bias_variable([40])      conv2_result = tf.nn.relu(conv(conv1_result, conv2_feature) + conv2_bias)      _activation_summary(conv2_result)      conv2_pool = tf.nn.max_pool(conv2_result, ksize = [1, 281, 281, 1], strides = [1, 281, 281, 1], padding = 'SAME')  with tf.variable_scope('conv3') as scope:      conv3_feature = weight_variable([5, 5, 40, 80])      conv3_bias = bias_variable([80])      conv3_result = tf.nn.relu(conv(conv2_pool, conv3_feature) + conv3_bias)      _activation_summary(conv3_result)      conv3_pool = tf.nn.max_pool(conv3_result, ksize = [1, 2, 2, 1], strides = [1, 2, 2, 1], padding = 'SAME')  with tf.variable_scope('local3') as scope:      perceptron1_weight = weight_variable([3 * 3 * 80, 10])      perceptron1_bias = bias_variable([10])      flatten_dense_connect = tf.reshape(conv3_pool, [1, -1])      compute_perceptron1_layer = tf.nn.relu(tf.matmul(flatten_dense_connect, perceptron1_weight) + perceptron1_bias)      _activation_summary(compute_perceptron1_layer)  with tf.variable_scope('softmax_connect') as scope:      perceptron3_weight = weight_variable([10, 4])      perceptron3_bias = bias_variable([4])      y_conv = tf.nn.softmax(tf.matmul(compute_perceptron1_layer, perceptron3_weight) + perceptron3_bias)      _activation_summary(y_conv)      return y_convdef error(forward_propagation_results, labels):    with tf.device('/cpu:0'):        labels = tf.cast(labels, tf.int64)        cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(forward_propagation_results, labels)        cost = tf.reduce_mean(cross_entropy)        tf.add_to_collection('losses', cost)        tf.scalar_summary('LOSS', cost)        return costdef train(cost):    with tf.device('/gpu:0'):        train_loss = tf.train.GradientDescentOptimizer(learning_rate = 0.01).minimize(cost)        return train_loss

回答:

主要问题在于Softmax被调用了两次。

Softmax在代码的前向传播部分被调用,并且它被放置在Tensorflow的交叉熵代码中,该代码已经包含了一个Softmax,因此导致输出异常。

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注