tensorflow: 输出层总是显示 [1.]

我正在训练一个判别网络,以便在生成网络中使用。我使用包含两个特征的数据集进行训练,并进行二元分类。1 表示冥想,0 表示未冥想。(数据集来自@[隐藏人名]的一个视频)。

由于某些原因,输出层(ol)在每个测试用例中总是输出 [1]。

我的数据集: https://drive.google.com/open?id=0B5DaSp-aTU-KSmZtVmFoc0hRa3c

import pandas as pdimport tensorflow as tfdata = pd.read_csv("E:/workspace_py/datasets/simdata/linear_data_train.csv")data_f = data.drop("lbl", axis = 1)data_l = data.drop(["f1", "f2"], axis = 1)learning_rate = 0.01batch_size = 1n_epochs = 30n_examples = 999 # This is highly unsatisfying >:3n_iteration = int(n_examples/batch_size)features = tf.placeholder('float', [None, 2], name='features_placeholder')labels = tf.placeholder('float', [None, 1], name = 'labels_placeholder')weights = {            'ol': tf.Variable(tf.random_normal([2, 1], stddev= -12), name = 'w_ol')}biases = {            'ol': tf.Variable(tf.random_normal([1], stddev=-12), name = 'b_ol')}ol = tf.nn.sigmoid(tf.add(tf.matmul(features, weights['ol']), biases['ol']), name = 'ol')loss = -tf.reduce_sum(labels*tf.log(ol), name = 'loss') # cross entropytrain = tf.train.AdamOptimizer(learning_rate).minimize(loss)sess = tf.Session()sess.run(tf.global_variables_initializer())for epoch in range(n_epochs):    ptr = 0    for iteration in range(n_iteration):        epoch_x = data_f[ptr: ptr + batch_size]        epoch_y = data_l[ptr: ptr + batch_size]        ptr = ptr + batch_size        _, err = sess.run([train, loss], feed_dict={features: epoch_x, labels:epoch_y})    print("Loss @ epoch ", epoch, " = ", err)print("Testing...\n")data = pd.read_csv("E:/workspace_py/datasets/simdata/linear_data_eval.csv")test_data_l = data.drop(["f1", "f2"], axis = 1)test_data_f = data.drop("lbl", axis = 1)#vvvHERE    print(sess.run(ol, feed_dict={features: test_data_f})) #<<<HERE#^^^HEREsaver = tf.train.Saver()saver.save(sess, save_path="E:/workspace_py/saved_models/meditation_disciminative_model.ckpt")sess.close()

输出:

2017-10-11 00:49:47.453721: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.2017-10-11 00:49:47.454212: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.2017-10-11 00:49:49.608862: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:955] Found device 0 with properties: name: GeForce GTX 960Mmajor: 5 minor: 0 memoryClockRate (GHz) 1.176pciBusID 0000:01:00.0Total memory: 4.00GiBFree memory: 3.35GiB2017-10-11 00:49:49.609281: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:976] DMA: 0 2017-10-11 00:49:49.609464: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:986] 0:   Y 2017-10-11 00:49:49.609659: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0)Loss @ epoch  0  =  0.000135789Loss @ epoch  1  =  4.16049e-05Loss @ epoch  2  =  1.84776e-05Loss @ epoch  3  =  9.41758e-06Loss @ epoch  4  =  5.24522e-06Loss @ epoch  5  =  2.98024e-06Loss @ epoch  6  =  1.66893e-06Loss @ epoch  7  =  1.07288e-06Loss @ epoch  8  =  5.96047e-07Loss @ epoch  9  =  3.57628e-07Loss @ epoch  10  =  2.38419e-07Loss @ epoch  11  =  1.19209e-07Loss @ epoch  12  =  1.19209e-07Loss @ epoch  13  =  1.19209e-07Loss @ epoch  14  =  -0.0Loss @ epoch  15  =  -0.0Loss @ epoch  16  =  -0.0Loss @ epoch  17  =  -0.0Loss @ epoch  18  =  -0.0Loss @ epoch  19  =  -0.0Loss @ epoch  20  =  -0.0Loss @ epoch  21  =  -0.0Loss @ epoch  22  =  -0.0Loss @ epoch  23  =  -0.0Loss @ epoch  24  =  -0.0Loss @ epoch  25  =  -0.0Loss @ epoch  26  =  -0.0Loss @ epoch  27  =  -0.0Loss @ epoch  28  =  -0.0Loss @ epoch  29  =  -0.0Testing...[[ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.] [ 1.]]Saving model...[Finished in 57.9s]

回答:

主要问题

首先,这不是有效的交叉熵损失。你使用的方程仅在有两个或更多输出时有效。对于单个sigmoid输出,你必须这样做

-tf.reduce_sum(labels*tf.log(ol) + (1-labels)*tf.log(1-ol), name = 'loss')

否则,最优解总是回答“1”(这正是现在发生的情况)。

为什么?

请注意,标签只有0或1,你的整个损失是标签与预测的对数的乘积。因此,当真实标签为0时,无论你的预测是什么,损失都是0,因为0 * log(x) = 0,无论x是什么(只要log(x)定义了)。因此,你的模型只会在应该预测“1”时未预测“1”时受到惩罚,所以它学会了一直输出1。

其他一些奇怪之处

  1. 你为正态分布提供了负的stddev,而你不应该这样做(除非这是random_normal的一些未记录的功能,但根据文档,它应该接受一个单一的浮点数,你应该在那里提供一个小的数字)。

  2. 像这样计算交叉熵(以一种天真的方式)在数值上不稳定,请查看tf.sigmoid_cross_entropy_with_logits。

  3. 你没有对数据集进行排列,因此你总是以相同的顺序处理数据,这可能会产生不良后果(损失的周期性增加,收敛困难或无法收敛)。

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注