我正在尝试从头开始在TensorFlow中实现MLP,并在MNIST数据集上进行测试。这是我的代码:
import tensorflow.compat.v1 as tffrom tensorflow.compat.v1.keras.losses import categorical_crossentropytf.disable_v2_behavior()image_tensor = tf.placeholder(tf.float32 , shape=(None , 784))label_tensor = tf.placeholder(tf.float32 , shape=(None , 10))# Model architecture# --> Layer 1w1 = tf.Variable(tf.random_uniform([784 , 128])) # weightsb1 = tf.Variable(tf.zeros([128])) # biasa1 = tf.matmul(image_tensor , w1) + b1h1 = tf.nn.relu(a1)# --> Layer 2w2 = tf.Variable(tf.random_uniform([128 , 128]))b2 = tf.zeros([128])a2 = tf.matmul(h1 , w2) + b2h2 = tf.nn.relu(a2)# --> output layerw3 = tf.Variable(tf.random_uniform([128 , 10]))b3 = tf.zeros([10])a3 = tf.matmul(h2 , w3) + b3predicted_tensor = tf.nn.softmax(a3) loss = tf.reduce_mean(categorical_crossentropy(label_tensor , predicted_tensor))opt = tf.train.GradientDescentOptimizer(0.01) training_step = opt.minimize(loss)with tf.Session() as sess: init_op = tf.global_variables_initializer() sess.run(init_op) epochs = 50 batch = 100 iterations = len(training_images) // batch for j in range(epochs): start = 0 end = batch for i in range(iterations): image_batch = np.array(training_images[start : end]) label_batch = np.array(training_labels[start : end]) start = batch + 1 end = start + batch _ , loss = sess.run(training_step , feed_dict = { image_tensor : image_batch, label_tensor : label_batch })
但是当我尝试运行这段代码时,我得到了以下错误信息:
File "MNIST3.py", line 97, in <module> main() File "MNIST3.py", line 88, in main label_tensor : label_batchTypeError: 'NoneType' object is not iterable
虽然当我尝试打印label_batch的前10个样本时:
print(training_labels[0 : 10])
输出将是:
[[1 0 0 0 0 0 0 0 0 0] [1 0 0 0 0 0 0 0 0 0] [1 0 0 0 0 0 0 0 0 0] [1 0 0 0 0 0 0 0 0 0] [1 0 0 0 0 0 0 0 0 0] [1 0 0 0 0 0 0 0 0 0] [1 0 0 0 0 0 0 0 0 0] [1 0 0 0 0 0 0 0 0 0] [1 0 0 0 0 0 0 0 0 0] [1 0 0 0 0 0 0 0 0 0]]
当我尝试打印数据集的形状时:
print(training_images.shape)print(training_labels.shape)
输出是:
(10000, 784)(10000, 10)
我在这里错过了什么?
回答:
你误解了错误信息(Python在这方面有时会让人误解,我们都曾不止一次被类似的错误困扰…)。尽管错误信息中显示了label_tensor : label_batch
这一行,但实际上它指的是整个session.run()
调用。
你看到这个错误的原因是,你期望调用返回一个元组,但你只提供了一个张量让TensorFlow计算。
sess.run(training_step, feed_dict=...)
将返回None
,因为操作training_step
不应该返回任何东西,调用它只是执行一步优化过程。
为了得到你想要的结果,请将代码改为:
_ , loss_result = sess.run([training_step, loss], feed_dict={ image_tensor : image_batch, label_tensor : label_batch })
这样TensorFlow将评估这两个操作,第一个将返回None(正如你已经得到的),第二个将计算给定批次的损失函数值。
(注意,你必须在左侧重命名损失变量,因为如果你不这样做,你将替换损失操作,下次调用可能会引发异常,或者更糟,只是给出错误的结果)