我在尝试编写TensorFlow中的逻辑与(AND)操作,有两个输入和两个权重,我希望将它们相乘得到一个数字,然后将这个数字加上偏置。我的问题在于使用matmul时,我发送了X(输入)和W(权重)到方法中,X的形状是[[1],[1]](垂直),W的形状是[0.49900547 , 0.49900547](水平),希望得到一个数字作为结果,但它却返回了两个数字。我该如何正确地进行乘法运算?这是我的代码>>
import tensorflow as tfimport numpyrng = numpy.random# Parameterslearning_rate = 0.01training_epochs = 2000display_step = 50# Training Datatrain_X = numpy.asarray([[[1.0],[1.0]],[[1.0],[0.0]],[[0.0],[1.0]],[[0.0],[0.0]]])train_Y = numpy.asarray([1.0,0.0,0.0,0.0])n_samples = train_X.shape[0]# tf Graph InputX = tf.placeholder("float",[2,1],name="inputarr")Y = tf.placeholder("float",name = "outputarr")# Create Model# Set model weightsW = tf.Variable(tf.zeros([1,2]), name="weight")b = tf.Variable(rng.randn(), name="bias")# Construct a linear modelactivation = tf.add(tf.matmul(X,W), b)mulres = tf.matmul(X,W)# Minimize the squared errorscost = tf.reduce_sum(tf.pow(activation-Y, 2))/(2*n_samples) #L2 lossoptimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) #Gradient descent# Initializing the variablesinit = tf.initialize_all_variables()# Launch the graphwith tf.Session() as sess: sess.run(init) # Fit all training data for epoch in range(training_epochs): for (x, y) in zip(train_X, train_Y): sess.run(optimizer, feed_dict={X: x, Y: y}) #Display logs per epoch step if epoch % display_step == 0: print "Epoch:", '%04d' % (epoch+1), \ "W=", sess.run(W), "b=", sess.run(b) , "x= ",x," y =", y," result :",sess.run(mulres,feed_dict={X: x}) print "Optimization Finished!" print "W=", sess.run(W), "b=", sess.run(b), '\n' # Testing example, as requested (Issue #2) test_X = numpy.asarray([[1.0,0.0]]) test_Y = numpy.asarray([0]) for x, y in zip(train_X, train_Y): print "x: ",x,"y: ",y print "Testing... (L2 loss Comparison)","result :",sess.run(mulres, feed_dict={X: x}) print sess.run(tf.matmul(X, W),feed_dict={X: x}) print "result :" predict = sess.run(activation,feed_dict={X: x}) print predict
回答:
与标准矩阵乘法一样,如果A
的形状为[m, k]
,B
的形状为[k, n]
,那么tf.matmul(A, B)
的形状将为[m, n]
(TensorFlow使用的顺序为m
行,n
列)。
在您的程序中,您计算了tf.matmul(X, W)
。X
被定义为形状为[2, 1]
的占位符;W
被定义为初始化为[1, 2]
的零矩阵的变量。因此,mulres = tf.matmul(X, W)
将具有形状[2, 2]
,这是我在本地运行您的代码时打印出的结果(result: ...
)。
如果您想定义一个具有单一输出的隐藏层,修改非常简单:
W = tf.Variable(tf.zeros([1,2]), name="weight")
…应替换为:
W = tf.Variable(tf.zeros([2, 1]), name="weight")
(实际上,将权重初始化为tf.zeros
会阻止其训练,因为在反向传播中所有输入元素将获得相同的梯度。相反,您应该随机初始化它们,例如使用:
W = tf.Variable(tf.truncated_normal([2, 1], stddev=0.5), name="weight")
这将使网络能够为权重的每个组成部分学习不同的值。)