我使用多层感知器进行二元分类,使用的是numpy和tensorflow。
输入矩阵的形状为(9578,18)
,标签的形状为(9578,1)
这是我的代码:
#预处理
input = np.loadtxt("input.csv", delimiter=",", ndmin=2).astype(np.float32)
labels = np.loadtxt("label.csv", delimiter=",", ndmin=2).astype(np.float32)
train_size = 0.9
train_cnt = floor(inp.shape[0] * train_size)
x_train = input[0:train_cnt]
y_train = labels[0:train_cnt]
x_test = input[train_cnt:]
y_test = labels[train_cnt:]
#定义参数
learning_rate = 0.01
training_epochs = 100
batch_size = 50
n_classes = labels.shape[1]
n_samples = 9578
n_inputs = input.shape[1]
n_hidden_1 = 20
n_hidden_2 = 20
def multilayer_network(X,weights,biases,keep_prob):
'''X: 用于数据输入的占位符
weights: 权重字典
biases: 偏置值字典'''
#第一隐藏层使用sigmoid激活函数
# sigmoid(X*W+b)
layer_1 = tf.add(tf.matmul(X,weights['h1']),biases['h1'])
layer_1 = tf.nn.sigmoid(layer_1)
layer_1 = tf.nn.dropout(layer_1,keep_prob)
#第二隐藏层
layer_2 = tf.add(tf.matmul(layer_1,weights['h2']),biases['h2'])
layer_2 = tf.nn.sigmoid(layer_2)
layer_2 = tf.nn.dropout(layer_2,keep_prob)
#输出层
out_layer = tf.matmul(layer_2,weights['out']) + biases['out']
return out_layer
#定义权重和偏置字典
weights = {
'h1': tf.Variable(tf.random_normal([n_inputs,n_hidden_1])),
'h2': tf.Variable(tf.random_normal([n_hidden_1,n_hidden_2])),
'out': tf.Variable(tf.random_normal([n_hidden_2,n_classes]))
}
biases = {
'h1': tf.Variable(tf.random_normal([n_hidden_1])),
'h2': tf.Variable(tf.random_normal([n_hidden_2])),
'out': tf.Variable(tf.random_normal([n_classes]))
}
keep_prob = tf.placeholder("float")
X = tf.placeholder(tf.float32,[None,n_inputs])
Y = tf.placeholder(tf.float32,[None,n_classes])
predictions = multilayer_network(X,weights,biases,keep_prob)
#成本函数(损失)和优化器函数
cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=predictions,labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
#运行会话
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
#for循环
for epoch in range(training_epochs):
avg_cost = 0.0
total_batch = int(len(x_train) / batch_size)
x_batches = np.array_split(x_train, total_batch)
y_batches = np.array_split(y_train, total_batch)
for i in range(total_batch):
batch_x, batch_y = x_batches[i], y_batches[i]
_, c = sess.run([optimizer, cost],
feed_dict={
X: batch_x,
Y: batch_y,
keep_prob: 0.8
})
avg_cost += c / total_batch
print("Epoch:", '%04d' % (epoch+1), "cost=",
"{:.9f}".format(avg_cost))
print("模型已完成{}轮训练".format(training_epochs))
correct_prediction = tf.equal(tf.argmax(predictions, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print("准确率:", accuracy.eval({X: x_test, Y: y_test,keep_probs=1.0}))
在运行我的模型100轮后,每轮的成本都在下降,这意味着网络运作正常,但每次准确率都是1.0,我不知道为什么,因为我对神经网络及其工作原理还是个初学者。所以任何帮助都将不胜感激。谢谢!
编辑: 我尝试在每轮后检查预测矩阵,结果每次都得到全是零的矩阵。我在带有轮次的for循环中使用了以下代码来检查预测矩阵:
for epoch in range(training_epochs):
avg_cost = 0.0
total_batch = int(len(x_train) / batch_size)
x_batches = np.array_split(x_train, total_batch)
y_batches = np.array_split(y_train, total_batch)
for i in range(total_batch):
batch_x, batch_y = x_batches[i], y_batches[i]
_, c,p = sess.run([optimizer, cost,predictions],
feed_dict={
X: batch_x,
Y: batch_y,
keep_prob: 0.8
})
avg_cost += c / total_batch
print("Epoch:", '%04d' % (epoch+1), "cost=",
"{:.9f}".format(avg_cost))
y_pred = sess.run(tf.argmax(predictions, 1), feed_dict={X: x_test,keep_prob:1.0})
y_true = sess.run(tf.argmax(y_test, 1))
acc = sess.run(accuracy, feed_dict={X: x_test, Y: y_test,keep_prob:1.0})
print('准确率:', acc)
print ('---------------')
print(y_pred, y_true)
print("模型已完成{}轮训练".format(training_epochs))
这是第一轮的输出:
Epoch: 0001 cost= 0.543714217
准确率: 1.0
---------------
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
回答:
你没有在predictions上调用sess.run。这意味着它当前代表的是tensorflow的图,而不是预测的值。
将你的_, c = sess.run([optimizer, cost], ...)
替换为_, c, p = sess.run([optimizer, cost, predictions], ...)
。然后在你得到的p
值上进行correct_prediction
计算。同样,真值是batch_y
,因为你的Y
变量也是tensorflow图对象。因此,你现在将在numpy变量中工作,所以argmax
调用应该使用np
而不是tf
。我认为这应该能解决问题。
如果你想在tensorflow中进行操作,将正确预测和准确率计算移到计算成本的地方,并将你的sess.run行更改为:_, c, a = sess.run([optimizer, cost, accuracy], ...)
关于你为什么得到100%准确率的解释,你有这样的代码行correct_prediction = tf.equal(tf.argmax(predictions, 1), tf.argmax(Y, 1))
,其中predictions
和Y
都是tensorflow图变量。你可以将它们视为当你调用sess.run()
时值将流经的包装器。所以当你打印准确率时,你是在比较tensorflow图操作和tensorflow图操作,我猜后台将它们视为总是相等的。
编辑:下面是提到的两种不同方法的示例代码。由于我无法轻易测试(我没有你的数据),所以不确定它是否工作,但应该类似于这样。
第一种方法:
_, c, p = sess.run([optimizer, cost, predictions], ...)
.
.
.
correct_prediction = np.equal(np.argmax(p, axis=1), np.argmax(batch_y, axis=1))
accuracy = np.mean(correct_prediction)
第二种方法:
cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=predictions,labels=Y))
correct_prediction = tf.equal(tf.argmax(predictions, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
...
for i in range(total_batch):
batch_x, batch_y = x_batches[i], y_batches[i]
_, c, a = sess.run([optimizer, cost, accuracy],
feed_dict={
X: batch_x,
Y: batch_y,
keep_prob: 0.8
})
print(a)
编辑2:虽然上述信息仍然正确,但还有另一个问题。当你只预测一个类时,使用交叉熵和准确率是没有意义的。如果你对长度为1的东西调用argmax,那么你总是会得到0,因为那就是唯一存在的位置!准确率和交叉熵只有在类别预测的上下文中才有意义,其中你的真值是类别列表上的一个热向量。