我试图使用多层感知器来近似sin(2x)函数的带噪数据:
# Get datadatasets = gen_datasets()# Add noisedatasets["ysin_train"] = add_noise(datasets["ysin_train"])datasets["ysin_test"] = add_noise(datasets["ysin_test"])# Extract wanted datapatterns_train = datasets["x_train"]targets_train = datasets["ysin_train"]patterns_test = datasets["x_test"]targets_test = datasets["ysin_test"]# Reshape to fit modelpatterns_train = patterns_train.reshape(62, 1)targets_train = targets_train.reshape(62, 1)patterns_test = patterns_test.reshape(62, 1)targets_test = targets_test.reshape(62, 1)# Parameterslearning_rate = 0.001training_epochs = 10000batch_size = patterns_train.shape[0]display_step = 1# Network Parametersn_hidden_1 = 2n_hidden_2 = 2n_input = 1n_classes = 1# tf Graph inputX = tf.placeholder("float", [None, n_input])Y = tf.placeholder("float", [None, n_classes])# Store layers weight & biasweights = { 'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])), 'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])), 'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))}biases = { 'b1': tf.Variable(tf.random_normal([n_hidden_1])), 'b2': tf.Variable(tf.random_normal([n_hidden_2])), 'out': tf.Variable(tf.random_normal([n_classes]))}# Create modeldef multilayer_perceptron(x): # Hidden fully connected layer with 2 neurons layer_1 = tf.sigmoid(tf.add(tf.matmul(x, weights['h1']), biases['b1'])) # Hidden fully connected layer with 2 neurons layer_2 = tf.sigmoid(tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])) # Output fully connected layer out_layer = tf.matmul(layer_2, weights['out']) + biases['out'] return out_layer# Construct modellogits = multilayer_perceptron(X)# Define loss and optimizerloss_op = tf.reduce_mean(tf.losses.absolute_difference(labels = Y, predictions = logits, reduction=tf.losses.Reduction.NONE))optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)train_op = optimizer.minimize(loss_op)# Initializing the variablesinit = tf.global_variables_initializer()with tf.Session() as sess: sess.run(init) # Training Cycle for epoch in range(training_epochs): _ = sess.run(train_op, feed_dict={X: patterns_train, Y: targets_train}) c = sess.run(loss_op, feed_dict={X: patterns_test, Y: targets_test}) if epoch % display_step == 0: print("Epoch: {0: 4} cost={1:9}".format(epoch+1, c)) print("Optimization finished!") outputs = sess.run(logits, feed_dict={X: patterns_test}) print("outputs: {0}".format(outputs.T)) plt.plot(patterns_test, outputs, "r.", label="outputs") plt.plot(patterns_test, targets_test, "b.", label="targets") plt.legend() plt.show()
当我在最后绘制图表时,得到的是一条直线,就像我有一个线性网络一样。请看图表:
这对于线性网络来说是错误最小化的正确结果。但我应该没有线性网络,因为我在multilayer_perceptron()
函数中使用了Sigmoid函数!为什么我的网络表现得像这样?
回答:
你在tf.random_normal
中用于权重和偏置初始化的默认值stddev=1.0
是非常大的。尝试为权重明确设置stddev=0.01
的值;至于偏置,常见的做法是将它们初始化为零。
作为初始方法,我也会尝试将learning_rate
提高到0.01(或者可能不这样做 – 请参见相关问题中的回答这里)