我刚开始学习TensorFlow和神经网络,我正在尝试开发一个能够预测某个属性值的神经网络(这是Kaggle.com上的一个入门比赛)。我知道使用神经网络可能不是解决回归问题的最佳模型,但我决定尝试一下。
当使用单层神经网络(没有隐藏层,这可能是一个线性回归)时,模型实际上预测的值与实际值非常接近。然而,当我添加一个隐藏层后,所有预测的值在20个输入张量的一批中都是相同的:
('real', array([[ 181000.], [ 128900.], [ 161500.], [ 180500.], [ 181000.], [ 183900.], [ 122000.], [ 378500.], [ 381000.], [ 144000.], [ 260000.], [ 185750.], [ 137000.], [ 177000.], [ 139000.], [ 137000.], [ 162000.], [ 197900.], [ 237000.], [ 68400.]]))('prediction ', array([[ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687], [ 4995.10597687]]))
更新:我注意到预测的值仅反映了输出层的偏置,而隐藏层和输出层的权重没有变化,并且始终为零。
为了进一步检查哪里出了问题,我生成了模型的图表(一次使用隐藏层,另一次不使用隐藏层),以比较这两个图表,看看是否缺少某些东西,不幸的是,它们对我来说看起来都是正确的,但我仍然不明白为什么在没有隐藏层时模型能工作,而使用隐藏层时却不能工作。
我的完整代码如下:
# coding: utf-8import tensorflow as tf import numpy as np def loadDataFromCSV(fileName , numberOfFields , numberOfOutputFields , numberOfRecords): XsArray = np.ndarray([numberOfRecords ,(numberOfFields-numberOfOutputFields)] , dtype=np.float64) YsArray = np.ndarray([numberOfRecords ,numberOfOutputFields] , dtype=np.float64) fileQueue = tf.train.string_input_producer(fileName) defaultValues = [[0]]*numberOfFields decodedLine = [[None]]*numberOfFields reader = tf.TextLineReader() key , singleLine = reader.read(fileQueue) decodedLine = tf.decode_csv(singleLine,record_defaults=defaultValues) inputFeatures = decodedLine[0:numberOfFields-numberOfOutputFields] outputFeatures =decodedLine[numberOfFields-numberOfOutputFields:numberOfFields] with tf.Session() as session : tf.global_variables_initializer().run() coor = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coor) for i in range(numberOfRecords) : XsArray[i,:] ,YsArray[i,:] = session.run([inputFeatures , outputFeatures]) coor.request_stop() coor.join(threads) return XsArray , YsArrayx , y =loadDataFromCSV(['/Users/mousaalsulaimi/Downloads/convertcsv.csv'] , 289 , 1, 1460)num_steps = 10000batch_size = 20 graph = tf.Graph()with graph.as_default() : with tf.name_scope('input'): inputProperties = tf.placeholder(tf.float32 , shape=(batch_size ,287 )) with tf.name_scope('realPropertyValue') : outputValues = tf.placeholder(tf.float32,shape=(batch_size,1)) with tf.name_scope('weights'): hidden1_w = tf.Variable( tf.truncated_normal([287,1000],stddev=math.sqrt(3/(287+1000)) , dtype=tf.float32)) with tf.name_scope('baises'): hidden1_b = tf.Variable( tf.zeros([1000] , dtype=tf.float32) ) with tf.name_scope('hidden_layer'): hidden1 =tf.matmul(inputProperties,hidden1_w) + hidden1_b #hidden1_relu = tf.nn.relu(hidden1) #hidden1_dropout = tf.nn.dropout(hidden1_relu,.5) with tf.name_scope('layer2_weights'): output_w = tf.Variable(tf.truncated_normal([1000,1],stddev=math.sqrt(3/(1000+1)) , dtype=tf.float32)) with tf.name_scope('layer2_baises'): output_b = tf.Variable(tf.zeros([1] , dtype=tf.float32)) with tf.name_scope('layer_2_predictions'): output =tf.matmul(hidden1,output_w) + output_b with tf.name_scope('predictions'): predictedValues = (output) loss = tf.sqrt(tf.reduce_mean(tf.square(predictedValues-outputValues))) loss_l2 = tf.nn.l2_loss(hidden1_w) with tf.name_scope('minimization') : minimum = tf.train.AdamOptimizer(.5).minimize(loss+.004*loss_l2)with tf.Session(graph=graph) as session: tf.global_variables_initializer().run() print("Initialized") for step in range(num_steps): # Pick an offset within the training data, which has been randomized. # Note: we could use better randomization across epochs. offset = (step * batch_size) % (y.shape[0] - batch_size) # Generate a minibatch. batch_data = x[offset:(offset + batch_size), 1:] batch_labels = y[offset:(offset + batch_size), :] print("real" , batch_labels) # Prepare a dictionary telling the session where to feed the minibatch. # The key of the dictionary is the placeholder node of the graph to be fed, # and the value is the numpy array to feed to it. feed_dict = {inputProperties : batch_data, outputValues : batch_labels} _, l, predictions , inp = session.run([minimum, loss, predictedValues ,inputProperties ], feed_dict=feed_dict) print("prediction " , predictions) print("loss : " , l) print("----------") print('+++++++++++')
我还上传了数据文件convertcsv.csv 这里,以防您想查看一下。
我非常感谢任何帮助我找出我做错的地方的建议。
谢谢您
回答:
好的,所以我终于明白问题所在了,正如预期的那样,是神经网络中的权重问题,我还进行了一些预处理以改善预测: