我重用了TensorFlow的多变量线性回归代码,并尝试降低成本,但问题是经过几次迭代后,成本以及W和b的值变成了inf,并且很快变成了nan。请问有人能告诉我问题出在哪里吗?我有大约100,000个值。我已经将其裁剪到10,000个值用于测试。数据集在这里这里
这是代码
import numpy as npimport tensorflow as tfdef computeX(): all_xs = np.loadtxt("test.csv", delimiter=',', skiprows=1, usecols=range(4,260)) #reads the columns except first one timestamps = np.loadtxt("test.csv", delimiter=',', skiprows=1, usecols=(0),dtype =str) symbols = np.loadtxt("test.csv", delimiter=',', skiprows=1, usecols=(1),dtype =float) categories = np.loadtxt("test.csv", delimiter=',', skiprows=1, usecols=(2),dtype =str) tempList = [] BOW = {"M1": 1.0, "M5": 2.0, "M15": 3.0, "M30": 4.0, "H1": 5.0, "H4": 6.0, "D1": 7.0} #explode dates and make them features.. 2016/11/1 01:54 becomes [2016, 11, 1, 01, 54] for i, v in enumerate(timestamps): splitted = v.split() dateVal = splitted[0] timeVal = splitted[1] ar = dateVal.split("/") splittedTime = timeVal.split(":") ar = ar + splittedTime Features = np.asarray(ar) Features = Features.astype(float) # append symbols Features = np.append(Features,symbols[i]) #append categories from BOW Features = np.append(Features, BOW[categories[i]] ) row = np.append(Features,all_xs[i]) row = row.tolist() tempList.append(row) all_xs = np.array(tempList) del tempList[:] return all_xsif __name__ == "__main__": print ("Starting....") learn_rate = 0.5 all_ys = np.loadtxt("test.csv", delimiter=',', skiprows=1, usecols=3) #reads only first column all_xs = computeX() datapoint_size= int(all_xs.shape[0]) print(datapoint_size) x = tf.placeholder(tf.float32, [None, 263], name="x") W = tf.Variable(tf.ones([263,1]), name="W") b = tf.Variable(tf.ones([1]), name="b") product = tf.matmul(x,W) y = product + b y_ = tf.placeholder(tf.float32, [datapoint_size]) cost = tf.reduce_mean(tf.square(y_-y))/ (2*datapoint_size) train_step = tf.train.GradientDescentOptimizer(learn_rate).minimize(cost) sess = tf.Session() init = tf.global_variables_initializer() sess.run(init) batch_size = 10000 steps =10 for i in range(steps): print("Entering Loop") if datapoint_size == batch_size: batch_start_idx = 0 elif datapoint_size < batch_size: raise ValueError("datapoint_size: %d, must be greater than batch_size: %d" % (datapoint_size, batch_size)) else: batch_start_idx = (i * batch_size) % (datapoint_size - batch_size) batch_end_idx = batch_start_idx + batch_size batch_xs = all_xs[batch_start_idx:batch_end_idx] batch_ys = all_ys[batch_start_idx:batch_end_idx] xs = np.array(batch_xs) ys = np.array(batch_ys) feed = { x: xs, y_: ys } sess.run(train_step, feed_dict=feed) print("W: %s" % sess.run(W)) print("b: %f" % sess.run(b)) print("cost: %f" % sess.run(cost, feed_dict=feed))
回答:
请查看你的数据:
id8 id9 id10 id11 id121451865600 1451865600 -19.8 87.1 0.57011451865600 1451865600 -1.6 3.6 0.571921451865600 1451865600 -5.3 23.9 0.57155
你还将权重初始化为1,如果你将所有输入数据乘以1,然后将它们相加,所有“重”的列(id8、id9等,包含大数字的列)会将数据从较小的列中推开。你还有充满零的列:
id236 id237 id238 id239 id2400 0 0 0 00 0 0 0 00 0 0 0 0
这些都是不相容的因素。大值会导致非常高的预测,这些会导致损失爆炸和溢出。即使将学习率缩小十亿倍也几乎没有效果。
因此建议如下:
- 检查你的数据,删除所有无意义的数据(充满零的列)
- 归一化你的输入数据
- 在这一点上检查损失函数的大小,然后尝试调整学习率。