import tensorflow as tfimport matplotlib.pyplot as pltimport pandas as pdimport pylab as plimport numpy as npimport tensorflow as tfimport matplotlib.patches as mpatchesimport matplotlib.pyplot as pltplt.rcParams['figure.figsize'] = (20, 6)df1 = pd.read_csv("TrainData.csv")df2 = pd.read_csv("TestData.csv")train_data_X = np.asanyarray(df1['ENGINE SIZE'])train_data_Y = np.asanyarray(df1['CO2 EMISSIONS'])test_data_X = np.asanyarray(df2['ENGINE SIZE'])test_data_Y = np.asanyarray(df2['CO2 EMISSIONS'])W = tf.Variable(20.0, name= 'Weight')b = tf.Variable(30.0, name= 'Bias')X = tf.placeholder(tf.float32, name= 'Input')Y = tf.placeholder(tf.float32, name= 'Output')Y = W*X + bloss = tf.reduce_mean(tf.square(Y - train_data_Y))optimizer = tf.train.GradientDescentOptimizer(0.05)train = optimizer.minimize(loss)init = tf.global_variables_initializer()sess = tf.Session()sess.run(init)loss_values = []train_data = []for step in range(100): _, loss_val, a_val, b_val = sess.run([train, loss, W, b], feed_dict={X:train_data_X, Y:train_data_Y}) loss_values.append(loss_val) if step % 5 == 0: print(step, loss_val, a_val, b_val) train_data.append([a_val, b_val])plt.plot(loss_values, 'ro')plt.show()
我正在尝试构建一个线性回归模型,通过输入发动机尺寸来检测二氧化碳排放。我在TensorFlow中使用了上述代码。1)当我使用这段代码时,权重和偏置保持不变。代码中有什么问题?2)如果我想同时输入发动机尺寸和里程,该做哪些代码更改?
提前感谢
回答:
代码中存在一些错误,下面提到了这些错误:
- 您使用了占位符
Y = W*X + b
,在代码的后续部分用于馈送数据(feed_dict={X:train_data_X, Y:train_data_Y}
)。您应该使用另一个变量来进行预测(而不是用于馈送数据的占位符),然后您就可以计算损失函数。然而,已进行了必要的更改。请查看下方代码中的prediction= W*X + b
- 您一次性在
feed_dict
中传递了完整的数据(feed_dict={X:train_data_X, Y:train_data_Y}
)。然而,您需要一次传递一个数据值(feed_dict={X:x, Y:y}
)
下方修正后的代码应该可以正常工作。
import tensorflow as tfimport matplotlib.pyplot as pltimport pandas as pdimport pylab as plimport numpy as npimport tensorflow as tfimport matplotlib.patches as mpatchesimport matplotlib.pyplot as pltplt.rcParams['figure.figsize'] = (20, 6)df1 = pd.read_csv("TrainData.csv")df2 = pd.read_csv("TestData.csv")train_data_X = np.asanyarray(df1['ENGINE SIZE'])train_data_Y = np.asanyarray(df1['CO2 EMISSIONS'])test_data_X = np.asanyarray(df2['ENGINE SIZE'])test_data_Y = np.asanyarray(df2['CO2 EMISSIONS'])W = tf.Variable(20.0, name= 'Weight')b = tf.Variable(30.0, name= 'Bias')X = tf.placeholder(tf.float32, name= 'Input')Y = tf.placeholder(tf.float32, name= 'Output')prediction= W*X + bloss = tf.reduce_mean(tf.square(prediction - Y))optimizer = tf.train.GradientDescentOptimizer(0.05)train = optimizer.minimize(loss)loss_values = []train_data = []init = tf.global_variables_initializer()with tf.Session() as sess: sess.run(init) for step in range(100): for (x,y) in zip(train_data_X,train_data_Y): _, loss_val, a_val, b_val = sess.run([train, loss, W, b], feed_dict={X:x, Y:y}) loss_values.append(loss_val) if step % 5 == 0: print(step, loss_val, a_val, b_val) train_data.append([a_val, b_val])plt.plot(loss_values, 'ro')plt.show()
注意:由于选择了不正确的损失函数,您的损失值会随着每一步而增加。
我下面提到了一种损失函数,可能会适用于您的数据。我不确定您的数据是什么样的,但如果您想尝试,可以试试这个,并告诉我是否有效。
n_samples = train_data_X.shape[0]loss = tf.reduce_sum(tf.pow(prediction - Y, 2)) / (2 * n_samples)
对您第二个问题的回答。
假设您的数据列名为MILEAGE,您可以在train_data_X
和test_data_X
中进行以下更改。其余代码将保持与上述相同。
train_data_X = np.asanyarray(df1[['ENGINE SIZE','MILEAGE']])train_data_Y = np.asanyarray(df1['CO2 EMISSIONS'])test_data_X = np.asanyarray(df2[['ENGINE SIZE','MILEAGE']])test_data_Y = np.asanyarray(df2['CO2 EMISSIONS'])