我从头开始编写了一个线性回归模型,但是损失值却在增加。我的数据是休斯顿房价数据集的房屋面积和价格(作为标签)。我尝试了多种学习率(从10到0.00000000001),但仍然没有效果。随着每个epoch的进行,我的拟合线/函数离数据点越来越远。我猜测函数中肯定有什么问题,但我找不出是什么。下面是一个损失值的例子:
loss: 0.5977188541860982loss: 0.6003449724263221loss: 0.6029841845821928loss: 0.6056365560589673loss: 0.6083021525886172loss: 0.6109810402314608loss: 0.6136732853778034loss: 0.6163789547495854loss: 0.6190981154020385loss: 0.6218308347253524loss: 0.6245771804463445
这是代码:
from preprocessing import load_csvimport pandas as pdimport numpy as npimport randomimport matplotlib.pyplot as plt# mean squared errordef MSE(y_prediction, y_true, deriv=(False, 1)): if deriv[0]: # deriv[1] is the derivitive of the fit_function return 2 * np.mean(np.subtract(y_true, y_prediction) * deriv[1]) return np.mean(np.square(np.subtract(y_true, y_prediction)))# linear functiondef fit_function(theta_0, theta_1, x): return theta_0 + (theta_1 * x)# train modeldef train(dataset, epochs=10, lr=0.01): # loadinh and normalizing the data x = (v := np.array(dataset["GrLivArea"].tolist()[:100])) / max(v) y = (l := np.array(dataset["SalePrice"].tolist()[:100])) / max(l) # y-intercept theta_0 = random.uniform(min(y), max(y)) # slope theta_1 = random.uniform(-1, 1) for epoch in range(epochs): predictions = fit_function(theta_0, theta_1, x) loss = MSE(predictions, y) delta_theta_0 = MSE(predictions, y, deriv=(True, 1)) delta_theta_1 = MSE(predictions, y, deriv=(True, x)) theta_0 -= lr * delta_theta_0 theta_1 -= lr * delta_theta_1 print("\nloss:", loss) plt.style.use("ggplot") plt.scatter(x, y) x, predictions = map(list, zip(*sorted(zip(x, predictions)))) plt.plot(x, predictions, "b--") plt.show()train(load_csv("dataset/houston_housing/single_variable_dataset/train.csv"), epochs=500, lr=0.001)
谢谢你的帮助 🙂
回答:
虽然这是一个相当旧的帖子,但我还是想给出回答。
你在MSE导数上搞错了符号:
def MSE(y_prediction, y_true, deriv=(False, 1)): if deriv[0]: return 2 * np.mean(np.subtract(y_prediction, y_true) * deriv[1]) return np.mean(np.square(np.subtract(y_true, y_prediction)))
相对于你的参数的偏导数是:
为了简洁起见:
def MSE(y_prediction, y_true, deriv=None): if deriv is not None: return 2 * np.mean((y_prediction - y_true)*deriv) return np.mean((y_prediction - y_true)**2)
这允许你获取导数而不需要传入一个带有标志的元组:
delta_theta_0 = MSE(predictions, y, deriv=1)delta_theta_1 = MSE(predictions, y, deriv=x)
这是一个使用sklearn.datasets.load_boston
的例子,使用LSTAT
(人口较低状态)和MEDV
(业主自住房屋的中位数价格,以1000美元计)作为最后两个数据特征作为输入和目标,分别进行训练。
使用epochs=10000
和lr=0.001
进行训练: