我正在尝试用Python实现梯度下降算法。当我绘制成本函数的历史记录时,它似乎在收敛,但我的实现得到的平均绝对误差远比从sklearn的linear_model得到的差。我无法找出我的实现有什么问题。
import matplotlib.pyplot as pltimport numpy as npimport pandas as pdfrom sklearn import linear_modelfrom sklearn.model_selection import train_test_splitfrom sklearn.metrics import mean_absolute_errordef gradient_descent(x, y, theta, alpha, num_iters): m = len(y) cost_history = np.zeros(num_iters) for iter in range(num_iters): h = np.dot(x, theta) for i in range(len(theta)): theta[i] = theta[i] - (alpha/m) * np.sum((h - y) * x[:,i]) #保存每次迭代的成本 cost_history[iter] = np.sum(np.square((h - y))) / (2 * m) return theta, cost_historyattributes = [...]class_field = [...]x_df = pd.read_csv('train.csv', usecols = attributes)y_df = pd.read_csv('train.csv', usecols = class_field)#标准化x_df = (x_df - x_df.mean()) / x_df.std()#梯度下降alpha = 0.01num_iters = 1000err = 0i = 10for i in range(i): x_train, x_test, y_train, y_test = train_test_split(x_df, y_df, test_size=0.2) x_train = np.array(x_train) y_train = np.array(y_train).flatten() theta = np.random.sample(len(x_df.columns)) theta, cost_history = gradient_descent(x_train, y_train, theta, alpha, num_iters) err = err + mean_absolute_error(y_test, np.dot(x_test, theta)) print(np.dot(x_test, theta)) #plt.plot(cost_history) #plt.show()print(err/i)regr = linear_model.LinearRegression()regr.fit(x_train, y_train)y_pred = regr.predict(x_test)print(mean_absolute_error(y_test, y_pred))
回答:
看起来你遗漏了偏置/截距列和系数。
线性函数的假设应该如下所示:
H = theta_0 + theta_1 * x
在你的实现中,它看起来像这样:
H = theta_1 * x