使用Python和numpy实现梯度下降的线性回归

我正在尝试使用Python实现Andrew NG的Coursera机器学习课程的第一个练习。课程中这个练习是使用Matlab/Octave完成的，但我希望同样用Python实现它。

问题在于更新theta值的那行代码似乎没有正确工作，它返回的值是[[0.72088159] [0.72088159]]，但应该是[[-3.630291] [1.166362]]

我使用了0.01的学习率，并且设置了1500次梯度循环（与Octave中的原始练习相同的值）。

显然，由于theta的值错误，预测结果也不正确，如最后的图表所示。

在测试成本函数的行中，当theta值定义为[0; 0]和[-1; 2]时，结果是正确的（与Octave中的练习相同），所以错误只能在梯度函数中，但我不知道哪里出了问题。

我希望有人能帮助我找出我做错的地方。我已经很感激了。

import numpy as npimport matplotlib.pyplot as plt%matplotlib inlinedef load_data():    X = np.genfromtxt('data.txt', usecols=(0), delimiter=',', dtype=None)        y = np.genfromtxt('data.txt', usecols=(1), delimiter=',', dtype=None)        X = X.reshape(1, X.shape[0])    y = y.reshape(1, y.shape[0])    ones = np.ones(X.shape)    X = np.append(ones, X, axis=0)    theta = np.zeros((2, 1))    return (X, y, theta)alpha = 0.01         iter_num = 1500      debug_at_loop = 10def plot(x, y, y_hat=None):    x = x.reshape(x.shape[0], 1)    plt.xlabel('x')    plt.ylabel('hΘ(x)')    plt.ylim(ymax = 25, ymin = -5)    plt.xlim(xmax = 25, xmin = 5)    plt.scatter(x, y)    if type(y_hat) is np.ndarray:        plt.plot(x, y_hat, '-')    plt.show()plot(X[1], y)def hip(X, theta):    return np.dot(theta.T, X)def cost(X, y, theta):    m = y.shape[1]    return np.sum(np.square(hip(X, theta) - y)) / (2 * m)print('With theta = [0 ; 0]')print('Cost computed =', cost(X, y, np.array([0, 0])))print()print('With theta = [-1 ; 2]')print('Cost computed =', cost(X, y, np.array([-1, 2])))def grad(X, y, alpha, theta, iter_num=1500, debug_cost_at_each=10):    J = []    m = y.shape[1]    for i in range(iter_num):        theta -= ((alpha * 1) / m) * np.sum(np.dot(hip(X, theta) - y, X.T))        if i % debug_cost_at_each == 0:            J.append(round(cost(X, y, theta), 6))    return J, thetaX, y, theta = load_data()J, fit_theta = grad(X, y, alpha, theta)print('Theta found by Gradient Descent:', fit_theta)# Predict values for population sizes of 35,000 and 70,000predict1 = np.dot(np.array([[1], [3.5]]).T, fit_theta);print('For population = 35,000, we predict a profit of \n', predict1 * 10000);predict2 = np.dot(np.array([[1], [7]]).T, fit_theta);print('For population = 70,000, we predict a profit of \n', predict2 * 10000);pred_y = hip(X, fit_theta)plot(X[1], y, pred_y.T)

我使用的数据是以下文本文件：

6.1101,17.5925.5277,9.13028.5186,13.6627.0032,11.8545.8598,6.82338.3829,11.8867.4764,4.34838.5781,126.4862,6.59875.0546,3.81665.7107,3.252214.164,15.5055.734,3.15518.4084,7.22585.6407,0.716185.3794,3.51296.3654,5.30485.1301,0.560776.4296,3.65187.0708,5.38936.1891,3.138620.27,21.7675.4901,4.2636.3261,5.18755.5649,3.082518.945,22.63812.828,13.50110.957,7.046713.176,14.69222.203,24.1475.2524,-1.226.5894,5.99669.2482,12.1345.8918,1.84958.2111,6.54267.9334,4.56238.0959,4.11645.6063,3.392812.836,10.1176.3534,5.49745.4069,0.556576.8825,3.911511.708,5.38545.7737,2.44067.8247,6.73187.0931,1.04635.0702,5.13375.8014,1.84411.7,8.00435.5416,1.01797.5402,6.75045.3077,1.83967.4239,4.28857.6031,4.99816.3328,1.42336.3589,-1.42116.2742,2.47565.6397,4.60429.3102,3.96249.4536,5.41418.8254,5.16945.1793,-0.7427921.279,17.92914.908,12.05418.959,17.0547.2182,4.88528.2951,5.744210.236,7.77545.4994,1.017320.341,20.99210.136,6.67997.3345,4.02596.0062,1.27847.2259,3.34115.0269,-2.68076.5479,0.296787.5386,3.88455.0365,5.701410.274,6.75265.1077,2.05765.7292,0.479535.1884,0.204216.3557,0.678619.7687,7.54356.5159,5.34368.5172,4.24159.1802,6.79816.002,0.926955.5204,0.1525.0594,2.82145.7077,1.84517.6366,4.29595.8707,7.20295.3054,1.98698.2934,0.1445413.394,9.05515.4369,0.61705

回答：

好吧，我在掉了好几根头发后终于解决了这个问题（编程可能会让我变秃）。

问题出在梯度线上，解决方案是这样的：

theta -= ((alpha * 1) / m) * np.dot(X, (hip(X, theta) - y).T)

我改变了X的位置，并对误差向量进行了转置。

学技术

使用Python和numpy实现梯度下降的线性回归

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复