在Python中对对数下降曲线进行梯度下降

我想对如下所示的对数下降曲线进行梯度下降:

y = y0 – a * ln(b + x).

在这个例子中我的y0是: 800

我尝试使用关于a和b的偏导数来做这件事,但虽然这显然最小化了平方误差,但它并没有收敛。我知道这不是向量化的,我可能完全采取了错误的方法。我犯了什么简单的错误,还是完全偏离了这个问题?

import numpy as np# constants my gradient descent model should find:a = 4b = 4# function to fit on!def function(x, a, b):    y0 = 800    return y0 - a * np.log(b + x)# Generates datadef gen_data(numpoints):    a = 4    b = 4    x = np.array(range(0, numpoints))    y = function(x, a, b)    return x, yx, y = gen_data(600)def grad_model(x, y, iterations):    converged = False    # length of dataset    m = len(x)    # guess   a ,  b    theta = [0.1, 0.1]    alpha = 0.001    # initial error    e = np.sum((np.square(function(x, theta[0], theta[1])) - y))    for iteration in range(iterations):        hypothesis = function(x, theta[0], theta[1])        loss = hypothesis - y        # compute partial deritaves to find slope to "fall" into        theta0_grad = (np.mean(np.sum(-np.log(x + y)))) / (m)        theta1_grad = (np.mean((((np.log(theta[1] + x)) / theta[0]) - (x*(np.log(theta[1] + x)) / theta[0])))) / (2*m)        theta0 = theta[0] - (alpha * theta0_grad)        theta1 = theta[1] - (alpha * theta1_grad)        theta[1] = theta1        theta[0] = theta0        new_e = np.sum(np.square((function(x, theta[0], theta[1])) - y))        if new_e > e:            print "AHHHH!"            print "Iteration: "+ str(iteration)            break        print theta    return theta[0], theta[1]

回答:

我在你的代码中发现了一些错误。这行

e = np.sum((np.square(function(x, theta[0], theta[1])) - y))

是错误的,应该替换为

e = np.sum((np.square(function(x, theta[0], theta[1]) - y)))

new_e的公式包含了同样的错误。

此外,梯度公式也是错误的。你的损失函数是$L(a,b) = \sum_{i=1}^N y_0 – a \log(b + x_i)$,所以你必须计算$L$关于$a$和$b$的偏导数。(LaTeX在stackoverflow上真的不工作吗?)最后一点是,梯度下降方法有一个步长限制,所以我们的步长不能太大。这里是你的代码的一个工作更好的版本:

import numpy as npimport matplotlib.pyplot as plt# constants my gradient descent model should find:a = 4.0b = 4.0y0 = 800.0# function to fit on!def function(x, a, b):    # y0 = 800    return y0 - a * np.log(b + x)# Generates datadef gen_data(numpoints):    # a = 4    # b = 4    x = np.array(range(0, numpoints))    y = function(x, a, b)    return x, yx, y = gen_data(600)def grad_model(x, y, iterations):    converged = False    # length of dataset    m = len(x)    # guess   a ,  b    theta = [0.1, 0.1]    alpha = 0.00001    # initial error    # e = np.sum((np.square(function(x, theta[0], theta[1])) - y))    #  This was a bug    e = np.sum((np.square(function(x, theta[0], theta[1]) - y)))    costs = np.zeros(iterations)    for iteration in range(iterations):        hypothesis = function(x, theta[0], theta[1])        loss = hypothesis - y        # compute partial deritaves to find slope to "fall" into        # theta0_grad = (np.mean(np.sum(-np.log(x + y)))) / (m)        # theta1_grad = (np.mean((((np.log(theta[1] + x)) / theta[0]) - (x*(np.log(theta[1] + x)) / theta[0])))) / (2*m)        theta0_grad = 2*np.sum((y0 - theta[0]*np.log(theta[1] + x) - y)*(-np.log(theta[1] + x)))        theta1_grad = 2*np.sum((y0 - theta[0]*np.log(theta[1] + x) - y)*(-theta[0]/(b + x)))        theta0 = theta[0] - (alpha * theta0_grad)        theta1 = theta[1] - (alpha * theta1_grad)        theta[1] = theta1        theta[0] = theta0        # new_e = np.sum(np.square((function(x, theta[0], theta[1])) - y)) # This was a bug        new_e = np.sum(np.square((function(x, theta[0], theta[1]) - y)))        costs[iteration] = new_e        if new_e > e:            print "AHHHH!"            print "Iteration: "+ str(iteration)            # break        print theta    return theta[0], theta[1], costs(theta0,theta1,costs) = grad_model(x,y,100000)plt.semilogy(costs)

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注