我正在尝试在MATLAB中实现“随机梯度下降”。我完全按照算法进行操作,但得到的预测/拟合函数的w(系数)非常非常大。我的算法有错误吗?
算法如下:
x = 0:0.1:2*pi // X-axis n = size(x,2); r = -0.2+(0.4).*rand(n,1); //generating random noise to be added to the sin(x) function t=zeros(1,n); y=zeros(1,n); for i=1:n t(i)=sin(x(i))+r(i); // adding the noise y(i)=sin(x(i)); // the function without noise end f = round(1+rand(20,1)*n); //generating random indexes h = x(f); //choosing random x points k = t(f); //chossing random y points m=size(h,2); // length of the h vector scatter(h,k,'Red'); // drawing the training points (with noise) %scatter(x,t,2); hold on; plot(x,sin(x)); // plotting the Sin function w = [0.3 1 0.5]; // starting point of w a=0.05; // learning rate "alpha"// ---------------- ALGORITHM ---------------------// for i=1:20 v = [1 h(i) h(i).^2]; // X vector e = ((w*v') - k(i)).*v; // prediction - observation w = w - a*e; // updating w end hold on; l = 0:1:6; g = w(1)+w(2)*l+w(3)*(l.^2); plot(l,g,'Yellow'); // drawing the prediction function
回答:
如果使用过大的学习率,SGD很可能会发散。
学习率应该收敛到零。