多变量的梯度下降法（不使用矩阵）

我刚开始学习Matlab和机器学习，尝试编写一个不使用矩阵的梯度下降函数。

m 是训练集中例子的数量
n 是每个例子的特征数量

函数gradientDescentMulti接受5个参数：

X mxn 矩阵
y m维向量
theta : n维向量
alpha : 实数
nb_iters : 实数

我已经有一个使用矩阵乘法的解决方案

function theta = gradientDescentMulti(X, y, theta, alpha, num_iters)  for iter = 1:num_iters    gradJ = 1/m * (X'*X*theta - X'*y);    theta = theta - alpha * gradJ;  endend

迭代后的结果：

theta =   1.0e+05 *    3.3430    1.0009    0.0367

但现在，我尝试不使用矩阵乘法来做同样的事情，这是函数：

function theta = gradientDescentMulti(X, y, theta, alpha, num_iters)  m = length(y); % number of training examples  n = size(X, 2); % number of features  for iter = 1:num_iters    new_theta = zeros(1, n);    %// for each feature, found the new theta    for t = 1:n      S = 0;      for example = 1:m        h = 0;        for example_feature = 1:n          h = h + (theta(example_feature) * X(example, example_feature));        end        S = S + ((h - y(example)) * X(example, n)); %// Sum each feature for this example      end      new_theta(t) = theta(t) - alpha * (1/m) * S; %// Calculate new theta for this example    end     %// only at the end of the function, update all theta simultaneously    theta = new_theta'; %// Transpose new_theta (horizontal vector) to theta (vertical vector)  endend

结果，所有theta值都相同：/

theta =   1.0e+04 *    3.5374    3.5374    3.5374

回答：

如果你查看梯度更新规则，实际上先计算所有训练样本的假设值，然后用每个训练样本的真实值减去这些假设值，并将这些差值存储在一个数组或向量中，可能会更有效。一旦你这样做了，你就可以很容易地计算更新规则。在我看来，你的代码中似乎没有这样做。

因此，我重写了代码，但我有一个单独的数组来存储每个训练样本的假设值与真实值之间的差异。一旦我这样做了，我就可以分别计算每个特征的更新规则：

for iter = 1 : num_iters    %// Compute hypothesis differences with ground truth first    h = zeros(1, m);    for t = 1 : m        %// Compute hypothesis        for tt = 1 : n            h(t) = h(t) + theta(tt)*X(t,tt);        end        %// Compute difference between hypothesis and ground truth        h(t) = h(t) - y(t);    end    %// Now update parameters    new_theta = zeros(1, n);        %// for each feature, find the new theta    for tt = 1 : n        S = 0;        %// For each sample, compute products of hypothesis difference        %// and the right feature of the sample and accumulate        for t = 1 : m            S = S + h(t)*X(t,tt);        end        %// Compute gradient descent step        new_theta(tt) = theta(tt) - (alpha/m)*S;    end    theta = new_theta'; %// Transpose new_theta (horizontal vector) to theta (vertical vector)    end

当我这样做时，我得到的结果与使用矩阵公式的结果相同。

学技术

多变量的梯度下降法（不使用矩阵）

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复