正确实现Hinge损失最小化的梯度下降法

我从这里复制了Hinge损失函数（以及它所基于的LossC和LossFunc）。然后我将其包含在我的梯度下降算法中，如下所示：

  do   {    iteration++;    error = 0.0;    cost = 0.0;    //遍历所有实例（完成一个epoch）    for (p = 0; p < number_of_files__train; p++)     {      // 1. 计算假设 h = X * theta      hypothesis = calculateHypothesis( theta, feature_matrix__train, p, globo_dict_size );      // 2. 计算损失 = h - y 以及可能的平方成本 (loss^2)/2m      //cost = hypothesis - outputs__train[p];      cost = HingeLoss.loss(hypothesis, outputs__train[p]);      System.out.println( "cost " + cost );      // 3. 计算梯度 = X' * 损失 / m      gradient = calculateGradent( theta, feature_matrix__train, p, globo_dict_size, cost, number_of_files__train);      // 4. 更新参数 theta = theta - alpha * 梯度      for (int i = 0; i < globo_dict_size; i++)       {          theta[i] = theta[i] - LEARNING_RATE * gradient[i];      }    }    //平方误差的总和（所有实例的误差值）    error += (cost*cost);         /* 均方根误差 */  //System.out.println("Iteration " + iteration + " : RMSE = " + Math.sqrt( error/number_of_files__train ) );  System.out.println("Iteration " + iteration + " : RMSE = " + Math.sqrt( error/number_of_files__train ) );  }   while( error != 0 );

但这完全不起作用。这是由于损失函数的原因吗？也许是我将损失函数添加到代码中的方式有问题？

我猜也有可能是我的梯度下降实现有问题。

这是我计算梯度和假设的方法，这些方法正确吗？

static double calculateHypothesis( double[] theta, double[][] feature_matrix, int file_index, int globo_dict_size ){    double hypothesis = 0.0;     for (int i = 0; i < globo_dict_size; i++)      {         hypothesis += ( theta[i] * feature_matrix[file_index][i] );     }     //偏置     hypothesis += theta[ globo_dict_size ];     return hypothesis;}static double[] calculateGradent( double theta[], double[][] feature_matrix, int file_index, int globo_dict_size, double cost, int number_of_files__train){    double m = number_of_files__train;    double[] gradient = new double[ globo_dict_size];//一个用于偏置？    for (int i = 0; i < gradient.length; i++)     {        gradient[i] = (1.0/m) * cost * feature_matrix[ file_index ][ i ] ;    }    return gradient;}

如果您有兴趣查看，代码的其余部分在这里。

下面是这些损失函数的外观。我应该使用loss还是deriv，这些是否正确？

/** * 计算HingeLoss损失 * * @param pred 预测值 * @param y 目标值 * @return HingeLoss损失 */public static double loss(double pred, double y){    return Math.max(0, 1 - y * pred);}/** * 计算HingeLoss损失的一阶导数 * * @param pred 预测值 * @param y 目标值 * @return HingeLoss损失的一阶导数 */public static double deriv(double pred, double y){    if (pred * y > 1)        return 0;    else        return -y;}

回答：

您提供的用于计算梯度的代码看起来不像是Hinge损失的梯度。请看一个有效的方程，例如这里：https://stats.stackexchange.com/questions/4608/gradient-of-hinge-loss

学技术

正确实现Hinge损失最小化的梯度下降法

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复