使用liblinear进行概率预测（Java），直接在代码中使用分类器

考虑以下liblinear的使用方式（http://liblinear.bwaldvogel.de/）：

    double C = 1.0; // 约束违反的成本    double eps = 0.01; // 停止标准    Parameter param = new Parameter(SolverType.L2R_L2LOSS_SVC, C, eps);    Problem problem = new Problem();    double[] GROUPS_ARRAY = {1, 0, 0, 0};    problem.y = GROUPS_ARRAY;    int NUM_OF_TS_EXAMPLES = 4;    problem.l = NUM_OF_TS_EXAMPLES;     problem.n = 2;    FeatureNode[] instance1 = { new FeatureNode(1, 1), new FeatureNode(2, 1) };    FeatureNode[] instance2 = { new FeatureNode(1, -1), new FeatureNode(2, 1) };    FeatureNode[] instance3 = { new FeatureNode(1, -1), new FeatureNode(2, -1) };    FeatureNode[] instance4 = { new FeatureNode(1, 1), new FeatureNode(2, -1) };    FeatureNode[] instance5 = { new FeatureNode(1, 1), new FeatureNode(2, -0.1) };    FeatureNode[] instance6 = { new FeatureNode(1, -0.1), new FeatureNode(2, 1) };    FeatureNode[] instance7 = { new FeatureNode(1, -0.1), new FeatureNode(2, -0.1) };    FeatureNode[][] testSetWithUnknown = {            instance5,            instance6,             instance7        };    FeatureNode[][] trainingSetWithUnknown = {            instance1,            instance2,             instance3,             instance4        };    problem.x = trainingSetWithUnknown;    Model m = Linear.train(problem, param);     for( int i = 0; i < trainingSetWithUnknown.length; i++)        System.out.println(" Train.instance =  " + i + " =>  " + Linear.predict(m, trainingSetWithUnknown[i]) );     System.out.println("---------------------");     for( int i = 0; i < testSetWithUnknown.length; i++)        System.out.println(" Test.instance =  " + i + " =>  " + Linear.predict(m, testSetWithUnknown[i]) );

以下是输出结果：

iter  1 act 1.778e+00 pre 1.778e+00 delta 6.285e-01 f 4.000e+00 |g| 5.657e+00 CG   1 Train.instance =  0 =>  1.0 Train.instance =  1 =>  0.0 Train.instance =  2 =>  0.0 Train.instance =  3 =>  0.0--------------------- Test.instance =  0 =>  1.0 Test.instance =  1 =>  1.0 Test.instance =  2 =>  0.0

我需要的是概率预测，而不是整数（硬）预测。命令行中有-b选项，但我找不到在代码中直接使用函数的任何方法。我还查看了代码（https://github.com/bwaldvogel/liblinear-java/blob/master/src/main/java/de/bwaldvogel/liblinear/Predict.java）；显然，通过在代码中直接使用，没有概率预测的选项。这是正确的吗？

更新：我最终使用了来自https://github.com/bwaldvogel/liblinear-java的liblinear代码。在Predict.java文件中，我将

private static boolean       flag_predict_probability = true;

改为

private static boolean       flag_predict_probability = false;

并使用

SolverType.L2R_LR

但仍然得到整数类。有什么想法吗？

回答：

要使用概率，需要更改代码。预测是在Linear.java文件中的以下函数中进行的：

public static double predictValues(Model model, Feature[] x, double[] dec_values) {

需要将

    if (model.nr_class == 2) {        System.out.println("Two classes ");         if (model.solverType.isSupportVectorRegression()) {             System.out.println("Support vector");            return dec_values[0];        }        else {             System.out.println("Not Support vector");            return (dec_values[0] > 0) ? model.label[0] : model.label[1];        }    }

更改为

    if (model.nr_class == 2) {        System.out.println("Two classes ");         if (model.solverType.isSupportVectorRegression()) {             System.out.println("Support vector");            return dec_values[0];        }        else {             System.out.println("Not Support vector");            return dec_values[0];         }        }

请注意，输出仍然不是概率，而是权重和特征值的线性组合。如果将其输入到softmax函数中，它将成为[0, 1]之间的概率。

另外，请确保选择逻辑回归：

     SolverType.L2R_LR

学技术

使用liblinear进行概率预测（Java），直接在代码中使用分类器

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复