考虑以下liblinear的使用方式(http://liblinear.bwaldvogel.de/):
double C = 1.0; // 约束违反的成本 double eps = 0.01; // 停止标准 Parameter param = new Parameter(SolverType.L2R_L2LOSS_SVC, C, eps); Problem problem = new Problem(); double[] GROUPS_ARRAY = {1, 0, 0, 0}; problem.y = GROUPS_ARRAY; int NUM_OF_TS_EXAMPLES = 4; problem.l = NUM_OF_TS_EXAMPLES; problem.n = 2; FeatureNode[] instance1 = { new FeatureNode(1, 1), new FeatureNode(2, 1) }; FeatureNode[] instance2 = { new FeatureNode(1, -1), new FeatureNode(2, 1) }; FeatureNode[] instance3 = { new FeatureNode(1, -1), new FeatureNode(2, -1) }; FeatureNode[] instance4 = { new FeatureNode(1, 1), new FeatureNode(2, -1) }; FeatureNode[] instance5 = { new FeatureNode(1, 1), new FeatureNode(2, -0.1) }; FeatureNode[] instance6 = { new FeatureNode(1, -0.1), new FeatureNode(2, 1) }; FeatureNode[] instance7 = { new FeatureNode(1, -0.1), new FeatureNode(2, -0.1) }; FeatureNode[][] testSetWithUnknown = { instance5, instance6, instance7 }; FeatureNode[][] trainingSetWithUnknown = { instance1, instance2, instance3, instance4 }; problem.x = trainingSetWithUnknown; Model m = Linear.train(problem, param); for( int i = 0; i < trainingSetWithUnknown.length; i++) System.out.println(" Train.instance = " + i + " => " + Linear.predict(m, trainingSetWithUnknown[i]) ); System.out.println("---------------------"); for( int i = 0; i < testSetWithUnknown.length; i++) System.out.println(" Test.instance = " + i + " => " + Linear.predict(m, testSetWithUnknown[i]) );
以下是输出结果:
iter 1 act 1.778e+00 pre 1.778e+00 delta 6.285e-01 f 4.000e+00 |g| 5.657e+00 CG 1 Train.instance = 0 => 1.0 Train.instance = 1 => 0.0 Train.instance = 2 => 0.0 Train.instance = 3 => 0.0--------------------- Test.instance = 0 => 1.0 Test.instance = 1 => 1.0 Test.instance = 2 => 0.0
我需要的是概率预测,而不是整数(硬)预测。命令行中有-b选项,但我找不到在代码中直接使用函数的任何方法。我还查看了代码(https://github.com/bwaldvogel/liblinear-java/blob/master/src/main/java/de/bwaldvogel/liblinear/Predict.java);显然,通过在代码中直接使用,没有概率预测的选项。这是正确的吗?
更新:我最终使用了来自https://github.com/bwaldvogel/liblinear-java的liblinear代码。在Predict.java文件中,我将
private static boolean flag_predict_probability = true;
改为
private static boolean flag_predict_probability = false;
并使用
SolverType.L2R_LR
但仍然得到整数类。有什么想法吗?
回答:
要使用概率,需要更改代码。预测是在Linear.java文件中的以下函数中进行的:
public static double predictValues(Model model, Feature[] x, double[] dec_values) {
需要将
if (model.nr_class == 2) { System.out.println("Two classes "); if (model.solverType.isSupportVectorRegression()) { System.out.println("Support vector"); return dec_values[0]; } else { System.out.println("Not Support vector"); return (dec_values[0] > 0) ? model.label[0] : model.label[1]; } }
更改为
if (model.nr_class == 2) { System.out.println("Two classes "); if (model.solverType.isSupportVectorRegression()) { System.out.println("Support vector"); return dec_values[0]; } else { System.out.println("Not Support vector"); return dec_values[0]; } }
请注意,输出仍然不是概率,而是权重和特征值的线性组合。如果将其输入到softmax函数中,它将成为[0, 1]之间的概率。
另外,请确保选择逻辑回归:
SolverType.L2R_LR