R中MLR包的预测函数结果与predict不一致

我在使用mlr包的框架构建一个svm模型来预测图像中的土地覆盖类别。我使用了raster包的predict函数，并且还将raster转换为数据框，然后使用“learner.model”作为输入对该数据框进行预测。这些方法都给我带来了现实的结果。

工作正常：

> predict(raster, mod$learner.model)

或者

> xy <- as.data.frame(raster, xy = T)> C <- predict(mod$learner.model, xy)

然而，如果我在不指定learner.model的情况下对从raster派生的数据框进行预测，结果就不一样了。

> C2 <- predict(mod, newdata=xy)

C2$data$response与C不相同。为什么？

这是一个可以重现问题的示例：

> library(mlr) > library(kernlab) > x1 <- rnorm(50) > x2 <- rnorm(50, 3) > x3 <- rnorm(50, -20, 3) > C <- sample(c("a","b","c"), 50, T) > d <-  data.frame(x1, x2, x3, C) > classif <- makeClassifTask(id = "example", data = d, target = "C") > lrn <- makeLearner("classif.ksvm", predict.type = "prob", fix.factors.prediction = T) > t <- train(lrn, classif) Using automatic sigma estimation (sigest) for RBF or laplace kernel > res1 <- predict(t, newdata = data.frame(x2,x1,x3)) > res1 Prediction: 50 observations predict.type: prob threshold: a=0.33,b=0.33,c=0.33 time: 0.01      prob.a    prob.b    prob.c response 1 0.2110131 0.3817773 0.4072095        c 2 0.1551583 0.4066868 0.4381549        c 3 0.4305353 0.3092737 0.2601910        a 4 0.2160050 0.4142465 0.3697485        b 5 0.1852491 0.3789849 0.4357659        c 6 0.5879579 0.2269832 0.1850589        a > res2 <- predict(t$learner.model, data.frame(x2,x1,x3)) > res2  [1] c c a b c a b a c c b c b a c b c a a b c b c c a b b b a a b a c b a c c c [39] c a a b c b b b b a b b Levels: a b c!> res1$data$response == res2  [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE [13]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE [25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE [37]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE [49]  TRUE FALSE

预测结果不一致。根据mlr的预测教程页面，我不明白为什么结果会不同。感谢您的帮助。

—–

更新：当我用随机森林模型做同样的事情时，两个向量是相等的。这是由于SVM依赖于尺度而随机森林不依赖于尺度吗？

 > library(randomForest) > classif <- makeClassifTask(id = "example", data = d, target = "C") > lrn <- makeLearner("classif.randomForest", predict.type = "prob", fix.factors.prediction = T) > t <- train(lrn, classif) > > res1 <- predict(t, newdata = data.frame(x2,x1,x3)) > res1 Prediction: 50 observations predict.type: prob threshold: a=0.33,b=0.33,c=0.33 time: 0.00   prob.a prob.b prob.c response 1  0.654  0.228  0.118        a 2  0.742  0.090  0.168        a 3  0.152  0.094  0.754        c 4  0.092  0.832  0.076        b 5  0.748  0.100  0.152        a 6  0.680  0.098  0.222        a > > res2 <- predict(t$learner.model, data.frame(x2,x1,x3)) > res2  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26  a  a  c  b  a  a  a  c  a  b  b  b  b  c  c  a  b  b  a  c  b  a  c  c  b  c 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  a  a  b  a  c  c  c  b  c  b  c  a  b  c  c  b  c  b  c  a  c  c  b  b Levels: a b c > > res1$data$response == res2  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [46] TRUE TRUE TRUE TRUE TRUE

—-

另一个更新：如果我将predict.type从“prob”改为“response”，两个svm预测向量是一致的。我将研究这些类型的差异，我原本以为“prob”会给出相同的结果但同时也给出概率。也许事实并非如此？

回答：

答案在这里：

为什么R中ksvm的概率和响应不一致？

简而言之，ksvm的type = “probabilities”给出的结果与type = “response”不同。

如果我运行

 > res2 <- predict(t$learner.model, data.frame(x2,x1,x3), type = "probabilities") > res2

那么我得到的结果与上面的res1相同（type = “response”是默认的）。

遗憾的是，基于概率对图像进行分类似乎不如使用“response”效果好。也许这仍然是估计分类确定性的最佳方式？

学技术

R中MLR包的预测函数结果与predict不一致

—–

—-

发表回复取消回复

—–

—-

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复