不一致的“最佳调参”和“跨调参重采样结果”在Caret R包中

我正在尝试使用Caret和调参网格来创建模型

svmGrid <- expand.grid(C = c(0.0001,0.001,0.01,0.1,1,10,20,30,40,50,100))

然后再次使用这个网格的一个子集:

svmGrid <- expand.grid(C = c(0.0001,0.001,0.01,0.1,1,10,20,30,40,50))

问题在于,虽然第一次调参网格中选择的C参数值也在第二个调参网格中出现,但我得到了不同的“最佳调参”和“跨调参重采样结果”。

当我使用不同的采样参数选项以及在trainControl()中使用不同的summaryFunction选项时,我也遇到了这些差异。

不用说,由于每次选择的模型不同,这会影响到测试集上的预测结果。

有谁知道这是为什么吗?

可重现的数据集:

library(caret)library(doMC)registerDoMC(cores = 8)set.seed(2969)imbal_train <- twoClassSim(100, intercept = -20, linearVars = 20)imbal_test  <- twoClassSim(100, intercept = -20, linearVars = 20)table(imbal_train$Class)

使用第一个调参网格运行:

svmGrid <-  expand.grid(C = c(0.0001,0.001,0.01,0.1,1,10,20,30,40,50,100))up_fitControl = trainControl(method = "cv", number = 10 , savePredictions = TRUE, allowParallel = TRUE, sampling = "up", seeds = NA)set.seed(5627)up_inside <- train(Class ~ ., data = imbal_train,                   method = "svmLinear",                   trControl = up_fitControl,                   tuneGrid = svmGrid,                   scale = FALSE)up_inside

第一次运行输出:

> up_insideSupport Vector Machines with Linear Kernel 100 samples 25 predictors  2 classes: 'Class1', 'Class2' No pre-processingResampling: Cross-Validated (10 fold) Summary of sample sizes: 90, 91, 90, 90, 89, 90, ... Addtional sampling using up-samplingResampling results across tuning parameters:  C      Accuracy   Kappa         Accuracy SD  Kappa SD   1e-04  0.7734343   0.252201364  0.1227632    0.3198165  1e-03  0.8225253   0.396439198  0.1245455    0.3626456  1e-02  0.7595960   0.116150973  0.1431780    0.3046825  1e-01  0.7686869   0.051430454  0.1167093    0.2712062  1e+00  0.7695960  -0.004261294  0.1162279    0.2190151  1e+01  0.7093939   0.111852756  0.2030250    0.3810059  2e+01  0.7195960   0.040458804  0.1932690    0.2580560  3e+01  0.7195960   0.040458804  0.1932690    0.2580560  4e+01  0.7195960   0.040458804  0.1932690    0.2580560  5e+01  0.7195960   0.040458804  0.1932690    0.2580560  1e+02  0.7195960   0.040458804  0.1932690    0.2580560Accuracy was used to select the optimal model using  the largest value.The final value used for the model was C = 0.001. 

使用第二个调参网格运行:

svmGrid <-  expand.grid(C = c(0.0001,0.001,0.01,0.1,1,10,20,30,40,50))up_fitControl = trainControl(method = "cv", number = 10 , savePredictions = TRUE, allowParallel = TRUE, sampling = "up", seeds = NA)set.seed(5627)up_inside <- train(Class ~ ., data = imbal_train,                   method = "svmLinear",                   trControl = up_fitControl,                   tuneGrid = svmGrid,                   scale = FALSE)up_inside

第二次运行输出:

> up_insideSupport Vector Machines with Linear Kernel 100 samples 25 predictors  2 classes: 'Class1', 'Class2' No pre-processingResampling: Cross-Validated (10 fold) Summary of sample sizes: 90, 91, 90, 90, 89, 90, ... Addtional sampling using up-samplingResampling results across tuning parameters:  C      Accuracy   Kappa         Accuracy SD  Kappa SD   1e-04  0.8125253   0.392165694  0.13043060   0.3694786  1e-03  0.8114141   0.375569633  0.12291273   0.3549978  1e-02  0.7995960   0.205413345  0.06734882   0.2662161  1e-01  0.7495960   0.017139266  0.09742161   0.2270128  1e+00  0.7695960  -0.004261294  0.11622791   0.2190151  1e+01  0.7093939   0.111852756  0.20302503   0.3810059  2e+01  0.7195960   0.040458804  0.19326904   0.2580560  3e+01  0.7195960   0.040458804  0.19326904   0.2580560  4e+01  0.7195960   0.040458804  0.19326904   0.2580560  5e+01  0.7195960   0.040458804  0.19326904   0.2580560Accuracy was used to select the optimal model using  the largest value.The final value used for the model was C = 1e-04.

回答:

如果你在caret中不提供种子,它会为你选择。由于你的网格长度不同,折叠的种子会略有不同。

下面,我粘贴了示例,我只是重命名了你的第二个模型,以便更容易获取比较的输出:

> up_inside$control$seeds[[1]] [1] 825016 802597 128276 935565 324036 188187 284067  58853 923008 995461  60759> up_inside2$control$seeds[[1]] [1] 825016 802597 128276 935565 324036 188187 284067  58853 923008 995461> up_inside$control$seeds[[2]] [1] 966837 256990 592077 291736 615683 390075 967327 349693  73789 155739 916233# 看看这里的第一个种子与第一个模型的最后一个种子相同> up_inside2$control$seeds[[2]] [1]  60759 966837 256990 592077 291736 615683 390075 967327 349693  73789

如果你现在继续设置自己的种子,你会得到相同的输出:

# 你的第一个训练的种子myseeds <- list(c(1:10,1000), c(11:20,2000), c(21:30, 3000),c(31:40, 4000),c(41:50, 5000),                c(51:60, 6000),c(61:70, 7000),c(71:80, 8000),c(81:90, 9000),c(91:100, 10000), c(343))# 你的第二个训练的种子myseeds2 <- list(c(1:10), c(11:20), c(21:30),c(31:40),c(41:50),c(51:60),                 c(61:70),c(71:80),c(81:90),c(91:100), c(343))> up_insideSupport Vector Machines with Linear Kernel 100 samples 25 predictor  2 classes: 'Class1', 'Class2' No pre-processingResampling: Cross-Validated (10 fold) Summary of sample sizes: 90, 91, 90, 90, 89, 90, ... Addtional sampling using up-samplingResampling results across tuning parameters:  C      Accuracy   Kappa        1e-04  0.7714141  0.239823027  1e-03  0.7914141  0.332834590  1e-02  0.7695960  0.207000745  1e-01  0.7786869  0.103957926  1e+00  0.7795960  0.006849817  1e+01  0.7093939  0.111852756  2e+01  0.7195960  0.040458804  3e+01  0.7195960  0.040458804  4e+01  0.7195960  0.040458804  5e+01  0.7195960  0.040458804  1e+02  0.7195960  0.040458804Accuracy was used to select the optimal model using  the largest value.The final value used for the model was C = 0.001. > up_inside2Support Vector Machines with Linear Kernel 100 samples 25 predictor  2 classes: 'Class1', 'Class2' No pre-processingResampling: Cross-Validated (10 fold) Summary of sample sizes: 90, 91, 90, 90, 89, 90, ... Addtional sampling using up-samplingResampling results across tuning parameters:  C      Accuracy   Kappa        1e-04  0.7714141  0.239823027  1e-03  0.7914141  0.332834590  1e-02  0.7695960  0.207000745  1e-01  0.7786869  0.103957926  1e+00  0.7795960  0.006849817  1e+01  0.7093939  0.111852756  2e+01  0.7195960  0.040458804  3e+01  0.7195960  0.040458804  4e+01  0.7195960  0.040458804  5e+01  0.7195960  0.040458804Accuracy was used to select the optimal model using  the largest value.The final value used for the model was C = 0.001.

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注