最近我按照一些教程学习如何使用mlr3中的GraphLearner。但我对于带分支的GraphLearner的调优结果仍然感到困惑。我设置了一个简单的例子,以下是我的代码:
# Create a Graph with Branchgraph_branch <- po("branch", c("nop", "pca", "scale"), id = "preprocess_branch") %>>% gunion(list( po("nop"), po("pca", id = "pca1"), po("scale") %>>% po("pca", id = "pca2") )) %>>% po("unbranch", id = "preprocess_unbranch") %>>% po("branch", c("classif.kknn", "classif.featureless"), id = "lrn_branch") %>>% gunion(list( lrn("classif.kknn", predict_type = "prob"), lrn("classif.featureless", predict_type = "prob") )) %>>% po("unbranch", id = "lrn_unbranch")# Convert a graph to a learnergraph_branch_lrn <- as_learner(graph_branch)graph_branch_lrn$graph$plot()
# Set the tuning gridtune_grid <- ParamSet$new( list( ParamFct$new("preprocess_branch.selection", levels = c("nop", "pca", "scale")), ParamInt$new("pca1.rank.", lower = 1, upper = 10), ParamInt$new("pca2.rank.", lower = 1, upper = 10), ParamFct$new("lrn_branch.selection", levels = c("classif.kknn", "classif.featureless")), ParamInt$new("classif.kknn.k", lower = 1, upper = 10) ))# Set the instanceinstance_rs <- TuningInstanceSingleCrit$new( task = task_train, learner = graph_branch_lrn, resampling = rsmp("cv", folds = 5), measure = msr("classif.auc"), search_space = tune_grid, terminator = trm("evals", n_evals = 20) )# Random search tuningtuner_rs <- tnr("random_search")plan(multisession, workers = 5)set.seed(100)tuner_rs$optimize(instance_rs)plan(sequential)
最佳调优结果是:
# Check the resultinstance_rs$result_learner_param_vals$preprocess_branch.selection[1] "nop"$scale.robust[1] FALSE$lrn_branch.selection[1] "classif.kknn"$classif.featureless.method[1] "mode"$pca1.rank.[1] 9$pca2.rank.[1] 9$classif.kknn.k[1] 9
我想知道如果选择了“nop”分支,为什么“pca1.rank.”和“pca2.rank.”的调优结果会出现?我曾经认为带分支的GraphLearner的调优会根据分支选择最佳结果,例如如果选择了“nop”分支,其他分支的参数就不会被考虑和出现。我在解释GraphLearner调优机制上是不是有什么误解,或者代码上有什么问题?
回答:
你没有做错什么;输出只是显示了所有超参数的最佳值。并不是所有的值都相关——在这里,你可以简单地忽略scale
、pca1
和pca2
的值。