两级堆叠学习器（集成模型）结合弹性网络和逻辑回归，使用mlr3

我尝试解决医学中的一个常见问题：将预测模型与其他来源（例如，专家的意见[在医学中有时被高度强调]）结合，在本文中称为superdoc预测器。

这个问题可以通过堆叠一个模型和一个逻辑回归（输入专家的意见）来解决，如本文第26页所述：

Afshar P, Mohammadi A, Plataniotis KN, Oikonomou A, Benali H. FromHandcrafted to Deep-Learning-Based Cancer Radiomics: Challenges andOpportunities. IEEE Signal Process Mag 2019; 36: 132–60. 可在此处获取 here

我在这里尝试过here，没有考虑过拟合（我没有对较低层学习器应用折外预测）：

示例数据

# librarylibrary(tidyverse)library(caret)library(glmnet)library(mlbench)# 获取示例数据data(PimaIndiansDiabetes, package="mlbench")data <- PimaIndiansDiabetes# 将超级医生的意见添加到数据集中set.seed(2323)data %>%   rowwise() %>%   mutate(superdoc=case_when(diabetes=="pos" ~ as.numeric(sample(0:2,1)), TRUE~ 0)) -> data# 将数据分成训练集和测试集train.data <- data[1:550,]test.data <- data[551:768,]

不考虑折外预测的堆叠模型：

# 弹性网络回归（不包括超级医生的意见）set.seed(2323)model <- train(  diabetes ~., data = train.data %>% select(-superdoc), method = "glmnet",  trControl = trainControl("repeatedcv",                           number = 10,                           repeats=10,                           classProbs = TRUE,                           savePredictions = TRUE,                           summaryFunction = twoClassSummary),  tuneLength = 10,  metric="ROC" #ROC指标在twoClassSummary中)# 提取最佳alpha和lambda的系数  coef(model$finalModel, model$finalModel$lambdaOpt) -> coeffstidy(coeffs) %>% tibble() -> coeffscoef.interc = coeffs %>% filter(row=="(Intercept)") %>% pull(value)coef.pregnant = coeffs %>% filter(row=="pregnant") %>% pull(value)coef.glucose = coeffs %>% filter(row=="glucose") %>% pull(value)coef.pressure = coeffs %>% filter(row=="pressure") %>% pull(value)coef.mass = coeffs %>% filter(row=="mass") %>% pull(value)coef.pedigree = coeffs %>% filter(row=="pedigree") %>% pull(value)coef.age = coeffs %>% filter(row=="age") %>% pull(value)# 将模型与超级医生的意见结合到逻辑回归模型中finalmodel = glm(diabetes ~ superdoc + I(coef.interc + coef.pregnant*pregnant + coef.glucose*glucose + coef.pressure*pressure + coef.mass*mass + coef.pedigree*pedigree + coef.age*age),family=binomial, data=train.data)# 在测试数据上进行预测predict(finalmodel,test.data, type="response") -> predictions# 检查测试数据中模型的AUC值roc(test.data$diabetes,predictions, ci=TRUE) #> Setting levels: control = neg, case = pos#> Setting direction: controls < cases#> #> Call:#> roc.default(response = test.data$diabetes, predictor = predictions,     ci = TRUE)#> #> Data: predictions in 145 controls (test.data$diabetes neg) < 73 cases (test.data$diabetes pos).#> Area under the curve: 0.9345#> 95% CI: 0.8969-0.9721 (DeLong)

现在我想根据这个非常有用的帖子使用mlr3包家族来考虑折外预测：Tuning a stacked learner

#librarylibrary(mlr3)library(mlr3learners)library(mlr3pipelines)library(mlr3filters)library(mlr3tuning)library(paradox)library(glmnet)# 创建弹性网络回归glmnet_lrn =  lrn("classif.cv_glmnet", predict_type = "prob")# 创建学习器的折外预测glmnet_cv1 = po("learner_cv", glmnet_lrn, id = "glmnet") #我找不到设置来过滤预测器（即，不在这里发送superdoc预测器）# 总结步骤 level0 = gunion(list(  glmnet_cv1,  po("nop", id = "only_superdoc_predictor")))  %>>% #我找不到设置只将superdoc预测器发送到"union1"  po("featureunion", id = "union1")# 最终的逻辑回归log_reg_lrn = lrn("classif.log_reg", predict_type = "prob")# 组合集成模型ensemble = level0 %>>% log_reg_lrnensemble$plot(html = FALSE)

^{由reprex包（v1.0.0）在2021-03-15创建}

我的问题（我对`mlr3`包家族还比较新）

mlr3包家族是否适合我试图构建的集成模型？
如果是，我该如何完成集成模型并在test.data上进行预测？

回答：

学技术

两级堆叠学习器（集成模型）结合弹性网络和逻辑回归，使用mlr3

我的问题（我对`mlr3`包家族还比较新）

发表回复取消回复

我的问题（我对mlr3包家族还比较新）

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

我的问题（我对`mlr3`包家族还比较新）

发表回复取消回复