我使用R构建了一个决策树模型,并且我想在预测值大于50%时添加一个新列,并在该列中打印“yes”。
注意:数据集中目标列是布尔值,1表示心脏病,0表示正常。
library(rpart)tree<-rpart(target ~ .,method ='class', data=train)print(summary(tree))tree.preds<-predict(tree,test)print(head(tree.preds))tree.preds<-as.data.frame(tree.preds)joiner<-function(x){ if(x>=0.5) return('yes') else return('no') }tree.preds$disease<-sapply(tree.preds$yes,joiner)print(head(tree.preds))
运行后出现以下错误:
Error in `$<-.data.frame`(`*tmp*`, t, value = list()) : replacement has 0 rows, data has 91
回答:
您可以使用ifelse
替代使用sapply
进行迭代:
library(rpart)dat = iris[,-5]dat$target = as.numeric(iris$Species=="versicolor")idx = sample(nrow(dat),100)train = dat[idx,]test = dat[-idx,]tree = rpart(target ~ .,method ='class', data=train)tree.preds = data.frame(predict(tree,test))tree.preds$Species = ifelse(tree.preds[,2]>0.5,"yes","no")