我正在对具有2个类别(0和1)的测试数据实施Naive Bayes的10折交叉验证。我按照以下步骤操作,但出现了错误。
data(testdata)attach(testdata)X <- subset(testdata, select=-Class)Y <- Classlibrary(e1071)naive_bayes <- naiveBayes(X,Y)library(caret)library(klaR)nb_cv <- train(X, Y, method = "nb", trControl = trainControl(method = "cv", number = 10))## Error:## Error in train.default(X, Y, method = "nb", trControl = trainControl(number = 10)) : ## wrong model type for regressiondput(testdata)structure(list(Feature.1 = 6.534088, Feature.2 = -19.050915, Feature.3 = 7.599378, Feature.4 = 5.093594, Feature.5 = -22.15166, Feature.6 = -7.478444, Feature.7 = -59.534652, Feature.8 = -1.587918, Feature.9 = -5.76889, Feature.10 = 95.810563, Feature.11 = 49.124086, Feature.12 = -21.101489, Feature.13 = -9.187984, Feature.14 = -10.53006, Feature.15 = -3.782506, Feature.16 = -10.805074, Feature.17 = 34.039509, Feature.18 = 5.64245, Feature.19 = 19.389724, Feature.20 = 16.450196, Class = 1L), .Names = c("Feature.1", "Feature.2", "Feature.3", "Feature.4", "Feature.5", "Feature.6", "Feature.7", "Feature.8", "Feature.9", "Feature.10", "Feature.11", "Feature.12", "Feature.13", "Feature.14", "Feature.15", "Feature.16", "Feature.17", "Feature.18", "Feature.19", "Feature.20", "Class"), class = "data.frame", row.names = c(NA, -1L))
另外,如何计算这个模型的R方或AUC?
数据集:有10000条记录,包含20个特征和二元类别。
回答:
Naive Bayes是一个分类器,因此将Y转换为因子或布尔值是解决问题的正确方法。您最初的公式使用了分类工具,但使用了数值,因此R感到困惑。
至于R方,这个指标仅用于回归问题,而不适用于分类问题。评估分类问题有其他指标,如精确度和召回率。
有关这些指标的更多信息,请参考维基百科链接:http://en.wikipedia.org/wiki/Binary_classification