出了问题;所有准确率指标值都丢失了:

我从Brett Lantz的教科书《用R进行机器学习》中复制了以下代码,并将其完全相同地复制到控制台中,

> library(caret)Loading required package: latticeLoading required package: ggplot2> library(kernlab)Attaching package: ‘kernlab’The following object is masked from ‘package:ggplot2’:alpha> set.seed(300)> ctrl <- trainControl(method = "cv", number = 10)> bagctrl <- bagControl(fit = svmBag$fit, predict = svmBag$pred, aggregate = svmBag$aggregate)> setwd("~/2148OS_code/chapter 11")> credit <- read.csv("credit.csv")> svmbag <- train(default ~ ., data = credit, "bag", trControl = ctrl, bagControl = bagctrl)

我得到了这样的回应。出了什么问题?

出了问题;所有准确率指标值都丢失了:    Accuracy       Kappa     Min.   : NA   Min.   : NA   1st Qu.: NA   1st Qu.: NA   Median : NA   Median : NA   Mean   :NaN   Mean   :NaN   3rd Qu.: NA   3rd Qu.: NA   Max.   : NA   Max.   : NA   NA's   :1     NA's   :1    Error in train.default(x, y, weights = w, ...) : StoppingIn addition: There were 50 or more warnings (use warnings() to see the first 50)

警告信息如下

> warnings()Warning messages:1: In data.row.names(row.names, rowsi, i) :  some row.names duplicated: 3,6,10,13,17,19,23,24,26,27,30,32,34,36,38,41,42,45,49,54,59,60,61,64,66,69,71,72,77,80,81,90,95,102,103,106,112,114,117,118,122,125,127,132,133,137,139,141,143,146,148,151,152,155,158,161,174,176,178,181,185,187,188,189,191,194,203,208,210,212,215,216,218,219,221,223,225,229,230,235,236,239,245,246,262,266,269,271,272,276,279,282,283,285,286,287,288,296,299,305,308,309,313,314,315,317,318,319,322,323,327,328,330,332,333,335,336,338,339,343,347,349,350,352,354,358,360,361,363,366,367,368,369,371,377,379,387,389,392,394,396,397,399,400,410,412,413,414,421,425,428,437,438,441,443,445,446,448,451,453,461,467,469,471,479,481,482,484,486,487,489,491,493,503,504,506,508,511,512,514,517,519,521,522,524,529,530,532,534,537,538,545,547,550,552,555,562,570,579,582,584,588,589,590,601,606,608,610,611,614,615,618,619,623,627,628,629,630,632,634,636,638,641,642,645,653,656,659,660,661,663,667,669,672,673,676,679,681,686,687,690,693,700,701,702,707,708,721,722,724,725,728, [... truncated]2: In data.row.names(row.names, rowsi, i) :  some row.names duplicated: 3,5,8,9,13,15,18,21,25,27,29,33,36,37,41,44,45,51,52,53,55,59,60,64,66,67,72,76,77,80,91,92,96,97,102,103,104,107,110,111,113,116,119,121,122,123,127,130,133,136,139,140,143,145,147,148,149,154,158,160,164,166,168,169,171,174,176,177,178,180,182,185,187,195,199,203,205,216,218,220,223,226,231,234,236,237,238,242,245,2

回答:

我使用了第二版中提供的代码。

如果你设置并行处理,这些警告信息将会消失。但是,你仍然会遇到准确率指标丢失的错误。

这个错误是由重抽样性能度量中的缺失值引起的。如果在某个重抽样中,某个结果类别(在本例中是违约)没有样本,那么敏感性或特异性将无法定义,这种情况可能会发生。

我还使用caret包中包含的GermanCredit数据进行了测试,结果也产生了同样的错误。

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注