我正在使用R中的frbs.learn()构建一个ANFIS模型。以下是我的代码:
library(readxl)library(anfis)library(parallel)library(frbs)Yamuna_final <- read_excel("F:/Downloads/Yamuna_final.xlsx", col_names = FALSE)data.train <- as.matrix(Yamuna_final)frbs_obj <- frbs.learn(data.train , range.data = NULL, method.type = c("ANFIS"), list(num.labels = 13, max.iter= 10, step.size = 0.01, type.tnorm = "MIN", type.implication.func = "ZADEH" , name = "Sim-0"))test <- read_excel("F:/Downloads/test.xlsx", col_names = FALSE)res <- predict(frbs_obj, test)
现在,当frbs.learn()执行时,我得到了以下错误:
在matrix(nrow = nrow(rule.data.num), ncol = 2 * ncol(rule.data.num) – : 无效的’ncol’值(< 0)
我的数据集(data.train)有1539行和12列。以下是其中一些实例:
X__1 X__2 X__3 X__4 X__5 X__6 X__7 X__8 X__9 X__10 X__11 X__12 [1,] 1999 1 1 7.720000 11.00000 1.000000 0.0500000 0.92000 85.0 14.00000 210 8.60000000 [2,] 1999 1 2 7.700000 10.00000 1.000000 0.0500000 2.00000 50.0 14.50000 3700 10.80000000 [3,] 1999 1 3 8.400000 10.00000 1.000000 0.0400000 0.92000 120.0 23.00000 400 8.60000000 [4,] 1999 1 4 8.270000 6.00000 1.000000 0.0500000 0.56000 80.0 22.00000 4600 12.50000000 [5,] 1999 1 5 8.180000 6.00000 1.000000 0.0500000 0.80000 140.0 22.00000 23000 8.70000000
现在,我的模型无法训练,并且我收到了上述错误。我不知道哪里出了问题。:(
回答:
错误可能是因为数据集中存在只包含一个唯一值的列。
在下面的代码中,删除这些列后,frbs.learn
可以无错误地运行。
library(frbs)data.train <- read.table(text=" X__1 X__2 X__3 X__4 X__5 X__6 X__7 X__8 X__9 X__10 X__11 X__12 [1,] 1999 1 1 7.720000 11.00000 1.000000 0.0500000 0.92000 85.0 14.00000 210 8.60000000 [2,] 1999 1 2 7.700000 10.00000 1.000000 0.0500000 2.00000 50.0 14.50000 3700 10.80000000 [3,] 1999 1 3 8.400000 10.00000 1.000000 0.0400000 0.92000 120.0 23.00000 400 8.60000000 [4,] 1999 1 4 8.270000 6.00000 1.000000 0.0500000 0.56000 80.0 22.00000 4600 12.50000000 [5,] 1999 1 5 8.180000 6.00000 1.000000 0.0500000 0.80000 140.0 22.00000 23000 8.70000000", header=T)# 查找并删除只包含一个唯一值的列delete_cols <- apply(data.train, 2, function(x) length(unique(x))!=1)data.train <- data.train[,delete_cols]frbs_obj <- frbs.learn(data.train, range.data = NULL, method.type =c("ANFIS"), list(num.labels = 13, max.iter= 10, step.size = 0.01, type.tnorm = "MIN", type.implication.func = "ZADEH" , name = "Sim-0"))
否则,错误可能是因为数据集中存在NA
值。
计算数据集各列中缺失数据的数量,你可以看到第二列有一个缺失值
apply(data.train,2,function(x) sum(is.na(x)))# X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 # 0 1 0 0 0 0 0 0 0 0 0 0
在第277行
posNA <- which(apply(data.train,1,function(x) any(is.na(x))))data.train[posNA, ]# X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12# 277 2000 NA 1 7.49 77 25 13.17 19.26 5000 20 2.1e+07 0
以下是最终代码:
library(frbs)data.train <- read_excel("F:/Downloads/Yamuna_final.xlsx", col_names=FALSE)posNA <- which(apply(data.train,1,function(x) any(is.na(x))))data.train <- data.train[-posNA, ]data.train <- as.matrix(data.train)frbs_obj <- frbs.learn(data.train , range.data = NULL, method.type = c("ANFIS"), list(num.labels = 13, max.iter= 10, step.size = 0.01, type.tnorm="MIN", type.implication.func="ZADEH" , name="Sim-0"))