构建 SVM 模型
model<- svm(SeriousDlqin2yrs~., IAStrain)predictedY <- predict(model, IAStest)Error in names(ret2) <- rowns: 'names' attribute [2000] must be the same length as the vector [1605]
我的两个数据集(训练集和测试集)的数据类型如下:
> str(IAStest)'data.frame': 2000 obs. of 10 variables:$ RevolvingUtilizationOfUnsecuredLines: num 0.106 0.503 0.111 1 1 ...$ age : int 45 46 78 78 63 33 44 65 31 41 ...$ NumberOfTime30.59DaysPastDueNotWorse: int 0 0 0 0 0 0 0 0 0 0 ...$ DebtRatio : num 0.2877 0.311 0.0651 0.1255 45 ...$ MonthlyIncome: int 10000 4912 11583 12465 NA 2500 NA 18915 8200 30018 ...$ NumberOfOpenCreditLinesAndLoans: int 5 6 8 2 4 8 4 6 9 14 ...$ NumberOfTimes90DaysLate: int 0 0 0 0 0 0 0 0 0 0 ...$ NumberRealEstateLoansOrLines : int 2 1 0 2 0 1 0 2 1 3 ...$ NumberOfTime60.89DaysPastDueNotWorse: int 0 0 0 0 0 0 0 0 0 0 ...$ NumberOfDependents : int 5 3 0 0 0 1 0 2 0 2 ...> str(IAStrain)'data.frame': 28000 obs. of 11 variables:$ SeriousDlqin2yrs: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...$ RevolvingUtilizationOfUnsecuredLines: num 0.957 0.658 0.907 0.213 0.306$ age : int 40 38 49 74 57 39 27 57 30 51 ...$ NumberOfTime30.59DaysPastDueNotWorse: int 0 1 1 0 0 0 0 0 0 0 ...$ DebtRatio : num 1.22e-01 8.51e-02 2.49e-02 3.76e-01 5.71e+03 ...$ MonthlyIncome : int 2600 3042 63588 3500 NA 3500 NA 23684 2500 6501 ...$ NumberOfOpenCreditLinesAndLoans: int 4 2 7 3 8 8 2 9 5 7 ...$ NumberOfTimes90DaysLate: int 0 1 0 0 0 0 0 0 0 0 ...$ NumberRealEstateLoansOrLines: int 0 0 1 1 3 0 0 4 0 2 ...$ NumberOfTime60.89DaysPastDueNotWorse: int 0 0 0 0 0 0 0 0 0 0 ...$ NumberOfDependents: int 1 0 0 1 0 0 NA 2 0 2 ...
我读了很多关于类似问题的帖子。问题主要出在变量的数据类型上。但在我这里这不是问题。
回答:
除了我的评论之外,很可能是你数据中的 NA
值造成了问题
predictedY <- predict(model, IAStest[!rowSums(is.na(IAStest)),])
应该可以为不包含 NA
值的行生成结果