R中SVM的错误

我是R的新手,尝试从文本中提取数据,然后在SVM中应用进行分类。这里是代码:

train<-read.table("training.txt")train[which(train=="?",arr.ind=TRUE)]<-NAtrain=unique(train)y=train[,length(train)]classifier<-svm(y~.,data=train[,-length(train)],scale=F)classifier<-svm(x=train[,-length(train)],y=factor(y),scale=F)

我尝试了两种不同的方式来调用svm,第一个(svm(y~.,data=train[,-length(train)],scale=F))看起来没问题,但第二个有问题,它报告了:

Error in svm.default(x = train[, length(train)], y = factor(y), scale = F) :   NA/NaN/Inf in foreign function call (arg 1)In addition: Warning message:In svm.default(x = train[, length(train)], y = factor(y), scale = F) :  NAs introduced by coercion

这是training.txt的一个样本,最后一列是目标

39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,050,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,038,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,053,Private,234721,11th,7,Married-civ-spouse,Handlers-cleaners,Husband,Black,Male,0,0,40,United-States,028,Private,338409,Bachelors,13,Married-civ-spouse,Prof-specialty,Wife,Black,Female,0,0,40,Cuba,037,Private,284582,Masters,14,Married-civ-spouse,Exec-managerial,Wife,White,Female,0,0,40,United-States,049,Private,160187,9th,5,Married-spouse-absent,Other-service,Not-in-family,Black,Female,0,0,16,Jamaica,052,Self-emp-not-inc,209642,HS-grad,9,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,45,United-States,131,Private,45781,Masters,14,Never-married,Prof-specialty,Not-in-family,White,Female,14084,0,50,United-States,142,Private,159449,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,5178,0,40,United-States,137,Private,280464,Some-college,10,Married-civ-spouse,Exec-managerial,Husband,Black,Male,0,0,80,United-States,130,State-gov,141297,Bachelors,13,Married-civ-spouse,Prof-specialty,Husband,Asian-Pac-Islander,Male,0,0,40,India,123,Private,122272,Bachelors,13,Never-married,Adm-clerical,Own-child,White,Female,0,0,30,United-States,032,Private,205019,Assoc-acdm,12,Never-married,Sales,Not-in-family,Black,Male,0,0,50,United-States,040,Private,121772,Assoc-voc,11,Married-civ-spouse,Craft-repair,Husband,Asian-Pac-Islander,Male,0,0,40,NA,1

有什么想法吗?提前感谢!


回答:

从文档中:

对于x参数:

a data matrix, a vector, or a sparse matrix (object of class Matrixprovided by the Matrix package,or of class matrix.csr provided by theSparseM package, or of class simple_triplet_matrix provided by the slam package).

对于y参数:

a response vector with one label for each row/component of x. Can beeither a factor (for classification tasks) or a numeric vector (for regression).

当你在第二个函数中输入x=train[,-length(train)]时,你实际上使用的是一个data.frame,这是不支持的,所以会崩溃。

svm函数支持数值矩阵

library(e1071)train[which(train=="?",arr.ind=TRUE)]<-NAtrain=unique(train)y=factor(train[,length(train)])train <- data.frame(lapply(train,as.numeric)) #转换为数值。实际上,分类变量在幕后也是整数字段。train <- as.matrix(train[-length(train)])classifier<-svm(x= train ,y=y,scale=F)

输出:

> summary(classifier)Call:svm.default(x = train, y = y, scale = F)Parameters:   SVM-Type:  C-classification  SVM-Kernel:  radial        cost:  1       gamma:  0.07142857 Number of Support Vectors:  14 ( 9 5 )Number of Classes:  2 Levels:  0 1

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注