这是一个包含10个特征和一个类别的癌症数据集。
X=df.iloc[:,1:10].values y=df.iloc[:,[-1]].valuesfrom sklearn.preprocessing import Imputerimputer=Imputer(missing_values='NaN',strategy='mean',axis=1)imputer=imputer.fit(X)X=imputer.transform(X)from sklearn.model_selection import train_test_splitX_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=0)from sklearn.svm import SVCclassifier=SVC (kernel='rbf',random_state=0)classifier.fit(X_train,y_train)y_pred=classifier.predict(y_test)
当我执行这段代码时,得到以下错误:
ValueError: X.shape[1] = 1 should be equal to 9, the number of features at training time
回答:
您的错误是由以下这行代码引起的,您传递了y_test
而不是X_test
:
classifier.predict(y_test)
完整代码如下:
from sklearn.datasets import load_breast_cancerfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import Imputerfrom sklearn.svm import SVCdata = load_breast_cancer()df = pd.DataFrame(data.data, columns=data.feature_names)X=df.iloc[:,1:10]y = data.targetimputer=Imputer(strategy='mean',axis=1)X = imputer.fit_transform(X)X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=0)clf = SVC(kernel='rbf').fit(X_train, y_train)y_pred=clf.predict(X_test)print(clf.score(X_test, y_test))
结果为:
0.6842105263157895