K-Nearest neighbours我正在尝试对心脏病预测数据库执行knn算法。当我尝试将其序列化并创建model.pkl文件时,它会给我一个未拟合错误。当我运行代码时,它能给出准确的预测,但是一旦序列化就会显示错误。我应该如何拟合这些数据?我是机器学习的新手,请帮助我。
from sklearn.neighbors import KNeighborsClassifier dataset = pd.get_dummies(df, columns = ['sex', 'cp', 'fbs', 'restecg', 'exang', 'slope', 'ca', 'thal'])from sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import StandardScalerstandardScaler = StandardScaler()columns_to_scale = ['age', 'trestbps', 'chol', 'thalach', 'oldpeak']dataset[columns_to_scale] = standardScaler.fit_transform(dataset[columns_to_scale])y = dataset['target']X = dataset.drop(['target'], axis = 1)from sklearn.model_selection import cross_val_scoreknn_scores = []for k in range(1,21): knn_classifier = KNeighborsClassifier(n_neighbors = k) score=cross_val_score(knn_classifier,X,y,cv=10) knn_scores.append(score.mean())plt.plot([k for k in range(1, 21)], knn_scores, color = 'red')for i in range(1,21): plt.text(i, knn_scores[i-1], (i, knn_scores[i-1]))plt.xticks([i for i in range(1, 21)])plt.xlabel('Number of Neighbors (K)')plt.ylabel('Scores')plt.title('K Neighbors Classifier scores for different K values')Text(0.5, 1.0, 'K Neighbors Classifier scores for different K values')knn_classifierknn_classifier = KNeighborsClassifier(n_neighbors = 12)score=cross_val_score(knn_classifier,X,y,cv=10)score.mean()0.8448387096774195import picklepickle.dump(knn_classifier, open('model.pkl', 'wb'))Heart_disease_detector_model = pickle.load(open('model.pkl', 'rb'))y_pred = Heart_disease_detector_model.predict(X_test)print('Accuracy of K – Nearest Neighbor model = ',accuracy_score(y_test, y_pred))---------------------------------------------------------------------------> NotFittedError Traceback (most recent call last)> <ipython-input-79-c37bd716088c> in <module>> 2 pickle.dump(knn_classifier, open('model.pkl', 'wb'))> 3 Heart_disease_detector_model = pickle.load(open('model.pkl', 'rb'))> ----> 4 y_pred = Heart_disease_detector_model.predict(X_test)> 5 print('Accuracy of K – Nearest Neighbor model = ',accuracy_score(y_test, y_pred))> > c:\users\jahnavi padala\miniconda3\lib\site-packages\sklearn\neighbors\_classification.py> in predict(self, X)> 195 X = check_array(X, accept_sparse='csr')> 196 > --> 197 neigh_dist, neigh_ind = self.kneighbors(X)> 198 classes_ = self.classes_> 199 _y = self._y> > c:\users\jahnavi padala\miniconda3\lib\site-packages\sklearn\neighbors\_base.py in> kneighbors(self, X, n_neighbors, return_distance)> 647 [2]]...)> 648 """> --> 649 check_is_fitted(self)> 650 > 651 if n_neighbors is None:> > c:\users\jahnavi padala\miniconda3\lib\site-packages\sklearn\utils\validation.py in> inner_f(*args, **kwargs)> 61 extra_args = len(args) - len(all_args)> 62 if extra_args <= 0:> ---> 63 return f(*args, **kwargs)> 64 > 65 # extra_args > 0> > c:\users\jahnavi padala\miniconda3\lib\site-packages\sklearn\utils\validation.py in> check_is_fitted(estimator, attributes, msg, all_or_any)> 1096 > 1097 if not attrs:> -> 1098 raise NotFittedError(msg % {'name': type(estimator).__name__})> 1099 > 1100 > > NotFittedError: This KNeighborsClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this> estimator.
回答:
错误提示分类器尚未拟合,这正是字面意思——你需要在使用模型之前对其进行拟合。在获取准确性得分之前,可以这样做:
knn_classifier.fit(X, y)
所以你最终会得到这样的代码:
knn_classifierknn_classifier = KNeighborsClassifier(n_neighbors = 12)knn_classifier.fit(X, y)