指定的至少一个标签必须在y_true中,目标向量是数值型

我正在使用这个数据实现一个SVM项目

这是我提取特征的方式:

import itertoolsimport matplotlib.pyplot as pltimport pandas as pdimport numpy as npfrom sklearn.model_selection import train_test_splitfrom sklearn import svmfrom sklearn.metrics import classification_report, confusion_matrixdf = pd.read_csv('loan_train.csv')df['due_date'] = pd.to_datetime(df['due_date'])df['effective_date'] = pd.to_datetime(df['effective_date'])df['dayofweek'] = df['effective_date'].dt.dayofweekdf['weekend'] = df['dayofweek'].apply(lambda x: 1 if (x>3)  else 0)Feature = df[['Principal','terms','age','Gender','weekend']]Feature = pd.concat([Feature,pd.get_dummies(df['education'])], axis=1)Feature.drop(['Master or Above'], axis = 1,inplace=True)X = Featurey = df['loan_status'].replace(to_replace=['PAIDOFF','COLLECTION'], value=[0,1],inplace=False)

创建模型和预测:

clf = svm.SVC(kernel='rbf')clf.fit(X_train_svm, y_train_svm)yhat_svm = clf.predict(X_test_svm)

评估阶段:

def plot_confusion_matrix(cm, classes,                          normalize=False,                          title='Confusion matrix',                          cmap=plt.cm.Blues):    """    This function prints and plots the confusion matrix.    Normalization can be applied by setting `normalize=True`.    """    if normalize:        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]        print("Normalized confusion matrix")    else:        print('Confusion matrix, without normalization')    print(cm)    plt.imshow(cm, interpolation='nearest', cmap=cmap)    plt.title(title)    plt.colorbar()    tick_marks = np.arange(len(classes))    plt.xticks(tick_marks, classes, rotation=45)    plt.yticks(tick_marks, classes)    fmt = '.2f' if normalize else 'd'    thresh = cm.max() / 2.    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):        plt.text(j, i, format(cm[i, j], fmt),                 horizontalalignment="center",                 color="white" if cm[i, j] > thresh else "black")    plt.tight_layout()    plt.ylabel('True label')    plt.xlabel('Predicted label')    plt.show()cnf_matrix = confusion_matrix(y_test_svm, yhat_svm, labels=[2,4])np.set_printoptions(precision=2)print (classification_report(y_test_svm, yhat_svm))# Plot non-normalized confusion matrixplt.figure()plot_confusion_matrix(cnf_matrix, classes=['Benign(2)','Malignant(4)'],normalize= False,  title='Confusion matrix')

这里是错误:

Traceback (most recent call last):

File “E:/python/classification_project/classification.py”, line 229,in

cnf_matrix = confusion_matrix(y_test_svm, yhat_svm, labels=[2,4])

File “C:\Program Files(x86)\Python38-32\lib\site-packages\sklearn\metrics_classification.py”, line 277, in confusion_matrix

raise ValueError(“At least one label specified must be in y_true”)

ValueError: At least one label specified must be in y_true

我查看了这个问题,它与我的情况类似,我将ycategorical更改为numerical,但错误仍然存在!


回答:

y中的值是01,但在confusion_matrix调用中:

cnf_matrix = confusion_matrix(y_test_svm, yhat_svm, labels=[2,4])

标签是24
confusion_matrix中的标签应该等于y向量中的标记,即:

cnf_matrix = confusion_matrix(y_test_svm, yhat_svm, labels=[0,1])

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注