ValueError: 使用聚类和分类模型时出现未知标签类型: ‘continuous’

我创建了一个聚类模型，试图根据年收入和消费得分使用Scikit-Learn中的KMeans算法来寻找不同类型的客户群。我使用了每个客户返回的聚类值，尝试创建一个使用sklearn.svm中的支持向量分类（Support Vector Classification）的分类模型。然而，当我试图将新模型拟合到数据集上时，我得到了一个错误消息：

File "/Users/user/Documents/Machine Learning A-Z Template Folder/Part 4 - Clustering/Section 24 - K-Means Clustering/cluster_and_prediction.py", line 28, in <module>    classifier.fit(x_train, y_train)  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sklearn/svm/_base.py", line 149, in fit    y = self._validate_targets(y)  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sklearn/svm/_base.py", line 525, in _validate_targets    check_classification_targets(y)  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sklearn/utils/multiclass.py", line 169, in check_classification_targets    raise ValueError("Unknown label type: %r" % y_type)ValueError: Unknown label type: 'continuous'

我的代码如下

import pandas as pd import numpy as np # 使用数据集中的相关列dataset = pd.read_csv('Mall_Customers.csv')x = dataset.iloc[:, 3:5].values# 使用最佳聚类数量创建模型kmeans = KMeans(n_clusters=5, init='k-means++', max_iter=300, n_init=10, random_state=0)kmeans.fit(x)predictions = kmeans.predict(x)# 为特征缩放创建numpy数组predictions = np.array(predictions, dtype=int)predictions = predictions[:, None]from sklearn.preprocessing import StandardScalersc_x = StandardScaler()sc_y = StandardScaler()x = sc_x.fit_transform(x)predictions = sc_y.fit_transform(predictions)# 将数据集分割成训练集和测试集from sklearn.model_selection import train_test_splitx_train, x_test, y_train, y_test = train_test_split(x, predictions, test_size=.25)# 创建支持向量分类模型from sklearn.svm import SVCclassifier = SVC(kernel='rbf')classifier.fit(x_train, y_train)

用于聚类的肘部模型

聚类可视化

包含数据集的.zip文件（数据集名为’Mall_Customers.csv’）

如何解决这个问题？

回答：

因为你希望将这个问题视为具有5个类别的分类问题，你不应该对标签使用缩放器；这会将它们转换为连续变量，然后输入到分类模型中，因此会导致错误。

此外，与问题无关，但正确的方法是仅在训练数据上拟合你的缩放器，然后使用这个拟合的缩放器来转换你的测试数据。

所以，以下是必要的更改（在你完成设置predictions变量后）：

# 这里使用初始（未缩放）的 x：x_train, x_test, y_train, y_test = train_test_split(x, predictions, test_size=.25)sc = StandardScaler()x_train_scaled = sc.fit_transform(x_train)x_test_scaled = sc.transform(x_test)classifier = SVC(kernel='rbf')classifier.fit(x_train_scaled, y_train) # 不对predictions或y_train进行缩放

与问题无关，但你应该在使用k-means之前缩放你的x数据，即你实际上应该先缩放你的x，然后进行聚类（作为练习，因为这与错误无关）。

学技术

ValueError: 使用聚类和分类模型时出现未知标签类型: ‘continuous’

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复