发现输入变量的样本数量不一致：[4, 1]

这是我所做的。代码如下。我有music.csv数据集。错误是发现输入变量的样本数量不一致：[4, 1]。错误详情在代码之后。

# 导入数据import pandas as pdmusic_data = pd.read_csv('music.csv')music_data# 分成训练和测试集-没有需要清理的数据# genre = 预测# 输入是年龄和性别，输出是genre# 方法=dropX = music_data.drop(columns=['genre'])  # 包含除genre之外的所有数据# X= 输入Y = music_data['genre']  # 只包含genre# Y=输出# 现在选择算法from sklearn.tree import DecisionTreeClassifiermodel = DecisionTreeClassifier()  # 模型model.fit(X, Y)prediction = model.predict([[21, 1]])from sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)  # 20%的数据用于测试# 前两个是输入，后一个是输出model.fit(X_train, y_train)from sklearn.metrics import accuracy_scorescore = accuracy_score(y_test, predictions)

然后出现了这个错误。这是一个值错误

ValueError                                Traceback (most recent call last)~\AppData\Local\Temp/ipykernel_28312/3992581865.py in <module>      5 model.fit(X_train, y_train)      6 from sklearn.metrics import accuracy_score----> 7 score = accuracy_score(y_test, predictions)c:\users\shrey\appdata\local\programs\python\python39\lib\site- packages\sklearn\utils\validation.py in inner_f(*args, **kwargs) 61             extra_args = len(args) - len(all_args) 62             if extra_args <= 0:---> 63                 return f(*args, **kwargs) 64  65             # extra_args > 0c:\users\shrey\appdata\local\programs\python\python39\lib\site-        packages\sklearn\metrics\_classification.py in accuracy_score(y_true, y_pred, normalize, sample_weight)200 201     # Compute accuracy for each possible representation--> 202     y_type, y_true, y_pred = _check_targets(y_true, y_pred)203     check_consistent_length(y_true, y_pred, sample_weight)204     if y_type.startswith('multilabel'):c:\users\shrey\appdata\local\programs\python\python39\lib\site- packages\sklearn\metrics\_classification.py in _check_targets(y_true, y_pred) 81     y_pred : array or indicator matrix 82     """ ---> 83     check_consistent_length(y_true, y_pred) 84     type_true = type_of_target(y_true) 85     type_pred = type_of_target(y_pred) c:\users\shrey\appdata\local\programs\python\python39\lib\site- packages\sklearn\utils\validation.py in check_consistent_length(*arrays)317     uniques = np.unique(lengths)318     if len(uniques) > 1:--> 319         raise ValueError("Found input variables with inconsistent numbers of"320                          " samples: %r" % [int(l) for l in lengths])321  ValueError: Found input variables with inconsistent numbers of samples: [4, 1]

请帮帮我。我不知道发生了什么，但我认为这与score = accuracy_score(y_test, predictions)有关。

回答：

在分割后的测试数据中，你有四个条目（行），这意味着y_test的长度为4。

而在尝试对[21, 1]进行预测时，你实际上只是在对一行进行预测。因此，prediction的长度为1。

这就是为什么你会得到样本数量不一致的错误。

你可以通过以下方式解决这个问题：

对X_test进行预测
```
prediction = model.predict(X_test) 
```
如果要对新数据进行预测，你需要分离目标(y_test)和输入特征(X_test)，然后进行预测。例如，如果[21,1]的目标是[2]
```
prediction = model.predict([[21,1]])y_test = [2] ## 注意这取决于相应的目标标签是什么score = accuracy_score(y_test,prediction)
```

学技术

发现输入变量的样本数量不一致：[4, 1]

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复