我是一名机器学习的初学者,正在通过Kaggle的泰坦尼克号问题进行学习。据我所知,我已经确保了各个指标是一致的,当然我不会将这个问题归咎于Python,而是自己。然而,我仍然找不到问题的源头,Spyder IDE也帮不上忙。
这是我的代码:
这是堆栈跟踪:
Traceback (most recent call last): File "<ipython-input-3-73797c87986e>", line 1, in <module> runfile('C:/Users/Omar/Downloads/Kaggle Competition/Titanic.py', wdir='C:/Users/Omar/Downloads/Kaggle Competition') File "C:\Users\Omar\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile execfile(filename, namespace) File "C:\Users\Omar\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile exec(compile(f.read(), filename, 'exec'), namespace) File "C:/Users/Omar/Downloads/Kaggle Competition/Titanic.py", line 58, in <module> print(accuracy_score(val_y, val_predictions)) File "C:\Users\Omar\Anaconda3\lib\site-packages\sklearn\metrics\classification.py", line 176, in accuracy_score y_type, y_true, y_pred = _check_targets(y_true, y_pred) File "C:\Users\Omar\Anaconda3\lib\site-packages\sklearn\metrics\classification.py", line 81, in _check_targets "and {1} targets".format(type_true, type_pred))ValueError: Classification metrics can't handle a mix of binary and continuous targets
回答:
您正在尝试将回归算法(DecisionTreeRegressor
)用于二元分类问题;如预期的那样,回归模型会给出连续的输出,而错误实际发生在accuracy_score
这里:
File "C:/Users/Omar/Downloads/Kaggle Competition/Titanic.py", line 58, in <module> print(accuracy_score(val_y, val_predictions))
它期望的是二元输出,因此出现了错误。
首先,请将您的模型更改为:
from sklearn.tree import DecisionTreeClassifiertitanic_model = DecisionTreeClassifier()