SVM在我的数据上表现不佳。如何修复？

我有一个数据集，包含510个样本用于训练，127个样本用于测试，每个样本有7680个特征。我希望设计一个模型来预测训练数据中的身高（厘米）标签。目前我使用了SVM，但结果非常差。您能查看我的代码并提供一些评论吗？您可以使用数据集和可运行的代码在您的机器上尝试一下


部分结果：
p_regression
[15.67367165 16.35094166 13.10510262 14.03943211 12.7116549  11.45071423 13.27225207  9.44959181 10.45775627 13.23953143 14.95568324 11.35994414 10.69531821 12.42556347 14.54712287 12.25965911  9.04101931 14.03604126 12.41237627 13.51951317 10.36302674  9.86389635 11.41448842 15.67146184 14.74764672 11.22794536 12.04429175 12.48199183 14.29790809 16.21724184 10.94478135  9.68210872 14.8663311   8.62974573 15.17281425 12.97230127  9.46515876 16.24388177 10.35742683 15.65336366 11.04652502 16.35094166 14.03943211 10.29066405 13.27225207  9.44959181 10.45775627 13.23953143 14.95568324 11.35994414 10.69531821 12.42556347 14.54712287 12.25965911  9.04101931 14.03604126 12.41237627 13.51951317 10.36302674  9.86389635 11.41448842 15.67146184 14.74764672 11.22794536 12.04429175 12.48199183 14.29790809 16.21724184 10.94478135  9.68210872 14.8663311   8.62974573 15.17281425 12.97230127  9.46515876 16.24388177 10.35742683 15.65336366 11.04652502 16.35094166 14.03943211 10.29066405 13.27225207  9.44959181 10.45775627 13.23953143 14.95568324 11.35994414 10.69531821 12.42556347 14.54712287 12.25965911  9.04101931 14.03604126 12.41237627 13.51951317 10.36302674  9.86389635 11.41448842 15.67146184 14.74764672 11.22794536 12.04429175 12.48199183 14.29790809 16.21724184 10.94478135  9.68210872 14.8663311   8.62974573 15.17281425 12.97230127  9.46515876 16.24388177 10.35742683 15.65336366 11.04652502 16.35094166 14.03943211 10.29066405 13.27225207  9.44959181 10.45775627 13.23953143 14.95568324 11.35994414 10.69531821]
test_Y
[13. 14. 13. 15. 15. 17. 13. 17. 16. 12. 17.  6.  4.  3.  4.  6.  6.  8.  9. 18.  3.  6.  4.  6.  7.  8. 11. 11. 13. 12. 12. 14. 13. 12. 15. 15. 16. 15. 17. 18. 17. 14. 15. 17. 13. 17. 16. 12. 17.  6.  4.  3.  4.  6.  6.  8.  9. 18.  3.  6.  4.  6.  7.  8. 11. 11. 13. 12. 12. 14. 13. 12. 15. 15. 16. 15. 17. 18. 17. 14. 15. 17. 13. 17. 16. 12. 17.  6.  4.  3.  4.  6.  6.  8.  9. 18.  3.  6.  4.  6.  7.  8. 11. 11. 13. 12. 12. 14. 13. 12. 15. 15. 16. 15. 17. 18. 17. 14. 15. 17. 13. 17. 16. 12. 17.  6.  4.]

回答：
这是一个类似的方法。我们将数据集分为train和test两部分。train数据集将用于调整超参数和拟合不同的模型。然后，我们将选择最佳的（以MSE为标准）模型，并使用test数据集进行预测。
所有训练好的（拟合的）模型将保存为Pickle文件，以便以后使用joblib.load()方法加载。
输出：
----------------------------- [SVR_rbf] ------------------------------Fitting 3 folds for each of 4 candidates, totalling 12 fits---------------------------- [SVR_linear] ----------------------------Fitting 3 folds for each of 4 candidates, totalling 12 fits------------------------------ [Ridge] -------------------------------Fitting 3 folds for each of 7 candidates, totalling 21 fits------------------------------ [Lasso] -------------------------------Fitting 3 folds for each of 6 candidates, totalling 18 fits--------------------------- [RandomForest] ---------------------------Fitting 3 folds for each of 3 candidates, totalling 9 fits----------------------------- [SVR_rbf] ------------------------------Score:      44.88%Parameters: {'SVR_rbf__C': 10, 'SVR_rbf__max_iter': 500}**********************************************************************---------------------------- [SVR_linear] ----------------------------Score:      33.40%Parameters: {'SVR_linear__C': 0.01, 'SVR_linear__max_iter': 1000}**********************************************************************------------------------------ [Ridge] -------------------------------Score:      34.83%Parameters: {'Ridge__alpha': 500, 'Ridge__max_iter': 200}**********************************************************************------------------------------ [Lasso] -------------------------------Score:      22.90%Parameters: {'Lasso__alpha': 0.1, 'Lasso__max_iter': 1000}**********************************************************************--------------------------- [RandomForest] ---------------------------Score:      36.87%Parameters: {'RandomForest__max_depth': 5, 'RandomForest__n_estimators': 250}**********************************************************************Mean Squared Error: {'SVR_rbf': 5.375, 'SVR_linear': 7.036, 'Ridge': 7.02, 'Lasso': 8.108, 'RandomForest': 9.475}
代码：
这段代码部分保持原样，不做翻译



相关文章：

标题: scikit-learn是否执行“真正的”多元回归（多个因变量）？
在执行scikit-learn线性回归模型时遇到问题
线性回归和Scikit-learn中的梯度下降？
使用Python中的SVM拟合数据集时出现错误
Python: 为什么我的线性回归图会显示出许多杂乱的彩色线条？
Python sklearn.linear_model: LinearRegression() 在.predict()时发生ValueError
Scikit learn线性回归预测标签
如何使用scikit-learn进行多变量线性回归？
使用Sklearn进行线性回归预测出现问题，数据未能正确拟合
在SciKit线性回归中遇到’ValueError: shapes not aligned’错误

学技术

SVM在我的数据上表现不佳。如何修复？

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复