Scikit – SGDRegressor无法拟合

你好,我正在尝试使用scikit-learn来拟合一小组数据。

import numpy as npfrom sklearn import linear_model, model_selectionX = np.array([[86.5999984741211,    9.10000038146973,   14.3000001907349,1],            [66.9000015258789,  17.3999996185303,   11.5,1],            [66.3000030517578,  20              ,   10.6999998092651,1],            [78.6999969482422,  15.3999996185303,   12.1000003814697,1],            [76.1999969482422,  18.2000007629395,   12.5,1],            [84.4000015258789,  9.89999961853027,   12.1000003814697,1],            [79.1999969482422,  8.5             ,   10.1000003814697,1],            [77.5           ,   10.1999998092651,   11.3999996185303,1],            [74.4000015258789,  17.7999992370605,   10.6000003814697,1],            [870.9000015258789, 13.5            ,   13,1],            [80.0999984741211,  8               ,   9.10000038146973,1],            [80.0999984741211,  10.3000001907349,   9,1],            [79.6999969482422,  13.1000003814697,   9.5,1],            [76.1999969482422,  13.6000003814697,   11.5,1],            [75.5999984741211,  12.1999998092651,   10.8000001907349,1],            [81.3000030517578,  13.1000003814697,   9.89999961853027,1],            [64.5999984741211,  20.3999996185303,   10.6000003814697,1],            [68.3000030517578,  26.3999996185303,   14.8999996185303,1],            [80             ,   10.6999998092651,   10.8999996185303,1],            [78.4000015258789,  9.69999980926514,   12,1],            [78.8000030517578,  10.6999998092651,   10.6000003814697,1],            [76.8000030517578,  15.3999996185303,   13,1],            [82.4000015258789,  11.6000003814697,   9.89999961853027,1],            [73.9000015258789,  16.1000003814697,   10.8999996185303,1],            [64.3000030517578,  24.7000007629395,   14.6999998092651,1],            [81             ,   14.8999996185303,   10.8000001907349,1],            [70             ,   14.3999996185303,   11.1000003814697,1],            [76.6999969482422,  11.1999998092651,   8.39999961853027,1],            [81.8000030517578,  10.3000001907349,   9.39999961853027,1],            [82.1999969482422,  9.89999961853027,   9.19999980926514,1],            [76.6999969482422,  10.8999996185303,   9.60000038146973,1],            [75.0999984741211,  17.3999996185303,   13.8000001907349,1],            [78.8000030517578,  9.80000019073486,   12.3999996185303,1],            [74.8000030517578,  16.3999996185303,   12.6999998092651,1],            [75.6999969482422,  13              ,   11.3999996185303,1],            [74.5999984741211,  19.8999996185303,   11.1000003814697,1],            [81.5           ,   11.8000001907349,   11.3000001907349,1],            [74.6999969482422,  13.1999998092651,   9.60000038146973,1],            [72             ,   11.1999998092651,   10.8000001907349,1],            [68.3000030517578,  18.7000007629395,   12.3000001907349,1],            [77.0999984741211,  14.1999998092651,   9.39999961853027,1],            [67.0999984741211,  19.6000003814697,   11.1999998092651,1],            [72.0999984741211,  17.3999996185303,   11.8000001907349,1],            [85.0999984741211,  10.6999998092651,   10,1],            [75.1999969482422,  9.69999980926514,   10.3000001907349,1],            [80.8000030517578,  10              ,   11,1],            [83.8000030517578,  12.1000003814697,   11.6999998092651,1],            [78.5999984741211,  12.6000003814697,   10.3999996185303,1],            [66             ,   22.2000007629395,   9.39999961853027,1],            [83             ,   13.3000001907349,   10.8000001907349,1],            [73.0999984741211,  26.3999996185303,   22.1000003814697,1]])y = np.array([761,            780,            593,            715,            1078,            567,            456,            686,            1206,            723,            261,            326,            282,            960,            489,            496,            463,            1062,            805,            998,            126,            792,            327,            744,            434,            178,            679,            82,            339,            138,            627,            930,            875,            1074,            504,            635,            503,            418,            402,            1023,            208,            766,            762,            301,            372,            114,            515,            264,            208,            286,            2922])model = linear_model.SGDRegressor(max_iter=0x7FFFFFFF, tol=1e-12, learning_rate="constant", eta0=.1, shuffle=False)"""model = linear_model.Lasso(max_iter=0x7FFFFFF,tol=1e-12)"""model.fit(X,y)print(model.coef_)print (model.score(X,y))"""for i in range(0,len(X)):    print (np.dot(X[i],model.coef_))"""

Ridge/Lasso/ElasticNet 在一定程度上可以拟合(得分约为0.7),但即使我设置了超高的迭代次数和极低的容忍度值,SGDRegressor 仍然无法很好地拟合这些数据。

调整 max_iter 或 tol 对结果没有任何影响,我总是得到非常大的系数。


回答:

在应用梯度下降技术之前,你需要确保你的特征已经进行了缩放。看看你的 X 数据,这样应该能解决问题。

from sklearn.preprocessing import StandardScalerX_scaled = StandardScaler().fit_transform(X)

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注