Scikit – SGDRegressor无法拟合

你好,我正在尝试使用scikit-learn来拟合一小组数据。

import numpy as npfrom sklearn import linear_model, model_selectionX = np.array([[86.5999984741211,    9.10000038146973,   14.3000001907349,1],            [66.9000015258789,  17.3999996185303,   11.5,1],            [66.3000030517578,  20              ,   10.6999998092651,1],            [78.6999969482422,  15.3999996185303,   12.1000003814697,1],            [76.1999969482422,  18.2000007629395,   12.5,1],            [84.4000015258789,  9.89999961853027,   12.1000003814697,1],            [79.1999969482422,  8.5             ,   10.1000003814697,1],            [77.5           ,   10.1999998092651,   11.3999996185303,1],            [74.4000015258789,  17.7999992370605,   10.6000003814697,1],            [870.9000015258789, 13.5            ,   13,1],            [80.0999984741211,  8               ,   9.10000038146973,1],            [80.0999984741211,  10.3000001907349,   9,1],            [79.6999969482422,  13.1000003814697,   9.5,1],            [76.1999969482422,  13.6000003814697,   11.5,1],            [75.5999984741211,  12.1999998092651,   10.8000001907349,1],            [81.3000030517578,  13.1000003814697,   9.89999961853027,1],            [64.5999984741211,  20.3999996185303,   10.6000003814697,1],            [68.3000030517578,  26.3999996185303,   14.8999996185303,1],            [80             ,   10.6999998092651,   10.8999996185303,1],            [78.4000015258789,  9.69999980926514,   12,1],            [78.8000030517578,  10.6999998092651,   10.6000003814697,1],            [76.8000030517578,  15.3999996185303,   13,1],            [82.4000015258789,  11.6000003814697,   9.89999961853027,1],            [73.9000015258789,  16.1000003814697,   10.8999996185303,1],            [64.3000030517578,  24.7000007629395,   14.6999998092651,1],            [81             ,   14.8999996185303,   10.8000001907349,1],            [70             ,   14.3999996185303,   11.1000003814697,1],            [76.6999969482422,  11.1999998092651,   8.39999961853027,1],            [81.8000030517578,  10.3000001907349,   9.39999961853027,1],            [82.1999969482422,  9.89999961853027,   9.19999980926514,1],            [76.6999969482422,  10.8999996185303,   9.60000038146973,1],            [75.0999984741211,  17.3999996185303,   13.8000001907349,1],            [78.8000030517578,  9.80000019073486,   12.3999996185303,1],            [74.8000030517578,  16.3999996185303,   12.6999998092651,1],            [75.6999969482422,  13              ,   11.3999996185303,1],            [74.5999984741211,  19.8999996185303,   11.1000003814697,1],            [81.5           ,   11.8000001907349,   11.3000001907349,1],            [74.6999969482422,  13.1999998092651,   9.60000038146973,1],            [72             ,   11.1999998092651,   10.8000001907349,1],            [68.3000030517578,  18.7000007629395,   12.3000001907349,1],            [77.0999984741211,  14.1999998092651,   9.39999961853027,1],            [67.0999984741211,  19.6000003814697,   11.1999998092651,1],            [72.0999984741211,  17.3999996185303,   11.8000001907349,1],            [85.0999984741211,  10.6999998092651,   10,1],            [75.1999969482422,  9.69999980926514,   10.3000001907349,1],            [80.8000030517578,  10              ,   11,1],            [83.8000030517578,  12.1000003814697,   11.6999998092651,1],            [78.5999984741211,  12.6000003814697,   10.3999996185303,1],            [66             ,   22.2000007629395,   9.39999961853027,1],            [83             ,   13.3000001907349,   10.8000001907349,1],            [73.0999984741211,  26.3999996185303,   22.1000003814697,1]])y = np.array([761,            780,            593,            715,            1078,            567,            456,            686,            1206,            723,            261,            326,            282,            960,            489,            496,            463,            1062,            805,            998,            126,            792,            327,            744,            434,            178,            679,            82,            339,            138,            627,            930,            875,            1074,            504,            635,            503,            418,            402,            1023,            208,            766,            762,            301,            372,            114,            515,            264,            208,            286,            2922])model = linear_model.SGDRegressor(max_iter=0x7FFFFFFF, tol=1e-12, learning_rate="constant", eta0=.1, shuffle=False)"""model = linear_model.Lasso(max_iter=0x7FFFFFF,tol=1e-12)"""model.fit(X,y)print(model.coef_)print (model.score(X,y))"""for i in range(0,len(X)):    print (np.dot(X[i],model.coef_))"""

Ridge/Lasso/ElasticNet 在一定程度上可以拟合(得分约为0.7),但即使我设置了超高的迭代次数和极低的容忍度值,SGDRegressor 仍然无法很好地拟合这些数据。

调整 max_iter 或 tol 对结果没有任何影响,我总是得到非常大的系数。


回答:

在应用梯度下降技术之前,你需要确保你的特征已经进行了缩放。看看你的 X 数据,这样应该能解决问题。

from sklearn.preprocessing import StandardScalerX_scaled = StandardScaler().fit_transform(X)

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注