在绘图中添加回归线与实际数据

我有一个pandas数据框中的以下数据:

freq = [10, 2, 1, 10, 6, 4, 1, 1, 6, 3, 4, 10, 6, 3, 9, 5, 5, 5, 4, 2, 2, 9, 11, 7, 5, 1, 3, 10, 7, 5, 5, 5, 8, 7, 25, 17, 9, 6, 7, 8, 4, 10, 3, 1, 7, 11, 6, 5, 10, 11, 8, 11, 15, 4, 6, 11, 6, 10, 10, 10, 4, 5, 7, 15, 15, 10, 12, 17, 25, 26, 22, 14, 15, 15, 7, 9, 8, 6, 1]date=[737444, 737445, 737446, 737447, 737448, 737449, 737450, 737451, 737452, 737453, 737454, 737455, 737456, 737457, 737458, 737459, 737460, 737461, 737462, 737463, 737464, 737465, 737466, 737467, 737468, 737469, 737470, 737472, 737473, 737474, 737475, 737476, 737477, 737478, 737479, 737480, 737481, 737482, 737483, 737484, 737485, 737486, 737487, 737488, 737489,  737490, 737491, 737492, 737493, 737494, 737495, 737496, 737497,  737498, 737499, 737500, 737501, 737502, 737503, 737504, 737505, 737506, 737507, 737508, 737509, 737510, 737511, 737512, 737513,  737514, 737515, 737516, 737517, 737518, 737519, 737520, 737521, 737522, 737523]

我已经计算了回归的系数和截距如下:

    from sklearn.model_selection import train_test_split        y = np.asarray(df['Frequency'])    X = df[['Date']]    X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3,random_state=42)        model = LinearRegression()    model.fit(X_train, y_train)    model.score(X_train, y_train)         coefs = zip(model.coef_, X.columns)    model.__dict__

得到以下结果:

    Coefficient:      [0.08711929]    Intercept:      -64241.58584385233    sl = -64241.6 + 0.1 Date

我想在显示实际数据趋势的图上方绘制这条线。我该怎么做?


回答:

线性回归模型由X*model.coef_ + model.intercept_定义(本质上是预测的结果)。因此,您可以绘制斜率以及数据的散点图,以实现结果的美观可视化:

freq=[10, 2, 1, 10, 6, 4, 1, 1, 6, 3, 4, 10, 6, 3, 9, 5, 5, 5, 4, 2, 2, 9, 11, 7, 5, 1, 3, 10, 7, 5, 5, 5, 8, 7, 25, 17, 9, 6, 7, 8, 4, 10, 3, 1, 7, 11, 6, 5, 10, 11, 8, 11, 15, 4, 6, 11, 6, 10, 10, 10, 4, 5, 7, 15, 15, 10, 12, 17, 25, 26, 22, 14, 15, 15, 7, 9, 8, 6, 1]date=[737444, 737445, 737446, 737447, 737448, 737449, 737450, 737451, 737452, 737453, 737454, 737455, 737456, 737457, 737458, 737459, 737460, 737461, 737462, 737463, 737464, 737465, 737466, 737467, 737468, 737469, 737470, 737472, 737473, 737474, 737475, 737476, 737477, 737478, 737479, 737480, 737481, 737482, 737483, 737484, 737485, 737486, 737487, 737488, 737489, 737490, 737491, 737492, 737493, 737494, 737495, 737496, 737497, 737498, 737499, 737500, 737501, 737502, 737503, 737504, 737505, 737506, 737507, 737508, 737509, 737510, 737511, 737512, 737513, 737514, 737515, 737516, 737517, 737518, 737519, 737520, 737521, 737522, 737523]X = np.array(date)y = np.array(freq)X_train, X_test, y_train, y_test = train_test_split(X[:,None],y,test_size=0.3,                                                   random_state=42)model = LinearRegression()model.fit(X_train, y_train)plt.subplots(figsize=(15, 8))plt.scatter(date, freq, color='lightblue')# 使用实际方程绘图,以清晰展示其工作原理# 本质上与model.predict(X)相同plt.plot(date, X*model.coef_ + model.intercept_)

enter image description here

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注