如何在线性回归中使用X和Y变量？

我正在尝试使用简单线性回归来预测某物品的成本。作为输入数据，我使用了该物品的成本。

代码似乎能够正常运行，但我无法理解在应用线性回归时如何使用X和Y。我使用X作为物品成本，Y作为标签（通过移动X的数据创建一个新的行）。

df = df[['Item Price']]forecast_col = 'Item Price'forecast_out = int(math.ceil(0.0000005 * len(df)))df['label'] = df[forecast_col].shift(-forecast_out)X = df[['Item Price']]X = preprocessing.scale(X)X_lately = X[forecast_out:]X = X[:-forecast_out]df.dropna(inplace=True)y = np.array(df['label'])X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)clf = LinearRegression(n_jobs=-1)clf.fit(X_train, y_train)forecast_set = clf.predict(X)

在解决线性回归方程Y = a + bX时，如何使用X和Y变量？

回答：

你的代码行：

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

将你的X和y各自分成两个样本：一个包含80%数据的训练集和一个包含其余20%数据的测试集。接着，代码行：

clf = LinearRegression(n_jobs=-1)

创建一个线性模型。最后，通过你的代码行：

clf.fit(X_train, y_train)

线性模型使用X_train和y_train中的所有(x, y)来计算最佳的线性回归器。

从更数学的角度来看，算法使用X_train和y_train中包含的所有(x, y)来寻找最小化方程E的a和b值：

E = SUM(y_i – a*x_i – b)

a和b的值通过找到E的一阶导数和二阶导数等于0的位置来确定。

学技术

如何在线性回归中使用X和Y变量？

发表回复取消回复

相关文章：

Related Posts

为什么我们在K-means聚类方法中使用kmeans.fit函数？

如何获取Keras中ImageDataGenerator的.flow_from_directory函数扫描的类名？

如何查看每个词的tf-idf得分

如何修复 ‘ValueError: Found input variables with inconsistent numbers of samples: [32979, 21602]’？

如何向神经网络输入两个不同大小的输入？

逻辑回归与机器学习有何关联

发表回复 取消回复

发表回复取消回复