我在Jupyter Notebook中使用Scikit-learn创建了三个机器学习模型(线性回归、决策树和随机森林)。这些模型的目的是根据多个气旋参数(预测因子/输入)来预测气旋的大小(预测/输出ROCI)。数据集共有9004行。以下是线性回归模型的一个示例。
In[31]: df.head()Out[31]: NAME LAT LON Pc Penv ROCI Vmax Pdc 0 HECTOR -15 128 985 1000 541 18 -15 1 HECTOR -15 127 990 1000 541 15.4 -10 2 HECTOR -16 126 992 1000 530 15 -8 3 HECTOR -16.3 126 992 1000 480 15.4 -8 4 HECTOR -16.5 126 992 1000 541 15.4 -8In [32]: X=df[['LAT','LON','Pc','Vmax','Pdc=Pc-Penv']] y=df['ROCI']In [33]: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4) In [34]: lm=LinearRegression()In [35]: lm.fit(X_train,y_train)Out [35]: LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)In [36]: print(lm.intercept_) lm.coef_ -3464.3452921023572Out [36]: array([-2.94229126, 0.29875575, 3.65214265, -1.25577799, -6.43917746])In [37]: predictions=lm.predict(X_test) predictionsOut [37]:array([401.02108725, 420.01451472, 434.4241271 , ..., 287.67803538, 343.80516896, 340.1007666 ])In [38]: plt.scatter(y_test,predictions) plt.xlabel('Recorded') plt.ylabel('Predicted') *figure to display accuracy*
现在,当我尝试在lm.predict()中输入单个值时,我得到了以下错误:
ValueError: Expected 2D array, got scalar array instead:array=300.Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
我认为这是因为我的模型是使用5列数据训练的,所以尝试输入数据集的第一行:
In [39]: lm.predict(-15,128,985,18,-15) ... ... TypeError: predict() takes 2 positional arguments but 6 were given
按照建议尝试使用array.reshape后,我得到:
In [49]: lm.predict(X_test.reshape(-1, 1)) ... ... AttributeError: 'DataFrame' object has no attribute 'reshape'
现在我很困惑!请问您能帮助我使用我的模型来获得一个预测值吗?我应该在lm.predict()中输入什么?我基本上只想说”Pc=990, Vmax=18, Pdc=-12″然后得到类似于”ROCI=540″的输出。谢谢您的时间。
回答:
如果你想预测你的数据的第一行,你应该先将其转换为一个数组:
import numpy as npfirst_row = np.array([-15, 128, 985, 18, -15])
然后,当
lm.predict(first_row)
产生类似于你报告的错误时,
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
请按照消息中的建议操作,即:
lm.predict(first_row.reshape(1, -1))