抱歉问一个新手问题…这是我的代码:
from __future__ import divisionimport sklearnimport numpy as npfrom scipy import stats from sklearn.linear_model import LinearRegressionimport matplotlib.pyplot as pltX =np.array([6,8,10,14,18])Y = np.array([7,9,13,17.5,18])X = np.reshape(X,(1,5))Y = np.reshape(Y,(1,5))print Xprint Yplt.figure()plt.title('Pizza Price as a function of Pizza Diameter')plt.xlabel('Pizza Diameter (Inches)')plt.ylabel('Pizza Price (Dollars)')axis = plt.axis([0, 25, 0 ,25])m, b = np.polyfit(X,Y,1)plt.grid(True)plt.plot(X,Y, 'k.')plt.plot(X, m*X + b, '-')#plt.show()#training data#x= [[6],[8],[10],[14],[18]]#y= [[7],[9],[13],[17.5],[18]]# create and fit linear regression modelmodel = LinearRegression()model.fit(X,Y)print 'A 12" pizza should cost $% .2f' % model.predict(19)#work out cost function, which is residual sum of squaresprint 'Residual sum of squares: %.2f' % np.mean((model.predict(x)- y) ** 2)#work out variance (AKA Mean squared error)xMean = np.mean(x)print 'Variance is: %.2f' %np.var([x], ddof=1)#work out covariance (this is whether the x axis data and y axis data correlate with eachother)#When a and b are 1-dimensional sequences, numpy.cov(x,y)[0][1] calculates covarianceprint 'Covariance is: %.2f' %np.cov(X, Y, ddof = 1)[0][1]#test the model on new test data, printing the r squared coefficientX_test = [[8], [9], [11], [16], [12]]y_test = [[11], [8.5], [15], [18], [11]]print 'R squared for model on test data is: %.2f' %model.score(X_test,y_test)
基本上,有些函数对我的变量X和Y有效,而有些则无效。
例如,按照当前的代码,它会抛出以下错误:
TypeError: expected 1D vector for x
在这一行:
m, b = np.polyfit(X,Y,1)
然而,当我注释掉调整变量的这两行,像这样:
#X = np.reshape(X,(1,5))#Y = np.reshape(Y,(1,5))
我会得到以下错误:
ValueError: Found input variables with inconsistent numbers of samples: [1, 5]
在这一行:
model.fit(X,Y)
那么,我该如何调整数组,使其在脚本中的所有函数中都能正常工作,而不需要使用结构略有不同的相同数据的不同数组呢?
谢谢你的帮助!
回答:
修改这些行
X = np.reshape(X,(5))Y = np.reshape(Y,(5))