线性回归的一个问题是它往往会对数据拟合不足,解决这一问题的一种方法是使用一种称为局部加权线性回归的技术。我在Andrew Ng的CS229讲义中读到了这种技术,并且我还尝试编写了以下脚本:
trX = np.linspace(0, 1, 100) trY= trX + np.random.normal(0,1,100)sess = tf.Session()xArr = []yArr = []for i in range(len(trX)): xArr.append([1.0,float(trX[i])]) yArr.append(float(trY[i]))xMat = mat(xArr); yMat = mat(yArr).TA_tensor = tf.constant(xMat)b_tensor = tf.constant(yMat)m = shape(xMat)[0]weights = mat(eye((m)))k = 1.0for j in range(m): for i in range(m): diffMat = xMat[i]- xMat[j,:] weights[j,j] = exp(diffMat*diffMat.T/(-2.0*k**2))weights_tensor = tf.constant(weights)# Matrix inverse solutionwA = tf.matmul(weights_tensor, A_tensor)tA_A = tf.matmul(tf.transpose(A_tensor), wA)tA_A_inv = tf.matrix_inverse(tA_A)product = tf.matmul(tA_A_inv, tf.transpose(A_tensor))solution = tf.matmul(product, b_tensor)solution_eval = sess.run(solution)# Extract coefficientsslope = solution_eval[0][0]y_intercept = solution_eval[1][0]print('slope: ' + str(slope))print('y_intercept: ' + str(y_intercept))# Get best fit linebest_fit = []for i in xArr: best_fit.append(slope*i+y_intercept)# Plot the resultsplt.plot(xArr, yArr, 'o', label='Data')plt.plot(xArr, best_fit, 'r-', label='Best fit line', linewidth=3)plt.legend(loc='upper left')plt.show()
当我运行上述脚本时,出现了一个错误:TypeError: ‘numpy.float64’ object cannot be interpreted as an integer。这个错误是由以下语句引起的:
best_fit.append(slope*i+y_intercept)
我尝试修复这个问题,但至今仍未找到解决方案。请帮助我。
回答:
在循环中,i
是一个列表,例如[1.0, 1.0]
。你需要决定从列表中取哪个值来乘以slope*i
。例如:
best_fit = []for i in xArr: best_fit.append(slope*i[0]+y_intercept)
列表中的第一个元素似乎总是等于1。
...[1.0, 0.24242424242424243][1.0, 0.25252525252525254][1.0, 0.26262626262626265][1.0, 0.27272727272727276][1.0, 0.2828282828282829][1.0, 0.29292929292929293][1.0, 0.30303030303030304][1.0, 0.31313131313131315][1.0, 0.32323232323232326][1.0, 0.33333333333333337][1.0, 0.3434343434343435][1.0, 0.3535353535353536]...
所以我认为你可能需要查找列表中的第二个元素(权重?)…
best_fit = []for i in xArr: best_fit.append(slope*i[1]+y_intercept)