我在csv文件中有100个条目。
Physics,Maths,Status_class0or130,40,090,70,1
使用上述数据,我试图构建一个逻辑(二元)分类器。请指导我哪里做错了?为什么我得到的是一个3*3的矩阵(theta有9个值,而应该只有3个)
这里是代码:导入库
从csv文件读取数据。
df = pd.read_csv("LogisticRegressionFirstBinaryClassifier.csv", header=None)df.columns = ["Maths", "Physics", "AdmissionStatus"]X = np.array(df[["Maths", "Physics"]])y = np.array(df[["AdmissionStatus"]])X = preprocessing.normalize(X)X = np.c_[np.ones(X.shape[0]), X]theta = np.ones((X.shape[1], 1))print(X.shape) # (100, 3)print(y.shape) # (100, 1)print(theta.shape) # (3, 1)
calc_z函数用于计算X和theta的点积
def calc_z(X,theta): return np.dot(X,theta)
Sigmoid函数
def sigmoid(z): return 1 / (1 + np.exp(-z))
成本函数
def cost_function(X, y, theta): z = calc_z(X,theta) h = sigmoid(z) return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()print("cost_function =" , cost_function(X, y, theta))def derivativeofcostfunction(X, y, theta): z = calc_z(X,theta) h = sigmoid(z) calculation = np.dot((h - y).T,X) return calculationprint("derivativeofcostfunction=", derivativeofcostfunction(X, y, theta))def grad_desc(X, y, theta, lr=.001, converge_change=.001): cost = cost_function(X, y, theta) change_cost = 1 num_iter = 1 while(change_cost > converge_change): old_cost = cost print(theta) print (derivativeofcostfunction(X, y, theta)) theta = theta - lr*(derivativeofcostfunction(X, y, theta)) cost = cost_function(X, y, theta) change_cost = old_cost - cost num_iter += 1 return theta, num_iter
这是输出结果:
[[ 0.4185146 -0.56877556 0.63999433] [15.39722864 9.73995197 11.07882445] [12.77277463 7.93485324 9.24909626]][[0.33944777 0.58199037 0.52493407] [0.02106587 0.36300629 0.30297278] [0.07040604 0.3969297 0.33737757]][[-0.05856159 -0.89826735 0.30849185] [15.18035041 9.59004868 10.92827046] [12.4804775 7.73302024 9.04599788]][[0.33950634 0.58288863 0.52462558] [0.00588552 0.35341624 0.29204451] [0.05792556 0.38919668 0.32833157]][[-5.17526527e-01 -1.21534937e+00 -1.03387571e-02] [ 1.49729502e+01 9.44663458e+00 1.07843504e+01] [ 1.21978140e+01 7.53778010e+00 8.84964495e+00]](array([[ 0.34002386, 0.58410398, 0.52463592], [-0.00908743, 0.34396961, 0.28126016], [ 0.04572775, 0.3816589 , 0.31948193]]), 46)
回答:
我修改了这段代码,只是在返回矩阵时添加了转置,这就解决了我的问题。
def derivativeofcostfunction(X, y, theta):z = calc_z(X,theta)h = sigmoid(z)calculation = np.dot((h - y).T,X)return calculation.T