在经过一些处理获得x_train和y_train后,我对它们进行了扁平化处理。以下是代码片段。
扁平化代码
x_train = x_train_flatten.Tx_test = x_test_flatten.Ty_test = Y_test.Ty_train = Y_train.Tprint("x train: ",x_train.shape)print("x test: ",x_test.shape)print("y train: ",y_train.shape)print("y test: ",y_test.shape)
结果
x train: (16384, 38)x test: (16384, 10)y train: (1, 38)y test: (1, 10)
归一化过程(扁平化后) :
from sklearn.impute import SimpleImputerimputer = SimpleImputer(missing_values=np.nan, strategy='mean')imputer = imputer.fit(x_train)x_train= imputer.transform(x_train)x_train = (x_train-np.min(x_train))/(np.max(x_train)-np.min(x_train))
然后我编写了一些方法用于逻辑回归
逻辑回归方法
def initialize_weights_and_bias(dimension): w = np.full((dimension,1),0.01) b = 0.0 return w, bdef sigmoid(z): y_head = 1/(1+np.exp(-z)) return y_headdef forward_backward_propagation(w,b,x_train,y_train): # 前向传播 z = np.dot(w.T,x_train) + b y_head = sigmoid(z) loss = -(1-y_train)*np.log(1-y_head)-y_train*np.log(y_head) cost = (np.sum(loss))/x_train.shape[1] # x_train.shape[1] 用于缩放 # 后向传播 derivative_weight = (np.dot(x_train,((y_head-y_train).T)))/x_train.shape[1] # x_train.shape[1] 用于缩放 derivative_bias = np.sum(y_head-y_train)/x_train.shape[1] # x_train.shape[1] 用于缩放 gradients = {"derivative_weight": derivative_weight,"derivative_bias": derivative_bias} return cost,gradientsdef update(w, b, x_train, y_train, learning_rate,number_of_iterarion): cost_list = [] cost_list2 = [] index = [] # 更新(学习)参数,迭代次数为number_of_iterarion for i in range(number_of_iterarion): # 进行前向和后向传播,找出成本和梯度 cost,gradients = forward_backward_propagation(w,b,x_train,y_train) cost_list.append(cost) # 更新 w = w - learning_rate * gradients["derivative_weight"] b = b - learning_rate * gradients["derivative_bias"] if i % 50 == 0: cost_list2.append(cost) index.append(i) print ("迭代后的成本 %i: %f" %(i, cost)) # 更新(学习)参数权重和偏差 parameters = {"weight": w,"bias": b} plt.plot(index,cost_list2) plt.xticks(index,rotation='vertical') plt.xlabel("迭代次数") plt.ylabel("成本") plt.show() return parameters, gradients, cost_listdef predict(w,b,x_test): # x_test 是前向传播的输入 z = sigmoid(np.dot(w.T,x_test)+b) Y_prediction = np.zeros((1,x_test.shape[1])) # 如果z大于0.5,我们预测为女性(y_head=1), # 如果z小于0.5,我们预测为男性(y_head=0), for i in range(z.shape[1]): if z[0,i]<= 0.5: Y_prediction[0,i] = 0 else: Y_prediction[0,i] = 1 return Y_predictiondef logistic_regression(x_train, y_train, x_test, y_test, learning_rate , num_iterations): # 初始化 dimension = x_train.shape[0] w,b = initialize_weights_and_bias(dimension) parameters, gradients, cost_list = update(w, b, x_train, y_train, learning_rate,num_iterations) y_prediction_test = predict(parameters["weight"],parameters["bias"],x_test) y_prediction_train = predict(parameters["weight"],parameters["bias"],x_train) train_acc_lr = round((100 - np.mean(np.abs(y_prediction_train - y_train)) * 100),2) test_acc_lr = round((100 - np.mean(np.abs(y_prediction_test - y_test)) * 100),2) # 打印训练/测试错误 print("训练准确率: %", train_acc_lr) print("测试准确率: %", test_acc_lr)
然后我开始调用logistic_regression方法来实现逻辑回归。
logistic_regression(x_train, y_train, x_test, y_test,learning_rate = 0.01, num_iterations = 700)
在显示了一些成本结果后,其中一些结果显示为NaN值,如下所示。
Cost after iteration 0: nanCost after iteration 50: 10.033753Cost after iteration 100: 11.253421Cost after iteration 150: nanCost after iteration 200: nanCost after iteration 250: nanCost after iteration 300: nanCost after iteration 350: nanCost after iteration 400: nanCost after iteration 450: 0.321755...
如何修复这个问题?
回答:
这是我的解决方案
- 在训练模型之前对x_train进行特征缩放,以避免产生NaN值
我在调用logistic_regression方法
之前编写了这段代码块。
from sklearn.preprocessing import StandardScalersc_X = StandardScaler()x_train = sc_X.fit_transform(x_train)