我正在使用以下正则化的cost()
和gradient()
函数:
def cost(theta, x, y, lam): theta = theta.reshape(1, len(theta)) predictions = sigmoid(np.dot(x, np.transpose(theta))).reshape(len(x), 1) regularization = (lam / (len(x) * 2)) * np.sum(np.square(np.delete(theta, 0, 1))) complete = -1 * np.dot(np.transpose(y), np.log(predictions)) \ - np.dot(np.transpose(1 - y), np.log(1 - predictions)) return np.sum(complete) / len(x) + regularizationdef gradient(theta, x, y, lam): theta = theta.reshape(1, len(theta)) predictions = sigmoid(np.dot(x, np.transpose(theta))).reshape(len(x), 1) theta_without_intercept = theta.copy() theta_without_intercept[0, 0] = 0 assert(theta_without_intercept.shape == theta.shape) regularization = (lam / len(x)) * np.sum(theta_without_intercept) return np.sum(np.multiply((predictions - y), x), 0) / len(x) + regularization
使用这些函数和scipy.optimize.fmin_bfgs()
,我得到了以下输出(几乎是正确的):
Starting loss value: 0.69314718056 Warning: Desired error not necessarily achieved due to precision loss. Current function value: 0.208444 Iterations: 8 Function evaluations: 51 Gradient evaluations: 397.53668131651e-08Trained loss value: 0.208443907192
下方是正则化的公式。如果我在scipy.optimize.fmin_bfgs()
之前注释掉正则化输入,它能正常工作,并正确返回局部最优解。
为什么会这样?
更新:
在收到额外评论后,我更新了上述代码中的成本和梯度正则化。但这个警告仍然出现(新的输出如上所示)。scipy check_grad
函数返回的值为:7.53668131651e-08。
更新2:
我使用的是UCI机器学习的Iris
数据集。基于分类模型One-vs-All
,我首先训练了Iris-setosa
的结果。
回答:
问题出在我对正则化的计算上,由于某些原因,我在正则化中对theta
值进行了求和:regularization = (lam / len(x)) * np.sum(theta_without_intercept)
。在这一阶段我们不需要对正则化值进行np.sum操作。这会导致每个theta的平均正则化和后续的预测损失。无论如何,谢谢你的帮助。
梯度方法:
def gradient(theta, x, y, lam): theta_len = len(theta) theta = theta.reshape(1, theta_len) predictions = sigmoid(np.dot(x, np.transpose(theta))).reshape(len(x), 1) theta_wo_bias = theta.copy() theta_wo_bias[0, 0] = 0 assert (theta_wo_bias.shape == theta.shape) regularization = np.squeeze(((lam / len(x)) * theta_wo_bias).reshape(theta_len, 1)) return np.sum(np.multiply((predictions - y), x), 0) / len(x) + regularization
输出:
Starting loss value: 0.69314718056 Optimization terminated successfully. Current function value: 0.201681 Iterations: 30 Function evaluations: 32 Gradient evaluations: 327.53668131651e-08Trained loss value: 0.201680992316