是否可以仅通过更改变量的某些元素来最小化损失函数?换句话说,如果我有一个长度为2的变量X,我如何通过更改X[0]并保持X[1]不变来最小化我的损失函数?
希望我尝试的这段代码能描述我的问题:
import tensorflow as tfimport tensorflow.contrib.opt as optX = tf.Variable([1.0, 2.0])X0 = tf.Variable([3.0])Y = tf.constant([2.0, -3.0])scatter = tf.scatter_update(X, [0], X0)with tf.control_dependencies([scatter]): loss = tf.reduce_sum(tf.squared_difference(X, Y))opt = opt.ScipyOptimizerInterface(loss, [X0])init = tf.global_variables_initializer()with tf.Session() as sess: sess.run(init) opt.minimize(sess) print("X: {}".format(X.eval())) print("X0: {}".format(X0.eval()))
输出结果如下:
INFO:tensorflow:Optimization terminated with: Message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL' Objective function value: 26.000000 Number of iterations: 0 Number of functions evaluations: 1X: [3. 2.]X0: [3.]
而我希望找到X0 = 2的最优值,从而使X = [2, 2]
编辑
这样做的动机:我想导入一个训练好的图/模型,然后根据我拥有的一些新数据调整某些变量的各种元素。
回答:
你可以使用这个技巧来限制梯度计算到一个索引上:
import tensorflow as tfimport tensorflow.contrib.opt as optX = tf.Variable([1.0, 2.0])part_X = tf.scatter_nd([[0]], [X[0]], [2])X_2 = part_X + tf.stop_gradient(-part_X + X)Y = tf.constant([2.0, -3.0])loss = tf.reduce_sum(tf.squared_difference(X_2, Y))opt = opt.ScipyOptimizerInterface(loss, [X])init = tf.global_variables_initializer()with tf.Session() as sess: sess.run(init) opt.minimize(sess) print("X: {}".format(X.eval()))
part_X变成了你想要更改的值,这是一个与X相同形状的单热向量。part_X + tf.stop_gradient(-part_X + X)在前向传递中与X相同,因为part_X - part_X为0。然而在反向传递中,tf.stop_gradient阻止了所有不必要的梯度计算。