我在尝试向我的图中添加梯度裁剪。我使用了这里推荐的方法:如何在TensorFlow中有效地应用梯度裁剪?
optimizer = tf.train.GradientDescentOptimizer(learning_rate) if gradient_clipping: gradients = optimizer.compute_gradients(loss) clipped_gradients = [(tf.clip_by_value(grad, -1, 1), var) for grad, var in gradients] opt = optimizer.apply_gradients(clipped_gradients, global_step=global_step) else: opt = optimizer.minimize(loss, global_step=global_step)
但是当我开启梯度裁剪时,我得到了以下堆栈跟踪:
<ipython-input-19-be0dcc63725e> in <listcomp>(.0) 61 if gradient_clipping: 62 gradients = optimizer.compute_gradients(loss)---> 63 clipped_gradients = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gradients] 64 opt = optimizer.apply_gradients(clipped_gradients, global_step=global_step) 65 else:/home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/ops/clip_ops.py in clip_by_value(t, clip_value_min, clip_value_max, name) 51 with ops.op_scope([t, clip_value_min, clip_value_max], name, 52 "clip_by_value") as name:---> 53 t = ops.convert_to_tensor(t, name="t") 54 55 # Go through list of tensors, for each value in each tensor clip/home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/ops.py in convert_to_tensor(value, dtype, name, as_ref) 619 for base_type, conversion_func in funcs_at_priority: 620 if isinstance(value, base_type):--> 621 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) 622 if ret is NotImplemented: 623 continue/home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/constant_op.py in _constant_tensor_conversion_function(v, dtype, name, as_ref) 178 as_ref=False): 179 _ = as_ref--> 180 return constant(v, dtype=dtype, name=name) 181 182 /home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name) 161 tensor_value = attr_value_pb2.AttrValue() 162 tensor_value.tensor.CopyFrom(--> 163 tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape)) 164 dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype) 165 const_tensor = g.create_op(/home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape) 344 else: 345 if values is None:--> 346 raise ValueError("None values not supported.") 347 # if dtype is provided, forces numpy array to be the type 348 # provided if possible.ValueError: None values not supported.
我该如何解决这个问题?
回答:
因此,似乎可行的解决方案之一是这样的:
optimizer = tf.train.GradientDescentOptimizer(learning_rate) if gradient_clipping: gradients = optimizer.compute_gradients(loss) def ClipIfNotNone(grad): if grad is None: return grad return tf.clip_by_value(grad, -1, 1) clipped_gradients = [(ClipIfNotNone(grad), var) for grad, var in gradients] opt = optimizer.apply_gradients(clipped_gradients, global_step=global_step) else: opt = optimizer.minimize(loss, global_step=global_step)
看起来compute_gradients在梯度为零张量时返回None而不是零张量,而tf.clip_by_value不支持None值。所以只要不将None值传递给它,并保留None值即可。