梯度裁剪似乎在处理None值时出现了问题

我在尝试向我的图中添加梯度裁剪。我使用了这里推荐的方法:如何在TensorFlow中有效地应用梯度裁剪?

    optimizer = tf.train.GradientDescentOptimizer(learning_rate)    if gradient_clipping:        gradients = optimizer.compute_gradients(loss)        clipped_gradients = [(tf.clip_by_value(grad, -1, 1), var) for grad, var in gradients]        opt = optimizer.apply_gradients(clipped_gradients, global_step=global_step)    else:        opt = optimizer.minimize(loss, global_step=global_step)

但是当我开启梯度裁剪时,我得到了以下堆栈跟踪:

<ipython-input-19-be0dcc63725e> in <listcomp>(.0)     61         if gradient_clipping:     62             gradients = optimizer.compute_gradients(loss)---> 63             clipped_gradients = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gradients]     64             opt = optimizer.apply_gradients(clipped_gradients, global_step=global_step)     65         else:/home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/ops/clip_ops.py in clip_by_value(t, clip_value_min, clip_value_max, name)     51   with ops.op_scope([t, clip_value_min, clip_value_max], name,     52                    "clip_by_value") as name:---> 53     t = ops.convert_to_tensor(t, name="t")     54      55     # Go through list of tensors, for each value in each tensor clip/home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/ops.py in convert_to_tensor(value, dtype, name, as_ref)    619     for base_type, conversion_func in funcs_at_priority:    620       if isinstance(value, base_type):--> 621         ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)    622         if ret is NotImplemented:    623           continue/home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/constant_op.py in _constant_tensor_conversion_function(v, dtype, name, as_ref)    178                                          as_ref=False):    179   _ = as_ref--> 180   return constant(v, dtype=dtype, name=name)    181     182 /home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name)    161   tensor_value = attr_value_pb2.AttrValue()    162   tensor_value.tensor.CopyFrom(--> 163       tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape))    164   dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)    165   const_tensor = g.create_op(/home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape)    344   else:    345     if values is None:--> 346       raise ValueError("None values not supported.")    347     # if dtype is provided, forces numpy array to be the type    348     # provided if possible.ValueError: None values not supported.

我该如何解决这个问题?


回答:

因此,似乎可行的解决方案之一是这样的:

    optimizer = tf.train.GradientDescentOptimizer(learning_rate)    if gradient_clipping:        gradients = optimizer.compute_gradients(loss)        def ClipIfNotNone(grad):            if grad is None:                return grad            return tf.clip_by_value(grad, -1, 1)        clipped_gradients = [(ClipIfNotNone(grad), var) for grad, var in gradients]        opt = optimizer.apply_gradients(clipped_gradients, global_step=global_step)    else:        opt = optimizer.minimize(loss, global_step=global_step)

看起来compute_gradients在梯度为零张量时返回None而不是零张量,而tf.clip_by_value不支持None值。所以只要不将None值传递给它,并保留None值即可。

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注