我在尝试使用自己的数据集训练‘宽与深学习’模型时,遇到了这个错误,错误发生在我将模型拟合到训练集上时。
---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-15-8f5351c1fdf8> in <module>()----> 1 m.fit(input_fn=train_input_fn, steps=200)/Users/prisma/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.pyc in fit(self, x, y, input_fn, steps, batch_size, monitors, max_steps)331 steps=steps,332 monitors=monitors,--> 333 max_steps=max_steps)334 logging.info('Loss for final step: %s.', loss)335 return self/Users/prisma/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.pyc in _train_model(self, input_fn, steps, feed_fn, init_op, init_feed_fn, init_fn, device_fn, monitors, log_every_steps, fail_on_nan_loss, max_steps)660 features, targets = input_fn()661 self._check_inputs(features, targets)--> 662 train_op, loss_op = self._get_train_ops(features, targets)663 664 # Add default monitors./Users/prisma/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.pyc in _get_train_ops(self, features, targets)188 logits = self._logits(features, is_training=True)189 if self._enable_centered_bias:--> 190 centered_bias_step = [self._centered_bias_step(targets, features)]191 else:192 centered_bias_step = []/Users/prisma/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.pyc in _centered_bias_step(self, targets, features)272 with ops.name_scope(None, "centered_bias", (targets, features)):273 training_loss = self._target_column.training_loss(--> 274 logits, targets, features)275 # Learn central bias by an optimizer. 0.1 is a convervative lr for a276 # single variable./Users/prisma/anaconda/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/target_column.pyc in training_loss(self, logits, target, features, name)204 """205 target = target[self.name] if isinstance(target, dict) else target--> 206 loss_unweighted = self._loss_fn(logits, target)207 208 weight_tensor = self.get_weight_tensor(features)/Users/prisma/anaconda/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/target_column.pyc in _log_loss_with_two_classes(logits, target)387 target = array_ops.expand_dims(target, dim=[1])388 loss_vec = nn.sigmoid_cross_entropy_with_logits(logits,--> 389 math_ops.to_float(target))390 return loss_vec391 /Users/prisma/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/nn.pyc in sigmoid_cross_entropy_with_logits(logits, targets, name)432 except ValueError:433 raise ValueError("logits and targets must have the same shape (%s vs %s)"--> 434 % (logits.get_shape(), targets.get_shape()))435 436 # The logistic loss formula from above isValueError: logits and targets must have the same shape ((?, 1) vs (13647309, 24))
我无法理解为什么logits的形状是(?,1)而不是(13647309, 24)。input_fn函数应该返回一个大小为(13647309, 24)的特征字典和一个形状为(13647309, 24)的标签张量。就我所知,logits应该是模型的输出,但在DNNLinearCombinedClassifier中没有地方可以指定输出大小,因此我假设输出大小会自动调整为与标签大小相同,即(13647309, 24)。我不知道为什么会发生这个错误,但我猜测我的模型可能有问题。由于整个代码太长无法粘贴,我只在这里粘贴模型构建部分。
model_dir = tempfile.mkdtemp()m = tf.contrib.learn.DNNLinearCombinedClassifier( model_dir=model_dir, linear_feature_columns=wide_columns, dnn_feature_columns=deep_columns, dnn_hidden_units=[100, 50])
我没有更改tensorflow教程中的模型参数。我只是根据自己的数据集定义了’wide_columns’和’deep_columns’。是模型还是我的输入函数有问题?我在tf.learn api网站上找不到DNNLinearCombinedClassifier的参考资料。
更新: 输入函数的代码
def input_fn(df): continuous_cols = {k: tf.constant(df[k].values) for k in CONTINUOUS_COLUMNS} categorical_cols = {k: tf.SparseTensor( indices=[[i, 0] for i in range(df[k].size)], values=df[k].values, shape=[df[k].size, 1]) for k in CATEGORICAL_COLUMNS} feature_cols = dict(continuous_cols.items() + categorical_cols.items()) label = tf.constant(df[Label_COLUMNS].values) return feature_cols, label
‘Label_COLUMNS’中有24个通道。
回答:
问题在于您需要在DNNLinearCombinedClassifier
的构造函数中指定n_classes=24
。请查看这里的文档说明。