我正在用Python和Keras开发一个聊天机器人,因此我创建了一个Keras的CNN模型,现在我想训练它。我创建了两个类,一个用于问候,另一个用于告别。当我运行代码进行训练时,它抛出了一个错误,原因是形状不兼容。
Training Data Shape: [[1. 1.] [1. 1.]]Target Data Shape: [1. 1.]Number of classes: 2Classes: ['byes' 'greeting']Epoch 1/100---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-8-95b45bf4ab5b> in <module>() 78 #Call fns 79 create_class()---> 80 model() 81 82 def chat():11 frames/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs) 971 except Exception as e: # pylint:disable=broad-except 972 if hasattr(e, "ag_error_metadata"):--> 973 raise e.ag_error_metadata.to_exception(e) 974 else: 975 raiseValueError: in user code: /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:806 train_function * return step_function(self, iterator) /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:796 step_function ** outputs = model.distribute_strategy.run(run_step, args=(data,)) /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:1211 run return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs) /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2585 call_for_each_replica return self._call_for_each_replica(fn, args, kwargs) /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2945 _call_for_each_replica return fn(*args, **kwargs) /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:789 run_step ** outputs = model.train_step(data) /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:749 train_step y, y_pred, sample_weight, regularization_losses=self.losses) /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/compile_utils.py:204 __call__ loss_value = loss_obj(y_t, y_p, sample_weight=sw) /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:149 __call__ losses = ag_call(y_true, y_pred) /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:253 call ** return ag_fn(y_true, y_pred, **self._fn_kwargs) /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201 wrapper return target(*args, **kwargs) /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:1535 categorical_crossentropy return K.categorical_crossentropy(y_true, y_pred, from_logits=from_logits) /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201 wrapper return target(*args, **kwargs) /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py:4687 categorical_crossentropy target.shape.assert_is_compatible_with(output.shape) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_shape.py:1134 assert_is_compatible_with raise ValueError("Shapes %s and %s are incompatible" % (self, other)) ValueError: Shapes (None, 1) and (None, 2) are incompatible
以下是重现错误的代码:
#Create arraysnn = [2, 4, 6, 8, 12, 10, 8, 2]labels = []words = []docs_x = []docs_y = []with open("intents.json") as file: data = json.load(file)for intent in data["intents"]: for pattern in intent["patterns"]: pattern = intent["tag"] for response in intent["responses"]: response = intent["responses"] for tags in intent["tag"]: tags = intent["tag"] #Add tags to labels[] if intent["tag"] not in labels: labels.append(intent["tag"]) #Add patterns to docs_y[] if intent["patterns"] not in words: docs_y.append(intent["patterns"])def create_class(): y = np.array(labels) x = np.array(words) classes = np.unique(y) nClasses = len(classes) print("Number of classes: " , nClasses) print("Classes: " , classes)def create_training_data(): training_data = np.ones(np.shape(docs_y)) target_data = np.ones(np.shape(labels)) print("Training Data Shape: " , training_data) print("Target Data Shape: " , target_data)#Create training shapes:create_training_data()def model(): INIT_LR = 1e-3 epochs = 6 batch_size = 64 model = kr.Sequential() model.add(kr.layers.Dense(nn[1], activation='relu')) model.add(kr.layers.Dense(nn[2], activation='relu')) model.add(kr.layers.Dense(nn[3], activation='relu')) model.add(kr.layers.Dense(nn[4], activation='relu')) model.add(kr.layers.Dense(nn[5], activation='relu')) model.add(kr.layers.Dense(nn[6], activation='relu')) model.add(kr.layers.Dense(nn[7], activation='sigmoid')) model.compile(loss=kr.losses.categorical_crossentropy, optimizer=kr.optimizers.Adagrad(lr=INIT_LR, decay=INIT_LR / 100),metrics=['accuracy']) #Training model.fit(training_data , target_data , epochs=100)#Call fnscreate_class()model()
有谁可以帮我吗?
谢谢。
回答:
你的最后一个密集层(输出层),使用sigmoid
激活函数的层应该只有1个神经元,而不是2个。你有两个输出类,即二元分类,尽管输出层只需要一个在0到1之间的概率,用于“正”类。另一个类的概率就是1减去这个概率。
因此,尝试将model.add(kr.layers.Dense(nn[7], activation='sigmoid'))
更改为model.add(kr.layers.Dense(1, activation='sigmoid'))
。