在Keras中对（部分）重叠的子模型使用多个损失函数

我在Keras中有一个模型，我希望使用两个损失函数。这个模型由一个自编码器和一个分类器组成。我希望有一个损失函数确保自编码器的拟合效果合理（例如，可以是均方误差mse），另一个损失函数评估分类器（例如，分类交叉熵categorical_crossentropy）。我希望拟合我的模型并使用一个损失函数，该函数是这两个损失函数的线性组合。

# 损失函数def ae_mse_loss(x_true, x_pred):    ae_loss = K.mean(K.square(x_true - x_pred), axis=1)    return ae_lossdef clf_loss(y_true, y_pred):    return K.sum(K.categorical_crossentropy(y_true, y_pred), axis=-1)def combined_loss(y_true, y_pred):    ???    return ae_loss + w1*clf_loss

其中w1是某个权重，用于定义在最终组合损失中“clf_loss的重要性”。

# 自编码器ae_in_layer = Input(shape=in_dim, name='ae_in_layer')ae_interm_layer1 = Dense(interm_dim, activation='relu', name='ae_interm_layer1')(ae_in_layer)ae_mid_layer = Dense(latent_dim, activation='relu', name='ae_mid_layer')(ae_interm_layer1)ae_interm_layer2 = Dense(interm_dim, activation='relu', name='ae_interm_layer2')(ae_mid_layer)ae_out_layer = Dense(in_dim, activation='linear', name='ae_out_layer')(ae_interm_layer2)ae_model=Model(ae_input_layer, ae_out_layer)ae_model.compile(optimizer='adam', loss = ae_mse_loss)# 分类器clf_in_layer = Dense(interm_dim, activation='sigmoid', name='clf_in_layer')(ae_out_layer)clf_out_layer = Dense(3, activation='softmax', name='clf_out_layer')(clf_in_layer)clf_model = Model(clf_in_layer, clf_out_layer)clf_model.compile(optimizer='adam', loss = combined_loss, metrics = [ae_mse_loss, clf_loss])

我不确定的是如何在两个损失函数中区分y_true和y_pred（因为它们指的是模型不同阶段的真实和预测数据）。我设想的是这样的（我不知道如何实现，因为显然我只能传递一组参数y_true和y_pred）：

def combined_loss(y_true, y_pred):    ae_loss = ae_mse_loss(x_true_ae, x_pred_ae)    clf_loss = clf_loss(y_true_clf, y_pred_clf)    return ae_loss + w1*clf_loss

我可以将这个问题定义为两个独立的模型，并分别训练每个模型，但如果可能的话，我真的希望能一次性完成所有操作（因为这将同时优化两个问题）。我意识到这个模型没有多大意义，但它以简单的方式展示了我试图解决的（更为复杂的）问题。

任何建议都将不胜感激。

回答：

你只需要使用原生Keras即可实现

你可以使用loss_weights参数自动组合多个损失

在下面的例子中，我试图重现你的例子，其中我将回归任务的mse损失和分类任务的categorical_crossentropy损失结合在一起

in_dim = 10interm_dim = 64latent_dim = 32n_class = 3n_sample = 100X = np.random.uniform(0,1, (n_sample,in_dim))y = tf.keras.utils.to_categorical(np.random.randint(0,n_class, n_sample))# 自编码器ae_in_layer = Input(shape=in_dim, name='ae_in_layer')ae_interm_layer1 = Dense(interm_dim, activation='relu', name='ae_interm_layer1')(ae_in_layer)ae_mid_layer = Dense(latent_dim, activation='relu', name='ae_mid_layer')(ae_interm_layer1)ae_interm_layer2 = Dense(interm_dim, activation='relu', name='ae_interm_layer2')(ae_mid_layer)ae_out_layer = Dense(in_dim, activation='linear', name='ae_out_layer')(ae_interm_layer2)# 分类器clf_in_layer = Dense(interm_dim, activation='sigmoid', name='clf_in_layer')(ae_out_layer)clf_out_layer = Dense(n_class, activation='softmax', name='clf_out_layer')(clf_in_layer)model = Model(ae_in_layer, [ae_out_layer,clf_out_layer])model.compile(optimizer='adam',               loss = {'ae_out_layer':'mse', 'clf_out_layer':'categorical_crossentropy'},              loss_weights = {'ae_out_layer':1., 'clf_out_layer':0.5})model.fit(X, [X,y], epochs=10)

在这种特定情况下，loss是1*ae_out_layer_loss + 0.5*clf_out_layer_loss的结果

学技术

在Keras中对（部分）重叠的子模型使用多个损失函数

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复