我正在使用Keras的API中的共享层部分的代码实现多任务回归模型。
有两个数据集,我们称之为data_1
和data_2
,如下所示。
data_1 : shape(1434, 185, 37)data_2 : shape(283, 185, 37)
data_1
包含1434个样本,每个样本长度为185个字符,37表示唯一字符总数为37,或者换句话说,就是vocab_size
。相比之下,data_2
包含283个字符。
在将data_1
和data_2
输入到嵌入层之前,我将它们转换为二维的numpy数组,如下所示。
data_1=np.argmax(data_1, axis=2)data_2=np.argmax(data_2, axis=2)
这样做的结果是数据的形状如下。
print(np.shape(data_1)) (1434, 185)print(np.shape(data_2)) (283, 185)
矩阵中的每个数字代表索引整数。
多任务模型如下所示。
user_input = keras.layers.Input(shape=((185, )), name='Input_1')products_input = keras.layers.Input(shape=((185, )), name='Input_2')shared_embed=(keras.layers.Embedding(vocab_size, 50, input_length=185))user_vec_1 = shared_embed(user_input )user_vec_2 = shared_embed(products_input )input_vecs = keras.layers.concatenate([user_vec_1, user_vec_2], name='concat')input_vecs_1=keras.layers.Flatten()(input_vecs)input_vecs_2=keras.layers.Flatten()(input_vecs)# Task 1 FC layersnn = keras.layers.Dense(90, activation='relu',name='layer_1')(input_vecs_1)result_a = keras.layers.Dense(1, activation='linear', name='output_1')(nn)# Task 2 FC layersnn1 = keras.layers.Dense(90, activation='relu', name='layer_2')(input_vecs_2)result_b = keras.layers.Dense(1, activation='linear',name='output_2')(nn1) model = Model(inputs=[user_input , products_input], outputs=[result_a, result_b])model.compile(optimizer='rmsprop', loss='mse', metrics=['accuracy'])
然后我按如下方式拟合模型。
model.fit([data_1, data_2], [Y_1,Y_2], epochs=10)
错误:
ValueError: All input arrays (x) should have the same number of samples. Got array shapes: [(1434, 185), (283, 185)]
在Keras中是否有方法可以使用两个不同样本大小的输入,或者有什么技巧可以避免这个错误以实现我的多任务回归目标?
这里是用于测试的最小工作代码。
data_1=np.array([[25, 5, 11, 24, 6], [25, 5, 11, 24, 6], [25, 0, 11, 24, 6], [25, 11, 28, 11, 24], [25, 11, 6, 11, 11]])data_2=np.array([[25, 11, 31, 6, 11], [25, 11, 28, 11, 31], [25, 11, 11, 11, 31]])Y_1=np.array([[2.33], [2.59], [2.59], [2.54], [4.06]])Y_2=np.array([[2.9], [2.54], [4.06]])user_input = keras.layers.Input(shape=((5, )), name='Input_1')products_input = keras.layers.Input(shape=((5, )), name='Input_2')shared_embed=(keras.layers.Embedding(37, 3, input_length=5))user_vec_1 = shared_embed(user_input )user_vec_2 = shared_embed(products_input )input_vecs = keras.layers.concatenate([user_vec_1, user_vec_2], name='concat')input_vecs_1=keras.layers.Flatten()(input_vecs) input_vecs_2=keras.layers.Flatten()(input_vecs) nn = keras.layers.Dense(90, activation='relu',name='layer_1')(input_vecs_1) result_a = keras.layers.Dense(1, activation='linear', name='output_1')(nn) # Task 2 FC layers nn1 = keras.layers.Dense(90, activation='relu', name='layer_2')(input_vecs_2) result_b = keras.layers.Dense(1, activation='linear',name='output_2')(nn1)model = Model(inputs=[user_input , products_input], outputs=[result_a, result_b])model.compile(optimizer='rmsprop', loss='mse', metrics=['accuracy'])model.fit([data_1, data_2], [Y_1,Y_2], epochs=10)
回答:
新回答:
这里我用TensorFlow 2写了一个解决方案。所以,你需要的是:
-
定义一个动态输入,它从数据中获取形状
-
使用平均池化,使你的密集层维度独立于输入维度
-
分别计算损失
这里是修改后的示例代码,供参考:
## 执行此操作#pip install tensorflow==2.0.0import tensorflow.keras as kerasimport numpy as npfrom tensorflow.keras.models import Modeldata_1=np.array([[25, 5, 11, 24, 6], [25, 5, 11, 24, 6], [25, 0, 11, 24, 6], [25, 11, 28, 11, 24], [25, 11, 6, 11, 11]])data_2=np.array([[25, 11, 31, 6, 11], [25, 11, 28, 11, 31], [25, 11, 11, 11, 31]])Y_1=np.array([[2.33], [2.59], [2.59], [2.54], [4.06]])Y_2=np.array([[2.9], [2.54], [4.06]])user_input = keras.layers.Input(shape=((None,)), name='Input_1')products_input = keras.layers.Input(shape=((None,)), name='Input_2')shared_embed=(keras.layers.Embedding(37, 3, input_length=5))user_vec_1 = shared_embed(user_input )user_vec_2 = shared_embed(products_input )x = keras.layers.GlobalAveragePooling1D()(user_vec_1)nn = keras.layers.Dense(90, activation='relu',name='layer_1')(x)result_a = keras.layers.Dense(1, activation='linear', name='output_1')(nn)# Task 2 FC layersx = keras.layers.GlobalAveragePooling1D()(user_vec_2)nn1 = keras.layers.Dense(90, activation='relu', name='layer_2')(x)result_b = keras.layers.Dense(1, activation='linear',name='output_2')(nn1)model = Model(inputs=[user_input , products_input], outputs=[result_a, result_b])loss = tf.keras.losses.MeanSquaredError()optimizer = tf.keras.optimizers.Adam()loss_values = []num_iter = 300for i in range(num_iter): with tf.GradientTape() as tape: # 前向传播。 logits = model([data_1, data_2]) loss_value = loss(Y_1, logits[0]) + loss(Y_2, logits[1]) loss_values.append(loss_value) gradients = tape.gradient(loss_value, model.trainable_weights) optimizer.apply_gradients(zip(gradients, model.trainable_weights))import matplotlib.pyplot as pltplt.plot(range(num_iter), loss_values)plt.xlabel("迭代次数")plt.ylabel('损失值')