多输入多输出自编码器

我正在尝试为两个独特的输入序列创建一个嵌入。所以对于每个观察，取一个整数符号序列和一个时间序列向量，来创建一个嵌入向量。看起来一个标准的单输入方法是创建一个自编码器，将数据作为输入和输出，并提取隐藏层的输出作为你的嵌入。

我在使用Keras，看起来我已经接近成功。输入1的形状为(1000000, 50)（一百万个长度为50的整数列表）。输入2的形状为(1000000, 50, 1)。

以下是我的Keras代码。

########################################### Input 1: event type sequencesinput_1a = Input(shape =(max_seq_length,), dtype = 'int32', name = 'first_input')# Input 1: Embedding layerinput_1b = Embedding(output_dim = embedding_length, input_dim = num_unique_event_symbols, input_length = max_seq_length, mask_zero=True)(input_1a)# Input 1: LSTM input_1c = LSTM(10, return_sequences = True)(input_1b)########################################### Input 2: unix time (minutes) vectorsinput_2a = Input(shape=(max_seq_length,1), dtype='float32', name='second_input')# Input 2: Masking input_2b = Masking(mask_value = 99999999.0)(input_2a)# Input 2: LSTM input_2c = LSTM(10, return_sequences = True)(input_2b)########################################### Concatenation layer herex = keras.layers.concatenate([input_1c, input_2c])x2 = Dense(40, activation='relu')(x)x3 = Dense(20, activation='relu', name = "journey_embeddings")(x2)########################################### Re-create the inputsxl = Lambda(lambda x: x, output_shape=lambda s:s)(x3)xf = Flatten()(xl)xf1 = Dense(20, activation='relu')(xf)xf2 = Dense(50, activation='relu')(xf1)xd = Dense(20, activation='relu')(x3)xd2 = TimeDistributed(Dense(1, activation='linear'))(xd)############################################ Compile and fit the modelmodel = Model(inputs=[input_1a, input_2a], outputs=[xf2,xd2])model.compile(optimizer = rms_prop, loss = 'mse')print(model.summary())np.random.seed(21)model.fit([X1,X2], [X1,X2], epochs=1, batch_size=200)

一旦我运行这个代码，我会像这样提取“journey_embeddings”隐藏层的输出：

layer_name = 'journey_embeddings'intermediate_layer_model = Model(inputs=model.input, outputs=model.get_layer(layer_name).output)intermediate_output = intermediate_layer_model.predict([X1,X2])

然而，intermediate_output的形状是(1000000, 50, 20)。我希望得到一个长度为20的嵌入向量。如何得到形状为(1000000, 20)的输出呢？

回答：

感谢@的人，以下代码有效：

########################################### Input 1: event type sequencesinput_1a = Input(shape =(max_seq_length,), dtype = 'int32', name = 'first_input')# Input 1: Embedding layerinput_1b = Embedding(output_dim = embedding_length, input_dim = num_unique_event_symbols, input_length = max_seq_length, mask_zero=True)(input_1a)# Input 1: LSTM input_1c = LSTM(10, return_sequences = False)(input_1b)########################################### Input 2: unix time (minutes) vectorsinput_2a = Input(shape=(max_seq_length,1), dtype='float32', name='second_input')# Input 2: Masking input_2b = Masking(mask_value = 99999999.0)(input_2a)# Input 2: LSTM input_2c = LSTM(10, return_sequences = False)(input_2b)########################################### Concatenation layer herex = keras.layers.concatenate([input_1c, input_2c])x2 = Dense(40, activation='relu')(x)x3 = Dense(20, activation='relu', name = "journey_embeddings")(x2)########################################### An abitrary number of dense, hidden layers herexf1 = Dense(20, activation='relu')(x3)xf2 = Dense(50, activation='relu')(xf1)xd = Dense(50, activation='relu')(x3)xd2 = Reshape((50, 1))(xd)############################################ Compile and fit the modelmodel = Model(inputs=[input_1a, input_2a], outputs=[xf2,xd2])model.compile(optimizer = rms_prop, loss = 'mse')print(model.summary())np.random.seed(21)model.fit([X1,X2], [X1,X2], epochs=1, batch_size=200)

学技术

多输入多输出自编码器

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复