AttributeError: ‘Tensor’ 对象没有 ‘_keras_history’ 属性,在实现共注意力层时遇到

大家好。我正在尝试为匹配任务定制一个共注意力层。并且有一个错误让我很困惑。

model = Model(inputs=[ans_input, ques_input], outputs=output)

我的程序在运行上述代码时关闭。然后它会抛出一个错误

AttributeError: 'Tensor' object has no attribute '_keras_history'

这意味着我的模型不能成为一个完整的图,我猜。所以我尝试了很多我在StackOverflow和其他博客上找到的方法。但所有这些都无法工作。:(
我将在下面粘贴我的模型。谢谢你们帮助我 🙂

import timefrom keras.layers import Embedding, LSTM, TimeDistributed, Lambdafrom keras.layers.core import *from keras.layers.merge import concatenatefrom keras.layers.pooling import GlobalMaxPooling1Dfrom keras.models import *from keras.optimizers import *from dialog.keras_lstm.k_call import *from dialog.model.keras_himodel import ZeroMaskedEntries, loggerclass Co_AttLayer(Layer):    def __init__(self, **kwargs):        # self.input_spec = [InputSpec(ndim=3)]        super(Co_AttLayer, self).__init__(**kwargs)    def build(self, input_shape):        assert len(input_shape) == 2        assert len(input_shape[0]) == len(input_shape[1])        super(Co_AttLayer, self).build(input_shape)    def cosine_sim(self, x):        ans_ss = K.sum(K.square(x[0]), axis=2, keepdims=True)        ans_norm = K.sqrt(K.maximum(ans_ss, K.epsilon()))        ques_ss = K.sum(K.square(x[1]), axis=2, keepdims=True)        ques_norm = K.sqrt(K.maximum(ques_ss, K.epsilon()))        tr_ques_norm = K.permute_dimensions(ques_norm, (0, 2, 1))        tr_ques = K.permute_dimensions(x[1], (0, 2, 1))        ss = K.batch_dot(x[0], tr_ques, axes=[2, 1])        den = K.batch_dot(ans_norm, tr_ques_norm, axes=[2, 1])        return ss / den    def call(self, x, mask=None):        cosine = Lambda(self.cosine_sim)(x)        coqWij = K.softmax(cosine)        print(x[1].shape, coqWij.shape)        ai = K.dot(coqWij, x[1])  # (N A Q) (N Q L)        coaWij = K.softmax(K.permute_dimensions(cosine, (0, 2, 1)))        qj = K.dot(coaWij, x[0])        print(qj.shape, ai.shape)        return concatenate([ai, qj], axis=2)    def compute_output_shape(self, input_shape):        return input_shapedef build_QAmatch_model(opts, vocab_size=0, maxlen=300, embedd_dim=50, init_mean_value=None):    ans_input = Input(shape=(maxlen,), dtype='int32', name='ans_input')    ques_input = Input(shape=(maxlen,), dtype='int32', name='ques_input')    embedding = Embedding(output_dim=embedd_dim, input_dim=vocab_size, input_length=maxlen,                          mask_zero=True, name='embedding')    dropout = Dropout(opts.dropout, name='dropout')    lstm = LSTM(opts.lstm_units, return_sequences=True, name='lstm')    hidden_layer = Dense(units=opts.hidden_units, name='hidden_layer')    output_layer = Dense(units=1, name='output_layer')    zme = ZeroMaskedEntries(name='maskedout')    ans_maskedout = zme(embedding(ans_input))    ques_maskedout = zme(embedding(ques_input))    ans_lstm = lstm(dropout(ans_maskedout))  # (A V)    ques_lstm = lstm(dropout(ques_maskedout))  # (Q V)    co_att = Co_AttLayer()([ans_lstm, ques_lstm])    def slice(x, index):        return x[:, :, index, :]    ans_att = Lambda(slice, output_shape=(maxlen, embedd_dim), arguments={'index': 0})(co_att)    ques_att = Lambda(slice, output_shape=(maxlen, embedd_dim), arguments={'index': 1})(co_att)    merged_ques = concatenate([ques_lstm, ques_att, ques_maskedout], axis=2)    merged_ans = concatenate([ans_lstm, ans_att, ans_maskedout], axis=2)    ans_vec = GlobalMaxPooling1D(name='ans_pooling')(merged_ans)    ques_vec = GlobalMaxPooling1D(name='ques_pooling')(merged_ques)    ans_hid = hidden_layer(ans_vec)    ques_hid = hidden_layer(ques_vec)    merged_hid = concatenate([ans_hid, ques_hid], axis=-1)    merged_all = concatenate([merged_hid, ans_hid + ques_hid, ans_hid - ques_hid, K.abs(ans_hid - ques_hid)], axis=-1)    output = output_layer(merged_all)    model = Model(inputs=[ans_input, ques_input], outputs=output)    if init_mean_value:        logger.info("Initialise output layer bias with log(y_mean/1-y_mean)")        bias_value = (np.log(init_mean_value) - np.log(1 - init_mean_value)).astype(K.floatx())        model.layers[-1].b.set_value(bias_value)    if verbose:        model.summary()    start_time = time.time()    model.compile(loss='mse', optimizer='rmsprop')    total_time = time.time() - start_time    logger.info("Model compiled in %.4f s" % total_time)    return model

回答:

我无法重现你的代码,但我猜测错误发生在这里:

merged_all = concatenate([merged_hid, ans_hid + ques_hid, ans_hid - ques_hid,                          K.abs(ans_hid - ques_hid)], axis=-1)

后端操作 +, -K.abs 没有被包装在 Lambda 层中,因此生成的张量不是 Keras 张量,因此它们缺少一些属性,例如 _keras_history。你可以按以下方式包装它们:

l1 = Lambda(lambda x: x[0] + x[1])([ans_hid, ques_hid])l2 = Lambda(lambda x: x[0] - x[1])([ans_hid, ques_hid])l3 = Lambda(lambda x: K.abs(x[0] - x[1]))([ans_hid, ques_hid])merged_all = concatenate([merged_hid, l1, l2, l3], axis=-1)

注意: 未经测试。

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注