我需要训练两个模型:modelA
和 modelB
,它们使用不同的optimizer
和hiddenLayers
。我希望将它们的输出结合起来,结果如下:
# w = 我给每个模型输出的权重output_modelC = output_modelA * w + output_modelB * (1 - w)
两个模型共享相同的Input
,但是在创建它们的compile
之后,我不知道该如何继续。我的代码如下:
Input = keras.layers.Input(shape=(2,))#modelAHidden_A_1 = keras.layers.Dense(units=20)(Input)Hidden_A_2 = keras.layers.Dense(units=20)(Hidden_A_1)Output_A = keras.layers.Dense(units=1, activation='sigmoid')(Hidden_A_2)optimizer_A = keras.optimizers.SGD(lr=0.00001, momentum=0.09, nesterov=True)model_A = keras.Model(inputs=Input, outputs=Output_A)model_A.compile(loss="binary_crossentropy", optimizer=optimizer_slow, metrics=['accuracy'])#modelBHidden1_B = keras.layers.Dense(units=10, activation='relu')(Input)Output_B = keras.layers.Dense(units=1, activation='sigmoid')(Hidden1_B)model_B = keras.Model(inputs=Input, outputs=Output_B)optimizer_B = keras.optimizers.Adagrad()model_B.compile(loss="binary_crossentropy", optimizer=optimizer_B, metrics=['accuracy'])
回答:
假设你会提供w
的值,以下代码可能会对你有帮助:
import keras Input = keras.layers.Input(shape=(784,))#modelAHidden_A_1 = keras.layers.Dense(units=20)(Input)Hidden_A_2 = keras.layers.Dense(units=20)(Hidden_A_1)Output_A = keras.layers.Dense(units=1, activation='sigmoid')(Hidden_A_2)optimizer_A = keras.optimizers.SGD(lr=0.00001, momentum=0.09, nesterov=True)model_A = keras.Model(inputs=Input, outputs=Output_A)model_A.compile(loss="binary_crossentropy", optimizer=optimizer_A, metrics=['accuracy'])#modelBHidden1_B = keras.layers.Dense(units=10, activation='relu')(Input)Output_B = keras.layers.Dense(units=1, activation='sigmoid')(Hidden1_B)model_B = keras.Model(inputs=Input, outputs=Output_B)optimizer_B = keras.optimizers.Adagrad()model_B.compile(loss="binary_crossentropy", optimizer=optimizer_B, metrics=['accuracy'])(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()x_train = x_train.reshape(60000, 784)x_test = x_test.reshape(10000, 784)x_train = x_train.astype('float32')x_test = x_test.astype('float32')model_A.fit(x_train,y_train)model_B.fit(x_train,y_train)w = 0.8output_modelC = model_A.predict(x_test) * w + model_B.predict(x_test) * (1 - w)
示例输出:
array([[0.98165023], [0.9918817 ], [0.93426293], ..., [0.99940777], [0.9960805 ], [0.9992139 ]], dtype=float32)
我选择的示例数据可能不是最合适的,但这只是为了展示如何将两个网络结合起来。