大家好,我是机器学习的新手。我正在使用LSTM来预测序列中的下一个标签实现联邦学习。我的序列看起来像这样[2,3,5,1,4,2,5,7]。例如,我的意图是预测这个序列中的7。所以我尝试了一个简单的基于Keras的联邦学习。我用这种方法在另一个模型(不是LSTM)上尝试过,它对我来说是有效的,但是在这里它总是过拟合到2。不管输入是什么,它总是预测2。我确保输入数据是平衡的,意味着在最后一个索引(这里是7)中每个标签的数量几乎相等。我在简单的深度学习上测试了这些数据,结果非常好。所以对我来说,这些数据可能不适合LSTM,或者存在其他问题。请帮助我。这是我的联邦学习代码。如果需要更多信息,请告诉我,我真的很需要。谢谢
def get_lstm(units): """LSTM(Long Short-Term Memory) Build LSTM Model. # Arguments units: List(int), number of input, output and hidden units. # Returns model: Model, nn model. """ model = Sequential() inp = layers.Input((units[0],1)) x = layers.LSTM(units[1], return_sequences=True)(inp) x = layers.LSTM(units[2])(x) x = layers.Dropout(0.2)(x) out = layers.Dense(units[3], activation='softmax')(x) model = Model(inp, out) optimizer = keras.optimizers.Adam(lr=0.01)seqLen=8 -1;global_model = Mymodel.get_lstm([seqLen, 64, 64, 15]) # 14 categories we have , array start from 0 but never can predict zero classglobal_model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=tf.keras.metrics.SparseTopKCategoricalAccuracy(k=1)) def main(argv): for comm_round in range(comms_round): print("round_%d" %( comm_round)) scaled_local_weight_list = list() global_weights = global_model.get_weights() np.random.shuffle(train) temp_data = train[:] # data divided among ten users and shuffled for user in range(10): user_data = temp_data[user * userDataSize: (user+1)*userDataSize] X_train = user_data[:, 0:seqLen] X_train = np.asarray(X_train).astype(np.float32) Y_train = user_data[:, seqLen] Y_train = np.asarray(Y_train).astype(np.float32) local_model = Mymodel.get_lstm([seqLen, 64, 64, 15]) X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1)) local_model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=tf.keras.metrics.SparseTopKCategoricalAccuracy(k=1)) local_model.set_weights(global_weights) local_model.fit(X_train, Y_train) scaling_factor = 1 / 10 # 10 is number of users scaled_weights = scale_model_weights(local_model.get_weights(), scaling_factor) scaled_local_weight_list.append(scaled_weights) K.clear_session() average_weights = sum_scaled_weights(scaled_local_weight_list) global_model.set_weights(average_weights)predictions=global_model.predict(X_test)for i in range(len(X_test)): print('%d,%d' % ((np.argmax(predictions[i])), Y_test[i]),file=f2 )
回答:
我找到了我问题的几个原因,所以我想与大家分享:
1- 序列中不同项目的比例不平衡。我的意思是,例如,我有1000个“2”和100个其他数字,所以在几轮之后,模型因为特定数字的数据更多而拟合到2。
2- 我改变了我的序列,使得序列中没有两个项目具有相同的值。因此,我能够从序列中删除一些重复的数据,使它们更加平衡。也许这不是活动的完整表示,但在我的情况下是有意义的。