我一直在研究一个具有两个输入的神经网络,用于评估我的国际象棋引擎的棋局位置。为此,我将网络从我的C++代码转换为Keras,以便能够在GPU上进行训练。
我的模型如下所示:
__________________________________________________________________________________________________Layer (type) Output Shape Param # Connected to ==================================================================================================input_1 (InputLayer) (None, 20480) 0 __________________________________________________________________________________________________input_2 (InputLayer) (None, 20480) 0 __________________________________________________________________________________________________dense_1 (Dense) (None, 256) 5243136 input_1[0][0] __________________________________________________________________________________________________dense_2 (Dense) (None, 256) 5243136 input_2[0][0] __________________________________________________________________________________________________concatenate_1 (Concatenate) (None, 512) 0 dense_1[0][0] dense_2[0][0] __________________________________________________________________________________________________dense_3 (Dense) (None, 32) 16416 concatenate_1[0][0] __________________________________________________________________________________________________dense_4 (Dense) (None, 32) 1056 dense_3[0][0] __________________________________________________________________________________________________dense_5 (Dense) (None, 1) 33 dense_4[0][0] ==================================================================================================Total params: 10,503,777Trainable params: 10,503,777Non-trainable params: 0
由于输入量巨大和训练数据量庞大(约3亿个位置),我在训练过程中使用了稀疏矩阵,效果很好。
我想将权重转移回我的手写C++代码中,为了调试目的,我想将单个输入 feeds 到Keras模型中,以与我的C++模型进行比较。
indices =[21768,21769,21770,21771,21773,21774,21775,21788,21825,21830,21890,21893,21952,21959,22019,1288,1289,1290,1291,1292,1293,1294,1295,1345,1350,1410,1413,1472,1479,1539]eval = -0.24x_1 = np.zeros(half_input_size)x_2 = np.zeros(half_input_size)for i in indices: if(i < half_input_size): x_1[i] = 1 else: x_2[i-half_input_size] = 1print(x_1.shape)print(x_2.shape)print(model.predict([x_1, x_2]))
两个输入的形状似乎是:
(20480,)(20480,)
然而,Keras给出了以下错误:
Traceback (most recent call last): File "A:/OneDrive/ProgrammSpeicher/CLionProjects/Koivisto/resources/networkTrainingKeras/Train.py", line 317, in <module> print(model.predict([x_1, x_2])) File "C:\Users\finne\.conda\envs\DeepLearning\lib\site-packages\keras\engine\training.py", line 1441, in predict x, _, _ = self._standardize_user_data(x) File "C:\Users\finne\.conda\envs\DeepLearning\lib\site-packages\keras\engine\training.py", line 579, in _standardize_user_data exception_prefix='input') File "C:\Users\finne\.conda\envs\DeepLearning\lib\site-packages\keras\engine\training_utils.py", line 145, in standardize_input_data str(data_shape))ValueError: Error when checking input: expected input_1 to have a shape (20480,) but got array with shape (1,)
如果有人能简要告诉我我哪里出错了,我将非常高兴!
问候Finn
回答:
在进行预测时,您需要添加批次维度。
如果您的模型接受2D输入,您必须在预测时传递2D样本
您可以简单地通过扩展维度来实现
model.predict([np.expand_dims(x_1,0), np.expand_dims(x_2,0)])