我正在测试简单RNN模型
我的测试数据非常简单,重复的三重2位*12
[1,0,0,0,1,0,0,1,0,0,0,0],[1,0,0,0,0,1,0,0,0,1,0,0],[0,0,1,0,0,0,0,1,0,0,0,1]
for i in range(0,100): temp = [[1,0,0,0,1,0,0,1,0,0,0,0], [1,0,0,0,0,1,0,0,0,1,0,0], [0,0,1,0,0,0,0,1,0,0,0,1]] line.extend(temp)total = []for j in range(0,500): total.append(line)total = np.array(total)print(total.shape) # (500, 300, 12)
这会生成一个(500, 300, 12)
的numpy数组。所有数据只是重复的,所以我期望训练和预测能完美工作。
然而,val_loss
并未减少,预测也未能有效工作。
Epoch 1/5743/743 [==============================] - 5s 5ms/step - loss: 0.1386 - val_loss: 0.1305Epoch 2/5743/743 [==============================] - 3s 4ms/step - loss: 0.1305 - val_loss: 0.1294Epoch 3/5743/743 [==============================] - 3s 4ms/step - loss: 0.1299 - val_loss: 0.1292Epoch 4/5743/743 [==============================] - 3s 4ms/step - loss: 0.1300 - val_loss: 0.1291Epoch 5/5743/743 [==============================] - 3s 4ms/step - loss: 0.1299 - val_loss: 0.1293[[ 0.67032564 -0.0020391 0.3332582 -0.0095186 0.35370785 0.3042156 0.00809216 0.7059332 0.00199411 0.30952734 -0.0021943 0.333712 ]]tf.Tensor([[1 0 0 0 0 0 0 1 0 0 0 0]], shape=(1, 12), dtype=int32)
我期望的结果是类似[1,0,0,0,1,0,0,1,0,0,0,0]
的…
我需要在某处进行更改还是我的代码有问题?
这些是我所有的代码。
import tensorflow as tffrom django.core.management.base import BaseCommand, CommandErrorfrom tensorflow.keras.models import Sequentialfrom tensorflow.keras.layers import LSTM,Dropout,Densefrom tensorflow.keras.layers import SimpleRNNimport numpy as npfrom tensorflow.keras.optimizers import SGDfrom tensorflow.keras.optimizers import Adamfrom sklearn.model_selection import train_test_splitdef makeModel(input_len,n_in): n_hidden = 512 model = Sequential() model.add(SimpleRNN(n_hidden, input_shape=(input_len, n_in), return_sequences=False)) model.add(Dense(n_hidden, activation="relu")) model.add(Dense(n_in, activation="relu")) opt = Adam(lr=0.001) model.compile(loss='mse', optimizer=opt) model.summary() return model line = [] for i in range(0,100): temp = [[1,0,0,0,1,0,0,1,0,0,0,0], [1,0,0,0,0,1,0,0,0,1,0,0], [0,0,1,0,0,0,0,1,0,0,0,1]] line.extend(temp)total = []for j in range(0,500): total.append(line)total = np.array(total)print(total.shape) # (500, 300, 12)chordIdList = totaln_in = 12 # dimentioninput_len = 3 # length to use prediction.model = makeModel(input_len,n_in)input_=[]target_=[]for C in chordIdList: for i in range(0, len(C) - input_len): input_.append( C[i:i+input_len] ) target_.append( C[i+input_len] ) X = np.array(input_)Y = np.array(target_).reshape((len(input_),1,n_in))from sklearn.model_selection import train_test_splitx, x_val,y, y_val = train_test_split(X, Y, train_size=0.8, random_state=1)print(x.shape) # (23760, 3, 12)print(y.shape) # (23760, 1, 12)print(x_val.shape) #(5940, 3, 12)print(y_val.shape) # (5940, 1, 12)epoch = 5history = model.fit(x, y, epochs=epoch,validation_data=(x_val, y_val))in_ = np.array([[1,0,0,0,1,0,0,1,0,0,0,0], [1,0,0,0,0,1,0,0,0,1,0,0], [0,0,1,0,0,0,0,1,0,0,0,1]]).reshape(1,3,12) print(in_.shape)out_ = model.predict(in_)print(out_)
回答:
嗯,这里你有一个主要问题,你试图进行回归,而你的问题是纯粹的分类。
你需要在代码的这一部分做如下修改:
def makeModel(input_len,n_in): n_hidden = 512 model = Sequential() model.add(SimpleRNN(n_hidden, input_shape=(input_len, n_in), return_sequences=False)) model.add(Dense(n_hidden, activation="relu")) model.add(Dense(n_in, activation="relu")) opt = Adam(lr=0.001) model.compile(loss='mse', optimizer=opt) model.summary() return model
将最后一层改为使用sigmoid激活函数(输出在0到1之间,就像你的情况)
model.add(Dense(n_in, activation="sigmoid"))
将损失函数改为二元交叉熵
model.compile(loss='binary_crossentropy', optimizer=opt)
使用relu会试图将值映射到一个无限函数上,我认为这会使学习变得复杂。
另外,使用以下代码压缩Y:
history = model.fit(x, np.squeeze(y), epochs=epoch,validation_data=(x_val, np.squeeze(y_val)))