CONST_TRAINTING_SEQUENCE_LENGTH = 12CONST_TESTING_CASES = 5def dataNormalization(data): return [(datum - data[0]) / data[0] for datum in data]def dataDeNormalization(data, base): return [(datum + 1) * base for datum in data]def getDeepLearningData(ticker): # 步骤1. 加载数据 data = pandas.read_csv('/Users/yindeyong/Desktop/Django_Projects/pythonstock/data/Intraday/' + ticker + '.csv')[ 'close'].tolist() # 步骤2. 构建训练数据 dataTraining = [] for i in range(len(data) - CONST_TESTING_CASES * CONST_TRAINTING_SEQUENCE_LENGTH): dataSegment = data[i:i + CONST_TRAINTING_SEQUENCE_LENGTH + 1] dataTraining.append(dataNormalization(dataSegment)) dataTraining = numpy.array(dataTraining) numpy.random.shuffle(dataTraining) X_Training = dataTraining[:, :-1] Y_Training = dataTraining[:, -1] # 步骤3. 构建测试数据 X_Testing = [] Y_Testing_Base = [] for i in range(CONST_TESTING_CASES, 0, -1): dataSegment = data[-(i + 1) * CONST_TRAINTING_SEQUENCE_LENGTH:-i * CONST_TRAINTING_SEQUENCE_LENGTH] Y_Testing_Base.append(dataSegment[0]) X_Testing.append(dataNormalization(dataSegment)) Y_Testing = data[-CONST_TESTING_CASES * CONST_TRAINTING_SEQUENCE_LENGTH:] X_Testing = numpy.array(X_Testing) Y_Testing = numpy.array(Y_Testing) # 步骤4. 为深度学习重塑数据 X_Training = numpy.reshape(X_Training, (X_Training.shape[0], X_Training.shape[1], 1)) X_Testing = numpy.reshape(X_Testing, (X_Testing.shape[0], X_Testing.shape[1], 1)) return X_Training, Y_Training, X_Testing, Y_Testing, Y_Testing_Basedef predictLSTM(ticker): # 步骤1. 加载数据 X_Training, Y_Training, X_Testing, Y_Testing, Y_Testing_Base = getDeepLearningData(ticker) # 步骤2. 构建模型 model = Sequential() model.add(LSTM( input_shape=1, output_dim=50, return_sequences=True)) model.add(Dropout(0.2)) model.add(LSTM( 200, return_sequences=False)) model.add(Dropout(0.2)) model.add(Dense(output_dim=1)) model.add(Activation('linear')) model.compile(lose='mse', optimizer='rmsprop') # 步骤3. 训练模型 model.fit(X_Training, Y_Training, batch_size=512, nb_epoch=5, validation_split=0.05)predictLSTM(ticker='MRIN')
我遇到了一个错误:
文件 “/Users/yindeyong/Desktop/Django_Projects/envs/stockenv/lib/python3.6/site-packages/keras/engine/base_layer.py”, 行147,在init batch_size,) + tuple(kwargs[‘input_shape’])TypeError: ‘int’对象不可迭代
我尝试将input_shape=1改为input_shape=(1,),然后得到了另一个错误:
ValueError: 输入0与层lstm_1不兼容:期望ndim=3,发现ndim=2
回答:
LSTM是一种处理序列数据的递归网络
序列必须有长度和特征,您的输入形状必须包含这两个:input_shape=(length, features)
。
您的数据也必须相应地进行形状调整,使用(sequences, length, features)
。
对于可变长度,您可以使用input_shape=(None,features)
。