使用Keras预测数字，如果在一定范围内则通过

我在处理epochs和运行的准确性上遇到了问题。准确性波动很大，这与我想要估算一个数字的事实有关。我希望如果估计的数值在正负2%的范围内，测试就能通过。

代码:

seed = 7basepath = '.'# find the right path for batch ai vs localoutpath = os.path.join (basepath, "out")if not os.path.exists(outpath):    os.makedirs(outpath)# Importing the datasetdataset = pd.read_csv(os.path.join (basepath, 'data.csv'))# fix random seed for reproducibilitynp.random.seed(seed)#Encode columns using label encoding #use a new label encoder everytime is important!vixpercentencoder = LabelEncoder()dataset['VIX Open Percent'] = responsetimeencoder.fit_transform(dataset['VIX Open Percent'])fiftydayaverageencoder = LabelEncoder()dataset['50 day average'] = suppliesgroupencoder.fit_transform(dataset['50 day average'])twohundreddayaverageencoder = LabelEncoder()dataset['200 day average'] = suppliessubgroupencoder.fit_transform(dataset['200 day average'])openingencoder = LabelEncoder()dataset['opening'] = regionencoder.fit_transform(dataset['opening'])#routetomarketencoder = LabelEncoder()#dataset['Route To Market'] = routetomarketencoder.fit_transform(dataset['Route To Market'])#What are the correlations between columns and targetcorrelations = dataset.corr()['closing'].sort_values()#Throw out unneeded columns dataset = dataset.drop('Date', axis=1)dataset = dataset.drop('VIX Open', axis=1)dataset = dataset.drop('VIX Close', axis=1)dataset = dataset.drop('Ticker', axis=1)#dataset = dataset.drop('VIX Open Percent', axis=1)#One Hot Encode columns that are more than binary# avoid the dummy variable trap#dataset = pd.concat([pd.get_dummies(dataset['Route To Market'], prefix='Route To Market', drop_first=True),dataset], axis=1)#dataset = dataset.drop('Route To Market', axis=1)#Create the input data set (X) and the outcome (y)X = dataset.drop('closing', axis=1).iloc[:, 0:dataset.shape[1] - 1].valuesy = dataset.iloc[:, dataset.columns.get_loc('closing')].values# Feature Scalingsc = StandardScaler()X = sc.fit_transform(X)# Initilzing the ANNmodel = Sequential()#Adding the input layermodel.add(Dense(units = 8, activation = 'relu', input_dim=X.shape[1], name= 'Input_Layer'))#Add hidden layermodel.add(Dense(units = 8, activation = 'relu', name= 'Hidden_Layer_1'))#Add the output layermodel.add(Dense(units = 1, activation = 'sigmoid', name= 'Output_Layer'))# compiling the ANNmodel.compile(optimizer= 'nadam', loss = 'binary_crossentropy', metrics=['accuracy'])# summary to consoleprint (model.summary())#Fit the ANN to the training sethistory = model.fit(X, y, validation_split = .20, batch_size = 64, epochs = 25)# summarize history for accuracyplt.plot(history.history['acc'])plt.plot(history.history['val_acc'])plt.title('model accuracy')plt.ylabel('accuracy')plt.xlabel('epoch')plt.legend(['train', 'test'], loc='upper left')plt.show()# summarize history for lossplt.plot(history.history['loss'])plt.plot(history.history['val_loss'])plt.title('model loss')plt.ylabel('loss')plt.xlabel('epoch')plt.legend(['train', 'test'], loc='upper left')plt.show()

回答：

看起来你是在尝试预测一个连续值（即回归问题），而不是离散值（即分类问题）。因此，我建议如下:

在最后一层使用sigmoid作为激活函数在这里是不合适的，除非目标值严格在[0,1]范围内。相反，如果值是无界的，则对最后一层不使用任何激活函数（即linear激活）。
使用合适的回归损失函数，例如均方误差，即'mse'。
在回归任务中使用'accuracy'作为度量标准没有意义（即它只用于分类问题）。相反，如果你想监控训练过程的度量标准，请使用另一个度量标准，例如平均绝对误差，即'mae'。

按照上述建议来正确设置你的模型。然后，实验和调整模型的循环就开始了。你可以尝试不同的层，不同的层数或层的单元数，添加正则化等，直到找到性能良好的模型。当然，你可能还需要一个验证集，以便在固定的未见样本集上比较不同配置的性能。

最后一点，不要指望这里有人给你一个完整的“获胜解决方案”。你需要自己用你拥有的数据进行实验，因为在机器学习中，为特定数据和特定任务设计合适的模型是艺术、科学和经验的结合。最终，其他人能给你的只是一些提示或想法（当然，除了指出你的错误之外）。

学技术

使用Keras预测数字，如果在一定范围内则通过

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复