我使用了MinMax归一化来对我的数据集进行归一化处理,包括特征和标签。我的问题是,是否也应该对标签进行归一化?如果是的,那么如何对神经网络的输出(即使用归一化的测试集进行预测的输出)进行反归一化处理?
我无法上传数据集,但它由18个特征和1个标签组成。这是一个回归任务,特征和标签都是物理量。
因此,问题在于y_train_pred
和y_test_pred
的值在0和1之间。我如何预测“真实值”?
代码如下:
dataset = pd.read_csv('DataSet.csv', decimal=',', delimiter = ";")label = dataset.iloc[:,-1]features = dataset.drop(columns = ['Label'])features = features[best_features]X_train1, X_test1, y_train1, y_test1 = train_test_split(features, label, test_size = 0.25, random_state = 1, shuffle = True)y_test2 = y_test1.to_frame()y_train2 = y_train1.to_frame()scaler1 = preprocessing.MinMaxScaler()scaler2 = preprocessing.MinMaxScaler()X_train = scaler1.fit_transform(X_train1)X_test = scaler2.fit_transform(X_test1)scaler3 = preprocessing.MinMaxScaler()scaler4 = preprocessing.MinMaxScaler()y_train = scaler3.fit_transform(y_train2)y_test = scaler4.fit_transform(y_test2)optimizer = tf.keras.optimizers.Adamax(lr=0.001)model = Sequential()model.add(Dense(80, input_shape = (X_train.shape[1],), activation = 'relu',kernel_initializer='random_normal'))model.add(Dropout(0.15))model.add(Dense(120, activation = 'relu',kernel_initializer='random_normal'))model.add(Dropout(0.15))model.add(Dense(80, activation = 'relu',kernel_initializer='random_normal'))model.add(Dense(1,activation = 'linear'))model.compile(loss = 'mse', optimizer = optimizer, metrics = ['mse'])history = model.fit(X_train, y_train, epochs = 300, validation_split = 0.1, shuffle=False, batch_size=120 )history_dict = history.historyloss_values = history_dict['loss']val_loss_values = history_dict['val_loss']y_train_pred = model.predict(X_train)y_test_pred = model.predict(X_test)
回答:
你应该进行反归一化处理,以便获得神经网络的实际世界预测值,而不仅仅是0到1之间的数值。
Min-Max归一化定义如下:
z = (x - min)/(max - min)
其中z为归一化后的值,x为标签值,max为x的最大值,min为x的最小值。所以,如果我们有z、min和max,我们可以按以下方式求解x:
x = z(max - min) + min
因此,在归一化数据之前,如果标签是连续的,定义变量来保存标签的最大值和最小值。然后,在得到预测值后,你可以使用以下函数:
y_max_pre_normalize = max(label)y_min_pre_normalize = min(label) def denormalize(y): final_value = y(y_max_pre_normalize - y_min_pre_normalize) + y_min_pre_normalize return final_value
并将此函数应用于你的y_test/y_pred,以获得相应的值。