在尝试使用神经网络进行Q学习失败了好几天后,我决定回归基础,做一个简单的函数逼近,看看一切是否正常运作,并观察一些参数对学习过程的影响。以下是我编写的代码
from keras.models import Sequentialfrom keras.layers import Denseimport matplotlib.pyplot as pltimport randomimport numpyfrom sklearn.preprocessing import MinMaxScalerregressor = Sequential()regressor.add(Dense(units=20, activation='sigmoid', kernel_initializer='uniform', input_dim=1))regressor.add(Dense(units=20, activation='sigmoid', kernel_initializer='uniform'))regressor.add(Dense(units=20, activation='sigmoid', kernel_initializer='uniform'))regressor.add(Dense(units=1))regressor.compile(loss='mean_squared_error', optimizer='sgd')#regressor = ExtraTreesRegressor()N = 5000X = numpy.empty((N,))Y = numpy.empty((N,))for i in range(N): X[i] = random.uniform(-10, 10)X = numpy.sort(X).reshape(-1, 1)for i in range(N): Y[i] = numpy.sin(X[i])Y = Y.reshape(-1, 1)X_scaler = MinMaxScaler()Y_scaler = MinMaxScaler()X = X_scaler.fit_transform(X)Y = Y_scaler.fit_transform(Y)regressor.fit(X, Y, epochs=2, verbose=1, batch_size=32)#regressor.fit(X, Y.reshape(5000,))x = numpy.mgrid[-10:10:100*1j]x = x.reshape(-1, 1)y = numpy.mgrid[-10:10:100*1j]y = y.reshape(-1, 1)x = X_scaler.fit_transform(x)for i in range(len(x)): y[i] = regressor.predict(numpy.array([x[i]]))plt.figure()plt.plot(X_scaler.inverse_transform(x), Y_scaler.inverse_transform(y))plt.plot(X_scaler.inverse_transform(X), Y_scaler.inverse_transform(Y))
问题是我的所有预测值都集中在0附近。如你所见,我使用了来自sklearn的ExtraTreesRegressor(已注释的代码行)来检查协议是否确实正确。那么我的神经网络哪里出了问题?为什么它不起作用?
(我实际要解决的问题是使用神经网络计算山地车问题的Q函数。这与这个函数逼近器有什么不同?)
回答:
通过以下更改:
- 激活函数改为
relu
- 移除
kernel_initializer
(即保留默认的'glorot_uniform'
) - 使用Adam优化器
- 设定100个训练周期
即
regressor = Sequential()regressor.add(Dense(units=20, activation='relu', input_dim=1)) regressor.add(Dense(units=20, activation='relu')) regressor.add(Dense(units=20, activation='relu')) regressor.add(Dense(units=1))regressor.compile(loss='mean_squared_error', optimizer='adam')regressor.fit(X, Y, epochs=100, verbose=1, batch_size=32)
其余代码保持不变,结果如下:
不断尝试,再尝试…