我对机器学习是新手,我想使用Keras来对序列中的每个数字进行分类,将其标记为1或0,具体取决于它是否大于前一个数字。也就是说,如果我有以下序列:
序列a = [1, 2, 6, 4, 5],
解决方案应该是:序列b = [0, 1, 1, 0, 1]。
到目前为止,我已经写了以下代码:
import tensorflow as tffrom tensorflow import kerasfrom keras.models import Sequentialfrom keras.layers import Densemodel = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=[1,1])])model.add(tf.keras.layers.Dense(17))model.add(tf.keras.layers.Dense(17))model.compile(optimizer='sgd', loss='BinaryCrossentropy', metrics=['binary_accuracy'])b = [1,6,8,3,5,8,90,5,432,3,5,6,8,8,4,234,0]a = [0,1,1,0,1,1,1,0,1,0,1,1,1,0,0,1,0]b = np.array(b, dtype=float)a = np.array(a, dtype=float)model.fit(b, a, epochs=500, batch_size=1)# # Generate predictions for samplespredictions = model.predict(b)print(predictions)
当我这样做时,得到的结果是:
Epoch 500/50017/17 [==============================] - 0s 499us/step - loss: 7.9229 - binary_accuracy: 0.4844[[[-1.37064695e+01 4.70858345e+01 -4.67341652e+01 -1.94298875e+00 5.75960045e+01 6.70146179e+01 6.34545479e+01 -4.86319550e+02 2.26250134e+01 -8.60109329e+00 -4.03220863e+01 -1.67574768e+01 3.36148148e+01 -4.55171967e+00 -1.39924898e+01 6.31023712e+01 -9.14120102e+00]] [[-6.92644653e+01 2.40270264e+02 -2.37715302e+02 -9.42625141e+00 2.93314209e+02 3.41092743e+02 3.23760315e+02 -2.49306396e+03 1.15242020e+02 -4.38339310e+01 -2.05973328e+02 -8.48139114e+01 1.70274872e+02 -2.48692398e+01 -7.15372696e+01 3.22131958e+02 -4.57872620e+01]] [[-9.14876480e+01 3.17544006e+02 -3.14107819e+02 -1.24195509e+01 3.87601562e+02 4.50723969e+02 4.27882660e+02 -3.29576172e+03 1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02 2.24938889e+02 -3.29962883e+01 -9.45551834e+01 4.25743744e+02 -6.04456978e+01]] [[-3.59296684e+01 1.24359612e+02 -1.23126640e+02 -4.93629456e+00 1.51883270e+02 1.76645889e+02 1.67576874e+02 -1.28901733e+03 5.96718216e+01 -2.26942272e+01 -1.06582588e+02 -4.39800491e+01 8.82788391e+01 -1.26787395e+01 -3.70104065e+01 1.66714172e+02 -2.37996235e+01]] [[-5.81528549e+01 2.01633392e+02 -1.99519104e+02 -7.92959309e+00 2.46170563e+02 2.86277161e+02 2.71699158e+02 -2.09171509e+03 9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01 1.42942856e+02 -2.08057709e+01 -6.00283318e+01 2.70326050e+02 -3.84580460e+01]] [[-9.14876480e+01 3.17544006e+02 -3.14107819e+02 -1.24195509e+01 3.87601562e+02 4.50723969e+02 4.27882660e+02 -3.29576172e+03 1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02 2.24938889e+02 -3.29962883e+01 -9.45551834e+01 4.25743744e+02 -6.04456978e+01]] [[-1.00263879e+03 3.48576855e+03 -3.44619800e+03 -1.35145050e+02 4.25337939e+03 4.94560596e+03 4.69689697e+03 -3.62063594e+04 1.67120789e+03 -6.35745117e+02 -2.98891406e+03 -1.22816174e+03 2.46616406e+03 -3.66204163e+02 -1.03828992e+03 4.67382764e+03 -6.61441223e+02]] [[-5.81528549e+01 2.01633392e+02 -1.99519104e+02 -7.92959309e+00 2.46170563e+02 2.86277161e+02 2.71699158e+02 -2.09171509e+03 9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01 1.42942856e+02 -2.08057709e+01 -6.00283318e+01 2.70326050e+02 -3.84580460e+01]] [[-4.80280518e+03 1.66995840e+04 -1.65093086e+04 -6.47000305e+02 2.03765059e+04 2.36925508e+04 2.25018145e+04 -1.73467625e+05 8.00621289e+03 -3.04566919e+03 -1.43194590e+04 -5.88322070e+03 1.18137129e+04 -1.75592432e+03 -4.97435352e+03 2.23914492e+04 -3.16803076e+03]] [[-3.59296684e+01 1.24359612e+02 -1.23126640e+02 -4.93629456e+00 1.51883270e+02 1.76645889e+02 1.67576874e+02 -1.28901733e+03 5.96718216e+01 -2.26942272e+01 -1.06582588e+02 -4.39800491e+01 8.82788391e+01 -1.26787395e+01 -3.70104065e+01 1.66714172e+02 -2.37996235e+01]] [[-5.81528549e+01 2.01633392e+02 -1.99519104e+02 -7.92959309e+00 2.46170563e+02 2.86277161e+02 2.71699158e+02 -2.09171509e+03 9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01 1.42942856e+02 -2.08057709e+01 -6.00283318e+01 2.70326050e+02 -3.84580460e+01]] [[-6.92644653e+01 2.40270264e+02 -2.37715302e+02 -9.42625141e+00 2.93314209e+02 3.41092743e+02 3.23760315e+02 -2.49306396e+03 1.15242020e+02 -4.38339310e+01 -2.05973328e+02 -8.48139114e+01 1.70274872e+02 -2.48692398e+01 -7.15372696e+01 3.22131958e+02 -4.57872620e+01]] [[-9.14876480e+01 3.17544006e+02 -3.14107819e+02 -1.24195509e+01 3.87601562e+02 4.50723969e+02 4.27882660e+02 -3.29576172e+03 1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02 2.24938889e+02 -3.29962883e+01 -9.45551834e+01 4.25743744e+02 -6.04456978e+01]] [[-9.14876480e+01 3.17544006e+02 -3.14107819e+02 -1.24195509e+01 3.87601562e+02 4.50723969e+02 4.27882660e+02 -3.29576172e+03 1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02 2.24938889e+02 -3.29962883e+01 -9.45551834e+01 4.25743744e+02 -6.04456978e+01]] [[-4.70412598e+01 1.62996490e+02 -1.61322891e+02 -6.43295908e+00 1.99026932e+02 2.31461517e+02 2.19638016e+02 -1.69036609e+03 7.81952209e+01 -2.97407875e+01 -1.39712814e+02 -5.75913391e+01 1.15610855e+02 -1.67422562e+01 -4.85193672e+01 2.18520096e+02 -3.11288433e+01]] [[-2.60270850e+03 9.04948047e+03 -8.94645508e+03 -3.50663330e+02 1.10420654e+04 1.28390557e+04 1.21937041e+04 -9.40005859e+04 4.33857861e+03 -1.65045227e+03 -7.75966846e+03 -3.18818774e+03 6.40197412e+03 -9.51349304e+02 -2.69557886e+03 1.21338779e+04 -1.71684766e+03]] [[-2.59487200e+00 8.44894505e+00 -8.53793907e+00 -4.46333081e-01 1.04523640e+01 1.21989994e+01 1.13933916e+01 -8.49708328e+01 4.10160637e+00 -1.55452514e+00 -7.19183874e+00 -3.14619255e+00 6.28279734e+00 -4.88203079e-01 -2.48353434e+00 1.12964716e+01 -1.81198704e+00]]]
回答:
你处理这个问题的几个问题如下 –
-
你对深度学习问题的设置有缺陷。你希望使用前一个元素的信息来推断下一个元素的标签。但在推理(和训练)时,你只传递了当前元素。想象一下,如果我明天部署这个模型会发生什么。我只会提供给你一个数字,比如“15”,然后问你它是否大于前一个元素,而前一个元素并不存在。你的模型会如何回应?
-
其次,为什么你的输出层预测的是一个17维向量?目标不应该是预测一个0或1(概率)吗?在这种情况下,你的输出应该是一个带有sigmoid激活函数的单一元素。请参考此图表作为你未来设置神经网络的指南。
- 第三,你没有使用任何激活函数,这是使用神经网络的核心原因(非线性)。没有激活函数,你只是在构建一个标准的回归模型。这里有一个基本的证明 –
#2层神经网络无激活h = W1.X+B1o = W2.h+B2o = W2.(W1.X+B1)+B2 = W2.W1.X + (W1.B1+B2) = W3.X + B3 #与线性回归相同!#2层神经网络有激活h = activation(W1.X+B1)o = activation(W2.h+B2)
我建议你从神经网络的基础开始,首先建立最佳实践,然后再跳到创建自己的问题陈述。Keras的作者Fchollet
有一些优秀的入门笔记本,你可以探索一下。
对于你的情况,尝试以下修改 –
import tensorflow as tffrom tensorflow import kerasfrom keras.models import Sequentialfrom keras.layers import Dense#修改输入形状和输出形状 + 添加激活函数model = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=(2,))]) #<------model.add(tf.keras.layers.Dense(17, activation='relu')) #<------model.add(tf.keras.layers.Dense(1, activation='sigmoid')) #<------model.compile(optimizer='sgd', loss='BinaryCrossentropy', metrics=['binary_accuracy'])#创建2个特征,第一个是前一个元素,第二个是当前元素b = [1,6,8,3,5,8,90,5,432,3,5,6,8,8,4,234,0]b = np.array([i for i in zip(b,b[1:])]) #<---- (16,2)#从第一对元素开始a = np.array([0,1,1,0,1,1,1,0,1,0,1,1,1,0,0,1,0])[1:] #<---- (16,)model.fit(b, a, epochs=20, batch_size=1)# # Generate predictions for samplespredictions = model.predict(b)print(np.round(predictions))
Epoch 1/2016/16 [==============================] - 0s 1ms/step - loss: 3.0769 - binary_accuracy: 0.7086Epoch 2/2016/16 [==============================] - 0s 823us/step - loss: 252.6490 - binary_accuracy: 0.6153Epoch 3/2016/16 [==============================] - 0s 1ms/step - loss: 3.8109 - binary_accuracy: 0.9212Epoch 4/2016/16 [==============================] - 0s 787us/step - loss: 0.0131 - binary_accuracy: 0.9845Epoch 5/2016/16 [==============================] - 0s 2ms/step - loss: 0.0767 - binary_accuracy: 1.0000Epoch 6/2016/16 [==============================] - 0s 1ms/step - loss: 0.0143 - binary_accuracy: 0.9800Epoch 7/2016/16 [==============================] - 0s 2ms/step - loss: 0.0111 - binary_accuracy: 1.0000Epoch 8/2016/16 [==============================] - 0s 2ms/step - loss: 4.0658e-04 - binary_accuracy: 1.0000Epoch 9/2016/16 [==============================] - 0s 941us/step - loss: 6.3996e-04 - binary_accuracy: 1.0000Epoch 10/2016/16 [==============================] - 0s 1ms/step - loss: 1.1477e-04 - binary_accuracy: 1.0000Epoch 11/2016/16 [==============================] - 0s 837us/step - loss: 6.8807e-04 - binary_accuracy: 1.0000Epoch 12/2016/16 [==============================] - 0s 2ms/step - loss: 5.0521e-04 - binary_accuracy: 1.0000Epoch 13/2016/16 [==============================] - 0s 851us/step - loss: 0.0015 - binary_accuracy: 1.0000Epoch 14/2016/16 [==============================] - 0s 1ms/step - loss: 0.0012 - binary_accuracy: 1.0000Epoch 15/2016/16 [==============================] - 0s 765us/step - loss: 0.0014 - binary_accuracy: 1.0000Epoch 16/2016/16 [==============================] - 0s 906us/step - loss: 3.9230e-04 - binary_accuracy: 1.0000Epoch 17/2016/16 [==============================] - 0s 1ms/step - loss: 0.0022 - binary_accuracy: 1.0000Epoch 18/2016/16 [==============================] - 0s 1ms/step - loss: 2.2149e-04 - binary_accuracy: 1.0000Epoch 19/2016/16 [==============================] - 0s 2ms/step - loss: 1.7345e-04 - binary_accuracy: 1.0000Epoch 20/2016/16 [==============================] - 0s 1ms/step - loss: 7.7950e-05 - binary_accuracy: 1.0000[[1.] [1.] [0.] [1.] [1.] [1.] [0.] [1.] [0.] [1.] [1.] [1.] [0.] [0.] [1.] [0.]]
上述模型很容易训练,因为问题本身并不复杂。你可以看到准确率很快就达到了100%。让我们尝试使用这个新模型对未见数据进行预测 –
np.round(model.predict([[5,1], #<- 5是否小于1 [5,500], #<- 5是否小于500 [5,6]])) #<- 5是否小于6array([[0.], #<- 否 [1.], #<- 是 [1.]], dtype=float32) #<- 是