神经网络在玩具数据集上失败

我创建了以下玩具数据集:dataset plot

我尝试使用Keras中的神经网络来预测类别:

model = Sequential()model.add(Dense(units=2, activation='sigmoid', input_shape= (nr_feats,)))model.add(Dense(units=nr_classes, activation='softmax'))model.compile(loss='categorical_crossentropy',              optimizer='adam',              metrics=['accuracy'])

其中nr_featsnr_classes设置为2。神经网络只能以50%的准确率预测,返回的结果要么全是1,要么全是2。使用逻辑回归则可以达到100%的准确率。

我无法找出问题出在哪里。

我已经上传了一个笔记本到GitHub,如果你想快速尝试一些东西的话。

编辑1

我大幅增加了训练轮数,准确率从第72轮的0.5开始改善,并在第98轮收敛到1.0。对于这样一个简单的数据库,这仍然显得极其缓慢。

我知道使用单个输出神经元和sigmoid激活函数更好,但我更想了解为什么使用两个输出神经元和softmax激活函数不起作用。

我预处理数据框的方式如下:

from sklearn.preprocessing import LabelEncoderx_train = df_train.iloc[:,0:-1].valuesy_train = df_train.iloc[:, -1]nr_feats = x_train.shape[1]nr_classes = y_train.nunique()label_enc = LabelEncoder()label_enc.fit(y_train)y_train = keras.utils.to_categorical(label_enc.transform(y_train), nr_classes)

训练和评估:

model.fit(x_train, y_train, epochs=500, batch_size=32, verbose=True)accuracy_score(model.predict_classes(x_train),  df_train.iloc[:, -1].values)

编辑2

在将输出层更改为单个神经元和sigmoid激活函数,并使用binary_crossentropy损失函数后,如modesitt所建议的,准确率在200轮内仍保持在0.5,并在100轮后收敛到1.0。


回答:

注意: 如果你想了解真正的原因,请阅读我的回答末尾的“更新”部分。在这种情况下,我提到的另外两个原因只有在学习率设置得非常低(低于1e-3)时才有效。


我整理了一些代码。它与你的代码非常相似,但我只是稍微清理了一下,并让它对我来说更简单。如你所见,我在最后一层使用了一个单元的密集层,并使用了sigmoid激活函数,并且仅将优化器从adam更改为rmsprop(这并不重要,你可以使用adam如果你喜欢):

import numpy as npimport random# 生成具有两个特征的随机数据n_samples = 200n_feats = 2cls0 = np.random.uniform(low=0.2, high=0.4, size=(n_samples,n_feats))cls1 = np.random.uniform(low=0.5, high=0.7, size=(n_samples,n_feats))x_train = np.concatenate((cls0, cls1))y_train = np.concatenate((np.zeros((n_samples,)), np.ones((n_samples,))))# 打乱数据,因为所有负样本(即类别“0”)都在前面# 然后是所有正样本(即类别“1”)indices = np.arange(x_train.shape[0])np.random.shuffle(indices)x_train = x_train[indices]y_train = y_train[indices]from keras.models import Sequentialfrom keras.layers import Densemodel = Sequential()model.add(Dense(2, activation='sigmoid', input_shape=(n_feats,)))model.add(Dense(1, activation='sigmoid'))model.compile(loss='binary_crossentropy',              optimizer='rmsprop',              metrics=['accuracy'])model.summary()model.fit(x_train, y_train, epochs=5, batch_size=32, verbose=True)

这是输出结果:

Layer (type)                 Output Shape              Param #   =================================================================dense_25 (Dense)             (None, 2)                 6         _________________________________________________________________dense_26 (Dense)             (None, 1)                 3         =================================================================Total params: 9Trainable params: 9Non-trainable params: 0_________________________________________________________________Epoch 1/5400/400 [==============================] - 0s 966us/step - loss: 0.7013 - acc: 0.5000Epoch 2/5400/400 [==============================] - 0s 143us/step - loss: 0.6998 - acc: 0.5000Epoch 3/5400/400 [==============================] - 0s 137us/step - loss: 0.6986 - acc: 0.5000Epoch 4/5400/400 [==============================] - 0s 149us/step - loss: 0.6975 - acc: 0.5000Epoch 5/5400/400 [==============================] - 0s 132us/step - loss: 0.6966 - acc: 0.5000

如你所见,准确率从未从50%增加。如果你将训练轮数增加到50会怎样:

Layer (type)                 Output Shape              Param #   =================================================================dense_35 (Dense)             (None, 2)                 6         _________________________________________________________________dense_36 (Dense)             (None, 1)                 3         =================================================================Total params: 9Trainable params: 9Non-trainable params: 0_________________________________________________________________Epoch 1/50400/400 [==============================] - 0s 1ms/step - loss: 0.6925 - acc: 0.5000Epoch 2/50400/400 [==============================] - 0s 136us/step - loss: 0.6902 - acc: 0.5000Epoch 3/50400/400 [==============================] - 0s 133us/step - loss: 0.6884 - acc: 0.5000Epoch 4/50400/400 [==============================] - 0s 160us/step - loss: 0.6866 - acc: 0.5000Epoch 5/50400/400 [==============================] - 0s 140us/step - loss: 0.6848 - acc: 0.5000Epoch 6/50400/400 [==============================] - 0s 168us/step - loss: 0.6832 - acc: 0.5000Epoch 7/50400/400 [==============================] - 0s 154us/step - loss: 0.6817 - acc: 0.5000Epoch 8/50400/400 [==============================] - 0s 146us/step - loss: 0.6802 - acc: 0.5000Epoch 9/50400/400 [==============================] - 0s 161us/step - loss: 0.6789 - acc: 0.5000Epoch 10/50400/400 [==============================] - 0s 140us/step - loss: 0.6778 - acc: 0.5000Epoch 11/50400/400 [==============================] - 0s 177us/step - loss: 0.6766 - acc: 0.5000Epoch 12/50400/400 [==============================] - 0s 180us/step - loss: 0.6755 - acc: 0.5000Epoch 13/50400/400 [==============================] - 0s 165us/step - loss: 0.6746 - acc: 0.5000Epoch 14/50400/400 [==============================] - 0s 128us/step - loss: 0.6736 - acc: 0.5000Epoch 15/50400/400 [==============================] - 0s 125us/step - loss: 0.6728 - acc: 0.5000Epoch 16/50400/400 [==============================] - 0s 165us/step - loss: 0.6718 - acc: 0.5000Epoch 17/50400/400 [==============================] - 0s 161us/step - loss: 0.6710 - acc: 0.5000Epoch 18/50400/400 [==============================] - 0s 170us/step - loss: 0.6702 - acc: 0.5000Epoch 19/50400/400 [==============================] - 0s 122us/step - loss: 0.6694 - acc: 0.5000Epoch 20/50400/400 [==============================] - 0s 110us/step - loss: 0.6686 - acc: 0.5000Epoch 21/50400/400 [==============================] - 0s 142us/step - loss: 0.6676 - acc: 0.5000Epoch 22/50400/400 [==============================] - 0s 142us/step - loss: 0.6667 - acc: 0.5000Epoch 23/50400/400 [==============================] - 0s 149us/step - loss: 0.6659 - acc: 0.5000Epoch 24/50400/400 [==============================] - 0s 125us/step - loss: 0.6651 - acc: 0.5000Epoch 25/50400/400 [==============================] - 0s 134us/step - loss: 0.6643 - acc: 0.5000Epoch 26/50400/400 [==============================] - 0s 143us/step - loss: 0.6634 - acc: 0.5000Epoch 27/50400/400 [==============================] - 0s 137us/step - loss: 0.6625 - acc: 0.5000Epoch 28/50400/400 [==============================] - 0s 131us/step - loss: 0.6616 - acc: 0.5025Epoch 29/50400/400 [==============================] - 0s 119us/step - loss: 0.6608 - acc: 0.5100Epoch 30/50400/400 [==============================] - 0s 143us/step - loss: 0.6601 - acc: 0.5025Epoch 31/50400/400 [==============================] - 0s 148us/step - loss: 0.6593 - acc: 0.5350Epoch 32/50400/400 [==============================] - 0s 161us/step - loss: 0.6584 - acc: 0.5325Epoch 33/50400/400 [==============================] - 0s 152us/step - loss: 0.6576 - acc: 0.5700Epoch 34/50400/400 [==============================] - 0s 128us/step - loss: 0.6568 - acc: 0.5850Epoch 35/50400/400 [==============================] - 0s 155us/step - loss: 0.6560 - acc: 0.5975Epoch 36/50400/400 [==============================] - 0s 136us/step - loss: 0.6552 - acc: 0.6425Epoch 37/50400/400 [==============================] - 0s 140us/step - loss: 0.6544 - acc: 0.6150Epoch 38/50400/400 [==============================] - 0s 120us/step - loss: 0.6538 - acc: 0.6375Epoch 39/50400/400 [==============================] - 0s 140us/step - loss: 0.6531 - acc: 0.6725Epoch 40/50400/400 [==============================] - 0s 135us/step - loss: 0.6523 - acc: 0.6750Epoch 41/50400/400 [==============================] - 0s 136us/step - loss: 0.6515 - acc: 0.7300Epoch 42/50400/400 [==============================] - 0s 126us/step - loss: 0.6505 - acc: 0.7450Epoch 43/50400/400 [==============================] - 0s 141us/step - loss: 0.6496 - acc: 0.7425Epoch 44/50400/400 [==============================] - 0s 162us/step - loss: 0.6489 - acc: 0.7675Epoch 45/50400/400 [==============================] - 0s 161us/step - loss: 0.6480 - acc: 0.7775Epoch 46/50400/400 [==============================] - 0s 126us/step - loss: 0.6473 - acc: 0.7575Epoch 47/50400/400 [==============================] - 0s 124us/step - loss: 0.6464 - acc: 0.7625Epoch 48/50400/400 [==============================] - 0s 130us/step - loss: 0.6455 - acc: 0.7950Epoch 49/50400/400 [==============================] - 0s 191us/step - loss: 0.6445 - acc: 0.8100Epoch 50/50400/400 [==============================] - 0s 163us/step - loss: 0.6435 - acc: 0.8625

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注