我刚开始学习机器学习,尝试使用TFlearn因为它简单易用。
我正在尝试制作一个基本的分类器,我觉得这很有趣。我的目标是训练系统预测一个点的方向。
例如,如果我输入两个二维坐标(50,50)
和(51,51)
,系统必须预测方向为东北(NE)。如果我输入(50,50)
和(49,49)
,系统必须预测方向为西南(SW)。
输入: X1,Y1,X2,Y2,Label
输出: 0到8,代表8个方向。
所以这里是我写的一个小代码,
from __future__ import print_functionimport numpy as npimport tflearnimport tensorflow as tfimport timefrom tflearn.data_utils import load_csv#Sample input 50,50,51,51,5data, labels = load_csv(filename, target_column=4, categorical_labels=True, n_classes=8)my_optimizer = tflearn.SGD(learning_rate=0.1)net = tflearn.input_data(shape=[None, 4])net = tflearn.fully_connected(net, 32) #input 4, output 32net = tflearn.fully_connected(net, 32) #input 32, output 32net = tflearn.fully_connected(net, 8, activation='softmax')net = tflearn.regression(net,optimizer=my_optimizer)model = tflearn.DNN(net)model.fit(data, labels, n_epoch=100, batch_size=100000, show_metric=True)model.save("direction-classifier.tfl")
我遇到的问题是,即使我输入了大约4000万个样本,系统的准确率仍然低至20%。
我将输入限制在40-x-60
和40-y-60
之间。
我无法理解是否过度拟合了样本,因为在整个4000万输入的训练过程中,准确率从未达到过高水平。
为什么这个简单的例子准确率这么低?
编辑:我已经降低了学习率并减小了批量大小。然而,结果仍然相同,准确率非常低。我已经包含了前25步的输出。
--Training Step: 100000 | total loss: 6.33983 | time: 163.327s| SGD | epoch: 001 | loss: 6.33983 - acc: 0.0663 -- iter: 999999/999999--Training Step: 200000 | total loss: 6.84055 | time: 161.981ss| SGD | epoch: 002 | loss: 6.84055 - acc: 0.1568 -- iter: 999999/999999--Training Step: 300000 | total loss: 5.90203 | time: 158.853ss| SGD | epoch: 003 | loss: 5.90203 - acc: 0.1426 -- iter: 999999/999999--Training Step: 400000 | total loss: 5.97782 | time: 157.607ss| SGD | epoch: 004 | loss: 5.97782 - acc: 0.1465 -- iter: 999999/999999--Training Step: 500000 | total loss: 5.97215 | time: 155.929ss| SGD | epoch: 005 | loss: 5.97215 - acc: 0.1234 -- iter: 999999/999999--Training Step: 600000 | total loss: 6.86967 | time: 157.299ss| SGD | epoch: 006 | loss: 6.86967 - acc: 0.1230 -- iter: 999999/999999--Training Step: 700000 | total loss: 6.10330 | time: 158.137ss| SGD | epoch: 007 | loss: 6.10330 - acc: 0.1242 -- iter: 999999/999999--Training Step: 800000 | total loss: 5.81901 | time: 157.464ss| SGD | epoch: 008 | loss: 5.81901 - acc: 0.1464 -- iter: 999999/999999--Training Step: 900000 | total loss: 7.09744 | time: 157.486ss| SGD | epoch: 009 | loss: 7.09744 - acc: 0.1359 -- iter: 999999/999999--Training Step: 1000000 | total loss: 7.19259 | time: 158.369s| SGD | epoch: 010 | loss: 7.19259 - acc: 0.1248 -- iter: 999999/999999--Training Step: 1100000 | total loss: 5.60177 | time: 157.221ss| SGD | epoch: 011 | loss: 5.60177 - acc: 0.1378 -- iter: 999999/999999--Training Step: 1200000 | total loss: 7.16676 | time: 158.607ss| SGD | epoch: 012 | loss: 7.16676 - acc: 0.1210 -- iter: 999999/999999--Training Step: 1300000 | total loss: 6.19163 | time: 163.711ss| SGD | epoch: 013 | loss: 6.19163 - acc: 0.1635 -- iter: 999999/999999--Training Step: 1400000 | total loss: 7.46101 | time: 162.091ss| SGD | epoch: 014 | loss: 7.46101 - acc: 0.1216 -- iter: 999999/999999--Training Step: 1500000 | total loss: 7.78055 | time: 158.468ss| SGD | epoch: 015 | loss: 7.78055 - acc: 0.1122 -- iter: 999999/999999--Training Step: 1600000 | total loss: 6.03101 | time: 158.251ss| SGD | epoch: 016 | loss: 6.03101 - acc: 0.1103 -- iter: 999999/999999--Training Step: 1700000 | total loss: 5.59769 | time: 158.083ss| SGD | epoch: 017 | loss: 5.59769 - acc: 0.1182 -- iter: 999999/999999--Training Step: 1800000 | total loss: 5.45591 | time: 158.088ss| SGD | epoch: 018 | loss: 5.45591 - acc: 0.0868 -- iter: 999999/999999--Training Step: 1900000 | total loss: 6.54951 | time: 157.755ss| SGD | epoch: 019 | loss: 6.54951 - acc: 0.1353 -- iter: 999999/999999--Training Step: 2000000 | total loss: 6.18566 | time: 157.408ss| SGD | epoch: 020 | loss: 6.18566 - acc: 0.0551 -- iter: 999999/999999--Training Step: 2100000 | total loss: 4.95146 | time: 157.572ss| SGD | epoch: 021 | loss: 4.95146 - acc: 0.1114 -- iter: 999999/999999--Training Step: 2200000 | total loss: 5.97208 | time: 157.279ss| SGD | epoch: 022 | loss: 5.97208 - acc: 0.1277 -- iter: 999999/999999--Training Step: 2300000 | total loss: 6.75645 | time: 157.201ss| SGD | epoch: 023 | loss: 6.75645 - acc: 0.1507 -- iter: 999999/999999--Training Step: 2400000 | total loss: 7.04119 | time: 157.346ss| SGD | epoch: 024 | loss: 7.04119 - acc: 0.1512 -- iter: 999999/999999--Training Step: 2500000 | total loss: 5.95451 | time: 157.722ss| SGD | epoch: 025 | loss: 5.95451 - acc: 0.1421 -- iter: 999999/999999
回答:
原来是优化器引起了所有问题。当移除自定义优化器后,损失开始正确下降,准确率提高到了99%。
以下两行必须修改。
my_optimizer = tflearn.SGD(learning_rate=0.1)net = tflearn.regression(net,optimizer=my_optimizer)
替换为
net = tflearn.regression(net)
产生了完美的效果。