我正在尝试训练一个ANN来预测一张图片属于多个类别的概率,我的目标值是一组这样的概率。
输入是简单地重塑的28×28灰度图片,像素值从0到255。
一个“目标”看起来像这样:0.738832,0.238159,0.023009,0,0.238159,0,0.238159,0,0.238159,0,0,0.238159,0,0.19793,0.80207,0.066806667,0.663691308,0.008334764,0,0,0.0494825,0.098965,0.0494825,0,0,0,0,0,0,0,0,0,0,0,0,0,0
然而,我得到的结果非常糟糕(远不如简单的线性回归),看起来像这样:0.011947,0.448668,0,0,0.095688,0,0.038233,0,0,0,0,0,0,0,0.405464,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
无论我使用300张还是30000张图片,结果都差不多。我显然做错了什么,非常希望得到一些建议。
代码:
# create datasetDS = SupervisedDataSet(784, 37)assert(ia.shape[0] == ta.shape[0])DS.setField('input', ia)DS.setField('target', ta)fnn = buildNetwork( DS.indim, 200, 37, outclass=SoftmaxLayer )trainer = BackpropTrainer( fnn, dataset=DS, momentum=0.1, verbose=True, weightdecay=0.01)trainer.trainUntilConvergence(maxEpochs=10,verbose=True,validationProportion=0.20)
回答:
你的问题在于你用于训练的值。一个softmax层意味着该层的全部值之和为1。因此,当你设置37个输出维度时,这意味着所有37个维度之和将为1.0。你的样本目标似乎并不符合这种分布。