Deeplearning4j – 3层神经网络无法正确拟合

我正在尝试学习Deeplearning4j库。我试图使用Sigmoid激活函数实现一个简单的3层神经网络来解决XOR问题。我缺少哪些配置或超参数？我已经成功使用了在线找到的一些MLP示例中的RELU激活函数和Softmax输出得到了准确的输出，但是使用Sigmoid激活函数似乎无法准确拟合。能否有人分享一下为什么我的网络无法产生正确输出？

    DenseLayer inputLayer = new DenseLayer.Builder()            .nIn(2)            .nOut(3)            .name("Input")            .weightInit(WeightInit.ZERO)            .build();    DenseLayer hiddenLayer = new DenseLayer.Builder()            .nIn(3)            .nOut(3)            .name("Hidden")            .activation(Activation.SIGMOID)            .weightInit(WeightInit.ZERO)            .build();    OutputLayer outputLayer = new OutputLayer.Builder()            .nIn(3)            .nOut(1)            .name("Output")            .activation(Activation.SIGMOID)            .weightInit(WeightInit.ZERO)            .lossFunction(LossFunction.MEAN_SQUARED_LOGARITHMIC_ERROR)            .build();    NeuralNetConfiguration.Builder nncBuilder = new NeuralNetConfiguration.Builder();    nncBuilder.iterations(10000);    nncBuilder.learningRate(0.01);    nncBuilder.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT);    NeuralNetConfiguration.ListBuilder listBuilder = nncBuilder.list();    listBuilder.layer(0, inputLayer);    listBuilder.layer(1, hiddenLayer);    listBuilder.layer(2, outputLayer);    listBuilder.backprop(true);    MultiLayerNetwork myNetwork = new MultiLayerNetwork(listBuilder.build());    myNetwork.init();    INDArray trainingInputs = Nd4j.zeros(4, inputLayer.getNIn());    INDArray trainingOutputs = Nd4j.zeros(4, outputLayer.getNOut());    // If 0,0 show 0    trainingInputs.putScalar(new int[]{0,0}, 0);    trainingInputs.putScalar(new int[]{0,1}, 0);    trainingOutputs.putScalar(new int[]{0,0}, 0);    // If 0,1 show 1    trainingInputs.putScalar(new int[]{1,0}, 0);    trainingInputs.putScalar(new int[]{1,1}, 1);    trainingOutputs.putScalar(new int[]{1,0}, 1);    // If 1,0 show 1    trainingInputs.putScalar(new int[]{2,0}, 1);    trainingInputs.putScalar(new int[]{2,1}, 0);    trainingOutputs.putScalar(new int[]{2,0}, 1);    // If 1,1 show 0    trainingInputs.putScalar(new int[]{3,0}, 1);    trainingInputs.putScalar(new int[]{3,1}, 1);    trainingOutputs.putScalar(new int[]{3,0}, 0);    DataSet myData = new DataSet(trainingInputs, trainingOutputs);    myNetwork.fit(myData);    INDArray actualInput = Nd4j.zeros(1,2);    actualInput.putScalar(new int[]{0,0}, 0);    actualInput.putScalar(new int[]{0,1}, 0);    INDArray actualOutput = myNetwork.output(actualInput);    System.out.println("myNetwork Output " + actualOutput);    //Output is producing 1.00. Should be 0.0

回答：

总的来说，我将链接给你： https://deeplearning4j.org/troubleshootingneuralnets

一些具体的建议。永远不要使用零权重初始化，我们在示例中不使用它是有原因的（我强烈建议你从这些示例开始，而不是从头开始）：https://github.com/deeplearning4j/dl4j-examples

对于输出层，如果你试图学习XOR，为什么不直接使用二元交叉熵呢：https://github.com/deeplearning4j/dl4j-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/feedforward/xor/XorExample.java

值得注意的是，也要关闭小批量处理（参见上面的示例），详见：https://deeplearning4j.org/toyproblems

学技术

Deeplearning4j – 3层神经网络无法正确拟合

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复