用于手写识别的神经网络？

我一直在跟进Andrew Ng关于机器学习的课程，目前对实现一个手写识别工具有些疑问。

-首先，他说使用了MNIST数据集的一个子集，其中包含5000个训练样本，每个训练样本是一张20×20的灰度格式图像。他说我们有一个长度为400的向量，这是之前描述数据的“展开”形式。这是否意味着训练集的格式类似于以下内容？

Training example 1 v[1,2,...,400]Training example 2 v[1,2,...,400]...Training example 5000 v[1,2,...,400]

在编码部分，作者提供了以下完整的Matlab代码：

%% Machine Learning Online Class - Exercise 3 | Part 2: Neural Networks%  Instructions%  ------------% %  This file contains code that helps you get started on the%  linear exercise. You will need to complete the following functions %  in this exericse:%%     lrCostFunction.m (logistic regression cost function)%     oneVsAll.m%     predictOneVsAll.m%     predict.m%%  For this exercise, you will not need to change any code in this file,%  or any other files other than those mentioned above.%%% Initializationclear ; close all; clc%% Setup the parameters you will use for this exerciseinput_layer_size  = 400;  % 20x20 Input Images of Digitshidden_layer_size = 25;   % 25 hidden unitsnum_labels = 10;          % 10 labels, from 1 to 10                             % (note that we have mapped "0" to label 10)%% =========== Part 1: Loading and Visualizing Data =============%  We start the exercise by first loading and visualizing the dataset. %  You will be working with a dataset that contains handwritten digits.%% Load Training Datafprintf('Loading and Visualizing Data ...\n')load('ex3data1.mat');m = size(X, 1);% Randomly select 100 data points to displaysel = randperm(size(X, 1));sel = sel(1:100);displayData(X(sel, :));fprintf('Program paused. Press enter to continue.\n');pause;%% ================ Part 2: Loading Pameters ================% In this part of the exercise, we load some pre-initialized % neural network parameters.fprintf('\nLoading Saved Neural Network Parameters ...\n')% Load the weights into variables Theta1 and Theta2load('ex3weights.mat');%% ================= Part 3: Implement Predict =================%  After training the neural network, we would like to use it to predict%  the labels. You will now implement the "predict" function to use the%  neural network to predict the labels of the training set. This lets%  you compute the training set accuracy.pred = predict(Theta1, Theta2, X);fprintf('\nTraining Set Accuracy: %f\n', mean(double(pred == y)) * 100);fprintf('Program paused. Press enter to continue.\n');pause;%  To give you an idea of the network's output, you can also run%  through the examples one at the a time to see what it is predicting.%  Randomly permute examplesrp = randperm(m);for i = 1:m    % Display     fprintf('\nDisplaying Example Image\n');    displayData(X(rp(i), :));    pred = predict(Theta1, Theta2, X(rp(i),:));    fprintf('\nNeural Network Prediction: %d (digit %d)\n', pred, mod(pred, 10));    % Pause    fprintf('Program paused. Press enter to continue.\n');    pause;end

学生需要完成predict函数，我做了以下工作：

function p = predict(Theta1, Theta2, X)%PREDICT Predict the label of an input given a trained neural network%   p = PREDICT(Theta1, Theta2, X) outputs the predicted label of X given the%   trained weights of a neural network (Theta1, Theta2)% Useful valuesm = size(X, 1);num_labels = size(Theta2, 1);% You need to return the following variables correctly p = zeros(size(X, 1), 1);X = [ones(m , 1) X];% ====================== YOUR CODE HERE ======================% Instructions: Complete the following code to make predictions using%               your learned neural network. You should set p to a %               vector containing labels between 1 to num_labels.%% Hint: The max function might come in useful. In particular, the max%       function can also return the index of the max element, for more%       information see 'help max'. If your examples are in rows, then, you%       can use max(A, [], 2) to obtain the max for each row.%a1 = X;a2 = sigmoid(a1*Theta1');a2 = [ones(m , 1) a2];a3 = sigmoid(a2*Theta2');[M , p] = max(a3 , [] , 2);

尽管它能运行，但我并不完全了解它是如何真正工作的（我只是按照作者网站上的逐步指导做的）。我对以下几点有疑问：

作者认为X（输入）是一个5000 x 400的元素数组，或者说它有400个输入神经元，10个输出神经元和一个隐藏层。这是否意味着这5000 x 400的值是训练集？
作者给了我们theta 1和theta 2的值，我认为这些是用于内部层计算的权重，但这些值是如何获得的？为什么他使用25个隐藏层神经元而不是24个或30个？

任何帮助都将不胜感激。谢谢

回答：

让我们逐部分解答你的问题：

首先，他说使用了MNIST数据集的一个子集，其中包含5000个训练样本，每个训练样本是一张20×20的灰度格式图像。他说我们有一个长度为400的向量，这是之前描述数据的“展开”形式。这是否意味着训练集的格式类似于以下内容？(…)

你的思路是对的。每个训练样本都是一张20×20的图像。课程中介绍的最简单的neural network模型将每张图像视为一个简单的1×400向量（“展开”正是这种转换的意思）。数据集存储在矩阵中，因为这样可以利用Octave/Matlab使用的有效线性代数库更快地进行计算。你不一定需要将所有训练样本存储为5000×400的矩阵，但这样你的代码会运行得更快。

作者认为X（输入）是一个5000 x 400的元素数组，或者说它有400个输入神经元，10个输出神经元和一个隐藏层。这是否意味着这5000 x 400的值是训练集？

“输入层”不过是输入图像本身。你可以将其视为输出值已计算的神经元，或者视为来自网络外部的值（想想你的视网膜。它就像你视觉系统的输入层）。因此，这个网络有400个输入单元（“展开”的20×20图像）。但当然，你的训练集不仅仅是一张图像，因此你将所有5000张图像组合成一个5000×400的矩阵来形成你的训练集。

作者给了我们theta 1和theta 2的值，我认为这些是用于内部层计算的权重，但这些值是如何获得的？

这些theta值是通过一种称为反向传播的算法找到的。如果你还没有在课程中实现它，请耐心等待。可能很快就会在练习中出现！顺便说一下，是的，它们是权重。

为什么他使用25个隐藏层神经元而不是24个或30个？

他可能选择了一个既不会运行得太慢，也不会表现得太差的任意值。你可能会找到更好的超参数值。但如果你增加得太多，训练过程可能会花费更长的时间。此外，由于你只使用了整个训练集的一小部分（原始MNIST有60000个训练样本和28×28的图像），你需要使用“较少”的隐藏单元来防止过拟合。如果你使用太多的单元，你的神经元会“死记硬背”训练样本，而无法泛化到新的未见数据。寻找超参数，如隐藏单元的数量，是一种你将通过经验（以及可能通过贝叶斯优化和更高级的方法，但那是另一个故事了xD）掌握的艺术。

学技术

用于手写识别的神经网络？

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复