我在Pytorch中有一个简单的模型。
model = Network()
它的详细信息如下:
Network( (hidden): Linear(in_features=784, out_features=256, bias=True) (output): Linear(in_features=256, out_features=10, bias=True) (sigmoid): Sigmoid() (softmax): Softmax(dim=1))
总共有3层神经元。1个输入层(786个神经元),1个隐藏层(256个神经元)和1个输出层(10个神经元)。因此将有两层权重。所以对于这两层权重应该有两个偏置(简单来说就是两个浮点数),对吗?(如果我错了请纠正我)。
现在在初始化我的网络后,我对这两个偏置值感到好奇。所以我想检查隐藏层的偏置值,于是我写了:
model.hidden.bias
但我得到的结果与我的预期不符!我实际上期望得到一个值!而我得到的实际上是:
tensor([-1.6868e-02, -3.5661e-02, 1.2489e-02, -2.7880e-02, 1.4025e-02, -2.6085e-02, 1.2625e-02, -3.1748e-02, 5.0335e-03, 3.8031e-03, -3.1648e-02, -3.4881e-02, -2.0026e-02, 1.9728e-02, 6.2461e-03, 9.3936e-04, -5.9270e-03, -2.7183e-02, -1.9850e-02, -3.5693e-02, -1.9393e-02, 2.6555e-02, 2.3482e-02, 2.1230e-02, -2.2175e-02, -2.4386e-02, 3.4848e-02, -2.6044e-02, 1.3575e-02, 9.4125e-03, 3.0012e-02, -2.6078e-02, 7.1615e-05, -1.7061e-02, 6.6355e-03, -3.4966e-02, 2.9311e-02, 1.4060e-02, -2.5763e-02, -1.4020e-02, 2.9852e-02, -7.9176e-03, -1.8396e-02, 1.6927e-02, -1.1001e-03, 1.5595e-02, 1.2169e-02, -1.2275e-02, -2.9270e-03, -6.5685e-04, -2.4297e-02, 3.0048e-02, 2.9692e-03, -2.5398e-02, 2.9955e-03, -9.3653e-04, -1.2932e-02, 2.4232e-02, -3.5182e-02, -1.6163e-02, 3.0025e-02, 3.1227e-02, -8.2498e-04, 2.7102e-02, -2.3830e-02, -3.4958e-02, -1.1886e-02, 1.6097e-02, 1.4579e-02, -2.6744e-02, 1.1900e-02, -3.4855e-02, -4.2208e-03, -5.2035e-03, 1.7055e-02, -4.8580e-03, 3.4088e-03, 1.6923e-02, 3.5570e-04, -3.0478e-02, 8.4647e-03, 2.5704e-02, -2.3255e-02, 6.9396e-03, -1.2521e-03, -9.4101e-03, -2.5798e-02, -1.4438e-03, -7.2684e-03, 3.5417e-02, -3.4388e-02, 1.3706e-02, -5.1430e-03, 1.6174e-02, 1.8135e-03, -2.9018e-02, -2.9083e-02, 7.4100e-03, -2.7758e-02, 2.4367e-02, -3.8350e-03, 9.4390e-03, -1.0844e-02, 1.6381e-02, -2.5268e-02, 1.3553e-02, -1.0545e-02, -1.3782e-02, 2.8519e-02, 2.3630e-02, -1.9703e-02, -2.0147e-02, -1.0485e-02, 2.4637e-02, 1.9989e-02, 5.6601e-03, 1.9121e-02, -1.5286e-02, 2.5996e-02, -2.9833e-02, -2.9458e-02, 2.3944e-02, -3.0107e-02, -1.2307e-02, -1.8419e-02, 3.3551e-02, 1.2396e-02, 2.9356e-02, 3.3274e-02, 5.4677e-03, 3.1715e-02, 1.3361e-02, 3.3042e-02, 2.7843e-03, 2.2837e-02, -3.4981e-02, 3.2355e-02, -2.7658e-03, 2.2184e-02, -2.0203e-02, -3.3264e-02, -3.4858e-02, 1.0820e-03, -1.4279e-02, -2.8041e-02, 4.1962e-03, 2.4266e-02, -3.5704e-02, -2.6172e-02, 2.3335e-02, 2.0657e-02, -3.0387e-03, -5.7096e-03, -1.1062e-02, 1.3450e-02, -3.3965e-02, 1.9623e-03, -2.0067e-02, -3.3858e-02, -2.1931e-02, -1.5414e-02, 2.4454e-02, 2.5668e-02, -1.1932e-02, 5.7540e-04, 1.5130e-02, 1.3916e-02, -2.1521e-02, -3.0575e-02, 1.8841e-02, -2.3240e-02, -2.7297e-02, -3.2668e-02, -1.5544e-02, -5.9408e-03, 3.0241e-02, 2.2039e-02, -2.4389e-02, 3.1703e-02, 3.5305e-02, -2.7501e-03, 2.0154e-02, -5.3489e-03, 1.4177e-02, 1.6829e-02, 3.3066e-02, -1.3425e-02, -3.2565e-02, 6.5624e-03, -1.5681e-02, 2.3047e-02, 6.5880e-03, -3.3803e-02, 2.3790e-02, -5.5061e-03, 2.9413e-02, 1.2290e-02, -1.0958e-02, 1.2680e-03, 1.3343e-02, 6.6689e-03, -2.2975e-03, -1.2068e-02, 1.6523e-02, -3.1612e-02, -1.7529e-02, -2.2220e-02, -1.4723e-02, -1.3495e-02, -5.1805e-03, -2.9620e-02, 3.0571e-02, -3.0999e-02, 3.3681e-03, 1.3579e-02, 1.4837e-02, 1.5694e-02, -1.1178e-02, 4.6233e-03, -2.2583e-02, -3.5281e-03, 3.0918e-02, 2.6407e-02, 1.5822e-04, -3.0181e-03, 8.6989e-03, 2.8998e-02, -1.5975e-02, -3.1574e-02, -1.5609e-02, 1.0472e-02, 5.8976e-03, 7.0131e-03, -3.2047e-02, 2.6045e-02, -2.8882e-02, -2.2121e-02, -3.2960e-02, 1.8268e-02, 3.0984e-02, 1.4824e-02, 3.0010e-02, -5.7523e-03, -2.0017e-02, 4.8700e-03, 1.4997e-02, -1.4898e-02, 6.8572e-03, 9.7713e-03, 1.3410e-02, 4.9619e-03, 3.1016e-02, 3.1240e-02, -3.0203e-02, 2.1435e-02, 2.7331e-02], requires_grad=True)
能有人解释一下这种行为吗?为什么我得到了256个值而不是一个值?
编辑1:
这是我对层的理解:对于整层神经元,偏置只是一个单一的值。我对吗?但我看到的输出大约是256个值?为什么?Pytorch是不是假定我每个神经元都有一个偏置?这样可以吗?
回答: