批量归一化如何在示例中工作？

我正在尝试理解批量归一化。我的简单示例

layer1 = tf.keras.layers.BatchNormalization(scale=False, center=False)x = np.array([[3.,4.]])out = layer1(x)print(out)

打印结果

tf.Tensor([[2.99850112 3.9980015 ]], shape=(1, 2), dtype=float64)

我尝试重现它

e=0.001m = np.sum(x)/2b = np.sum((x - m)**2)/2 x_=(x-m)/np.sqrt(b+e)print(x_)

它打印出

[[-0.99800598  0.99800598]]

我哪里做错了？

回答：

这里有两个问题。

首先，批量归一化有两种“模式”：训练模式，使用批量统计数据进行归一化；推理模式，使用训练过程中从批量中收集的“总体统计数据”进行归一化。默认情况下，keras层/模型在推理模式下工作，你需要在调用时指定training=True来更改此设置（还有其他方法，但这是最简单的）。

layer1 = tf.keras.layers.BatchNormalization(scale=False, center=False)x = np.array([[3.,4.]], dtype=np.float32)out = layer1(x, training=True)print(out)

这会打印tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)。仍然不对！

其次，批量归一化在批量轴上进行归一化，针对每个特征分别进行。然而，你指定输入的方式（作为一个1×2的数组）基本上是一个单一输入（批量大小为1），带有两个特征。批量归一化只是将每个特征归一化为均值0（标准差未定义）。相反，你想要两个输入，每个输入有一个特征：

layer1 = tf.keras.layers.BatchNormalization(scale=False, center=False)x = np.array([[3.],[4.]], dtype=np.float32)out = layer1(x, training=True)print(out)

这会打印

tf.Tensor([[-0.99800634] [ 0.99800587]], shape=(2, 1), dtype=float32)

或者，指定“特征轴”：

layer1 = tf.keras.layers.BatchNormalization(axis=0, scale=False, center=False)x = np.array([[3.,4.]], dtype=np.float32)out = layer1(x, training=True)print(out)

请注意，输入形状是“错误的”，但我们告诉批量归一化第0轴是特征轴（它默认是-1，最后一个轴）。这也将给出所需的结果：

tf.Tensor([[-0.99800634  0.99800587]], shape=(1, 2), dtype=float32)

学技术

批量归一化如何在示例中工作？

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复