我目前对Tensorflow还比较陌生。这两个代码片段让我有些困惑。
代码A:
self.h1_layer = tf.layers.dense(self.x, self.n_nodes_hl1, activation=tf.nn.relu, name="h1")self.h2_layer = tf.layers.dense(self.h1_layer, self.n_nodes_hl2, activation=tf.nn.relu, name="h2")self.h3_layer = tf.layers.dense(self.h2_layer, self.n_nodes_hl3, activation=tf.nn.relu, name="h3")self.logits = tf.layers.dense(self.h3_layer, self.num_of_classes, name="output")
代码B:
self.hidden_1_layer = { 'weights': tf.Variable(tf.random_normal([self.num_of_words, self.h1])), 'biases' : tf.Variable(tf.random_normal([self.h1]))}self.hidden_2_layer = { 'weights': tf.Variable(tf.random_normal([self.h1, self.h2])), 'biases' : tf.Variable(tf.random_normal([self.h2]))}self.hidden_3_layer = { 'weights': tf.Variable(tf.random_normal([self.h2, self.h3])), 'biases' : tf.Variable(tf.random_normal([self.h3]))}self.final_output_layer = { 'weights': tf.Variable(tf.random_normal([self.h3, self.num_of_classes])), 'biases' : tf.Variable(tf.random_normal([self.num_of_classes]))}layer1 = tf.add(tf.matmul(data, self.hidden_1_layer['weights']), self.hidden_1_layer['biases'])layer1 = tf.nn.relu(layer1)layer2 = tf.add(tf.matmul(layer1, self.hidden_2_layer['weights']), self.hidden_2_layer['biases'])layer2 = tf.nn.relu(layer2)layer3 = tf.add(tf.matmul(layer2, self.hidden_3_layer['weights']), self.hidden_3_layer['biases'])layer3 = tf.nn.relu(layer3)output = tf.matmul(layer3, self.final_output_layer['weights']) + self.final_output_layer['biases']
它们是相同的东西吗?代码A和B的权重和偏置可以使用tf.train.Saver()保存吗?
谢谢
编辑:我使用代码A生成预测时遇到了问题。看起来代码A的logits总是变化的。
完整代码:
import tensorflow as tfimport osfrom utils import Utils as utilsos.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'class Neural_Network: # Neural Network Setup num_of_epoch = 50 n_nodes_hl1 = 500 n_nodes_hl2 = 500 n_nodes_hl3 = 500 def __init__(self): self.num_of_classes = utils.get_num_of_classes() self.num_of_words = utils.get_num_of_words() # placeholders self.x = tf.placeholder(tf.float32, [None, self.num_of_words]) self.y = tf.placeholder(tf.int32, [None, self.num_of_classes]) with tf.name_scope("model"): self.h1_layer = tf.layers.dense(self.x, self.n_nodes_hl1, activation=tf.nn.relu, name="h1") self.h2_layer = tf.layers.dense(self.h1_layer, self.n_nodes_hl2, activation=tf.nn.relu, name="h2") self.h3_layer = tf.layers.dense(self.h2_layer, self.n_nodes_hl3, activation=tf.nn.relu, name="h3") self.logits = tf.layers.dense(self.h3_layer, self.num_of_classes, name="output") def predict(self): return self.logits def make_prediction(self, query): result = None with tf.Session() as sess: sess.run(tf.global_variables_initializer()) saver = tf.train.import_meta_graph('saved_models/testing.meta') saver.restore(sess, 'saved_models/testing') # for variable in tf.trainable_variables(): # print sess.run(variable) prediction = self.predict() pre, prediction = sess.run([self.logits, prediction], feed_dict={self.x : query}) print pre prediction = prediction.tolist() prediction = tf.nn.softmax(prediction) prediction = sess.run(prediction) print prediction return utils.get_label_from_encoding(prediction[0]) def train(self, data): prediction = self.predict() cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=self.y)) optimizer = tf.train.AdamOptimizer().minimize(cost) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) writer = tf.summary.FileWriter("mygraph/logs", tf.get_default_graph()) for epoch in range(self.num_of_epoch): optimised, loss = sess.run([optimizer, cost], feed_dict={self.x: data['values'], self.y: data['labels']}) if epoch % 1 == 0: print("Completed Training Cycle: " + str(epoch) + " out of " + str(self.num_of_epoch)) print("Current Loss: " + str(loss)) saver = tf.train.Saver() saver.save(sess, 'saved_models/testing') print("Model saved")
回答:
TLDR:这两个操作本质上是相同的,但变量创建和初始化方法有所不同。
如果你从这里追踪代码,你最终会到达代码调用tf.get_variable
来初始化变量的阶段。在你上面的例子中,由于kernel_initializer
和bias_initializer
没有设置,它们将分别默认为None
和tf.zeros_initializer()
(参见Dense API)。当None
作为初始化器传递给tf.get_variable
时,将使用glorot_uniform_initializer
:
如果initializer为None(默认值),则将使用变量作用域中传递的默认初始化器。如果那个也是None,则将使用glorot_uniform_initializer。初始化器也可以是Tensor,在这种情况下,变量将被初始化为这个值和形状。
关于tf.get_variable的更多信息可以在这里找到。
在一种情况下,你对内核权重和偏置权重都使用了tf.random_normal
初始化器,但在另一种情况下,你使用了tf.layers.dense
,这将导致内核权重使用glorot_uniform_initializer
,偏置权重使用zeros_initializer
,因为没有参数传递给tf.layers.dense
。
关于它们是否可以被保存,你的第二个问题,答案是可以的。
最后要注意的是,使用tf.Variable
时要小心,因为如果作用域没有正确设置,可能会使事情变得复杂。