我认为我已经掌握了 DropOut 的基础知识以及使用 TensorFlow API 实现它的方法。但是,tf.nn.dropout
中与 dropout 概率相关的归一化似乎并不属于DropConnect 的一部分。这是正确的吗?如果是的话,归一化会造成任何“伤害”吗?还是我可以简单地将 tf.nn.dropout
应用到我的权重上以实现 DropConnect?
回答:
回答
是的,你可以使用 tf.nn.dropout 来实现 DropConnect,只需用 tf.nn.dropout 包装你的权重矩阵而不是你的矩阵乘法后的结果。然后你可以通过像这样乘以 dropout 来“撤销”权重的变化
dropConnect = tf.nn.dropout( m1, keep_prob ) * keep_prob
代码示例
这里是一个使用 DropConnect 计算 XOR 函数的代码示例。我还注释掉了执行 dropout 的代码,你可以替换并比较输出结果。
### importsimport tensorflow as tf### constant datax = [[0.,0.],[1.,1.],[1.,0.],[0.,1.]]y_ = [[1.,0.],[1.,0.],[0.,1.],[0.,1.]]### induction# Layer 0 = the x2 inputsx0 = tf.constant( x , dtype=tf.float32 )y0 = tf.constant( y_ , dtype=tf.float32 )keep_prob = tf.placeholder( dtype=tf.float32 )# Layer 1 = the 2x12 hidden sigmoidm1 = tf.Variable( tf.random_uniform( [2,12] , minval=0.1 , maxval=0.9 , dtype=tf.float32 ))b1 = tf.Variable( tf.random_uniform( [12] , minval=0.1 , maxval=0.9 , dtype=tf.float32 ))########## DROP CONNECT# - use this to preform "DropConnect" flavor of dropoutdropConnect = tf.nn.dropout( m1, keep_prob ) * keep_probh1 = tf.sigmoid( tf.matmul( x0, dropConnect ) + b1 ) ########## DROP OUT# - uncomment this to use "regular" dropout#h1 = tf.nn.dropout( tf.sigmoid( tf.matmul( x0,m1 ) + b1 ) , keep_prob )# Layer 2 = the 12x2 softmax outputm2 = tf.Variable( tf.random_uniform( [12,2] , minval=0.1 , maxval=0.9 , dtype=tf.float32 ))b2 = tf.Variable( tf.random_uniform( [2] , minval=0.1 , maxval=0.9 , dtype=tf.float32 ))y_out = tf.nn.softmax( tf.matmul( h1,m2 ) + b2 )# loss : sum of the squares of y0 - y_outloss = tf.reduce_sum( tf.square( y0 - y_out ) )# training step : discovered learning rate of 1e-2 through experimentationtrain = tf.train.AdamOptimizer(1e-2).minimize(loss)### training# run 5000 times using all the X and Y# print out the loss and any other interesting infowith tf.Session() as sess: sess.run( tf.initialize_all_variables() ) print "\nloss" for step in range(5000) : sess.run(train,feed_dict={keep_prob:0.5}) if (step + 1) % 100 == 0 : print sess.run(loss,feed_dict={keep_prob:1.}) results = sess.run([m1,b1,m2,b2,y_out,loss],feed_dict={keep_prob:1.}) labels = "m1,b1,m2,b2,y_out,loss".split(",") for label,result in zip(*(labels,results)) : print "" print label print resultprint ""
输出
两种方法都能正确地将输入分成正确的输出
y_out[[ 7.05891490e-01 2.94108540e-01] [ 9.99605477e-01 3.94574134e-04] [ 4.99370173e-02 9.50062990e-01] [ 4.39682379e-02 9.56031740e-01]]
你可以看到 DropConnect 的输出能够正确地将 Y 分类为真、真、假、假。