在设计用于提取DNA基序的卷积神经网络的背景下,为什么要堆叠卷积层而不在其间使用最大池化函数?
以下是这种架构出现的背景。
self.model = Sequential()assert len(num_filters) == len(conv_width)for i, (nb_filter, nb_col) in enumerate(zip(num_filters, conv_width)): conv_height = 4 if i == 0 else 1 self.model.add(Convolution2D( nb_filter=nb_filter, nb_row=conv_height, nb_col=nb_col, activation='linear', init='he_normal', input_shape=self.input_shape, W_regularizer=l1(L1), b_regularizer=l1(L1))) self.model.add(Activation('relu')) self.model.add(Dropout(dropout))self.model.add(MaxPooling2D(pool_size=(1, pool_width)))
回答:
提供的代码确实会在卷积之间使用激活函数
self.model = Sequential()assert len(num_filters) == len(conv_width)for i, (nb_filter, nb_col) in enumerate(zip(num_filters, conv_width)): conv_height = 4 if i == 0 else 1 self.model.add(Convolution2D( nb_filter=nb_filter, nb_row=conv_height, nb_col=nb_col, activation='linear', init='he_normal', input_shape=self.input_shape, W_regularizer=l1(L1), b_regularizer=l1(L1))) self.model.add(Activation('relu')) # <--------------------- 激活函数 self.model.add(Dropout(dropout))self.model.add(MaxPooling2D(pool_size=(1, pool_width)))
生成的模型结构大致如下
conv -- relu -- dropout -- conv -- relu -- dropout -- ... -- max pool
为什么他们要将激活函数单独放置,而不是在卷积层内部指定“activation”?不知道,看起来像是一个奇怪的实现决定,但从实际角度来看,
self.model.add(Convolution2D( nb_filter=nb_filter, nb_row=conv_height, nb_col=nb_col, activation='linear', init='he_normal', input_shape=self.input_shape, W_regularizer=l1(L1), b_regularizer=l1(L1)))self.model.add(Activation('relu'))
和
self.model.add(Convolution2D( nb_filter=nb_filter, nb_row=conv_height, nb_col=nb_col, activation='relu', init='he_normal', input_shape=self.input_shape, W_regularizer=l1(L1), b_regularizer=l1(L1)))
是等效的。