我想创建一个具有多个输入的模型。因此,我尝试构建这样一个模型。
# 定义两组输入inputA = Input(shape=(32,64,1))inputB = Input(shape=(32,1024)) # CNNx = layers.Conv2D(32, kernel_size = (3, 3), activation = 'relu')(inputA)x = layers.Conv2D(32, (3,3), activation='relu')(x)x = layers.MaxPooling2D(pool_size=(2,2))(x)x = layers.Dropout(0.2)(x)x = layers.Flatten()(x)x = layers.Dense(500, activation = 'relu')(x)x = layers.Dropout(0.5)(x)x = layers.Dense(500, activation='relu')(x)x = Model(inputs=inputA, outputs=x) # DNNy = layers.Flatten()(inputB)y = Dense(64, activation="relu")(y)y = Dense(250, activation="relu")(y)y = Dense(500, activation="relu")(y)y = Model(inputs=inputB, outputs=y) # 合并两个模型的输出combined = concatenate([x.output, y.output]) # 合并输出sz = Dense(300, activation="relu")(combined)z = Dense(100, activation="relu")(combined)z = Dense(1, activation="softmax")(combined)model = Model(inputs=[x.input, y.input], outputs=z)model.summary()opt = Adam(lr=1e-3, decay=1e-3 / 200)model.compile(loss = 'sparse_categorical_crossentropy', optimizer = opt, metrics = ['accuracy'])
并且模型的概要如下:
但是,当我尝试训练这个模型时,
history = model.fit([trainimage, train_product_embd],train_label, validation_data=([validimage,valid_product_embd],valid_label), epochs=10, steps_per_epoch=100, validation_steps=10)
问题就出现了….:
ResourceExhaustedError Traceback (most recent call last) <ipython-input-18-2b79f16d63c0> in <module>() ----> 1 history = model.fit([trainimage, train_product_embd],train_label, validation_data=([validimage,valid_product_embd],valid_label), epochs=10, steps_per_epoch=100, validation_steps=10) 4 frames /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py in __call__(self, *args, **kwargs) 1470 ret = tf_session.TF_SessionRunCallable(self._session._session, 1471 self._handle, args, -> 1472 run_metadata_ptr) 1473 if run_metadata: 1474 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) ResourceExhaustedError: 2 root error(s) found. (0) Resource exhausted: OOM when allocating tensor with shape[800000,32,30,62] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node conv2d_1/convolution}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. [[metrics/acc/Mean_1/_185]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. (1) Resource exhausted: OOM when allocating tensor with shape[800000,32,30,62] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node conv2d_1/convolution}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. 0 successful operations. 0 derived errors ignored.
感谢阅读并希望能帮助到我 🙂
回答:
OOM 是”内存不足”的缩写。你的GPU内存不足,无法为此张量分配内存。你可以尝试以下几种方法:
- 减少
Dense
、Conv2D
层中的过滤器数量 - 使用较小的
batch_size
(或增加steps_per_epoch
和validation_steps
) - 使用灰度图像(你可以使用
tf.image.rgb_to_grayscale
) - 减少层的数量
- 在卷积层后使用
MaxPooling2D
层 - 缩小图像尺寸(你可以使用
tf.image.resize
来实现) - 对于输入使用较小的
float
精度,即np.float32
- 如果你使用预训练模型,冻结前几层(就像这样)
关于这个错误还有更多有用的信息:
OOM when allocating tensor with shape[800000,32,30,62]
这是一个奇怪的形状。如果你在处理图像,通常应该有3个或1个通道。此外,看起来你一次性传递了整个数据集;你应该以批次的方式传递数据。