如何修复”ResourceExhaustedError: OOM when allocating tensor”

我想创建一个具有多个输入的模型。因此，我尝试构建这样一个模型。

# 定义两组输入inputA = Input(shape=(32,64,1))inputB = Input(shape=(32,1024)) # CNNx = layers.Conv2D(32, kernel_size = (3, 3), activation = 'relu')(inputA)x = layers.Conv2D(32, (3,3), activation='relu')(x)x = layers.MaxPooling2D(pool_size=(2,2))(x)x = layers.Dropout(0.2)(x)x = layers.Flatten()(x)x = layers.Dense(500, activation = 'relu')(x)x = layers.Dropout(0.5)(x)x = layers.Dense(500, activation='relu')(x)x = Model(inputs=inputA, outputs=x) # DNNy = layers.Flatten()(inputB)y = Dense(64, activation="relu")(y)y = Dense(250, activation="relu")(y)y = Dense(500, activation="relu")(y)y = Model(inputs=inputB, outputs=y) # 合并两个模型的输出combined = concatenate([x.output, y.output]) # 合并输出sz = Dense(300, activation="relu")(combined)z = Dense(100, activation="relu")(combined)z = Dense(1, activation="softmax")(combined)model = Model(inputs=[x.input, y.input], outputs=z)model.summary()opt = Adam(lr=1e-3, decay=1e-3 / 200)model.compile(loss = 'sparse_categorical_crossentropy', optimizer = opt,    metrics = ['accuracy'])

并且模型的概要如下：

但是，当我尝试训练这个模型时，

history = model.fit([trainimage, train_product_embd],train_label,    validation_data=([validimage,valid_product_embd],valid_label), epochs=10,     steps_per_epoch=100, validation_steps=10)

问题就出现了….:

 ResourceExhaustedError                    Traceback (most recent call last) <ipython-input-18-2b79f16d63c0> in <module>() ----> 1 history = model.fit([trainimage, train_product_embd],train_label, validation_data=([validimage,valid_product_embd],valid_label), epochs=10, steps_per_epoch=100, validation_steps=10) 4 frames /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py in __call__(self, *args, **kwargs)    1470         ret = tf_session.TF_SessionRunCallable(self._session._session,    1471       self._handle, args, -> 1472                                                run_metadata_ptr)    1473         if run_metadata:    1474           proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)  ResourceExhaustedError: 2 root error(s) found.   (0) Resource exhausted: OOM when allocating tensor with shape[800000,32,30,62] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc     [[{{node conv2d_1/convolution}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.      [[metrics/acc/Mean_1/_185]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.    (1) Resource exhausted: OOM when allocating tensor with shape[800000,32,30,62] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc     [[{{node conv2d_1/convolution}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.  0 successful operations. 0 derived errors ignored.

感谢阅读并希望能帮助到我 🙂

回答：

OOM 是”内存不足”的缩写。你的GPU内存不足，无法为此张量分配内存。你可以尝试以下几种方法：

减少Dense、Conv2D层中的过滤器数量
使用较小的batch_size（或增加steps_per_epoch和validation_steps）
使用灰度图像（你可以使用tf.image.rgb_to_grayscale）
减少层的数量
在卷积层后使用MaxPooling2D层
缩小图像尺寸（你可以使用tf.image.resize来实现）
对于输入使用较小的float精度，即np.float32
如果你使用预训练模型，冻结前几层（就像这样）

关于这个错误还有更多有用的信息：

OOM when allocating tensor with shape[800000,32,30,62]

这是一个奇怪的形状。如果你在处理图像，通常应该有3个或1个通道。此外，看起来你一次性传递了整个数据集；你应该以批次的方式传递数据。

学技术

如何修复”ResourceExhaustedError: OOM when allocating tensor”

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复