在Colab上进行图像分类模型训练时,我的训练总是无错误地停止

当我设置Colab环境来训练图像分类模型时,训练过程会启动并最终自动停止。我怀疑分配的12GB内存不够,因为内存条会变成橙色,然后进程停止,接着显示Ctrl+C(这意味着停止训练)。我可以增加内存吗?

WARNING:tensorflow:The TensorFlow contrib module will not be included in TensorFlow 2.0.For more information, please see:  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md  * https://github.com/tensorflow/addons  * https://github.com/tensorflow/io (for I/O related ops)If you depend on functionality not listed there, please file an issue.WARNING:tensorflow:From /content/drive/My Drive/models/research/slim/nets/inception_resnet_v2.py:373: The name tf.GraphKeys is deprecated. Please use tf.compat.v1.GraphKeys instead.WARNING:tensorflow:From /content/drive/My Drive/models/research/slim/nets/mobilenet/mobilenet.py:397: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.WARNING:tensorflow:From train.py:55: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.WARNING:tensorflow:From train.py:55: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.WARNING:tensorflow:From train.py:184: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/absl/app.py:250: main (from __main__) is deprecated and will be removed in a future version.Instructions for updating:Use object_detection/model_main.py.W1001 13:13:15.483837 139753866016640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/absl/app.py:250: main (from __main__) is deprecated and will be removed in a future version.Instructions for updating:Use object_detection/model_main.py.WARNING:tensorflow:From train.py:90: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.W1001 13:13:15.484074 139753866016640 deprecation_wrapper.py:119] From train.py:90: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/utils/config_util.py:102: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.W1001 13:13:15.484555 139753866016640 deprecation_wrapper.py:119] From /content/drive/My Drive/models/research/object_detection/utils/config_util.py:102: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.WARNING:tensorflow:From train.py:95: The name tf.gfile.Copy is deprecated. Please use tf.io.gfile.copy instead.W1001 13:13:15.490095 139753866016640 deprecation_wrapper.py:119] From train.py:95: The name tf.gfile.Copy is deprecated. Please use tf.io.gfile.copy instead.WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/legacy/trainer.py:266: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.Instructions for updating:Please switch to tf.train.create_global_stepW1001 13:13:15.501523 139753866016640 deprecation.py:323] From /content/drive/My Drive/models/research/object_detection/legacy/trainer.py:266: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.Instructions for updating:Please switch to tf.train.create_global_stepWARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/data_decoders/tf_example_decoder.py:182: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.W1001 13:13:15.505989 139753866016640 deprecation_wrapper.py:119] From /content/drive/My Drive/models/research/object_detection/data_decoders/tf_example_decoder.py:182: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/data_decoders/tf_example_decoder.py:197: The name tf.VarLenFeature is deprecated. Please use tf.io.VarLenFeature instead.W1001 13:13:15.506183 139753866016640 deprecation_wrapper.py:119] From /content/drive/My Drive/models/research/object_detection/data_decoders/tf_example_decoder.py:197: The name tf.VarLenFeature is deprecated. Please use tf.io.VarLenFeature instead.WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:64: The name tf.gfile.Glob is deprecated. Please use tf.io.gfile.glob instead.W1001 13:13:15.524426 139753866016640 deprecation_wrapper.py:119] From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:64: The name tf.gfile.Glob is deprecated. Please use tf.io.gfile.glob instead.WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:71: The name tf.logging.warning is deprecated. Please use tf.compat.v1.logging.warning instead.W1001 13:13:15.527117 139753866016640 deprecation_wrapper.py:119] From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:71: The name tf.logging.warning is deprecated. Please use tf.compat.v1.logging.warning instead.WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.W1001 13:13:15.527241 139753866016640 dataset_builder.py:72] num_readers has been reduced to 1 to match input file shards.WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:86: parallel_interleave (from tensorflow.contrib.data.python.ops.interleave_ops) is deprecated and will be removed in a future version.Instructions for updating:Use `tf.data.experimental.parallel_interleave(...)`.W1001 13:13:15.533276 139753866016640 deprecation.py:323] From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:86: parallel_interleave (from tensorflow.contrib.data.python.ops.interleave_ops) is deprecated and will be removed in a future version.Instructions for updating:Use `tf.data.experimental.parallel_interleave(...)`.WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/data/python/ops/interleave_ops.py:77: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.Instructions for updating:Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.W1001 13:13:15.533428 139753866016640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/data/python/ops/interleave_ops.py:77: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.Instructions for updating:Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:155: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.Instructions for updating:Use `tf.data.Dataset.map()W1001 13:13:15.562883 139753866016640 deprecation.py:323] From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:155: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.Instructions for updating:Use `tf.data.Dataset.map()WARNING:tensorflow:From /content/drive/My Drive/models/research/object_detection/builders/dataset_builder.py:43: DatasetV1.make_initializable_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1066: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.Instructions for updating:Use standard file utilities to get mtimes.W1001 13:13:33.349572 139753866016640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1066: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.Instructions for updating:Use standard file utilities to get mtimes.INFO:tensorflow:Running local_init_op.I1001 13:13:33.351701 139753866016640 session_manager.py:500] Running local_init_op.INFO:tensorflow:Done running local_init_op.I1001 13:13:33.607376 139753866016640 session_manager.py:502] Done running local_init_op.INFO:tensorflow:Starting Session.I1001 13:13:38.220966 139753866016640 learning.py:754] Starting Session.INFO:tensorflow:Saving checkpoint to path training/model.ckptI1001 13:13:38.410431 139752680122112 supervisor.py:1117] Saving checkpoint to path training/model.ckptINFO:tensorflow:Starting Queues.I1001 13:13:38.413790 139753866016640 learning.py:768] Starting Queues.2019-10-01 13:13:49.720631: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:111] Filling up shuffle buffer (this may take a while): 1382 of 2048INFO:tensorflow:global_step/sec: 0I1001 13:13:49.738999 139752671729408 supervisor.py:1099] global_step/sec: 02019-10-01 13:13:54.929910: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:162] Shuffle buffer filled.INFO:tensorflow:Recording summary at step 0.I1001 13:13:56.814973 139752663336704 supervisor.py:1050] Recording summary at step 0.INFO:tensorflow:global step 1: loss = 13.7762 (20.265 sec/step)I1001 13:14:00.905406 139753866016640 learning.py:507] global step 1: loss = 13.7762 (20.265 sec/step)^C

回答:

由于内存条会变成黄色,甚至可能变成红色,这意味着内存正在被填满。你无法增加内存。这是由Google固定的。

解决这个问题的一个方法是减小模型的大小,如果这不起作用,还可以减少模型层中的神经元数量(参数数量)。

Related Posts

在使用k近邻算法时,有没有办法获取被使用的“邻居”?

我想找到一种方法来确定在我的knn算法中实际使用了哪些…

Theano在Google Colab上无法启用GPU支持

我在尝试使用Theano库训练一个模型。由于我的电脑内…

准确性评分似乎有误

这里是代码: from sklearn.metrics…

Keras Functional API: “错误检查输入时:期望input_1具有4个维度,但得到形状为(X, Y)的数组”

我在尝试使用Keras的fit_generator来训…

如何使用sklearn.datasets.make_classification在指定范围内生成合成数据?

我想为分类问题创建合成数据。我使用了sklearn.d…

如何处理预测时不在训练集中的标签

已关闭。 此问题与编程或软件开发无关。目前不接受回答。…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注