我在GitHub上克隆了TensorFlow目标检测模型:GitHub链接
我想用我自己的数据(331张萨摩耶狗的图片)来训练这个模型,按照这个博客教程进行:点击这里
我的步骤如下:
- 创建了PASCAL VOC格式的数据集;
- 下载了预训练模型(ssd_mobilenet_v1_coco_11_06_2017.tar.gz);
- 修改了配置文件(ssd_mobilenet_v1_pets.config);
- 通过以下代码启动训练过程:
python object_detection/train.py \ –logtostderr \ –pipeline_config_path=./samoyed_test_and_train/training/ssd_mobilenet_v1_pets.config \ –train_dir=./samoyed_test_and_train/data/train.record
但是我收到了错误,我的操作系统是MacOS,我也尝试在AWS上运行,出现了相同的问题,你能找出我的错误吗?错误如下:
INFO:tensorflow:Summary name Learning Rate is illegal; using Learning_Rate instead.WARNING:tensorflow:From /Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/object_detection/meta_architectures/ssd_meta_arch.py:579: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.Instructions for updating:Please use tf.global_variables instead.INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.2017-08-01 10:34:42.992224: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.2017-08-01 10:34:42.992254: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.2017-08-01 10:35:00.359032: I tensorflow/core/common_runtime/simple_placer.cc:675] Ignoring device specification /device:GPU:0 for node 'prefetch_queue_Dequeue' because the input edge from 'prefetch_queue' is a reference connection and already has a device field set to /device:CPU:0INFO:tensorflow:Restoring parameters from /Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/samoyed_test_and_train/training/model.ckptINFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.FailedPreconditionError'>, ./samoyed_test_and_train/data/train.record/graph.pbtxt.tmpf4587d1958df43cbaa9a0d7a04199f6f2017-08-01 10:35:29.556458: E tensorflow/core/util/events_writer.cc:62] Could not open events file: ./samoyed_test_and_train/data/train.record/events.out.tfevents.1501554929.MacBook-Pro.local: Failed precondition: ./samoyed_test_and_train/data/train.record/events.out.tfevents.1501554929.MacBook-Pro.local2017-08-01 10:35:29.556480: E tensorflow/core/util/events_writer.cc:95] Write failed because file could not be opened.Traceback (most recent call last): File "object_detection/train.py", line 198, in <module> tf.app.run() File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "object_detection/train.py", line 194, in main worker_job_name, is_chief, FLAGS.train_dir) File "/Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/object_detection/trainer.py", line 290, in train saver=saver) File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 732, in train master, start_standard_services=False, config=session_config) as sess: File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 964, in managed_session self.stop(close_summary_writer=close_summary_writer) File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 792, in stop stop_grace_period_secs=self._stop_grace_secs) File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 389, in join six.reraise(*self._exc_info_to_raise) File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 953, in managed_session start_standard_services=start_standard_services) File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 709, in prepare_or_wait_for_session self._write_graph() File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 612, in _write_graph self._logdir, "graph.pbtxt") File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/framework/graph_io.py", line 67, in write_graph file_io.atomic_write_string_to_file(path, str(graph_def)) File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 418, in atomic_write_string_to_file write_string_to_file(temp_pathname, contents) File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 305, in write_string_to_file f.write(file_content) File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 101, in write self._prewrite_check() File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 87, in _prewrite_check compat.as_bytes(self.__name), compat.as_bytes(self.__mode), status) File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next() File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status))tensorflow.python.framework.errors_impl.FailedPreconditionError: ./samoyed_test_and_train/data/train.record/graph.pbtxt.tmpf4587d1958df43cbaa9a0d7a04199f6f
回答:
train_dir
标志应该指向某个(通常是空的)目录,在训练过程中会写入训练日志和检查点。例如,可以设置为train_dir=/tmp/training_directory
。看起来你试图将其指向你的数据集——而配置文件应该已经指向数据集了。