TensorFlow Object Detection API – 并非所有类别都被检测到

  • Python: 3.7
  • TF-gpu==1.15
  • Quadro RTX 4000
  • 8 GB VRAM, 64GB 系统内存
  • 预训练模型: ssd_mobilenet_v1_pets.config

我刚开始使用TensorFlow对象检测API,想将其应用于我自己的图像集。我希望教它区分BGA芯片的顶视图、底视图和侧视图(如果有的话,还包括带有尺寸的表格),这些图像来自于称为数据手册的文件,展示了上述组件的精确尺寸。

images/train = 565 张图像images/test = 24 张图像

我不明白为什么只有“top”标签被识别出来。这个问题困扰了我一整天,我知道这不是因为我的csv文件或tf记录的问题,因为我已经反复调整并确保它们是正常的。

配置文件:

# SSD with Mobilenet v1, configured for Oxford-IIIT Pets Dataset.# Users should configure the fine_tune_checkpoint field in the train config as# well as the label_map_path and input_path fields in the train_input_reader and# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that# should be configured.model {  ssd {    num_classes: 4    box_coder {      faster_rcnn_box_coder {        y_scale: 10.0        x_scale: 10.0        height_scale: 5.0        width_scale: 5.0      }    }    matcher {      argmax_matcher {        matched_threshold: 0.5        unmatched_threshold: 0.5        ignore_thresholds: false        negatives_lower_than_unmatched: true        force_match_for_each_row: true      }    }    similarity_calculator {      iou_similarity {      }    }    anchor_generator {      ssd_anchor_generator {        num_layers: 6        min_scale: 0.2        max_scale: 0.95        aspect_ratios: 1.0        aspect_ratios: 2.0        aspect_ratios: 0.5        aspect_ratios: 3.0        aspect_ratios: 0.3333      }    }    image_resizer {      fixed_shape_resizer {        height: 300        width: 300      }    }    box_predictor {      convolutional_box_predictor {        min_depth: 0        max_depth: 0        num_layers_before_predictor: 0        use_dropout: false        dropout_keep_probability: 0.8        kernel_size: 1        box_code_size: 4        apply_sigmoid_to_scores: false        conv_hyperparams {          activation: RELU_6,          regularizer {            l2_regularizer {              weight: 0.00004            }          }          initializer {            truncated_normal_initializer {              stddev: 0.03              mean: 0.0            }          }          batch_norm {            train: true,            scale: true,            center: true,            decay: 0.9997,            epsilon: 0.001,          }        }      }    }    feature_extractor {      type: 'ssd_mobilenet_v1'      min_depth: 16      depth_multiplier: 1.0      conv_hyperparams {        activation: RELU_6,        regularizer {          l2_regularizer {            weight: 0.00004          }        }        initializer {          truncated_normal_initializer {            stddev: 0.03            mean: 0.0          }        }        batch_norm {          train: true,          scale: true,          center: true,          decay: 0.9997,          epsilon: 0.001,        }      }    }    loss {      classification_loss {        weighted_sigmoid {        }      }      localization_loss {        weighted_smooth_l1 {        }      }      hard_example_miner {        num_hard_examples: 3000        iou_threshold: 0.99        loss_type: CLASSIFICATION        max_negatives_per_positive: 3        min_negatives_per_image: 0      }      classification_weight: 1.0      localization_weight: 1.0    }    normalize_loss_by_num_matches: true    post_processing {      batch_non_max_suppression {        score_threshold: 1e-8        iou_threshold: 0.6        max_detections_per_class: 100        max_total_detections: 100      }      score_converter: SIGMOID    }  }}train_config: {  batch_size: 16  optimizer {    rms_prop_optimizer: {      learning_rate: {        exponential_decay_learning_rate {          initial_learning_rate: 0.004          decay_steps: 2500          decay_factor: 0.9        }      }      momentum_optimizer_value: 0.9      decay: 0.9      epsilon: 1.0    }  }  fine_tune_checkpoint: "ssd_mobilenet_v1_coco_2018_01_28/model.ckpt"  from_detection_checkpoint: true  load_all_detection_checkpoint_vars: true  # Note: The below line limits the training process to 200K steps, which we  # empirically found to be sufficient enough to train the pets dataset. This  # effectively bypasses the learning rate schedule (the learning rate will  # never decay). Remove the below line to train indefinitely.  num_steps: 4000  data_augmentation_options {    random_horizontal_flip {    }  }}train_input_reader: {  tf_record_input_reader {    input_path: "data/train.record"  }  label_map_path: "data/object-detection.pbtxt"}eval_config: {  metrics_set: "coco_detection_metrics"  num_examples: 24}eval_input_reader: {  tf_record_input_reader {    input_path: "data/test.record"  }  label_map_path: "training/object-detection.pbtxt"  shuffle: false  num_readers: 1}

标签映射:

item {    id: 1    name: 'top'}item {    id: 2    name: 'bottom'}item {    id: 3    name: 'side'}item {    id: 4    name: 'table'}

enter image description here


回答:

如果我理解正确的话,训练后你在检测阶段无法看到所有类别。我建议使用这个脚本来加载训练后的冻结推理图,并且不要忘记指定类别的数量。祝你好运!这是代码的链接请不要忘记如果这个问题解决了你的问题,就接受这个答案

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注