YoloV3 结果在每个类别上的置信度均为零

我正在实现Yolo v3用于多类对象检测

yolo是一种基于区域建议的算法，将最高置信度的区域建议视为yolo的预测，了解更多信息可以阅读这里

对于这个特定任务，我参考了这个murtuza教程，它从头开始指导了我

由于复杂的网络架构需要数小时的训练，我更喜欢使用迁移学习，即使用预训练的网络和权重（参数），你可以在这里找到这两个链接
架构配置：cfg
网络参数（权重）：weights

我在这里使用了yolov3 tiny，因为我需要更高的帧率来处理视频，但结果并不像教程中展示的那样有前景，我不知道自己哪里出了问题，即使将网络cfg和权重文件更改为原始的yolov3(320)，也无法得到正确的结果，我得到了所有5个空间数据作为坐标和置信度[cx,cy,h,w,confidence]，但所有80个类别的概率仍然是零向量[0.0,0.0,0.0—0.0]，即使更换视频源并选择另一个视频，结果仍然是零向量，而在教程中这些是正常工作的

实现代码：

# YOLO Algorithm# Network Weights and configuration Files yolov3_tiny_cfg='/root/Downloads/ML TASK/yolov3-tiny.cfg' # configuration fileyolov3_tiny_weights='/root/Downloads/ML TASK/yolov3-tiny.weights' # weightscoco_names='/root/Downloads/ML TASK/coco.names' # coco classes# for yolo genral 320 architecture# put paths to directoryyolov3_cfg='/root/Downloads/ML TASK/yolov3.cfg'yolov3_weights='/root/Downloads/ML TASK/yolov3.weights'# Test VideosTest_video_1='/root/Downloads/ML TASK/mn.mp4'Test_video_2='/root/Downloads/ML TASK/bg.mp4'# Dependenciesimport cv2import numpy as np# Dataset Classes:# there are around 80 classes in the coco dataset so manually writing them would not be right choice so instead of them we are getting them from a file name coco.names stored in drive# getting list of classesclasses=[] # empty list intializationwith open(coco_names,'r')as f:  classes=f.read().splitlines()# viewing the multiclass list around 80 classes in coco dataset# Loading the yolov3 using configuration file and weightsnetwork=cv2.dnn.readNetFromDarknet(yolov3_cfg,yolov3_weights)network.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)# to use opencv CPU as backendnetwork.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)#NOTE: The network won't feed directly the image we have to First Preprocess it To match the input shape of network also the type i.e. Blob it genrally refers to a mathematical form of binary Images Like BitmapWidth,Height=320,320 # sqaure image so the network grid should be n*n equal on both dimensionConfidence_Threshould=0.5 # minimum problity for claiming the predictionNMS_Threshould=0.3cap=cv2.VideoCapture('game.mp4')fps = cap.get(cv2.CAP_PROP_FPS)timestamps = [cap.get(cv2.CAP_PROP_POS_MSEC)]# function to find objects on captured video streamdef findObjects(outputs,image):  h,w,c=image.shape  bound_box=[] # for feeding through function  classIds=[]  confidence=[]  for output in outputs: # getting o/p from 2 layers(v3 tiny) 3 if use yolov3 320    for detection in output:      scores=detection[5:]  #slice first five values cause we are gonnause them in bounding      classId=np.argmax(scores)      confs=scores[classIds]      # filtering object putting them as final prediction only when its breaches the minimum threshould of confidence      if confs > Confidence_Threshould:        w,h=int(detection[2]*Width),int(detection[3]*Height)              # to convert % into pixel        x,y=int((detection[0]*Width)-(w/2)),int((detection[1]*Height)-(h/2))         bound_box.append([x,y,w,h])        classIds.append(classId)        confidence.append(float(confs))  print(len(bound_box))  # to downsample the no. of boxes on frame we use nms boxes it give indices by which spatial info to keep  indices=cv2.dnn.NMSBoxes(bound_box,confs,Confidence_Threshould,NMS_Threshould)  for i in indices:    i=i[0]    box=bound_box[i]    x,y,w,h=box[0],box[1],box[2],box[3]    cv2.rectangle(image,(x,y),(x+w,y+h),(255,0,0),2)    cv2.puttext(image,f'{classes[classIds[i]]}{int(confidence[i]*100)}%',    (x,y-10),cv2.FONT_HERSHEY_PLAIN,0.6,(0,255,0),2)    cv2.puttext(image,f'FPS:{fps}',(0,150),cv2.FONT_HERSHEY_PLAIN,0.6,(0,255,0),2)    cv2.puttext(image,f'TIMESTAMPS:{timestamps}',(150,0),cv2.FONT_HERSHEY_PLAIN,0.6,(0,255,0),2)while True:  success,image=cap.read()  # coverting image into blob for network i/O processing  try:   blob=cv2.dnn.blobFromImage(image,1/255,(Width,Height),[0,0,0],crop=False)  except:   continue  # I/P  network.setInput(blob) # Setting Input  # O/P   # As Yolo Architecture Produces Three O/p[Genral Architecture] From The Respective Layer And By Summarize The Max Of Confidence to Decide Final Predictions  # But here only 2 o/p of network as we are using the tiny version for higher frame rates  # In Order to Get The Outputs We Have To Know the Name Of the Respective Layers #i.e. Not Names Actually But Getting indexes(starting from 1 Not zero) Here By Use Of getUnconnectedOutLayers Function   layers_names=network.getLayerNames()  #print(network.getUnconnectedOutLayers()) #36th and 48th indexes  #looping over as we are traversing multiple values of OutLayers  outputNames=[layers_names[i[0]-1]for i in network.getUnconnectedOutLayers()] #-1 cause the index are starting from one not zero   #print(outputNames) # for v3 tiny its 16 and 23  are layer name  # forwading the image to network  outputs=network.forward(outputNames)   # finding objects  # print(outputs[0].shape)=>(300,85) 300=>no.of boxes 85=>[cx,cy,height,width,confidence,probablity of 80 classes]  # using the cx,cy,h,w we are gonna determine the bounding box  # print(outputs[1].shape)=>(1200,85) 1200 boxes this shape present in m*n format i.e. matrix faishion  where 1200 rows of boxes map with 85 vector details explained aboved  #print(outputs[0][0])  findObjects(outputs,image)    cv2.imshow('Window',image)    if cv2.waitKey(15) & 0xFF == ord('q'):        break  cap.release()  cv2.destroyAllWindows()

回答：

你的代码有很多问题。

你必须使用从图像中获取的h,w，而不是你用于将图像转换为yoloV3 blob的默认宽度和高度。

更改

    w,h=int(detection[2]*Width),int(detection[3]*Height)        x,y=int((detection[0]*Width)-(w/2)),int((detection[1]*Height)-(h/2))

为

    w,h = int(det[2]*w) , int(det[3]*h)    x,y = int((det[0]*w)-Width/2) , int((det[1]*h)-Height/2)

你混淆了confs和confidence，这造成了混乱，你可以参考murtaza教程进行检查，但这需要一些时间。

可能还有一些我遗漏的小错误。

———————————- 最终解决方案： ———————————-

为了节省你的时间，这里是你的项目可以工作的正确代码风格。

注意1：我稍微改变了coco.names标签的加载方法，你的方法在我Macbook Pro上效果不好。

注意2：在我的代码中，你需要将文件路径改回你原始代码中的路径。

yolov3_cfg='/root/Downloads/ML TASK/yolov3.cfg'

yolov3_weights='/root/Downloads/ML TASK/yolov3.weights'





相关文章：

调整cv2.VideoCapture的帧大小
动态检测和存储图像的独特颜色
3D对象识别
Yolo v1训练步骤中的边界框
如何计算YOLO中卷积层的输出大小？
如何将整数与np.array一起保存
在Python中处理图像的最快加载方式
### 在图像中拆分数字
如何裁剪无人机拍摄的太阳能板？
对YOLO过程的困惑

学技术

YoloV3 结果在每个类别上的置信度均为零

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复