YoloV3 结果在每个类别上的置信度均为零

我正在实现Yolo v3用于多类对象检测

yolo是一种基于区域建议的算法,将最高置信度的区域建议视为yolo的预测,了解更多信息可以阅读这里

对于这个特定任务,我参考了这个murtuza教程,它从头开始指导了我

由于复杂的网络架构需要数小时的训练,我更喜欢使用迁移学习,即使用预训练的网络和权重(参数),你可以在这里找到这两个链接
架构配置:cfg
网络参数(权重):weights

我在这里使用了yolov3 tiny,因为我需要更高的帧率来处理视频,但结果并不像教程中展示的那样有前景,我不知道自己哪里出了问题,即使将网络cfg和权重文件更改为原始的yolov3(320),也无法得到正确的结果,我得到了所有5个空间数据作为坐标和置信度[cx,cy,h,w,confidence],但所有80个类别的概率仍然是零向量[0.0,0.0,0.0—0.0],即使更换视频源并选择另一个视频,结果仍然是零向量,而在教程中这些是正常工作的

实现代码:

# YOLO Algorithm# Network Weights and configuration Files yolov3_tiny_cfg='/root/Downloads/ML TASK/yolov3-tiny.cfg' # configuration fileyolov3_tiny_weights='/root/Downloads/ML TASK/yolov3-tiny.weights' # weightscoco_names='/root/Downloads/ML TASK/coco.names' # coco classes# for yolo genral 320 architecture# put paths to directoryyolov3_cfg='/root/Downloads/ML TASK/yolov3.cfg'yolov3_weights='/root/Downloads/ML TASK/yolov3.weights'# Test VideosTest_video_1='/root/Downloads/ML TASK/mn.mp4'Test_video_2='/root/Downloads/ML TASK/bg.mp4'# Dependenciesimport cv2import numpy as np# Dataset Classes:# there are around 80 classes in the coco dataset so manually writing them would not be right choice so instead of them we are getting them from a file name coco.names stored in drive# getting list of classesclasses=[] # empty list intializationwith open(coco_names,'r')as f:  classes=f.read().splitlines()# viewing the multiclass list around 80 classes in coco dataset# Loading the yolov3 using configuration file and weightsnetwork=cv2.dnn.readNetFromDarknet(yolov3_cfg,yolov3_weights)network.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)# to use opencv CPU as backendnetwork.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)#NOTE: The network won't feed directly the image we have to First Preprocess it To match the input shape of network also the type i.e. Blob it genrally refers to a mathematical form of binary Images Like BitmapWidth,Height=320,320 # sqaure image so the network grid should be n*n equal on both dimensionConfidence_Threshould=0.5 # minimum problity for claiming the predictionNMS_Threshould=0.3cap=cv2.VideoCapture('game.mp4')fps = cap.get(cv2.CAP_PROP_FPS)timestamps = [cap.get(cv2.CAP_PROP_POS_MSEC)]# function to find objects on captured video streamdef findObjects(outputs,image):  h,w,c=image.shape  bound_box=[] # for feeding through function  classIds=[]  confidence=[]  for output in outputs: # getting o/p from 2 layers(v3 tiny) 3 if use yolov3 320    for detection in output:      scores=detection[5:]  #slice first five values cause we are gonnause them in bounding      classId=np.argmax(scores)      confs=scores[classIds]      # filtering object putting them as final prediction only when its breaches the minimum threshould of confidence      if confs > Confidence_Threshould:        w,h=int(detection[2]*Width),int(detection[3]*Height)              # to convert % into pixel        x,y=int((detection[0]*Width)-(w/2)),int((detection[1]*Height)-(h/2))         bound_box.append([x,y,w,h])        classIds.append(classId)        confidence.append(float(confs))  print(len(bound_box))  # to downsample the no. of boxes on frame we use nms boxes it give indices by which spatial info to keep  indices=cv2.dnn.NMSBoxes(bound_box,confs,Confidence_Threshould,NMS_Threshould)  for i in indices:    i=i[0]    box=bound_box[i]    x,y,w,h=box[0],box[1],box[2],box[3]    cv2.rectangle(image,(x,y),(x+w,y+h),(255,0,0),2)    cv2.puttext(image,f'{classes[classIds[i]]}{int(confidence[i]*100)}%',    (x,y-10),cv2.FONT_HERSHEY_PLAIN,0.6,(0,255,0),2)    cv2.puttext(image,f'FPS:{fps}',(0,150),cv2.FONT_HERSHEY_PLAIN,0.6,(0,255,0),2)    cv2.puttext(image,f'TIMESTAMPS:{timestamps}',(150,0),cv2.FONT_HERSHEY_PLAIN,0.6,(0,255,0),2)while True:  success,image=cap.read()  # coverting image into blob for network i/O processing  try:   blob=cv2.dnn.blobFromImage(image,1/255,(Width,Height),[0,0,0],crop=False)  except:   continue  # I/P  network.setInput(blob) # Setting Input  # O/P   # As Yolo Architecture Produces Three O/p[Genral Architecture] From The Respective Layer And By Summarize The Max Of Confidence to Decide Final Predictions  # But here only 2 o/p of network as we are using the tiny version for higher frame rates  # In Order to Get The Outputs We Have To Know the Name Of the Respective Layers #i.e. Not Names Actually But Getting indexes(starting from 1 Not zero) Here By Use Of getUnconnectedOutLayers Function   layers_names=network.getLayerNames()  #print(network.getUnconnectedOutLayers()) #36th and 48th indexes  #looping over as we are traversing multiple values of OutLayers  outputNames=[layers_names[i[0]-1]for i in network.getUnconnectedOutLayers()] #-1 cause the index are starting from one not zero   #print(outputNames) # for v3 tiny its 16 and 23  are layer name  # forwading the image to network  outputs=network.forward(outputNames)   # finding objects  # print(outputs[0].shape)=>(300,85) 300=>no.of boxes 85=>[cx,cy,height,width,confidence,probablity of 80 classes]  # using the cx,cy,h,w we are gonna determine the bounding box  # print(outputs[1].shape)=>(1200,85) 1200 boxes this shape present in m*n format i.e. matrix faishion  where 1200 rows of boxes map with 85 vector details explained aboved  #print(outputs[0][0])  findObjects(outputs,image)    cv2.imshow('Window',image)    if cv2.waitKey(15) & 0xFF == ord('q'):        break  cap.release()  cv2.destroyAllWindows()

回答:

你的代码有很多问题。

  1. 你必须使用从图像中获取的h,w,而不是你用于将图像转换为yoloV3 blob的默认宽度和高度。

更改

    w,h=int(detection[2]*Width),int(detection[3]*Height)        x,y=int((detection[0]*Width)-(w/2)),int((detection[1]*Height)-(h/2))

    w,h = int(det[2]*w) , int(det[3]*h)    x,y = int((det[0]*w)-Width/2) , int((det[1]*h)-Height/2)
  1. 你混淆了confs和confidence,这造成了混乱,你可以参考murtaza教程进行检查,但这需要一些时间。

可能还有一些我遗漏的小错误。

———————————- 最终解决方案: ———————————-

为了节省你的时间,这里是你的项目可以工作的正确代码风格。

注意1:我稍微改变了coco.names标签的加载方法,你的方法在我Macbook Pro上效果不好。

注意2:在我的代码中,你需要将文件路径改回你原始代码中的路径。

yolov3_cfg='/root/Downloads/ML TASK/yolov3.cfg'

yolov3_weights='/root/Downloads/ML TASK/yolov3.weights'

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注