如何使用 Keras 检测帧中的多个对象答案

【问题标题】：How to detect multiple objects in a frame using Keras如何使用 Keras 检测帧中的多个对象
【发布时间】：2021-09-16 04:46:47
【问题描述】：

我已经使用 keras 训练了一个模型，其数据集为 airplane chair 和 cups。它训练有素，并且可以很好地检测物体。我关注了这个tutorial

在检测过程中，我注意到它只检测到框架中的一个对象。例如，如果框架中有飞机和椅子。理想情况下，它应该同时检测两者，但如果只显示高置信度而不检测两者。

下面是我用来检测的代码

from tensorflow.keras.preprocessing.image import img_to_array
from tensorflow.keras.preprocessing.image import load_img
from tensorflow.keras.models import load_model
import numpy as np
import mimetypes
import argparse
import imutils
import pickle
import cv2
import os


imagePaths = os.path.join(os.path.dirname(__file__), 'test')
image_list = os.listdir(imagePaths)

model = load_model("output/detector.model")
lb = pickle.loads(open(config.LB_PATH, "rb").read())

for test_image in image_list:
    imagePath = os.path.join(imagePaths, test_image)
    image1 = cv2.imread(imagePath)
    image = load_img(imagePath, target_size=(224, 224))
    image = img_to_array(image) / 255.0
    image = np.expand_dims(image, axis=0)
    print(model.predict(image))
    (boxPreds, labelPreds) = model.predict(image)

    (startX, startY, endX, endY) = boxPreds[0]
    i = np.argmax(labelPreds, axis=1)
    label = lb.classes_[i][0]

    confidence = labelPreds.max()
    
    print(labelPreds.min())
    text = str(label + " " + str(confidence))

    image = cv2.imread(imagePath)
    image = imutils.resize(image, width=600)
    (h, w) = image.shape[:2]
    startX = int(startX * w)
    startY = int(startY * h)
    endX = int(endX * w)
    endY = int(endY * h)
    y = startY - 10 if startY - 10 > 10 else startY + 10
    cv2.putText(image, text, (startX, y), cv2.FONT_HERSHEY_SIMPLEX, 0.65, (0, 255, 0), 2)
    cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2)

    cv2.imshow("Output", image)
    cv2.waitKey(0)

如何更新模型，以便在预测时，它可以预测框架中的多个对象。请帮忙。谢谢

【问题讨论】：

你只画了一个盒子（第一个），所以询问多个对象是没有实际意义的。

标签： python tensorflow keras object-detection

【解决方案1】：

我只有一个猜测，会不会是boxPreds数组里面有更多的预测？

(startX, startY, endX, endY) = boxPreds[0]

试试这个：

for x in range len(boxPreds):
    (startX, startY, endX, endY) = boxPreds[x]
    i = np.argmax(labelPreds, axis=1)
    label = lb.classes_[i][x]


    confidence = labelPreds.max()
    
    print(labelPreds.min())
    text = str(label + " " + str(confidence))

    image = cv2.imread(imagePath)
    image = imutils.resize(image, width=600)
    (h, w) = image.shape[:2]
    startX = int(startX * w)
    startY = int(startY * h)
    endX = int(endX * w)
    endY = int(endY * h)
    y = startY - 10 if startY - 10 > 10 else startY + 10
    cv2.putText(image, text, (startX, y), cv2.FONT_HERSHEY_SIMPLEX, 0.65, (0, 255, 0), 2)
    cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2)

cv2.imshow("Output", image)
cv2.waitKey(0)

未经测试，但我希望你能明白。

【讨论】：