即使没有任何对象，Resnet 也会显示错误的预测答案

【问题标题】：Resnet is showing wrong predictions even without any object即使没有任何对象，Resnet 也会显示错误的预测
【发布时间】：2020-06-13 21:25:42
【问题描述】：

我试图使用 resnet 对有缺陷的镀层和没有缺陷的印版进行分类。对于这两个类别中的每一个，我都有一个包含 2000 张图像的数据集。但是当我实时测试它时，它显示“有缺陷”，即使相机下方没有物体（无论是公平的还是有缺陷的板）。我在下面提供我的训练代码和测试代码。请尝试找出可能是什么问题

测试代码：

    image =frame
    #print(image.shape)
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    ret,thresh = cv2.threshold(gray,160,255,0)
    contours, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    height, width, channels = image.shape
    min_x, min_y = width, height
    max_x = max_y = 0
    for cnt in contours:
        area = cv2.contourArea(cnt)
        if area > 400:
            (x, y, w, h) = cv2.boundingRect(cnt)
            min_x, max_x = min(x, min_x), max(x + w, max_x)
            min_y, max_y = min(y, min_y), max(y + h, max_y)
            if (w>200) and (h>200):
                roi1 = image[y:y + h, x:x + w]

                roi = cv2.resize(roi1, (64, 64))
                roi = roi.astype("float") / 255.0
    # order channel dimensions (channels-first or channels-last)
    # depending on our Keras backend, then add a batch dimension to
    # the image
                roi = img_to_array(roi)
                roi = np.expand_dims(roi, axis=0)
    # make predictions on the input image
                pred = model.predict(roi)
                if (pred[0][1]) > .96:
                    label = "Defect"
                    color = (0, 0, 255)
                    confidence = pred[0][1]

                elif (pred[0][0]) > .32:
                    label = "Fair"
                    color = (0, 255, 0)
                    confidence = pred[0][0]             
                #pred = pred.argmax(axis=1)[0]

                # label = "Fair" if pred == 0 else "Defected"
                # color = (0, 255, 0) if pred == 0 else (0, 0, 255)

                cv2.putText(roi1, label, (3, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
                    color, 2)
                cv2.putText(roi1, "confidence:"+str(confidence*100)+"%", (3, 40), 
                    cv2.FONT_HERSHEY_SIMPLEX, 0.5,
                    (255,255,255), 2)

培训代码：

NUM_EPOCHS = 500
BS = 32

# derive the path to the directories containing the training,
# validation, and testing splits, respectively
TRAIN_PATH = os.path.sep.join([args["dataset"], "training"])
VAL_PATH = os.path.sep.join([args["dataset"], "validation"])
TEST_PATH = os.path.sep.join([args["dataset"], "testing"])

# determine the total number of image paths in training, validation,
# and testing directories
totalTrain = len(list(paths.list_images(TRAIN_PATH)))
totalVal = len(list(paths.list_images(VAL_PATH)))
totalTest = len(list(paths.list_images(TEST_PATH)))

# initialize the training training data augmentation object
trainAug = ImageDataGenerator(
    rescale=1 / 255.0,
    rotation_range=20,
    zoom_range=0.05,
    width_shift_range=0.05,
    height_shift_range=0.05,
    shear_range=0.05,
    horizontal_flip=True,
    fill_mode="nearest")

# initialize the validation (and testing) data augmentation object
valAug = ImageDataGenerator(rescale=1 / 255.0)

# initialize the training generator
trainGen = trainAug.flow_from_directory(
    TRAIN_PATH,
    class_mode="categorical",
    target_size=(64, 64),
    color_mode="rgb",
    shuffle=True,
    batch_size=32)

# initialize the validation generator
valGen = valAug.flow_from_directory(
    VAL_PATH,
    class_mode="categorical",
    target_size=(64, 64),
    color_mode="rgb",
    shuffle=False,
    batch_size=BS)

# initialize the testing generator
testGen = valAug.flow_from_directory(
    TEST_PATH,
    class_mode="categorical",
    target_size=(64, 64),
    color_mode="rgb",
    shuffle=False,
    batch_size=BS)

# initialize our Keras implementation of ResNet model and compile it
model = ResNet.build(64, 64, 3, 2, (2, 2, 3),
    (32, 64, 128, 256), reg=0.0005)
opt = SGD(lr=1e-1, momentum=0.9, decay=1e-1 / NUM_EPOCHS)
model.compile(loss="binary_crossentropy", optimizer=opt,
    metrics=["accuracy"])

# train our Keras model
H = model.fit_generator(
    trainGen,
    steps_per_epoch=totalTrain // BS,
    validation_data=valGen,
    validation_steps=totalVal // BS,
    epochs=NUM_EPOCHS)

# reset the testing generator and then use our trained model to
# make predictions on the data
print("[INFO] evaluating network...")
testGen.reset()
predIdxs = model.predict_generator(testGen,
    steps=(totalTest // BS) + 1)

# for each image in the testing set we need to find the index of the
# label with corresponding largest predicted probability
predIdxs = np.argmax(predIdxs, axis=1)

# show a nicely formatted classification report
print(classification_report(testGen.classes, predIdxs,
    target_names=testGen.class_indices.keys()))

# save the network to disk
print("[INFO] serializing network to '{}'...".format(args["model"]))
model.save(args["model"])

当我通过视频进行测试时，它显示出对缺陷板的亲和力，公平板的置信度得分非常低，而缺陷板的置信度得分非常高。

【问题讨论】：

标签： machine-learning keras deep-learning computer-vision resnet

【解决方案1】：

这种网络的输出是跨类的概率。这不包括“无”类，除非你这样做。如果您在实验室环境中（听起来），那么您的测试区域的规律性可能会允许这样做。网络总是有一些最高得分的输出类，即使它们都非常低，尽管使用 softmax 激活，类之间的总和总是等于 1。有一些方法可以在各种情况下处理这个问题，但在很大程度上它被认为是一个紧迫的积极研究领域。

除此之外，您的网络性能无法预测；这就是人工智能应用的整个领域。您需要通过实验对其进行改进。

【讨论】：

那么您建议使用 binary_crossentropy 或 categorical 以及任何其他激活函数或其他什么东西
您可以尝试使用 sigmoid 输出激活函数的二元交叉熵损失，并获得每个类概率的区间 [0,1] 上的值，而不是求和为一个（就像使用 softmax 一样）。但是，我的想法是，如果您在受控环境中使用它，您只需在相机下方的桌子上用彩色胶带将一个大的粉红色“X”拍下来，理想情况下在您检查盘子时将其挡住。然后你可以训练大粉色“X”到“无”类。
谢谢你的好建议，让我试试，我会回复你结果审查