在 Python 中计算红色像素值并绘制直方图答案

【问题标题】：Count red pixel values and plot histogram in Python在 Python 中计算红色像素值并绘制直方图
【发布时间】：2017-10-22 05:59:08
【问题描述】：

我有一组图像，根据它们的类型，它们位于 3 个单独的文件夹中。我想遍历每个类型并计算每个图像的红色像素值。我为红色设置了一个限制，范围从 200 到 256。我想为每种类型创建直方图，然后对直方图进行聚类并区分 3 个类。我在 Python 方面的经验非常有限，我被困在如何隔离和计算红色像素值上。我附上了我的代码和类型 1 的结果直方图，它是一条直线。有人可以帮忙吗？

import numpy as np
import cv2
import os.path
import glob
import matplotlib.pyplot as plt

## take the image, compute sum of all row colors and return the percentage
#iterate through every Type
for t in [1]:

    #load_files
    files = glob.glob(os.path.join("..", "data", "train", "Type_{}".format(t), "*.jpg"))
    no_files = len(files)

    #iterate and read
    for n, file in enumerate(files):
        try:
            image = cv2.imread(file)
            hist = cv2.calcHist([img], [0], None, [56], [200, 256])

            print(file, t, "-files left", no_files - n)

        except Exception as e:
            print(e)
            print(file)

plt.plot(hist)
plt.show()

【问题讨论】：

为什么要转换BGR2RGB？如果你只关心红色，那你为什么要在所有 3 个频道上都做inRange？为什么不直接抓取红色通道（使用cv2.split 或仅使用 numpy 索引）并使用它？
我尝试做image = cv2.imread(file) img= cv2.split(image)，但它返回 img 不是 numpy 数组，也不是标量
numpy 索引是什么意思？
是的，它不是一个 numpy 数组，它是一个 python 列表，每个原始通道包含一个单通道 numpy 数组（例如，一个 BGR 图像将被分成 3 个单独的数组）。 Numpy array indexing
所以，为了获得红色通道，我应该添加一行 im = img[0, :, :]?

标签： python opencv numpy image-processing scikit-image

【解决方案1】：

这是我想出的解决方案。我冒昧地重构和简化了您的代码。

import os
import glob
import numpy as np
import matplotlib.pyplot as plt
from skimage import io

root = 'C:\Users\you\imgs'  # Change this appropriately
folders = ['Type_1', 'Type_2', 'Type_3']
extension = '*.bmp'  # Change if necessary
threshold = 150  # Adjust to fit your neeeds

n_bins = 5  # Tune these values to customize the plot
width = 2.
colors = ['cyan', 'magenta', 'yellow']
edges = np.linspace(0, 100, n_bins+1)
centers = .5*(edges[:-1]+ edges[1:])

# This is just a convenience class used to encapsulate data
class img_type(object):
    def __init__(self, folder, color):
        self.folder = folder
        self.percents = []
        self.color = color

lst = [img_type(f, c) for f, c in zip(folders, colors)]

fig, ax = plt.subplots()

for n, obj in enumerate(lst):
    filenames = glob.glob(os.path.join(root, obj.folder, extension))

    for fn in filenames:
        img = io.imread(fn)
        red = img[:, :, 0]
        obj.percents.append(100.*np.sum(red >= threshold)/red.size)

    h, _ = np.histogram(obj.percents, bins=edges)
    h = np.float64(h)
    h /= h.sum()
    h *= 100.
    ax.bar(centers + (n - .5*len(lst))*width, h, width, color=obj.color)

ax.legend(folders)
ax.set_xlabel('% of pixels whose red component is >= threshold')
ax.set_ylabel('% of images')
plt.show()

请注意，我使用 scikit-image 而不是 OpenCV 来读取图像。如果这不适合您，请插入 import cv2 并更改：

    img = io.imread(fn)
    red = img[:, :, 0]

到：

    img = cv2.imread(fn)
    red = img[:, :, 2]

【讨论】：

感谢您对 Tonechas 的帮助。它有效，但我有几个问题，为了完全理解你做了什么。你为什么把门槛定在150？我知道我可以将其更改为任何其他数字，但它代表什么？此外，此代码根据它们的类型比较每个图像中红色像素的分布。所以这就是训练过程。如果我想在不知道其类型的情况下测试一张新图像，如何根据其红色像素数将其分类为正确的图像？
我将阈值设置为中等值150，因为在我用来执行测试的图像中，红色通道中没有具有高强度值的像素。如果我将阈值设置为220，我将获得this result
我会在this thread回复你评论的第二部分