Spark 如何使用图像格式读取我的图像？答案

【问题标题】：How is Spark reading my image using the image format?Spark 如何使用图像格式读取我的图像？
【发布时间】：2019-10-29 16:42:43
【问题描述】：

这可能是一个愚蠢的问题，但我无法弄清楚 Spark 是如何使用 spark.read.format("image").load(....) 参数读取我的图像的。

导入我的图像后，我得到以下信息：

>>> image_df.select("image.height","image.width","image.nChannels", "image.mode", "image.data").show()
+------+-----+---------+----+--------------------+
|height|width|nChannels|mode|                data|
+------+-----+---------+----+--------------------+
|   430|  470|        3|  16|[4D 55 4E 4C 54 4...|
+------+-----+---------+----+--------------------+

我得出的结论是：

我的图片是 430x470 像素，
我的图像是彩色的（RGB 由于 nChannels = 3），这是一种 openCV 兼容类型，
我的图像模式是 16，它对应于特定的 openCV 字节顺序。
- 有人知道我可以浏览哪个网站/文档以了解更多信息吗？
数据列中的数据类型为Binary，但是：
- 当我运行 image_df.select("image.data").take(1) 时，我得到了一个似乎只有一个数组的输出（见下文）。

>>> image_df.select("image.data").take(1)

# **1/** Here are the last elements of the result
....<<One Eternity Later>>....x92\x89\x8a\x8d\x84\x86\x89\x80\x84\x87~'))]

# 2/ I got also several part of the result which looks like:
.....\x89\x80\x80\x83z|\x7fvz}tpsjqtkrulsvmsvmsvmrulrulrulqtkpsjnqhnqhmpgmpgmpgnqhnqhn
qhnqhnqhnqhnqhnqhmpgmpgmpgmpgmpgmpgmpgmpgnqhnqhnqhnqhnqhnqhnqhnqhknejmdilcilchkbh
kbilcilckneloflofmpgnqhorioripsjsvmsvmtwnvypx{ry|sz}t{~ux{ry|sy|sy|sy|sz}tz}tz}tz}
ty|sy|sy|sy|sz}t{~u|\x7fv|\x7fv}.....

接下来的内容与上面显示的结果相关联。这可能是由于我缺乏关于 openCV（或其他）的知识。尽管如此：

1/ 我不明白如果我得到一个 RGB 图像，我应该有 3 个矩阵，但输出以.......\x84\x87~'))] 结束。我更想获得类似[(...),(...),(...\x87~')] 的东西。
2/ 这部分有什么特殊含义吗？像那些是每个矩阵之间的分隔符还是什么？

为了更清楚地了解我想要实现的目标，我想处理图像以在每个图像之间进行像素比较。因此，我想知道图像中给定位置的像素值（我假设如果我有一个 RGB 图像，那么给定位置应该有 3 个像素值）。

示例：假设我有一个仅在白天指向天空的网络摄像头，我想知道对应于左上天空部分的位置的像素值，我发现这些值的串联给出颜色浅蓝色，表示这张照片是在晴天拍摄的。假设唯一的可能性是阳光明媚的日子会呈现颜色Light Blue。
接下来，我想将先前的串联与另一个在完全相同位置但来自第二天拍摄的照片的像素值串联进行比较。如果我发现它们不相等，那么我得出结论，给定的照片是在阴天/雨天拍摄的。如果相等则为晴天。

对此的任何帮助将不胜感激。为了更好地理解，我将我的例子粗俗化了，但我的目标几乎是一样的。我知道可以存在 ML 模型来实现这些东西，但我很乐意先尝试一下。我的第一个目标是将此列拆分为 3 列，对应于每个颜色代码：红色矩阵、绿色矩阵、蓝色矩阵

【问题讨论】：

标签： image-processing pyspark apache-spark-sql pixel

【解决方案1】：

我认为我有逻辑。我使用了 keras.preprocessing.image.img_to_array() 函数来了解值是如何分类的（因为我有一个 RGB 图像，所以我必须有 3 个矩阵：每个颜色对应一个 R G B）。发布如果有人想知道它是如何工作的，我可能是错的，但我认为我有一些东西：

from keras.preprocessing import image
import numpy as np
from PIL import Image

# Using spark built-in data source
first_img = spark.read.format("image").schema(imageSchema).load(".....")
raw = first_img.select("image.data").take(1)[0][0]
np.shape(raw)
(606300,) # which is 470*430*3



# Using keras function
img = image.load_img(".../path/to/img")
yy = image.img_to_array(img)
>>> np.shape(yy)
(430, 470, 3) # the form is good but I have a problem of order since:

>>> raw[0], raw[1], raw[2]
(77, 85, 78)
>>> yy[0][0]
array([78., 85., 77.], dtype=float32)

# Therefore I used the numpy reshape function directly on raw 
# to have 470 matrix of 3 lines and 470 columns:

array = np.reshape(raw, (430,470,3))
xx = image.img_to_array(array)     # OPTIONAL and not used here

>>> array[0][0] == (raw[0],raw[1],raw[2])
array([ True,  True,  True])

>>> array[0][1] == (raw[3],raw[4],raw[5])
array([ True,  True,  True])

>>> array[0][2] == (raw[6],raw[7],raw[8])
array([ True,  True,  True])

>>> array[0][3] == (raw[9],raw[10],raw[11])
array([ True,  True,  True])

所以如果我理解得很好，spark 会将图像读取为一个大数组 - 此处为 (606300,) - 实际上每个元素都是有序的并且对应于它们各自的颜色阴影 (R G B)。
在做了我的小转换之后，我得到了 430 个 3 列 x 470 行的矩阵。由于我的图像是 (470x430) 的 (WidthxHeight)，每个矩阵对应一个像素高度位置，每个矩阵内部：每种颜色 3 列，每个宽度位置 470 行。

希望对某人有所帮助:)！

【讨论】：

你是如何提取 rbg 值的？