Topic: Perceptual Hash: Average -- How to compare images  
[Select]
define IMAGE as any input image.

Convert IMAGE to grey scale mode:
Code: [Select]
for each PIXEL in IMAGE:
    Set the current pixel to (R+G+B) / 3

Resize the image down to 8x8 pixels, regardless of its size or aspect ratio.
How to do this, is beyond the scope of this post and I suggest you use an existing image library for this.

Compute the average colour value for all 64 pixels:
Code: [Select]
define AVERAGE as a 32-bit, unsigned integer.

for each PIXEL in IMAGE:
    Add the pixel colour to AVERAGE

Divide AVERAGE by the amount of pixels.

Now use the average to create the hash value:
Code: [Select]
define HASH as a 64-bit, unsigned integer.

for each PIXEL in IMAGE:
    If PIXEL is higher than AVERAGE:
        Set the Nth bit of HASH to 1.
    Otherwise:
        Set the Nth bit of HASH to 0.

Comparing hashes:

In order to compare two image hashes to actually see how similar the images are, we use something called the Hamming Distance. This simply yields the number of bits in each hash which are different from each other. The larger the Hamming Distance, the more different the images are. A Hamming Distance of 0 means the images are identical. A distance of 1-5, will mean the images are likely very similar. In our case, the maximum distance possible will be 64. And this means the images are completely different.

Code: [Select]
define DISTANCE as a 64-bit, unsigned integer.
define HASH_A and HASH_B as the two hashes to compare.

for each BIT_A in HASH_A and BIT_B in HASH_B:
    If BIT_A is not equal to BIT_B:
        Increment DISTANCE by 1.

Now we have the distance between the two hashes and we know how the two images compare to each other.
Whether we consider an image equal or not, depends on your needs. Ideally, you will define some threshold distance. Any distance larger than this threshold, means the image is not the same and is discarded.

相关文章: