Find Similar Pictures
This search method enables you to find similar or almost identical pictures. For each image will be a checksum calculated for further comparison with all other images.- Comparison Method
- Hints
- Image Formats
- Percentage Match
- Picture Area
- Compare Size
- Checksum
- Compare only pictures with the same properties
- Detect picture modifications (slower)
- Recognition Rate Test
- Examples
Comparison Method
The comparison methods aHash, bHash, dHash, mHash and pHash enables you to find similar or almost identical pictures by using a percent match lower than 100%. If you want to find exactly the same pictures you have to use a percent match of 100% or the comparison methods MD5/SHA. An overview of the recognition rate of the comparison methods aHash, bHash, dHash, mHash and pHash can be found here. You can find more information about the comparison methods aHash, bHash, dHash, mHash and pHash in the articles Testing different image hash functions and Detection of Duplicate Images Using Image Hash Functions on the Internet.-
aHash
The comparison method aHash (Average Hash) resizes the image to 8x8 or 16x16 pixel. The image is then converted to grayscale and the average color value of all image pixels is calculated. Then all image pixels are compared with the average color value and the checksum is calculated.
-
bHash
The comparison method bHash (Blockhash) resizes the image to 128x128, 256x256 or 512x512 pixel. The image will be divided into a block matrix and the median value of all blocks will be calculated to create the checksum. The options "Fast" and "Precise" enables you to influence the accuracy of the checksum calculation.
-
dHash
The comparison method dHash (Difference Hash) resizes the image to 8x8 or 16x16 pixel. After that, the image is converted to grayscale and the checksum is created by comparing the difference in brightness values between all neighboring pixels.
-
mHash
The comparison method mHash (Median Hash) resizes the image to 8x8 or 16x16 pixel. The image is then converted to grayscale and the mean color value of all image pixels is determined. Then all image pixels are compared with the mean color value and the checksum is calculated.
-
pHash
The comparison method pHash (Perceptual Hash) resizes the image to 32x32 pixel. The image is then converted to grayscale and transformed by a discrete cosine transform (DCT). Next, the mean color value of all pixels in the image area (8x8) in the top left of the image is determined. Then the checksum is calculated by comparing the color value of all pixels from the top left image area with the mean color value.
-
MD5, SHA
These comparison methods can only be used to find exactly the same images. The following comparison methods are available:
Calculation time of the checksums
Using the comparison methods aHash, bHash, dHash, mHash and pHash, we have calculated the time required to create a checksum and listed it in the following table:Comparison Method | Picture Area | Checksum | Expenditure of Time |
---|---|---|---|
aHash | 8x8 | 64-bit | 0,0450 ms |
aHash | 16x16 | 256-bit | 0,1425 ms |
bHash (fast) | 256x256 | 256-bit | 9,6030 ms |
bHash (precise) | 256x256 | 256-bit | 28,4792 ms |
dHash | 8x8 | 64-bit | 0,0458 ms |
dHash | 16x16 | 256-bit | 0,0988 ms |
mHash | 8x8 | 64-bit | 0,1435 ms |
mHash | 16x16 | 256-bit | 1,1012 ms |
pHash | 32x32 | 64-bit | 8,6922 ms |
Hints
The following files will be automatically excluded from the search:- Files with a size of 0 bytes
- Pictures with a width or height smaller than the specified compare size
- Corrupted, invalid or incomplete pictures (*)
- Files with a blocked read access (*)
Image Formats
Here you can specify which image formats should be checked during a search. Image files with the following file extensions are supported: 3FR, ARW, BMP, CR2, CRW, CUT, DCR, DIB, DNG, EMF, ERF, GIF, HDP, ICO, IFF, J2C, J2K, JP2, JPE, JPG, JPEG, JPX, JFIF, KDC, MDC, MEF, MOS, MRW, NEF, ORF, PEF, PBM, PCX, PGM, PNG, PPM, PSD, RAF, RAS, RAW, RW2, SRW, TGA, TIF, TIFF, RAS, RLE, WBMP, WEBP, WMF and X3F.Percentage Match
Here you can specify the minimum percentage matching of two pictures. The calculated percentage matching between two pictures will be shown at the column Match. The percentage always refers to the reference picture of a group which will be shown in a different text color.Picture Area
This option enables you to specify the picture area to be used to create the checksum. The following options are available:- entire picture
- area in the upper left corner
- area in the upper right corner
- area in the lower left corner
- area in the lower right corner
Compare Size
Here you can specify the maximum width and height of the pictures to be compared. A lower compare size finds more similar pictures and speeds up the comparison time. A higher compare size finds more identical pictures and less similar pictures and of course needs more time to compare them.Checksum
The size of the checksum in bits is displayed here. The checksum can only be changed when using the bHash comparison method.Compare only pictures with the same properties
This option will be performed before the option Detect picture modifications. The following picture properties are available:File Name
This option enables you to compare only pictures with the same file name.File Extension
This option enables you to compare only pictures with the same file extension.Width and Height
This option enables you to compare only pictures with the same width and height.Orientation
This option enables you to compare only pictures with the same orientation (portrait or landscape).Aspect Ratio
This option enables you to compare only pictures with the same aspect ratio. The calculation of the aspect ratio is done by the formula "width divided by height". The result of the calculation is truncated to one decimal place. A picture with 1920x1080 pixel has an aspect ratio of "1.7".Detect picture modifications (slower)
This option enables you to detect different picture modifications when comparing two pictures. For this purpose, each picture modification will be performed with the picture to be compared and in each case an additional checksum will be created. The following picture modifications can be detected:- Rotated 90° to the right
- Rotated 180° to the right
- Rotated 90° to the left
- Flipped horizontally
- Rotated 90° to the right and flipped horizontally
- Flipped vertically
- Rotated 90° to the left and flipped horizontally
Recognition Rate Test
We have carried out tests with various comparison methods to determine the recognition rate for different image changes. For this purpose, 29 copies of a JPEG image (1600x1200px, 606KB) were created and their color, size and format changed. The tests always compared the original JPEG image and a modified copy of the image.The following comparison methods were used in the test:
Comparison Method | Compare Size | Checksum |
---|---|---|
aHash¹ | 8x8 | 64-bit |
aHash² | 16x16 | 256-bit |
bHash¹ (fast) | 256x256 | 256-bit |
bHash² (precise) | 256x256 | 256-bit |
dHash | 16x16 | 256-bit |
mHash | 16x16 | 256-bit |
pHash | 32x32 | 64-bit |
The percentage values in the table below show how well a comparison method recognizes the change in the image copy:
Image Modification | aHash¹ | aHash² | bHash¹ | bHash² | dHash | mHash | pHash |
---|---|---|---|---|---|---|---|
Image reduction to 75% | 100% | 100% | 100% | 100% | 100% | 100% | 97% |
Image reduction to 50% | 100% | 100% | 100% | 100% | 99% | 100% | 100% |
Image reduction to 25% | 100% | 100% | 100% | 100% | 99% | 100% | 97% |
Image enlargement to 150% | 100% | 100% | 100% | 100% | 100% | 100% | 100% |
Image enlargement to 200% | 100% | 100% | 100% | 100% | 100% | 100% | 100% |
Conversion to grayscale | 97% | 98% | 97% | 93% | 97% | 96% | 97% |
Brightness increased by 30% | 95% | 99% | 100% | 99% | 98% | 97% | 91% |
Brightness decreased by 30% | 95% | 95% | 89% | 93% | 92% | 94% | 91% |
Contrast increased by 30% | 100% | 99% | 95% | 96% | 96% | 96% | 94% |
Contrast decreased by 30% | 100% | 99% | 100% | 100% | 98% | 100% | 94% |
JPEG quality reduced by 25% | 100% | 100% | 100% | 100% | 100% | 100% | 100% |
JPEG quality reduced by 50% | 100% | 100% | 100% | 100% | 99% | 100% | 100% |
JPEG quality reduced by 75% | 100% | 100% | 100% | 100% | 99% | 100% | 94% |
Image rotated 90° to the left | 47% | 52% | 68% | 54% | 46% | 50% | 47% |
Image rotated 90° to the right | 47% | 52% | 54% | 55% | 48% | 50% | 44% |
Image flipped vertically | 81% | 81% | 77% | 73% | 71% | 80% | 50% |
Image flipped horizontally | 69% | 59% | 62% | 64% | 54% | 61% | 47% |
Added white frame (30px) | 84% | 82% | 68% | 61% | 86% | 82% | 88% |
Added black frame (30px) | 94% | 94% | 92% | 89% | 92% | 92% | 88% |
Image height reduced to 80% | 100% | 100% | 100% | 100% | 100% | 100% | 100% |
Image width reduced to 80% | 100% | 100% | 100% | 100% | 99% | 100% | 100% |
Conversion to PNG 24bit | 100% | 100% | 100% | 100% | 100% | 100% | 100% |
Conversion to BMP 24bit | 100% | 100% | 100% | 100% | 100% | 100% | 100% |
Conversion to GIF 256 colors | 100% | 100% | 100% | 99% | 99% | 100% | 97% |
Cropped 100 pixel from the left edge | 92% | 91% | 88% | 71% | 82% | 92% | 78% |
Cropped 100 pixel from the right edge | 97% | 92% | 86% | 86% | 85% | 90% | 91% |
Cropped 100 pixel from the top edge | 94% | 88% | 86% | 83% | 86% | 89% | 84% |
Cropped 100 pixel from the bottom edge | 98% | 91% | 87% | 86% | 86% | 91% | 84% |
Cropped 100 pixel from all edges | 89% | 80% | 77% | 75% | 70% | 80% | 62% |
The following table gives you an overview of how many duplicates were found with the various comparison methods in the test. The evaluation was carried out with a Percentage Match of at least 70%, 80% and 90%.
Comparison Method | 70% | 80% | 90% |
---|---|---|---|
aHash¹ (64-bit) | 26 | 26 | 23 |
aHash² (256-bit) | 26 | 26 | 22 |
bHash¹ (fast) | 25 | 23 | 18 |
bHash² (precise) | 25 | 22 | 18 |
dHash (256-bit) | 26 | 24 | 19 |
mHash (256-bit) | 26 | 26 | 22 |
pHash (64-bit) | 26 | 24 | 19 |
Examples
The displayed search results were created with the following settings:- Comparison method: aHash
- Compare size: 16x16
- Checksum: 256-bit