wagtail-images-deduplicator
is a Wagtail app to detect duplicate images in the admin. It's built with imagehash
.
Wagtail Images De-duplicator works with wagtail>=3.0
.
Use pip
to install this package:
pip install wagtail-images-deduplicator
-
Add
wagtail_images_deduplicator
to yourINSTALLED_APPS
in your project's settings. -
Add the
DuplicateFindingMixin
to your custom image model. An example of doing it is shown below:
from wagtail.images.models import Image, AbstractImage, AbstractRendition
from wagtail_images_deduplicator.models import DuplicateFindingMixin
class CustomImage(DuplicateFindingMixin, AbstractImage):
admin_form_fields = Image.admin_form_fields
class CustomRendition(AbstractRendition):
image = models.ForeignKey(
CustomImage, on_delete=models.CASCADE, related_name="renditions"
)
class Meta:
unique_together = (("image", "filter_spec", "focal_point_key"),)
If you choose to add the mixin and have existing image data, you will need to call save()
on all existing instances to fill in the new hash value:
from wagtail.images import get_image_model
for image in get_image_model().objects.all():
image.save()
This setting determines the hash function to use.
Hash function | Reference | Setting name |
---|---|---|
Average hashing | http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html | average_hash |
Perceptual hashing | http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html | phash (default) |
Difference hashing | http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html | dhash or dhash_vertical |
Wavelet hashing | https://fullstackml.com/2016/07/02/wavelet-image-hash-in-python/ | whash |
HSV color hashing | colorhash |
|
Crop-resistant hashing | https://ieeexplore.ieee.org/document/6980335 | crop_resistant_hash |
This setting determines the maximum distance between 2 images to consider them as duplicates.
The default value is 5.
To help you assess how these different algorithms behave and to learn more about hash distances, check out the examples section of the imagehash library's README.