JettIsOnTheNet / undupe

Duplicate image file mover command line tool. Build in XCode 14.3.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

undupe

I needed a simple command line tool to first pass sort image duplicates. So, I built undupe. It is a simple command line tool built in C++ for XCode 14.3, that will take a directory as input, hash all files to find binary duplicates, then take these duplicates and move them to an output diretory. This technique is a fast and efficient way to reduce duplicates and prune your dataset before similarity ranking the image duplicates on a deeper level, which will take much more processing power. I am sure this can be improved upon. Such as, this does not process buffer images in batches of n. It just grabs all the images and brute forces. I only needed to first pass prune on roughly 200k images in a single directory, and it was good enough for my use case to run in only a few minute. If you want to pull improvements, please do. I will verify and merge.

Works automatically with the following file extension:

  • .jpg
  • .png
  • .gif
  • .webp
  • .bmp
  • .jpeg
  • .tiff
  • .heif

To build, clone this repo.

Open XCode proj file or create a new XCode project and move main.cpp into it.

Build/run (Menu: Product/Build For/Running or CMD+SHFT+R).

Compiled binary file will be in the build directory (Menu: Product/Show Build Folder in Finder).

Navigate to: ./Build/Products/Debug/). Move it to /usr/bin/ or to location of your choice that is in your path.

Then it can be run in terminal:

undupe -i <input_directory> -o <output_directory>

Example output:

$ undupe 
Usage: undupe -i <input directory> -o <output directory>

$ undupe -i input -o output
2849471691476935143: input/1.jpg input/copy.jpg
Move the duplicates to output directory? (y/n): y
Moved file: "output/copy.jpg"

Build/Runs on MacOS Ventura 13.4. I did not test on any other platforms. Probably can be built with gcc or g++ on Mac or on Linux etc easily enough.

About

Duplicate image file mover command line tool. Build in XCode 14.3.

License:MIT License


Languages

Language:C++ 100.0%