Duplicates
This Python script walks the current directory creating an SQLite database with the path, size, and mtime for each file. After computing and adding the md5 checksum for all duplicate candidates (files with the same byte size), it generates a CSV giving the following infos for each file that has the same content (md5sum) as another file:
location
(path relative to the initial directory)md5sum
,size
, andmtime
name
(basename) andext
(file extension)
Dependencies
- Python 3.9+
- SQLAlchemy 1.4+