pkolaczk / fclones

Efficient Duplicate File Finder

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fclones scans a ton of files that have no chance of matching

KyleSanderson opened this issue · comments

Runline: fclones group --hidden --no-ignore -s1M --cache /mnt/*T*

[2024-01-13 14:03:31.740] fclones:  info: Started grouping
[2024-01-13 14:03:48.012] fclones:  info: Scanned 633467 file entries
[2024-01-13 14:03:48.018] fclones:  info: Found 466729 (107.6 TB) files matching selection criteria
[2024-01-13 14:03:48.580] fclones:  info: Found 425599 (35.6 TB) candidates after grouping by size
[2024-01-13 14:03:49.004] fclones:  info: Found 425599 (35.6 TB) candidates after grouping by paths
[2024-01-13 14:22:18.464] fclones:  info: Found 73996 (5.6 TB) candidates after grouping by prefix
[2024-01-13 14:24:36.793] fclones:  info: Found 73858 (5.5 TB) candidates after grouping by suffix
[2024-01-13 22:14:39.468] fclones:  info: Found 73294 (5.5 TB) redundant files

The true hardlink size when running link is 350GB~.

Can you elaborate why do you think so? Which files should not be scanned? It basically scans all the files in the given directory, and then compares them by hashes.

Can you elaborate why do you think so? Which files should not be scanned? It basically scans all the files in the given directory, and then compares them by hashes.

The scan is 5.6 + 5.5TB, if the prefix and suffix match the total should be 5.5 or less, not 11TB. Additionally these files are not new, but it seems to scan them anyway for fun even though cache is specified.

Looks like half the problem is the cache isn't updated when link runs, and fclones sees them as new.

Ive literally never had a problem