fclones scans a ton of files that have no chance of matching

Question

fclones scans a ton of files that have no chance of matching

KyleSanderson opened this issue 6 months ago · comments

Runline: fclones group --hidden --no-ignore -s1M --cache /mnt/*T*

[2024-01-13 14:03:31.740] fclones:  info: Started grouping
[2024-01-13 14:03:48.012] fclones:  info: Scanned 633467 file entries
[2024-01-13 14:03:48.018] fclones:  info: Found 466729 (107.6 TB) files matching selection criteria
[2024-01-13 14:03:48.580] fclones:  info: Found 425599 (35.6 TB) candidates after grouping by size
[2024-01-13 14:03:49.004] fclones:  info: Found 425599 (35.6 TB) candidates after grouping by paths
[2024-01-13 14:22:18.464] fclones:  info: Found 73996 (5.6 TB) candidates after grouping by prefix
[2024-01-13 14:24:36.793] fclones:  info: Found 73858 (5.5 TB) candidates after grouping by suffix
[2024-01-13 22:14:39.468] fclones:  info: Found 73294 (5.5 TB) redundant files

The true hardlink size when running link is 350GB~.

Piotr Kołaczkowski · Answer 1 · Sun Jan 28 2024 00:30:47 GMT+0800 (China Standard Time)

Can you elaborate why do you think so? Which files should not be scanned? It basically scans all the files in the given directory, and then compares them by hashes.

Kyle Sanderson · Answer 2 · Sun Jan 28 2024 06:29:11 GMT+0800 (China Standard Time)

Can you elaborate why do you think so? Which files should not be scanned? It basically scans all the files in the given directory, and then compares them by hashes.

The scan is 5.6 + 5.5TB, if the prefix and suffix match the total should be 5.5 or less, not 11TB. Additionally these files are not new, but it seems to scan them anyway for fun even though cache is specified.

Kyle Sanderson · Answer 3 · Sun Feb 04 2024 14:16:23 GMT+0800 (China Standard Time)

Looks like half the problem is the cache isn't updated when link runs, and fclones sees them as new.

Motophan · Answer 4 · Mon Feb 12 2024 14:51:17 GMT+0800 (China Standard Time)

Ive literally never had a problem