pkolaczk / fclones

Efficient Duplicate File Finder

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

isolate reports duplicates under same root if they also exist elsewhere

felciano opened this issue · comments

According to the README docs, --isolate finds files that match across two directory trees, without matching identical files within each tree. However this doesn't seem to be the case. Consider this file structure, where all the files are identical:

dir-1/A.jpg
dir-1/A copy.jpg
dir-2/A.jpg

Then run the following:

fclones group dir-1 dir-2 --isolate

I would expect this to find duplicates of files in dir-1 in dir-2 only, and vice versa. Instead I get:

815e2d46660c7176848ad3900fb7a456, 1019282 B (1019.3 KB) * 3:
    /Volumes/Main/fclones/dir-1/A copy.JPG
    /Volumes/Main/fclones/dir-1/A.JPG
    /Volumes/Main/fclones/dir-2/A.JPG

The first two entries in this report indicate that A.JPG and A copy.JPG are duplicates, which is true, but should be excluded with the --isolate flag.

This is not a bug. Works as designed. Fclones reports always both sides of the duplicate match, because it has no idea which of the duplicates you want to remove. If multiple files are present under one isolated roots, they are counted as one, but still all are reported.

@pkolaczk this makes sense -- thanks for the clarification.

In the isolate scenario, is there a way to tell which of the duplicates in the first directory will be used if you elect the link option?

That is, under the above scenario, if I tell fclones that I want to replace duplicates with links, will file .../dir-2/A.JPG end up being linked to .../dir-1/A.JPG or .../dir-1/A copy.JPG?