twpayne / find-duplicates

Find duplicate files quickly.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Improve directory walk performance

twpayne opened this issue · comments

The command find seems to have much better performance than Go's filepath.WalkDir.

stapelberg indicated that bradfitz (no mentions to avoid spamming) investigated this as part of goimports and was able to significantly improve performance, maybe by using the right syscalls.

It looks like this is the relevant conversation about this. From what I understand, the problematic one was filepath.Walk. filepath.WalkDir improves a lot (although it's not as fast as find according to the conversation) as it "uses the new os.DirEntry [...] type to avoid a stat for every file".