dtolnay / dissimilar

Diff library with semantic cleanup, based on Google's diff-match-patch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Incompatibility with the original alogrithm

andelf opened this issue · comments

commented

Cannot generated the same result as the original speedtest google/diff-match-patch.

Note:

  • speedtest should return 2187 diffs, I use the python3 version
  • this crate: 313, applied Semantic diff post proc
  • dmp generates the same result
  • diff_match_patch cannot generate the same result: 2177
commented

Close. This should be a feature request for timeout and post proc options.

The Rust dissimilar::diff(text1, text2) is equivalent to the following from the python3 API:

dmp = diff_match_patch()
dmp.Diff_Timeout = 0.0
diffs = dmp.diff_main(text1, text2)
dmp.diff_cleanupSemantic(diffs)

which both produce 313.

Without diff_cleanupSemantic the python3 implementation produces a low-quality diff. That's why you saw so many chunks come out of it (2187).

commented

@dtolnay Thanks for your clarification.