kpdecker / jsdiff

A javascript text differencing implementation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG?] incompact seperate logic when same character

loynoir opened this issue · comments

Given

  • diffChars('1 aab 2','1 zzb 2')
[
  {
    "count": 2,
    "value": "1 "
  },
  {
    "count": 2,
    "removed": true,
    "value": "aa"
  },
  {
    "count": 2,
    "added": true,
    "value": "zz"
  },
  {
    "count": 3,
    "value": "b 2"
  }
]

Expected

  • diffChars('1 aab 2','1 bbb 2')
[
  {
    "count": 2,
    "value": "1 "
  },
  {
    "count": 2,
    "removed": true,
    "value": "aa"
  },
  {
    "count": 2,
    "added": true,
    "value": "bb"
  },
  {
    "count": 3,
    "value": "b 2"
  }
]

Actual

  • diffChars('1 aab 2','1 bbb 2')
[
  {
    "count": 2,
    "value": "1 "
  },
  {
    "count": 2,
    "removed": true,
    "value": "aa"
  },
  {
    "count": 1,
    "value": "b"
  },
  {
    "count": 2,
    "added": true,
    "value": "bb"
  },
  {
    "count": 2,
    "value": " 2"
  }
]

Additional

If it is not a bug, will be nice to have option to separate at last b, rather than first b.

Hmm. I guess the underlying intuition here is that it's better to preserve the b that's in the same index in the string? So e.g. with diffChars('1 baa 2','1 bbb 2') you WOULD want to preserve the first b?

I think to get a diff that matches your intuition here you probably want to be using a diffing algorithm where edits can be substitutions, like a diff based on Levenshtein distance? If the edits you're making can be substitutions then the single optimal way to convert bbb to aab to is to substitute the first two bs with as (which achieves the transformation with 2 edits). But to the Myers algorithm, which can only do insertions and deletions, it simply doesn't matter which of the three bs you keep; it's the same edit distance (4, made up of 2 insertions and 2 deletions) either way.

Since jsdiff is based on the Myers diff algorithm and that's unlikely to change, I don't think there's a reasonable way for us to make jsdiff behave in the way you wanted, though.