mattphillips / deep-object-diff

Deep diffs two objects, including nested structures of arrays and objects, and returns the difference. ❄️

Home Page:https://www.npmjs.com/package/deep-object-diff

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Array diffs entirely on order and not contents

antgonzales opened this issue · comments

When I diff ['a', 'b', 'c'] and ['b', 'c'], the result is:

{
  added: {},
  deleted: { '2': undefined },
  updated: { '0': 'b', '1': 'c' }
}

I would expect the result to be:

{
  added: {},
  deleted: { '0': undefined },
  updated: { '0': 'b', '1': 'c' }
}

I think this is by design.

How should your expected diff be applied to ['a', 'b', 'c'] to get ['b', 'c']?

If the deleted property corresponds to the delete operator, the result is incorrect:

x = ['a', 'b', 'c']
delete x[0] // -> x = [undefined, 'b', 'c']
x[0] = 'b'  // -> x = ['b', 'b', 'c']
x[1] = 'c'  // -> x = ['b', 'c', 'c']

If the deleted property corresponds to Array.splice(n, 1), the result would depend on the order in which the operations are applied:

  • If deleted is applied first, the result is correct but the updated part is completely redundant:

    x = ['a', 'b', 'c']
    x.splice(0,1) // -> x = ['b', 'c']
    x[0] = 'b'    // -> x = ['b', 'c']
    x[1] = 'c'    // -> x = ['c', 'c']
  • If updated is applied first, the result is incorrect:

    x = ['a', 'b', 'c']
    x[0] = 'b'    // -> x = ['b', 'b', 'c']
    x[1] = 'c'    // -> x = ['b', 'c', 'c']
    x.splice(0,1) // -> x = ['c', 'c']

The operations specified by the diff currently output by this library would be applied with these operations (which work in any order):

x = ['a', 'b', 'c']
delete x[2] // -> x = ['a', 'b']
x[0] = 'b'  // -> x = ['b', 'b']
x[1] = 'c'  // -> x = ['b', 'c']

This applies correctly, but it is indeed a lot of work for what could be 1 Array operation. I imagine this is also why diffing large Arrays takes so long (#56).

Making an Array.splice-based diff for arrays (without the redundant updated property) would often be shorter than a per-key diff, especially when large arrays are pushed or popped and many keys shift (it is effectively a run length encoding), but it would require special cases both when calculating the diff and when applying it. It would require care with non-index numeric keys like { '0': 'xyz' }.

Certainly possible to make. Would have to consider if it's a good fit for this library. If my idea for custom comparisons is implemented in the future, plugging in a separate library to do more compact Array diffs when necessary would be easy.