flitbit / diff

Javascript utility for calculating deep difference, capturing changes, and applying changes across objects; for nodejs and the browser.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Incorrectly rejects identical structure when encountering circularity

cpcallen opened this issue · comments

In this case there is no difference between o1 and o2, but diff incorrectly reports that there is:

var o1 = {};
var o2 = {};
o1.foo = o1;
o2.foo = o2;
diff(o1, o2)

By comparison, in this case there is actually a structural difference (which is correctly detected, albeit with less than entirely illuminating output):

var o1 = {};
var o2 = {};
o1.foo = o1;
o2.foo = {foo: o2};
diff(o1, o2)

I've left this issue here for about a year. Seems like a theoretical question... here's why I say that:

const { log, inspect } = require('util');
const assert = require('assert');
const diff = require('../');

const o1 = {};
const o2 = {};
o1.foo = o1;
o2.foo = o2;

assert.notEqual(o1, o2, 'not same object');
assert.notEqual(o1.foo, o2.foo, 'not same object');

const differences = diff(o1, o2);
log(inspect(differences, false, 9));

Produces:

[ DiffEdit {
    kind: 'E',
    path: [ 'foo' ],
    lhs: { foo: [Circular] },
    rhs: { foo: [Circular] } } ]

Seems correct to me. I suppose the README

for determining the structural differences between objects

may be the counter argument, after all, the structures are similar. I'm inclined to close this.

The use case I would have for diff, absent this bug, is testing serialisation / deserialisation round-tripping or other deep-copy routines. My test code consists of creating various datastructures, serialising and then deserialising them, and then diffing the deserialised result against the original input. The data I am serialising is arbitrary JavaScript data, so the serialisation routines need to handle cyclic data—and therefore the tests must too.

Having already had to write my own diff routines for the golang version of this code, I will observe that the algorithm is actually pretty easy:

  • Maintain two maps. The first maps objects found while traversing o1 to the corresponding objects found while traversing o2; the second map vice versa.
  • Assume the two structures are the same until proven otherwise.
  • "Proof" means:
    • o1[…][…]… and o2[…][…]… are primitives with different values, or
    • one is a primitive and the other is not, or
    • one exists and the other does not, or
    • both are objects but of different internal types (e.g., plain object vs. function vs. array vs. date vs. regexp, etc.)
    • both are objects of the same internal type (e.g., regexps) but with different data (e.g., different patterns, different dates, boxed primitives with different boolean/string/number values, etc.), or
    • both are objects and map1[o1[…][…]…] exists but is !== o2[…][…]… or,
    • both are objects and map2[o2[…][…]…] exists but is !== o1[…][…]….

(Object prototypes, set members, and map keys and values ought to be treated just like properties: i.e., recursed upon and with the same two-way map tests. WeakMaps and WeakSets present some difficulty, though there are other possible workarounds if we can constrain the set of possible keys.)

Thanks for the feedback. Now I'm inclined to fix it.

Any update on this? Thanks!