kriszyp / msgpackr

Ultra-fast MessagePack implementation with extension for record and structural cloning / msgpack.org[JavaScript/NodeJS]

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Upgrading from old msgpack - compatibility issues

gonenduk-dy opened this issue · comments

Hi,

As you know, we are trying to move away from the old (and used to be most popular) msgpack. We are part of a big code base and since we are the ones who pack and others unpack and read, each on its own language, platform and runtime - in order to have a safe move, we need to create identical buffers to the ones created by the old lib.

Consider the following code:

const msgpack = require('msgpack');
const { Packr } = require('msgpackr');
const packr = new Packr({ useRecords: false, encodeUndefinedAsNil: true, mapAsObject: true, variableMapSize: true, mapAsEmptyObject: true, setAsEmptyObject: true });

const input =  {
  '123' : 456,
  tick: new Date(),
}

const newPack = packr.pack(input);
const oldPack = msgpack.pack(input);

console.log(`buffers are ${Buffer.compare(oldPack, newPack) ? 'different' : 'the same'}`);

It seems the 2 buffer are not identical.
There are 2 issues here:

  1. Object with numeric keys (but still a string).
  2. Date object. We suspect old msgpack is so old, it does not support date object and converts the object to a string before packing it. Maybe we could have a flag to do the same (like dateAsString).

What do you think? Thanks again for you prompt replies.

You should be able to add an extension for handling dates as you would like:

addExtension({
	Class: Date,
	write(date) {
		return date.toString()
	}
})

For object with numeric keys (or keys that can be converted to numbers), the old library converts these to numbers? The thing is that in JS, all object properties are strings. JS considers them all to be strings and converts any attempt at a number to string (at least in terms of what is returned for the object keys). You would need to convert an object to a Map in order to explicitly define the type of the keys (and however you might decide what is number or string, there are plenty of different ways to do that).

Hi,

Indeed, when object key is made a string made of digits, the old lib converts it to a number and stores the number in the buffer. Both the old lib and your lib can unpack this buffer successfully, so I think we can ignore this change.

Regarding the date, this solution works great (!!!), just one thing: in index.d.ts:

interface Extension {
	Class: Function
	type: number
	pack?(value: any): Buffer | Uint8Array
	unpack?(messagePack: Buffer | Uint8Array): any	
	read?(datum: any): any
	write?(instance: any): any
}

type is required. It shouldn't be when we want to supply only class and write (like in this case).
If we do supply type, we can pack using this lib, but not unpack using the old lib since it doesn't recognize the type.
Can you change type not be required?

Thanks again for the prompt replies!!!

Saw the commit,
Will close this issue once 1.9.3 is published.

In the meantime, I can overcome the issue with:

addExtension({
	type: undefined,
	Class: Date,
	write(date) {
		return date..toISOString()
	}
})

Thanks!

Can we add an extension for Object type? Get the key and manipulate it like we do with Date in the call to the function write?

No, you can not alter the handling of the Object type. And the extensions are tested after checking if the object is an Object for performance. So changing the code to check all the extension possibilities before serializing plain objects would be a performance regression.

Ok, understood.

When I think about storing keys made of digits as numbers, it might be actually a good feature for users who want to make their packed buffer smaller (especially those who have very limited bandwidth). It's not only for backward compatibility like the mapAsEmptyObject.
Consider an object with a list of keys of ids, like:

{
  123456: { ... },
  234567: { ... }
  345678: { ... },
  ...
}

Storing the keys as numbers will reduce the result buffer size.
It's like the variableMapSize which takes slightly longer, but might save some space.

The change is simple (pack.js line 551 and 562) from:

pack(key = keys[i])

to:

key = this.flagName && isNaN(keys[i]) ? keys[i] : Number(keys[i])
pack(key)

And since it is behind a (boolean) flag, it will not affect performance when not in use.

What do you think?

hey @kriszyp
Any thoughts? Thanks!

I think that may be doable. If a flag check has some overhead, but would be small. Are you using this with variableMapSize? Would it be feasible to only have it has an option in conjunction with variableMapSize?

Yes, I use it only with variableMapSize as I use it for backward compatibility with the older unmaintained package.
It does make sense to relate the 2, since both make the size smaller with the price for being slightly slower. Size over time.
For those who seek smaller sizes - will probably have both enabled.

Ok, I added a coercibleKeyAsNumber in 1.9.4 for this.

Thank you so much!
Tried it and it looks good.
I suggest adding the flag also to the readme file, for users who want to squeeze the size to the max.
Closing the issue.