Refactor encodings

Question

Refactor encodings

vweevers opened this issue 3 years ago · comments

As written in #58, there are too many forms of encoding now. Yet for an abstract-leveldown implementation there's no builtin primitive to decode/deserialize data.

So here's a WIP plan for refactoring. I'll be editing this.

Vincent Weevers · Answer 1 · Thu Oct 21 2021 20:39:47 GMT+0800 (China Standard Time)

Updated the above with a more complete plan, that removes the need for serialization and asBuffer options, and also paves the way for typed arrays.

Example of a transcoder (some should be hardcoded to optimize them, others like json+buffer can be dynamic):

exports['utf8+view'] = new Encoding({
  encode (data) {
    return ArrayBuffer.isView(data) ? data : textEncoder.encode(data)
  },
  decode (data) {
    return textDecoder.decode(data)
  },
  type: 'utf8+view',
  format: 'view'
})

Vincent Weevers · Answer 2 · Thu Oct 21 2021 20:51:44 GMT+0800 (China Standard Time)

Doing this in abstract-leveldown could have performance benefits too. For example, if we know that a particular value encoding on a get() operation is idempotent (like buffer or utf8) and we know that the implementation supports that encoding "natively", then we can do _get(..., callback) without wrapping the callback in another function that decodes the value.

Vincent Weevers · Answer 3 · Mon Oct 25 2021 07:05:23 GMT+0800 (China Standard Time)

https://github.com/Level/transcoder

Vincent Weevers · Answer 4 · Thu Oct 28 2021 00:27:10 GMT+0800 (China Standard Time)

Ugh, this works better than I expected. Now subleveldown can just forward operations to a db, without unwrapping or rewrapping that db with encoding-down or levelup. Doesn't matter what encoding options that db used, or whether it stores data as buffers or Uint8Array internally. On the subleveldown db you can use any encoding too, including Uint8Array even though subleveldown internally works with Buffers and strings. It just fucking works.

The following tests are passing (locally; haven't pushed yet):

Click to expand

const test = require('tape')
const suite = require('abstract-leveldown/test')
const memdown = require('memdown')
const subleveldown = require('subleveldown')

// Test abstract-leveldown compliance
function runSuite (factory) {
  suite({ test, factory })
}

// Test basic prefix
runSuite(function factory (opts) {
  return subleveldown(memdown(), 'test', opts)
})

// Test empty prefix
runSuite(function factory (opts) {
  return subleveldown(memdown(), '', opts)
})

// Test custom separator
runSuite(function factory (opts) {
  return subleveldown(memdown(), 'test', { ...opts, separator: '%' })
})

// Test on db with buffer encoding
runSuite(function factory (opts) {
  return subleveldown(memdown({ keyEncoding: 'buffer' }), 'test', opts)
})

// Test on db with view encoding (Uint8Array)
runSuite(function factory (opts) {
  return subleveldown(memdown({ keyEncoding: 'view' }), 'test', opts)
})

// Have memdown internally use views too
runSuite(function factory (opts) {
  return subleveldown(memdown({ keyEncoding: 'view', storeEncoding: 'view' }), 'test', opts)
})

// Lastly, for good measure:
runSuite(function factory (opts) {
  return subleveldown(memdown({ keyEncoding: 'buffer', storeEncoding: 'view' }), 'test', opts)
})

@ralphtheninja @juliangruber @MeirionHughes ARE YOU EXCITED? Because I am! Fuck!

Julian Gruber · Answer 5 · Thu Oct 28 2021 16:34:48 GMT+0800 (China Standard Time)

Sounds like this gets us a lot of flexibility and api simplicity as well. Well done! 👏

Vincent Weevers · Answer 6 · Sun Oct 31 2021 00:53:34 GMT+0800 (China Standard Time)

Added support of other ecosystem encodings (codecs, abstract-encoding, multiformats) to level-transcoder, fixed some bugs, and updated its README: https://github.com/Level/transcoder. That part is now done.