Venemo / node-lmdb

Node.js binding for lmdb

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Append mode is not working correctly for string keys with character codes larger than 255 (MDB_KEYEXIST error)

hjerabek opened this issue · comments

The following sample code throws an MDB_KEYEXIST: Key/data pair already exists error when performing a put operation using the {append:true} option with a string containing character codes larger than 255, even though the key does not already exist:

var fs=require("fs"),node_lmdb=require("node-lmdb");
var path="append.mdb";
if (fs.existsSync(path)) fs.unlinkSync(path);
var env=new node_lmdb.Env();
env.open({path:path,maxDbs:1,mapSize:1e7,noSubdir:true});
var dbi=env.openDbi({create:true,keyIsBuffer:false});
var i,k,n=65536,txn=env.beginTxn(),opts={append:true};
for (i=0;i<n;i++) {
    k=String.fromCharCode(i);
    txn.putBoolean(dbi,k,true,opts);
}
txn.commit();
dbi.close()
env.close();

I assume the reason is that node-lmdb internally uses a little-endian instead of a big-endian encoding for UTF16 strings. As soon as the character code hits 256 the first of the two bytes gets reset to 0, so for the lmdb the key is out of order as it already has keys whose first byte is up to 255. My assumption is based on the fact that I get the same error if I use a utf16le-encoded buffer as key:

var fs=require("fs"),node_lmdb=require("node-lmdb");
var path="append.mdb";
if (fs.existsSync(path)) fs.unlinkSync(path);
var env=new node_lmdb.Env();
env.open({path:path,maxDbs:1,mapSize:1e7,noSubdir:true});
var dbi=env.openDbi({create:true,keyIsBuffer:true});
var i,k,n=65536,txn=env.beginTxn(),opts={append:true};
for (i=0;i<n;i++) {
    k=Buffer.from(String.fromCharCode(i),"utf16le");
    txn.putBoolean(dbi,k,true,opts);
}
txn.commit();
dbi.close()
env.close();

...yet it works without error if the key is utf16be-encoded (since Node.js does not provide that encoding itself, I use swap16 on the utf16le-encoded buffer):

var fs=require("fs"),node_lmdb=require("node-lmdb");
var path="append.mdb";
if (fs.existsSync(path)) fs.unlinkSync(path);
var env=new node_lmdb.Env();
env.open({path:path,maxDbs:1,mapSize:1e7,noSubdir:true});
var dbi=env.openDbi({create:true,keyIsBuffer:true});
var i,k,n=65536,txn=env.beginTxn(),opts={append:true};
for (i=0;i<n;i++) {
    k=Buffer.from(String.fromCharCode(i),"utf16le").swap16();
    txn.putBoolean(dbi,k,true,opts);
}
txn.commit();
dbi.close()
env.close();

PS: The sample code has been tested with Node.js v12.19 and node-lmdb v0.9.4.

PPS: I know this is a niche issue. I don't expect it to get fixed, I just wanted to report so others can find the error code in a websearch and see the workaround.

I couldn't get append in put options working at all.

const someString = "someString"
const val = txn.getString(db, 1)
console.log(val)
if (val != someString) {
    txn.putString(db, 1, someString);
} else {
    txn.putString(db, 1, 'someOtherString', {append: true})
}

Simply removing append option does allow me to overwrite the value, but I can't get append working.

Your code will use the append mode only if val==someString, in which case the database already contains a value for the key 1. AFAIK, appending only works if the key is larger than any other existing key in the given database. It should even fail if the key you are overwriting is the largest one.

Your code will use the append mode only if val==someString, in which case the database already contains a value for the key 1. AFAIK, appending only works if the key is larger than any other existing key in the given database. It should even fail if the key you are overwriting is the largest one.

I completely misunderstood what append is for. I figured out what I actually wanted to know, which was either extending the existing entry (at which point I would just input the existing entry plus the new one together as the new input), or multiple entries on the same key (solved by modifying the db on creation with dupSort: true).

I appreciate the response and apologize for cluttering up the thread.