vlcn-io / js

Components to build apps in JavaScript atop cr-sqlite

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Malformed DB using fts5

theboolean opened this issue · comments

Hi!
I think I found a blocking bug for @vlcn.io/crsqlite-wasm when using fts5.

Basically after some INSERT I receive this error and the db becomes unusable:

Error: database disk image is malformed
    at check (sqlite-api.js:866:1)
    at Object.step (sqlite-api.js:699:1)
    at async TX.statements (TX.ts:152:1)

I saw this can be mitigated by wrapping the INSERT into a transaction, but still if you perform lot of transactions this happens again.

Minimal code to reproduce:

    const sqlite = await initWasm()
    const db = await sqlite.open('sqlite_test.db')
    await db.exec('CREATE VIRTUAL TABLE IF NOT EXISTS test_fts USING fts5(fieldA, fieldB, fieldC, fieldD, fieldE, id UNINDEXED, prefix=2, detail=none);')

    for (let i = 0; i < 1000; i++) {
      await db.exec(`INSERT INTO test_fts (fieldA, fieldB, fieldC, fieldD, fieldE, id) VALUES ('test', 'test', 'test', 'test', 'test', '${i}');`)
        .catch(err => {
          console.error(err, i)
        })
    }

to me it always fail during iteration 877.

I also created a basic create-react-app project that demonstrates this, code lives here.

Forgot to say: tested in Chrome and Firefox on Mac OS

Thanks for the report. It may be a problematic interaction with the indexeddb vfs. I should release a variant of cr-sqlite that lets the user choose their virtual filesystem so we can track these things down.

FYI I tried to set the DB in memory and the error didn't occur, so you're right: it must be something related to indexeddb. Did you manage to reproduce the bug on your side?

I found a blocking bug

In the interest of getting you unblocked, have you tried inserting multiple values at a time, e.g.

INSERT INTO test_fts (
  fieldA, fieldB, fieldC, fieldD, fieldE, 
  id
) 
VALUES 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  );

I do this in my own project here in batches of 1000 and was able to add ~35k rows without database disk image is malformed (I really should use a transaction though). Schema here - I only have 3 FTS columns though compared to your example's 5. If you want to test my code I can upload those rows somewhere.

Unfortunately, this is not a workaround we can implement with our use case. We perform lot of updates after the first db population, and eventually it will corrupt

@theboolean - would switching to OPFS be acceptable for your use case? I think the OPFS VFS should handle this fine since it is less temperamental than indexeddb. I'll need to expose OPFS for you -- it currently isn't exposed by the cr-sqlite wrappers over wa-sqlite.

@rhashimoto - just an fyi. I assume this is in wa-sqlite itself but haven't stripped out cr-sqlite layers to confirm.

Will take a look, but I've never used FTS (it's not in the wa-sqlite default build) and I'm deep into something else right now.

@theboolean If you run your test with Dev Tools open is there anything interesting printed to the console when it fails?

@tantaman I would like to use indexeddb because of wider browser support, but I can give OPFS a try to compare performances.

@rhashimoto the only error I receive is the one I pasted in the OP, but I added the repo that can reproduce the issue to codesandbox, so you can try by yourself.

Thank you guys for the interest in this issue 🙏🏻

@tantaman is exposing OPFS a simple change to do?

@theboolean - should be pretty straightforward. Would be a matter of exposing an option for the user to supply a VFS class to initWasm

See these lines:

sqlite3.vfs_register(
new IDBBatchAtomicVFS("idb-batch-atomic", { durability: "relaxed" })
);

export default async function initWasm(
locateWasm?: (file: string) => string
): Promise<SQLite3> {
if (api != null) {
return api;
}
const wasmModule = await SQLiteAsyncESMFactory({
locateFile(file: string) {
if (locateWasm) {
return locateWasm(file);
}
return new URL("crsqlite.wasm", import.meta.url).href;
},
});
const sqlite3 = SQLite.Factory(wasmModule);
sqlite3.vfs_register(
new IDBBatchAtomicVFS("idb-batch-atomic", { durability: "relaxed" })
);

Reproduced in wa-sqlite. It fails with SQLITE_CORRUPT on iteration 877, exactly as reported.

It's a bug in IDBBatchAtomicVFS. It also reveals a likely bug in SQLite, without which it probably wouldn't have shown up here. It's still a critical bug in IDBBatchAtomicVFS, though, as I think it can be exposed in other ways. I hope to merge the fix tomorrow-ish.

Appreciate it.

@theboolean - I'll roll a new release to update to https://github.com/rhashimoto/wa-sqlite/releases/tag/v0.9.8 which will resolve this issue.

@theboolean What a great bug report, including a compact reproduction! This exposed not one but two quite subtle bugs, one in SQLite which is possibly the best tested library on the planet (and one in wa-sqlite which is...not). The bug was fixed in SQLite only hours after submission, and was considered important enough to bypass the code freeze for their imminent release.

@theboolean, @iansinnott - do you guys need a fix immediately or can you wait till the v0.16 release?

@tantaman I think it makes sense to merge wa-sqlite v0.9.8 as a patch and then releasing v0.16. What do you think?

@theboolean, @iansinnott - do you guys need a fix immediately or can you wait till the v0.16 release?

@tantaman no rush, thanks for looking into this

This'll fix it -- #36

but I'll need to publish a new release for everyone.

Hi @tantaman! Touching base just to check if there are any updates on this :)

@theboolean - I'll be publishing a new release today. Luckily it'll be compatible with the current release as we decided to revert some breaking changes.

re-opening until the release is actually published.

Almost there. Need to get this pulled in -- #38

New release published. You should be all set. Note that I've published these under the next tag until I've had a chance to roll them into the various examples and demo apps to further verify the builds.

Thank you @tantaman, @rhashimoto, and all the people involved in fixing this 🙏🏻

v0.16.0-next.2 is pretty solid at this point. There were a few bugs in v0.16.0-next.0 that are now fixed.

Will be making next.2 the stable release soon.