observablehq / feedback

Customer submitted bugs and feature requests

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Update duckdb-wasm to latest

jzavala-gonzalez opened this issue · comments

Is your feature request related to a problem? Please describe.
I'm writing a notebook that uses the native Observable DuckDB client. One query attempts using DuckDB's UNPIVOT statement introduced in DuckDB 0.8.0 (released around mid-May 2023, see announcement and docs). The query fails with a syntax error (see screenshot) which I'm guessing is due to an outdated duckdb-wasm version in Observable. Checking the stdlib dependencies file shows Observable is using version 1.24.0 of duckdb-wasm, which was published in March according to npm. That months-long difference between 1.24.0 and 1.27.0 could explain why UNPIVOT isn't working within yet Observable.

Describe the solution you'd like
Update to the latest version of duckdb-wasm, which at time of writing is 1.27.0 and was published mid-June.

Describe alternatives you've considered
If updating isn't possible, one could update the intro notebook "Hello, DuckDB" to mention what version of duckdb Observable is currently running so users can look at the correct documentation.

Additional context
unpivot syntax error
stdlib dependencies
npm duckdb-wasm versions

Related observablehq/stdlib#364 observablehq/stdlib#378

We ran into backwards incompatibility issues last time we upgraded, but someone simply needs to test the latest version of duckdb-wasm (currently 1.27.0) and then we can upgrade.

Thank you for the quick response and merge!! I'm still getting the syntax error but I haven't been able to check whether the update has activated on the notebook (or what version of the stdlib and runtime it's using). Is the update immediate? I can check over the next few days if it changes or check if I just have a bug in the query.

Hasn’t quite landed yet! It should be available this week though. We will give an update here when it is released.

hey @jzavala-gonzalez, we just deployed the update and i see UNPIVOT working in production now. let us know if it's working for you!

It's working!! I can finish the notebook now. Thank you so much!! In case there's interest here's the (in-progress) link: https://observablehq.com/@jzavala-gonzalez/am4-scalable-time-series-visualization It's a fork of UW Data's implementation of the M4 algorithm so it adjusts into AM4. The algorithm is very SQL friendly so UNPIVOT really helps wrap it together as the last step before plotting. Thank you! Closing the issue

unfortunately the upgrade seems to have created some issues with the DuckDB client on Safari, so we just had to revert it. sorry for the churn, we will work on getting a fix in and re-applying the upgrade ASAP! i'll reopen the issue for now.

Ok! Take the time you need. I had the chance to actually finish the notebook while the update was out so I already know it should all work once the update is back up. So no rush from my part. Thank you!!

To summarize a little more:

  • It looks like duckdb-wasm 1.25.0 introduces the issue.
  • This notebook reproduces the error: in Safari and Firefox, loading a sufficiently large CSV from S3 (i.e. a file attachment if you refresh after uploading) into DuckDB fails with a range request error. movies.csv (439 KB) fails; penguins.csv (14 KB) doesn’t.
  • There’s a config option, allowFullHTTPReads, which looked like it might help, but either it didn’t or I don’t understand.

We’re investigating; I also opened an issue on duckdb-wasm, in case they spot or know anything else.

Hi all, Carlo here, from DuckDB team.

We would really like to see duckdb-wasm bumped, and I am available to help, but I think in this case there is something off with either the S3 setting or with the CDN settings when range request are made passing "Accept-Encoding" parameter (this is automatically handled by the browser, see: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Encoding).

Could someone take a look at these simple curl examples: duckdb/duckdb-wasm#1366 (comment). I believe fixing the responses in those cases (that is, NOT advertising something as compressed blindly but only when actually compressed) would solve the issue you are seeing.

thanks for digging into this, Carlo! this is super helpful. we're going to look into our S3/CDN configuration and figure out what's happening – will update here.

+1
Especially since duckdb-wasm can load extensions like spatial.
https://duckdb.org/2023/12/18/duckdb-extensions-in-wasm.html

apologies for the delay – the fix turned out to be more complicated than a CDN config change, we've still got a few things to iron out but hopefully will be able to ship it soon.

I agree. It will be very usefull to load spatial extension in duckdb to realize beautiful map. Thank you by advance.

I can't wait to use duckdb's spatial extension in observable...

Any updates on this? I also found myself needing to use duckdb with unpivot and extension support. Will these changes be reflected in Observable Framework or should I also create an issue over there?

@zookini Observable Framework uses a different standard library than notebooks and generally defaults to the latest version of everything. However, we have currently pinned the default version of @duckdb/duckdb-wasm to 1.28.0 because the DuckDB-Wasm team inadvertently marked 1.28.1 as the latest tag in npm even though it is still in prerelease. Also, the DuckDB-Wasm team was having problems publishing 1.28.1 prereleases to jsDelivr duckdb/duckdb-wasm#1561 though that has since been fixed.

https://github.com/observablehq/framework/blob/0b40787072ee245a54dbafc9a0d24d9a97ae2c17/src/javascript/imports.ts#L346

For reference, the correspondence between duckdb versions and storage version is here: https://duckdb.org/docs/internals/storage#storage-version-table

Also: all versions of duckdb, starting with 10.0, will be able to read previous versions (from 0.9). See https://duckdb.org/docs/internals/storage#backward-compatibility and https://duckdb.org/2024/02/13/announcing-duckdb-0100.html#backward-compatibility.

Currently, in Observable notebooks, we're not able to attach duckdb files created with duckdb 10.0 (such as https://huggingface.co/datasets/nyu-mll/glue/resolve/refs%2Fconvert%2Fduckdb/ax/test/index.duckdb), it gives:

Error: Invalid Input Error: Attempting to fetch from an unsuccessful query result
Error: IO Error: Trying to read a database file with version number 64, but we can only read version 43.
The database file was created with an newer version of DuckDB.

The storage of DuckDB is not yet stable; newer versions of DuckDB cannot read old database files and vice versa.
The storage will be stabilized when version 1.0 releases.

For now, we recommend that you load the database file in a supported version of DuckDB, and use the EXPORT DATABASE command followed by IMPORT DATABASE on the current version of DuckDB.

See the storage page for more information: https://duckdb.org/internals/storage

If we do tag duckdb-wasm 1.29.0 (that is aligned to duckdb 0.10.1) later today, would that solve help in finally solving this problem?

It would be awesome! edited

It would be awesome! It would also include the fix to duckdb/duckdb#10263, right? (sorry if I'm highjacking the thread, I can delete)

@severo: I think the right place would be discussing this as part of the support channels available