lmangani / embedded-db-benchmarks

Simple, Non authoritative Benchmarks for embedded databases running in Github Actions

Home Page:https://metrico.in/benchmark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

incorrect result, glaredb's sql api is lazy

sundy-li opened this issue · comments

it just constructs a plan, we should use gdb.sql(query).execute() to apply the action.

https://github.com/GlareDB/glaredb/blob/main/bindings/python/src/connection.rs#L107-L130

Thanks @sundy-li I'm new to GlareDB and this really helps! The suggested format didn't work but it seems gdb.execute(query) does, could you check and confirm if this is acceptable?

After some testing, it seems only the .show() function produces realistic results and the .execute() function doe not appear to be supported any longer

Resolved! Thanks for your precious input @sundy-li and let me know if you have other suggestions 👍

let me know if you have other suggestions

Maybe you can try adding databend to this bench, it has similar API
to glaredb, https://github.com/datafuselabs/databend/blob/main/src/bendpy/README.md

I thought databend required a service. I'll definitely add it if it works embedded! Thanks for the suggestion

@sundy-li databend added, but I couldn't find a way to query a local parquet file. Feel free to send a PR

@sundy-li my apologies by mistake I deleted your comment instead of replying to it! 😢

read local parquet examples:

 select * from 'fs:///home/sundy/data_parquet/parquet-00001.snappy.parquet' limit 3;

I've played with the approach a little and indeed it seems to want full paths.... something like this works but I see no way of selecting multiple local files without adding them to the stage, which seems overly complex for this test.

>>> import os
>>> db.sql("SELECT COUNT(*) FROM 'fs://"+os.getcwd()+"/hits_0.parquet'").collect()