This DuckDB extension, Sheetreader, allows you to read .XLSX files by using https://github.com/polydbms/sheetreader-core.
This repository is based on https://github.com/duckdb/extension-template.
To install your extension binaries from S3, you will need to do two things. Firstly, DuckDB should be launched with the
allow_unsigned_extensions
option set to true. How to set this will depend on the client you're using. Some examples:
CLI:
duckdb -unsigned
Python:
con = duckdb.connect(':memory:', config={'allow_unsigned_extensions' : 'true'})
NodeJS:
db = new duckdb.Database(':memory:', {"allow_unsigned_extensions": "true"});
Get the extension from S3 (platform is either linux_amd64
, linux_amd64_gcc4
, linux_arm64
, osx_arm64
, osx_amd64
, windows_amd64
, wasm_eh
, wasm_mvp
, wasm_threads
):
wget https://duckdb-sheetreader-extension.s3.eu-central-1.amazonaws.com/v0.10.3/<platform>/sheetreader.duckdb_extension.gz
At the moment the metadata mechanic doesn't work, so you have to prepare the extension for loading:
gzip -d sheetreader.duckdb_extension.gz
truncate -s -256 sheetreader.duckdb_extension # Delete metadata
After running these steps, you can install and load the extension using the regular INSTALL
/LOAD
commands in DuckDB:
D FORCE INSTALL './sheetreader.duckdb_extension';
D LOAD sheetreader;
Now we can use the features from the extension directly in DuckDB. The extension contains a table function sheetreader()
that takes the path of an .XLSX file and returns a table:
D from sheetreader('data.xlsx',threads=4);
To build the extension, run:
GEN=ninja make
The main binaries that will be built are:
./build/release/duckdb
./build/release/extension/sheetreader/sheetreader.duckdb_extension
duckdb
is the binary for the duckdb shell with the extension code automatically loaded.sheetreader.duckdb_extension
is the loadable binary as it would be distributed.
To run the extension code, simply start the shell with ./build/release/duckdb
.