Fedomn / sqlrs

sqlrs is an in-process sql query engine

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sqlrs

sqlrs is an in-process sql query engine modeled off duckdb in Rust

  • The goal of this project is to build a embedded in-process sql query engine for OLAP workloads.
  • It leverages the power of Rust ecosystem and Apache Arrow.
  • It achieved columnar-vectorized execution engine.
  • It will support pipeline parallelism execution in the future.

๐Ÿšง The project is actively developing a new planner V2 that inspired by DuckDB, and will replace the planner v1 in the future.๐Ÿšง

SQL demo

currently, the following SQL statements are supported, execute the commands into interactive mode to test them:

  • make run: run sqlrs in planner_v1
  • make run_v2: run sqlrs in planner_v2
-- supported in Roadmap 0.1 (planner_v1)
select first_name from employee where last_name = 'Hopkins';

-- supported in Roadmap 0.2 (planner_v1)
select sum(salary+1), count(salary), max(salary) from employee where id > 1;
select state, count(state), sum(salary) from employee group by state;
-- load csv table
\load csv department ./tests/csv/department.csv
\load csv employee ./tests/csv/employee.csv
-- show tables
\dt

-- supported in Roadmap 0.3 (planner_v1)
select id from employee order by id desc offset 2 limit 1;
select * from employee left join state on employee.state=state.state_code and state.state_name!='California State';

-- supported in Roadmap 0.4 (planner_v1)
-- explain plan tree
\explain select a from t1;
-- Heuristic Optimizer that includes rules such as: Column pruning, Predicates pushdown, Limit pushdown etc.

-- supported in Roadmap 0.5 (planner_v1)
-- distinct
select distinct state from employee;
select count(distinct(b)) from t2;
-- alias
select a as c1 from t1 order by c1 desc limit 1;
select t.a from t1 t where t.b > 1 order by t.a desc limit 1;
-- uncorrelated scalar subquery
select t.* from (select * from t1 where a > 1) t where t.b > 7;
select a, (select max(b) from t1) max_b from t1;


-- supported in Roadmap 0.6 (planner_v2)
-- create and insert table in memory
create table t1(v1 int, v2 int, v3 int);
create table t2 as select * from read_csv('t2.csv');
insert into t1 values (0, 4, 1), (1, 5, 2);
select * from t1;
-- select only expressions
select 1, 2.3, '๐Ÿ˜‡', true, null;
-- pragma commands
show tables;
describe t1;
-- previous SQL statements
select v1+1 as a from t1 where a >= 2;
select v1 from t1 limit 2 offset 1;
-- table functions
select * from sqlrs_tables();
select * from sqlrs_columns();
select * from read_csv('t1.csv');
select * from read_csv('t1.csv', header=>true, delim=>',');
select * from 't1.csv';
-- copy
copy t1 from 't1.csv' ( DELIMITER '|', HEADER false);
-- date and interval
select date '1998-12-01' - interval '1' month;
select interval '1' year + date '1998-12-01';

Roadmap

High level description:

  • Roadmap 0.1: Build a basic SQL query on CSV storage
  • Roadmap 0.2: Support aggregation operators, e2e testing framework and interactive mode
  • Roadmap 0.3: Support limit, order, and join operators
  • Roadmap 0.4: Introduce a Heuristic Optimizer and common optimization rules
  • Roadmap 0.5: Support distinct, alias, and uncorrelated scalar subquery
  • Roadmap 0.6: New planner_v2 highly inspired by DuckDB

Please see Roadmap for more information of implementation steps

Deep Dive Series Blog (in Chinese)

About

sqlrs is an in-process sql query engine

License:Apache License 2.0


Languages

Language:Rust 99.8%Language:Makefile 0.2%