frederick-vs-ja / type-exercise-in-rust

Learn Rust type by implementing basic types in database systems

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Type Exercise in Rust

(In Chinese) Rust 语言中的类型体操 - 以数据库系统为例

This is a short lecture on how to use the Rust type system to build necessary components in a database system.

The lecture evolves around how Rust programmers (like me) build database systems in the Rust programming language. We leverage the Rust type system to minimize runtime cost and make our development process easier with safe, nightly Rust.

Map of Types

Day 1: Array and ArrayBuilder

ArrayBuilder and Array are reciprocal traits. ArrayBuilder creates an Array, while we can create a new array using ArrayBuilder with existing Array. In day 1, we implement arrays for primitive types (like i32, f32) and for variable-length types (like String). We use associated types in traits to deduce the right type in generic functions and use GAT to unify the Array interfaces for both fixed-length and variable-length types. This framework is also very similar to libraries like arrow, but with much stronger type constraints and much lower runtime overhead.

Day 2: Scalar and ScalarRef

Scalar and ScalarRef are reciprocal types. We can get a reference ScalarRef of a Scalar, and convert ScalarRef back to Scalar. By adding these two traits, we can write more generic functions with zero runtime overhead on type matching and conversion. Meanwhile, we associate Scalar with Array, so as to write functions more easily.

TBD Lectures

Day 3: ArrayImpl, ArrayBuilderImpl, ScalarImpl and ScalarRefImpl

It could be possible that some information is not available until runtime. Therefore, we use XXXImpl enums to cover all variants of a single type.

Day 4: More Types with Macro

As we are having more and more data types, we need to write the same code multiple times within a match arm. In day 4, we use declarative macros (instead of procedural macros or other kinds of code generator) to generate such code and avoid writing boilerplate code.

Day 5: Binary Expressions

Now that we have Array, ArrayBuilder, Scalar and ScalarRef, we can convert every function we wrote to a vectorized one using generics.

Day 6: Aggregators

Aggregators are another kind of expressions. We learn how to implement them easily with our type system in day 6.

Day 7: Expression Framework

Now we are having more and more expression kinds, and we need an expression framework to unify them -- including unary, binary and expressions of more inputs. At the same time, we also need to automatically convert ArrayImpl into their corresponding concrete types using TryFrom and TryInto traits.

At the same time, we will also experiment with return value optimizations in variable-size types.

Day 8: Physical Data Type and Logical Data Type

i32, i64 is simply physical types -- how types are stored in memory (or on disk). But in a database system, we also have logical types (like Char, and Varchar). In day 8, we learn how to associate logical types with physical types using macros.

About

Learn Rust type by implementing basic types in database systems

License:Apache License 2.0


Languages

Language:Rust 100.0%