Mordent

Mordent is a fully managed relational database management system for .Net. This is an open-source research project startign from scratch.

I am going to start with writing down the ideas behind this system; and, eventually, build some code to implement those ideas.

Manifesto

Here is the list of design ideas that are floating around in my head:

The server would expose everything via RESTful API. Undecided yet: how to express the query part. ODATA? RQL? custom?
1. "Tables": GET https:////[?query]
2. "Queries": GET https:///?query
3. Read "Procedures": GET https:///database>/[?Parameters]
4. Modifying "Procedures": POST https:///database>/[?Parameters]
Authentication: OAuth; consider an RBAC security model with GRANTs, DENYs and REVOKEs as usual
Transactions: one command - one transaction. No way to declare a transaction other than to send a complete query and wait for the response.

We need a full-fledged language for data processing and logic management. C# would be a perfect fit if we can restrict it to something reasonable.
Each database is a collection of data + code. Code lives in a "project"; project might depend on another "project" which is a specific database version.
Migrations are still underdesigned - need to think that through. TODO: talk to the people maintaining large and old relational databases; what is their preferred way to handle migrations? We should build a similar counterpart.
There would be various objects in the database:
1. "Tables": C# records, supporting inheritance
2. "Views" aka table-valued functions; defined in terms of queries over the rest of the database (NB: how to prevent circular dependencies?)
3. "Procedures" aka code blocks with parameters that might apply some changes to the code
4. "Queues": special tables that don't allow arbitrary selection - only "consumption" and "population". Idea is that adding to the queue might be a part of one transaction, while processing the queue might be a part of a different transaction.
5. Custom scalar functions
6. Custom aggregates: special types that are similar to the CLR User-Defined Aggregates, but with the decent implementation based upon generics
7. Indexes - indexes are supposed to be outside of the database schema, so we can add and remove them on the fly.

We would use a memory-mapped file and the struct types to represent the stored values.
We will require x64 to avoid the addressability issues.
The engine would generate the struct types for storing the user-defined data automatically, during the schema load.
When processing a query expressed in terms of the user-defined and user-visible types, the engine would convert it to the "query plan" which is an imperative code implemented in terms of the internal struct types.
The database structure would be represented as a memory-mapped file, treated as an array of DbPages.
1. Most operations on database would receive the Span<DbPage> parameter, and operate on that span. (TODO: verify the performance penalties for the range-checking; expectation is to have those negligibly small)
2. Different types of DbPages will coexist in a form of "Union", i.e. internal blocks explicily overlayed via the FieldOffset attribute. This is to avoid casts between various Span<DbPageXXX> types
Strings would be stored in form of an in-row prefix, optionally followed by the B+-Tree based implementation. See strings.md for detail