More than my first attempt at writing a programming language, but the first to be generate code, though it cannot (yet) parse anything.
This was largely written to write down a/my mental model for compilers. Though also a
majority of the time spent so fars was encoding x86_64 instructions from an embedded
DSL. The x86_64 encoder is incomplete, only handling ADD
-like instructions.
There is a good amount of hardcoded tests that compare encoded instructions
to a slice of bytes. However this can a be a little error prone, so there is
also some "integration testing" that compares the instruction encoding here
with that of objdump
.
Test cases also have an associated string name which is actually the intel
syntax version of the instruction. The name and encodings are written to files
test_xxx.asm
and test_xxx.bin
respectively. Then the output of
objdump ... test_xxx.bin
can be diffed
with that of test_xxx.asm
-
- parser
- - basic types
- - tuples
- - abstract data types
- - abstraction
- - application
- - let binding
- - patterns (missing ADTs)
- - types (missing ADTs)
- - multiple lets
- - pattern lets
-
- string interning
-
- interpreter
- - values
- - beta reduction (function calling)
- - pattern matching
- - let bindings
- - multiple lets
- - pattern lets
-
- type check
- - basic types
- - patterns
- - sizing
- - match arm consitency
- - match exhaustiveness
- - polymorphism
- - type inference
-
- pointers?
- - requires some level of polymorphism
-
- optimizer
- - transform local shadowing into mutability
- - automatically demote args to references
-
- x86_64 instruction encoding
- - add-like
- - ret
- - call
- - jmp
- - syscall
-
- codegen
- - everything on the stack
- - use regs
- - basic types prelude
- - non-builtin string
- - debug info
-
- more arches
- - Z80/Gameboy
- - RISC-V
- - arm