leegao / Lua-SideProjectMonth

Let's build a completely self-hosting Lua compiler from scratch (that means regex, lexer, parser, bytecode, the whole nine yards!)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Let's build a Lua compiler from the ground up in Lua. Here, we will be developing our own tools from scratch along the way, which is great because Lua doesn't come with much in terms of its standard library so we get to do a lot platforming work as well :)

I will be focusing heavier on the parsing aspect of this compiler as that is the low-hanging fruit aspect of compiler construction that I haven't had a chance to explore deeply yet.

See https://github.com/leegao/Lua-In-Lua/blob/master/lua/grammar.ylua for the grammar, the parser language is written in the same language that the parser parses (yo dawg). Note that Lua's language specification is not LL(n) for any finite n, which means that any oracular lookahead machine will still not be able to parse this without angelicism. To get around this, we use an extremely clever idea: we relax the language to be parsable by LL(3) and we use the semantic action during parse time to restrict valid trees. This mix of "dynamic" and "static" parsing analysis will allow us to get a full Lua parser.

To see the compiler in action, run hello.lua. The output will be a list of opcodes that is generated based on ll1/ll1.lua.

Status Report:

  1. Regular expressions recognizer generator: completed!
  2. Lexer generator: backend completed, pending Parser to self-host the frontend as well.
  3. LL(1) Parser generator: completed, which will use the graph reduction to efficiently compute the fixed point.
  4. Add nullable elimination transformation - (grammar cannot have inherent nullables anymore)
  5. Eliminate production cycles (A ⩲> A)
  6. Eliminte immediate left recursion
  7. Create left-recursion folding mechanism
  8. Create left-factor elimination
  9. Self-hosting the Lexer.
  10. Self-hosting the Parser.
  11. Added support for extende BNF grammars
  12. Added support for oracular lookaheads
  13. Create a lua parser in LL(1)
  14. Add semantic actions to the lua parser
  15. Add a tokenizer for Lua
  16. Lua 5.2 bytecode deserializer
  17. AST -> Bytecode translation
  18. AST -> Bytecode compiler.
  19. Able to compile Hello World!

In progress:

  1. Operator precedence in parser
  2. Lua 5.2 bytecode interpreter

TODO:

  1. Standard library
  2. Self-host the toolchain (5.2 compatibility)

Stretch:

  1. LR(0-1) Parser, which will also use a similar mechanism
  2. Type-induction.
  3. Type inferencing.

About

Let's build a completely self-hosting Lua compiler from scratch (that means regex, lexer, parser, bytecode, the whole nine yards!)


Languages

Language:Lua 100.0%