DS ( Data Structure ) AST (Abstract Syntax Tree)
A lexer transforms a sequence of characters into a sequence of tokens.
A compiler lexer is a crucial component in the compilation process of a programming language. It is responsible for breaking down the source code into smaller, meaningful units called tokens or lexemes. These tokens are then fed into the parser, which constructs the abstract syntax tree (AST) of the program.
enum TokenType {
...
}
interface TokenPosition {
/** line */
ln: number;
/** column */
col: number;
}
interface Token {
type: TokenType;
lexeme: string;
/** it's useful for debugging proprose later one for a given programming language */
position?: TokenPosition;
}
type LexerFn = (input: string) => Token[];
-
Tokenization: The lexer converts the source code into tokens, which are the smallest syntactic units of the language. These tokens can be identifiers, keywords, literals, operators, or other special characters.
-
Regular Expressions: The lexer uses regular expressions to define the patterns for identifying these tokens. This approach allows for efficient and flexible token recognition.
-
State Transition Table: The lexer can be implemented using a state transition table, which is a table-driven approach that directly jumps to follow-up states via goto statements. This approach can produce faster lexers than hand-coded ones.
A parser is a software component that takes input data (typically text) and builds a data structure, often a parse tree or abstract syntax tree (AST), giving a structural representation of the input while checking for correct syntax. It is a crucial part of the compilation process, particularly in compiler design.
- Recursive Descent Parser | GeeksforGeeks (2023/06/09)
An Abstract Syntax Tree (AST) is a data structure used in computer science to represent the structure of a program or code snippet. It is a tree-like representation of the source code, abstracting away the syntax and semantics of the programming language. The AST is designed to preserve essential information such as variable types, the location of each declaration, the order of executable statements, left and right components of binary operations, and identifiers and their assigned values.
-
[YouTube Playlist] Compiler Design - Quick Concepts | Neso Academy
-
[YouTube Playlist] Compiler Design - Chapter 1 - Introduction to Compiler Design | Neso Academy
-
[YouTube Playlist] Compiler Design - Chapter 2 - Syntax Analysis | Neso Academy
-
[YouTube Playlist] Compiler Design - Chapter 3 - Top-Down Parsers | Neso Academy
-
A Guide To Parsing: Algorithms And Terminology | Gabriele Tomassetti (2023/07/26)
-
Compilers Series' Articles | by Paul Lefebvre - DEV Community
-
Compilers 101 - Overview and Lexer (2018/01/19)
-
Compilers 102 - Parser (2018/01/22)
-
-
What is a Lexer ? known also as Tokenizer or Scanner - Lexical Analysis | DataCadamia
-
Lexical Analysis - (Token|Lexical unit|Lexeme|Symbol|Word) | DataCadamia
-
Parser / Compiler - (Abstract) Syntax Tree (AST) | DataCadamia
-
Abstract Syntax Tree (AST) - Explained in Plain English | DEV Community (2024/06/11) - As a developer, the source code that you write is all so concise and elegant.
-
[GitHub] cowchimp/awesome-ast - A curated list of awesome AST resources
-
BNF Notation: Dive Deeper Into Python's Grammar | Real Python
-
[YouTube] LLVM in 100 Seconds | Fireship (2022/05/23)
-
Writing Your Own Lexer With Simple Steps | Serhii Chornenkyi (2023/11/24)
-
A simple recursive descent parser | DEV Community (2023/10/09)
-
- [GitHub] tlaceby/guide-to-interpreters-series - Contains source-code for viewers following along with my Beginners Guide To Building Interpreters series on my Youtube Channel.
-
Let's Build A Simple Interpreter | Ruslan's Blog
-
Part 7: Abstract Syntax Trees | Ruslan's Blog (2015/12/15) - python and rust implementations
-
Part 13: Semantic Analysis | Ruslan's Blog (2017/04/27)
-
-
[YouTube Playlist] Building a Compiler in JS | benwatkins10xd
- [GitHub] benwatkins10xd/js-compile - Compiler in vanilla javascript from scratch
-
[YouTube] abstract syntax tree's are gonna be IMPORTANT in 2024 | Chris Hay (2023/12/28)
-
[GitHub] antlr/antlr4 - is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
-
[GitHub] antlr/grammars-v4 - Grammars written for ANTLR v4; expectation that the grammars are free of actions.