A lame c compiler which implements a basic lexer, an LR(1) parser and a recursive descent parser
git clone --recurse-submodules https://github.com/leo4048111/LameCC
- Lexer ☑
- LR(1) Parser ☑
- Recursive Descent Parser ☑
- Semantic Analysis ☑
- Intermediate Code Generator(in both Quaternion and LLVM IR forms) ☑
- Code Optimization ☐ // TODO
- Assembly Generator ☐ // TODO
- Prettified json dump
- Log info/error
- Visualized LR(1) canonical collection, ACTION GOTO table and LR(1) parsing process
- Other features
- OS: Windows or GNU/Linux
- Cmake version >= 3.8
- Installed LLVM libraries and cpp headers, make sure you have set
CMAKE_PREFIX_PATH
orLLVM_DIR
env variable to LLVM directory properly - If you are running Windows and have installed MinGW64, simply run
build.bat
int/float
var declaration/definitionif-else
statementint/float/void
function declaration/definitionwhile
statement- value statement(complex expression, function call, etc...)
return
statement
Example input source file(see ./testcases/test.cpp
):
// nonvoid return type function decl with params
int NonVoidFuncDeclWithParams(int parm1, int parm2);
// nonvoid return type function decl without params
char NonVoidFuncDeclWithoutParams();
// nonvoid return type function definition with params
float NonVoidFuncDefWithoutParamsWithEmptyBody()
{
return 0xAF.D65P-5; // some float representations
}
// nonvoid return type function definition with params
int NonVoidFuncDefWithParamsWithEmptyBody(int param1, char param2)
{
return 0;
}
// void return type function decl with params
void VoidFuncDeclWithParams(int parm1, int parm2);
// nonvoid return type function decl without params
void VoidFuncDeclWithoutParams();
// void return type function definition with params with empty body
void VoidFuncDefWithoutParamsWithEmptyBody()
{
}
// void return type function definition with params with empty body
void VoidFuncDefWithParamsWithEmptyBody(int param1, int param2)
{
}
// function definition
int main()
{
int left = 0; // DeclStmt
int right = 100; // DeclStmt
int target = (NonVoidFuncDefWithParamsWithEmptyBody(99, 100) % 2 + 5) - right * left; // complex Expression
while (left < right) // WhileStmt
{
int mid = (left + right) / 2;
if (mid == target) // IfStmt
return mid; // ReturnStmt
else if (mid < target) // elseBody which is another IfStmt
left = mid + 1; // ValueStmt
else // elseBody
right = mid; // ValueStmt
}
return left; // ReturnStmt
}
Command options:
PS D:\Projects\CPP\Homework\LameCC\build> .\LameCC.exe -?
Usage:
LameCC.exe <input file> [options]
Available options:
-?, --help show all available options
-o, --out set output file path
-T, --dump-tokens dump tokens in json format
-A, --dump-ast dump AST Nodes in json format
--LR1 specify grammar with a json file and use LR(1) parser
--log print LR(1) parsing process
Run command:
PS D:\Projects\CPP\Homework\LameCC\build> .\LameCC.exe ../testcases/test.cpp -A -T --LR1 ../src/grammar.gram --log
Token dump:
[
{
"id": 1,
"type": "TOKEN_KWINT",
"content": "int",
"position": [
2,
1
]
},
{
"id": 2,
"type": "TOKEN_IDENTIFIER",
"content": "NonVoidFuncDeclWithParams",
"position": [
2,
5
]
},
{
"id": 3,
"type": "TOKEN_LPAREN",
"content": "(",
"position": [
2,
30
]
},
{
"id": 4,
"type": "TOKEN_KWINT",
"content": "int",
"position": [
2,
31
]
},
...
AST dump:
{
"type": "TranslationUnitDecl",
"children": [
{
"type": "FunctionDecl",
"functionType": "int(int, int)",
"name": "NonVoidFuncDeclWithParams",
"params": [
{
"type": "ParmVarDecl",
"name": "parm1"
},
{
"type": "ParmVarDecl",
"name": "parm2"
}
],
"body": "empty"
},
{
"type": "FunctionDecl",
"functionType": "char()",
"name": "NonVoidFuncDeclWithoutParams",
"params": [],
"body": "empty"
},
{
"type": "FunctionDecl",
"functionType": "float()",
"name": "NonVoidFuncDefWithoutParamsWithEmptyBody",
"params": [],
"body": [
{
"type": "CompoundStmt",
"children": [
{
"type": "ReturnStmt",
"value": [
{
"type": "FloatingLiteral",
"value": "5.494911"
}
]
}
]
}
]
},
...
LR(1) Canonical Collections:
ACTION GOTO Table:
Parsing Process: