yiiigww / Cmoji

A non C-like language written in emojis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This is a non C-like language written entirely in emojis.
This project is composed of two main parts: a compiler and a processor.

Usercode lifecycle:
Compiler:
 - The usercode (written entirely in emojis) will be preprocessed into english keywords.
 - The preprocessed code will then be run through a lexer in order to traslate keywords to tokens.
   The lexer is written with the tool Lex.
 - The tokens can then be run through the parser in order to verify syntax and generate Abstract Syntax Trees (AST).
   The parser is written with the tool Yacc.
 - The ASTs can then be run through the code genereate in order to create the binary file that will be executed.
   The Binary file is written in a custom instruction set I call 'Emoji-86'

Processor:
 - The virtual processor loads in the 'Emoji-86' binary file that the compiler has just created
 - The processor then executes the binary using a Tomasulo out-of-order algorithm
 - The Tomasulo algorith is used to minimize runtime, and maximize speed.



Strengths of this language:
	This language is meant to be a very concise scripting language that makes it very easy to do simple tasks.
	An Emoji is worth 1000 lines.

Weaknesses of this language:
	Beacuse this language runs on a virtual processor with a custom instruction set written by hand, the language is limited by the processor.
	Although what the processor *can* do, it does very well, it cannot write back to memory.
	This is a very restirctive limitations. In order to store data (variables), the processor must use its internal registers.
	The virtual processor has a total of 16 registers. 1 register is dedicated to printing, leaving 15 to store user variables.
	Because variable data must be stored in 16 bit registers, only a single data type can be supported - ints
	Summary: You can only have 15 variables max per file, all of which must be ints.

	I realize how limiting this makes the language. I had to make the choice between the following two design approaches:
	1) Compile into X86 and have the language be Turing complete
	2) Compile into Emoji-86, and execute on a custom CPU I wrote by hand.
	I made the choice to have Cmoji run on 100% custom code from start of compile time to end of execution time, despite this choice drastically limiting the capabilities of the language.	


The current mapping of Emoji -> keywords is as follows

# //Comment
πŸ” //loop
πŸ”‚ //loop1
0️⃣ //0
1️⃣ //1
2️⃣ //2
3️⃣ //3
4️⃣ //4
5️⃣ //5
6️⃣ //6
7️⃣ //7
8️⃣ //8
9️⃣ //9
πŸ‘‰ //point
πŸ™… //*
πŸ‘ //+
πŸ‘Ž //-
πŸ–– ///
πŸ–¨ //print
πŸ‘¬ //=
🌜 //( or {
πŸŒ› //) or }
πŸ€” //if
πŸ™„ //elif
😳 //else
πŸ‘† //>
πŸ‘‡ //<
πŸ”š //newline character

This language supports comments. Use the # symbol to begin a comment. Everything following the # will be ignored by the compiler

All other emojis encountered will be compiled as variable names

The Emoji-86 instruction set for the virtual processor is as follows:
 encoding          instruction   description

0   0000iiiiiiiitttt  mov i,t       regs[t] = i; pc += 1;
1   0001aaaabbbbtttt  add a,b,t     regs[t] = regs[a] + regs[b]; pc += 1;
7   0111aaaabbbbtttt  sub a,b,t     regs[t] = regs[a] - regs[b]; pc += 1;
9   1001aaaabbbbtttt  mul a,b,t     regs[t] = regs[a] * regs[b]; pc += 1;
10  1010aaaabbbbtttt  div a,b,t     regs[t] = regs[a] / regs[b]; pc += 1;
2   0010jjjjjjjjjjjj  jmp j         pc = j;
3   0011000000000000  halt          <stop fetching instructions>
4   0100iiiiiiiitttt  ld i,t        regs[t] = mem[i]; pc += 1;
5   0101aaaabbbbtttt  ldr a,b,t     regs[t] = mem[regs[a]+regs[b]]; pc += 1;
6   0110aaaabbbbtttt  jeq a,b,t     if (regs[a] == regs[b]) pc += d
                                    else pc += 1;
11  1011aaaabbbbtttt  jlt           if(regs[a] < regs[b]) pc += t
                                    else pc += 1;
8   1000aaaabbbbcccc print a	    print regs[a]; pc += 1;
				                    (a == b == c. This makes implementation easier)
12  1100iiiiiiiiiiii print i  	    print character i in ascii; pc += 1;

About

A non C-like language written in emojis


Languages

Language:C 54.4%Language:Verilog 27.7%Language:Yacc 5.6%Language:Common Lisp 3.3%Language:C++ 2.2%Language:Makefile 1.8%Language:Python 1.8%Language:Lex 1.7%Language:Java 1.5%Language:Shell 0.0%