emojicode / emojicode

I'm interested in the formal syntax of Emojicode.

http://www.emojicode.org/docs/reference/syntax.html

Are tokens separated by white space or could for example an emoji be immediately followed by an identifier? Could a number be immediately followed by an identifier? And so on.

If tokens are separated with white space, which characters are considered as white space? Linefeed (10), carriage return (13), space (32), tab (9) ?

Regards,
Daniel

Tokens are not per se separated by whitespace. The tokenizer ignores whitespace if it hasn't determined a token type yet. If a token is parsed (e.g. a number) and a whitespace occurs that syntactically cannot be part of the token, the token is terminated. The tokenizer continues with this whitespace as if it has not determined a token type.

E.g. 432😀 is an integer token and an identifier or 😏493903@var is an identifier, integer and a variable.

All unicode whitespace characters are considered whitespace. See http://www.unicode.org/Public/6.3.0/ucd/PropList.txt and

emojicode/Compiler/EmojicodeCompiler.hpp

Lines 25 to 29 in 5c328c6

    
           /// @returns True if @c c is a whitespace character. See http://www.unicode.org/Public/6.3.0/ucd/PropList.txt 
        
           inline bool isWhitespace(char32_t c) { 
        
               return (0x9 <= c && c <= 0xD) || c == 0x20 || c == 0x85 || c == 0xA0 || c == 0x1680 || (0x2000 <= c && c <= 0x200A) 
        
               || c == 0x2028 || c== 0x2029 || c == 0x2029 || c == 0x202F || c == 0x205F || c == 0x3000 || c == 0xFE0F; 
        
           }

Thanks. May two emojis come immediately after each other and are they in that case considered as one token, or may for example an emoji for a for loop be immediately followed by an emoji for a symbol?

Examples: May 😀 🔤a is bigger than b🔤 be written as 😀🔤a is bigger than b🔤
or may 🍊 ▶️ a b 🍇 be written as 🍊▶️ a b 🍇

I look at this code:
🐇 🔶🎅🎁 🍇

🍉
And the first line has two white spaces. Are these white spaces mandatory in order to separate the tokens?

Subsequent emojis are never considered one token. There is no difference in writing 🐇 🔶🎅🎁 🍇 or 🐇🔶🎅🎁🍇.

Note, however, that if your system is not properly displaying emojis you might see ZWJ emoji sequences, flags or emojis with gender/skin color as multiple symbols, e.g. 👱🏽‍♂️, 👩‍💻, 👨‍💻, 🙆‍♂️, 👩‍👧‍👦, 👨‍👩‍👧‍👧. By definition each of them is a single emoji and treated accordingly by the tokenizer.

Emojicode verion 0.5: variableInitAndScoping.emojic

What does this line do?
😀 🍪🔤i=🔤 🔡i 10🍪

😀 prints the string to the standard output.
🍪 concatenate strings.
🔤i=🔤 one of the strings to be concatenated.

But what does 🔡i 10 do? Convert i to a string with a radix of 10? 🔡 is the string class, but I don't get the syntax here.

It's an ordinary method call on i, which seems to be an integer: http://www.emojicode.org/docs/packages/s/1f682.html#m🔡

	/// @returns True if @c c is a whitespace character. See http://www.unicode.org/Public/6.3.0/ucd/PropList.txt
	inline bool isWhitespace(char32_t c) {
	return (0x9 <= c && c <= 0xD) \|\| c == 0x20 \|\| c == 0x85 \|\| c == 0xA0 \|\| c == 0x1680 \|\| (0x2000 <= c && c <= 0x200A)
	\|\| c == 0x2028 \|\| c== 0x2029 \|\| c == 0x2029 \|\| c == 0x202F \|\| c == 0x205F \|\| c == 0x3000 \|\| c == 0xFE0F;
	}

Syntax question: How is tokens separated?