felixhao28 / JSCPP

A simple C++ interpreter written in JavaScript

Home Page:https://felixhao28.github.io/JSCPP/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Grammar railroad diagram

mingodad opened this issue · comments

Using this tool https://www.bottlecaps.de/convert/ we can copy and paste https://github.com/felixhao28/JSCPP/blob/master/pegjs/ast.pegjs on the Input grammar: textarea then click the button Convert then after conversion click the button View Diagram to see a nice interactive railroad diagram ( https://en.wikipedia.org/wiki/Syntax_diagram ) that can also be downloaded as xhtml.

Also can be done offline using the java tool https://www.bottlecaps.de/rr/download/rr-1.63-java8.zip (link on the Welcome tab).

Closing as it is not an issue.

I'm revisiting this topic with and new parser that converts your pegjs grammars to a peg grammar understood by https://github.com/mingodad/peg and https://github.com/yhirose/cpp-peglib that has a online playground here https://yhirose.github.io/cpp-peglib/ to test the grammar and view the AST.

Notice that in your grammar you have several rules unreferenced:

peg-dad -e jscpp-prepast-naked.peg
rule 'ELSE' redefined
rule 'HexDigit' redefined

217:2 'NAMESPACE' is not referenced.
103:2 'BREAK' is not referenced.
211:2 'DECLSPEC' is not referenced.
205:2 'COMPLEX' is not referenced.
175:2 'STATIC' is not referenced.
202:2 'BOOL' is not referenced.
187:2 'UNION' is not referenced.
181:2 'SWITCH' is not referenced.
172:2 'SIZEOF' is not referenced.
166:2 'SHORT' is not referenced.
157:2 'REGISTER' is not referenced.
178:2 'STRUCT' is not referenced.
193:2 'VOID' is not referenced.
136:2 'FLOAT' is not referenced.
151:2 'INLINE' is not referenced.
145:2 'IF' is not referenced.
139:2 'FOR' is not referenced.
127:2 'ELSE' is not referenced.
124:2 'DO' is not referenced.
118:2 'DEFAULT' is not referenced.
184:2 'TYPEDEF' is not referenced.
106:2 'CASE' is not referenced.
100:2 'AUTO' is not referenced.
169:2 'SIGNED' is not referenced.
112:2 'CONST' is not referenced.
154:2 'LONG' is not referenced.
190:2 'UNSIGNED' is not referenced.
196:2 'VOLATILE' is not referenced.
142:2 'GOTO' is not referenced.
121:2 'DOUBLE' is not referenced.
109:2 'CHAR' is not referenced.
214:2 'ATTRIBUTE' is not referenced.
208:2 'STDCALL' is not referenced.
199:2 'WHILE' is not referenced.
133:2 'EXTERN' is not referenced.
115:2 'CONTINUE' is not referenced.
163:2 'RETURN' is not referenced.
130:2 'ENUM' is not referenced.
148:2 'INT' is not referenced.
160:2 'RESTRICT' is not referenced.
220:2 'USING' is not referenced.

The converted prepast.pegjs that can be used at https://github.com/yhirose/cpp-peglib to parser C++ inputs an show the AST:

 TranslationUnit <-
	 Spacing ( Preprocessor / PrepMacroText Spacing )+ EOT

 Preprocessor <-
	 ( PrepDefine / PrepInclude / ConditionalInclusion ) Spacing

 PrepDefine <-
	 PrepFunctionMacro / PrepSimpleMacro / PrepUndef

 PrepUndef <-
	 SHARP UNDEF Identifier

 PrepSimpleMacro <-
	 SHARP DEFINE Identifier PrepMacroText?

 PrepFunctionMacro <-
	 SHARP DEFINE Identifier PrepFunctionMacroArgs PrepMacroText

 PrepFunctionMacroArgs <-
	 LPAR Identifier ( COMMA Identifier )* RPAR

 PrepFunctionMacroCallArgs <-
	 LPAR PrepMacroMacroCallText ( COMMA PrepMacroMacroCallText )* RPAR

 PrepMacroMacroCallText <-
	 ( Identifier PrepFunctionMacroCallArgs InlineSpacing / Identifier / SeperatorArgs )+

 PrepMacroText <-
	 ( Identifier PrepFunctionMacroCallArgs InlineSpacing / Identifier / Seperator )+

 PrepInclude <-
	 PrepIncludeLib / PrepIncludeLocal

 PrepIncludeLib <-
	 SHARP INCLUDE LT Filename GT

 PrepIncludeLocal <-
	 SHARP INCLUDE QUO Filename QUO

 Filename <-
	 ( IdChar / [/\\.] )+

 ConditionalInclusion <-
	 PrepIfdef / PrepIfndef / PrepEndif / PrepElse

 PrepIfdef <-
	 SHARP IFDEF Identifier

 PrepIfndef <-
	 SHARP IFNDEF Identifier

 PrepEndif <-
	 SHARP ENDIF

 PrepElse <-
	 SHARP PREP_ELSE

 SHARP <-
	 "#" InlineSpacing

 DEFINE <-
	 "define" InlineSpacing

 UNDEF <-
	 "undef" InlineSpacing

 INCLUDE <-
	 "include" InlineSpacing

 IFDEF <-
	 "ifdef" InlineSpacing

 IFNDEF <-
	 "ifndef" InlineSpacing

 ENDIF <-
	 "endif" InlineSpacing

 PREP_ELSE <-
	 "else" InlineSpacing

 InlineSpacing <-
	 ( InlineWhiteSpace / LongComment / LineComment )*

 Spacing <-
	 ( WhiteSpace / LongComment / LineComment )*

 InlineWhiteSpace <-
	 [ \t\x0B\x0C]

 WhiteSpace <-
	 [ \n\r\t\x0B\x0C]

 LongComment <-
	 "/*" ( ! "*/" _ )* "*/"

 LineComment <-
	 "//" ( ! "\n" _ )*

 AUTO <-
	 "auto" ! IdChar Spacing

 BREAK <-
	 "break" ! IdChar Spacing

 CASE <-
	 "case" ! IdChar Spacing

 CHAR <-
	 "char" ! IdChar Spacing

 CONST <-
	 "const" ! IdChar Spacing

 CONTINUE <-
	 "continue" ! IdChar Spacing

 DEFAULT <-
	 "default" ! IdChar Spacing

 DOUBLE <-
	 "double" ! IdChar Spacing

 DO <-
	 "do" ! IdChar Spacing

 ELSE <-
	 "else" ! IdChar Spacing

 ENUM <-
	 "enum" ! IdChar Spacing

 EXTERN <-
	 "extern" ! IdChar Spacing

 FLOAT <-
	 "float" ! IdChar Spacing

 FOR <-
	 "for" ! IdChar Spacing

 GOTO <-
	 "goto" ! IdChar Spacing

 IF <-
	 "if" ! IdChar Spacing

 INT <-
	 "int" ! IdChar Spacing

 INLINE <-
	 "inline" ! IdChar Spacing

 LONG <-
	 "long" ! IdChar Spacing

 REGISTER <-
	 "register" ! IdChar Spacing

 RESTRICT <-
	 "restrict" ! IdChar Spacing

 RETURN <-
	 "return" ! IdChar Spacing

 SHORT <-
	 "short" ! IdChar Spacing

 SIGNED <-
	 "signed" ! IdChar Spacing

 SIZEOF <-
	 "sizeof" ! IdChar Spacing

 STATIC <-
	 "static" ! IdChar Spacing

 STRUCT <-
	 "struct" ! IdChar Spacing

 SWITCH <-
	 "switch" ! IdChar Spacing

 TYPEDEF <-
	 "typedef" ! IdChar Spacing

 UNION <-
	 "union" ! IdChar Spacing

 UNSIGNED <-
	 "unsigned" ! IdChar Spacing

 VOID <-
	 "void" ! IdChar Spacing

 VOLATILE <-
	 "volatile" ! IdChar Spacing

 WHILE <-
	 "while" ! IdChar Spacing

 BOOL <-
	 "_Bool" ! IdChar Spacing

 COMPLEX <-
	 "_Complex" ! IdChar Spacing

 STDCALL <-
	 "_stdcall" ! IdChar Spacing

 DECLSPEC <-
	 "__declspec" ! IdChar Spacing

 ATTRIBUTE <-
	 "__attribute__" ! IdChar Spacing

 NAMESPACE <-
	 "namespace" ! IdChar Spacing

 USING <-
	 "using" ! IdChar Spacing

 TRUE <-
	 "true" ! IdChar Spacing

 FALSE <-
	 "false" ! IdChar Spacing

 Keyword <-
	 ( "auto" / "break" / "case" / "char" / "const" / "continue" / "default" / "double" / "do" / "else" / "enum" / "extern" / "float" / "for" / "goto" / "if" / "int" / "inline" / "long" / "register" / "restrict" / "return" / "short" / "signed" / "sizeof" / "static" / "struct" / "switch" / "typedef" / "union" / "unsigned" / "void" / "volatile" / "while" / "_Bool" / "_Complex" / "_Imaginary" / "_stdcall" / "__declspec" / "__attribute__" ) ! IdChar

 Identifier <-
	 ! Keyword IdNondigit IdChar* InlineSpacing

 SeperatorArgs <-
	 ( Keyword / ! IdNondigit ! [\r\n,)] _ ) InlineSpacing

 Seperator <-
	 ( Keyword / Constant / StringLiteral / ! IdNondigit ! [\r\n] _ ) InlineSpacing

 IdNondigit <-
	 [a-z] / [A-Z] / [_] / UniversalCharacter

 IdChar <-
	 [a-z] / [A-Z] / [0-9] / [_] / UniversalCharacter

 UniversalCharacter <-
	 "\\u" HexQuad / "\\U" HexOcto

 HexOcto <-
	 ( HexDigit HexDigit HexDigit HexDigit HexDigit HexDigit HexDigit HexDigit )

 HexQuad <-
	 ( HexDigit HexDigit HexDigit HexDigit )

 Constant <-
	 ( FloatConstant / IntegerConstant / EnumerationConstant / CharacterConstant / BooleanConstant )

 BooleanConstant <-
	 ( TRUE / FALSE )

 IntegerConstant <-
	 ( BinaryConstant / DecimalConstant / HexConstant / OctalConstant ) IntegerSuffix?

 DecimalConstant <-
	 [1-9] [0-9]*

 OctalConstant <-
	 "0" [0-7]*

 HexConstant <-
	 HexPrefix HexDigit+

 HexPrefix <-
	 "0x" / "0X"

 HexDigit <-
	 [a-f] / [A-F] / [0-9]

 BinaryPrefix <-
	 "0b"

 BinaryDigit <-
	 [0-1]

 BinaryConstant <-
	 BinaryPrefix BinaryDigit+

 IntegerSuffix <-
	 [uU] Lsuffix? / Lsuffix [uU]?

 Lsuffix <-
	 "ll" / "LL" / [lL]

 FloatConstant <-
	 ( DecimalFloatConstant / HexFloatConstant ) FloatSuffix? Spacing

 DecimalFloatConstant <-
	 Fraction Exponent? / [0-9]+ Exponent

 HexFloatConstant <-
	 HexPrefix HexFraction BinaryExponent? / HexPrefix HexDigit+ BinaryExponent

 Fraction <-
	 [0-9]* "." [0-9]+ / [0-9]+ "."

 HexFraction <-
	 HexDigit* "." HexDigit+ / HexDigit+ "."

 Exponent <-
	 [eE] [+-]? [0-9]+

 BinaryExponent <-
	 [pP] [+-]? [0-9]+

 FloatSuffix <-
	 [flFL]

 EnumerationConstant <-
	 Identifier

 CharacterConstant <-
	 "L"? "'" Char* "'" Spacing

 Char <-
	 Escape / ! ['\n\\] _

 Escape <-
	 ( SimpleEscape / OctalEscape / HexEscape / UniversalCharacter )

 SimpleEscape <-
	 "\\" ['\"?\\abfnrtv]

 OctalEscape <-
	 "\\" [0-7] [0-7]? [0-7]?

 HexEscape <-
	 "\\x" HexDigit+

 StringLiteral <-
	 ( "L" / "u8" / "u" / "U" )? ( RawStringLiteral / EscapedStringLiteral )

 RawStringLiteral <-
	 "R" ( ["] RawStringChar* ["] Spacing )+

 EscapedStringLiteral <-
	 ( ["] StringChar* ["] Spacing )+

 RawStringChar <-
	 ! [\"\n] _

 StringChar <-
	 Escape / ! [\"\n\\] _

 LPAR <-
	 "(" InlineSpacing

 RPAR <-
	 ")" InlineSpacing

 COMMA <-
	 "," InlineSpacing

 LT <-
	 "<" ! [=] InlineSpacing

 GT <-
	 ">" ! [=] InlineSpacing

 QUO <-
	 "\"" InlineSpacing

 EOT <-
	 ! _

 _ <-
	 .

After writing the above message I revisited the grammar looking to the unreferenced rules and found that they are due to having an indepentend rule for the keyword tokens but using the keyword literals through the grammar.

But the duplicated rule definitions for ELSE and HexDigit are there (on the converted peg grammar shown above they are fixed).