Grammar railroad diagram
mingodad opened this issue · comments
Using this tool https://www.bottlecaps.de/convert/ we can copy and paste https://github.com/felixhao28/JSCPP/blob/master/pegjs/ast.pegjs on the Input grammar:
textarea then click the button Convert
then after conversion click the button View Diagram
to see a nice interactive railroad diagram ( https://en.wikipedia.org/wiki/Syntax_diagram ) that can also be downloaded as xhtml
.
Also can be done offline using the java tool https://www.bottlecaps.de/rr/download/rr-1.63-java8.zip (link on the Welcome
tab).
Closing as it is not an issue.
I'm revisiting this topic with and new parser that converts your pegjs
grammars to a peg
grammar understood by https://github.com/mingodad/peg and https://github.com/yhirose/cpp-peglib that has a online playground here https://yhirose.github.io/cpp-peglib/ to test the grammar and view the AST.
Notice that in your grammar you have several rules
unreferenced:
peg-dad -e jscpp-prepast-naked.peg
rule 'ELSE' redefined
rule 'HexDigit' redefined
217:2 'NAMESPACE' is not referenced.
103:2 'BREAK' is not referenced.
211:2 'DECLSPEC' is not referenced.
205:2 'COMPLEX' is not referenced.
175:2 'STATIC' is not referenced.
202:2 'BOOL' is not referenced.
187:2 'UNION' is not referenced.
181:2 'SWITCH' is not referenced.
172:2 'SIZEOF' is not referenced.
166:2 'SHORT' is not referenced.
157:2 'REGISTER' is not referenced.
178:2 'STRUCT' is not referenced.
193:2 'VOID' is not referenced.
136:2 'FLOAT' is not referenced.
151:2 'INLINE' is not referenced.
145:2 'IF' is not referenced.
139:2 'FOR' is not referenced.
127:2 'ELSE' is not referenced.
124:2 'DO' is not referenced.
118:2 'DEFAULT' is not referenced.
184:2 'TYPEDEF' is not referenced.
106:2 'CASE' is not referenced.
100:2 'AUTO' is not referenced.
169:2 'SIGNED' is not referenced.
112:2 'CONST' is not referenced.
154:2 'LONG' is not referenced.
190:2 'UNSIGNED' is not referenced.
196:2 'VOLATILE' is not referenced.
142:2 'GOTO' is not referenced.
121:2 'DOUBLE' is not referenced.
109:2 'CHAR' is not referenced.
214:2 'ATTRIBUTE' is not referenced.
208:2 'STDCALL' is not referenced.
199:2 'WHILE' is not referenced.
133:2 'EXTERN' is not referenced.
115:2 'CONTINUE' is not referenced.
163:2 'RETURN' is not referenced.
130:2 'ENUM' is not referenced.
148:2 'INT' is not referenced.
160:2 'RESTRICT' is not referenced.
220:2 'USING' is not referenced.
The converted prepast.pegjs
that can be used at https://github.com/yhirose/cpp-peglib to parser C++
inputs an show the AST:
TranslationUnit <-
Spacing ( Preprocessor / PrepMacroText Spacing )+ EOT
Preprocessor <-
( PrepDefine / PrepInclude / ConditionalInclusion ) Spacing
PrepDefine <-
PrepFunctionMacro / PrepSimpleMacro / PrepUndef
PrepUndef <-
SHARP UNDEF Identifier
PrepSimpleMacro <-
SHARP DEFINE Identifier PrepMacroText?
PrepFunctionMacro <-
SHARP DEFINE Identifier PrepFunctionMacroArgs PrepMacroText
PrepFunctionMacroArgs <-
LPAR Identifier ( COMMA Identifier )* RPAR
PrepFunctionMacroCallArgs <-
LPAR PrepMacroMacroCallText ( COMMA PrepMacroMacroCallText )* RPAR
PrepMacroMacroCallText <-
( Identifier PrepFunctionMacroCallArgs InlineSpacing / Identifier / SeperatorArgs )+
PrepMacroText <-
( Identifier PrepFunctionMacroCallArgs InlineSpacing / Identifier / Seperator )+
PrepInclude <-
PrepIncludeLib / PrepIncludeLocal
PrepIncludeLib <-
SHARP INCLUDE LT Filename GT
PrepIncludeLocal <-
SHARP INCLUDE QUO Filename QUO
Filename <-
( IdChar / [/\\.] )+
ConditionalInclusion <-
PrepIfdef / PrepIfndef / PrepEndif / PrepElse
PrepIfdef <-
SHARP IFDEF Identifier
PrepIfndef <-
SHARP IFNDEF Identifier
PrepEndif <-
SHARP ENDIF
PrepElse <-
SHARP PREP_ELSE
SHARP <-
"#" InlineSpacing
DEFINE <-
"define" InlineSpacing
UNDEF <-
"undef" InlineSpacing
INCLUDE <-
"include" InlineSpacing
IFDEF <-
"ifdef" InlineSpacing
IFNDEF <-
"ifndef" InlineSpacing
ENDIF <-
"endif" InlineSpacing
PREP_ELSE <-
"else" InlineSpacing
InlineSpacing <-
( InlineWhiteSpace / LongComment / LineComment )*
Spacing <-
( WhiteSpace / LongComment / LineComment )*
InlineWhiteSpace <-
[ \t\x0B\x0C]
WhiteSpace <-
[ \n\r\t\x0B\x0C]
LongComment <-
"/*" ( ! "*/" _ )* "*/"
LineComment <-
"//" ( ! "\n" _ )*
AUTO <-
"auto" ! IdChar Spacing
BREAK <-
"break" ! IdChar Spacing
CASE <-
"case" ! IdChar Spacing
CHAR <-
"char" ! IdChar Spacing
CONST <-
"const" ! IdChar Spacing
CONTINUE <-
"continue" ! IdChar Spacing
DEFAULT <-
"default" ! IdChar Spacing
DOUBLE <-
"double" ! IdChar Spacing
DO <-
"do" ! IdChar Spacing
ELSE <-
"else" ! IdChar Spacing
ENUM <-
"enum" ! IdChar Spacing
EXTERN <-
"extern" ! IdChar Spacing
FLOAT <-
"float" ! IdChar Spacing
FOR <-
"for" ! IdChar Spacing
GOTO <-
"goto" ! IdChar Spacing
IF <-
"if" ! IdChar Spacing
INT <-
"int" ! IdChar Spacing
INLINE <-
"inline" ! IdChar Spacing
LONG <-
"long" ! IdChar Spacing
REGISTER <-
"register" ! IdChar Spacing
RESTRICT <-
"restrict" ! IdChar Spacing
RETURN <-
"return" ! IdChar Spacing
SHORT <-
"short" ! IdChar Spacing
SIGNED <-
"signed" ! IdChar Spacing
SIZEOF <-
"sizeof" ! IdChar Spacing
STATIC <-
"static" ! IdChar Spacing
STRUCT <-
"struct" ! IdChar Spacing
SWITCH <-
"switch" ! IdChar Spacing
TYPEDEF <-
"typedef" ! IdChar Spacing
UNION <-
"union" ! IdChar Spacing
UNSIGNED <-
"unsigned" ! IdChar Spacing
VOID <-
"void" ! IdChar Spacing
VOLATILE <-
"volatile" ! IdChar Spacing
WHILE <-
"while" ! IdChar Spacing
BOOL <-
"_Bool" ! IdChar Spacing
COMPLEX <-
"_Complex" ! IdChar Spacing
STDCALL <-
"_stdcall" ! IdChar Spacing
DECLSPEC <-
"__declspec" ! IdChar Spacing
ATTRIBUTE <-
"__attribute__" ! IdChar Spacing
NAMESPACE <-
"namespace" ! IdChar Spacing
USING <-
"using" ! IdChar Spacing
TRUE <-
"true" ! IdChar Spacing
FALSE <-
"false" ! IdChar Spacing
Keyword <-
( "auto" / "break" / "case" / "char" / "const" / "continue" / "default" / "double" / "do" / "else" / "enum" / "extern" / "float" / "for" / "goto" / "if" / "int" / "inline" / "long" / "register" / "restrict" / "return" / "short" / "signed" / "sizeof" / "static" / "struct" / "switch" / "typedef" / "union" / "unsigned" / "void" / "volatile" / "while" / "_Bool" / "_Complex" / "_Imaginary" / "_stdcall" / "__declspec" / "__attribute__" ) ! IdChar
Identifier <-
! Keyword IdNondigit IdChar* InlineSpacing
SeperatorArgs <-
( Keyword / ! IdNondigit ! [\r\n,)] _ ) InlineSpacing
Seperator <-
( Keyword / Constant / StringLiteral / ! IdNondigit ! [\r\n] _ ) InlineSpacing
IdNondigit <-
[a-z] / [A-Z] / [_] / UniversalCharacter
IdChar <-
[a-z] / [A-Z] / [0-9] / [_] / UniversalCharacter
UniversalCharacter <-
"\\u" HexQuad / "\\U" HexOcto
HexOcto <-
( HexDigit HexDigit HexDigit HexDigit HexDigit HexDigit HexDigit HexDigit )
HexQuad <-
( HexDigit HexDigit HexDigit HexDigit )
Constant <-
( FloatConstant / IntegerConstant / EnumerationConstant / CharacterConstant / BooleanConstant )
BooleanConstant <-
( TRUE / FALSE )
IntegerConstant <-
( BinaryConstant / DecimalConstant / HexConstant / OctalConstant ) IntegerSuffix?
DecimalConstant <-
[1-9] [0-9]*
OctalConstant <-
"0" [0-7]*
HexConstant <-
HexPrefix HexDigit+
HexPrefix <-
"0x" / "0X"
HexDigit <-
[a-f] / [A-F] / [0-9]
BinaryPrefix <-
"0b"
BinaryDigit <-
[0-1]
BinaryConstant <-
BinaryPrefix BinaryDigit+
IntegerSuffix <-
[uU] Lsuffix? / Lsuffix [uU]?
Lsuffix <-
"ll" / "LL" / [lL]
FloatConstant <-
( DecimalFloatConstant / HexFloatConstant ) FloatSuffix? Spacing
DecimalFloatConstant <-
Fraction Exponent? / [0-9]+ Exponent
HexFloatConstant <-
HexPrefix HexFraction BinaryExponent? / HexPrefix HexDigit+ BinaryExponent
Fraction <-
[0-9]* "." [0-9]+ / [0-9]+ "."
HexFraction <-
HexDigit* "." HexDigit+ / HexDigit+ "."
Exponent <-
[eE] [+-]? [0-9]+
BinaryExponent <-
[pP] [+-]? [0-9]+
FloatSuffix <-
[flFL]
EnumerationConstant <-
Identifier
CharacterConstant <-
"L"? "'" Char* "'" Spacing
Char <-
Escape / ! ['\n\\] _
Escape <-
( SimpleEscape / OctalEscape / HexEscape / UniversalCharacter )
SimpleEscape <-
"\\" ['\"?\\abfnrtv]
OctalEscape <-
"\\" [0-7] [0-7]? [0-7]?
HexEscape <-
"\\x" HexDigit+
StringLiteral <-
( "L" / "u8" / "u" / "U" )? ( RawStringLiteral / EscapedStringLiteral )
RawStringLiteral <-
"R" ( ["] RawStringChar* ["] Spacing )+
EscapedStringLiteral <-
( ["] StringChar* ["] Spacing )+
RawStringChar <-
! [\"\n] _
StringChar <-
Escape / ! [\"\n\\] _
LPAR <-
"(" InlineSpacing
RPAR <-
")" InlineSpacing
COMMA <-
"," InlineSpacing
LT <-
"<" ! [=] InlineSpacing
GT <-
">" ! [=] InlineSpacing
QUO <-
"\"" InlineSpacing
EOT <-
! _
_ <-
.
After writing the above message I revisited the grammar looking to the unreferenced
rules and found that they are due to having an indepentend rule for the keyword tokens but using the keyword literals through the grammar.
But the duplicated rule definitions for ELSE
and HexDigit
are there (on the converted peg
grammar shown above they are fixed).