The ll1 tool performs a small set of analyses on a context-free grammar in order to determine whether it is LL(1), and thus suitable for implementation via a predictive parser algorithm. Running a grammar through ll1 can help you catch left recursion and other conflicts that could ruin your day if they pop up later during parser implementation.
ll1 parses a CFG in an 'un-adorned' GNU Bison format. The use of %empty is absolutely mandatory for writing epsilon productions. Character literals are also accepted as tokens, so the following grammar:
%%
/* left-recursive! */
expr : expr '+' term
| expr '-' term
| term ;
/* left-recursive! */
term : term '*' factor
| term '/' factor
| factor ;
factor : ID | NUM ;
%%
would be perfectly acceptable input, even though it will cause ll1 to pitch a fit about predict set overlaps between productions of the same nonterminal. ;)
However, ll1 will be tickled pink to accept this input:
%%
/* tail-recursive... :) */
expr : term expr_next ;
term : factor term_next ;
expr_next : '+' expr | '-' expr | %empty ;
term_next : '*' term | '/' term | %empty ;
factor : ID | NUM ;
%%
In addition to LL(1) conflict reports, ll1 will also output information about:
- Terminal symbols
- Nonterminal symbols
- Nonterminals deriving
%empty
- Recapitulation of the input grammar
- First, follow and predict sets
It's neither perfect nor complete, but it helps provide some basic insights.
I wrote ll1 in a day, and only passed it through valgrind a handful of times to check for serious errors. Given the fact that most of the data structures are patterned after the null-termination patterns of C strings, there has to be a bug or two in there... somewhere...
Anyways, if you run into one, let me know and I'll push a patch.
The ll1 source code is released under the MIT license. See the LICENSE.md file for the complete license terms.
And as always, enjoy!
*~ Brad.*