shnewto / bnf

Parse BNF grammar definitions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow partial parsing from string

CrockAgile opened this issue · comments

Frequently while using this crate terms, expressions, productions, and grammars are built from their raw type parts:

let t1: Term = Term::Terminal(String::from("terminal"));
let nt1: Term = Term::Nonterminal(String::from("nonterminal"));
let e1: Expression = Expression::from_parts(vec![nt1, t1]);

Productions and Grammars require even more work. It would be convenient and powerful if instead the types could be constructed using the BNF syntax:

let term = Term::from_str("<nonterminal>").unwrap();
let expression = Expression::from_str("<nonterminal> \"terminal\"").unwrap();
let production = Production::from_str("<nonterminal> ::= <nonterminal> \"terminal\" | \"terminal\"").unwrap();

This could be achieved by implementing the FromStr trait for each type and leveraging existing parsers. Adding this to the Grammar as well would remove the need for the bnf::parse function.

There is however a difficult wrinkle to this problem:

  • How are terminals and nonterminals differentiated in string format?

A grammar can differentiate between the two by inspecting the full set of productions, but terms in productions and expressions are essentially ambiguous. Maybe some solution brainstorming will help start the discussion:

  • All terms are terminals until incorporated into a grammar or production that says otherwise
  • Add a third type of term which represents some "undefined" state

Hopefully @Snewt will have some perspective to resolve this block and enable this convenient functionality 🤞

@CrockAgile I agree that this would be a really good add. I also think that we resolve your concern by requiring the same input format all they way down so we can use all the existing parsers, like grammar to string requires nonterminals to be delimited by < and > and terminals to be delimited by " and ". Can we just expect that for the others? For instance:
revise let expression = Expression::from_str("nonterminal terminal").unwrap(); to
let expression = Expression::from_str("<nonterminal> \"terminal\"").unwrap();

@Snewt That makes so much sense. Just realized I was thinking about terminals totally wrong. Wooo that makes this very doable. I'll count my question as resolved then and add this to 0.2.0 milestone

Closed by PR #24