Idea: Support `node_tag` and `branch_tag`
oovm opened this issue · comments
Add two fields to identify nodes
#[derive(Clone, Debug)]
pub struct Node<'i> {
pub rule: Rule,
pub start: usize,
pub end: usize,
pub node_tag: Option<&'i str>, // The lifetime is the same as the input text
pub branch_tag: Option<&'i str>, // The lifetime is the same as the input text
pub children: Vec<Node<'i>>,
pub alternative: Option<u16>,
}
Basically like this:
expr <-
"(" expr ")" #Priority
/ lhs=expr "*" rhs=expr #Mul
/ lhs=expr "/" rhs=expr #Div
/ lhs=expr "+" rhs=expr #Add
/ lhs=expr "-" rhs=expr #Sub
/ num #Atom
;
num <- re#[0-9]+#;
PEG.js has this feature
This is a great idea, thank you!
So I get that you wish to label a node. What does the branch_tag
for?
I think the life-time can be 'static because we can simply use const static str for this.
I have an actual usage example here
Consider the grammar
epxr <-
"(" expr ")" #Priority
/ expr "<-" expr #Mark
// ...others
I marked the branch_tag
here
A macro is used here, and it looks like this after expansion:
#[inline]
pub fn expr(s: RuleState) -> RuleResult {
let s = match s.rule(Rule::BRANCH, self::__aux_expr_priority) {
Ok(o) => return o.tag_branch("Priority"),
Err(e) => e,
};
let s = match s.rule(Rule::BRANCH, self::__aux_expr_mark) {
Ok(o) => return o.tag_branch("Mark"),
Err(e) => e,
};
/// ...others
return Err(s);
}
Finally deal with branch_tag
here
You are right, it should be Option<&'static str>
👍 Right, that is pretty nice.
I was thinking the start/end in the node could be replaced with a `&str' (with the same lifetime as the input string). I suspect this used the same amount of memory, but with a much better devx.
I've given this some thought. I think using special comments is not a great way of doing this. Also, we want to be able to mark expressions as "create nodes for this". With that in mind I've taken some inspiration from lalrpop and I've come up with this synax:
start <- (<foo> / bar) EOI;
foo <-
add:/ <left:foo> "+" <right:num>
sub:/ <left:foo> "-" <right:num>
num;
bar <- "NO";
num <- re#\d+#;
So the idea is:
- Only create nodes for expressions surrounded by
<
and>
- Nodes with
<
label:
expression>
get nodes with a label - Sequences alternatives can have labes with label
:/
I've pushed the changes which add the labels for nodes <
.. >
and alternatives label :/
.
At the moment, all nodes are still being created. That the next step.