p4lang / p4c

P4_16 reference compiler

Home Page:https://p4.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inquiry Regarding Tools and Methods for Generating Abstract Syntax Trees (AST) from P4 Code

ROBINRUGAN opened this issue · comments

I'm currently exploring methods for converting P4 code into Abstract Syntax Trees (AST) for further analysis and processing. I'd greatly appreciate any insights or recommendations on existing tools or methods to achieve this.

Specifically, I'm interested in: Are there any existing tools or methods available to convert P4 code into its corresponding Abstract Syntax Tree (AST)? @jafingerhut

Eagerly anticipating your prompt and timely reply.

Hi Robin. You can already do this with the existing P4 compiler. You can use P4C to convert the existing P4 code into an AST that you can then manipulate. Please see https://github.com/fruffy/p4dummy for an instructional example back end.

However, I still don't get how to build a AST for p4 and print the structure. I wonder if there are any P4_grammer rules or any python programs that help me to build and print AST of P4 programs (ingress part is enough)

The internal IR of the compiler frontend is an abstract syntax tree -- you can dump that as text or json. We don't have any tools for visualizing it, however.

The internal IR of the compiler frontend is an abstract syntax tree -- you can dump that as text or json. We don't have any tools for visualizing it, however.

https://github.com/p4lang/p4c/tree/main/tools/ir-generator
you mean this one? if yes i will try it

No, that is the tool the compiler uses to generate C++ code from the IR .def files describing the IR classes. The compiler itself is p4cX where 'X' is a suffix for the target. For example p4c-bm2-psa is the compiler for PSA running on the behavioral model bm2.

One of the available open-source backends is p4c-graphs which will generate graphs (in .dot format) from the programs, but those are control-flow graphs of the parser(s) and control(s), not ASTs. But it accepts the various --toJson and --top4 dump options to dump the IR at various points, as do most backend executables that don't explicit disable those options.

So there are no pure tools that can simply get the AST of p4 programs? just p4c-graphs or the straight compile products(json file) right?

The P4C is the tool to get P4 AST. The P4 IR that we have in the compiler is a form of AST. It is in C++ so to access the AST you would need to work with that. P4C is in some sense more a library for building P4 compilers, so to get AST, you would just run the parser and then do whatever you want with the AST. You could e.g. write a pass that dumps it to dot, or some other format useful for visualization. There are a lot of node types though. But I don't think we have anything like that. We have a textual dumper.

There might be other unofficial tools that parse P4, but I've never heart about any (public) tool for AST visualization nor any P4 compiler in Python.

There is someone that built a Grammar railroad diagram for the P4 specification AST once. Maybe this is helpful?

#4017

You could build this from the P4C IR, too. Currently, no tool does this though.

So there are no pure tools that can simply get the AST of p4 programs? just p4c-graphs or the straight compile products(json file) right?

Another approach to get the AST in text form is to use p4test prog.p4 --dump dmp_folder --top4 "End" -v. The -v option will dump the entire internal representation along with P4 program nodes. For example for

state parse_ipv4 {
    pkt.extract(h.ipv4);
    transition accept;
}

it will produce

/* 
<ParserState>(1522) */
state parse_ipv4 {
    /* 
  <MethodCallStatement>(1518)
    <MethodCallExpression>(1517)
      <Member>(1510)extract
        <PathExpression>(1509)
          pkt
      <Vector<Type>>(1516), size=0
      <Vector<Argument>>(1515), size=1
        <Argument>(1514)
          <Member>(1513)ipv4
            <PathExpression>(1512)
              h */
    pkt.extract(/* 
        <Argument>(1514)
          <Member>(1513)ipv4
            <PathExpression>(1512)
              h */
h.ipv4);
/* 
  <PathExpression>(1519)
    accept */
            transition accept;
}