kaby76 / Trash

Toolkit for grammars

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add test coverage of grammar

kaby76 opened this issue · comments

Given input files on command line, determine what rules and alts are tested. Output a hotspot tagging.

Algorithm for test coverage

  • Create a tool called trcover. It parses input to construct a parse trees for all input. I figured passing around parse trees for 1000's of parses would not be very efficient, so this tool should operate like trperf.
  • Get the compiled parser driver code and find the .g4 files. Compute an NFA for each rule of the grammar.
  • Parse the normal input, get parse trees.
  • For these parse trees, define a tree visitor to determine grammar coverage. When visiting each node in the parse tree, use the children of the node as input and an NFA created for the rule where each transition is a token in the grammar (TOKEN_REF, RULE_REF, ...), parse the NFA states used using a backtracking parser. Make sure that the NFA constructed has not only the token type, but the token index in the .g4. We want to calculate at a table of the times the RHS symbol used, something like Dictionary<int index, int count> .
  • Construct tables and visualizations for "hotspots" of RHS symbols tested.

For test coverage "hotspot" visualization, I can achieve this by using a <pre><code> element around the .g4 file, and highlighting of what part of the rule tested by something like <b style="background-color:Tomato;">...</b>. But, it looks like I cannot embed the marked-up code in github:

parser grammar abbParser;

options { tokenVocab = abbLexer; }

/*
    This grammar is still in development.
    In the current state, it is only able to parse .sys-files and read the given declarations.
*/
/*
    This file is the grammar for the ABB RAPID Robot Language.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU Lesser General Public License for more details.

    You should have received a copy of the GNU Lesser General Public License
    along with this program.  If not, see .
*/
/*
Antlr4 port by Tom Everett, 2016

Question mark stands for: zero or one
Plus stands for: one or more
Star stands for: zero or more

*/

module_
    : moduleData EOF
    ;

moduleData
    : MODULE moduleName NEWLINE
      dataList
      NEWLINE*
      ENDMODULE
    ;

moduleName
    : IDENTIFIER
    | procCall
    ;

dataList
    : (NEWLINE
    | declaration NEWLINE
    | procedure NEWLINE)*
    ;

procedure
    : PROC procCall NEWLINE
      (functionCall NEWLINE)*
    ENDPROC
    ;

procCall
    : procName procParameter?
    ;

procName
    : IDENTIFIER
    ;

procParameter
    : BRACKET_OPEN IDENTIFIER? BRACKET_CLOSE
    ;

functionCall
    : IDENTIFIER (functionParameter COMMA)* functionParameter SEMICOLON
    ;

functionParameter
    : ON_CALL
    | OFF_CALL
    | primitive
    | IDENTIFIER
    ;

declaration
    : init_ type_ IDENTIFIER (EQUALS expression)? SEMICOLON
    ;

type_
    : ( TOOLDATA | WOBJDATA | SPEEDDATA | ZONEDATA | CLOCK | BOOL )
    ;

init_
    : LOCAL? ( CONST | PERS | VAR )
    ;

expression
    : array_ | primitive
    ;

array_
    : SQUARE_OPEN (expression COMMA)* expression SQUARE_CLOSE
    ;

primitive
    : BOOLLITERAL
    | CHARLITERAL
    | STRINGLITERAL
    | (PLUS | MINUS)? FLOATLITERAL
    | (PLUS | MINUS)? INTLITERAL
    ;

Completed.