[TorqueScript: C++]Old scripting language conversion

Question

[TorqueScript: C++]Old scripting language conversion

marauder2k7 opened this issue 2 months ago · comments

I am currently converting torquescript to use antlr instead of bison-flex

Almost everything is working well so far:

The original scripting language had functions built in for returning a specific token based on certain criteria and i am having trouble figuring out how to do this as they will require different parser rules.

i have this ID token

ID : LETTER IDTAIL*;

in the original code it would call into a function with this:

{ID}        { return Sc_ScanIdent(); }

static int Sc_ScanIdent()
{
   ConsoleBaseType *type;

   CMDtext[CMDleng] = 0;

   if((type = ConsoleBaseType::getTypeByName(CMDtext)) != NULL)
   {
      /* It's a type */
      CMDlval.i = MakeToken< int >( type->getTypeID(), lineIndex );
      return TYPEIDENT;
   }

   /* It's an identifier */
   CMDlval.s = MakeToken< StringTableEntry >( StringTable->insert(CMDtext), lineIndex );
   return IDENT;
}

IDENT and TYPEIDENT have different parser rules for them so was curious if this type of behavior can be replicated?

target C++

Ken Domino · Answer 1 · Thu Apr 18 2024 21:39:50 GMT+0800 (China Standard Time)

This isn't handled well in Antlr because the lexer is independent from the parse. This is why I've been advocating parser-state aware lexers for Antlr5. This is an example of why this may be useful.

The alternative is to parse the input twice, once to collect definitions in a symbol table, the second time to reference the symbol table during the parse.

marauder2k7 · Answer 2 · Fri Apr 19 2024 23:23:33 GMT+0800 (China Standard Time)

no worries, thanks for your reply, this is really something we need since we add a lot of tokens at runtime. Will keep a watchful eye on antlr5 though.