alan-if / alan

ALAN IF compilers and interpreters

Home Page:https://alanif.se

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Q: About ALAN's Parser & Lexer Generators

tajmone opened this issue · comments

@thoni56, In #12 you mentioned

The character sets used are actually generated by the scanner generator ("lexem parser generator") that is part of the legacy compiler-compiler toolset that is used to generate the scanner (and parser) for the Alan compiler.

I've been trying to find info about this compiler-compiler tool, but couldn't find anything by searching "lexem parser generator". I noticed that in the generated files pmParse.c and smScan.c it mentions the tools "ParserMaker" and "ScannerMaker", but couldn't find anything for those either on Google.

Not much luck with Wikipedia's page Comparison of parser generators either.

Could you provide me with more info about this toolset and its author, and a link to its website (if still in existence) or some place where its original codebase can still be found?

I would have liked to add to our Wiki (here) a page with some info about it, since its part of the required tools to build ALAN, and because these tools are now old and not easy to find info about.

Is this compiler-compiler tool (which I believe you've had to modify in order to adapt it to the latest changes) included in this repository? i.e. how would someone wanting to update the ALAN grammar and parser/lexer have to go about it? (I'm guessing he/she would need access to the same parser/lexer generator tools).

Thanks.

Yes, and I think I have mentioned before that those are legacy tools from a suite that I and my colleagues at my previous company, SoftLab (not the German one), built in the 90's. It was of course proprietary at the time. It introduced modern parsing and compiling techniques for the Ericsson telephone exchange language Plex and shrank compile-times to a tenth of what it was previously. It has been said that the reduction of CPU-time alone, over a year, paid for the project cost. It was also used at a handful of our other clients.

But as that company after some intricate turn of events ended up as a part of Ericsson, and the tool probably hasn't been used during the last decades (but who knows how long the maintain these things...) I made it public after a few decades at https://github.com/thoni56/ToolMaker. With the disclaimer that if anyone objects to it being available, I'll take it down.

Actually, as usually, if you know what to Google for you can find a few matches.

Documentation? There is none. I always kept a personal copy of the source, and as it was vital to building Alan. Also building Alan on multiple platforms, sometimes made me also take ToolMake with me to that platform, although not strictly necessary. Unfortunately, the documentation was lost, or I never had a copy, and I have not been able to track down someone who had a copy, not even at Ericsson, where some of my old colleagues still work (unless they are retired nowadays...).

Is this compiler-compiler tool (which I believe you've had to modify in order to adapt it to the latest changes) included in this repository? i.e. how would someone wanting to update the ALAN grammar and parser/lexer have to go about it? (I'm guessing he/she would need access to the same parser/lexer generator tools).

You never need to modify the tools. It is only if the language grammar changes, or there are changes to the internal form that must be handled when the AST is built that the tool is needed. But, yes, if that happens, you need a working ToolMaker kit installed. But the use of it is all handled by the Makefiles, and I doubt that there will ever be such radical changes that you can't wing it by mimicing what's already there.

One concern that I have (not much, but a little) is that the grammar might at some point change so that the parser generator of ToolMaker, aptly named ParserMaker, cannot handle it. ParserMaker only handles LALR-grammars, although there are some means to tweak the parsing rules so a larger set of grammars can be handled (which is already done at a few places).

And of course I've been toying with converting to other parser generators multiple times over the years. That's why you see fragments of CoCo and ANTLR grammars in the compiler directory. Feeble attempts have so far resulted in nothing, as the work would be quite large and the value at this time very low. A lot would have to change as most of the compiler toolkits handle a lot of things differently. What's cool about ToolMaker is that many things already exist and integrate easily with each other, such as source positions, error messages, listing files, interface to scanner/lexer/tokenizer etc. etc. ParserMaker also has the worlds first back-tracking error-recovery logic, it was actually the doctoral work by one of my then colleagues, also former professor, Kenth Ericson (no relation to the company ;-).

With all that background it might not come as a surprise that my recommendation is to no go down that rabbit hole ;-)

But it is good that you asked the question (I will move this to a discussion, rather than an issue). In that way this answer is saved for posterity until someone does that wiki-page write-up.