skvadrik / re2c

Lexer generator for C, C++, Go and Rust.

Home Page:https://re2c.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Any interest in replacing usage of GNU Bison with Lemon?

starseeker opened this issue · comments

Many years ago, we did some work exploring using the Lemon parser generator in place of Bison for RE2C's grammar files. It did work, but we haven't keep those changes current with the latest RE2C sources.

If we were to look at that problem again, would upstream RE2C be interested in using Lemon for the purpose of grammar definitions instead of Bison? This time I wanted to inquire as to whether such an effort would be of interest before plunging in ;-)

Last time I checked lemon was not released as a standalone project and has always been a part of sqlite.

Glancing at distributions currently packaging lemon it looks like most distributions just don't do it: https://repology.org/projects/?search=lemon&maintainer=&category=&inrepo=&notinrepo=&repos=&families=&repos_newest=&families_newest=

Thus it would probably require bundling and maintaining lemon within re2c project. And given that sqlite does not accept external contributions that sounds a bit scary to start relying on it.

Out of curiosity: why do you need a different parser implementation? bison is an implementation detail of re2c.

Thanks for asking @starseeker , I agree with @trofi . Besides, bison parser has been considerably improved over the years and the current implementation is quite neat (we adopted various new features of bison such as pure API).

The original motivation was to try and use a tool that was easier to build on Windows. Our project has had a long history of trying to automate our full build starting with just a C/C++ compiler, without requiring users to install other external tools or use cached generated source files. If build tools (like RE2C) are needed, we've built those too as part of our build. Bison, at least historically, was problematic on Windows in that regard, although I've not looked at it in a long time. I see per the RE2C documentation that the current "expected" build mode is to build using the cached generated sources, so easy cross-platform automated bootstrapping probably won't appeal much as a feature to the RE2C project.

or use cached generated source files

re2c uses re2c to build lexer for .re files:

./src/parse/lex.re
./src/parse/lex_conf.re
./bootstrap/lib/lex.cc
./bootstrap/src/parse/lex.cc

Does it automatically rule out re2c entirely?

Regenerating sources is such a nuanced topic. Different projects have different trade-offs. Do you consider ./configure such a cached generated source? Wonder if it throws away all of autotools build systems (or imposes solving circular dependencies) for you.

I see per the RE2C documentation that the current "expected" build mode is to build using the cached generated sources, so easy cross-platform automated bootstrapping probably won't appeal much as a feature to the RE2C project.

That's right, bison is required as a dependency only in developer mode.

Being able to regenerate sources on an arbitrary platform would require bundling all generators with re2c sources, as there are always platforms that won't have the necessary tool/version. And that would be hard to maintain. Also, bison is not the only generator, docs are also autogenerated and require docutils.

@trofi It doesn't rule out re2c completely - bootstrapping like that is one of the situations where there's no practical alternative to generated sources. My reflex is to be wary of generated sources myself because of my temptation to do the lazy thing and work around things like portability problems in those generated copies rather than correctly addressing the issues by fixing/enhancing the generator tool, but if automating the generation portability is also very difficult then that cure can end up worse than the disease. It's a tricky cost/benefit calculation and depends a lot on both the specifics of the situation and overall project philosophy.

We (and most of our dependencies) have moved off of autotools in favor of CMake, but that's more to do with cross-platform support for things like Visual Studio. Personally if I was maintaining an autotools project with a configure.ac I'd want my CI system generating configure on all platforms to make sure all targeted development platforms could use autotools to regenerate configure successfully, but that's just my take.

My position on the issue isn't hard and fast - I just bias in favor of minimizing the number of potential portability failure points. If we want our software to be buildable and maintainable (say) on Haiku OS as a first class citizen then we have to consider the viability of every dependency and tool we might potentially need on that OS to build our code. The project I primarily work on is decades old and has already outlived a number of its original targeted operating systems, so the desire to ensure wide and deep cross platform viability flows from that history.

@skvadrik You're quite right about the maintainability challenge. We strive to guarantee availability of our complete dependency chain beyond basic C/C++ compiler and system libraries (like graphics and font layers) but our project's goals in that regard are rather unusual and it does indeed take considerable effort.

It sounds as if lemon wouldn't be a good fit for modern re2c, so that answers the question and I can go ahead and close the issue - thanks for taking the time to consider it.