thoughtpolice / tree-sitter-openddl

a tree-sitter grammar, for OpenDDL v2.0

Home Page:http://openddl.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tree-sitter parser for OpenDDL

This repository contains a tree-sitter grammar for the Open Data Description Language ("OpenDDL", "ODDL"), designed and authored by Eric Lengyel. It is a very close transcription of the official OpenDDL grammar, described using railroad diagrams, on http://openddl.org. It targets the latest OpenDDL 2.0 specification.

The intention of this project is to provide a canonical, machine-usable description of the original grammar, one that can be used in other OpenDDL-based tools -- such as derivative, format-specific parsers -- by simply incorporating tree-sitter. A distant secondary goal is to start building a canonical test suite of ODDL files that other implementations could share, to ensure they can parse things correctly (so that we can avoid creating our own nightmares).

HEADS UP: This grammar should be considered very unstable as of now, and not thoroughly tested or documented at this time. String literal parsing, at minimum, is certainly not within spec. There are few test cases, exercising only small, trivial parts of the grammar. tree-sitter's highlighting support is still changing, and should be considered non-functional -- and more I've forgotten.

The current primary use case is a foundational parser for tools built around the Open Graphics Exchange Format ("OpenGEX", "OGEX") format, but you can generally reuse the grammar for any ODDL tool -- it is likely useful for any other uses of the OpenDDL format, which I'm sure people can think up.

Thanks to the design of tree-sitter itself, it also provides a foundation for incremental re-parsing and syntax highlighting of OpenDDL-based formats, which could be used for efficient editor integration, refactoring, etc -- though this is likely only useful for simpler, custom OpenDDL formats, versus formats like OpenGEX (which are intended to be generated, and will often be very large).

NOTE: While OpenDDL is the basis language for the OpenGEX, and one intention of this project is to be usable for OpenGEX-based tooling, the tree-sitter parser here DOES NOT offer any specific support or validation for the OpenGEX format, such as validating properties, types, etc. That must be built as a layer on top of the tree-sitter AST.

Usage

Traditionally, developers of tree-sitter grammars are encouraged to write grammars, and generate C code for their grammar using tree-sitter generate. This auto-generated code is then committed next to the grammar code itself, in the Git repository. Users of tree-sitter grammars are intended to clone that repository as a submodule, and link against the C code checked into it.

While this design works okay, I generally find this kind of design to be flawed in general, for a number of reasons (which won't be elaborated on here), and so it is avoided to some extent.

Instead, generated C code is distributed separately from the grammar code (though still in Git), and is automatically generated upon every commit using continuous integration. You're encouraged to instead simply vendor the C code into your repository by downloading a version of it when needed (or, using git submodule directly -- if you hate yourself and anyone who has to contribute.)

Downloading C code for the grammar

Version information: The C code for this grammar is generated by tree-sitter version 0.16.2, and therefore you MUST link the generated C code against a compatible version of the tree-sitter library -- version 0.16.x or later.

TBD.

Sample C program

TBD.

Building & hacking

I use Nix to do both continuous integration and local development, so install Nix if you wish, on your favorite Linux distribution. (You can use any Linux distribution you like, in fact.) Then run nix-build a lot, or nix-shell and hack iteratively.

Alternatively, you can install tree-sitter yourself and do typical tree-sitter generate && tree-sitter test development, but Nix does all that for you and a lot more (provisioning nodejs, etc). It's your choice.

NOTE: The nix-based build here ONLY works on x86_64 Linux, but this is only a technical restriction, due to the usage of a static Linux binary for tree-sitter. This could be lifted in the future for macOS and aarch64 Linux.

In the mean time, macOS users cannot use Nix, and must use tree-sitter directly. They must also install nodejs.

I strongly suggest that Windows users use a tool like WSL2 in order to do grammar development. Like any Linux distro, WSL2 Linux distributions can use nix, or tree-sitter and nodejs directly, as macOS users do.

A useful guide to keep open in your browser is tree-sitter's documentation on how to create parsers.

Continuous deployment

TBD: get GH Actions deploying things, and describe it here.

Authors

See AUTHORS.txt for the list of contributors to the project.

License

MIT, like most tree-sitter grammars. See LICENSE.txt for precise terms of copyright and redistribution.

About

a tree-sitter grammar, for OpenDDL v2.0

http://openddl.org

License:MIT License


Languages

Language:JavaScript 47.8%Language:Nix 43.8%Language:Shell 8.4%