This repository contains a tree-sitter grammar for the Open Data Description Language ("OpenDDL", "ODDL"), designed and authored by Eric Lengyel. It is a very close transcription of the official OpenDDL grammar, described using railroad diagrams, on http://openddl.org. It targets the latest OpenDDL 2.0 specification.
The intention of this project is to provide a canonical, machine-usable
description of the original grammar, one that can be used in other
OpenDDL-based tools -- such as derivative, format-specific parsers -- by simply
incorporating tree-sitter
. A distant secondary goal is to start building a
canonical test suite of ODDL files that other implementations could share, to
ensure they can parse things correctly (so that we can avoid creating our own
nightmares).
HEADS UP: This grammar should be considered very unstable as of now, and not thoroughly tested or documented at this time. String literal parsing, at minimum, is certainly not within spec. There are few test cases, exercising only small, trivial parts of the grammar.
tree-sitter
's highlighting support is still changing, and should be considered non-functional -- and more I've forgotten.
The current primary use case is a foundational parser for tools built around the Open Graphics Exchange Format ("OpenGEX", "OGEX") format, but you can generally reuse the grammar for any ODDL tool -- it is likely useful for any other uses of the OpenDDL format, which I'm sure people can think up.
Thanks to the design of tree-sitter
itself, it also provides a foundation for
incremental re-parsing and syntax highlighting of OpenDDL-based formats, which
could be used for efficient editor integration, refactoring, etc -- though this
is likely only useful for simpler, custom OpenDDL formats, versus formats like
OpenGEX (which are intended to be generated, and will often be very large).
NOTE: While OpenDDL is the basis language for the OpenGEX, and one intention of this project is to be usable for OpenGEX-based tooling, the
tree-sitter
parser here DOES NOT offer any specific support or validation for the OpenGEX format, such as validating properties, types, etc. That must be built as a layer on top of thetree-sitter
AST.
Traditionally, developers of tree-sitter
grammars are encouraged to write
grammars, and generate C code for their grammar using tree-sitter generate
.
This auto-generated code is then committed next to the grammar code itself, in
the Git repository. Users of tree-sitter
grammars are intended to clone that
repository as a submodule, and link against the C code checked into it.
While this design works okay, I generally find this kind of design to be flawed in general, for a number of reasons (which won't be elaborated on here), and so it is avoided to some extent.
Instead, generated C code is distributed separately from the grammar code
(though still in Git), and is automatically generated upon every commit using
continuous integration. You're encouraged to instead simply vendor the C code
into your repository by downloading a version of it when needed (or, using git submodule
directly -- if you hate yourself and anyone who has to contribute.)
Version information: The C code for this grammar is generated by
tree-sitter
version 0.16.2, and therefore you MUST link the generated C code against a compatible version of thetree-sitter
library -- version 0.16.x or later.
TBD.
TBD.
I use Nix to do both continuous integration and local development, so install
Nix if you wish, on your favorite Linux distribution.
(You can use any Linux distribution you like, in fact.) Then run nix-build
a
lot, or nix-shell
and hack iteratively.
Alternatively, you can install tree-sitter
yourself and do typical
tree-sitter generate && tree-sitter test
development, but Nix does all that
for you and a lot more (provisioning nodejs
, etc). It's your choice.
NOTE: The
nix
-based build here ONLY works on x86_64 Linux, but this is only a technical restriction, due to the usage of a static Linux binary fortree-sitter
. This could be lifted in the future for macOS and aarch64 Linux.In the mean time, macOS users cannot use Nix, and must use tree-sitter directly. They must also install
nodejs
.I strongly suggest that Windows users use a tool like WSL2 in order to do grammar development. Like any Linux distro, WSL2 Linux distributions can use
nix
, or tree-sitter and nodejs directly, as macOS users do.
A useful guide to keep open in your browser is tree-sitter
's documentation
on how to create parsers.
TBD: get GH Actions deploying things, and describe it here.
See AUTHORS.txt for the list of contributors to the project.
MIT, like most tree-sitter
grammars. See
LICENSE.txt
for precise terms of copyright and redistribution.