How does it work?

Question

How does it work?

qwertie opened this issue 5 years ago · comments

I'm curious about lots of things about this project, but a few jump out right away.

The video says I can use several input languages, but how? I don't see a way to choose anything but TypeScript for the top-left pane, and editing any other pane does not propagate changes to the others.
Does the Generator tab contain a complete description of how your internal representation is converted to text, or are there additional knobs we can't see?
The Generator tab appears to contain code in some kind of programming language, in curly braces. What language is this?
How does the reverse conversion (from text to the internal representation) work? I don't expect you can magically use the stuff in the "Generator tab" in reverse? And yet, somehow, you appear to have single-handedly written five "reverse generators"! How much time did that take?
Are you aware of the Loyc project?

As the inventor of LES I'm a bit biased, but I can't help but think that this project could benefit from using it. For example, compared to JSON or plain text, LES would provide a much more compact and intuitive way to express your ASTs.

Perhaps we could collaborate... for a brief time period starting right now, I have a lot of involuntary free time on my hands. Your primary language seems to be TypeScript, and it would be an interesting exercise to try to port my C#-based LES parser and printer to TypeScript... maybe using OneLang itself to help me do it! What do you think, if I did that, would you be interested in using LES somewhere in OneLang?

Then there's my Enhanced C# parser... if I could port that to TypeScript, you'd have access to a high-quality C# parser (though it doesn't quite support C# 7 yet.) Since the EC# and LES parsers were generated by the same parser generator (which I also wrote), if I can manage to port the LES parser to TypeScript, the EC# parser should be a pretty easy follow-up.

Tamás Koczka · Answer 1 · Sun Mar 10 2019 20:15:40 GMT+0800 (China Standard Time)

Hey there! :)

If you select Cross-language editing demo mode then it will enable editing in TS, C#, PHP and Ruby. I only implemented these 4 parsers, but they are probably pretty limited and buggy. TS was always indeed the main language.
Generically speaking, yes, it's a complete description. Here and there I have to implement some feature in the Core because of one specific language (like detecting what variable is constant / mutable for Swift), but I try to implement it in generic way, so the Generator can use the isMutable property later. This was one of my main design goal: the Generator should be a template-based language.
Note: this is just an experiment as the whole project.
The file itself is YAML, the curly brace stuff is a custom template language, called OneTemplate. I tried to reuse existing template languages, but they were not enough precise in mostly whitespace handling. Although my current solution is still not perfect and needs at least one revision, it allows the Generator code to be somewhat more readable than other solutions. By the way here are a few test cases which shows what the language knows, how it works in different situations.
I does not use the Generator tab indeed. I wrote the "reverse generators" by hand and I call them Parsers. The TypeScript parser was the first, this implemented a lot of common component ("tokenizer", comment, line manager, etc) and was rewritten multiple times. I think it took somewhere between 20 - 40 hours. Writing the other parsers (C#, Ruby, PHP) took 4-8 hours per language I think. But it's important these are nowhere complete or compliant parsers. I call them e.g. 'C#-like frontend for One', so they won't be able to parse existing C# programs yet. For serious purposes I plan to wire in real parsers (e.g. Roslyn) into One. But one day I want to 'rewrite' my parsers in One, so you will be able to "parse any language from any language".
Until now, no. :)

I will think about your other suggestions, but I have to go through the docs and posts you linked first.

The plaintext AST is mostly used for debugging purposes (the JSON one is not really used), and for that it's better if it is less compact and shows more information like the type of the internal representation, the variable's inferred type, etc. It also helps that one node is usually one line, so I can diff before-after states, so e.g. after I modify something in the compiler and I regenerate all the tests, I see in Git in an instant if some representation changed and I can decide if it is a bug or it is something I indeed want.

For example here I can see that the compiler indeed understood that the call returns a OneJProperty which has a getName method, so it becomes a MethodReference instead of a generic PropertyAccess.

So, I will look into your projects later, but I don't promise a quick answer, I am pretty dormant nowadays regarding OneLang...

David Piepgrass · Answer 2 · Mon Mar 11 2019 19:33:55 GMT+0800 (China Standard Time)

Thanks for your answer. I would say that it never hurts to have a more compact representation... for example, consider var x: number = y + 1;, represented as

- Variable: x [OneNumber]
  - Binary: + [any]
    - Identifier: y [any]
    - Literal (numeric): "7" [OneNumber]

In LESv3, the same information could be expressed with something like

.var (@#is(OneNumber) x) = (@#is(any) (@#is(any) y) + (@#is(OneNumber) 7))

The attribute notation for types is a bit clunky here, admittedly, because the low precedence of the attribute marker @ requires a lot of parentheses. But how about...

.var x is. OneNumber = y is. any + 7 is. OneNumber is+ any

[edited] In this example I've introduced the "is." operator and the "is+" operator. These are known as "combo operators" (combination text and punctuation). Because of the dot, the precedence of "is." is the same as the dot operator, and similarly for "is+". So we can mentally parse the expression as follows:

.var ( (x is. OneNumber) = ((y is. any) + (7 is. OneNumber)) is+ any )