lunacookies / eldiro

Learn to make your own programming language with Rust

Home Page:https://lunacookies.github.io/lang/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question: Desired Idents

Zij-IT opened this issue · comments

Hey!

First of all, thanks for the great read, and the wealth of information provided by your tutorial!

Okay, onto the "issue"!

TokenKind::Ident currently uses the regex [A-Za-z][A-Za-z0-9]* to get all idents. However, this misses the all imporant _ character. I am not sure if it is desired that idents cannot have an underscore or not, as it was not discussed from what I can tell.

If idents should be able to have _, then the regex just should be either [A-Za-z][A-Za-z0-9_]* or [A-Za-z][A-Za-z0-9]*, depending of if you would like _ to be a valid ident or not.

What currently happens when an ident has an _:

 > let long_name = 2
Root@0..18
  VariableDef@0..14
    LetKw@0..3 "let"
    Whitespace@3..4 " "
    Ident@4..8 "long"
    Error@8..9
      Error@8..9 "_"
    VariableRef@9..14
      Ident@9..13 "name"
      Whitespace@13..14 " "
  Error@14..16
    Equals@14..15 "="
    Whitespace@15..16 " "
  Literal@16..18
    Number@16..17 "2"
    Whitespace@17..18 "\n"

lower(root) = (
    Database {
        exprs: Arena {
            len: 0,
            data: [],
        },
    },
    [
        VariableDef(
            "long",
            VariableRef(
                "name",
            ),
        ),
        Expr(
            Literal(
                Some(
                    2,
                ),
            ),
        ),
    ],
)

With the first suggested change:

 -> let long_name = 1
Root@0..18
  VariableDef@0..18
    Let@0..3 "let"
    Whitespace@3..4 " "
    Ident@4..13 "long_name"
    Whitespace@13..14 " "
    Equals@14..15 "="
    Whitespace@15..16 " "
    Literal@16..18
      Number@16..17 "1"
      Whitespace@17..18 "\n"

lower(root) = (
    Database {
        exprs: Arena {
            len: 0,
            data: [],
        },
    },
    [
        VariableDef(
            "long_name",
            Literal(
                Some(
                    1,
                ),
            ),
        ),
    ],
)


That’a a great point, thanks for the discovery. I definitely would prefer to allow identifiers to contain underscores, so that was an oversight when I was writing the regex. I’m really busy with schoolwork at the moment, and will likely continue to be until the end of the term. Currently the next part is halfway done, so I think I can finish it during the holidays. I’ll make sure to add an addendum about the regex.

I know you are probably busy and working on a new project, but your blog serie is very very usefull. This is probably the single ressource on the web on "how" you can use Rowan outside of small toy. It deserve way more view.

I looked at your new project (unnamed-language) and it seem a bit more evolved, I think you learned a bit more since 2020 :) . Hope we can have some new blog post about your recent discovery.

@kMeillet Thank you for the compliments, it means a lot! :)

unnamed-language (name pending) is definitely more advanced than Eldiro ever was, but it’s my first ever try at some parts of making a language that go beyond the absolute basics (name resolution, type checking). I came up with the ad-hoc algorithms I used for these, which means there’s probably a ‘proper’ way of doing things that I’ve just naively ignored. I don’t want to write a blog post about areas that are totally new to me, both to avoid distributing incorrect information and to avoid having to make tons of corrections and edits later.

Although the main reason I stopped writing blog posts for the series was initially a lack of time and later a loss of interest, I think having to write a detailed explanation with code snippets for every single change became tiring. Rather than just being able to write code and improve Eldiro, I had to keep pausing to write, only to have to change it later once I would inevitably find problems with my initial implementation.

What do you think would be useful to include in a blog post about unnamed-language? Maybe an overview of the architecture? Then again, to me at least it’s a straightforward extension of the one used by Eldiro.

@arzg I'm totally impressed by your blog and would also love to see some new posts :)

From my perspective, an overview of the new architecture and what you have learned would be most interesting for me. You introduced also hir_lower and hit_ty a little bit more details to these crates would also be cool.

I'm more or less in the need right now to develop also a scripting language, because neither rhai nor rune are fulfilling my needs. I'm developing a business platform where customers can run their own business logic. I use rune today, but it has a lot of pitfalls....

Keep up the really cool work and please post again. From my personal perspective; your language will be used or not somewhere. But your blog post has thought a lot of people how to create a language and most likely will survive your language. 😄