Suggestion: Decide on a proper grammar before implementing a fully fledged parser
tjgurwara99 opened this issue Β· comments
Very cool project!!
I was thinking about this yesterday and I think before writing a concrete parser, there should be a formal specification (token/grammar) of what kind of operators it should handle - otherwise it will become a technical debt and the parser would start to get messy.
Right off the bat, I think you should consider what operators should and shouldn't be legal in this shell. Once that is decided, then the grammatical (or syntactical) structure can be defined.
These are a few tokens that you may want to consider. Depending on your idea, you might want a simple one, in which case most of them can be ignored. But if you're aiming for a fully fledged shell then they might be useful to think about before hand.
const (
PIPE = iota
DOT
COMMA
COLON
SEMICOLON
LPAREN
RPAREN
LBRACK
RBRACK
LBRACE
RBRACE
EQUALS
PLUS
MINUS
MULT
DIV
MOD
NOT
AND
OR
XOR
LSHIFT
RSHIFT
LOR
LAND
LNOT
LT
GT
LE
GE
EQ
NE
INC
DEC
ASSIGN
ADD_ASSIGN
SUB_ASSIGN
MUL_ASSIGN
DIV_ASSIGN
MOD_ASSIGN
AND_ASSIGN
OR_ASSIGN
XOR_ASSIGN
IF_STMT
ELSE_STMT
FOR_STMT
WHILE_STMT
BREAK_STMT
CONTINUE_STMT
RETURN_STMT
FUNC_STMT
BLOCK_STMT
VAR_STMT
ESCAPE // for escaping special tokens (the ones defined here)
)
Once the grammar is fixed, then you would need to work on writing a Lexer for each of these tokens. Go is great when it comes to writing your own lexer! Anyways, if you need any help, let me know π I'd love to help out with this one. It will be a very good learning experience for me π
Thank You for being interested it mash!
So first of all, the package name is wrong, it is not a parser. The thing that is in the parser
package is actually a lexer. I noticed it but was to lazy to change it (typical me).
The current syntax the lexer assumes is very simple, so it does not require a parser. It just breaks the command into "words", the first word it the command and the rest are the arguments.
Anyways the current implementation it bad. It works, but it is hard to add anything new. Since the whole lexer is hidden behind it's exposed api, it should not be hard to change the implementation. The problem is I am currently trying to decide on the syntax. There are many things, for example, I would like to get from bash, but of which i do not like the syntax of. Once that is decided a lexer should be easy, and a parser design will also be created.
I also want to put as much of the functionality possible into the builtin commands, so that it remains more of a shell that a programming language. I would of course want to make sure that it can do all that a regular programming language can, maybe a bit less.
Currently, I am trying to stay on commands only (will look at other types of statements later). I have been using this Stackoverflow question as a guide, and am trying to change them into a modernistic looking language, while keeping it easily usable as operators. If you have a syntax in mind, please let me know.
Here is the notes file i have been scribbling my thoughts on:
mash syntax:
1. Commands with args
2. Redirection operators [both stdout and stderr]
3. Logical operators
Logical operators:
1. [command] && [command]
2. [command] || [command]
3. ![command]
Redirection operators:
1. [command] [i[o]e]> [stream]
2. [command] [i[o]e]>> [stream]
3. [command] [i[o]e]| [command]
4. [command] < [stream]
Streams:
Stdin: i
Stdout: o
Stderr: e
Ah cool, the operations you pointed out are I think is enough to start with
mash syntax:
1. Commands with args
2. Redirection operators [both stdout and stderr]
3. Logical operators
Logical operators:
1. [command] && [command]
2. [command] || [command]
3. ![command]
Redirection operators:
1. [command] [i[o]e]> [stream]
2. [command] [i[o]e]>> [stream]
3. [command] [i[o]e]| [command]
4. [command] < [stream]
Streams:
Stdin: i
Stdout: o
Stderr: e
If you'd like I can start working on the lexer - see whether you like this kind of approach because its quite a bit different from other approaches (eg Lex - the tool and regular expressions). Once the lexer is done then we can work on Parser π. The benefit of my approach is that you can do lexing and parsing concurrently. - I will push something in the next few days if you're okay for me to work on it?
You can create a new branch on your fork and then open a pr here. It is open source for a reason :)
Here is a grammar specification. I have described it in simple english:
A command is composed of words, single quoted strings, and double
quoted strings, with white-space separating them (not required if two
different types are side by side). The strings will be parsed by
strconv.Unquote(). Normal words will be kept raw. Any white-space which
is not inside a string is discarded and acts as a seperator.
Operators are defined in the above comment by me.
If you need any help, feel free to ask me (though I think you won't).
@tjgurwara99 Also, what is your opinion on the operator syntax I have created. Any suggestions would be greatly appreciated.
Everything seems to be fine - nothing too drastic IMO, but I think there is an issue with redirection operators. The redirect operators >
and <
are used to write to file or stream (any io.writer) but pipe command like this command_1 | command_2
is essentially command_1 > tmp_file && command_2 < tmp_file
so I don't think it makes sense to have [i[o]e]
with pipe. Unless you were thinking of it in some different way, in which case could you elaborate as to what your thinking for pipe
was?
The i
should not be there, that is true. But i thought they should be able to configure whether they want to pipe the Stdin
, Stderr
, or both.
Is there a particular use case that you have in mind. I can't think of why you would need to pipe stderr?
Not to say that this would be difficult but rather a question of design choice. I think it is worth considering what things are just syntactical sugar and what is an actual feature. Cause we can easily work on syntactical sugar part later but the features need to be robust so that there is no technical debt to it and sugar can be added without issues.
Yes we can work on it later. The reason to pipe Stderr
is simply to maintain a regularity with the other operators and just because we can. There can be specific areas where it can be useful, but actually it is more due to the above reasons.
A question came to me today about this. Do we need the lexer to be public at all? We can do the whole lexing and parsing in the same package and the lexer doesn't even have to be public. What do you think?
I don't think the lexer needs to be public. You can combine them into the same package.
Suggestion for the Lexer
The Lexer will return the following types of tokens:
WORD
DOUBLE_QUOTED
SINGLE_QUOTED
EOF
The parser will extract relevant information from them depending on the context.
Lexing details
- All tokens will be emitted at runs of white-space, except inside strings.
- A semicolon will be emitted at each newline, whose preceding token was a string.
- A semicolon will be inserted if the preceding word is a valid identifier.
- A semicolon will be inserted if the preceding word is one of
)
,}
, or]
. - An
EOF
token will be emitted at the end of the file.
Definitions
string: a token of type SINGLE_QUOTED
or DOUBLE_QUOTED
word: a token of type WORD
identifier: a word which matches [_a-zA-Z][_a-zA-Z0-9]*
Here is the current parser grammer:
(* the entire mash script *)
program = { statement } EOF ;
(* a statement is an effector followed by a semicolon *)
statement = [single | block] ";" ;
(* a block statement is a list of statements *)
block = "{" statement "}" ;
(* a single is a non block statement *)
single = condition | loop | command ;
(* condition statement is the if-elif-else statements *)
condition = "if" expression block { "elif" expression block } [ "else" block ] ;
(* loop ais the for loop in c-type languages *)
loop = "for" expression block ;
(* command is a builtin or executable *)
command = string { string } ;
string = WORD | SINGLE_QUOTED | DOUBLE_QUOTED ;
If you are not familiar with Extended Backus-Naur form, I would suggest the Table of Symbols in this wikipedia page.
Regarding the parser's grammar, I think you also need to consider the order of pipes and redirects.
PS: I'm familiar with EBN form
I am also updating the gists in which I previously wrote the grammar to use this one.
Regarding the parser's grammar, I think you also need to consider the order of pipes and redirects.
PS: I'm familiar with EBN form
Yes, the production for the grammar and expression still need to be added. I just wanted to make sure we are on the same page on the operators. I posted this once before, but this was the suggested operators for commands:
mash syntax:
1. Commands with args
2. Redirection operators [both stdout and stderr]
3. Logical operators
Logical operators:
1. [command] && [command]
2. [command] || [command]
3. ![command]
Redirection operators:
1. [command] [b | e]>[c] [stream]
2. [command] [b | e]>>[c] [stream]
3. [command] [b | e]| [command]
4. [command] <[c] [stream]
Streams:
Stdin: i
Stdout: o
Stderr: e
Any suggestions to improve it?
I think the operators are good, they seem to be different but not difficult to work with. The only thing I was concerned about was to have it in Backus-Naur form. For example, currently bash has this BN form (if I remember correctly)
cmd [arg]* [ | cmd [arg]* ]* [ [> filename] [< filename] [ >& filename] [>> filename] [>>& filename] ]* [&]
If you can think of the order of operations and priority that would be infinitely more useful than having a proper grammar - because grammar can be extended later based on the order and priority of operations (think BIDMAS in maths). Does that make sense?
If you can think of the order of operations and priority that would be infinitely more useful than having a proper grammar - because grammar can be extended later based on the order and priority of operations (think BIDMAS in maths). Does that make sense?
I am not sure what you tried to say there, could you elaborate?
Sure, I'll try to construct an example. But I'm at work right now so will have to explain it to you in the evening (in about 8 hours) π
In the mean time, could you try to write the grammar in Backus-Naur form (not extended BN form)?
I am actually much more familiar with EBNF, and also know how to convert each operator as code. If it is better to use BNF, I will try to learn it though.
The benefit of BNF is that it forces you to think about the order in which operations are legal. The EBNF is equivalent (in fact better for certain complex languages) but I think your EBNF is not complete.
As mentioned earlier, the BNF of shell is something like this
cmd [arg]* [ | cmd [arg]* ]* [ [> filename] [< filename] [ >& filename] [>> filename] [>>& filename] ]* [&]
The [...]*
means zero or more can be present and [...] means zero or one. So a simple command will be something like
You can note from this that:
| filename
is illegal.cmd | cmd2
is legal.- The
&
at the end means that the commands on the left and right would be run asynchronously and can be present without the right hand side command. a < b < c
is illegal - can't have two inputs on the same line. In your current syntax on the linked gist its possible to have more than two inputs I think, but should it really be?a <
is also illegal - can't have no file redirection on input.- Also forces the whitespace characters around the redirections (although that's my personal taste and not the language spec I think)
etc etc.
As you can see that this makes it clear what the precise nature of the language is. Indeed, you can choose to make these ambiguous but if you do that and write a specification then the next person who writes your language might not behave the same in all conditions but will essentially be the same language. Python
comes to mind as their specification is a bit loose in certain areas and different variants of python (Jython, PyPy, Python etc) behave a bit differently (although the basics are the same in all variants since they were unambiguous).
So I wanted you to consider writing these down precisely so that we can write proper tests in order to completely nail the basic language and extend from there.
The final grammar is of course going to be unambiguous. Before writing the grammar for an expression, we need to decide on what operators to include in expressions. Notice the distinction between the command and a normal expression or statement. That is one of the design choices I think should work, but I would like your perspective on it.
Currently, I am thinking to treat statements which do not start with a keyword as a command. So:
if true {
# do stuff
}
echo "Hello, World!" # command
This does create some problems, which I would like your help on.
- Automatic semicolon insertion does not work, because the last token may be a non-identifier word.
- Is treating commands as their own separate entity is a good idea?
Also, here are the suggested expression operators.
+ sum integers, floats, strings
- difference integers, floats
* product integers, floats
/ quotient integers, floats
% remainder integers
& bitwise AND integers
| bitwise OR integers
^ bitwise XOR integers
&^ bit clear (AND NOT) integers
<< left shift integer << integer >= 0
>> right shift integer >> integer >= 0
== equal
!= not equal
< less
<= less or equal
> greater
>= greater or equal
&& conditional AND
|| conditional OR
! NOT
I am using golang's operator set as it is smaller than most others but provided much better functionality.
Also, another little implementation detail. I am thinking of parsing to an AST, which we will traverse to form bytecode. How does that sound? Do you have another suggestion for code execution?
@tjgurwara99 I have updated the formal grammar spec with more details. Please see and review it.
https://gist.github.com/raklaptudirm/9aa25462cbb434906a340d047184a23e
- Automatic semicolon insertion does not work, because the last token may be a non-identifier word.
Can you give an example of a non-ident where this doesn't work? If you look at the spec for go under the section lexical elements there should be a subsection semicolon
which explains their consideration - I think the same principle would apply here as well.
- Is treating commands as their own separate entity is a good idea?
It is a design consideration. I don't think there is a problem with that; it might need some more thinking when it comes to constructing the AST though.
Second thing that I think you are confusing yourself with is that you consider that if a statement is not starting with a keyword then it is a command, but what do you mean by keyword? The grammar doesn't explain that right? Also, if it is what I think it is then if
is a keyword but then echo
is also a built-in command (which would be a keyword)?
Anyways, to not go down the road of constant improvisation on every new feature, I recommend constructing a simple baseline program in your newly confirmed syntax (a program that does very basic things and you want your language to execute this). This way, we would have a baseline to work towards and build on top of - does that make sense?
I like the operators that you have chosen, but then what happens when you use these with commands; remember we always work with the assumption that user (me) is stupid π.
Also, another little implementation detail. I am thinking of parsing to an AST, which we will traverse to form bytecode. How does that sound? Do you have another suggestion for code execution?
IMO, there is no need to do byte-code if you're going to convert it to AST to begin with. Just parse it to AST from lexer directly - I don't think that is difficult. Unless I misunderstood you here - I don't know what you mean by "traverse from bytecode" - I'm assuming from byte-code to AST, am I right in my assumption?
I left some comments on your gist about the grammar, let me know if anything doesn't make sense there.
Also I recommend looking at some programming language levers and tokens. Here's the source for go's lexer and parser.
Also, in the meantime till we have the grammar confirmed, if you'd like to sharpen your language writing skills, I can help you write the Monkey language interpreter with AST... if you think that would help and not waste time. Let me know π
Can you give an example of a non-ident where this doesn't work?
I think I have actually got it figured out. If the current line is a command, automatically insert a semicolon at the end of line if the last token is a string.
I think you are confusing yourself with is that you consider that if a statement is not starting with a keyword then it is a command, but what do you mean by keyword?
If you look at the grammar for the statement, it has a command
terminal, followed by others (like conditional
, expression
, etc). All the terminals other than command
start with a certain keyword if you check the grammar. That is what I mean when I say that a statement starts with a keyword. Hope I cleared that up.
IMO, there is no need to do byte-code if you're going to convert it to AST to begin with. Just parse it to AST from lexer directly - I don't think that is difficult. Unless I misunderstood you here
You did misunderstand me. I meant we first parse the tokens to form the AST and then go through the AST (traverse it) to form or create bytecode which can then be run by the virtual machine.
Also, in the meantime till we have the grammar confirmed, if you'd like to sharpen your language writing skills, I can help you write the Monkey language interpreter with AST... if you think that would help and not waste time. Let me know π
We surely can. Tell we what to do and we will start.
I looked at it and the language is surprisingly close to the syntax of mash, which just makes it a better exercise.
Here's what I did yesterday on Monkey - we can collaborate there and work with that as a simple language to work with and go from there.
There's a book that I read a few years back (I think 2017-18) called "Writing an Interpreter in Go" - it develops Monkey and I followed that for my first implementation of Monkey because I wanted to write my own language once. At the time, I followed the book but now I have more experience with this so I'm writing things in a bit of a different way. You can refer to the book as a basis for my implementation on the repo and we can work there.
@tjgurwara99 What is your opinion on this:
There will be no function statements. Creating a function will be like creating a variable and assigning a function expression to it. Also, there will be no argument list. Every function body will have an args
array, which will contain all the arguments provided to the current running instance of the function.
let function := func {
# function body
}
I think its alright π I'm all for first class functions and it keeps things simple.
I'm not sure about the no arguments part of it though... I personally don't like it (as its not explicit what is happening or reader friendly) but up to you...
On a sidenote, not sure if you want to have a distinction between :=
and =
What I wanted to do is require variable declarations. So :=
is for declaring a variable while =
is for assigning to it.
What I wanted to do is require variable declarations. So
:=
is for declaring a variable while=
is for assigning to it.
Thats all good but then do you need let in the assign statement too? If so we shouldn't have let in declarations either. Does that make sense?
I understand what you are trying to say, but non command statements need to start with a keyword. You can think of let
as saying the there will we some sort of declaration after it.
Otherwise if expressions start with an identifier, we will be wondering whether it is a command or a indentifier.
git ...
@tjgurwara99 How does the mechanism sound to you? Do you have another way we could do this?
I understand what you are trying to say, but non command statements need to start with a keyword. You can think of
let
as saying the there will we some sort of declaration after it.
Hmm... Not sure what you mean here. Even if you look at the current bash
or zsh
we have simple assigns right? For example GO_PATH
environment variables and such. And setting it as as simple as GO_PATH=/path/to/gopath/
so as you can see this is not a command but rather within the language. I think the declaration can as easily be the same...
I think you are viewing the program as command and not command but personally I think you should consider it the other way around. If its a program the syntactical analysis will take care of it, if its not a program then it falls back to the simple shell (look for command and if the command exists then execute it). Does that make sense?
Yes, but that creates problems while lexing. We will be extracting keywords from where it is actually an argument to a command. This mechanism makes it easy to lex the command properly.
And non-command statements should start with the keyword for that very reason.
Those are the reasons the let
keyword is required in my opinion. What do you think @tjgurwara99 ?
Yes, but that creates problems while lexing. We will be extracting keywords from where it is actually an argument to a command. This mechanism makes it easy to lex the command properly.
And non-command statements should start with the keyword for that very reason.
Lexer's job is to lex tokens thats all, parsers job is to create ast and evaluators job is to execute the program ast (usually).
Once the parser has ast, the way to recognise that its a language syntax is simply to think of it as a prgram first and then if it can't be a program then it must either be a command or invalid syntax.
Whether you go with command or not command vs program or not program, you will have two different branches anyways.
Also, making it easier to work with your language is much more important than making it easier to write your language. Simple is better than complex but simplicity is difficult to achieve so even if it may be a bit tricky to do it the other way around, I think its worth it in the long run. Anyways, this is also a design decision so ultimately you're the one to go whichever way you think is best. Let me know what you decide π
Those are the reasons the
let
keyword is required in my opinion. What do you think @tjgurwara99 ?
I think in that case let
will be required even for assigns then and that is not ideal either π
How do you suggest we should do it?
How do you suggest we should do it?
My suggestion is to go the program or not program way, and nail the small language first and then work on the fallback to commands later since that is well studied in terms of shells
@tjgurwara99
Here is a list of issues which I think the let
keyword will solve:
- Lexing commands like
echo ======
will be lexed asecho, ======
, and notecho, ==, ==, ==
(treating the==
as the operator), as statements will be separated from commands. Otherwise, separation of arguments of commands will become more complicated. - Deciding whether a line is a command or a statement will be trivial, as the statements will start with predefined keywords.
- It will prevent the problem where an expression with an incorrect syntax is evaluated as a command, which can be really hard to diagnose. For example,
i = a * b c
is invalid, probably an operator is missing, but it will be treated like a command and the vm will try to run it, which can lead to undesirable circumstances.
Replacing the current let
keyword will require an elegant solution to all of these problems. If you do have one, we can proceed with the newer one.
Yeah now I see the problem, I didn't think about the first point... Nice catch!
Regarding your second point, I think triviality of writing the language is besides the point of making the language easier to work with - if you catch my point π
My problem with let
keyword is that I think we will use let
on both =
and :=
and therefore, I don't think it's an elegant solution to this problem either.
I don't have a solution yet unfortunately. Let's think this through and we can circle back to it after finishing parts of Monkey
and you finish the Lexer for this. It seems like you have thought through the direction of going with command or not command route anyways so it's not like we're completely out of options but I think its worth considering better routes (if they exists - right now, I'm not sure my approach would be a better route).
My thinking on shell
s like bash
or zsh
is that essentially they are dumb by design because they only need to work with binaries and therefore binaries are a "first class citizens" in the lang of shell. But the plans for mash is to be a bit more nuanced so I don't know...
My problem with let keyword is that I think we will use let on both = and := and therefore, I don't think it's an elegant solution to this problem either.
If you change your viewpoint to what I intend thelet
keyword to mean, I think it is elegant. You should this think oflet
as saying that "I am going to assign something to this assignable", and how we are going to do it depends on the assignment operator. Like thefor
keyword in go, where you know it is going to be some sort of loop, but to know what sort you have to look furthur.
Let me know your opinions @tjgurwara99 .
But then what is the difference between let something = 1
and let something_else := 2
. In the AST they would both be the same thing then. Plus lets say that you have
let something = 1
...
something=2
What happens then? At first glance there doesn't seem to be any problem, but lets say that we have a binary (command) called something
what would happen to the variable something
because syntactically it would be fine as this begins with not command. Because of this we will come back to the complecated impasse again π where we would have to distinguish the program syntax vs command syntax. This is why I don't think that is an elegant solution either. Does that make sense? @raklaptudirm
something=2
That will be treated as a command, and it will also not do anything to the existing variable, which is a completely different thing. Commands and variables are not related. So having a something
command wont change anything in the variable.
You've lost me there, I don't understand what you mean. How will that be treated as a command and how will we be able to access the something
variable then?
Like how do I access something
variable that we assigned with let something = 2
then?
First of all, anything that does not begin with a keyword will be treated as a command. So the lexer will parse the first "word" and then use the lookup to see if it is an IDENT
or any of the keyword. If it is a keyword, it will lex the rest of the line like normal lexers do. Otherwise, it will say that the line is a command, and will lex it accordingly (so no operators and keywords and stuff).
Now, when do you use a variable? In a statement. Inside statements(starting with keyword) the language will act like you expect it to, the identifiers are variables and stuff, and you cannot run a command there. So no problem using a var with the same name.
Now it is possible you want to embed the data of a variable into a command. That is possible by the fact that we will add a feature of embedding expressions inside strings, and commands dont work inside expressions so there will still be no ambiguity.
In other words, commands and variables are used in separate places, so there will be no ambiguity in what is which.
Hope I made it clear @tjgurwara99
I still think there is a lot of ambiguity here. I have a few cases that I am considering and according to what you've written down, I think I understand what you meant but I still don't think that is a valid because it still is not clear what happened to the variable (according to your EBNF). Consider the following:
In the current shell implementations we have something like this
The variables here are accessed with $echo
note that echo is a command but the token =
takes precedence. This is possible because the shell language is a language first but falls back to commands when it fails to work it through.
Note another thing in this is that we never really had to use a let keyword but it was fine when it came to assignment. So the next part.
Now, the main question I had was how are you going to use the variable after you've assigned a value to one. The way its done in the current shells are simply prefixing with $
so will you be using them. If that is the case then is there really a need for the let
in the following
let something := "hello"
This could just as easily be
$something := "hello"
but there is no benefit to having this because
something="hello"
will do the exact same thing. So there is no difference between :=
and =
which is why I'm bit confused as to why you think this :=
declaration and =
is different to begin with.
To your point about command and variables are used in different places, what happens here then:
let something = "hello"
echo something
Does it print "something" or does it print "hello". With this question we get to the same point again, how are you accessing this variable?
Finally, the evaluator of shell in fact expands all the values before executing, for example:
So we have something much more dangerous to be considered because essentially you can have self executing variable (I term I just invented lol - maybe there is a better term) which is what the Log4j fiasco is all about. For example, consider something like this:
let something = "rm -r /"
$something # please don't run this command even though shell would just consider the whole string to be
In the current implementations of shell its not dangerous though because the variable is one ident but this should always be kept in mind I guess. Anyways, I digress.
@tjgurwara99 It seems like we have some misunderstanding regarding this.
:=
declares or creates the variable. The other assignment operators just assign to it.- Any identifier in a command(does not start with a keyword) is treated as a string. So:
let something := "hello"
echo something
will print something
, not hello
.
- Using variable values inside a command is done inside the string, with something like template literals. I am still not sure how exactly I want to implement it, but am thinking on the lines of:
echo {something}
echo "{something}, World!"
Lol I feel like we're going around in circles haha.
:=
declares or creates the variable. The other assignment operators just assign to it.
Lol I know what that means but there are a few things that don't sit well with me. My precise concern is that in any shell we have failsafe of not having defined a variable but still being able to use it. So it never catches errors when something is not defined it just sends back ""
empty string word. So my main question is that are you thinking that something
variable doesn't exist to begin with? In that case, I further understand what you meant but another concern pops up in my head (sorry to be a pain about this π
- I have a tendency to understand requirements before I begin with something). Now that I understand the preliminary, what happens in this case:
let something := "echo"
something=hahaha
Will the above work in your language? if so what is the difference between let something = "hahaha"
and something=hahaha
- note the first one is not declaration but assignment. Is the second syntax even going to be valid in your new language (I don't think I would call that language a shell anymore haha).
Another concern is that, what happens when the variable has not been declared...
something=hahaha
or in the other case where the above syntax is invalid
let something = "hahaha"
Like what happens when something
is not declared but you use it anyways in the assign. Do you throw error? Or do you assign it anyways. If you assign it anyways - there is no need for :=
IMO. If you don't then, it's a different story but then this deviates a lot from shell specification and I would say we should consider making something else entirely. Maybe something like a distinction between a command mode and language mode similar to what vim has a command mode and insert mode - not something that I recommend though since there will be a distinction between scripts and language, and I firmly think there shouldn't be any such distinction...
Anyways, let me know π
Hey @raklaptudirm , I thought a lot about this and I think the best approach for us currently is to look at what our program would look like in the actual language - both approaches command and keyword thing that you talked about.
Once you write it down, I'll try and create a Psuedo AST and see if that makes sense. After that we can discuss problems that arise after that...
After that we can discuss problems that arise after that...After that we can discuss problems that arise after that...
I would like to leave some final comments regarding the points you raised in your previous comment @tjgurwara99
Like what happens when
something
is not declared but you use it anyways in the assign.
Assigning to a non declared variable is considered an error.
If you don't then, it's a different story but then this deviates a lot from shell specification and I would say we should consider making something else entirely.
In this shell, I am trying to take the positives from existing shell implementations, but not to let them prevent me from doing things my way.
Maybe something like a distinction between a command mode and language mode similar to what vim has a command mode and insert mode.
I don't see how that would work, as the language is one, and the shell will just be a repl for the language.
The distinction between a statement(starting with the keyword) and command(not starting with the keyword), along with the let
statement for assignments fixes all the above problems elegantly imo. Let me know what you think :)