stevedonovan / el

A more commamd-line friendly style of using Lua

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A Better Scientific Calculator, Done Unscientifically

This all started because I became unsatisfied with console calculators, and wanted to move beyond keeping interactive Python open. My love for Lua made me type this a few too many times:

$ lua -e 'print(math.sin(2*math.pi))'
-2.4492935982947e-16

That print is irritating, as well as the carefully namespaced globals. But I find the quotes particularly annoying.

So the first step, make printing implicit, put everything into global, and replace '()' by '[]' to avoid freaking out the shell.

$ el sin[2*pi]
-2.4492935982947e-16

That's definitely easier on the fingers! Making the fingers do as little possible work for a given result starts making sense when your hands ache after a day of furious typing. (Unlike modern people, I don't think the solution is a new keyboard)

There's another simplification: if the expression is a simple function call, we collect the argument expressions separated by spaces:

$ el sin 2*pi
-2.4492935982947e-16
$ el max 10 5 23
23
$ el time
1636196164
$ el date
Sat Nov  6 12:54:19 2021

It's refreshing to have finger-friendly access to the Lua standard library without too much bracketing and commafication.

The argument expressions can be simularly written, if so desired. So we can party like it's 1963 with distinctly Lisp-ish syntax:

$ el max 20 {min 30 10}
20

There is a further rule; if the arguments are of form VAR=EXPR then these are considered key-value pairs and the function receives a table:

$ el a=1 b=2
{b=2,a=1}

el defines a global set which writes key-value pairs to a little ~/.el file, which is then reloaded on each invocation.

$ el set tau=2*pi
$ el sin tau
1.3965925355373e-14

We would like our own functions to benefit from such convenient access, and Lua files can be registered with el using set.

-- normal.lua
local spi = math.sqrt(2*math.pi)
function normal(x,sig,mean)
  if sig == nil then 
    mean = x.mean
    sig = x.sig
    x = x.val
  end
  return 1.0/(sig*spi) * math.exp(-0.5*((x - mean)/sig)^2)
end
$ el set add_file=^ex/normal.lua
$ el normal 11 0.5 10
0.10798193302638
$ el normal mean=10 sig=0.5 val=11
0.10798193302638

There is now a dofile("/home/steve/dev/el/ex/simple.lua") in ~/.el and so it is available globally. (Scary shit, but worse things live in '~')

The implicit table feature makes named arguments straightforward. The testing workflow is editing the file and exercising the functions using el.

Can also bring in Lua modules:

$ el set add_mod=^lfs
$ el lfs.dir ^. do it
data.txt
basic.lua
collect.lua
test.sh
...

You can save aliases to functions within modules, but there is a little catch:

$ el set dir=^lfs.dir
$ el dir ^. do it
...

The expression has to be quoted, because after it is evaluated, we have no idea where the function came from (and in fact, not even its name without using the debug interface). So it must be a 'dotted expression' which we look up and confirm that it does resolve and handled specially.

Now, I have not forgotten that the original itch, before the explosion of curiosity, was to make a convenient technical calculator. We work with bits, so support for binary is useful & educational. Lua does not have binary literals, but we can make something similar happen at lookup time - check if it starts with 'b' and follows with binary digits. Bits go from least significant up, as is the usual tradition:

$ el b011
6

There are three output conversion functions: json, hex and bin. The special sauce here is that the first word of a top-level expression can be a conversion function. In this way, the cool Lua 5.2+ bit32 bitwise functions can be explored:

$ el bin band b011 b010 
01

This convention saves us from having to say bin {band b011 b010} when applying an output conversion.

(These functions were retired in 5.4 in favour of operators, but el defines them if not found)

Here we print out the bits of a byte:

$ el 0 7 do print it {extract b01011010 it}
0	0
1	1
2	0
3	1
4	1
5	0
6	1
7	0

At this point, el is sufficient for the original purpose, of doing calculations on the command line, with the extra coolness of saved variables and custom functions. But it's always fun to take an idea too far...

Quoting Strings

The standard UNIX shell is awkward with quoting strings. Merely single (or double) quoting is insufficient:

$ el date '%c'
error	[string "expr"]:1: unexpected symbol near '%'	date(%c)

Instead, you have to say '"%c"' which is awkward. So strings are prefixed with '^' which has no special meaning to the shell; there are only a few sigils available which wouldn't also interfere with Lua operators.

$ el date ^%c
Sat Nov  6 14:24:47 2021
$ el date ^'%F %T'
2021-11-06 14:25:28

Could we infer from context? It is impossible to do this perfectly, and possible to create silent ambiguities, and 80% correct forces the user to second-guess a parser/evaluator.

Because of shell quoting rules, there are limitations. You can say f[^'hello dolly',^fine] but actually those single quotes are just making the statement more readable - the quotes disappear and so I do barbaric stuff like look for ',',']', etc. When in doubt, say '"hello dolly"'. (This is definitely one of the least attractive parts of el notation)

There is something equivalent to Perl's qw{...}. If the function ends with '^', then all the arguments of that function are simply interpreted as quoted strings.

$ el list^ hello dolly fine
{"hello","dolly","fine"}

This makes collecting the results of a shell expansion possible:

$ el files={list^ *.lua }
{files={"1311-el.lua","basic.lua","bin.lua","collect.lua","el.lua","normal.lua"}}

Note the important space after *.lua for the shell expansion to work!

In the special case of subexprs, {^ one two} is the shortcut for {list^ one two}.

Also (borrowed from the convention set by curl and others) a file may be directly read in as a string using @filename. The special filename stdin means 'read standard input' (lf is a convenient global so we don't have to type the awkward ^'\n')

$ el split @data/text.txt lf
{"Normally text isn't as interesting","as a line from a poem,","or a sentence scrawled in lipstick","on the bathroom mirror"}

Hit and Miss: it and this

We can calculate now, but not repeat calculations. The keyword do separates the command-line into two expressions, the first is an iterator, the second an expression that will consume the iterator. The first value passed by the iterator is it (as in Kotlin) and the second value is this. The value of the expression will be printed out if it is not nil. (Generally you should treat it as reserved word since weird shit will happen if you use it in other contexts.)

$ el seq 0 4 do it
0
1
2
3
4

The global seq works rather like the usual Linux command of that name, but it can do floating point!

$ el seq 0 2*pi 0.2 do print it sin[it]
0	0
0.2	0.19866933079506
0.4	0.38941834230865
0.6	0.56464247339504
0.8	0.71735609089952
1	0.8414709848079
1.2	0.93203908596723
....

The formatting is crap, though. This looks better:

$ el seq 0 2*pi 0.2 do format ^'%5.1f %6.2f' it sin[it]
  0.0   0.00
  0.2   0.20
  0.4   0.39
  0.6   0.56
  0.8   0.72
  1.0   0.84
...

There are some implicit functions, depending on the content. seq is the default function for the iterator, if the first argument is a number. And format is the default for the expression, if the first argument is a string.

$ el 0 2*pi 0.2 do ^'%5.1f %6.2f' it sin[it]
  0.0   0.00
  0.2   0.20
  0.4   0.39
  0.6   0.56
  0.8   0.72
  1.0   0.84
  ...

Another place it appears when continuing a set of expressions using :. Personally I prefer the flow of data from left to right, rather than the right to left buildup of function application, so each expression is assigned to it in turn. The last expression's value is printed out as usual (unless it is nil). This is also good for interactive refinement by adding modifiers, like shell pipes.

$ el ^'hello dolly' : sub it 1 4 : upper it
HELL

this appears in the common case of an iterator returning two values:

$ el pairs n=20 msg=^hello do print it this
n	20
msg	hello

Another context-specific implicit happens with table constructors:

$ el json {10 20 30}
[10,20,30]
$ el json {a=1 b=1.5}
{"a":1,"b":1.5}

Here the implicit function is list.

These implicit functions kick in when the expression does not start with a known global function.

With this release, these colon-separated 'steps' can be used in more contexts. For instance, can have steps that initialize the loop:

$ el let k=0 N=10 : 1 N do let k=k+it
{k=1}
{k=3}
{k=6}
{k=10}
{k=15}
{k=21}
{k=28}
{k=36}
{k=45}
{k=55}

Often we just want the final result, so it's now possible to put final expressions after the loop with the end keyword:

$ el let k=0 N=1000 : 1 N do let k=k+it end k
500500

Lambdas and User-Defined Functions

I've always wanted a shortcut form for function(a,b) return a+b end but most Lua people resist this idea, since (a) explicit is good (b) using new sigils would be a break from (mostly) keyword-driven syntax and (c) writing code is not as important as reading code. And these are all good points, if you are programming. But here, we are writing shell one-liners.

The el notation is a little different from the usual proposals, whether LHF's \a,b(a+b) lambdas or Rust-style |a,b| a+b closures:

$ el sort {5 2 3 6 1} {a,b: a '>' b} 
{6,5,3,2,1}

And the main reason is that the characters they use would fight with the shell, and we would have to quote again. (In this case, we did have to quote 'greater than' for this reason).

You can save these functions and reuse them!

$ el set desc={a,b: a '>' b}
$ el sort {5 2 3 6 1} desc
{6,5,3,2,1}

Another Lua library function where lambdas are useful is string.gsub. The problem is that some log formats just use unadorned UNIX time-stamps, but we turn this into an opportunity to present the data exactly as we want:

$ el ^'%d,%s' 1636908445 ^'A log message' > log.log
$ cat log.log
1636908445,A log message
$ cat log.log | el gsub L ^'%d+' {t: date ^'%F %T' t}
2021-11-14 18:47:25,A log message

Higher-order functions can be concisely expressed:

$ el set f={y: {x: x+y}}
$ el f[10][20]
30
$ el set tstamp={fmt: {t: date fmt t}}
$ cat log.log | el gsub L ^'%d+' tstamp[^'%F %T']
2021-11-14 18:47:25,A log message

Multiple steps in subexpressions make lambdas more flexible:

~/dev/el$ el set services={: tinyyaml.parse @docker-compose.yaml : mapma it.services}
~/dev/el$ el services do it.image
zookeeper:${ZOOKEEPER_VERSION}
docker.elastic.co/elasticsearch/elasticsearch-oss:${ELASTICSEARCH_VERSION}
wurstmeister/kafka:${KAFKA_VERSION}
docker.elastic.co/logstash/logstash-oss:${LOGSTASH_VERSION}
nginx:${NGINX_VERSION}
bobrik/curator:${CURATOR_VERSION}
docker.elastic.co/kibana/kibana-oss:${KIBANA_VERSION}
mongo:${MONGODB_VERSION}
bobrik/curator:${CURATOR_VERSION}

it.services is a map of all the services, so we make it an array. Arrays are automatically iterable.

Filtering Text

Filters are an important part of the shell experience, and I wanted el to be useful in shell pipelines.

$ cat data/text.txt
Normally text isn't as interesting
as a line from a poem,
or a sentence scrawled in lipstick
on the bathroom mirror
$ cat data/text.txt | el lines do sub it 1 6
Normal
as a l
or a s
on the

There's no magic here: io.lines() and string.sub(), in their informal, first-names-please guise. There is a command cut that does this kind of thing, but I can never remember it, whereas I do remember Lua standard library functions.

There is another special variable, row, which increments from 1 to n over the input:

$ cat data/text.txt | el lines do ^'%02d %s' row it
01 Normally text isn't as interesting
02 as a line from a poem,
03 or a sentence scrawled in lipstick
04 on the bathroom mirror

lines do is common to all filters over input lines, so there is yet another implicit; if the expression contains that special variable L then the loop is over all lines:

$ cat data/text.txt | el sub L 0 6
Normal
as a l
or a s
on the

There is a postfix if, allowing yet another re-invention of grep (note the double '^^')

$ cat data/text.txt | el L if match L ^^or
or a sentence scrawled in lipstick

But this is an expression, so we can make more involved queries without going blind trying to write complicated regexes:

$ cat data/text.txt | el L if {match L ^a} and {match L ^poem}
as a line from a poem,

(My tests currently involve a one-liner to collect all the commands in this file:

cat readme.md | el match L ^'^%$ (.+)' > test.sh

)

Too Much Implicit?

This started as a calculator-itch and ended up scratching something more interesting: mostly (as an over-generalization) shells are nasty programming languages and vice versa. So we take Lua and make it easier to write once-off expressions. The coding styles appropriate for applications (e.g. 'global-free' for Lua) doesn't apply in console golfing. 'Type-ability' is a thing, and the punctuation in 'match("hello dolly","^hell")' slows us down and seems tiring. (At least subjectively, and that is sufficient for me now). In the context of programming, typing is a not an important part of total effort, but shells are about typing fast and accurately.

There are two kinds of implicits going on in el - one is eliding common functions in context, such as seq, format and list. The other is where we look for a particular, special var L and assume it must be looped (which is very AWK-ish, another favourite tool of mine). There is an implicit table constructor where a key-value argument makes the rest of the arguments collected as a table. That could be a little surprising, I will grant you. But these are just experiments in a certain design space - it is hard to evaluate a feature without an implementation, and it is certainly no fun. I grant you that I am laying on the special sauce a little thick sometimes, but it is all optional.

The implementation itself is a fevered hot mess of string patterns and substitutions, but the great thing about Itch Research is that a bad implementation is sufficient to demonstrate ideas enough to show whether they are bad, good or merely meh. There is no point in spending weeks doing a formal grammar and parser if the idea itself is not useful to more than a handful of people. But it means that things are rough and the error handling is poor.

Appendix: Some Trickery Used to Prepare this Entertainment

Metatable Madness

Global lookup proceeds by looking at math,io,os,string,bit32,table and then saved_globals, which is the table containing all the key-values stored by set. Then we check if it looks like a binary 'literal', e.g 'b1011' and then if we can look it up in the environment. Obviously this is not the fastest way to do this but good enough for now (in dynamic languages you pay for cleverness at runtime, not compile-time)

So, how is saved_globals managed?

scratch$ cat ~/.el
local saved = {_G=_G};_ENV=saved
simple={two=2,one=1}
_LITERAL_={join="function(p1,p2) return p1 .. '/' .. p2 end",f="function(y) return function(x) return x+y  end  end",tstamp="function(fmt) return function(t) return date(fmt,t) end  end",exists="function(p) return open(p) and p end",desc="function(a,b) return a > b end"}
_ENV=saved._G
join=function(p1,p2) return p1 .. '/' .. p2 end
f=function(y) return function(x) return x+y  end  end
tstamp=function(fmt) return function(t) return date(fmt,t) end  end
exists=function(p) return open(p) and p end
desc=function(a,b) return a > b end
return saved

For regular data values, things are straightforward: they get put into saved using the _ENV strategy. But we put the actual functions into _G, and keep their code representation in the _LITERAL_ table. Next time set gets called, the values will be separated into data and functions, and the values go after the _ENV=saved, and the functions get copied into _LITERAL_.

A similar trick is used for add_file (uses dofile('/path/to/file')) and add_mod (uses mod=require('mod)`)

A hack was required to make functions work - we automatically quote them as strings in a key-value context.

Saving/loading is the platform-dependent part here, should not be difficult to generalize. A sensible person might ask: can you not just add some dependencies? But nah: single-file Lua is the way to make things easier for users, although it might offend our fine-tuned developer feelings.

Massage for Shell Happiness

The standard shell uses a lot of characters and so we have to adjust our lexical symbols to what will not give offense. () are special, so functions are called by f[x] (or f x if the stars are in alignment). Since '"hello"' is awkward to type, el has the quote marker ^hello.

Things like > and | must be quoted, but at least only single quotes are needed. * is a sneaky one, since if the shell can expand it, it will, so 2 * 3 blows up weirdly but 2*3 will not.

All decisions have consequences which may cause headaches. Array indexing is messed because a[1] becomes a(1) and arrays are (generally) not callable. So, we make them callable, either as a result of list or when loaded, with a metatable that implements call in terms of indexing. This is not beautiful but (again) will do for now. (There's an opportunity to invent slice notation a[2,3] here but perhaps for another day)

Parsing Kludgery

The moral of the story, don't just start typing. But sometimes, that's how you start. The parser has far too many 'if ... but' statements and so the features tend to fight with each other. The implicit stuff is mostly optional, but implicit lists will bite:

$ el {10 + 20}
error	[string "expr"]:1: unexpected symbol near '+'	printx(eval(list(10,+,20)))
$ el {eval 10 + 20}
30

So el solves the old Lisp quoting problem the other way around: it assumes subexpressions are lists, unless the first item is a global function. This is convenient but (again) will bite us further down the road - it is not possible to understand from the shape the code how it will be interpreted.

About

A more commamd-line friendly style of using Lua

License:MIT License


Languages

Language:Lua 99.8%Language:Shell 0.2%