Configuration, logging, and CLI arguments parser libraries are all far from endangered species. Why create another one?
It's fair to say that parsing command line arguments in Scala is a solved problem. There are quite a few libraries that do exactly that, and do it pretty well. There's also an army of Java libraries that do the same thing with varying degrees of style.
The problem with all of the above is this: IMHO it is counterintuitive to learn & use an API that has little to do with the way CLI applications are used, and with how such applications display "usage" text. Each CLI arguments parsing library has its own API that seems to address a goal completely orthogonal to the task of parsing CLI arguments & generating a human-readable help/usage text. This, in my mind, qualifies as a distraction.
What I wanted is a fluent description language that specifies the interface of
a CLI application. That description language (call it a DSL if you must)
should be used both to parse CLI arguments and display help/usage text to
the user. This library - literargs
- achieves exactly that.
import literargs._
import cats.implicits._
object hello {
def main(argv: Array[String]): Unit = {
val args"""
-l $living
-n <$name>
-a [$age:Int]
-i <..$interests>
-c [..$accomplishments]
""" = argv
println(s"""
hello, ${name.value}!
you're aged ${age.value}
you like ${interests.value.map(_.toList.mkString(", "))}
and you've accomplished ${accomplishments.value.map(_.mkString(", "))}
""")
if (!living.value.valueOr(_ => false))
println("alas, you're no longer with us. :(")
}
}
Let's try to run this now:
hello.main(Array(
"-n", "Albert Einstein",
"-a", "137",
"-i", "physics",
"-i", "cosmology",
"-i", "wonders of the universe",
"-c", "Nobel Prize",
"-c", "New Jersey Hall Of Fame"
))
//
// hello, Right(Albert Einstein)!
// you're aged Right(Some(137))
// you like Right(physics, cosmology, wonders of the universe)
// and you've accomplished Right(Nobel Prize, New Jersey Hall Of Fame)
//
// alas, you're no longer with us. :(
So, we have a argv: Array[String]
that we want to parse into meaningful
values. Here val args""" ... """ = argv
is an extractor that turns the
provided arguments array into a bunch of values, in this case, name
, age
,
and interests
.
You can think of the val args""" ... """ = argv
piece as a DSL block that
configures and invokes an extractor. The quoted text - """-n <$name> -a [$age:Int] -i [..$interests]"""
- is used similarly to quasiquote extractors.
The difference that instead of Scala-like syntax used to extract pieces of the
Scala AST, here we use a syntax similar to "usage text" that shows up when a
CLI application is run with the --help
option. After each option description
(i.e. -i
) comes a type description with a binding inside of it:
[..$interests]
.
Let's examine in more detail the types of values bound by the args
extractor.
Stringly typing an entire application based on contents of Array[String]
isn't anyone's idea of fun, after all.
scala> val args"-l $living -n <$name> -a [$age:Int] -i <..$interests> -c [..$accomplishments]" =
| Array(
| "-n", "Albert Einstein",
| "-a", "137",
| "-i", "physics",
| "-i", "cosmology",
| "-i", "wonders of the universe",
| "-c", "Nobel Prize",
| "-c", "New Jersey Hall Of Fame"
| )
living: literargs.Argument[cats.Id,Boolean] = $anon$1$Match$`l`$macro$6$@114ef90c
name: literargs.Argument[cats.Id,String] = $anon$1$Match$`n`$macro$7$@63edd5fe
age: literargs.Argument[Option,Int] = $anon$1$Match$`a`$macro$8$@1a7031ae
interests: literargs.Argument[cats.data.NonEmptyList,String] = $anon$1$Match$`i`$macro$9$@b5afa66
accomplishments: literargs.Argument[List,String] = $anon$1$Match$`c`$macro$10$@6dbae498
We get three symbols - name
, age
, and interests
- of type Argument[M[_], A]
. The Argument
type is defined by literargs
as follows:
sealed trait Argument[M[_], T] {
def value: Xor[Throwable, M[T]]
}
The purpose of Argument
is clear: it encapsulates an M[T]
, where M[_]
is
some type constructor that contains zero, one, or multiple values of type T
.
Syntax within the val args""" ... """
extractor controls which M[_]
and T
types are chosen by the macro.
We see three M[_]
types here: Id
, Option
and List
. With the
exception of age
(which is an Int
), the other two T
types are String
.
How does this relate to the syntax we used to describe our options?
Definition syntax | Bound to symbol | M[_] type |
Value type |
---|---|---|---|
-l $living |
living |
Id |
Boolean |
-n <$name> |
name |
Id |
String |
-a [$age:Int] |
age |
Option |
Int |
-i <..$interests> |
interests |
NonEmptyList |
String |
-c [..$accomplishments] |
accomplishments |
List |
String |
In plain English, this means that:
- Bindings without any adornments are Boolean flags.
- Bindings surrounded by
<
and>
are considered required. - Bindings surrounded by
[
and]
are considered optional. - Bindings with
..
before the name will accept multiple values.- Required
..
bindings turn into aNonEmptyList
. - Optional
..
bindings turn into aList
.
- Required
Let's check out some of the values now:
scala> living.value
res2: cats.data.Xor[Throwable,cats.Id[Boolean]] = Right(false)
scala> name.value
res3: cats.data.Xor[Throwable,cats.Id[String]] = Right(Albert Einstein)
scala> age.value
res4: cats.data.Xor[Throwable,Option[Int]] = Right(Some(137))
scala> interests.value
res5: cats.data.Xor[Throwable,cats.data.NonEmptyList[String]] = Right(OneAnd(physics,List(cosmology, wonders of the universe)))
scala> accomplishments.value
res6: cats.data.Xor[Throwable,List[String]] = Right(List(Nobel Prize, New Jersey Hall Of Fame))
We see here that types of values are assigned depending on what syntax was used to adorn bindings.
The -c [..$accomplishments]
option above is a List[String]
detailing the
accomplishments of one famous scientist. What if we wanted to keep a
List[Accomplishment]
, containing both the accomplishment and the year when
it was achieved? Let's adorn the option definition with our desired type:
import literargs._
import cats.implicits._
scala> {
| case class Accomplishment(year: Int, description: String)
| val args"""
| -l $living
| -n <$name>
| -a [$age:Int]
| -i <..$interests>
| -c [..$accomplishments:Accomplishment]
| """ =
| Array(
| "-n", "Albert Einstein",
| "-a", "137",
| "-i", "physics",
| "-i", "cosmology",
| "-i", "wonders of the universe",
| "-c", "Nobel Prize",
| "-c", "New Jersey Hall Of Fame"
| )
| }
<console>:21: error: could not find implicit value for parameter E: literargs.Extractor[List,Accomplishment]
val args"""
^
<console>:22: error: recursive value x$1 needs type
-l $living
^
This fails to compile because literargs
doesn't know how to turn String
-s
inside Array[String]
into instances of Accomplishment
. This is easy to fix
by providing the requisite type class instance:
import literargs._
import cats.Functor
import cats.implicits._
case class Accomplishment(year: Int, description: String)
// defined class Accomplishment
implicit def ExtractAccomplishment[F[_]: Functor] =
new Extractor[F, Accomplishment] {
def extract(raw: F[String]) =
F.map(raw)(_.split(':') match {
case Array(year, description) => Accomplishment(year.toInt, description)
})
}
// ExtractAccomplishment: [F[_]](implicit evidence$1: cats.Functor[F])literargs.Extractor[F,Accomplishment]
val args"""
-l $living
-n <$name>
-a [$age:Int]
-i <..$interests>
-c [..$accomplishments:Accomplishment]
""" =
Array(
"-n", "Albert Einstein",
"-a", "137",
"-i", "physics",
"-i", "cosmology",
"-i", "wonders of the universe",
"-c", "Nobel Prize",
"-c", "New Jersey Hall Of Fame"
)
// living: literargs.Argument[cats.Id,Boolean] = $anon$1$Match$`l`$macro$16$@6080a3dd
// name: literargs.Argument[cats.Id,String] = $anon$1$Match$`n`$macro$17$@3277307e
// age: literargs.Argument[Option,Int] = $anon$1$Match$`a`$macro$18$@69106507
// interests: literargs.Argument[cats.data.NonEmptyList,String] = $anon$1$Match$`i`$macro$19$@54e0e733
// accomplishments: literargs.Argument[List,Accomplishment] = $anon$1$Match$`c`$macro$20$@319b3839
Here we have provided an instance of Extractor[F[_], Accomplishment]
that
turns String
-s into Accomplishment
-s. A couple of things to note:
- We must abstract over
F[_]
because in the field it might be aId
,Option
,List
,NonEmptyList
, or some other thing. - This
Extractor
performs a couple of dangerous operations, which, in case of failure, would make use of the LHS ofdef value: Xor[Throwable, M[T]]
onArgument[M[_], T]
.
Currently literargs
is in its larval stage: the basic building blocks are
there, but there's still a lot of work to be done in a number of areas,
including but not limited to:
- Error handling and messaging to the programmer at compile time.
- Error handling and messaging to the CLI user at run time.
- Support for more CLI option description syntax; perhaps docopt?
- Macros in this project care little for potential symbol collisions.
- Generation of help/usage text to the user.
- Parsing of
argv: Array[String]
is accomplished by a custom parser that is probably broken in numerous ways. This was easier than using another library as the parsing backend (as shown by the extinctcommons-cli
integration that has gone sour), but is perhaps not the best approach long term.
In addition, option description syntax is subject to change, and may not be flexible enough for all use cases. YMMV!