CatalaLang / catala

Programming language for literate programming law specification

Home Page:https://catala-lang.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Collection syntax conflicts

AltGr opened this issue · comments

The parsing issue

Current state

Among our adhoc syntaxes for collection operations, two are a source of trouble for the implementation:

  • x among some_list such that x > 1 (the result of which is a list)
  • x among some_list such that x*x is maximum or if list empty then 0 (the result of which is a single element)

Without entering into details, this is due to the fact that they start with a variable name, so when you first see the x you don't know yet if it needs to be parsed as the expression x (e.g. in x + x) or as an identifier for one of the forms above ; and having this ambiguity that can only be decided later is a problem for parsing.

Why it's a problem

For now it works because it's just a single variable, but it breaks if we want to add anything more complex, like for example a pair (x, y), which is now allowed for other list operations (forms like (x * y) for (x, y) among (list1, list2)).

In the future we intend to further extend this and allow more complex forms, like maybe nested structures (it could be the same as in match patterns);

Proposed solutions

A workaround

For tuples, a trick to make it possible can be done (allow an expression instead of x, and check that it has the correct format later on). However, it won't scale to more complex patterns.

Syntax update

(which if decided should rather be done earlier than later)

Adding a keyword in front of the x would be enough to avoid the ambiguity and solve the problem. But we'd rather avoid adding too many new keywords to the language.

Proposition 1

Use let:

  • let x among some_list such that x > 1
  • let x among some_list such that x * x is maximum or if list empty then 0

Personally I find this a bit confusing with the let x equals ... in ... form, but I am not sure how well that reads ?

Proposition 2

Reuse words from type definitions:

  • list of x among some_list such that x > 1
  • content of x among some_list such that x * x is maximum or if list empty then 0

What do you think ?

Very likely to help for #452 as well

In its meeting of February 13th, 2024, the syntax committee picked decision 2 with the caveat that the committee might decide to add filter as a keyword later, in which case this decision should be revisited.

Last minute proposal (only if suddenly it's found to be 10x better ?), for the first case only:

  • all x among some_list such that x > 1

A few remarks gathered after porting our examples to the new syntax:

  • the most frequent use was of the form number of (x among some_collection such that some_condition of x), which are now number of (list of x among some_collection such that some_condition of x), which works but is a little verbose (note that the parens can be removed).
  • For the case where this was directly followed by >= 1, I took the liberty to replace by exists x among some_collection such that some_condition of x ; except when the legal text explicitely said "1 or more" or similar. Do we actually need the exists notation ?
  • In lots of cases, some_condition is actually a function. As a functional programmer I would be very tempted to, instead of list of x among some_collection such that condition of x, get rid of the repeated x and write something much more concise in the lines of some_condition among some_collection. Probably not worth adding one more syntax though.

For an example in french, that last proposal would allow the following change:

      nombre de (liste de enfant parmi enfants_à_charge
                 tel que droit_ouvert_forfaitaire de enfant)
 # into
      nombre de droit_ouvert_forfaitaire parmi enfants_à_charge
* For the case where this was directly followed by `>= 1`, I took the liberty to replace by `exists x among some_collection such that some_condition of x` ; except when the legal text explicitely said "1 or more" or similar. Do we actually need the `exists` notation ?

You can code exists with number of (<filtered list>) > 0 indeed but exists better conveys what you want to do in some contexts I think.

* In lots of cases, `some_condition` is actually a function. As a functional programmer I would be very tempted to, instead of `list of x among some_collection such that condition of x`, get rid of the repeated `x` and write something much more concise in the lines of `some_condition among some_collection`. Probably not worth adding one more syntax though.

I don't think we want lawyers to ponder on the concept of eta-expansion too much. Always providing function parameters explicitly sounds like a good thing, even if more verbose.