tweag / HaskellR

The full power of R in Haskell.

Home Page:https://tweag.github.io/HaskellR

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Simplifying the SEXP and SomeSEXP types with Liquid Haskell

facundominguez opened this issue · comments

inline-r currently uses the Haskell type system to catch some programming mistakes at compile time. Because producing these compile time errors has a cost in complexity of the API presented to the user, this ticket comes to challenge the assumption that catching these errors at compile-time is required in absolutely all uses of inline-r.

The problem

The SEXP type is the type of references to values in the R heap. It contains two phantom parameters.

newtype SEXP s (a :: SEXPTYPE) = SEXP { unSEXP :: SEXP0 }

The first parameter s defines a region, and determines the scope of the program in which the reference can be used.

The second parameter a specifies the form of the R value.

Both type parameters allow the Haskell compiler preventing mistakes.

  • The region parameter allows producing compile-time errors when a reference is used after it has (or could have) been deleted.
  • The form parameter allows producing compile-time errors when a reference is used with a function that expected a reference of a different form.

These provisions introduce some complexity.

The region parameter complexity

A region parameter requires the users to carry around the parameter in all functions manipulating references. For uses of inline-r where the program deals with a single region that matches the lifetime of the running program, managing the region parameter is unwelcome overhead when there is no opportunity to use the references after the region has been terminated.

The form parameter complexities

The form parameter can't express the unknown form of a reference returned by a function (f :: ... -> R (SEXP s a)). Using a type variable as the form in the return type allows the caller to choose any form, which may not match the actual form at run time. The function should instead return a SomeSEXP (f :: ... -> R (SomeSEXP s)), where SomeSEXP is a type wrapping an existential quantification.

data SomeSEXP s = forall a. SomeSEXP (SEXP s a)

Even if the form is known to be either of two options (like Int or Character), using SomeSEXP would be necessary still.

Moreover, the user is forced to manage the form parameter in all cases in type signatures, including those where she would not mind getting an error at run time because of a mismatched form.

A solution with Liquid Haskell

Here we discuss what a solution could look like with Liquid Haskell instead of using the type checker. If we used Liquid Haskell, we could drop both type parameters from the SEXP type, and we could probably delete the SomeSEXP wrapper.

If the user cared about checking at compile time for more programming mistakes, she could enable Liquid Haskell on her project. This would require inline-r to provide Liquid Haskell specifications for all functions. Then the specification of function f could look like

{-@ f :: ... -> R { v:SEXP s | formOf v = Character || formOf v = Int } @-}
f :: ... -> R (SEXP s)

which is a rather precise spec. Or if nothing is known of the form of the returned reference, then the following expresses it well.

{-@ f :: ... -> R (SEXP s) @-}
f :: ... -> R (SEXP s)

It is to be analyzed if there is a better design in Haskell that can get this precision using the type checker instead.

Removing the region parameter is a bit harder.

{-@ measure isValidIn :: SEXP -> World -> Bool @-}

{-@ f :: ... -> R<\v w0 wf -> isValidIn v wf> SEXP @-}
f :: ... -> R SEXP

It will require using features of Liquid Haskell known as abstract and bounded refinement types, where R<\v w0 wf -> isValidIn v wf> asserts that the reference returned by the R monad is valid in the "world" state wf resulting from running the returned R SEXP computation. It is to be analyzed if the hassle of writing these predicates is better than dealing with the region parameter.

Another alternative is to use linear types to control the scope of validity of references, like done in inline-java. And additionally, we could also consider linear constraints.