Add gradual typing to Bend

Question

Add gradual typing to Bend

developedby opened this issue 2 months ago · comments

Is your feature request related to a problem? Please describe.
Bend already assumes by the semantics of ADTs and matches that users are writing programs within the contraints of some kind of polymorphic lambda calculus type system.

This is not enforced anywhere since bend is currently untyped. This allows very easily writing very clever programs that escape a simple type system, but in most cases just leads to writing incorrect programs. It'd be better if there was some way of enforcing it.

Having it be untyped like it currently is makes Bend basically unusable for anything not super tiny.

Describe the solution you'd like
Add optional hindler-milner typing to Bend. Types are created by adt defs.

Needs to have support for axiomatic type statements to allow unscoped variables, superpositions and native hvm definitions to be used with typed code.

Untyped functions have type Any, meaning that they are trusted, like in typescript or mypy.

For Imp syntax, a syntax like typehints in python could work well

def and3(a: Bool, b: Bool, c: Bool) -> Bool:
  x1: Bool = and(a, b)
  return and(x1, c)

For Fun syntax, we can have separate function type declaration like in haskell for example.

# This is hard to parse, but I like how it looks similar to the other one
(and3 (a: Bool) (b: Bool) (c: Bool)) -> Bool 
(and3 a b c) =
  # Parens are not necessary here
  let (x1: Bool) = (and2 a b);
  (and2 x1 c)

# This is easier to parse, but i don't like how it looks
and3: Bool -> Bool -> Bool -> Bool
and3 a b c =
  # We could also have annotation terms instead
  # Parens are also not necessary here either
  let x1 = ((and2 a b) : Bool);
  (and2 x1 c)

We can compile to Kind to typecheck. That would need some trickery to deal with pairs, sups and unscopeds, but should be possible. Functions that have unscopeds need to be untyped or typed with an axiom.

This leaves a couple things still open:
What syntax to use for type variables. I like these two options

def List/reverse<t>(list: List(t)) -> List(t):

def List/reverse(~t, list: List(t)) -> List(t):

What syntax to use for unchecked type statements. For example we could use :: instead of -> in Imp syntax for an unchecked type.

# Works, but a bit too subtle
def channel<a, b>() :: (a -> b, b -> a):
  return (lambda $a: $b, lambda $b: $a)

# Using '->' for type and a prefix marker for unchecked type
def !channel<a, b> -> (a -> b, b -> a):
def unchecked channel<a, b> -> (a -> b, b -> a):
...

# An attribute?
@unchecked
def channel<a, b>() -> (a -> b, b -> a):

# something else?

I can't think of a use for it in expressions (except as from the desugaring of functions) but we could use :: instead of : there as well if this is wanted. Initially I think we shouldn't because it incentivizes users destroying types when a type error appears.

What syntax to use for foralls. Functions with parametric types compile to lambdas with forall for types

def id<t>(x: t) -> t:
  return x

# becomes
id = (@x x : forall t t -> t)

Describe alternatives you've considered
Leaving typing to another language. That would go against our current direction of making Bend the language people use.

Using a dependent type system. While personally I'd love to see that, if values cross into the types then our type checker necessarily needs to know how to evaluate all the hvm quirks. Basically it would restrict the type checker to HVM itself or an implementation of HVM in the type checker. That would be annoying and could potentially be catastrophic if the type checker has the same limitations as HVM.

Using another kind of polymorphic lambda calculus, like system F or system F_omega. I think for users of imperative languages hindley milner is easier to understand and write.

Eduardo Sandalo Porto · Answer 1 · Thu Jul 04 2024 04:46:57 GMT+0800 (China Standard Time)

For Fun syntax, we can have separate function type declaration like in haskell for example.

Kind2 used to have function type declarations within their definitions, so something like this:

(and3 (a: Bool) (b: Bool) (c: Bool)): Bool =
  ...

Considering gradual typing, I think this also works better than separating it since we could give types only to the variables we want:

(and3 (a: Bool) b c): Bool =
  ...

What syntax to use for type variables.

Regarding generics, Kind2 does it with the diamond operator following the name of the function being defined. For example:

concat <T> (xs: (List T)) (ys: (List T)) : (List T) =
  match xs {
    ++: (cons xs.head (concat xs.tail ys))
    []: ys
  }

It's also similar to other existing languages, so I think it's nice.

With Hindley-Milner, it would be possible to automatically "float" any undefined names to type quantifiers, but I personally think being explicit is better.

I agree with what Kate said on the Discord channel - passing ~t to a function as a normal parameter makes it look like types live in the same domain as values and would be too much like dependently-typed languages, so I prefer the other option

What syntax to use for foralls

I'll look into this more later, but if we use a HM style type checker, it could be good to do something similar to the existing literature.

Nicolas Abril · Answer 2 · Thu Jul 04 2024 05:48:39 GMT+0800 (China Standard Time)

Kind2 used to have function type declarations within their definitions, so something like this:

That doesn't allow for pattern matching. If we want to allow it, then it was

Foo (a: T1) (b: T2) : T3
Foo (CtrA1) b = ...
Foo (CtrA2) b = ...

It's a good option

Nicolas Abril · Answer 3 · Thu Jul 04 2024 21:00:05 GMT+0800 (China Standard Time)

For native hvm definitions we can give them a type with the imp syntax

hvm to_u24 -> forall t. t -> u24:
  ...

Nicolas Abril · Answer 4 · Fri Jul 05 2024 03:09:37 GMT+0800 (China Standard Time)

@edusporto @imaqtkatt @LunaAmora I'm also considering using :: instead of : for imp functions.

def fun <a, b> (x: b, y: b) :: a -> b:
  ...

This deviates from the python syntax but i think it'll get too hard to read functions that return functions

def fun <a, b> (x: b, y: b) -> a -> b:

# must add parens to be readable
def fun <a, b> (x: b, y: b) -> (a -> b) :

Nicolas Abril · Answer 5 · Fri Jul 05 2024 14:51:57 GMT+0800 (China Standard Time)

We could also use haskell style for fun syntax, with implicit type variables

fun :: b -> b -> (a -> b)
fun x y = ...

Nicolas Abril · Answer 6 · Fri Jul 05 2024 19:23:36 GMT+0800 (China Standard Time)

Ok here's what I settled on.

First version has types only on functions. Possibly will need to add annotation expressions later to mix typed and untyped code.

For imp syntax

# <var1, ..., varn> for type vars
# args can be given a type with `: Type`
# Type constructors are like normal function applications
# Since types and values never cross, `object` can share its type name with the constructor name.
# Can declare type variables in a type expression with `forall a, b: ...` or `∀a, b:`
# Eg. `def const(val: u24) -> forall t: t -> u24:`. Not sure of the usefulness but seems like we should have the option.
# Arrow types are infix `ArgType -> RetType`.
# Tuple types are written like tuples `(a, b)`
# Superpositions are not in the type system.
# Affinity is not in the type system.
# `Any` type will typecheck with anything. Default type of unchecked things
# Other builtin types are `u24`, `f24`, `i24` and `None` (the type of `*` as a value, a unit type).
def foo <a, b> (x: Ctr(a), y: a -> b, z, w: Any) -> OtherCtr -> b:
  ...
# Can mark a function as having a trusted unchecked type with `unchecked def`.
# This is needed for functions with superposition, unscopeds, other non-lambda things.
unchecked def channel<a, b>() -> (a -> b, b -> a):
  return (lambda $a: $b, lambda $b: $a)
# Untyped arguments and returns are given type `Any`
def foo(a, b):
  ...
# same as
def foo(a: Any, b: Any) -> Any:

Fun syntax

# Functions can be preceeded by a type signature
foo <a> <b> (x: a) (y: b) : a -> b
foo x y = ...

# Signatures can have parens like pattern matching rules
(reverse <t> (xs: (List t)) (acc: (List t))) : (List t)
(reverse (Cons x xs) acc) = (reverse xs (Cons x acc))
(reverse  Nil        acc) = acc

# If only one non-pattern matching rule, we can write the equation and the signature together like in Kind
(reverse <t> (xs: (List t)) : (List t) = (reverse.go xs [])
# Type ctrs use parens around like function calls
# forall is written `forall var body` or `∀var body` like lambdas are. No patterns allowed.
# Arrow type is still infix
# Everything else same as above
fold <a> <b> (xs: (List a)) (cons: a -> b -> b) (nil: b) : b

# Unchecked functions start their signature with `unchecked`
unchecked channel <a> <b> : (a -> b, b -> a) = (@$a b, @$b a)

Hvm definitions are always unchecked, but accept a type

# Uses imp syntax
hvm to_u24 -> forall a: (a -> u24):
  ...

I'm just not sure whether untyped functions should be type checked by default or not.
Some possible situations

def foo(a, b):
  # `bar` is well typed and returns type `Type1` 
  x = bar(a, b)
  # Ok either way
  match x:
    case Type1/Ctr:
      ...
  # If `foo` is completely untyped should this be an error or not
  match x:
    case Type2/Ctr:
      ...

def foo2(a: Any, b: Any) -> Any :
  # Now that our function is explicitly typed with any type, should the behaviour change?
  match bar(a, b)
    case Type2/CtR:
      ...

Nicolas Abril · Answer 7 · Fri Jul 05 2024 21:36:09 GMT+0800 (China Standard Time)

For type definitions, I'll go with

type List <t>:
  Cons { head: t, ~tail: List(t) }
  Nil

type List <t> =
  (Cons (head: t) (~tail: (List t))) |
  Nil

object Pair <a, b> { fst: a, snd: b }

These are always checked, and generate the constructors

List/Cons <t> (head: t) (tail: (List t)) : (List t)
List/Nil <t> : (List t)
Pair <a> <b> (fst: a) (snd: b) : (Pair a b)

Nicolas Abril · Answer 8 · Sat Jul 06 2024 01:16:35 GMT+0800 (China Standard Time)

With implicit type variable in functions (eg. like haskell)

def foo (x: Ctr(a), y: a -> b, z, w: Any) -> OtherCtr -> b:

foo (x: a) (y: b) : a -> b
foo x y = ...

hvm to_u24 -> (a -> u24):

type List(t):
  Cons { head: t, ~tail: List(t) }
  Nil

# parens optional
type (List t) =
  (Cons (head: t) (~tail: (List t))) |
  Nil

object Pair(a, b) { fst: a, snd: b }

Nicolas Abril · Answer 9 · Tue Jul 09 2024 21:16:31 GMT+0800 (China Standard Time)

The needed builtin types to cover all the operations we have so far are

Any, for untyped things
None, for * (erasure)
u24, f24 and i24, the three types of native numbers
Number(t) the polymorphic type that accepts any of the three numbers as parameter
Integer(t) the polymorphic type that accepts i24 and u24 as parameter. Needed for bitwise operations.
Float(t) the polymorphic type that accepts only f24. Not strictly needed at the moment, but will be if we ever add more native floating point numbers. For things like sin, log, etc.
(t1, ..., tn) the type of tuples. Technically multiple types.

Eduardo Sandalo Porto · Answer 10 · Tue Jul 09 2024 22:17:48 GMT+0800 (China Standard Time)

None, for * (erasure)

Wouldn't it be more like a Unit type? I'm not too sure about the equivalencies between interaction nets and linear logic, but since erasures can be created, it doesn't seem like their type is empty

Nicolas Abril · Answer 11 · Tue Jul 09 2024 22:25:59 GMT+0800 (China Standard Time)

None, for * (erasure)

Wouldn't it be more like a Unit type? I'm not too sure about the equivalencies between interaction nets and linear logic, but since erasures can be created, it doesn't seem like their type is empty

Null types are usually unit types in most languages, inhabited by the null value.

I called it None because that's what it's called in Python, but it could be named null or nothing like it is on other languages, or just * like tuple types are (,).

We could also substitute it by Any. I chose to give it an explicitly different type to signal that the value shouldn't be used, but purely at the type level, the only things you can do with purely generic type forall t. t (Any) is move, duplicate or erase.