scala / scala3

The Scala 3 compiler, also known as Dotty.

Home Page:https://dotty.epfl.ch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add enum construct

odersky opened this issue · comments

Introduction

This is a proposal to add an enum construct to Scala's syntax. The construct is intended to serve at the same time as a native implementation of enumerations as found in other languages and as a more concise notation for ADTs and GADTs. The proposal affects the Scala definition and its compiler in the following ways:

  • It adds new syntax, including a new keyword, enum.
  • It adds code to the scanner and parser to support the new syntax
  • It adds new rules for desugaring enums.
  • It adds a predefined trait scala.Enum and a predefined runtime class scala.runtime.EnumValues.

This is all that's needed. After desugaring, the resulting programs are expressible as normal Scala code.

Motivation

enums are essentially syntactic sugar. So one should ask whether they are necessary at all. Here are some issues that the proposal addresses:

  1. Enumerations as a lightweight type with a finite number of user-defined elements are not very well supported in Scala. Using integers for this task is tedious and loses type safety. Using case objects is less efficient and gets verbose as the number of values grows. The existing library-based approach in the form of Scala's Eumeration object has been criticized for being hard to use and for lack of interoperability with host-language enumerations. Alternative approaches, such as Enumeratum fix some of these issues, but have their own tradeoffs.

  2. The standard approach to model an ADT uses a sealed base class with final case classes and objects as children. This works well, but is more verbose than specialized syntactic constructs.

  3. The standard approach keeps the children of ADTs as separate types. For instance, Some(x) has type Some[T], not Option[T]. This gives finer type distinctions but can also confuse type inference. Obtaining the standard ADT behavior is possible, but very tricky. Essentially, one has to make the case class abstract and implement the apply method in the companion object by hand.

  4. Generic programming techniques need to know all the children types of an ADT or a GADT. Furthermore, this information has to be present during type-elaboration, when symbols are first completed. There is currently no robust way to do so. Even if the parent type is sealed, its compilation unit has to be analyzed completely to know its children. Such an analysis can potentially introduce cyclic references or it is not guaranteed to be exhaustive. It seems to be impossible to avoid both problems at the same time.

I think all of these are valid criticisms. In my personal opinion, when taken alone, neither of these criticisms is strong enough to warrant introducing a new language feature. But taking them together could shift the balance.

Objectives

  1. The new feature should allow the concise expression of enumerations.
  2. Enumerations should be efficient, even if they define many values. In particular, we should avoid defining a new class for every value.
  3. It should be possible to model Java enumerations as Scala emumerations.
  4. The new feature should allow the concise expression of ADTs and GADTs.
  5. It should support all idioms that can be expressed with case classes. In particular, we want to support type and value parameters, arbitrary base traits, self types, and arbitrary statements in a case class and its companion object.
  6. It should lend itself to generic programming

Basic Idea

We define a new kind of enum class. This is essentially a sealed class whose instances are given by cases defined in its companion object. Cases can be simple or parameterized. Simple cases without any parameters map to values. Parameterized cases map to case classes. A shorthand form enum E { Cs } defines both an enum class E and a companion object with cases Cs.

Examples

Here's a simple enumeration

enum Color { 
  case Red
  case Green
  case Blue
}

or, even shorter:

enum Color { case Red, Green, Blue }

Here's a simple ADT:

enum Option[T] {
  case Some[T](x: T)
  case None[T]()
}

Here's Option again, but expressed as a covariant GADT, where None is a value that extends Option[Nothing].

enum Option[+T] {
  case Some[T](x: T)
  case None
}

It is also possible to add fields or methods to an enum class or its companion object, but in this case we need to split the `enum' into a class and an object to make clear what goes where:

enum class Option[+T] extends Serializable {
  def isDefined: Boolean
}
object Option {
  def apply[T](x: T) = if (x != null) Some(x) else None
  case Some[+T](x: T) {
     def isDefined = true
  }
  case None {
     def isDefined = false
  }
}

The canonical Java "Planet" example (https://docs.oracle.com/javase/tutorial/java/javaOO/enum.html) can be expressed
as follows:

enum class Planet(mass: Double, radius: Double) {
  private final val G = 6.67300E-11
  def surfaceGravity = G * mass / (radius * radius)
  def surfaceWeight(otherMass: Double) =  otherMass * surfaceGravity
}
object Planet {
  case MERCURY extends Planet(3.303e+23, 2.4397e6)
  case VENUS   extends Planet(4.869e+24, 6.0518e6)
  case EARTH   extends Planet(5.976e+24, 6.37814e6)
  case MARS    extends Planet(6.421e+23, 3.3972e6)
  case JUPITER extends Planet(1.9e+27,   7.1492e7)
  case SATURN  extends Planet(5.688e+26, 6.0268e7)
  case URANUS  extends Planet(8.686e+25, 2.5559e7)
  case NEPTUNE extends Planet(1.024e+26, 2.4746e7)

  def main(args: Array[String]) = {
    val earthWeight = args(0).toDouble
    val mass = earthWeight/EARTH.surfaceGravity
    for (p <- enumValues)
      println(s"Your weight on $p is ${p.surfaceWeight(mass)}")
  }
}

Syntax Extensions

Changes to the syntax fall in two categories: enum classes and cases inside enums.

The changes are specified below as deltas with respect to the Scala syntax given here

  1. Enum definitions and enum classes are defined as follows:

    TmplDef ::=  `enum' `class’ ClassDef
             |   `enum' EnumDef
    EnumDef ::=  id ClassConstr [`extends' [ConstrApps]] 
                 [nl] `{’ EnumCaseStat {semi EnumCaseStat} `}’
    
  2. Cases of enums are defined as follows:

     EnumCaseStat  ::=  {Annotation [nl]} {Modifier} EnumCase
     EnumCase      ::=  `case' (EnumClassDef | ObjectDef | ids)
     EnumClassDef  ::=  id [ClsTpeParamClause | ClsParamClause] 
                        ClsParamClauses TemplateOpt
     TemplateStat  ::=  ... | EnumCaseStat
    

Desugarings

Enum classes and cases expand via syntactic desugarings to code that can be expressed in existing Scala. First, some terminology and notational conventions:

  • We use E as a name of an enum class, and C as a name of an enum case that appears in the companion object of E.

  • We use <...> for syntactic constructs that in some circumstances might be empty. For instance <body> represents either the body of a case between {...} or nothing at all.

  • Enum cases fall into three categories:

    • Class cases are those cases that are parameterized, either with a type parameter section [...] or with one or more (possibly empty) parameter sections (...).
    • Simple cases are cases of a non-generic enum class that have neither parameters nor an extends clause or body. That is, they consist of a name only.
    • Value cases are all cases that do not have a parameter section but that do have a (possibly generated) extends clause and/or a body.

Simple cases and value cases are called collectively singleton cases.

The desugaring rules imply that class cases are mapped to case classes, and singleton cases are mapped to val definitions.

There are seven desugaring rules. Rules (1) and (2) desugar enums and enum classes. Rules (3) and (4) define extends clauses for cases that are missing them. Rules (4 - 6) define how such expanded cases map into case classes, case objects or vals. Finally, rule (7) expands comma separated simple cases into a sequence of cases.

  1. An enum definition

     enum E ... { <cases> }
    

    expands to an enum class and a companion object

    enum class E ...
    object E { <cases> }
    
  2. An enum class definition

     enum class E ... extends <parents> ...
    

    expands to a sealed abstract class that extends the scala.Enum trait:

    sealed abstract class E ... extends <parents> with scala.Enum ...
    
  3. If E is an enum class without type parameters, then a case in its companion object without an extends clause

     case C <params> <body>
    

    expands to

     case C <params> <body> extends E
    
  4. If E is an enum class with type parameters Ts, then a case in its companion object without an extends clause

     case C <params> <body>
    

    expands according to two alternatives, depending whether C has type parameters or not. If C has type parameters, they must have the same names and appear in the same order as the enum type parameters Ts (variances may be different, however). In this case

     case C [Ts] <params> <body>
    

    expands to

     case C[Ts] <params> extends E[Ts] <body>
    

    For the case where C does not have type parameters, assume E's type parameters are

     V1 T1 > L1 <: U1 ,   ... ,    Vn Tn >: Ln <: Un      (n > 0)
    

    where each of the variances Vi is either '+' or '-'. Then the case expands to

     case C <params> extends E[B1, ..., Bn] <body>
    

    where Bi is Li if Vi = '+' and Ui if Vi = '-'. It is an error if Bi refers to some other type parameter Tj (j = 0,..,n-1). It is also an error if E has type parameters that are non-variant.

  5. A class case

     case C <params> ...
    

    expands analogous to a case class:

     final case class C <params> ...
    

    However, unlike for a regular case class, the return type of the associated apply method is a fully parameterized type instance of the enum class E itself instead of C. Also the enum case defines an enumTag method of the form

     def enumTag = n
    

    where n is the ordinal number of the case in the companion object, starting from 0.

  6. A value case

     case C extends <parents> <body>
    

    expands to a value definition

     val C = new <parents> { <body>; def enumTag = n; $values.register(this) }
    

    where n is the ordinal number of the case in the companion object, starting from 0.
    The statement $values.register(this) registers the value as one of the enumValues of the
    enumeration (see below). $values is a compiler-defined private value in
    the companion object.

  7. A simple case

     case C
    

    of an enum class E that does not take type parameters expands to

     val C = $new(n, "C")
    

    Here, $new is a private method that creates an instance of of E (see below).

  8. A simple case consisting of a comma-separated list of enum names

    case C_1, ..., C_n
    

    expands to

     case C_1; ...; case C_n
    

    Any modifiers or annotations on the original case extend to all expanded cases.

Enumerations

Non-generic enum classes E that define one or more singleton cases are called enumerations. Companion objects of enumerations define the following additional members.

  • A method enumValue of type scala.collection.immutable.Map[Int, E]. enumValue(n) returns the singleton case value with ordinal number n.
  • A method enumValueNamed of type scala.collection.immutable.Map[String, E]. enumValueNamed(s) returns the singleton case value whose toString representation is s.
  • A method enumValues which returns an Iterable[E] of all singleton case values in E, in the order of their definitions.

Companion objects that contain at least one simple case define in addition:

  • A private method $new which defines a new simple case value with given ordinal number and name. This method can be thought as being defined as follows.

       def $new(tag: Int, name: String): ET = new E {
          def enumTag = tag
          def toString = name
          $values.register(this)   // register enum value so that `valueOf` and `values` can return it.
       }
    

Examples

The Color enumeration

enum Color { 
  case Red, Green, Blue
}

expands to

sealed abstract class Color extends scala.Enum
object Color {
  private val $values = new scala.runtime.EnumValues[Color]
  def enumValue: Map[Int, Color] = $values.fromInt
  def enumValueNamed: Map[String, Color] = $values.fromName
  def enumValues: Iterable[Color] = $values.values

  def $new(tag: Int, name: String): Color = new Color {
    def enumTag: Int = tag
    override def toString: String = name
    $values.register(this)
  }

  final case val Red: Color = $new(0, "Red")
  final case val Green: Color = $new(1, "Green")
  final case val Blue: Color = $new(2, "Blue")
}

The Option GADT

enum Option[+T] {
  case Some[+T](x: T)
  case None
}

expands to

sealed abstract class Option[+T] extends Enum
object Option {
  final case class Some[+T](x: T) extends Option[T] {
     def enumTag = 0
  }
  object Some {
    def apply[T](x: T): Option[T] = new Some(x)
  }
  val None = new Option[Nothing] {
    def enumTag = 1
    override def toString = "None"
    $values.register(this)
  }
} 

Note: We have added the apply method of the case class expansion because
its return type differs from the one generated for normal case classes.

Implementation Status

An implementation of the proposal is in #1958.

Interoperability with Java Enums

On the Java platform, an enum class may extend java.lang.Enum. In that case, the enum as a whole is implemented as a Java enum. The compiler will enforce the necessary restrictions on the enum to make such an implementation possible. The precise mapping scheme and associated restrictions remain to be defined.

Open Issue: Generic Programming

One advantage of the proposal is that it offers a reliable way to enumerate all cases of an enum class before any typechecking is done. This makes enums a good basis for generic programming. One could envisage compiler-generated hooks that map enums to their "shapes", i.e. typelevel sums of products. An example of what could be done is elaborated in a test in the dotty repo.

A very nice explanation of the new feature 👍

There seems to be an inconsistency between the desugaring Rule 5 and the following code example:

enum Option[+T] {
  case Some(x: T)
  case None extends Option[Nothing]
}

If I understand correctly, the desugaring Rule 5 says that for the case None, it is an error for Option to take type parameters.

  1. A case without explicitly given type or value parameters but with an explicit extends clause or body

case C extends |parents| |body|

expands to a value definition

val C = new |parents| { |body|; def enumTag = n }

where n is the ordinal number of the case in the companion object, starting from 0. It is an error in this case if the enum class E takes type parameters.

Another minor question is, it seems the following code in the example expansions does not type check:

  object Some extends T => Option[T] {
    def apply[T](x: T): Option[T] = new Some(x)
  }

We need to remove the part extends T => Option[T]?

@liufengyun

If I understand correctly, the desugaring Rule 5 says that for the case None, it is an error for Option to take type parameters.

Well spotted. This clause should go to rule 6. I fixed it.

Another minor question is, it seems the following code in the example expansions does not type check

You are right. We need to drop the extends clause.

In the following introductory example:

enum class Option[+T] extends Serializable {
  def isDefined: Boolean
}
object Option {
  def apply[T](x: T) = if (x != null) Some(x) else None
  case Some(x: T) {
     def isDefined = true
  }
  case None extends Option[Nothing] {
     def isDefined = false
  }
}

I find it a little bit confusing that in the case Some(x: T) definition the type parameter T is bound to the one defined in enum class Option[+T]. I think it is the first time that symbol binding crosses lexical scopes.

Also, how would that interact with additional type parameters?

case Some[A](x: T, a: A)

Also, how would that interact with additional type parameters?

We have to disallow that.

Keeping type parameters undefined looks more like an artifact of desugaring and Dotty's type system than a feature to me. Are there any cases where this would actually be useful?

enum Option[T] {
  case Some(x: T)
  case None()
}

OTOH, covariant type parameters look very useful and are common in immutable data structures. Could this case be simplified?

enum Option[+T] {
  case Some(x: T)
  case None extends Option[Nothing]
}

How about automatically filling in unused type parameters in cases as their lower (covariant) or upper (contravariant) bounds and only leaving invariant type parameters undefined?

  1. It should be possible to model Java enumerations as Scala emumerations.

Instead of only exposing Java enums to Scala in this way, Is there a well-defined subset of Scala enumerations that can be compiled to proper Java enums for the best efficiency and Java interop on the JVM?

I'm proposing modification to the longer syntax:

enum class Option[+T] extends Serializable {
  def isDefined: Boolean
}
object Option {
  def apply[T](x: T) = if (x != null) Some(x) else None
  case Some[T](x: T) { // <-- changed
     def isDefined = true
  }
  case None extends Option[Nothing] {
     def isDefined = false
  }
}

In this case the T is obviously bound in the scope. It still desugars to the same thing, but I feel it's more regular and it allows to rename the type argument:

enum class Option[+T] extends Serializable {
  def isDefined: Boolean
}
object Option {
  def apply[T](x: T) = if (x != null) Some(x) else None
  case Some[U](x: U) extends Option[U] { // <-- changed
     def isDefined = true
  }
  case None extends Option[Nothing] {
     def isDefined = false
  }
}

On the meeting, we've also proposed an additional rule:
require that all extends clauses in case-s list the enum super-class.
This will run the code below invalid:

enum class Option[+T] extends Serializable {
  def isDefined: Boolean
}
object Option {
  def apply[T](x: T) = if (x != null) Some(x) else None
  case Some(x: Int) extends AnyRef { // <-- Not part of enum
     def isDefined = true
  }
  case None extends Option[Nothing] {
     def isDefined = false
  }
}

@DarkDimius I think this is still insufficient because it is still (a little bit) confusing that the T type parameter of case Some[T] is automatically applied to the T type parameter of the parent Option[+T] class.

Despite these inconveniences, I think that the shorter syntax is a huge benefit, so I find it acceptable to have just case Some(x: T) as a shorthand for case class Some[T](x: T) extends Option[T]. It is always possible to fallback to usual sealed traits and case classes for the cases we need more fine grained control (e.g. case class Flipped[A, B]() extends Parent[B, A] can not be expressed with case enums).

One more point discussed on the dotty meeting:

there should be additional limitation that no other class can extend abstract case class. Otherwise the supper-class isn't a sum of it's children and serialization\patmat won't be able to enumarate all children.

It is always possible to fallback to usual sealed traits and case classes for the cases we need more fine grained control

Sealed classes give less guarantees. The point of this addition is that you cannot get equivalent guarantees from sealed classes.

e.g. case class Flipped[A, B]() extends Parent[B, A] can not be expressed with case enums).

Given currently proposed rules it can be expressed, you simply need to write it explicitly using the longer vesion.

However, unlike for a regular case class, the return type of the associated apply and copy methods is a fully parameterized type instance of the enum class E itself instead of C

Am I understanding correctly that the following occurs?

enum IntWrapper {
 case W(i:Int)
 case N
}
val i = IntWrapper(1)
some match {
  case (w:W) => 
    w.copy(i = 2)
     .copy(i = 3) //this line won't compile because the previous copy returned an IntWrapper
  case N => ???
}

If so then it seems like copy should still return C

If so then it seems like copy should still return C

That's a good argument. I dropped copy from the description.

Instead of only exposing Java enums to Scala in this way, Is there a well-defined subset of Scala enumerations that can be compiled to proper Java enums for the best efficiency and Java interop on the JVM?

AFAICT, any enum containing only simple cases (i.e., without ()) can be compiled to Java enums, and exposed as an enum to Java for interop. This even includes enum classes with cases that redefine members.

commented

For enumerations I would love to see a valueOf method with String => E type as well, to look values up by name as well as ordinal.

i'd probably never use a naked Int=>E valueOf method for fear of exceptions; i'd very much prefer Int=>Option[E]. or maybe (if that's still a thing in dotty) something like paulp's structural pseudo-Option used in pattern matching.

oh, and if there is a String=>E (Option preferred, of course), then why not an E=>String, too?

This looks great!

I don't think the long form is an improvement, though. The case keyword is all you need to disambiguate the members of the ADT from other stuff.

enum Either[+L, +R] {
  def fold[Z](f: L => Z, g: R => Z): Z
  case Left(value: L) {
    def fold[Z](f: L => Z, g: Nothing => Z) = f(value)
  }
  case Right(value: R) {
    def fold[Z](f: Nothing => Z, g: R => Z) = g(value)
  }
}

I don't see any issues here. I agree with Stefan that generics should be handled automatically by default, and have the type parameter missing and filled in as Nothing if the type is not referenced. If you want something else, you can do it explicitly.

  case Right[+L, +R](value: R) extends Either[L, R]
commented

Currently this is looking great! I wrote Enumeratum and would be happy to see something like this baked into the language :)

Just a few thoughts/questions based on feedback I've received in the past:

  • It might be nice to make valueOf non-throwing (returns Option) by default. Slightly easier to reason about and might even be faster
  • A withName method might also be nice to have
  • Would it be possible to customise the enumTag for a given enum member so that users can control the resolution valueOf? If so, it might be nice to have the compiler check for uniqueness too :)

AFAICT, any enum containing only simple cases (i.e., without ()) can be compiled to Java enums, and exposed as an enum to Java for interop. This even includes enum classes with cases that redefine members.

Compiling to Java enums has some downsides:

  • The base enum type cannot have a user-selected subclass, as the one-and-only subclass slot would be taken by extending java.lang.Enum
  • Methods inherited from java.lang.Enum might not be desired (e.g. case Person(name: Name) would not be allowed because java.lang.Enum.name()String is final, so the accessor method for name would clash.
  • Crossing the threshold of "compilable to platform Enum" in either direction (by adding the first, or removing the last, case with params) would likely be a binary incompatible change.

This suggests to me that we need an opt-in (or maybe an opt-out) annotation for this compilation strategy.

Java enums are exposed the the Scala typechecker as though they were constant value definitions:

scala> symbolOf[java.lang.annotation.RetentionPolicy].companionModule.info.decls.toList.take(3).map(_.initialize.defString).mkString("\n")
res21: String =
final val SOURCE: java.lang.annotation.RetentionPolicy(SOURCE)
final val CLASS: java.lang.annotation.RetentionPolicy(CLASS)
final val RUNTIME: java.lang.annotation.RetentionPolicy(RUNTIME)

scala> showRaw(symbolOf[java.lang.annotation.RetentionPolicy].companionModule.info.decls.toList.head.info.resultType)
res24: String = ConstantType(Constant(TermName("SOURCE")))

This is something of an implementation detail, but is needed:

  • to allow references in (platform) annotation arguments, which only admit constant values
  • to keep track of the cases long enough for the pattern matcher to analyse exhaustivity/reachability.

The enums from this proposal will need a similar approach, and I think that should be specced.

@Ichoran The long form is intended to allow for

  • class members
  • companion members
  • parent types of the companion

A played with various variants but found none that was clearer than what was eventually proposed. If one is worried about scoping of the type parameter one could specify that the long form is a single syntactic construct

enum <ident> <params> extends <parents> <body> 
[object <ident> extends <parents> <body>]

and specify that any type parameters in <params> are visible in the whole construct. That would be an option.

@ritschwumm

I'd probably never use a naked Int=>E valueOf method for fear of exceptions; i'd very much prefer Int=>Option[E]. or maybe (if that's still a thing in dotty) something like paulp's structural pseudo-Option used in pattern matching.

What about making valueOf an immutable map?

@retronym Thanks for the analysis wrt Java enums. It seems like an opt-in is the best way to do it. How about we take inheritance from java.lang.Enum as our cue? I.e.

enum JavaColor extends java.lang.Enum {
   case Red
   case Green
   case Blue
}

Then there would be no surprise that we cannot redefine name because it is final in java.lang.Enum.

Also, can you suggest spec language for the constantness part?

A withName method might also be nice to have

I agree. This would also be required for most of the useful generic programming stuff we want to do (e.g. automatically generate serializers/deserializers for enumerations based on their name rather than their ordinal).

@szeiger I agree it would be nice if we could fill in extremal types of co/contravariant enum types, i.e. expand

case None

to

case None extends Option[Nothing]

But maybe it's too much magic? Have to think about it some more.

A withName method might also be nice to have

I agree. This would also be required for most of the useful generic programming stuff we want to do (e.g. automatically generate serializers/deserializers for enumerations based on their name rather than their ordinal).

Agreed. But that means we'd have to design that feature with the generic programming stuff, because it would likely end up on the type level? Not sure abut this point.

I pushed a new version where enumerations now define three public members:

  • valueOf: Map[Int, E]
  • withName: Map[String, E]
  • values: Iterable[E]

How does the result type of apply affect cases with type parameters that do not coincide with the superclass' type parameters?

Besides, the rationale for using the super class as the apply result type is that it will be more friendly to type inference. However, I fail to come up with a realistic example that would not correctly infer before, and correctly infer with this feature. For example, the typical type inference problem:

val x: List[Int] = ???
val reversed = x.foldLeft(Nil)((ys, y) => y :: ys)

In current Scala, the type parameter of foldLeft is inferred as Nil.type, and then y :: ys does not conform to that. With the scheme proposed here, it still fails to compile, because the type parameter is inferred as List[Nothing].

The exact same would happen to None and Option[Nothing].

Is there any (realistic, common) snippet that would fail to compile before, and succeed now?

@sjrd

colors.foldLeft(Color.Black)((result, color) => someOp(result, color))

Where someOp computes a color based on two colors.

However, your point with List[Nothing] and Option[Nothing] still holds :-(

Where someOp computes a color based on two colors.

Have you ever actually seen such a case? I.e., an operation that takes an enum value and something, and returns a new value of the same enum set?

There's a reason I insisted on realistic, common. We sure can come up with snippets, but that does not count.

Here is a snippet adapted from a real example:

sealed trait Step
case class A(b: Boolean) extends Step
case class B(s: String) extends Step

xs.foldLeft[Step](A(b = false))( … )

@retronym @odersky I like the idea of opting into Java enum compatibility by extending java.lang.Enum. The spec just needs to be compatible with Java enums (https://docs.oracle.com/javase/specs/jls/se8/html/jls-8.html#jls-8.9.3) in principle so that a compiler can emit them in these cases. In particular, defining both the Scala enum values method from the current implementation and the static values method required by Java enums could be problematic.

i'd probably never use a naked Int=>E valueOf method for fear of exceptions; i'd very much prefer Int=>Option[E]. or maybe (if that's still a thing in dotty) something like paulp's structural pseudo-Option used in pattern matching.

Can't agree with this more. If we are going to the trouble of adding enums, we really need a String => Option[T] function which lets you look up an enum by its String representation. Thats a lot of the reason why libraries like https://github.com/lloydmeta/enumeratum are required in order to do basic stuff that is already available by default in other languages

Otherwise I agree with the proposal in general

After thinking about values some more: Enumerations are mostly "normal" bytecode but with some special support by the Java compiler. In this sense they are similar to varargs, so we could treat values: Array[E] vs values: Iterable[E] in the same way:

  1. When defining an enum that extends java.lang.Enum in Scala, generate values method with the Java signature instead of the one with the Scala signature.

  2. When using any Java enum (whether defined in Java or Scala) from Scala, treat its values method as having the Scala signature (with the same downside as for Java varargs: every call needs to allocate a wrapper Seq).

I updated the proposal to add withName, and make both valueOf and withName maps. The implementation #1958 has also been updated. Still to do: Clarify the connection to Java enums and what it means for values.

I'm VERY excited with this proposal. I spent +50 hours working on a solution +2 years ago, planning to turn it into a macro. And to solve the core problem, Odersky is the one that finally got me to see a key fix was to use abstract case class.

It thrills me to the moon that I can consider abandoning my own implementation.

@sjrd this doesn't take typeclasses into account. Having an Option[Nothing] is still miles more useful than None.type if for example you care about using a Monad[Option] with it.

Is there a precedent for having the compiler output depend on scala.Map?

@odersky thinking about it again, having a Map[Int,E] and an additional Iterable[E] is overkill. a simple (preferrably immutable) Seq is just as good. it's not that big of a difference whether i call Map#get or Seq#lift.

Is there a precedent for having the compiler output depend on scala.Map?

Afaik there isn't no such precedent currently currently in Dotty.

If my answer here is accurate, there isn't any precedent in scalac either.

Will case class and sealed be removed in Scala 3?

Will case class and sealed be removed in Scala 3?

Case class: definitely not. We need open as well as closed sums. Sealed: will also stay because there are more complex class hierarchies that cannot be explained by enums but that are still confined to one compilation unit.

@odersky , is it possible to allow extending an enum in the same source file?

I didn't feel good if the language has two very similar features: sealed trait and enum

I did not like the nested definition in the enum syntax, too.

It brings Java style static member definitions back. Static member definitions are inconsistent with type / companion separation conventions in Scala language, and cause confusion on Scala's path-dependent type syntax.

It looks like the only reason for case val is sharing Java class files between enum values. If it is the case, why not apply this optimization on all object?

Is it possible to share the class file for all empty-body objects that have the same super types and same modifiers (no matter whether they are case object or not) ?

I didn't feel good if the language has two very similar features: sealed trait and enum

They are not just similar, but related: One expands into the other. There's arguably lots of precedent in Scala for this. Think about how function values expand into objects, or how for expressions expand into map,flatMap,withFilter operations.

Is it possible to share the class file for all empty-body objects that have the same super types and same modifiers (no matter whether they arecase object or not) ?

That's a good question. There's a problem with the fact that each object is supposed to define its own class. I am not sure to what degree one can ignore that. Also, there's the issue that objects are lazy and we want non-lazy vals for efficiency.

how for expressions expand into map,flatMap,withFilter operations.

I bet the for/yield expanding is a nightmare for scala.meta guys. It also troubled macro authors like me, because we have to deal with two different syntaxes with the (almost) equal AST.

One expands into the other

One key difference, though, is the return type of the apply method of the case companions. Do you think we could pull this feature to sealed traits too?

That's a good question. There's a problem with the fact that each object is supposed to define its own class. I am not sure to what degree one can ignore that. Also, there's the issue that objects are lazy and we want non-lazy vals for efficiency.

I know the optimization will break the behavior of getClass. Fortunately, IIRC, there is no mention about getClass in SLS.

One key difference, though, is the return type of the apply method of the case companions. Do you think we could pull this feature to sealed traits too?

Usually adding methods to an existing object would not break source code backward compatibility.

Compiling empty-body objects to the same shared class would expose more fragility at the binary compatibility level: now adding the first member to an object would break binary compatibility. I don't think this is an option.

Why not go full blown GADT style? enum Option[T] = None | Some(t: T) or enum Either[A, B] = Left(a: A) | Right(b: B) ? And then just replace enum with data?

@notxcain
We want to maintain ability to for enumeration values to define custom methods and inherit custom classes that weren't necessary inherited by the common enum definition.

Binary compatibility is a platform-dependent attribute that does not affect the semantic of the language.

It does affect the semantics of the ecosystem, though, which is closely related to the language. Dismissing binary compat concerns like that is dangerous for the well-being of the language, because a language cannot survive without its ecosystem.

I agree binary compatibility matters. However, since both @shareJavaClass case object and case val have the same impact on binary compatibility, why not keep the existing syntax?

I prefer @shareJavaClass case object because an author of a macro library like upickle does not have to modify its code as the AST does not change.

I am fully in favor of such proposal, for all the reasons invoked above!

A few months ago I made a prototype based on macro annotations that had almost the same syntax as this proposal. This was a good compromise, reusing the existing Scala syntax but allowing for more concision.
However, I am worried that this syntax will confuse newcomers more than anything. The main reason, as pointed by @Atry, is that it blurs the distinction between object and class members.
Concepts such as type members are already hard to grasp coming from other languages; if we make the object/class member distinction murkier, it will jump in the way of the fundamental intuition.

Since we have the possibility to change the syntax of the language, I would propose a syntax where the cases are not syntactically scoped inside the class, but rather "appended" to it, a bit like one defines co-recursive entities in ML by starting with let or type and then separating definitions with and –– but to remain Scala-ish and avoid adding keywords we could use with instead:

enum Option[T]
with case Some(x: T)
with case None

object Option[T] {
  def apply[T](x: T) = if (x != null) Some(x) else None
}

Or if we really want less key strokes, simply use case alone:

enum Option[T]
case Some(x: T)
case None

enum Color case Red case Green case Blue

Since the cases are syntactically kept together with the enum (as opposed to potentially defined in an object later in the file), it is no more a problem to have them refer to the enum's type parameters.

I would also be in favor of two improvements:

  • Nominality of type and value parameters: if a case has parameters named the same as the parent enum, these are used in the implicit extends clause. Parent enum parameters that are not parameters in the case can be defined in the body of the case.
  • Type and type bounds elision: when introducing a case parameter homonymous to an enum parameter, its type (for values) or bounds (for types) can be left out.

For example:

enum Param(name: String, typ: Type)
case Nominal(name, typ)
case Positional(ord: Int, typ) { val name = s"_$ord" }

Would be comparable to the current:

sealed abstract class Param(val name: String, val typ: Type) extends Product with Serializable
final case class Nominal(override val name: String, override val typ: Type) extends Param(name, typ)
final case class Positional(ord: Int, override val typ: Type) extends Param(s"_$ord", typ)

It also means we can express the Flipped case aluded by @julienrf without an explicit extension clause, as in: enum Parent[A,B] case Flipped[B,A].

I agree with letting unspecified type parameters resolve to the lower bound of covariant and upper bound of contravariant parameters. This is consistent with the way they are inferred in expressions.

Also, the same type/bounds elision principle could be applied to overriding methods as well (which is implemented in my boilerless prototype). I think it would make sense to allow that in enums only, because in that situation one can quickly look up the parent method signature (cf. the enum has to be defined just above the cases).

Combined with the previous remark, this means one can write:

enum EitherOrBoth[+A,+B]  { def fold[T](f: A => T, g: B => T)(m: (T,T) => T): T }
case First (value: A)     { def fold(f,g)(m) = f(value)          }
case Second(value: B)     { def fold(f,g)(m) = g(value)          }
case Both(fst: A, snd: B) { def fold(f,g)(m) = m(f(fst),g(snd))  }

All that being said, and if we assume nominal implicit extension, the magic of automatic type parameters insertion becomes of dubious usefulness. It would be better IMHO to have cases list all their parameters explicitly (types and values). This would not lead to much more code, but would be much clearer.

Say, for example:

enum MyOption[+T <: AnyRef] // bounds are implicitly propagated to homonymous case parameters
case Som[+T](x: T)          // means `case Some[+T<:AnyRef](x:T) extends MyOption[T]`
case Non                    // means `case Non extends MyOption[Nothing]`

I like this proposal. I do hope two things come in:

  1. inferring Nothing on covariant enums that have cases that don't use a type parameter.
  2. I really don't like preventing adding additional types on the cases: #1970 (comment)

If we ban adding additional types, then when you need to add one, you have to refactor to sealed trait/case classes and expand out this boiler plate by hand, which is a pretty bad experience.

I can't see why we can't just take all the types in the enum and consider those the first types on all the cases, and any additional types come after that. (Or scala could have multiple type parameter blocks, which would also be nice for many inference situations).

Or if we really want less key strokes, simply use case alone:

     enum Option[T]
     case Some(x: T)
     case None

That was actually the design I started with. The problem with it is that it effectively disallows toplevel enums. First, simple cases like None could not be mapped to vals because they are not allowed at toplevel. Second, enums with many alternatives would pollute the namespace too much.

Note that one can always "pull out" cases into the enclosing scope using val and type aliases. So cases inside objects are more flexible.

cases inside objects are more flexible

Actually, I had in mind that enum E case A case B ... would expand into class E; object E { case class/object A; case class/object B; ... }. Realizing now that the expansion of my Param example was inaccurate.
I think it's still less counter-intuitive than something that looks like a class but whose "members" are moved to a companion object. After all, adding case in front of a class creates a companion object with an apply method inside. Is it a big stretch to accept that case following a class create case classes in the companion object?

A language should provide mechanism, not policy.

I agree that:

  • Sharing Java class is an optimization, which is important for JVM.
  • values, valueOf and java.lang.Enum interoperability are important for a sealed trait

I did not recognize "less verbose" as a goal for a language feature itself. "less verbose" should be achieved by composing tiny, reusable, elegance, atomic features.

I propose not to change Scala 2.x syntax except replacing sealed trait to enum.

enum Color
object Color {
  case object Red extends Color
  case object Blue extends Color
  final case class Rgb(r: Byte, g: Byte, b: Byte) extends Color
}

expand to

sealed trait Color extends java.lang.Enum with scala.Enumeration

object Color extends scala.EnumerationCompanion {
  @shareJavaClass case object Red extends Color
  @shareJavaClass case object Blue extends Color
  final case class Rgb(r: Byte, g: Byte, b: Byte) extends Color
}

As enum already cover all usage of sealed. I guess the sealed keyword might be removed from the language.

My take on the whole compatibility with Java enums, if Java enum compatibility is going to harm the design of enum in general, then compatibility should be opt in with some annotation. Scala is going to start taking into account other platforms, and I don't think that being shoestringed with Java compatibility is a good idea. Binary compatibility for adding more entries in the enum should be provided however, that is killer feature.

Also if the syntax is overly verbose for enums, then there isn't really going to be that much of a difference between using enum and something like Enumeratum which means there is little advantage of having this as a language feature.

In my opinion, I would rather prefer a light enum that does less with minimal syntax that focuses on performance/memory usage and provides SYB generated methods for looking up enum cases by String value rather than something which is sought of but not really ADT's.

After reading the entire discussion, its starting to get muddied about what the difference between what people are discussing and what is possible now with ADT's/case objects. If the difference between the two isn't going to be substantial, its really just going to be confusing for end users and we don't want to repeat what happened with Scala enum again

Another reason against this PR:

Multiple case object for one sealed trait is a rare situation in real world Scala code. For a enumerator that allows large candidates, we usually store them in an external format, like database schema, XML DTD, or JSON schema.

Popular Scala libraries, like scalaz, usually modeled with a lot of case classes to represent its states, even for those cases that have no arguments, because they might need type parameter like case class Tower[A]().

Even enum in Java language is rarely used as well. They use visitor pattern to avoid the need of ADT.

If you search "enum" in all Java source code on Github, you will find the main usage of enum is testing enum keyword itself.
https://github.com/search?l=Java&q=enum&type=Code&utf8=%E2%9C%93

Multiple case object for one sealed trait is a rare situation in real world Scala code.

I don't think we live in the same "real world". A search in the Scala.js repo shows that virtually all case objects are part of a collection of several that extend a sealed trait or sealed abstract class. And in most of those cases, it's actually a proper enum, in the sense that there is no case class in the same set.

@sjrd How many case classes are there in Scala.js repository? What is the ratio between case object and case class?

@Atry @sjrd - to sort out usage, @olafurpg has a huge corpus that can be used to check relevant usage statistics. This feels to me to be sliding into conjecture...

I don’t know how relevant it is but in a simple project (~10k LOC) I count about 30 enum-like definitions (sealed traits extended by case objects only).

I noticed that almost all usages of case object in Scala.js repository are schema for configuration. While my libraries usually provide APIs for some standalone mechanisms, which does not provide configuration to users.

I guess that's why @sjrd and I live in different "real world"s. No surprise that other utility libraries like Scalaz or Shapeless do not use case object very often as well.

@Atry

Multiple case object for one sealed trait is a rare situation in real world Scala code. For a enumerator that allows large candidates, we usually store them in an external format, like database schema, XML DTD, or JSON schema.

Uh I wouldn't make assumptions like this. Where I work, we have a huge amount of code which is enums that are defined with enumeratum (i.e. case object's in sealed abstract class which could easily be a sealed trait)

@Atry

How many case classes are there in Scala.js repository? What is the ratio between case object and case class?

scalajs$ git grep 'case object' | wc
     75     502    8418
scalajs$ git grep 'case class' | wc
    269    2197   35161
scalajs$ git grep 'case class' | grep -v '/Trees.scala' | wc
    146    1101   19364

75 case objects versus 269 case classes. Of which 123 are for defining the ASTs of the IR and JavaScript code.

In any case, it's definitely not rare.

Its also used all over the place when building webservers (one of the primary types of software that is built with Scala today) to define enums which get outputted in JSON/XML's

Or any type of GUI needs a concept of ordering enums.

Its really not rare at all

I think this code is unreadable :

enum class Option[+T] extends Serializable {
  def isDefined: Boolean
}
object Option {
  def apply[T](x: T) = if (x != null) Some(x) else None
  case Some(x: T) {
     def isDefined = true
  }
  case None extends Option[Nothing] {
     def isDefined = false
  }
}

Why is Some defined in the object ? It is a class ! enum should be used to list all possible case, what is an abstract function declaration doing in an enum ? In my opinion it should look like this :

enum Option[+T] {
  Some(x: T)
  None 
} {
  def isDefined: Boolean = this match {
    case Some(_) => true
    case None => false
  )
}

It is a lot like ruest enum and much more intuitive in my opinion.

Even if the use of case object is not "rare", is it really "many values"?

I believe there are a lot enumeration-like structures that have many (10-256) values and that are not represented as case objects, precisely because case objects are too heavyweight. For instance, there are quite a few of these in the various compilers that I know. The idea is that the enum construct should cover these use cases.

So the number of case objects in existing code bases is not a good indicator for deciding whether we need more lightweight enums.

Enumerations should be efficient, even if they define many values. In particular, we should avoid defining a new class for every value

(I realize the following is a tangent and I am not involved at all writing compilers or optimizers.)

Scenarios where Scala will be deployed the good old fashioned way as non-optimized JAR files will remain with us for a long time, but won't the upcoming Dotty linker change our notion of what might be heavyweight vs. lightweight, inefficient vs. efficient?

For example with Scala.js (and its linker/optimizer) you care much less about optimizing for certain patterns because you know the optimizer takes care of it (for example erasing certain collection operations down to basic operations on Scala.js arrays, see @sjrd's talks on this).

With an appropriate linker/optimizer, a Scala ADT with hundreds of objects wouldn't necessarily imply the creation of hundreds of Java classes at runtime: the linker/optimizer could create a much more optimal runtime representation of sealed hierarchies as needed.

I don't think this changes much regarding the original proposal, which has other motivations. I just wanted to point out that the performance motivation might mostly go away in the future.

@ebruchez Scala.js optiimizes away objects that are not referenced. Here we are talking about large enums which are referenced in large match expessions. E.g. an enum for token classes in a compiler.
Those won't be optimized away.

Generally, there's a danger to assume a "sufficiently smart compiler". If you have a concrete plan how to optimize a certain construct, that should surely be factored in. But if it's only a vague feeling, don't count on it.

That's understood. I realize that the Scala.js linker doesn't do this kind of things right now (AFAIK). I had something more concrete in mind in fact, although I realize that this would need implementation work in the linker.

Stephen Compall wrote here about the equivalence between type-safe pattern matches and "matchless folds". If I understand well this implies that, assuming type-safety, one could write pattern matches for ADTs which do not rely on individual isInstanceOf checks (or maybe just one such test on the base class/trait).

This means that many common sealed hierarchies could be implemented at runtime with much more efficient representations. Obviously the runtime pattern matcher would have to be aware of that.

This tells me that it might not entirely be a crazy assumption that common enums patterns could be massively optimized by the linker.

@ebruchez It comes down to this: Can we assume that an optimizing compiler/linker will replace object definitions with value definitions? For this we need to show:

  • The object's initializer does not have side effects
  • Nobody does interesting things with the object's class (because it will go away)

Both conditions are uncheckable with our current technologies. The first condition might be checkable with a global code analysis or an effect system, but neither exist yet. The second condition is probably too vaguely defined to be effectively checkable.

The enum proposal needs to give an expansion anyway for simple enum cases. It currently maps them to value definitions, which is straightforward. Mapping them to object definitions instead would couple the whole issue to the two hairy problems mentioned above, which looks too risky. Not to mention that interop with Java enumerations would become a game of roulette with an unpredicatable optimizer.

@ebruchez

Comparing to Scala.js (or Dotty's experimental deep linker) is not an apples for apples comparison.

In regards to Dotty's deep linker, this is not designed to work with libraries and only works with final "executable" jars. At least with Scala on the JVM, any built artifact which is a JAR is treated as library so it needs to be able to be dynamically linked at runtime. Artifacts built with Dotty's deep linker wouldn't be able to be linked as libraries afaik.

Scala.js is slightly different in this regard. It target is still building highly optimized .js targets (which is similar to Dotty's deep linker that builds highly optimized JARs), however for Scala.js libraries it uses its own internal bytecode which differs from JVM bytecode which still allows you to make Scala.js libraries.

The reason this is important is that enum aren't "just case objects". Case objects carry around class information, because they can be referenced/looked up and extended in ways that enum's can't be. Having an actual enum keyword signifies the compiler that this construct is a much more limited version of case object, and because of this the compiler can do optimizations (which are not possible if we are dealing with dynamically linked libraries) that it otherwise can't do. This is the same reasoning behind constructs like AnyVal, its not possible to automatically have unboxed value classes if we also have to provide dynamically linked libraries. The deal is the same with abstract class versus trait or the new @static proposal being made.

This is why I am also in favour of making the proposed enum construct to be as limited as possible, in the same way how AnyVal is very limited compared to a standard one field case class constructor. The more things that Enum is asked to support, the closer it is to becoming a slightly different variant of our standard sealed abstract class/sealed abstract trait + case object pattern we have now.

At least personally, the things that are important for enum are the same things that are added with https://github.com/lloydmeta/enumeratum, and nothing more, specifically

  • Way to look up ADT by its symoblic name
  • Way to represent ADT by int index (for efficiency)
  • Enum's can be ordered

And more importantly, what it doesn't support

  • More than 1 arg constructor cases (the parameter should only be the actual symbolic enum id)
  • Being able to extend the values in the enum itself

This is, of course, so we can support the enums not instantiating a class (or two) class instances per enum value

@notxcain

Why not go full blown GADT style? enum Option[T] = None | Some(t: T) or enum Either[A, B] = Left(a: A) | Right(b: B) ? And then just replace enum with data?

@DarkDimius

We want to maintain ability to for enumeration values to define custom methods and inherit custom classes that weren't necessary inherited by the common enum definition.

Objective 5:

It should support all idioms that can be expressed with case classes. In particular, ...elided.., and arbitrary statements in a case class and its companion object.

I'd argue if we follow @notxcain 's idea to go pure ML-style GADT/ADT to disallow template for enum class or cases (extends is fine), we loose nothing. As Scala has a wonderful feature implicit helpers, I can foresee a programming paradigm emerge:

data Option[+T] = Some(x: T) | None

implicit class OptionHelper[T](opt: Option[T]) extends AnyValue {
  def isDefined: Boolean = opt match {
    case Some(_) => true
    case None => false
  )
}

This way, Scala can have the best of FP and OO. Also, the expression problem reminds us that there are two orthogonal ways to define operations on data. As case classes are mainly the OO way of operations on data, I think it's probably good for this new language feature to go the FP way of operations on data. This also alleviates the problem of confused users: which construct to use?

@liufengyun Just adding standard ADTs would not be very Scala-like. Scala has always avoided having FP and OOP features side by side. Instead it tries very hard to unify them. So if we define enums, we want to map them to classes, and we want to not artificially restrict the things you can do with these classes because that would lose orthogonality.

Just adding standard ADTs would not be very Scala-like. Scala has always avoided having FP and OOP features side by side.

@odersky I guess there is some language design philosophy here I cannot argue against. But as a programmer, the appeal of syntactical simplicity is irresistible -- I believe that's also most Scala programmers think Scala is. It's just about the syntax, behind the scene the standard ADTs also map into classes. For programmers, the syntactical simplicity makes them love the language, not just use the language, IMHO.

Or even something like

data Option[A] = Some(a: A) | None
ops Option[A] {
  def isDefined: Boolean = this match {
    case Some(_) => true
    case None => false
  }
}

Just improvising

It's just about the syntax

@liufengyun What about the syntax I proposed earlier?

enum Option[T] case Some(x: T) case None

This lends itself to both short, elegant definitions and to more typical OOP: you can add braces to introduce members for cases (and it still looks like Scala), but if you want to keep the ADT clean and add your methods via implicits, no one is preventing you from doing that. Why impose arbitrary restrictions on programmers?

@LPTK

enum Option[T] case Some(x: T) case None: This lends itself to both short, elegant definitions and to more typical OOP: you can add braces to introduce members for cases (and it still looks like Scala)

First, I'd argue that this is not the simplest possible syntax. By the philosophy make simple things easy and difficult things possible, the ML-like ADT approach also supports the OOP by extends, it just disallow templates.

Second, syntactical simplicity also relates to redundancy of language features. By disallowing templates, we are minimising the overlap between the new ADT and sealed class/trait and differentiating their use cases.

Third, it's related to @odersky 's concern about top-level definitions:

That was actually the design I started with. The problem with it is that it effectively disallows toplevel enums. First, simple cases like None could not be mapped to vals because they are not allowed at toplevel. Second, enums with many alternatives would pollute the namespace too much.

If we have completely new ML-style ADT syntax that has no prima facie connection with case object and case class, the language designer has much more flexibility in the implementation. For example, the implementation can put the case defs in the object Color or object Weekday. This implementation behaviour is justified only if we make the syntax more different from case object/class (don't use the keyword case), otherwise it contradicts the existing intuitions.

This also means programmers can still define the companion object Color and object Weekday, in the compiler, it just merges the custom provided companion object with the synthesised one, consistent with the existing Scala behavior.

@liufengyun Have you seen the answer I provided to the message you quoted? (Genuine question; not saying you should necessarily agree with it.)

I had in mind that enum E case A case B ... would expand into class E; object E { case class/object A; case class/object B; ... }. Realizing now that the expansion of my Param example was inaccurate.
I think it's still less counter-intuitive than something that looks like a class but whose "members" are moved to a companion object. After all, adding case in front of a class creates a companion object with an apply method inside. Is it a big stretch to accept that case following a class create case classes in the companion object?

There is nothing in this approach (AFAIK) that would prevent what you then describe:

This also means programmers can still define the companion object Color and object Weekday [...] it just merges the custom provided companion object with the synthesised one

@LPTK Yes, I see it's just syntactical difference from yours -- I'm just reserved about using the case keyword and allowing templates for cases and enum.

BTW, I see a potential usability problem with putting cases definition inside an object. If a programmer defines multiple ADTs, say 10, in a single file. To use the ADTs in another file, the programmer has to import 10 times. It will become very annoying in a project where all data definitions are centralized in a single file.

I'm not sure if it's technically possible to alleviate this problem by putting top-level statements (non class/object defs) inside the package object -- some tricky merging is required in the Namer. Technically, it seems feasible.

It will become very annoying in a project where all data definitions are centralized in a single file.

Good point. But a simple @expose or @forward macro annotation could automatically create type and value forwarders alongside any non-top-level annotated class, exposing selected members of its companion object (essentially its public classes, objects, and those values that result from enum cases, I would suggest).

@mdedetrich

In regards to Dotty's deep linker, this is not designed to work with libraries and only works with final "executable" jars. At least with Scala on the JVM, any built artifact which is a JAR is treated as library so it needs to be able to be dynamically linked at runtime. Artifacts built with Dotty's deep linker wouldn't be able to be linked as libraries afaik.

Scala.js is slightly different in this regard. It target is still building highly optimized .js targets (which is similar to Dotty's deep linker that builds highly optimized JARs), however for Scala.js libraries it uses its own internal bytecode which differs from JVM bytecode which still allows you to make Scala.js libraries.

I am not sure I follow or understand the distinction above. In both cases, you have whole-program optimization under a closed-world assumption, except for explicit entry points. My understanding is that all Scala libraries, at some point in the future, would include TASTY trees, which the linker will use for its analysis.

Artifacts built with Dotty's deep linker wouldn't be able to be linked as libraries afaik.

If you mean runtime linkage(as in dropping a jar in J2EE container classpath) then you're right.
If you mean compile-time dependencies, than, as correctly pointed out by @ebruchez, no. The reason is that you can recover the entire library from TASTY and recompile it under different assumptions, if you need to.
Note that it's too early to tell how it will work out, but design wise there's nothing prohibiting compile-time dependencies on pre-optimized jar.

This tells me that it might not entirely be a crazy assumption that common enums patterns could be massively optimized by the linker.

Yes, they might be. But I think there are two points which are convoluted in this discussion: enum semantics(and whether we need them), and performance of implementation.
My understanding is that enum started as a way to provide bigger guarantees that sealed, that enable reliable discovery of subclasses in metaprographing. In this regard, I consider enum a substantial improvement over current situation and I think we should include them.

As far as how to compile them - the current scheme is very inefficient if you compare it with C enums, but C enums are a lot less expressive. There may be a place for an analysis to optimize existing classes\objects into enums and then find if those could be represented in a more compact\efficient way. I think that this is a separate project which may be a nice feature for Dotty linker\Scala.js linker. For Dotty linker specialization and devirtualization should are already huge and we don't want to spread us thin.

@odersky

  • The object's initializer does not have side effects
  • Nobody does interesting things with the object's class (because it will go away)
    Both conditions are uncheckable with our current technologies. The first condition might be checkable with a global code analysis or an effect system, but neither exist yet. The second condition is probably too vaguely defined to be effectively checkable.

Once you have that linker/optimizer and an "enumeration optimizer", it seems to me that the main (only?) risk would be to get slower enumerations if you do, as you say, "interesting things" with the objects. Like for @tailrec, you could have an annotation to ensure that the enumeration is in fact compiled/linked efficiently. If it is not possible to check that, then you would get a warning or error. But I can also imagine how too may uses of the enumeration might fail the check to make the annotation useful, although that remains a bit unclear to me.

Not to mention that interop with Java enumerations would become a game of roulette with an unpredicatable optimizer.

It wouldn't be if Scala enumerations that must be Java-compatible are marked as such, like extending from a particular trait or having an annotation, as suggested in some comments above. Only pure-scala enumeration would benefit from an optimized representation.

All in all, I don't have anything against the simpler and more straightforward solution of using values rather than objects.

I still would love to see somebody experimenting with optimizing the runtime representation of ADTs in the context of an linker/optimizer. As @DarkDimius just wrote above, there is other fish to fry, so I will leave things at that for now ;)

I have updated the proposal to reflect all suggestions in the discussion that were adopted so far. I believe this proposal as a whole will not change much anymore. I would still welcome suggestions on details. More fundamental change requests, such as a fundamental change in syntax or scope of the proposal could be worked out fully as alternative proposals, ideally including an implementation, so that they can be discussed in depth. In that case, it would be best to make alternative proposals in separate issues.

We'd like to reach a decision whether we want to go ahead with this or not by the end of next week.

I like the basic design of these enum GADTs:

enum Option[+T] {
  case Some(x: T)
  case None extends Option[Nothing]
}

But to add instance members, would it be possible to put them directly
in the cases? I.e.,

enum Option[+T] {
  case Some(x: T) { override def isDefined: Boolean = true }
  case None extends Option[Nothing] {
    override def isDefined: Boolean = false
  }

  def isDefined: Boolean
}

object Option {
  def apply[T](t: T): Option[T] = if (t != null) Some(t) else None
}

The logic is that if something inside an enum definition doesn't start
with the case keyword, it's an instance member.

The logic is that if something inside an enum definition doesn't start
with the case keyword, it's an instance member.

I am philosophically opposed to that. It makes enum look like a way to introduce another kind of class but then case makes no sense in a class - it should go in the object! This matters when you consider what other members can be accessed from a case. The existing syntax treats an enum alone as neither a class not as an object (or, if you want, as both a class and an object).

OK. Not sure I understand this part:

This matters when you consider what other members can be accessed from a case.

Re: enums being just another kind of class [with instance members], I think that is familiar to Java programmers.