Add enum construct
odersky opened this issue · comments
Introduction
This is a proposal to add an enum
construct to Scala's syntax. The construct is intended to serve at the same time as a native implementation of enumerations as found in other languages and as a more concise notation for ADTs and GADTs. The proposal affects the Scala definition and its compiler in the following ways:
- It adds new syntax, including a new keyword,
enum
. - It adds code to the scanner and parser to support the new syntax
- It adds new rules for desugaring enums.
- It adds a predefined trait
scala.Enum
and a predefined runtime classscala.runtime.EnumValues
.
This is all that's needed. After desugaring, the resulting programs are expressible as normal Scala code.
Motivation
enum
s are essentially syntactic sugar. So one should ask whether they are necessary at all. Here are some issues that the proposal addresses:
-
Enumerations as a lightweight type with a finite number of user-defined elements are not very well supported in Scala. Using integers for this task is tedious and loses type safety. Using case objects is less efficient and gets verbose as the number of values grows. The existing library-based approach in the form of Scala's
Eumeration
object has been criticized for being hard to use and for lack of interoperability with host-language enumerations. Alternative approaches, such as Enumeratum fix some of these issues, but have their own tradeoffs. -
The standard approach to model an ADT uses a
sealed
base class withfinal
case classes and objects as children. This works well, but is more verbose than specialized syntactic constructs. -
The standard approach keeps the children of ADTs as separate types. For instance,
Some(x)
has typeSome[T]
, notOption[T]
. This gives finer type distinctions but can also confuse type inference. Obtaining the standard ADT behavior is possible, but very tricky. Essentially, one has to make the case classabstract
and implement theapply
method in the companion object by hand. -
Generic programming techniques need to know all the children types of an ADT or a GADT. Furthermore, this information has to be present during type-elaboration, when symbols are first completed. There is currently no robust way to do so. Even if the parent type is sealed, its compilation unit has to be analyzed completely to know its children. Such an analysis can potentially introduce cyclic references or it is not guaranteed to be exhaustive. It seems to be impossible to avoid both problems at the same time.
I think all of these are valid criticisms. In my personal opinion, when taken alone, neither of these criticisms is strong enough to warrant introducing a new language feature. But taking them together could shift the balance.
Objectives
- The new feature should allow the concise expression of enumerations.
- Enumerations should be efficient, even if they define many values. In particular, we should avoid defining a new class for every value.
- It should be possible to model Java enumerations as Scala emumerations.
- The new feature should allow the concise expression of ADTs and GADTs.
- It should support all idioms that can be expressed with case classes. In particular, we want to support type and value parameters, arbitrary base traits, self types, and arbitrary statements in a case class and its companion object.
- It should lend itself to generic programming
Basic Idea
We define a new kind of enum
class. This is essentially a sealed
class whose instances are given by cases defined in its companion object. Cases can be simple or parameterized. Simple cases without any parameters map to values. Parameterized cases map to case classes. A shorthand form enum E { Cs }
defines both an enum class E
and a companion object with cases Cs
.
Examples
Here's a simple enumeration
enum Color {
case Red
case Green
case Blue
}
or, even shorter:
enum Color { case Red, Green, Blue }
Here's a simple ADT:
enum Option[T] {
case Some[T](x: T)
case None[T]()
}
Here's Option
again, but expressed as a covariant GADT, where None
is a value that extends Option[Nothing]
.
enum Option[+T] {
case Some[T](x: T)
case None
}
It is also possible to add fields or methods to an enum class or its companion object, but in this case we need to split the `enum' into a class and an object to make clear what goes where:
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def apply[T](x: T) = if (x != null) Some(x) else None
case Some[+T](x: T) {
def isDefined = true
}
case None {
def isDefined = false
}
}
The canonical Java "Planet" example (https://docs.oracle.com/javase/tutorial/java/javaOO/enum.html) can be expressed
as follows:
enum class Planet(mass: Double, radius: Double) {
private final val G = 6.67300E-11
def surfaceGravity = G * mass / (radius * radius)
def surfaceWeight(otherMass: Double) = otherMass * surfaceGravity
}
object Planet {
case MERCURY extends Planet(3.303e+23, 2.4397e6)
case VENUS extends Planet(4.869e+24, 6.0518e6)
case EARTH extends Planet(5.976e+24, 6.37814e6)
case MARS extends Planet(6.421e+23, 3.3972e6)
case JUPITER extends Planet(1.9e+27, 7.1492e7)
case SATURN extends Planet(5.688e+26, 6.0268e7)
case URANUS extends Planet(8.686e+25, 2.5559e7)
case NEPTUNE extends Planet(1.024e+26, 2.4746e7)
def main(args: Array[String]) = {
val earthWeight = args(0).toDouble
val mass = earthWeight/EARTH.surfaceGravity
for (p <- enumValues)
println(s"Your weight on $p is ${p.surfaceWeight(mass)}")
}
}
Syntax Extensions
Changes to the syntax fall in two categories: enum classes and cases inside enums.
The changes are specified below as deltas with respect to the Scala syntax given here
-
Enum definitions and enum classes are defined as follows:
TmplDef ::= `enum' `class’ ClassDef | `enum' EnumDef EnumDef ::= id ClassConstr [`extends' [ConstrApps]] [nl] `{’ EnumCaseStat {semi EnumCaseStat} `}’
-
Cases of enums are defined as follows:
EnumCaseStat ::= {Annotation [nl]} {Modifier} EnumCase EnumCase ::= `case' (EnumClassDef | ObjectDef | ids) EnumClassDef ::= id [ClsTpeParamClause | ClsParamClause] ClsParamClauses TemplateOpt TemplateStat ::= ... | EnumCaseStat
Desugarings
Enum classes and cases expand via syntactic desugarings to code that can be expressed in existing Scala. First, some terminology and notational conventions:
-
We use
E
as a name of an enum class, andC
as a name of an enum case that appears in the companion object ofE
. -
We use
<...>
for syntactic constructs that in some circumstances might be empty. For instance<body>
represents either the body of a case between{...}
or nothing at all. -
Enum cases fall into three categories:
- Class cases are those cases that are parameterized, either with a type parameter section
[...]
or with one or more (possibly empty) parameter sections(...)
. - Simple cases are cases of a non-generic enum class that have neither parameters nor an extends clause or body. That is, they consist of a name only.
- Value cases are all cases that do not have a parameter section but that do have a (possibly generated) extends clause and/or a body.
- Class cases are those cases that are parameterized, either with a type parameter section
Simple cases and value cases are called collectively singleton cases.
The desugaring rules imply that class cases are mapped to case classes, and singleton cases are mapped to val
definitions.
There are seven desugaring rules. Rules (1) and (2) desugar enums and enum classes. Rules (3) and (4) define extends clauses for cases that are missing them. Rules (4 - 6) define how such expanded cases map into case classes, case objects or vals. Finally, rule (7) expands comma separated simple cases into a sequence of cases.
-
An
enum
definitionenum E ... { <cases> }
expands to an enum class and a companion object
enum class E ... object E { <cases> }
-
An enum class definition
enum class E ... extends <parents> ...
expands to a
sealed
abstract
class that extends thescala.Enum
trait:sealed abstract class E ... extends <parents> with scala.Enum ...
-
If
E
is an enum class without type parameters, then a case in its companion object without an extends clausecase C <params> <body>
expands to
case C <params> <body> extends E
-
If
E
is an enum class with type parametersTs
, then a case in its companion object without an extends clausecase C <params> <body>
expands according to two alternatives, depending whether
C
has type parameters or not. IfC
has type parameters, they must have the same names and appear in the same order as the enum type parametersTs
(variances may be different, however). In this casecase C [Ts] <params> <body>
expands to
case C[Ts] <params> extends E[Ts] <body>
For the case where
C
does not have type parameters, assumeE
's type parameters areV1 T1 > L1 <: U1 , ... , Vn Tn >: Ln <: Un (n > 0)
where each of the variances
Vi
is either'+'
or'-'
. Then the case expands tocase C <params> extends E[B1, ..., Bn] <body>
where
Bi
isLi
ifVi = '+'
andUi
ifVi = '-'
. It is an error ifBi
refers to some other type parameterTj (j = 0,..,n-1)
. It is also an error ifE
has type parameters that are non-variant. -
A class case
case C <params> ...
expands analogous to a case class:
final case class C <params> ...
However, unlike for a regular case class, the return type of the associated
apply
method is a fully parameterized type instance of the enum classE
itself instead ofC
. Also the enum case defines anenumTag
method of the formdef enumTag = n
where
n
is the ordinal number of the case in the companion object, starting from 0. -
A value case
case C extends <parents> <body>
expands to a value definition
val C = new <parents> { <body>; def enumTag = n; $values.register(this) }
where
n
is the ordinal number of the case in the companion object, starting from 0.
The statement$values.register(this)
registers the value as one of theenumValues
of the
enumeration (see below).$values
is a compiler-defined private value in
the companion object. -
A simple case
case C
of an enum class
E
that does not take type parameters expands toval C = $new(n, "C")
Here,
$new
is a private method that creates an instance of ofE
(see below). -
A simple case consisting of a comma-separated list of enum names
case C_1, ..., C_n
expands to
case C_1; ...; case C_n
Any modifiers or annotations on the original case extend to all expanded cases.
Enumerations
Non-generic enum classes E
that define one or more singleton cases are called enumerations. Companion objects of enumerations define the following additional members.
- A method
enumValue
of typescala.collection.immutable.Map[Int, E]
.enumValue(n)
returns the singleton case value with ordinal numbern
. - A method
enumValueNamed
of typescala.collection.immutable.Map[String, E]
.enumValueNamed(s)
returns the singleton case value whosetoString
representation iss
. - A method
enumValues
which returns anIterable[E]
of all singleton case values inE
, in the order of their definitions.
Companion objects that contain at least one simple case define in addition:
-
A private method
$new
which defines a new simple case value with given ordinal number and name. This method can be thought as being defined as follows.def $new(tag: Int, name: String): ET = new E { def enumTag = tag def toString = name $values.register(this) // register enum value so that `valueOf` and `values` can return it. }
Examples
The Color
enumeration
enum Color {
case Red, Green, Blue
}
expands to
sealed abstract class Color extends scala.Enum
object Color {
private val $values = new scala.runtime.EnumValues[Color]
def enumValue: Map[Int, Color] = $values.fromInt
def enumValueNamed: Map[String, Color] = $values.fromName
def enumValues: Iterable[Color] = $values.values
def $new(tag: Int, name: String): Color = new Color {
def enumTag: Int = tag
override def toString: String = name
$values.register(this)
}
final case val Red: Color = $new(0, "Red")
final case val Green: Color = $new(1, "Green")
final case val Blue: Color = $new(2, "Blue")
}
The Option
GADT
enum Option[+T] {
case Some[+T](x: T)
case None
}
expands to
sealed abstract class Option[+T] extends Enum
object Option {
final case class Some[+T](x: T) extends Option[T] {
def enumTag = 0
}
object Some {
def apply[T](x: T): Option[T] = new Some(x)
}
val None = new Option[Nothing] {
def enumTag = 1
override def toString = "None"
$values.register(this)
}
}
Note: We have added the apply
method of the case class expansion because
its return type differs from the one generated for normal case classes.
Implementation Status
An implementation of the proposal is in #1958.
Interoperability with Java Enums
On the Java platform, an enum class may extend java.lang.Enum
. In that case, the enum as a whole is implemented as a Java enum. The compiler will enforce the necessary restrictions on the enum to make such an implementation possible. The precise mapping scheme and associated restrictions remain to be defined.
Open Issue: Generic Programming
One advantage of the proposal is that it offers a reliable way to enumerate all cases of an enum class before any typechecking is done. This makes enums a good basis for generic programming. One could envisage compiler-generated hooks that map enums to their "shapes", i.e. typelevel sums of products. An example of what could be done is elaborated in a test in the dotty repo.
A very nice explanation of the new feature 👍
There seems to be an inconsistency between the desugaring Rule 5 and the following code example:
enum Option[+T] {
case Some(x: T)
case None extends Option[Nothing]
}
If I understand correctly, the desugaring Rule 5 says that for the case None
, it is an error for Option
to take type parameters.
- A case without explicitly given type or value parameters but with an explicit extends clause or body
case C extends |parents| |body|
expands to a value definition
val C = new |parents| { |body|; def enumTag = n }
where n is the ordinal number of the case in the companion object, starting from 0. It is an error in this case if the enum class E takes type parameters.
Another minor question is, it seems the following code in the example expansions does not type check:
object Some extends T => Option[T] {
def apply[T](x: T): Option[T] = new Some(x)
}
We need to remove the part extends T => Option[T]
?
If I understand correctly, the desugaring Rule 5 says that for the case None, it is an error for Option to take type parameters.
Well spotted. This clause should go to rule 6. I fixed it.
Another minor question is, it seems the following code in the example expansions does not type check
You are right. We need to drop the extends clause.
In the following introductory example:
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def apply[T](x: T) = if (x != null) Some(x) else None
case Some(x: T) {
def isDefined = true
}
case None extends Option[Nothing] {
def isDefined = false
}
}
I find it a little bit confusing that in the case Some(x: T)
definition the type parameter T
is bound to the one defined in enum class Option[+T]
. I think it is the first time that symbol binding crosses lexical scopes.
Also, how would that interact with additional type parameters?
case Some[A](x: T, a: A)
Also, how would that interact with additional type parameters?
We have to disallow that.
Keeping type parameters undefined looks more like an artifact of desugaring and Dotty's type system than a feature to me. Are there any cases where this would actually be useful?
enum Option[T] {
case Some(x: T)
case None()
}
OTOH, covariant type parameters look very useful and are common in immutable data structures. Could this case be simplified?
enum Option[+T] {
case Some(x: T)
case None extends Option[Nothing]
}
How about automatically filling in unused type parameters in cases as their lower (covariant) or upper (contravariant) bounds and only leaving invariant type parameters undefined?
- It should be possible to model Java enumerations as Scala emumerations.
Instead of only exposing Java enums to Scala in this way, Is there a well-defined subset of Scala enumerations that can be compiled to proper Java enums for the best efficiency and Java interop on the JVM?
I'm proposing modification to the longer
syntax:
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def apply[T](x: T) = if (x != null) Some(x) else None
case Some[T](x: T) { // <-- changed
def isDefined = true
}
case None extends Option[Nothing] {
def isDefined = false
}
}
In this case the T is obviously bound in the scope. It still desugars to the same thing, but I feel it's more regular and it allows to rename the type argument:
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def apply[T](x: T) = if (x != null) Some(x) else None
case Some[U](x: U) extends Option[U] { // <-- changed
def isDefined = true
}
case None extends Option[Nothing] {
def isDefined = false
}
}
On the meeting, we've also proposed an additional rule:
require that all extends
clauses in case
-s list the enum
super-class.
This will run the code below invalid:
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def apply[T](x: T) = if (x != null) Some(x) else None
case Some(x: Int) extends AnyRef { // <-- Not part of enum
def isDefined = true
}
case None extends Option[Nothing] {
def isDefined = false
}
}
@DarkDimius I think this is still insufficient because it is still (a little bit) confusing that the T
type parameter of case Some[T]
is automatically applied to the T
type parameter of the parent Option[+T]
class.
Despite these inconveniences, I think that the shorter syntax is a huge benefit, so I find it acceptable to have just case Some(x: T)
as a shorthand for case class Some[T](x: T) extends Option[T]
. It is always possible to fallback to usual sealed traits and case classes for the cases we need more fine grained control (e.g. case class Flipped[A, B]() extends Parent[B, A]
can not be expressed with case enums).
One more point discussed on the dotty meeting:
there should be additional limitation that no other class can extend abstract case class
. Otherwise the supper-class isn't a sum of it's children and serialization\patmat won't be able to enumarate all children.
It is always possible to fallback to usual sealed traits and case classes for the cases we need more fine grained control
Sealed classes give less guarantees. The point of this addition is that you cannot get equivalent guarantees from sealed classes.
e.g.
case class Flipped[A, B]() extends Parent[B, A]
can not be expressed with case enums).
Given currently proposed rules it can be expressed, you simply need to write it explicitly using the longer vesion.
However, unlike for a regular case class, the return type of the associated apply and copy methods is a fully parameterized type instance of the enum class E itself instead of C
Am I understanding correctly that the following occurs?
enum IntWrapper {
case W(i:Int)
case N
}
val i = IntWrapper(1)
some match {
case (w:W) =>
w.copy(i = 2)
.copy(i = 3) //this line won't compile because the previous copy returned an IntWrapper
case N => ???
}
If so then it seems like copy
should still return C
If so then it seems like copy should still return C
That's a good argument. I dropped copy
from the description.
Instead of only exposing Java enums to Scala in this way, Is there a well-defined subset of Scala enumerations that can be compiled to proper Java enums for the best efficiency and Java interop on the JVM?
AFAICT, any enum
containing only simple case
s (i.e., without ()
) can be compiled to Java enums, and exposed as an enum to Java for interop. This even includes enum class
es with cases that redefine members.
For enumerations I would love to see a valueOf
method with String => E
type as well, to look values up by name as well as ordinal.
i'd probably never use a naked Int=>E
valueOf method for fear of exceptions; i'd very much prefer Int=>Option[E]
. or maybe (if that's still a thing in dotty) something like paulp's structural pseudo-Option used in pattern matching.
oh, and if there is a String=>E
(Option
preferred, of course), then why not an E=>String
, too?
This looks great!
I don't think the long form is an improvement, though. The case
keyword is all you need to disambiguate the members of the ADT from other stuff.
enum Either[+L, +R] {
def fold[Z](f: L => Z, g: R => Z): Z
case Left(value: L) {
def fold[Z](f: L => Z, g: Nothing => Z) = f(value)
}
case Right(value: R) {
def fold[Z](f: Nothing => Z, g: R => Z) = g(value)
}
}
I don't see any issues here. I agree with Stefan that generics should be handled automatically by default, and have the type parameter missing and filled in as Nothing if the type is not referenced. If you want something else, you can do it explicitly.
case Right[+L, +R](value: R) extends Either[L, R]
Currently this is looking great! I wrote Enumeratum and would be happy to see something like this baked into the language :)
Just a few thoughts/questions based on feedback I've received in the past:
- It might be nice to make
valueOf
non-throwing (returnsOption
) by default. Slightly easier to reason about and might even be faster - A
withName
method might also be nice to have - Would it be possible to customise the
enumTag
for a given enum member so that users can control the resolutionvalueOf
? If so, it might be nice to have the compiler check for uniqueness too :)
AFAICT, any enum containing only simple cases (i.e., without ()) can be compiled to Java enums, and exposed as an enum to Java for interop. This even includes enum classes with cases that redefine members.
Compiling to Java enums has some downsides:
- The base enum type cannot have a user-selected subclass, as the one-and-only subclass slot would be taken by extending
java.lang.Enum
- Methods inherited from
java.lang.Enum
might not be desired (e.g.case Person(name: Name)
would not be allowed becausejava.lang.Enum.name()String
is final, so the accessor method forname
would clash. - Crossing the threshold of "compilable to platform Enum" in either direction (by adding the first, or removing the last, case with params) would likely be a binary incompatible change.
This suggests to me that we need an opt-in (or maybe an opt-out) annotation for this compilation strategy.
Java enums are exposed the the Scala typechecker as though they were constant value definitions:
scala> symbolOf[java.lang.annotation.RetentionPolicy].companionModule.info.decls.toList.take(3).map(_.initialize.defString).mkString("\n")
res21: String =
final val SOURCE: java.lang.annotation.RetentionPolicy(SOURCE)
final val CLASS: java.lang.annotation.RetentionPolicy(CLASS)
final val RUNTIME: java.lang.annotation.RetentionPolicy(RUNTIME)
scala> showRaw(symbolOf[java.lang.annotation.RetentionPolicy].companionModule.info.decls.toList.head.info.resultType)
res24: String = ConstantType(Constant(TermName("SOURCE")))
This is something of an implementation detail, but is needed:
- to allow references in (platform) annotation arguments, which only admit constant values
- to keep track of the cases long enough for the pattern matcher to analyse exhaustivity/reachability.
The enums from this proposal will need a similar approach, and I think that should be specced.
@Ichoran The long form is intended to allow for
- class members
- companion members
- parent types of the companion
A played with various variants but found none that was clearer than what was eventually proposed. If one is worried about scoping of the type parameter one could specify that the long form is a single syntactic construct
enum <ident> <params> extends <parents> <body>
[object <ident> extends <parents> <body>]
and specify that any type parameters in <params>
are visible in the whole construct. That would be an option.
I'd probably never use a naked Int=>E valueOf method for fear of exceptions; i'd very much prefer Int=>Option[E]. or maybe (if that's still a thing in dotty) something like paulp's structural pseudo-Option used in pattern matching.
What about making valueOf
an immutable map?
@retronym Thanks for the analysis wrt Java enums. It seems like an opt-in is the best way to do it. How about we take inheritance from java.lang.Enum
as our cue? I.e.
enum JavaColor extends java.lang.Enum {
case Red
case Green
case Blue
}
Then there would be no surprise that we cannot redefine name
because it is final in java.lang.Enum
.
Also, can you suggest spec language for the constantness part?
A
withName
method might also be nice to have
I agree. This would also be required for most of the useful generic programming stuff we want to do (e.g. automatically generate serializers/deserializers for enumerations based on their name rather than their ordinal).
@szeiger I agree it would be nice if we could fill in extremal types of co/contravariant enum types, i.e. expand
case None
to
case None extends Option[Nothing]
But maybe it's too much magic? Have to think about it some more.
A withName method might also be nice to have
I agree. This would also be required for most of the useful generic programming stuff we want to do (e.g. automatically generate serializers/deserializers for enumerations based on their name rather than their ordinal).
Agreed. But that means we'd have to design that feature with the generic programming stuff, because it would likely end up on the type level? Not sure abut this point.
I pushed a new version where enumerations now define three public members:
- valueOf: Map[Int, E]
- withName: Map[String, E]
- values: Iterable[E]
How does the result type of apply
affect cases with type parameters that do not coincide with the superclass' type parameters?
Besides, the rationale for using the super class as the apply
result type is that it will be more friendly to type inference. However, I fail to come up with a realistic example that would not correctly infer before, and correctly infer with this feature. For example, the typical type inference problem:
val x: List[Int] = ???
val reversed = x.foldLeft(Nil)((ys, y) => y :: ys)
In current Scala, the type parameter of foldLeft
is inferred as Nil.type
, and then y :: ys
does not conform to that. With the scheme proposed here, it still fails to compile, because the type parameter is inferred as List[Nothing]
.
The exact same would happen to None
and Option[Nothing]
.
Is there any (realistic, common) snippet that would fail to compile before, and succeed now?
colors.foldLeft(Color.Black)((result, color) => someOp(result, color))
Where someOp
computes a color based on two colors.
However, your point with List[Nothing]
and Option[Nothing]
still holds :-(
Where someOp computes a color based on two colors.
Have you ever actually seen such a case? I.e., an operation that takes an enum value and something, and returns a new value of the same enum set?
There's a reason I insisted on realistic, common. We sure can come up with snippets, but that does not count.
Here is a snippet adapted from a real example:
sealed trait Step
case class A(b: Boolean) extends Step
case class B(s: String) extends Step
xs.foldLeft[Step](A(b = false))( … )
@retronym @odersky I like the idea of opting into Java enum compatibility by extending java.lang.Enum
. The spec just needs to be compatible with Java enums (https://docs.oracle.com/javase/specs/jls/se8/html/jls-8.html#jls-8.9.3) in principle so that a compiler can emit them in these cases. In particular, defining both the Scala enum values
method from the current implementation and the static values
method required by Java enums could be problematic.
i'd probably never use a naked Int=>E valueOf method for fear of exceptions; i'd very much prefer Int=>Option[E]. or maybe (if that's still a thing in dotty) something like paulp's structural pseudo-Option used in pattern matching.
Can't agree with this more. If we are going to the trouble of adding enums, we really need a String => Option[T]
function which lets you look up an enum by its String
representation. Thats a lot of the reason why libraries like https://github.com/lloydmeta/enumeratum are required in order to do basic stuff that is already available by default in other languages
Otherwise I agree with the proposal in general
After thinking about values
some more: Enumerations are mostly "normal" bytecode but with some special support by the Java compiler. In this sense they are similar to varargs, so we could treat values: Array[E]
vs values: Iterable[E]
in the same way:
-
When defining an enum that
extends java.lang.Enum
in Scala, generatevalues
method with the Java signature instead of the one with the Scala signature. -
When using any Java enum (whether defined in Java or Scala) from Scala, treat its
values
method as having the Scala signature (with the same downside as for Java varargs: every call needs to allocate a wrapperSeq
).
I updated the proposal to add withName
, and make both valueOf
and withName
maps. The implementation #1958 has also been updated. Still to do: Clarify the connection to Java enums and what it means for values
.
I'm VERY excited with this proposal. I spent +50 hours working on a solution +2 years ago, planning to turn it into a macro. And to solve the core problem, Odersky is the one that finally got me to see a key fix was to use abstract case class.
It thrills me to the moon that I can consider abandoning my own implementation.
@sjrd this doesn't take typeclasses into account. Having an Option[Nothing]
is still miles more useful than None.type
if for example you care about using a Monad[Option]
with it.
Is there a precedent for having the compiler output depend on scala.Map
?
@odersky thinking about it again, having a Map[Int,E]
and an additional Iterable[E]
is overkill. a simple (preferrably immutable) Seq is just as good. it's not that big of a difference whether i call Map#get
or Seq#lift
.
Is there a precedent for having the compiler output depend on scala.Map?
Afaik there isn't no such precedent currently currently in Dotty.
If my answer here is accurate, there isn't any precedent in scalac either.
Will case class
and sealed
be removed in Scala 3?
Will case class and sealed be removed in Scala 3?
Case class: definitely not. We need open as well as closed sums. Sealed: will also stay because there are more complex class hierarchies that cannot be explained by enums but that are still confined to one compilation unit.
I didn't feel good if the language has two very similar features: sealed trait
and enum
I did not like the nested definition in the enum
syntax, too.
It brings Java style static member definitions back. Static member definitions are inconsistent with type / companion separation conventions in Scala language, and cause confusion on Scala's path-dependent type syntax.
It looks like the only reason for case val
is sharing Java class files between enum values. If it is the case, why not apply this optimization on all object
?
Is it possible to share the class file for all empty-body object
s that have the same super types and same modifiers (no matter whether they are case object
or not) ?
I didn't feel good if the language has two very similar features: sealed trait and enum
They are not just similar, but related: One expands into the other. There's arguably lots of precedent in Scala for this. Think about how function values expand into objects, or how for expressions expand into map,flatMap,withFilter operations.
Is it possible to share the class file for all empty-body objects that have the same super types and same modifiers (no matter whether they arecase object or not) ?
That's a good question. There's a problem with the fact that each object is supposed to define its own class. I am not sure to what degree one can ignore that. Also, there's the issue that objects are lazy and we want non-lazy vals for efficiency.
how for expressions expand into map,flatMap,withFilter operations.
I bet the for
/yield
expanding is a nightmare for scala.meta guys. It also troubled macro authors like me, because we have to deal with two different syntaxes with the (almost) equal AST.
One expands into the other
One key difference, though, is the return type of the apply
method of the case companions. Do you think we could pull this feature to sealed trait
s too?
That's a good question. There's a problem with the fact that each object is supposed to define its own class. I am not sure to what degree one can ignore that. Also, there's the issue that objects are lazy and we want non-lazy vals for efficiency.
I know the optimization will break the behavior of getClass
. Fortunately, IIRC, there is no mention about getClass
in SLS.
One key difference, though, is the return type of the
apply
method of the case companions. Do you think we could pull this feature tosealed traits
too?
Usually adding methods to an existing object
would not break source code backward compatibility.
Compiling empty-body objects to the same shared class would expose more fragility at the binary compatibility level: now adding the first member to an object would break binary compatibility. I don't think this is an option.
Why not go full blown GADT style? enum Option[T] = None | Some(t: T)
or enum Either[A, B] = Left(a: A) | Right(b: B)
? And then just replace enum
with data
?
@notxcain
We want to maintain ability to for enumeration values to define custom methods and inherit custom classes that weren't necessary inherited by the common enum definition.
Binary compatibility is a platform-dependent attribute that does not affect the semantic of the language.
It does affect the semantics of the ecosystem, though, which is closely related to the language. Dismissing binary compat concerns like that is dangerous for the well-being of the language, because a language cannot survive without its ecosystem.
I agree binary compatibility matters. However, since both @shareJavaClass case object
and case val
have the same impact on binary compatibility, why not keep the existing syntax?
I prefer @shareJavaClass case object
because an author of a macro library like upickle does not have to modify its code as the AST does not change.
I am fully in favor of such proposal, for all the reasons invoked above!
A few months ago I made a prototype based on macro annotations that had almost the same syntax as this proposal. This was a good compromise, reusing the existing Scala syntax but allowing for more concision.
However, I am worried that this syntax will confuse newcomers more than anything. The main reason, as pointed by @Atry, is that it blurs the distinction between object and class members.
Concepts such as type members are already hard to grasp coming from other languages; if we make the object/class member distinction murkier, it will jump in the way of the fundamental intuition.
Since we have the possibility to change the syntax of the language, I would propose a syntax where the cases are not syntactically scoped inside the class, but rather "appended" to it, a bit like one defines co-recursive entities in ML by starting with let
or type
and then separating definitions with and
–– but to remain Scala-ish and avoid adding keywords we could use with
instead:
enum Option[T]
with case Some(x: T)
with case None
object Option[T] {
def apply[T](x: T) = if (x != null) Some(x) else None
}
Or if we really want less key strokes, simply use case
alone:
enum Option[T]
case Some(x: T)
case None
enum Color case Red case Green case Blue
Since the cases are syntactically kept together with the enum (as opposed to potentially defined in an object later in the file), it is no more a problem to have them refer to the enum's type parameters.
I would also be in favor of two improvements:
- Nominality of type and value parameters: if a case has parameters named the same as the parent enum, these are used in the implicit
extends
clause. Parent enum parameters that are not parameters in the case can be defined in the body of the case. - Type and type bounds elision: when introducing a case parameter homonymous to an enum parameter, its type (for values) or bounds (for types) can be left out.
For example:
enum Param(name: String, typ: Type)
case Nominal(name, typ)
case Positional(ord: Int, typ) { val name = s"_$ord" }
Would be comparable to the current:
sealed abstract class Param(val name: String, val typ: Type) extends Product with Serializable
final case class Nominal(override val name: String, override val typ: Type) extends Param(name, typ)
final case class Positional(ord: Int, override val typ: Type) extends Param(s"_$ord", typ)
It also means we can express the Flipped
case aluded by @julienrf without an explicit extension clause, as in: enum Parent[A,B] case Flipped[B,A]
.
I agree with letting unspecified type parameters resolve to the lower bound of covariant and upper bound of contravariant parameters. This is consistent with the way they are inferred in expressions.
Also, the same type/bounds elision principle could be applied to overriding methods as well (which is implemented in my boilerless prototype). I think it would make sense to allow that in enums only, because in that situation one can quickly look up the parent method signature (cf. the enum has to be defined just above the cases).
Combined with the previous remark, this means one can write:
enum EitherOrBoth[+A,+B] { def fold[T](f: A => T, g: B => T)(m: (T,T) => T): T }
case First (value: A) { def fold(f,g)(m) = f(value) }
case Second(value: B) { def fold(f,g)(m) = g(value) }
case Both(fst: A, snd: B) { def fold(f,g)(m) = m(f(fst),g(snd)) }
All that being said, and if we assume nominal implicit extension, the magic of automatic type parameters insertion becomes of dubious usefulness. It would be better IMHO to have cases list all their parameters explicitly (types and values). This would not lead to much more code, but would be much clearer.
Say, for example:
enum MyOption[+T <: AnyRef] // bounds are implicitly propagated to homonymous case parameters
case Som[+T](x: T) // means `case Some[+T<:AnyRef](x:T) extends MyOption[T]`
case Non // means `case Non extends MyOption[Nothing]`
I like this proposal. I do hope two things come in:
- inferring
Nothing
on covariant enums that have cases that don't use a type parameter. - I really don't like preventing adding additional types on the cases: #1970 (comment)
If we ban adding additional types, then when you need to add one, you have to refactor to sealed trait/case classes
and expand out this boiler plate by hand, which is a pretty bad experience.
I can't see why we can't just take all the types in the enum and consider those the first types on all the cases, and any additional types come after that. (Or scala could have multiple type parameter blocks, which would also be nice for many inference situations).
Or if we really want less key strokes, simply use case alone:
enum Option[T]
case Some(x: T)
case None
That was actually the design I started with. The problem with it is that it effectively disallows toplevel enums. First, simple cases like None
could not be mapped to val
s because they are not allowed at toplevel. Second, enums with many alternatives would pollute the namespace too much.
Note that one can always "pull out" cases into the enclosing scope using val and type aliases. So cases inside objects are more flexible.
cases inside objects are more flexible
Actually, I had in mind that enum E case A case B ...
would expand into class E; object E { case class/object A; case class/object B; ... }
. Realizing now that the expansion of my Param
example was inaccurate.
I think it's still less counter-intuitive than something that looks like a class but whose "members" are moved to a companion object. After all, adding case
in front of a class creates a companion object with an apply
method inside. Is it a big stretch to accept that case
following a class create case classes in the companion object?
A language should provide mechanism, not policy.
I agree that:
- Sharing Java class is an optimization, which is important for JVM.
values
,valueOf
andjava.lang.Enum
interoperability are important for asealed trait
I did not recognize "less verbose" as a goal for a language feature itself. "less verbose" should be achieved by composing tiny, reusable, elegance, atomic features.
I propose not to change Scala 2.x syntax except replacing sealed trait
to enum
.
enum Color
object Color {
case object Red extends Color
case object Blue extends Color
final case class Rgb(r: Byte, g: Byte, b: Byte) extends Color
}
expand to
sealed trait Color extends java.lang.Enum with scala.Enumeration
object Color extends scala.EnumerationCompanion {
@shareJavaClass case object Red extends Color
@shareJavaClass case object Blue extends Color
final case class Rgb(r: Byte, g: Byte, b: Byte) extends Color
}
As enum
already cover all usage of sealed
. I guess the sealed
keyword might be removed from the language.
My take on the whole compatibility with Java enums, if Java enum compatibility is going to harm the design of enum in general, then compatibility should be opt in with some annotation. Scala is going to start taking into account other platforms, and I don't think that being shoestringed with Java compatibility is a good idea. Binary compatibility for adding more entries in the enum should be provided however, that is killer feature.
Also if the syntax is overly verbose for enums, then there isn't really going to be that much of a difference between using enum and something like Enumeratum which means there is little advantage of having this as a language feature.
In my opinion, I would rather prefer a light enum that does less with minimal syntax that focuses on performance/memory usage and provides SYB generated methods for looking up enum cases by String
value rather than something which is sought of but not really ADT's.
After reading the entire discussion, its starting to get muddied about what the difference between what people are discussing and what is possible now with ADT's/case objects. If the difference between the two isn't going to be substantial, its really just going to be confusing for end users and we don't want to repeat what happened with Scala enum again
Another reason against this PR:
Multiple case object
for one sealed trait
is a rare situation in real world Scala code. For a enumerator that allows large candidates, we usually store them in an external format, like database schema, XML DTD, or JSON schema.
Popular Scala libraries, like scalaz, usually modeled with a lot of case class
es to represent its states, even for those cases that have no arguments, because they might need type parameter like case class Tower[A]()
.
Even enum
in Java language is rarely used as well. They use visitor pattern to avoid the need of ADT.
If you search "enum" in all Java source code on Github, you will find the main usage of enum
is testing enum
keyword itself.
https://github.com/search?l=Java&q=enum&type=Code&utf8=%E2%9C%93
Multiple case object for one sealed trait is a rare situation in real world Scala code.
I don't think we live in the same "real world". A search in the Scala.js repo shows that virtually all case object
s are part of a collection of several that extend a sealed trait or sealed abstract class. And in most of those cases, it's actually a proper enum, in the sense that there is no case class in the same set.
@sjrd How many case class
es are there in Scala.js repository? What is the ratio between case object
and case class
?
I don’t know how relevant it is but in a simple project (~10k LOC) I count about 30 enum-like definitions (sealed traits extended by case objects only).
I noticed that almost all usages of case object
in Scala.js repository are schema for configuration. While my libraries usually provide APIs for some standalone mechanisms, which does not provide configuration to users.
I guess that's why @sjrd and I live in different "real world"s. No surprise that other utility libraries like Scalaz or Shapeless do not use case object
very often as well.
Multiple case object for one sealed trait is a rare situation in real world Scala code. For a enumerator that allows large candidates, we usually store them in an external format, like database schema, XML DTD, or JSON schema.
Uh I wouldn't make assumptions like this. Where I work, we have a huge amount of code which is enums that are defined with enumeratum (i.e. case object
's in sealed abstract class
which could easily be a sealed trait
)
How many case classes are there in Scala.js repository? What is the ratio between case object and case class?
scalajs$ git grep 'case object' | wc
75 502 8418
scalajs$ git grep 'case class' | wc
269 2197 35161
scalajs$ git grep 'case class' | grep -v '/Trees.scala' | wc
146 1101 19364
75 case objects versus 269 case classes. Of which 123 are for defining the ASTs of the IR and JavaScript code.
In any case, it's definitely not rare.
Its also used all over the place when building webservers (one of the primary types of software that is built with Scala today) to define enums which get outputted in JSON/XML's
Or any type of GUI needs a concept of ordering enums.
Its really not rare at all
I think this code is unreadable :
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def apply[T](x: T) = if (x != null) Some(x) else None
case Some(x: T) {
def isDefined = true
}
case None extends Option[Nothing] {
def isDefined = false
}
}
Why is Some
defined in the object
? It is a class ! enum
should be used to list all possible case, what is an abstract function declaration doing in an enum ? In my opinion it should look like this :
enum Option[+T] {
Some(x: T)
None
} {
def isDefined: Boolean = this match {
case Some(_) => true
case None => false
)
}
It is a lot like ruest enum and much more intuitive in my opinion.
Even if the use of case object is not "rare", is it really "many values"?
I believe there are a lot enumeration-like structures that have many (10-256) values and that are not represented as case objects, precisely because case objects are too heavyweight. For instance, there are quite a few of these in the various compilers that I know. The idea is that the enum
construct should cover these use cases.
So the number of case objects in existing code bases is not a good indicator for deciding whether we need more lightweight enums.
Enumerations should be efficient, even if they define many values. In particular, we should avoid defining a new class for every value
(I realize the following is a tangent and I am not involved at all writing compilers or optimizers.)
Scenarios where Scala will be deployed the good old fashioned way as non-optimized JAR files will remain with us for a long time, but won't the upcoming Dotty linker change our notion of what might be heavyweight vs. lightweight, inefficient vs. efficient?
For example with Scala.js (and its linker/optimizer) you care much less about optimizing for certain patterns because you know the optimizer takes care of it (for example erasing certain collection operations down to basic operations on Scala.js arrays, see @sjrd's talks on this).
With an appropriate linker/optimizer, a Scala ADT with hundreds of object
s wouldn't necessarily imply the creation of hundreds of Java classes at runtime: the linker/optimizer could create a much more optimal runtime representation of sealed hierarchies as needed.
I don't think this changes much regarding the original proposal, which has other motivations. I just wanted to point out that the performance motivation might mostly go away in the future.
@ebruchez Scala.js optiimizes away objects that are not referenced. Here we are talking about large enums which are referenced in large match expessions. E.g. an enum for token classes in a compiler.
Those won't be optimized away.
Generally, there's a danger to assume a "sufficiently smart compiler". If you have a concrete plan how to optimize a certain construct, that should surely be factored in. But if it's only a vague feeling, don't count on it.
That's understood. I realize that the Scala.js linker doesn't do this kind of things right now (AFAIK). I had something more concrete in mind in fact, although I realize that this would need implementation work in the linker.
Stephen Compall wrote here about the equivalence between type-safe pattern matches and "matchless folds". If I understand well this implies that, assuming type-safety, one could write pattern matches for ADTs which do not rely on individual isInstanceOf
checks (or maybe just one such test on the base class/trait).
This means that many common sealed hierarchies could be implemented at runtime with much more efficient representations. Obviously the runtime pattern matcher would have to be aware of that.
This tells me that it might not entirely be a crazy assumption that common enums patterns could be massively optimized by the linker.
@ebruchez It comes down to this: Can we assume that an optimizing compiler/linker will replace object definitions with value definitions? For this we need to show:
- The object's initializer does not have side effects
- Nobody does interesting things with the object's class (because it will go away)
Both conditions are uncheckable with our current technologies. The first condition might be checkable with a global code analysis or an effect system, but neither exist yet. The second condition is probably too vaguely defined to be effectively checkable.
The enum proposal needs to give an expansion anyway for simple enum cases. It currently maps them to value definitions, which is straightforward. Mapping them to object definitions instead would couple the whole issue to the two hairy problems mentioned above, which looks too risky. Not to mention that interop with Java enumerations would become a game of roulette with an unpredicatable optimizer.
Comparing to Scala.js (or Dotty's experimental deep linker) is not an apples for apples comparison.
In regards to Dotty's deep linker, this is not designed to work with libraries and only works with final "executable" jars. At least with Scala on the JVM, any built artifact which is a JAR is treated as library so it needs to be able to be dynamically linked at runtime. Artifacts built with Dotty's deep linker wouldn't be able to be linked as libraries afaik.
Scala.js is slightly different in this regard. It target is still building highly optimized .js targets (which is similar to Dotty's deep linker that builds highly optimized JARs), however for Scala.js libraries it uses its own internal bytecode which differs from JVM bytecode which still allows you to make Scala.js libraries.
The reason this is important is that enum aren't "just case objects". Case objects carry around class information, because they can be referenced/looked up and extended in ways that enum's can't be. Having an actual enum keyword signifies the compiler that this construct is a much more limited version of case object, and because of this the compiler can do optimizations (which are not possible if we are dealing with dynamically linked libraries) that it otherwise can't do. This is the same reasoning behind constructs like AnyVal
, its not possible to automatically have unboxed value classes if we also have to provide dynamically linked libraries. The deal is the same with abstract class
versus trait
or the new @static
proposal being made.
This is why I am also in favour of making the proposed enum
construct to be as limited as possible, in the same way how AnyVal
is very limited compared to a standard one field case class constructor. The more things that Enum is asked to support, the closer it is to becoming a slightly different variant of our standard sealed abstract class
/sealed abstract trait
+ case object
pattern we have now.
At least personally, the things that are important for enum
are the same things that are added with https://github.com/lloydmeta/enumeratum, and nothing more, specifically
- Way to look up ADT by its symoblic name
- Way to represent ADT by int index (for efficiency)
- Enum's can be ordered
And more importantly, what it doesn't support
- More than 1 arg constructor cases (the parameter should only be the actual symbolic enum id)
- Being able to extend the values in the enum itself
This is, of course, so we can support the enums not instantiating a class (or two) class instances per enum value
Why not go full blown GADT style? enum Option[T] = None | Some(t: T) or enum Either[A, B] = Left(a: A) | Right(b: B) ? And then just replace enum with data?
We want to maintain ability to for enumeration values to define custom methods and inherit custom classes that weren't necessary inherited by the common enum definition.
Objective 5:
It should support all idioms that can be expressed with case classes. In particular, ...elided.., and arbitrary statements in a case class and its companion object.
I'd argue if we follow @notxcain 's idea to go pure ML-style GADT/ADT to disallow template for enum class or cases (extends is fine), we loose nothing. As Scala has a wonderful feature implicit helpers
, I can foresee a programming paradigm emerge:
data Option[+T] = Some(x: T) | None
implicit class OptionHelper[T](opt: Option[T]) extends AnyValue {
def isDefined: Boolean = opt match {
case Some(_) => true
case None => false
)
}
This way, Scala can have the best of FP and OO. Also, the expression problem reminds us that there are two orthogonal ways to define operations on data. As case classes are mainly the OO way of operations on data, I think it's probably good for this new language feature to go the FP way of operations on data. This also alleviates the problem of confused users: which construct to use?
@liufengyun Just adding standard ADTs would not be very Scala-like. Scala has always avoided having FP and OOP features side by side. Instead it tries very hard to unify them. So if we define enums, we want to map them to classes, and we want to not artificially restrict the things you can do with these classes because that would lose orthogonality.
Just adding standard ADTs would not be very Scala-like. Scala has always avoided having FP and OOP features side by side.
@odersky I guess there is some language design philosophy here I cannot argue against. But as a programmer, the appeal of syntactical simplicity is irresistible -- I believe that's also most Scala programmers think Scala is. It's just about the syntax, behind the scene the standard ADTs also map into classes. For programmers, the syntactical simplicity makes them love the language, not just use the language, IMHO.
Or even something like
data Option[A] = Some(a: A) | None
ops Option[A] {
def isDefined: Boolean = this match {
case Some(_) => true
case None => false
}
}
Just improvising
It's just about the syntax
@liufengyun What about the syntax I proposed earlier?
enum Option[T] case Some(x: T) case None
This lends itself to both short, elegant definitions and to more typical OOP: you can add braces to introduce members for cases (and it still looks like Scala), but if you want to keep the ADT clean and add your methods via implicits, no one is preventing you from doing that. Why impose arbitrary restrictions on programmers?
enum Option[T] case Some(x: T) case None
: This lends itself to both short, elegant definitions and to more typical OOP: you can add braces to introduce members for cases (and it still looks like Scala)
First, I'd argue that this is not the simplest possible syntax. By the philosophy make simple things easy and difficult things possible
, the ML-like ADT approach also supports the OOP by extends
, it just disallow templates.
Second, syntactical simplicity also relates to redundancy of language features. By disallowing templates, we are minimising the overlap between the new ADT and sealed class/trait
and differentiating their use cases.
Third, it's related to @odersky 's concern about top-level definitions:
That was actually the design I started with. The problem with it is that it effectively disallows toplevel enums. First, simple cases like None could not be mapped to vals because they are not allowed at toplevel. Second, enums with many alternatives would pollute the namespace too much.
If we have completely new ML-style ADT syntax that has no prima facie connection with case object
and case class
, the language designer has much more flexibility in the implementation. For example, the implementation can put the case defs in the object Color
or object Weekday
. This implementation behaviour is justified only if we make the syntax more different from case object/class
(don't use the keyword case
), otherwise it contradicts the existing intuitions.
This also means programmers can still define the companion object Color
and object Weekday
, in the compiler, it just merges the custom provided companion object with the synthesised one, consistent with the existing Scala behavior.
@liufengyun Have you seen the answer I provided to the message you quoted? (Genuine question; not saying you should necessarily agree with it.)
I had in mind that enum E case A case B ... would expand into class E; object E { case class/object A; case class/object B; ... }. Realizing now that the expansion of my Param example was inaccurate.
I think it's still less counter-intuitive than something that looks like a class but whose "members" are moved to a companion object. After all, adding case in front of a class creates a companion object with an apply method inside. Is it a big stretch to accept that case following a class create case classes in the companion object?
There is nothing in this approach (AFAIK) that would prevent what you then describe:
This also means programmers can still define the companion object Color and object Weekday [...] it just merges the custom provided companion object with the synthesised one
@LPTK Yes, I see it's just syntactical difference from yours -- I'm just reserved about using the case
keyword and allowing templates for cases and enum.
BTW, I see a potential usability problem with putting cases definition inside an object. If a programmer defines multiple ADTs, say 10, in a single file. To use the ADTs in another file, the programmer has to import 10 times. It will become very annoying in a project where all data definitions are centralized in a single file.
I'm not sure if it's technically possible to alleviate this problem by putting top-level statements (non class/object defs) inside the package object -- some tricky merging is required in the Namer. Technically, it seems feasible.
It will become very annoying in a project where all data definitions are centralized in a single file.
Good point. But a simple @expose
or @forward
macro annotation could automatically create type and value forwarders alongside any non-top-level annotated class, exposing selected members of its companion object (essentially its public classes, objects, and those values that result from enum cases, I would suggest).
In regards to Dotty's deep linker, this is not designed to work with libraries and only works with final "executable" jars. At least with Scala on the JVM, any built artifact which is a JAR is treated as library so it needs to be able to be dynamically linked at runtime. Artifacts built with Dotty's deep linker wouldn't be able to be linked as libraries afaik.
Scala.js is slightly different in this regard. It target is still building highly optimized .js targets (which is similar to Dotty's deep linker that builds highly optimized JARs), however for Scala.js libraries it uses its own internal bytecode which differs from JVM bytecode which still allows you to make Scala.js libraries.
I am not sure I follow or understand the distinction above. In both cases, you have whole-program optimization under a closed-world assumption, except for explicit entry points. My understanding is that all Scala libraries, at some point in the future, would include TASTY trees, which the linker will use for its analysis.
Artifacts built with Dotty's deep linker wouldn't be able to be linked as libraries afaik.
If you mean runtime linkage(as in dropping a jar in J2EE container classpath) then you're right.
If you mean compile-time dependencies, than, as correctly pointed out by @ebruchez, no. The reason is that you can recover the entire library from TASTY and recompile it under different assumptions, if you need to.
Note that it's too early to tell how it will work out, but design wise there's nothing prohibiting compile-time dependencies on pre-optimized jar.
This tells me that it might not entirely be a crazy assumption that common enums patterns could be massively optimized by the linker.
Yes, they might be. But I think there are two points which are convoluted in this discussion: enum semantics(and whether we need them), and performance of implementation.
My understanding is that enum
started as a way to provide bigger guarantees that sealed
, that enable reliable discovery of subclasses in metaprographing. In this regard, I consider enum
a substantial improvement over current situation and I think we should include them.
As far as how to compile them - the current scheme is very inefficient if you compare it with C enums
, but C enums are a lot less expressive. There may be a place for an analysis to optimize existing classes\objects into enums and then find if those could be represented in a more compact\efficient way. I think that this is a separate project which may be a nice feature for Dotty linker\Scala.js linker. For Dotty linker specialization and devirtualization should are already huge and we don't want to spread us thin.
- The object's initializer does not have side effects
- Nobody does interesting things with the object's class (because it will go away)
Both conditions are uncheckable with our current technologies. The first condition might be checkable with a global code analysis or an effect system, but neither exist yet. The second condition is probably too vaguely defined to be effectively checkable.
Once you have that linker/optimizer and an "enumeration optimizer", it seems to me that the main (only?) risk would be to get slower enumerations if you do, as you say, "interesting things" with the objects. Like for @tailrec
, you could have an annotation to ensure that the enumeration is in fact compiled/linked efficiently. If it is not possible to check that, then you would get a warning or error. But I can also imagine how too may uses of the enumeration might fail the check to make the annotation useful, although that remains a bit unclear to me.
Not to mention that interop with Java enumerations would become a game of roulette with an unpredicatable optimizer.
It wouldn't be if Scala enumerations that must be Java-compatible are marked as such, like extending from a particular trait or having an annotation, as suggested in some comments above. Only pure-scala enumeration would benefit from an optimized representation.
All in all, I don't have anything against the simpler and more straightforward solution of using values rather than objects.
I still would love to see somebody experimenting with optimizing the runtime representation of ADTs in the context of an linker/optimizer. As @DarkDimius just wrote above, there is other fish to fry, so I will leave things at that for now ;)
I have updated the proposal to reflect all suggestions in the discussion that were adopted so far. I believe this proposal as a whole will not change much anymore. I would still welcome suggestions on details. More fundamental change requests, such as a fundamental change in syntax or scope of the proposal could be worked out fully as alternative proposals, ideally including an implementation, so that they can be discussed in depth. In that case, it would be best to make alternative proposals in separate issues.
We'd like to reach a decision whether we want to go ahead with this or not by the end of next week.
I like the basic design of these enum GADTs:
enum Option[+T] {
case Some(x: T)
case None extends Option[Nothing]
}
But to add instance members, would it be possible to put them directly
in the cases? I.e.,
enum Option[+T] {
case Some(x: T) { override def isDefined: Boolean = true }
case None extends Option[Nothing] {
override def isDefined: Boolean = false
}
def isDefined: Boolean
}
object Option {
def apply[T](t: T): Option[T] = if (t != null) Some(t) else None
}
The logic is that if something inside an enum
definition doesn't start
with the case
keyword, it's an instance member.
The logic is that if something inside an enum definition doesn't start
with the case keyword, it's an instance member.
I am philosophically opposed to that. It makes enum
look like a way to introduce another kind of class but then case
makes no sense in a class - it should go in the object! This matters when you consider what other members can be accessed from a case. The existing syntax treats an enum
alone as neither a class not as an object (or, if you want, as both a class and an object).
OK. Not sure I understand this part:
This matters when you consider what other members can be accessed from a case.
Re: enums being just another kind of class [with instance members], I think that is familiar to Java programmers.