Make approach to defining enums consistent across the project
calebbrown opened this issue · comments
#654 introduces the enum type Ecosystem implemented differently to other enums (e.g. analysis.Mode
and staticanalysis.Task
and analysisrun.DynamicPhase
)
We should develop a consistent approach to how we define enums:
Some areas that should be covered:
- should the base type be string vs numeric
- how do we handle lists of enums
- how do enums interact with flags
- how do enums handle serialization (JSON)
Both:
- require String()
Numeric based enums:
- can use
iota
- can consume as little as 8bits (byte)
- very fast to compare equality
encoding.TextMarshaler
andencoding.TextUnmarshaler
needed to supportflag.TextVar
and json serialization- requires strings to be specified for marshaling and unmarshaling (although an array of strings may make this easier)
String based enums:
- consume 128 bits per enum
- simpler String() function and serialization
- equality is more costly (at minimum two comparisons, maximum the size of the shortest string)
Some research:
- https://www.practical-go-lessons.com/chap-27-enum-iota-and-bitmask - proposes int + iota and implementing fmt.Stringer() and serialization interfaces.
- https://threedots.tech/post/safer-enums-in-go/ - documents each style, suggests strings are better for URL slugs
- Go stdlib:
- https://github.com/golang/go/blob/master/src/time/time.go#L302 - uses int, and a
[]string
to associate constant with string - https://github.com/golang/go/blob/master/src/net/interface.go#L39 - uses uint as bitmask, and a
[]string
to associate constant with string
- https://github.com/golang/go/blob/master/src/time/time.go#L302 - uses int, and a
- GitHub:
- https://cs.github.com/TencentBlueKing/bk-cmdb/blob/53fa5996b3ddff2ec8e9dba039452ce46ce26dca/src/common/blog/glog/glog.go - uses int32 and
[]string
to associate constant with string, implementsflag.Value
(origin: https://code.google.com/p/google-glog) - https://github.com/uber-go/zap/blob/master/zapcore/level.go - uses int8 and switches to implement various interfaces.
- https://github.com/zhlhahaha/gvisor/blob/master/runsc/config/config.go - uses int and switches to implement
flag.Value
- https://github.com/dt/cockroach/blob/9fd4ce118639ff09d11fffd85fbad6d42fa727c4/pkg/cli/flags_util.go#L156 - dumpMode uses int, and switches to implement
flag.Value
- https://cs.github.com/TencentBlueKing/bk-cmdb/blob/53fa5996b3ddff2ec8e9dba039452ce46ce26dca/src/common/blog/glog/glog.go - uses int32 and
As far as flag.TextVar
usage go, the usage is currently super low since the API was only recently introduced.
The standard approaches I've seen for enums is either:
type Foo int
const (
A Foo = iota
B
C
)
func (f Foo) String() string {
switch f {
case A:
return "A"
case B:
return "B"
case C:
return "C"
default:
return "<unknown>"
}
}
or
type Foo int
const (
A Foo = iota
B
C
)
fooStrings = []string{"A","B","C"} // this can also be map[Foo]string
func (f Foo) String() string {
// lookup string in fooStrings and return it
}
My proposal:
- use numeric based enum types (e.g.
int
,int8
) withiota
- rationale: this approach is almost universally used
- use
[]string
ormap[x]string
to associate const to string, and for parsing- rationale: while switches are more common, this approaches reduces string duplication,
and supports producing help text for flags
- rationale: while switches are more common, this approaches reduces string duplication,
- implement
encoding.TextMarshaler
andencoding.TextUnmarshaler
rather thanflag.Value
and JSON marshalers- rationale:
encoding.TextMarshaler
andencoding.TextUnmarshaler
are both usable with the newflag.TextVar
and in-place of JSON marshalers, requiring only 2 methods instead of 4.
- rationale:
We should definitely have a consistent set of functions to work with types used as enums.
Perhaps the decision of type could be decided case-by-case? e.g. in some cases having a string name defined as the enum could be more meaningful, as described in the reference above. But happy to restrict ourselves to int
if that's what we want.
For numeric based enums, I think the idea of using a slice or map to define the string values for the enum in a single place is a very good one.
Building on Caleb's proposal, how about the following five functions/methods for type Foo
which is used like an enum:
String()
string implementingfmt.Stringer
FooFromString(name string) (Foo, error)
implemented like this (from the same reference linked above), perhaps with the error replaced by a boolean true/false like in task.goMarshalText()(text []byte, err error)
implementingencoding.TextMarshaler
for serialisation and flag usages (callsString()
)UnmarshalText(text []byte) error
implementingencoding.TextUnmarshaler
for deserialisation and flag interfaces (callsFooFromString()
)FooValues() []Foo
which returns a slice containing all valid enum values.
Methods 1 and 2 are 'nicer' to use in internal code with strings, while 3 and 4 are implementing external interfaces which are designed to be very general and so are a little bit less readable.
I agree with everything - except for naming of (2) FooFromString
.
Usually methods for returning a type from a string are named Parse
(i.e when the package is named foo
) or ParseFoo
(e.g. https://pkg.go.dev/net/url#Parse, https://pkg.go.dev/time#Parse, https://pkg.go.dev/net/mail#ParseAddress, and https://pkg.go.dev/strconv).
Oh, and for FooValues()
I'd accept Values() []Foo
as well, if the package is named foo
(so foo.Values()
instead of foo.FooValues()
).
I'd also add that implementing ParseFoo()
, encoding.TextMarshaler
and encoding.TextUnmarshaler
could be optional.
But if an enum is used in a flag it must implement all three.
Sure, naming the 2nd method ParseFoo()
or foo.Parse()
depending on the containing package name and/or size sounds good to me.
Should we consistently define a none/empty value for each enum?