stepchowfun / typical

Data interchange with algebraic data types.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

First class support for a `Map` type

AriaFallah opened this issue · comments

Description
Seems like there's no built in Map type

Alternatives considered
I think this can be worked around by doing something like

struct MapEntry {
  key: String = 0
  value: String = 1
}

struct Map {
  values: MapEntry = 0
}

Additional context
Just curious if this is an intentional design decision to omit, or just something with a good enough workaround that there's no need to implement it.

Hi @AriaFallah, thanks for opening this issue!

You're correct that there's no primitive map type, and this is a great opportunity to start a discussion about it and the general principles for deciding what types should be built-in.

Typical currently has a fairly conservative set of built-in types (e.g., Booleans, common numeric types, Unicode strings, arrays, binary blobs). These types have fairly straightforward encodings and correspond to concepts with fairly language-agnostic semantics. More complicated data types can be built out of these, such as in the workaround you described, although with worse ergonomics than if they were built-in.

If we added maps, we would need to answer some questions such as:

  1. What guarantees do we make about ordering? Is it preserved? Is it deterministic (within the same binary, across different binaries, or even across different Typical versions?)? If so, what if the host language doesn't offer the same guarantees?
  2. What types are supported for the keys? Do we allow only strings, or would we be more flexible like Python? What would we do about languages that have restrictions on key types (e.g., that they must be hashable, etc.)?
  3. How do we handle duplicate keys when decoding (return an error, last-one-wins, etc.)?
  4. We also need to think about schema compatibility / migrations. Maps are obviously covariant in their value types, but what about their key types? Conceptually speaking, when used as lookup tables, maps are contravariant in their keys, but covariance makes more sense when iterating over the entries of a map. So, perhaps invariance is the answer, but that is also unsatisfactory because people probably expect covariance everywhere ("what do you mean I can't add a new field to this struct?"). This could be an argument for only supporting string keys, at least initially, but then should the schema syntax require specifying the key type even if it's required to be String in anticipation of relaxing that restriction in the future?

In contrast, arrays do not have these complications: ordering is expected, any element type is allowed, duplicates are unproblematic, and arrays are logically covariant. Arrays have fairly consistent semantics across programming languages. When I worked at Google, we often found ourselves using arrays of key-value pairs (as you described) rather than Protobuf's built-in map type for various reasons related to these complications, such as more control over ordering (e.g., so we could deterministically compute the hash of such messages), at the cost of convenience.

I am not necessarily opposed to adding a map type, but we would need to be thoughtful about it. Maps are certainly useful and convenient, but they come with some complications. I'm not sure the trade-offs are worth it.

Thanks for such a thorough reply! My question was mostly curiosity as to why such a complete-looking library wouldn't be using map types, and I've definitely gotten the answer and then some. I'm personally satisfied so I have no issue closing it.