Add convenience APIs

Question

Add convenience APIs

siennathesane opened this issue 2 years ago · comments

This is a great library, and I'm getting a lot of use out of it. However, it's very complicated and the learning curve is high. It would be great if portions of the API could be simplified for developers who don't need all features all the time. I'd like to suggest a couple convenience style wrappers for folks who want the benefits but don't need the fine-grained details. I would assume most of the functionality I'm thinking of would happen during code generation, so it wouldn't necessarily require a lot of structural changes.

While I am raising these as opportunities, I am aware there are potential performance, copying, or reference challenges. I think convenience at the cost of speed is a reasonable trade-off that folks could be willing to make so long as it's documented well, and why. My thought process is, how could the adoption curve be lowered and be as simple as possible? I'm not trying to capture all use cases, just simplify serialization and general struct interactions.

With a bit of help, I could likely introduce these APIs into the generator.

Reference

I'll use the books example for shared context.

using Go = import "/go.capnp";
@0x85d3acc39d94e0f8;
$Go.package("books");
$Go.import("foo/books");

struct Book {
    title @0 :Text;
    # Title of the book.

    pageCount @1 :Int32;
    # Number of pages in the book.
}

Creation & Marshal

I think this workflow could be simplified, from this:

// Make a brand new empty message.  A Message allocates Cap'n Proto structs.
msg, seg, err := capnp.NewMessage(capnp.SingleSegment(nil))
if err != nil {
    panic(err)
}

// Create a new Book struct.  Every message must have a root struct.
book, err := books.NewRootBook(seg)
if err != nil {
    panic(err)
}
book.SetTitle("War and Peace")
book.SetPageCount(1440)

// Write the message to stdout.
err = capnp.NewEncoder(os.Stdout).Encode(msg)
if err != nil {
    panic(err)
}

Into something more like this:

// create a new book, create a new root and segment under the hood.
book, err := books.New()
if err != nil {
    panic(err)
}

book.SetTitle("War and Peace")
book.SetPageCount(1440)

// call `book.Message().Marshal()` behind the scenes
// or the encoder, whichever is the preferred way
payload, err := book.Marshal()
if err != nil {
    panic(err)
}

I think the segments, messages, structs, and other various construct should still exist, I think this type of simplicity would really benefit a lot of devs without hiding away the different layers.

Unmarshal and Reference

Simplifying this:

// Read the message from stdin.
msg, err := capnp.NewDecoder(os.Stdin).Decode()
if err != nil {
    panic(err)
}

// Extract the root struct from the message.
book, err := books.ReadRootBook(msg)
if err != nil {
    panic(err)
}

// Access fields from the struct.
title, err := book.Title()
if err != nil {
    panic(err)
}
pageCount := book.PageCount()
fmt.Printf("%q has %d pages\n", title, pageCount)

Into something like this:

// implement decoding and extraction under the hood
book, err := books.Unmarshal(someByteArray)
if err != nil {
    panic(err)
}

// Get the field from the struct, don't access it.
title, err := book.GetTitle()
if err != nil {
    panic(err)
}
pageCount := book.GetPageCount()
fmt.Printf("%q has %d pages\n", title, pageCount)

In this case, calling capnp.Decoder(io.Reader).Decode() + books.ReadRootBook(msg) is boilerplate code for a lot of use cases. There's definitely a use case for both of those APIs, but a wrapper around them would bring a lot of value.

I think there's a lot of value in having an explicit foo.GetBar() method instead of just having the current API of foo.Bar(). If you're required to call foo.SetBar(val), it makes sense to have an equivalent construct. Other languages have proper get and set field or property accessors that you can override or implement, and I think the Get()/Set() equality is easier on the brain for most people. Since v3 can include breaking changes, I think it'd be a good time to remove the standard accessor foo.Bar() in favour of foo.GetBar()

Sienna · Answer 1 · Fri Jul 01 2022 02:04:05 GMT+0800 (China Standard Time)

I could also see something like client := cap.AddRef().Client being reimplemented into something simpler, such as client := cap.Clone().

Louis Thibault · Answer 2 · Tue Jul 05 2022 22:26:06 GMT+0800 (China Standard Time)

Hi Sienna,

Firstly, wow! Thank you so much for this detailed analysis! This issue is rather timely, as we are currently discussing improvements to our RPC API, so it might make sense to bundle additional improvements to the encoding API into v3.

Let me now respond to your thoughts inline:

Creation & Marshal

I agree that the marshaling API has some awkward bits, and generally like the direction of your suggestion. The main limitations I see is that books.New() assumes that the books package will only export a singe type: Book. However, it is common practice for packages to export several types, so I think something like books.NewBook is more appropriate. On this second point, you'll have no doubt noticed that the current pattern is NewRootBook, and that NewRootFoo vs NewFoo is a footgun. Needless to say, this is something we absolutely intend to fix in time for v3. It is also a relatively straightforward fix, and a good first issue for someone inclined to submit a PR 🙂

An alternative API design might involve the use of functional options, of which I am personally a fan. The following example has the benefit of better readability and less boilerplate, at the expense of a larger set of backwards-incompatible changes. I submit it here for our collective consideration:

// defaults to new root book in a single-segment arena.
book, _ = books.NewBook()  

// We can override defaults like so...
book, _ = books.NewBook(
    capnp.WithRootMessage(false),                         // orphaned message
    capnp.WithArena(capnp.MultiSegment(nil)))    // override default arena

Unmarshal and Reference

Concerning the Get* convention you suggest, I think "getters" are generally discouraged in the design of Go APIs. I for one find the "Get" prefix to be redundant. It doesn't add any useful information, and even clutters the mind. Consider the following (somewhat contrived) example: book.AddPage(book.GetPage(1)) vs book.AddPage(book.Page(1)). I personally find the latter to be much easier on the brain.

In this case, calling capnp.Decoder(io.Reader).Decode() + books.ReadRootBook(msg) is boilerplate code for a lot of use cases. There's definitely a use case for both of those APIs, but a wrapper around them would bring a lot of value.

Here, I enthusiastically agree. As a minor point, I would suggest making this a method on books.Book rather than a stand-alone function for the same reason as above: there may be more than one type exported from books. Perhaps books.Book.Unmarshal and UnmarshalPacked?

Misc

I could also see something like client := cap.AddRef().Client being reimplemented into something simpler, such as client := cap.Clone().

Usually, you would interact with the capability type that wraps a Client, rather than with the client directly. There's a bit of a balancing act between "batteries-included" and keeping the API size small. In this case, I think I'm leaning towards the latter.

Ian Denhardt · Answer 3 · Wed Jul 06 2022 00:12:05 GMT+0800 (China Standard Time)

Thanks for getting the ball rolling on this.

Some low hanging fruit: Message should implement io.WriterTo and io.ReaderFrom. This avoids needing to create an encoder/decoder for one message.

A lot of the stuff we've historically done codegen for can actually be done with generics now, and I think it's a better approach. e.g.

func UnmarshalStruct[T ~struct { capnp.Struct }](data []byte) (T, error)

I'll find some time to pick through this in more detail soonish.

Sienna · Answer 4 · Sat Jul 16 2022 23:43:04 GMT+0800 (China Standard Time)

Creation & Marshal

I agree that the marshaling API has some awkward bits, and generally like the direction of your suggestion. The main limitations I see is that books.New() assumes that the books package will only export a singe type: Book. However, it is common practice for packages to export several types, so I think something like books.NewBook is more appropriate. On this second point, you'll have no doubt noticed that the current pattern is NewRootBook, and that NewRootFoo vs NewFoo is a footgun. Needless to say, this is something we absolutely intend to fix in time for v3. It is also a relatively straightforward fix, and a good first issue for someone inclined to submit a PR 🙂

Sure, I think the T.NewT vs T.NewRootT is definitely the right way to go. I can likely submit a PR (might take me a few days), can you please create or link a PR that has acceptance criteria for fixing that?

An alternative API design might involve the use of functional options, of which I am personally a fan. The following example has the benefit of better readability and less boilerplate, at the expense of a larger set of backwards-incompatible changes. I submit it here for our collective consideration:
// defaults to new root book in a single-segment arena.
book, _ = books.NewBook()  

// We can override defaults like so...
book, _ = books.NewBook(
    capnp.WithRootMessage(false),                         // orphaned message
    capnp.WithArena(capnp.MultiSegment(nil)))    // override default arena

Major versions should never be backwards compatible, so that part doesn't matter to me so much, but I want to push against AP styling a little and go for what makes sense to humans. Over chat, we talked about how Cap'n Proto is not a simple protocol; I'm surfacing there is a lot of existing complexity with boilerplate code and I think what I care most about is simplicity. I'm not sure I have a preference towards builder-style, functional-style, or adapter-style API designs, so long as the focus is on simplicity with overridable defaults 😄

Unmarshal and Reference

Concerning the Get* convention you suggest, I think "getters" are generally discouraged in the design of Go APIs. I for one find the "Get" prefix to be redundant. It doesn't add any useful information, and even clutters the mind. Consider the following (somewhat contrived) example: book.AddPage(book.GetPage(1)) vs book.AddPage(book.Page(1)). I personally find the latter to be much easier on the brain.

I'm going to very strongly disagree with you about the value of "getters" in API patterns. Go supports type conversions via (T1).T2, (*T1).T2, but also T1(T2). When I went through the examples and was learning how to use Cap'n Proto, it was immensely confusing to me that I would convert a type to get something, e.g. book.Page(1). That doesn't look or feel like a getter, that looks like a type conversion. If you've been working in Go for a long time, you know that book.Page(1) is really doing a bunch of compile-time type conversions to convert 1 to uint16 (or whatever the underlying type is), but to a new developer, you may not realise that the int you're passing in is getting converted at compilation. Something like book.GetPage(1) gives a dev a chance to see there might be more involved in getting the field.

The reality is that the majority of people aren't going to care enough to provide feedback on this library, and that's totally fine, but that doesn't mean that we should expect everyone to be an expert - this library needs to be available to developers transitioning from other languages, junior developers, and developers who just need to get things done. What I'm suggesting with book.GetPage(1) has very little to do with the core API design but a lot to do with how people think (esp people with less experience), and I think making an effort to accommodate is worth it. I personally disagree with the Effective Go mentality of not using getters, it feels elitist, useless, and inaccessible when the core conversion syntax is so similar.

In this case, calling capnp.Decoder(io.Reader).Decode() + books.ReadRootBook(msg) is boilerplate code for a lot of use cases. There's definitely a use case for both of those APIs, but a wrapper around them would bring a lot of value.

Here, I enthusiastically agree. As a minor point, I would suggest making this a method on books.Book rather than a stand-alone function for the same reason as above: there may be more than one type exported from books. Perhaps books.Book.Unmarshal and UnmarshalPacked?

Ya, I completely agree on this one. I'm fine with books.Book.Unmarshal or something like that. What I'm really trying to surface is that I should be able to take a byte array and marshal it to it's matching type in a single, relevant method without requiring boilerplate code.

Misc

I could also see something like client := cap.AddRef().Client being reimplemented into something simpler, such as client := cap.Clone().

Usually, you would interact with the capability type that wraps a Client, rather than with the client directly. There's a bit of a balancing act between "batteries-included" and keeping the API size small. In this case, I think I'm leaning towards the latter.

I think sane defaults with overridable options is the right thing to do; how that manifests in reality matters a bit less to me so long as it's simple and makes sense.

Ian Denhardt · Answer 5 · Mon Jul 18 2022 13:23:48 GMT+0800 (China Standard Time)

Some thoughts:

I'm not religiously opposed to functional options, but I agree it's a bit of complexity that would be better avoided unless it buys us something really nontrivial.
I'm a bit confused by the discussion on getters; in particular I'm unclear in the examples folks are using what e.g. GetPage (or Page, depending on whether we like a Get prefix?) is supposed to do, since the example schema doesn't have anything that would actually generate that. Can folks clarify what part of the schema this maps to (if any, and if not provide an example that this would actually apply to)?
I agree the library should be as accessible to folks as we can make it, including folks coming from other languages, but it's also worth being aware that just adopting idioms common in language A, while it may make things easier on folks who know A, may also just make things harder for folks who are coming from language B -- not only do they have to learn Go's idioms, but this one library is also now pushing them to learn language A's idioms too. It's also really easy for discussions about stylistic issues like these to turn into religious arguments, which I want to avoid. Probably the fact that different language ecosystems each full of bright people take different stances on these conventions suggests that the evidence to support either position isn't exactly overwhelming, and in the absence of something much more compelling we really should defer to Go's norms.

Sure, I think the T.NewT vs T.NewRootT is definitely the right way to go. I can likely submit a PR (might take me a few days), can you please create or link a PR that has acceptance criteria for fixing that?

I'm not sure what exactly you're looking for wrt. acceptance critera, but the idea for the fix is discussed in #245: basically, all of the generated stand-alone NewFoo functions (one for each struct type in the schema) should be renamed to NewOrphanFoo. Note that this does not apply to The New* methods that correspond to struct fields, since those attach the object to the parent struct and are therefore not orphans (or roots for that matter). Does that sufficiently clarify things?

Louis Thibault · Answer 6 · Wed Jul 20 2022 21:06:35 GMT+0800 (China Standard Time)

I'm a bit confused by the discussion on getters; in particular I'm unclear in the examples folks are using what e.g. GetPage (or Page, depending on whether we like a Get prefix?) is supposed to do, since the example schema doesn't have anything that would actually generate that. Can folks clarify what part of the schema this maps to (if any, and if not provide an example that this would actually apply to)?

My understanding was that Sienna's example was just meant to illustrate how auto-generated field accessors should look. The essence of the debate is GetFoo() vs Foo().

Sure, I think the T.NewT vs T.NewRootT is definitely the right way to go.

@mxplusb I think we may be talking slightly past each other. Ian addressed this in his response, but I just want to clarify a bit further that there is a difference between somepackage.NewFoo/NewRootFoo and somepackage.T.NewFoo. The former is a package-level constructor that returns a capnp object that does not "belong" to another object. The latter is an object-level constructor that instantiates a field Foo on type somepackage.T (in other words: after calling T.NewFoo, you can get the same object by calling T.Foo).

I'm not religiously opposed to functional options, but I agree it's a bit of complexity that would be better avoided unless it buys us something really nontrivial.

Arguably, the reduction in boilerplate code constitutes a non-trivial gain. In my experience -- which may well diverge from that of others -- 95% of object creations follow the same pattern: root object with single-segment arena. The advantage of functional options is that the NewFoo() constructor can default to this configuration, while offering a legible API for overriding defaults in the exceptional case.

in the absence of something much more compelling we really should defer to Go's norms.

100% agreed. This is, in fact, my primary argument against the GetFoo accessor naming convention.

Ian Denhardt · Answer 7 · Fri Jul 22 2022 02:56:05 GMT+0800 (China Standard Time)

Re: functional options, an alternate solution that I think is more straightforward is just to have an Options struct, where you can leave things as zero values for defaults. to make things a bit more terse, we can pass a pointer to it and treat nil as all defaults. This just seems like less cognitive overhead.

Ian Denhardt · Answer 8 · Fri Jul 22 2022 02:56:56 GMT+0800 (China Standard Time)

FWIW though, I think we really should force the caller to decide re: root vs. orphan; I don't think there's a good default there.

Louis Thibault · Answer 9 · Fri Jul 22 2022 23:11:57 GMT+0800 (China Standard Time)

Re: functional options, an alternate solution that I think is more straightforward is just to have an Options struct, where you can leave things as zero values for defaults. to make things a bit more terse, we can pass a pointer to it and treat nil as all defaults. This just seems like less cognitive overhead.

👍 That ticks all my boxes.

FWIW though, I think we really should force the caller to decide re: root vs. orphan; I don't think there's a good default there.

I defer to your judgement on this one.

Louis Thibault · Answer 10 · Sat Sep 03 2022 05:20:11 GMT+0800 (China Standard Time)

I think the ideas generated by this issue have been folded into separate issues, so I'm going to go ahead and close this.

Sienna · Answer 11 · Thu Sep 08 2022 00:51:38 GMT+0800 (China Standard Time)

Awesome! Thanks for taking this on!