Last question about status_code before adopting it

Question

Last question about status_code before adopting it

kjcamann opened this issue 5 years ago · comments

Hi Niall,

Thank you for all your work on status_code, outcome, and llfio. I would like to start adopting these for a large project -- currently I use <system_error> -- but I have one last issue. My concern is about C interoperability. This is a long one but it might be important for standardization sake.

I may have missed it, but it doesn't appear that I can directly construct a system_code using a pair of (status_code_domain *, intptr) because no public constructor or static factory function permits this. What follows is an explanation of why someone would want to do this.

I have a lot of libraries which have both a C and C++ API. The C++ APIs are better in certain ways, but the C APIs are ABI-stable. In practice, the C APIs are thin wrappers around the C++ implementation -- they statically link the C++ implementation and also the C++ runtime.

It seems like I can unify the C and C++ error handling, by making a C API that wraps system_code and reuses all the internal machinery. To make this more concrete, here is some code:

// C error API, prefix everything because we don't have namespaces

struct foo_syscode_domain; // C error domain, opaque, actually a status_code_domain

// C system code
struct foo_syscode {
  foo_syscode_domain *domain;
  intptr_t errcode;
};

// system_code API
bool foo_syscode_success(struct foo_syscode);
bool foo_syscode_failure(struct foo_syscode);
const char *foo_syscode_message(struct foo_syscode);
bool foo_syscode_equal(struct foo_syscode, struct foo_syscode);

// system_code_domain API
const char *foo_syscode_domain_name(const struct foo_syscode_domain *);
unsigned long long foo_syscode_domain_id(const struct foo_syscode_domain *);
bool foo_syscode_domain_equal(const struct foo_syscode_domain *

The intention is to "export" a system_code to C by copying the (domain, intptr_t) pair into a foo_syscode. To implement the C error API, we need to reconstruct the system_code from that exported data, but it doesn't appear possible without relying on some "hair-raising" UB, such as this:

bool foo_syscode_success(struct foo_syscode sc) {
  system_code cxxsc;
  memmove(&cxxsc, &sc, sizeof sc);
  return cxxsc.success();
}

Because system_code is very far from POD -- neither standard layout or trivial -- I would rather not do this. Direct construction is something the old API (the C++11 <system_error>) did allow.

Niall Douglas · Answer 1 · Mon Oct 14 2019 16:46:00 GMT+0800 (China Standard Time)

Firstly, if you wish to avoid any UB at all, then you can't use system_code or any status_code<erased<T>> in C, except by pointer indirection. That's simply because we don't have support for move relocation in the language yet. Maybe for C++ 23.

If you want to avoid UB, I'd define a status_code<trivially_erased<T>> which is the same as status_code<erased<T>>, but which guarantees that it won't construct from domains whose payload isn't trivially copyable. This ought to be easy enough to do.

I'm not minded to support that directly, because WG21 already gets confused by the move-only nature of erased status codes, and I'd like to not confuse them further. But you yourself definitely can duplicate status_code<erased<T>> into a trivially-copyable form with very little work, and all your boxes get ticked.

Do note that under C, compatible layout rules means that any layout compatible structure can be used to initialise any other. So, we don't need a formal construction API in C, just use memcpy, or poke the values of a compatible layout structure.

Kenneth Camann · Answer 2 · Sat Oct 19 2019 07:00:00 GMT+0800 (China Standard Time)

After thinking about it further, some aspects of my design don't make sense. For a C-linkage-compatible API to actually work, it needs to mirror your "object lifetime" design, namely, it needs a way to say "I'm done with an error code" so that the underlying implementation can (ultimately) call _do_erased_destroy.

The alternative is to never pass an error across the C linkage boundary that uses the indirecting domain, or any domain that uses a similar trick.

Much like C++ has categories of types (polymorphic, trivial, aggregate, etc.), it seems that there is a need to think about categories of domains too. Namely, domains that actually do something inside of _do_erased_destroy, and those that don't. The former aren't just "status codes," they're status codes that require "cleanup," and the cleanup mechanism is tied to the RAII pattern.

The beauty of the C++ design is that the annoying boilerplate of cleaning up the status_codes is hidden by the C++ concepts like move semantics and destructors, and a C-linkage-compatible version of the API would need to match this. But in C-style APIs, resource management is verbose and awkward. I need to think about whether the idea is salvageable. It either has to be verbose, or it has to ban the sending of "requires cleanup" status codes by mapping them to the generic domain.

One issue here is, I can't tell if something requires cleanup once it's erased. I can reasonably guess (if can check if the domain is the indirecting_domain) but it would be nice if some pure virtual, bool-valued method of status_code_domain could tell me. Then I would know whether I had to do the mapping to the generic domain or not.

In the meantime, here is something you might consider:

Although you've deleted the copy constructor for status_code<erased<T>>, it seems you've also left an escape hatch, clone, which allows users to copy if they're sure they understand all the lifetime-management issues involved.

Is that the right way to think of it? If so, I don't see why you wouldn't allow direct construction from a (status_code_domain *, value_type) pair as another escape hatch (as a static factory member function) because it's a reasonable thing to want to do. I'm not strongly advocating for it, but it's something to consider.

The alternative you suggested -- implementing an out-of-line specialization status_code<trivially_erased<T>> -- is rough in practice. I did implement this, and I had to become familiar with a fair number of details of the implementation, so it doesn't seem like something I could easily do if this were to become standardized (I'd need an implementation for all major compilers and I'd need to track with their internals with every major version).

Anyway, feel free to close the issue with any final thoughts, and thank you for your time!

Niall Douglas · Answer 3 · Sat Oct 19 2019 21:34:43 GMT+0800 (China Standard Time)

If a type is trivially copyable, then it is usually safe to pass into C code. A trivially copyable type erased status code is absolutely fine in C on the major compilers (technically, it is UB in the standard without standard layout, but every major compiler guarantees C compatibility with trivial copyability only).

Erased status codes cannot know what sort of copyability their original payload is. Hence they cannot expose a copy constructor, except via clone(), which may fail if the erased type cannot be copied, or fails to copy. C++ does not have language support for better here.

If you want to implement direct construction, there is nothing stopping you implementing that. status code is intended to be extended and customised. There's lots of custom stuff I extend status code with in my own code which is intentionally not shipped in the core library.

I'm not sure how you found implementing a trivially copyable erased status code problematic. It certainly doesn't vary between compilers, nor requires maintenance.

There is always a fine balancing act in standard libraries between doing detail for the end user, and leaving detail open to end users to constantly reimplement. The balance is never perfect, but always a personal best judgement, with a fair bit of consensus generating suboptimal compromise as well.

Thanks for your feedback!