hpyproject / hpy

HPy: a better API for Python

Home Page:https://hpyproject.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HPy helpers deployment

fangerer opened this issue · comments

We had some discussion in October 2023's dev call about the way how we provide the HPy helpers.

@steve-s brought up that it may be problematic to propagate updates/fixes for helpers if there are bugs or even security vulnerabilities.

Currently, the helpers are provided as a collection of source files and shipped with HPy. The sources of the helpers are automatically added to the extension's sources list and will be compiled with the extension.
Consequently, if there is anything that needs to be fixed in the helpers, all extensions using it need to be rebuilt.

@steve-s was thinking about if we could do better.

IMO, there are three main reasons for having the HPy helpers compiled with the extension (rather than having an appropriate HPy API function):

  1. Portability: Helpers are often those functions that use compiler-/platform-/architecture-specific features like variadic functions or variable-width types (e.g. int). Compiling those functions together with the extension ensures everything works as expected.
  2. Performance: Some helpers are just thin wrappers around some HPy context function and act as glue code (e.g. the HPyLong_FromSomething where Something is not a fixed-width type). If they are compiled with the extension, there should be very little to no overhead compared to calling the target context function.
  3. Code sharing: Some helpers are using common code patterns and thus we avoid that extension authors keep writing the same patterns over and over.

We discussed the idea to do separate version management for the helpers and compile the version (just like the HPy ABI version) into to HPy extension. Than the interpreter can at least check and maybe deny usage of certain helper versions.

One view at this is that the HPy helpers are just a convenience C library, sugar on top of the C API. In the case of C, the API and ABI are 1:1, so it does not make much sense to talk about HPy binding for C, but we can view it like that.

Other languages, like Rust, should provide their own API and sugar that fits their ecosystem. For example, string formatting is typically not done via varargs functions in Rust, but by other means. It would be also more idiomatic to provide streams like interface in C++ over printf-like interface.

Should we treat something like Rust HPy binding differently than the C binding? Perhaps yes, the C binding should be treated specially. Otherwise it should be treated like any other system library, which would however mean that HPy extensions would have a system dependency -- that's not good.

Existing C++ bindings, like nanobind, seem to ignore the issue that potential bugs/vulnerabilities require recompilation, but there isn't much choice with C++ templates, and additionally the current C API, which they are based on, also requires recompilation. So maybe it's non existing issue for them.


One possible solution: I am not sure if this is too complicated for something like the helpers library, but we could:

  • define something like a HPyContext, but for the helpers, i.e., vtable for the helper functions. This vtable will be stored in a global variable, so user will not explicitly pass it to any the helper function, it will be completely transparent.
  • we would still compile the helper functions into the extension and fill the vtable with pointers to the helper functions in the module init
  • the module will expose the version of the helper functions and some well-known function that the runtime can call to override the vtable if desired