python / devguide

The Python developer's guide

Home Page:https://devguide.python.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

C API: Add guidelines for C APIs with output params

erlend-aasland opened this issue · comments

One other issue is whether the output parameter should be touched if there is no meaningful value.

Take PyDict_GetItemSuffixToBeDecided(PyDictObject *op, PyObject *key, PyObject **pvalue).
Should we touch *pvalue if returning -1 or 0.
In other words, in

   PyObject *value;
   int err = PyDict_GetItemSuffixToBeDecided(op, key, &value);

Should value be undefined if err <= 0, or should it be set to NULL?

It is more efficient to not touch it, since the return value specifies that it has no meaning.
But, do we trust all C extension authors not to read the value, which may point to a real object if they are unlucky? Setting *pvalue = NULL is safer.

Originally posted by @markshannon in #1121 (comment)

Quoting Victor's reply, #1121 (comment):

In my PyDict_GetItemRef() PR, I chose to always initialized *pvalue, especially in the error case (return -1). IMO there is a minor performance overhead, but it closes a whole class of issue: undefined behavior if the function returns -1. Some C compilers try to evaluate all code paths and say that a &value argument "can be undefined" when the code is correct. A common workaround makes me sad: initialize the value in the calling site: PyObject *value = NULL; :-(

For the "return 0" case (missing key), I also prefer to set *pvalue to NULL for the same reason, but also because people upgrading their code from PyDict_GetItem() and PyDict_GetItemWithError() may want to check if value is NULL or not, rather than checking the return value:

PyObject *value;  // OMG! undefined value
if (PyDict_GetItemRef(dict, key, &value) < 0) ... error ...
else if (value == NULL) ... missing key ...
else ... key is present ...

Here the function result is not used for the second code path: ... missing key ....

I'm not comfortable to document that *pvalue is intialized for the error case (return -1). But I wrote an unit test for that.

After thinking about this for some days, I've landed on the conclusion that API exposed in Python.h should explicitly set output params to NULL1; let's try to avoid segfaults bco. API misuse :) I would expect that for most APIs, the minor performance impact would be lost in benchmark noise.

Footnotes

  1. I agree with Victor that we don't think we need to document that fact in the C API docs.

[...] I've landed on the conclusion that API exposed in Python.h should explicitly set output params to NULL1; let's try to avoid segfaults bco. API misuse :) [...]

IOW, we should ensure that there is no UB when API users check the output parameters.