hadley / r-internals

Documentation for R's internal C API

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Safely releasing external resources on error

hadley opened this issue · comments

From Simon Urbanek on R-devel:

I agree that it may be useful to have some kind of "finally"-like infrastructure. However, in the use cases you list there are already ways to do that - the same way that R_alloc uses. First, you don't need to call UNPROTECT - the whole point is that the protection stack is automatically popped in case of an abnormal termination. That is precisely how memory leaks are prevented - as long as you play by the rules it will be released for you. The check for UNPROTECTs at the end of .Call is explicitly there so catch bugs in normal termination. So no memory leaks there.

The real case where users may create leaks is if you allocate memory that you don't tell R about. As long as you associate a finalizer with any non-R memory you allocate, there will be no memory leaks -that is how you are supposed to write R packages with external allocations. So the only difference to your example is that you don't register a finalizer with the function but rather with the allocation you make. That also seems IMHO less error prone.

So to take your example, the way would typically write that code safely is something like

typedef struct {
  xmlNodePtr *node;
 EVP_PKEY_CTX *ctx;
} my_context_t;

// define how to dispose of all things you care about correctly
static void context_fin(SEXP what) {
    my_context_t *c = (my_context_t*) EXTPTR_PTR(what);
    if (!c) return;
    if (c->ctx) EVP_PKEY_CTX_free(c->ctx);
    if (c->node) xmlFreeNode(c->node);
}

[...]
// allocate the context and tell R to manage its protection and finalization
// (you could write a macro to make this one-liner)
my_context_t* c = (my_context_t*) R_Calloc(1, my_context_t);
SEXP res = PROTECT(R_MakeExternalPtr(c, R_NilValue, R_NilValue));
R_RegisterCFinalizer(res, context_fin);

// do all work here ... you safely abort at any point without memory leaks
c->node =  xmlNewNode(...);
c->ctx = EVP_PKEY_CTX_new(...);
[...]

The point of using a finalizer is that no matter what happens the memory is always released. The structure with all allocations is protected until you unprotect it or there is any interrupt/error. Since all regular R rules apply, you can also assign it someplace to make the protection dependent on any other object you care about. This is often useful because you don't need to PROTECT things left and right, but instead you can just have one object that holds references to random things you care about.

Of course, you could write a wrapper for the above with some syntactic sugar to achieve the same - essentially limiting the finalizer to be just a function call on the reference that you create. It may be a bit of overkill since you may end up creating objects for every allocation, but certainly doable. I would argue that in most cases you already tend to have a structure for the things you allocate so the "normal" approach is typically more clear and readable than inlining calls with side-effects, but that may be a matter of taste.