ovis-hpc / ovis

OVIS/LDMS High Performance Computing monitoring, analysis, and visualization project.

Home Page:https://github.com/ovis-hpc/ovis-wiki/wiki

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ldms_record_heap_size() is confusing

morrone opened this issue · comments

The documentation (and maybe the function name) for ldms_record_heap_size() is confusing. Mostly I think the confusion arises because it only claims to "determines the memory size in the heap required for an instance of the given record". But looking at the code, that isn't true, it actually gives the size for both the record AND the storage for the list entry combined. That really needs to be clearly documented, because otherwise the list API is rather vague and confusing. I think that any competent developer is going to wonder where the list entry memory is coming from and be confused about how to grow and shrink the heap without that important detail.

We might also explicitly mention that records can only be used as part of lists (I assume that is true?) which would help to explain why the API doesn't give clean separate details for the list entry and record sizes separately.

The ldms_list_append_item() and ldms_list_append_record() API both allocate the list entry as part of the allocation of the entry to be added. The functions return the ldms_mval_t for the allocated entity. The application gets the needed information from ldms_xxx_heap_size() function to get the size they need to provide to ldms_set_new_with_heap(). That said we don't have an interface called ldms_set_new_with_heap_and_auth() and so I'm bit confounded how we got here.

We need to schedule a UG meeting to walk through this and get an API together that is more cogent. Some ideas:

  • The heap list size can be reduced to list_max * sizeof(list_entry)
  • Currently anything that is allocated off the heap goes into a list. I don't see any other use case.
  • Ultimately, heaps will be exchanged in the monitoring infrastructure like sets are today...more to come here.
  • What to do when ldms_set_heap_alloc returns ENOMEM

Currently the code pattern is:

mval = ldms_list_append_item(list_handle, type, ...);
rc = ldms_mval_set_XXX(mval, <value>);

I would argue that this is preferred:

mval = ldms_set_heap_alloc(set, type, ...);
rc = ldms_mval_set_xxx(mval, <value>);
rc = ldms_list_append_item(..., type, mval);

For ldms_set_heap_alloc(), the var-arg list is empty for primitive types. For arrays, it would be the array length, and for records, it would be the metric-id for the record definition.

I would advocate for combining the ldms_list_append_item() and ldms_list_append_record() API into a single interface and deprecating ldms_list_append_record().

I think we should do this ASAP while these interfaces are exclusively used in our source base.