ovis-hpc / ovis

OVIS/LDMS High Performance Computing monitoring, analysis, and visualization project.

Home Page:https://github.com/ovis-hpc/ovis-wiki/wiki

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Decompostion behavior when metric set list is empty

morrone opened this issue · comments

For context, I am seeing this when using store_avro_kafka. I suspect this happens with any decomposition-enabled store, but I include that detail in case I am wrong.

Most of the samplers that we use with decompositon, we have a single metric set and a single list. In the list, each entry is the data for a single device. For instance, with the slingshot_info sampler, there is one list entry for each slingshot interface found on the node.

This ticket is about the behavior that we see when that list is empty. Right now it appears that even when the list length is zero, decompositon is still creating a row as if there was a list entry, and all of the numbers are zero and the strings are (I think) empty strings.

I think that we would probably prefer that no row be generated at all if the list (or all lists?) in the metric set are zero length.

So maybe there is a question about the proper way to achieve this. Should the sampler have to delete the metric set to make that happen? Could we make some change to the decomposition to (maybe optionally) avoid generating a row when the (all?) list lengths are zero?

@tom95858 How does this sound?

I am thinking of adding a "empty_array_skips_row: true" field to the decomp static code.

I am skimming the decomp code, and it looks like decomp flex is likely tolerant of other decomposition code returning an empty row list/row count zero. The only other place I see calling decompose() right now is strgp_decompose(). I probably need to update that to avoid calling strgp->store->commit() when the row_count is 0. The only question then is whether callers should avoid calling release_rows() when the list is empty, or just require that the decomposition plugins be tolerant of release_rows() being called with an empty list. I am leaning towards the latter.

So probably the most complicated part (for me as not-the-author of the decomp code), is actually handling the configuration option in decomp_static.c __decomp_static_config() is over 300 lines long, and largely undocumented.

In ldmsd.h:struct ldmsd_decomp_s, the documentation of config() is out of date it appears. It talks about "param json_path", but actually we have "param jcfg". And I have no idea from context what to expect jcfg to contain.

Am I correct to guess that decomp_flex calls config() (for instance) decomp_static multiple times, once for each '"type": "static"' block? Are all of the json fields in the mapping included (i.e. not just "rows"?).