Format of ceph health changes jewel -> luminous
jan--f opened this issue · comments
I agree. It shouldn't be hard to keep ceph_exporter backwards compatible , i.e. both formats.
@jan--f have you tried out a luminous build with these changes yet? I'm seeing that besides the change in format for messages, all the mon stats have disappeared. Specifically, the section under "health -> health_services -> mons" is gone. Do you know where these stats went? I don't see them in the new mgr status
To follow up, Sage indicated that he didn't think the mon metrics were being used or needed by anyone, but he could add them back if they are actually helpful.
For a workaround (at least for the health stats) one can add mon_health_preluminous_compat=true
to ceph.conf.
hi @jan--f, did the client/recovery/cache I/O output of ceph status --format plain
also get removed or moved elsewhere in Luminous? The lines in this function use the plain status output to look for client io
, recovery io
, and cache io
, but I no longer see them in the luminous output: https://github.com/digitalocean/ceph_exporter/blob/master/collectors/health.go#L741
Any idea how to collect those values still? Client I/O and recovery I/O (but not cache I/O) look to be available per pool with ceph osd pool stats --format=json
, but I'm wondering if those stats are still available top level (aggregate). Thanks!
Actually I do see client IO in the ceph status
output, but it has to be active to be printed. Same with recovery and cache IO.
Instead of updating the parsing of the plain text, which changed format in luminous, it's probably the best bet to just parse the JSON instead. It appears the format can be gleaned from the client, cache, and recovery functions called from here in Luminous: https://github.com/ceph/ceph/blob/138f08d5df311d9e4987819a792c01838dc36806/src/mon/PGMap.cc#L253
For a workaround (at least for the health stats) one can add mon_health_preluminous_compat=true to
ceph.conf.
This would probably break oA's health display.
@jbw976 Yeah the I/O parts are not working for me either. I agree about the plains vs. json parsing. No idea why this implementation was chosen.
Also the json format also changed, so parsing the json won't help upgrade pains.
FWIW I'm also working on a mgr plugin that exports prometheus metrics. Its not equivalent to the ceph_exporter but should roughly export the same metrics (differences in naming and labels though). See ceph/ceph#16990
FYI, This is our downstream issue: https://tracker.openattic.org/browse/OP-2583
This should no longer be a problem. :) Feel free to re-open if you're having problems with client IO metrics