[ERROR] Unable to collect data from ceph osd df rados: Invalid argument
gopherunner opened this issue · comments
After building the ceph_exporter go project, I try to run the bin ceph_exporter file, and I'm getting the following err msgs:
root@cm03:/# /usr/local/go/bin/ceph_exporter
2017/05/18 14:33:48 Starting ceph exporter on ":9128"
2017/05/18 14:33:55 [ERROR] cannot extract total bytes: strconv.ParseFloat: parsing "": invalid syntax
2017/05/18 14:33:55 [ERROR] cannot extract used bytes: strconv.ParseFloat: parsing "": invalid syntax
2017/05/18 14:33:55 [ERROR] cannot extract available bytes: strconv.ParseFloat: parsing "": invalid syntax
2017/05/18 14:33:55 failed collecting cluster health metrics: strconv.ParseFloat: parsing "": invalid syntax
2017/05/18 14:33:55 [ERROR] Unable to collect data from ceph osd df rados: Invalid argument
2017/05/18 14:33:55 failed collecting osd metrics: rados: Invalid argument
Any ideas?
Im getting the same error while using:
/ceph_exporter -ceph.config "/etc/ceph/ceph.conf" -ceph.user "admin" -exporter.config "/etc/ceph/exporter.yml"
Same issue here
[root@mds01ceph ~]# systemctl -l status ceph-exporter
● ceph-exporter.service - Ceph Exporter Service
Loaded: loaded (/usr/lib/systemd/system/ceph-exporter.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2020-03-12 12:15:31 +08; 3min 0s ago
Process: 2184 ExecStartPre=/usr/bin/docker rm ceph_exporter (code=exited, status=0/SUCCESS)
Process: 1146 ExecStartPre=/usr/bin/docker kill ceph_exporter (code=exited, status=1/FAILURE)
Main PID: 2196 (docker)
Tasks: 14
Memory: 95.4M
CGroup: /system.slice/ceph-exporter.service
└─2196 /usr/bin/docker run --name ceph_exporter -v /etc/ceph:/etc/ceph --net=host -p=9128:9128 ceph_exporter
Mar 12 12:17:30 mds01ceph docker[2196]: 2020/03/12 04:17:30 [ERROR] cannot extract total objects: strconv.ParseFloat: parsing "": invalid syntax
Mar 12 12:17:30 mds01ceph docker[2196]: 2020/03/12 04:17:30 failed performing PG dump brief: json: cannot unmarshal object into Go value of type collectors.cephPGDumpBrief
Mar 12 12:17:45 mds01ceph docker[2196]: 2020/03/12 04:17:45 [ERROR] cannot extract total objects: strconv.ParseFloat: parsing "": invalid syntax
Mar 12 12:17:45 mds01ceph docker[2196]: 2020/03/12 04:17:45 failed performing PG dump brief: json: cannot unmarshal object into Go value of type collectors.cephPGDumpBrief
I am having the same problem. For now I simply added StandardError=null
to the systemd service, to avoid flooding the system log. However, this will also hide any other error message.
2020/06/16 18:43:43 failed performing PG dump brief: json: cannot unmarshal object into Go value of type collectors.cephPGDumpBrief```
I believe this was fixed.If you're running a Nautilus cluster, make sure you're not using the Luminous exporter.
There were a few changes from Luminous to Nautilus:
total objects
error: was due to changes inceph df
- unmarshal error on
ceph pg dump pgs_brief
was due to a json output change
FWIW my clusters are Infernalis and Nautilus. Working to sunset the former. I've deployed Nautilus admin nodes on the Infernalis clusters for uniformity; the Docker image I'm using is for sure from the Nautilus branch from ... late June or early July. That said, I've never really understood the branch strategy, it's susceptible to divergence which for sure we've seen. Would seem cleaner to have just one branch with internal logic to act differently based on the version installed. Upstream randomly changing things -- notably time skew -- marks this a real moving target.
I never had to look at infernalis but this is likely the issue, change in pg_dump brief -f json output.
100% agreed on the multi-branch, our next step for ceph_exporter is to merge everything in one main branch before we start working on Octopus compatibility but there's a bit of work required in order to make it happen.