matusnovak / prometheus-smartctl

HDD S.M.A.R.T exporter for Prometheus written in Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

More datapoints?

SnorreSelmer opened this issue · comments

I tried setting up this SMART-exporter script (https://github.com/prometheus-community/node-exporter-textfile-collector-scripts/blob/master/smartmon.sh) but couldn't quite make it work right, so I found your excellent exporter, and was up and running in 5 minutes!

However! The above script has some more datapoints, some of which I'd consider to be a bit more user-friendly. I'm thinking of this one:

# HELP smartmon_temperature_celsius_raw_value SMART metric temperature_celsius_raw_value
# TYPE smartmon_temperature_celsius_raw_value gauge
smartmon_temperature_celsius_raw_value{disk="/dev/sda",type="sat",smart_id="194"} 3.000000e+01
smartmon_temperature_celsius_raw_value{disk="/dev/sdb",type="sat",smart_id="194"} 3.200000e+01
smartmon_temperature_celsius_raw_value{disk="/dev/sdc",type="sat",smart_id="194"} 3.100000e+01
smartmon_temperature_celsius_raw_value{disk="/dev/sdd",type="sat",smart_id="194"} 3.200000e+01
smartmon_temperature_celsius_raw_value{disk="/dev/sde",type="sat",smart_id="194"} 3.200000e+01
smartmon_temperature_celsius_raw_value{disk="/dev/sdf",type="sat",smart_id="194"} 3.100000e+01
smartmon_temperature_celsius_raw_value{disk="/dev/sdg",type="sat",smart_id="194"} 3.300000e+01
smartmon_temperature_celsius_raw_value{disk="/dev/sdh",type="sat",smart_id="194"} 3.100000e+01
smartmon_temperature_celsius_raw_value{disk="/dev/sdj",type="sat",smart_id="194"} 2.900000e+01

What's the difference between this and your smartprom_Temperature_Celsius?
Well, on my drives, smartprom_Temperature_Celsius gives out a value that I have to subtract from 152 to get the correct temperature. smartprom_Temperature_Celsius{disk="sda"} returns 122, and 152-122=30.
The above smartmon_temperature_celsius_raw_value{disk="/dev/sda"} is already 30, no math needed.
Maybe you could look at the datapoints scraped in the script I linked (there's also a Python script there), and implement some of them?

Hi @SnorreSelmer

That's not a problem, I will add the missing data points.

On my drives (Seagate Barracuda/IronWolf) the Temperature_Celsius is accurate and the temperature_celsius_raw_value is missing completely. Weird. I suppose each hard drive manufacturer publishes different data points.

Should be resolved, check the latest master, or docker pull matusnovak/prometheus-smartctl:v1.1.1.

I have changed the script so that the Prometheus gauges are created dynamically based on what smartctl -A /dev/sd* reports. So this way the script will report all data points available.

I think the problem with the incorrect smartprom_Temperature_Celsius was that my script was using the default device type and wasn't running -A with the correct -d [sat,nvme,scsi,...]. With the latest changes, the command is run with the correct -d and the smartprom_Temperature_Celsius should be a normal value.

You have a SCSI disk, don't you? I think smartctl calculates the values differently depending on command line option -d.

My bad. I think the correct temperature is meant to be the raw value (raw_value column reported by smartctl) and not the current value (value column).

I have added *_raw gauges that will use the "raw_value" column, but only if it is an integer.

Pushed to master and to DockerHub -> docker pull matusnovak/prometheus-smartctl:v1.1.2.

So you will get:

smartprom_Temperature_Celsius{drive="sdl"} 35.0
smartprom_Temperature_Celsius{drive="sdk"} 33.0
smartprom_Temperature_Celsius{drive="sdj"} 35.0
smartprom_Temperature_Celsius_raw{drive="sdl"} 35.0
smartprom_Temperature_Celsius_raw{drive="sdk"} 33.0
smartprom_Temperature_Celsius_raw{drive="sdj"} 35.0
etc...

Closing this. The current implementation will provide you will all endpoints.