BoukeHaarsma23 / WattmanGTK

A Wattman-like GTK3+ GUI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Remove hardcoded hwmon0 value and dynamically search for valid cards

urbenlegend opened this issue · comments

This app expects to read values from /sys/class/drm/card0/device/hwmon/hwmon0 but this could change depending on the hardware configuration. For example, my Vega 64 card in Arch is located at /sys/class/drm/card0/device/hwmon/hwmon2

Also there are many places that use /hwmon/hwmon0/subsystem/hwmon0/ but I believe subsystem is just a symlink. You can access the same info via `/hwmon/hwmon0', but I may be wrong on this.

Do you know if there is any documentation on this online?

Okay, so I have a few suggestions

  1. We definitely need a way to select which graphics card we want. We may have card0 and card1 etc.
  2. I am thinking that since we already specify the card via /sys/class/drm/card0/device/ and there's a hwmon folder inside that directory, maybe we can just assume that there will only be one hwmon* directory inside hwmon. But then again assuming makes an ass out of you and me so they say, which brings me to my 3rd point.
  3. Instead of assuming, we should check whether the contents of /sys/class/drm/card0/device/device == /sys/class/drm/card0/device/hwmon/hwmon*/device/device. Then we just iterate through all the hwmon* folders that we find until we find the one that fits. Then we access info via that hwmon* folder.

/sys/class/drm/card0/device/device btw lists the PCI id of the device, which we can find via lspci -nn

I will write a more generic way to retrieve sensors paths. For now, as a work around, edit the paths in plot.py and GPU.py to the correct values.

I also had to edit the paths in handler.py in order to get the script to successfully load (in addition to GPU.py and plot.py).

Just in case others are having trouble.

I also have a Vega64 like @urbenlegend

My Vega56 on Arch Linux 4.18 had a similar issue of not being at the hardcoded path. It was at the following:
/hwmon/hwmon1/subsystem/hwmon1/

I edited the files as described above but wanted to give you some more data.

commented

My RX 480 was under "hwmon3"

Please try the generic-sensors branch and let me know if that fixed this problem.

commented

Please try the generic-sensors branch and let me know if that fixed this problem.

I won't be able to test that till Tuesday night, the flickering from turning on the mask makes gaming impossible, but I will try the latest code on Tuesday night for you no problem

Just switched to generic-sensors branch. Seems like it's working for me with hwmon2

Working fine for me too with a 580 on hwmon1, thanks.

Worked for me with Vega56 on hwmon1. Thanks @BoukeHaarsma23!

Yup, works now! :D

Failed to install:

Traceback (most recent call last): File "setup.py", line 18, in <module> from setuptools import setup ModuleNotFoundError: No module named 'setuptools'

@greevar I don't know if you found the answer to your issue, but you just need to install the setuptools command.

  • sudo apt-get install -y python3-setuptools
  • sudo python setup.py install

Ideally you should search:
/sys/class/drm/card*/

And

/sys/class/drm/card*/device/hwmon/hwmon*/

When looking for the appropriate cards and hwmon interfaces.

You should check the PCI-ID information of the associated device numbers for these branches (./device) to find the appropriate locations. This will also cover additional cards that aren't the primary boot card.

One could also check other info data in the proc tree for those files for validation that it is indeed the appropriate interface instead, as this will give ALL devices registered with the DRM video interface module.

[kzlosnik@kgrCentos WattmanGTK-master]$ python3 run.py
1 AMD GPU(s) found. Checking if correct kernel driver is used for this/these.
44:00.0 uses amdgpu kernel driver
Searching for sysfs path...
/sys/devices/pci0000:40/0000:40:03.1/0000:42:00.0/0000:43:00.0/0000:44:00.0 belongs to 44:00.0 with symbolic link to /sys/class/drm/card0/device
Sysfs path found in /sys/devices/pci0000:40/0000:40:03.1/0000:42:00.0/0000:43:00.0/0000:44:00.0
amdgpu card found in /sys/class/hwmon/hwmon0 hwmon folder
Checking which device this hwmon path belongs to
/sys/class/hwmon/hwmon0 belongs to /sys/devices/pci0000:40/0000:40:03.1/0000:42:00.0/0000:43:00.0/0000:44:00.0 (Advanced Micro Devices, Inc. [AMD/ATI] Device 081e)
Found sensor in0_input
Trying to read /sys/class/hwmon/hwmon0/in0_input
Found sensor fan1_min
Trying to read /sys/class/hwmon/hwmon0/fan1_min
Found sensor temp1_crit
Trying to read /sys/class/hwmon/hwmon0/temp1_crit
Found sensor pwm1_enable
Trying to read /sys/class/hwmon/hwmon0/pwm1_enable
Found sensor pwm1
Trying to read /sys/class/hwmon/hwmon0/pwm1
Found sensor temp1_crit_hyst
Trying to read /sys/class/hwmon/hwmon0/temp1_crit_hyst
Found sensor power1_cap_min
Trying to read /sys/class/hwmon/hwmon0/power1_cap_min
Found sensor fan1_enable
Trying to read /sys/class/hwmon/hwmon0/fan1_enable
Found sensor fan1_max
Trying to read /sys/class/hwmon/hwmon0/fan1_max
Found sensor power1_cap
Trying to read /sys/class/hwmon/hwmon0/power1_cap
Found sensor pwm1_min
Trying to read /sys/class/hwmon/hwmon0/pwm1_min
Found sensor power1_average
Trying to read /sys/class/hwmon/hwmon0/power1_average
Found sensor power1_cap_max
Trying to read /sys/class/hwmon/hwmon0/power1_cap_max
Found sensor fan1_input
Trying to read /sys/class/hwmon/hwmon0/fan1_input
Found sensor temp1_input
Trying to read /sys/class/hwmon/hwmon0/temp1_input
Found sensor pwm1_max
Trying to read /sys/class/hwmon/hwmon0/pwm1_max
Found sensor fan1_target
Trying to read /sys/class/hwmon/hwmon0/fan1_target
Found sensor in0_label
Trying to read /sys/class/hwmon/hwmon0/in0_label
Reading clock states and limits.
Traceback (most recent call last):
File "run.py", line 23, in
wattman.main()
File "/home/kzlosnik/Downloads/WattmanGTK-master/WattmanGTK/wattman.py", line 158, in main
card.get_states()
File "/home/kzlosnik/Downloads/WattmanGTK-master/WattmanGTK/GPU.py", line 70, in get_states
self.pstate_clock.append(int(match.group(2)))
AttributeError: 'NoneType' object has no attribute 'group'

what now?

Trying to read /sys/class/hwmon/hwmon1/in0_label
Reading clock states and limits.
Cannot read file pp_od_clk_voltage, trying using pp_dpm_sclk and pp_dpm_mclk
Cannot do seperate overclocking via states, only by percentage!
Also got an error reading /sys/devices/pci0000:00/0000:00:01.1/0000:10:00.0/pp_dpm_sclk or /sys/devices/pci0000:00/0000:00:01.1/0000:10:00.0/pp_dpm_sclk
WattmanGTK will not be able to continue

I am using an r9 280, with the amdgpu open source driver, and my cpu is the ryzen 5 2400g with integrated vega 11 graphics.

I am unsure what to do at the moment as I am not great at decifering these sorts of messages, any and all help is greatly appreciated.