martinpitt / umockdev

Mock hardware devices for creating unit tests and bug reporting

Home Page:https://launchpad.net/umockdev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Properties not visible to libudev since systemd database changes

hadess opened this issue · comments

It took me a long while to figure out, but sd-device, which is the backend for libudev in systemd, barely reads uevent files ever anymore (if at all). The devices were marked as "sealed" every time device_read_uevent_file() was called. I think this has to do with the changes made to systemd to counteract the ABI change in uevents in the kernel, with more data being cached.

The use case is https://gitlab.gnome.org/GNOME/gnome-control-center/-/merge_requests/1090 where I try to hotplug a UPS while a real upowerd is running. upowerd doesn't manage to detect the UPS because the UPOWER_* properties aren't accessible.

The smallest reproducer for the bug that I could find is:

diff --git a/tests/test-umockdev.py b/tests/test-umockdev.py
index 58d8bab..c42daa8 100644
--- a/tests/test-umockdev.py
+++ b/tests/test-umockdev.py
@@ -134,10 +134,12 @@ class Testbed(unittest.TestCase):
         def on_uevent(client, action, device, counters):
             if action == 'add':
                 counters[0] += 1
+                self.assertEqual(device.get_property('ID_INPUT'), '1')
             elif action == 'remove':
                 counters[1] += 1
             else:
                 assert action == 'change'
+                self.assertEqual(device.get_property('ID_INPUT'), '1')
                 counters[2] += 1
 
             counters[3] = device.get_sysfs_path()

or even better, so as to not send an add uevent twice:

diff --git a/tests/test-umockdev.py b/tests/test-umockdev.py
index 58d8bab..5f991ac 100644
--- a/tests/test-umockdev.py
+++ b/tests/test-umockdev.py
@@ -134,25 +134,26 @@ class Testbed(unittest.TestCase):
         def on_uevent(client, action, device, counters):
             if action == 'add':
                 counters[0] += 1
+                self.assertEqual(device.get_property('ID_INPUT'), '1')
             elif action == 'remove':
                 counters[1] += 1
             else:
                 assert action == 'change'
+                self.assertEqual(device.get_property('ID_INPUT'), '1')
                 counters[2] += 1
 
             counters[3] = device.get_sysfs_path()
 
-        syspath = self.testbed.add_device('pci', 'mydev', None, ['idVendor', '0815'], ['ID_INPUT', '1'])
-        self.assertNotEqual(syspath, None)
-
         # set up listener for uevent signal
         client = GUdev.Client.new(['pci'])
         client.connect('uevent', on_uevent, counter)
 
+        syspath = self.testbed.add_device('pci', 'mydev', None, ['idVendor', '0815'], ['ID_INPUT', '1'])
+        self.assertNotEqual(syspath, None)
+
         mainloop = GLib.MainLoop()
 
-        # send a signal and run main loop for 0.5 seconds to catch it
-        self.testbed.uevent(syspath, 'add')
+        # run main loop for 0.5 seconds to catch the add uevent
         GLib.timeout_add(500, mainloop.quit)
         mainloop.run()
         self.assertEqual(counter, [1, 0, 0, syspath])

Thanks @hadess! Ugh, that's a major bug.. I'll look at it ASAP. I created a reproducer test for now, and as expected it fails everywhere except Alpine (with musl).

as to not send an add uevent twice:

I'd like to keep that test as is: It's specifically meant to test the self.testbed.uevent(syspath, 'add') call. t_testbed_add_devicev() (and others) already cover the "implicit" add event from add_device().

I'd like to keep that test as is: It's specifically meant to test the self.testbed.uevent(syspath, 'add') call. t_testbed_add_devicev() (and others) already cover the "implicit" add event from add_device().

That's fine. It was easier for me to modify an existing test than to create a new one. Took me more than a day to "find" the bug.

I checked current systemd's code path: udev_device_get_property_value()sd_device_get_property_value(), and from then on pretty much everything happens inside sd-device.c: sd_device_get_property_value()device_properties_prepare()device_read_uevent_file(). So that bit is still as expected, but the latter immediately exits on device->uevent_loaded || device->sealed as you mentioned. uevent_loaded is just a function-internal cache flag that it only ever runs the function once.

At first sight we can ignore the "sealed" code paths -- device_seal() is only called from udevadm-test.c, and from device_new_from_{strv,nulstr}, and some copy constructor.

So after the first review I didn't yet see the root cause. Needs building a debug version and go through the code with a fine comb (and gdb).

Patches which could be useful, for setting SYSTEMD_LOG_LEVEL=debug. See also systemd/systemd#21761

libgudev:

diff --git a/gudev/gudevclient.c b/gudev/gudevclient.c
index d96283a..46c6623 100644
--- a/gudev/gudevclient.c
+++ b/gudev/gudevclient.c
@@ -178,6 +178,10 @@ g_udev_client_constructed (GObject *object)
 
   client->priv->udev = udev_new ();
 
+  /* From sys/syslog.h */
+#define LOG_DEBUG 7
+  udev_set_log_priority (client->priv->udev, LOG_DEBUG);
+
   /* connect to event source */
   client->priv->monitor = udev_monitor_new_from_netlink (client->priv->udev, "udev");
 

systemd:

diff --git a/src/libudev/libudev.c b/src/libudev/libudev.c
index 73574874ca..d7cfdb31c0 100644
--- a/src/libudev/libudev.c
+++ b/src/libudev/libudev.c
@@ -150,5 +150,8 @@ _public_ int udev_get_log_priority(struct udev *udev) {
  *
  **/
 _public_ void udev_set_log_priority(struct udev *udev, int priority) {
+        log_set_target(LOG_TARGET_AUTO);
+        log_parse_environment();
+        log_set_open_when_needed(true);
         log_set_max_level(priority);
 }

Update from Martin: I tried the systemd patch, doesn't seem to work for me. But gdb/adding log_warning() etc. is fine.

This is when the device gets sealed:

#0  device_verify (device=0x5555557d9210) at ../../../../Projects/jhbuild/systemd/src/libsystemd/sd-device/device-private.c:410
#1  0x00007fffe8efd056 in device_new_from_nulstr (ret=ret@entry=0x7fffffff8068, nulstr=0x7fffffff81b8 "ACTION", len=len@entry=57) at ../../../../Projects/jhbuild/systemd/src/libsystemd/sd-device/device-private.c:504
#2  0x00007fffe8efb017 in device_monitor_receive_device (m=0x55555579b560, ret=ret@entry=0x7fffffffa1e0) at ../../../../Projects/jhbuild/systemd/src/libsystemd/sd-device/device-monitor.c:510
#3  0x00007fffe8ef7658 in udev_monitor_receive_sd_device (ret=<optimized out>, udev_monitor=<optimized out>) at ../../../../Projects/jhbuild/systemd/src/libudev/libudev-monitor.c:202
#4  udev_monitor_receive_device (udev_monitor=0x5555557efe30) at ../../../../Projects/jhbuild/systemd/src/libudev/libudev-monitor.c:247
#5  0x00007fffe8ee3aa3 in monitor_event (source=<optimized out>, condition=<optimized out>, data=0x5555557e9e70) at ../../../../Projects/jhbuild/libgudev/gudev/gudevclient.c:84
#6  0x00007fffe9d7c130 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#7  0x00007fffe9dd1208 in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0
#8  0x00007fffe9d7b853 in g_main_loop_run () at /lib64/libglib-2.0.so.0

which means it's sealed as soon as it's parsed the contents of the uevent sent over the socket, rather than after it's read the uevent file once.

Update from Martin: I tried the systemd patch, doesn't seem to work for me. But gdb/adding log_warning() etc. is fine.

You need both to get the logging working.

Update: For the gudev test, yes; I'm debugging the libudev test from the C code, gudev isn't involved there.

Update: For the gudev test, yes; I'm debugging the libudev test from the C code, gudev isn't involved there.

OK, so you'll need to call udev_set_log_priority() from somewhere too. Also, would be best to avoid replying inside my comments, it just looks weird...

Heh, sorry, I wanted to avoid generating more github notifications, these were just meant as notes for myself. (Also: I'm stuck in meetings all afternoon, will dive into this in earnest tomorrow morning)

Note for tomorrow: To fix this we need to include all the properties in uvent_sender_send(). We need to pass that from the caller in umockdev.vala, as that has the knowledge about where to find and read the file.

With this PoC hack the test passes:

diff --git src/uevent_sender.c src/uevent_sender.c
index 4c98bda..12384d5 100644
--- src/uevent_sender.c
+++ src/uevent_sender.c
@@ -255,6 +255,7 @@ uevent_sender_send(uevent_sender * sender, const char *devpath, const char *acti
         count += append_property(props, sizeof (props), count, "DEVNAME=", devname);
     if (devtype)
         count += append_property(props, sizeof (props), count, "DEVTYPE=", devtype);
+    count += append_property(props, sizeof (props), count, "ID_INPUT=", "1");
 
     /* add versioned header */
     memset(&nlh, 0x00, sizeof(struct udev_monitor_netlink_header));

That looks like my ugly diff:

diff --git a/src/uevent_sender.c b/src/uevent_sender.c
index 7bbb814..eca806d 100644
--- a/src/uevent_sender.c
+++ b/src/uevent_sender.c
@@ -254,6 +254,8 @@ uevent_sender_send(uevent_sender * sender, const char *devpath, const char *acti
         count += append_property(props, sizeof (props), count, "DEVNAME=", devname);
     if (devtype)
         count += append_property(props, sizeof (props), count, "DEVTYPE=", devtype);
+    //count += append_property(props, sizeof (props), count, "UPOWER_BATTERY_TYPE=", "ups");
+    //count += append_property(props, sizeof (props), count, "UPOWER_FAKE_DEVICE=", "1");
 
     /* add versioned header */
     memset(&nlh, 0x00, sizeof(struct udev_monitor_netlink_header));

But I wonder if it's there's a better way. The uevent file is supposed to only contain kernel stuff, but I guess this is a mock, so making libudev happy is pretty much the only thing we're interested in.

The uevent file is supposed to only contain kernel stuff

Right, and current umockdev is exactly the wrong way around -- the uevent files only contain the userland udevd properties, and the kernel stuff like DEVNAME= or DEVTYPE= is added on the fly in uevent_sender_send(). What was I thinking back then.. Let's clean this up.

But I wonder if it's there's a better way.

umockdev could store the userland properties in a /.udev.properties file, and then concatenate both when sending netlink responses. It's a trade-off whether tested programs are more likely to get confused by userland properties in /uevent or that extra /.udev.properties file (which is not a sysfs attribute). The properties could also be stored in some parallel file hierarchy, or not written to disk at all and just stored in the preload's brain. I'll think about that after fixing the above issue, which is much more urgent.

Hi, the following may be redundant info, but

  • If udevd sends information about a device through netlink, then the message contains all properties. See device_monitor_send_device() in device-monitor.c, and device_get_properties_nulstr() in device-private.c.
  • A receiver of the netlink socket parses the properties and create a sd_device object. Then the generated sd_device object has the sealed flag set. See device_new_from_nulstr() in device-private.c.

I am not familiar with umockdev, but if it emulates of eavesdrops the netlink socket, then it should send all expected properties.

The properties could also be stored in some parallel file hierarchy, or not written to disk at all and just stored in the preload's brain. I'll think about that after fixing the above issue, which is much more urgent.

That's what the /run/udev/data/ folder is used for. I would make a note somewhere (in the README?) that the uevent files contains all the properties when it should only contain the kernel ones, and insist that umockdev mocks udev, not sysfs.

I first had to do some yak shaving with some false-positive gcc -fanalyzer errors in the latest Fedora version, and then another gcc bug around diagnostic pragmas. With that out of the way, I sent #165 with a quick fix for that, which should unblock you.

After that I'll do the code cleanups as discussed above, they are not that urgent any more. I didn't make up my mind about mocking /run/udev/data yet -- if libudev looks at that directly, umockdev kind of has to, but if it's just udevd's internal persistent database and libudev only looks at netlink uevents (current code inspection looks that way), then things are fine -- the properties can remain in umockdev's memory, and /uevent file can look like the real kernel's.

Thanks for your help!

Released 0.17.4 with this, will shepherd to Fedora.