artyom-poptsov / guile-udev

GNU Guile bindings to libudev.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

100% CPU usage when using default timeouts (0 sec and 0 usec)

Apteryks opened this issue · comments

Hi,

Is it expected that 100% of CPU be used when not setting a timeout value for make-udev-monitor, as in this modification to the example:

modified   examples/device-listener.scm
@@ -13,7 +13,7 @@
 (define (main args)
   (let* ((udev         (make-udev))
          (udev-monitor (make-udev-monitor udev
-                                          #:timeout-sec  1
+                                          #:timeout-sec  0
                                           #:timeout-usec 0
                                           #:callback     callback
                                           #:filter       (list "usb"

Reading man 2 select, it the defaults (0) are explained like:

If both fields of the timeval structure are zero, then select() returns immediately. (This is useful for polling.)

That's probably the reason. My expectation for the default would rather be NULL valued timeouts (no timeout):

If timeout is specified as NULL, select() blocks indefinitely waiting for a file descriptor to become ready.

I'll see if I can implement this as accepting #f in scheme and producing NULL for the C call.

I've been trying to use scm_select instead of plain select (C), which would make it easy to share the same interface (e.g. secs / usecs and correct argument validation/behavior).

But this is not easy from C, especially trying to preserve the existing error handling scheme. I think I may be able to do so using nested functions in GNU C (https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html), but I was wondering, perhaps the future would be using the FFI of Guile to access libudev instead of C SMOBS?

Hello!

First of all thank you for helping me with Guile-Udev! I'm really curious now what project you're using it for. :-)

I was wondering, perhaps the future would be using the FFI of Guile to access libudev instead of C SMOBS?

Sorry, I'm not willing to migrate Guile-Udev to FFI yet as it will require to re-write most of the code (that basically works already) and I don't have enough motivation to do that right now. But let's see what we can do about this bug.

So if a user wants zero timeouts, would it be proper to just pass NULL to select? I think we can implement it pretty easily with just one check.

I've been trying to use scm_select instead of plain select (C), which would make it easy to share the same interface (e.g. secs / usecs and correct argument validation/behavior).

Could you please push the changes you made on a new branch in your Guile-Udev GitHub clone, even if it not working as you expect? Maybe I'll be able to see what is going on in the code and help with the patch.

-- avp

Hi Artyom! I understand about a rewrite to use Guile FFI, it seems like it'd be some work.

Here's what I'm currently debugging: https://github.com/Apteryks/guile-udev/tree/udev-monitor-improvements. The commit replacing SM_NEWSMOB with scm_new_smob could be dropped if it causes problems (it at least causes compilation warnings that I'm not sure how to address).

Debugging currently looks like this:

$ gdb --args sh ./pre-inst-env ./examples/device-listener.scm

warning: Currently logging to gdb.txt.  Turn the logging off and on to make the new setting effective.
Reading symbols from sh...
(No debugging symbols found in sh)
(gdb) b udev_monitor_scanner
Function "udev_monitor_scanner" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (udev_monitor_scanner) pending.
(gdb) r
Starting program: /gnu/store/6xybfny8349lhp04z5ih6h1a854w51ls-profile/bin/sh ./pre-inst-env ./examples/device-listener.scm
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libthread_db.so.1".
[Detaching after fork from child process 16595]
[Detaching after fork from child process 16596]
process 16592 is executing new program: /gnu/store/yr39rh6wihd1wv6gzf7w4w687dwzf3vb-coreutils-9.1/bin/env
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libthread_db.so.1".
process 16592 is executing new program: /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/bin/guile
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libthread_db.so.1".
[New Thread 0x7ffff7017640 (LWP 16597)]
[New Thread 0x7ffff6816640 (LWP 16598)]
[New Thread 0x7ffff6015640 (LWP 16599)]
[New Thread 0x7ffff5735640 (LWP 16600)]
[Switching to Thread 0x7ffff5735640 (LWP 16600)]

Thread 5 "guile" hit Breakpoint 1, udev_monitor_scanner (arg=0x7ffff7550fc0) at udev-monitor-func.c:188
warning: Source file is more recent than executable.
188         SCM udev_monitor = (SCM) arg;
(gdb) n
189         gudev_monitor_t* umd = gudev_monitor_from_scm(udev_monitor);
(gdb) n
194         SCM error_callback = umd->error_callback;
(gdb) n
196         scm_init_guile();
(gdb) 
198         result = udev_monitor_enable_receiving(umd->udev_monitor);
(gdb) 
199         if (result < 0) {
(gdb) 
207         monitor_fd = scm_from_int(udev_monitor_get_fd(umd->udev_monitor));
(gdb) 
208         if (scm_less_p(monitor_fd, scm_from_int(0))) {
(gdb) 
209              char msg[] = "Could not udev monitor file descriptor.";
(gdb) 
210              scm_call_2(error_callback, udev_monitor,
(gdb) 
guile: uncaught exception:
Wrong number of arguments to [New Thread 0x7ffff4e85640 (LWP 16606)]
#<procedure callback (device)>
[Thread 0x7ffff4e85640 (LWP 16606) exited]
[Thread 0x7ffff5735640 (LWP 16600) exited]
[Thread 0x7ffff6015640 (LWP 16599) exited]
[Thread 0x7ffff7017640 (LWP 16597) exited]
[Thread 0x7ffff791e380 (LWP 16592) exited]
[Thread 0x7ffff6816640 (LWP 16598) exited]
[New process 16592]
[Inferior 1 (process 16592) exited with code 01]
(gdb) bt
No stack.

OK, that's a current bug at least for the callback being used in place of error-callback:

d07666f 2021-03-27 64 (udev-monitor-set-error-callback! monitor callback)

edit: fixed on branch

On the just pushed branch, it segfaults when attempting to run scm_internal_catch:


Thread 5 "guile" hit Breakpoint 1, udev_monitor_scanner (arg=0x7ffff7550fc0) at udev-monitor-func.c:188
188         SCM udev_monitor = (SCM) arg;
(gdb) n
189         gudev_monitor_t* umd = gudev_monitor_from_scm(udev_monitor);
(gdb) n
195         SCM error_callback = umd->error_callback;
(gdb) 
197         scm_init_guile();
(gdb) 
199         result = udev_monitor_enable_receiving(umd->udev_monitor);
(gdb) 
200         if (result < 0) {
(gdb) 
208         c_monitor_fd = udev_monitor_get_fd(umd->udev_monitor);
(gdb) 
209         monitor_fd = scm_from_int(c_monitor_fd);
(gdb) 
220         select_args.reads = scm_list_1(monitor_fd);
(gdb) 
221         select_args.secs = umd->secs;
(gdb) 
222         select_args.usecs = umd->usecs;
(gdb) 
224         SCM callback = umd->scanner_callback;
(gdb) 
229             pthread_mutex_lock(&umd->lock);
(gdb) 
230             if (! umd->is_scanning) {
(gdb) 
233             pthread_mutex_unlock(&umd->lock);
(gdb) 
238             select_result = scm_internal_catch(scm_from_utf8_symbol("system-error"),
(gdb) 

Thread 5 "guile" received signal SIGSEGV, Segmentation fault.
0x00007ffff7f50ae8 in ?? () from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
(gdb) ,bt
Undefined command: "".  Try "help".
(gdb) bt
#0  0x00007ffff7f50ae8 in ?? ()
   from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#1  0x00007ffff7f50d9c in scm_call_n ()
   from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#2  0x00007ffff7ebb6a9 in scm_call_5 ()
   from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#3  0x00007ffff7f62092 in ?? ()
   from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#4  0x00007ffff7f3de1f in scm_c_catch ()
   from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#5  0x00007ffff7f3de3e in scm_internal_catch ()
   from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#6  0x00007ffff5786241 in udev_monitor_scanner (arg=0x7ffff7550fc0) at udev-monitor-func.c:238
#7  0x00007ffff79a43aa in start_thread ()
   from /gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libc.so.6
#8  0x00007ffff7a24f7c in clone3 ()
   from /gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libc.so.6
(gdb) 

This compilation warning is probably relevant, but I'm not sure how to fix it:

udev-monitor-func.c: In function 'call_select':
udev-monitor-func.c:176:24: warning: passing argument 1 of 'scm_call_5' from incompatible pointer type [-Wincompatible-pointer-types]
  176 |      return scm_call_5(scm_select, args->reads, SCM_EOL, SCM_EOL,
      |                        ^~~~~~~~~~
      |                        |
      |                        struct scm_unused_struct * (*)(struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *)
In file included from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/include/guile/3.0/libguile.h:47,
                 from udev-monitor-func.c:22:
/gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/include/guile/3.0/libguile/eval.h:63:29: note: expected 'SCM' {aka 'struct scm_unused_struct *'} but argument is of type 'struct scm_unused_struct * (*)(struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *)'
   63 | SCM_API SCM scm_call_5 (SCM proc, SCM arg1, SCM arg2, SCM arg3, SCM arg4,
      |                         ~~~~^~~~

The first argument of scm_call_5 must be a Scheme procedure, but scm_select is a C procedure:

C Function: scm_select (reads, writes, excepts, secs, usecs)

https://www.gnu.org/software/guile/manual/html_node/Ports-and-File-Descriptors.html#index-scm_005fselect

Thanks, dropping the scm_call_5 and calling scm_select directly in call_select fixed it. :-). Tested working. Will sent a MR soon.

By the way, the use case is this: https://gitlab.com/Apteryks/x-resize/, which allows fixing this: https://issues.guix.gnu.org/57068.