100% CPU usage when using default timeouts (0 sec and 0 usec)
Apteryks opened this issue · comments
Hi,
Is it expected that 100% of CPU be used when not setting a timeout value for make-udev-monitor
, as in this modification to the example:
modified examples/device-listener.scm
@@ -13,7 +13,7 @@
(define (main args)
(let* ((udev (make-udev))
(udev-monitor (make-udev-monitor udev
- #:timeout-sec 1
+ #:timeout-sec 0
#:timeout-usec 0
#:callback callback
#:filter (list "usb"
Reading man 2 select
, it the defaults (0) are explained like:
If both fields of the timeval structure are zero, then select() returns immediately. (This is useful for polling.)
That's probably the reason. My expectation for the default would rather be NULL valued timeouts (no timeout):
If timeout is specified as NULL, select() blocks indefinitely waiting for a file descriptor to become ready.
I'll see if I can implement this as accepting #f in scheme and producing NULL for the C call.
I've been trying to use scm_select
instead of plain select
(C), which would make it easy to share the same interface (e.g. secs / usecs and correct argument validation/behavior).
But this is not easy from C, especially trying to preserve the existing error handling scheme. I think I may be able to do so using nested functions in GNU C (https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html), but I was wondering, perhaps the future would be using the FFI of Guile to access libudev instead of C SMOBS?
Hello!
First of all thank you for helping me with Guile-Udev! I'm really curious now what project you're using it for. :-)
I was wondering, perhaps the future would be using the FFI of Guile to access libudev instead of C SMOBS?
Sorry, I'm not willing to migrate Guile-Udev to FFI yet as it will require to re-write most of the code (that basically works already) and I don't have enough motivation to do that right now. But let's see what we can do about this bug.
So if a user wants zero timeouts, would it be proper to just pass NULL
to select
? I think we can implement it pretty easily with just one check.
I've been trying to use scm_select instead of plain select (C), which would make it easy to share the same interface (e.g. secs / usecs and correct argument validation/behavior).
Could you please push the changes you made on a new branch in your Guile-Udev GitHub clone, even if it not working as you expect? Maybe I'll be able to see what is going on in the code and help with the patch.
-- avp
Hi Artyom! I understand about a rewrite to use Guile FFI, it seems like it'd be some work.
Here's what I'm currently debugging: https://github.com/Apteryks/guile-udev/tree/udev-monitor-improvements. The commit replacing SM_NEWSMOB with scm_new_smob could be dropped if it causes problems (it at least causes compilation warnings that I'm not sure how to address).
Debugging currently looks like this:
$ gdb --args sh ./pre-inst-env ./examples/device-listener.scm
warning: Currently logging to gdb.txt. Turn the logging off and on to make the new setting effective.
Reading symbols from sh...
(No debugging symbols found in sh)
(gdb) b udev_monitor_scanner
Function "udev_monitor_scanner" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (udev_monitor_scanner) pending.
(gdb) r
Starting program: /gnu/store/6xybfny8349lhp04z5ih6h1a854w51ls-profile/bin/sh ./pre-inst-env ./examples/device-listener.scm
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libthread_db.so.1".
[Detaching after fork from child process 16595]
[Detaching after fork from child process 16596]
process 16592 is executing new program: /gnu/store/yr39rh6wihd1wv6gzf7w4w687dwzf3vb-coreutils-9.1/bin/env
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libthread_db.so.1".
process 16592 is executing new program: /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/bin/guile
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libthread_db.so.1".
[New Thread 0x7ffff7017640 (LWP 16597)]
[New Thread 0x7ffff6816640 (LWP 16598)]
[New Thread 0x7ffff6015640 (LWP 16599)]
[New Thread 0x7ffff5735640 (LWP 16600)]
[Switching to Thread 0x7ffff5735640 (LWP 16600)]
Thread 5 "guile" hit Breakpoint 1, udev_monitor_scanner (arg=0x7ffff7550fc0) at udev-monitor-func.c:188
warning: Source file is more recent than executable.
188 SCM udev_monitor = (SCM) arg;
(gdb) n
189 gudev_monitor_t* umd = gudev_monitor_from_scm(udev_monitor);
(gdb) n
194 SCM error_callback = umd->error_callback;
(gdb) n
196 scm_init_guile();
(gdb)
198 result = udev_monitor_enable_receiving(umd->udev_monitor);
(gdb)
199 if (result < 0) {
(gdb)
207 monitor_fd = scm_from_int(udev_monitor_get_fd(umd->udev_monitor));
(gdb)
208 if (scm_less_p(monitor_fd, scm_from_int(0))) {
(gdb)
209 char msg[] = "Could not udev monitor file descriptor.";
(gdb)
210 scm_call_2(error_callback, udev_monitor,
(gdb)
guile: uncaught exception:
Wrong number of arguments to [New Thread 0x7ffff4e85640 (LWP 16606)]
#<procedure callback (device)>
[Thread 0x7ffff4e85640 (LWP 16606) exited]
[Thread 0x7ffff5735640 (LWP 16600) exited]
[Thread 0x7ffff6015640 (LWP 16599) exited]
[Thread 0x7ffff7017640 (LWP 16597) exited]
[Thread 0x7ffff791e380 (LWP 16592) exited]
[Thread 0x7ffff6816640 (LWP 16598) exited]
[New process 16592]
[Inferior 1 (process 16592) exited with code 01]
(gdb) bt
No stack.
OK, that's a current bug at least for the callback being used in place of error-callback:
d07666f 2021-03-27 64 (udev-monitor-set-error-callback! monitor callback)
edit: fixed on branch
On the just pushed branch, it segfaults when attempting to run scm_internal_catch:
Thread 5 "guile" hit Breakpoint 1, udev_monitor_scanner (arg=0x7ffff7550fc0) at udev-monitor-func.c:188
188 SCM udev_monitor = (SCM) arg;
(gdb) n
189 gudev_monitor_t* umd = gudev_monitor_from_scm(udev_monitor);
(gdb) n
195 SCM error_callback = umd->error_callback;
(gdb)
197 scm_init_guile();
(gdb)
199 result = udev_monitor_enable_receiving(umd->udev_monitor);
(gdb)
200 if (result < 0) {
(gdb)
208 c_monitor_fd = udev_monitor_get_fd(umd->udev_monitor);
(gdb)
209 monitor_fd = scm_from_int(c_monitor_fd);
(gdb)
220 select_args.reads = scm_list_1(monitor_fd);
(gdb)
221 select_args.secs = umd->secs;
(gdb)
222 select_args.usecs = umd->usecs;
(gdb)
224 SCM callback = umd->scanner_callback;
(gdb)
229 pthread_mutex_lock(&umd->lock);
(gdb)
230 if (! umd->is_scanning) {
(gdb)
233 pthread_mutex_unlock(&umd->lock);
(gdb)
238 select_result = scm_internal_catch(scm_from_utf8_symbol("system-error"),
(gdb)
Thread 5 "guile" received signal SIGSEGV, Segmentation fault.
0x00007ffff7f50ae8 in ?? () from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
(gdb) ,bt
Undefined command: "". Try "help".
(gdb) bt
#0 0x00007ffff7f50ae8 in ?? ()
from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#1 0x00007ffff7f50d9c in scm_call_n ()
from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#2 0x00007ffff7ebb6a9 in scm_call_5 ()
from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#3 0x00007ffff7f62092 in ?? ()
from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#4 0x00007ffff7f3de1f in scm_c_catch ()
from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#5 0x00007ffff7f3de3e in scm_internal_catch ()
from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1
#6 0x00007ffff5786241 in udev_monitor_scanner (arg=0x7ffff7550fc0) at udev-monitor-func.c:238
#7 0x00007ffff79a43aa in start_thread ()
from /gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libc.so.6
#8 0x00007ffff7a24f7c in clone3 ()
from /gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libc.so.6
(gdb)
This compilation warning is probably relevant, but I'm not sure how to fix it:
udev-monitor-func.c: In function 'call_select':
udev-monitor-func.c:176:24: warning: passing argument 1 of 'scm_call_5' from incompatible pointer type [-Wincompatible-pointer-types]
176 | return scm_call_5(scm_select, args->reads, SCM_EOL, SCM_EOL,
| ^~~~~~~~~~
| |
| struct scm_unused_struct * (*)(struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *)
In file included from /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/include/guile/3.0/libguile.h:47,
from udev-monitor-func.c:22:
/gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/include/guile/3.0/libguile/eval.h:63:29: note: expected 'SCM' {aka 'struct scm_unused_struct *'} but argument is of type 'struct scm_unused_struct * (*)(struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *, struct scm_unused_struct *)'
63 | SCM_API SCM scm_call_5 (SCM proc, SCM arg1, SCM arg2, SCM arg3, SCM arg4,
| ~~~~^~~~
The first argument of scm_call_5
must be a Scheme procedure, but scm_select
is a C procedure:
C Function: scm_select (reads, writes, excepts, secs, usecs)
Thanks, dropping the scm_call_5 and calling scm_select directly in call_select fixed it. :-). Tested working. Will sent a MR soon.
By the way, the use case is this: https://gitlab.com/Apteryks/x-resize/, which allows fixing this: https://issues.guix.gnu.org/57068.