monome / serialosc

multi-device, bonjour-capable monome OSC server

Home Page:http://monome.org/docs/serialosc/osc

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

serialosc bails out under heavy cpu load on linux

artfwo opened this issue · comments

this issue seems to reproduce perfectly when system is busy.

  1. run something to impose load on the cpu, e.g. stress -c 4
  2. run serialosc, which segfaults.

backtrace:

#0  0x0000000000404b90 in handle_device_msg (self=0x7fffffffd8c0,
    dev=0x62b3f0, msg=0x0) at ../src/serialoscd/uv.c:576
#1  0x0000000000403c45 in dispatch_ipc_msgs (stream=0x62b580, nbytes=22,
    buf=0x7fffffffa580, cb=0x404b78 <handle_device_msg>,
    self=0x7fffffffd8c0, dev=0x62b3f0) at ../src/serialoscd/uv.c:172
#2  0x0000000000404cb6 in device_read_cb (stream=0x62b580, nbytes=22,
    buf=0x7fffffffa580) at ../src/serialoscd/uv.c:610
#3  0x0000000000411640 in uv__read (stream=0x62b580)
    at ../third-party/libuv/src/unix/stream.c:1143
#4  0x0000000000411919 in uv__stream_io (
    loop=0x6285a0 <default_loop_struct>, w=0x62b608, events=1)
    at ../third-party/libuv/src/unix/stream.c:1206
#5  0x0000000000416ec3 in uv__io_poll (loop=0x6285a0 <default_loop_struct>,
    timeout=-1) at ../third-party/libuv/src/unix/linux-core.c:319
#6  0x00000000004088fb in uv_run (loop=0x6285a0 <default_loop_struct>,
    mode=UV_RUN_DEFAULT) at ../third-party/libuv/src/unix/core.c:324
#7  0x000000000040517e in main (argc=1, argv=0x7fffffffde28)
    at ../src/serialoscd/uv.c:762

Every now and then I have the exact same issue. I'm not 100% sure but it seems to happen when I plugin my grid, but only every once in a while. Might indeed be load related, haven't validated that yet.

Program received signal SIGSEGV, Segmentation fault.
handle_device_msg (self=0x7fffffffd420, dev=0x619c30, msg=0x0) at ../src/serialoscd/uv.c:576
576	../src/serialoscd/uv.c: No such file or directory.
(gdb) thread apply all bt

Thread 1 (Thread 0x7ffff7fb4700 (LWP 4885)):
#0  handle_device_msg (self=0x7fffffffd420, dev=0x619c30, msg=0x0) at ../src/serialoscd/uv.c:576
#1  0x0000000000401f6c in dispatch_ipc_msgs (stream=<optimized out>, nbytes=22, buf=0x7fffffffa120, 
    cb=cb@entry=0x401dc2 <handle_device_msg>, self=0x7fffffffd420, dev=0x619c30)
    at ../src/serialoscd/uv.c:172
#2  0x0000000000401fe4 in device_read_cb (stream=<optimized out>, nbytes=<optimized out>, 
    buf=<optimized out>) at ../src/serialoscd/uv.c:610
#3  0x00007ffff7bc7c6a in uv__read (stream=stream@entry=0x619dc0)
    at /var/tmp/portage/dev-libs/libuv-1.10.2/work/libuv-1.10.2/src/unix/stream.c:1197
#4  0x00007ffff7bc88a6 in uv__stream_io (loop=<optimized out>, w=0x619e48, events=1)
    at /var/tmp/portage/dev-libs/libuv-1.10.2/work/libuv-1.10.2/src/unix/stream.c:1263
#5  0x00007ffff7bccec9 in uv__io_poll (loop=loop@entry=0x7ffff7dd7c40 <default_loop_struct>, timeout=-1)
    at /var/tmp/portage/dev-libs/libuv-1.10.2/work/libuv-1.10.2/src/unix/linux-core.c:382
#6  0x00007ffff7bbffb0 in uv_run (loop=0x7ffff7dd7c40 <default_loop_struct>, 
    mode=mode@entry=UV_RUN_DEFAULT)
    at /var/tmp/portage/dev-libs/libuv-1.10.2/work/libuv-1.10.2/src/unix/core.c:352
#7  0x000000000040258f in main (argc=<optimized out>, argv=<optimized out>)
    at ../src/serialoscd/uv.c:762

@artfwo Wanted to add that after this segfault serialosc will keep segfaulting. The only way I've found to get it to work again is to restart. Does the same happen in your case?

@simonvanderveldt Hey, sorry for the late reply. This doesn't reproduce at all on my current system. Will look into it further when I get home.

So I didn't run into the segfault since last commenting, but it just happened again.
I started serialoscd, plugged in the grid and it segfaulted. I was compiling stuff in the background, so +-90-95% CPU load. Still not sure if that's relevant.

Starting serialoscd again would result in the same segfault as long as I kept the grid plugged in.
Unplugging the grid allowed me to start serialoscd again without a segfault. Also, plugging in the grid after that didn't cause a segfault and it's now working correctly again.

@simonvanderveldt apparently, I'm getting the same kind of crash on a Raspberry Pi 3. Had to plug the grid in and out like a dozen times to catch this one :)

#0  dispatch_ipc_msgs (stream=0x7effef70, buf=<optimized out>, dev=0x0, self=0x7effee90, cb=<optimized out>, nbytes=14) at ../src/serialoscd/uv.c:172
#1  detector_read_cb (stream=0x7effef70, nbytes=<optimized out>, buf=0x7effabe4) at ../src/serialoscd/uv.c:681
#2  0x00407f12 in uv__read (stream=stream@entry=0x7effef70) at ../third-party/libuv/src/unix/stream.c:1143
#3  0x004084ca in uv__stream_io (loop=<optimized out>, w=0x7effefb4, events=1) at ../third-party/libuv/src/unix/stream.c:1206
#4  0x0040b18e in uv__io_poll (loop=loop@entry=0x422020 <default_loop_struct>, timeout=-1) at ../third-party/libuv/src/unix/linux-core.c:319
#5  0x0040515a in uv_run (loop=0x422020 <default_loop_struct>, mode=UV_RUN_DEFAULT) at ../third-party/libuv/src/unix/core.c:324
#6  0x00402a9a in main (argc=<optimized out>, argv=<optimized out>) at ../src/serialoscd/uv.c:778

@wrl okay, here are a few problematic messages (they happen to come in sequence of 4):

 [-] bad message, bailing out
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 5C 50                                 |  ....\P 
 [-] bad message, bailing out
0C 00 00 00 00 00 00 00                           |  ........ 
 [-] bad message, bailing out
2F 64 65 76 2F 74 74 79  55 53 42 30              |  /dev/ttyUSB0 
 [-] bad message, bailing out
5C 50                                             |  \P 

another sequence I've managed to catch just now:

 [-] bad message, bailing out
01 00 00 00 A0 72 CB 36  B8 55 00 00 00 00 00 00  |  .....r.6.U...... 
00 00 00 00 5C 50 08 00  00 00 00 00 00 00 6D 30  |  ....\P........m0 
30 30 31 37 35 34 00 50  0A 00 00 00 00 00 00 00  |  001754.P........ 
6D 6F 6E 6F 6D 65 20 31  32 38                    |  monome 128 
 [-] bad message, bailing out
5C 50 04 00 00 00 07 2C  00 00 00 00 00 00 00 00  |  \P.....,........ 
00 00 00 00 00 00 5C 50  02 00 00 00 00 00 00 00  |  ......\P........ 
00 00 00 00 00 00 00 00  00 00 00 00 5C 50        |  ............\P 

for the record, here's the "good" message sequence when a device is plugged in

SOSC_DEVICE_CONNECTION

00 00 00 00 A0 A2 4A CA  B2 55 00 00 00 00 00 00  |  ......J..U...... 
00 00 00 00 5C 50 0C 00  00 00 00 00 00 00 2F 64  |  ....\P......../d 
65 76 2F 74 74 79 55 53  42 30 00 50              |  ev/ttyUSB0.P 

SOSC_DEVICE_INFO

01 00 00 00 A0 A2 4A CA  B2 55 00 00 50 81 4A CA  |  ......J..U..P.J. 
B2 55 00 00 5C 50 08 00  00 00 00 00 00 00 6D 30  |  .U..\P........m0 
30 30 31 37 35 34 00 50  0A 00 00 00 00 00 00 00  |  001754.P........ 
6D 6F 6E 6F 6D 65 20 31  32 38 00 50 04 00 00 00  |  monome 128.P.... 
07 2C 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  .,.............. 
5C 50 02 00 00 00 00 00  00 00 00 00 00 00 00 00  |  \P.............. 
00 00 00 00 00 00 5C 50                           |  ......\P 

SOSC_OSC_PORT_CHANGE

04 00 00 00 07 2C 00 00  00 00 00 00 00 00 00 00  |  .....,.......... 
00 00 00 00 5C 50 02 00  00 00 00 00 00 00 00 00  |  ....\P.......... 
00 00 00 00 00 00 00 00  00 00 5C 50              |  ..........\P 

SOSC_DEVICE_READY

02 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 5C 50                                 |  ....\P 

finally, a mixed case when we get one good message and a few bad ones:

SOSC_DEVICE_CONNECTION

00 00 00 00 A0 C2 80 F7  28 56 00 00 00 00 00 00  |  ........(V...... 
00 00 00 00 5C 50 0C 00  00 00 00 00 00 00 2F 64  |  ....\P......../d 
65 76 2F 74 74 79 55 53  42 30 00 50              |  ev/ttyUSB0.P 

bad message, 40 bytes:

 [-] bad message, bailing out
01 00 00 00 A0 C2 80 F7  28 56 00 00 00 00 00 00  |  ........(V...... 
00 00 00 00 5C 50 08 00  00 00 00 00 00 00 6D 30  |  ....\P........m0 
30 30 31 37 35 34 00 50                           |  001754.P 

bad message, 64 bytes:

 [-] bad message, bailing out
0A 00 00 00 00 00 00 00  6D 6F 6E 6F 6D 65 20 31  |  ........monome 1 
32 38 5C 50 04 00 00 00  07 2C 00 00 00 00 00 00  |  28\P.....,...... 
00 00 00 00 00 00 00 00  5C 50 02 00 00 00 00 00  |  ........\P...... 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 5C 50  |  ..............\P 

apparently fixed with #35