squidfunk / generic-linked-in-driver

A generic non-blocking linked-in driver for interfacing Erlang and C

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

c_src/gen_driver.c segfaults at line 160

ToJans opened this issue · comments

Using the current commit of our fdb_driver the ready segfaults on the error in the last test on cluster_create_database.

void
ready(ErlDrvData drv_data, ErlDrvThreadData thread_data) {
  gd_t *drv     = (gd_t *)drv_data;
  gd_ptr_t *ptr = (gd_ptr_t *)thread_data;

  /* Check, if we reached the end of the request buffer */
  ei_decode_list_header(ptr->req->buf, &ptr->req->index, NULL);
  if (!error_occurred(ptr->res) && ptr->req->len != ptr->req->index)
    // THIS IS LINE 160 & IT SEGFAULTS
    error(ptr->res, GD_ERR_DEC);

  /* Check for error on synchronous request, output data */
  if (ptr->req->syn) {
    if (error_occurred(ptr->res) && (ptr->res->index = 1))
      encode_error(ptr->res->buf, &ptr->res->index, ptr->res->error);
    else if (ptr->res->index == 1)
      encode_ok(ptr->res->buf, &ptr->res->index);
    driver_output(drv->port, ptr->res->buf, ptr->res->index);
  }

  /* Free request and result */
  driver_free(ptr->req->buf); /* control */
  driver_free(ptr->res->buf); /* control */
  driver_free(ptr->req); /* control */
  driver_free(ptr->res); /* control */
  driver_free(ptr); /* control */
}

I assume this has to do with the size of the preallocated buffer; I'll try changing the size?

Tried making the buffer larger, no help yet.

Yes, you have to make sure, that the result buffer is large enough. By default there are only 64 bytes allocated, see:
https://github.com/squidfunk/generic-linked-in-driver/blob/master/c_src/gen_driver.c#L219

Please post a stack trace from gdb.

tojans@ubuntu:/mnt/hgfs/develop/erlang/fdb-erlang$ clear && ./rebar compile eunit -v


GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /usr/local/lib/erlang/erts-5.10.3/bin/erlexec...done.
(gdb) set solib-search-path priv
(gdb) continue 
The program is not being run.
(gdb) start 
Temporary breakpoint 1 at 0x401400: file ./erlexec.c, line 394.
Starting program: /usr/local/lib/erlang/erts-5.10.3/bin/erlexec +B -boot start_clean -noshell -pa rebar/rebar/ebin -run escript start -extra ./rebar compile eunit -v

Temporary breakpoint 1, main (argc=15, argv=0x7fffffffe118) at ./erlexec.c:394
394 {
(gdb) continue 
Continuing.
process 30079 is executing new program: /usr/local/lib/erlang/erts-5.10.3/bin/beam
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7fe1700 (LWP 30082)]
[New Thread 0x7ffff7ebf700 (LWP 30083)]
[New Thread 0x7ffff7e9d700 (LWP 30084)]
[New Thread 0x7ffff7e7b700 (LWP 30085)]
[New Thread 0x7ffff7e59700 (LWP 30086)]
[New Thread 0x7ffff7e37700 (LWP 30087)]
[New Thread 0x7ffff61bf700 (LWP 30088)]
[New Thread 0x7ffff619d700 (LWP 30089)]
[New Thread 0x7ffff617b700 (LWP 30090)]
[New Thread 0x7ffff6159700 (LWP 30091)]
==> gen_driver (compile)
==> fdb-erlang (compile)
==> gen_driver (eunit)
======================== EUnit ========================
module 'gen_driver_test'
module 'gen_driver'
module 'basic_test'
  basic_test: setup_and_teardown_test...warning: .dynamic section for "/mnt/hgfs/develop/erlang/fdb-erlang/priv/test.so" is not at the expected address (wrong library or version mismatch?)
[0.011 s] ok
  basic_test: basic_test...[0.504 s] ok
  [done in 0.521 s]
=======================================================
  All 2 tests passed.
==> fdb-erlang (eunit)
======================== EUnit ========================
module 'fdb_test'
  fdb_test: api_version_test...[0.006 s] ok
  fdb_test: setup_network_test...[0.001 s] ok
  fdb_test: run_network_test...[New Thread 0x7fffd7fff700 (LWP 30096)]
[0.005 s] ok
  fdb_test: cluster_test...[New Thread 0x7fffd77fe700 (LWP 30097)]
/usr/local/lib/erlang/erts-5.10.3/bin/beam: 
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6f61d2a in strchrnul () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) backtrace 
#0  0x00007ffff6f61d2a in strchrnul () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff6f17a60 in vfprintf () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff6f1d1a4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007ffff6f17bde in vfprintf () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007ffff6fc0ce5 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007ffff6fc0e43 in error () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007ffff4df56d7 in ready (drv_data=0x7ffff66c24b0, thread_data=0x7ffff66c2a08) at deps/gen_driver/c_src/gen_driver.c:160
#7  0x000000000047c078 in async_ready (p=0x7ffff66c25f0, data=<optimized out>) at beam/io.c:4805
#8  0x00000000004d318a in call_async_ready (a=0x7ffff66c2298) at beam/erl_async.c:399
#9  erts_check_async_ready (varq=0x7ffff6531440) at beam/erl_async.c:549
#10 0x0000000000485968 in handle_async_ready (aux_work=64, awdp=<optimized out>, waiting=<optimized out>) at beam/erl_process.c:1247
#11 handle_aux_work (awdp=0x7ffff64c02c0, orig_aux_work=<optimized out>, waiting=<optimized out>) at beam/erl_process.c:1743
#12 0x0000000000489174 in scheduler_wait (rq=0x7ffff64c0080, esdp=0x7ffff64c0280, fcalls=<synthetic pointer>) at beam/erl_process.c:2435
#13 schedule (p=<optimized out>, calls=<optimized out>) at beam/erl_process.c:7017
#14 0x0000000000507adb in process_main () at beam/beam_emu.c:1198
#15 0x000000000044685f in erl_start (argc=24, argv=<optimized out>) at beam/erl_init.c:1783
#16 0x000000000042ab39 in main (argc=<optimized out>, argv=<optimized out>) at sys/unix/erl_main.c:29

If I detect an error in my input, I do not parse the rest; I assume that might have something to do with it?

No, you do not have to handle the request, you can abort at any time. But, I think the problem is a collision of the function name "error", since the stack trace says error within libc is called. We better prefix the internal functions to avoid such name collisions.

#5  0x00007ffff6fc0e43 in error () from /lib/x86_64-linux-gnu/libc.so.6

I just renamed error/2 to error_set/2 - can you verify that this was the error?

No more segfault, but a decode error now; thanks!

ezoic increase your site revenue