APE-Project / APE_Server

Ajax Push Engine : Lightweight HTTP Streaming server. Fully written in C language, it provides best performances, making it the faster Comet server to date. APE now support server-side javascript modules through spidermonkey

Home Page:www.ape-project.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Segfault in ape_disconnect() roughtly once a day on production server

opened this issue · comments

Hi,

I have a chat-like app using APE. Like once a day, the APE server crash.
I use Debian 6 and APE 1.1.1 from .deb file.

Sometimes I can see a "glibc crash detected [...] double free or corruption" in the logs. Sometimes not.

Yesterday I tried that :

  • "git checkout" the current head of master branch
  • "make" it with -g (instead of -O2)
  • clone the main config file and set "daemon = no"
  • run it in a screen with "gdb ./aped" then :
    set args --cfg /etc/ape/ape_debug.conf
    handle SIGPIPE nostop
    run

It start well, server works as in daemon mode.
After around 8 hours, Nagios have said that the port 6969 has cease to respond.

4 new lines in the screen :
Error: Cannot write to socket 230; Connection timed out

Program received signal SIGSEGV, Segmentation fault.
0x00000000004186ae in ape_disconnect (co=0x86bf40, g_ape=0x62aa90) at src/servers.c:59
59 if (co->fd == sub->client->fd) {
(gdb)

I am currently tring to run in debug the server with a slight change in the code : before the problematic line, I have added a "if ( sub->client != NULL)".
I have added also a else with a ape_log() and a printf() call...

I will post some news in few days.

Regards,
Ludovic

OK. Two more crashes this night. It seems occurs when a client loosing his internet access while the server is repling to it.
I take 2 core dumps with GDB and I have print out some values. Console dump of the first crash this night :

[JS] Loading script /var/ape/framework/mootools.js...
[JS] Loading script /var/ape/framework/Http.js...
[JS] Loading script /var/ape/framework/userslist.js...
[JS] Loading script /var/ape/utils/utils.js...
[JS] Loading script /var/ape/commands/proxy.js...
[JS] Loading script /var/ape/examples/move.js...
[JS] Loading script /var/ape/utils/checkTool.js...
[JS] Loading script /var/ape/module_mtarget/chat_module.js...


Program received signal SIGPIPE, Broken pipe.
Error: Cannot write to socket 68; Broken pipe

Program received signal SIGSEGV, Segmentation fault.
0x00000000004186bf in ape_disconnect (co=0x11ab850, g_ape=0x62aa90) at src/servers.c:63
63                              if (co->fd == sub->client->fd) {

Here gdb prints :

(gdb) print sub
$1 = (subuser *) 0x11fd130
(gdb) print sub->client
$2 = (ape_socket *) 0x35003800330033
(gdb) print sub->client->fd
Cannot access memory at address 0x350038003300e3

(gdb) bt full
#0  0x00000000004186bf in ape_disconnect (co=0x11ab850, g_ape=0x62aa90) at src/servers.c:63
        sub = 0x11fd130
#1  0x00000000004086ae in sockroutine (g_ape=0x62aa90) at src/sock.c:445
        readb = 0
        bitev = 3
        active_fd = 247
        timeout_to_hang = 49
        sl = {co = 0x7fffffffe110, tfd = 0x7fffffffdcc8}
        new_fd = -1
        nfds = 21
        sin_size = 16
        i = 10
        tfd = 336
        t_start = {tv_sec = 1369170540, tv_usec = 929675}
        t_end = {tv_sec = 1369170540, tv_usec = 929675}
        ticks = 0
        uticks = 3964
        lticks = 659
        their_addr = {sin_family = 2, sin_port = 49090, sin_addr = {s_addr = 3776933214}, sin_zero = "\000\000\000\000\000\000\000"}
#2  0x0000000000406f53 in main (argc=3, argv=0x7fffffffe118) at src/entry.c:306
        srv = 0x62a990
        random = 6
        im_r00t = 1
        pidfd = 0
        serverfd = 7
        getrandom = 2470454
        pidfile = 0x0
        confs_path = 0x62a010 "/etc/ape/"
        fdev = {basemem = 0x62abe4, add = 0x417ec4 <event_epoll_add>, remove = 0x417f50 <event_epoll_remove>, poll = 0x417f62 <event_epoll_poll>, get_current_fd = 0x417fb1 <event_epoll_get_fd>,
          growup = 0x417fe0 <event_epoll_growup>, revent = 0x418028 <event_epoll_revent>, reload = 0x41809a <event_epoll_reload>, events = 0x14df1f0, epoll_fd = 6, handler = EVENT_EPOLL}
        cfgfile = "/etc/ape/ape_debug.conf", '\000' <repeats 489 times>
        g_ape = 0x62aa90
        i = 0

The second crash show the same backtrace but a 0xfffffff adresse for sub->client (uninitialized memory ?). There is no SIGPIPE this time.

[JS] Loading script /var/ape/module_mtarget/chat_module.js...

Program received signal SIGSEGV, Segmentation fault.
0x00000000004186bf in ape_disconnect (co=0x70a6b0, g_ape=0x62aa90) at src/servers.c:63
63                              if (co->fd == sub->client->fd) {
(gdb) generate-core-file
Saved corefile core.17516
(gdb) print sub
$3 = (subuser *) 0xa47050
(gdb) print sub->client
$4 = (ape_socket *) 0xffffffff
(gdb) backtrace
#0  0x00000000004186bf in ape_disconnect (co=0x70a6b0, g_ape=0x62aa90) at src/servers.c:63
#1  0x00000000004086ae in sockroutine (g_ape=0x62aa90) at src/sock.c:445
#2  0x0000000000406f53 in main (argc=3, argv=0x7fffffffe118) at src/entry.c:306

Could it be your chat module? I once had a module that was causing a segmentation fault but i can't exactly remember the statement that was causing it.

Issue over a years old. Closing. Reopen a new one if issue is still there.

Ok. I don't use APE anymore, so I can't say if issue is still there. Thanks all folks. Ludovic