_nss_mdns_gethostbyname4_r crash with glibc 2.25
civodul opened this issue · comments
nss-mdns 0.11 crashes reliably on GuixSD (GNU/Linux with glibc 2.25) when doing getaddrinfo
lookups (0.10 was fine):
$ sudo ltrace ping ribbon.local
[...]
getaddrinfo("ribbon.local", nil, 0x7ffc1c86ee80, 0x7ffc1c86ee78) = -2
error(1, 0, 0x40c2ac, 0ping: unknown host
<no return ...>
+++ exited (status 1) +++
$ sudo ltrace getent hosts ribbon.local
mtrace() = <void>
setlocale(LC_ALL, "") = "LC_CTYPE=en_US.utf8;LC_NUMERIC=e"...
textdomain("libc") = "libc"
argp_parse(0x605440, 3, 0x7ffdfe8f7c08, 0) = 0
strcmp("hosts", "hosts") = 0
inet_pton(10, 0x7ffdfe8f85b3, 0x7ffdfe8f7aa0, 8) = 0
inet_pton(2, 0x7ffdfe8f85b3, 0x7ffdfe8f7aa0, 0xff42000000000000) = 0
gethostbyname2(0x7ffdfe8f85b3, 10, 0, 66) = 0x7f8524197200
inet_ntop(10, 0x21c5088, 0x7ffdfe8f7a40, 46) = 0x7ffdfe8f7a40
printf("%-15s %s", "fe80::bcb:7426:", "ribbon.local") = 37
__overflow(0x7f8524194600, 10, 0x7f8524195760, 0x7fffffdafe80::bcb:7426:7adb:6aa7 ribbon.local
) = 10
+++ exited (status 0) +++
The first execution triggers an nscd
crash. Backtrace:
Core was generated by `/gnu/store/3h31zsqxjjg52da5gp3qmhkh4x8klhah-glibc-2.25/sbin/nscd -f /gnu/store/'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 strlen () at ../sysdeps/x86_64/strlen.S:106
106 ../sysdeps/x86_64/strlen.S: Dosiero aŭ dosierujo ne ekzistas.
[Current thread is 1 (Thread 0x7fee65a4b700 (LWP 32659))]
(gdb) bt
#0 strlen () at ../sysdeps/x86_64/strlen.S:106
#1 0x000055a0e3263883 in addhstaiX (db=db@entry=0x55a0e3472340 <dbs+704>,
fd=fd@entry=13, req=req@entry=0x7fee65a4a8c0, key=key@entry=0x7fee65a4ab10,
uid=uid@entry=4294967295, he=he@entry=0x0, dh=0x0) at aicache.c:174
#2 0x000055a0e326432e in addhstai (db=db@entry=0x55a0e3472340 <dbs+704>,
fd=fd@entry=13, req=req@entry=0x7fee65a4a8c0, key=key@entry=0x7fee65a4ab10,
uid=uid@entry=4294967295) at aicache.c:571
#3 0x000055a0e325857a in handle_request (uid=4294967295, pid=<optimized out>,
key=0x7fee65a4ab10, req=0x7fee65a4a8c0, fd=13) at connections.c:1275
#4 nscd_run_worker (p=<optimized out>) at connections.c:1762
#5 0x00007fee6b66e454 in start_thread (arg=0x7fee65a4b700) at pthread_create.c:456
#6 0x00007fee6b1987cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
(gdb) bt full
#0 strlen () at ../sysdeps/x86_64/strlen.S:106
No locals.
#1 0x000055a0e3263883 in addhstaiX (db=db@entry=0x55a0e3472340 <dbs+704>, fd=fd@entry=13, req=req@entry=0x7fee65a4a8c0, key=key@entry=0x7fee65a4ab10, uid=uid@entry=4294967295,
he=he@entry=0x0, dh=0x0) at aicache.c:174
atmem = {next = 0x55a0e3472800 <readylist_lock>, name = 0x0, family = 1801920929, addr = {32750, 0, 2, 1801929696}, scopeid = 32750}
at = 0x7fee65a4a7e0
addrs = <optimized out>
family = <optimized out>
status = {-1, -1}
naddrs = 2
canon = 0x0
canonlen = <optimized out>
cp = <optimized out>
addrslen = 0
fct4 = <optimized out>
dataset = 0x0
hosts_database = 0x55a0e42025d0
nip = 0x55a0e4202610
no_more = 0
rc6 = 0
rc4 = 0
herrno = 0
old_res_options = 705
tmpbuf6len = 1024
tmpbuf6 = 0x7fee65a4a2e0 "pluto.local"
tmpbuf4len = <optimized out>
tmpbuf4 = <optimized out>
ttl = 2147483647
total = 0
key_copy = 0x0
alloca_used = false
timeout = 9223372036854775807
__PRETTY_FUNCTION__ = "addhstaiX"
#2 0x000055a0e326432e in addhstai (db=db@entry=0x55a0e3472340 <dbs+704>, fd=fd@entry=13, req=req@entry=0x7fee65a4a8c0, key=key@entry=0x7fee65a4ab10, uid=uid@entry=4294967295)
at aicache.c:571
No locals.
#3 0x000055a0e325857a in handle_request (uid=4294967295, pid=<optimized out>, key=0x7fee65a4ab10, req=0x7fee65a4a8c0, fd=13) at connections.c:1275
db = 0x55a0e3472340 <dbs+704>
#4 nscd_run_worker (p=<optimized out>) at connections.c:1762
keybuf = "pluto.local", '\000' <repeats 1013 times>
fd = 13
pid = <optimized out>
it = <optimized out>
req = {version = 2, type = GETAI, key_len = 12}
uid = 4294967295
buf = '\000' <repeats 255 times>
#5 0x00007fee6b66e454 in start_thread (arg=0x7fee65a4b700) at pthread_create.c:456
__res = <optimized out>
pd = 0x7fee65a4b700
now = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140661884237568, -461186331514265124, 140724270282382, 140724270282383, 0, 140661884237568, 451840114903196124,
451872565911724508}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
pagesize_m1 = <optimized out>
sp = <optimized out>
freesize = <optimized out>
__PRETTY_FUNCTION__ = "start_thread"
#6 0x00007fee6b1987cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.
More details at https://bugs.gnu.org/30396 .
gethostbyname4_r
wasn't implemented before release 0.11. It looks like there is a bug or we might need a workaround for nscd
.
My guess is that it is probably this:
https://lists.freedesktop.org/archives/systemd-devel/2013-February/008606.html
I think I made a mistake on the fix. The segfault is gone, but now some clients won't see more than one result. I need to investigate.
The fix for the missing results is in 31ccbec. You can use 0.12 and cherrypick the fix, or wait for 0.13 (no ETA).
Works like a charm, thank you!