NLnetLabs / nsd

The NLnet Labs Name Server Daemon (NSD) is an authoritative, RFC compliant DNS nameserver.

Home Page:https://nlnetlabs.nl/nsd

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NSD crashes then started and loading a catz zone from disk

pettai opened this issue · comments

commented

NSD crashes then starting up and loading a catz zone file from disk (previously transferred and written to disk via nsd-control write)

...
Aug 16 16:08:49 bygg-u2204 nsd[19710]: listen on ip-address ::1@53 (udp) with server(s): *
Aug 16 16:08:49 bygg-u2204 nsd[19710]: listen on ip-address ::1@53 (tcp) with server(s): *
Aug 16 16:08:49 bygg-u2204 nsd[19710]: listen on ip-address 127.0.0.1@53 (udp) with server(s): *
Aug 16 16:08:49 bygg-u2204 nsd[19710]: listen on ip-address 127.0.0.1@53 (tcp) with server(s): *
Aug 16 16:08:49 bygg-u2204 nsd[19710]: file rotation on (null) enabled
Aug 16 16:08:49 bygg-u2204 nsd[19710]: dropped user privileges, run as nsd
Aug 16 16:08:49 bygg-u2204 nsd[19710]: xfrd pre-startup
Aug 16 16:08:49 bygg-u2204 nsd[19710]: xfrd: adding catz-test.catalog zone
Aug 16 16:08:49 bygg-u2204 nsd[19710]: xfrd zone catz-test.catalog is activated, state 1
Aug 16 16:08:49 bygg-u2204 nsd[19710]: xfrd: started server 1 secondary zones
Aug 16 16:08:49 bygg-u2204 nsd[19710]: task procsync /tmp/nsd-xfr-19710/nsd.19710.task.1 size 288
Aug 16 16:08:49 bygg-u2204 nsd[19711]: delete zone visit catz-test.catalog.
Aug 16 16:08:49 bygg-u2204 nsd[19711]: axfrdel: recyclebin holds 0 bytes
Aug 16 16:08:49 bygg-u2204 nsd[19711]: memory: 26 objects (26 small/0 large), 1536 bytes allocated (43 wasted) in 1 chunks, 0 cleanups, 0 in recyclebin 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Aug 16 16:08:49 bygg-u2204 nsd[19711]: zone catz-test.catalog read with success
Aug 16 16:08:49 bygg-u2204 nsd[19711]: Catz version TXT
Aug 16 16:08:49 bygg-u2204 nsd[19711]: Catz version is 2
Aug 16 16:08:49 bygg-u2204 nsd[19711]: 1a3e5aca65819ae2.zones
Aug 16 16:08:49 bygg-u2204 nsd[19711]: Task created for catalog catz-test.catalog.: catztest1.se.
Aug 16 16:08:49 bygg-u2204 nsd[19711]: add task addcatzone catztest1.se. catz-test.catalog. catz-test.catalog. 1a3e5aca65819ae2.zones.catz-test.catalog.
Aug 16 16:08:49 bygg-u2204 kernel: [84389.527356] nsd: main[19711]: segfault at 20 ip 00005575b86842b5 sp 00007fff89844330 error 4 in nsd[5575b865a000+7f000]
Aug 16 16:08:49 bygg-u2204 kernel: [84389.527381] Code: 00 e8 0f 74 fd ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 48 89 f5 53 48 83 ec 28 <48> 8b 5e 20 48 8d 71 11 48 89 7c 24 10 4c 8b 0b 89 54 24 1c 48 89
Aug 16 16:08:49 bygg-u2204 nsd[19710]: did not get start signal from main

(Full info in the pull request #261 (comment) and below)

commented

Hello @Koenvh1

I built the latest git version with symbols, so here's a stack trace to pin down the issue that causes NSD to segfault:

(gdb) run
Starting program: /usr/sbin/nsd -d
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Attaching after Thread 0x7ffff74e6c00 (LWP 47788) fork to child process 47789]
[New inferior 2 (process 47789)]
[Detaching after fork from parent process 47788]
[Inferior 1 (process 47788) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Thread 2.1 "nsd: main" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff74e6c00 (LWP 47789)]
0x00005555555852b5 in udb_ptr_alloc_space (ptr=0x7fffffffdb40, udb=0x0, type=udb_chunk_type_task, sz=122) at ./udb.c:2077
2077	./udb.c: No such file or directory.
(gdb) bt
#0  0x00005555555852b5 in udb_ptr_alloc_space (ptr=0x7fffffffdb40, udb=0x0, type=udb_chunk_type_task, sz=122) at ./udb.c:2077
#1  0x00005555555a38c1 in task_create_new_elem (udb=udb@entry=0x0, last=last@entry=0x0, e=e@entry=0x7fffffffdb40, sz=sz@entry=122, zname=zname@entry=0x0)
    at ./difffile.c:1525
#2  0x00005555555a438c in task_new_add_catzone (udb=0x0, last=0x0, zone=0x55555576c6d0 "catztest2.se.", pattern=0x55555576b700 "catz-test.catalog.",
    from_catalog=0x55555576c6e0 "catz-test.catalog.", member_id=0x555555612c60 <buf> "e752a6d54bfb58ac.zones.catz-test.catalog.", zonestatid=0) at ./difffile.c:1761
#3  0x00005555555c610a in catz_add_zone (nsd=<optimized out>, last_task=0x0, udb=0x0, pname=<optimized out>, catalog_zone=0x7ffff55f43a8, member_id=<optimized out>,
    member_zone_name=0x7ffff55f4b28) at ./cat-zones-nsd.c:71
#4  nsd_catalog_consumer_process.constprop.0 (zone=zone@entry=0x7ffff55f43a8, udb=udb@entry=0x0, last_task=last_task@entry=0x0, nsd=<optimized out>) at ./cat-zones-nsd.c:251
#5  0x00005555555c7c11 in namedb_read_zonefile.constprop.0 (zone=<optimized out>, taskudb=0x0, last_task=0x0, nsd=<optimized out>) at ./dbaccess.c:624
#6  0x00005555555c993a in namedb_check_zonefiles.constprop.0.isra.0 (taskudb=0x0, last_task=0x0, nsd=<optimized out>, opt=<optimized out>, opt=<optimized out>)
    at ./dbaccess.c:713
#7  0x00005555555bdd9f in server_prepare.constprop.0 (nsd=<optimized out>) at ./server.c:1449
#8  0x000055555555f491 in main (argc=<optimized out>, argv=<optimized out>) at ./nsd.c:1745
(gdb)

(for reference, this is built & running on a vanilla Ubuntu 22.04 cloud image)

HTH,
/P

commented

Here's the current catz-zone on disk, for reference

; zone catz-test.catalog written by NSD 4.6.2 on Wed Sep  6 12:11:17 2023
; received update to serial 1686924749 at 2023-09-06T11:11:17 from 192.71.100.227 TSIG verified with key xxxx
$ORIGIN catalog.
catz-test	0	IN	SOA	invalid. invalid. (
		1686924749 3600 600 2147483646 0 )
	0	IN	NS	invalid.
$ORIGIN catz-test.catalog.
version	0	IN	TXT	"2"
$ORIGIN zones.catz-test.catalog.
d000250b594a7c3f	0	IN	PTR	catztest1.se.
e752a6d54bfb58ac	0	IN	PTR	catztest2.se.

I can't reproduce this error in #294, so it must have been fixed there 👍

Too bad... I was able to reproduce once again.

root@bygg-u2204:/etc/nsd/nsd.conf.d# systemctl start nsd
Job for nsd.service failed because the control process exited with error code.
See "systemctl status nsd.service" and "journalctl -xeu nsd.service" for details.

root@bygg-u2204:/etc/nsd/nsd.conf.d# gdb --args /usr/sbin/nsd -d
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/nsd...
Reading symbols from /usr/lib/debug/.build-id/cb/73766dd6870ef97313611b767fa8f48416d0e3.debug...
(gdb) set follow-fork-mode child
(gdb) run
Starting program: /usr/sbin/nsd -d
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Attaching after Thread 0x7ffff74e6c00 (LWP 13195) fork to child process 13198]
[New inferior 2 (process 13198)]
[Detaching after fork from parent process 13195]
[Inferior 1 (process 13195) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Thread 2.1 "nsd: main" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff74e6c00 (LWP 13198)]
0x0000555555584a15 in udb_ptr_alloc_space (ptr=0x7fffffffdc30, udb=0x0, type=udb_chunk_type_task, sz=121) at ./udb.c:2077
2077	./udb.c: No such file or directory.
(gdb) bt
#0  0x0000555555584a15 in udb_ptr_alloc_space (ptr=0x7fffffffdc30, udb=0x0, type=udb_chunk_type_task, sz=121) at ./udb.c:2077
#1  0x00005555555a3fd1 in task_create_new_elem (udb=udb@entry=0x0, last=last@entry=0x0, e=e@entry=0x7fffffffdc30, sz=sz@entry=121, zname=zname@entry=0x0) at ./difffile.c:1525
#2  0x00005555555a47fc in task_new_add_catzone (udb=0x0, last=0x0, zone=0x55555576f760 "catztest2.se.", pattern=0x55555576e740 "catz-test.pattern",
    from_catalog=0x55555576f770 "catz-test.catalog.", member_id=0x555555614c60 <buf> "e752a6d54bfb58ac.zones.catz-test.catalog.", zonestatid=0) at ./difffile.c:1761
#3  0x00005555555c6c7e in catz_add_zone (nsd=<optimized out>, last_task=0x0, udb=0x0, pname=<optimized out>, catalog_zone=0x7ffff55f43a8, member_id=<optimized out>,
    member_zone_name=0x7ffff55f4b28) at ./cat-zones.c:63
#4  catalog_consumer_process.constprop.0 (zone=zone@entry=0x7ffff55f43a8, udb=udb@entry=0x0, last_task=last_task@entry=0x0, nsd=<optimized out>) at ./cat-zones.c:246
#5  0x00005555555c9470 in namedb_read_zonefile.constprop.0 (zone=<optimized out>, taskudb=0x0, last_task=0x0, nsd=<optimized out>) at ./dbaccess.c:623
#6  0x00005555555ba5ab in namedb_check_zonefiles (opt=<optimized out>, taskudb=0x0, last_task=0x0, nsd=<optimized out>) at ./dbaccess.c:713
#7  0x00005555555be7ab in server_prepare.constprop.0 (nsd=<optimized out>) at ./server.c:1449
#8  0x000055555555f52e in main (argc=<optimized out>, argv=<optimized out>) at ./nsd.c:1747
(gdb)

The issue comes then I added zonefile: /../zone.file to the zone: -stanza of a catalog zone (contains catalog: yes).
If I save that catalog zone to disk, and then restart nsd, it will segfault as it looks above.

pattern:
        name: "catz-test.catalog."
        zonefile: "/var/lib/nsd/catzones/%s"
        allow-notify: x.x.x.x
        request-xfr:  x.x.x.x

zone:
        name: "catz-test.catalog"
        catalog: yes
        catalog-member-pattern: "catz-test.catalog."
        zonefile: "/var/lib/nsd/catzones/%s"        <<---- segfaults then nsd restarts and the zone has been written to disk earlier 
        allow-notify: ::1 NOKEY
        allow-notify: 127.0.0.1 NOKEY
        provide-xfr: ::1 NOKEY
        provide-xfr: 127.0.0.1 NOKEY

(So with the new conf syntax of catalog zones, it's isolated to the reading of a catalog zone from disk. )