NLnetLabs / nsd

Hi,

we recently stumbled upon an issue with NSD v4.6.1 in combination with the nsd_exporter, where the latter appeared to be unable to get any metrics from NSD at certain times.

The nsd_exporter uses the stats_noreset command to fetch the data from NSD and then does $THINGS to represent them in prometheus-readable format (it uses go-nsdctl for communicating with NSD, which is basically a golang version of your very own nsd-control).

We were able to correlate the timing of this issue to the times where NSD is writing a zone file to disk (as specified in our nsd.conf with zonefiles-write), which takes quite a bit of time (think 60+ seconds, it's a rather large zone + slow disk).

During this time, NSD does accept incoming connections from the exporter on the control channel (sends a TCP SYN back), but it does not send any data until the write operation is complete (but the connection is kept open).

Once the write is complete, all the outstanding commands in the still open connections are executed / answered (and the connections closed afterwards).

Since the write is taking so long, the client attempts multiple times to retrieve data from NSD while it is still writing the zone to disk, so at some point NSD refuses the new connections due to its built-in maximum number of control connections.

I understand that this is probably a weird edge-case (because honestly, who expects the write to take that long 😄 ), but I would have expected the write operation not to "block" the collection of zone stats.

We also ran nsd-control status, nsd-control zonesstatus and nsd-control stats_noreset while the zone file was being written - the two status commands work just fine, only the stats_noreset hangs.

Unfortunately my knowledge of C is basically nonexistent so I am not sure where to even start looking how this might be fixed / changed...

Any feedback would be greatly appreciated!

CC @bschoenbach @Max-02

also another question that is co-related to this issue. What is max_active about?

nsd/remote.c

Line 284 in 3ad1ec0

rc->max_active = 10;

Would it be helpful if we increase it in this case?

The max_active is the maximum number of remote connections for control operations. If that is raised, then the connections are no longer turned away. They would still wait for the long operation to complete before a result is returned. Because a result is still returned, the statistics monitoring system may no longer show an outage for the time. So perhaps change that to 100, does that ameliorate the situation for you?

NSD performs a process refork to collect the statistics information. This is different from the status command that can be answered from the process that is connected to. The write operation also is performed by a refork that performs the requested changes. The long write operation makes the other interprocess tasks wait for the completion.

@wcawijngaards thanks for clarifying!

So this is currently working "as intended", correct?

I understand that the statistics-related commands and the IO-related commands fork into new processes, but I have to admit I am not sure I understand why those processes need to communicate with each other / why other processes need to wait for completion of the IO-task.

Unfortunately this is currently preventing us from using NSD, because it leads our monitoring systems to assume the instances of NSD that are currently writing a zone to file to be down (when in fact they are working just fine, but not responding to the statistics requests).

Do you think it is feasible to "untangle" these different kinds of operations?

I don't really know anything about the internal structure of NSD, so I have no idea something like that would be on the scale of a few hours work, or if it would require big changes of NSD's architecture...

Any insights would be appreciated :)

Currently there is not a problem, like a bug, but there is improvement possible, to get the statistics while long operations are in progress. After talking with colleagues in the team, I think a good way to implement the improvement is with a shared memory statistics buffer. This is currently also how the per zone statistics work. That would mean the statistics printout can print the output without having to wait.

There is code in the linked change, and the code from that branch should work for the purpose of getting statistics output while the reload process is performing a long task. You can try it if you like, or wait for it to get included in to the main code repository.

This issue is fixed by #305 .

NSD not responding to stats_noreset command while writing zone to file