MichaIng / DietPi

Lightweight justice for your single-board computer!

Home Page:https://dietpi.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dietpi-wifi-monitor enhancement

cbrenberg opened this issue · comments

I've recently been fighting with dietpi-wifi-monitor, which has been incorrectly detecting network outages and resetting my Pi Zero 2W's DHCP lease when my router occasionally drops a ping packet. In almost all cases, the Pi's network connection is still active and does not need to be reset.

While investigating the above issue, I noticed the dietpi-wifi-monitor script sends one single ping, and if the ping fails, it interprets the result as a lost connection.

I don't know enough bash to suggest a specific implementation, but my idea for an enhancement would be for dietpi-wifi-monitor to send multiple pings each time it runs instead of just one, and only attempt to reset the connection if ALL ping attempts fail. This would avoid unnecessary network resets due to a single packet drop.

For now I've disabled dietpi-wifi-monitor, which so far seems to result in a much more stable wlan0 connection on this Pi.

This somehow fits the question which recently popped up on the forum: https://dietpi.com/forum/t/auto-reconnect-wifi/18667/17

What is your use case? If both, Pi and router are stationary, then the DietPi-WiFi-Monitor indeed does usually not make much sense. Short connection drops do not permanently break a WiFi connection, as you recognised yourself. If the router is regularly shut down or any of both is not stationary, then it makes sense.

I am not sure whether increasing the amount of pings would help much. Often, if one ping is dropped, because the connection is flaky, then multiple in a row are dropped as well. Or did you test in your particular case that when sending multiple pins, occasionally only a single of them is dropped while all others are replied?

In case we want to implement this, a problem is that I see no way to get a positive exit code from ping if any response is missing. If you send multiple pings with a timeout, all must be answered for a positive exit code, otherwise you get the same as if all would fail. Nice would be something like "exit positive once you get a response within X seconds". Of course we could add an own loop ... we even have a function for this:

G_EXEC_RETRIES=3 G_EXEC ping -nqc 1 -I wlan0 192.168.1.1

This runs ping 3 times, doing 1 ping each, exiting with positive exit code as fast as one succeeds. This is quite similar to how we do network connection tests at the start of DietPi updates, software installs etc. However, since ping by default has quite a long timeout, the overall time before a network connection restart is done is tripled as well. Nicer would be to have 3 pings sent, and then waiting only once for any response. We could of course reduce the timeout, but there are cases where the AP simply always takes a little longer to respond, which would then be broken.

If someone finds a way to have ping returning positively on first answer, after sending multiple echos, let me know, that would be ideal and relatively regression-free to implement.

Since the script currently relies on the exit code, here's what the ping man page says about exit codes:

If ping does not receive any reply packets at all it will exit with code 1. If a packet count and deadline are both specified, and fewer than count packets are received by the time the deadline has arrived, it will also exit with code 1. On other error it exits with code 2. Otherwise it exits with code 0. This makes it possible to use the exit code to see if a host is alive or not.

One option that would still respect the exit code would be to add the -W flag to set a maximum timeout threshold, which would reduce the amount of time it waits for a response on each of the three retries. But since network latency varies widely, I don't know what the default timeout should be set to.

An alternative option would be to issue one single command that sends out three pings in serial using the -f option with a small interval and a count of 3. In this case, the output would have to be captured and parsed to look for either "..." (which is printed by -f if all three pings fail) or look for "100% packet loss"

That command might look something like ping -nfc 3 -i 0.2 -I wlan0 192.168.1.1

Here's another possibility using the -f flag but still using exit codes:

ping -i 0.2 -c 5 -f -I wlan0 192.168.1.1 | grep '\.\.\.\.\.' -m1

It pipes the output from ping into grep, checking for a sequence of . characters corresponding with the ping count. If there's a match, it forces ping to exit and issues the following codes:

Exit code is 0 when all ping attempts fail, otherwise exit code is 1