hessu / aprsc

aprsc, a core APRS-IS server

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

APRS-IS Packets Being Denied

na7q opened this issue · comments

commented

I've been having an issue for some time now. I've tested uses quite a few servers.
I also put it to the test even more so by creating my own APRSC server and testing.

Here's my issue. I have several APRS IGate stations, that are also digis, some within range of each station also. The ones in range of each other are set to be on the same server. I can use "Northwest.aprs2.net" for example.. or my own that I've created.

Whenever the each station beacons over RF, nothing shows up on aprs.fi or other sites, even though the other station heard the beacon. I confirm that it hears it by remotely accessing the computers and watching the live traffic.

If I use a different server on each station, and then beacon over RF, I can see the packet show up on aprs.fi and other sites.

The issue here is that sometimes these sites may temporarily lose internet, only for a brief moment of time. So it's not only the beacon packets that get lost, but some things being digipeated by one of the stations. And since it can't be gated by station when it loses internet for that minute, that the other station picks up the packet, it's lost for good.
The beacons over RF are also useful for being gated. Mostly for propagation and showing up online as an RF station.

Nobody has been able to explain this. It acts as if the server sees packets that are logged into the same server are duplicate and therefore tossed.

Could you document the exact callsign-SSIDs of each of the stations, and document a few example cases of full packets, with paths shown?

One of the standard duplicate/looped packet rejection mechanisms of the APRS-IS servers (both aprsc and javaprssrvr) is that, if a station is logged in to the server as CALL-1, and then packets from CALL-1 enter the server through another socket, those packets are dropped, as they are only expected from CALL-1's direct connection. That might potentially be the thing.

commented

NA7Q-2 and KNAPPA are the two that deliver the main issue described.

Here's my example packet from KNAPPA that NA7Q-2 hears.
KNAPPA>APDW14:>Knappa iGate, Digipeater, WX - NA7Q

Here's a packet from NA7Q-2 that KNAPPA hears.
NA7Q-2>APRS:!4621.44NI12342.59W#PHG7180Fill-in Digi/IGate

So going from or to either station, if on the same server (maybe even the same hub), neither will show the traffic being gated on APRS.fi or other sites. At times it is gated by other stations that are not mine, however none of them are using the same servers, so the issue doesn't happen with them. However, whenever they do happen to be on the same server, it does present the issue.

I would normally use the northwest.aprs2.net server. To solve my problem I had used the yyz.aprs2.net and the indiana.aprs2.net (which currently now forward to the same server). At the time, they were on different hubs as well.

I created my own server to also test the issue, and it produces identical results when the stations are using it. My server is "na7q.com", using standard ports. I've currently connected the two stations (NA7Q-2 & KNAPPA) to this server.

I may try to create a video when I get a chance and attempt to show this issue. It's difficult to explain otherwise.

One of the standard duplicate/looped packet rejection mechanisms of the APRS-IS servers (both aprsc and javaprssrvr) is that, if a station is logged in to the server as CALL-1, and then packets from CALL-1 enter the server through another socket, those packets are dropped, as they are only expected from CALL-1's direct connection. That might potentially be the thing. It's a built-in loop protection mechanism that is present in all APRS-IS servers.

https://github.com/hessu/aprsc/blob/master/src/incoming.c#L998

It may help if you can make the stations connect to different servers, or if you can make the connection more reliable (faster reconnection after failure).

commented

I don't quite understand then. The logins are different. The stations are different. The servers can at times, be different. It takes testing each set of servers to figure out which doesn't do it. I don't think having to do this is the solution.
I think something needs to be done about this, because it creates a HUGE loss in packets in larger areas.

If I get this right, a packet is only lost when:

  1. A station, CALL-1, while being connected to the APRS-IS server A using that callsign, transmits a packet on RF, but does not transmit it to the APRS-IS at the same time
  2. Another station, CALL-2, hears the packet and rx-iGates it to the APRS-IS
  3. The packet goes to APRS-IS server A, where CALL-1 is logged in, and the APRS-IS server drops the packet since it should be coming in from CALL-1 directly
  4. If CALL-2 and CALL-1 are on the same APRS-IS server, CALL-2 "doesn't seem to hear" CALL-1.

Or, do you see any other chain of events? I just want to check that I understand you right, because this is what I have the explanation for.

commented

I don't know how I missed this response... sorry..

I think you have it right.
[For reference: Call-1 and Call-2 are igates and digipeaters]

Example 1:
Call-1 is APRS-IS connected via server A.
Call-2 is APRS-IS connected via server A.
Call-1 beacons over RF. Call-2 hears Call-1's beacon and sends it to APRS-IS server A.
Server A then drops the packet.
Same results going the opposite direction.

Example 2:
[Add Call-3 (a standard mobile station)]
[Add internet connectivity issues]
One particular place the above issues becomes problematic is when Call-1 randomly loses its internet connection. It may drop it for seconds, or 1, 5, 10 minutes, or more, and sometimes still appears connected.
So Call-3 beacons, Call-1 hears it (but has no internet due to the internet loss), Call-1 digipeats, Call-2 hears that digipeated packet and sends it to server A. Server A drops the packet. The packet over RF is lost.

Example: 3
[Add Call-4 (a mobile igate & digi station)]
Call-4 is APRS-IS connected via server A.
Call-4 beacons, Call-1 hears Call-4 and sends it to APRS-IS Server A. Server A drops the packet. The packet over RF is lost from Call-4.

Example 4:
Call-4 loses internet momentarily. Call-3 beacons, Call-4 hears Call-3 and due to internet loss, it does not make it to Server A. Call-4 still digipeats Call-3, and is heard by Call-1. Call-1 sends Call-3's packet digipeated by Call-4 to Server A. Server A drops the packet.

I think I described the issue correctly to the best of my experience. Maybe you know what's going on. If not, I will eventually try to do some tests with video or screen shots.

With all of this, I have seen a lot of lost packets from my stations and others. Sometimes they do make it to APRS-IS when on the same server or hub, but not always, sometimes never.

Sorry for the slow reply. Remind me if I forget.

One of the loop prevention checks in APRS-IS servers (both javAPRSSrvr and aprsc) is:

  • If a client having a callsign of X is logged in, with a correct passcode, to a server
  • and a packet from callsign X arrives from another network socket than the socket of client X
  • the packet shall be dropped

This is how the servers operate currently. This is by design. It has a drawback, as you have found out, but that's how it works. It doesn't look at the digipeater path though; just compares the source callsign with the list of currently logged-in clients. The algorithm was designed as one countermeasure against packet loops which seriously flooded and broke the APRS-IS network years before I even started looking at APRS.

If a station is connected to the APRS-IS, it must transmit its own packets to the APRS-IS. If another iGate passes them on, and they happen to be on the same server, packets may be lost.

This explains example 1 above.

It does not explain example 2; I didn't find anything in the code that would drop the packet in such a case. The code only looks at the source callsign, not whether a validated login appears in the digipeater path. I'm not entirely convinced example 2 would happen. Running a server in debug logging level would reveal exact packet drop reasons.

Example 3 is, in principle, a duplicate of example 1; working as designed. I didn't design it, I just reimplemented it. :)

Example 4 is a duplicate of example 2. I'm not entirely sure this would really happen as the code is not looking at the digipeater path, as far as I remember, and as far as I can see in the source code now. Please test by running a server in debug mode. Demonstrate and provide packet logs, and I'll look into it - it is possible that there is a bug somewhere.

commented

Looks like this problem is once again being introduced at a couple stations we deployed. We didn't mean for them to be on the same server. But packets are being lost, and data we want to collect goes out the door with it.

MARBLE is in range of TROUT, both are on the same server. TROUT beacons, but when attempting to be gated at MARBLE, the server rejects it until passed to the next igate via digipeating. As shown below. Same applies vise versa.

TROUT>APDW15,MARBLE,WIDE1,REDMT,WIDE2*,qAO,WA7RGO-10:!4559.39NI12132.41W#PHG7290W2, WAn-N, SARn-N / NA7Q

If I understand, this is working as intended by the server design to reduce the loopback?

Right. I'm just copy-pasting from the answer above, because it's still how things are:

This is how the servers operate currently. This is by design. It has a drawback, as you have found out, but that's how it works. If a station is connected to the APRS-IS, it must transmit its own packets to the APRS-IS. If another iGate passes them on, and they happen to be on the same server, packets may be lost.

If they happen to be on different servers, the packets igated by another gateway will still not be shown to clients which are to connected to the same server with the originator of the packets. If TROUT was on server-1, and MARBLE was on server-2, packets would be forwarded to the rest of the APRS-IS from MARBLE but server-1 would still drop them as TROUT is a local client and the packets were not injected by that client.

commented

I think I understand that. I will try to debug a bit more to be sure. I have changed servers and of course, it's resolved for now.

The dilemma is that we need to keep track of propagation and most importantly transmitter issues by way of gating other stations.

So for a literal and recent example. TROUT had a recent transmitter failure. We usually can tell by the lack of it being gated by MARBLE. No gating means something is wrong with the radio or antenna.

Since we weren't seeing gating between the two, but especially TROUT to MARBLE, we assumed the the transmitter had died. It did die, but it also was the server issue at the exact same time ironically.

If there were ways to have these packets pass through without rejection, it would be amazing. Though I'm not sure it's possible?

Unfortunately the APRS-IS design currently does not support your use case. There is no simple or easy workaround for this; I cannot change this without risking duplicate packets and packet looping issues.

To detect a broken transmitter, you'll need to do the detection & alarms by looking locally at what is received by an APRS receiver or iGate which hears the transmitter, without the packets traversing the APRS-IS. Have the monitoring software look at the received packets log locally.

commented

Year after year, this keeps coming back and biting me. It has proven to be a difficult problem with no real resolution.

The resolution, as far as I can see, is still the same:

If a station is connected to the APRS-IS, it must transmit its own packets to the APRS-IS. If another iGate passes them on, and they happen to be on the same server, packets may be lost.

Have you tried that?