pion / interceptor

Pluggable RTP/RTCP processors for building real time communication

Home Page:https://pion.ly/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

totalLost count error in receiverStream?

Marco-LIU opened this issue · comments

in interceptor/receiver_stream.go

`func (stream *receiverStream) setReceived(seq uint16) {
pos := seq % stream.size
stream.packets[pos/64] |= 1 << (pos % 64)
}

func (stream *receiverStream) delReceived(seq uint16) {
pos := seq % stream.size
stream.packets[pos/64] &^= 1 << (pos % 64)
}

func (stream *receiverStream) getReceived(seq uint16) bool {
pos := seq % stream.size
return (stream.packets[pos/64] & (1 << (pos % 64))) != 0
}`

In there 3 functions, pos always in range [0, 128), thus pos/64 always in [0, 1).
So, it only record recent 128 packages, may get error data in totalLost

If it not a bug, just IGNORE this.

Yes, I think you are right. If the RTCP RR interval is more than 128 packets, we may under-report lost packets in the RR. The solution is probably to increase the size to something much larger (maybe 10X larger).

@davidzhao / @Sean-Der - would be interested in your opinion on this. It seems to me that 128 is simply 'much too small' (since the RTCP RR interval is usually longer than 128 packets) and since things like the decision whether to include FEC in Opus packets rely on an accurate lost packet count, it appears that this should be fixed. But I wanted to make sure I am not missing something here.

It appears to me from reading the code that each element in stream.packets doesn't represent a packet, but instead is a bitmask of 64 packets. That means the history is 8192 packets long. That should probably be enough for most cases, though it could be improved just in case the RR interval is very high or the packet rate is very high; I'm not certain there is any protection around those edge cases. Thankfully, they should be VERY rare.

However, there does appear to be a bug in how the offsets are calculated. I think it should be pos := seq % (stream.size * 64)

I'm going to write a test to verify the bug, then make sure my proposed fix works. If it does, I'll submit a PR

Update: this test fails (as expected), but with the fix it passes. I'll put up a PR shortly.

func TestReceiverStream(t *testing.T) {
	t.Run("can use entire history size", func(t *testing.T) {
		stream := newReceiverStream(12345, 90000)
		maxPackets := stream.size * packetsPerHistoryEntry
		for seq := uint16(0); seq < maxPackets; seq++ {
			require.False(t, stream.getReceived(seq), "packet with SN %v shouldn't be received yet", seq)
			stream.setReceived(seq)
			require.True(t, stream.getReceived(seq), "packet with SN %v should now be received", seq)
		}
	})
}

Hi @kcaffrey - unless I've misunderstood (not uncommon!) aren't what you're saying and what I'm saying isomorphic to each other? The difference since seems to be in what the units of stream.size are 'meant to be', which isn't documented.

If the units of stream.size are 'meant to be packets', then the bug is that stream.size is much smaller than the typical RTCP RR interval (and also that stream.packets is allocated larger than it needs to be for the given stream.size). (my interpretation)

If the units of stream.size are 'meant to be blocks of 64 packets' then the bug is that pos := seq % stream.size is not the right calculation. (your interpretation)

Edit to add: I see in the code we have this:

		size:         128,
		packets:      make([]uint64, 128),

Since each packet slice entry represents a 64-packet block, and both size and the length of the packet slice is 128, this suggests that the unit of size is also meant to mean '64-packet blocks' in which case your interpretation is correct, and so I fully support your PR.