FRRouting / frr

The FRRouting Protocol Suite

Home Page:https://frrouting.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GR fails under large route

wangdan1323 opened this issue · comments

Kernel: Linux 4.19.0-12-2-amd64
FRR Version: stable/8.5
GR fails under large route.

The reason is bpacket_queue_is_full, causing some routes to be sent to GR Helper after End-of-Rib.
Why is subgroup-pkt-queue-max set to 40 by default and supported to be configurable?
What is the impact of the default setting to the maximum value of 100?
Can the default value be set to exceed 100?

We need more details on what situation you hit this limit. Can you share the configuration, scope, and maybe some logs? Also... show event cpu output when this limit is reached.

bgp92----------bgp62-----------bgp51
bgp92 has 100k routes to bgp62.
When 62 bgp restart, 62 will receive 100k routes from 62. Then 62 sends these 100k routes to its neighbors.
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.124/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.139/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.125/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.140/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.126/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.127/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.141/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.128/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.129/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.142/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.130/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE 100.0.141.143/32 IPv4 unicast
DEBUG bgp#bgpd[62]: u1:s1 send UPDATE len 4096 numpfx 806
806 * subgroup-pkt-queue-max (default = 40) = 32240
when the routes number is 100k, this limit is reached.

Can you show show ip bgp update-groups also when doing a restart?

Here is 37k routes:
show ip bgp update-groups
Update-group 3:
Created: Tue Jun 4 15:57:48 2024
Outgoing route map: wd1
MRAI value (seconds): 0

Update-subgroup 3:
Created: Tue Jun 4 15:57:48 2024
Join events: 1
Prune events: 0
Merge events: 0
Split events: 0
Update group switch events: 0
Peer refreshes combined: 0
Merge checks triggered: 2
Coalesce Time: 1350
Version: 76000
Packet queue length: 0
Total packets enqueued: 0
Packet queue high watermark: 0
Adj-out list count: 0
Advertise list: empty
Flags:
Peers:
- 210.6.1.92
Update-group 4:
Created: Tue Jun 4 15:57:48 2024
MRAI value (seconds): 0

Update-subgroup 4:
Created: Tue Jun 4 15:57:48 2024
Join events: 2
Prune events: 0
Merge events: 0
Split events: 0
Update group switch events: 0
Peer refreshes combined: 1
Merge checks triggered: 0
Coalesce Time: 1350
Version: 76000
Packet queue length: 0
Total packets enqueued: 138
Packet queue high watermark: 48
Adj-out list count: 38000
Advertise list: empty
Flags:
Peers:
- 210.2.1.179
- 210.4.1.51

Packet queue length is 0. Doesn't seem to be triggered in that case where you said at the beginning. Btw, anything changes if you change from 40 to 100?

1 bgp default subgroup-pkt-queue-max 90
updating:
Update-group 5:
Created: Tue Jun 4 16:54:32 2024
MRAI value (seconds): 0

Update-subgroup 9:
Created: Tue Jun 4 16:54:32 2024
Join events: 2
Prune events: 0
Merge events: 0
Split events: 0
Update group switch events: 0
Peer refreshes combined: 0
Merge checks triggered: 0
Coalesce Time: 1350
Version: 156000
Packet queue length: 39
Total packets enqueued: 41
Packet queue high watermark: 39

Adj-out list count: 38000
Advertise list: not empty
Flags:
Peers:
- 210.4.1.51
- 210.2.1.179

End of update:

Update-group 5:
Created: Tue Jun 4 16:54:32 2024
MRAI value (seconds): 0

Update-subgroup 9:
Created: Tue Jun 4 16:54:32 2024
Join events: 2
Prune events: 0
Merge events: 0
Split events: 0
Update group switch events: 0
Peer refreshes combined: 0
Merge checks triggered: 0
Coalesce Time: 1350
Version: 156000
Packet queue length: 0
Total packets enqueued: 48
Packet queue high watermark: 39

Adj-out list count: 38000
Advertise list: empty
Flags:
Peers:
- 210.4.1.51
- 210.2.1.179

2 bgp default subgroup-pkt-queue-max 30
updaing

Update-group 9:
Created: Tue Jun 4 16:59:00 2024
MRAI value (seconds): 0

Update-subgroup 13:
Created: Tue Jun 4 16:59:00 2024
Join events: 2
Prune events: 0
Merge events: 0
Split events: 0
Update group switch events: 0
Peer refreshes combined: 0
Merge checks triggered: 1
Coalesce Time: 1350
Version: 270000
Packet queue length: 0
Total packets enqueued: 2
Packet queue high watermark: 2

Adj-out list count: 38000
Advertise list: not empty
Flags:
Peers:
- 210.4.1.51
- 210.2.1.179

End of update:
Update-group 9:
Created: Tue Jun 4 16:59:00 2024
MRAI value (seconds): 0

Update-subgroup 13:
Created: Tue Jun 4 16:59:00 2024
Join events: 2
Prune events: 0
Merge events: 0
Split events: 0
Update group switch events: 0
Peer refreshes combined: 0
Merge checks triggered: 1
Coalesce Time: 1350
Version: 270000
Packet queue length: 0
Total packets enqueued: 48
Packet queue high watermark: 29

Adj-out list count: 38000
Advertise list: empty
Flags:
Peers:
- 210.4.1.51
- 210.2.1.179