Wierd BGP IPv6 ll nh behavior
qeleq opened this issue · comments
Hi All!.
FRR version 10.0.
I have two interfaces with ipv6 ll addresses and EBGP IPv6 sessions
7: ens13f0np0.80@ens13f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 18:9b:a5:82:25:e2 brd ff:ff:ff:ff:ff:ff
inet6 fe80:14:fc01:1::2/64 scope link
valid_lft forever preferred_lft forever
inet6 fe80::1a9b:a5ff:fe82:25e2/64 scope link
valid_lft forever preferred_lft forever
10: ens28f0np0.80@ens28f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether e8:eb:d3:b3:54:b6 brd ff:ff:ff:ff:ff:ff
inet6 fe80:14:fc01:2::2/64 scope link
valid_lft forever preferred_lft forever
inet6 fe80::eaeb:d3ff:feb3:54b6/64 scope link
valid_lft forever preferred_lft forever
FRR settings
_frr version 10.0
frr defaults traditional
hostname el-fw1.cdnwb.ru
log syslog informational
service integrated-vtysh-config
router bgp 65323
neighbor SW-LAN peer-group
neighbor fe80:14:fc01:1::1 peer-group SW-LAN
neighbor fe80:14:fc01:1::1 interface ens13f0np0.80
no neighbor fe80:14:fc01:1::1 enforce-first-as
neighbor fe80:14:fc01:2::1 peer-group SW-LAN
neighbor fe80:14:fc01:2::1 interface ens28f0np0.80
no neighbor fe80:14:fc01:2::1 enforce-first-as
address-family ipv6 unicast
neighbor SW-LAN activate
neighbor SW-LAN soft-reconfiguration inbound
neighbor SW-LAN route-map FROM_LAN_V6 in
neighbor SW-LAN route-map TO_LAN_V6 out
exit-address-family_
All sessions are UP and stable
_Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
fe80:14:fc01:1::1 4 65322 2400 2170 11 0 0 16:27:31 1 0 N/A
fe80:14:fc01:2::1 4 65322 2362 2142 11 0 0 16:27:31 1 0 N/A_
Both BGP peer announce me one IPv6 prefix, 2a03:720:1000::/36
el-fw1.cdnwb.ru# sh bgp neighbors fe80:14:fc01:1::1 received-routes
_BGP table version is 11, local router ID is 192.168.0.1, vrf id 0
Default local pref 100, local AS 65323
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 2a03:720:1000::/36
fe80:14:fc01:1::1
0 65322 4206000170 57073 i
Total number of prefixes 1_
el-fw1.cdnwb.ru# sh bgp neighbors fe80:14:fc01:2::1 received-routes
BGP table version is 11, local router ID is 192.168.0.1, vrf id 0
Default local pref 100, local AS 65323
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 2a03:720:1000::/36
fe80:14:fc01:2::1
0 65322 4206000170 57073 i
Total number of prefixes 1
So, BGP signaling is ok, but i have very weird situation for adding routes to RIB. So
_el-fw1.cdnwb.ru# sh bgp neighbors fe80:14:fc01:1::1 received-routes detail
BGP table version is 11, local router ID is 192.168.0.1, vrf id 0
Default local pref 100, local AS 65323
BGP routing table entry for 2a03:720:1000::/36, version 11
Paths: (2 available, best #1, table default)
Not advertised to any peer
65322 4206000170 57073
**fe80:14:fc01:2::1** from **fe80:14:fc01:2::1** (10.255.193.111)
(fe80:14:fc01:2::1) (used)
Origin IGP, valid, external, best (First path received)
Last update: Mon May 27 17:45:41 2024
65322 4206000170 57073
**fe80:14:fc01:1::1** (inaccessible, import-check enabled) from **fe80:14:fc01:1::1** (10.255.193.110)
(fe80:14:fc01:1::1) (used)
Origin IGP, invalid, external
Last update: Mon May 27 17:45:41 2024
Total number of prefixes 1_
Question number 1 why route from peer fe80:14:fc01:2::1 is shown as route from peer fe80:14:fc01:1::1
And the second question is probably related to the first, i have a big problem with installing route to the RIB. Some time i have both routes
_B>* 2a03:720:1000::/36 [20/0] via **fe80:14:fc01:1::1,** ens13f0np0.80, weight 1, 00:11:59
** via **fe80:14:fc01:2::1**, ens28f0np0.80, weight 1, 00:11:59_
Sometimes one
_B>* 2a03:720:1000::/36 [20/0] via fe80:14:fc01:2::1, ens28f0np0.80, weight 1, 16:38:59_
Some times none :-(
Help me please.
Can you enable debug bgp updates
, debug bgp neighbor
, debug bgp nht
and then send us the logs?
Also, just in case the following commands outputs would be handy too:
show ipv6 nht
show bgp nexthop
show bgp import-check-table
Done
VRF default:
Resolve via default: on
fe80:14:fc01:1::1(Connected)
resolved via connected
is directly connected, ens13f0np0.80 (vrf default)
Client list: bgp(fd 18)
fe80:14:fc01:2::1(Connected)
resolved via connected
is directly connected, ens28f0np0.80 (vrf default)
Client list: bgp(fd 18)
el-fw1.cdnwb.ru# show bgp nexthop
Current BGP nexthop cache:
fe80:14:fc01:1::1 valid [IGP metric 0], #paths 0, peer fe80:14:fc01:1::1
if ens13f0np0.80
Last update: Mon May 27 16:26:58 2024
fe80:14:fc01:2::1 valid [IGP metric 0], #paths 1, peer fe80:14:fc01:2::1
if ens28f0np0.80
Last update: Mon May 27 16:35:25 2024
fe80:14:fc01:1::1 invalid, #paths 1
Must be Connected
Last update: Wed May 22 17:20:29 2024
el-fw1.cdnwb.ru# show bgp import-check-table
Current BGP import check cache:
el-fw1.cdnwb.ru#_
You have something strange in next-hop cache:
fe80:14:fc01:1::1 valid [IGP metric 0], #paths 0, peer fe80:14:fc01:1::1
if ens13f0np0.80
Last update: Mon May 27 16:26:58 2024
fe80:14:fc01:1::1 invalid, #paths 1
Must be Connected
Last update: Wed May 22 17:20:29 2024
Two entries for the same next-hop, but one is invalid. And the last update is way older. Does this happens (bad behavior) even when the router is restarted? Or is that starting to happen after some time?
I dont know its related or not.
I have similar issue like this after restore config from 9.1 to 10.0 (which is enforce-first-as as default). Triggering command with no neighbor XXX enforce-first-as bring still showing weird low number of received-routes. Clear ip bgp also not works until solved by neighbor XXX shutdown and no shutdown.
So command no neighbor XXX enforce-first-as need shut and no shut the peer then the command will aplied.
You have something strange in next-hop cache:
fe80:14:fc01:1::1 valid [IGP metric 0], #paths 0, peer fe80:14:fc01:1::1 if ens13f0np0.80 Last update: Mon May 27 16:26:58 2024 fe80:14:fc01:1::1 invalid, #paths 1 Must be Connected Last update: Wed May 22 17:20:29 2024
Two entries for the same next-hop, but one is invalid. And the last update is way older. Does this happens (bad behavior) even when the router is restarted? Or is that starting to happen after some time?
It's a new router with new ipv6 design. A have got this problem just after the frr and host configurations were completed. There was one period when everything was working, about 15 minutes. It seems to me that after the restart FRR the situation may change. Both nexthops can become invalid, for example, or both can work, anything is possible. By the way, now nh table is
el-fw1.cdnwb.ru# sh bgp nexthop
Current BGP nexthop cache:
fe80:14:fc01:1::1 valid [IGP metric 0], #paths 0, peer fe80:14:fc01:1::1
if ens13f0np0.80
Last update: Mon May 27 16:26:58 2024
fe80:14:fc01:2::1 valid [IGP metric 0], #paths 1, peer fe80:14:fc01:2::1
if ens28f0np0.80
Last update: Mon May 27 16:35:25 2024
fe80:14:fc01:1::1 invalid, #paths 1
Must be Connected
Last update: Wed May 22 17:20:29 2024
el-fw1.cdnwb.ru#
I dont know its related or not. I have similar issue like this after restore config from 9.1 to 10.0 (which is enforce-first-as as default). Triggering command with no neighbor XXX enforce-first-as bring still showing weird low number of received-routes. Clear ip bgp also not works until solved by neighbor XXX shutdown and no shutdown.
So command no neighbor XXX enforce-first-as need shut and no shut the peer then the command will aplied.
Sorry, it didn't help me