FRRouting / frr

The FRRouting Protocol Suite

Home Page:https://frrouting.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EVPN type-2 MACIP not imported into other VRFs by originating node, only at receivers

toreanderson opened this issue · comments

Description

I am not 100% certain whether or not this is a bug, or expected behaviour. In any case, it leads to suboptimal routing, so if it is not a bug, it could be considered a feature request at least.

When importing routes from an EVPN VRF into another VRF (such as the default VRF), type-2 MACIP routes (containing an IP address) do not get imported into the target VRF on the node that originated the type-2 MACIP route (i.e., the one that has the MAC/IP in the type-2 route locally attached). It is however imported on all other nodes.

This leads to suboptimal routing as traffic the traffic from the external network is always directed to a node where the MAC/IP destination is not present. From there it will be encapsulated in VXLAN and sent to the target node.

It would be better if also the node where the MAC/IP is present could also import the route into the target VRF and re-advertise it there, as that path would be preferred by the external network (due to a shorter AS path)

Version

FRRouting 10.1-dev (frrtest) on Linux(6.1.0-21-amd64).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--sbindir=/usr/lib/frr' '--with-vtysh-pager=/usr/bin/pager' '--libdir=/usr/lib/x86_64-linux-gnu/frr' '--with-moduledir=/usr/lib/x86_64-linux-gnu/frr/modules' '--disable-dependency-tracking' '--enable-rpki' '--enable-scripting' '--enable-pim6d' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' '--enable-sharpd' 'build_alias=x86_64-linux-gnu' 'PYTHON=python3'

(from frr_10.1-dev-master-ga24c805-20240604.084942-1~deb12u1_amd64.deb)

How to reproduce

To illustrate, I've set up a lab with three Debian 12 nodes connected in a triangle (full mesh between interfaces ens7 and ens8 on each node).

One of the nodes, frrtest, does not participate in EVPN - it represents the external network.

The other two nodes, frrtest2 and frrtest3, represents two EVPN routers with ASN 2 and 3, with a single L3VNI (10) bound to VRF 10, and an IRB on L2VNI 100. Static MAC/ARP entries are used to generate type-2 routes for (192.168.0.2 and 192.168.0.3) are used to represent downstream hosts on the L2VNI. The default VRF imports routes from VRF 10.

These are the scripts I use to configure the three nodes from scratch:

frrtest1

vtysh <<EOF
configure

interface lo
 ip address 10.0.0.1/32

interface ens7
 no shutdown

interface ens8
 no shutdown

router bgp 1
 no bgp ebgp-requires-policy
 neighbor ens7 interface remote-as external
 neighbor ens8 interface remote-as external

 address-family ipv4 unicast
  network 10.0.0.1/32
  neighbor ens7 activate
  neighbor ens8 activate
  exit-address-family
EOF

frrtest2 and frrtest3

# ID=1 on frrtest1
# ID=2 on frrtest2
ID=${HOSTNAME#frrtest}

vtysh <<EOF
configure

interface lo
 ip address 10.0.0.$ID/32

interface ens7
 no shutdown

interface ens8
 no shutdown

vrf vrf10
 vni 10

router bgp $ID
 no bgp ebgp-requires-policy
 neighbor ens7 interface remote-as external
 neighbor ens8 interface remote-as external

 address-family ipv4 unicast
  network 10.0.0.$ID/32
  neighbor ens7 activate
  neighbor ens8 activate
  import vrf vrf10
 exit-address-family

 address-family l2vpn evpn
  advertise-all-vni
  neighbor ens8 activate
 exit-address-family

router bgp $ID vrf vrf10
 address-family ipv4 unicast
  redistribute connected
 exit-address-family

 address-family l2vpn evpn
  advertise ipv4 unicast
 exit-address-family
EOF

# L3VNI setup
ip link add up vrf10 type vrf table 10
ip link add up br10 type bridge
ip link set br10 master vrf10
ip link add up vni10 type vxlan id 10 local 10.0.0.$ID nolearning dstport 4789
ip link set vni10 master br10
bridge link set dev vni10 learning off

# L2VNI and IRB setup
ip link add up br100 type bridge
ip link set br100 master vrf10
ip link add up vni100 type vxlan id 100 local 10.0.0.$ID nolearning dstport 4789
ip link set vni100 master br100
bridge link set dev vni100 learning off
ip address add 192.168.0.1/24 dev br100

# Mock host setup (to cause type-2 MACIP advertisement)
ip link add up dummy0 type dummy
ip link set dummy0 master br100
bridge fdb add 02:00:00:00:00:$ID$ID dev dummy0 master static sticky
ip neigh add 192.168.0.$ID lladdr 02:00:00:00:00:$ID$ID dev br100

Expected behavior

frrtest should see a direct route to 192.168.0.2 via frrtest2, and a direct route to 192.168.0.3 via frrtest3.

Both frrtest2 and frrtest3 should see routes to 192.168.0.2 and 192.168.0.3 in the default VRF with an AS-path length of null for the locally generated route, and one for the route received from the other EVPN node.

Actual behavior

frrtest only sees a route to 192.168.0.2 via frrtest3 (as-path 3 2) and t 192.168.0.3 via frtest2 (as-path 2 3):

frrtest# show ip bgp 192.168.0.2
BGP routing table entry for 192.168.0.2/32, version 5
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  ens7 ens8
  3 2
    ::ffff:a00:3 from ens8 (10.0.0.3)
    (fe80::f816:3eff:fe87:37c2) (used)
      Origin IGP, valid, external, best (First path received)
      Extended Community: ET:8 Rmac:0e:cd:2f:40:19:54
      Last update: Tue Jun  4 14:27:28 2024
frrtest# show ip bgp 192.168.0.3
BGP routing table entry for 192.168.0.3/32, version 6
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  ens7 ens8
  2 3
    ::ffff:a00:2 from ens7 (10.0.0.2)
    (fe80::f816:3eff:fec0:6702) (used)
      Origin IGP, valid, external, best (First path received)
      Extended Community: ET:8 Rmac:86:42:e3:bc:7c:91
      Last update: Tue Jun  4 14:27:28 2024

This is also visible on frrtest2 and frrtest3, which only has routes in the default VRF to the remote MACIP route, not to its own:

frrtest2# show ip route vrf default 192.168.0.0/24 longer-prefixes 
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

B>* 192.168.0.0/24 [20/0] is directly connected, vrf10 (vrf vrf10), weight 1, 01:19:45
B>* 192.168.0.3/32 [20/0] via 10.0.0.3, br10 (vrf vrf10) onlink, weight 1, 01:19:36
frrtest3# show ip route vrf default 192.168.0.0/24 longer-prefixes
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

B>* 192.168.0.0/24 [20/0] is directly connected, vrf10 (vrf vrf10), weight 1, 01:19:47
B>* 192.168.0.2/32 [20/0] via 10.0.0.2, br10 (vrf vrf10) onlink, weight 1, 01:19:45

Additional context

Both frrtest2 and frrtest3 see their local and the remote MACIP route with the expected as-path lengths (null for the locally generated type-2, one AS for the remotely generated one):

frrtest2

frrtest2# show bgp l2vpn evpn route type 2 
BGP table version is 3, local router ID is 10.0.0.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[EthTag]:[ESI]:[IPlen]:[VTEP-IP]:[Frag-id]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]

   Network          Next Hop            Metric LocPrf Weight Path
                    Extended Community
Route Distinguisher: 10.0.0.2:3
 *>  [2]:[0]:[48]:[02:00:00:00:00:22]
                    10.0.0.2                           32768 i
                    ET:8 RT:2:100 MM:0, sticky MAC
 *>  [2]:[0]:[48]:[02:00:00:00:00:22]:[32]:[192.168.0.2]
                    10.0.0.2                           32768 i
                    ET:8 RT:2:100 RT:2:10 Rmac:0e:cd:2f:40:19:54
Route Distinguisher: 10.0.0.3:3
 *>  [2]:[0]:[48]:[02:00:00:00:00:33]
                    10.0.0.3                               0 3 i
                    RT:3:100 ET:8 MM:0, sticky MAC
 *>  [2]:[0]:[48]:[02:00:00:00:00:33]:[32]:[192.168.0.3]
                    10.0.0.3                               0 3 i
                    RT:3:10 RT:3:100 ET:8 Rmac:86:42:e3:bc:7c:91

Displayed 4 prefixes (4 paths) (of requested type)

frrtest3

frrtest3# show bgp l2vpn evpn route type 2 
BGP table version is 3, local router ID is 10.0.0.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[EthTag]:[ESI]:[IPlen]:[VTEP-IP]:[Frag-id]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]

   Network          Next Hop            Metric LocPrf Weight Path
                    Extended Community
Route Distinguisher: 10.0.0.2:3
 *>  [2]:[0]:[48]:[02:00:00:00:00:22]
                    10.0.0.2                               0 2 i
                    RT:2:100 ET:8 MM:0, sticky MAC
 *>  [2]:[0]:[48]:[02:00:00:00:00:22]:[32]:[192.168.0.2]
                    10.0.0.2                               0 2 i
                    RT:2:10 RT:2:100 ET:8 Rmac:0e:cd:2f:40:19:54
Route Distinguisher: 10.0.0.3:3
 *>  [2]:[0]:[48]:[02:00:00:00:00:33]
                    10.0.0.3                           32768 i
                    ET:8 RT:3:100 MM:0, sticky MAC
 *>  [2]:[0]:[48]:[02:00:00:00:00:33]:[32]:[192.168.0.3]
                    10.0.0.3                           32768 i
                    ET:8 RT:3:100 RT:3:10 Rmac:86:42:e3:bc:7c:91

Displayed 4 prefixes (4 paths) (of requested type)

I would be happy to give any interested developer access to this lab (including full sudo access) in case that is of interest.

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.