EVPN Enhancements
This section describes EVPN enhancements.
Define RDs and RTs
The RD and RTs for the layer 2 VNI are different from the tenant VRF RD and RTs. To define the tenant VRF RD and RTs, see Configure the RD and RTs for the Tenant VRF.
When FRR learns about a local VNI and there is no explicit configuration for that VNI in FRR, the switch derives the RD and import and export RTs for this VNI automatically. The RD uses RouterId:VNI-Index and the import and export RTs use AS:VNI. For routes that come from a layer 2 VNI (type-2 and type-3), the RD uses the VXLAN local tunnel IP address (vxlan-local-tunnelip
) from the layer 2 VNI interface instead of the RouterId (vxlan-local-tunnelip:VNI
). EVPN route exchange uses the RD and RTs.
The RD disambiguates EVPN routes in different VNIs (they can have the same MAC and IP address) while the RTs describe the VPN membership for the route. The VNI-Index for the RD is a unique number that the switch generates. It only has local significance; on remote switches, its only role is for route disambiguation. The switch uses this number instead of the VNI value itself because this number has to be less than or equal to 65535. In the RT, the AS is always a 2-byte value to allow room for a large VNI. If the router has a 4-byte AS, it only uses the lower 2 bytes. This ensures a unique RT for different VNIs while having the same RT for the same VNI across routers in the same AS.
For eBGP EVPN peering, the peers are in a different AS so using an automatic RT of AS:VNI does not work for route import. Therefore, Cumulus Linux treats the import RT as *:VNI to determine which received routes apply to a particular VNI. This only applies when the switch auto-derives the import RT.
If you do not want to derive RDs and RTs (layer 2 RTS) automatically, you can define them manually. The following example commands are per VNI.
cumulus@leaf01:~$ nv set evpn vni 10 rd 10.10.10.1:20
cumulus@leaf01:~$ nv set evpn vni 10 route-target export 65101:10
cumulus@leaf01:~$ nv set evpn vni 10 route-target import 65102:10
cumulus@leaf01:~$ nv config apply
cumulus@leaf03:~$ nv set evpn vni 10 rd 10.10.10.3:20
cumulus@leaf03:~$ nv set evpn vni 10 route-target export 65102:10
cumulus@leaf03:~$ nv set evpn vni 10 route-target import 65101:10
cumulus@leaf03:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# vni 10
leaf01(config-router-af-vni)# rd 10.10.10.1:20
leaf01(config-router-af-vni)# route-target export 65101:10
leaf01(config-router-af-vni)# route-target import 65102:10
leaf01(config-router-af-vni)# exit
leaf01(config-router-af)# advertise-all-vni
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf
file.
...
address-family l2vpn evpn
advertise-all-vni
vni 10
rd 10.10.10.1:20
route-target export 65101:10
route-target import 65102:10
...
cumulus@leaf03:~$ sudo vtysh
...
leaf03# configure terminal
leaf03(config)# router bgp 65102
leaf03(config-router)# address-family l2vpn evpn
leaf03(config-router-af)# vni 10
leaf03(config-router-af-vni)# rd 10.10.10.3:20
leaf03(config-router-af-vni)# route-target export 65102:10
leaf03(config-router-af-vni)# route-target import 65101:10
leaf03(config-router-af-vni)# exit
leaf03(config-router-af)# advertise-all-vni
leaf03(config-router-af)# end
leaf03# write memory
leaf03# exit
The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf
file:
...
address-family l2vpn evpn
advertise-all-vni
vni 10
rd 10.10.10.3:20
route-target export 65102:10
route-target import 65101:10
- If you delete the RD or RT later, it reverts back to its corresponding default value.
- Route target auto derivation does not support 4-byte AS numbers; If the router has a 4-byte AS, you must define the RTs manually.
You can configure multiple RT values. In addition, you can configure both the import and export route targets with a single command by using route-target both
:
cumulus@leaf01:~$ nv set evpn vni 10 route-target import 65102:10
cumulus@leaf01:~$ nv set evpn vni 10 route-target import 65102:20
cumulus@leaf01:~$ nv set evpn vni 20 route-target both 65101:10
cumulus@leaf01:~$ nv config apply
cumulus@leaf03:~$ nv set evpn vni 10 route-target import 65101:10
cumulus@leaf03:~$ nv set evpn vni 10 route-target import 65101:20
cumulus@leaf03:~$ nv set evpn vni 20 route-target both 65102:10
cumulus@leaf03:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# vni 10
leaf01(config-router-af-vni)# route-target import 65102:10
leaf01(config-router-af-vni)# route-target import 65102:20
leaf01(config-router-af-vni)# exit
leaf01(config-router-af)# vni 20
leaf01(config-router-af-vni)# route-target both 65101:10
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf
file:
...
address-family l2vpn evpn
vni 10
route-target import 65102:10
route-target import 65102:20
vni 20
route-target import 65101:10
route-target export 65101:10
...
cumulus@leaf03:~$ sudo vtysh
...
leaf03# configure terminal
leaf03(config)# router bgp 65102
leaf03(config-router)# address-family l2vpn evpn
leaf03(config-router-af)# vni 10
leaf03(config-router-af-vni)# route-target import 65101:10
leaf03(config-router-af-vni)# route-target import 65101:20
leaf03(config-router-af-vni)# exit
leaf03(config-router-af)# vni 20
leaf03(config-router-af-vni)# route-target both 65102:10
leaf03(config-router-af)# end
leaf03# write memory
leaf03# exit
The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf
file:
...
address-family l2vpn evpn
vni 10
route-target import 65101:10
route-target import 65101:20
vni 20
route-target import 65102:10
route-target export 65102:10
...
Enable EVPN in an iBGP Environment with an OSPF Underlay
You can use EVPN with an OSPF or static route underlay. This is a more complex configuration than using eBGP. In this case, iBGP advertises EVPN routes directly between VTEPs and the spines are unaware of EVPN or BGP.
The leafs peer with each other in a full mesh within the EVPN address family without using route reflectors. The leafs generally peer to their loopback addresses, which advertise in OSPF. The receiving VTEP imports routes into a specific VNI with a matching route target community.
cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.2 remote-as internal
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.3 remote-as internal
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.4 remote-as internal
cumulus@leaf01:~$ nv set evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.2 address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.3 address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.4 address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router ospf router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router ospf area 0 network 10.10.10.1/32
cumulus@leaf01:~$ nv set interface lo router ospf passive on
cumulus@leaf01:~$ nv set interface swp49 router ospf area 0.0.0.0
cumulus@leaf01:~$ nv set interface swp50 router ospf area 0.0.0.0
cumulus@leaf01:~$ nv set interface swp51 router ospf area 0.0.0.0
cumulus@leaf01:~$ nv set interface swp52 router ospf area 0.0.0.0
cumulus@leaf01:~$ nv set interface swp49 router ospf network-type point-to-point
cumulus@leaf01:~$ nv set interface swp50 router ospf network-type point-to-point
cumulus@leaf01:~$ nv set interface swp51 router ospf network-type point-to-point
cumulus@leaf01:~$ nv set interface swp52 router ospf network-type point-to-point
cumulus@leaf01:~$ nv config apply
NVUE creates the following configuration snippet in the /etc/nvue.d/startup.yaml
file:
cumulus@leaf01:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
lo:
ip:
address:
10.10.10.1/32: {}
router:
ospf:
area: 0
enable: on
network-type: point-to-point
type: loopback
swp49:
router:
ospf:
area: 0.0.0.0
enable: on
type: swp
swp50:
router:
ospf:
area: 0.0.0.0
enable: on
network-type: point-to-point
type: swp
swp51:
router:
ospf:
area: 0.0.0.0
enable: on
network-type: point-to-point
type: swp
swp52:
router:
ospf:
area: 0.0.0.0
enable: on
network-type: point-to-point
type: swp
bridge:
domain:
br_default:
multicast:
snooping:
enable: off
querier:
enable: on
router:
bgp:
autonomous-system: 65101
enable: on
router-id: 10.10.10.1
ospf:
router-id: 10.10.10.1
enable: on
vrf:
default:
router:
bgp:
peer:
10.10.10.2:
remote-as: internal
type: numbered
address-family:
l2vpn-evpn:
enable: on
10.10.10.3:
remote-as: internal
type: numbered
address-family:
l2vpn-evpn:
enable: on
10.10.10.4:
remote-as: internal
type: numbered
address-family:
l2vpn-evpn:
enable: on
enable: on
address-family:
l2vpn-evpn:
enable: on
evpn:
enable: on
nve:
vxlan:
enable: on
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor 10.10.10.2 remote-as internal
leaf01(config-router)# neighbor 10.10.10.3 remote-as internal
leaf01(config-router)# neighbor 10.10.10.4 remote-as internal
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# neighbor 10.10.10.2 activate
leaf01(config-router-af)# neighbor 10.10.10.3 activate
leaf01(config-router-af)# neighbor 10.10.10.4 activate
leaf01(config-router-af)# advertise-all-vni
leaf01(config-router-af)# exit
leaf01(config-router)# exit
leaf01(config)# router ospf
leaf01(config-router)# router-id 10.10.10.1
leaf01(config-router)# passive-interface lo
leaf01(config-router)# exit
leaf01(config)# interface lo
leaf01(config-if)# ip ospf area 0.0.0.0
leaf01(config-if)# exit
leaf01(config)# interface swp49
leaf01(config-if)# ip ospf area 0.0.0.0
leaf01(config-if)# ospf network point-to-point
leaf01(config-if)# exit
leaf01(config)# interface swp50
leaf01(config-if)# ip ospf area 0.0.0.0
leaf01(config-if)# ospf network point-to-point
leaf01(config-if)# exit
leaf01(config)# interface swp51
leaf01(config-if)# ip ospf area 0.0.0.0
leaf01(config-if)# ospf network point-to-point
leaf01(config-if)# exit
leaf01(config)# interface swp52
leaf01(config-if)# ip ospf area 0.0.0.0
leaf01(config-if)# ospf network point-to-point
leaf01(config-if)# end
leaf01# write memory
leaf01# exit
The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf
file.
...
interface lo
ip ospf area 0.0.0.0
!
interface swp49
ip ospf area 0.0.0.0
ip ospf network point-to-point
!
interface swp50
ip ospf area 0.0.0.0
ip ospf network point-to-point
!
interface swp51
ip ospf area 0.0.0.0
ip ospf network point-to-point
!
interface swp52
ip ospf area 0.0.0.0
ip ospf network point-to-point
!
router bgp 65101
neighbor 10.10.10.2 remote-as internal
neighbor 10.10.10.3 remote-as internal
neighbor 10.10.10.4 remote-as internal
!
address-family l2vpn evpn
neighbor 10.10.10.2 activate
neighbor 10.10.10.3 activate
neighbor 10.10.10.4 activate
advertise-all-vni
exit-address-family
!
Router ospf
Ospf router-id 10.10.10.1
Passive-interface lo
...
ARP and ND Suppression
ARP suppression with EVPN allows a VTEP to suppress ARP flooding over VXLAN tunnels as much as possible. A local proxy handles ARP requests from locally attached hosts for remote hosts. ARP suppression is for IPv4; ND suppression is for IPv6.
Cumulus Linux enables ARP and ND suppression by default on all VNIs to reduce ARP and ND packet flooding over VXLAN tunnels; however, you must configure layer 3 interfaces (SVIs) for ARP and ND suppression to work with EVPN.
- ARP and ND suppression only suppresses the flooding of known hosts. To disable all flooding refer to the Disable BUM Flooding section.
- NVIDIA recommends that you keep ARP and ND suppression enabled on all VXLAN interfaces on the switch. If you must disable suppression for a special use case, you cannot disable ARP and ND suppression on some VXLAN interfaces but not others.
- When deploying EVPN and VXLAN using a hardware profile other than the default forwarding table profile, ensure that both the soft maximum and hard maximum garbage collection threshold settings have a value larger than the number of neighbor (ARP and ND) entries you expect in your deployment. Refer to Global Timer Settings.
ND Suppression and IPv6 Address Reuse
If you disable ND suppression and reuse IPv6 addresses, IPv6 duplicate address detection fails and the address remains tentative and not useable. The following example shows an IPv6 duplicate address detection failure on vlan10:
cumulus@switch:~$ ip address show vlan10 | grep dad
inet6 2001:db8::1/32 scope global dadfailed tentative
To prevent IPv6 duplicate address detection from failing, you can either disable IPv6 duplicate address detection globally or on the interface address.
To disable IPv6 duplicate address detection globally, add the following lines in the /etc/sysctl.conf
file, then reboot the switch.
cumulus@switch:~$ sudo nano /etc/sysctl.conf
...
net.ipv6.conf.default.accept_dad = 0
To disable IPv6 duplicate address detection on an interface address, create an NVUE snippet, then patch and apply the configuration. The following snippet disables duplicate address detection on vlan10 with the IP address 2001:db8::1/32:
cumulus@switch:~$ sudo nano DisableDadVlan10.yaml
- set:
system:
config:
snippet:
ifupdown2_eni:
vlan10: |
post-up ip address add 2001:db8::1/32 dev vlan10 nodad
cumulus@switch:~$ nv config patch DisableDadVlan10.yaml
cumulus@switch:~$ nv config apply
You do not need to reboot the switch after you create and apply the snippet.
ARP ND Suppression and Centralized Routing
In a centralized routing deployment, you must configure layer 3 interfaces even if you configure the switch only for layer 2 (you are not using VXLAN routing). To avoid installing unnecessary layer 3 information, you can turn off IP forwarding.
The following example commands turn off IPv4 and IPv6 forwarding on VLAN 10 and VLAN 20.
cumulus@leaf01:~$ nv set interface vlan10 ip ipv4 forward off
cumulus@leaf01:~$ nv set interface vlan10 ip ipv6 forward off
cumulus@leaf01:~$ nv set interface vlan20 ip ipv4 forward off
cumulus@leaf01:~$ nv set interface vlan20 ip ipv6 forward off
cumulus@leaf01:~$ nv config apply
Edit the /etc/network/interfaces
file.
cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto vlan10
iface vlan10
ip6-forward off
ip-forward off
vlan-id 10
vlan-raw-device bridge
auto vlan20
iface vlan20
ip6-forward off
ip-forward off
vlan-id 20
vlan-raw-device bridge
auto vni10
iface vni10
bridge-access 10
vxlan-id 10
bridge-learning off
auto vni20
iface vni20
bridge-access 20
vxlan-id 20
bridge-learning off
...
For a bridge in traditional mode, you must edit the bridge configuration in the /etc/network/interfaces
file using a text editor:
cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto bridge1
iface bridge1
bridge-ports swp1.10 swp2.10 vni10
ip6-forward off
ip-forward off
...
Disable ARP and ND Suppression
NVIDIA recommends that you keep ARP and ND suppression on to reduce ARP and ND packet flooding over VXLAN tunnels. However, if you do need to disable ARP and ND suppression, run the NVUE nv set nve vxlan arp-nd-suppress off
command or set bridge-arp-nd-suppress off
in the /etc/network/interfaces
file:
cumulus@leaf01:~$ nv set nve vxlan arp-nd-suppress off
cumulus@leaf01:~$ nv config apply
Edit the /etc/network/interfaces
file to set bridge-arp-nd-suppress off
on the VXLAN device, then run the ifreload -a
command.
cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto vxlan48
iface vxlan48
bridge-vlan-vni-map 10=10 20=20 30=30 4036=4002 4024=4001
bridge-learning off
bridge-arp-nd-suppress off
...
cumulus@leaf01:~$ sudo ifreload -a
The neighbor manager service relies on ARP and ND suppression to snoop on packets and update forwarding entries based on neighbor changes. If you disable suppression, you must enable the neighbor manager snooper manually:
-
Create the systemd override configuration file
/etc/systemd/system/neighmgrd.service
with the following content:[Service] ExecStart=/usr/bin/neighmgrd --snoop-all-bridges
-
Reload the systemd unit configuration with the
sudo systemctl daemon-reload
command. -
Restart the
neighmgrd
service with thesudo systemctl restart neighmgrd.service
command.
Configure Static MAC Addresses
You can configure a MAC address that you intend to pin to a particular VTEP on the VTEP as a static bridge FDB entry. EVPN picks up these MAC addresses and advertises them to peers as remote static MACs. You configure static bridge FDB entries for MAC addresses under the bridge configuration:
Edit the /etc/network/interfaces
file. For example:
cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto bridge
iface bridge
bridge-ports bond1 vni10
bridge-vids 10
bridge-vlan-aware yes
post-up bridge fdb add 26:76:e6:93:32:78 dev bond1 vlan 10 master static sticky
...
For a bridge in traditional mode, you must edit the bridge configuration in the /etc/network/interfaces
file using a text editor:
cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto br10
iface br10
bridge-ports swp1.10 vni10
post-up bridge fdb add 26:76:e6:93:32:78 dev swp1.10 master static sticky
...
Configure a Site ID for MLAG
When you use EVPN with MLAG, EVPN might install local MAC addresses or neighbor entries as remote entries. To prevent EVPN from taking ownership of local MAC addresses or neighbor entries from MLAG, you can associate all local layer 2 VNIs with a unique site ID, which represents an MLAG pair.
When you configure a site ID, Cumulus Linux:
- Adds a
Site-of-Origin
extended community encoded with the local site ID to EVPN routes that originate from local layer 2 VNIs. Cumulus Linux adds theSite-of-Origin
extended community when creating the route. - Filters all received EVPN routes with a
Site-of-Origin
extended community that matches the local site ID. Cumulus Linux filters the routes when importing the routes from the global table to the layer 2 VNI or layer 3 VNI table.
The site ID is in the format <IPv4 address>:<2-byte Value>
, where the IPv4 address is the anycast IP address (a virtual IP address for VXLAN data-path termination) and the 2-byte value is an integer between 0 and 65535. For example: 10.0.1.12:10
cumulus@leaf01:~$ nv set evpn mac-vrf-soo 10.0.1.12:10
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# mac-vrf soo 10.0.1.12:10
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
NVIDIA recommends you do not configure a site ID on a standalone or multihoming VTEP.
Filter EVPN Routes
It is common to subdivide the data center into multiple pods with full host mobility within a pod but only do prefix-based routing across pods. You can achieve this by only exchanging EVPN type-5 routes across pods.
The following example commands configure EVPN to advertise type-5 routes:
cumulus@leaf01:~$ nv set router policy route-map map1 rule 10 match type ipv4
cumulus@leaf01:~$ nv set router policy route-map map1 rule 10 match evpn-route-type ip-prefix
cumulus@leaf01:~$ nv set router policy route-map map1 rule 10 action permit
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast route-export to-evpn route-map map1
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
..
leaf01# configure terminal
leaf01(config)# route-map map1 permit 1
leaf01(config)# match evpn route-type prefix
leaf01(config)# end
leaf01# write memory
leaf01# exit
You must apply the route map for the configuration to take effect. See Route Maps for more information.
In many situations, it is also desirable to only exchange EVPN routes carrying a particular VXLAN ID. For example, if data centers or pods within a data center only share certain tenants, you can use a route map to control the EVPN routes exchanged based on the VNI.
The following example configures a route map that only advertises EVPN routes from VNI 1000:
cumulus@switch:~$ nv set router policy route-map map1 rule 10 match evpn-vni 1000
cumulus@switch:~$ nv set router policy route-map map1 rule 10 action permit
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# route-map map1 permit 1
switch(config)# match evpn vni 1000
switch(config)# end
switch# write memory
switch# exit
You can only match type-2 and type-5 routes based on VNI.
BGP Neighbor Prefix Limits for EVPN
Cumulus Linux provides commands to control the number of inbound prefixes allowed from a BGP neighbor for EVPN.
To configure inbound prefix limits, set:
- The maximum inbound prefix limit from the BGP neighbor. You can set a value between 0 and 4294967295 or
none
. - When to generate a warning syslog message and bring down the BGP session. This is a percentage of the maximum inbound prefix limit. You can set a value between 0 and 100. Alternatively, you can configure the switch to generate a warning syslog message only without bringing down the BGP session.
- The time in seconds to wait before establishing the BGP session again with the neighbor. The default value is
auto
, which uses standard BGP timers and processing (typically between 2 and 3 seconds). You can set a value between 1 and 65535.
Before you configure a prefix limit, determine how many routes the remote BGP neighbor typically sends and set a threshold that is slightly higher than the number of BGP prefixes you expect to receive during normal operations.
The following example sets the maximum inbound prefix limit from the neighbor swp51 to 3, generates a warning syslog message and brings down the BGP session when the number of prefixes received reaches 50 percent of the maximum limit. After 60 seconds, the BGP session with the peer reestablishes.
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound maximum 3
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound warning-threshold 50
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound reestablish-wait 60
cumulus@switch:~$ nv config apply
The following example sets the maximum inbound prefix limit from peer swp51 to 3 and generates a warning syslog message only (without bringing down the BGP session) when the number of prefixes received reaches 50 percent of the maximum limit.
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound maximum 3
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound warning-threshold 50
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound warning-only on
cumulus@switch:~$ nv config apply
The following example sets the maximum inbound prefix limit from the neighbor swp51 to 3, generates a warning syslog message and brings down the BGP session when the number of prefixes received reaches 50 percent of the maximum limit. After 1 minute, the BGP session with the peer reestablishes.
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn
switch(config-router-af)# neighbor swp51 maximum-prefix 3 50 restart 1
switch(config-router-af)# end
switch# write memory
switch# exit
You can use the force
option (neighbor swp51 maximum-prefix 3 50 restart 1 force
) to force check all received routes, not only accepted routes.
The following example sets the maximum inbound prefix limit from peer swp51 to 3, and generates a warning syslog message only (without bringing down the BGP session) when the number of prefixes received reaches 50 percent of the maximum limit.
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn
switch(config-router-af)# neighbor swp51 maximum-prefix 3 50 warning-only
switch(config-router-af)# end
switch# write memory
switch# exit
You can use the force
option (neighbor swp51 maximum-prefix 3 50 warning-only force
) to force check all received routes, not only accepted routes.
The vtysh commands save the configuration in the /etc/frr/frr.conf
file. For example:
cumulus@switch:~$ sudo cat /etc/frr/frr.conf
...
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp51 maximum-prefix 5 warning-only
...
Advertise SVI IP Addresses
In a typical EVPN deployment, you reuse SVI IP addresses on VTEPs across multiple racks. However, if you use unique SVI IP addresses across multiple racks and you want the local SVI IP address to be reachable via remote VTEPs, you can enable the advertise SVI IP and MAC address option. This option advertises the SVI IP and MAC address as a type-2 route and eliminates the need for any flooding over VXLAN to reach the IP address from a remote VTEP or rack.
- When you enable the advertise SVI IP and MAC address option, the anycast IP and MAC address pair is not advertised. Be sure not to enable both the
advertise-svi-ip
option and theadvertise-default-gw
option at the same time. (Theadvertise-default-gw
option configures the gateway VTEPs to advertise their IP and MAC address. See Advertising the Default Gateway. - If you use MLAG on your switch, refer to Advertise Primary IP Address.
To advertise all SVI IP and MAC addresses on the switch, run these commands:
cumulus@leaf01:~$ nv set evpn route-advertise svi-ip on
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# advertise-svi-ip
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
To advertise a specific SVI IP/MAC address, run these commands:
cumulus@leaf01:~$ nv set evpn vni 10 route-advertise svi-ip on
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# vni 10
leaf01(config-router-af-vni)# advertise-svi-ip
leaf01(config-router-af-vni)# end
leaf01# write memory
leaf01# exit
The vtysh commands save the configuration in the /etc/frr/frr.conf
file. For example:
cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
address-family l2vpn evpn
vni 10
advertise-svi-ip
exit-address-family
...
Disable BUM Flooding
By default, the VTEP floods all broadcast, and unknown unicast and multicast packets (such as ARP, NS, or DHCP) it receives to all interfaces (except for the incoming interface) and to all VXLAN tunnel interfaces in the same broadcast domain. When the switch receives such packets on a VXLAN tunnel interface, it floods the packets to all interfaces in the packet’s broadcast domain.
You can disable BUM flooding over VXLAN tunnels so that EVPN does not advertise type-3 routes for each local VNI and stops taking action on received type-3 routes.
Disabling BUM flooding is useful in a deployment with a controller or orchestrator, where the switch is pre-provisioned and there is no need to flood any ARP, NS, or DHCP packets.
For information on EVPN BUM flooding with PIM, refer to EVPN BUM Traffic with PIM-SM.
To disable BUM flooding:
cumulus@leaf01:~$ nv set nve vxlan flooding enable off
cumulus@leaf01:~$ nv config apply
To reenable BUM flooding, run the following commands. Enabling BUM flooding requires head-end replication.
cumulus@leaf01:~$ nv set nve vxlan flooding enable on
cumulus@leaf01:~$ nv set nve vxlan flooding head-end-replication evpn
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# flooding disable
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
The vtysh commands save the configuration in the /etc/frr/frr.conf
file. For example:
...
router bgp 65101
!
address-family l2vpn evpn
flooding disable
exit-address-family
...
To reenable BUM flooding, run the vtysh flooding head-end-replication
command.
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# flooding head-end-replication
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
To show that BUM flooding is off, run the vtysh show bgp l2vpn evpn vni
command. For example:
cumulus@leaf01:~$ sudo vtysh
...
leaf01# show bgp l2vpn evpn vni
Advertise Gateway Macip: Disabled
Advertise SVI Macip: Disabled
Advertise All VNI flag: Enabled
BUM flooding: Disabled
Number of L2 VNIs: 3
Number of L3 VNIs: 2
Flags: * - Kernel
VNI Type RD Import RT Export RT Tenant VRF
* 20 L2 10.10.10.1:3 65101:20 65101:20 RED
* 30 L2 10.10.10.1:4 65101:30 65101:30 BLUE
* 10 L2 10.10.10.1:6 65101:10 65101:10 RED
* 4002 L3 10.1.30.2:2 65101:4002 65101:4002 BLUE
* 4001 L3 10.1.20.2:5 65101:4001 65101:4001 RED
Run the vtysh show bgp l2vpn evpn route type multicast
command to make sure there are no EVPN type-3 routes that originate locally.
Extended Mobility
Cumulus Linux supports scenarios where the IP to MAC binding for a host or virtual machine changes across the move. In addition to the simple mobility scenario where a host or virtual machine with a binding of IP1
, MAC1
moves from one rack to another, Cumulus Linux supports additional scenarios where a host or virtual machine with a binding of IP1
, MAC1
moves and takes on a new binding of IP2
, MAC1
or IP1
, MAC2
. The EVPN protocol mechanism to handle extended mobility continues to use the MAC mobility extended community and is the same as the standard mobility procedures. Extended mobility defines how to compute the sequence number in this attribute when binding changes occur.
Extended mobility not only supports virtual machine moves, but also where one virtual machine shuts down and you provision another on a different rack that uses the IP address or the MAC address of the previous virtual machine. For example, in an EVPN deployment with OpenStack, where virtual machines for a tenant provision and shut down dynamically, a new virtual machine can use the same IP address as an earlier virtual machine but with a different MAC address.
To reuse the same distributed gateway on VLANs fabric wide, you can set the fabric-wide MAC address; see Change the VRR MAC address.
Cumulus Linux enables extended mobility by default.
To examine the sequence numbers for a host or virtual machine MAC address and IP address, run the vtysh show evpn mac vni <vni> mac <address>
command. For example:
cumulus@switch:~$ sudo vtysh
...
switch# show evpn mac vni 10100 mac 00:02:00:00:00:42
MAC: 00:02:00:00:00:42
Remote VTEP: 10.0.0.2
Local Seq: 0 Remote Seq: 3
Neighbors:
10.1.1.74 Active
switch# show evpn arp vni 10100 ip 10.1.1.74
IP: 10.1.1.74
Type: local
State: active
MAC: 44:39:39:ff:00:24
Local Seq: 2 Remote Seq: 3
Duplicate Address Detection
Cumulus Linux can detect duplicate MAC and IPv4 or IPv6 addresses on hosts or virtual machines in a VXLAN-EVPN configuration. The Cumulus Linux switch (VTEP) considers a host MAC or IP address to be duplicate if the address moves across the network more than a certain number of times within a certain number of seconds (five moves within 180 seconds by default). In addition to legitimate host or VM mobility scenarios, address movement can occur when you configure IP addresses incorrectly on a host or when packet looping occurs in the network due to faulty configuration or behavior.
Cumulus Linux enables duplicate address detection by default, which triggers when:
- Two hosts have the same MAC address (the host IP addresses are the same or different)
- Two hosts have the same IP address but different MAC addresses
By default, when the switch detects a duplicate address, it flags the address as a duplicate and generates an error in syslog so that you can troubleshoot the reason and address the fault, then clear the duplicate address flag. The switch does not take any functional action on the address.
- If the switch flags a MAC address as duplicate, it also flags all IP addresses associated with that MAC as duplicates. However, in an MLAG configuration, sometimes only one of the MLAG peers flags the associated IP addresses as duplicates.
- In an MLAG configuration, MAC mobility detection runs independently on each switch in the MLAG pair. Based on the sequence in which local learning and, or route withdrawal from the remote VTEP occurs, the MAC mobility counter for a type-2 route increments only on one of the switches in the MLAG pair. In rare cases, it is possible for neither VTEP to increment the MAC mobility counter for the type-2 prefix.
- Duplicate address detection is not supported in an EVPN multihoming configuration.
When Does Duplicate Address Detection Trigger?
The VTEP that sees an address move from remote to local begins the detection process by starting a timer. Each VTEP runs duplicate address detection independently. Detection always starts with the first mobility event from remote to local. If the address is initially remote, the detection count can start with the first move for the address. If the address is initially local, the detection count starts only with the second or higher move for the address. If an address is undergoing a mobility event between remote VTEPs, duplicate detection does not start.
The following illustration shows VTEP-A, VTEP-B, and VTEP-C in an EVPN configuration. Duplicate address detection triggers on VTEP-A when there is a duplicate MAC address for two hosts attached to VTEP-A and VTEP-B. However, duplicate detection does not trigger on VTEP-A when mobility events occur between two remote VTEPs (VTEP-B and VTEP-C).
Configure Duplicate Address Detection
You can configure the threshold for MAC and IP address moves. The maximum number of moves allowed can be between 2 and 1000 and the detection time interval can be between 2 and 1800 seconds.
The following example command sets the maximum number of address moves allowed to 10 and the duplicate address detection time interval to 1200 seconds.
cumulus@switch:~$ nv set evpn dad mac-move-threshold 10
cumulus@switch:~$ nv set evpn dad move-window 1200
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn
switch(config-router-af)# dup-addr-detection max-moves 10 time 1200
switch(config-router-af)# end
switch# write memory
switch# exit
To disable duplicate address detection, see Disable Duplicate Address Detection below.
Example syslog Messages
The following example shows the syslog message that generates when Cumulus Linux detects a MAC address as a duplicate during a local update:
2018/11/06 18:55:29.463327 ZEBRA: [EC 4043309149] VNI 1001: MAC 00:01:02:03:04:11 detected as duplicate during local update, last VTEP 172.16.0.16
The following example shows the syslog message that generates when Cumulus Linux detects an IP address as a duplicate during a remote update:
2018/11/09 22:47:15.071381 ZEBRA: [EC 4043309151] VNI 1002: MAC aa:22:aa:aa:aa:aa IP 10.0.0.9 detected as duplicate during remote update, from VTEP 172.16.0.16
Freeze a Detected Duplicate Address
Cumulus Linux provides a freeze option that takes action on a detected duplicate address. You can freeze the address permanently (until you intervene) or for a defined amount of time, after which it clears automatically.
When you enable the freeze option and the switch detects a duplicate address:
- If the switch learns the MAC or IP address from a remote VTEP at the time it freezes, the forwarding information in the kernel and hardware does not update, leaving it in the prior state. Any future remote updates process but they do not reflect in the kernel entry. If the remote VTEP sends a MAC-IP route withdrawal, the local VTEP removes the frozen remote entry. Then, if the local VTEP has a locally learned entry already present in its kernel, FRR originates a corresponding MAC-IP route and advertises it to all remote VTEPs.
- If the MAC or IP address is locally learned on this VTEP at the time it freezes, the address does not advertise to remote VTEPs. Future local updates process but do not advertise to remote VTEPs. If FRR receives a local entry delete event, it removes the frozen entry from the FRR database. Any remote updates (from other VTEPs) change the state of the entry to remote but the entry does not install in the kernel (until cleared).
To recover from a freeze, shut down the faulty host or VM or fix any other misconfiguration in the network. If the address freezes permanently, run the clear command on the VTEP where the address is duplicate. If the address freezes for a defined period of time, it clears automatically after the timer expires (you can clear the duplicate address before the timer expires with the clear command).
If you run the clear command or the timer expires before you address the fault, duplicate address detection can continue to occur.
After you clear a frozen address, if it is present behind a remote VTEP, the kernel and hardware forwarding tables update. If this VTEP learns the address locally, the address advertises to remote VTEPs. All VTEPs get the correct address as soon as the host communicates. The switch only learns silent hosts after the faulty entries age out, or you intervene and clear the faulty MAC and ARP table entries.
Configure the Freeze Option
You can enable Cumulus Linux to freeze detected duplicate addresses. The duration can be any number of seconds between 30 and 3600.
The following example command freezes duplicate addresses for a period of 1000 seconds, after which it clears automatically:
cumulus@switch:~$ nv set evpn dad duplicate-action freeze duration 1000
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn
switch(config-router-af)# dup-addr-detection freeze 1000
switch(config-router-af)# end
switch# write memory
switch# exit
Set the freeze timer to be three times the duplicate address detection window. For example, if the duplicate address detection window is 180 seconds, set the freeze timer to 540 seconds.
The following example command freezes duplicate addresses permanently (until you run the clear command):
cumulus@switch:~$ nv set evpn dad duplicate-action freeze duration permanent
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn
switch(config-router-af)# dup-addr-detection freeze permanent
switch(config-router-af)# end
switch# write memory
switch# exit
Clear Duplicate Addresses
You can clear duplicate addresses for all VNIs, or clear a duplicate MAC or IP address (and unfreeze a frozen address).
To clear duplicate addresses for all VNIs:
cumulus@switch:~$ nv action clear evpn vni
Action succeeded
To clear duplicate IP address 10.0.0.9 for VNI 10:
cumulus@switch:~$ nv action clear evpn vni 10 host 10.0.0.9
Action succeeded
To clear duplicate MAC address 00:e0:ec:20:12:62 for VNI 10:
cumulus@switch:~$ nv action clear evpn vni 10 mac 00:e0:ec:20:12:62
Action succeeded
To clear duplicate addresses for all VNIs:
cumulus@switch:~$ sudo vtysh
...
switch# clear evpn dup-addr vni all
switch# exit
To clear duplicate IP address 10.0.0.9 for VNI 10:
cumulus@switch:~$ sudo vtysh
...
switch# clear evpn dup-addr vni 10 ip 10.0.0.9
switch# exit
To clear duplicate MAC address 00:e0:ec:20:12:62 for VNI 10:
cumulus@switch:~$ sudo vtysh
...
switch# clear evpn dup-addr vni 10 mac 00:e0:ec:20:12:62
switch# exit
- In an MLAG configuration, you need to run the clear command on both the MLAG primary and secondary switch.
- When you clear a duplicate MAC address, all its associated IP addresses also clear. However, you cannot clear an associated IP address if its MAC address is still in a duplicate state.
Disable Duplicate Address Detection
Duplicate address detection is on by default. The switch generates a syslog error when it detects a duplicate address. To disable duplicate address detection, run the following command.
cumulus@switch:~$ nv set evpn dad enable off
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn
switch(config-router-af)# no dup-addr-detection
switch(config-router-af)# end
switch# write memory
switch# exit
When you disable duplicate address detection, Cumulus Linux clears the configuration and all existing duplicate addresses.
Show Detected Duplicate Address Information
During the duplicate address detection process, you can see the start time and current detection count with the vtysh show evpn mac vni <vni_id> mac <mac_addr>
command. The following command example shows that detection starts for MAC address 00:01:02:03:04:11 for VNI 1001 on Tuesday, Nov 6 at 18:55:05 and Cumulus Linux detects one move.
cumulus@switch:~$ sudo vtysh
...
switch# show evpn mac vni 1001 mac 00:01:02:03:04:11
MAC: 00:01:02:03:04:11
Intf: hostbond3(15) VLAN: 1001
Local Seq: 1 Remote Seq: 0
Duplicate detection started at Tue Nov 6 18:55:05 2018, detection count 1
Neighbors:
10.0.1.26 Active
After the duplicate MAC address clears, the vtysh show evpn mac vni <vni_id> mac <mac_addr>
command shows:
MAC: 00:01:02:03:04:11
Remote VTEP: 172.16.0.16
Local Seq: 13 Remote Seq: 14
Duplicate, detected at Tue Nov 6 18:55:29 2018
Neighbors:
10.0.1.26 Active
To display information for a duplicate IP address, run the vtysh show evpn arp-cache vni <vni_id> ip <ip_addr>
command. The following command example shows information for IP address 10.0.0.9 for VNI 1001.
cumulus@switch:~$ sudo vtysh
...
switch# show evpn arp-cache vni 1001 ip 10.0.0.9
IP: 10.0.0.9
Type: remote
State: inactive
MAC: 00:01:02:03:04:11
Remote VTEP: 10.0.0.34
Local Seq: 0 Remote Seq: 14
Duplicate, detected at Tue Nov 6 18:55:29 2018
To show a list of MAC addresses detected as duplicate for a specific VNI or for all VNIs, run the vtysh show evpn mac vni <vni-id|all> duplicate
command. The following example command shows a list of duplicate MAC addresses for VNI 1001:
cumulus@switch:~$ sudo vtysh
...
switch# show evpn mac vni 1001 duplicate
Number of MACs (local and remote) known for this VNI: 16
MAC Type Intf/Remote VTEP VLAN
aa:bb:cc:dd:ee:ff local hostbond3 1001
To show a list of IP addresses detected as duplicate for a specific VNI or for all VNIs, run the vtysh show evpn arp-cache vni <vni-id|all> duplicate
command. The following example command shows a list of duplicate IP addresses for VNI 1001:
cumulus@switch:~$ sudo vtysh
...
switch# show evpn arp-cache vni 1001 duplicate
Number of ARPs (local and remote) known for this VNI: 20
IP Type State MAC Remote VTEP
10.0.0.8 local active aa:11:aa:aa:aa:aa
10.0.0.9 local active aa:11:aa:aa:aa:aa
10.10.0.12 remote active aa:22:aa:aa:aa:aa 172.16.0.16
To show configured duplicate address detection parameters, run the vtysh show evpn
command:
cumulus@switch:~$ sudo vtysh
...
switch# show evpn
L2 VNIs: 4
L3 VNIs: 2
Advertise gateway mac-ip: No
Duplicate address detection: Enable
Detection max-moves 7, time 300
Detection freeze permanent