NVIDIA Cumulus Linux

Cumulus Linux 5.12 User Guide

NVIDIA® Cumulus Linux is the first full-featured Debian bookworm-based, Linux operating system for the networking industry.

This user guide provides in-depth documentation on the Cumulus Linux installation process, system configuration and management, network solutions, and monitoring and troubleshooting recommendations. In addition, the quick start guide provides an end-to-end setup process to get you started.

Cumulus Linux 5.12 includes the NVIDIA NetQ agent and CLI. You can use NetQ to monitor and manage your data center network infrastructure and operational health. Refer to the NVIDIA NetQ documentation for details.

For a list of the new features in this release, see What's New. For bug fixes and known issues present in this release, refer to the Cumulus Linux 5.12 Release Notes.

Try It Pre-built Demos

The Cumulus Linux documentation includes pre-built Try It demos for certain Cumulus Linux features. The Try It demos run a simulation in NVIDIA Air; a cloud hosted platform that works exactly like a real world production deployment. Use the Try It demos to examine switch configuration for a feature. For more information, see Try It Pre-built Demos.

Open Source Contributions

To implement various Cumulus Linux features, NVIDIA has forked various software projects, like CFEngine Netdev and some Puppet Labs packages. Some of the forked code resides in the NVIDIA Networking GitHub repository and some is available as part of the Cumulus Linux repository as Debian source packages.

NVIDIA has also developed and released new applications as open source. The list of open source projects is on the Cumulus Linux packages page.

Download the User Guide

Use one of the following methods to download the Cumulus Linux user guide and view it offline:

What's New

This document supports the Cumulus Linux 5.12 release, and lists new platforms, features, and enhancements.

What’s New in Cumulus Linux 5.12

Platforms

New Features and Enhancements

To align with a long-term vision of a common interface between Cumulus Linux, Nvidia OS (NVOS), and Host-Based Networking, certain NVUE commands in Cumulus Linux 5.12 have changed. Before you upgrade to 5.12, review the list of changed and removed NVUE commands above and be sure to make any necessary changes to your automation.

Release Considerations

Review the following considerations before you upgrade to Cumulus Linux 5.12.

Linux Configuration Files Overwritten

If you use Linux commands to configure the switch, read the following information before you upgrade to Cumulus Linux 5.12.0 or later.

Cumulus Linux includes a default NVUE startup.yaml file. In addition, NVUE configuration auto save is enabled by default. As a result, Cumulus Linux overwrites any manual changes to Linux configuration files on the switch when the switch reboots after upgrade or you change the cumulus user account password with the Linux passwd command.

These issues occur only if you use Linux commands to configure the switch. If you use NVUE commands to configure the switch, these issues do not occur and no action is needed.

To prevent Cumulus Linux from overwriting manual changes to the Linux configuration files when the switch reboots or when changing the cumulus user account password with the passwd command, follow the steps below before you upgrade to 5.12.0 or later, or after a new binary image installation:

  1. Disable NVUE auto save:
cumulus@switch:~$ nv set system config auto-save state disabled
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv config save
  1. Delete the /etc/nvue.d/startup.yaml file:

    cumulus@switch:~$ sudo rm -rf /etc/nvue.d/startup.yaml
    
  2. Add the PASSWORD_NVUE_SYNC=no line to the /etc/default/nvued file:

    cumulus@switch:~$ sudo nano /etc/default/nvued
    PASSWORD_NVUE_SYNC=no
    

DHCP Lease with the host-name Option

When a Cumulus Linux switch with NVUE enabled receives a DHCP lease containing the host-name option, it ignores the received hostname and does not apply it. For details, see this knowledge base article.

NVUE Commands After Upgrade

Cumulus Linux 5.12 includes the NVUE object model. After you upgrade to Cumulus Linux 5.12, running NVUE configuration commands might override configuration for features that are now configurable with NVUE and removes configuration you added manually to files or with automation tools like Ansible, Chef, or Puppet. To keep your configuration, you can do one of the following:

Quick Start Guide

This quick start guide provides an end-to-end setup process for installing and running Cumulus Linux.

Prerequisites

This guide assumes you have intermediate-level Linux knowledge. You need to be familiar with basic text editing, Unix file permissions, and process monitoring. Cumulus Linux includes a variety of preinstalled text editors, such as vi and nano.

You must have access to a Linux or UNIX shell. If you are running Windows, use a Linux environment like Cygwin as your command line tool for interacting with Cumulus Linux.

Get Started

Cumulus Linux is on the switch by default. To upgrade to a different Cumulus Linux release or reinstall Cumulus Linux, refer to Installation Management. To show the current Cumulus Linux release on the switch, run the NVUE nv show system command.

When starting Cumulus Linux for the first time, the management port makes a DHCPv4 request. To determine the IP address of the switch, you can cross reference the serial number of the switch with your DHCP server. The DHCP request from the switch includes the serial number in the client identifier (option 61).

To get started:

You can choose to configure Cumulus Linux either with NVUE commands or Linux commands (with vtysh or by manually editing configuration files). Do not run both NVUE configuration commands (such as nv set, nv unset, nv action, nv config) and Linux commands to configure the switch. NVUE commands replace the configuration in files such as /etc/network/interfaces and /etc/frr/frr.conf, and remove any configuration you add manually or with automation tools like Ansible, Chef, or Puppet.

If you choose to configure Cumulus Linux with NVUE, you can configure features that do not yet support the NVUE object model by creating NVUE Snippets.

Login Credentials

The default installation includes two accounts:

ONIE includes options that allow you to change the default password for the cumulus account automatically when you install a new Cumulus Linux image. Refer to ONIE Installation Options. You can also change the default password using a ZTP script.

In this quick start guide, you use the cumulus account to configure Cumulus Linux.

All accounts except root can use remote SSH login; you can use sudo to grant a non-root account root-level access. Commands that change the system configuration require this elevated level of access.

For more information about sudo, see Using sudo to Delegate Privileges.

Serial Console Management

NVIDIA recommends you perform management and configuration over the network, either in band or out of band. A serial console is fully supported.

Typically, switches ship from the manufacturer with a mating DB9 serial cable. Switches with ONIE are always set to a 115200 baud rate.

Wired Ethernet Management

A Cumulus Linux switch always provides at least one dedicated Ethernet management port called eth0. This interface is specifically for out-of-band management use. The management interface uses DHCPv4 for addressing by default.

To set a static IP address and gateway address for eth0:

cumulus@switch:~$ nv unset interface eth0 ip address dhcp
cumulus@switch:~$ nv set interface eth0 ip address 192.0.2.42/24
cumulus@switch:~$ nv set interface eth0 ip gateway 192.0.2.1
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file:

cumulus@switch:~$ sudo nano /etc/network/interfaces
# Management interface
auto eth0
iface eth0
    address 192.0.2.42/24
    gateway 192.0.2.1

Configure the Hostname

The hostname identifies the switch; make sure you configure the hostname to be unique and descriptive.

Do not use an underscore (_), apostrophe ('), or non-ASCII characters in the hostname.

To change the hostname:

Run the nv set system hostname <hostname> command. The following example sets the hostname to leaf01:

cumulus@switch:~$ nv set system hostname leaf01
cumulus@switch:~$ nv config apply
  1. Change the hostname with the hostnamectl command; for example:

    cumulus@switch:~$ sudo hostnamectl set-hostname leaf01
    
  2. In the /etc/hosts file, replace the host for IP address 127.0.1.1 with the new hostname:

    cumulus@switch:~$ sudo nano /etc/hosts
    ...
    127.0.1.1       leaf01
    

The command prompt in the terminal does not reflect the new hostname until you either log out of the switch or start a new shell.

Configure the Time Zone

The default time zone on the switch is UTC (Coordinated Universal Time). Change the time zone on your switch to be the time zone for your location.

To update the time zone:

Run the nv set system timezone <timezone> command. To see all the available time zones, run nv set system timezone and press the Tab key. The following example sets the time zone to US/Eastern:

cumulus@switch:~$ nv set system timezone US/Eastern
cumulus@switch:~$ nv config apply
  1. In a terminal, run the following command:

    cumulus@switch:~$ sudo dpkg-reconfigure tzdata
    
  2. Follow the on screen menu options to select the geographic area and region.

Programs that are already running (including log files) and logged in users, do not see time zone changes. To set the time zone for all services and daemons, reboot the switch.

Verify the System Time

Verify that the date and time on the switch are correct. If the date and time are incorrect, the switch does not synchronize with automation tools, such as Puppet, and returns errors after you restart switchd.

To show the current date and time, run the nv show system time command:

cumulus@switch:~$ nv show system time
                           operational                  
-------------------------  -----------------------------
local-time                 Wed 2024-08-21 17:39:44 EDT
universal-time             Wed 2024-08-21 21:39:44 UTC
rtc-time                   Fri 2024-08-16 16:50:06    
time-zone                  US/Eastern (EDT, -0400)    
system-clock-synchronized  no                         
ntp-service                n/a                        
rtc-in-local-tz            no                         
unix-time                  1724276384.1403222

To set the software clock according to the configured time zone, run the nv action change system time <YYYY-MM-DD> <HH:MM:SS> command; for example:

cumulus@switch:~$ nv action change system time 2023-12-04 2:33:30
System Date-time changed successfully
Local Time is now Mon 2023-12-04 02:33:30 UTC
Action succeeded

To show the current date and time on the switch, run the date command:

cumulus@switch:~$ date
Wed 11 Oct 2023 12:18:33 PM UTC

To set the software clock according to the configured time zone, run the sudo date -s command:

cumulus@switch:~$ sudo date -s "Tue Jan 26 00:37:13 2021"

For more information about setting the system time, see Setting the Date and Time.

NTP and PTP

Configure Breakout Ports with Splitter Cables

If you are using 4x10G DAC or AOC cables, or you want to break out (split) switch ports, configure the breakout ports; see Switch Port Attributes.

Test Cable Connectivity

By default, Cumulus Linux disables all data plane ports (every Ethernet port except the management interface, eth0). To test cable connectivity, administratively enable physical ports.

To enable a port administratively, run the nv set interface <interface> command:

cumulus@switch:~$ nv set interface swp1
cumulus@switch:~$ nv config apply

To enable all physical ports administratively on a switch that has ports numbered from swp1 to swp52:

cumulus@switch:~$ nv set interface swp1-52
cumulus@switch:~$ nv config apply

To view link status, run the nv show interface command.

To enable a port administratively, edit the /etc/network/interfaces file to add the port, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp1
iface swp1
...
cumulus@switch:~$ sudo ifreload -a

To enable all physical ports administratively, edit the /etc/network/interfaces file to add all the interfaces, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp1
iface swp1

auto swp2
iface swp2

auto swp3
iface swp3
...
cumulus@switch:~$ sudo ifreload -a

To view link status, run the ip link show command.

Configure Layer 2 Ports

Cumulus Linux does not put all ports into a bridge by default. To create a bridge and configure one or more front panel ports as members of the bridge, run the following commands.

The following example places the front panel port swp1 into the default bridge called br_default.

cumulus@switch:~$ nv set interface swp1 bridge domain br_default
cumulus@switch:~$ nv config apply

You can add a range of ports in one command. For example, to add swp1 through swp3, swp10, and swp14 through swp20 to the bridge:

cumulus@switch:~$ nv set interface swp1-3,swp6,swp14-20 bridge domain br_default
cumulus@switch:~$ nv config apply

The following example places the front panel port swp1 into the default bridge called br_default:

...
auto br_default
iface br_default
    bridge-ports swp1
...

The following example adds swp1 through swp3, swp10, and swp14 through swp20 to the bridge:

...
auto br_default
iface br_default
    bridge-ports swp1 swp2 swp3 swp6 swp14 swp15 swp16 swp17 swp18 swp19 swp20
...

To apply the configuration, check for typos:

cumulus@switch:~$ sudo ifquery -a

If there are no errors, run the following command:

cumulus@switch:~$ sudo ifup -a

For more information about Ethernet bridges, see Ethernet Bridging - VLANs.

Configure Layer 3 Ports

You can configure a front panel port or bridge interface as a layer 3 port.

The following example configures the front panel port swp1 as a layer 3 access port:

cumulus@switch:~$ nv set interface swp1 ip address 10.0.0.0/31
cumulus@switch:~$ nv config apply

To add an IP address to a bridge interface, you must put it into a VLAN interface. If you want to use a VLAN other than the native one, set the bridge PVID:

cumulus@switch:~$ nv set interface swp1-2 bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default vlan 10
cumulus@switch:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@switch:~$ nv set bridge domain br_default untagged 1
cumulus@switch:~$ nv config apply

The following example configures the front panel port swp1 as a layer 3 access port:

auto swp1
iface swp1
  address 10.0.0.0/31

To add an IP address to a bridge interface, include the address under the iface stanza in the /etc/network/interfaces file. If you want to use a VLAN other than the native one, set the bridge PVID:

auto vlan10
iface vlan10
    address 10.1.10.2/24
    vlan-raw-device br_default
    vlan-id 10
auto br_default
iface br_default
    bridge-ports swp1 swp2
    hwaddress 44:38:39:22:01:78
    bridge-vlan-aware yes
    bridge-vids 10
    bridge-pvid 1
    bridge-stp yes
    bridge-mcsnoop no
    mstpctl-forcevers rstp

To apply the configuration, check for typos:

cumulus@switch:~$ sudo ifquery -a

If there are no errors, run the following command:

cumulus@switch:~$ sudo ifup -a

Configure a Loopback Interface

Cumulus Linux has a preconfigured loopback interface. When the switch boots up, the loopback interface, called lo, is up and assigned an IP address of 127.0.0.1.

The loopback interface lo must always exist on the switch and must always be up. To check the status of the loopback interface, run the NVUE nv show interface lo command or the Linux ip addr show lo command.

The following example sets the loopback IP address to 10.10.10.1/32.

cumulus@switch:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@switch:~$ nv config apply

Add the IP address directly under the iface lo inet loopback definition in the /etc network/interfaces file:

auto lo
iface lo inet loopback
    address 10.10.10.1/32

If you configure an IP address without a subnet mask, it becomes a /32 IP address. For example, 10.10.10.1 is 10.10.10.1/32.

You can add multiple loopback addresses. For more information, see Interface Configuration and Management.

Show Platform and System Settings

Next Steps

You are now ready to configure the switch according to your needs. This guide provides separate sections that describe how to configure system, layer 1, layer 2, layer 3, and network virtualization settings. Each section includes example configurations and pre-built demos.

For a deep dive into the NVUE object model that provides a CLI to simplify configuration, see NVUE.

Installation Management

This section describes how to manage, install, and upgrade Cumulus Linux on your switch.

Managing Cumulus Linux Disk Images

The Cumulus Linux operating system resides on a switch as a disk image. This section discusses how to manage the image.

To install a new Cumulus Linux image, refer to Installing a New Cumulus Linux Image. To upgrade Cumulus Linux, refer to Upgrading Cumulus Linux.

Reprovision the System (Restart the Installer)

Reprovisioning the system deletes all system data from the switch.

To stage an ONIE installer from the network (where ONIE automatically locates the installer), run the onie-select -i command. You must reboot the switch to start the install process.

cumulus@switch:~$ sudo onie-select -i
WARNING:
WARNING: Operating System install requested.
WARNING: This will wipe out all system data.
WARNING:
Are you sure (y/N)? y
Enabling install at next reboot...done.
Reboot required to take effect.

To cancel a pending reinstall operation, run the onie-select -c command:

cumulus@switch:~$ sudo onie-select -c
Cancelling pending install at next reboot...done.

To stage an installer located in a specific location, run the onie-install -i <location> command. You can specify a local, absolute or relative path, an HTTP or HTTPS server, SCP or FTP server. You can also stage a Zero Touch Provisioning (ZTP) script along with the installer. You typically use the onie-install command with the -a option to activate installation. If you do not specify the -a option, you must reboot the switch to start the installation process.

The following example stages the installer located at http://203.0.113.10/image-installer together with the ZTP script located at http://203.0.113.10/ztp-script and activates installation and ZTP:

cumulus@switch:~$ sudo onie-install -i http://203.0.113.10/image-installer
cumulus@switch:~$ sudo onie-install -z http://203.0.113.10/ztp-script
cumulus@switch:~$ sudo onie-install -a

You can also specify these options together in the same command. For example:

cumulus@switch:~$ sudo onie-install -i http://203.0.113.10/image-installer -z http://203.0.113.10/ztp-script -a

To see more onie-install options, run man onie-install.

Migrate from Cumulus Linux to ONIE (Uninstall All Images and Remove the Configuration)

To remove all installed images and configurations, and return the switch to its factory defaults, run the onie-select -k command.

The onie-select -k command takes a long time to run as it overwrites the entire NOS section of the flash. Only use this command if you want to erase all NOS data and take the switch out of service.

cumulus@switch:~$ sudo onie-select -k
WARNING:
WARNING: Operating System uninstall requested.
WARNING: This will wipe out all system data.
WARNING:
Are you sure (y/N)? y
Enabling uninstall at next reboot...done.
Reboot required to take effect.

You must reboot the switch to start the uninstallation process.

To cancel a pending uninstall operation, run the onie-select -c command:

cumulus@switch:~$ sudo onie-select -c
Cancelling pending uninstall at next reboot...done.

Boot Into Rescue Mode

If your system becomes unresponsive, you can correct certain issues by booting into ONIE rescue mode, which uses unmounted file systems. You can use various Cumulus Linux utilities to try and resolve a problem.

To reboot the system into ONIE rescue mode, run the onie-select -r command:

cumulus@switch:~$ sudo onie-select -r
WARNING:
WARNING: Rescue boot requested.
WARNING:
Are you sure (y/N)? y
Enabling rescue at next reboot...done.
Reboot required to take effect.

You must reboot the system to boot into rescue mode.

To cancel a pending rescue boot operation, run the onie-select -c command:

cumulus@switch:~$ sudo onie-select -c
Cancelling pending rescue at next reboot...done.

Inspect the Image File

The Cumulus Linux image file is executable. From a running switch, you can display, extract, and verify the contents of the image file.

To display the contents of the Cumulus Linux image file, pass the info option to the image file. For example, to display the contents of an image file called onie-installer located in the /var/lib/cumulus/installer directory:

cumulus@switch:~$ sudo /var/lib/cumulus/installer/onie-installer info
Verifying image checksum ...OK.
Preparing image archive ... OK.
Control File Contents
=====================
Description: Cumulus Linux 4.1.0
Release: 4.1.0
Architecture: amd64
Switch-Architecture: bcm-amd64
Build-Id: dirtyz224615f
Build-Date: 2019-05-17T16:34:22+00:00
Build-User: clbuilder
Homepage: http://www.cumulusnetworks.com/
Min-Disk-Size: 1073741824
Min-Ram-Size: 536870912
mkimage-version: 0.11.111_gbcf0

To extract the contents of the image file, use with the extract <path> option. For example, to extract an image file called onie-installer located in the /var/lib/cumulus/installer directory to the mypath directory:

cumulus@switch:~$ sudo /var/lib/cumulus/installer/onie-installer extract mypath
total 181860
-rw-r--r-- 1 4000 4000       308 May 16 19:04 control
drwxr-xr-x 5 4000 4000      4096 Apr 26 21:28 embedded-installer
-rw-r--r-- 1 4000 4000  13273936 May 16 19:04 initrd
-rw-r--r-- 1 4000 4000   4239088 May 16 19:04 kernel
-rw-r--r-- 1 4000 4000 168701528 May 16 19:04 sysroot.tar

To verify the contents of the image file, use with the verify option. For example, to verify the contents of an image file called onie-installer located in the /var/lib/cumulus/installer directory:

cumulus@switch:~$ sudo /var/lib/cumulus/installer/onie-installer verify
Verifying image checksum ...OK.
Preparing image archive ... OK.
./cumulus-linux-bcm-amd64.bin.1: 161: ./cumulus-linux-bcm-amd64.bin.1: onie-sysinfo: not found
Verifying image compatibility ...OK.
Verifying system ram ...OK.
Open Network Install Environment (ONIE) Home Page

Installing a New Cumulus Linux Image

The default password for the cumulus user account is cumulus. The first time you log into Cumulus Linux, you must change this default password. Be sure to update any automation scripts before installing a new image. Cumulus Linux provides command line options to change the default password automatically during the installation process. Refer to ONIE Installation Options.

You can install a new Cumulus Linux image using ONIE, an open source project (equivalent to PXE on servers) that enables the installation of network operating systems (NOS) on bare metal switches.

Before you install Cumulus Linux, the switch can be in two different states:

The sections below describe some of the different ways you can install the Cumulus Linux image. Steps show how to install directly from ONIE (if no image is on the switch) and from Cumulus Linux (if the image is already on the switch). For additional methods to find and install the Cumulus Linux image, see the ONIE Design Specification.

You can download a Cumulus Linux image from the NVIDIA Enterprise support portal.

Installing the Cumulus Linux image is destructive; configuration files on the switch are not saved; copy them to a different server before installing.

In the following procedures:

Install Using a DHCP/Web Server With DHCP Options

To install Cumulus Linux using a DHCP or web server with DHCP options, set up a DHCP/web server on your laptop and connect the eth0 management port of the switch to your laptop. After you connect the cable, the installation proceeds as follows:

  1. The switch boots up and requests an IP address (DHCP request).

  2. The DHCP server acknowledges and responds with DHCP option 114 and the location of the installation image.

  3. ONIE downloads the Cumulus Linux image, installs, and reboots.

    You are now running Cumulus Linux.

The most common way is to send DHCP option 114 with the entire URL to the web server (this can be the same system). However, there are other ways you can use DHCP even if you do not have full control over DHCP. See the ONIE user guide for information on partial installer URLs and advanced DHCP options; both articles list more supported DHCP options.

The following shows an example DHCP configuration with an ISC DHCP server:

subnet 172.0.24.0 netmask 255.255.255.0 {
  range 172.0.24.20 172.0.24.200;
  option default-url = "http://172.0.24.14/onie-installer-x86_64";
}

The following shows an example DHCP configuration with dnsmasq (static address assignment):

dhcp-host=sw4,192.168.100.14,6c:64:1a:00:03:ba,set:sw4
dhcp-option=tag:sw4,114,"http://roz.rtplab.test/onie-installer-x86_64"

If you do not have a web server, you can use this free Apache example.

Install Using a DHCP/Web Server without DHCP Options

Follow the steps below if you can log into the switch on a serial console (ONIE), or log in on the console or with ssh (Install from Cumulus Linux).

  1. Place the Cumulus Linux image in a directory on the web server.

  2. Run the onie-nos-install command:

    ONIE:/ #onie-nos-install http://10.0.1.251/path/to/cumulus-install-x86_64.bin
    
  1. Place the Cumulus Linux image in a directory on the web server.

  2. From the Cumulus Linux command prompt, run the onie-install command, then reboot the switch.

    cumulus@switch:~$ sudo onie-install -a -i http://10.0.1.251/path/to/cumulus-install-x86_64.bin
    

Install Using a Web Server With no DHCP

Follow the steps below if you can log into the switch on a serial console (ONIE), or you can log in on the console or with ssh (Install from Cumulus Linux) but no DHCP server is available.

You need a console connection to access the switch; you cannot perform this procedure remotely.

  1. ONIE is in discovery mode. You must disable discovery mode with the following command:

    onie# onie-discovery-stop
    

    On older ONIE versions, if the onie-discovery-stop command is not supported, run:

    onie# /etc/init.d/discover.sh stop
    
  2. Assign a static address to eth0 with the ip addr add command:

    ONIE:/ #ip addr add 10.0.1.252/24 dev eth0
    
  3. Place the Cumulus Linux image in a directory on your web server.

  4. Run the installer manually (because there are no DHCP options):

    ONIE:/ #onie-nos-install http://10.0.1.251/path/to/cumulus-install-x86_64.bin
    
  1. Place the Cumulus Linux image in a directory on your web server.

  2. From the Cumulus Linux command prompt, run the onie-install command, then reboot the switch.

    cumulus@switch:~$ sudo onie-install -a -i http://10.0.1.251/path/to/cumulus-install-x86_64.bin
    

Install Using FTP Without a Web Server

Follow the steps below if your laptop is on the same network as the switch eth0 interface but no DHCP server is available.

  1. Set up DHCP or static addressing for eth0. The following example assigns a static address to eth0:

    ONIE:/ #ip addr add 10.0.1.252/24 dev eth0
    
  2. If you are using static addressing, disable ONIE discovery mode:

    onie# onie-discovery-stop
    

    On older ONIE versions, if the onie-discovery-stop command is not supported, run:

    onie# /etc/init.d/discover.sh stop
    
  3. Place the Cumulus Linux image into a TFTP or FTP directory.

  4. If you are not using DHCP options, run one of the following commands (tftp for TFTP or ftp for FTP):

    ONIE# onie-nos-install ftp://local-ftp-server/cumulus-install-x86_64.bin
    
    ONIE# onie-nos-install tftp://local-tftp-server/cumulus-install-[PLATFORM].bin
    
  1. Place the Cumulus Linux image into an FTP directory (TFTP is not supported in Cumulus Linux).

  2. From the Cumulus Linux command prompt, run the following command, then reboot the switch.

    cumulus@switch:~$ sudo onie-install -a -i ftp://local-ftp-server/cumulus-install-x86_64.bin
    

Install Using a Local File

Follow the steps below to install the Cumulus Linux image referencing a local file.

  1. Set up DHCP or static addressing for eth0. The following example assigns a static address to eth0:

    ONIE:/ #ip addr add 10.0.1.252/24 dev eth0
    
  2. If you are using static addressing, disable ONIE discovery mode.

    onie# onie-discovery-stop
    

    On older ONIE versions, if the onie-discovery-stop command is not supported, run:

    onie# /etc/init.d/discover.sh stop
    
  3. Use scp to copy the Cumulus Linux image to the switch.

  4. Run the installer manually from ONIE:

    ONIE:/ #onie-nos-install /path/to/local/file/cumulus-install-x86_64.bin
    

The onie-install command lets you stage a Cumulus Linux image and other files, such as a ZTP script or an NVUE startup.yaml file, then run the installation on the switch when you are ready.

You can provide the following file paths with the onie-install command:

  • The local file path (absolute or relative path)
  • http://server/path/
  • https://server/path/
  • scp://user@server/path/
  • ftp://server/path/ (anonymous only)

Use these options to stage additional files with the Cumulus Linux image:

  • -z stages a ZTP script.
  • -t stages an NVUE startup.yaml file.

The following example stages an image on an HTTP server:

cumulus@cumulus:~$ sudo onie-install -i http://203.0.113.10/image-installer 

The following example stages an image and a ZTP script on an HTTP server:

cumulus@cumulus:~$ sudo onie-install -i http://203.0.113.10/image-installer -z http://203.0.113.10/ztp-script

The following example stages an image on an HTTP server and a local NVUE startup.yaml file:

cumulus@cumulus:~$ sudo onie-install -i http://203.0.113.10/image-installer -t /etc/nvue.d/startup.yaml

When you stage an NVUE startup.yaml file, ZTP still runs after the new image is installed. To prevent ZTP from running after the new image is installed, either:

  • Use the -z option to specify an existing ZTP script that takes no action.
  • Run the sudo ztp -d or nv action disable system ztp commands to disable ZTP after the new image is running.

To activate the staged installation, use the -a option, then reboot the switch:

cumulus@cumulus:~$ sudo onie-install -a
WARNING: This will wipe out all system data
WARNING: Make sure to back up your data
Are you sure (N/y)? y
Activating staged installer...done.
Reboot required to take effect.

You can combine the -i, -z, -t and -a options. In addition, you can use the -f (force) option together with the -a option to suppress the yes and no prompts:

cumulus@cumulus:~$ sudo onie-install -fa -i http://203.0.113.10/image-installer -z http://203.0.113.10/ztp-script -t /etc/nvue.d/startup.yaml
Staging installer image... Adding ZTP script...done.
Activating staged installer...done.
Reboot required to take effect.

Install Using a USB Drive

Follow the steps below to install the Cumulus Linux image using a USB drive.

Installing Cumulus Linux using a USB drive is not scalable. DHCP can scale to hundreds of switch installs with zero manual input unlike USB installs.

Prepare for USB Installation

  1. From the NVIDIA Enterprise support portal, download the appropriate Cumulus Linux image for your platform.

  2. From a computer, prepare your USB drive by formatting it using one of the supported formats: FAT32, vFAT or EXT2.

    Optional: Prepare a USB Drive inside Cumulus Linux

    a. Insert your USB drive into the USB port on the switch running Cumulus Linux and log in to the switch. Examine output from cat /proc/partitions and sudo fdisk -l [device] to determine the location of your USB drive. For example, sudo fdisk -l /dev/sdb.

    These instructions assume your USB drive is the /dev/sdb device, which is typical if you insert the USB drive after the machine is already booted. However, if you insert the USB drive during the boot process, it is possible that your USB drive is the /dev/sda device. Make sure to modify the commands below to use the proper device for your USB drive.

    b. Create a new partition table on the USB drive. If the parted utility is not on the system, install it with sudo -E apt-get install parted.

    sudo parted /dev/sdb mklabel msdos
    

    c. Create a new partition on the USB drive:

    sudo parted /dev/sdb -a optimal mkpart primary 0% 100%
    

    d. Format the partition to your filesystem of choice using one of the examples below:

    sudo mkfs.ext2 /dev/sdb1
    sudo mkfs.msdos -F 32 /dev/sdb1
    sudo mkfs.vfat /dev/sdb1
    

    To use mkfs.msdos or mkfs.vfat, you need to install the dosfstools package from the Debian software repositories, as they are not included by default.

    e. To continue installing Cumulus Linux, mount the USB drive to move files:

    sudo mkdir /mnt/usb
    sudo mount /dev/sdb1 /mnt/usb
    
  3. Copy the Cumulus Linux image to the USB drive, then rename the image file to onie-installer-x86_64.

    You can also use any of the ONIE naming schemes mentioned here.

    When using a MAC or Windows computer to rename the installation file, the file extension can still be present. Make sure you remove the file extension so that ONIE can detect the file.

  4. Insert the USB drive into the switch, then prepare the switch for installation:

    • If the switch is offline, connect to the console and power on the switch.
    • If the switch is already online in ONIE, use the reboot command.

    SSH sessions to the switch get dropped after this step. To complete the remaining instructions, connect to the console of the switch. Cumulus Linux switches display their boot process to the console; you need to monitor the console specifically to complete the next step.

  5. Monitor the console and select the ONIE option from the first GRUB screen shown below.

  6. Cumulus Linux on x86 uses GRUB chainloading to present a second GRUB menu specific to the ONIE partition. No action is necessary in this menu to select the default option ONIE: Install OS.

  7. The switch recognizes the USB drive and mounts it automatically. Cumulus Linux installation begins.

  8. After installation completes, the switch automatically reboots into the newly installed instance of Cumulus Linux.

ONIE Installation Options

You can run several installer command line options from ONIE to perform basic switch configuration automatically after installation completes and Cumulus Linux boots for the first time. These options enable you to:

The onie-nos-install command does not allow you to specify command line parameters. You must access the switch from the console and transfer a disk image to the switch. You must then make the disk image executable and install the image directly from the ONIE command line with the options you want to use.

The following example commands transfer a disk image to the switch, make the image executable, and install the image with the --password option to change the default cumulus user password:

ONIE:/ # wget http://myserver.datacenter.com/cumulus-linux-4.4.0-mlx-amd64.bin
ONIE:/ # chmod 755 cumulus-linux-4.4.0-mlx-amd64.bin
ONIE:/ # ./cumulus-linux-4.4.0-mlx-amd64.bin --password 'MyP4$$word'

You can run more than one option in the same command.

Set the cumulus User Password

The default cumulus user account password is cumulus. When you log into Cumulus Linux for the first time, you must provide a new password for the cumulus account, then log back into the system.

To automate this process, you can specify a new password from the command line of the installer with the --password '<clear text-password>' option. For example, to change the default cumulus user password to MyP4$$word:

ONIE:/ # ./cumulus-linux-4.4.0-mlx-amd64.bin --password 'MyP4$$word'

To provide a hashed password instead of a clear text password, use the --hashed-password '<hash>' option. An encrypted hash maintains a secure management network.

  1. Generate a sha-512 password hash with the following openssl command. The example command generates a sha-512 password hash for the password MyP4$$word.

    user@host:~$ openssl passwd -6 'MyP4$$word'
    6$LXOrvmOkqidBGqu7$dy0dpYYllekNKOY/9LLrobWA4iGwL4zHsgG97qFQWAMZ3ZzMeyz11JcqtgwKDEgYR6RtjfDtdPCeuj8eNzLnS.
    
  2. Specify the new password from the command line of the installer with the --hashed-password '<hash>' command:

    ONIE:/ # ./cumulus-linux-4.4.0-mlx-amd64.bin  --hashed-password '6$LXOrvmOkqidBGqu7$dy0dpYYllekNKOY/9LLrobWA4iGwL4zHsgG97qFQWAMZ3ZzMeyz11JcqtgwKDEgYR6RtjfDtdPCeuj8eNzLnS.'
    

If you specify both the --password and --hashed-password options, the --hashed-password option takes precedence and the switch ignores the --password option.

Provide Initial Network Configuration

To provide initial network configuration automatically when Cumulus Linux boots for the first time after installation, use the --interfaces-file <filename> option. For example, to copy the contents of a file called network.intf into the /etc/network/interfaces file and run the ifreload -a command:

ONIE:/ # ./cumulus-linux-4.4.0-mlx-amd64.bin  --interfaces-file network.intf

Execute a ZTP Script

To run a ZTP script that contains commands to execute after Cumulus Linux boots for the first time after installation, use the --ztp <filename> option. For example, to run a ZTP script called initial-conf.ztp:

ONIE:/ # ./cumulus-linux-4.4.0-mlx-amd64.bin --ztp initial-conf.ztp

The ZTP script must contain the CUMULUS-AUTOPROVISIONING string near the beginning of the file and must reside on the ONIE filesystem. Refer to Zero Touch Provisioning - ZTP.

If you use the --ztp option together with any of the other command line options, the ZTP script takes precedence and the switch ignores other command line options.

Change the Default BIOS Password

To provide a layer of security and to prevent unauthorized access to the switch, NVIDIA recommends you change the default BIOS password. The default BIOS password is admin.

To change the default BIOS password:

  1. During system boot, press Ctrl+B through the serial console while the BIOS version prints.

  2. From the Security menu, select Administrator Password.

  1. Follow the prompts.

Edit the Cumulus Linux Image (Advanced)

The Cumulus Linux disk image file contains a BASH script that includes a set of variables. You can set these variables to be able to install a fully configured system with a single image file.

To edit the image

Example Image File

The Cumulus Linux disk image file is a self-extracting executable. The executable part of the file is a BASH script at the beginning of the file. Towards the beginning of this BASH script are a set of variables with empty strings:

...
CL_INSTALLER_PASSWORD=''
CL_INSTALLER_HASHED_PASSWORD=''
CL_INSTALLER_LICENSE=''
CL_INSTALLER_INTERFACES_FILENAME=''
CL_INSTALLER_INTERFACES_CONTENT=''
CL_INSTALLER_ZTP_FILENAME=''
CL_INSTALLER_QUIET=""
CL_INSTALLER_FORCEINST=""
CL_INSTALLER_INTERACTIVE=""
CL_INSTALLER_EXTRACTDIR=""
CL_INSTALLER_PAYLOAD_SHA256="72a8c3da28cda7a610e272b67fa1b3a54c50248bf6abf720f73ff3d10e79ae76"

You can set these variables:

Variable Description
CL_INSTALLER_PASSWORD Defines the clear text password.
This variable is equivalent to the ONIE installer command line option --password.
CL_INSTALLER_HASHED_PASSWORD Defines the hashed password.
This variable is equivalent to the ONIE installer command line option --hashed-password.
If you set both the CL_INSTALLER_PASSWORD and CL_INSTALLER_HASHED_PASSWORD variable, the CL_INSTALLER_HASHED_PASSWORD takes precedence.
CL_INSTALLER_INTERFACES_FILENAME Defines the name of the file on the ONIE filesystem you want to use as the /etc/network/interfaces file.
This variable is equivalent to the ONIE installer command line option --interfaces-file.
CL_INSTALLER_INTERFACES_CONTENT Describes the network interfaces available on your system and how to activate them. Setting this variable defines the contents of the /etc/network/interfaces file.
There is no equivalent ONIE installer command line option.
If you set both the CL_INSTALLER_INTERFACES_FILENAME and CL_INSTALLER_INTERFACES_CONTENT variables, the CL_INSTALLER_INTERFACES_FILENAME takes precedence.
CL_INSTALLER_ZTP_FILENAME Defines the name of the ZTP file on the ONIE filesystem you want to execute at first boot after installation.
This variable is equivalent to the ONIE installer command line option --ztp

Edit the Image File

Because the Cumulus Linux image file is a binary file, you cannot use standard text editors to edit the file directly. Instead, you must split the file into two parts, edit the first part, then put the two parts back together.

  1. Copy the first 20 lines to an empty file:
head -20 cumulus-linux-4.4.0-mlx-amd64.bin > cumulus-linux-4.4.0-mlx-amd64.bin.1
  1. Remove the first 20 lines of the image, then copy the remaining lines into another empty file:
sed -e '1,20d' cumulus-linux-4.4.0-mlx-amd64.bin > cumulus-linux-4.4.0-mlx-amd64.bin.2

The original file is now split, with the first 20 lines in cumulus-linux-4.4.0-mlx-amd64.bin.1 and the remaining lines in cumulus-linux-4.4.0-mlx-amd64.bin.2.

  1. Use a text editor to change the variables in cumulus-linux-4.4.0-mlx-amd64.bin.1.

  2. Put the two pieces back together using cat:

cat cumulus-linux-4.4.0-mlx-amd64.bin.1 cumulus-linux-4.4.0-mlx-amd64.bin.2 > cumulus-linux-4.4.0-mlx-amd64.bin.final
  1. Calculate the new checksum and update the CL_INSTALLER_PAYLOAD_SHA256 variable.
    sed -e '1,/^exit_marker$/d' "cumulus-linux-4.4.0-mlx-amd64.bin.final" | sha256sum | awk '{ print $1 }'

This following example shows a modified image file:

...
CL_INSTALLER_PAYLOAD_SHA256='d14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac332e42f'
CL_INSTALLER_PASSWORD='MyP4$$word'
CL_INSTALLER_HASHED_PASSWORD=''
CL_INSTALLER_LICENSE='customer@datacenter.com|4C3YMCACDiK0D/EnrxlXpj71FBBNAg4Yrq+brza4ZtJFCInvalid'
CL_INSTALLER_INTERFACES_FILENAME=''
CL_INSTALLER_INTERFACES_CONTENT='# This file describes the network interfaces available on your system and how to activate them.

source /etc/network/interfaces.d/*.intf

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet dhcp
	vrf mgmt

auto bridge
iface bridge
    bridge-ports swp1 swp2
    bridge-pvid 1
    bridge-vids 10 11
    bridge-vlan-aware yes

auto mgmt
iface mgmt
	address 127.0.0.1/8
	address ::1/128
	vrf-table auto
'
CL_INSTALLER_ZTP_FILENAME=''
...

You can install this edited image file in the usual way, by using the ONIE install waterfall or the onie-nos-install command.

If you install the modified installation image and specify installer command line parameters, the command line parameters take precedence over the variables modified in the image.

Secure Boot

Secure Boot validates each binary image loaded during system boot with key signatures that correspond to a stored trusted key in firmware.

Secure Boot is only on the NVIDIA SN3700C-S switch and switches with the Spectrum-4 ASIC.

Secure Boot settings are in the BIOS Security menu. To access BIOS, press Ctrl+B through the serial console during system boot while the BIOS version prints:


To access the BIOS menu, use admin which is the default BIOS password:


NVIDIA recommends changing the default BIOS password; navigate to Security and select Administrator Password.

To validate or change the Secure Boot mode, navigate to Security and select Secure Boot:


In the Secure Boot menu, you can enable and disable Secure Boot mode. To install an unsigned version of Cumulus Linux or access ONIE without a prompt for a username and password, set Secure Boot to disabled:


To access ONIE when Secure Boot is enabled, authentication is necessary. The default username and password are both root:

​ONIE: Rescue Mode ...
Platform  : x86_64-mlnx_x86-r0
Version   : 2021.02-5.3.0006-rc3-115200
Build Date: 2021-05-20T14:27+03:00
Info: Mounting kernel filesystems... done.

Info: Mounting ONIE-BOOT on /mnt/onie-boot ...
[   17.011057] ext4 filesystem being mounted at /mnt/onie-boot supports timestamps until 2038 (0x7fffffff)
Info: Mounting EFI System on /boot/efi ...
Info: BIOS mode: UEFI
Info: Using eth0 MAC address: b8:ce:f6:3c:62:06
Info: eth0:  Checking link... up.
Info: Trying DHCPv4 on interface: eth0
ONIE: Using DHCPv4 addr: eth0: 10.20.84.226 / 255.255.255.0
Starting: klogd... done.
Starting: dropbear ssh daemon... done.
Starting: telnetd... done.
discover: Rescue mode detected.  Installer disabled.

Please press Enter to activate this console. To check the install status inspect /var/log/onie.log.
Try this:  tail -f /var/log/onie.log

** Rescue Mode Enabled **
login: root
Password: root
ONIE:~ #

To validate the Secure Boot status of a system from Cumulus Linux, run the mokutil --sb-state command.

cumulus@leaf01:mgmt:~$ mokutil --sb-state
SecureBoot enabled

On a switch with the Spectrum-4 ASIC, if the ASIC firmware fails to boot, you see a message alerting you to contact NVIDIA Customer Support for further options.

Upgrading Cumulus Linux

The default password for the cumulus user account is cumulus. The first time you log into Cumulus Linux, you must change this default password. Be sure to update any automation scripts before you upgrade. You can use ONIE command line options to change the default password automatically during the Cumulus Linux image installation process. Refer to ONIE Installation Options.

Cumulus Linux provides several options for upgrading the switch:

NVIDIA recommends deploying, provisioning, configuring, and upgrading switches using automation, even with small networks or test labs. During the upgrade process, you can upgrade dozens of devices in a repeatable manner. Using tools like Ansible, Chef, or Puppet for configuration management greatly increases the speed and accuracy of the next major upgrade; these tools also enable you to quickly swap failed switch hardware.

Before You Upgrade

Optimized image upgrade and package upgrade do not overwrite configuration files on the switch, however upgrading Cumulus Linux with ONIE is destructive and any configuration files on the switch are not saved; before you start an upgrade with ONIE, back up configuration files to a different server.

For troubleshooting any upgrade issues, create a cl-support file before you start and after you complete the upgrade.

Back up Configuration Files

Understanding the location of configuration data is important for successful upgrades, migrations, and backup. As with other Linux distributions, the /etc directory is the primary location for all configuration data in Cumulus Linux. The following list contains the files you need to back up and migrate to a new release. Make sure you examine any changed files. Make the following files and directories part of a backup strategy.

File Name and Location Description Cumulus Linux Documentation Debian Documentation
/etc/frr/ Routing application (responsible for BGP and OSPF) FRRouting N/A
/etc/hostname Configuration file for the hostname of the switch Quick Start Guide https://wiki.debian.org/HowTo/ChangeHostname
/etc/network/ Network configuration files, most notably /etc/network/interfaces and /etc/network/interfaces.d/ Switch Port Attributes N/A
/etc/resolv.conf DNS resolution Not unique to Cumulus Linux: wiki.debian.org/NetworkConfiguration https://www.debian.org/doc/manuals/debian-reference/ch05.en.html
/etc/hosts Configuration file for the hostname of the switch Quick Start Guide https://wiki.debian.org/HowTo/ChangeHostname
/etc/cumulus/acl/* Netfilter configuration Access Control List Configuration N/A
/etc/cumulus/control-plane/policers.conf Configuration for control plane policers Access Control List Configuration N/A
/etc/cumulus/datapath/qos/qos_features.conf QoS configuration

Note: In Cumulus Linux 5.0 and later, default ECN configuration parameters start with default_ecn_red_conf instead of default_ecn_conf.
Quality of Service N/A
/etc/mlx/datapath/qos/qos_infra.conf QoS configuration Quality of Service N/A
/etc/mlx/datapath/tcam_profile.conf Configuration for the forwarding table profiles Forwarding Table Size and Profiles N/A
/etc/cumulus/datapath/traffic.conf Configuration for the forwarding table profiles Forwarding Table Size and Profiles N/A
/etc/cumulus/ports.conf Breakout cable configuration file Switch Port Attributes N/A; read the guide on breakout cables
/etc/cumulus/switchd.conf switchd configuration Configuring switchd N/A; read the guide on switchd configuration
File Name and Location Description Cumulus Linux Documentation Debian Documentation
/etc/motd Message of the day Not unique to Cumulus Linux wiki.debian.org/motd
/etc/passwd User account information Not unique to Cumulus Linux https://www.debian.org/doc/manuals/debian-reference/ch04.en.html
/etc/shadow Secure user account information Not unique to Cumulus Linux https://www.debian.org/doc/manuals/debian-reference/ch04.en.html
/etc/group Defines user groups on the switch Not unique to Cumulus Linux https://www.debian.org/doc/manuals/debian-reference/ch04.en.html
/etc/init/lldpd.conf Link Layer Discover Protocol (LLDP) daemon configuration Link Layer Discovery Protocol https://packages.debian.org/buster/lldpd
/etc/lldpd.d/ Configuration directory for lldpd Link Layer Discovery Protocol https://packages.debian.org/buster/lldpd
/etc/nsswitch.conf Name Service Switch (NSS) configuration file TACACS N/A
/etc/ssh/ SSH configuration files SSH for Remote Access https://wiki.debian.org/SSH
/etc/sudoers, /etc/sudoers.d Best practice is to place changes in /etc/sudoers.d/ instead of /etc/sudoers; changes in the /etc/sudoers.d/ directory are not lost during upgrade Using sudo to Delegate Privileges

  • If you are using the root user account, consider including /root/.
  • If you have custom user accounts, consider including /home/<username>/.

File Name and Location Description
/etc/mlx/ Per-platform hardware configuration directory, created on first boot. Do not copy.
/etc/default/clagd Created and managed by ifupdown2. Do not copy.
/etc/default/grub Grub init table. Do not modify manually.
/etc/default/hwclock Platform hardware-specific file. Created during first boot. Do not copy.
/etc/init Platform initialization files. Do not copy.
/etc/init.d/ Platform initialization files. Do not copy.
/etc/fstab Static information on filesystem. Do not copy.
/etc/image-release System version data. Do not copy.
/etc/os-release System version data. Do not copy.
/etc/lsb-release System version data. Do not copy.
/etc/lvm/archive Filesystem files. Do not copy.
/etc/lvm/backup Filesystem files. Do not copy.
/etc/modules Created during first boot. Do not copy.
/etc/modules-load.d/ Created during first boot. Do not copy.
/etc/sensors.d Platform-specific sensor data. Created during first boot. Do not copy.
/root/.ansible Ansible tmp files. Do not copy.
/home/cumulus/.ansible Ansible tmp files. Do not copy.

The following commands verify which files have changed compared to the previous Cumulus Linux install. Be sure to back up any changed files.

Back Up and Restore Configuration with NVUE

You can back up and restore the configuration file with NVUE only if you used NVUE commands to configure the switch you want to upgrade.

To back up and restore the configuration file:

  1. Save the configuration to the /etc/nvue.d/startup.yaml file with the nv config save command:

    cumulus@switch:~$ nv config save
    saved
    
  2. Copy the /etc/nvue.d/startup.yaml file off the switch to a different location.

  3. After upgrade is complete, restore the configuration. Copy the /etc/nvue.d/startup.yaml file to the switch, run the nv config patch command, then run the nv config apply command. In the following example startup.yaml is in the /home/cumulus directory on the switch:

    cumulus@switch:~$ nv config patch /home/cumulus/startup.yaml
    cumulus@switch:~$ nv config apply
    

For information about the NVUE object model and commands, see NVIDIA User Experience - NVUE.

As NVUE supports more features and introduces new syntax, snippets and flexible snippets become invalid.

Before you upgrade Cumulus Linux to a new release, make sure to:

  • Review the What's New for new NVUE syntax.
  • If NVUE introduces new syntax for the feature that a snippet configures, you must remove the snippet before upgrading.

Create a cl-support File

Before and after you upgrade the switch, run the cl-support script to create a cl-support archive file. The file is a compressed archive of useful information for troubleshooting. If you experience any issues during upgrade, you can send this archive file to the Cumulus Linux support team to investigate.

  1. Create the cl-support archive file with either the NVUE nv action generate system tech-support command or the Linux sudo cl-support command:
cumulus@switch:~$ nv action generate system tech-support
  1. Copy the cl-support file off the switch to a different location.

  2. After upgrade is complete, create a new archive file:

cumulus@switch:~$ nv action generate system tech-support

Upgrade Cumulus Linux

You can upgrade Cumulus Linux in one of two ways:

Cumulus Linux also provides ISSU to upgrade an active switch with minimal disruption to the network. See In-Service-System-Upgrade-ISSU.

  • To upgrade to Cumulus Linux 5.12 from Cumulus Linux 4.x or 3.x, you must install a disk image of the new release. You cannot upgrade packages with package upgrade.
  • Upgrading an MLAG pair requires additional steps. If you are using MLAG to dual connect two Cumulus Linux switches in your environment, follow the steps in Upgrade Switches in an MLAG Pair below to ensure a smooth upgrade.

Install a Cumulus Linux Image or Upgrade Packages?

The decision to upgrade Cumulus Linux by either installing a Cumulus Linux image or upgrading packages depends on your environment and your preferences. The following section provides recommendations for each upgrade method.

Install a Cumulus Linux image if you are performing a rolling upgrade in a production environment and if you are using up-to-date and comprehensive automation scripts. This upgrade method enables you to choose the exact release to which you want to upgrade and is the only method available to upgrade your switch to a new release train (for example, from 4.4.3 to 5.12).

Be aware of the following when installing the Cumulus Linux image:

Run package upgrade if you are upgrading from one Cumulus Linux 5.x release to a later 5.x release, and if you use third-party applications (package upgrade does not replace or remove third-party applications, unlike the Cumulus Linux image install).

Be aware of the following when upgrading packages:

Install an Image

Optimized Cumulus Linux image install uses two partitions to upgrade the image with just one reboot cycle and takes less time than installing the image with ONIE, which requires two reboots.

With two partitions on the switch, the current image boots from one partition, from which the image upgrade triggers. After detecting the running partition, and checking if the second partition is available for installation, optimized upgrade starts to stage the installation in the second partition (copying the image, preparing the partition, unpacking the new image, and tuning and finalizing the new partition for the new image). The subsequent boot occurs from the second partition.

To upgrade the switch with optimized install:

  1. Download the Cumulus Linux image with the nv action fetch system image <remote-url> command:

    cumulus@switch:~$ nv action fetch system image http://10.0.1.251/cumulus-linux-5.12.0-mlx-amd64.bin
    
  2. Install the image on the second partition:

    cumulus@switch:~$ nv action install system image
    

    Use the force option to force install the image:

    cumulus@switch:~$ nv action install system image force
    
  3. Set the boot partition:

    cumulus@switch:~$ nv action boot-next system image other 
    
  4. Reboot the switch:

    cumulus@switch:~$ reboot
    
  • To rename a Cumulus Linux image on the switch, run the nv action rename system image files <image> <new-image-name> command.
  • To delete a Cumulus Linux image from the switch, run the nv action delete system image files <image> command.

To show information about a cumulus image:

cumulus@switch:~$ nv show system image

To list the available Cumulus Linux image files:

cumulus@switch:~$ nv show system image files

To show information about a specific Cumulus Linux image file:

cumulus@switch:~$ nv show system image files cumulus-linux-5.12.0-mlx-amd64.bin
  1. Download the Cumulus Linux image to the switch.

  2. Install the image on the second partition:

    cumulus@switch:~$ cl-image-upgrade -u cumulus-linux-5.12.0-mlx-amd64.bin
    

To check the current boot partition status, run the cl-image-upgrade -s command:

cumulus@switch:~$ cl-image-upgrade -s  
Current system partition is 1 on /dev/sda5 
Current system partition has "Cumulus Linux 5.12.0" 
Other system partition is 2 on /dev/sda6 
Other system partition has "Cumulus Linux 5.12.0" 
Next boot to partition 1. 

To activate the other partition at next boot, run the cl-image-upgrade -a command:

cumulus@switch:~$ cl-image-upgrade -a 

ONIE is an open source project (equivalent to PXE on servers) that enables the installation of network operating systems (NOS) on a bare metal switch.

To upgrade the switch with ONIE:

  1. Back up the configurations off the switch.

  2. Download the Cumulus Linux image.

  3. Install the Cumulus Linux image with the onie-install -a -i <image-location> command, which boots the switch into ONIE. The following example command installs the image from a web server, then reboots the switch. There are additional ways to install the Cumulus Linux image, such as using FTP, a local file, or a USB drive. For more information, see Installing a New Cumulus Linux Image.

    cumulus@switch:~$ sudo onie-install -a -i http://10.0.1.251/cumulus-linux-5.12.0-mlx-amd64.bin && sudo reboot
    
  4. Restore the configuration files to the new release (NVIDIA does not recommend restoring files with automation).

  5. Verify correct operation with the old configurations on the new release.

  6. Reinstall third party applications and associated configurations.

Package Upgrade

  • NVUE deprecated the port split command options (2x10G, 2x25G, 2x40G, 2x50G, 2x100G, 2x200G, 4x10G, 4x25G, 4x50G, 4x100G, 8x50G) available in Cumulus Linux 5.3 and earlier. If you use NVUE to configure port breakout speeds in Cumulus 5.3 or earlier, NVUE automatically updates the configuration during upgrade to Cumulus Linux 5.5 and later to use the new format (2x, 4x, 8x).
  • Cumulus Linux continues to support the old port split format in the /etc/cumulus/ports.conf file; however NVIDIA recommends that you use the new format.

Cumulus Linux completely embraces the Linux and Debian upgrade workflow, where you use an installer to install a base image, then perform any package upgrades within that release train. Any packages that have changed after the base install get upgraded in place from the repository. All switch configuration files remain untouched, or in rare cases merged during the package upgrade.

When you use package upgrade to upgrade your switch, configuration data stays in place during the upgrade. If the new release updates a previously changed configuration file, the upgrade process prompts you to either specify the version you want to use or evaluate the differences.

Disk Space Requirements

Make sure you have enough disk space to perform a package upgrade. To upgrade from Cumulus Linux 5.11 to Cumulus Linux 5.12, you need 0.8GB of free disk space.

Before you upgrade, run the NVUE nv show system disk usage command or the Linux sudo df -h command to show how much disk space you are currently using on the switch.

cumulus@switch:~$ nv show system disk usage 
Mount Point   Filesystem   Size   Used         Avail   Use% 
-----------   ----------   --     ---------    ----    ---- 
/             /dev/sda5    5.4G    3.0G        2.2G     58% 
/dev          udev         2.0G    0           2.0G     0% 
/dev/shm      tmpfs        2.1G    61M         2.0G     3% 
/run          tmpfs        411M    38M         374M     10% 
/run/lock     tmpfs        5.0M    0           5.0M     0% 
/tmp          tmpfs        2.1G    12K         2.1G     1% 
/vagrant      vagrant      4.3T    3.1T        1.3T     72% 

Upgrade from Cumulus Linux 5.9.x to Cumulus Linux 5.12.0

If you are running Cumulus Linux 5.9.x (the current extended-support release), the default switch configuration allows you to upgrade to the latest Cumulus 5.9.x release only.

To upgrade from Cumulus Linux 5.9.x to Cumulus Linux 5.11.0 or later, perform the following procedure before you start the package upgrade:

  1. Edit the /etc/apt/sources.list file to include the following lines at the top of the file.

    cumulus@switch:~$ sudo nano /etc/apt/sources.list
    deb      https://apt.cumulusnetworks.com/repo CumulusLinux-d12-latest cumulus upstream netq
    deb-src  https://apt.cumulusnetworks.com/repo CumulusLinux-d12-latest cumulus upstream netq
    
  2. Remove or comment out the following lines in the /etc/apt/sources.list file:

    deb      https://apt.cumulusnetworks.com/repo CumulusLinux-5.9-latest cumulus upstream netq
    deb-src  https://apt.cumulusnetworks.com/repo CumulusLinux-5.9-latest cumulus upstream netq
    

Upgrade the Switch

To upgrade the switch using package upgrade:

  1. Back up the configurations from the switch.

  2. Fetch the latest update metadata from the repository and review potential upgrade issues (in some cases, upgrading new packages might also upgrade additional existing packages due to dependencies).

    cumulus@switch:~$ nv action upgrade system packages to latest use-vrf default dry-run
    

    By default, the NVUE nv action upgrade system packages command runs in the management VRF. To run the command in a non-management VRF such as default, you must use the use-vrf <vrf> option.

  3. Upgrade all the packages to the latest distribution.

    cumulus@switch:~$ nv action upgrade system packages to latest use-vrf default
    

    By default, the NVUE nv action upgrade system packages command runs in the management VRF. To run the command in a non-management VRF such as default, you must use the use-vrf <vrf> option.

    If you see errors for expired GPG keys that prevent you from upgrading packages, follow the steps in Upgrading Expired GPG Keys.

  4. After the upgrade completes, check if you need to reboot the switch, then reboot the switch if required:

    cumulus@switch:~$ nv show system reboot required
    yes
    cumulus@switch:~$ nv action reboot system
    
  5. Verify correct operation with the old configurations on the new version.

  1. Back up the configurations from the switch.

  2. Fetch the latest update metadata from the repository.

    cumulus@switch:~$ sudo -E apt-get update
    
  3. Review potential upgrade issues (in some cases, upgrading new packages might also upgrade additional existing packages due to dependencies).

    cumulus@switch:~$ sudo -E apt-get upgrade --dry-run
    
  4. Upgrade all the packages to the latest distribution.

    cumulus@switch:~$ sudo -E apt-get upgrade
    

    If you do not need to reboot the switch after the upgrade completes, the upgrade ends, restarts all upgraded services, and logs messages in the /var/log/syslog file similar to the ones shown below. In the examples below, the process only upgrades the frr package.

    Policy: Service frr.service action stop postponed
    Policy: Service frr.service action start postponed
    Policy: Restarting services: frr.service
    Policy: Finished restarting services
    Policy: Removed /usr/sbin/policy-rc.d
    Policy: Upgrade is finished
    

    If the upgrade process encounters changed configuration files that have new versions in the release to which you are upgrading, you see a message similar to this:

    Configuration file '/etc/frr/daemons'
    ==> Modified (by you or by a script) since installation.
    ==> Package distributor has shipped an updated version.
    What would you like to do about it ? Your options are:
    Y or I : install the package maintainer's version
    N or O : keep your currently-installed version
    D : show the differences between the versions
    Z : start a shell to examine the situation
    The default action is to keep your current version.
    *** daemons (Y/I/N/O/D/Z) [default=N] ?
    
    • To see the differences between the currently installed version and the new version, type D.
    • To keep the currently installed version, type N. The new package version installs with the suffix .dpkg-dist (for example, /etc/frr/daemons.dpkg-dist). When the upgrade completes and before you reboot, merge your changes with the changes from the newly installed file.
    • To install the new version, type I. Your currently installed version has the suffix .dpkg-old.
    • Cumulus Linux includes /etc/apt/sources.list in the cumulus-archive-keyring package. During upgrade, you must select if you want the new version from the package or the existing file.

    When the upgrade is complete, you can search for the files with the sudo find / -mount -type f -name '*.dpkg-*' command.

    If you see errors for expired GPG keys that prevent you from upgrading packages, follow the steps in Upgrading Expired GPG Keys.

  5. Reboot the switch if the upgrade messages indicate that you need to perform a system restart.

    cumulus@switch:~$ sudo -E apt-get upgrade
    ... upgrade messages here ...
    
    *** Caution: Service restart prior to reboot could cause unpredictable behavior
    *** System reboot required ***
    cumulus@switch:~$ sudo reboot
    
  6. Verify correct operation with the old configurations on the new version.

Upgrade Notes

Package upgrade always updates to the latest available release in the Cumulus Linux repository. For example, if you are currently running Cumulus Linux 5.0.0 and perform a package upgrade, the packages upgrade to the latest releases in the latest 5.x release.

Cumulus Linux is a collection of different Debian Linux packages; be aware of the following:

Upgrade Switches in an MLAG Pair

If you are using MLAG to dual connect two switches in your environment, follow the steps below to upgrade the switches.

You must upgrade both switches in the MLAG pair to the same release of Cumulus Linux.

Only during the upgrade process does Cumulus Linux supports different software versions between MLAG peer switches. After you upgrade the first MLAG switch in the pair, run the clagctl showtimers command to monitor the init-delay timer. When the timer expires, make the upgraded MLAG switch the primary, then upgrade the peer to the same version of Cumulus Linux.

NVIDIA has not tested running different versions of Cumulus Linux on MLAG peer switches outside of the upgrade time period; you might see unexpected results.

  1. Verify the switch is in the secondary role:

    cumulus@switch:~$ nv show mlag
    
  2. Shut down the core uplink layer 3 interfaces. The following example shuts down swp1:

    cumulus@switch:~$ nv set interface swp1 link state down
    cumulus@switch:~$ nv config apply
    
  3. Shut down the peer link:

    cumulus@switch:~$ nv set interface peerlink link state down
    cumulus@switch:~$ nv config apply
    
  4. To boot the switch into ONIE, run the onie-install -a -i <image-location> command. The following example command installs the image from a web server. There are additional ways to install the Cumulus Linux image, such as using FTP, a local file, or a USB drive. For more information, see Installing a New Cumulus Linux Image.

    cumulus@switch:~$ sudo onie-install -a -i http://10.0.1.251/downloads/cumulus-linux-5.12.0-mlx-amd64.bin
    

    To upgrade the switch with package upgrade instead of booting into ONIE, see Package Upgrade.

  5. Save the changes to the NVUE configuration from steps 2 and 3, then reboot the switch:

    cumulus@switch:~$ nv config save
    cumulus@switch:~$ nv action reboot system
    
  6. If you installed a new image on the switch, restore the configuration files to the new release. If you performed an upgrade with apt, bring the uplink and peer link interfaces you shut down in steps 2 and 3 up:

    cumulus@switch:~$ nv set interface swp1 link state up
    cumulus@switch:~$ nv set interface peerlink link state up
    cumulus@switch:~$ nv config apply
    cumulus@switch:~$ nv config save
    
  7. Verify STP convergence across both switches with the Linux mstpctl showall command. NVUE does not provide an equivalent command.

    cumulus@switch:~$ mstpctl showall
    
  8. Verify core uplinks and peer links are UP:

    cumulus@switch:~$ nv show interface
    
  9. Verify MLAG convergence:

    cumulus@switch:~$ nv show mlag
    
  10. Make this secondary switch the primary:

    cumulus@switch:~$ nv set mlag priority 2084
    
  11. Verify the other switch is now in the secondary role.

  12. Repeat steps 2-9 on the new secondary switch.

  13. Remove the priority 2048 and restore the priority back to 32768 on the current primary switch:

    cumulus@switch:~$ nv set mlag priority 32768
    
  1. Verify the switch is in the secondary role:

    cumulus@switch:~$ clagctl status
    
  2. Shut down the core uplink layer 3 interfaces:

    cumulus@switch:~$ sudo ip link set <switch-port> down
    
  3. Shut down the peer link:

    cumulus@switch:~$ sudo ip link set peerlink down
    
  4. To boot the switch into ONIE, run the onie-install -a -i <image-location> command. The following example command installs the image from a web server. There are additional ways to install the Cumulus Linux image, such as using FTP, a local file, or a USB drive. For more information, see Installing a New Cumulus Linux Image.

    cumulus@switch:~$ sudo onie-install -a -i http://10.0.1.251/downloads/cumulus-linux-5.12.0-mlx-amd64.bin
    

    To upgrade the switch with package upgrade instead of booting into ONIE, see Package Upgrade.

  5. Reboot the switch:

    cumulus@switch:~$ sudo reboot
    
  6. If you installed a new image on the switch, restore the configuration files to the new release.

  7. Verify STP convergence across both switches:

    cumulus@switch:~$ mstpctl showall
    
  8. Verify that core uplinks and peer links are UP:

    cumulus@switch:~$ ip addr show
    
  9. Verify MLAG convergence:

    cumulus@switch:~$ clagctl status
    
  10. Make this secondary switch the primary:

    cumulus@switch:~$ clagctl priority 2048
    
  11. Verify the other switch is now in the secondary role.

  12. Repeat steps 2-9 on the new secondary switch.

  13. Remove the priority 2048 and restore the priority back to 32768 on the current primary switch:

    cumulus@switch:~$ clagctl priority 32768
    

Roll Back a Cumulus Linux Installation

Even the most well planned and tested upgrades can result in unforeseen problems and sometimes the best solution is to roll back to the previous state. These main strategies require detailed planning and execution:

The method you employ is specific to your deployment strategy. Providing detailed steps for each scenario is outside the scope of this document.

Third Party Packages

If you install any third party applications on a Cumulus Linux switch, configuration data is typically installed in the /etc directory, but it is not guaranteed. It is your responsibility to understand the behavior and configuration file information of any third party packages installed on the switch.

After you upgrade using a full Cumulus Linux image install, you need to reinstall any third party packages or any Cumulus Linux add-on packages.

Adding and Updating Packages

To manage additional applications in the form of packages and to install the latest updates, use the Advanced Packaging Tool (apt).

Updating, upgrading, and installing packages with apt causes disruptions to network services:

  • Upgrading a package can cause services to restart or stop.
  • Installing a package sometimes disrupts core services by changing core service dependency packages. In some cases, installing new packages also upgrades additional existing packages due to dependencies.
  • If services stop, you need to reboot the switch to restart the services.

Update the Package Cache

To work correctly, apt relies on a local cache listing of the available packages. You must populate the cache initially, then periodically update it with sudo -E apt-get update:

cumulus@switch:~$ sudo -E apt-get update
Ign:1 copy:/var/lib/cumulus/cumulus-local-apt-archive cumulus-local-apt-archive InRelease
Get:2 copy:/var/lib/cumulus/cumulus-local-apt-archive cumulus-local-apt-archive Release [1,115 B]
Ign:3 copy:/var/lib/cumulus/cumulus-local-apt-archive cumulus-local-apt-archive Release.gpg
Get:4 http://security.debian.org buster/updates InRelease [65.4 kB]                 
Hit:5 http://deb.debian.org/debian buster InRelease                                 
Get:6 http://deb.debian.org/debian buster-updates InRelease [51.9 kB]
Get:7 http://deb.debian.org/debian buster-backports InRelease [46.7 kB]
Get:8 http://deb.debian.org/debian buster-updates/main Sources.diff/Index [8,608 B] 
Get:9 http://deb.debian.org/debian buster-updates/main amd64 Packages.diff/Index [8,608 B]
Get:10 http://deb.debian.org/debian buster-updates/main Sources 2021-09-28-1420.03.pdiff [185 B]
Get:10 http://deb.debian.org/debian buster-updates/main Sources 2021-09-28-1420.03.pdiff [185 B]
Get:11 http://deb.debian.org/debian buster-updates/main amd64 Packages 2021-09-28-1420.03.pdiff [184 B]               
Get:11 http://deb.debian.org/debian buster-updates/main amd64 Packages 2021-09-28-1420.03.pdiff [184 B]               
Get:12 http://deb.debian.org/debian buster-backports/main Sources.diff/Index [27.8 kB]                     
Get:13 http://deb.debian.org/debian buster-backports/main amd64 Packages.diff/Index [27.8 kB]                         
Hit:14 http://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 InRelease                                            
Get:15 http://security.debian.org buster/updates/main Sources [200 kB]                             
Get:16 http://security.debian.org buster/updates/main amd64 Packages [305 kB]              
Hit:17 http://apt.cumulusnetworks.com/repo CumulusLinux-4-latest InRelease                       
Get:18 http://deb.debian.org/debian buster-backports/main Sources 2021-10-02-0801.17.pdiff [681 B]
Get:19 http://deb.debian.org/debian buster-backports/main Sources 2021-10-02-1405.24.pdiff [31 B]
Get:19 http://deb.debian.org/debian buster-backports/main Sources 2021-10-02-1405.24.pdiff [31 B]
Get:20 http://deb.debian.org/debian buster-backports/main amd64 Packages 2021-10-02-1405.24.pdiff [178 B]
Get:20 http://deb.debian.org/debian buster-backports/main amd64 Packages 2021-10-02-1405.24.pdiff [178 B]
Fetched 744 kB in 1s (982 kB/s)
Reading package lists... Done

Use the -E option with sudo whenever you run any apt-get command. This option preserves your environment variables (such as HTTP proxies) before you install new packages or upgrade your distribution.

List Available Packages

After the cache populates, use the apt-cache command to search the cache and find the packages of interest or to get information about an available package.

The following shows examples of the search and show sub-commands:

cumulus@switch:~$ apt-cache search tcp
collectd-core - statistics collection and monitoring daemon (core system)
fakeroot - tool for simulating superuser privileges
iperf - Internet Protocol bandwidth measuring tool
iptraf-ng - Next Generation Interactive Colorful IP LAN Monitor
libfakeroot - tool for simulating superuser privileges - shared libraries
libfstrm0 - Frame Streams (fstrm) library
libibverbs1 - Library for direct userspace use of RDMA (InfiniBand/iWARP)
libnginx-mod-stream - Stream module for Nginx
libqt4-network - Qt 4 network module
librtr-dev - Small extensible RPKI-RTR-Client C library - development files
librtr0 - Small extensible RPKI-RTR-Client C library
libwiretap8 - network packet capture library -- shared library
libwrap0 - Wietse Venema's TCP wrappers library
libwrap0-dev - Wietse Venema's TCP wrappers library, development files
netbase - Basic TCP/IP networking system
nmap-common - Architecture independent files for nmap
nuttcp - network performance measurement tool
openssh-client - secure shell (SSH) client, for secure access to remote machines
openssh-server - secure shell (SSH) server, for secure access from remote machines
openssh-sftp-server - secure shell (SSH) sftp server module, for SFTP access from remote machines
python-dpkt - Python 2 packet creation / parsing module for basic TCP/IP protocols
rsyslog - reliable system and kernel logging daemon
socat - multipurpose relay for bidirectional data transfer
tcpdump - command-line network traffic analyzer
cumulus@switch:~$ apt-cache show tcpdump
Package: tcpdump
Version: 4.9.3-1~deb10u1
Installed-Size: 1109
Maintainer: Romain Francoise <rfrancoise@debian.org>
Architecture: amd64
Replaces: apparmor-profiles-extra (<< 1.12~)
Depends: libc6 (>= 2.14), libpcap0.8 (>= 1.5.1), libssl1.1 (>= 1.1.0)
Suggests: apparmor (>= 2.3)
Breaks: apparmor-profiles-extra (<< 1.12~)
Size: 400060
SHA256: 3a63be16f96004bdf8848056f2621fbd863fadc0baf44bdcbc5d75dd98331fd3
SHA1: 2ab9f0d2673f49da466f5164ecec8836350aed42
MD5sum: 603baaf914de63f62a9f8055709257f3
Description: command-line network traffic analyzer
 This program allows you to dump the traffic on a network. tcpdump
 is able to examine IPv4, ICMPv4, IPv6, ICMPv6, UDP, TCP, SNMP, AFS
 BGP, RIP, PIM, DVMRP, IGMP, SMB, OSPF, NFS and many other packet
 types.
 .
 It can be used to print out the headers of packets on a network
 interface, filter packets that match a certain expression. You can
 use this tool to track down network problems, to detect attacks
 or to monitor network activities.
Description-md5: f01841bfda357d116d7ff7b7a47e8782
Homepage: http://www.tcpdump.org/
Multi-Arch: foreign
Section: net
Priority: optional
Filename: pool/upstream/t/tcpdump/tcpdump_4.9.3-1~deb10u1_amd64.deb

The search commands look for the search terms not only in the package name but in other parts of the package information; the search matches on more packages than you expect.

List Packages Installed on the System

The apt-cache command shows information about all the packages available in the repository. To see which packages are actually installed on your system, run the following command.

cumulus@switch:~$ nv show platform software installed
acpi                                   libfreeipmi17                          libyajl2
acpid                                  libfreetype6                           libyaml-0-2
acpi-support-base                      libfstrm0                              libyang2
adduser                                libfuse2                               libyuv0
apt                                    libgav1-1                              libzmq5
arping                                 libgcc-12-dev                          libzstd1
arptables                              libgcc-s1                              linux-base
atftp                                  libgcrypt20                            linux-image-6.1.0-cl-1-amd64
atftpd                                 libgd3                                 linux-image-amd64
auditd                                 libgdbm6                               linux-libc-dev
babeltrace                             libgdbm-compat4                        linux-perf
base-files                             libgee-0.8-2                           linuxptp
base-passwd                            libgeoip1                              linux-selftests
...
cumulus@switch:~$ dpkg -l
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                Version                   Architecture Description
+++-===================-=========================-============-=================================
ii  acpi                1.7-1.1                   amd64        displays information on ACPI devices
ii  acpi-support-base   0.142-8                   all          scripts for handling base ACPI events such as th
ii  acpid               1:2.0.31-1                amd64        Advanced Configuration and Power Interface event
ii  adduser             3.118                     all          add and remove users and groups
ii  apt                 1.8.2                     amd64        commandline package manager
ii  arping              2.19-6                    amd64        sends IP and/or ARP pings (to the MAC address)
ii  arptables           0.0.4+snapshot20181021-4  amd64        ARP table administration
...

Show the Version of a Package

To show the version of a specific package installed on the system:

The following example command shows which version of the vrf package is on the system:

cumulus@switch:~$ nv show platform software installed vrf
             operational        
-----------  -------------------
package      vrf                
version      1.0-cl5.9.0u4      
description  Linux tools for VRF

The following example command shows which version of the vrf package is on the system:

cumulus@switch:~$ dpkg -l vrf
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name       Version      Architecture Description
+++-==========-============-============-=================================
ii  vrf        1.0-cl5.9.0u4    amd64        Linux tools for VRF

Upgrade Packages

To upgrade all the packages installed on the system to their latest versions, run the following commands:

cumulus@switch:~$ nv action upgrade system packages to latest use-vrf default dry-run

By default, the NVUE nv action upgrade system packages command runs in the management VRF. To run the command in a non-management VRF such as default, you must use the use-vrf <vrf> option.

cumulus@switch:~$ sudo -E apt-get update
cumulus@switch:~$ sudo -E apt-get upgrade

The system lists the packages for upgrade and prompts you to continue.

The above commands upgrade all installed versions with their latest versions but do not install any new packages.

Add New Packages

To add a new package, first ensure the package is not already on the system:

cumulus@switch:~$ dpkg -l | grep <name of package>
cumulus@switch:~$ sudo -E apt-get update
cumulus@switch:~$ sudo -E apt-get install tcpreplay
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
tcpreplay
0 upgraded, 1 newly installed, 0 to remove and 1 not upgraded.
Need to get 436 kB of archives.
After this operation, 1008 kB of additional disk space will be used
...

You can install several packages at the same time:

cumulus@switch:~$ sudo -E apt-get install <package1> <package2> <package3>

In some cases, installing a new package also upgrades additional existing packages due to dependencies. To view these additional packages before you install, run the apt-get install --dry-run command.

Add Packages From Another Repository

As shipped, Cumulus Linux searches the Cumulus Linux repository for available packages. You can add additional repositories to search by adding them to the list of sources that apt-get consults. See man sources.list for more information.

NVIDIA adds features or makes bug fixes to certain packages; do not replace these packages with versions from other repositories.

If you want to install packages that are not in the Cumulus Linux repository, the procedure is the same as above, but with one additional step.

NVIDIA does not test and Cumulus Linux Technical Support does not support packages that are not part of the Cumulus Linux repository.

Installing packages outside of the Cumulus Linux repository requires the use of sudo -E apt-get; however, depending on the package, you can use easy-install and other commands.

To install a new package, complete the following steps:

  1. Run the dpkg command to ensure that the package is not already installed on the system:

    cumulus@switch:~$ dpkg -l | grep <name of package>
    
  2. If the package is already on the system, ensure it is the version you need. If it is an older version, update the package from the Cumulus Linux repository:

    cumulus@switch:~$ sudo -E apt-get update
    cumulus@switch:~$ sudo -E apt-get install <name of package>
    cumulus@switch:~$ sudo -E apt-get upgrade
    
  3. If the package is not on the system, the package source location is not in the /etc/apt/sources.list file. Edit and add the appropriate source to the file. For example, add the following if you want a package from the Debian repository that is not in the Cumulus Linux repository:

    deb http://http.us.debian.org/debian buster main
    deb http://security.debian.org/ buster/updates main
    

    Otherwise, /etc/apt/sources.list lists the repository but comments it out. To uncomment the repository, remove the # at the start of the line, then save the file.

  4. Run sudo -E apt-get update, then install the package and upgrade:

    cumulus@switch:~$ sudo -E apt-get update
    cumulus@switch:~$ sudo -E apt-get install <name of package>
    cumulus@switch:~$ sudo -E apt-get upgrade
    

Add Packages from the Cumulus Linux Local Archive

Cumulus Linux contains a local archive embedded in the Cumulus Linux image. This archive, cumulus-local-apt-archive, contains the packages you need to install ifplugd, LDAP, RADIUS or TACACS+ without a network connection.

The archive contains the following packages:

Add these packages with apt-get update && apt-get install, as described above.

Zero Touch Provisioning - ZTP

Use ZTP to deploy network devices in large-scale environments. On first boot, Cumulus Linux runs ZTP, which executes the provisioning automation that deploys the device for its intended role in the network.

The provisioning framework allows you to execute a one-time, user-provided script. You can develop this script using a variety of automation tools and scripting languages. You can also use it to add the switch to a configuration management (CM) platform such as Puppet, Chef, CFEngine or a custom, proprietary tool.

While developing and testing the provisioning logic, you can use the ztp command in Cumulus Linux to run your provisioning script manually on a device.

The ZTP service can run a script automatically in this order:

  1. Through a local file
  2. Using a USB drive inserted into the switch (ZTP-USB)
  3. Through DHCP

You can also run ZTP manually.

Use a Local File

ZTP only looks one time for a ZTP script on the local file system when the switch boots. ZTP searches for an install script that matches an ONIE-style waterfall in /var/lib/cumulus/ztp, looking for the most specific name first, and ending at the most generic:

Use a USB Drive

NVIDIA tests this feature only with thumb drives, not an external large USB hard drive.

If the ztp process does not discover a local script, it tries one time to locate an inserted but unmounted USB drive. If it discovers one, it begins the ZTP process. Cumulus Linux supports the use of a FAT32, FAT16, or VFAT-formatted USB drive as an installation source for ZTP scripts. You must plug in the USB drive before you power up the switch.

At minimum, the script must:

Follow these steps to perform ZTP using a USB drive:

  1. Copy the installation image to the USB drive.
  2. The ztp process searches the root filesystem of the newly mounted drive for filenames matching an ONIE-style waterfall (see the patterns and examples above), looking for the most specific name first, and ending at the most generic.
  3. ZTP parses the contents of the script to ensure it contains the CUMULUS-AUTOPROVISIONING flag (see example scripts).

The USB drive mounts to a temporary directory under /tmp (for example, /tmp/tmpigGgjf/). To reference files on the USB drive, use the environment variable ZTP_USB_MOUNTPOINT to refer to the USB root partition.

ZTP Over DHCP

If the ztp process does not discover a local ONIE script or applicable USB drive, it checks DHCP every ten seconds for up to five minutes for the presence of a ZTP URL specified in /var/run/ztp.dhcp. The URL can be any of HTTP, HTTPS, FTP, or TFTP.

For ZTP using DHCP, provisioning initially takes place over the management network and initiates through a DHCP hook. A DHCP option specifies a configuration script. The ZTP process requests this script from the Web server and the script executes locally.

The ZTP process over DHCP follows these steps:

  1. The first time you boot Cumulus Linux, eth0 makes a DHCP request. By default, Cumulus Linux sends DHCP option 60 (the vendor class identifier) with the value cumulus-linux x86_64 to identify itself to the DHCP server.
  2. The DHCP server offers a lease to the switch.
  3. If option 239 is in the response, the ZTP process starts.
  4. The ZTP process requests the contents of the script from the URL, sending additional HTTP headers containing details about the switch.
  5. ZTP parses the contents of the script to ensure it contains the CUMULUS-AUTOPROVISIONING flag (see example scripts).
  6. If provisioning is necessary, the script executes locally on the switch with root privileges.
  7. ZTP examines the return code of the script. If the return code is 0, ZTP marks the provisioning state as complete in the autoprovisioning configuration file.

Trigger ZTP Over DHCP

If you have not yet provisioned the switch, you can trigger the ZTP process over DHCP when eth0 uses DHCP and one of the following events occur:

You can also run the ztp --run <URL> command, where the URL is the path to the ZTP script.

Configure the DHCP Server

During the DHCP process over eth0, Cumulus Linux requests DHCP option 239. This option specifies the custom provisioning script.

For example, the /etc/dhcp/dhcpd.conf file for an ISC DHCP server looks like:

option cumulus-provision-url code 239 = text;

  subnet 192.0.2.0 netmask 255.255.255.0 {
  range 192.0.2.100 192.168.0.200;
  option cumulus-provision-url "http://192.0.2.1/demo.sh";
}

DHCP on Front Panel Ports

ZTP runs DHCP on all the front panel switch ports and on any active interface. ZTP assesses the list of active ports on every retry cycle. When it receives the DHCP lease and option 239 is present in the response, ZTP starts to execute the script.

Inspect HTTP Headers

The following HTTP headers in the request to the web server retrieve the provisioning script:

Header                        Value                 Example
------                        -----                 -------
User-Agent                                          CumulusLinux-AutoProvision/0.4
CUMULUS-ARCH                  CPU architecture      x86_64
CUMULUS-BUILD                                       5.1.0
CUMULUS-MANUFACTURER                                odm
CUMULUS-PRODUCTNAME                                 switch_model
CUMULUS-SERIAL                                      XYZ123004
CUMULUS-BASE-MAC                                    44:38:39:FF:40:94
CUMULUS-MGMT-MAC                                    44:38:39:FF:00:00
CUMULUS-VERSION                                     5.1.0
CUMULUS-PROV-COUNT                                  0
CUMULUS-PROV-MAX                                    32

Manually Run ZTP

Cumulus Linux provides commands so that you can manually:

The following example enables ZTP and activates the provisioning process. ZTP tries to run the next time the switch boots. However, if ZTP already ran on a previous boot up or if you made manual configuration changes, ZTP exits without trying to look for a script.

cumulus@switch:~$ nv action enable system ztp
The operation will perform enable of the ZTP.
Type [y] to perform enable of the ZTP.
Type [N] to cancel an action.

Do you want to continue? [y/N]

If you add the force option, ZTP enables and activates the provisioning process without prompting you for confirmation.

cumulus@switch:~$ nv action enable system ztp force
ction executing ...
Enabling ZTP for next reboot
Action executing ...
Action succeeded

The following example disables ZTP and deactivates the provisioning process. If a ZTP script is currently running, ZTP is not disabled.

cumulus@switch:~$ nv action disable system ztp
The operation will perform disable of the ZTP.
Type [y] to perform disable of the ZTP.
Type [N] to cancel an action.

Do you want to continue? [y/N] 

If you add the force option, ZTP runs without prompting you for confirmation.

cumulus@switch:~$ nv action disable system ztp force
Action executing ...
Disabling ZTP for next reboot
Action executing ...
Action succeeded

The following example manually runs ZTP from the beginning. If you made manual configuration changes, ZTP considers the switch as already provisioned and exits.

cumulus@switch:~$ nv action run system ztp
The operation will perform rerun of the ZTP.
Type [y] to perform rerun of the ZTP.
Type [N] to cancel an action.

Do you want to continue? [y/N] 

If you add the force option, ZTP runs without prompting you for confirmation.

cumulus@switch:~$ nv action run system ztp force
Action executing ...
Action succeeded

The following example manually runs ZTP and specifies a custom URL for the ZTP script. If you made manual configuration changes, ZTP considers the switch as already provisioned and exits.

cumulus@switch:~$ nv action run system ztp url https://myserver/mypath/cumulus-ztp.sh
The operation will perform rerun of the ZTP.
Type [y] to perform rerun of the ZTP.
Type [N] to cancel an action.

Do you want to continue? [y/N]

The following example manually runs ZTP from the /home/cumulus directory on the switch. If you made manual configuration changes, ZTP considers the switch as already provisioned and exits.

cumulus@switch:~$ nv action run system ztp url /home/cumulus/cumulus-ztp.sh
The operation will perform rerun of the ZTP.
Type [y] to perform rerun of the ZTP.
Type [N] to cancel an action.

Do you want to continue? [y/N]

If you add the force option, ZTP runs without prompting you for confirmation.

cumulus@switch:~$ nv action run system ztp url https://myserver/mypath/cumulus-ztp.sh force
cumulus@switch:~$ nv action run system ztp url /home/cumulus/cumulus-ztp.sh force

The following example terminates ZTP if it is in the discovery process or is not currently running a script:

cumulus@switch:~$ nv action abort system ztp

If you add the force option, ZTP terminates without prompting you for confirmation:

cumulus@switch:~$ nv action abort system ztp force

To show the status of the ZTP service, run the nv show system ztp command.

cumulus@switch:~$ nv show system ztp
        operational
-------  -----------
service  enabled   
status   enabled   
version  1.0  

Use caution when using the above ZTP commands:

  • When running ZTP with a custom URL, ensure that the specified URL is accessible and contains the script you want to run.
  • Abruptly terminating ZTP can disrupt ongoing configurations and have unintended consequences for the system.
  • Enabling or disabling ZTP, especially with the force option might interrupt existing processes or ongoing configurations.

To enable ZTP and activate the provisioning process, use the -e option:

cumulus@switch:~$ sudo ztp -e

To reset ZTP to its original state, use the -R option. This option removes the ztp directory and ZTP runs the next time the switch reboots.

cumulus@switch:~$ sudo ztp -R

To disable ZTP and deactivate the provisioning process, use the -d option:

cumulus@switch:~$ sudo ztp -d

To manually run ZTP, use the -r option:

cumulus@switch:~$ sudo ztp -r

To run ZTP and specify a custom URL for the ZTP script:

cumulus@switch:~$ sudo ztp -r https://myserver/mypath/cumulus-ztp.sh

To run ZTP from a directory on the switch:

cumulus@switch:~$ sudo ztp -r /home/cumulus/cumulus-ztp.sh

To see the current ZTP state, use the -s option:

cumulus@switch:~$ sudo ztp -s
ZTP INFO:
State          disabled
Version        1.0
Result         success
Date           Mon May 20 21:51:04 2019 UTC
Method         Switch manually configured  
URL            None

Write ZTP Scripts

You must include the following line in any of the supported scripts that you expect to run using the autoprovisioning framework.

# CUMULUS-AUTOPROVISIONING

The script must contain the CUMULUS-AUTOPROVISIONING flag. You can include this flag in a comment or remark; you do not need to echo or write the flag to stdout.

You can write the script in any language that Cumulus Linux supports, such as:

The script must return an exit code of 0 upon success to mark the process as complete in the autoprovisioning configuration file.

The following script installs Cumulus Linux from a USB drive and applies a configuration:

#!/bin/bash
function error() {
  echo -e "\e[0;33mERROR: The ZTP script failed while running the command $BASH_COMMAND at line $BASH_LINENO.\e[0m" >&2
  exit 1
}

# Log all output from this script
exec >> /var/log/autoprovision 2>&1
date "+%FT%T ztp starting script $0"

trap error ERR

#Load NVUE startup.yaml from usb
nv config patch ${ZTP_USB_MOUNTPOINT}/startup.yaml
nv config apply

# CUMULUS-AUTOPROVISIONING
exit 0

Continue Provisioning

Typically ZTP exits after executing the script locally and does not continue. To continue with provisioning so that you do not have to intervene manually or embed an Ansible callback into the script, you can add the CUMULUS-AUTOPROVISION-CASCADE directive.

Best Practices

ZTP scripts come in different forms and frequently perform the same tasks. As BASH is the most common language for ZTP scripts, use the following BASH snippets to perform common tasks with robust error checking.

Set the Default Cumulus User Password

The default cumulus user account password is cumulus. When you log into Cumulus Linux for the first time, you must provide a new password for the cumulus account, then log back into the system.

Add the following NVUE commands to your ZTP script to change the default cumulus user account password to a clear-text password. The example changes the password cumulus to MyP4$$word.

nv set system aaa user cumulus password 'MyP4$$word'
nv config apply

If you have an insecure management network, inclue the following commands in your ZTP script to set the password with an encrypted hash instead of a clear-text password. See Hashed Passwords for additional information.

 nv set system aaa user <username> hashed-password <password>
 nv config apply

Add the following function to your ZTP script to change the default cumulus user account password to a clear-text password. The example changes the password cumulus to MyP4$$word.

function set_password(){
     # Unexpire the cumulus account
     passwd -x 99999 cumulus
     # Set the password
     echo 'cumulus:MyP4$$word' | chpasswd
}
set_password

If you have an insecure management network, set the password with an encrypted hash instead of a clear-text password.

  • First, generate a sha-512 password hash with the following python commands. The example commands generate a sha-512 password hash for the password MyP4$$word.

    user@host:~$ python3 -c "import crypt; print(crypt.crypt('MyP4$$word',salt=crypt.mksalt()))"
    $6$hs7OPmnrfvLNKfoZ$iB3hy5N6Vv6koqDmxixpTO6lej6VaoKGvs5E8p5zNo4tPec0KKqyQnrFMII3jGxVEYWntG9e7Z7DORdylG5aR/
    
  • Then, add the following function to the ZTP script to change the default cumulus user account password:

    function set_password(){
         # Unexpire the cumulus account
         passwd -x 99999 cumulus
         # Set the password
         usermod -p '$6$hs7OPmnrfvLNKfoZ$iB3hy5N6Vv6koqDmxixpTO6lej6VaoKGvs5E8p5zNo4tPec0KKqyQnrFMII3jGxVEYWntG9e7Z7DORdylG5aR/' cumulus
    }
    set_password
    

Set the System Hostname

To set the system hostname.

To set the system hostname with NVUE, include the following commands in your ZTP script. This example sets the hostname to leaf01:

nv set system hostname leaf01
nv config apply
  1. Change the hostname with the hostnamectl command; for example:

    cumulus@switch:~$ sudo hostnamectl set-hostname leaf01
    
  2. In the /etc/hosts file, replace the host for IP address 127.0.1.1 with the new hostname:

    cumulus@switch:~$ sudo nano /etc/hosts
    ...
    127.0.1.1       leaf01
    

If you do not manage your switch with NVUE and want to manage the system hostname through the DHCP host-name option, see this knowledge base article

Set the Management IP Address

A Cumulus Linux switch always provides at least one dedicated Ethernet management port called eth0. This interface is specifically for out-of-band management use. The management interface uses DHCPv4 for addressing by default. To set a static IP address and gateway for the management interface, include the following commands in your ZTP script:

nv set interface eth0 ip address 192.0.2.42/24
nv set interface eth0 ip gateway 192.0.2.1
nv config apply

Set the System Time Zone

To set the system time zone, include the following commands in your ZTP script. This example sets the time zone to US/Eastern.

nv set system timezone US/Eastern
nv config apply

Configure NTP

NTP starts at boot by default on the switch and the NTP configuration includes default servers. For additional information, see NTP. To configure additional NTP servers, include the following commands in your ZTP script. This example adds the server 4.cumulusnetworks.pool.ntp.org in the default VRF:

nv set service ntp default server 4.cumulusnetworks.pool.ntp.org iburst on
nv config apply

Test DNS Name Resolution

DNS names are frequently used in ZTP scripts. The ping_until_reachable function tests that each DNS name resolves into a reachable IP address. Call this function with each DNS target used in your script before you use the DNS name elsewhere in your script.

The following example shows how to call the ping_until_reachable function in the context of a larger task.

function ping_until_reachable(){
    last_code=1
    max_tries=30
    tries=0
    while [ "0" != "$last_code" ] && [ "$tries" -lt "$max_tries" ]; do
        tries=$((tries+1))
        echo "$(date) INFO: ( Attempt $tries of $max_tries ) Pinging $1 Target Until Reachable."
        ping $1 -c2 &> /dev/null
        last_code=$?
            sleep 1
    done
    if [ "$tries" -eq "$max_tries" ] && [ "$last_code" -ne "0" ]; then
        echo "$(date) ERROR: Reached maximum number of attempts to ping the target $1 ."
        exit 1
    fi
}

Check the Cumulus Linux Release

The following script segment demonstrates how to check which Cumulus Linux release is running and upgrades the node if the release is not the target release. If the release is the target release, normal ZTP tasks execute. This script calls the ping_until_reachable script (described above) to make sure the server holding the image server and the ZTP script is reachable.

function init_ztp(){
    #do normal ZTP tasks
}

CUMULUS_TARGET_RELEASE=5.1.0
CUMULUS_CURRENT_RELEASE=$(cat /etc/lsb-release  | grep RELEASE | cut -d "=" -f2)
IMAGE_SERVER_HOSTNAME=webserver.example.com
IMAGE_SERVER= "http:// "$IMAGE_SERVER_HOSTNAME "/ "$CUMULUS_TARGET_RELEASE ".bin "
ZTP_URL= "http:// "$IMAGE_SERVER_HOSTNAME "/ztp.sh "

if [ "$CUMULUS_TARGET_RELEASE" != "$CUMULUS_CURRENT_RELEASE" ]; then
    ping_until_reachable $IMAGE_SERVER_HOSTNAME
    /usr/cumulus/bin/onie-install -fa -i $IMAGE_SERVER -z $ZTP_URL && reboot
else
    init_ztp && reboot
fi
exit 0

Perform Ansible Provisioning Callbacks

After initially configuring a node with ZTP, use Provisioning Callbacks to inform Ansible Tower or AWX that the node is ready for more detailed provisioning. The following example demonstrates how to use a provisioning callback:

/usr/bin/curl -H "Content-Type:application/json" -k -X POST --data '{"host_config_key":"'somekey'"}' -u username:password http://ansible.example.com/api/v2/job_templates/1111/callback/

Test ZTP Scripts

Use these commands to test and debug your ZTP scripts.

You can use verbose mode to debug your script and see where your script fails. Include the -v option when you run ZTP:

cumulus@switch:~$ sudo ztp -v -r http://192.0.2.1/demo.sh
Attempting to provision via ZTP Manual from http://192.0.2.1/demo.sh

Broadcast message from root@dell-s6010-01 (ttyS0) (Tue May 10 22:44:17 2016):  

ZTP: Attempting to provision via ZTP Manual from http://192.0.2.1/demo.sh
ZTP Manual: URL response code 200
ZTP Manual: Found Marker CUMULUS-AUTOPROVISIONING
ZTP Manual: Executing http://192.0.2.1/demo.sh
error: ZTP Manual: Payload returned code 1
error: Script returned failure

To see results of the most recent ZTP execution, you can run the ztp -s command.

cumulus@switch:~$ ztp -s
ZTP INFO:

State              enabled
Version            1.0
Result             Script Failure
Date               Mon 20 May 2019 09:31:27 PM UTC
Method             ZTP DHCP
URL                http://192.0.2.1/demo.sh

If ZTP runs when the switch boots and not manually, you can run the systemctl -l status ztp.service then journalctl -l -u ztp.service to see if any failures occur:

cumulus@switch:~$ sudo systemctl -l status ztp.service
● ztp.service - Cumulus Linux ZTP
    Loaded: loaded (/lib/systemd/system/ztp.service; enabled)
    Active: failed (Result: exit-code) since Wed 2016-05-11 16:38:45 UTC; 1min 47s ago
        Docs: man:ztp(8)
    Process: 400 ExecStart=/usr/sbin/ztp -b (code=exited, status=1/FAILURE)
    Main PID: 400 (code=exited, status=1/FAILURE)

May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP USB: Device not found
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Looking for ZTP Script provided by DHCP
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: Attempting to provision via ZTP DHCP from http://192.0.2.1/demo.sh
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: URL response code 200
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Found Marker CUMULUS-AUTOPROVISIONING
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Executing http://192.0.2.1/demo.sh
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Payload returned code 1
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: Script returned failure
May 11 16:38:45 dell-s6010-01 systemd[1]: ztp.service: main process exited, code=exited, status=1/FAILURE
May 11 16:38:45 dell-s6010-01 systemd[1]: Unit ztp.service entered failed state.
cumulus@switch:~$
cumulus@switch:~$ sudo journalctl -l -u ztp.service --no-pager
-- Logs begin at Wed 2016-05-11 16:37:42 UTC, end at Wed 2016-05-11 16:40:39 UTC. --
May 11 16:37:45 cumulus ztp[400]: ztp [400]: /var/lib/cumulus/ztp: Sate Directory does not exist. Creating it...
May 11 16:37:45 cumulus ztp[400]: ztp [400]: /var/run/ztp.lock: Lock File does not exist. Creating it...
May 11 16:37:45 cumulus ztp[400]: ztp [400]: /var/lib/cumulus/ztp/ztp_state.log: State File does not exist. Creating it...
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Looking for ZTP local Script
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell_s6010_s1220-rUNKNOWN
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell_s6010_s1220
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP USB: Looking for unmounted USB devices
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP USB: Parsing partitions
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP USB: Device not found
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Looking for ZTP Script provided by DHCP
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: Attempting to provision via ZTP DHCP from http://192.0.2.1/demo.sh
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: URL response code 200
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Found Marker CUMULUS-AUTOPROVISIONING
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Executing http://192.0.2.1/demo.sh
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Payload returned code 1
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: Script returned failure
May 11 16:38:45 dell-s6010-01 systemd[1]: ztp.service: main process exited, code=exited, status=1/FAILURE
May 11 16:38:45 dell-s6010-01 systemd[1]: Unit ztp.service entered failed state.

Instead of running journalctl, you can see the log history by running:

cumulus@switch:~$ cat /var/log/syslog | grep ztp
2016-05-11T16:37:45.132583+00:00 cumulus ztp [400]: /var/lib/cumulus/ztp: State Directory does not exist. Creating it...
2016-05-11T16:37:45.134081+00:00 cumulus ztp [400]: /var/run/ztp.lock: Lock File does not exist. Creating it...
2016-05-11T16:37:45.135360+00:00 cumulus ztp [400]: /var/lib/cumulus/ztp/ztp_state.log: State File does not exist. Creating it...
2016-05-11T16:37:45.185598+00:00 cumulus ztp [400]: ZTP LOCAL: Looking for ZTP local Script
2016-05-11T16:37:45.485084+00:00 cumulus ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell_s6010_s1220-rUNKNOWN
2016-05-11T16:37:45.486394+00:00 cumulus ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell_s6010_s1220
2016-05-11T16:37:45.488385+00:00 cumulus ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell
2016-05-11T16:37:45.489665+00:00 cumulus ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64
2016-05-11T16:37:45.490854+00:00 cumulus ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp
2016-05-11T16:37:45.492296+00:00 cumulus ztp [400]: ZTP USB: Looking for unmounted USB devices
2016-05-11T16:37:45.493525+00:00 cumulus ztp [400]: ZTP USB: Parsing partitions
2016-05-11T16:37:45.636422+00:00 cumulus ztp [400]: ZTP USB: Device not found
2016-05-11T16:38:43.372857+00:00 cumulus ztp [1805]: Found ZTP DHCP Request
2016-05-11T16:38:45.696562+00:00 cumulus ztp [400]: ZTP DHCP: Looking for ZTP Script provided by DHCP
2016-05-11T16:38:45.698598+00:00 cumulus ztp [400]: Attempting to provision via ZTP DHCP from http://192.0.2.1/demo.sh
2016-05-11T16:38:45.816275+00:00 cumulus ztp [400]: ZTP DHCP: URL response code 200
2016-05-11T16:38:45.817446+00:00 cumulus ztp [400]: ZTP DHCP: Found Marker CUMULUS-AUTOPROVISIONING
2016-05-11T16:38:45.818402+00:00 cumulus ztp [400]: ZTP DHCP: Executing http://192.0.2.1/demo.sh
2016-05-11T16:38:45.834240+00:00 cumulus ztp [400]: ZTP DHCP: Payload returned code 1
2016-05-11T16:38:45.835488+00:00 cumulus ztp [400]: Script returned failure
2016-05-11T16:38:45.876334+00:00 cumulus systemd[1]: ztp.service: main process exited, code=exited, status=1/FAILURE
2016-05-11T16:38:45.879410+00:00 cumulus systemd[1]: Unit ztp.service entered failed state.

If you see that the issue is a script failure, you can modify the script and then run ZTP manually using ztp -v -r <URL/path to that script>, as above.

cumulus@switch:~$ sudo ztp -v -r http://192.0.2.1/demo.sh
Attempting to provision via ZTP Manual from http://192.0.2.1/demo.sh

Broadcast message from root@dell-s6010-01 (ttyS0) (Tue May 10 22:44:17 2019):  

ZTP: Attempting to provision via ZTP Manual from http://192.0.2.1/demo.sh
ZTP Manual: URL response code 200
ZTP Manual: Found Marker CUMULUS-AUTOPROVISIONING
ZTP Manual: Executing http://192.0.2.1/demo.sh
error: ZTP Manual: Payload returned code 1
error: Script returned failure
cumulus@switch:~$ sudo ztp -s
State         enabled
Version       1.0
Result        Script Failure
Date          Mon 20 May 2019 09:31:27 PM UTC
Method        ZTP Manual
URL           http://192.0.2.1/demo.sh

Use the following command to check syslog for information about ZTP:

cumulus@switch:~$ sudo grep -i ztp /var/log/syslog

Common ZTP Script Errors

Could not find referenced script/interpreter in downloaded payload

cumulus@leaf01:~$ sudo cat /var/log/syslog | grep ztp
2018-04-24T15:06:08.887041+00:00 leaf01 ztp [13404]: Attempting to provision via ZTP Manual from http://192.168.0.254/ztp_oob_windows.sh
2018-04-24T15:06:09.106633+00:00 leaf01 ztp [13404]: ZTP Manual: URL response code 200
2018-04-24T15:06:09.107327+00:00 leaf01 ztp [13404]: ZTP Manual: Found Marker CUMULUS-AUTOPROVISIONING
2018-04-24T15:06:09.107635+00:00 leaf01 ztp [13404]: ZTP Manual: Executing http://192.168.0.254/ztp_oob_windows.sh
2018-04-24T15:06:09.132651+00:00 leaf01 ztp [13404]: ZTP Manual: Could not find referenced script/interpreter in downloaded payload.
2018-04-24T15:06:14.135521+00:00 leaf01 ztp [13404]: ZTP Manual: Retrying
2018-04-24T15:06:14.138915+00:00 leaf01 ztp [13404]: ZTP Manual: URL response code 200
2018-04-24T15:06:14.139162+00:00 leaf01 ztp [13404]: ZTP Manual: Found Marker CUMULUS-AUTOPROVISIONING
2018-04-24T15:06:14.139448+00:00 leaf01 ztp [13404]: ZTP Manual: Executing http://192.168.0.254/ztp_oob_windows.sh
2018-04-24T15:06:14.143261+00:00 leaf01 ztp [13404]: ZTP Manual: Could not find referenced script/interpreter in downloaded payload.
2018-04-24T15:06:24.147580+00:00 leaf01 ztp [13404]: ZTP Manual: Retrying
2018-04-24T15:06:24.150945+00:00 leaf01 ztp [13404]: ZTP Manual: URL response code 200
2018-04-24T15:06:24.151177+00:00 leaf01 ztp [13404]: ZTP Manual: Found Marker CUMULUS-AUTOPROVISIONING
2018-04-24T15:06:24.151374+00:00 leaf01 ztp [13404]: ZTP Manual: Executing http://192.168.0.254/ztp_oob_windows.sh
2018-04-24T15:06:24.155026+00:00 leaf01 ztp [13404]: ZTP Manual: Could not find referenced script/interpreter in downloaded payload.
2018-04-24T15:06:39.164957+00:00 leaf01 ztp [13404]: ZTP Manual: Retrying
2018-04-24T15:06:39.165425+00:00 leaf01 ztp [13404]: Script returned failure
2018-04-24T15:06:39.175959+00:00 leaf01 ztp [13404]: ZTP script failed. Exiting...

Errors in syslog for ZTP like those shown above often occur if you create or edit the script on a Windows machine. Check to make sure that the \r\n characters are not present in the end-of-line encodings.

Use the cat -v ztp.sh command to view the contents of the script and search for any hidden characters.

root@oob-mgmt-server:/var/www/html# cat -v ./ztp_oob_windows.sh 
#!/bin/bash^M
^M
###################^M
#   ZTP Script^M
###################^M
^M
/usr/cumulus/bin/cl-license -i http://192.168.0.254/license.txt^M
^M
# Clean method of performing a Reboot^M
nohup bash -c 'sleep 2; shutdown now -r "Rebooting to Complete ZTP"' &^M
^M
exit 0^M
^M
# The line below is required to be a valid ZTP script^M
#CUMULUS-AUTOPROVISIONING^M
root@oob-mgmt-server:/var/www/html#

The ^M characters in the output of your ZTP script, as shown above, indicate the presence of Windows end-of-line encodings that you need to remove.

Use the translate (tr) command on any Linux system to remove the '\r' characters from the file.

root@oob-mgmt-server:/var/www/html# tr -d '\r' < ztp_oob_windows.sh > ztp_oob_unix.sh
root@oob-mgmt-server:/var/www/html# cat -v ./ztp_oob_unix.sh 
#!/bin/bash
###################
#   ZTP Script
###################
/usr/cumulus/bin/cl-license -i http://192.168.0.254/license.txt
# Clean method of performing a Reboot
nohup bash -c 'sleep 2; shutdown now -r "Rebooting to Complete ZTP"' &
exit 0
# The line below is required to be a valid ZTP script
#CUMULUS-AUTOPROVISIONING
root@oob-mgmt-server:/var/www/html#

Considerations

Factory Reset

Factory reset puts the switch back to the same or similar state it was in when shipped from the factory. When you perform a factory reset, the currently installed image remains on the switch.

You can also run factory reset when you want to remove a complex or corrupted configuration that is blocking your progress, when you want to move a switch from one network to another, reset the switch to factory defaults and configure it as a new switch, or if you want to selectively remove either configurations or system log files to identify issues.

  • To run factory reset commands, you must have system admin, root, or sudo privileges.
  • The switch does not support factory reset if you upgrade to Cumulus Linux 5.12 from Cumulus Linux 5.9.x or 5.10.x with package upgrade.
  • To run factory reset with NVUE commands, the nvued service must be running.
  • After a successful reset, Cumulus Linux runs ztp-X to restart the ZTP process. The ZTP -X option resets ZTP and clears the URL cache.

Run Factory Reset

Factory reset provides options to:

To reset the switch to the factory defaults and remove all configuration, system files, and log files, run the nv action reset system factory-default command.

Use the following options to keep configuration or system and log files:

Option Description
keep basic Retains password policy rules, management interface configuration, local user accounts and roles, and SSH configuration.
keep all-config Retains all configuration.
keep only-files Retains all system files and log files.

When you run the NVUE factory reset commands, the switch prompts you to confirm that you want to continue. To run the commands without the prompts to continue, add the force option at the end of the command.

The following example resets the switch to the factory defaults and removes all configuration, system files, and log files:

cumulus@switch:~$ nv action reset system factory-default
This operation will reset the system configuration, delete the log files and reboot the switch.
Type [y] continue. 
Type [n] to abort. 
Do you want to continue? [y/n] y
...

The following example resets the switch to the factory defaults but keeps password policy rules, management interface configuration (such as eth0), local user accounts and roles, and SSH configuration:

cumulus@switch:~$ nv action reset system factory-default keep basic
This operation will keep only the basic system configuration, delete the log files and reboot the switch.
Type [y] to continue. 
Type [n] to abort. 
Do you want to continue? [y/n] y
... 

The following example resets the switch to the factory defaults but keeps all configuration:

cumulus@switch:~$ nv action reset system factory-default keep all-config
This operation will not reset the system configuration, only delete the log files and reboot the switch.
Type [y] to continue.
Type [n] to abort.
Do you want to continue? [y/n] y 
...

The following example resets the switch to the factory defaults but keeps all system files and log files:

cumulus@switch:~$ nv action reset system factory-default keep only-files
This operation will reset the system configuration, not delete the log files and reboot the switch.
Type [y] to continue. 
Type [n] to abort. 
Do you want to continue? [y/n] y 
...

The following example resets the switch to the factory defaults but keeps all system files and log files. The force option runs factory reset without the prompts to continue:

cumulus@switch:~$ nv action reset system factory-default keep only-files force 

To reset the switch to the factory defaults and remove all configuration, system files, and log files (the default option), run the systemctl restart factory-reset.service command.

cumulus@switch:~$ sudo systemctl restart factory-reset.service

To keep certain configuration, keep all configuration but not system and log files, or keep system and log files but no configuration, create the /tmp/factory-reset.conf file, add one of the reset options to the file, then run the systemctl restart factory-reset.service command.

  • TYPE=keep-basic resets the switch to the factory defaults but keeps password policy rules, management interface configuration (such as eth0), local user accounts and roles, and SSH configuration.
  • TYPE=keep-all-config resets the switch to the factory defaults but keeps all configuration.
  • TYPE=keep-all-files resets the switch to the factory defaults but keep all system files and log files.

The following example resets the switch to the factory defaults but keeps password policy rules, management interface configuration (such as eth0), local user accounts and roles, and SSH configuration.

cumulus@switch:~$ sudo nano /tmp/factory-reset.conf
TYPE=keep-basic

When you use the keep-basic option, you must create a /tmp/startup-new.yaml file with the configuration you want after factory reset, then start factory-reset.service. This is not necessary for the other options.

cumulus@switch:~$ sudo systemctl restart factory-reset.service

Considerations

System Configuration

This section describes how to configure the following system settings:

NVIDIA User Experience - NVUE

NVUE is an object-oriented, schema driven model of a complete Cumulus Linux system (hardware and software) providing a robust API that allows for multiple interfaces to both view (show) and configure (set and unset) any element within a system running the NVUE software.

NVUE Object Model

The NVUE object model definition uses the OpenAPI specification (OAS). Similar to YANG (RFC 6020 and RFC 7950), OAS is a data definition, manipulation, and modeling language (DML) that lets you build model-driven interfaces for both humans and machines. Although the computer networking and telecommunications industry commonly uses YANG (standardized by IETF) as a DML, the adoption of OpenAPI is broader, spanning cloud to compute to storage to IoT and even social media. The OpenAPI Initiative (OAI) consortium leads OpenAPI standardization, a chartered project under the Linux Foundation.

The OAS schema forms the management plane model with which you configure, monitor, and manage the Cumulus Linux switch. The v3.0.2 version of OAS defines the NVUE data model.

Like other systems that use OpenAPI, the NVUE OAS schema defines the endpoints (paths) exposed as RESTful APIs. With these REST APIs, you can perform various create, retrieve, update, delete, and eXecute (CRUDX) operations. The OAS schema also describes the API inputs and outputs (data models).

You can use the NVUE object model in these two ways:

The CLI and the REST API are equivalent in functionality; you can run all management operations from the REST API or the CLI. The NVUE object model drives both the REST API and the CLI management operations. All operations are consistent; for example, the CLI nv show commands reflect any PATCH operation (create) you run through the REST API.

NVUE follows a declarative model, removing context-specific commands and settings. It is structured as a big tree that represents the entire state of a Cumulus Linux instance. At the base of the tree are high level branches representing objects, such as router and interface. Under each of these branches are further branches. As you navigate through the tree, you gain a more specific context. At the leaves of the tree are actual attributes, represented as key-value pairs. The path through the tree is similar to a filesystem path.

Cumulus Linux installs NVUE by default and enables the NVUE service nvued.

NVUE CLI

The NVUE CLI has a flat structure instead of a modal structure. Therefore, you can run all commands from the primary prompt instead of only in a specific mode.

You can choose to configure Cumulus Linux either with NVUE commands or Linux commands (with vtysh or by manually editing configuration files). Do not run both NVUE configuration commands (such as nv set, nv unset, nv action, and nv config) and Linux commands to configure the switch. NVUE commands replace the configuration in files such as /etc/network/interfaces and /etc/frr/frr.conf, and remove any configuration you add manually or with automation tools like Ansible, Chef, or Puppet.

If you choose to configure Cumulus Linux with NVUE, you can configure features that do not yet support the NVUE object model by creating snippets. See NVUE Snippets.

Command Syntax

NVUE commands all begin with nv and fall into one of three syntax categories:

Command Completion

As you enter commands, you can get help with the valid keywords or options using the tab key. For example, using tab completion with nv set displays the possible options for the command and returns you to the command prompt to complete the command.

cumulus@switch:~$ nv set <<press tab>>
acl        evpn       mlag       platform   router     system     
bridge     interface  nve        qos        service    vrf
cumulus@switch:~$ nv set

Command Question Mark

You can type a question mark (?) after a command to display required information quickly and concisely. When you type ?, NVUE specifies the value type, range, and options with a brief description of each; for example:

cumulus@switch:~$ nv set interface swp1 link state ?
    [Enter]               
    down                   The interface is not ready
    up                     The interface is ready
cumulus@switch:~$ nv set interface swp1 link mtu ?
    <arg>                  (integer:552 - 9216)
cumulus@switch:~$ nv set interface swp1 link speed ?
    <arg>                  (string | enum:10M, 100M, 1G, 10G, 25G, 40G, 50G, 100G,
                           200G, 400G, 800G, auto)

NVUE also indicates if you need to provide specific values for the command:

cumulus@switch:~$ nv set interface swp1 bridge domain ?
    <domain-id>            Domain (bridge-name)

Command Abbreviation

NVUE supports command abbreviation, where you can type a certain number of characters instead of a whole command to speed up CLI interaction. For example, instead of typing nv show interface, you can type nv sh int.

If the command you type is ambiguous, NVUE shows the reason for the ambiguity so that you can correct the shortcut. For example:

cumulus@switch:~$ nv s i 
Ambiguous Command: 
   set interface 
   show interface 

Command Help

As you enter commands, you can get help with command syntax by entering -h or --help at various points within a command entry. For example, to examine the options available for nv set interface, enter nv set interface -h or nv set interface --help.

cumulus@switch:~$ nv set interface -h
usage: 
  nv [options] set interface <interface-id>

Description:
  interface             Update all interfaces. Provide single interface or multiple interfaces using ranging (e.g. swp1-2,5-6 -> swp1,swp2,swp5,swp6).

Identifiers:
  <interface-id>        Interface (interface-name)

General Options:
  -h, --help            Show help.

Command List

You can list all the NVUE commands by running nv list-commands. See List All NVUE Commands below.

Command History

At the command prompt, press the Up Arrow and Down Arrow keys to move back and forth through the list of commands you entered previously. When you find the command you want to use, you can run the command by pressing Enter. You can also modify the command before you run it.

Command Categories

The NVUE CLI has a flat structure; however, the commands are in three functional categories:

Configuration Commands

The NVUE configuration commands modify switch configuration. You can set and unset configuration options.

The nv set and nv unset commands are in the following categories. Each command group includes subcommands. Use command completion (press the tab key) to list the subcommands.

Command Group
Description
nv set acl
nv unset acl
Configures Access Control Lists.
nv set bridge
nv unset bridge
Configures a bridge domain. This is where you configure bridge attributes, such as the bridge type (VLAN-aware), the STP state and priority, and VLANs.
nv set evpn
nv unset evpn
Configures EVPN. This is where you enable and disable the EVPN control plane, and set EVPN route advertise, multihoming, and duplicate address detection options.
nv set interface <interface-id>
nv unset interface <interface-id>
Configures the switch interfaces. Use this command to configure bond and bridge interfaces, interface IP addresses and descriptions, VLAN IDs, and links (MTU, FEC, speed, duplex, and so on).
nv set mlag
nv unset mlag
Configures MLAG. This is where you configure the backup IP address or interface, MLAG system MAC address, peer IP address, MLAG priority, and the delay before bonds come up.
nv set nve
nv unset nve
Configures network virtualization (VXLAN) settings. This is where you configure the UDP port for VXLAN frames, control dynamic MAC learning over VXLAN tunnels, enable and disable ARP and ND suppression, and configure how Cumulus Linux handles BUM traffic in the overlay.
nv set platform
nv unset platform
Configures Pulse per Second; the simplest form of synchronization for the physical hardware clock.
nv set qos
nv unset qos
Configures QoS RoCE.
nv set router
nv unset router
Configures router policies (prefix list rules and route maps), sets global BGP options (enable and disable, ASN and router ID, BGP graceful restart and shutdown), global OSPF options (enable and disable, router ID, and OSPF timers) PIM, IGMP, PBR, VRR, and VRRP.
nv set service
nv unset service
Configures DHCP relays and servers, NTP, PTP, LLDP, SNMP servers, DNS, and syslog.
nv set system
nv unset system
Configures system settings, such as the hostname of the switch, pre and post login messages, reboot options (warm, cold, fast), the time zone and global system settings, such as the anycast ID, the system MAC address, and the anycast MAC address. This is also where you configure SPAN and ERSPAN sessions, telemetry, and set how configuration apply operations work (which files to ignore and which files to overwrite; see Configure NVUE to Ignore Linux Files).
nv set vrf <vrf-id>
nv unset vrf <vrf-id>
Configures VRFs. This is where you configure VRF-level configuration for PTP, BGP, OSPF, and EVPN.

Monitoring Commands

The NVUE monitoring commands show various parts of the network configuration. For example, you can show the complete network configuration or only interface configuration. The monitoring commands are in the following categories. Each command group includes subcommands. Use command completion (press the tab key) to list the subcommands.

Command Group
Description
nv show acl Shows Access Control List configuration.
nv show action Shows information about the action commands that reset counters and remove conflicts.
nv show bridge Shows bridge domain configuration.
nv show evpn Shows EVPN configuration.
nv show interface Shows interface configuration and counters.
nv show mlag Shows MLAG configuration.
nv show nve Shows network virtualization configuration, such as VXLAN-specfic MLAG configuration and VXLAN flooding.
nv show platform Shows platform configuration, such as hardware and software components.
nv show qos Shows QoS RoCE configuration.
nv show router Shows router configuration, such as router policies, global BGP and OSPF configuration, PBR, PIM, IGMP, VRR, and VRRP configuration.
nv show service Shows DHCP relays and server, NTP, PTP, LLDP, and syslog configuration.
nv show system Shows global system settings, such as the reserved routing table range for PBR and the reserved VLAN range for layer 3 VNIs. You can also see system login messages and switch reboot history.
nv show system version Shows the Cumulus Linux release running on the switch.
nv show vrf Shows VRF configuration.

The following example shows the nv show router commands after pressing the tab key, then shows the output of the nv show router bgp command.

cumulus@leaf01:mgmt:~$ nv show router <<tab>>
adaptive-routing  igmp              ospf              pim               ptm               vrrp              
bgp               nexthop           pbr               policy            vrr               

cumulus@leaf01:mgmt:~$ nv show router bgp
                                operational  applied  pending
------------------------------  -----------  -------  -----------  ----------------------------------------------------------------------
                                applied      pending    
------------------------------  -----------  -----------
enable                          on           on         
autonomous-system               65101        65101      
router-id                       10.10.10.1   10.10.10.1 
policy-update-timer             5            5          
graceful-shutdown               off          off        
wait-for-install                off          off        
graceful-restart                                        
  mode                          helper-only  helper-only
  restart-time                  120          120        
  path-selection-deferral-time  360          360        
  stale-routes-time             360          360        
convergence-wait                                        
  time                          0            0          
  establish-wait-time           0            0          
queue-limit                                             
  input                         10000        10000      
  output                        10000        10000 

If there are no pending or applied configuration changes, the nv show command only shows the running configuration (under operational).

Additional options are available for certain nv show commands. For example, you can choose the configuration you want to show (pending, applied, startup, or operational). You can also turn on colored output, and paginate specific output.

Option
Description
--applied Shows configuration applied with the nv config apply command. For example, nv show --applied.
--brief-help Shows help about the nv show command. For example, nv show interface swp1 --brief-help
--color Turns colored output on or off. For example, nv show interface swp1 --color on
--filter Filters show command output on column data. For example, the nv show interface --filter mtu=1500 shows only the interfaces with MTU set to 1500. For more information, see Filter nv show Command Output below.
--hostname Shows system configuration for the switch with the specified hostname. For example, nv show --hostname leaf01.
--operational Shows the running configuration (the actual system state). For example, nv show interface swp1 --operational shows the running configuration for swp1. The running and applied configuration should be the same. If different, inspect the logs.
--output Shows command output in table (auto), json, yaml or plain text (raw) format, such as vtysh native output. For example:
nv show interface bond1 --output auto
nv show interface bond1 --output json
nv show interface bond1 --output yaml
nv show router bgp -output raw
--paginate Paginates the output. For example, nv show interface bond1 --paginate on.
--pending Shows the last applied configuration and any pending set or unset configuration that you have not yet applied. For example, nv show interface bond1 --pending.
--rev <revision> Shows a detached pending configuration. See the nv config detach configuration management command below. For example, nv show --rev 1. You can also show only applied or only operational information in the nv show output. For example, to show only the applied settings for swp1 configuration, run the nv show interface swp1 --rev=applied command. To show only the operational settings for swp1 configuration, run the nv show interface swp1 --rev=operational command.
--startup Shows configuration saved with the nv config apply command. This is the configuration after the switch boots. For example: nv show interface --startup.
--tab Show information in tab format. For example, nv show interface swp1 --tab.
--view Shows different views. A view is a subset of information provided by certain nv show commands. To see the views available for an nv show command, run the command with --view and press TAB.

The following example shows pending BGP graceful restart configuration:

cumulus@switch:~$ nv show router bgp graceful-restart --pending
                              Rev ID: 8                  
----------------------------  -----------------  
mode                          helper-only        
path-selection-deferral-time  360              
restart-time                  120              
stale-routes-time             360              

The following example shows the views available for the nv show interface command:

cumulus@switch:~$ nv show interface --view <<TAB>>
acl-statistics  carrier-stats   dot1x-counters  lldp-detail     physical        status          vrf
bond-members    counters        dot1x-summary   mac             port-security   svi             
bonds           description     down            mlag-cc         qos-profile     synce-counters  
brief           detail          lldp            neighbor        small           up

Configuration Management Commands

The NVUE configuration management commands manage and apply configurations.

Command
Description
nv config apply Saves the pending configuration (nv config apply) or a specific revision (nv config apply 2) to the startup configuration automatically (when auto save is on, which is the default setting). To see the list of revisions you can apply, run nv config apply <<Tab>>.
You can also use these prompt options:
  • --y or --assume-yes to automatically reply yes to all prompts.
  • --assume-no to automatically reply no to all prompts.
You can also use these apply options:
--confirm applies the configuration change but you must confirm the applied configuration. If you do not confirm within ten minutes, the configuration rolls back automatically. You can change the default time with the apply --confirm <time> command. For example, apply --confirm 60 requires you to confirm within one hour.
--confirm-status shows the amount of time left before the automatic rollback.
nv config detach Detaches the configuration from the current pending configuration and uses an integer to identify it; for example, 4. To list all the current detached pending configurations, run nv config diff <<press tab>.
nv config diff <revision> <revision> Shows differences between configurations, such as the pending configuration and the applied configuration, or the detached configuration and the pending configuration.
nv config find <string> Finds a portion of the applied configuration according to the search string you provide. For example to find swp1 in the applied configuration, run nv config find swp1.
nv config history Enables you to keep track of the configuration changes on the switch and shows a table with the configuration revision ID, the date and time of the change, the user account that made the change, and the type of change (such as CLI or REST API). The nv config history <revision> command shows the apply history for a specific revision.
nv config patch <nvue-file> Updates the pending configuration with the specified YAML configuration file.
nv config replace <nvue-file> Replaces the pending configuration with the specified YAML configuration file.
nv config revision Shows all the configuration revisions on the switch.
nv config save This command overwrites the startup configuration with the applied configuration by writing to the /etc/nvu.d/startup.yaml file. The configuration persists after a reboot. Use this command when the auto save option is off.
nv config show Shows the currently applied configuration in yaml format. This command also shows NVUE version information.
nv config show -o commands Shows the currently applied configuration commands.
nv config diff -o commands Shows differences between two configuration revisions.

You can use the NVUE configuration management commands to back up and restore configuration when you upgrade Cumulus Linux on the switch. Refer to Upgrading Cumulus Linux.

Action Commands

The NVUE action commands clear counters, and provide system reboot and TACACS user disconnect options.

Command
Description
nv action change system time Sets the software clock date and time.
nv action clear Provides commands to clear ACL statistics, duplicate addresses, PTP violations, interfaces from a protodown state, interface counters, Qos buffers, BGP routes, OSPF interface counters, matches against a route map, and remove conflicts from protodown MLAG bonds.
nv action deauthenticate interface <interface>> dot1x authorized-sessions Deauthenticates the 802.1X supplicant on the specified interface. If you do not want to notify the supplicant when deauthenticating, you can add the silent option; for example, nv action deauthenticate interface swp1 dot1x authorized-sessions 00:55:00:00:00:09 silent.
nv action delete system security Provides commands to delete CA and entity certificates.
nv action disable system maintenance mode
nv action disable system maintenance ports
Disables system maintenance mode
Brings up the ports.
nv action disconnect system aaa user Provides commands to disconnect users logged into the switch.
nv action enable system maintenance mode
nv action enable system maintenance ports
Enables system maintenance mode.
Brings all the ports down for maintenance.
nv action import system security ca-certificate
nv action import system security certificate
Provides commands to import CA and entity certificates.
nv action reboot system Reboots the switch in the configured restart mode (fast, cold, or warm). You must specify the no-confirm option with this command.
nv action rename Renames the system configuration.
nv action upload Uploads system configuration to the switch.

List All NVUE Commands

To show the full list of NVUE commands, run nv list-commands. For example:

cumulus@switch:~$ nv list-commands
nv show platform
nv show platform inventory
nv show platform inventory <inventory-id>
nv show platform software
nv show platform software installed
nv show platform software installed <installed-id>
nv show platform firmware
nv show platform firmware <platform-component-id>
nv show platform environment
...

You can show the list of commands for a command grouping. For example, to show the list of interface commands:

cumulus@switch:~$ nv list-commands interface
nv show interface
nv show interface <interface-id>
nv show interface <interface-id> ip
nv show interface <interface-id> ip address
nv show interface <interface-id> ip address <ip-prefix-id>
nv show interface <interface-id> ip gateway
nv show interface <interface-id> ip gateway <ip-address-id>
...

To view the NVUE command reference for Cumulus Linux, which describes all the NVUE CLI commands and provides examples, go to the NVUE Command Reference.

NVUE Configuration File

When you save network configuration, NVUE writes the configuration to the /etc/nvue.d/startup.yaml file.

You can edit or replace the contents of the /etc/nvue.d/startup.yaml file. NVUE applies the configuration in the /etc/nvue.d/startup.yaml file during system boot only if the nvue-startup.service is running. If this service is not running, the switch reboots with the same configuration that is running before the reboot.

To start nvue-startup.service:

cumulus@switch:~$ sudo systemctl enable nvue-startup.service
cumulus@switch:~$ sudo systemctl start nvue-startup.service

When you apply a configuration with nv config apply, NVUE also writes to underlying Linux files such as /etc/network/interfaces and /etc/frr/frr.conf. You can view these configuration files; however, do not manually edit them while using NVUE. If you need to configure certain network settings manually or use automation such as Ansible to configure the switch, see Configure NVUE to Ignore Linux Files below.

Default Startup File

NVUE provides a default /etc/nvue.d/startup.yaml file that includes configuration such as the switch hostname, default firewall rules, and cumulus user account credentials. The file also enables the NVUE API. This file is the factory configuration file that you can restore at any time.

  • The default startup configuration file sets the default hostname as cumulus; therefore, Cumulus Linux does not accept the DHCP host-name option. To set a different hostname with NVUE, see Configure the Hostname. If you do not manage your switch with NVUE and want to change this behavior with Linux configuration files, see this knowledge base article.
  • The default NVUE startup.yaml file includes the cumulus user account, which is the default account for the system. Modifying the NVUE configuration to not include the cumulus user account, replacing the configuration or applying a startup configuration, deletes the cumulus account. To merge in configuration changes or to restore a backup startup.yaml file, use the nv config patch command as described in Back up and Restore Configuration with NVUE.
  • You cannot delete a logged in user account.

Encrypted Passwords

By default, NVUE encrypts passwords, such as the RADIUS secret, TACACS secret, BGP peer password, OSPF MD5 key, and SNMP strings in the startup.yaml file. You can disable password encryption with the nv set system security encryption db state disabled command:

cumulus@switch:~$ nv set system security encryption db state disabled
cumulus@switch:~$ nv config apply

To reenable password encryption, run the nv set system security encryption db state enabled command.

To show if password encryption is on, run the nv show system security encryption command:

cumulus@switch:~$ nv show system security encryption
         operational  applied
-------  -----------  -------
db                           
  state               enabled

Configuration Files that NVUE Manages

NVUE manages the following configuration files:

File Description
/etc/network/interfaces Configures the network interfaces available on your system.
/etc/frr/frr.conf Configures FRRouting.
/etc/cumulus/switchd.conf Configures switchd options.
/etc/cumulus/switchd.d/ptp.conf Configures PTP timestamping.
/etc/frr/daemons Configures FRRouting services.
/etc/hosts Configures the hostname of the switch.
/etc/default/isc-dhcp-relay-default Configures DHCP relay options.
/etc/dhcp/dhcpd.conf Configures DHCP server options.
/etc/hostname Configures the hostname of the switch.
/etc/cumulus/datapath/qos/qos_features.conf Configures QoS settings, such as traffic marking, shaping and flow control.
/etc/mlx/datapath/qos/qos_infra.conf Configures QoS platform specific configurations, such as buffer allocations and Alpha values.
/etc/cumulus/switchd.d/qos.conf Configures QoS settings.
/etc/cumulus/ports.conf Configures port breakouts.
etc/ntpsec/ntp.conf Configures NTP settings.
/etc/ptp4l.conf Configures PTP settings.
/etc/snmp/snmpd.conf Configures SNMP settings.

Search for a Specific Configuration

To search for a specific portion of the NVUE configuration, run the nv config find <search string> command. The search shows all items above and below the search string. For example, to search the entire NVUE object model configuration for any mention of ptm:

cumulus@switch:~$ nv config find bond1
- set:
    interface:
      bond1:
        bond:
          lacp-bypass: on
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default:
              access: 10
        link:
          mtu: 9000
        type: bond

Configure NVUE to Ignore Linux Files

You can configure NVUE to ignore certain underlying Linux files when applying configuration changes. For example, if you push certain configuration to the switch using Ansible and Jinja2 file templates or you want to use custom configuration for a particular service such as PTP, you can ensure that NVUE never writes to those configuration files.

The following example configures NVUE to ignore the Linux /etc/ptp4l.conf file when applying configuration changes.

cumulus@switch:~$ nv set system config apply ignore /etc/ptp4l.conf
cumulus@switch:~$ nv config apply

Auto Save

By default, when you run the nv config apply command to apply a configuration setting, NVUE applies the pending configuration to become the applied configuration and automatically saves the changes to the startup configuration file (/etc/nvue.d/startup.yaml).

To disable auto save so that NVUE does not save applied configuration changes, run the nv set system config auto-save state disabled command:

cumulus@switch:~$ nv set system config auto-save state disabled
cumulus@switch:~$ nv config apply

When you disable auto save, you must run the nv config save command to save the applied configuration to the startup configuration so that the changes persist after a reboot.

To renable auto save, run the nv set system config auto-save state enabled command.

Show Switch Configuration

To show the applied configuration on the switch, run the nv config show command:

cumulus@switch:~$ nv config show
header:
    model: VX
    nvue-api-version: nvue_v1
    rev-id: 1.0
    version: Cumulus Linux 5.7.0
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
            '30':
              vni:
                '30': {}
...

To show the configuration on the switch in YAML format and include all default options, run the nv config show --all command.

Add Configuration Apply Messages

When you run the nv config apply command, you can add a message that describes the configuration updates you make. You can see the message when you run the nv config history command.

To add a configuration apply message, run the nv config apply -m <message> command. If the message includes more than one word, enclose the message in quotes.

cumulus@switch:~$ nv config apply -m "this is my message"

Reset NVUE Configuration to Default Values

To reset the NVUE configuration on the switch back to the default values, run the following command:

cumulus@switch:~$ nv config replace /usr/lib/python3/dist-packages/cue_config_v1/initial.yaml
cumulus@switch:~$ nv config apply

Detach a Pending Configuration

The following example configures the IP address of the loopback interface, then detaches the configuration from the current pending configuration. Cumulus Linux saves the detached configuration to a file with a numerical value to distinguish it from other pending configurations.

cumulus@switch:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@switch:~$ nv config detach

View Differences Between Configurations

To view differences between configurations, run the nv config diff command.

To view differences between two detached pending configurations, run the nv config diff «tab» command to list all the current detached pending configurations, then run the nv config diff command with the pending configurations you want to diff.

cumulus@switch:~$ nv config diff <<press tab>>
1        2        3        4        5        6        applied  empty    startup
cumulus@switch:~$ nv config diff 2 3
- unset:
    system:
      wjh:
        channel:
          forwarding:
            trigger:
              l2:

To view differences between the applied configuration and the startup configuration:

cumulus@switch:~$ nv config diff applied startup
- unset:
    interface:
    system:
      wjh:

Replace and Patch a Pending Configuration

The following example replaces the pending configuration with the contents of the YAML configuration file called nv-02/13/2021.yaml located in the /deps directory:

cumulus@switch:~$ nv config replace /deps/nv-02/13/2021.yaml

The following example patches the pending configuration (runs the set or unset commands from the configuration in the nv-02/13/2021.yaml file located in the /deps directory):

cumulus@switch:~$ nv config patch /deps/nv-02/13/2021.yaml

A patch contains a single request to the NVUE service. Ordering of parameters within a patch is not guaranteed; NVUE does not support both unset and set commands for the same object in a single patch.

Translate a Configuration Revision or File

NVUE provides commands to translate an NVUE configuration revision or yaml file into NVUE commands. The revision ID must be either an integer or a named revision (such as startup, applied, pending). The configuration file must be located on the switch and must include the full path to the file containing the configuration you want to translate. The file must be in YAML format and must be accessible with proper read permissions.

To translate a specific NVUE configuration revision, run the nv config translate system config revision <revision-id> command. NVUE displays the translation on the console.

The following command translates the configuration in revision 1:

cumulus@switch:~$ nv config translate revision 10 

The following command translates the configuration in the applied revision:

cumulus@switch:~$ nv config translate revision applied 

To translate a configuration file, run the nv config translate system config input-file <file-path> command. The following example translates the backup.yaml file in the /home/cumulus directory. NVUE displays the translation on the console.

cumulus@switch:~$ nv config translate input-file /home/cumulus/backup.yaml

If the revision or yaml file is not readable, is in an invalid format, or includes invalid parameters, NVUE returns an error message and prompts you to correct the issue before proceeding.

Session-Based Authentication

NVUE uses sessions to authenticate and authorize requests. After authenticating the user with the first request, NVUE stores the session in the nvued cache. NVUE authenticates subsequent interactions within the session locally so that it does not have to keep checking with external authentication servers. This process enhances system performance and efficiency, making it ideal for high-traffic environments.

The following example clears the admin user session:

cumulus@switch:~$ nv action clear system api session user admin

The following example clears all sessions:

cumulus@switch:~$ nv action clear system api session

If you do not clear a user session after making changes directly on the RADIUS, TACACS, or LDAP server, NVUE uses the existing session for authentication and authorization until the session times out (up to 60 minutes).

Passwords and Special Characters

If you use certain special characters in a password, you must quote or escape (with a backslash) these characters so that the system understands that they are part of the password.

The following table shows if you need to quote or escape a special character.

Special Character Normal Use Single Quotes ('') Double Quotes ("") Escape (\)
backtick (`) x 1
exclamation point (!) x x
semicolon (;) x
ampersand (&) x
question mark (?) x x
tilde (~) x
at-sign (@)
hash sign (#) x
dollar sign ($) x x
percent sign (%)
caret (^)
asterisk (*)
parentheses (()) x
dash (-)
underscore (_)
equals sign (=)
plus sign (+)
vertical bar x
brackets ([])
braces ({})
colon (:)
single quote () x x
double quote () x x
comma (,)
angle brackets (<>) x
slash (/)
dot (.) 2 2 2 2
white space x x 3 x
  1. Requires escape (\) in addition to the double quotes ("").
  2. You cannot use this character at the beginning of a word.
  3. A word cannot consist entirely of white space, even inside double quotes.

The following example shows a password that includes a question mark (?):

cumulus@switch:~$ nv set system aaa user cumulus password “Hello?world123”

The following example shows a password that includes a dot (.):

cumulus@switch:~$ nv set system aaa user cumulus password “Hello.world.123”

The following example shows a password that includes a dot (.) and tilde (~):

cumulus@switch:~$ nv set system aaa user cumulus password “Hello.world\~123”

Filter nv show Command Output

Filters show command output on column data; for example, to show only the interfaces with MTU set to 1500:

cumulus@switch:~$ nv show interface --filter mtu=1500

To filter on multiple column outputs, enclose the entire filter in double quotes; for example, to show data for bridges with MTU 9216:

cumulus@switch:~$ nv show interface --filter "type=bridge&mtu=9216" 

You can use wildcards; for example, to show all IP addresses that start with 1 for swp1:

cumulus@switch:~$ nv show interface swp1 --filter "ip.address=1*"

You can filter on all revisions (operational, applied, and pending); for example, to show all IP addresses that start with 1 for swp1 in the applied revision:

cumulus@switch:~$ nv show interface --filter "ip.address=1*" --rev=applied

You can filter the FRR nv show vrf <vrf> router rib command output by protocol (gp, ospf, kernel, static, ospf6, sharp, or connected); for example, to show all BGP IPv4 routes in the routing table:

cumulus@switch:~$ nv show vrf default router rib ipv4 route --filter=protocol=bgp                                                                             
Flags - * - selected, q - queued, o - offloaded, i - installed, S - fib-        
selected, x - failed                                                            
                                                                                
Route            Protocol  Distance  Uptime                NHGId  Metric  Flags
---------------  --------  --------  --------------------  -----  ------  -----
10.0.1.34/32     bgp       20        2024-12-17T10:24:14Z  127    0       *Si  
10.0.1.255/32    bgp       20        2024-12-17T10:24:10Z  127    0       *Si  
10.10.10.2/32    bgp       20        2024-12-17T10:24:10Z  62     0       *Si  
10.10.10.3/32    bgp       20        2024-12-17T10:24:17Z  127    0       *Si  
10.10.10.4/32    bgp       20        2024-12-17T10:24:10Z  127    0       *Si  
10.10.10.63/32   bgp       20        2024-12-17T10:24:10Z  127    0       *Si  
10.10.10.64/32   bgp       20        2024-12-17T10:24:17Z  127    0       *Si  
10.10.10.101/32  bgp       20        2024-12-17T10:24:10Z  102    0       *Si  
10.10.10.102/32  bgp       20        2024-12-17T10:24:10Z  115    0       *Si  
10.10.10.103/32  bgp       20        2024-12-17T10:24:10Z  121    0       *Si  
10.10.10.104/32  bgp       20        2024-12-17T10:24:10Z  113    0       *Si  

You can filter the FRR nv show vrf <vrf> router bgp neighbor command output by state (established or non-established); for example, to show all BGP established neighbors:

cumulus@switch:~$ nv show vrf default router bgp neighbor --filter=state=established                                                                             
AS - Remote Autonomous System, PeerEstablishedTime - Peer established time in   
UTC format, UpTime - Uptime in milliseconds, Afi-Safi - Address family, PfxSent 
- Transmitted prefix counter, PfxRcvd - Recieved prefix counter                 
                                                                                
Neighbor       AS     State        PeerEstablishedTime   UpTime   MsgRcvd  MsgSent  Afi-Safi      PfxSent  PfxRcvd
-------------  -----  -----------  --------------------  -------  -------  -------  ------------  -------  -------
peerlink.4094  65102  established  2024-12-17T10:22:36Z  8998000  3145     3151     ipv4-unicast  13       12     
                                                                                    l2vpn-evpn    72       51     
swp51          65199  established  2024-12-17T10:22:41Z  8998000  3132     3149     ipv4-unicast  13       8      
                                                                                    l2vpn-evpn    72       51     
swp52          65199  established  2024-12-17T10:22:44Z  8998000  3125     3139     ipv4-unicast  13       8      
                                                                                    l2vpn-evpn    72       51     
swp53          65199  established  2024-12-17T10:22:44Z  8998000  3138     3139     ipv4-unicast  13       8      
                                                                                    l2vpn-evpn    72       51     
swp54          65199  established  2024-12-17T10:22:44Z  8998000  3143     3139     ipv4-unicast  13       8      
                                                                                    l2vpn-evpn    72       51  

To show all BGP non-established neighbors:

cumulus@switch:~$ nv show vrf default router bgp neighbor --filter=state!=established
No Data

NVUE and FRR Restart

NVUE restarts the FRR service when you:

Restarting FRR restarts all the routing protocol daemons that you enable and that are running, which might impact traffic.

Date and Time

This section discusses how to set the time zone and the date and time on the switch software clock, configure NTP, PTP, PPS, and SyncE.

Setting the Date and Time

This section discusses how to set the time zone, and how to set the date and time on the software clock on the switch. To configure NTP, see Network Time Protocol - NTP. To configure PTP, see Precision Time Protocol - PTP.

Setting the time zone, and the date and time on the software clock requires root privileges; use sudo.

Show the Current Time Zone, Date, and Time

To show the current time zone, date, and time on the switch:

cumulus@switch:~$ nv show system time
                           operational                  
-------------------------  -----------------------------
llocal-time                 Wed 2024-08-21 17:39:44 EDT
universal-time             Wed 2024-08-21 21:39:44 UTC
rtc-time                   Fri 2024-08-16 16:50:06    
time-zone                  US/Eastern (EDT, -0400)    
system-clock-synchronized  no                         
ntp-service                n/a                        
rtc-in-local-tz            no                         
unix-time                  1724276384.1403222 
cumulus@switch:~$ date
Wed 11 Oct 2023 12:18:33 PM UTC

To show the time zone only, run the date +%Z command:

cumulus@switch:~$ date +%Z
UTC

Set the Time Zone

You can use one of these methods to set the time zone on the switch:

Run the nv set system timezone <timezone> command. To see all the available time zones, run nv set system timezone and press the Tab key. The following example sets the time zone to US/Eastern:

cumulus@switch:~$ nv set system timezone US/Eastern
cumulus@switch:~$ nv config apply
  1. In a terminal, run the following command:

    cumulus@switch:~$ sudo dpkg-reconfigure tzdata
    
  2. Follow the on screen menu options to select the geographic area and region.

  1. Edit the /etc/timezone file to add your desired time zone. For a list of valid time zones, refer to tz database time zones.

    cumulus@switch:~$ sudo vi /etc/timezone
    US/Eastern
    
  2. Apply the new time zone:

    cumulus@switch:~$ sudo dpkg-reconfigure --frontend noninteractive tzdata
    
  3. Change /etc/localtime to reflect your current time zone:

    sudo ln -sf /usr/share/zoneinfo/US/Eastern /etc/localtime
    

Set the Date and Time

The switch contains a battery backed hardware clock that maintains the time while the switch powers off and between reboots. When the switch is running, the Cumulus Linux operating system maintains its own software clock.

During boot up, the switch copies the time from the hardware clock to the operating system software clock. The software clock takes care of all the timekeeping. During system shutdown, the switch copies the software clock back to the battery backed hardware clock.

If you need to reconfigure the current time zone, refer to the instructions above.

To set the software clock according to the configured time zone:

Run the nv action change system time <clock-date> <clock-time> command. Specify <clock-date> in YYYY-MM-DD format and <clock-time> in HH:MM:SS format.

cumulus@switch:~$ nv action change system time 2023-12-04 2:33:30
System Date-time changed successfully
Local Time is now Mon 2023-12-04 02:33:30 UTC
Action succeeded
cumulus@switch:~$ sudo date -s "Tue Jan 26 00:37:13 2021"

You can write the current value of the software clock to the hardware clock using the hwclock command:

cumulus@switch:~$ sudo hwclock -w

See man hwclock(8) for more information.

NVUE Snippets

NVUE supports both traditional snippets and flexible snippets:

  • A snippet configures a single parameter associated with a specific configuration file.
  • You can only set or unset a snippet; you cannot modify, partially update, or change a snippet.
  • Setting the snippet value replaces any existing snippet value.
  • Cumulus Linux supports only one snippet for a configuration file.
  • Only certain configuration files support a snippet.
  • NVUE does not parse or validate the snippet content and does not validate the resulting file after you apply the snippet.
  • PATCH is only the method of applying snippets and does not refer to any snippet capabilities.
  • As NVUE supports more features and introduces new syntax, snippets and flexible snippets become invalid. Before you upgrade Cumulus Linux to a new release, review the What's New for new NVUE syntax and remove the snippet if NVUE introduces new syntax for the feature that the snippet configures.

Traditional Snippets

Use traditional snippets if you configure Cumulus Linux with NVUE commands, then want to configure a feature that does not yet support the NVUE object model. You create a snippet in yaml format, then add the configuration to the file with the nv config patch command.

The nv config patch command requires you to use the fully qualified path name to the snippet .yaml file; for example you cannot use ./ with the nv config patch command.

/etc/frr/frr.conf Snippets

Example 1: Top Level Configuration

NVUE does not support configuring BGP to peer across the default route. The following example configures BGP to peer across the default route from the default VRF:

  1. Create a .yaml file with the following traditional snippet:

    cumulus@switch:~$ sudo nano bgp_snippet.yaml
    - set:
        system:
          config:
            snippet:
              frr.conf: |
                ip nht resolve-via-default
    
  2. Run the following command to patch the configuration:

    cumulus@switch:~$ nv config patch bgp_snippet.yaml
    
  3. Run the nv config apply command to apply the configuration:

    cumulus@switch:~$ nv config apply
    
  4. Verify that the configuration exists at the end of the /etc/frr/frr.conf file:

    cumulus@switch:~$ sudo cat /etc/frr/frr.conf
    ...
    ! end of router ospf block
    !---- CUE snippets ----
    ip nht resolve-via-default
    

Example 2: Nested Configuration

NVUE does not support configuring EVPN route targets using auto derived values from RFC 8365. The following example configures BGP to enable RFC 8365 derived router targets:

  1. Create a .yaml file with the following traditional snippet:

    cumulus@switch:~$ sudo nano bgp_snippet.yaml
    - set:
        system:
          config:
            snippet:
              frr.conf: |
                router bgp 65517 vrf default
                  address-family l2vpn evpn
                    autort rfc8365-compatible
    

Make sure to use spaces not tabs; the parser expects spaces in yaml format.

  1. Run the following command to patch the configuration:

    cumulus@switch:~$ nv config patch bgp_snippet.yaml
    
  2. Run the nv config apply command to apply the configuration:

    cumulus@switch:~$ nv config apply
    
  3. Verify that the configuration exists at the end of the /etc/frr/frr.conf file:

    cumulus@switch:~$ sudo cat /etc/frr/frr.conf
    ...
    ! end of router bgp 65517 vrf default
    !---- CUE snippets ----
    router bgp 65517 vrf default
    address-family l2vpn evpn
    autort rfc8365-compatible
    

The traditional snippets for FRR write content to the /etc/frr/frr.conf file. When you apply the configuration and snippet with the nv config apply command, the FRR service goes through and reads in the /etc/frr/frr.conf file.

Example 3: EVPN Multihoming FRR Debugging

NVUE does not support configuring FRR debugging for EVPN multihoming. The following example configures FRR debugging:

  1. Create a .yaml file and add the following traditional snippet:

    cumulus@switch:~$ sudo nano mh_debug_snippet.yaml
    - set:
        system:
          config:
            snippet:
              frr.conf: |
                debug bgp evpn mh es
                debug bgp evpn mh route
                debug bgp zebra
                debug zebra evpn mh es
                debug zebra evpn mh mac
                debug zebra evpn mh neigh
                debug zebra evpn mh nh
                debug zebra vxlan
    
  2. Run the following command to patch the configuration:

    cumulus@switch:~$ nv config patch mh_debug_snippet.yaml
    
  3. Run the nv config apply command to apply the configuration:

    cumulus@switch:~$ nv config apply
    
  4. Verify that the configuration exists in the /etc/frr/frr.conf file:

    cumulus@switch:~$ sudo cat /etc/frr/frr.conf
    ...
    !---- NVUE snippets ----
    debug bgp evpn mh es
    debug bgp evpn mh route
    debug bgp zebra
    debug zebra evpn mh es
    debug zebra evpn mh mac
    debug zebra evpn mh neigh
    debug zebra evpn mh nh
    debug zebra vxlan
    

The traditional snippets for FRR write content to the /etc/frr/frr.conf file. When you apply the configuration and snippet with the nv config apply command, the FRR service goes through and reads in the /etc/frr/frr.conf file.

/etc/network/interfaces Snippets

MLAG Timers Example

NVUE supports configuring only one of the MLAG service timeouts (initDelay). The following example configures the MLAG peer timeout to 400 seconds:

  1. Create a .yaml file and add the following traditional snippet:

    cumulus@switch:~$ sudo nano mlag_snippet.yaml
    - set:
        system:
          config:
            snippet:
              ifupdown2_eni:
                peerlink.4094: |
                  clagd-args --peerTimeout 400
    
  2. Run the following command to patch the configuration:

    cumulus@switch:~$ nv config patch mlag_snippet.yaml
    
  3. Run the nv config apply command to apply the configuration:

    cumulus@switch:~$ nv config apply
    
  4. Verify that the configuration exists in the peerlink.4094 stanza of the /etc/network/interfaces file:

    cumulus@switch:~$ sudo cat /etc/network/interfaces
    ...
    auto peerlink.4094
    iface peerlink.4094
     clagd-args --peerTimeout 400
     clagd-peer-ip linklocal
     clagd-backup-ip 10.10.10.2
     clagd-sys-mac 44:38:39:BE:EF:AA
     clagd-args --initDelay 180
    ...
    

Traditional Bridge Example

NVUE does not support configuring traditional bridges. The following example configures a traditional bridge called br0 with the IP address 11.0.0.10/24. swp1, swp2 are members of the bridge.

  1. Create a .yaml file and add the following traditional snippet:

    cumulus@switch:~$ sudo nano bridge_snippet.yaml
    - set:
        system:
         config:
           snippet:
             ifupdown2_eni:
               eni_stanzas: |
                 auto br0
                 iface br0
                   address 11.0.0.10/24
                   bridge-ports swp1 swp2
                   bridge-vlan-aware no
    
  2. Run the following command to patch the configuration:

    cumulus@switch:~$ nv config patch bridge_snippet.yaml
    
  3. Run the nv config apply command to apply the configuration:

    cumulus@switch:~$ nv config apply
    
  4. Verify that the configuration exists at the end of the /etc/network/interfaces file:

    cumulus@switch:~$ sudo cat /etc/network/interfaces
    ...
    auto br0
    iface br0
      address 11.0.0.10/24
      bridge-ports swp1 swp2
      bridge-vlan-aware no
    

VLAN-aware RSTP Timers Example

NVUE does not support configuring RSTP timers on VLAN-aware bridges. The following example configures non-default RSTP timers for the NVUE default bridge br_default:

  1. Create a .yaml file and add the following traditional snippet:

    cumulus@switch:~$ sudo nano vlan-aware_bridge_snippet.yaml
    - set:
        system:
          config:
            snippet:
              ifupdown2_eni:
                br_default: |
                  mstpctl-maxage 10
                  mstpctl-hello 1
                  mstpctl-fdelay 8
    
  2. Run the following command to patch the configuration:

    cumulus@switch:~$ nv config patch vlan-aware_bridge_snippet.yaml
    
  3. Run the nv config apply command to apply the configuration:

    cumulus@switch:~$ nv config apply
    
  4. Verify that the configuration exists at the end of the /etc/network/interfaces file:

    cumulus@switch:~$ sudo cat /etc/network/interfaces
    ...
    auto br_default
    iface br_default
        mstpctl-maxage 10
        mstpctl-hello 1
        mstpctl-fdelay 8
    ...
    

/etc/cumulus/datapath/traffic.conf Snippets

To add data path configuration for the Cumulus Linux switchd module that NVUE does not yet support, create a traffic.conf snippet.

The following example creates a file called traffic_conf_snippet.yaml and enables the resilient hash setting.

  1. Create a .yaml file and add the following traditional snippet:

    cumulus@switch:~$ sudo nano traffic_conf_snippet.yaml
    - set:
        system:
          config:
            snippet:
              traffic.conf: |
                resilient_hash_enable = TRUE
    
  2. Run the following command to patch the configuration:

    cumulus@switch:~$ nv config patch traffic_conf_snippet.yaml
    
  3. Run the nv config apply command to apply the configuration:

    cumulus@switch:~$ nv config apply
    
  4. Verify that the configuration exists at the end of the /etc/cumulus/datapath/traffic.conf file:

    cumulus@switch:~$ sudo cat /etc/cumulus/datapath/traffic.conf
    ...
    !---- NVUE snippets ----
    resilient_hash_enable = TRUE
    

/etc/snmp/snmpd.conf Snippets

To add Cumulus Linux SNMP agent configuration not yet available with NVUE commands, create an snmpd.conf snippet.

The following example creates a file called snmpd.conf_snippet.yaml, and sets the read only community string and the listening address to run in the mgmt VRF.

SNMP snippets do not take effect unless you first enable SNMP with the NVUE nv set system snmp-server state enable and nv set system snmp-server listening-address commands (or with the equivalent REST API methods).

  1. Create a .yaml file and add the following traditional snippet:

    cumulus@switch:~$ sudo nano snmpd.conf_snippet.yaml
    - set:
        system:
          config:
            snippet:
              snmpd.conf: |
                rocommunity cumuluspassword default
                agentaddress udp:@mgmt:161
    
  2. Run the following command to patch the configuration:

    cumulus@switch:~$ nv config patch snmpd.conf_snippet.yaml
    
  3. Run the nv config apply command to apply the configuration:

    cumulus@switch:~$ nv config apply
    
  4. Verify that the configuration exists at the end of the /etc/snmp/snmpd.conf file:

    cumulus@switch:~$ sudo cat /etc/snmp/snmpd.conf
    ...
    !---- NVUE SNMP Server Snippets ----
    rocommunity cumuluspassword default
    agentaddress udp:@mgmt:161
    

/etc/ssh/sshd_config Snippets

To add SSH service configuration not yet available with NVUE commands, create an sshd_config snippet.

The following example creates a file called sshd_config_snippet.yaml to allow root login and enable X11 forwarding for all users except user anoncvs. The snippet also disables TCP forwarding for the anoncvs user and runs the cvs server command when anoncvs logs in.

  1. Create a .yaml file and add the following traditional snippet:

    cumulus@switch:~$ sudo nano sshd_config_snippet.yaml
    - set:
        system:
          config:
            snippet:
              sshd_config: |
                PermitRootLogin yes
                X11Forwarding yes
                Match User anoncvs
                   X11Forwarding no
                   AllowTcpForwarding no
                   ForceCommand cvs server
    
  2. Run the following command to patch the configuration:

    cumulus@switch:~$ nv config patch sshd_config_snippet.yaml
    
  3. Run the nv config apply command to apply the configuration:

    cumulus@switch:~$ nv config apply
    
  4. Verify that the configuration exists at the end of the /etc/ssh/sshd_config file:

    cumulus@switch:~$ sudo cat /etc/ssh/sshd_config
    ...
    !---- NVUE snippets ----
    PermitRootLogin yes
    X11Forwarding yes
    Match User anoncvs
       X11Forwarding no
       AllowTcpForwarding no
       ForceCommand cvs server
    

Flexible Snippets

Flexible snippets are an extension of traditional snippets that let you manage any text file on the system.

The account you use through the CLI or the REST API to configure and manage flexible snippets must be in the sudo group, which includes the NVUE system-admin role, or you must be the root user.

Files NVUE Manages

You can use flexible snippets to add configuration to the following files that NVUE manages:

Filename
Description
/etc/cumulus/csmgrd Configuration file for csmgrctl commands.
/etc/default/isc-dhcp-relay-<VRF> Configuration file for DHCP relay. Changes to this file require a dhcrelay@<VRF>.service restart.
/etc/resolv.conf Configuration file for DNS resolution.
/etc/hosts Configuration file for the hostname of the switch.
/etc/default/isc-dhcp-server-<VRF> Configuration file for DHCP servers. Changes to this file require a dhcpd@<VRF>.service restart.
/etc/default/isc-dhcp-server6-<VRF> Configuration file for DHCP servers for IPv6. Changes to this file require a dhcpd6@<VRF>.service restart
/etc/dhcp/dhcpd-<VRF>.conf Configuration file for the dhcpd service. Changes to this file require a dhcpd@<VRF>.service restart
/etc/dhcp/dhcpd6-<VRF>.conf Configuration file for the dhcpd service for IPv6. Changes to this file require a dhcpd6@<VRF>.service restart
etc/ntpsec/ntp.conf Configuration file for NTP servers. Changes to this file require an ntp service restart.
/etc/default/isc-dhcp-relay6-<VRF> Configuration file for DHCP relay for IPv6. Changes to this file require a dhcrelay6@<VRF>.service restart.
/etc/snmp/snmpd.conf Configuration file for SNMP. Changes to this file require an snmpd restart.
/etc/cumulus/datapath/traffic.conf Configuration file for forwarding table profiles. Changes to this file require a switchd restart.
/etc/cumulus/switchd.conf Configuration file for switchd. Changes to this file require a switchd restart.

Flexible snippets do not support:

Use caution when creating flexible snippets:

  • If you configure flexible snippets incorrectly, they might impact switch functionality. For example, even though flexible snippet validation allows you to only add textual content, Cumulus Linux does not prevent you from creating a flexible snippet that adds to sensitive text files, such as /boot/grub.cfg and /etc/fstab or add corrupt contents. Such snippets might render the switch unusable or create a potential security vulnerability (the NVUE service (nvued) runs with superuser privileges).
  • Do not manually update configuration files to which you add flexible snippets.
  • Any sensitive data in plain text (such as passwords) appears in the NVUE-managed configuration files as plain text.

Create a Flexible Snippet

To create a flexible snippet:

  1. Create a file in yaml format and add each flexible snippet you want to apply in the format shown below. NVUE appends the flexible snippet at the end of an existing file. If the file does not exist, NVUE creates the file, then adds the content.

    cumulus@leaf01:mgmt:~$ sudo nano <filename>.yaml>
    - set:
        system:
         config:
           snippet:
             <snippet-name>:
               file: "<filename>"
               permissions: "<umask-permissions>"
               content: |
                 # This is my content
               services:
                  <name>:
                    service: <service-name>
                    action: <action>
    
    • You can only set the umast permissions to a new file that you create. Adding the permissions: line is optional. The default umask persmissions are 644.
    • You can add a service with an action, such as start, restart, or stop. Adding the services: lines is optional; however, if you add the service: line, you must specify at least one service.
  2. Run the following command to patch the configuration:

    cumulus@switch:~$ nv config patch <filename>.yaml>
    
  3. Run the nv config apply command to apply the configuration:

    cumulus@switch:~$ nv config apply
    
  4. Verify the patched configuration.

The nv config patch command requires you to use the fully qualified path name to the snippet .yaml file; for example you cannot use ./ with the nv config patch command.

Flexible Snippet Examples

The following example flexible snippet called crontab-flex-snippet appends the single line @daily /opt/utils/run-backup.sh to the existing /etc/crontab file, then restarts the cron service.

cumulus@leaf01:mgmt:~$ sudo nano crontab-flex-snippet.yaml
- set:
    system:
      config:
        snippet:
          crontab-flex-snippet:
            file: "/etc/crontab"
            content: |
              @daily /opt/utils/run-backup.sh
            services:
              schedule:
                service: cron
                action: restart

The following example flexible snippet called apt-flex-snippet creates a new file /etc/apt/sources.list.d/microsoft-prod.list with 0644 permissions and adds multi-line text:

cumulus@leaf01:mgmt:~$ sudo nano apt-flex-snippet.yaml
- set:
    system:
      config:
        snippet:
          apt-flexible-snippet:
            file: "/etc/apt/sources.list.d/microsoft-prod.list"
            content: |
              # Adding Microsoft SQL Server Sources
              deb [arch=amd64] https://packages.microsoft.com/debian/10/prod buster main
            permissions: "0644"

The following flexible snippet called lldp_config_snipppet disables LLDP on swp1 and swp2 using the configure system interface pattern-blacklist command:

cumulus@leaf01:mgmt:~$ sudo nano lldp_config_snipppet.yaml
- set:
    system:
      config:
        snippet:
          lldp-interfaces-config:
            file: "/etc/lldpd.d/lldp-interfaces.conf"
            content: |
              configure system interface pattern-blacklist swp1,swp2
              services:
                lldp:
                  service: lldpd
                  action: restart

The following flexible snippet called lldp_config_snipppet disables LLDP on swp1 and swp2 using the system interface pattern keyword:

cumulus@leaf01:mgmt:~$ sudo nano lldp_config_snipppet.yaml
- set:
    system:
      config:
        snippet:
          lldp-interfaces-config:
            file: "/etc/lldpd.d/lldp-interfaces.conf"
            content: |
              configure system interface pattern eth*,swp*,!swp1,!swp2
            services:
              lldp:
                service: lldpd
                action: restart

After you patch and apply the configuration above, the snippet creates a new file in the /etc/lldp.d directory, then restarts the lldpd service to stop LLDP transmitting and receiving on swp1 and swp2. Other interfaces continue to participate in LLDP.

If you try to apply a flexible snippet to a file that NVUE does not allow, you see an error message similar to the following:

cumulus@leaf01:mgmt:~$ nv config apply
Invalid config [rev_id: 8]
  Flexible snippets are not allowed to be configured on the file '/etc/cumulus/ports.conf’.
  Flexible snippets are not allowed to be configured on the file '/etc/cumulus/ports_width.conf’.

If you try to apply a flexible snippet to a file that supports traditional snippets, you see an error message similar to the following:

cumulus@leaf01:mgmt:~$ nv config apply
Invalid config [rev_id: 1]
  Flexible snippet cannot be used to modify the file '/etc/ssh/sshd_config'. Traditional snippets (for e.g., 'sshd_config') are supported on this file. Consult NVIDIA NVUE documentation for further information on snippets.

You can also create a flexible snippet with the REST API. See NVUE API.

Remove a Snippet

To remove a traditional or flexible snippet, edit the snippet .yaml file to change set to unset, then patch and apply the configuration. You can also use the REST API DELETE and PATCH methods.

The following example removes the MLAG timer traditional snippet created above to configure the MLAG peer timeout:

  1. Edit the mlag_snippet.yaml file to change set to unset:

    cumulus@switch:~$ sudo nano mlag_snippet.yaml
    - unset:
        system:
          config:
            snippet:
              ifupdown2_eni:
    
  2. Run the following command to patch the configuration:

    cumulus@switch:~$ nv config patch mlag_snippet.yaml
    
  3. Run the nv config apply command to apply the configuration:

    cumulus@switch:~$ nv config apply
    
  4. Verify that the peer timeout parameter no longer exists in the peerlink.4094 stanza of the /etc/network/interfaces file:

    cumulus@switch:~$ sudo cat /etc/network/interfaces
    ...
    auto peerlink.4094
    iface peerlink.4094
     clagd-peer-ip linklocal
     clagd-backup-ip 10.10.10.2
     clagd-sys-mac 44:38:39:BE:EF:AA
     clagd-args --initDelay 180
    ...
    

Network Time Protocol - NTP

The ntpd daemon running on the switch implements the NTP protocol. It synchronizes the system time with time servers in the /etc/ntpsec/ntp.conf file. The ntpd daemon starts at boot by default.

If you intend to run this service within a VRF, including the management VRF, follow these steps to configure the service.

Configure NTP Servers

The default NTP configuration includes the following servers, which are in the /etc/ntpsec/ntp.conf file:

To add the NTP servers you want to use, run the following commands. Include the iburst option to increase the sync speed.

The NVUE command requires a VRF. The following command adds the NTP servers in the default VRF.

cumulus@switch:~$ nv set service ntp default server 4.cumulusnetworks.pool.ntp.org iburst on
cumulus@switch:~$ nv config apply

Edit the /etc/ntpsec/ntp.conf file to add or update NTP server information:

cumulus@switch:~$ sudo nano /etc/ntpsec/ntp.conf
# pool.ntp.org maps to about 1000 low-stratum NTP servers.  Your server will
# pick a different set every time it starts up.  Please consider joining the
# pool: <http://www.pool.ntp.org/join.html>
server 0.cumulusnetworks.pool.ntp.org iburst
server 1.cumulusnetworks.pool.ntp.org iburst
server 2.cumulusnetworks.pool.ntp.org iburst
server 3.cumulusnetworks.pool.ntp.org iburst
server 4.cumulusnetworks.pool.ntp.org iburst

To set the initial date and time with NTP before starting the ntpd daemon, run the ntpd -q command. Be aware that ntpd -q can hang if the time servers are not reachable.

To verify that ntpd is running on the system:

cumulus@switch:~$ ps -ef | grep ntp
ntp       4074     1  0 Jun20 ?        00:00:33 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 101:102

To check the NTP peer status:

cumulus@switch:~$ nv show service ntp mgmt server
                 delay    iburst  jitter  offset   peer-state  poll  reach  refid         stratum  type  when
---------------  -------  ------  ------  -------  ----------  ----  -----  ------------  -------  ----  ----
23.157.160.168   67.4257          2.3843  -3.9378  -           128   377    129.6.15.28   2        u     41  
50.205.57.38     72.6007          1.0799  -1.8208  *           128   377    .GPS.         1        u     63  
h134-215-155-17  59.4988          2.3081  -2.6286  +           128   377    216.239.35.0  2        u     15  
li1150-42.membe  40.9645          0.4877  -1.9565  +           64    376    129.7.1.66    2        u     162

The nv show service ntp <vrf-id> pool command shows information about the configured NTP pools. However, this command does not show an accurate representation of the connectivity state to the NTP reference clocks on the network. To show the actual state of the NTP reference servers discovered by the switch, run the nv show service ntp <vrf-id> server command.

cumulus@switch:~$ ntpq -p
      remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
+ec2-34-225-6-20 129.6.15.30      2 u   73 1024  377   70.414   -2.414   4.110
+lax1.m-d.net    132.163.96.1     2 u   69 1024  377   11.676    0.155   2.736
*69.195.159.158  199.102.46.72    2 u  133 1024  377   48.047   -0.457   1.856
-2.time.dbsinet. 198.60.22.240    2 u 1057 1024  377   63.973    2.182   2.692

The following example commands remove default NTP servers:

cumulus@switch:~$ nv unset service ntp default server 0.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp default server 1.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp default server 2.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp default server 3.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv config apply

Edit the /etc/ntpsec/ntp.conf file to delete NTP servers.

cumulus@switch:~$ sudo nano /etc/ntpsec/ntp.conf
...
# pool.ntp.org maps to about 1000 low-stratum NTP servers.  Your server will
# pick a different set every time it starts up.  Please consider joining the
# pool: <http://www.pool.ntp.org/join.html>
server 4.cumulusnetworks.pool.ntp.org iburst
...

Specify the NTP Source Interface

By default, the source interface that NTP uses is eth0. The following example command configures the NTP source interface to be swp10.

cumulus@switch:~$ nv set service ntp default listen swp10
cumulus@switch:~$ nv config apply

Edit the /etc/ntpsec/ntp.conf file and modify the entry under the Specify interfaces comment.

cumulus@switch:~$ sudo nano /etc/ntpsec/ntp.conf
...
# Specify interfaces
interface listen swp10
...

Use NTP in a DHCP Environment

You can use DHCP to specify your NTP servers. Ensure that the DHCP-generated configuration file /run/ntp.conf.dhcp exists. The /etc/dhcp/dhclient-exit-hooks.d/ntp script generates this file, which is a copy of the default /etc/ntpsec/ntp.conf file with a modified server list from the DHCP server. If this file does not exist and you plan on using DHCP in the future, you can copy your current /etc/ntpsec/ntp.conf file to the location of the DHCP file.

To use DHCP to specify your NTP servers, run the sudo -E systemctl edit ntpsec.service command and add the ExecStart= line:

cumulus@switch:~$ sudo -E systemctl edit ntpsec.service
[Service]
ExecStart=
ExecStart=/usr/sbin/ntpd -n -u ntp:ntp -g -c /run/ntp.conf.dhcp

The sudo -E systemctl edit ntpsec.service command always updates the base ntpsec.service even if you use ntp@mgmt.service. The ntpsec@mgmt.service is re-generated automatically.

To validate that your configuration, run these commands:

cumulus@switch:~$ sudo systemctl restart ntp
cumulus@switch:~$ sudo systemctl status -n0 ntpsec.service

If the state is not Active, or the alternate configuration file does not appear in the ntp command line, it is likely that you made a configuration mistake. Correct the mistake and rerun the commands above to verify.

Configure NTP with Authorization Keys

For added security, you can configure NTP to use authorization keys.

Configure the NTP Server

  1. Create a .keys file, such as /etc/ntp.keys. Specify a key identifier (a number between 1 and 65535), an encryption method (M for MD5), and the password. The following provides an example:
```
#
# PLEASE DO NOT USE THE DEFAULT VALUES HERE.
#
#65535  M  akey
#1      M  pass

1  M  CumulusLinux!
```
  1. In the /etc/ntpsec/ntp.conf file, add a pointer to the /etc/ntp.keys file you created above and specify the key identifier. For example:

    keys /etc/ntp/ntp.keys
    trustedkey 1
    controlkey 1
    requestkey 1
    
  2. Restart NTP with the sudo systemctl restart ntp command.

Configure the NTP Client

The NTP client is the Cumulus Linux switch.

  1. Create the same .keys file you created on the NTP server (/etc/ntp.keys). For example:
```
cumulus@switch:~$  sudo nano /etc/ntp.keys
#
# DO NOT USE THE DEFAULT VALUES HERE.
#
#65535  M  akey
#1      M  pass

1  M  CumulusLinux!
```
  1. Edit the /etc/ntpsec/ntp.conf file to specify the server you want to use, the key identifier, and a pointer to the /etc/ntp.keys file you created in step 1. For example:

    cumulus@switch:~$ sudo nano /etc/ntpsec/ntp.conf
    ...
    # You do need to talk to an NTP server or two (or three).
    #pool ntp.your-provider.example
    # OR
    #server ntp.your-provider.example
    
    # pool.ntp.org maps to about 1000 low-stratum NTP servers.  Your server will
    # pick a different set every time it starts up.  Please consider joining the
    # pool: <http://www.pool.ntp.org/join.html>
    #server 0.cumulusnetworks.pool.ntp.org iburst
    #server 1.cumulusnetworks.pool.ntp.org iburst
    #server 2.cumulusnetworks.pool.ntp.org iburst
    #server 3.cumulusnetworks.pool.ntp.org iburst
    server 10.50.23.121 key 1
    
    #keys
    keys /etc/ntp.keys
    trustedkey 1
    controlkey 1
    requestkey 1
    ...
    
  2. Restart NTP in the active VRF (default or management). For example:

    cumulus@switch:~$ systemctl restart ntp@mgmt.service
    
  3. Wait a few minutes, then run the ntpq -c as command to verify the configuration:

    cumulus@switch:~$ ntpq -c as
    
    ind assid status  conf reach auth condition  last_event cnt
    ===========================================================
      1 40828  f014   yes   yes   ok     reject   reachable  1
    

    After a successful authorization, you see the following command output:

    cumulus@switch:~$ ntpq -c as
    
    ind assid status  conf reach auth condition  last_event cnt
    ===========================================================
      1 40828  f61a   yes   yes   ok   sys.peer    sys_peer  1
    

Considerations

NTP in Cumulus Linux uses the /usr/share/zoneinfo/leap-seconds.list file, which expires periodically and results in generated log messages about the expiration. When the file expires, update it from https://www.ietf.org/timezones/data/leap-seconds.list or upgrade the tzdata package to the newest version.

NVUE API

When you upgrade to Cumulus Linux 5.6 or later, the switch overwrites any manual configuration you performed by editing files in Cumulus Linux 5.5 or earlier, such as configuring the listening address, port, TLS, or certificate.

In addition to the CLI, NVUE supports a REST API. Instead of accessing Cumulus Linux using SSH, you can interact with the switch using an HTTP client, such as cURL or a web browser.

The nvued service provides access to the NVUE REST API. Cumulus Linux exposes the HTTP endpoint internally, which makes the NVUE REST API accessible locally within the Cumulus Linux switch. The NVUE CLI also communicates with the nvued service using internal APIs. To provide external access to the NVUE REST API, Cumulus Linux uses an HTTP reverse proxy server, and supports HTTPS and TLS connections from external REST API clients.

The following illustration shows the NVUE REST API architecture and illustrates how Cumulus Linux forwards the requests internally.

Supported HTTP Methods

The NVUE REST API supports the following methods:

In Cumulus Linux 5.9 and earlier, the REST API PATCH response returns the full state of the NVUE system (your configuration change and all other NVUE configuration on the switch), which can be inefficient with large scale configurations as the system state grows with the configuration. In Cumulus Linux 5.10 and later, the REST API PATCH response returns only the resulting configuration change. The response typically equals the request payload; however, in certain instances the response returns additional changes that the NVUE server patches in automatically. For example, when using well-named interface names like swp1, NVUE configures the type automatically:

PATCH request: {'interface': {'swp1': {}}
PATCH Response: {'interface': {'swp1': {'type': 'swp'}},
...

In Cumulus Linux 5.10 and later, DELETE responses return a 204(No Content) status code. In Cumulus Linux 5.9 and earlier, DELETE responses return 200 with an empty json body ({}).

Secure the API

The NVUE REST API supports HTTP basic authentication, and the same underlying authentication methods for username and password that the NVUE CLI supports. User accounts work the same on both the API and the CLI.

Certificates

Cumulus Linux includes a self-signed certificate and private key to use on the server so that it works out of the box. The switch generates the self-signed certificate and private key when it boots for the first time. The X.509 certificate with the public key is in /etc/ssl/certs/cumulus.pem and the corresponding private key is in /etc/ssl/private/cumulus.key.

NVIDIA recommends you use your own certificates and keys. For the steps to generate self-signed certificates and keys, refer to the Ubuntu Certificates and Security documentation.

Cumulus Linux lets you manage CA certificates (such as DigiCert or Verisign) and entity (end-point) certificates. Both a CA certificate and an entity certificate can contain a chain of certificates.

You can import certificates onto the switch (fetch certificates from an external source), set which certificate you want to use for the NVUE REST API, and show information about a certificate, such as the serial number, and the date and time during which the certificate is valid.

Import a Certificate

  • You can import a maximum of 25 entity certificates and a maximum of 25 CA bundles. Each CA bundle file supports up to 100 CA certificates.
  • The certificate you import contains sensitive private key information. NVIDIA recommends that you use a secure transport such as SFTP, SCP, or HTTPS.

If the certificate is passphrase protected, you need to include the passphrase.

You must provide a certificate ID (<cert-id>) to uniquely identify the certificate you import.

The following example imports a CA certificate bundle with a public key and calls the certificate tls-cert-1. The certificate is passphrase protected with mypassphrase. The public key is a Base64 ASCII encoded PEM string.

  • You must enclose the public key in the NVUE command with three double quotes ("""<public-key>""").
  • With the REST API, you must enclose the public key with one double quote ("<public-key>").

cumulus@switch:~$ nv action import system security ca-certificate tls-cert-1 passphrase mypassphrase data """<public-key>""" 

The following example imports an entity certificate and calls the certificate tls-cert-1. The certificate is passphrase protected with mypassphrase.

A certificate bundle must be in .PFX or .P12 format.

cumulus@switch:~$ nv action import system security certificate tls-cert-1 passphrase mypassphrase uri-bundle scp://user@pass:1.2.3.4:/opt/certs/cert.p12 

The following example imports an entity certificate with the public key URI scp://user@pass:1.2.3.4 and private key URI scp://user@pass:1.2.3.4, and calls the certificate tls-cert-1. The certificate is not passphrase protected.

A CA certificate must be in .pem, .p7a, or .p7c format.

cumulus@switch:~$ nv action import system security certificate tls-cert-1 uri-public-key scp://user@pass:1.2.3.4 uri-private-key scp://user@pass:1.2.3.4

Set the Certificate to Use

You can configure the NVUE REST API to use a specific certificate.

The following example configures the API to use the certificate or CA bundle named tls-cert-1:

cumulus@switch:~$ nv set system api certificate tls-cert-1
cumulus@switch:~$ nv config apply

The following example configures the API to use the self-signed certificate:

cumulus@switch:~$ nv set system api certificate self-signed
cumulus@switch:~$ nv config apply

To unset the certificate to use with the NVUE REST API:

cumulus@switch:~$ nv unset system api certificate tls-cert-1

To configure a certificate to use for mutual authentication with mTLS:

cumulus@switch:~$ nv set system api mtls ca-certificate tls-cert-1

Delete Certificates

The following command deletes the certificate tls-cert-1:

cumulus@switch:~$ nv action delete system security certificate tls-cert-1 

Show Certificate Information

The following example shows all the entity certificates on the switch:

cumulus@switch:~$ nv show system security certificate

The following example shows the applications that are using a specific entity certificate.

cumulus@switch:~$ nv show system security certificate tls-cert-1 installed

The following example shows detailed information about the CA certificate tls-cert-1:

cumulus@switch:~$ nv show system security ca-certificate tls-cert-1 dump

API-only User

To create an API-only user without SSH permissions, use Linux group permissions. You can create the API-only user in the ZTP script.

# Create the dedicated automation user 
adduser --disabled-password --gecos "Automation User,,,," --shell /usr/bin/nologin automation

# Set the password
echo 'automation:password!' | chpasswd

# Add the user to nvapply group to make NVUE config changes
adduser automation nvapply

Control Plane ACLs

You can secure the API by configuring:

This example shows how to create ACLs to allow users from the management subnet and the local switch to communicate with the switch using REST APIs, and restrict all other access.

cumulus@switch:~$ nv set acl API-PROTECT type ipv4 
cumulus@switch:~$ nv set acl API-PROTECT rule 10 action permit
cumulus@switch:~$ nv set acl API-PROTECT rule 10 match ip .protocol tcp .dest-port 8765 .source-ip 192.168.200.0/24
cumulus@switch:~$ nv set acl API-PROTECT rule 10 remark "Allow the Management Subnet to talk to API"

cumulus@switch:~$ nv set acl API-PROTECT rule 20 action permit
cumulus@switch:~$ nv set acl API-PROTECT rule 20 match ip .protocol tcp .dest-port 8765 .source-ip 127.0.0.1
cumulus@switch:~$ nv set acl API-PROTECT rule 20 remark "Allow the local switch to talk to the API"

cumulus@switch:~$ nv set acl API-PROTECT rule 30 action deny
cumulus@switch:~$ nv set acl API-PROTECT rule 30 match ip .protocol tcp .dest-port 8765
cumulus@switch:~$ nv set acl API-PROTECT rule 30 remark "Block everyone else from talking to the API"

cumulus@switch:~$ nv set system control-plane acl API-PROTECT inbound

Supported Objects

The NVUE object model supports most features on the Cumulus Linux switch. The following list shows the supported objects. The NVUE API supports more objects within each of these objects. To see a full listing of the supported API endpoints, refer to NVUE OpenAPI Specification for Cumulus Linux.

High-level Objects Description
acl Access control lists.
bridge Bridge domain configuration.
evpn EVPN configuration.
interface Interface configuration.
mlag MLAG configuration.
nve Network virtualization configuration, such as VXLAN-specfic MLAG configuration and VXLAN flooding.
platform Platform configuration, such as hardware and software components.
qos QoS RoCE configuration.
router Router configuration, such as router policies, global BGP and OSPF configuration, PBR, PIM, IGMP, VRR, and VRRP configuration.
service DHCP relays and server, NTP, PTP, LLDP, and syslog configuration.
system Global system settings, such as the reserved routing table range for PBR and the reserved VLAN range for layer 3 VNIs, system login messages and switch reboot history.
vrf VRF configuration.

Use the API

The NVUE CLI and the REST API are equivalent in functionality; you can run all management operations from the REST API or from the CLI. The NVUE object model drives both the REST API and the CLI management operations. All operations are consistent; for example, the CLI nv show commands reflect any PATCH operation (create and update) you run through the REST API.

NVUE follows a declarative model, removing context-specific commands and settings. The structure of NVUE is like a big tree that represents the entire state of a Cumulus Linux instance. At the base of the tree are high level branches representing objects, such as router and interface. Under each of these branches are more branches. As you navigate through the tree, you gain a more specific context. At the leaves of the tree are actual attributes, represented as key-value pairs. The path through the tree is similar to a filesystem path.

Cumulus Linux enables the NVUE REST API by default. To disable the NVUE REST API, run the nv set system api state disabled command.

To use the NVUE REST API in Cumulus Linux 5.6 and later, you must change the password for the cumulus user; otherwise you see 403 responses when you run commands.

API Port and Listening Address

This section shows how to:

The following example sets the port to 8888:

cumulus@switch:~$ nv set system api port 8888
cumulus@switch:~$ nv config apply

You can listen on multiple interfaces by specifying different listening addresses:

cumulus@switch:~$ nv set system api listening-address 10.10.10.1
cumulus@switch:~$ nv set system api listening-address 10.10.20.1
cumulus@switch:~$ nv config apply

The following example configures the listening address on eth0, which has IP address 172.0.24.0 and uses the management VRF by default:

cumulus@switch:~$ nv set system api listening-address 172.0.24.0
cumulus@switch:~$ nv config apply

The following example configures VRF BLUE on swp1, which has IP address 10.10.20.1, then sets the API listening address to the IP address for swp1 (configured for VRF BLUE).

cumulus@switch:~$ nv set interface swp1 ip address 10.10.10.1/24
cumulus@switch:~$ nv set interface swp1 ip vrf BLUE
cumulus@switch:~$ nv config apply

cumulus@switch:~$ nv set system api listening-address 10.10.10.1
cumulus@switch:~$ nv config apply

The following example sets the port to 8888:

cumulus@switch:~$ curl -u 'cumulus:cumulus' -k --request PATCH https://localhost:8765/nvue_v1/system/api?rev=2 -H 'Content-Type:application/json' -d '{"port": 8888 }'

You can listen on multiple interfaces by specifying different listening addresses. The following example sets localhost, interface address 10.10.10.1, and 10.10.20.1 as listen-addresses.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -k --request PATCH https://localhost:8765/nvue_v1/system/api/listening-address?rev=2 -H 'Content-Type:application/json' -d '{ "localhost": {}, "10.10.10.1": {}, "10.10.20.1": {}}'

The following example configures the listening address on eth0, which has IP address 172.0.24.0 and uses the management VRF by default:

cumulus@switch:~$ curl -u 'cumulus:cumulus' -k --request PATCH https://localhost:8765/nvue_v1/system/api/listening-address?rev=2 -H 'Content-Type:application/json' -d '{"172.0.24.0": {}}'

Show NVUE REST API Information

To show REST API port configuration, state (enabled or disabled), certificate, listening address, and connection information:

Run the nv show system api command:

cumulus@switch:~$ nv show system api
                  operational     applied
--------------    -----------     -------
port                 8888         8888     
state                enabled      enabled
certificate          self-signed  self-signed  
[listening-address]  localhost    localhost
connections
  accepted        31
  active          1
  handled         33
  reading         0
  requests        28
  waiting         0
  writing         1

To show connection information only, run the nv show system api connections command:

cumulus@switch:~$ nv show system api connections
          operational  applied
--------  -----------  -------
accepted  31                  
active    1                   
handled   33                  
reading   0                   
requests  28                   
waiting   0                   
writing   1     

To show the configured listening address, run the nv show system api listening-address command:

cumulus@switch:~$ nv show system api listening-address
---------
localhost

To show all the certificates installed on the switch, run the nv show system security certificate command. To show information about a specific certificate, such as the serial number and how long the certificate is valid, run the nv show system security certificate <certificate> command:

cumulus@switch:~$ nv show system security certificate tls-cert-1 
               operational                applied  pending 
-------------  -------------------------  -------  ------- 
installed      
 app           TLS 
serial-number  67:03:3B:B4:6E:35:D3 
valid-from     2023-02-14T00:35:18+00:00 
valid-to       2033-02-11T00:35:18+00:00 
cumulus@switch:~$ curl -u 'cumulus:cumulus' -k --request GET https://localhost:8765/nvue_v1/system/api?rev=2 -H "accept: application/json"
{
  "certificate": "self-signed",
  "listening-address": {
    "10.10.10.1": {},
    "10.10.20.1": {},
    "172.0.24.0": {},
    "localhost": {}
  },
  "port": 8888,
  "state": "enabled"
}

To show the configured listening address:

cumulus@switch:~$ curl -u 'cumulus:cumulus' -k --request GET https://localhost:8765/nvue_v1/system/api/listening-address?rev=2 -H "accept: application/json"
{
  "10.10.10.1": {},
  "10.10.20.1": {}
}

To show the certificates on the switch:

cumulus@switch:~$ curl -u 'cumulus:cumulus' -k --request GET https://localhost:8765/nvue_v1/system/api/certificate?rev=2 -H "accept: application/json"
{
  "tls-cert-1": {},
  "tls-cert-2": {}
}

To show information about a specific certificate, such as the serial number and how long the certificate is valid:

cumulus@switch:~$ curl -u 'cumulus:cumulus' -k --request GET https://localhost:8765/nvue_v1/system/api/certificate/tls-cert-1?rev=2 -H "accept: application/json"
{
  "serial-number": "67:03:3B:B4:6E:35:D3",
  "valid-from": "2023-02-14T00:35:18+00:00",
  "valid-to": "2033-02-11T00:35:18+00:00"
}

Run cURL Commands

You can run the cURL commands from the command line. Use the username and password for the switch. For example:

cumulus@switch:~$ curl  -u 'cumulus:cumulus' --insecure https://127.0.0.1:8765/nvue_v1/interface
{
  "eth0": {
    "ip": {
      "address": {
        "192.168.200.12/24": {}
      }
    },
    "link": {
      "mtu": 1500,
      "state": {
        "up": {}
      },
      "stats": {
        "carrier-transitions": 2,
        "in-bytes": 184151,
        "in-drops": 0,
        "in-errors": 0,
        "in-pkts": 2371,
        "out-bytes": 117506,
        "out-drops": 0,
        "out-errors": 0,
        "out-pkts": 762
      }
...

API Use Cases

The following examples show the primary API uses cases.

View a Configuration

Use the following example to obtain the current applied configuration on the switch. Change the rev argument to view any revision. Possible options for the rev argument include startup, pending, operational, and applied.

cumulus@switch:~$ curl -k -u cumulus:cumulus -X GET "https://127.0.0.1:8765/nvue_v1/?rev=applied&filled=false"
"acl": {}, 
  "bridge": { 
    "domain": { 
      "br_default": { 
        "encap": "802.1Q", 
        "mac-address": "auto", 
        "multicast": { 
          "snooping": { 
            "enable": "off" 
          } 
        }, 
        "stp": { 
          "priority": 32768, 
          "state": { 
            "up": {} 
          } 
        }, 
        "type": "vlan-aware", 
        "untagged": 1, 
        "vlan": { 
          "10": { 
            "multicast": { 
...  
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

if __name__ == "__main__":
    r = requests.get(url=nvue_end_point + "/?rev=applied&filled=false",
                     auth=auth,
                     verify=False)
    print("=======Current Applied Revision=======")
    print(json.dumps(r.json(), indent=2))
cumulus@switch:~$ nv config show
- set: 
    bridge: 
      domain: 
        br_default: 
          type: vlan-aware 
          vlan: 
            '10': 
              vni: 
                '10': {} 
            '20': 
              vni: 
                '20': {} 
            '30': 
              vni: 
                '30': {} 
    evpn: 
      enable: on 
    mlag: 
      backup: 
        10.10.10.2: {} 
      enable: on 
      init-delay: 10 
      mac-address: 44:38:39:BE:EF:AA 
... 

Replace an Entire Configuration

To replace an entire configuration:

  1. Create a new revision ID with a POST:

    cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure -X POST https://127.0.0.1:8765/nvue_v1/revision
    {
     "1": {
       "state": "pending",
       "transition": {
         "issue": {},
         "progress": ""
       }
     }
    }
    
  2. Record the revision ID. In the above example, the revision ID is "1".

  3. Do a root patch to delete the whole configuration.

    cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{}' -H 'Content-Type: application/json' -k -X DELETE https://127.0.0.1:8765/nvue_v1/?rev=1
    {}
    
  4. Do a root patch to update the switch with the new configuration.

    cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{
       "system": {
         "hostname": "switch01"
       },
       "bridge": {
         "domain": {
           "br_default": {
             "type": "vlan-aware",
             "vlan": {
               "10": {
                 "vni": {
                   "10": {}
                   }
                 },
               "20": {
                 "vni": {
                   "20": {}
                 }
               },
               "30": {
                 "vni": {
                   "30": {}
                 }
               }
             }
           }
         }
       },
       "interface": {
         "eth0": {
           "ip": {
             "address": {
               "192.168.200.6/24": {}
             },
             "vrf": "mgmt"
           },
           "type": "eth"
         },
         "lo": {
           "ip": {
             "address": {
               "10.10.10.1/32": {}
             }
           },
           "type": "loopback"
         },
         "swp51": {
           "link": {
             "state": {
               "up": {}
             }
           },
           "type": "swp"
         },
         "swp52": {
           "link": {
             "state": {
               "up": {}
             }
           },
           "type": "swp"
         },
         "swp53": {
           "link": {
             "state": {
               "up": {}
             }
           },
           "type": "swp"
         },
         "swp54": {
           "link": {
             "state": {
               "up": {}
             }
           },
           "type": "swp"
         }
       },
       "mlag": {
         "backup": {
           "10.10.10.2": {}
         },
         "enable": "on",
         "init-delay": 10,
         "mac-address": "44:38:39:BE:EF:AA",
         "peer-ip": "linklocal",
         "priority": 1000
       }
       "router": {
         "bgp": {
           "enable": "on"
         },
         "vrr": {
           "enable": "on"
         }
       },
       "service": {},
       "vrf": {
         "mgmt": {
           "router": {
             "static": {
               "0.0.0.0/0": {
                 "address-family": "ipv4-unicast",
                 "via": {
                   "192.168.200.1": {
                     "type": "ipv4-address"
                   }
                 }
               }
             }
           }
         }
       }
     }' -H 'Content-Type: application/json' -k -X PATCH https://127.0.0.1:8765/nvue_v1/?rev=1
    {}
    
  5. Apply the changes with a PATCH to the revision changeset.

    cumulus@switch:~$ curl -u 'cumulus:cumulus' -H 'Content-Type:application/json' -d '{"state": "apply", "auto-prompt": {"ays": "ays_yes"}}' -k -X PATCH https://127.0.0.1:8765/nvue_v1/revision/1
    {
      "state": "apply",
      "transition": {
        "issue": {},
        "progress": ""
      }
    }
    
    cumulus@switch:~$ nv config apply
    
  6. Review the status of the apply and the configuration:

    cumulus@switch:~$ curl -u 'cumulus:cumulus' -k -X GET https://127.0.0.1:8765/nvue_v1/revision/1
    {
      "state": "applied",
      "transition": {
        "issue": {},
        "progress": ""
      }
    }
    
    cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure https://127.0.0.1:8765/nvue_v1/system
    {
     "build": "Cumulus Linux 5.4.0",
     "hostname": "switch01",
     "timezone": "Etc/UTC",
     "uptime": 763
    }
    cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure https://127.0.0.1:8765/nvue_v1/bridge/domain/br_default/vlan/10
    {
     "multicast": {
       "snooping": {
         "querier": {
           "source-ip": "0.0.0.0"
         }
       }
     },
     "ptp": {
       "enable": "off"
     },
     "vni": {
       "10": {
         "flooding": {
           "enable": "auto"
         },
         "mac-learning": "off"
       }
     }
    
    #!/usr/bin/env python3
    
    import requests
    from requests.auth import HTTPBasicAuth
    import json
    import time
    
    auth = HTTPBasicAuth(username="cumulus", password="password")
    nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
    mime_header = {"Content-Type": "application/json"}
    
    DUMMY_SLEEP = 5  # In seconds
    POLL_APPLIED = 1  # in seconds
    RETRIES = 10
    
    def print_request(r: requests.Request):
        print("=======Request=======")
        print("URL:", r.url)
        print("Headers:", r.headers)
        print("Body:", r.body)
    
    def print_response(r: requests.Response):
        print("=======Response=======")
        print("Headers:", r.headers)
        print("Body:", json.dumps(r.json(), indent=2))
    
    def create_nvue_changest():
        r = requests.post(url=nvue_end_point + "/revision",
                          auth=auth,
                          verify=False)
        print_request(r.request)
        print_response(r)
        response = r.json()
        changeset = response.popitem()[0]
        return changeset
    
    def apply_nvue_changeset(changeset):
        apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
        url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
                                                                   safe="")
        r = requests.patch(url=url,
                           auth=auth,
                           verify=False,
                           data=json.dumps(apply_payload),
                           headers=mime_header)
        print_request(r.request)
        print_response(r)
    
    def is_config_applied(changeset) -> bool:
        # Check if the configuration was indeed applied
        global RETRIES
        global POLL_APPLIED
        retries = RETRIES
        while retries > 0:
            r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
                             auth=auth,
                             verify=False)
            response = r.json()
            print(response)
            if response["state"] == "applied":
                return True
            retries -= 1
            time.sleep(POLL_APPLIED)
    
        return False
    
    def apply_new_config(path,payload):
        # Create a new revision ID
        changeset = create_nvue_changest()
        print("Using NVUE Changeset: '{}'".format(changeset))
    
        # Delete existing configuration
        query_string = {"rev": changeset}
        r = requests.delete(url=nvue_end_point + path,
                           auth=auth,
                           verify=False,
                           params=query_string,
                           headers=mime_header)
        print_request(r.request)
        print_response(r)
    
        # Patch the new configuration
        
        query_string = {"rev": changeset}
        r = requests.patch(url=nvue_end_point + path,
                           auth=auth,
                           verify=False,
                           data=json.dumps(payload),
                           params=query_string,
                           headers=mime_header)
        print_request(r.request)
        print_response(r)
    
        # Apply the changes to the new revision changeset
        apply_nvue_changeset(changeset)
    
        # Check if the changeset was applied
        is_config_applied(changeset)
    
    def nvue_get(path):
        r = requests.get(url=nvue_end_point + path,
                         auth=auth,
                         verify=False)
        print_request(r.request)
        print_response(r)
    
    if __name__ == "__main__":
        payload = {
          "system": {
            "hostname": "switch01"
          },
          "bridge": {
            "domain": {
              "br_default": {
                "type": "vlan-aware",
                "vlan": {
                  "10": {
                    "vni": {
                      "10": {}
                      }
                    },
                  "20": {
                    "vni": {
                      "20": {}
                    }
                  },
                  "30": {
                    "vni": {
                      "30": {}
                    }
                  }
                }
              }
            }
          },
          "interface": {
            "eth0": {
              "ip": {
                "address": {
                  "192.168.200.6/24": {}
                },
                "vrf": "mgmt"
              },
              "type": "eth"
            },
            "lo": {
              "ip": {
                "address": {
                  "10.10.10.1/32": {}
                }
              },
              "type": "loopback"
            },
            "swp51": {
              "link": {
                "state": {
                  "up": {}
                }
              },
              "type": "swp"
            },
            "swp52": {
              "link": {
                "state": {
                  "up": {}
                }
              },
              "type": "swp"
            },
            "swp53": {
              "link": {
                "state": {
                  "up": {}
                }
              },
              "type": "swp"
            },
            "swp54": {
              "link": {
                "state": {
                  "up": {}
                }
              },
              "type": "swp"
            }
          },
          "mlag": {
            "backup": {
              "10.10.10.2": {}
            },
            "enable": "on",
            "init-delay": 10,
            "mac-address": "44:38:39:BE:EF:AA",
            "peer-ip": "linklocal",
            "priority": 1000
          }
          "router": {
            "bgp": {
              "enable": "on"
            },
            "vrr": {
              "enable": "on"
            }
          },
          "service": {},
          "vrf": {
            "mgmt": {
              "router": {
                "static": {
                  "0.0.0.0/0": {
                    "address-family": "ipv4-unicast",
                    "via": {
                      "192.168.200.1": {
                        "type": "ipv4-address"
                      }
                    }
                  }
                }
              }
            }
          }
        }
        apply_new_config("/",payload)
        time.sleep(DUMMY_SLEEP)
        print("=====Verifying some of the configurations=====")
        nvue_get("/system")
        nvue_get("/bridge/domain/br_default/vlan/10")
    
    cumulus@switch:~$ nv show system
                operational          applied
    --------  -------------------  -------
    hostname  switch01             cumulus
    build     Cumulus Linux 5.4.0
    uptime    0:12:59
    timezone  Etc/UTC
    
    cumulus@switch:~$ nv show bridge domain br_default vlan 10
    
                     operational  applied  pending  description
    ---------------  -----------  -------  -------  ------------------------------------------------------
    [vni]            10           10       10       L2 VNI
    multicast
      snooping
        querier
          source-ip  0.0.0.0      0.0.0.0  0.0.0.0  Source IP to use when sending IGMP/MLD queries.
    ptp
      enable         off          off      off      Turn the feature 'on' or 'off'.  The default is 'off'.
    

Make a Configuration Change

To make a configuration change:

  1. Create a new revision ID with a POST:

    cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure -X POST https://127.0.0.1:8765/nvue_v1/revision
    {
       "2": {
       "state": "pending",
       "transition": {
         "issue": {},
         "progress": ""
       }
     }
    }
    
  2. Record the revision ID. In the above example, the revision ID is "2".

  3. Make the change with a PATCH and link it to the revision ID:

    cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"99.99.99.99/32": {}}' -H 'Content-Type: application/json' -k -X PATCH https://127.0.0.1:8765/nvue_v1/interface/lo/ip/address?rev=2
    {
      "99.99.99.99/32": {}
    }
    
    cumulus@switch:~$ nv set interface lo ip address 99.99.99.99/32
    
  4. Apply the changes with a PATCH to the revision changeset:

    cumulus@switch:~$ curl -u 'cumulus:cumulus' -H 'Content-Type:application/json' -k -X PATCH https://127.0.0.1:8765/nvue_v1/revision/2
    {
      "state": "apply",
      "transition": {
        "issue": {},
        "progress": ""
      }
    }
    
    cumulus@switch:~$ nv config apply
    
  5. Review the status of the apply and the configuration:

    cumulus@switch:~$ curl -u 'cumulus:cumulus' -k -X GET https://127.0.0.1:8765/nvue_v1/revision/2
    {
      "state": "applied",
      "transition": {
        "issue": {},
        "progress": ""
      }
    }
    
    cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure https://127.0.0.1:8765/nvue_v1/interface/lo/ip/address
    {
      "127.0.0.1/8": {},
      "99.99.99.99/32": {},
      "::1/128": {}
    }
    
    #!/usr/bin/env python3
    
    import requests
    from requests.auth import HTTPBasicAuth
    import json
    import time
    
    auth = HTTPBasicAuth(username="cumulus", password="password")
    nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
    mime_header = {"Content-Type": "application/json"}
    
    DUMMY_SLEEP = 5  # In seconds
    POLL_APPLIED = 1  # in seconds
    RETRIES = 10
    
    def print_request(r: requests.Request):
        print("=======Request=======")
        print("URL:", r.url)
        print("Headers:", r.headers)
        print("Body:", r.body)
    
    def print_response(r: requests.Response):
        print("=======Response=======")
        print("Headers:", r.headers)
        print("Body:", json.dumps(r.json(), indent=2))
    
    def create_nvue_changest():
        r = requests.post(url=nvue_end_point + "/revision",
                          auth=auth,
                          verify=False)
        print_request(r.request)
        print_response(r)
        response = r.json()
        changeset = response.popitem()[0]
        return changeset
    
    def apply_nvue_changeset(changeset):
        apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
        url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
                                                                   safe="")
        r = requests.patch(url=url,
                           auth=auth,
                           verify=False,
                           data=json.dumps(apply_payload),
                           headers=mime_header)
        print_request(r.request)
        print_response(r)
    
    def is_config_applied(changeset) -> bool:
        # Check if the configuration was indeed applied
        global RETRIES
        global POLL_APPLIED
        retries = RETRIES
        while retries > 0:
            r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
                             auth=auth,
                             verify=False)
            response = r.json()
            print(response)
            if response["state"] == "applied":
                return True
            retries -= 1
            time.sleep(POLL_APPLIED)
    
        return False
    
    def apply_new_config(path,payload):
        # Create a new revision ID
        changeset = create_nvue_changest()
        print("Using NVUE Changeset: '{}'".format(changeset))
    
        # Delete existing configuration
        query_string = {"rev": changeset}
        r = requests.delete(url=nvue_end_point + path,
                           auth=auth,
                           verify=False,
                           params=query_string,
                           headers=mime_header)
        print_request(r.request)
        print_response(r)
    
        # Patch the new configuration
        
        query_string = {"rev": changeset}
        r = requests.patch(url=nvue_end_point + path,
                           auth=auth,
                           verify=False,
                           data=json.dumps(payload),
                           params=query_string,
                           headers=mime_header)
        print_request(r.request)
        print_response(r)
    
        # Apply the changes to the new revision changeset
        apply_nvue_changeset(changeset)
    
        # Check if the changeset was applied
        is_config_applied(changeset)
    
    def nvue_get(path):
        r = requests.get(url=nvue_end_point + path,
                         auth=auth,
                         verify=False)
        print_request(r.request)
        print_response(r)
    
    if __name__ == "__main__":
        payload = {
            "99.99.99.99/32": {}
        }
        apply_new_config("/interface/lo/ip/address",payload)
        time.sleep(DUMMY_SLEEP)
        nvue_get("/interface/lo/ip/address")
    
    cumulus@switch:~$ nv show interface lo ip address
       
    -------------
    99.99.99.99/32
    127.0.0.1/8
    ::1/128
    

View Differences Between Configurations

To view differences between configurations, run the API GET /nvue_v1/<resource>?rev=<rev-A>&diff=<rev-B> method with the configurations you want to diff. This method is equivalent to the NVUE nv config diff <rev-A> <rev-B> command.

To see the difference between the startup revision and the applied revision:

cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure -X GET /nvue_v1/interface?rev=startup&diff=applied

To see the difference between revision 1 and revision 2:

cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure -X GET /nvue_v1/<resource>?rev=1&diff=2

You can change the order of the revisions; for example, GET /nvue_v1/<resource>?rev=2&diff=1.

Troubleshoot Configuration Changes

When a configuration change fails, you see an error in the change request.

Configuration Fails Because of a Dependency

If you stage a configuration but it fails because of a dependency, the failure shows the reason. In the following example, the change fails because the BGP router ID is not set.

cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure https://127.0.0.1:8765/nvue_v1/revision/6
{
  "state": "invalid",
  "transition": {
    "issue": {
      "0": {
        "code": "config_invalid",
        "data": {
          "location": "router.bgp.enable",
          "reason": "BGP requires router-id to be set globally or in the VRF.\n"
        },
        "message": "Config invalid at router.bgp.enable: BGP requires router-id to be set globally or in the VRF.\n",
        "severity": "error"
      }
    },
    "progress": "Invalid config"
  }
}

The staged configuration is missing router-id.

cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure https://127.0.0.1:8765/nvue_v1/vrf/default/router/bgp?rev=6
{
  "autonomous-system": 65999,
  "enable": "on"
}

Configuration Apply Fails with Warnings

In some cases, such as the first push with NVUE or if you change a file manually instead of using NVUE, you see a warning prompt and the apply fails.

cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure -X GET https://127.0.0.1:8765/nvue_v1/revision/6
{
  "6": {
    "state": "ays_fail",
    "transition": {
      "issue": {
        "0": {
          "code": "client_timeout",
          "data": {},
          "message": "Timeout while waiting for client response",
          "severity": "error"
        }
      },
      "progress": "Aborted apply after warnings"
    }
  }

To resolve this issue, observe the failures or errors, then inspect the configuration that you are trying to apply. After you resolve the errors, retry the API. If you prefer to overlook the errors and force an apply, add "auto-prompt":{"ays": "ays_yes"} to the configuration apply.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"state":"apply","auto-prompt":{"ays": "ays_yes"}}' -H 'Content-Type:application/json' --insecure -X PATCH https://127.0.0.1:8765/nvue_v1/revision/6

Save a Configuration

To save an applied configuration change to the startup configuration file (/etc/nvue.d/startup.yaml) so that the changes persist after a reboot, use a PATCH to the applied revision with the save state.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -k -X PATCH -d '{"state": "save", "auto-prompt": {"ays": "ays_yes"}}' -H 'Content-Type: application/json'  https://127.0.0.1:8765/nvue_v1/revision/applied 
{ 
  "state": "save",
  "transition": {
    "issue": {},
    "progress": ""
  }
}
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10

def print_request(r: requests.Request):
    print("=======Request=======")
    print("URL:", r.url)
    print("Headers:", r.headers)
    print("Body:", r.body)

def print_response(r: requests.Response):
    print("=======Response=======")
    print("Headers:", r.headers)
    print("Body:", json.dumps(r.json(), indent=2))

def save_nvue_changeset():
    apply_payload = {"state": "save", "auto-prompt": {"ays": "ays_yes"}}
    url = nvue_end_point + "/revision/applied"
    r = requests.patch(url=url,
                       auth=auth,
                       verify=False,
                       data=json.dumps(apply_payload),
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

if __name__ == "__main__":
    save_nvue_changeset()
cumulus@switch:~$ nv config save
saved

Unset a Configuration Change

To unset a configuration change, use the null value to the key. For example, to delete vlan100 from a switch, use the following syntax:

cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"vlan100":null}' -H 'Content-Type: application/json' --insecure -X PATCH https://127.0.0.1:8765/nvue_v1/interface/rev=4

When you unset a change, you must still use the PATCH action. The value indicates removal of the entry. The data is {"vlan100":null} with the PATCH action.

Use the API for Active Monitoring

The example below fetches the counters for interface swp1.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -k -X GET https://127.0.0.1:8765/nvue_v1/interface/swp1/link/stats
{
  "carrier-transitions": 6,
  "in-bytes": 293771538,
  "in-drops": 0,
  "in-errors": 0,
  "in-pkts": 2321737,
  "out-bytes": 366068936,
  "out-drops": 0,
  "out-errors": 0,
  "out-pkts": 3536629
}
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

if __name__ == "__main__":
    r = requests.get(url=nvue_end_point + "/interface/swp1/link/stats",
                     auth=auth,
                     verify=False)
    print("=======Interface swp1 Statistics=======")
    print(json.dumps(r.json(), indent=2))
cumulus@switch:~$ nv show interface swp1 link stats
                     operational  applied  pending  description
-------------------  -----------  -------  -------  ----------------------------------------------------------------------
carrier-transitions  6                              Number of times the interface state has transitioned between up and...
in-bytes             280.15 MB                      total number of bytes received on the interface
in-drops             0                              number of received packets dropped
in-errors            0                              number of received packets with errors
in-pkts              2321659                        total number of packets received on the interface
out-bytes            349.10 MB                      total number of bytes transmitted out of the interface
out-drops            0                              The number of outbound packets that were chosen to be discarded eve...
out-errors           0                              The number of outbound packets that could not be transmitted becaus...
out-pkts             3536508                        total number of packets transmitted out of the interface

Retrieve View Types

NVUE provides views for certain show commands. A view is a subset of information.

To see the views available for a show command, run the command with --view and press TAB:

cumulus@switch:~$ nv show interface --view <<TAB>>
acl-statistics  description     lldp            physical        status          
bond-members    detail          lldp-detail     pluggables      svi             
bonds           dot1x-counters  mac             port-security   synce-counters  
brief           dot1x-summary   mlag-cc         qos-profile     up              
counters        down            neighbor        small           vrf
cumulus@switch:~$ nv show vrf default router rib ipv4 route --view <<TAB>>
brief   detail

To retrieve view types through the REST API, you use the curl -u 'cumulus:CumulusLinux!' -k -X GET http://path?view=<brief> syntax. For example, the equivalent REST API method for the NVUE nv show vrf <vrf-id> router rib ipv4 route --view=brief command is:

cumulus@switch:~$ curl -u 'cumulus:CumulusLinux!' -k -X GET https://127.0.0.1:8765/nvue_v1/vrf/BLUE/router/rib/ipv4/route?view=brief

The equivalent REST API method for the NVUE nv show interface --view=acl-statistics command is:

cumulus@switch:~$ curl -u 'cumulus:CumulusLinux!' -k -X GET https://127.0.0.1:8765/nvue_v1/interface?view=acl-statistics

For a query with a view that does not exist. The API returns a 400 Bad Request error and displays all the defined views for that endpoint.

Convert CLI Changes to Use the API

You can take a configuration change from the CLI and use the API to configure the same set of changes.

  1. Make your configuration changes on the system with the NVUE CLI.

    cumulus@switch:~$ nv set system hostname switch01
    cumulus@switch:~$ nv set interface lo ip address 99.99.99.99/32
    cumulus@switch:~$ nv set interface eth0 ip address 192.168.200.6/24
    cumulus@switch:~$ nv set interface bond0 bond member swp1-4
    
  2. View the changes as a JSON blob.

    cumulus@switch:~$ nv config diff -o json
    [
      {
        "set": {
          "interface": {
            "bond0": {
              "bond": {
                "member": {
                  "swp1": {},
                  "swp2": {},
                  "swp3": {},
                  "swp4": {}
                }
              },
              "type": "bond"
            },
            "lo": {
              "ip": {
                "address": {
                  "99.99.99.99/32": {}
                }
              }
            }
          },
          "system": {
            "hostname": "switch01"
          }
        }
      }
    ]
    
  3. Staple the JSON blob to a root patch request as the payload.

    cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{
          "interface": {
            "bond0": {
              "bond": {
                "member": {
                  "swp1": {},
                  "swp2": {},
                  "swp3": {},
                  "swp4": {}
                }
              },
              "type": "bond"
            },
            "lo": {
              "ip": {
                "address": {
                  "99.99.99.99/32": {}
                }
              }
            }
          },
          "system": {
            "hostname": "switch01"
          }
        }' -k -X PATCH https://127.0.0.1:8765/nvue_v1/?rev=3
    
    {
      "bridge": {
        "domain": {
          "br_default": {
            "type": "vlan-aware",
            "vlan": {
              "10": {
                "vni": {
                  "10": {}
                }
              },
              "20": {
                "vni": {
                  "20": {}
                }
              },
              "30": {
                "vni": {
                  "30": {}
                }
              }
            }
          }
        }
      },
      "evpn": {
        "enable": "on"
      },
      "interface": {
        "bond1": {
          "bond": {
            "lacp-bypass": "on",
            "member": {
              "swp1": {}
            },
    ...
    
  4. Apply the changes with a PATCH to the revision changeset.

    cumulus@switch:~$ curl -u 'cumulus:cumulus' -H 'Content-Type:application/json' -k -d '{"state": "apply", "auto-prompt": {"ays": "ays_yes"}}' -X PATCH https://127.0.0.1:8765/nvue_v1/revision/3
    {
      "state": "apply",
      "transition": {
        "issue": {},
        "progress": ""
      }
    }
    
  5. Review the status of the apply and the configuration:

    cumulus@switch:~$ curl -u 'cumulus:cumulus' -k -X GET https://127.0.0.1:8765/nvue_v1/revision/3
    {
      "state": "applied",
      "transition": {
        "issue": {},
        "progress": ""
      }
    }
    
    #!/usr/bin/env python3
    
    import requests
    from requests.auth import HTTPBasicAuth
    import json
    import time
    
    auth = HTTPBasicAuth(username="cumulus", password="password")
    nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
    mime_header = {"Content-Type": "application/json"}
    
    DUMMY_SLEEP = 5  # In seconds
    POLL_APPLIED = 1  # in seconds
    RETRIES = 10
    
    def print_request(r: requests.Request):
        print("=======Request=======")
        print("URL:", r.url)
        print("Headers:", r.headers)
        print("Body:", r.body)
    
    def print_response(r: requests.Response):
        print("=======Response=======")
        print("Headers:", r.headers)
        print("Body:", json.dumps(r.json(), indent=2))
    
    def create_nvue_changest():
        r = requests.post(url=nvue_end_point + "/revision",
                          auth=auth,
                          verify=False)
        print_request(r.request)
        print_response(r)
        response = r.json()
        changeset = response.popitem()[0]
        return changeset
    
    def apply_nvue_changeset(changeset):
        # apply_payload = {"state": "apply"}
        apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
        url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
                                                                   safe="")
        r = requests.patch(url=url,
                           auth=auth,
                           verify=False,
                           data=json.dumps(apply_payload),
                           headers=mime_header)
        print_request(r.request)
        print_response(r)
    
    def is_config_applied(changeset) -> bool:
        # Check if the configuration was indeed applied
        global RETRIES
        global POLL_APPLIED
        retries = RETRIES
        while retries > 0:
            r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
                             auth=auth,
                             verify=False)
            response = r.json()
            print(response)
    
            if response["state"] == "applied":
                return True
            retries -= 1
            time.sleep(POLL_APPLIED)
    
        return False
    
    def apply_new_config(path,payload):
        # Create a new revision ID
        changeset = create_nvue_changest()
        print("Using NVUE Changeset: '{}'".format(changeset))
    
        # Delete existing configuration
        query_string = {"rev": changeset}
        r = requests.delete(url=nvue_end_point + path,
                           auth=auth,
                           verify=False,
                           params=query_string,
                           headers=mime_header)
        print_request(r.request)
        print_response(r)
    
        # Patch the new configuration
        
        query_string = {"rev": changeset}
        r = requests.patch(url=nvue_end_point + path,
                           auth=auth,
                           verify=False,
                           data=json.dumps(payload),
                           params=query_string,
                           headers=mime_header)
        print_request(r.request)
        print_response(r)
    
        # Apply the changes to the new revision changeset
        apply_nvue_changeset(changeset)
    
        # Check if the changeset was applied
        is_config_applied(changeset)
    
    def nvue_get(path):
        r = requests.get(url=nvue_end_point + path,
                         auth=auth,
                         verify=False)
        print_request(r.request)
        print_response(r)
    
    if __name__ == "__main__":
        payload = {
          "interface": {
            "bond0": {
              "bond": {
                "member": {
                  "swp1": {},
                  "swp2": {},
                  "swp3": {},
                  "swp4": {}
                }
              },
              "type": "bond"
            },
            "lo": {
              "ip": {
                "address": {
                  "99.99.99.99/32": {}
                }
              }
            }
          },
          "system": {
            "hostname": "switch01"
          }
        }
        apply_new_config("/",payload)
        time.sleep(DUMMY_SLEEP)
        nvue_get("/interface/bond0")
        nvue_get("/interface/lo")
        nvue_get("/system")
    
    

API Examples

The following section provides practical API examples.

Configure the System

To set the system hostname, pre-login or post-login message, and time zone on the switch, send a targeted API request to /nvue_v1/system.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"system": {"hostname":"switch01","timezone":"America/Los_Angeles","message":{"pre-login":"Welcome to NVIDIA Cumulus Linux","post-login":"You have successfully logged in to switch01"}}}' -k -X PATCH https://127.0.0.1:8765/nvue_v1/?rev=4
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10

def print_request(r: requests.Request):
    print("=======Request=======")
    print("URL:", r.url)
    print("Headers:", r.headers)
    print("Body:", r.body)

def print_response(r: requests.Response):
    print("=======Response=======")
    print("Headers:", r.headers)
    print("Body:", json.dumps(r.json(), indent=2))

def create_nvue_changest():
    r = requests.post(url=nvue_end_point + "/revision",
                      auth=auth,
                      verify=False)
    print_request(r.request)
    print_response(r)
    response = r.json()
    changeset = response.popitem()[0]
    return changeset

def apply_nvue_changeset(changeset):
    # apply_payload = {"state": "apply"}
    apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
    url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
                                                               safe="")
    r = requests.patch(url=url,
                       auth=auth,
                       verify=False,
                       data=json.dumps(apply_payload),
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

def is_config_applied(changeset) -> bool:
    # Check if the configuration was indeed applied
    global RETRIES
    global POLL_APPLIED
    retries = RETRIES
    while retries > 0:
        r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
                         auth=auth,
                         verify=False)
        response = r.json()
        print(response)

        if response["state"] == "applied":
            return True
        retries -= 1
        time.sleep(POLL_APPLIED)

    return False

def apply_new_config(path,payload):
    # Create a new revision ID
    changeset = create_nvue_changest()
    print("Using NVUE Changeset: '{}'".format(changeset))

    # Delete existing configuration
    query_string = {"rev": changeset}
    r = requests.delete(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Patch the new configuration
    
    query_string = {"rev": changeset}
    r = requests.patch(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       data=json.dumps(payload),
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the changes to the new revision changeset
    apply_nvue_changeset(changeset)

    # Check if the changeset was applied
    is_config_applied(changeset)

def nvue_get(path):
    r = requests.get(url=nvue_end_point + path,
                     auth=auth,
                     verify=False)
    print_request(r.request)
    print_response(r)

if __name__ == "__main__":
    payload = {
      "system": 
      {
        "hostname":"switch01",
        "timezone":"America/Los_Angeles",
        "message":
        {
          "pre-login":"Welcome to NVIDIA Cumulus Linux",
          "post-login:"You have successfully logged in to switch01"
        }
      }
    }
    apply_new_config("/",payload) # Root patch
    time.sleep(DUMMY_SLEEP)
    nvue_get("/system")
cumulus@switch:~$ nv set system hostname switch01
cumulus@switch:~$ nv set system timezone America/Los_Angeles
cumulus@switch:~$ nv set system message pre-login "Welcome to NVIDIA Cumulus Linux"
cumulus@switch:~$ nv set system message post-login "You have successfully logged into switch01"

Configure Services

To set up NTP, DNS, and SNMP on the switch, send a targeted API request to /nvue_v1/service.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"service": { "ntp": {"default":{"server":{"4.cumulusnetworks.pool.ntp.org":{"iburst":"on"}}}}, "dns": {"mgmt":{"server":{"192.168.1.100":{}}}}, "syslog": {"mgmt":{"server":{"192.168.1.120":{"port":8000}}}}}}' -k -X PATCH https://127.0.0.1:8765/nvue_v1/?rev=5
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10

def print_request(r: requests.Request):
    print("=======Request=======")
    print("URL:", r.url)
    print("Headers:", r.headers)
    print("Body:", r.body)

def print_response(r: requests.Response):
    print("=======Response=======")
    print("Headers:", r.headers)
    print("Body:", json.dumps(r.json(), indent=2))

def create_nvue_changest():
    r = requests.post(url=nvue_end_point + "/revision",
                      auth=auth,
                      verify=False)
    print_request(r.request)
    print_response(r)
    response = r.json()
    changeset = response.popitem()[0]
    return changeset

def apply_nvue_changeset(changeset):
    # apply_payload = {"state": "apply"}
    apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
    url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
                                                               safe="")
    r = requests.patch(url=url,
                       auth=auth,
                       verify=False,
                       data=json.dumps(apply_payload),
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

def is_config_applied(changeset) -> bool:
    # Check if the configuration was indeed applied
    global RETRIES
    global POLL_APPLIED
    retries = RETRIES
    while retries > 0:
        r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
                         auth=auth,
                         verify=False)
        response = r.json()
        print(response)

        if response["state"] == "applied":
            return True
        retries -= 1
        time.sleep(POLL_APPLIED)

    return False

def apply_new_config(path,payload):
    # Create a new revision ID
    changeset = create_nvue_changest()
    print("Using NVUE Changeset: '{}'".format(changeset))

    # Delete existing configuration
    query_string = {"rev": changeset}
    r = requests.delete(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Patch the new configuration
    
    query_string = {"rev": changeset}
    r = requests.patch(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       data=json.dumps(payload),
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the changes to the new revision changeset
    apply_nvue_changeset(changeset)

    # Check if the changeset was applied
    is_config_applied(changeset)

def nvue_get(path):
    r = requests.get(url=nvue_end_point + path,
                     auth=auth,
                     verify=False)
    print_request(r.request)
    print_response(r)

if __name__ == "__main__":
    payload = {
      "service":
      {
        "ntp":
        {
          "default":
          {
            "server:
            {
              "4.cumulusnetworks.pool.ntp.org":
              {
                "iburst":"on"
              }
            }
          }
        },
        "dns":
        {
          "mgmt":
          {
            "server:
            {
              "192.168.1.100":{}
            }
          }
        },
        "syslog":
        {
          "mgmt":
          {
            "server:
            {
              "192.168.1.120":
              {
                "port":8000
              }
            }
          }
        }
      }
    }
    apply_new_config("/",payload) # Root patch
    time.sleep(DUMMY_SLEEP)
    nvue_get("/service/ntp")
    nvue_get("/service/dns")
    nvue_get("/service/syslog")
cumulus@switch:~$ nv set service ntp default server 4.cumulusnetworks.pool.ntp.org iburst on
cumulus@switch:~$ nv set service dns mgmt server 192.168.1.100 
cumulus@switch:~$ nv set service syslog mgmt server 192.168.1.120 port 8000

Configure Users

The following example creates a new user, then deletes the user.

This example creates a new user called test1.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"system": {"aaa": {"user": {"test1": {"hashed-password":"72b28582708d749c6c82f3b3f226041f1bd37090281641eaeba8d44bd915d0042d609a92759d9f6fb96475cb0601cf428cd22613df8a53a09461e0b426cf0a35","role": "nvue-monitor","enable": "on","full-name": "Test User"}}}}}' -k -X PATCH https://127.0.0.1:8765/nvue_v1/?rev=5

This example deletes the test1 user.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -k -X DELETE https://127.0.0.1:8765/nvue_v1/system/aaa/user/test1?rev=6
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10

def print_request(r: requests.Request):
    print("=======Request=======")
    print("URL:", r.url)
    print("Headers:", r.headers)
    print("Body:", r.body)

def print_response(r: requests.Response):
    print("=======Response=======")
    print("Headers:", r.headers)
    print("Body:", json.dumps(r.json(), indent=2))

def create_nvue_changest():
    r = requests.post(url=nvue_end_point + "/revision",
                      auth=auth,
                      verify=False)
    print_request(r.request)
    print_response(r)
    response = r.json()
    changeset = response.popitem()[0]
    return changeset

def apply_nvue_changeset(changeset):
    # apply_payload = {"state": "apply"}
    apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
    url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
                                                               safe="")
    r = requests.patch(url=url,
                       auth=auth,
                       verify=False,
                       data=json.dumps(apply_payload),
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

def is_config_applied(changeset) -> bool:
    # Check if the configuration was indeed applied
    global RETRIES
    global POLL_APPLIED
    retries = RETRIES
    while retries > 0:
        r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
                         auth=auth,
                         verify=False)
        response = r.json()
        print(response)

        if response["state"] == "applied":
            return True
        retries -= 1
        time.sleep(POLL_APPLIED)

    return False

def apply_new_config(path,payload):
    # Create a new revision ID
    changeset = create_nvue_changest()
    print("Using NVUE Changeset: '{}'".format(changeset))

    # Delete existing configuration
    query_string = {"rev": changeset}
    r = requests.delete(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Patch the new configuration
    
    query_string = {"rev": changeset}
    r = requests.patch(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       data=json.dumps(payload),
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the changes to the new revision changeset
    apply_nvue_changeset(changeset)

    # Check if the changeset was applied
    is_config_applied(changeset)

def delete_config(path):
    # Create an NVUE changeset
    changeset = create_nvue_changest()
    print("Using NVUE Changeset: '{}'".format(changeset))

    # Equivalent to JSON `null`
    payload = None

    # Stage the change
    query_string = {"rev": changeset}
    r = requests.delete(url=nvue_end_point + path,
                        auth=auth,
                        verify=False,
                        data=json.dumps(payload),
                        params=query_string,
                        headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the staged changeset
    apply_nvue_changeset(changeset)

    # Check if the changeset was applied
    is_config_applied(changeset)

def nvue_get(path):
    r = requests.get(url=nvue_end_point + path,
                     auth=auth,
                     verify=False)
    print_request(r.request)
    print_response(r)

if __name__ == "__main__":

    # Need to create a hashed password - The supported password
    # hashes are documented here:
    # https://docs.nvidia.com/networking-ethernet-software/cumulus-linux-55/System-Configuration/Authentication-Authorization-and-Accounting/User-Accounts/#hashed-passwords  # noqa
    # Here in this example, we use SHA-512
    import crypt
    hashed_password = crypt.crypt("hello$world#2023", salt=crypt.METHOD_SHA512)
    payload = {
        "system": {
            "aaa": {
                "user": {
                    "test1": {
                        "hashed-password": hashed_password,
                        "role": "nvue-monitor",
                        "enable": "on",
                        "full-name": "Test User",
                    }
                }
            }
        }
    }
    apply_new_config("/",payload) # Root patch
    time.sleep(DUMMY_SLEEP)
    nvue_get("/system/user/aaa")

    """Delete an existing user account using the AAA API."""
    delete_config("/system/aaa/user/test1")
    time.sleep(DUMMY_SLEEP)
    nvue_get("/system/user/aaa")

This example creates a new user test1.

cumulus@switch:~$ nv set system aaa user test1
cumulus@switch:~$ nv set system aaa user test1 full-name "Test User" 
cumulus@switch:~$ nv set system aaa user test1 password "abcd@test"
cumulus@switch:~$ nv set system aaa user test1 role nvue-monitor
cumulus@switch:~$ nv set system aaa user test1 enable on

This example deletes the user test1.

cumulus@switch:~$ nv unset system aaa user test1

Configure an Interface

The following example configures an interface.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -k -d '{"swp1": {"link":{"state":{"up": {}}}}}' -H 'Content-Type: application/json' -k -X PATCH https://127.0.0.1:8765/nvue_v1/interface?rev=21
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10

def print_request(r: requests.Request):
    print("=======Request=======")
    print("URL:", r.url)
    print("Headers:", r.headers)
    print("Body:", r.body)

def print_response(r: requests.Response):
    print("=======Response=======")
    print("Headers:", r.headers)
    print("Body:", json.dumps(r.json(), indent=2))

def create_nvue_changest():
    r = requests.post(url=nvue_end_point + "/revision",
                      auth=auth,
                      verify=False)
    print_request(r.request)
    print_response(r)
    response = r.json()
    changeset = response.popitem()[0]
    return changeset

def apply_nvue_changeset(changeset):
    # apply_payload = {"state": "apply"}
    apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
    url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
                                                               safe="")
    r = requests.patch(url=url,
                       auth=auth,
                       verify=False,
                       data=json.dumps(apply_payload),
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

def is_config_applied(changeset) -> bool:
    # Check if the configuration was indeed applied
    global RETRIES
    global POLL_APPLIED
    retries = RETRIES
    while retries > 0:
        r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
                         auth=auth,
                         verify=False)
        response = r.json()
        print(response)

        if response["state"] == "applied":
            return True
        retries -= 1
        time.sleep(POLL_APPLIED)

    return False

def apply_new_config(path,payload):
    # Create a new revision ID
    changeset = create_nvue_changest()
    print("Using NVUE Changeset: '{}'".format(changeset))

    # Delete existing configuration
    query_string = {"rev": changeset}
    r = requests.delete(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Patch the new configuration
    
    query_string = {"rev": changeset}
    r = requests.patch(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       data=json.dumps(payload),
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the changes to the new revision changeset
    apply_nvue_changeset(changeset)

    # Check if the changeset was applied
    is_config_applied(changeset)

def nvue_get(path):
    r = requests.get(url=nvue_end_point + path,
                     auth=auth,
                     verify=False)
    print_request(r.request)
    print_response(r)

if __name__ == "__main__":
    payload = {
      "swp1":
      {
        "type":"swp",
        "link":
        {
          "state":"up"
          }
        }
      }
    apply_new_config("/interface",payload)
    time.sleep(DUMMY_SLEEP)
    nvue_get("/interface/swp1")
cumulus@switch:~$ nv set interface swp1

Configure a Bond

The following example configures a bond.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"bond0": {"type":"bond","bond":{"member":{"swp1":{},"swp2":{},"swp3":{},"swp4":{}}}}}' -H 'Content-Type: application/json' -k -X PATCH https://127.0.0.1:8765/nvue_v1/interface?rev=7
{
  "bond0": {
    "bond": {
      "member": {
        "swp1": {},
        "swp2": {},
        "swp3": {},
        "swp4": {}
      }
    },
    "type": "bond"
  }
}
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10

def print_request(r: requests.Request):
    print("=======Request=======")
    print("URL:", r.url)
    print("Headers:", r.headers)
    print("Body:", r.body)

def print_response(r: requests.Response):
    print("=======Response=======")
    print("Headers:", r.headers)
    print("Body:", json.dumps(r.json(), indent=2))

def create_nvue_changest():
    r = requests.post(url=nvue_end_point + "/revision",
                      auth=auth,
                      verify=False)
    print_request(r.request)
    print_response(r)
    response = r.json()
    changeset = response.popitem()[0]
    return changeset

def apply_nvue_changeset(changeset):
    # apply_payload = {"state": "apply"}
    apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
    url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
                                                               safe="")
    r = requests.patch(url=url,
                       auth=auth,
                       verify=False,
                       data=json.dumps(apply_payload),
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

def is_config_applied(changeset) -> bool:
    # Check if the configuration was indeed applied
    global RETRIES
    global POLL_APPLIED
    retries = RETRIES
    while retries > 0:
        r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
                         auth=auth,
                         verify=False)
        response = r.json()
        print(response)

        if response["state"] == "applied":
            return True
        retries -= 1
        time.sleep(POLL_APPLIED)

    return False

def apply_new_config(path,payload):
    # Create a new revision ID
    changeset = create_nvue_changest()
    print("Using NVUE Changeset: '{}'".format(changeset))

    # Delete existing configuration
    query_string = {"rev": changeset}
    r = requests.delete(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Patch the new configuration
    
    query_string = {"rev": changeset}
    r = requests.patch(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       data=json.dumps(payload),
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the changes to the new revision changeset
    apply_nvue_changeset(changeset)

    # Check if the changeset was applied
    is_config_applied(changeset)

def nvue_get(path):
    r = requests.get(url=nvue_end_point + path,
                     auth=auth,
                     verify=False)
    print_request(r.request)
    print_response(r)

if __name__ == "__main__":
    payload = {
      "bond0":
      {
        "type":"bond",
        "bond":
        {
          "member":
          {
            "swp1":{},
            "swp2":{},
            "swp3":{},
            "swp4":{}
          }
        }
      }
    }
    apply_new_config("/interface",payload)
    time.sleep(DUMMY_SLEEP)
    nvue_get("/interface/bond0")
cumulus@switch:~$ nv set interface bond0 bond member swp1-4

Configure a Bridge

The following example configures a bridge.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"swp1": {"bridge":{"domain":{"br_default":{}}}},"swp2": {"bridge":{"domain":{"br_default":{}}}}}' -H 'Content-Type: application/json' -k -X PATCH https://127.0.0.1:8765/nvue_v1/interface?rev=21
{
  "swp1": {
    "bridge": {
      "domain": {
        "br_default": {}
      }
    },
    "type": "swp"
  },
  "swp2": {
    "bridge": {
      "domain": {
        "br_default": {}
      }
    },
    "type": "swp"
  }
}

cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"untagged":1,"vlan":{"10":{},"20":{}}}' -H 'Content-Type: application/json' -k -X PATCH https://127.0.0.1:8765/nvue_v1/bridge/domain/br_default?rev=8

{ “untagged”: 1, “vlan”: { “10”: {}, “20”: {} } }

#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10

def print_request(r: requests.Request):
    print("=======Request=======")
    print("URL:", r.url)
    print("Headers:", r.headers)
    print("Body:", r.body)

def print_response(r: requests.Response):
    print("=======Response=======")
    print("Headers:", r.headers)
    print("Body:", json.dumps(r.json(), indent=2))

def create_nvue_changest():
    r = requests.post(url=nvue_end_point + "/revision",
                      auth=auth,
                      verify=False)
    print_request(r.request)
    print_response(r)
    response = r.json()
    changeset = response.popitem()[0]
    return changeset

def apply_nvue_changeset(changeset):
    # apply_payload = {"state": "apply"}
    apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
    url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
                                                               safe="")
    r = requests.patch(url=url,
                       auth=auth,
                       verify=False,
                       data=json.dumps(apply_payload),
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

def is_config_applied(changeset) -> bool:
    # Check if the configuration was indeed applied
    global RETRIES
    global POLL_APPLIED
    retries = RETRIES
    while retries > 0:
        r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
                         auth=auth,
                         verify=False)
        response = r.json()
        print(response)

        if response["state"] == "applied":
            return True
        retries -= 1
        time.sleep(POLL_APPLIED)

    return False

def apply_new_config(path,payload):
    # Create a new revision ID
    changeset = create_nvue_changest()
    print("Using NVUE Changeset: '{}'".format(changeset))

    # Delete existing configuration
    query_string = {"rev": changeset}
    r = requests.delete(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Patch the new configuration
    
    query_string = {"rev": changeset}
    r = requests.patch(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       data=json.dumps(payload),
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the changes to the new revision changeset
    apply_nvue_changeset(changeset)

    # Check if the changeset was applied
    is_config_applied(changeset)

def nvue_get(path):
    r = requests.get(url=nvue_end_point + path,
                     auth=auth,
                     verify=False)
    print_request(r.request)
    print_response(r)

if __name__ == "__main__":
    int_payload = {
      "swp1":
      {
        "bridge":
        {
          "domain":
          {
            "br_default":{}
          }
        },
        "swp2": 
        {
          "bridge":
          {
            "domain":
            {
              "br_default":{}
            }
          }
        }
      }
    }
    apply_new_config("/interface",int_payload)
    br_payload = {
      "untagged":1,
      "vlan":
      {
        "10":{},
        "20":{}
      }
    }
    apply_new_config("/bridge/domain/br_default",br_payload)
    time.sleep(DUMMY_SLEEP)
    nvue_get("/interface/swp1")
    nvue_get("/bridge/domain/br_default")
cumulus@switch:~$ nv set interface swp1-2 bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default vlan 10,20
cumulus@switch:~$ nv set bridge domain br_default untagged 1

Configure BGP

The following example configures BGP.

cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"bgp": {"autonomous-system": 65101,"router-id":"10.10.10.1"}}' -H 'Content-Type: application/json' -k -X PATCH https://127.0.0.1:8765/nvue_v1/router?rev=9
cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"bgp":{"neighbor":{"swp51":{"remote-as":"external"}},"address-family":{"ipv4-unicast":{"network":{"10.10.10.1/32":{}}}}}}' -H 'Content-Type: application/json' -k -X PATCH https://127.0.0.1:8765/nvue_v1/vrf/default/router?rev=9
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10

def print_request(r: requests.Request):
    print("=======Request=======")
    print("URL:", r.url)
    print("Headers:", r.headers)
    print("Body:", r.body)

def print_response(r: requests.Response):
    print("=======Response=======")
    print("Headers:", r.headers)
    print("Body:", json.dumps(r.json(), indent=2))

def create_nvue_changest():
    r = requests.post(url=nvue_end_point + "/revision",
                      auth=auth,
                      verify=False)
    print_request(r.request)
    print_response(r)
    response = r.json()
    changeset = response.popitem()[0]
    return changeset

def apply_nvue_changeset(changeset):
    # apply_payload = {"state": "apply"}
    apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
    url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
                                                               safe="")
    r = requests.patch(url=url,
                       auth=auth,
                       verify=False,
                       data=json.dumps(apply_payload),
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

def is_config_applied(changeset) -> bool:
    # Check if the configuration was indeed applied
    global RETRIES
    global POLL_APPLIED
    retries = RETRIES
    while retries > 0:
        r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
                         auth=auth,
                         verify=False)
        response = r.json()
        print(response)

        if response["state"] == "applied":
            return True
        retries -= 1
        time.sleep(POLL_APPLIED)

    return False

def apply_new_config(path,payload):
    # Create a new revision ID
    changeset = create_nvue_changest()
    print("Using NVUE Changeset: '{}'".format(changeset))

    # Delete existing configuration
    query_string = {"rev": changeset}
    r = requests.delete(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Patch the new configuration
    
    query_string = {"rev": changeset}
    r = requests.patch(url=nvue_end_point + path,
                       auth=auth,
                       verify=False,
                       data=json.dumps(payload),
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the changes to the new revision changeset
    apply_nvue_changeset(changeset)

    # Check if the changeset was applied
    is_config_applied(changeset)

def nvue_get(path):
    r = requests.get(url=nvue_end_point + path,
                     auth=auth,
                     verify=False)
    print_request(r.request)
    print_response(r)

if __name__ == "__main__":
    rt_payload = {
      "bgp":
      {
        "autonomous-system": 65101,
        "router-id":"10.10.10.1"
      }
    }
    apply_new_config("/router",rt_payload)
    vrf_payload = {
      "bgp":
      {
        "neighbor":
        {
          "swp51":
          {
            "remote-as":"external"
          }
        },
        "address-family":
        {
          "ipv4-unicast":
          {
            "network":
            {
              "10.10.10.1/32":{}
            }
          }
        }
      }
    }
    apply_new_config("/vrf/default/router",vrf_payload)
    time.sleep(DUMMY_SLEEP)
    nvue_get("/router")
    nvue_get("/vrf/default/router")
cumulus@switch:~$ nv set router bgp autonomous-system 65101
cumulus@switch:~$ nv set router bgp router-id 10.10.10.1
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@switch:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.1/32

Action Operations

The NVUE action operations are ephemeral operations that do not modify the state of the configuration; they reset counters for interfaces, BGP, QoS buffers and pools, and remove conflicts from protodown MLAG bonds.

To clear counters on swp1:

cumulus@switch:~$ curl -u 'cumulus:cumulus' -H 'Content-Type:application/json' -d '{"@clear": {"state": "start", "parameters": {}}}' -k -X POST https://127.0.0.1:8765/nvue_v1/interface/swp1/counters
1
cumulus@switch:~$ curl -u 'cumulus:cumulus' -X GET https://127.0.0.1:8765/nvue_v1/action/1 -k
{"detail":"swp1 counters cleared.","http_status":200,"issue":[],"state":"action_success","status":"swp1 counters cleared.","timeout":60,"type":""}

To clear QoS buffers on swp1:

cumulus@switch:~$ curl -u 'cumulus:cumulus' -H 'Content-Type:application/json' -d '{"@clear": {"state": "start", "parameters": {}}}' -k -X POST https://127.0.0.1:8765/nvue_v1/interface/swp1/qos/buffer
2
cumulus@switch:~$ curl -u 'cumulus:cumulus'  -X GET https://127.0.0.1:8765/nvue_v1/action/2 -k
{"detail":"QoS buffers cleared on swp1.","http_status":200,"issue":[],"state":"action_success","status":"QoS buffers cleared on swp1.","timeout":60,"type":""}
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10

def print_request(r: requests.Request):
    print("=======Request=======")
    print("URL:", r.url)
    print("Headers:", r.headers)
    print("Body:", r.body)

def print_response(r: requests.Response):
    print("=======Response=======")
    print("Headers:", r.headers)
    print("Body:", json.dumps(r.json(), indent=2))

def nvue_action():
    r = requests.post(url=nvue_end_point + path,
                      auth=auth,
                      verify=False,
                      data=json.dumps(apply_payload),
                      headers=mime_header)
    print_request(r.request)
    print_response(r)
    return response

def nvue_get(path):
    r = requests.get(url=nvue_end_point + path,
                     auth=auth,
                     verify=False)
    print_request(r.request)
    print_response(r)

if __name__ == "__main__":
    payload = {
      "@clear": 
      {
        "state": "start", 
        "parameters": {}
      }
    }
    action_id=nvue_action("/interface/swp1/qos/counter",payload)
    time.sleep(DUMMY_SLEEP)
    nvue_get(f"/action/{action_id}")
   
cumulus@switch:~$ nv action clear interface swp1 qos counter

Example Python Scripts

Configuration example

In the following python example, the full_config_example() method sets the system pre-login message, enables BGP globally, and changes a few other configuration settings in a single bulk operation. The API end-point goes to the root node /nvue_v1. The bridge_config_example() method performs a targeted API request to /nvue_v1/bridge/domain/<domain-id> to set the vlan-vni-offset attribute.

Example Configuration Script
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time

auth = HTTPBasicAuth(username="vagrant", password="vagrant")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10

def print_request(r: requests.Request):
    print("=======Request=======")
    print("URL:", r.url)
    print("Headers:", r.headers)
    print("Body:", r.body)

def print_response(r: requests.Response):
    print("=======Response=======")
    print("Headers:", r.headers)
    print("Body:", json.dumps(r.json(), indent=2))

def sanity():
    # Basic retrieval to check connectivity
    r = requests.get(url=nvue_end_point + "/system",
                     auth=auth,
                     verify=False)
    print_request(r.request)
    print_response(r)

def create_nvue_changest():
    r = requests.post(url=nvue_end_point + "/revision",
                      auth=auth,
                      verify=False)
    print_request(r.request)
    print_response(r)
    response = r.json()
    changeset = response.popitem()[0]
    return changeset

def apply_nvue_changeset(changeset):
    # apply_payload = {"state": "apply"}
    apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
    url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
                                                               safe="")
    r = requests.patch(url=url,
                       auth=auth,
                       verify=False,
                       data=json.dumps(apply_payload),
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

def full_config_example():
    # Create an NVUE changeset
    changeset = create_nvue_changest()
    print("Using NVUE Changeset: '{}'".format(changeset))

    # https://www.asciiart.eu/comics/batman
    pre_login_message = u"""
                   ,.ood888888888888boo.,
              .od888P^""            ""^Y888bo.
          .od8P''   ..oood88888888booo.    ``Y8bo.
       .odP'"  .ood8888888888888888888888boo.  "`Ybo.
     .d8'   od8'd888888888f`8888't888888888b`8bo   `Yb.
    d8'  od8^   8888888888[  `'  ]8888888888   ^8bo  `8b
  .8P  d88'     8888888888P      Y8888888888     `88b  Y8.
 d8' .d8'       `Y88888888'      `88888888P'       `8b. `8b
.8P .88P            """"            """"            Y88. Y8.
88  888                                              888  88
88  888                                              888  88
88  888.        ..                        ..        .888  88
`8b `88b,     d8888b.od8bo.      .od8bo.d8888b     ,d88' d8'
 Y8. `Y88.    8888888888888b    d8888888888888    .88P' .8P
  `8b  Y88b.  `88888888888888  88888888888888'  .d88P  d8'
    Y8.  ^Y88bod8888888888888..8888888888888bod88P^  .8P
     `Y8.   ^Y888888888888888LS888888888888888P^   .8P'
       `^Yb.,  `^^Y8888888888888888888888P^^'  ,.dP^'
          `^Y8b..   ``^^^Y88888888P^^^'    ..d8P^'
              `^Y888bo.,            ,.od888P^'
                   "`^^Y888888888888P^^'"
"""

    # https://www.asciiart.eu/comics/superman
    post_login_message = u'''
        _____________________________________________
      //:::::::::::::::::::::::::::::::::::::::::::::\\
    //:::_______:::::::::________::::::::::_____:::::::\\
  //:::_/   _-"":::_--"""        """--_::::\_  ):::::::::\\
 //:::/    /:::::_"                    "-_:::\/:::::|^\:::\\
//:::/   /~::::::I__                      \:::::::::|  \:::\\
\\:::\   (::::::::::""""---___________     "--------"  /::://
 \\:::\  |::::::::::::::::::::::::::::""""==____      /::://
  \\:::"\/::::::::::::::::::::::::::::::::::::::\   /~::://
    \\:::::::::::::::::::::::::::::::::::::::::::)/~::://
      \\::::\""""""------_____::::::::::::::::::::::://
        \\:::"\               """""-----_____:::::://
          \\:::"\    __----__                )::://
            \\:::"\/~::::::::~\_         __/~:://
              \\::::::::::::::::""----""":::://
                \\::::::::::::::::::::::::://
                  \\:::\^""--._.--""^/::://
                    \\::"\         /":://
                      \\::"\     /":://
                        \\::"\_/":://
                          \\::::://
                            \\_//
                              "
'''

    # Prepare payload which configures a few
    # different switch configurations
    payload = {
        "interface":{
            "eth0":{
                "description": "management port"
            }
        },
        "router":{
            "bgp":{
                "enable":"on"
            }
        },
        "system":{
            "message":{
                "pre-login": pre_login_message,
                "post-login": post_login_message
            },
            "timezone": "Europe/Paris",
            "config": {
               "snippet": {
                   "test-flexible-snippet": {
                       "file": "/tmp/blah",
                       "content": "NVIDIA rocks"
                   },
                   "frr.conf": "hello world"
               }
            }
        },
        "service": {
            "ntp": {
                "mgmt": {
                    "listen": "eth0"
                }
            }
        }
    }
    # Stage the change
    query_string = {"rev": changeset}
    r = requests.patch(url=nvue_end_point + "/",  # Root patch
                       auth=auth,
                       verify=False,
                       data=json.dumps(payload),
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the staged changeset
    apply_nvue_changeset(changeset)

    # Check if the changeset was applied
    is_config_applied(changeset)

def bridge_config_example(domain_id):
    # Create an NVUE changeset
    changeset = create_nvue_changest()
    print("Using NVUE Changeset: '{}'".format(changeset))
    payload = {
        "vlan-vni-offset": 1000
    }

    # Stage the change
    query_string = {"rev": changeset}
    r = requests.patch(url=nvue_end_point + f"/bridge/domain/{domain_id}",
                       auth=auth,
                       verify=False,
                       data=json.dumps(payload),
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the staged changeset
    apply_nvue_changeset(changeset)

    # Check if the changeset was applied
    is_config_applied(changeset)

def message_get():
    # Get the system pre-login/post-login
    # message that was configured.
    r = requests.get(url=nvue_end_point + "/system/message",
                     auth=auth,
                     verify=False)
    print_request(r.request)
    print_response(r)

def is_config_applied(changeset) -> bool:
    # Check if the configuration was indeed applied
    global RETRIES
    global POLL_APPLIED
    retries = RETRIES
    while retries > 0:
        r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
                         auth=auth,
                         verify=False)
        response = r.json()
        print(response)

        if response["state"] == "applied":
            return True
        retries -= 1
        time.sleep(POLL_APPLIED)

    return False

if __name__ == "__main__":
    sanity()
    time.sleep(DUMMY_SLEEP)
    full_config_example()
    time.sleep(DUMMY_SLEEP)
    bridge_config_example("br_default")
    time.sleep(DUMMY_SLEEP)
    message_get()

In the following example, get_link_status() fetches the current running state of the switches passed as a parameter. The link_status_down() brings the down the totalLinks links between leafs and spines passed as parameters. It discovers the neighbor switches using LLDP and filters out the interfaces that are not 400G or swp. The link_status_up() brings the previously brought down downLinks up.

Link Status Manipulation Script
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time
from urllib3.exceptions import InsecureRequestWarning


auth = HTTPBasicAuth(username="vagrant", password="vagrant")
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10
 
# Suppress the warnings from urllib3
requests.packages.urllib3.disable_warnings(category=InsecureRequestWarning)

def print_request(r: requests.Request):
    print("API URL:", r.url)
    print("API Body:", r.body)

def print_response(r: requests.Response):
    print("API response:", json.dumps(r.json(), indent=2))

def create_nvue_changest(url):
    r = requests.post(url=url + "/revision",
                      auth=auth,
                      verify=False)
    print_response(r)
    response = r.json()
    changeset = response.popitem()[0]
    return changeset

def apply_nvue_changeset(url,changeset):
    apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
    url = url + "/revision/" + requests.utils.quote(changeset,
                                                               safe="")
    r = requests.patch(url=url,
                       auth=auth,
                       verify=False,
                       data=json.dumps(apply_payload),
                       headers=mime_header)
    print_response(r)

def is_config_applied(url,changeset) -> bool:
    # Check if the configuration was indeed applied
    global RETRIES
    global POLL_APPLIED
    retries = RETRIES
    while retries > 0:
        r = requests.get(url=url + "/revision/" + requests.utils.quote(changeset, safe=""),
                         auth=auth,
                         verify=False)
        response = r.json()

        if response["state"] == "applied":
            return True
        retries -= 1
        time.sleep(POLL_APPLIED)
    return False

def apply_new_config(url,path,payload):
    # Create a new revision ID
    changeset = create_nvue_changest(url)
    print("Using NVUE Changeset: '{}'".format(changeset))

    # Patch the new configuration
    query_string = {"rev": changeset}
    r = requests.patch(url=url + path,
                       auth=auth,
                       verify=False,
                       data=json.dumps(payload),
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the changes to the new revision changeset
    apply_nvue_changeset(url,changeset)

    # Check if the changeset was applied
    is_config_applied(url,changeset)

def nvue_get(url,path):
    r = requests.get(url=url + path,
                     auth=auth,
                     verify=False)
    return(r.json())

def get_link_status(switches):
    print("===Current Link State===")
    for switch in switches:
        print("===Switch name: " + switch + "===")
        interfaces = nvue_get("https://" + switch + ":8765/nvue_v1","/interface")
        for interface in interfaces:
            if "swp" in interface:
                print("Interface: " + interface)
                for state in interfaces[interface]['link']['state']:
                    print("State: " + state)

def link_status_down(spines, leafs, totalLinks):
    # Bring down switch-leaf interfaces
    # Discover LLDP neighbor of the switch interfaces (swp with link speed 400G)
    # Collate interfaces per leaf eg: leaf01 = ["swp1s0","swp1s1"]
    discovery = {}
    leafDiscovery = {}
    for spine in spines:
        leafDiscovery[spine] = {}
        for leaf in leafs:
            leafDiscovery[spine][leaf] = []
        discovery[spine] = []
        interfaces = nvue_get("https://" + spine +":8765/nvue_v1","/interface?rev=operational")
        for interface in interfaces:
            if "swp" in interface and "speed" in interfaces[interface]["link"].keys():
                if interfaces[interface]["link"]["speed"] == "400G":
                    details = {}
                    details["LocalPort"] = interface
                    if "lldp" in interfaces[interface].keys():
                        for neighbor in interfaces[interface]["lldp"]["neighbor"]:
                            details["Neighbor"] = neighbor
                            details["RemotePort"] = interfaces[interface]["lldp"]["neighbor"][neighbor]['port']['name']
                            if neighbor in leafs:
                                leafDiscovery[spine][neighbor].append({"LocalPort": details["LocalPort"], "RemotePort": details["RemotePort"]})
                        discovery[spine].append(details)
                    else:
                        print(spine + " - " + interface + " - No neighbors found!")
                else:
                    print(spine + " - " + interface + " is not a 400G interface!")
            else:
                print(spine + " - " + interface + " is not a swp or/and we are unable to determine the link speed!")

    for switch in discovery:
        print("Switch name: " + switch)
        if not discovery[switch]:
            print("No neighbors found!")
        else:
            print("\nLocal port Neighbor Remote port\n---------- -------- -----------\n")
            for port in discovery[switch]:
                print(port["LocalPort"] + "  " + port["Neighbor"] + "  " + port["RemotePort"] + "\n")
        
    # Bring down the links
    downLinks = {}
    for spine in leafDiscovery:
        downLinks[spine] = []
        for leaf in leafDiscovery[spine]:
            downLinks[leaf] = []
            if not leafDiscovery[switch][leaf]:
                print("No link(s) between " + leaf + " and " + spine)
            else:
                spineBody = {}
                leafBody = {}
                print("===Bringing down " + str(totalLinks) + " link(s) between " + leaf + " and " + spine + "===")
                i = 0
                for interfaces in leafDiscovery[switch][leaf]:
                    if i < totalLinks:
                        # Build the spine int body
                        spineBody[interfaces['LocalPort']] = {'link':{'state':{interfaceState:{}}}}
                        downLinks[spine].append(interfaces['LocalPort'])
                        # Build the leaf int body
                        leafBody[interfaces['RemotePort']] = {'link':{'state':{interfaceState:{}}}}
                        downLinks[leaf].append(interfaces['RemotePort'])
                        i += 1
                    else:
                        break
                apply_new_config("https://" + spine + ":8765/nvue_v1","/interface",spineBody)
                apply_new_config("https://" + leaf + ":8765/nvue_v1","/interface",leafBody)
        return downLinks
 
def link_status_up(downLinks):
        # Bring the links up
        # The script assumes that the links were first brought down, and passed as a parameter
        for switch in downLinks:
            switchBody = {}
            print("===Bringing up the links on " + switch + "===")
            for interface in downLinks[switch]:
                switchBody[interface] = {'link':{'state':{interfaceState:{}}}}
            apply_new_config("https://" + switch + ":8765/nvue_v1","/interface",switchBody)

if __name__ == "__main__":
    switches = ['leaf01','leaf02','spine01']
    spines = ['spine01']
    leafs = ['leaf01', 'leaf02']
    time.sleep(DUMMY_SLEEP)
    get_link_status(switches)
    time.sleep(DUMMY_SLEEP)
    downLinks = link_status_down(spines, leafs, 1)
    time.sleep(DUMMY_SLEEP)
    link_status_up(downLinks)
    get_link_status(switches)

Reboot example

In the following example, switch_reboot() reboots the switches passed as a parameter. The issu_reboot() triggers ISSU (In System Service Upgrade) on the switches passed as a parameter, and reboots the switch in the reboot_mode defined.

ISSU and Switch Reboot Script
#!/usr/bin/env python3

import requests
from requests.auth import HTTPBasicAuth
import json
import time
from urllib3.exceptions import InsecureRequestWarning


auth = HTTPBasicAuth(username="vagrant", password="vagrant")
mime_header = {"Content-Type": "application/json"}

DUMMY_SLEEP = 5  # In seconds
POLL_APPLIED = 1  # in seconds
RETRIES = 10
 
# Suppress the warnings from urllib3
requests.packages.urllib3.disable_warnings(category=InsecureRequestWarning)

def print_request(r: requests.Request):
    print("API URL:", r.url)
    print("API Body:", r.body)

def print_response(r: requests.Response):
    print("API response:", json.dumps(r.json(), indent=2))

def create_nvue_changest(url):
    r = requests.post(url=url + "/revision",
                      auth=auth,
                      verify=False)
    print_response(r)
    response = r.json()
    changeset = response.popitem()[0]
    return changeset

def apply_nvue_changeset(url,changeset):
    apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
    url = url + "/revision/" + requests.utils.quote(changeset,
                                                               safe="")
    r = requests.patch(url=url,
                       auth=auth,
                       verify=False,
                       data=json.dumps(apply_payload),
                       headers=mime_header)
    print_response(r)

def is_config_applied(url,changeset) -> bool:
    # Check if the configuration was indeed applied
    global RETRIES
    global POLL_APPLIED
    retries = RETRIES
    while retries > 0:
        r = requests.get(url=url + "/revision/" + requests.utils.quote(changeset, safe=""),
                         auth=auth,
                         verify=False)
        response = r.json()

        if response["state"] == "applied":
            return True
        retries -= 1
        time.sleep(POLL_APPLIED)
    return False

def apply_new_config(url,path,payload):
    # Create a new revision ID
    changeset = create_nvue_changest(url)
    print("Using NVUE Changeset: '{}'".format(changeset))

    # Patch the new configuration
    query_string = {"rev": changeset}
    r = requests.patch(url=url + path,
                       auth=auth,
                       verify=False,
                       data=json.dumps(payload),
                       params=query_string,
                       headers=mime_header)
    print_request(r.request)
    print_response(r)

    # Apply the changes to the new revision changeset
    apply_nvue_changeset(url,changeset)

    # Check if the changeset was applied
    is_config_applied(url,changeset)

def nvue_post_action(url,payload):
    r = requests.post(url=url,
                    auth=auth,
                    verify=False,
                    data=json.dumps(payload),
                    headers=mime_header)
    print_request(r.request)
    print_response(r)

def nvue_get(url,path):
    r = requests.get(url=url + path,
                     auth=auth,
                     verify=False)
    return(r.json())

def switch_reboot(switches):
    for switch in switches:
        # Reboot the switch
        payload = {
            "@reboot":{
                "state":"start",
                "parameters":{
                    "no-confirm": True
                    }
                }
            }
        nvue_post_action("https://" + switch + ":8765/nvue_v1/system",payload)

        # Verify if switch is pingable
        hostUP = True
        time.sleep(DUMMY_SLEEP) # wait before reboot starts
        while hostUP:
            hostUP  = os.system(f"ping -c 1 {switch}") == 1
            time.sleep(POLL_APPLIED)
        print(json.dumps(nvue_get("https://" + switch + ":8765/nvue_v1","/system/reboot")["history"]["1"]))

def issu_reboot(switches, mode):
    for switch in switches:
        # Configure ISSU in fast/warm mode
        body = {
            "reboot":{
                "mode": mode
            }

        }
        apply_new_config("https://" + switch + ":8765/nvue_v1","/system",body)
        
    switch_reboot(switches)

if __name__ == "__main__":
    switches = ['leaf01','leaf02','spine01']
    issu_switches = ['spine01']
    time.sleep(DUMMY_SLEEP)
    switch_reboot(switches)
    time.sleep(DUMMY_SLEEP)
    issu_reboot(issu_switches, "fast")

Try the API

To try out the NVUE REST API, use the NVUE API Lab available on NVIDIA Air. The lab provides a basic example to help you get started. You can also try out the other examples in this document.

Resources

For information about using the NVUE REST API, refer to the NVUE API Swagger documentation. The full object model download is available here.

Considerations

Precision Time Protocol - PTP

Cumulus Linux supports IEEE 1588-2008 Precision Timing Protocol (PTPv2), which defines the algorithm and method for synchronizing clocks of various devices across packet-based networks, including Ethernet switches and IP routers.

PTP is capable of sub-microsecond accuracy. The clocks are in a master-slave hierarchy, where the slaves synchronize to their masters, which can be slaves to their own masters. The Best Master Clock (BMC) algorithm, which runs on every clock, creates and updates the hierarchy automatically. The Grand Master clock is the top-level master. To provide a high-degree of accuracy, a Global Positioning System (GPS) time source typically synchronizes the Grand Master clock.

In the following example:

Cumulus Linux and PTP

PTP in Cumulus Linux uses the linuxptp package that includes the following programs:

Cumulus Linux supports:

  • You cannot run both PTP and NTP on the switch.
  • PTP supports the default VRF only.
  • PTP on the NVIDIA SN5400 switch is in BETA
  • 1G links might have a lower accuracy for PTP due to hardware limitations. If your application needs high accuracy from PTP, use higher link speeds.

Basic Configuration

Basic PTP configuration requires you:

If you configure PTP with Linux commands, you must also enable PTP timestamping; see step 1 of the Linux procedure below. NVUE enables timestamping when you enable PTP on the switch.

The basic configuration shown below uses the default PTP settings:

To configure other settings, such as the PTP profile, domain, priority, and DSCP, the PTP interface transport mode and timers, and PTP monitoring, see the Optional Configuration sections below.

Disable NTP

Remove the default NTP configuration on the switch:

cumulus@switch:~$ nv unset service ntp mgmt server 0.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp mgmt server 1.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp mgmt server 2.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp mgmt server 3.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv config apply

Stop and disable the NTP service in the management VRF:

cumulus@switch:~$ sudo systemctl stop ntpsec@mgmt.service
cumulus@switch:~$ sudo systemctl disable ntpsec@mgmt.service
  1. Edit the /etc/ntpsec/ntp.conf file to comment out the default NTP configuration:

    cumulus@switch:~$ sudo nano /etc/ntpsec/ntp.conf
    # server 0.cumulusnetworks.pool.ntp.org iburst
    # server 1.cumulusnetworks.pool.ntp.org iburst
    # server 2.cumulusnetworks.pool.ntp.org iburst
    # server 3.cumulusnetworks.pool.ntp.org iburst
    
    1. Stop and disable the NTP service in the management VRF:
    cumulus@switch:~$ sudo systemctl stop ntpsec@mgmt.service
    cumulus@switch:~$ sudo systemctl disable ntpsec@mgmt.service
    

Configure PTP

The NVUE nv set service ptp commands require an instance number (1 in the example command below) for management purposes.

When you enable the PTP service with the nv set service ptp <instance> enable on command, NVUE restarts the switchd service, which causes all network ports to reset in addition to resetting the switch hardware configuration.

cumulus@switch:~$ nv set service ptp 1 enable on
cumulus@switch:~$ nv set interface swp1 ip address 10.0.0.9/32
cumulus@switch:~$ nv set interface swp2 ip address 10.0.0.10/32
cumulus@switch:~$ nv set interface swp1 ptp enable on
cumulus@switch:~$ nv set interface swp2 ptp enable on
cumulus@switch:~$ nv config apply

The configuration writes to the /etc/ptp4l.conf file.

cumulus@switch:~$ nv set service ptp 1 enable on
cumulus@switch:~$ nv set bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default type vlan-aware
cumulus@switch:~$ nv set bridge domain br_default vlan 10-30
cumulus@switch:~$ nv set bridge domain br_default vlan 10 ptp enable on
cumulus@switch:~$ nv set interface vlan10 type svi
cumulus@switch:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@switch:~$ nv set interface vlan10 ptp enable on
cumulus@switch:~$ nv set interface swp1 bridge domain br_default
cumulus@switch:~$ nv set interface swp1 bridge domain br_default vlan 10
cumulus@switch:~$ nv set interface swp1 ptp enable on
cumulus@switch:~$ nv config apply

  • You can configure only one address; either IPv4 or IPv6.
  • For IPv6, set the trunk port transport mode to IPv6.

The configuration writes to the /etc/ptp4l.conf file.

cumulus@switch:~$ nv set service ptp 1 enable on
cumulus@switch:~$ nv set bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default type vlan-aware
cumulus@switch:~$ nv set bridge domain br_default vlan 10-30
cumulus@switch:~$ nv set bridge domain br_default vlan 10 ptp enable on
cumulus@switch:~$ nv set interface vlan10 type svi
cumulus@switch:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@switch:~$ nv set interface swp2 bridge domain br_default
cumulus@switch:~$ nv set interface swp2 bridge domain br_default access 10
cumulus@switch:~$ nv set interface swp2 ptp enable on
cumulus@switch:~$ nv config apply

  • You can configure only one address; either IPv4 or IPv6.
  • For IPv6, set the trunk port transport mode to IPv6.
  • When you enable PTP on a bridge port, you must also enable PTP on the VLAN configured for the port with the nv set bridge domain <domain> vlan <vlan-id> ptp enable on command.

The configuration writes to the /etc/ptp4l.conf file.

  1. Configure NVUE to stop managing PTP configuration files:
cumulus@switch:~$ nv set system config apply ignore /etc/linuxptp/phc2sys.conf
cumulus@switch:~$ nv set system config apply ignore /etc/ptp4l.conf
cumulus@switch:~$ nv set system config apply ignore /etc/cumulus/switchd.d/ptp.conf
cumulus@switch:~$ nv config apply
  1. Edit the /etc/cumulus/switchd.d/ptp.conf file to set the ptp.timestamping parameter to TRUE:

    cumulus@switch:~$ sudo nano /etc/cumulus/switchd.d/ptp.conf
    ...
    ptp.timestamping     TRUE
    ...
    
  2. Restart the switchd service:

    cumulus@switch:~$ sudo systemctl restart switchd.service
    

Restarting the switchd service causes all network ports to reset in addition to resetting the switch hardware configuration.

  1. Edit the Default interface options section of the /etc/ptp4l.conf file to configure the interfaces on the switch that you want to use for PTP.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
#
# Default interface options
#
time_stamping                  hardware
# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
[swp1]
udp_ttl                 1
masterOnly              0
delay_mechanism         E2E
[swp2]
udp_ttl                 1
masterOnly              0
delay_mechanism         E2E

For a trunk VLAN, add the VLAN configuration to the switch port stanza: set l2_mode to trunk, vlan_intf to the VLAN interface, and src_ip to the IP address of the VLAN interface:

[swp1]
l2_mode                 trunk
vlan_intf               vlan10
src_ip                  10.1.10.2
udp_ttl                 1
masterOnly              0
delay_mechanism         E2E
network_transport       RAWUDPv4

For a switch port VLAN, add the VLAN configuration to the switch port stanza: set l2_mode to access, vlan_intf to the VLAN interface, and src_ip to the IP address of the VLAN interface:

[swp2]
l2_mode                 access
vlan_intf               vlan10
src_ip                  10.1.10.2
udp_ttl                 1
masterOnly              0
delay_mechanism         E2E
network_transport       RAWUDPv4
  1. Edit the /etc/linuxptp/phc2sys.conf file to add the following parameters:

    cumulus@switch:~$ sudo nano /etc/linuxptp/phc2sys.conf
    # phc2sys is enabled
    [global]
    logging_level         6
    path_trace_enabled    0
    use_syslog            1
    verbose               0
    domainNumber          0
    
  2. Enable and start the ptp4l and phc2sys services:

    cumulus@switch:~$ sudo systemctl enable ptp4l.service phc2sys.service
    cumulus@switch:~$ sudo systemctl start ptp4l.service phc2sys.service
    

Global Configuration

Cumulus Linux provides several ways to modify the default basic global configuration. You can:

When a predefined profile is set, NVUE does not allow you to configure global parameters. Do not edit the Linux /etc/ptp4l.conf file to modify the global parameters when a predefined profile is in use. For information about profiles, see PTP Profiles.

Clock Domains

PTP domains allow different independent timing systems to be present in the same network without confusing each other. A PTP domain is a network or a portion of a network within which all the clocks synchronize. Every PTP message contains a domain number. A PTP instance works in only one domain and ignores messages that contain a different domain number. Cumulus Linux supports only one domain in the system.

You can specify multiple PTP clock domains. PTP isolates each domain from other domains so that each domain is a different PTP network. You can specify a number between 0 and 127.

The following example commands configure domain 3 when a profile is not set:

cumulus@switch:~$ nv set service ptp 1 domain 3
cumulus@switch:~$ nv config apply

Edit the Default Data Set section of the /etc/ptp4l.conf file to change the domainNumber setting, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly               0
priority1               128
priority2               128
domainNumber            3
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

Clock Timestamp Mode

The Cumulus Linux switch provides the following clock timestamp modes:

One-step mode significantly reduces the number of PTP messages. Two-step mode is the default configuration.

Cumulus Linux supports one-step mode on Spectrum-2 and later.

The following example commands configure one-step mode when a profile is not set:

cumulus@switch:~$ nv set service ptp 1 two-step off
cumulus@switch:~$ nv config apply

To revert the clock timestamp mode to the default setting (two-step mode), run the nv set service ptp 1 two-step on command.

To set the clock timestamp mode for a custom profile based on IEEE1588, ITU 8275-1 or ITU 8275-2, run the nv set service ptp <instance-id> profile <profile-id> two-step command. For example, to set one-step mode for the custom profile called CUSTOM1, run the nv set service ptp 1 profile CUSTOM1 two-step off command.

Edit the Default Data Set section of the /etc/ptp4l.conf file to change the twoStepFlag setting to 0, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly               0
priority1               254
priority2               254
domainNumber            3

twoStepFlag             0
dscp_event              43
dscp_general            43
udp6_scope              0x0E
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

To revert the clock timestamp mode to the default setting (two-step mode), change the twoStepFlag setting to 1.

PTP Priority

The BMC selects the PTP master according to the criteria in the following order:

  1. Priority 1
  2. Clock class
  3. Clock accuracy
  4. Clock variance
  5. Priority 2
  6. Port ID

Use the PTP priority to select the Best Master Clock. You can set priority 1 and 2:

The range for both priority 1 and priority 2 is between 0 and 255. The default priority is 128. For the boundary clock, use a number above 128. The lower priority applies first.

The following example commands set priority 1 and priority 2 to 200 when a profile is not set:

cumulus@switch:~$ nv set service ptp 1 priority1 200
cumulus@switch:~$ nv set service ptp 1 priority2 200
cumulus@switch:~$ nv config apply

Edit the Default Data Set section of the /etc/ptp4l.conf file to change the priority1 and, or priority2 setting, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly               0
priority1               200
priority2               200
domainNumber            3
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

Noise Transfer Servo

ITU-T noise transfer specifies the following key elements to measure, test, and classify the accuracy of a clock:

Cumulus Linux PTP has an option to use a servo specifically designed to handle the ITU-T Noise Transfer specification. When you use this option, the PHC the Noise Transfer Servo resolves the jitter and wander noise from the Master clock.

  • To use Noise Transfer Servo, you need to enable SyncE on the switch and on PTP interfaces.
  • Cumulus Linux supports Noise Transfer Servo on Spectrum ASICs that support SyncE.
  • NVIDIA recommends you use Noise Transfer Servo with PTP Telecom profiles. If you use other profiles or choose not to use a profile, make sure to set the sync interval to -3 or better.
  • When you enable Noise Transfer Servo, the PTP log reporting offset is one every two seconds instead of one every second.

To enable Noise Transfer Servo:

The following example enables PTP, sets the profile to default-itu-8275-1, enables SyncE, enables PTP on swp3, and enables Noise Transfer Servo.

cumulus@switch:~$ nv set service ptp 1 enable on
cumulus@switch:~$ nv set service ptp 1 current-profile default-itu-8275-1
cumulus@switch:~$ nv set system synce enable on
cumulus@switch:~$ nv set interface swp3 ptp enable on
cumulus@switch:~$ nv set service ptp 1 servo noise-transfer
cumulus@switch:~$ nv config apply

Edit the /etc/ptp4l.conf and the /etc/firefly_servo/servo.conf files; see examples below.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly                      0
free_running                   1
slave_event_monitor            /var/run/servo_slave_event_monitor
priority1                      128
priority2                      128
domainNumber                   24

twoStepFlag                    1
dscp_event                     46
dscp_general                   46
network_transport              L2
dataset_comparison             G.8275.x
G.8275.defaultDS.localPriority 128
ptp_dst_mac                    01:80:C2:00:00:0E
#
# Port Data Set
#
logAnnounceInterval            -3
logSyncInterval                -4
logMinDelayReqInterval         -4
announceReceiptTimeout         3
delay_mechanism                E2E
 
offset_from_master_min_threshold   -50
offset_from_master_max_threshold   50
mean_path_delay_threshold          200
tsmonitor_num_ts                   100
tsmonitor_num_log_sets             2
tsmonitor_num_log_entries          4
tsmonitor_log_wait_seconds         1
#
# Run time options
#
logging_level                  6
path_trace_enabled             0
use_syslog                     1
verbose                        0
summary_interval               0
#
# servo parameters
#
pi_proportional_const          0.000000
pi_integral_const              0.000000
pi_proportional_scale          0.700000
pi_proportional_exponent       -0.300000
pi_proportional_norm_max       0.700000
pi_integral_scale              0.300000
pi_integral_exponent           0.400000
pi_integral_norm_max           0.300000
first_step_threshold           0.000020
step_threshold                 0.000000025
servo_offset_threshold         20
servo_num_offset_values        10
write_phase_mode               1
max_frequency                  50000000
sanity_freq_limit              0
#
# Default interface options
#
time_stamping                  hardware

[swp3]
udp_ttl                      1
masterOnly                   0
delay_mechanism              E2E
cumulus@switch:~$ sudo nano /etc/firefly_servo/servo.conf
[global]
free_running                        0
domainNumber                        24

offset_from_master_min_threshold    -50
offset_from_master_max_threshold    50

# Debugging & Logging
doca_logging_level                  50

init_max_time_adjustment            0
max_time_adjustment                 1500
hold_over_timer                     0
# Sampling Window & servo logic
servo_window_timer                  3000
servo_window_min_samples            10
servo_num_offset_values             5

To show Noise Transfer Servo configuration settings, run the nv show service ptp <instance-id> servo command:

cumulus@switch:~$ nv show service ptp 1 servo
       operational  applied
-----  -----------  --------------
servo               noise-transfer

Ignore Source Port ID

If the master clock has Announce disabled, you can disable the source port ID check in SYNC, Follow Up, and Delay Response PTP messages. Disabling the source port ID check is also useful in rare implementations of PTP, where the master changes the source Port ID in the above messages from the one sent on Announce.

To disable the source port ID check, run the nv set service ptp 1 ignore-source-id on command:

cumulus@switch:~$ nv set service ptp 1 ignore-source-id on
cumulus@switch:~$ nv config apply

To reenable the source port ID check, run the nv set service ptp 1 ignore-source-id off command.

To disable the source port ID check, edit the /etc/ptp4l.conf file to add the ignore_source_id 1 parameter, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly                      0
priority1                      128
priority2                      128
domainNumber                   0
ignore_source_id               1
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

Multicast MAC Address

PTP over Ethernet uses the following types of multicast MAC addresses:

For Telecom Profile ITU 8275-1, set the multicast MAC address to non-forwarding.

To set the multicast MAC address to non-forwarding:

cumulus@switch:~$ nv set service ptp 1 multicast-mac non-forwarding
cumulus@switch:~$ nv config apply

To set the multicast MAC address to forwarding, run the nv unset service ptp 1 multicast-mac non-forwarding command.

To set the multicast MAC address to non-forwarding, edit the /etc/ptp4l.conf file to add the ptp_dst_mac parameter, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
#
# Run time options
#
logging_level                  6
path_trace_enabled             0
use_syslog                     1
verbose                        0
summary_interval               0
ptp_dst_mac                    01:80:C2:00:00:0E
cumulus@switch:~$ sudo systemctl restart ptp4l.service

Optional Global Configuration

Optional global PTP configuration includes configuring the DiffServ code point (DSCP). You can configure the DSCP value for all PTP IPv4 packets originated locally. You can set a value between 0 and 63.

cumulus@switch:~$ nv set service ptp 1 ip-dscp 22
cumulus@switch:~$ nv config apply

Edit the Default Data Set section of the /etc/ptp4l.conf file to change the dscp_event setting for PTP messages that trigger a timestamp read from the clock and the dscp_general setting for PTP messages that carry commands, responses, information, or timestamps.

After you save the /etc/ptp4l.conf file, restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly               0
priority1               200
priority2               200
domainNumber            3

twoStepFlag             1
dscp_event              22
dscp_general            22
udp6_scope              0x0E
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

PTP Interface Configuration

Cumulus Linux provides several ways to modify the default basic interface configuration. You can:

When a profile is in use, avoid configuring the following interface configuration parameters with NVUE or in the Linux configuration file so that the interface retains its profile settings.

Transport Mode

By default, Cumulus Linux encapsulates PTP messages in UDP IPv4 frames. To encapsulate PTP messages on an interface in UDP IPv6 frames:

cumulus@switch:~$ nv set interface swp1 ptp transport ipv6
cumulus@switch:~$ nv config apply

Edit the Default interface options section of the /etc/ptp4l.conf file to change the network_transport setting for the interface, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping           hardware

# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.

[swp1]
udp_ttl                 1
masterOnly              0
delay_mechanism         E2E
network_transport       RAWUDPv6

[swp2]
udp_ttl                 1
masterOnly              0
delay_mechanism         E2E
network_transport       RAWUDPv6
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

Message Mode

Cumulus Linux supports the following PTP message modes:

Multicast mode is the default setting; when you enable PTP on an interface, the message mode is multicast.

To change the message mode to mixed on swp1:

cumulus@switch:~$ nv set interface swp1 ptp mixed-multicast-unicast on
cumulus@switch:~$ nv config apply

To change the message mode back to the default setting of multicast on swp1:

cumulus@switch:~$ nv set interface swp1 ptp mixed-multicast-unicast off
cumulus@switch:~$ nv config apply

Edit the Default interface options section of the /etc/ptp4l.conf file to add the hybrid_e2e 1 line under the interface, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping           hardware

# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.

[swp1]
hybrid_e2e              1
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

To change the message mode back to the default setting of multicast, remove the hybrid_e2e line under the interface, then restart the ptp4l service.

PTP Interface Timers

You can set the following timers for PTP messages.

Timer Description
announce-interval The average interval between successive Announce messages. Specify the value as a power of two in seconds.
announce-timeout The number of announce intervals that have to occur without receiving an Announce message before a timeout occurs. Make sure that this value is longer than the announce-interval in your network.
delay-req-interval The minimum average time interval allowed between successive Delay Required messages.
sync-interval The interval between PTP synchronization messages on an interface. Specify the value as a power of two in seconds.

The following example sets the announce interval between successive Announce messages on swp1 to -1.

cumulus@switch:~$ nv set interface swp1 ptp timers announce-interval -1
cumulus@switch:~$ nv config apply

The following example sets the mean sync-interval for multicast messages on swp1 to -5.

cumulus@switch:~$ nv set interface swp1 ptp timers sync-interval -5
cumulus@switch:~$ nv config apply

Edit the Default interface options section of the /etc/ptp4l.conf file:

  • To set the announce interval between successive Announce messages on swp1 to -1, add logAnnounceInterval -1 under the interface stanza.
  • To set the mean sync-interval for multicast messages on swp1 to -5, add logSyncInterval -5 under the interface stanza.

After you edit the /etc/ptp4l.conf file, restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping           hardware

# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.

[swp1]
logAnnounceInterval     -1
logSyncInterval         -5
udp_ttl                 20
masterOnly              1
delay_mechanism         E2E
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

Optional PTP Interface Configuration

Forced Master Mode

By default, PTP ports are in auto mode, where the BMC algorithm determines the state of the port.

You can configure Forced Master mode on a PTP port so that it is always in a master state and the BMC algorithm does not run for this port. This port ignores any Announce messages it receives.

cumulus@switch:~$ nv set interface swp1 ptp forced-master on
cumulus@switch:~$ nv config apply

Edit the Default interface options section of the /etc/ptp4l.conf file to change the masterOnly setting for the interface, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping           hardware

# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.

[swp1]
udp_ttl                 1
masterOnly              1
delay_mechanism         E2E
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

TTL for a PTP Message

To restrict the number of hops a PTP message can travel, set the TTL on the PTP interface. You can set a value between 1 and 255.

cumulus@switch:~$ nv set interface swp1 ptp ttl 20
cumulus@switch:~$ nv config apply

Edit the Default interface options section of the /etc/ptp4l.conf file to change the udp_ttl setting for the interface, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping           hardware

# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.

[swp1]
udp_ttl                 20
masterOnly              1
delay_mechanism         E2E
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

Unicast Mode

Cumulus Linux supports unicast mode so that a unicast client can perform Unicast Discover and Negotiation with servers. Unlike the default multicast mode, where both the server(master) and client(slave) start sending out announce requests and discover each other, in unicast mode, the client starts by sending out requests for unicast transmission. The client sends this to every server address in its Unicast Master Table. The server responds with an accept or deny to the request.

Global Unicast Configuration

Unicast clients need a unicast master table for unicast negotiation; you must configure at least one unicast master table on the switch.

To configure unicast globally:

Interface Unicast Configuration

For interface unicast configuration, in addition to enabling PTP on an interface, you also need to configure the PTP interface to be either a unicast client or a unicast server.

When configuring multiple PTP interfaces on the switch to be unicast clients, you must configure a unicast table ID on every interface set as a unicast client. Each client must have a different table ID.

To configure a PTP interface to be the unicast client:

cumulus@switch:~$ nv set interface swp1 ptp unicast-service-mode client
cumulus@switch:~$ nv config apply
  1. Add the following lines at the end of the interface section of the /etc/ptp4l.conf file:

    [unicast_master_table]
    table_id               3
    logQueryInterval       0
    RAWUDPv4                  100.100.100.1
    
    [swp1]
    table_id                1
    ...
    
  2. Restart the ptp4l service.

    cumulus@switch:~$ sudo systemctl restart ptp4l.service
    

To configure a PTP interface to be the unicast server:

cumulus@switch:~$ nv set interface swp1 ptp unicast-service-mode server
cumulus@switch:~$ nv config apply
  1. Add the following lines at the end of the interface section of the /etc/ptp4l.conf file:

    [swp1]
    ...
    unicast_listen      1
    ...
    
  2. Restart the ptp4l service.

    cumulus@switch:~$ sudo systemctl restart ptp4l.service
    

To configure a unicast table ID:

cumulus@switch:~$ nv set interface swp1 ptp unicast-master-table-id 1
cumulus@switch:~$ nv config apply
  1. Add the table ID at the end of the interface section of the /etc/ptp4l.conf file:

    [swp1]
    ...
    table_id   1
    
    
  2. Restart the ptp4l service.

    cumulus@switch:~$ sudo systemctl restart ptp4l.service
    

To show the unicast master table configuration on the switch, run the nv show service ptp <instance-id> unicast-master <table-id> command.

To show unicast PTP related counters, run the nv show interface <interface>> counters ptp command and examine the Signaling section in the output.

cumulus@switch:~$ nv show interface swp1 counters ptp
Packet Type                       Received       Transmitted    
---------------------             ------------   ------------   
Announce                                    0            681
Sync                                        0          43530
Follow-up                                   0          43530
Delay Request                           42064              0
Delay Response                              0          42064
Peer Delay Request                          0              0
Peer Delay Response                         0              0
Management                                  0              0
Signaling                                  94            282
  Announce Grant Request                   94              0
  Announce Grant Response                   0             94
  Announce Deny Response                    0              0
  Sync Grant Request                       94              0
  Sync Grant Response                       0             94
  Sync Deny Response                        0              0
  Delay Grant Request                      94              0
  Delay Grant Response                      0             94
  Delay Deny Response                       0              0
  Cancel Announce Request                   0              0
  Cancel Sync Request                       0              0
  Cancel Delay Request                      0              0

  • The client sends unicast requests together in one signaling message (Announce, Sync, Delay request TLV), and the unicast server sees one signaling message and three TLVs. The counter increments for each request received.
  • The server responds with a grant signaling message individually for each response; the response includes three signaling messages each with one TLV. The counters increment individually.

Optional Unicast Interface Configuration

You can set the unicast request duration for unicast clients, which is the service time in seconds requested by the unicast client during unicast negotiation. The default value is 300 seconds.

cumulus@switch:~$ nv set interface swp1 ptp unicast-request-duration 20
cumulus@switch:~$ nv config apply
  1. Add the unicast_request_duration parameter at the end of the interface section of the /etc/ptp4l.conf file:

    [swp1]
    ...
    table_id   1
    unicast_request_duration 20
    
  2. Restart the ptp4l service.

    cumulus@switch:~$ sudo systemctl restart ptp4l.service
    

PTP Profiles

PTP profiles are a standardized set of configurations and rules intended to meet the requirements of a specific application. Profiles define required, allowed, and restricted PTP options, network restrictions, and performance requirements.

Cumulus Linux supports three predefined profiles: IEEE 1588, and two Telecom profiles - ITU 8275-1 and ITU 8275-2.

IEEE 1588 ITU 8275-1 ITU 8275-2
Application Enterprise Mobile Networks Mobile Networks
Transport Layer 2 and Layer 3 Layer 2 Layer 3
Encapsulation 802.3, UDPv4, or UDPv6 802.3 UDPv4 or UDPv6
Transmission Unicast and Multicast Multicast Unicast
Supported Clock Types Boundary Clock Boundary Clock Boundary Clock

  • You cannot modify the predefined profiles. If you want to set a parameter to a different value in a predefined profile, you need to create a custom profile. You can modify a custom profile within the range applicable to the profile type.
  • You cannot set the current profile to a profile not yet created.
  • You cannot set global PTP parameters in a profile currently in use.
  • PTP profiles do not support VLANs or bonds.
  • If you set a predefined or custom profile, do not change any global PTP settings, such as the DSCP or the clock domain.
  • For better performance in a high scale network with PTP on multiple interfaces, configure a higher system policer rate with the nv set system control-plane policer lldp-ptp burst <value> and nv set system control-plane policer lldp-ptp rate <value> commands. The switch uses the LLDP policer for PTP protocol packets. The default value for the LLDP policer is 2500. When you use the ITU 8275.1 profile with higher sync rates, use higher policer values.

Set a Predefined Profile

To set a predefined profile:

  • To set the ITU 8275.1 profile, run the nv set service ptp <instance-id> current-profile default-itu-8275-1 command.
  • To set the ITU 8275.2 profile, run the nv set service ptp <instance-id> current-profile default-itu-8275-2 command.

The following example sets the profile to ITU 8275.1

cumulus@switch:~$ nv set service ptp 1 current-profile default-itu-8275-1
cumulus@switch:~$ nv config apply

To set the IEEE 1588 profile:

cumulus@switch:~$ nv set service ptp 1 current-profile default-1588
cumulus@switch:~$ nv config apply

To set the predefined ITU 8275.1 profile, edit the /etc/ptp4l.conf file and set the parameters shown below, then restart the ptp4l service:

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
...
[global]
#
# Default Data Set
#
slaveOnly                      0
priority1                      128
priority2                      128
domainNumber                   24

twoStepFlag                    1
dscp_event                     46
dscp_general                   46
dataset_comparison             G.8275.x
G.8275.defaultDS.localPriority 128
ptp_dst_mac                    01:80:C2:00:00:0E
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

To set the predefined ITU 8275.2 profile, edit the /etc/ptp4l.conf file and set the parameters shown below, then restart the ptp4l service:

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
...
[global]
#
# Default Data Set
#
slaveOnly                      0
priority1                      128
priority2                      128
domainNumber                   24

twoStepFlag                    1 
dscp_event                     46
dscp_general                   46
network_transport              RAWUDPv4
dataset_comparison             G.8275.x
G.8275.defaultDS.localPriority 128
hybrid_e2e                     1
inhibit_multicast_service      1
unicast_listen                 1
unicast_req_duration           60
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

To use the predefined IEEE 1588 profile, edit the /etc/ptp4l.conf file and set the parameters shown below, then restart the ptp4l service:

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly                      0
priority1                      128
priority2                      128
domainNumber                   0

twoStepFlag                    1
dscp_event                     46
dscp_general                   46
network_transport              RAWUDPv4
dataset_comparison             ieee1588
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

Create a Custom Profile

To create a custom profile:

  • Create a profile name.
  • Set the profile type on which to base the new profile (itu-g-8275-1 itu-g-8275-2, or ieee-1588).
  • Update any of the profile settings you want to change (announce-interval, delay-req-interval, priority1, sync-interval, announce-timeout, domain, priority2, transport, delay-mechanism, local-priority).
  • Set the custom profile to be the current profile.

The following example commands create a custom profile called CUSTOM1 based on the predefined profile ITU 8275-1. The commands set the domain to 28 and the announce-timeout to 3, then set CUSTOM1 to be the current profile:

cumulus@switch:~$  nv set service ptp 1 profile CUSTOM1 
cumulus@switch:~$  nv set service ptp 1 profile CUSTOM1 profile-type itu-g-8275-1  
cumulus@switch:~$  nv set service ptp 1 profile CUSTOM1 domain 28
cumulus@switch:~$  nv set service ptp 1 profile CUSTOM1 announce-timeout 3
cumulus@switch:~$  nv set service ptp 1 current-profile CUSTOM1
cumulus@switch:~$  nv config apply

The following example /etc/ptp4l.conf file creates a custom profile based on the predefined profile ITU 8275-1 and sets the domain to 28 and the announce-timeout to 3.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly                      0
priority1                      128
priority2                      128
domainNumber                   28

twoStepFlag                    1
dscp_event                     46
dscp_general                   46
network_transport              L2
dataset_comparison             G.8275.x
G.8275.defaultDS.localPriority 128
ptp_dst_mac                    01:80:C2:00:00:0E

#
# Port Data Set
#
logAnnounceInterval            5
logSyncInterval                -4
logMinDelayReqInterval         -4
announceReceiptTimeout         3
delay_mechanism                E2E

offset_from_master_min_threshold   -50
offset_from_master_max_threshold   50
mean_path_delay_threshold          200
tsmonitor_num_ts                   100
tsmonitor_num_log_sets             3
tsmonitor_num_log_entries          4
tsmonitor_log_wait_seconds         1

#
# Run time options
#
logging_level                  6
path_trace_enabled             0
use_syslog                     1
verbose                        0
summary_interval               0

#
# servo parameters
#
pi_proportional_const          0.000000
pi_integral_const              0.000000
pi_proportional_scale          0.700000
pi_proportional_exponent       -0.300000
pi_proportional_norm_max       0.700000
pi_integral_scale              0.300000
pi_integral_exponent           0.400000
pi_integral_norm_max           0.300000
step_threshold                 0.000002
first_step_threshold           0.000020
max_frequency                  900000000
sanity_freq_limit              0

#
# Default interface options
#
time_stamping                  hardware


# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.

[swp1]
udp_ttl                 1
masterOnly              0
delay_mechanism         E2E

[swp2]
udp_ttl                 1
masterOnly              0
delay_mechanism         E2E
cumulus@switch:~$ sudo systemctl restart ptp4l.service

Telecom Profiles

ITU 8275-1 and ITU 8275-2 are Telecom profiles. You can use the PTP Telecom profiles for phase distribution in networks that have full timing support and for time distribution in networks that have partial timing support. While ITU 8275-1 uses 802.3 encapsulation, ITU 8275-2 uses unicast. When you use a Telecom profile, PTP uses the Alternate Best Master Clock Algorithm (BMCA), which provides the following functionality over the regular BMCA:

Local Priority

The local priority attributes of the Telecom profiles provide a powerful tool in building the synchronization topology. The profiles have two local priority configuration parameters:

Both clock-local-priority and local-priority have default values of 128. When you use the default values, the Alternate BMCA determines the synchronization topology automatically. If you use non-default local priority values, you build the synchronization topology manually.

  • Exercise caution when using local priority attributes to build the synchronization topology manually.
  • With two connected switches, you must set the local priority on one switch higher than 128 and the local priority on the second switch lower than 128.

The following example commands set:

  • The local priority to 10 for the custom profile called CUSTOM1, based on ITU 8275-2.
  • The clock local priority to 100 for the custom profile called CUSTOM1, based on ITU 8275-2.
cumulus@switch:~$ nv set service ptp 1 profile CUSTOM1 local-priority 10
cumulus@switch:~$ nv set service ptp 1 profile CUSTOM1 clock-local-priority 100
cumulus@switch:~$ nv config apply

Add the G.8275.portDS.localPriority (local priority) option and the G.8275.defaultDS.localPriority (clock local priority) option to the Global section of the /etc/ptp4l.conf file, then restart the ptp4l service.

The following example sets:

  • The local priority to 10.
  • The clock local priority to 100.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly                      0
priority1                      128
priority2                      128
domainNumber                   28

twoStepFlag                    1
dscp_event                     46
dscp_general                   46
network_transport              L2
dataset_comparison             G.8275.x
G.8275.defaultDS.localPriority 100
G.8275.portDS.localPriority    10
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

The following example sets the local priority on swp1 to 120.

cumulus@switch:~$ nv set interface swp1 ptp 1 local-priority 120
cumulus@switch:~$ nv config apply

Add the G.8275.portDS.localPriority option to the interface section of the /etc/ptp4l.conf file, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
[swp1]
udp_ttl                      1
hybrid_e2e                   1
masterOnly                   0
delay_mechanism              E2E
network_transport            RAWUDPv6
G.8275.portDS.localPriority  120
...
cumulus@switch:~$ sudo systemctl restart ptp4l.service

Show Profile Settings

To show the current PTP profile setting, run the nv show service ptp <ptp-instance> command:

cumulus@switch:~$ nv show service ptp 1
                             operational  applied             description
---------------------------  -----------  ------------------  --------------------------------------------------------------------
enable                       on           on                  Turn the feature 'on' or 'off'.  The default is 'off'.
current-profile                           default-itu-8275-1  Current PTP profile index
domain                       24           0                   Domain number of the current syntonization
ip-dscp                      46           46                  Sets the Diffserv code point for all PTP packets originated locally.
priority1                    128          128                 Priority1 attribute of the local clock
priority2                    128          128                 Priority2 attribute of the local clock
...

To show the settings for a profile, run the nv show service ptp <instance> profile <profile-name> command:

cumulus@switch:~$ nv show service ptp 1 profile CUSTOM1
                             operational  applied           
---------------------------  -----------  ------------------
enable                                    on                
current-profile                           default-itu-8275-1
domain                                    0                 
ip-dscp                                   46                
logging-level                             info              
priority1                                 128               
priority2                                 128               
[acceptable-master]    
monitor                                                     
  max-offset-threshold                    50                
  max-timestamp-entries                   100               
  max-violation-log-entries               4                 
  max-violation-log-sets                  3                 
  min-offset-threshold                    -50               
  path-delay-threshold                    200               
  violation-log-interval                  1                 

Optional Acceptable Master Table

The acceptable master table option is a security feature that prevents a rogue player from pretending to be the Grand Master clock to take over the PTP network. To use this feature, you configure the clock IDs of known Grand Master clocks in the acceptable master table and set the acceptable master table option on a PTP port. The BMC algorithm checks if the Grand Master clock received in the Announce message is in this table before proceeding with the master selection. Cumulus Linux disables this option by default on PTP ports.

The following example command adds the Grand Master clock ID 24:8a:07:ff:fe:f4:16:06 to the acceptable master table and enables the PTP acceptable master table option for swp1:

cumulus@switch:~$ nv set service ptp 1 acceptable-master 24:8a:07:ff:fe:f4:16:06
cumulus@switch:~$ nv config apply

You can also configure an alternate priority 1 value for the Grand Master:

cumulus@switch:~$ nv set service ptp 1 acceptable-master 24:8a:07:ff:fe:f4:16:06 alt-priority 2

To enable the PTP acceptable master table option for swp1:

cumulus@switch:~$ nv set interface swp1 ptp acceptable-master on
cumulus@switch:~$ nv config apply

Edit the Default interface options section of the /etc/ptp4l.conf file to add acceptable_master_clockIdentity 248a07.fffe.f41606.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
#
# Default interface options
#
time_stamping           hardware


[acceptable_master_table]
maxTableSize 16
acceptable_master_clockIdentity 248a07.fffe.f41606
...

You can also configure an alternate priority 1 value for the Grand Master.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
#
# Default interface options
#
time_stamping           hardware


[acceptable_master_table]
maxTableSize 16
acceptable_master_clockIdentity 248a07.fffe.f41606 2

To enable the PTP acceptable master table option for swp1, add acceptable_master on under [swp1].

...
# Default interface options
#
time_stamping           hardware

# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.

[swp1]
udp_ttl                 20
masterOnly              1
delay_mechanism         E2E
acceptable_master       on
...

Restart the ptp4l service:

cumulus@switch:~$ sudo systemctl restart ptp4l.service

Optional Monitor Configuration

Cumulus Linux provides the following optional PTP monitoring configuration.

Configure Clock TimeStamp and Path Delay Thresholds

Cumulus Linux monitors clock timestamp and path delay against thresholds, and generates counters when PTP reaches the set thresholds. You can see the counters in the NVUE nv show command output and in log messages.

You can configure the following monitor settings:

Command Description
nv set service ptp <instance> monitor min-offset-threshold Sets the minimum difference allowed between the master and slave time. You can set a value between -1000000000 and 0 nanoseconds. The default value is -50 nanoseconds.
nv set service ptp <instance> monitor max-offset-threshold Sets the maximum difference allowed between the master and slave time. You can set a value between 0 and 1000000000 nanoseconds. The default value is 50 nanoseconds.
nv set service ptp <instance> monitor path-delay-threshold Sets the mean time that PTP packets take to travel between the master and slave. You can set a value between 0 and 1000000000 nanoseconds. The default value is 200 nanoseconds.
nv set service ptp <instance> monitor max-timestamp-entries Sets the maximum number of timestamp entries allowed. Cumulus Linux updates the timestamps continuously. You can specify a value between 100 and 200. The default value is 100 entries.

The following example sets the minimum offset threshold to -1000, the maximum offset threshold to 1000, and the path delay threshold to 300:

cumulus@switch:~$ nv set service ptp 1 monitor min-offset-threshold -1000
cumulus@switch:~$ nv set service ptp 1 monitor max-offset-threshold 1000
cumulus@switch:~$ nv set service ptp 1 monitor path-delay-threshold 300
cumulus@switch:~$ nv config apply

You can configure the following monitor settings manually in the /etc/ptp4l.conf file. Be sure to run the sudo systemctl restart ptp4l.service to apply the settings.

Parameter Description
offset_from_master_min_threshold Sets the minimum difference allowed between the master and slave time. You can set a value between -1000000000 and 0 nanoseconds. The default value is -50 nanoseconds.
offset_from_master_max_threshold Sets the maximum difference allowed between the master and slave time. You can set a value between 0 and 1000000000 nanoseconds. The default value is 50 nanoseconds.
mean_path_delay_threshold Sets the mean time that PTP packets take to travel between the master and slave. You can set a value between 0 and 1000000000 nanoseconds. The default value is 200 nanoseconds.

The following example sets the minimum offset threshold to -1000, the maximum offset threshold to 1000, and the path delay threshold to 300:

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
[global]
#
# Default Data Set
#
slaveOnly               0
priority1               128
priority2               128
domainNumber            0

twoStepFlag             1
dscp_event              46
dscp_general            46

offset_from_master_min_threshold   -1000
offset_from_master_max_threshold   1000
mean_path_delay_threshold          300
...

Configure PTP Logging

A log set contains the log entries for clock timestamp and path delay violations at different times. You can set the number of entries to log and the interval between successive violation logs.

Command Description
nv set service ptp 1 monitor max-violation-log-sets Sets the maximum number of log sets allowed. You can specify a value between 2 and 4. The default value is 3.
nv set service ptp 1 monitor max-violation-log-entries Sets the maximum number of log entries allowed in a log set. You can specify a value between 4 and 8. The default value is 4.
nv set service ptp 1 monitor violation-log-interval Sets the number of seconds to wait before logging back-to-back violations. You can specify a value between 0 and 60. The default value is 1.

The following example sets the maximum number of log sets allowed to 4, the maximum number of log entries allowed to 6, and the violation log interval to 10:

cumulus@switch:~$ nv set service ptp 1 monitor max-violation-log-sets 4
cumulus@switch:~$ nv set service ptp 1 monitor max-violation-log-entries 6
cumulus@switch:~$ nv set service ptp 1 monitor violation-log-interval 10
cumulus@switch:~$ nv config apply

You can configure the following monitor settings manually in the /etc/ptp4l.conf file. Be sure to run the sudo systemctl restart ptp4l.service to apply the settings.

Parameter Description
tsmonitor_num_log_sets Sets the maximum number of log sets allowed. You can specify a value between 2 and 4. The default value is 3.
tsmonitor_num_log_entries Sets the maximum number of log entries allowed in a log set. You can specify a value between 4 and 8. The default value is 4.
tsmonitor_log_wait_seconds Sets the number of seconds to wait before logging back-to-back violations. You can specify a value between 0 and 60. The default value is 1.

The following example sets the maximum number of log sets allowed to 4, the maximum number of log entries allowed to 6, and the violation log interval to 10:

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
[global]
#
# Default Data Set
#
slaveOnly               0
priority1               128
priority2               128
domainNumber            0

twoStepFlag             1
dscp_event              46
dscp_general            46

offset_from_master_min_threshold   -50
offset_from_master_max_threshold   50
mean_path_delay_threshold          300
tsmonitor_num_ts                   100
tsmonitor_num_log_sets             4
tsmonitor_num_log_entries          6
tsmonitor_log_wait_seconds         10
...

Show PTP Logs

PTP monitoring provides commands to show counters for violations as well as the timestamp log entries for a violation.

Command Description
nv show service ptp <instance> monitor timestamp-log Shows the last 25 PTP timestamps.
nv show service ptp <instance> monitor violations Shows the threshold violation count and the last time a violation of a specific type occurred.
nv show service ptp 1 monitor violations log acceptable-master Shows logs with violations that occur when a PTP server not in the Acceptable Master table sends an Announce request.
nv show service ptp 1 monitor violations log forced-master Shows logs with violations that occur when a forced master port gets a higher clock.
nv show service ptp 1 monitor violations log max-offset Shows logs with violations that occur when the timestamp offset is higher than the max offset threshold.
nv show service ptp 1 monitor violations log min-Offset Shows logs with violations that occur when the timestamp offset is lower than the minimum offset threshold.
nv show service ptp 1 monitor violations log path-delay Shows logs with violations that occur when the mean path delay is higher than the path delay threshold.

The following example shows the threshold violation count and the last time a minimum offset threshold violation occurred:

cumulus@switch:~$ nv show service ptp 1 monitor violations
                  operational                  applied
----------------  ---------------------------  -------
last-max-offset
last-min-offset   2023-04-24T15:22:01.312295Z
last-path-delay
max-offset-count  0
min-offset-count  2
path-delay-count  0

Clear PTP Violation Logs

cumulus@leaf01:mgmt:~$ nv action clear service ptp 1 monitor violations log path-delay
Action succeeded

Delete PTP Configuration

To delete PTP configuration, delete the PTP master and slave interfaces. The following example commands delete the PTP interfaces swp1, swp2, and swp3.

cumulus@switch:~$ nv unset interface swp1 ptp
cumulus@switch:~$ nv unset interface swp2 ptp
cumulus@switch:~$ nv unset interface swp3 ptp
cumulus@switch:~$ nv config apply

Edit the /etc/ptp4l.conf file to remove the interfaces from the Default interface options section, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping           hardware

# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
cumulus@switch:~$ sudo systemctl restart ptp4l.service

To disable PTP on the switch and stop the ptp4l and phc2sys processes:

cumulus@switch:~$ nv set service ptp 1 enable off
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo systemctl stop ptp4l.service phc2sys.service
cumulus@switch:~$ sudo systemctl disable ptp4l.service phc2sys.service

Troubleshooting

Show PTP Configuration

To show a summary of the PTP configuration on the switch, run the nv show service ptp <instance> command:

cumulus@switch:~$ nv show service ptp 1
                             operational  applied
---------------------------  -----------  ------------------
enable                       on           on
current-profile                            default-itu-8275-2
domain                                    0
ip-dscp                                   46
logging-level                             info
priority1                                 128
priority2                                 128
[acceptable-master]
monitor
  max-offset-threshold                     50
  max-timestamp-entries                   100
  max-violation-log-entries               4
  max-violation-log-sets                  2
  min-offset-threshold                     -50
  path-delay-threshold                    200
  violation-log-interval                  1
[profile]                                  abc
[profile]                                  default-1588
[profile]                                  default-itu-8275-1
[profile]                                  default-itu-8275-2
[unicast-master]                          1
[unicast-master]                          2
[unicast-master]                          3
[unicast-master]                          4
...

You can drill down with the following nv show service ptp <instance> commands:

Show PTP Interface Configuration

To check configuration for a PTP interface, run the nv show interface <interface> ptp command.

cumulus@switch:~$ nv show interface swp1 ptp
                           operational  applied     description
-------------------------  -----------  ----------  ----------------------------------------------------------------------
enable                                  on          Turn the feature 'on' or 'off'.  The default is 'off'.
acceptable-master                       off         Determines if acceptable master check is enabled for this interface.
delay-mechanism            end-to-end   end-to-end  Mode in which PTP message is transmitted.
forced-master              off          off         Configures PTP interfaces to forced master state.
instance                                1           PTP instance number.
mixed-multicast-unicast                 off         Enables Multicast for Announce, Sync and Followup and Unicast for D...
transport                  ipv4         ipv4        Transport method for the PTP messages.
ttl                        1            1           Maximum number of hops the PTP messages can make before it gets dro...
unicast-request-duration                300         The service time in seconds to be requested during discovery.
timers
  announce-interval        0            0           Mean time interval between successive Announce messages.  It's spec...
  announce-timeout         3            3           The number of announceIntervals that have to pass without receipt o...
  delay-req-interval       -3           -3          The minimum permitted mean time interval between successive Delay R...
  sync-interval            -3           -3          The mean SyncInterval for multicast messages.  It's specified as a...
peer-mean-path-delay       0                        An estimate of the current one-way propagation delay on the link wh...
port-state                 master                   State of the port
protocol-version           2                        The PTP version in use on the port

Show PTP Counters

To show all PTP counters, run the nv show service ptp <instance> counters command:

cumulus@switch:~$ nv show service ptp 1 counters
Packet Type              Received       Transmitted    
---------------------    ------------   ------------   
Port swp4
  Announce                 0              10370            
  Sync                     0              20731             
  Follow-up                0              20731            
  Delay Request            0              0              
  Delay Response           0              0              
  Peer Delay Request       0              0              
  Peer Delay Response      0              0              
  Management               0              0              
  Signaling                0              0

To show PTP counters for an interface, run the nv show interface <interface> counters ptp command.

To clear PTP counters for an interface, run the nv action clear interface <interface> counters ptp command:

cumulus@switch:~$ nv action clear interface swp1 counters ptp
Action succeeded

Show the Status of All PTP Interfaces

To show the status of all PTP interfaces, run the nv show service ptp <instance> status command. The command output shows the PTP enabled ports, the PTP port mode (unicast or multicast), the state of the port based on BMCA, the unicast state, and identifies the server address to which the client connects.

cumulus@switch:~$ nv show service ptp 1 status
Port   Mode   State    Ustate                           Server
-----  -----  -------  -------------------------------  -------
swp9   Ucast  SLAVE    Sync and Delay Granted (H_SYDY)  9.9.9.2
swp10  Ucast  PASSIVE  Initial State (WAIT)
swp11  Ucast  PASSIVE  Initial State (WAIT)
swp12  Ucast  PASSIVE  Initial State (WAIT)

Show the List of NVUE PTP Commands

cumulus@switch:~$ nv list-commands service ptp
nv show service ptp
nv show service ptp <instance-id>
nv show service ptp <instance-id> status
nv show service ptp <instance-id> domain
nv show service ptp <instance-id> priority1
nv show service ptp <instance-id> priority2
nv show service ptp <instance-id> ip-dscp
nv show service ptp <instance-id> acceptable-master
...
cumulus@switch:~$ nv list-commands | grep 'nv show interface <interface-id> ptp'
...
nv show interface <interface-id> ptp
nv show interface <interface-id> ptp timers
nv show interface <interface-id> ptp shaper
...

Example Configuration

In the following example, the boundary clock on the switch receives time from Master 1 (the Grand Master) on PTP slave port swp1, sets its clock and passes the time down through PTP master ports swp2, swp3, and swp4 to the hosts that receive the time.

The following example configuration assumes that you have already configured the layer 3 routed interfaces (swp1, swp2, swp3, and swp4) you want to use for PTP.

cumulus@switch:~$ nv set service ptp 1 enable on
cumulus@switch:~$ nv set service ptp 1 priority2 254
cumulus@switch:~$ nv set service ptp 1 priority1 254
cumulus@switch:~$ nv set service ptp 1 domain 3
cumulus@switch:~$ nv set interface swp1 ptp enable on
cumulus@switch:~$ nv set interface swp2 ptp enable on
cumulus@switch:~$ nv set interface swp3 ptp enable on
cumulus@switch:~$ nv set interface swp4 ptp enable on
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
      swp1:
        ptp:
          enable: on
        type: swp
      swp2:
        ptp:
          enable: on
        type: swp
      swp3:
        ptp:
          enable: on
        type: swp
      swp4:
        ptp:
          enable: on
        type: swp
    service:
      ptp:
        '1':
          domain: 3
          enable: on
          priority1: 254
          priority2: 254
cumulus@switch:~$ sudo cat /etc/ptp4l.conf
...
[global]
#
# Default Data Set
#
slaveOnly                      0
priority1                      254
priority2                      254
domainNumber                   3

twoStepFlag                    1
dscp_event                     46
dscp_general                   46

offset_from_master_min_threshold   -50
offset_from_master_max_threshold   50
mean_path_delay_threshold          200
tsmonitor_num_ts                   100
tsmonitor_num_log_sets             2
tsmonitor_num_log_entries          4
tsmonitor_log_wait_seconds         1

#
# Run time options
#
logging_level                  6
path_trace_enabled             0
use_syslog                     1
verbose                        0
summary_interval               0

#
# servo parameters
#
pi_proportional_const          0.000000
pi_integral_const              0.000000
pi_proportional_scale          0.700000
pi_proportional_exponent       -0.300000
pi_proportional_norm_max       0.700000
pi_integral_scale              0.300000
pi_integral_exponent           0.400000
pi_integral_norm_max           0.300000
step_threshold                 0.000002
first_step_threshold           0.000020
max_frequency                  900000000
sanity_freq_limit              0

#
# Default interface options
#
time_stamping                  hardware


# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.

[swp1]
udp_ttl                      1
masterOnly                   0
delay_mechanism              E2E
network_transport            RAWUDPv4

[swp2]
udp_ttl                      1
masterOnly                   0
delay_mechanism              E2E
network_transport            RAWUDPv4

[swp3]
udp_ttl                      1
masterOnly                   0
delay_mechanism              E2E
network_transport            RAWUDPv4

[swp4]
udp_ttl                      1
masterOnly                   0
delay_mechanism              E2E
network_transport            RAWUDPv4

Considerations

PTP Version

Cumulus Linux uses a linuxptp package that is PTP v2.1 compliant, and sets the major PTP version to 2 and the minor PTP version to 1 by default in the configuration. If your PTP configuration does not work correctly when the minor version is set, you can change the minor version to 0.

cumulus@switch:~$ nv set service ptp 1 force-version 2.0
cumulus@switch:~$ nv config apply

To set the minor PTP version back to the default, run the nv unset service ptp 1 force-version command.

Edit the /etc/ptp4l.conf file to add ptp_minor_version 0 to the Global section, then restart the ptp4l service.

cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
[global]
#
# Default Data Set
#
slaveOnly                      0
priority1                      128
priority2                      128
domainNumber                   0

twoStepFlag                    1
dscp_event                     46
dscp_general                   46
ptp_minor_version              0
cumulus@switch:~$ sudo systemctl restart ptp4l.service

To set the minor PTP version back to the default value (1), remove ptp_minor_version 0 from the Global section of the /etc/ptp4l.conf file, then restart the ptp4l service.

To show that the PTP minor version is now 0, run the nv show service ptp <instance> force-version command:

cumulus@switch:~$ nv show service ptp 1 force-version
               applied
-------------  -------
force-version  2.0

PTP Traffic Shaping

To improve performance on the NVIDIA Spectrum 1 switch for PTP-enabled ports with speeds lower than 100G, you can enable a pre-defined traffic shaping profile. For example, if you see that the PTP timing offset varies widely and does not stabilize, enable PTP shaping on all PTP enabled ports to reduce the bandwidth on the ports slightly and improve timing stabilization.

  • Switches with Spectrum-2 and later do not support PTP shaping.

  • Bonds do not support PTP shaping.

  • You cannot configure QoS traffic shaping and PTP traffic shaping on the same ports.

  • You must configure a strict priority for PTP traffic; for example:

    cumulus@switch:~$ nv set qos egress-scheduler default-global traffic-class 0-5,7 mode dwrr
    cumulus@switch:~$ nv set qos egress-scheduler default-global traffic-class 0-5,7 bw-percent 12
    cumulus@switch:~$ nv set qos egress-scheduler default-global traffic-class 6 mode strict
    

For each PTP-enabled port on which you want to set traffic shaping, run the nv set interface <interface> ptp shaper enable on command.

cumulus@switch:~$ nv set interface swp1 ptp shaper enable on
cumulus@switch:~$ nv set interface swp2 ptp shaper enable on
cumulus@switch:~$ nv config apply

To see the PTP shaping setting for an interface, run the nv show interface <interface> ptp shaper command:

cumulus@switch:~$ nv show interface swp1 ptp shaper
        operational  applied  
------  -----------  -------  
enable               on   

In the /etc/cumulus/switchd.d/ptp_shaper.conf file, set the following parameters for the interfaces to which you want to apply traffic shaping and enable the traffic shaper. You must reload switchd for the changes to take effect.

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.d/ptp_shaper.conf
## Per-port configuration for PTP shaper
ptp_shaper.port_group_list = [enable-group]
ptp_shaper.enable-group.port_set = swp1,swp2
ptp_shaper.enable-group.ptp_shaper_enable = true
cumulus@switch:~$ sudo systemctl reload switchd.service

Spanning Tree and PTP

PTP frames are affected by STP filtering; events, such as an STP topology change (where ports temporarily go into the blocking state), can cause interruptions to PTP communications.

If you configure PTP on bridge ports, NVIDIA recommends that the bridge ports are spanning tree edge ports or in a bridge domain where spanning tree is disabled.

Pulse Per Second - PPS

PPS is the simplest form of synchronization. The PPS source provides a signal precisely every second. The switch is capable of using an external PPS signal to synchronize its PHC (for PPS In) and can also generate the PPS signal that other devices can use to synchronize their clocks (for PPS Out).

Cumulus Linux supports PPS for the NVIDIA SN3750-SX and SN5400 switches only.

Enable PPS Synchronization

To enable PPS synchronization:

Before you enable PPS In, make sure to configure a PTP slave port on the switch. See Precision Time Protocol - PTP.

cumulus@switch:~$ nv set platform pulse-per-second in state enabled
cumulus@switch:~$ nv config apply

  • If you configure SyncE or PTP noise transfer, Cumulus Linux does not support PPS In.
  • When you enable PPS In, the PTP log reporting offset is one every two seconds instead of one every second.

cumulus@switch:~$ nv set platform pulse-per-second out state enabled
cumulus@switch:~$ nv config apply
  1. Edit the Default interface options section of the /etc/ptp4l.conf file to configure the PTP slave port on the switch. PPS In requires PTP slave port. See Precision Time Protocol - PTP for information about PTP.

    cumulus@switch:~$ sudo nano /etc/linuxptp/pps_out.conf
    ...
    # Default interface options
    #
    time_stamping                  hardware
    [swp29]
    udp_ttl                      1
    masterOnly                   0
    delay_mechanism              E2E
    network_transport            RAWUDPv4
    
  2. Edit the /etc/linuxptp/ts2phc.conf file to set the following parameters to enable PPS In.

    cumulus@switch:~$ sudo nano /etc/linuxptp/ts2phc.conf
    # Default configurations
    [global]
    use_syslog                0
    verbose                   1
    logging_level             6
    slave_event_monitor       /var/run/ptp_sem.sock
    ts2phc.pulsewidth         500000000
    ts2phc.tod_source         ptp 
    #
    # servo parameters
    #
    pi_proportional_const          0.000000
    pi_integral_const              0.000000
    pi_proportional_scale          0.700000
    pi_proportional_exponent       -0.300000
    pi_proportional_norm_max       0.700000
    pi_integral_scale              0.300000
    pi_integral_exponent           0.400000
    pi_integral_norm_max           0.300000
    step_threshold                 0.000000050
    first_step_threshold           0.000000001
    max_frequency                  500000000
    sanity_freq_limit              0
    #
    [/dev/ptp1] 
    ts2phc.pin_index               0 
    ts2phc.channel                 0
    ts2phc.extts_polarity          rising 
    ts2phc.extts_correction        0
    
  3. Enable and start the ptp4l and phc2sys services:

    cumulus@switch:~$ sudo systemctl enable ptp4l.service phc2sys.service
    cumulus@switch:~$ sudo systemctl start ptp4l.service phc2sys.service
    
  1. Edit the /etc/linuxptp/pps_out.conf file to set the following parameters.

    cumulus@switch:~$ sudo nano /etc/linuxptp/pps_out.conf
    # Configuration file used for the pps_out.service
    # It is shell formatted and the file is source'd by the service
    # Set the PTP device to source our PPS from. 
    # If not specified, the service will find the first device with a clock name "sx_ptp".
    PTP_DEV=/dev/ptp1
    # Set the pin index on the PPS device to send on. 
    # On the NVIDIA systems, only pin 1 (0-based) is supported
    OUT_PIN=1
    # Set the file where to cache the last started values. 
    # This is used primarily in the "stop" operation to know what to clean up.
    CACHE_FILE=/var/run/pps_out
    # Set the out pulse charateristics for frequency and width
    PULSE_FREQ=1000000000
    PULSE_WIDTH=500000000
    PULSE_PHASE=0
    
  2. Enable and start the pps_out service:

    cumulus@switch:~$ sudo systemctl enable pps_out.service 
    cumulus@switch:~$ sudo systemctl start pps_out.service 
    

PPS Synchronization Settings

You can configure these PPS settings:

PPS In Setting Description
channel-index Sets the channel index for PPS In. You can set a value of 1 or 0. The default value is 0.
logging-level Sets the logging level for PPS In. You can specify emergency, alert, critical, error, warning, notice, info, or debug. The default logging level is info.
pin-index Sets the pin index for PPS In. You can set a value of 1 or 0. The default value is 0.
signal-polarity Sets the polarity of the PPS In signal. You can specify rising-edge, falling-edge, or both. The default setting is rising-edge.
signal-width Sets the pulse width of the PPS In signal. You can set a value between 1000000 and 999000000. The default value is 500000000.
timestamp-correction Sets the value, in nanoseconds, to add to each PPS In timestamp. You can set a value between -1000000000 and 1000000000. The default value is 0.
PPS Out Setting Description
channel-index Sets the channel index for PPS Out. You can set a value of 1 or 0. The default value is 0.
frequency-adjustment Sets the frequency adjustment of the PPS Out signal. You can set a value between 1000000000 and 2147483647. The default value is 1000000000.
pin-index Sets the pin index for PPS Out. Cumulus Linux supports only pin 1.
signal-width Sets the pulse width of the PPS Out signal. You can set a value between 1000000 and 999000000. The default value is 500000000.

The NVUE CLI includes the phase adjustment setting for PPS Out. Cumulus Linux 5.9 and later does not support this setting.

The following example configures PPS In and sets:

  • The channel index to 1.
  • The pin index to 1.
  • The signal width to 999000000.
  • The timestamp correction to 1000000000.
  • The logging level to warning.
  • The polarity of the PPS In signal to falling-edge.
cumulus@switch:~$ nv set platform pulse-per-second in channel-index 1
cumulus@switch:~$ nv set platform pulse-per-second in pin-index 1
cumulus@switch:~$ nv set platform pulse-per-second in signal-width 999000000
cumulus@switch:~$ nv set platform pulse-per-second in timestamp-correction 1000000000
cumulus@switch:~$ nv set platform pulse-per-second in logging-level warning
cumulus@switch:~$ nv set platform pulse-per-second in signal-polarity falling-edge
cumulus@switch:~$ nv config apply

The following example configures PPS Out and sets:

  • The channel index to 1.
  • The signal width to 999000000.
  • The frequency-adjustment of the PPS Out signal to 2147483647.
cumulus@switch:~$ nv set platform pulse-per-second out channel-index 1
cumulus@switch:~$ nv set platform pulse-per-second out signal-width 999000000
cumulus@switch:~$ nv set platform pulse-per-second out frequency-adjustment 2147483647
cumulus@switch:~$ nv config apply

To configure PPS In, edit the /etc/linuxptp/ts2phc.conf file, then restart the PPS In service with the sudo systemctl restart ts2phc.service command.

The following example configures PPS In and sets:

  • The channel index to 1
  • The pin index to 1
  • The signal width to 999000000.
  • The timestamp correction to 1000000000.
  • The logging level to 4 (warning).
  • The polarity of the PPS In signal to falling edge (falling).
cumulus@switch:~$ sudo nano /etc/linuxptp/ts2phc.conf
# ts2phc is enabled 
[global] 
use_syslog                     0 
verbose                        1 
slave_event_monitor            /var/run/ptp_sem.sock 
logging_level                  4 
ts2phc.pulsewidth              999000000 
ts2phc.tod_source              ptp 
domainNumber                   0
...
[/dev/ptp1] 
ts2phc.pin_index               1 
ts2phc.channel                 1 
ts2phc.extts_polarity          falling 
ts2phc.extts_correction        0

To configure PPS Out, edit the /etc/linuxptp/pps_out.conf.conf file, then restart the PPS Out service with the sudo systemctl restart pps_out.service command.

The following example configures PPS Out and sets:

  • The channel index to 1.
  • The signal width to 999000000.
  • The frequency-adjustment of the PPS Out signal to 2147483647.
cumulus@switch:~$ sudo nano /etc/linuxptp/pps_out.conf.conf
# Configuration file used for the pps_out.service
# It is shell formatted and the file is source'd by the service
#
# Set the PTP device to source our PPS from. 
# If not specified, the service will find the first device with a clock name "sx_ptp".
PTP_DEV=/dev/ptp1
#
# Set the pin index on the PPS device to send on. 
# On the NVIDIA systems, only pin 1 (0-based) is supported
OUT_PIN=1
#
OUT_CHANNEL=1 
#
# Set the file where to cache the last started values. 
# This is used primarily in the "stop" operation to know what to clean up.
CACHE_FILE=/var/run/pps_out
#
# Set the out pulse charateristics for frequency and width
PULSE_FREQ=2147483647
PULSE_WIDTH=999000000
PULSE_PHASE=1000000000

Show PPS Configuration Settings

To show a summary of the PPS In and PPS out configuration settings, run the nv show platform pulse-per-second command:

cumulus@switch:~$ nv show platform pulse-per-second
                        applied
----------------------  -----------
in
  state                 enabled
  pin-index             0
  channel-index         0
  signal-width          500000000
  signal-polarity       rising-edge
  timestamp-correction  0
  logging-level         info
out
  state                 disabled
  pin-index             1
  channel-index         0
  frequency-adjustment  1000000000
  phase-adjustment      0
  signal-width          500000000

To show only PPS In configuration settings, run the nv show platform pulse-per-second in command:

cumulus@switch:~$ nv show platform pulse-per-second in
                      applied
--------------------  -----------
state                 enabled
pin-index             0
channel-index         0
signal-width          500000000
signal-polarity       rising-edge
timestamp-correction  0
logging-level         info

To show only PPS Out configuration settings, run the nv show platform pulse-per-second out command:

cumulus@switch:~$ nv show platform pulse-per-second out
                      applied
--------------------  ----------
state                 disabled
pin-index             1
channel-index         0
frequency-adjustment  1000000000
phase-adjustment      0
signal-width          500000000

Synchronous Ethernet - SyncE

SyncE is an ITU-T standard for transmitting clock signals over the Ethernet physical layer to synchronize clocks across the network by propagating frequency using the transmission rate of symbols in the network. A dedicated channel, ESMC manages this synchronization, as specified by the ITU-T Rec. G.8264 standard.

The Cumulus Linux switch includes a SyncE controller and a SyncE daemon.

Cumulus Linux constructs the SyncE clock identity as follows:

  • Only the NVIDIA SN3750-SX switch and the NVIDIA SN5400 switch support SyncE.
  • SyncE on 1G interfaces only supports 1000BASE-SX transceivers, 1000BASE-LX transceivers, and ADVA 5401 GrandMaster transceivers.
  • When you configure SyncE on a switch with PTP enabled, configure ITU-T noise transfer.
  • To use SyncE on the SN5400 switch running Cumulus Linux 5.11 or later, you must upgrade SyncE firmware. See Upgrade SyncE Firmware on the SN5400 Switch below.

Upgrade SyncE Firmware on the NVIDIA SN5400 Switch

The NVIDIA SN5400 switch running Cumulus Linux 5.11 and later requires a firmware upgrade to use SyncE.

To upgrade the SyncE firmware on the SN5400 switch:

  1. From the NVIDIA Enterprise support portal, download all the SyncE firmware files.

  2. Upload the SyncE firmware files to the switch.

  3. Upgrade the firmware with the sudo flint command for each file; for example:

    cumulus@switch:~$ sudo flint -d /dev/mst/mt53120_pciconf0 -i MC000030_HIPPO_ALBATROSS_CLKBRD1_CLK_FW_UPGRADE_REV0100.bin burn
        Current FW version on flash:  0.0
        New FW version:               1.0
    -I- Downloading FW ...
    FSMST_INITIALIZE -   OK
    Writing COMPID_CLOCK_SYNC_EEPROM component -   OK
    FSMST_LOCKED -   OK
    FSMST_DOWNSTREAM_DEVICE_TRANSFER -   OK
    -I- Component FW burn finished successfully.
    -I- To load new FW run reboot machine.
    
    cumulus@switch:~$ sudo flint -d /dev/mst/mt53120_pciconf0 -i MC000031_HIPPO_ALBATROSS_CLKBRD2_CLK_FW_UPGRADE_REV0100.bin burn
        Current FW version on flash:  0.0
        New FW version:               1.0
    -I- Downloading FW ...
    FSMST_INITIALIZE -   OK
    Writing COMPID_CLOCK_SYNC_EEPROM component -   OK
    FSMST_LOCKED -   OK
    FSMST_DOWNSTREAM_DEVICE_TRANSFER -   OK
    -I- Component FW burn finished successfully.
    -I- To load new FW run reboot machine.
    
  4. Completely reset the system by removing power for at least 30 seconds.

Basic Configuration

Basic SyncE configuration requires you:

The basic configuration shown below uses the default SyncE settings:

cumulus@switch:~$ nv set system synce enable on
cumulus@switch:~$ nv set interface swp2 synce enable on
cumulus@switch:~$ nv set interface swp2 synce bundle-id 10
cumulus@switch:~$ nv config apply

Edit the /etc/synced/synced.conf file to configure the interface, then enable and start the SyncE service. Adding an interface section in the /etc/synced/synced.conf file enables SyncE on that interface.

The following example enables SyncE on swp2.

cumulus@switch:~$ sudo nano /etc/synced/synced.conf
...
# NVUE SyncE state is enable on

[global]
twtr_seconds=300
priority=1

[swp2]
bundle=10
cumulus@switch:~$ sudo systemctl enable synced.service
cumulus@switch:~$ sudo systemctl start synced.service

Optional Global Configuration

Wait to Restore Time

The wait to restore time is the number of seconds SyncE waits for each port to be up before opening the Ethernet Synchronization Message Channel (ESMC) for messages. You can set a value between 0 and 720 (12) minutes. The default value is 300 seconds (5 minutes).

The following command example sets the wait to restore time to 180 seconds (3 minutes):

cumulus@switch:~$ nv set system synce wait-to-restore-time 180
cumulus@switch:~$ nv config apply

Edit the /etc/synced/synced.conf file to change the twtr_seconds setting, then restart the SyncE service.

cumulus@switch:~$ sudo nano /etc/synced/synced.conf
...
[global]
twtr_seconds=180
cumulus@switch:~$ sudo systemctl restart synced.service

Priority

You can set the priority for the clock source. The lowest priority is 1 and the highest priority is 256. If two clock sources have the same priority, the switch uses the lowest clock source.

The following example command sets the priority to 256:

cumulus@switch:~$ nv set system synce provider-default-priority 256
cumulus@switch:~$ nv config apply

Edit the /etc/synced/synced.conf file to change the priority setting, then restart the SyncE service.

cumulus@switch:~$ sudo nano /etc/synced/synced.conf
...
[global]
twtr_seconds=180
priority=256
cumulus@switch:~$ sudo systemctl restart synced.service

Minimum Acceptable Quality Level

You can prevent SyncE from tracking a source with a quality level lower than a specific value. The quality level can be: eec1, eeec, ssu-b, ssu-a, prc, eprc, prtc, or eprtc, where eec1 is the lowest quality level and eprtc is the highest quality level.

Run the nv set system synce min-acceptable-ql <quality-level> command. The following example prevents SyncE from tracking a source with a quality level lower than ssu-b:

cumulus@switch:~$ nv set system synce min-acceptable-ql ssu-b
cumulus@switch:~$ nv config apply

Edit the /etc/synced/synced.conf file to add the min_ql parameter, then restart the SyncE service. The following example prevents SyncE from tracking a source with a quality level lower than ssu-b:

cumulus@switch:~$ sudo nano /etc/synced/synced.conf
...
min_ql=ssu-b
cumulus@switch:~$ sudo systemctl restart synced.service

Logging

You can set the logging level that the SyncE service uses:

The following example command sets the logging level to debug.

cumulus@switch:~$ nv set system synce log-level debug
cumulus@switch:~$ nv config apply

Edit the /etc/synced.conf file to change the log-level setting, then reload the SyncE service.

cumulus@switch:~$ sudo nano /etc/synced/synced.conf
...
[global]
twtr_seconds=180
priority=256
loglevel=debug
cumulus@switch:~$ sudo systemctl reload synced.service

Optional Interface Configuration

Frequency Source Priority

The clock selection algorithm uses the frequency source priority on an interface to choose between two sources that have the same QL. You can specify a value between 1 (the highest priority) and 256 (the lowest priority). The default value is 1.

The following command example sets the priority on swp2 to 10, on swp2 to 20, and on swp3 to 10:

cumulus@switch:~$ nv set interface swp1 synce provider-priority 10
cumulus@switch:~$ nv set interface swp2 synce provider-priority 20
cumulus@switch:~$ nv set interface swp3 synce provider-priority 10
cumulus@switch:~$ nv config apply

Edit the /etc/synced.conf file to change the priority setting for the interface, then restart the SyncE service.

cumulus@switch:~$ sudo nano /etc/synced/synced.conf
...
[global]
twtr_seconds=180
priority=256
log-level=debug

[swp1]
priority=10
 
[swp2]
priority=20
 
[swp3]
priority=10
cumulus@switch:~$ sudo systemctl restart synced.service

Troubleshooting

Show SyncE Configuration and Counters

To show global SyncE configuration, run the NVUE nv show system synce command or the Linux syncectl show status command.

To show SyncE configuration for a specific interface, run the NVUE nv show interface <interface-id> synce command or the Linux syncectl show interface status <interface> command.

cumulus@switch:~$ nv show system synce
                           operational                                                        applied
-------------------------  -----------------------------------------------------------------  -------
enable                     On                                                                 on
log-level                  notice
provider-default-priority  10                                                                 10
wait-to-restore-time       40                                                                 40
clock-identity             0x849e00fffe00ca00
local-clock-quality        eec1
network-type               1
summary                    Group #0: TRACKING holdover acquired on swp1. freq_diff: 77 (ppb)

To show SyncE statistics for a specific interface, run the NVUE nv show interface <interface-id> counters synce command or the Linux syncectl show interface counters <interface command:

cumulus@switch:~$ nv show interface swp2 counters synce
Packet Type                       Received       Transmitted    
---------------------             ------------   ------------   
ESMC                                      700            708
ESMC Error                                  0              0
ESMC DNU                                  549              0
ESMC EEC1                                   1            558
ESMC E-EEC                                  0              0
ESMC SSU B                                  0              0
ESMC SSU A                                  0              0
ESMC PRC                                  150            150
ESMC E-PRC                                  0              0
ESMC PRTC                                   0              0
ESMC E-PRTC                                 0              0
ESMC Unknown                                0              0

Clear SyncE Interface Counters

To clear counters for a specific SyncE interface, run the NVUE nv action clear interface <interface> counters synce command or the Linux syncectl clear interface counters <interface> command.

cumulus@switch:~$ nv action clear interface swp1 counters synce
swp1 counters cleared
Action succeeded

To clear counters for all SyncE interfaces, run the syncectl clear counters command.

To see all the syncectl commands, run syncectl -h.

ITU G.781

Authentication Authorization and Accounting

This section describes how to set up user accounts and ssh for remote access, and configure LDAP authentication, TACACS+, and RADIUS AAA.

SSH for Remote Access

Cumulus Linux uses the OpenSSH package to provide access to the system using the Secure Shell (SSH) protocol.

Configure SSH

You can configure SSH to provide login access to the root user and to specific user accounts, limit SSH to listen on a specific VRF, and configure timeouts and session options.

Root User Settings

By default, the root account cannot use SSH to log in.

You can configure the root account to use SSH to log into the switch with:

To allow the root account to SSH into the switch with a password:

cumulus@switch:~$ nv set system ssh-server permit-root-login enabled
cumulus@switch:~$ nv config apply

Run the nv set system ssh-server permit-root-login disabled command to disable SSH login for the root account with a password.

To allow the root account to SSH into the switch and authenticate with a public key or any allowed mechanism that is not a password and not keyboard interactive:

cumulus@switch:~$ nv set system ssh-server permit-root-login prohibit-password
cumulus@switch:~$ nv config apply

To allow the root account to SSH into the switch and only run a set of commands defined in the authorized_keys file:

cumulus@switch:~$ nv set system ssh-server permit-root-login forced-commands-only
cumulus@switch:~$ nv config apply

To allow the root account to SSH into the switch using a password, edit the /etc/ssh/sshd_config file and set the PermitRootLogin option to yes:

cumulus@switch:~$ sudo cat /etc/ssh/sshd_config
...
# Authentication:
LoginGraceTime 2m
PermitRootLogin yes
...

Set the PermitRootLogin command to no to disable SSH login with a password.

To allow the root account to SSH into the switch and authenticate with a public key or any allowed mechanism that is not a password and not keyboard interactive:

  1. Create an .ssh directory for the root user.

    cumulus@switch:~$ sudo mkdir -p /root/.ssh
    cumulus@switch:~$ sudo chmod 0700 /root/.ssh 
    
  2. As a privileged user (such as the cumulus user), either echo the public key contents and redirect the contents to the authorized key file or copy the public key file to the switch, then copy it to the root account (with privilege escalation).

    To echo the public key contents and redirect the contents to the authorized key file:

    cumulus@switch:~$ echo "<SSH public key contents>" | sudo tee -a /root/.ssh/authorized_keys 
    cumulus@switch:~$ sudo chmod 0644 /root/.ssh/authorized_keys 
    

    To copy the public key file to the switch, then copy it to the root account:

    cumulus@switch:~$ sudo cp <SSH public key file> /root/.ssh/authorized_keys 
    cumulus@switch:~$ sudo chmod 0644 /root/.ssh/authorized_keys
    

Allow and Deny Users

To allow certain users to establish an SSH session:

cumulus@switch:~$ nv set system ssh-server allow-users user1
cumulus@switch:~$ nv config apply

To deny certain users to establish an SSH session:

cumulus@switch:~$ nv set system ssh-server deny-users user4
cumulus@switch:~$ nv config apply

To allow certain users to establish an SSH session, edit the /etc/ssh/sshd_config file and add the AllowUsers parameter:

cumulus@switch:~$ sudo cat /etc/ssh/sshd_config
...
...
# Example of overriding settings on a per-user basis
#Match User anoncvs
# X11Forwarding no
# AllowTcpForwarding no
# PermitTTY no
# ForceCommand cvs server
AllowUsers = user1

To deny certain users to establish an SSH session, edit the /etc/ssh/sshd_config file and add the DenyUsers parameter:

cumulus@switch:~$ sudo cat /etc/ssh/sshd_config
...
# Example of overriding settings on a per-user basis
#Match User anoncvs
# X11Forwarding no
# AllowTcpForwarding no
# PermitTTY no
# ForceCommand cvs server
AllowUsers = user1
DenyUsers  = user4

SSH and VRFs

The SSH service runs in the default VRF on the switch but listens on all interfaces in all VRFs. You can limit SSH to listen on specific VRFs.

You cannot run SSH in the default VRF and other VRFs at the same time.

The following example configures SSH to listen only on the management VRF:

cumulus@switch:~$ nv set system ssh-server vrf mgmt
cumulus@switch:~$ nv config apply

The following example configures SSH to listen on the management VRF and VRF RED:

cumulus@switch:~$ nv set system ssh-server vrf mgmt
cumulus@switch:~$ nv set system ssh-server vrf RED
cumulus@switch:~$ nv config apply

Bind the SSH service to the VRF. The following example configures SSH to listen only on the management VRF:

cumulus@switch:~$ sudo systemctl stop ssh.service
cumulus@switch:~$ sudo systemctl disable ssh.service
cumulus@switch:~$ sudo systemctl start ssh@mgmt.service
cumulus@switch:~$ sudo systemctl enable ssh@mgmt.service

The following example configures SSH to listen on the management VRF and VRF RED:

cumulus@switch:~$ sudo systemctl stop ssh.service
cumulus@switch:~$ sudo systemctl disable ssh.service
cumulus@switch:~$ sudo systemctl start ssh@mgmt.service
cumulus@switch:~$ sudo systemctl enable ssh@mgmt.service
cumulus@switch:~$ sudo systemctl start ssh@RED.service
cumulus@switch:~$ sudo systemctl enable ssh@RED.service

To configure SSH to listen to only one IP address or a subnet in a VRF, you need to bind the service to that VRF (as above), then set the ListenAddress parameter in the /etc/ssh/sshd_config file to the IP address or subnet in that VRF.

cumulus@switch:~$ sudo cat /etc/ssh/sshd_config
...

#Port 22
#AddressFamily any
ListenAddress 10.10.10.6
#ListenAddress ::

Enable and Disable the SSH Server

Cumulus Linux enables the SSH server by default. To disable the SSH server:

cumulus@switch:~$ nv set system ssh-server state disabled
cumulus@switch:~$ nv config apply

Run the nv set system ssh-server state enabled command to renable the SSH server.

cumulus@switch:~$ sudo systemctl stop ssh.service
cumulus@switch:~$ sudo systemctl disable ssh.service

To renable the SSH server:

cumulus@switch:~$ sudo systemctl start ssh.service
cumulus@switch:~$ sudo systemctl enable ssh.service

SSH Strict Mode

By default, SSH strict mode is on; Cumulus Linux disables X11, TCP forwarding, and compression and enforces secure ciphers.

To disable SSH strict mode, run the nv set system ssh-server strict disabled command:

cumulus@switch:~$ nv set system ssh-server strict disabled
cumulus@switch:~$ nv config apply

To renable strict mode, run the nv set system ssh-server strict enabled command.

To show if strict mode is on or off, run the nv show system ssh-server command:

cumulus@switch:~$ nv show system ssh-server

                             applied
---------------------------  --------
authentication-retries       6
login-timeout                120
inactive-timeout             15
permit-root-login            enabled
max-sessions-per-connection  30
state                        enabled
strict                       disabled
...  

Edit the /etc/ssh/sshd_config file and change the AllowTcpForwarding, X11Forwarding and Compression parameters to yes. Also, remove the ciphers and keys under #RekeyLimit default none in the Ciphers and keying section of the file.

cumulus@switch:~$ sudo nano /etc/ssh/sshd_config
...

# Ciphers and keying
#RekeyLimit default none
...
#AllowAgentForwarding yes
AllowTcpForwarding yes
#GatewayPorts no
X11Forwarding yes
#X11DisplayOffset 10
#X11UseLocalhost yes
#PermitTTY yes
PrintMotd no
#PrintLastLog yes
#TCPKeepAlive yes
#PermitUserEnvironment no
Compression yes
ClientAliveInterval 0
ClientAliveCountMax 0
#UseDNS no
#PidFile /var/run/sshd.pid
MaxStartups 10:30:100
#PermitTunnel no
#ChrootDirectory none
#VersionAddendum none

Configure Timeouts and Sessions

You can configure the following SSH timeout and session options:

The following example configures the number of login attempts allowed before rejecting the SSH session to 10 and the number of seconds allowed before login times out to 200:

cumulus@switch:~$ nv set system ssh-server authentication-retries 10
cumulus@switch:~$ nv set system ssh-server login-timeout 200
cumulus@switch:~$ nv config apply

Edit the /etc/ssh/sshd_config file and change the MaxAuthTries parameter in the Authentication section to 10 and the LoginGraceTime parameter to 200:

cumulus@switch:~$ sudo nano /etc/ssh/sshd_config
...
# Authentication:

LoginGraceTime 200s
PermitRootLogin prohibit-password
#StrictModes yes
MaxAuthTries 10
MaxSessions 10

The following example configures the TCP port that listens for incoming SSH sessions to 443:

cumulus@switch:~$ nv set system ssh-server port 443
cumulus@switch:~$ nv config apply

Edit the /etc/ssh/sshd_config file and add the Port parameter:

cumulus@switch:~$ sudo nano /etc/ssh/sshd_config
...
Port 443
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::
...

The following example configures the amount of time a session can be inactive before the SSH server terminates the connection to 5 minutes (300 seconds) and the maximum number of SSH sessions allowed per TCP connection to 5. The default inactive-timeout is 15 minutes and the default max-sessions-per-connection is 10:

cumulus@switch:~$ nv set system ssh-server inactive-timeout 5
cumulus@switch:~$ nv set system ssh-server max-sessions-per-connection 5
cumulus@switch:~$ nv config apply

Edit Authentication section of the /etc/ssh/sshd_config file.

  • To configure the amount of time (in seconds) a session can be inactive before the SSH server terminates the connection, change the ClientAliveInterval parameter.
  • To configure the maximum number of SSH sessions allowed per TCP connection, change the MaxSessions parameter.
cumulus@switch:~$ sudo nano /etc/ssh/sshd_config
...
# Authentication:

LoginGraceTime 120s
PermitRootLogin prohibit-password
#StrictModes yes
MaxAuthTries 10
MaxSessions 5
...
#AllowAgentForwarding yes
#AllowTcpForwarding yes
#GatewayPorts no
X11Forwarding yes
#X11DisplayOffset 10
#X11UseLocalhost yes
#PermitTTY yes
PrintMotd no
#PrintLastLog yes
#TCPKeepAlive yes
#PermitUserEnvironment no
#Compression delayed
ClientAliveInterval 300
...

The following example configures:

cumulus@switch:~$ nv set system ssh-server max-unauthenticated throttle-start 5
cumulus@switch:~$ nv set system ssh-server max-unauthenticated throttle-percent 22
cumulus@switch:~$ nv set system ssh-server max-unauthenticated session-count 20
cumulus@switch:~$ nv config apply

Edit the /etc/ssh/sshd_config file and change the MaxStartups parameter.

The following example configures:

  • The number of unauthenticated SSH sessions allowed before throttling starts to 5.
  • The starting percentage of connections to reject above the throttle start count before reaching the session count limit to 22.
  • The maximum number of unauthenticated SSH sessions allowed to 20.
cumulus@switch:~$ sudo nano /etc/ssh/sshd_config
...
MaxStartups 5:22:20
...

SSH Login Notifications

Cumulus Linux shows the following SSH login information on the console after authentication:

Cumulus Linux displays login notifications for both SSH and serial connections. The information can help to detect unwanted or malicious activities, such as suspicious logins or password and role changes.

To configure the time period in days during which to show login notifications, run the nv set system ssh-server login-record-period <days> command. You can specify a value between 1 and 30. The default value is 1.

The following example sets the SSH login notification period to 20 days:

cumulus@switch:~$ nv set system ssh-server login-record-period 20
cumulus@switch:~$ nv config apply

To set the SSH login notification period back to the default value (1 day), run the nv unset system ssh-server login-record-period command.

To show the configured SSH login notification period, run the nv show system ssh-server command. See Troubleshooting below.

Generate and Install an SSH Key Pair

This section describes how to generate an SSH key pair on one system and install the key as an authorized key on another system.

Generate an SSH Key Pair

To generate an SSH key pair, run the ssh-keygen command and follow the prompts.

Cumulus Linux does not support sha1 ssh key exchange methods.

To configure the system without a password, do not enter a passphrase when prompted in the following step.

cumulus@host01:~$ ssh-keygen 
Generating public/private rsa key pair. 
Enter file in which to save the key (/home/cumulus/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/cumulus/.ssh/id_rsa. 
Your public key has been saved in /home/cumulus/.ssh/id_rsa.pub. 
The key fingerprint is: 
5a:b4:16:a0:f9:14:6b:51:f6:f6:c0:76:1a:35:2b:bb cumulus@leaf04 
The key's randomart image is: 
+---[RSA 2048]----+ 
|      +.o   o    | 
|     o * o . o   | 
|    o + o O o    | 
|     + . = O     | 
|      . S o .    | 
|       +   .     | 
|      .   E      | 
|                 | 
|                 | 
+-----------------+ 

Install an Authorized SSH Key

To install an authorized SSH key, you take the contents of an SSH public key and add it to the SSH authorized key file (~/.ssh/authorized_keys) of the user.

A public key is a text file with three space separated fields:

<type> <key string> <comment>
Field Description
<type>  The algorithm you want to use to hash the key. The algorithm can be ecdsa-sha2-nistp256, ecdsa-sha2-nistp384, ecdsa-sha2-nistp521, ssh-dss, ssh-ed25519, or ssh-rsa (the default value).
<key string> A base64 format string for the key.
<comment> A single word string. By default, this is the name of the system that generated the key. NVUE uses the <comment> field as the key name.

The procedure to install an authorized SSH key is different based on whether the user is an NVUE managed user or a non-NVUE managed user.

The following example adds an authorized key named prod_key to the user admin2. The content of the public key file is ssh-rsa 1234 prod_key.

cumulus@leaf01:~$ nv set system aaa user admin2 ssh authorized-key prod_key key XABDB3NzaC1yc2EAAAADAQABAAABgQCvjs/RFPhxLQMkckONg+1RE1PTIO2JQhzFN9TRg7ox7o0tfZ+IzSB99lr2dmmVe8FRWgxVjc...
cumulus@leaf01:~$ nv set system aaa user admin2 ssh authorized-key prod_key type ssh-rsa
cumulus@leaf01:~$ nv config apply

The following example adds an authorized key file from the account cumulus on a host to the cumulus account on the switch:

  1. To copy a previously generated public key to the desired location, run the ssh-copy-id command and follow the prompts:

    cumulus@host01:~$ ssh-copy-id -i /home/cumulus/.ssh/id_rsa.pub cumulus@leaf02
    The authenticity of host 'leaf02 (192.168.0.11)' can't be established.
    ECDSA key fingerprint is b1:ce:b7:6a:20:f4:06:3a:09:3c:d9:42:de:99:66:6e.
    Are you sure you want to continue connecting (yes/no)? yes
    /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
    /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
    cumulus@leaf01's password:
    Number of key(s) added: 1
    

    The ssh-copy-id command does not work if the username on the remote switch is different from the username on the local switch. To work around this issue, use the scp command instead:

    cumulus@host01:~$ scp .ssh/id_rsa.pub cumulus@leaf02:.ssh/authorized_keys
    Enter passphrase for key '/home/cumulus/.ssh/id_rsa':
    id_rsa.pub
    
  2. Connect to the remote switch to confirm that the authentication keys are in place:

    cumulus@leaf01:~$ ssh cumulus@leaf02
    Welcome to Cumulus VX (TM) 
    Cumulus VX (TM) is a community supported virtual appliance designed for
    experiencing, testing and prototyping the latest technology.
    For any questions or technical support, visit our community site at:
    http://community.cumulusnetworks.com 
    The registered trademark Linux (R) is used pursuant to a sublicense from LMI,
    the exclusive licensee of Linus Torvalds, owner of the mark on a world-wide basis. 
    Last login: Thu Sep 29 16:56:54 2016
    

Troubleshooting

To show all the current SSH server configuration settings, run the NVUE nv show system ssh-server command:

cumulus@switch:~$ nv show system ssh-server
                             applied          
---------------------------  -----------------
authentication-retries       6               
login-timeout                120            
inactive-timeout             0           
permit-root-login            prohibit-password
max-sessions-per-connection  10 
state                        enabled       
strict                       enabled
login-record-period          20          
max-unauthenticated                                              
  session-count              100         
  throttle-percent           30            
  throttle-start             10

To show the current number of active SSH sessions, run the NVUE nv show system ssh-server active-sessions command or the Linux w command:

cumulus@switch:~$ nv show system ssh-server active-sessions
Peer Address:Port    Local Address:Port      State
-------------------  ----------------------  -----
192.168.200.1:46528  192.168.200.11%mgmt:22  ESTAB
cumulus@switch:~$ w
 11:10:46 up 19:19,  4 users,  load average: 0.08, 0.05, 0.05
USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
cumulus  ttyS0    -                Wed15   19:19m  0.03s  0.02s -bash
cumulus  pts/0    192.168.200.1    07:27    3:43m  0.03s  0.03s -bash
cumulus  pts/1    192.168.200.1    10:01    1:09m  0.02s  0.02s -bash
cumulus  pts/2    192.168.200.1    11:10    1.00s  0.03s  0.00s w

To show which users can establish an SSH session, run the nv show system ssh-server allow-users command. To show which users cannot establish an SSH session, run the nv show system ssh-server deny-users command. You can also show information for a specific user with the nv show system ssh-server allow-users <user> command and the nv show system ssh-server deny-users <user> command.

To show the TCP port numbers that listen for incoming SSH sessions, run the nv show system ssh-server port command. You can also show information for a specific port with the nv show system ssh-server port <port> command.

To show the SSH timer and session information, run the nv show system ssh-server max-unauthenticated command:

cumulus@switch:~$ nv show system ssh-server max-unauthenticated
                  applied
----------------  -------
session-count     20     
throttle-percent  22     
throttle-start    5

Log Files with NVUE

NVUE provides commands to show the current system logging configuration, show the contents of the log files on the switch, and to delete log files.

Show System Logging Configuration

To show the current system log configuration on the switch, run the nv show system log command:

cumulus@switch:~$ nv show system log

Show System Log Files

To show the contents of the most current system log file, run the nv show system log file command. The contents are shown with the less command, which enables you to scroll through the file interactively. The less command is typically used to view the most recent log entries.

cumulus@switch:~$ nv show system log file

The nv show system log file command provides the following options:

Option Description
brief Shows the contents of the most current system log file but in a more concise format.
follow Shows the contents of a system log file in real-time. The command shows the log file output continuously as it is updated, similar to the behavior of the tail -f command.
list Shows the available system log files on the system with their filenames and corresponding file paths.
<file-name> Shows the contents of a specific system log file. If the file is a regular log file (such as syslog.1), the system uses less so that you can scroll and search through the log entries. If the file is compressed (such as syslog.2.gz), the system displays the contents without decompressing the file. This command is useful for viewing both archived and compressed log files.

The following example shows the contents of the most current system log file:

cumulus@switch:~$ nv show system log file

The following example shows the contents of the most current system log file in a more concise format:

cumulus@switch:~$ nv show system log file brief

The following example shows the contents of a system log file in real-time:

cumulus@switch:~$ nv show system log file follow

The following example shows the available system log files on the system with their filenames and corresponding file paths:

cumulus@switch:~$ nv show system log file list

The following example shows the contents of syslog.1

cumulus@switch:~$ nv show system log file syslog.1

Show Components Generating the Logs

To show the components of the system generating the logs and the log severity levels associated with each component, run the nv show system log component command.

cumulus@switch:~$ nv show system log component 
Component         Level 
----------------  ------ 
nvue             info  
orchagent         notice 
portsyncd         notice 
sai_api_port      notice 
sai_api_switch    notice 
symmetry-manager  info  
syncd             notice

The nv show system log component command provides the following options:

Option Description
<component-name> file Shows the contents of the most current file for a specific component. The system uses the less command so that you can scroll through the file interactively.
<component-name> file list Provides a list of log files for the specified component and shows the associated logs.
Component File List
System Component Files
apt All files in the /var/log/apt directory. The most current file is history.log.
routing All files in /var/log/frr directory. The most current file is frr.log.
auth auth.log
audit All files in the /var/log/audit directory.
boot boot.log
dpkg All files in the /var/log/dpkg directory.
installer All files in the /var/log/installer directory.
stp mstpd.log
nvue nvued.log, nv-cli.log. The most current file is nvued.log
otlp-telemetry All files in the /var/log/otlp-telemetry directory.
nv-telemetry All files in the /var/log/nv-telemetry directory.
mlag clagd.log
csmgr csmgrd.log, cl-system-services.log
platform-thermal tc_log
ptp ptp4l.log
synce synced.log, synced.log, synced-selector.log
pps ts2phc.log
platform-phc phc2sys.log
ptp-firefly-servo
ptm ptmd.log
ifupdown2 ifupdown2/*/ifupdown2.debug.log
nginx All files in the nginx folder. The most current file is access.log.
datapath mswitchd.log

The following example shows the contents of the most current file for NVUE:

cumulus@switch:~$ nv show system log component nvue file

The following example shows the log files and associated logs for NVUE:

cumulus@switch:~$ nv show system log component nvue file list 

Delete System Log Files

Deleting log files enables you to manage storage space and ensure that only relevant logs remain. You typically delete log files after you upload or archive them, or when you no longer need the logs for troubleshooting or auditing. Log file deletion is a crucial step in log management to ensure that outdated or irrelevant data does not occupy system resources.

To delete a log file, run the nv action delete system log file <file-name> command:

cumulus@switch:~$ nv action delete system log file mstpd.log 

To delete a log file from a specific system component, run the nv action delete system log component <component-name> file <file-name> command:

cumulus@switch:~$ nv action delete system log component nvue file nvued.log

User Accounts

By default, Cumulus Linux has two user accounts: cumulus and root.

The cumulus account:

The root account:

Add a New User Account

You can add additional user accounts as needed.

Default Roles

Cumulus Linux provides the following default roles:

Role
Permissions
system-admin Allows the user to use sudo to run commands as the privileged user, run nv show commands, run nv set and nv unset commands to stage configuration changes, and run nv apply commands to apply configuration changes.
nvue-admin Allows the user to run nv show commands, run nv set and nv unset commands to stage configuration changes, and run nv apply commands to apply configuration changes.
nvue-monitor Allows the user to run nv show commands only.
Role
Permissions
sudo Allows the user to use sudo to run commands as the privileged user.
nvshow Allows the user to run nv show commands only.
nvset Allows the user to run nv show commands, and run nv set and nv unset commands to stage configuration changes.
nvapply Allows the user to run nv show commands, run nv set and nv unset commands to stage configuration changes, and run nv apply commands to apply configuration changes.

To add a new user account and assign the user a default role:

The following example:

  • Creates a new user account called admin2 and sets the role to system-admin.
  • Sets a plain text password. NVUE hashes the plain text password and stores the value as a hashed password. To set a hashed password, see Hashed Passwords, below.
  • Adds the full name FIRST LAST. If the full name includes more than one name, either separate the names with a hyphen (FIRST-LAST) or enclose the full name in quotes ("FIRST LAST").
cumulus@switch:~$ nv set system aaa user admin2 role system-admin
cumulus@switch:~$ nv set system aaa user admin2 password
Enter new password:
Confirm password:
cumulus@switch:~$ nv set system aaa user admin2 full-name "FIRST LAST"
cumulus@switch:~$ nv config apply

You can also run the nv set system aaa user <user> password <plain-text-password> command to specify the plain text password inline. This command bypasses the Enter new password and Confirm password prompts but displays the plain text password as you type it.

If you are an NVUE-managed user, you can update your own password with the Linux passwd command.

The following example:

  • Creates a new user account called admin2, creates a home directory for the user, and adds the full name First Last.
  • Securely sets the password for the user with passwd.
  • Sets the group membership (role) to sudo and nvapply (permissions to use sudo, nv show, nv set, and nv apply).
cumulus@switch:~$ sudo useradd admin2 -m -c "First Last"
cumulus@switch:~$ sudo passwd admin2
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
cumulus@switch:~$ sudo adduser admin2 sudo
cumulus@switch:~$ sudo adduser admin2 nvapply

  • When you run Linux commands to add a new user, you must create a home directory for the user with the -m option. NVUE commands create a home directory automatically.
  • If you run Linux commands to configure a user password with five or fewer characters, Cumulus Linux logs the message BAD PASSWORD: The password is shorter than 6 characters. If password security is disabled, this is only a warning and the password is set. If password security is enabled, the short password is not set.

Only the following user accounts can create, modify, and delete other system-admin accounts:

  • NVUE-managed users with the system-admin role.
  • The root user.
  • Non NVUE-managed users that are in the sudo group.

You can also create custom roles and assign a custom role to a user. See Role-based Access Control.

Hashed Passwords

Instead of a plain text password, you can provide a hashed password for a local user.

You must specify the hashed password in Linux crypt format; the password must be a minimum of 15 to 20 characters long and must include special characters, digits, lowercase alphabetic letters, and more. Typically, the password format is set to $id$salt$hashed, where $id is the hashing algorithm. In GNU or Linux:

To generate a hashed password on the switch, you can either run a python3 command or install and use the mkpasswd utility:

Run the following command on the switch or Linux host. When prompted, enter the plain text password you want to hash:

cumulus@switch:~$ python3 -c "import crypt; import getpass; print(crypt.crypt(getpass.getpass(), salt=crypt.METHOD_SHA512))"                    
Password:                                                                                                                                                                 
$6$MIDE.sdxwxuAMGHd$XFXSpHV4NRJymUpeCKz.SYEMUfGGEtLbcqK0fBw3d96ZzegP3sw6ppl5Atx9xLS3UHLLTWS/BOwjkeBJJaRx10
  1. Install the mkpasswd utility on the switch or Linux host:
cumulus@switch:~$ sudo -E apt-get update
cumulus@switch:~$ sudo -E apt-get install whois
  1. To generate a hashed password for SHA-512, SHA256, or MD5 encryption, run the following command. When prompted, enter the plain text password you want to hash:

    SHA-512 encryption:

    cumulus@switch:~$ mkpasswd -m SHA-512
    Password:
    $6$bQcjKuWgKC0vdwT5$.ZlRgmS44geDH/HsCIttldsaxJ7Y/NidicXwR0FarwXq74uA/yJHxQXGHZwNviY/cG412i7Grzl6Wk8mStJwD0
    

    SHA256 encryption:

    cumulus@switch:~$ mkpasswd -m SHA-256
    Password:
    $5$SJsPU8bjl2F$.fzRpTGxwGw82RDdFPwhIermSSh6g2ZCYzPeNpeDrgC
    

    MD5 encryption:

    cumulus@switch:~$ mkpasswd -m MD5
    Password:
    $1$/ETjhZMJ$P73qhBZEYP20mKnRkhBol0
    

To set the hashed password for the local user:

Run the nv set system aaa user <username> hashed-password <password> command:

cumulus@switch:~$ nv set system aaa user admin2 hashed-password '$1$/ETjhZMJ$P73qhBZEYP20mKnRkhBol0'
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo useradd admin2 -c "First Last" -p '$1$/ETjhZMJ$P73qhBZEYP20mKnRkhBol0'

Hashed password strings contain characters, such as $, that have a special meaning in the Linux shell; you must enclose the hashed password in single quotes (').

Delete a User Account

To delete a user account:

Run the nv unset system aaa user <user> command. The following example deletes the user account called admin2.

cumulus@switch:~$ nv unset system aaa user admin2
cumulus@switch:~$ nv config apply

Run the sudo userdel <user> command. The following example deletes the user account called admin2.

cumulus@switch:~$ sudo userdel admin2

Show User Accounts

To show the user accounts configured on the system, run the NVUE nv show system aaa command or the linux sudo cat /etc/passwd command.

cumulus@switch:~$ nv show system aaa user
Username          Full-name                           Role     enable  Summary
----------------  ----------------------------------  -------  ------  -------
_apt                                                  Unknown  system         
_lldpd                                                Unknown  system         
backup            backup                              Unknown  system         
bin               bin                                 Unknown  system         
cumulus           cumulus,,,                          Unknown  on             
daemon            daemon                              Unknown  system         
dnsmasq           dnsmasq,,,                          Unknown  system         
frr               Frr routing suite,,,                Unknown  system         
games             games                               Unknown  system         
gnats             Gnats Bug-Reporting System (admin)  Unknown  system         
irc               ircd                                Unknown  system         
list              Mailing List Manager                Unknown  system         
lp                lp                                  Unknown  system         
mail              mail                                Unknown  system         
man               man                                 Unknown  system         
messagebus                                            Unknown  system         
news              news                                Unknown  system         
nobody            nobody                              Unknown  off            
ntp                                                   Unknown  system         
nvue              NVIDIA User Experience              Unknown  system         
proxy             proxy                               Unknown  system         
root              root                                Unknown  system         
snmp                                                  Unknown  system         
sshd                                                  Unknown  system         
sync              sync                                Unknown  system         
sys               sys                                 Unknown  system         
systemd-coredump  systemd Core Dumper                 Unknown  system         
systemd-network   systemd Network Management,,,       Unknown  system         
systemd-resolve   systemd Resolver,,,                 Unknown  system         
systemd-timesync  systemd Time Synchronization,,,     Unknown  system         
user1                                                 OSPF     on             
user2                                                 IFMgr    on             
uucp              uucp                                Unknown  system         
uuidd                                                 Unknown  system

To show information about a specific user account, run the NVUE nv show system aaa user <user> command:

cumulus@switch:~$ nv show system aaa user cumulus
                    operational  applied
------------------  -----------  -------
role                Unknown             
full-name           cumulus,,,          
hashed-password     *                   
ssh                                     
  [authorized-key]                      
state               enabled       enabled 

Enable the root User

The root user does not have a password and cannot log into a switch using SSH. This default account behavior is consistent with Debian.

Enable Console Access

To log into the switch using root from the console, you must set the password for the root account:

cumulus@switch:~$ sudo passwd root
Enter new password:
...

Enable SSH Access

To log into the switch using root with SSH, either:

Password Security

A user password is the key credential that verifies the user accessing the switch and acts as the first line of defense to secure the switch. The complexity of the password, replacement capabilities, and change frequency define the security level of the first perimeter of the switch. To further improve and harden the switch, Cumulus Linux enables a password security option that enforces password policies that apply to all users on the switch; user passwords must include at least one lowercase character, one uppercase character, one digit, one special character, and cannot be usernames. In addition, passwords must be a minimum of eight characters long, expire in 365 days, and provide a warning 15 days before expiration.

You can change these password security policies; see Configure Password Policies below.

Disable Password Security

The password security option is on by default. To disable password security, run the nv set system security password-hardening state disabled command:

cumulus@switch:~$ nv set system security password-hardening state disabled
cumulus@switch:~$ nv config apply

To reenable password security, run the nv set system security password-hardening state enabled command.

Configure Password Policies

The following table describes the password policies that Cumulus Linux provides and shows the default settings when password security is on. You can change these settings with NVUE commands.

Policy Description Default Setting
Lowercase Passwords must include at least one lowercase character. You can specify enabled or disabled. enabled
Uppercase Passwords must include at least one uppercase character. You can specify enabled or disabled. enabled
Digits Passwords must include at least one digit. You can specify enabled or disabled. enabled
Special characters Passwords must include at least one special character. You can specify enabled or disabled. enabled
Password length The minimum password length. You can specify a value between 6 and 32 characters. 8 characters
Expiration in days The duration in days after which passwords expire. You can set a value between 1 and 365 days. 180 days
Password expiration warning The number of days before a password expires to send a warning. You can set a value between 1 and 30 days. 15 days
Prevent usernames as passwords Passwords cannot be usernames. You can specify enabled or disabled. enabled
Password reuse The number of times you can reuse the same password. You can set a value between 1 and 100. 10

The following example commands disable enforcement of lowercase and uppercase characters, digits, and special characters:

cumulus@switch:~$ nv set system security password-hardening lower-class disabled
cumulus@switch:~$ nv set system security password-hardening upper-class disabled
cumulus@switch:~$ nv set system security password-hardening digits-class disabled
cumulus@switch:~$ nv set system security password-hardening special-class disabled

Special characters include ` ~ ! @ # $ % ^ & * ( ) - _ + = | [ { } ] ; : ' , < . > / ? and white space.

The following example commands set the minimum password length to 10 characters, the password expiration to 30 days, and the expiration warning to 5 days before expiration.

cumulus@switch:~$ nv set system security password-hardening len-min 10
cumulus@switch:~$ nv set system security password-hardening expiration 30
cumulus@switch:~$ nv set system security password-hardening expiration-warning 5

The following example commands allow usernames as passwords and sets the number of times you can reuse a password to 20:

cumulus@switch:~$ nv set system security password-hardening reject-user-passw-match disabled
cumulus@switch:~$ nv set system security password-hardening history-cnt 20

Show Password Policies

To show the currently configured password policies, run the nv show system security password-hardening command:

cumulus@switch:~$ nv show system security password-hardening
                         operational  applied 
-----------------------  -----------  --------
state                    enabled      enabled 
reject-user-passw-match  disabled     disabled
lower-class              enabled      enabled 
upper-class              enabled      enabled 
digits-class             disabled     disabled
special-class            disabled     disabled
expiration-warning       15           15      
expiration               180          180     
history-cnt              20           20      
len-min                  8            8

Role-Based Access Control

In addition to the default roles that Cumulus Linux provides, you can create your own roles to restrict authorization, giving you more granular control over what a user can manage on the switch. For example, you can assign a user the role of network manager and provide the user privileges for interface management, service management and system management. When the user logs in and executes an NVUE command, NVUE checks the user privileges and authorizes the user to run that command.

Custom role-based access control consists of the following elements:

Element Description
Role A virtual identifier for multiple classes (groups). You can assign only one role for a user. For example, for a user that can manage interfaces, you can create a role called IFMgr.
Class A class is similar in concept to a Linux group. Creating and managing classes is the simplest way to configure multiple users simultaneously, especially when configuring permissions.A class consists of:
  • Command paths, which Cumulus Linux bases on the objects in the NVUE declarative model and, which are the same as URI paths; for example; you can use the /vrf/ command path to allow or deny a user access to all VRFs, or /system/nat to allow or deny a user access to NAT configuration. Use the tab key to see available command paths (nv set system aaa class <class-name> command-path / <<press tab>>).
  • Permissions for the command paths: (ro) to run show commands, (rw) to run set, unset, and apply commands, (act) to run action commands, or (all) to run all commands. The default permission setting is all.
Action The action for the class: allow or deny.

  • You can assign a maximum of 64 classes to a role.
  • You can configure a maximum of 128 command paths for a class.
  • When you configure a command path, you allow or deny a specific schema path and its children. For example the command path /qos/ allows or denies access to QoS commands, whereas the command path /qos/egress-scheduler allows or denies access to QoS egress scheduler commands.

The following example describes the permissions for a role (role1) that consists of three classes: class1, class2, class3

class1 has the allow class action and the following command path permissions:

Command Path Permissions
/interface/ all
/interface/*/acl/ ro
/interface/*/ptp/ ro

class2 has the allow class action and the following command path permissions:

Command Path Permissions
/system/ ro
/vrf/ rw

class3 has the deny class action and the following command path permissions:

Command Path Permissions
/interface/*/evpn/ rw
/interface/*/qos/ rw

The following table shows the permissions for a user assigned the role role1. In the table, R is read only (RO), W is write, and X is action (ACT).

Path Allow Deny Permissions
/acl/ RWX Implicit deny
/qos/ RWX Implicit deny
All unspecified paths are implicit deny
/interface/ RWX The permissions specified
/interface/* (* matches all interfaces) RWX Inherited from parent
/interface/*/bond/ RWX Inherited from parent
/interface/*/ip/ RWX Inherited from parent
All unspecified children of /interface/ inherit parent permissions RWX
/interface/*/acl/ R WX The permissions specified
/interface/*/ptp/ R WX The permissions specified
/interface/*/evpn/ RWX The permissions specified
/interface/*/qos/ RWX The permissions specified
/system/ R WX The permissions specified
/system/aaa/ R WX Inherited from parent
/system/api/ R WX Inherited from parent
All unspecified children of /system/ inherit parent permissions R
/vrf/ RW X The permissions specified
All unspecified children of /vrf/ inherit parent permissions RW X

Assign a Custom Role to a User Account

To assign a custom role to a user account:

You assign a custom role to an existing user account. For information about creating user accounts, see User Accounts commands.

When you create a class, then run nv config apply, NVUE removes LDAP configuration from the /etc/nsswitch.conf file. If you are using LDAP, run the nv set system config apply ignore /etc/nsswitch.conf command before you run nv config apply to keep the LDAP configuration.

The following example creates the three classes described above for role role1.

class1 has permissions to manage all interfaces except for ACL and PTP interfaces, which only have show permissions:

cumulus@leaf01:mgmt:~$ nv set system aaa role ROLE1 class class1
cumulus@leaf01:mgmt:~$ nv set system aaa class class1 action allow
cumulus@leaf01:mgmt:~$ nv set system aaa class class1 command-path /interface/ permission all   
cumulus@leaf01:mgmt:~$ nv set system aaa class class1 command-path /interface/*/acl/ permission ro
cumulus@leaf01:mgmt:~$ nv set system aaa class class1 command-path /interface/*/ptp/ permission ro
cumulus@leaf01:mgmt:~$ nv config apply

class2 has permissions to only show system commands and to set, unset, and apply VRF commands:

cumulus@leaf01:mgmt:~$ nv set system aaa role ROLE1 class class2
cumulus@leaf01:mgmt:~$ nv set system aaa class class2 action allow
cumulus@leaf01:mgmt:~$ nv set system aaa class class2 command-path /system/ permission ro
cumulus@leaf01:mgmt:~$ nv set system aaa class class2 command-path /vrf/ permission rw
cumulus@leaf01:mgmt:~$ nv config apply

class3 prevents setting, unsetting, and applying interface commands for EVPN and QOS:

cumulus@leaf01:mgmt:~$ nv set system aaa role ROLE1 class class3
cumulus@leaf01:mgmt:~$ nv set system aaa class class3 action deny
cumulus@leaf01:mgmt:~$ nv set system aaa class class3 command-path /interface/*/evpn/ permission rw
cumulus@leaf01:mgmt:~$ nv set system aaa class class3 command-path /interface/*/qos/ permission rw
cumulus@leaf01:mgmt:~$ nv config apply

The following command assigns user admin2 the role role1:

cumulus@leaf01:mgmt:~$ nv set system aaa user admin2 role role1
cumulus@leaf01:mgmt:~$ nv config apply

Delete Custom Roles

To delete a custom role and all its classes, you must first unassign the role from the user, then delete the role:

cumulus@switch:~$ nv unset system aaa user admin2 role role1
cumulus@switch:~$ nv unset system aaa role role1
cumulus@switch:~$ nv config apply

To delete a class from a role, run the nv unset system aaa role <role> class <class> command:

cumulus@switch:~$ nv unset system aaa role role1 class class2
cumulus@switch:~$ nv config apply

Show Custom Role Information

To show the user accounts configured on the system, run the NVUE nv show system aaa user command or the Linux sudo cat /etc/passwd command.

cumulus@switch:~$ nv show system aaa user
Username          Full-name                           Role     enable  Summary
----------------  ----------------------------------  -------  ------  -------
_apt                                                  Unknown  system         
_lldpd                                                Unknown  system         
backup            backup                              Unknown  system         
bin               bin                                 Unknown  system         
cumulus           cumulus,,,                          Unknown  on             
daemon            daemon                              Unknown  system         
dnsmasq           dnsmasq,,,                          Unknown  system         
frr               Frr routing suite,,,                Unknown  system         
games             games                               Unknown  system         
gnats             Gnats Bug-Reporting System (admin)  Unknown  system         
irc               ircd                                Unknown  system         
list              Mailing List Manager                Unknown  system         
lp                lp                                  Unknown  system         
mail              mail                                Unknown  system         
man               man                                 Unknown  system         
messagebus                                            Unknown  system         
news              news                                Unknown  system         
nobody            nobody                              Unknown  off            
ntp                                                   Unknown  system         
nvue              NVIDIA User Experience              Unknown  system         
proxy             proxy                               Unknown  system         
root              root                                Unknown  system         
snmp                                                  Unknown  system         
sshd                                                  Unknown  system         
sync              sync                                Unknown  system         
sys               sys                                 Unknown  system         
systemd-coredump  systemd Core Dumper                 Unknown  system         
systemd-network   systemd Network Management,,,       Unknown  system         
systemd-resolve   systemd Resolver,,,                 Unknown  system         
systemd-timesync  systemd Time Synchronization,,,     Unknown  system         
admin2                                                role1    on             
uucp              uucp                                Unknown  system         
uuidd                                                 Unknown  system         
www-data          www-data                            Unknown  system    

To show information about a specific user account including the role assigned to the user, run the NVUE nv show system aaa user <user> command:

cumulus@switch:~$ nv show system aaa user admin2
           operational  applied
---------  -----------  -------
role       role1        role1  
full-name                      
enable     on           on

To show all the roles configured on the switch, run the NVUE nv show system aaa role command:

cumulus@switch:~$ nv show system aaa role
Role          Class  
------------  -------
nvue-admin    nvapply
nvue-monitor  nvshow 
role1         class1 
              class2 
              class3 
system-admin  nvapply
              sudo

To show the classes applied to specific role, run the nv show system aaa role <role> command:

cumulus@switch:~$ nv show system aaa role role1
         applied
-------  -------
[class]  class1 
[class]  class2 
[class]  class3

To show all the classes configured on the switch, run the nv show system aaa class command:

cumulus@switch:~$ nv show system aaa class
Class Name  Command Path        Permission  Action
----------  ------------------  ----------  ------
class1      /interface/         all         allow 
            /interface/*/acl/   ro                
            /interface/*/ptp/   ro                
class2      /system/            ro          allow 
            /vrf/               rw                
class3      /interface/*/evpn/  rw          deny  
            /interface/*/qos/   rw                
nvapply     /                   all         allow 
nvshow      /                   ro          allow 
sudo        /                   all         allow 

To show the configuration and state of the command paths for a class, run the nv show system aaa class <class> command:

cumulus@switch:~$ nv show system aaa class class3
               applied           
--------------  ------------------
action          deny              
[command-path]  /interface/*/evpn/
[command-path]  /interface/*/qos/

Using sudo to Delegate Privileges

By default, Cumulus Linux has two user accounts: root and cumulus. The cumulus account is a normal user and is in the group sudo.

You can add more user accounts as needed. Like the cumulus account, these accounts must use sudo to execute privileged commands.

sudo Basics

sudo allows you to execute a command as superuser or another user as specified by the security policy.

The default security policy is sudoers, which you configure in the /etc/sudoers file. Use /etc/sudoers.d/ to add to the default sudoers policy.

Use visudo only to edit the sudoers file; do not use another editor like vi or emacs.

When creating a new file in /etc/sudoers.d, use visudo -f. This option performs sanity checks before writing the file to avoid errors that prevent sudo from working.

Errors in the sudoers file can result in losing the ability to elevate privileges to root. You can fix this issue only by power cycling the switch and booting into single user mode. Before modifying sudoers, enable the root user by setting a password for the root user.

By default, users in the sudo group can use sudo to execute privileged commands. To add users to the sudo group, use the useradd(8) or usermod(8) command. To see which users belong to the sudo group, see /etc/group (man group(5)).

You can run any command as sudo, including su. You must enter a password.

The example below shows how to use sudo as a non-privileged user cumulus to bring up an interface:

cumulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master br0 state DOWN mode DEFAULT qlen 500
link/ether 44:38:39:00:27:9f brd ff:ff:ff:ff:ff:ff

cumulus@switch:~$ ip link set dev swp1 up
RTNETLINK answers: Operation not permitted

cumulus@switch:~$ sudo ip link set dev swp1 up
Password:

umulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP mode DEFAULT qlen 500
link/ether 44:38:39:00:27:9f brd ff:ff:ff:ff:ff:ff

sudoers Examples

The following examples show how you grant as few privileges as necessary to a user or group of users to allow them to perform the required task. Each example uses the system group noc; groups include the prefix %.

When an unprivileged user runs a command, the command must include the sudo prefix.

Category Privilege Example Command sudoers Entry
Monitoring Switch port information ethtool -m swp1 %noc ALL=(ALL) NOPASSWD:/sbin/ethtool
Monitoring System diagnostics cl-support %noc ALL=(ALL) NOPASSWD:/usr/cumulus/bin/cl-support
Monitoring Routing diagnostics cl-resource-query %noc ALL=(ALL) NOPASSWD:/usr/cumulus/bin/cl-resource-query
Image management Install images onie-select http://lab/install.bin %noc ALL=(ALL) NOPASSWD:/usr/cumulus/bin/onie-select
Package management Any apt-get command apt-get update or apt-get install %noc ALL=(ALL) NOPASSWD:/usr/bin/apt-get
Package management Just apt-get update apt-get update %noc ALL=(ALL) NOPASSWD:/usr/bin/apt-get update
Package management Install packages apt-get install vim %noc ALL=(ALL) NOPASSWD:/usr/bin/apt-get install *
Package management Upgrading apt-get upgrade %noc ALL=(ALL) NOPASSWD:/usr/bin/apt-get upgrade
Netfilter Install ACL policies cl-acltool -i %noc ALL=(ALL) NOPASSWD:/usr/cumulus/bin/cl-acltool
Netfilter List iptables rules iptables -L %noc ALL=(ALL) NOPASSWD:/sbin/iptables
Layer 1 and 2 Any LLDP command lldpcli show neighbors / configure %noc ALL=(ALL) NOPASSWD:/usr/sbin/lldpcli
Layer 1 and 2 Just show neighbors lldpcli show neighbors %noc ALL=(ALL) NOPASSWD:/usr/sbin/lldpcli show neighbors*
Interfaces Modify any interface ip link set dev swp1 {up|down} %noc ALL=(ALL) NOPASSWD:/sbin/ip link set *
Interfaces Up any interface ifup swp1 %noc ALL=(ALL) NOPASSWD:/sbin/ifup
Interfaces Down any interface ifdown swp1 %noc ALL=(ALL) NOPASSWD:/sbin/ifdown
Interfaces Up/down only swp2 ifup swp2 / ifdown swp2 %noc ALL=(ALL) NOPASSWD:/sbin/ifup swp2,/sbin/ifdown swp2
Interfaces Any IP address change ip addr {add|del} 192.0.2.1/30 dev swp1 %noc ALL=(ALL) NOPASSWD:/sbin/ip addr *
Interfaces Only set IP address ip addr add 192.0.2.1/30 dev swp1 %noc ALL=(ALL) NOPASSWD:/sbin/ip addr add *
Ethernet bridging Any bridge command brctl addbr br0 / brctl delif br0 swp1 %noc ALL=(ALL) NOPASSWD:/sbin/brctl
Ethernet bridging Add bridges and interfaces brctl addbr br0 / brctl addif br0 swp1 %noc ALL=(ALL) NOPASSWD:/sbin/brctl addbr *,/sbin/brctl addif *
Spanning tree Set STP properties mstpctl setmaxage br2 20 %noc ALL=(ALL) NOPASSWD:/sbin/mstpctl
Troubleshooting Restart switchd systemctl restart switchd.service %noc ALL=(ALL) NOPASSWD:/usr/sbin/service switchd *
Troubleshooting Restart any service systemctl cron switchd.service %noc ALL=(ALL) NOPASSWD:/usr/sbin/service
Troubleshooting Packet capture tcpdump %noc ALL=(ALL) NOPASSWD:/usr/sbin/tcpdump
Layer 3 Add static routes ip route add 10.2.0.0/16 via 10.0.0.1 %noc ALL=(ALL) NOPASSWD:/bin/ip route add *
Layer 3 Delete static routes ip route del 10.2.0.0/16 via 10.0.0.1 %noc ALL=(ALL) NOPASSWD:/bin/ip route del *
Layer 3 Any static route change ip route * %noc ALL=(ALL) NOPASSWD:/bin/ip route *
Layer 3 Any iproute command ip * %noc ALL=(ALL) NOPASSWD:/bin/ip
Layer 3 Non-modal OSPF cl-ospf area 0.0.0.1 range 10.0.0.0/24 %noc ALL=(ALL) NOPASSWD:/usr/bin/cl-ospf

LDAP Authentication and Authorization

Cumulus Linux uses Pluggable Authentication Modules (PAM) and Name Service Switch (NSS) for user authentication. NSS enables PAM to use LDAP to provide user authentication, group mapping, and information for other services on the system.

NVUE manages LDAP authentication with PAM and NSS.

  • Cumulus Linux only supports LDAP with IPv4.
  • LDAP authentication is sensitive to network delay. For optimal performance, NVIDIA recommends a round trip time of 10ms or less between LDAP clients and the LDAP server. If latency is between 10-50ms, NVIDIA recommends changing the authentication order to prioritize local authentication before LDAP. For connections exceeding 50ms of latency, authentication might experience unacceptable delays and alternative authentication methods should be considered.

Configure LDAP Server Settings

You can configure LDAP server settings with NVUE commands or by editing Linux configuration files.

Connection

Configure the following connection settings:

The following example configures the LDAP server and port, and the BIND credentials.

cumulus@switch:~$ nv set system aaa ldap hostname ldapserver1
cumulus@switch:~$ nv set system aaa ldap port 388
cumulus@switch:~$ nv set system aaa ldap bind-dn CN=cumulus-admin,CN=Users,DC=rtp,DC=example,DC=test
cumulus@switch:~$ nv set system aaa ldap secret 1Q2w3e4r!
cumulus@switch:~$ nv config apply

The following example sets the priority to 2 for ldapserver2 when using multiple LDAP servers:

cumulus@switch:~$ nv set system aaa ldap hostname ldapserver2 priority 2

Edit the /etc/nslcd.conf file to set the URI and BIND credentials, then uncomment the lines:

cumulus@switch:~$ sudo nano /etc/nslcd.conf
...
# The location at which the LDAP server(s) should be reachable.
uri ldaps://ldapserver1:388/
#uripriority 1
...
# The DN to bind with for normal lookups.
binddn CN=cumulus admin,CN=Users,DC=rtp,DC=example,DC=test
bindpw 1Q2w3e4r!
...

Set the Authentication Order

To prioritize the order in which Cumulus Linux attempts different authentication methods to verify user access to the switch, you set the authentication order. By default, Cumulus Linux verifies users according to their local passwords.

If you set the authentication order to start with LDAP, but the LDAP servers do not have the user in the directory or does not respond, Cumulus Linux tries local password authentication.

To set the authentication order to start with LDAP before local authentication:

cumulus@switch:~$ nv set system aaa authentication-order 1 ldap
cumulus@switch:~$ nv config apply

To set the authentication order to start with local authentication before querying LDAP:

cumulus@switch:~$ nv set system aaa authentication-order 1 local
cumulus@switch:~$ nv set system aaa authentication-order 2 ldap
cumulus@switch:~$ nv config apply

Edit the /etc/nsswitch.conf file and add ldap before files for the passwd and group options to attempt LDAP authentication first, or configure files first to prioritize local authentication:

cumulus@switch:~$ sudo nano /etc/nsswitch.conf
...
passwd:         ldap files
group:          ldap files
shadow:         files
gshadow:        files

hosts:          files dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

Search Function

When an LDAP client requests information about a resource, the client must connect and bind to the server, then perform one or more resource queries depending on the lookup. All search queries to the LDAP server use the configured search base, filter, and the desired entry (uid=myuser). If the LDAP directory is large, this search takes a long time. Define a more specific search base for the common maps (passwd and group).

cumulus@switch:~$ nv set system aaa ldap base-dn ou=support,dc=rtp,dc=example,dc=test
cumulus@switch:~$ nv config apply

Edit the /etc/nslcd.conf file to add the search base:

cumulus@switch:~$ sudo nano /etc/nslcd.conf
...
# The search base that will be used for all queries.
base ou=support,dc=rtp,dc=example,dc=test

Search Scope

You can set the search scope to one level to limit the level of the search to users directly under the base DN or to subtree to search for users in all branches under the base DN. The default setting is subtree.

To set the search scope to one level:

NVUE does not provide commands to set the search scope.

Edit the /etc/nslcd.conf file to set the scope option to one.

cumulus@switch:~$ sudo nano /etc/nslcd.conf
...
# The search scope.
scope one
...

To set the search scope back to the default setting (subtree), set the scope option to sub:

Search Filters

To limit the search scope when authenticating users, use search filters to specify criteria when searching for objects within the directory.

NVUE does not provide commands to limit the search scope.

Edit the /etc/nslcd.conf file to set the filter passwd, filter group, or filter shadow options.

cumulus@switch:~$ sudo nano /etc/nslcd.conf
...
# filters and maps
filter passwd cumulus
filter group cn
filter shadow 1234
...

Attribute Mapping

The map configuration allows you to override the attributes pushed from LDAP. To override an attribute for a given map, specify the attribute name and the new value.

NVUE does not provide commands for attribute mapping.

Edit the /etc/nslcd.conf file to set the map passwd and map group options..

cumulus@switch:~$ sudo nano /etc/nslcd.conf
...
# filters and maps
...
map passwd homedirectory /home/$sAMAccountName
map passwd userpassword cumulus
map group cn sAMAccountName
map group gidnumber objectSid:S-1-5-21-1391733952-3059161487-1245441232
...

LDAP Version

Cumulus Linux uses LDAP version 3 by default. If you need to change the LDAP version to 2:

cumulus@switch:~$ nv set system aaa ldap version 2
cumulus@switch:~$ nv config apply

Edit the /etc//etc/nslcd.conf file to change the ldap_version.

cumulus@switch:~$ sudo nano /etc/nslcd.conf
...
# The LDAP protocol version to use.
ldap_version 2
...

LDAP Timeouts

Cumulus Linux provides two timeout settings:

The following example sets both the BIND session timeout and the search timeout to 60 seconds.

cumulus@switch:~$ nv set system aaa ldap timeout-bind 60
cumulus@switch:~$ nv set system aaa ldap timeout-search 60
cumulus@switch:~$ nv config apply

Edit the /etc/nslcd.conf file to change the bind_timelimit option and the timelimit option.

cumulus@switch:~$ sudo nano /etc/nslcd.conf
...
bind_timelimit 60
timelimit 60
...

SSL Options

You can configure the following SSL options:

The following example sets the SSL mode to SSL, the port to 8443, enables the SSL certificate checker, sets the CA certificate list to none, the SSL cipher suites to TLS1.3 and the Certificate Revocation List to /etc/ssl/certs/rtp-example-ca.crt.

cumulus@switch:~$ nv set system aaa ldap ssl mode ssl
cumulus@switch:~$ nv set system aaa ldap ssl port 8443
cumulus@switch:~$ nv set system aaa ldap ssl ca-list none
cumulus@switch:~$ nv set system aaa ldap ssl tls-ciphers TLS1.3
cumulus@switch:~$ nv set system aaa ldap ssl crl-check /etc/ssl/certs/rtp-example-ca.crt
cumulus@switch:~$ nv config apply

Edit the /etc/nslcd.conf file to set the SSL options.

cumulus@switch:~$ sudo nano /etc/nslcd.conf
...
# SSL options
ssl on
tls_reqcert try
tls_ciphers NORMAL:-VERS-ALL:+VERS-TLS1.3
tls_crlcheck all
tls_crlfile /etc/ssl/certs/rtp-example-ca.crt
...

LDAP Referrals

LDAP referrals allow a directory tree to be partitioned and distributed between multiple LDAP servers.

To enable LDAP referral:

cumulus@switch:~$ nv set system aaa ldap referrals enabled
cumulus@switch:~$ nv config apply

Edit the /etc/nslcd.conf file to set the referrals option.

cumulus@switch:~$ sudo nano /etc/nslcd.conf
...
referrals yes
...

Show LDAP Settings

To show the LDAP configuration settings on the switch, run the following commands:

The following example shows all the LDAP configuration settings:

cumulus@switch:~$ nv show system aaa ldap
                      operational                                          applied                                            
--------------------  ---------------------------------------------------  ---------------------------------------------------
vrf                   default                                              mgmt                                               
bind-dn               CN=cumulus-admin,CN=Users,DC=rtp,DC=example,DC=test  CN=cumulus-admin,CN=Users,DC=rtp,DC=example,DC=test
base-dn               ou=support,dc=rtp,dc=example,dc=test                 ou=support,dc=rtp,dc=example,dc=test               
referrals             yes                                                  off                                                
port                  389                                                  389                                                
timeout-bind          5                                                    5                                                  
timeout-search        5                                                    5                                                  
secret                *                                                    *                                                  
version               3                                                    3                                                  
[hostname]            ldapserver1                                          ldapserver1                                        
ssl                                                                                                                           
  mode                none                                                 none                                               
  port                389                                                  636                                                
  ca-list             default                                              default                                            
  tls-ciphers         all                                                  all                                                
  crl-check           none                                                 none                                               
...

The following example shows the hostnames of the LDAP servers and their priorities:

cumulus@switch:~$ nv show system aaa ldap hostname
Hostname     Priority
-----------  --------
ldapserver1  1
ldapserver2  2     

The following example shows the SSL configuration settings:

cumulus@switch:~$ nv show system aaa ldap ssl
             operational  applied
-----------  -----------  -------
mode         none         none   
port         389          636    
ca-list      default      default
tls-ciphers  all          all    
crl-check    none         none

Configure LDAP Authorization

Linux uses the sudo command to allow non-administrator users (such as the default cumulus user account) to perform privileged operations. To control the users that can use sudo, define a series of rules in the /etc/sudoers file and files in the /etc/sudoers.d/ directory. The rules apply to groups but you can also define specific users. You can add sudo rules using the group names from LDAP. For example, if a group of users are in the group netadmin, you can add a rule to give those users sudo privileges. Refer to the sudoers manual (man sudoers) for a complete usage description. The following shows an example in the /etc/sudoers file:

# The basic structure of a user specification is "who where = (as_whom) what ".
%sudo ALL=(ALL:ALL) ALL
%netadmin ALL=(ALL:ALL) ALL

LDAP Verification Tools

The LDAP client daemon retrieves and caches password and group information from LDAP. To verify the LDAP interaction, use these command-line tools to trigger an LDAP query from the device.

Identify a User with the id Command

The id command performs a username lookup by following the lookup information sources in NSS for the passwd service. This returns the user ID, group ID and the group list retrieved from the information source. In the following example, the user cumulus is locally defined in /etc/passwd, and myuser is on LDAP. The NSS configuration has the passwd map configured with the sources compat ldap:

cumulus@switch:~$ id cumulus
uid=1000(cumulus) gid=1000(cumulus) groups=1000(cumulus),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev)
cumulus@switch:~$ id myuser
uid=1230(myuser) gid=3000(Development) groups=3000(Development),500(Employees),27(sudo)

getent

The getent command retrieves all records found with NSS for a given map. It can also retrieve a specific entry under that map. You can perform tests with the passwd, group, shadow, or any other map in the /etc/nslcd.conf file. The output from this command formats according to the map requested. For the passwd service, the structure of the output is the same as the entries in /etc/passwd. The group map outputs the same structure as /etc/group.

In this example, looking up a specific user in the passwd map, the user cumulus is locally defined in /etc/passwd, and myuser is only in LDAP.

cumulus@switch:~$ getent passwd cumulus
cumulus:x:1000:1000::/home/cumulus:/bin/bash
cumulus@switch:~$ getent passwd myuser
myuser:x:1230:3000:My Test User:/home/myuser:/bin/bash

In the next example, looking up a specific group in the group service, the group cumulus is locally defined in /etc/groups, and netadmin is on LDAP.

cumulus@switch:~$ getent group cumulus
cumulus:x:1000:
cumulus@switch:~$ getent group netadmin
netadmin:*:502:larry,moe,curly,shemp

Running the command getent passwd or getent group without a specific request returns all local and LDAP entries for the passwd and group maps.

Troubleshooting

nslcd Debug Mode

When setting up LDAP authentication for the first time, turn off the nslcd service using the systemctl stop nslcd.service command (or the systemctl stop nslcd@mgmt.service if you are running the service in a management VRF) and run it in debug mode. Debug mode works whether you are using LDAP over SSL (port 636) or an unencrypted LDAP connection (port 389).

cumulus@switch:~$ sudo systemctl stop nslcd.service
cumulus@switch:~$ sudo nslcd -d

After you enable debug mode, run the following command to test LDAP queries:

cumulus@switch:~$ getent passwd

If you configure LDAP correctly, the following messages appear after you run the getent command:

nslcd: DEBUG: accept() failed (ignored): Resource temporarily unavailable
nslcd: [8e1f29] DEBUG: connection from pid=11766 uid=0 gid=0
nslcd: [8e1f29] <passwd(all)> DEBUG: myldap_search(base="dc=example,dc=com", filter="(objectClass=posixAccount)")
nslcd: [8e1f29] <passwd(all)> DEBUG: ldap_result(): uid=myuser,ou=people,dc=example,dc=com
nslcd: [8e1f29] <passwd(all)> DEBUG: ldap_result(): ... 152 more results
nslcd: [8e1f29] <passwd(all)> DEBUG: ldap_result(): end of results (162 total)

In the example output above, <passwd(all)> shows a query of the entire directory structure.

You can query a specific user with the following command:

cumulus@switch:~$ getent passwd myuser

You can replace myuser with any username on the switch. The following debug output indicates that user myuser exists:

nslcd: DEBUG: add_uri(ldap://10.50.21.101)
nslcd: version 0.8.10 starting
nslcd: DEBUG: unlink() of /var/run/nslcd/socket failed (ignored): No such file or directory
nslcd: DEBUG: setgroups(0,NULL) done
nslcd: DEBUG: setgid(110) done
nslcd: DEBUG: setuid(107) done
nslcd: accepting connections
nslcd: DEBUG: accept() failed (ignored): Resource temporarily unavailable
nslcd: [8b4567] DEBUG: connection from pid=11369 uid=0 gid=0
nslcd: [8b4567] <passwd="myuser"> DEBUG: myldap_search(base="dc=cumulusnetworks,dc=com", filter="(&(objectClass=posixAccount)(uid=myuser))")
nslcd: [8b4567] <passwd="myuser"> DEBUG: ldap_initialize(ldap://<ip_address>)
nslcd: [8b4567] <passwd="myuser"> DEBUG: ldap_set_rebind_proc()
nslcd: [8b4567] <passwd="myuser"> DEBUG: ldap_set_option(LDAP_OPT_PROTOCOL_VERSION,3)
nslcd: [8b4567] <passwd="myuser"> DEBUG: ldap_set_option(LDAP_OPT_DEREF,0)
nslcd: [8b4567] <passwd="myuser"> DEBUG: ldap_set_option(LDAP_OPT_TIMELIMIT,0)
nslcd: [8b4567] <passwd="myuser"> DEBUG: ldap_set_option(LDAP_OPT_TIMEOUT,0)
nslcd: [8b4567] <passwd="myuser"> DEBUG: ldap_set_option(LDAP_OPT_NETWORK_TIMEOUT,0)
nslcd: [8b4567] <passwd="myuser"> DEBUG: ldap_set_option(LDAP_OPT_REFERRALS,LDAP_OPT_ON)
nslcd: [8b4567] <passwd="myuser"> DEBUG: ldap_set_option(LDAP_OPT_RESTART,LDAP_OPT_ON)
nslcd: [8b4567] <passwd="myuser"> DEBUG: ldap_simple_bind_s(NULL,NULL) (uri="ldap://<ip_address>")
nslcd: [8b4567] <passwd="myuser"> DEBUG: ldap_result(): end of results (0 total)

TACACS

Cumulus Linux implements TACACS+ client AAA in a transparent way with minimal configuration. The client implements the TACACS+ protocol as described in this IETF document. There is no need to create accounts or directories on the switch. Accounting records go to all configured TACACS+ servers by default. Using per-command authorization requires additional setup on the switch.

TACACS+ in Cumulus Linux:

TACACS+ Client Packages

NVUE automatically installs the TACACS+ packages; you do not have to install the packages if you use NVUE commands to configure TACACS+.

If you use Linux commands to configure TACACS+, you must install the TACACS+ packages. You can install the TACACS+ packages even if the switch is not connected to the internet; the packages are in the cumulus-local-apt-archive repository in the Cumulus Linux image.

To install all required packages, run these commands:

cumulus@switch:~$ sudo -E apt-get update
cumulus@switch:~$ sudo -E apt-get install tacplus-client

Required TACACS+ Client Configuration

Configure the following required settings on the switch (the TACACS+ client).

If you use NVUE commands to configure TACACS+, you must also set the priority for the authentication order for local and TACACS+ users, and enable TACACS+.

After you configure any TACACS+ settings with NVUE and you run nv config apply, you must restart the NVUE service with the sudo systemctl restart nvued.service command.

NVUE commands require you to specify the priority for each TACACS+ server. You must set a priority even if you only specify one server.

The following example commands set:

  • The TACACS+ server priority to 5.
  • The IP address of the server to 192.168.0.30.
  • The secret to mytacac$key.

If you include special characters in the password (such as $), you must enclose the password in single quotes (').

  • The VRF to mgmt.
  • The authentication order so that TACACS+ authentication has priority over local (the lower number has priority).
  • TACACS+ to enabled.
cumulus@switch:~$ nv set system aaa tacacs server 5 host 192.168.0.30
cumulus@switch:~$ nv set system aaa tacacs server 5 secret 'mytacac$key'
cumulus@switch:~$ nv set system aaa tacacs vrf mgmt 
cumulus@switch:~$ nv set system aaa authentication-order 5 tacacs
cumulus@switch:~$ nv set system aaa authentication-order 10 local
cumulus@switch:~$ nv set system aaa tacacs enable on
cumulus@switch:~$ nv config apply

If you want the server to use IPv6, you must add the nv set system aaa tacacs server <priority> prefer-ip-version 6 command:

cumulus@switch:~$ nv set system aaa tacacs server 5 host server5
cumulus@switch:~$ nv set system aaa tacacs server 5 prefer-ip-version 6
...

If you configure more than one TACACS+ server, you need to set the priority for each server. If the switch cannot establish a connection with the server that has the highest priority, it tries to establish a connection with the next highest priority server. The server with the lower number has the higher prioritity. In the example below, server 192.168.0.30 with a priority value of 5 has a higher priority than server 192.168.1.30, which has a priority value of 10.

cumulus@switch:~$ nv set system aaa tacacs server 5 host 192.168.0.30
cumulus@switch:~$ nv set system aaa tacacs server 5 secret 'mytacac$key' 
cumulus@switch:~$ nv set system aaa tacacs server 10 host 192.168.1.30
cumulus@switch:~$ nv set system aaa tacacs server 10 secret 'mytacac$key2'
cumulus@switch:~$ nv config apply
  1. Edit the /etc/tacplus_servers file to add at least one server and one shared secret (key). You can specify the server and secret parameters in any order anywhere in the file. Whitespace (spaces or tabs) are not allowed. For example, if your TACACS+ server IP address is 192.168.0.30 and your shared secret is tacacskey, add these parameters to the /etc/tacplus_servers file:

    cumulus@switch:~$ sudo nano /etc/tacplus_servers
    secret=mytacac$key
    server=192.168.0.30
    

    Cumulus Linux supports a maximum of seven TACACS+ servers. To specify multiple servers, add one per line to the /etc/tacplus_servers file. Connections establish in the order in the file.

    cumulus@switch:~$ sudo nano /etc/tacplus_servers
    secret=mytacac$key
    server=192.168.0.30
    secret=mytacac$key2
    server=192.168.1.30
    

    If you want the server to use IPv6, you must add the prefer_ip_version=6 parameter in the /etc/tacplus_servers file:

    cumulus@switch:~$ sudo nano /etc/tacplus_servers
    secret=mytacac$key
    server=server5
    prefer_ip_version=ipv6 
    secret=mytacac$key2
    server=server6
    prefer_ip_version=ipv6 
    
  2. Uncomment the vrf=mgmt line:

    # If the management network is in a vrf, set this variable to the vrf name.
    # This would usually be "mgmt"
    # When this variable is set, the connection to the TACACS+ accounting servers
    # will be made through the named vrf.
    vrf=mgmt
    
  3. Restart auditd:

    cumulus@switch:~$ sudo systemctl restart auditd
    

Optional TACACS+ Configuration

You can configure the following optional TACACS+ settings:

The following example commands set the timeout to 10 seconds and the TACACS+ server port to 32:

cumulus@switch:~$ nv set system aaa tacacs timeout 10
cumulus@switch:~$ nv set system aaa tacacs server 5 port 32
cumulus@switch:~$ nv config apply

The following example commands set the source IP address to 10.10.10.1 and the authentication type to CHAP:

cumulus@switch:~$ nv set system aaa tacacs source-ip 10.10.10.1
cumulus@switch:~$ nv set system aaa tacacs authentication mode chap
cumulus@switch:~$ nv config apply

The following example commands exclude the user USER1 from going to the TACACS+ server for authentication and enables Cumulus Linux to create a separate home directory for each TACACS+ user when the TACACS+ user first logs in:

cumulus@switch:~$ nv set system aaa tacacs exclude-user USER1
cumulus@switch:~$ nv set system aaa tacacs authentication per-user-homedir on
cumulus@switch:~$ nv config apply
  • To set the server port (use the format server:port), source IP address, authentication type, and enable Cumulus Linux to create a separate home directory for each TACACS+ user, edit the /etc/tacplus_servers file, then restart auditd.
  • To set the timeout and the usernames to exclude from TACACS+ authentication, edit the /etc/tacplus_nss.conf file (you do not need to restart auditd).

The following example sets the server port to 32, the authentication type to CHAP, the source IP address to 10.10.10.1, and enables Cumulus Linux to create a separate home directory for each TACACS+ user when the TACACS+ user first logs in:

cumulus@switch:~$ sudo nano /etc/tacplus_servers
...
secret=mytacac$key
server=192.168.0.30:32
...
# Sets the IPv4 address used as the source IP address when communicating with
# the TACACS+ server.  IPv6 addresses are not supported, nor are hostnames.
# The address must work when passsed to the bind() system call, that is, it must
# be valid for the interface being used.
source_ip=10.10.10.1
...
# If user_homedir=1, then tacacs users will be set to have a home directory
# based on their login name, rather than the mapped tacacsN home directory.
# mkhomedir_helper is used to create the directory if it does not exist (similar
# to use of pam_mkhomedir.so). This flag is ignored for users with restricted
# shells, e.g., users mapped to a tacacs privilege level that has enforced
# per-command authorization (see the tacplus-restrict man page).
user_homedir=1
...
login=chap
cumulus@switch:~$ sudo systemctl restart auditd

The following example sets the timeout to 10 seconds and excludes the user USER1 from going to the TACACS+ server for authentication:

cumulus@switch:~$ sudo nano /etc/tacplus_nss.conf
...
# The connection timeout for an NSS library should be short, since it is
# invoked for many programs and daemons, and a failure is usually not
# catastrophic.  Not set or set to a negative value disables use of poll().
# This follows the include of tacplus_servers, so it can override any
# timeout value set in that file.
# It's important to have this set in this file, even if the same value
# as in tacplus_servers, since tacplus_servers should not be readable
# by users other than root.
timeout=10
...
# This is a comma separated list of usernames that are never sent to
# a tacacs server, they cause an early not found return.
#
# "*" is not a wild card.  While it's not a legal username, it turns out
# that during pathname completion, bash can do an NSS lookup on "*"
# To avoid server round trip delays, or worse, unreachable server delays
# on filename completion, we include "*" in the exclusion list.
exclude_users=root,daemon,nobody,cron,radius_user,radius_priv_user,sshd,cumulus,quagga,frr,snmp,www-data,ntp,man,_lldpd,USER1,*

Cumulus Linux supports the following additional Linux parameters in the etc/tacplus_nss.conf file. Currently, there are no equivalent NUVE commands.

Linux Parameter Description
include Configures a supplemental configuration file to avoid duplicating configuration information. You can include up to eight additional configuration files. For example: include=/myfile/myname.
min_uid Configures the minimum user ID that the NSS plugin can look up. 0 specifies that the plugin never looks up uid 0 (root). Do not specify a value greater than the local TACACS+ user IDs (0 through 15).

TACACS+ Accounting

When you install the TACACS+ packages and configure the basic TACACS+ settings (set the server and shared secret), accounting is on and there is no additional configuration required.

TACACS+ accounting uses the audisp module, with an additional plugin for auditd and audisp. The plugin maps the auid in the accounting record to a TACACS login, which it bases on the auid and sessionid. The audisp module requires libnss_tacplus and uses the libtacplus_map.so library interfaces as part of the modified libpam_tacplus package.

Communication with the TACACS+ servers occurs with the libsimple-tacact1 library, through dlopen(). A maximum of 240 bytes of command name and arguments send in the accounting record, due to the TACACS+ field length limitation of 255 bytes.

  • All sudo commands run by TACACS+ users generate accounting records against the original TACACS+ login name.
  • All Linux and NVUE commands result in an accounting record, including login commands and sub-processes of other commands. This can generate a lot of accounting records.

By default, Cumulus Linux sends accounting records to all servers. You can change this setting to send accounting records to the server that is first to respond:

cumulus@switch:~$ nv set system aaa tacacs accounting send-records first-response
cumulus@switch:~$ nv config apply

To reset to the default configuration (send accounting records to all servers), run the nv set system aaa tacacs accounting send-records all command.

  1. Edit the /etc/audisp/audisp-tac_plus.conf file and change the acct_all parameter to 0:

    cumulus@switch:~$ sudo nano /etc/audisp/audisp-tac_plus.conf
    ...
    acct_all=0
    
  2. Restart auditd:

    cumulus@switch:~$ sudo systemctl restart auditd
    

To reset to the default configuration (send accounting records to all servers), change the value of acct_all to 1 (acct_all=1).

To disable TACACS+ accounting:

cumulus@switch:~$ nv set system aaa tacacs accounting enable off
cumulus@switch:~$ nv config apply
  1. Edit the /etc/audit/plugins.d/audisp-tacplus.conf file and change the active parameter to no:

    cumulus@switch:~$ sudo nano /etc/audit/plugins.d/audisp-tacplus.conf
    ...
    # default to enabling tacacs accounting; change to no to disable
    active = no
    
  2. Restart auditd:

    cumulus@switch:~$ sudo systemctl restart auditd
    

Local Fallback Authentication

You can configure the switch to allow local fallback authentication for a user when the TACACS servers are unreachable, do not include the user for authentication, or have the user in the exclude user list.

To allow local fallback authentication for a user, add a local privileged user account on the switch with the same username as a TACACS user. A local user is always active even when the TACACS service is not running.

NVUE does not provide commands to configure local fallback authentication.

To configure local fallback authentication:

  1. Edit the /etc/nsswitch.conf file to remove the keyword tacplus from the line starting with passwd. (You need to add the keyword back in step 3.)

    The following example shows the /etc/nsswitch.conf file with no tacplus keyword in the line starting with passwd.

    cumulus@switch:~$ sudo nano /etc/nsswitch.conf
    #
    # Example configuration of GNU Name Service Switch functionality.
    # If you have the `glibc-doc-reference' and `info' packages installed, try:
    # `info libc "Name Service Switch"' for information about this file.
    passwd:         files
    group:          tacplus files
    shadow:         files
    gshadow:        files
    ...
    
  2. To enable the local privileged user to run sudo and NVUE commands, run the adduser commands shown below. In the example commands, the TACACS account name is tacadmin.

    The first adduser command prompts for information and a password. You can skip most of the requested information by pressing ENTER.

    cumulus@switch:~$ sudo adduser --ingroup tacacs tacadmin
    cumulus@switch:~$ sudo adduser tacadmin nvset
    cumulus@switch:~$ sudo adduser tacadmin nvapply
    cumulus@switch:~$ sudo adduser tacadmin sudo
    
  3. Edit the /etc/nsswitch.conf file to add the keyword tacplus back to the line starting with passwd (the keyword you removed in the first step).

    cumulus@switch:~$ sudo nano /etc/nsswitch.conf
    #
    # Example configuration of GNU Name Service Switch functionality.
    # If you have the `glibc-doc-reference' and `info' packages installed, try:
    # `info libc "Name Service Switch"' for information about this file.
    passwd:         tacplus files
    group:          tacplus files
    shadow:         files
    gshadow:        files
    ...
    
  4. Restart the nvued service with the following command:

    cumulus@switch:~$ sudo systemctl restart nvued
    

TACACS+ Per-command Authorization

TACACS+ per-command authorization lets you configure the commands that TACACS+ users at different privilege levels can run.

To reach the TACACS+ server through the default VRF, you must specify the egress interface you use in the default VRF. Either run the NVUE nv set system aaa tacacs vrf <interface> command (for example, nv set system aaa tacacs vrf swp51) or set the vrf=<interface> option in the /etc/tacplus_servers file (for example, vrf=swp51).

The following command allows TACACS+ users at privilege level 0 to run the nv and ip commands (if authorized by the TACACS+ server):

cumulus@switch:~$ nv set system aaa tacacs authorization 0 command ip 
cumulus@switch:~$ nv set system aaa tacacs authorization 0 command nv
cumulus@switch:~$ nv config apply

To show the per-command authorization settings, run the nv show system aaa tacacs authorization command:

cumulus@switch:~$ nv show system aaa tacacs authorization
Privilege Level  role          command
---------------  ------------  -------
0                nvue-monitor  ip     
                               nv  
tacuser0@switch:~$ sudo tacplus-restrict -i -u tacacs0 -a ip nv

The tacplus-auth command handles authorization for each command. To make this an enforced authorization, change the TACACS+ log in to use a restricted shell, with a very limited executable search path. Otherwise, the user can bypass the authorization. The tacplus-restrict utility simplifies setting up the restricted environment.

The following table provides the tacplus-restrict command options:

Option Description
-i Initializes the environment. You only need to issue this option one time per username.
-a You can invoke the utility with the -a option as often as you like. For each command in the -a list, the utility creates a symbolic link from tacplus-auth to the relative portion of the command name in the local bin subdirectory. You also need to enable these commands on the TACACS+ server (refer to your TACACS+ server documentation). It is common for the server to allow some options to a command, but not others.
-f Re-initializes the environment. If you need to restart, run the -f option with -i to force re-initialization; otherwise, the utility ignores repeated use of -i.
During initialization:
- The user shell changes to /bin/rbash.
- The utility saves any existing dot files.

After running this command, examine the tacacs0 directory::

cumulus@switch:~$ sudo ls -lR ~tacacs0
total 12
lrwxrwxrwx 1 root root 22 Nov 21 22:07 ip -> /usr/sbin/tacplus-auth
lrwxrwxrwx 1 root root 22 Nov 21 22:07 nv -> /usr/sbin/tacplus-auth

Except for shell built-ins, privilege level 0 TACACS users can only run the ip and nv commands.

If you add commands with the -a option by mistake, you can remove them. The example below removes the nv command:

cumulus@switch:~$ sudo rm ~tacacs0/bin/nv

To remove all commands:

cumulus@switch:~$ sudo rm ~tacacs0/bin/*

Remove the TACACS+ Client Packages

To remove all the TACACS+ client packages, use the following commands:

cumulus@switch:~$ sudo -E apt-get remove tacplus-client
cumulus@switch:~$ sudo -E apt-get autoremove

To remove the TACACS+ client configuration files as well as the packages (recommended), use this command:

cumulus@switch:~$ sudo -E apt-get autoremove --purge

Troubleshooting

Show TACACS+ Configuration

Run the following commands to show TACACS+ configuration:

The following example command shows all TACACS+ configuration:

cumulus@switch:~$ nv show system aaa tacacs
                    applied
------------------  -------
enable              off    
debug-level         0      
timeout             5      
vrf                 mgmt   
accounting                 
  enable            off    
authentication             
  mode              pap    
  per-user-homedir  off    
[server]            5      
[server]            10 

The following command shows the list of users excluded from TACACS+ server authentication:

cumulus@switch:~$ nv show system aaa tacacs exclude-user
          applied
--------  -------
username  USER1  

Basic Server Connectivity or NSS Issues

You can use the getent command to determine if you configured TACACS+ correctly and if the local password is in the configuration files. In the example commands below, the cumulus user represents the local user, while cumulusTAC represents the TACACS user.

To look up the username within all NSS methods:

cumulus@switch:~$ sudo getent passwd cumulusTAC
cumulusTAC:x:1016:1001:TACACS+ mapped user at privilege level 15,,,:/home/tacacs15:/bin/bash

To look up the user within the local database only:

cumulus@switch:~$ sudo getent -s compat passwd cumulus
cumulus:x:1000:1000:cumulus,,,:/home/cumulus:/bin/bash

To look up the user within the TACACS+ database only:

cumulus@switch:~$ sudo getent -s tacplus passwd cumulusTAC
cumulusTAC:x:1016:1001:TACACS+ mapped user at privilege level 15,,,:/home/tacacs15:/bin/bash

If TACACS+ is not working correctly, you can use debugging. Add the debug=1 parameter to the /etc/tacplus_servers and /etc/tacplus_nss.conf files; see the Linux Commands under Optional TACACS+ Configuration above. You can also add debug=1 to individual pam_tacplus lines in /etc/pam.d/common*.

All log messages are in /var/log/syslog.

Incorrect Shared Key

The TACACS client on the switch and the TACACS server must have the same shared secret key. If this key is incorrect, the following message prints to syslog:

2017-09-05T19:57:00.356520+00:00 leaf01 sshd[3176]: nss_tacplus: TACACS+ server 192.168.0.254:49 read failed with protocol error (incorrect shared secret?) user cumulus

Debug Issues with Per-command Authorization

To debug TACACS user command authorization, have the TACACS+ user enter the following command at a shell prompt, then try the command again:

tacuser0@switch:~$ export TACACSAUTHDEBUG=1

When you enable debugging, the command authorization conversation with the TACACS+ server shows additional information.

To disable debugging:

tacuser0@switch:~$ export -n TACACSAUTHDEBUG

Debug Issues with Accounting Records

If you add or delete TACACS+ servers from the configuration files, make sure you notify the audisp plugin with this command:

cumulus@switch:~$ sudo killall -HUP audisp-tacplus

If accounting records do not send, add debug=1 to the /etc/audisp/audisp-tac_plus.conf file, then run the command above to notify the plugin. Ask the TACACS+ user to run a command and examine the end of /var/log/syslog for messages from the plugin. You can also check the auditing log file /var/log/audit audit.log to be sure the auditing records exist. If the auditing records do not exist, restart the audit daemon with:

cumulus@switch:~$ sudo systemctl restart auditd.service

TACACS+ Package Descriptions

Cumulus Linux uses the following packages for TACACS.

Package
Description
audisp-tacplus Uses auditing data from auditd to send accounting records to the TACACS+ server and starts as part of auditd.
libtac2 Provides basic TACACS+ server utility and communication routines.
libnss-tacplus Provides an interface between libc username lookups, the mapping functions, and the TACACS+ server.
tacplus-auth Includes the tacplus-restrict setup utility, which enables you to perform per-command TACACS+ authorization. Per-command authorization is not the default.
libpam-tacplus Provides a modified version of the standard Debian package.
libtacplus-map1 Provides mapping between local and TACACS+ users on the server. The package:- Sets the immutable sessionid and auditing UID to ensure that you can track the original user through multiple processes and privilege changes.- Sets the auditing loginuid as immutable.- Creates and maintains a status database in /run/tacacs_client_map to manage and lookup mappings.
libsimple-tacacct1 Provides an interface for programs to send accounting records to the TACACS+ server. audisp-tacplus uses this package.
libtac2-bin Provides the tacc testing program and TACACS+ man page.

TACACS+ Client Configuration Files

The following table describes the TACACS+ client configuration files that Cumulus Linux uses.

Filename
Description
/etc/tacplus_servers The primary file that requires configuration after installation. All packages with include=/etc/tacplus_servers parameters use this file. Typically, this file contains the shared secrets; make sure that the Linux file mode is 600.
/etc/nsswitch.conf When the libnss_tacplus package installs, this file configures tacplus lookups through libnss_tacplus. If you replace this file by automation, you need to add tacplus as the first lookup method for the passwd database line.
/etc/tacplus_nss.conf Sets the basic parameters for libnss_tacplus. The file includes a debug variable for debugging NSS lookups separately from other client packages.
/usr/share/pam-configs/tacplus The configuration file for pam-auth-update to generate the files in the next row. The file uses these configurations at login, by su, and by ssh.
/etc/pam.d/common-* The /etc/pam.d/common-* files update for tacplus authentication. The files update with pam-auth-update when you install or remove libpam-tacplus.
/etc/sudoers.d/tacplus Allows TACACS+ privilege level 15 users to run commands with sudo. The file includes an example (commented out) of how to enable privilege level 15 TACACS users to use sudo without a password and provides an example of how to enable all TACACS users to run specific commands with sudo. Only edit this file with the visudo -f /etc/sudoers.d/tacplus command.
/etc/audisp/plugins.d/audisp-tacplus.conf The audisp plugin configuration file. You do not need to modify this file.
/etc/audisp/audisp-tac_plus.conf The TACACS+ server configuration file for accounting. You do not need to modify this file. You can use this configuration file when you only want to debug TACACS+ accounting issues, not all TACACS+ users.
/etc/audit/rules.d/audisp-tacplus.rules The auditd rules for TACACS+ accounting. The augenrules command uses all rule files to generate the rules file.
/etc/audit/audit.rules The audit rules file that generate when you install auditd.

Considerations

Multiple TACACS+ Users

If two or more TACACS+ users log in simultaneously with the same privilege level, while the accounting records are correct, a lookup on either name matches both users, while a UID lookup only returns the user that logs in first.

As a result, any processes that either user runs apply to both and all files either user creates apply to the first name matched. This is similar to adding two local users to the password file with the same UID and GID and is an inherent limitation of using the UID for the base user from the password file.

The current algorithm returns the first name matching the UID from the mapping file; either the first or the second user that logs in.

To work around this issue, you can use the switch audit log or the TACACS server accounting logs to determine which processes and files each user creates.

The Linux auditd system does not always generate audit events for processes when terminated with a signal (with the kill system call or internal errors such as SIGSEGV). As a result, processes that exit on a signal that you do not handle, generate a STOP accounting record.

Issues with the deluser Command

TACACS+ and other non-local users that run the deluser command with the --remove-home option see the following error:

tacuser0@switch: deluser --remove-home USERNAME
userdel: cannot remove entry 'USERNAME' from /etc/passwd
/usr/sbin/deluser: `/usr/sbin/userdel USERNAME' returned error code 1. Exiting

The command does remove the home directory. The user can still log in on that account but does not have a valid home directory. This is a known upstream issue with the deluser command for all non-local users.

Only use the --remove-home option with the user_homedir=1 configuration command.

Both TACACS+ and RADIUS AAA Clients

When you install both the TACACS+ and the RADIUS AAA client, Cumulus Linux does not attempt RADIUS login. As a workaround, do not install both the TACACS+ and the RADIUS AAA client on the same switch.

TACACS+ and PAM

PAM modules and an updated version of the libpam-tacplus package configure authentication initially. When you install the package, the pam-auth-update command updates the PAM configuration in /etc/pam.d. If you make changes to your PAM configuration, you need to integrate these changes. If you also use LDAP with the libpam-ldap package, you need to edit the PAM configuration with the LDAP and TACACS ordering you prefer. The libpam-tacplus package ignore rules and the values in success=2 require adjustments to ignore LDAP rules.

The TACACS+ privilege attribute priv_lvl determines the privilege level for the user that the TACACS+ server returns during the user authorization exchange. The client accepts the attribute in either the mandatory or optional forms and also accepts priv-lvl as the attribute name. The attribute value must be a numeric string in the range 0 to 15, with 15 the most privileged level.

By default, TACACS+ users at privilege levels other than 15 cannot run sudo commands and can only run commands with standard Linux user permissions.

You can edit the /etc/pam.d/common-* files manually. However, if you run pam-auth-update again after making the changes, the update fails. Only configure /usr/share/pam-configs/tacplus, then run pam-auth-update.

NSS Plugin

With pam_tacplus, TACACS+ authenticated users can log in without a local account on the system using the NSS plugin that comes with the tacplus_nss package. The plugin uses the mapped tacplus information if the user is not in the local password file, provides the getpwnam() and getpwuid()entry points, and uses the TACACS+ authentication functions.

The plugin asks the TACACS+ server if it knows the user, and then for relevant attributes to determine the privilege level of the user. When you install the libnss_tacplus package, nsswitch.conf changes to set tacplus as the first lookup method for passwd. If you change the order, lookups return the local accounts, such as tacacs0

If TACACS+ server does not find the user, it uses the libtacplus.so exported functions to do a mapped lookup. The privilege level appends to tacacs and the lookup searches for the name in the local password file. For example, privilege level 15 searches for the tacacs15 user. If the TACACS+ server finds the user, it adds information for the user in the password structure.

If the TACACS+ server does not find the user, it decrements the privilege level and checks again until it reaches privilege level 0 (user tacacs0). This allows you to use only the two local users tacacs0 and tacacs15, for minimal configuration.

TACACS+ Client Sequencing

Cumulus Linux requires the following information at the beginning of the AAA sequence:

For non-local users (users not in the local password file) you need to send a TACACS+ authorization request as the first communication with the TACACS+ server, before authentication and before the user logging in requests a password.

You need to configure certain TACACS+ servers to allow authorization requests before authentication. Contact your TACACS+ server vendor for information.

Multiple Servers with Different User Accounts

If you configure multiple TACACS+ servers that have different user accounts:

RADIUS AAA

Cumulus Linux provides add-on packages to enable RADIUS users to log into the switch transparently with minimal configuration. There is no need to create accounts or directories on the switch. Authentication uses PAM and includes login, ssh, sudo and su.

Install the RADIUS Packages

NVUE automatically installs the RADIUS AAA packages; you do not have to install the packages if you use NVUE commands to configure RADIUS AAA.

If you use Linux commands to configure RADIUS AAA, you must install the RADIUS libnss-mapuser and libpam-radius-auth packages before you start configuration. The packages are in the cumulus-local-apt-archive repository, which is embedded in the Cumulus Linux image. You can install the packages even when the switch is not connected to the internet.

To install the RADIUS packages:

cumulus@switch:~$ sudo apt-get update
cumulus@switch:~$ sudo apt-get install libnss-mapuser libpam-radius-auth

When installing the libpam-radius-auth package, Cumulus Linux prompts you to either overwrite the local files with those from the package or to keep the local files. The default option is to keep the local files, which does not allow RADIUS to operate as expected. To install the libpam-radius-auth package and overwrite the local files, run the following command:

cumulus@switch:~$ sudo apt-get -y -o Dpkg::Options::=--force-confnew install libnss-mapuser libpam-radius-auth

If you install the libpam-radius-auth package without overwriting the local files, you must either remove and reinstall the package with the sudo apt-get -y -o Dpkg::Options::=--force-confnew install libnss-mapuser libpam-radius-auth command, or overwrite the local files without removing or reinstalling the package with the sudo pam-auth-update –force command.

After installation completes, either reboot the switch or run the sudo systemctl restart nvued command.

The nvshow group includes the radius_user account, and the nvset and nvapply groups. The sudo groups include the radius_priv_user account. This enables all RADIUS logins to run NVUE nv show commands and all privileged RADIUS users to also run nv set, nv unset, and nv apply commands, and to use sudo.

Required RADIUS Client Configuration

After you install the required RADIUS packages, configure the following required settings on the switch (the RADIUS client):

After you configure any RADIUS settings with NVUE and you run nv config apply, you must restart the NVUE service with the sudo systemctl restart nvued.service command.

The following example commands set:

  • The IP address of the RADIUS server to 192.168.0.254 and the port to 42.
  • The secret to 'myradius$key'.
  • The priority at which Cumulus Linux contacts the RADIUS server to 10.
  • The authentication order to 10 so that RADIUS authentication has priority over local.
cumulus@switch:~$ nv set system aaa radius server 192.168.0.254 port 42
cumulus@switch:~$ nv set system aaa radius server 192.168.0.254 secret 'myradius$key'
cumulus@switch:~$ nv set system aaa radius server 192.168.0.254 priority 10
cumulus@switch:~$ nv set system aaa authentication-order 10 radius
cumulus@switch:~$ nv set system aaa authentication-order 20 local
cumulus@switch:~$ nv config apply

Edit the /etc/pam_radius_auth.conf file to specify the hostname or IP address of at least one RADIUS server, and the shared secret you want to use to authenticate and encrypt communication with each server.

...
mapped_priv_user   radius_priv_user

# server[:port]    shared_secret   timeout (secs)  src_ip
192.168.0.254:42   myradius$key       3
...

You must be able to resolve the hostname of the switch to an IP address. If you cannot find the hostname in DNS, you can add the hostname to the /etc/hosts file manually. Be aware that adding the hostname to the /etc/hosts file manually can cause problems because DHCP assigns the IP address, which can change at any time.

Cumulus Linux verifies multiple server configuration lines in the order listed. Other than memory, there is no limit to the number of RADIUS servers you can use.

The server port number is optional. The system looks up the port in the /etc/services file. However, you can override the ports in the /etc/pam_radius_auth.conf file.

Optional RADIUS Configuration

You can configure the following global RADIUS settings and server specific settings.

Setting Description
vrf The VRF you want to use to communicate with the RADIUS servers. This is typically the management VRF (mgmt), which is the default VRF on the switch. You cannot specify more than one VRF.
privilege-level The minimum privilege level that determines if users can configure the switch with NVUE commands and sudo, or have read-only rights. The default privilege level is 15, which provides full administrator access. This is a global option only; you cannot set the minimum privilege level for specific RADIUS servers.
retransmit The maximum number of retransmission attempts allowed for requests when a RADIUS authentication request times out. This is a global option only; you cannot set the number of retransmission attempts for specific RADIUS servers.
timeout The timeout value when a server is slow or latencies are high. You can set a value between 1 and 60. The default timeout is 3 seconds. If you configure multiple RADIUS servers, you can set a global timeout for all servers.
source-ipv4source-ipv6 A specific interface to reach all RADIUS servers. To configure the source IP address for a specific RADIUS server, use the source-ip option.
debug The debug option for troubleshooting. The debugging messages write to /var/log/syslog. When the RADIUS client is working correctly, you can disable the debug option. You enable the debug option globally for all the servers.

The following example configures global RADIUS settings:

cumulus@switch:~$ nv set system aaa radius vrf mgmt
cumulus@switch:~$ nv set system aaa radius privilege-level 10
cumulus@switch:~$ nv set system aaa radius retransmit 8
cumulus@switch:~$ nv set system aaa radius timeout 10
cumulus@switch:~$ nv set system aaa radius source-ipv4 192.168.1.10
cumulus@switch:~$ nv set system aaa radius debug enable
cumulus@switch:~$ nv config apply

The following example configures RADIUS settings for a specific RADIUS server:

cumulus@switch:~$ nv set system aaa radius server 192.168.0.254 source-ip 192.168.1.10
cumulus@switch:~$ nv set system aaa radius server 192.168.0.254 timeout 10
cumulus@switch:~$ nv config apply
Setting Description
vrf The VRF you want to use to communicate with the RADIUS servers. This is typically the management VRF (mgmt), which is the default VRF on the switch. You cannot specify more than one VRF.
privilege-level Determines the privilege level for the user on the switch.
timeout The timeout value when a server is slow or latencies are high. You can set a value between 1 and 60. The default timeout is 3 seconds. If you configure multiple RADIUS servers, you can set a global timeout for all servers.
src_ip A specific IPv4 or IPv6 interface to reach the RADIUS server. If you configure multiple RADIUS servers, you can configure a specific interface to reach all RADIUS servers.
debug The debug option for troubleshooting. The debugging messages write to /var/log/syslog. When the RADIUS client is working correctly, you can disable the debug option. If you configure multiple RADIUS servers, you can enable the debug option globally for all the servers.

Edit the /etc/pam_radius_auth.conf file.

...
# Set the minimum privilege level in VSA attribute shell:privilege-level=VALUE
# default is 15, range is 0-15.
privilege-level 10
#
#  Uncomment to enable debugging, can be used instead of altering pam files
debug
#
# Account for privileged radius user mapping.  If you change it here,  you need
# to change /etc/nss_mapuser.conf as well
mapped_priv_user radius_priv_user

# server[:port]                                    shared_secret       timeout (secs)     src_ip
192.168.0.254:42                                   myradius$key        10                 192.168.1.10        

vrf-name mgmt

Enable Login without Local Accounts

NVUE does not provide commands to enable login without local accounts.

LDAP is not commonly used with switches and adding accounts locally is cumbersome, Cumulus Linux includes a mapping capability with the libnss-mapuser package.

Mapping uses two NSS (Name Service Switch) plugins, one for account name, and one for UID lookup. The installation process configures these accounts automatically in the /etc/nsswitch.conf file and removes them when you delete the package. See the nss_mapuser (8) man page for the full description of this plugin.

A username is mapped at login to a fixed account specified in the configuration file, with the fields of the fixed account used as a template for the user that is logging in.

For example, if you look up the name dave and the fixed account in the configuration file is radius\_user, and that entry in /etc/passwd is:

radius_user:x:1017:1002:radius user:/home/radius_user:/bin/bash

then the matching line that returns when you run getent passwd dave is:

cumulus@switch:~$ getent passwd dave
dave:x:1017:1002:dave mapped user:/home/dave:/bin/bash

The login process creates the home directory /home/dave if it does not already exist and populates it with the standard skeleton files by the mkhomedir_helper command.

The configuration file /etc/nss_mapuser.conf configures the plugins. The file includes the mapped account name, which is radius_user by default. You can change the mapped account name by editing the file. The nss_mapuser (5) man page describes the configuration file.

A flat file mapping derives from the session number assigned during login, which persists across su and sudo. Cumulus Linux removes the mapping at logout.

Local Fallback Authentication

NVUE does not provide commands to configure local fallback authentication.

If a site wants to allow local fallback authentication for a user when none of the RADIUS servers are reachable, you can add a privileged user account as a local account on the switch. The local account must have the same unique identifier as the privileged user and the shell must be the same.

To configure local fallback authentication:

  1. Add a local privileged user account. For example, if the radius_priv_user account in the /etc/passwd file is radius_priv_user:x:1002:1001::/home/radius_priv_user:/sbin/radius_shell, run the following command to add a local privileged user account named johnadmin:

    cumulus@switch:~$ sudo useradd -u 1002 -g 1001 -o -s /sbin/radius_shell johnadmin
    
  2. To enable the local privileged user to run sudo and NVUE commands, run the following commands:

    cumulus@switch:~$ sudo adduser johnadmin nvset
    cumulus@switch:~$ sudo adduser johnadmin nvapply
    cumulus@switch:~$ sudo adduser johnadmin sudo
    cumulus@switch:~$ sudo systemctl restart nvued
    
  3. Edit the /etc/passwd file to move the local user line before to the radius_priv_user line:

    cumulus@switch:~$ sudo vi /etc/passwd
    ...
    johnadmin:x:1002:1001::/home/johnadmin:/sbin/radius_shell
    radius_priv_user:x:1002:1001::/home/radius_priv_user:/sbin/radius_shell
    
  4. To set the local password for the local user, run the following command:

    cumulus@switch:~$ sudo passwd johnadmin
    

RADIUS User Command Accounting

RADIUS user command accounting lets you log every command that a RADIUS user runs and send the commands to RADIUS servers for auditing. Audit logs are a requirement for compliance standards, such as PCI and HIPPA.

The RADIUS servers must be configured to accept packets from clients and have a dictionary entry for NV-Command-String.

When you enable or change accounting settings, NVUE disconnects currently logged in RADIUS users.

Enable RADIUS Accounting

To enable RADIUS user command accounting:

cumulus@switch:~$ nv set system aaa radius accounting state enabled
cumulus@switch:~$ nv config apply

To disable RADIUS user command accounting, run the nv set system aaa radius accounting state disabled command.

The /var/log/radius-cmd-acct.log file contains the local copy of the logs, which match the logs that the server receives.

If you do not receive any accounting packets, check the /var/log/radius-send-cmd.log file.

To see if RADIUS user command accounting is enabled, run the nv show system aaa radius command.

Send Accounting Records to First Response

By default, Cumulus Linux sends accounting records to all servers. You can change this setting to send accounting records to the server that is first to respond. If the first available server does not respond, Cumulus Linux continues trying down the list of servers (by priority) until one is reachable. If none of the servers are reachable, there is a 30 second timeout, after which Cumulus Linux retries the servers. After 10 failed retries, the switch drops the packet.

cumulus@switch:~$ nv set system aaa radius accounting send-records first-response
cumulus@switch:~$ nv config apply

To reset to the default configuration (send accounting records to all servers), run the nv set system aaa radius accounting send-records all command. If none of the servers respond, there is a 30 second timeout, after which Cumulus Linux retries the servers. After 10 failed retries, the switch drops the packet.

Verify RADIUS Client Configuration

To verify the RADIUS client configuration, log in as a non-privileged user and run the nv set interface command.

In this example, the ops user is not a privileged RADIUS user so the ops user cannot add an interface.

ops@leaf01:~$ nv set interface swp1
ERROR: User ops does not have permission to make networking changes.

In this example, the admin user is a privileged RADIUS user (with privilege level 15) so is able to add interface swp1.

admin@leaf01:~$ nv set interface swp1
admin@leaf01:~$ nv apply

Show RADIUS Configuration

To show global RADIUS configuration, run the nv show system aaa radius command:

cumulus@switch:~$ nv show system aaa radius
                 operational    applied        
---------------  -------------  ------------- 
vrf              mgmt           mgmt          
debug            disabled       disabled      
privilege-level                 15            
retransmit       0              0             
port                            1812          
timeout                         3             
accounting       enabled        enabled       
[server]         192.168.0.254  192.168.0.254 

To show all RADIUS configured servers, run the nv show system aaa radius server command:

cumulus@switch:~$ nv show system aaa radius server
Hostname       Port  Priority  Password  source-ip     Timeout
-------------  ----  --------  --------  ------------  -------
192.168.0.254  42    1         *         192.168.1.10  10

To show configuration for a specific RADIUS server, run the nv show system aaa radius server <server> command:

cumulus@switch:~$ nv show system aaa radius server 192.168.0.254
           operational   applied     
---------  ------------  ------------
port       42            42          
timeout    10            10          
secret     *             *           
priority   1             10          

Remove RADIUS Client Packages

Remove the RADIUS packages with the following command:

cumulus@switch:~$ sudo apt-get remove libnss-mapuser libpam-radius-auth

When you remove the packages, Cumulus Linux deletes the plugins from the /etc/nsswitch.conf file and from the PAM files.

To remove all configuration files for these packages, run:

cumulus@switch:~$ sudo apt-get purge libnss-mapuser libpam-radius-auth

Cumulus Linux does not remove the RADIUS fixed account from the /etc/passwd or /etc/group file or the home directories. They remain in case of modifications to the account or files in the home directories.

To remove the home directories of the RADIUS users, obtain the list by running the following command:

cumulus@switch:~$ sudo ls -l /home | grep radius

For all users listed, except the radius_user, run the following command to remove the home directories:

cumulus@switch:~$ sudo deluser --remove-home USERNAME

USERNAME is the account name (the home directory relative portion). This command gives the following warning because the user is not listed in the /etc/passwd file.

userdel: cannot remove entry 'USERNAME' from /etc/passwd
/usr/sbin/deluser: `/usr/sbin/userdel USERNAME' returned error code 1. Exiting.

After you remove all the RADIUS users, run the command to remove the fixed account. If there are changes to the account in the /etc/nss_mapuser.conf file, use that account name instead of radius_user.

cumulus@switch:~$ sudo deluser --remove-home radius_user
cumulus@switch:~$ sudo deluser --remove-home radius_priv_user
cumulus@switch:~$ sudo delgroup radius_users

Considerations

Access Control Lists

Access Control Lists (ACLs) are rules on the switch that act as filters to manage traffic.

This section discusses:

Firewall Rules

The Cumulus Linux default firewall rules protect the switch control plane and CPU from DOS and other potentially malicious network attacks.

In Cumulus Linux 5.8 and earlier, the set of default firewall rules are more open; Cumulus Linux accepts packets from all addresses and protocols. Cumulus Linux 5.9 and later provides a set of default firewall rules that allows only specific addresses and ports, and drops disallowed packets.

The default set of firewall rules consists of IP and transport level rules. To block specific layer 2 packets such as ARP, LLDP, or STP or any packets sent to the CPU as part of generic traps, you must configure separate rules using control plane ACLs in the INPUT or OUTPUT chain of ebtables. See Access Control List Configuration.

Default Firewall Rule Files without NVUE

Cumulus Linux enables the default firewall rules on the switch even before you apply NVUE configuration for the first time. The default firewall rules are in the 01control_plane.rules and 98control_plane_whitelist.rules files in the /etc/cumulus/acl/policy.d/ directory.

If you prefer to configure the switch by editing Linux files instead of running NVUE commands, you can make changes to these files to add additional rules.

DoS Rules

DoS rules protect the switch control plane and CPU from DOS attacks. Cumulus Linux provides firewall DoS rules to:

Whitelist Rules

Whitelist rules specify the services or application ports enabled on the switch. Cumulus Linux provides firewall whitelist rules to enable TCP ports and UDP ports.

The following table lists the ports that Cumulus Linux enables by default.

Protocol Port Application
TCP 22 SSH
TCP 179 BGP
UDP 68 DHCP Client
UDP 67 DHCP Server
UDP 123 NTP
UDP 323 Chrony
UDP 161 SNMP
UDP 6306  A multicast socket used internally.
UDP 69 TFTP
TCP/UDP 389 LDAP
UDP 1812,1813 RADIUS
TCP/UDP 49 TACACS
TCP/UDP 53 DNS
TCP 8765 NVUE NGINX
UDP 6343, 6344 sFlow
UDP 514 remote syslog
UDP 3786 BFD
UDP 4784 Multi-Hop BFD
TCP 5342 MLAG
UDP 4789 VXLAN
UDP 319,320 PTP
TCP 443 HTTPS
TCP 9339 gNMI
TCP 31980,31982 NETQ Agent
OSPF NA NA
UDP 53 (SPORT) DNS response packets
TCP 9999 XMLRPC
ICMP NA Ping
PIM NA NA
IGMP NA NA
VRRP NA NA
TCP 639 MSDP

Unset the Default Firewall Rules

To unset the default firewall rules and use the setting in Cumulus Linux 5.8 and earlier that accepts packets from all addresses and protocols:

cumulus@switch:~$ nv unset system control-plane acl acl-default-dos 
cumulus@switch:~$ nv unset system control-plane acl acl-default-whitelist
cumulus@switch:~$ nv config apply

To set the firewall rules back to the default setting:

cumulus@switch:~$ nv set system control-plane acl acl-default-dos inbound
cumulus@switch:~$ nv set system control-plane acl acl-default-whitelist inbound
cumulus@switch:~$ nv config apply

Add Firewall Rules

You cannot modify the acl-default-dos and acl-default-whitelist rules. However, you can append or insert additional rules. Additionally, you can add your own ACLs and apply them on the control plane; control plane ACLs take precedence over acl-default-whitelist rules when the default firewall rules are enabled.

If you use non-default ports for an application, NVIDIA recommends that you add a whitelist rule for the non-default port. For example, if you use ports 3020 and 3022 for radius server accounting and authentication instead of 1812 and 1813, you can add the following whitelist rules:

cumulus@switch:~$ nv set acl acl-default-whitelist rule 73 match ip udp source-port 3020
cumulus@switch:~$ nv set acl acl-default-whitelist rule 73 match ip connection-state new
cumulus@switch:~$ nv set acl acl-default-whitelist rule 73 match ip connection-state established
cumulus@switch:~$ nv set acl acl-default-whitelist rule 73 action permit
cumulus@switch:~$ nv set acl acl-default-whitelist rule 74 match ip udp source-port 3022
cumulus@switch:~$ nv set acl acl-default-whitelist rule 74 match ip connection-state new
cumulus@switch:~$ nv set acl acl-default-whitelist rule 74 match ip connection-state established
cumulus@switch:~$ nv set acl acl-default-whitelist rule 74 action permit
cumulus@switch:~$ nv config apply

Hashlimit and Recent List Match

For firewall IPv4 type ACLs on the control plane, you can match on hashlimit and recent list. These matches are not supported for data plane ACLs, which get installed in hardware.

Cumulus Linux provides the following commands for matching on hashlimit.

Command Description
nv set acl <acl> rule <rule> match ip hashlimit name The hashlimit name.
nv set acl <acl> rule <rule> match ip hashlimit mode The hashlimit mode. You can specify src-ip or dst-ip.
nv set acl <acl> rule <rule> match ip hashlimit burst The hashlimit burst rate; the maximum number of packets to match in a burst. You can specify a value between 1 and 4294967295.
nv set acl <acl> rule <rule> match ip hashlimit rate-above The limit rate. You can specify <integer/second>, <integer/min>, or <integer/hour>. The maximum rate is 1000000/second.
nv set acl <acl> rule <rule> match ip hashlimit expire The number of milliseconds after which hash entries expire.
nv set acl <acl> rule <rule> match ip hashlimit source-mask The source address grouping prefix length.
nv set acl <acl> rule <rule> match ip hashlimit destination-mask The destination address grouping prefix length.

The following example shows an ACL that drops packets when matching on hashlimit.

To configure the hashlimit match, you must set the hashlimit name, mode, expiration, burst, and rate; the source mask and destination mask settings are optional.

cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip source-ip 10.0.14.2/32
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip hashlimit name ssh
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip hashlimit mode src-ip 
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip hashlimit expire 100
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip hashlimit burst 100
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip hashlimit rate-above 100/second
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip hashlimit source-mask 32
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action deny
cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound control-plane
cumulus@switch:~$ nv config apply

NVUE writes this rule in the /etc/cumulus/acl/policy.d/50_nvue.rules file:

cumulus@switch:~$ sudo cat /etc/cumulus/acl/policy.d/50_nvue.rules
[iptables]
## ACL EXAMPLE1 in dir inbound on interface swp1 ##
# rule-id #10:  #
-A INPUT -i swp1 -m comment --comment rule_id:10,acl_name:EXAMPLE1,dir:inbound,interface_id:swp1 -s 10.0.14.2/32 -p tcp -m hashlimit --hashlimit-name ssh --hashlimit-mode srcip --hashlimit-htable-expire 100 --hashlimit-burst 100 --hashlimit-above 100/second --hashlimit-srcmask 32 -j DROP

You can also show the ACL settings with the nv show acl <acl> command:

cumulus@switch:~$ nv show acl EXAMPLE1
      applied
----  -------
type  ipv4
rule
=======
    Number  Summary                                   
    ------  ------------------------------------------
    10      match.ip.hashlimit.burst:              100
            match.ip.hashlimit.expire:             100
            match.ip.hashlimit.mode:            src-ip
            match.ip.hashlimit.name:            ssh
            match.ip.hashlimit.rate-above: 100/second
            match.ip.hashlimit.source-mask:         32
            match.ip.protocol:                     tcp
            match.ip.source-ip:           10.0.14.2/32

Cumulus Linux provides the following commands to match on recent list.

Command Description
nv set acl <acl> rule <rule> match ip recent-list name The recent module name.
nv set acl <acl> rule <rule> match ip recent-list action The recent action. You can specify set or update.
nv set acl <acl> rule <rule> match ip recent-list hit-count The number of hits in an interval. You can specify a value between 1 and 4294967295.
nv set acl <acl> rule <rule> match ip recent-list update-interval The update interval. You can specify a value between 1 and 4294967295.

The following example shows an ACL that drops packets when matching on recent-list.

To configure the recent module match, you must set the recent list name and action; other recent-list settings are optional.

cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip source-ip 10.0.14.2/32
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip recent-list name bruteforce
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip recent-list action set
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip recent-list hit-count 5
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip recent-list update-interval 3600
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action deny
cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound control-plane
cumulus@switch:~$ nv config apply

NVUE writes this rule in the /etc/cumulus/acl/policy.d/50_nvue.rules file:

cumulus@switch:~$ sudo cat /etc/cumulus/acl/policy.d/50_nvue.rules
[iptables]

## ACL EXAMPLE1 in dir inbound on interface swp1 ##
# rule-id #10:  #
-A INPUT -i swp1 -m comment --comment rule_id:10,acl_name:EXAMPLE1,dir:inbound,interface_id:swp1 -s 10.0.14.2/32 -p tcp -m recent --name bruteforce --set  --hitcount 5 --seconds 360 -j DROP

You can also show the ACL settings with the NVUE nv show acl <acl> command.

Show Firewall Rules

To show the DoS rules, run the nv show acl acl-default-dos command:

cumulus@switch:~$ nv show acl acl-default-dos
      applied  pending
----  -------  -------
type  ipv4     ipv4   
rule
=======
    Number  Summary                                 
    ------  ----------------------------------------
    30      match.ip.protocol:                   tcp
    40      match.ip.protocol:                   tcp
    41      match.ip.protocol:                   tcp
    42      match.ip.protocol:                   tcp
    50                                              
    60      match.ip.protocol:                   tcp
    70      match.ip.protocol:                   tcp
    80      match.ip.protocol:                   tcp
    90      match.ip.protocol:                   tcp
            match.ip.tcp.all-mss-except:   536-65535
    100     match.ip.recent-list.action:         set
            match.ip.tcp.dest-port:               22
    110     match.ip.recent-list.action:      update
            match.ip.recent-list.hit-count:       50
            match.ip.recent-list.update-interval: 60
            match.ip.tcp.dest-port:               22
    120     match.ip.hashlimit.burst:              2
            match.ip.hashlimit.expire:         30000
            match.ip.hashlimit.mode:          src-ip
            match.ip.hashlimit.name:          TCPRST
            match.ip.hashlimit.rate-above:     5/min
            match.ip.hashlimit.source-mask:       32
            match.ip.protocol:                   tcp
    130     match.ip.hashlimit.burst:             30
            match.ip.hashlimit.expire:         30000
            match.ip.hashlimit.mode:          src-ip
            match.ip.hashlimit.name:      TCPGENERAL
            match.ip.hashlimit.rate-above: 50/second
            match.ip.hashlimit.source-mask:       32
            match.ip.protocol:                   tcp

Run the nv show acl acl-default-dos --rev=applied -o json command to show additional information, such as the connection state, hit count and update interval:

cumulus@switch:~$ nv show acl acl-default-dos --rev=applied -o json
{
  "rule": {
    "100": {
      "action": {
        "recent": {}
      },
      "match": {
        "ip": {
          "connection-state": {
            "new": {}
          },
          "recent-list": {
            "action": "set"
          },
          "tcp": {
            "dest-port": {
              "22": {}
            }
          }
        }
      }
    },
    "110": {
      "action": {
        "deny": {}
      },
      "match": {
        "ip": {
          "connection-state": {
            "new": {}
          },
          "recent-list": {
            "action": "update",
            "hit-count": 50,
            "update-interval": 60
          },
          "tcp": {
            "dest-port": {
              "22": {}
            }
          }
        }
      }
    },
...

To show the whitelist rules, run the nv show acl acl-default-whitelist command:

cumulus@switch:~$ nv show acl acl-default-whitelist 
      applied  pending
----  -------  -------
type  ipv4     ipv4
rule
=======
    Number  Summary                                          
    ------  -------------------------------------------------
    5       match.ip.protocol:                            tcp
            match.ip.tcp.dest-port:                       ssh
    10      match.ip.protocol:                            tcp
            match.ip.tcp.dest-port:                       bgp
    15      match.ip.protocol:                            tcp
            match.ip.tcp.dest-port:                      ldap
    20      match.ip.protocol:                            tcp
            match.ip.tcp.dest-port:                      8765
    25      match.ip.protocol:                            tcp
            match.ip.tcp.dest-port:                     https
    30      match.ip.protocol:                            tcp
            match.ip.tcp.dest-port:                      clag
    35      match.ip.protocol:                            tcp
            match.ip.tcp.source-port:                      49
    40      match.ip.protocol:                            udp
            match.ip.udp.dest-port:               dhcp-client
    45      match.ip.protocol:                            udp
            match.ip.udp.dest-port:               dhcp-server
    50      match.ip.protocol:                            udp
            match.ip.udp.dest-port:                       ntp
    55      match.ip.protocol:                            udp
            match.ip.udp.dest-port:                       323
    60      match.ip.protocol:                            udp
            match.ip.udp.dest-port:                      snmp
    65      match.ip.protocol:                            udp
            match.ip.udp.dest-port:                      tftp
    70      match.ip.protocol:                            udp
            match.ip.udp.dest-port:                      ldap
    75      match.ip.protocol:                            udp
            match.ip.udp.source-port:                    1812
    80      match.ip.protocol:                            udp
            match.ip.udp.source-port:                    1813
    85      match.ip.protocol:                            udp
            match.ip.udp.dest-port:                      6343
    90      match.ip.protocol:                            udp
            match.ip.udp.dest-port:                      6344
    95      match.ip.protocol:                            udp
            match.ip.udp.dest-port:                       514
    100     match.ip.protocol:                            udp
            match.ip.udp.dest-port:                       bfd
    105     match.ip.protocol:                            udp
            match.ip.udp.dest-port:              bfd-multihop
    110     match.ip.protocol:                            udp
            match.ip.udp.dest-port:                      4789
    115     match.ip.protocol:                            udp
            match.ip.udp.dest-port:                       319
    120     match.ip.protocol:                            udp
            match.ip.udp.dest-port:                       320
    125     match.ip.protocol:                            tcp
            match.ip.tcp.dest-port:                      9339
    130     match.ip.protocol:                            tcp
            match.ip.tcp.dest-port:                     31980
            match.ip.tcp.dest-port:                     31982
    135     match.ip.protocol:                            tcp
            match.ip.tcp.dest-port:                       639
    140     match.ip.protocol:                            udp
            match.ip.udp.source-port:                      53
    145     match.ip.protocol:                            tcp
            match.ip.tcp.dest-port:                      9999
    150     match.ip.protocol:                           ospf
    155     match.ip.protocol:                            pim
    160     match.ip.protocol:                           vrrp
    165     match.ip.protocol:                           igmp
    170     match.ip.protocol:                           icmp
    9999    Log Level:                                      3
            action.log.log-prefix: IPTables-Dropped-<Domain>:
            Log Rate:                                       1

Run the nv show acl acl-default-whitelist --rev=applied -o json command to show additional information, such as the connection state:

cumulus@switch:~$ nv show acl acl-default-whitelist --rev=applied -o json
{
  "rule": {
    "10": {
      "action": {
        "permit": {}
      },
      "match": {
        "ip": {
          "connection-state": {
            "established": {},
            "new": {}
          },
          "protocol": "tcp",
          "tcp": {
            "dest-port": {
              "bgp": {}
            }
          }
        }
      }
    },
    "100": {
      "action": {
        "permit": {}
      },
      "match": {
        "ip": {
          "connection-state": {
            "established": {},
            "new": {}
          },
          "protocol": "udp",
          "udp": {
            "dest-port": {
              "bfd": {}
            }
          }
        }
      }
...

To show information about a specific rule, run the nv show acl acl-default-dos rule <rule> command:

cumulus@switch:~$ nv show acl acl-default-dos rule 30
              applied  pending
------------  -------  -------
match                         
  ip                          
    protocol  tcp      tcp

Run the nv show acl acl-default-dos rule <rule> --rev=applied -o json command to see additional information, such as the connection state:

cumulus@switch:~$ nv show acl acl-default-dos rule 30 --rev=applied -o json
{
  "action": {
    "permit": {}
  },
  "match": {
    "ip": {
      "connection-state": {
        "established": {},
        "related": {}
      },
      "protocol": "tcp"
    }
  }
}

syslog Messages

Default firewall rules include a log rule for packets that arrive in the control plane and do not match user defined or default firewall rules. The switch generates a log message in /var/log/syslog for packets that match the log rule.

Access Control List Configuration

Cumulus Linux provides several tools to configure ACLs:

Traffic Rules

Chains

ACLs in Cumulus Linux classify and control packets to, from, and across the switch, asserting policies at layer 2, 3 and 4 of the OSI model by inspecting packet and frame headers according to a list of rules. The iptables, ip6tables, and ebtables userspace applications provide syntax you use to define rules.

The rules inspect or operate on packets at several points (chains) in the life of the packet through the system:

Tables

When you build rules to affect the flow of traffic, tables can access the individual chains. Linux provides three tables by default:

Each table has a set of default chains that modify or inspect packets at different points of the path through the switch. Chains contain the individual rules to influence traffic.

Rules

Rules classify the traffic you want to control. You apply rules to chains, which attach to tables.

Rules have several different components:

How Rules Parse and Apply

The switch reads all the rules from each chain from iptables, ip6tables, and ebtables and enters them in order into either the filter table or the mangle table. The switch reads the rules from the kernel in the following order:

When you combine and put rules into one table, the order determines the relative priority of the rules; iptables and ip6tables have the highest precedence and ebtables has the lowest.

The Linux packet forwarding construct is an overlay for how the silicon underneath processes packets. Be aware of the following:

Rule Placement in Memory

INPUT and ingress (FORWARD -i) rules occupy the same memory space. A rule counts as ingress if you set the -i option. If you set both input and output options (-i and -o), the switch considers the rule as ingress and occupies that memory space. For example:

-A FORWARD -i swp1 -o swp2 -s 10.0.14.2 -d 10.0.15.8 -p tcp -j ACCEPT

If you set an output flag with the INPUT chain, you see an error. For example:

-A FORWARD,INPUT -i swp1 -o swp2 -s 10.0.14.2 -d 10.0.15.8 -p tcp -j ACCEPT
error: line 2 : output interface specified with INPUT chain error processing rule '-A FORWARD,INPUT -i swp1 -o swp2 -s 10.0.14.2 -d 10.0.15.8 -p tcp -j ACCEPT'

If you remove the -o option and the interface, it is a valid rule.

Nonatomic Update Mode and Atomic Update Mode

Cumulus Linux enables atomic update mode by default. However, this mode limits the number of ACL rules that you can configure.

To increase the number of configurable ACL rules, configure the switch to operate in nonatomic mode.

Instead of reserving 50% of your TCAM space for atomic updates, nonatomic mode runs incremental updates that use available free space to write the new TCAM rules, then swap over to the new rules. Cumulus Linux deletes the old rules and frees up the original TCAM space. If there is insufficient free space to complete this task, the regular nonatomic (non-incremental) update runs, which interrupts traffic.

Nonatomic updates offer better scaling because all TCAM resources actively impact traffic. With atomic updates, half of the hardware resources are on standby and do not actively impact traffic.

Incremental nonatomic updates are table based, so they do not interrupt network traffic when you install new rules. The rules map to the following tables and update in this order:

The incremental nonatomic update operation follows this order:

  1. Updates are incremental, one table at a time without stopping traffic.
  2. Cumulus Linux checks if the rules in a table are different from installation time; if a table does not have any changes, it does not reinstall the rules.
  3. If there are changes in a table, the new rules populate in new groups or slices in hardware, then that table switches over to the new groups or slices.
  4. Finally, old resources for that table free up. This process repeats for each of the tables listed above.
  5. If there are insufficient resources to hold both the new rule set and old rule set, Cumulus Linux tries regular nonatomic mode, which interrupts network traffic.
  6. If the regular nonatomic update fails, Cumulus Linux reverts back to the previous rules.

To set nonatomic mode:

cumulus@switch:~$ nv set system acl mode non-atomic 
cumulus@switch:~$ nv config apply

  • On Spectrum-2 and later, NVUE reloads switchd after you run and apply the nv set system acl mode command. Reloading switchd does not interrupt network services.
  • On Spectrum 1, NVUE restarts switchd after you run and apply the nv set system acl mode command. Restarting switchd causes all network ports to reset in addition to resetting the switch hardware configuration.

  1. Edit the /etc/cumulus/switchd.conf file to add acl.non_atomic_update_mode = TRUE:

    cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
    ...
    acl.non_atomic_update_mode = TRUE
    
  2. On Spectrum-2 and later, reload switchd for the changes to take effect. Reloading switchd does not interrupt network services.

    cumulus@switch:~$ sudo systemctl reload switchd.service
    

    On Spectrum 1, restart switchd for the changes to take effect. Restarting switchd causes all network ports to reset in addition to resetting the switch hardware configuration.

    cumulus@switch:~$ sudo systemctl restart switchd.service
    

During regular non-incremental nonatomic updates, traffic stops, then continues after all the new configuration is in the hardware.

iptables, ip6tables, and ebtables

Do not use iptables, ip6tables, ebtables directly; installed rules only apply to the Linux kernel and Cumulus Linux does not hardware accelerate. When you run cl-acltool -i, Cumulus Linux resets all rules and deletes anything that is not in /etc/cumulus/acl/policy.conf.

For example, the following rule appears to work:

cumulus@switch:~$ sudo iptables -A INPUT -p icmp --icmp-type echo-request -j DROP

The cl-acltool -L command shows the rule:

cumulus@switch:~$ sudo cl-acltool -L ip
-------------------------------
Listing rules of type iptables:
-------------------------------

TABLE filter :
Chain INPUT (policy ACCEPT 72 packets, 5236 bytes)
pkts bytes target prot opt in out source destination
0 0 DROP icmp -- any any anywhere anywhere icmp echo-request

However, Cumulus Linux does not synchronize the rule to hardware. Running cl-acltool -i or reboot removes the rule without replacing it. To ensure that Cumulus Linux hardware accelerates all rules that can be in hardware, add them to /etc/cumulus/acl/policy.conf and install them with the cl-acltool -i command.

Estimate the Number of Rules

To estimate the number of rules you can create from an ACL entry, first determine if the ACL entry is ingress or egress. Then, determine if the entry is an IPv4-mac or IPv6 type rule. This determines the slice to which the rule belongs. Use the following to determine how many entries the switch uses for each type.

By default, each entry occupies one double wide entry, except if the entry is one of the following:

Match on VLAN IDs on Layer 2 Interfaces

You can match on VLAN IDs on layer 2 interfaces for ingress rules. The following example matches on a VLAN and DSCP class, and sets the internal class of the packet. For extended matching on IP fields, combine this rule with ingress iptable rules.

[ebtables]
-A FORWARD -p 802_1Q --vlan-id 100 -j mark --mark-set 102

[iptables]
-A FORWARD -i swp31 -m mark --mark 102 -m dscp --dscp-class CS1 -j SETCLASS --class 2

  • Cumulus Linux reserves mark values between 0 and 100; for example, if you use --mark-set 10, you see an error. Use mark values between 101 and 4196.
  • You cannot mark multiple VLANs with the same value.
  • If you enable EVPN-MH and configure VLAN match rules in ebtables with a mark target, the ebtables rule might overwrite the mark set by traffic class rules you configure for EVPN-MH on ingress. Egress EVPN MH traffic class rules that match the ingress traffic class mark might not get hit. To work around this issue, add ebtable rules to ACCEPT the packets already marked by EVPN-MH traffic class rules on ingress.

Install and Manage ACL Rules with NVUE

Instead of crafting a rule by hand, then installing it with cl-acltool, you can use NVUE commands. Cumulus Linux converts the commands to the /etc/cumulus/acl/policy.d/50_nvue.rules file. The rules you create with NVUE are independent of the default files /etc/cumulus/acl/policy.d/00control_plane.rules and 99control_plane_catch_all.rules.

Cumulus Linux 5.0 and later uses the -t mangle -A PREROUTING chain for ingress rules and the -t mangle -A POSTROUTING chain for egress rules instead of the - A FORWARD chain used in previous releases.

Consider the following iptables rule:

-t mangle -A PREROUTING -i swp1 -s 10.0.14.2/32 -d 10.0.15.8/32 -p tcp -j ACCEPT

To create this rule with NVUE, follow the steps below. NVUE adds all options in the rule automatically.

  1. Set the rule type, the matching protocol, source IP address and port, destination IP address and port, and the action. You must provide a name for the rule (EXAMPLE1 in the commands below):

    cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
    cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
    cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip source-ip 10.0.14.2/32
    cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp source-port ANY
    cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip dest-ip 10.0.15.8/32
    cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp dest-port ANY
    cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action permit
    

    For firewall IPv4 type ACLs on the control plane, you can match on the hashlimit module (hashimit), the connection state (connection-state), and the recent module (recent-list). Refer to Firewall Rules.

  2. Apply the rule to an inbound or outbound interface with the nv set interface <interface> acl command.

    • For rules affecting the -t mangle -A PREROUTING chain (-A FORWARD in previous releases), apply the rule to an inbound or outbound interface: For example:
    cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound
    cumulus@switch:~$ nv config apply
    
    • For rules affecting the INPUT or OUPUT chain (-A INPUT or -A OUTPUT), apply the rule to a control plane interface. For example:
    cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound control-plane
    cumulus@switch:~$ nv config apply
    

To see the installed rule, either examine the /etc/cumulus/acl/policy.d/50_nvue.rules file or run the NVUE nv show acl <rule-name> rule <ID> command:

cumulus@switch:~$ sudo cat /etc/cumulus/acl/policy.d/50_nvue.rules
[iptables]

## ACL EXAMPLE1 in dir inbound on interface swp1 ##
-t mangle -A PREROUTING -i swp1 -s 10.0.14.2/32 -d 10.0.15.8/32 -p tcp -j ACCEPT
...
cumulus@switch:~$ nv show acl EXAMPLE1 rule 10 
                     operational   applied     
-------------------  ------------  ------------
match                                          
  ip                                           
    source-ip        10.0.14.2/32  10.0.14.2/32
    dest-ip          10.0.15.8/32  10.0.15.8/32
    protocol         tcp           tcp         
    tcp                                        
      [source-port]  ANY           ANY         
      [dest-port]    ANY           ANY

To remove this rule, run the nv unset acl <acl-name> and nv unset interface <interface> acl <acl-name> commands. These commands delete the rule from the /etc/cumulus/acl/policy.d/50_nvue.rules file.

cumulus@switch:~$ nv unset acl EXAMPLE1
cumulus@switch:~$ nv unset interface swp1 acl EXAMPLE1
cumulus@switch:~$ nv config apply

To show ACL statistics per interface, such as the total number of bytes that match the ACL rule, run the nv show interface <interface-id> acl <acl-id> statistics or nv show interface <interface-id> acl <acl-id> statistics <rule-id> command.

To see the list of all NVUE ACL commands, run the nv list-commands acl command.

Install and Manage ACL Rules with cl-acltool

You can manage Cumulus Linux ACLs with cl-acltool. Rules write first to the iptables chains, as described above, and then synchronize to hardware through switchd.

To examine the current state of chains and list all installed rules, run:

cumulus@switch:~$ sudo cl-acltool -L all
 -------------------------------
Listing rules of type iptables:
-------------------------------
TABLE filter :
Chain INPUT (policy ACCEPT 432K packets, 31M bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  swp+   any     240.0.0.0/5          anywhere            
    0     0 DROP       all  --  swp+   any     127.0.0.0/8          anywhere            
    0     0 DROP       all  --  swp+   any     base-address.mcast.net/4  anywhere            
    0     0 DROP       all  --  swp+   any     255.255.255.255      anywhere            
    0     0 ACCEPT     all  --  swp+   any     anywhere             anywhere            

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 457K packets, 35M bytes)
 pkts bytes target     prot opt in     out     source               destination
...

To list installed rules using native iptables, ip6tables and ebtables, use the -L option with the respective commands:

cumulus@switch:~$ sudo iptables -L
cumulus@switch:~$ sudo ip6tables -L
cumulus@switch:~$ sudo ebtables -L

To remove all installed rules, run:

cumulus@switch:~$ sudo cl-acltool -F all

To remove only the IPv4 iptables rules, run:

cumulus@switch:~$ sudo cl-acltool -F ip

If the install fails, ACL rules in the kernel and hardware roll back to the previous state. You also see errors from programming rules in the kernel or ASIC.

Install Packet Filtering (ACL) Rules

cl-acltool takes access control list (ACL) rule input in files. Each ACL policy file includes iptables, ip6tables and ebtables categories under the tags [iptables], [ip6tables] and [ebtables]. You must assign each rule in an ACL policy to one of the rule categories.

See man cl-acltool(5) for ACL rule details. For iptables rule syntax, see man iptables(8). For ip6tables rule syntax, see man ip6tables(8). For ebtables rule syntax, see man ebtables(8).

See man cl-acltool(5) and man cl-acltool(8) for more details on using cl-acltool.

By default:

  • ACL policy files are in /etc/cumulus/acl/policy.d/.
  • All *.rules files in /etc/cumulus/acl/policy.d/ directory are also in /etc/cumulus/acl/policy.conf.
  • All files in the policy.conf file install when the switch boots up.
  • The policy.conf file expects rule files to have a .rules suffix as part of the file name.

The following shows an example ACL policy file:

[iptables]
-A INPUT -i swp1 -p tcp --dport 80 -j ACCEPT
-A FORWARD -i swp1 -p tcp --dport 80 -j ACCEPT

[ip6tables]
-A INPUT -i swp1 -p tcp --dport 80 -j ACCEPT
-A FORWARD -i swp1 -p tcp --dport 80 -j ACCEPT

[ebtables]
-A INPUT -p IPv4 -j ACCEPT
-A FORWARD -p IPv4 -j ACCEPT

You can use wildcards or variables to specify chain and interface lists.

  • You can only use swp+ and bond+ as wildcard names.
  • swp+ rules apply as an aggregate, not per port. If you want to apply per port policing, specify a specific port instead of the wildcard.

INGRESS = swp+
INPUT_PORT_CHAIN = INPUT,FORWARD

[iptables]
-A $INPUT_PORT_CHAIN -i $INGRESS -p tcp --dport 80 -j ACCEPT

[ip6tables]
-A $INPUT_PORT_CHAIN -i $INGRESS -p tcp --dport 80 -j ACCEPT

[ebtables]
-A INPUT -p IPv4 -j ACCEPT

You can write ACL rules for the system into multiple files under the default /etc/cumulus/acl/policy.d/ directory. The ordering of rules during installation follows the sort order of the files according to their file names.

Use multiple files to stack rules. The example below shows two rule files that separate rules for management and datapath traffic:

cumulus@switch:~$ ls /etc/cumulus/acl/policy.d/
00sample_mgmt.rules 01sample_datapath.rules
cumulus@switch:~$ cat /etc/cumulus/acl/policy.d/00sample_mgmt.rules

INGRESS_INTF = swp+
INGRESS_CHAIN = INPUT

[iptables]
# protect the switch management
-A $INGRESS_CHAIN -i $INGRESS_INTF -s 10.0.14.2 -d 10.0.15.8 -p tcp -j ACCEPT
-A $INGRESS_CHAIN -i $INGRESS_INTF -s 10.0.11.2 -d 10.0.12.8 -p tcp -j ACCEPT
-A $INGRESS_CHAIN -i $INGRESS_INTF -d 10.0.16.8 -p udp -j DROP

cumulus@switch:~$ cat /etc/cumulus/acl/policy.d/01sample_datapath.rules
INGRESS_INTF = swp+
INGRESS_CHAIN = INPUT, FORWARD

[iptables]
-A $INGRESS_CHAIN -i $INGRESS_INTF -s 192.0.2.5 -p icmp -j ACCEPT
-A $INGRESS_CHAIN -i $INGRESS_INTF -s 192.0.2.6 -d 192.0.2.4 -j DROP
-A $INGRESS_CHAIN -i $INGRESS_INTF -s 192.0.2.2 -d 192.0.2.8 -j DROP

Install all ACL policies under a directory:

cumulus@switch:~$ sudo cl-acltool -i -P ./rules
Reading files under rules
Reading rule file ./rules/01_http_rules.txt ...
Processing rules in file ./rules/01_http_rules.txt ...
Installing acl policy ...
Done.

Apply all rules and policies included in /etc/cumulus/acl/policy.conf:

cumulus@switch:~$ sudo cl-acltool -i

Specify the Policy Files to Install

By default, Cumulus Linux installs any .rules file you configure in /etc/cumulus/acl/policy.d/. To add other policy files to an ACL, you need to include them in /etc/cumulus/acl/policy.conf. For example, for Cumulus Linux to install a rule in a policy file called 01_new.datapathacl, add include /etc/cumulus/acl/policy.d/01_new.rules to policy.conf:

cumulus@switch:~$ sudo nano /etc/cumulus/acl/policy.conf

#
# This file is a master file for acl policy file inclusion
#
# Note: This is not a file where you list acl rules.
#
# This file can contain:
# - include lines with acl policy files
#   example:
#     include <filepath>
#
# see manpage cl-acltool(5) and cl-acltool(8) for how to write policy files 
#

include /etc/cumulus/acl/policy.d/01_new.datapathacl

Hardware Limitations for ACL Rules

The maximum number of rules that the switch hardware can store depends on:

If you exceed the maximum number of rules or run out of related memory resources for the ACL table, cl-acltool -i generates one of the following errors:

error: hw sync failed (sync_acl hardware installation failed) Rolling back .. failed.
error: hw sync failed (Bulk counter init failed with No More Resources). Rolling back ..

To troubleshoot this issue and manage resources with high VLAN and ACL scale, refer to Troubleshooting ACL Rule Installation Failures.

NVIDIA Spectrum switches use a TCAM or ATCAM to quickly look up various tables that include ACLs, multicast routes, and certain internal VLAN counters. Depending on the size of the network ACLs, multicast routes, and VLAN counters, you might need to adjust some parameters to fit your network requirements into the tables.

TCAM Profiles on Spectrum 1

The NVIDIA Spectrum 1 ASIC (model numbers 2xx0) has one common TCAM space for both ingress and egress ACLs, which the switch also uses for multicast route entries.

Cumulus Linux controls the ACL and multicast route entry scale on NVIDIA Spectrum 1 switches with different TCAM profiles in combination with the ACL atomic and nonatomic update setting.

Profile Atomic Mode IPv4 Rules Atomic Mode IPv6 Rules Nonatomic Mode IPv4 Rules Nonatomic Mode IPv6 Rules Multicast Route Entries
default 500 250 1000 500 1000
ipmc-heavy 750 500 1500 1000 8500
acl-heavy 1750 1000 3500 2000 450
ipmc-max 1000 500 2000 1000 13000
ip-acl-heavy 6000 0 12000 0 0

  • Even though the table above specifies the ip-acl-heavy profile supports no IPv6 rules, Cumulus Linux does not prevent you from configuring IPv6 rules. However, there is no guarantee that IPv6 rules work under the ip-acl-heavy profile.
  • The ip-acl-heavy profile shows an updated number of supported atomic mode and nonatomic mode IPv4 rules. The previously published numbers were 7500 for atomic mode and 15000 for nonatomic mode IPv4 rules.

To configure the profile you want to use, set the tcam_resource.profile parameter in the /etc/mlx/datapath/tcam_profile.conf file, then restart switchd:

cumulus@switch:~$ sudo nano /etc/mlx/datapath/tcam_profile.conf
...
tcam_resource.profile = ipmc-max
cumulus@switch:~$ sudo systemctl restart switchd.service

Spectrum 1 TCAM resource profiles that control ACLs and multicast route scale are different from forwarding resource profiles that control MAC table, IPv4, and IPv6 entry scale.

ATCAM on Spectrum-2 and Later

Switches with Spectrum-2 and later use a newer KVD scheme and an ATCAM design that is more flexible and allows a higher ACL scale than Spectrum 1. There is no TCAM resource profile on Spectrum-2 and later.

The following table shows the tested ACL rule limits. Because the KVD and ATCAM share space with forwarding table entries, multicast route entries, and VLAN flow counters, these ACL limits might vary based on your use of other tables.

These limits are valid when using any Spectrum-2 and later forwarding profile, except for the l2-heavy-3 and v6-lpm-heavy1 profiles, which reduce the ACL scale significantly.

For Spectrum-2 and later, all profiles support the same number of rules.

Atomic Mode IPv4 Rules Atomic Mode IPv6 Rules Nonatomic Mode IPv4 Rules Nonatomic Mode IPv6 Rules
12500 6250 25000 12500

For information about nonatomic and atomic mode, refer to Nonatomic Update Mode and Atomic Update Mode.

ATCAM Resource Exhaustion

If you see error messages similar to No More Resources .. Rolling back when you try to apply ACLs, refer to Troubleshooting ACL Rule Installation Failures for information on troublshooting and managing resources.

Supported Rule Types

The iptables/ip6tables/ebtables construct tries to layer the Linux implementation on top of the underlying hardware but they are not always directly compatible. The following shows the supported rules for chains in iptables, ip6tables and ebtables.

To learn more about any of the options shown in the tables below, run iptables -h [name of option]. The same help syntax works for options for ip6tables and ebtables.

root@leaf1# ebtables -h tricolorpolice
...
tricolorpolice option:
--set-color-mode STRING setting the mode in blind or aware
--set-cir INT setting committed information rate in kbits per second
--set-cbs INT setting committed burst size in kbyte
--set-pir INT setting peak information rate in kbits per second
--set-ebs INT setting excess burst size in kbyte
--set-conform-action-dscp INT setting dscp value if the action is accept for conforming packets
--set-exceed-action-dscp INT setting dscp value if the action is accept for exceeding packets
--set-violate-action STRING setting the action (accept/drop) for violating packets
--set-violate-action-dscp INT setting dscp value if the action is accept for violating packets
Supported chains for the filter table:
INPUT FORWARD OUTPUT

iptables and ip6tables Rule Support

Rule Element Supported Unsupported
Matches Src/Dst, IP protocol
In/out interface
IPv4: ecn, icmp, frag, ttl,
IPv6: icmp6, hl,
IP common: tcp (with flags), udp, multiport, DSCP, addrtype
Rules with input/output Ethernet interfaces do not apply
Inverse matches
Standard Targets ACCEPT, DROP RETURN, QUEUE, STOP, Fall Thru, Jump
Extended Targets LOG (IPv4/IPv6); UID is not supported for LOG
TCP SEQ, TCP options or IP options
ULOG
SETQOS
DSCP
Unique to Cumulus Linux:
SPAN
ERSPAN (IPv4/IPv6)
POLICE
TRICOLORPOLICE
SETCLASS

ebtables Rule Support

Rule Element Supported Unsupported
Matches ether type
input interface/wildcard
output interface/wildcard
Src/Dst MAC
IP: src, dest, tos, proto, sport, dport
IPv6: tclass, icmp6: type, icmp6: code range, src/dst addr, sport, dport
802.1p (CoS)
VLAN
Inverse matches
Proto length
Standard Targets ACCEPT, DROP RETURN, CONTINUE, Jump, Fall Thru
Extended Targets ULOG
LOG
Unique to Cumulus Linux:
SPAN
ERSPAN
POLICE
TRICOLORPOLICE
SETCLASS

Other Unsupported Rules

Considerations

Splitting rules across the ingress TCAM and the egress TCAM causes the ingress IPv6 part of the rule to match packets going to all destinations, which can interfere with the regular expected linear rule match in a sequence. For example:

A higher rule can prevent a lower rule from matching:

Rule 1: -A FORWARD -o vlan100 -p icmp6 -j ACCEPT

Rule 2: -A FORWARD -o vlan101 -p icmp6 -s 01::02 -j ACCEPT

Rule 1 matches all icmp6 packets from to all out interfaces in the ingress TCAM.

This prevents rule 2 from matching, which is more specific but with a different out interface. Make sure to put more specific matches above more general matches even if the output interfaces are different.

When you have two rules with the same output interface, the lower rule might match depending on the presence of the previous rules.

Rule 1: -A FORWARD -o vlan100 -p icmp6 -j ACCEPT

Rule 2: -A FORWARD -o vlan101 -s 00::01 -j DROP

Rule 3: -A FORWARD -o vlan101 -p icmp6 -j ACCEPT

Rule 3 still matches for an icmp6 packet with sip 00:01 going out of vlan101. Rule 1 interferes with the normal function of rule 2 and/or rule 3.

When you have two adjacent rules with the same match and different output interfaces, such as:

Rule 1: -A FORWARD -o vlan100 -p icmp6 -j ACCEPT

Rule 2: -A FORWARD -o vlan101 -p icmp6 -j DROP

Rule 2 never matches on ingress. Both rules share the same mark.

Common Examples

Data Plane Policers

You can configure quality of service for traffic on the data plane. By using QoS policers, you can rate limit traffic so incoming packets get dropped if they exceed specified thresholds.

Counters on POLICE ACL rules in iptables do not show dropped packets due to those rules.

The following example rate limits the incoming traffic on swp1 to 400 packets per second with a burst of 200 packets per second:

cumulus@switch:~$ nv set acl example1 type ipv4
cumulus@switch:~$ nv set acl example1 rule 10 action police
cumulus@switch:~$ nv set acl example1 rule 10 action police mode packet
cumulus@switch:~$ nv set acl example1 rule 10 action police burst 200
cumulus@switch:~$ nv set acl example1 rule 10 action police rate 400
cumulus@switch:~$ nv set interface swp1 acl example1 inbound
cumulus@switch:~$ nv config apply

Use the POLICE target with iptables. POLICE takes these arguments:

  • --set-rate value specifies the maximum rate in kilobytes (KB) or packets.
  • --set-burst value specifies the number of packets or kilobytes (KB) allowed to arrive sequentially.
  • --set-mode string sets the mode in KB (kilobytes) or pkt (packets) for rate and burst size.

For example, to rate limit the incoming traffic on swp1 to 400 packets per second with a burst of 200 packets per second and set this rule in your appropriate .rules file:

-t mangle -A PREROUTING -i swp1  -j POLICE --set-mode pkt --set-rate 400 --set-burst 200

Control Plane Policers

You can configure quality of service for traffic on the control plane and rate limit traffic so incoming packets drop if they exceed certain thresholds in the following ways:

Cumulus Linux 5.0 and later no longer uses INPUT chain rules to configure control plane policers.

To configure control plane policers:

  • Set the burst rate for the trap group with the nv set system control-plane policer <trap-group> burst <value> command. The burst rate is the number of packets or kilobytes (KB) allowed to arrive sequentially.
  • Set the forwarding rate for the trap group with the nv set system control-plane policer <trap-group> rate <value> command. The forwarding rate is the maximum rate in kilobytes (KB) or packets.

The trap group can be: arp, bfd, pim-ospf-rip, bgp, clag, icmp-def, dhcp-ptp, igmp, ssh, icmp6-neigh, icmp6-def-mld, lacp, lldp, rpvst, eapol, ip2me, acl-log, nat, stp, l3-local, span-cpu, catch-all, or NONE.

The following example changes the PIM trap group forwarding rate and burst rate to 400 packets per second, and the IGMP trap group forwarding rate to 400 packets per second and burst rate to 200 packets per second:

cumulus@switch:~$ nv set system control-plane policer pim-ospf-rip rate 400
cumulus@switch:~$ nv set system control-plane policer pim-ospf-rip burst 400
cumulus@switch:~$ nv set system control-plane policer pim-ospf-rip state on
cumulus@switch:~$ nv set system control-plane policer igmp rate 400
cumulus@switch:~$ nv set system control-plane policer igmp burst 200
cumulus@switch:~$ nv config apply

To rate limit traffic using the /etc/cumulus/control-plane/policers.conf file, you:

  • Enable an individual policer for a trap group (set enable to TRUE).
  • Set the policer rate in packets per second. The forwarding rate is the maximum rate in kilobytes (KB) or packets.
  • Set the policer burst rate in packets per second. The burst rate is the number of packets or kilobytes (KB) allowed to arrive sequentially.

After you edit the /etc/cumulus/control-plane/policers.conf file, you must reload the file with the /usr/lib/cumulus/switchdctl --load /etc/cumulus/control-plane/policers.conf command.

When enable is FALSE for a trap group, the trap group and catch-all trap group have a shared policer. When enable is TRUE, Cumulus Linux creates an individual policer for the trap group.

The following example changes the PIM trap group forwarding rate and burst rate to 400 packets per second, and the IGMP trap group forwarding rate to 400 packets per second and burst rate to 200 packets per second:

cumulus@switch:~$ sudo nano /etc/cumulus/control-plane/policers.conf
...
copp.pim_ospf_rip.enable = TRUE
copp.pim_ospf_rip.rate = 400
copp.pim_ospf_rip.burst = 400
...
copp.igmp.enable = TRUE
copp.igmp.rate = 400
copp.igmp.burst = 200
...
cumulus@switch:~$ /usr/lib/cumulus/switchdctl --load /etc/cumulus/control-plane/policers.conf

To show the control plane police configuration and statistics, run the NVUE nv show system control-plane policer --view=brief command.

Cumulus Linux provides default control plane policer values. You can adjust these values to accommodate higher scale requirements for specific protocols as needed.

Policers Default Values
cumulus@leaf01:mgmt:~$ sudo cat /etc/cumulus/control-plane/policers.conf
copp.arp.enable = TRUE
copp.arp.rate = 800
copp.arp.burst = 800

copp.bfd.enable = TRUE
copp.bfd.rate = 2000
copp.bfd.burst = 2000

copp.pim_ospf_rip.enable = TRUE
copp.pim_ospf_rip.rate = 2000
copp.pim_ospf_rip.burst = 2000

copp.bgp.enable = TRUE
copp.bgp.rate = 2000
copp.bgp.burst = 2000

copp.clag.enable = TRUE
copp.clag.rate = 2000
copp.clag.burst = 2000

copp.icmp_def.enable = TRUE
copp.icmp_def.rate = 100
copp.icmp_def.burst = 40

copp.dhcp_ptp.enable = TRUE
copp.dhcp_ptp.rate = 2000
copp.dhcp_ptp.burst = 2000

copp.igmp.enable = TRUE
copp.igmp.rate = 1000
copp.igmp.burst = 1000

copp.ssh.enable = TRUE
copp.ssh.rate = 1000
copp.ssh.burst = 1000

copp.icmp6_neigh.enable = TRUE
copp.icmp6_neigh.rate = 500
copp.icmp6_neigh.burst = 500

copp.icmp6_def_mld.enable = TRUE
copp.icmp6_def_mld.rate = 300
copp.icmp6_def_mld.burst = 100

copp.lacp.enable = TRUE
copp.lacp.rate = 2000
copp.lacp.burst = 2000

copp.lldp.enable = TRUE
copp.lldp.rate = 200
copp.lldp.burst = 200

copp.rpvst.enable = TRUE
copp.rpvst.rate = 2000
copp.rpvst.burst = 2000

copp.eapol.enable = TRUE
copp.eapol.rate = 2000
copp.eapol.burst = 2000

copp.ip2me.enable = TRUE
copp.ip2me.rate = 1000
copp.ip2me.burst = 1000

copp.acl_log.enable = TRUE
copp.acl_log.rate = 100
copp.acl_log.burst = 100

copp.nat.enable = TRUE
copp.nat.rate = 200
copp.nat.burst = 200

copp.stp.enable = TRUE
copp.stp.rate = 2000
copp.stp.burst = 2000

copp.l3_local.enable = TRUE
copp.l3_local.rate = 400
copp.l3_local.burst = 100

copp.span_cpu.enable = TRUE
copp.span_cpu.rate = 100
copp.span_cpu.burst = 100

copp.catch_all.enable = TRUE
copp.catch_all.rate = 100
copp.catch_all.burst = 100

Control Plane ACLs

You can configure control plane ACLs to apply a single rule for all packets forwarded to the CPU regardless of the source interface or destination interface on the switch. Control plane ACLs allow you to regulate traffic forwarded to applications on the switch with more granularity than traps and to configure ACLs to block SSH from specific addresses or subnets.

Cumulus Linux applies inbound control plane ACLs in the INPUT chain and outbound control plane ACLs in the OUTPUT chain.

Cumulus Linux does not support a deny all control plane rule. This type of rule blocks traffic for interprocess communication and impacts overall system functionality.

The following example command applies the input control plane ACL called ACL1.

cumulus@switch:~$ nv set system control-plane acl ACL1 inbound
cumulus@switch:~$ nv config apply

The following example command applies the output control plane ACL called ACL2.

cumulus@switch:~$ nv set system control-plane acl ACL2 outbound
cumulus@switch:~$ nv config apply

To show statistics for all control-plane ACLs, run the nv show system control-plane acl command:

cumulus@switch:~$ nv show system control-plane acl
ACL Name   Rule ID  In Packets  In Bytes  Out Packets  Out Bytes
---------  -------  ----------  --------  -----------  ---------
acl1       1        0           0         0            0
           65535    0           0         0            0
acl2       1        0           0         0            0
           65535    0           0         0            0 

To show statistics for a specific control-plane ACL, run the nv show system control-plane acl <acl_name> statistics command:

cumulus@switch:~$ nv show system control-plane acl ACL1 statistics
Rule  In Packet  In Byte  Out Packet  Out Byte  Summary 

----  ---------  -------  ----------  --------  --------------------------- 

1     0          0 Bytes  0           0 Bytes   match.ip.dest-ip:   9.1.2.3 

2     0          0 Bytes  0           0 Bytes   match.ip.source-ip: 7.8.2.3 

Set DSCP on Transit Traffic

The following examples use the mangle table to modify the packet as it transits the switch. DSCP is in decimal notation in the examples below.

[iptables]

#Set SSH as high priority traffic.
-t mangle -A PREROUTING -i swp+ -p tcp -m multiport --dports 22 -j SETQOS --set-dscp 46

#Set everything coming in swp1 as AF13
-t mangle -A PREROUTING -i swp1  -j SETQOS --set-dscp 14

#Set Packets destined for 10.0.100.27 as best effort
-t mangle -A PREROUTING -i swp+ -d 10.0.100.27/32 -j SETQOS --set-dscp 0

#Example using a range of ports for TCP traffic
-t mangle -A PREROUTING -i swp+ -s 10.0.0.17/32 -d 10.0.100.27/32 -p tcp -m multiport --sports 10000:20000 -m multiport --dports 10000:20000 -j SETQOS --set-dscp 34

Apply the rule:

cumulus@switch:~$ sudo cl-acltool -i

To set SSH as high priority traffic:

cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp dest-port 22
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action set dscp 46
cumulus@switch:~$ nv set interface swp1-48 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply

To set everything coming in swp1 as AF13:

cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action set dscp 14
cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply

To set Packets destined for 10.0.100.27 as best effort:

cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip dest-ip 10.0.100.27/32
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action set dscp 0
cumulus@switch:~$ nv set interface swp1-48 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply

To use a range of ports for TCP traffic:

cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip source-ip 10.0.0.17/32
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp source-port 10000:20000
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip dest-ip 10.0.100.27/32
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp dest-port 10000:20000
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action set dscp 34
cumulus@switch:~$ nv set interface swp1-48 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply

To specify all ports on the switch in NVUE (swp+ in an iptables rule), you must set the range of interfaces on the switch as in the examples above (nv set interface swp1-48). This command creates as many rules in the /etc/cumulus/acl/policy.d/50_nvue.rules file as the number of interfaces in the range you specify.

Filter Specific TCP Flags

The example rule below drops ingress IPv4 TCP packets when you set the SYN bit and reset the RST, ACK, and FIN bits. The rule applies inbound on interface swp1. After configuring this rule, you cannot establish new TCP sessions that originate from ingress port swp1. You can establish TCP sessions that originate from any other port.

-t mangle -A PREROUTING -i swp1 -p tcp --tcp-flags  ACK,SYN,FIN,RST SYN -j DROP

Apply the rule:

cumulus@switch:~$ sudo cl-acltool -i
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip tcp flags syn
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip tcp mask rst
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip tcp mask syn
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip tcp mask fin
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip tcp mask ack
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 action deny 
cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply

Control Who Can SSH into the Switch

Run the following commands to control who can SSH into the switch. In the following example, 10.10.10.1/32 is the interface IP address (or loopback IP address) of the switch and 10.255.4.0/24 can SSH into the switch.

-A INPUT -i swp+ -s 10.255.4.0/24 -d 10.10.10.1/32 -j ACCEPT
-A INPUT -i swp+ -d 10.10.10.1/32 -j DROP

Apply the rule:

cumulus@switch:~$ sudo cl-acltool -i
cumulus@switch:~$ nv set acl example2 type ipv4
cumulus@switch:~$ nv set acl example2 rule 10 match ip source-ip 10.255.4.0/24 
cumulus@switch:~$ nv set acl example2 rule 10 match ip dest-ip 10.10.10.1/32
cumulus@switch:~$ nv set acl example2 rule 10 action permit
cumulus@switch:~$ nv set acl example2 rule 20 match ip source-ip ANY 
cumulus@switch:~$ nv set acl example2 rule 20 match ip dest-ip 10.10.10.1/32
cumulus@switch:~$ nv set acl example2 rule 20 action deny
cumulus@switch:~$ nv set system control-plane acl example2 inbound
cumulus@switch:~$ nv config apply

Block Traffic towards the eth0 Interface

To block traffic towards the eth0 interface, apply an ACL on the system control plane instead of on the eth0 interface. The following example creates an ACL called DENY-IN that blocks traffic from ingressing eth0 with source IP address 192.168.200.10:

cumulus@switch:~$ nv set acl DENY-IN rule 10 action deny
cumulus@switch:~$ nv set acl DENY-IN rule 10 match ip source-ip 192.168.200.10
cumulus@switch:~$ nv set acl DENY-IN type ipv4
cumulus@switch:~$ nv set system control-plane acl DENY-IN inbound
cumulus@switch:~$ nv config apply

Match on ECN Bits in the TCP IP Header

ECN allows end-to-end notification of network congestion without dropping packets. You can add ECN rules to match on the ECE, CWR, and ECT flags in the TCP IPv4 header.

By default, ECN rules match a packet with the bit set. You can reverse the match by using an explanation point (!).

Match on the ECE Bit

After an endpoint receives a packet with the CE bit set by a router, it sets the ECE bit in the returning ACK packet to notify the other endpoint that it needs to slow down.

To match on the ECE bit:

Create a rules file in the /etc/cumulus/acl/policy.d directory and add the following rule under [iptables]:

cumulus@switch:~$ sudo nano /etc/cumulus/acl/policy.d/30-tcp-flags.rules
[iptables]
-t mangle -A PREROUTING -i swp1 -p tcp -m ecn  --ecn-tcp-ece  -j ACCEPT

Apply the rule:

cumulus@switch:~$ sudo cl-acltool -i
cumulus@switch:~$ nv set acl example2 type ipv4
cumulus@switch:~$ nv set acl example2 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl example2 rule 10 match ip ecn flags tcp-ece
cumulus@switch:~$ nv set acl example2 rule 10 action permit
cumulus@switch:~$ nv set interface swp1 acl example2 inbound
cumulus@switch:~$ nv config apply

Match on the CWR Bit

The CWR bit notifies the other endpoint of the connection that it received and reacted to an ECE.

To match on the CWR bit:

Create a rules file in the /etc/cumulus/acl/policy.d directory and add the following rule under [iptables]:

cumulus@switch:~$ sudo nano /etc/cumulus/acl/policy.d/30-tcp-flags.rules
[iptables]
-t mangle -A PREROUTING -i swp1 -p tcp -m ecn  --ecn-tcp-cwr  -j ACCEPT

Apply the rule:

cumulus@switch:~$ sudo cl-acltool -i
cumulus@switch:~$ nv set acl example2 type ipv4
cumulus@switch:~$ nv set acl example2 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl example2 rule 10 match ip ecn flags tcp-cwr
cumulus@switch:~$ nv set acl example2 rule 10 action permit
cumulus@switch:~$ nv set interface swp1 acl example2 inbound
cumulus@switch:~$ nv config apply

Match on the ECT Bit

The ECT codepoints negotiate if the connection is ECN capable by setting one of the two bits to 1. Routers also use the ECT bit to indicate that they are experiencing congestion by setting both the ECT codepoints to 1.

To match on the ECT bit:

Create a rules file in the /etc/cumulus/acl/policy.d directory and add the following rule under [iptables]:

cumulus@switch:~$ sudo nano /etc/cumulus/acl/policy.d/30-tcp-flags.rules
[iptables]
-t mangle -A PREROUTING -i swp1 -p tcp -m ecn  --ecn-ip-ect 1 -j ACCEPT

Apply the rule:

cumulus@switch:~$ sudo cl-acltool -i
cumulus@switch:~$ nv set acl example2 type ipv4
cumulus@switch:~$ nv set acl example2 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl example2 rule 10 match ip ecn ip-ect 1
cumulus@switch:~$ nv set acl example2 rule 10 action permit
cumulus@switch:~$ nv set interface swp1 acl example2 inbound
cumulus@switch:~$ nv config apply

Example Configuration

The following example demonstrates how Cumulus Linux applies several different rules.

Egress Rule

The following rule blocks any TCP traffic with destination port 200 going through leaf01 to server01 (rule 1 in the diagram above).

[iptables]
-t mangle -A POSTROUTING -o swp1 -p tcp -m multiport --dports 200 -j DROP
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp dest-port 200
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action deny
cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 outbound
cumulus@switch:~$ nv config apply

Ingress Rule

The following rule blocks any UDP traffic with source port 200 going from server01 through leaf01 (rule 2 in the diagram above).

[iptables] 
-t mangle -A PREROUTING -i swp1 -p udp -m multiport --sports 200 -j DROP
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol udp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip udp source-port 200
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action deny
cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply

Input Rule

The following rule blocks any UDP traffic with source port 200 and destination port 50 going from server02 to the leaf02 control plane (rule 3 in the diagram above).

[iptables] 
-A INPUT -i swp2 -p udp -m multiport --dports 50 -j DROP
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol udp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip udp dest-port 50
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action deny
cumulus@switch:~$ nv set interface swp2 acl EXAMPLE1 inbound control-plane
cumulus@switch:~$ nv config apply

Output Rule

The following rule blocks any TCP traffic with source port 123 and destination port 123 going from leaf02 to server02 (rule 4 in the diagram above).

[iptables] 
-A OUTPUT -o swp2 -p tcp -m multiport --sports 123 -m multiport --dports 123 -j DROP
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp source-port 123
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp dest-port 123
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action deny
cumulus@switch:~$ nv set interface swp2 acl EXAMPLE1 outbound control-plane
cumulus@switch:~$ nv config apply

Layer 2 Rules (ebtables)

The following rule blocks any traffic with source MAC address 00:00:00:00:00:12 and destination MAC address 08:9e:01:ce:e2:04 going from any switch port egress or ingress.

[ebtables]
-A FORWARD -s 00:00:00:00:00:12 -d 08:9e:01:ce:e2:04 -j DROP
cumulus@switch:~$ nv set acl EXAMPLE type mac
cumulus@switch:~$ nv set acl EXAMPLE rule 10 match mac source-mac 00:00:00:00:00:12
cumulus@switch:~$ nv set acl EXAMPLE rule 10 match mac dest-mac 08:9e:01:ce:e2:04
cumulus@switch:~$ nv set acl EXAMPLE rule 10 action deny
cumulus@switch:~$ nv set interface swp1-48 acl EXAMPLE inbound
cumulus@switch:~$ nv config apply

Considerations

Not All Rules Supported

Cumulus Linux does not support all iptables, ip6tables, or ebtables rules. Refer to Supported Rules for specific rule support.

ACL Log Policer Limits Traffic

To protect the CPU from overloading, Cumulus Linux limits traffic copied to the CPU to 1 packet per second by an ACL Log Policer.

Bridge Traffic Limitations

Bridge traffic that matches LOG ACTION rules do not log to syslog; the kernel and hardware identify packets using different information.

You Cannot Forward Log Actions

You cannot forward logged packets. The hardware cannot both forward a packet and send the packet to the control plane (or kernel) for logging. A log action must also have a drop action.

SPAN Sessions that Reference an Outgoing Interface

SPAN sessions that reference an outgoing interface create mirrored packets based on the ingress interface before the routing/switching decision. See SPAN Sessions that Reference an Outgoing Interface and Use the CPU Port as the SPAN Destination in the Network Troubleshooting section.

iptables Interactions with cl-acltool

Because Cumulus Linux is a Linux operating system, you can use the iptables commands. However, consider using cl-acltool instead for the following reasons:

For example, running the following command works:

cumulus@switch:~$ sudo iptables -A INPUT -p icmp --icmp-type echo-request -j DROP

The rules appear when you run cl-acltool -L:

cumulus@switch:~$ sudo cl-acltool -L ip
-------------------------------
Listing rules of type iptables:
-------------------------------
TABLE filter :
Chain INPUT (policy ACCEPT 72 packets, 5236 bytes)
pkts bytes target  prot opt in   out   source    destination

0     0 DROP    icmp --  any  any   anywhere  anywhere      icmp echo-request

However, running cl-acltool -i or reboot removes them. To ensure that Cumulus Linux can hardware accelerate all rules that can be in hardware, place them in the /etc/cumulus/acl/policy.conf file, then run cl-acltool -i.

Where to Assign Rules

Troubleshooting ACL Rule Installation Failures

On Spectrum-2 and later, in addition to ACLs, items stored in KVD and ATCAM include internal counters for VLANs and interfaces in a bridge. If the network includes more than 1000 VLAN interfaces, the counters might occupy a significant amount of space and reduce the amount of available space for ACLs.

If you use all the ACL space, you might see error messages similar to the following when you try to apply ACLs:

cumulus@switch:$ sudo cl-acltool -i -p 00control_plane.rules
Using user provided rule file 00control_plane.rules
Reading rule file 00control_plane.rules ...
Processing rules in file 00control_plane.rules ...
error: hw sync failed (sync_acl hardware installation failed)
Installing acl policy... Rolling back ..
failed.
error: hw sync failed (Bulk counter init failed with No More Resources). Rolling back ..

You might also see messages similar to the following in the /var/log/syslog file:

2023-12-07T16:31:32.386792-05:00 mlx-4700-51 sx_sdk: 1951 [FLOW_COUNTER] [NOTICE ]:
Spectrum_flow_counter_bulk_set: cm_bulk_block_add failed toallocated bulk size 64

You might also see messages similar to the following in the /var/log/switchd.log file:

2023-12-07T16:31:32.387219-05:00 mlx-4700-51 switchd[7354]: hal_mlx_sdk_counter_wrap.c:366 ERR
sx_api_flow_counter_bulk_set create failed with: No More Resources
2023-12-07T16:31:32.387338-05:00 mlx-4700-51 switchd[7354]: hal_mlx_flx_acl.c:9531 ERR
flow_counter_bulk_set create failed with: No More Resources
2023-12-07T16:31:32.387415-05:00 mlx-4700-51 switchd[7354]: hal_mlx_flx_acl.c:3202 ERR BULK
counter init failed with No More Resources
2023-12-07T16:31:32.387481-05:00 mlx-4700-51 switchd[7354]: hal_mlx_flx_acl.c:2765
 hal_mlx_flx_chain_desc_install returned 0
2023-12-07T16:31:32.387554-05:00 mlx-4700-51 switchd[7354]: hal_mlx_flx_acl.c:1981 ERR
acl_plan_install returned 0
2023-12-07T16:31:32.393928-05:00 mlx-4700-51 switchd[7354]: sync_acl.c:225 ERR BULK counter init
failed with No More Resources
2023-12-07T16:31:32.394047-05:00 mlx-4700-51 switchd[7354]: sync_acl.c:6669 ERR BULK counter init
failed with No More Resources

For information on ACL resource limitations, refer to Hardware Limitations for ACL Rules.

On Spectrum-2 and later, you might see resource errors when you try to configure more than 1000 VLAN interfaces because certain VLAN counters share space with ACL memory in the ATCAM.

To free up resources, you can:

The flow counters are internal counters for debugging; you do not see the counters in nv show interface <interface> counters or cl-netstat commands.

To see how much space the flow counters consume, examine the Flow Counters line in the cl-resource-query output.

ACLs Do not Match when the Output Port on the ACL is a Subinterface

The ACL does not match on packets when you configure a subinterface as the output port. The ACL matches on packets only if the primary port is as an output port. If a subinterface is an output or egress port, the packets match correctly.

For example:

-A FORWARD -o swp49s1.100 -j ACCEPT

Egress ACL Matching on Bonds

Cumulus Linux does not support ACL rules that match on an outbound bond interface. For example, you cannot create the following rule:

[iptables]
-A FORWARD -o <bond_intf> -j DROP

To work around this issue, duplicate the ACL rule on each physical port of the bond. For example:

[iptables]
-A FORWARD -o <bond-member-port-1> -j DROP
-A FORWARD -o <bond-member-port-2> -j DROP

SSH Traffic to the Management VRF

To allow SSH traffic to the management VRF, use -i mgmt, not -i eth0. For example:

-A INPUT -i mgmt -s 10.0.14.2/32 -p tcp --dport ssh -j ACCEPT

INPUT Chain Rules and swp+

In INPUT chain rules, the -i swp+ match works only if the destination of the packet is towards a layer 3 swp interface; the match does not work if the packet terminates at an SVI interface (for example, vlan10). To allow traffic towards specific SVIs, use rules without any interface match or rules with individual -i <SVI> matches.

Services and Daemons in Cumulus Linux

Services (also known as daemons) and processes are at the heart of how a Linux system functions. Most of the time, a service takes care of itself; you just enable and start it, then let it run. However, because a Cumulus Linux switch is a Linux system, you can dig deeper if you like. Services can start multiple processes as they run. Services are important to monitor on a Cumulus Linux switch.

You manage services in Cumulus Linux to identify all active or stopped services and the boot time state of a specific service, disable or enable a specific service, and identify active listener ports.

systemd and the systemctl Command

You manage services that use systemd with the systemctl command.

Command Options Description
status Returns the status of the specified service.
start Starts the service.
stop Stops the service.
restart Stops, then starts the service, all the while maintaining state. If there are dependent services or services that mark the restarted service as Required, the other services also restart. For example, running systemctl restart frr.service restarts any of the routing protocol services that you enable and that are running, such as bgpd or ospfd.
reload Reloads the configuration for the service.
enable Enables the service to start when the system boots, but does not start it unless you use the systemctl start SERVICENAME.service command or reboot the switch.
disable Disables the service, but does not stop it unless you use the systemctl stop SERVICENAME.service command or reboot the switch. You can start or stop a disabled service.
reenable Disables, then enables a service. Run this command so that any new Wants or WantedBy lines create the symlinks necessary for ordering. This does not affect on other services.

You do not need to interact with the services directly using these commands. If a critical service crashes or encounters an error, systemd restarts it automatically. systemd is the caretaker of services in modern Linux systems and responsible for starting all the necessary services at boot time.

The following example restarts the networking service:

cumulus@switch:~$ sudo systemctl restart networking.service

The following example shows all running services:

cumulus@switch:~$ sudo systemctl status
● switch
    State: running
      Jobs: 0 queued
    Failed: 0 units
    Since: Thu 2019-01-10 00:19:34 UTC; 23h ago
    CGroup: /
            ├─init.scope
            │ └─1 /sbin/init
            └─system.slice
              ├─haveged.service
              │ └─234 /usr/sbin/haveged --Foreground --verbose=1 -w 1024
              ├─sysmonitor.service
              │ ├─  658 /bin/bash /usr/lib/cumulus/sysmonitor
              │ └─26543 sleep 60
              ├─systemd-udevd.service
              │ └─218 /lib/systemd/systemd-udevd
              ├─system-ntp.slice
              │ └─ntp@mgmt.service
              │   └─vrf
              │     └─mgmt
              │       └─12108 /usr/sbin/ntpd -n -u ntp:ntp -g
              ├─cron.service
              │ └─274 /usr/sbin/cron -f -L 38
              ├─system-serial\x2dgetty.slice
              │ └─serial-getty@ttyS0.service
              │   └─745 /sbin/agetty -o -p -- \u --keep-baud 115200,38400,9600 ttyS0 vt220
              ├─nginx.service
              │ ├─332 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
              │ └─333 nginx: worker process
              ├─auditd.service
              │ └─235 /sbin/auditd
              ├─rasdaemon.service
              │ └─275 /usr/sbin/rasdaemon -f -r
              ├─clagd.service
              │ └─11443 /usr/bin/python /usr/sbin/clagd --daemon 169.254.1.2 peerlink.4094 44:39:39:ff:40:9
              --priority 100 --vxlanAnycas
              ├─switchd.service
              │ └─430 /usr/sbin/switchd -vx
              ...

Add the service name after the systemctl argument.

Ensure a Service Starts after Multiple Restarts

By default, systemd tries to restart a particular service only a certain number of times within a given interval before the service fails to start. The settings StartLimitInterval (which defaults to 10 seconds) and StartBurstLimit (which defaults to 5 attempts) are in the service script; however, certain services override these defaults, sometimes with much longer times. For example, switchd.service sets StartLimitInterval=10m and StartBurstLimit=3; therefore, if you restart switchd more than three times in ten minutes, it does not start.

When the restart fails for this reason, you see a message similar to the following:

Job for switchd.service failed. See 'systemctl status switchd.service' and 'journalctl -xn' for details.

systemctl status switchd.service shows output similar to:

Active: failed (Result: start-limit) since Thu 2016-04-07 21:55:14 UTC; 15s ago

To clear this error, run systemctl reset-failed switchd.service. If you know you are going to restart frequently (multiple times within the StartLimitInterval), you can run the same command before you issue the restart request. This also applies to stop followed by start.

Keep systemd Services from Hanging after Starting

If you start, restart, or reload a systemd service that you can start from another systemd service, you must use the --no-block option with systemctl.

Identify Active Listener Ports for IPv4 and IPv6

You can identify the active listener ports under both IPv4 and IPv6 using the netstat command:

cumulus@switch:~$ netstat -nlp --inet --inet6
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:53              0.0.0.0:*               LISTEN      444/dnsmasq
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      874/sshd
tcp6       0      0 :::53                   :::*                    LISTEN      444/dnsmasq
tcp6       0      0 :::22                   :::*                    LISTEN      874/sshd
udp        0      0 0.0.0.0:28450           0.0.0.0:*                           839/dhclient
udp        0      0 0.0.0.0:53              0.0.0.0:*                           444/dnsmasq
udp        0      0 0.0.0.0:68              0.0.0.0:*                           839/dhclient
udp        0      0 192.168.0.42:123        0.0.0.0:*                           907/ntpd
udp        0      0 127.0.0.1:123           0.0.0.0:*                           907/ntpd
udp        0      0 0.0.0.0:123             0.0.0.0:*                           907/ntpd
udp        0      0 0.0.0.0:4784            0.0.0.0:*                           909/ptmd
udp        0      0 0.0.0.0:3784            0.0.0.0:*                           909/ptmd
udp        0      0 0.0.0.0:3785            0.0.0.0:*                           909/ptmd
udp6       0      0 :::58352                :::*                                839/dhclient
udp6       0      0 :::53                   :::*                                444/dnsmasq
udp6       0      0 fe80::a200:ff:fe00::123 :::*                                907/ntpd
udp6       0      0 ::1:123                 :::*                                907/ntpd
udp6       0      0 :::123                  :::*                                907/ntpd
udp6       0      0 :::4784                 :::*                                909/ptmd
udp6       0      0 :::3784                 :::*                                909/ptmd

Identify Active or Stopped Services

To see active or stopped services, run the cl-service-summary command:

cumulus@switch:~$ cl-service-summary
Service cron               enabled    active   
Service ssh                enabled    active   
Service rsyslog            enabled    active   
Service asic-monitor       enabled    inactive 
Service clagd              disabled   active   
Service cumulus-poe                   inactive 
Service lldpd              enabled    active   
Service mstpd              enabled    active   
Service neighmgrd          enabled    active   
Service netd               enabled    active   
Service netq-agent         disabled   inactive 
Service ntp                disabled   inactive 
Service portwd             enabled    inactive 
Service ptmd               enabled    active   
Service pwmd               enabled    active   
Service smond              enabled    active   
Service switchd            enabled    active   
Service sysmonitor         enabled    active   
Service vxrd                          inactive 
Service vxsnd                         inactive 
Service rdnbrd             disabled   inactive 
Service frr                enabled    active   
Service ntp@mgmt           disabled   inactive 
Service ntp@ntp            disabled   inactive
...

You can also run the systemctl list-unit-files --type service command to list all services on the switch and to see their status:

cumulus@switch:~$ systemctl list-unit-files --type service
UNIT FILE                                  STATE           VENDOR PRESET
aclinit.service                            enabled         enabled      
acltool.service                            enabled         enabled      
acpid.service                              disabled        enabled      
air-agent@.service                         indirect        enabled      
apt-daily-upgrade.service                  static          -            
apt-daily.service                          static          -            
asic-monitor.service                       enabled         enabled      
atftpd.service                             generated       -            
auditd.service                             enabled         enabled      
autovt@.service                            alias           -            
blk-availability.service                   enabled         enabled      
bmcd.service                               disabled        enabled      
bootlog.service                            enabled         enabled      
cl-system-services.service                 enabled         enabled      
clagd.service                              disabled        enabled      
clagd_rebootNotifier.service               disabled        enabled      
console-getty.service                      disabled        disabled     
console-setup.service                      enabled         enabled      
container-getty@.service                   static          -            
containerd.service                         disabled        enabled      
cron.service                               enabled         enabled      
cryptdisks-early.service                   masked          enabled      
cryptdisks.service                         masked          enabled      
csmgrd.service                             enabled         enabled      
cumulus-aclcheck.service                   static          -            
cumulus-cleanup-health_check.service       static          -            
cumulus-cleanup-lttng_traces.service       static          -            
cumulus-core.service                       static          -
...

Identify Essential Services

To identify which services must run when the switch boots:

cumulus@switch:~$ systemctl list-dependencies --before basic.target

To identify which services you need for networking:

cumulus@switch:~$ systemctl list-dependencies --after network.target
   ├─switchd.service
   └─network-pre.target

To identify the services needed for a multi-user environment, run:

cumulus@switch:~$ systemctl list-dependencies --before multi-user.target

 ●  ├─bootlog.service
   ├─systemd-readahead-done.service
   ├─systemd-readahead-done.timer
   ├─systemd-update-utmp-runlevel.service
   └─graphical.target
   └─systemd-update-utmp-runlevel.service

Important Services

The following table lists the most important services in Cumulus Linux.

Service Name Description Affects Forwarding?
switchd Hardware abstraction daemon. Synchronizes the kernel with the ASIC. YES
sx_sdk Interfaces with the Spectrum ASIC. Only on Spectrum switches. YES
frr FRR. Handles routing protocols. There are separate processes for each routing protocol, such as bgpd and ospfd. YES if routing
clagd Cumulus link aggregation daemon. Handles MLAG. YES if using MLAG
neighmgrd Keeps neighbor entries refreshed, snoops on ARP and ND packets if ARP suppression is on, and refreshes VRR MAC addresses. YES
mstpd Spanning tree protocol daemon. YES if using layer 2
ptmd Prescriptive Topology Manager. Verifies cabling based on LLDP output. Also sets up BFD sessions. YES if using BFD
nvued Handles the NVUE object model. NO
rsyslog Handles logging of syslog messages. NO
ntp Network time protocol. NO
ledmgrd LED manager. Reads the state of system LEDs. NO
sysmonitor Watches and logs critical system load (free memory, disk, CPU). NO
lldpd Handles Tx/Rx of LLDP information. NO
smond Reads platform sensors and fan information from pwmd. NO
pwmd Reads and sets fan speeds. NO

Limit Resources for Services

You can configure a limit on memory and CPU usage for the following services to divide hardware resources up among applications and users, increasing overall efficiency.

To configure a limit on CPU usage, run the nv set service control <service-name-id> resource-limit cpu <percent> command.

The following example configures the syslog service to limit CPU usage to 60 percent:

cumulus@switch:~$ nv set service control rsyslog resource-limit cpu 60
cumulus@switch:~$ nv config apply

To configure a limit on memory usage, run the nv set service control <service-name-id> resource-limit memory <size> command.

The following example configures the DHCP service to limit memory usage to 6700M:

cumulus@switch:~$ nv set service control dhcpd resource-limit memory 6700M
cumulus@switch:~$ nv config apply

A value of 100 configures no limit on CPU usage for the service.

To show the current CPU and memory usage for all services, run the nv show service control command:

cumulus@switch:~$ nv show service control

To show the current CPU and memory usage for a specific service, run the nv show service control <service> command:

cumulus@switch:~$ nv show service control rsyslog

To show the configured resource limits for a specific service, run the nv show service control <service-name-id> resource-limit command:

cumulus@switch:~$ nv show service control rsyslog resource-limit

Configuring switchd

The switchd service enables the switch to communicate with Cumulus Linux and all the applications running on Cumulus Linux.

Configure switchd Settings

You can control certain options associated with the switchd process. For example, you can set polling intervals, optimize ACL hardware resources for better utilization, configure log message levels, set the internal VLAN range, and configure VXLAN encapsulation and decapsulation.

To configure switchd options, you either run NVUE commands or manually edit the /etc/cumulus/switchd.conf file.

NVUE currently only supports a subset of the switchd configuration available in the /etc/cumulus/switchd.conf file.

You can run NVUE commands to set the following switchd options:

  • The statistic polling interval for physical interfaces and for logical interfaces.
    • For physical interfaces, you can specify a value between 1 and 10. The default setting is 2 seconds
    • For logical interfaces, you can specify a value between 1 and 30. The default setting is 5 seconds.

A low setting, such as 1, might affect system performance.

  • The log level to debug the data plane programming related code. You can specify debug, info, notice, warning, or error. The default setting is info. NVIDIA recommends that you do not set the log level to debug in a production environment.
  • The DSCP action and value for encapsulation. You can set the DSCP action to copy (to copy the value from the IP header of the packet), set (to specify a specific value), or derive (to obtain the value from the switch priority). The default action is derive. Only specify a value if the action is set.
  • The DSCP action for decapsulation in VXLAN outer headers. You can specify copy (to copy the value from the IP header of the packet), preserve (to keep the inner DSCP value), or derive (to obtain the value from the switch priority). The default action is derive.
  • The preference between a route and neighbor with the same IP address and mask. You can specify route, neighbor, or route-and-neighbor. The default setting is route.
  • The ACL mode (atomic or non-atomic). The default setting is atomic.
  • The reserved VLAN range. The default setting is 3725-3999.

Certain switchd settings require a switchd restart or reload. Before applying the settings, NVUE indicates if it requires a switchd restart or reload and prompts you for confirmation.

  • When the switchd service restarts, in addition to resetting the switch hardware configuration, all network ports reset.
  • When the switchd service reloads, there is no interruption to network services.

The following command example sets both the statistic polling interval for logical interfaces and physical interfaces to 6 seconds:

cumulus@switch:~$ nv set system counter polling-interval logical-interface 6
cumulus@switch:~$ nv set system counter polling-interval physical-interface 6
cumulus@switch:~$ nv config apply

The following command example sets the log level for debugging the data plane programming related code to warning:

cumulus@switch:~$ nv set system forwarding programming log-level warning
cumulus@switch:~$ nv config apply

The following command example sets the DSCP action for encapsulation in VXLAN outer headers to set and the value to af12:

cumulus@switch:~$ nv set nve vxlan encapsulation dscp action set
cumulus@switch:~$ nv set nve vxlan encapsulation dscp value af12
cumulus@switch:~$ nv config apply

The following command example sets the DSCP action for decapsulation in VXLAN outer headers to preserve:

cumulus@switch:~$ nv set nve vxlan decapsulation dscp action preserve
cumulus@switch:~$ nv config apply

The following command example sets the route or neighbor preference to both route and neighbor:

cumulus@switch:~$ nv set system forwarding host-route-preference route-and-neighbour
cumulus@switch:~$ nv config apply

The following command example sets the ACL mode to non-atomic:

cumulus@switch:~$ nv set system acl mode non-atomic 
cumulus@switch:~$ nv config apply

  • On Spectrum-2 and later NVUE reloads switchd after you run and apply the nv set system acl mode command.
  • On Spectrum 1 switches, NVUE restarts switchd after you run and apply the nv set system acl mode command.

The following command example sets the reserved VLAN range between 4064 and 4094:

cumulus@switch:~$ nv set system global reserved vlan internal range 4064-4094
cumulus@switch:~$ nv config apply

To configure the switchd parameters, edit the /etc/cumulus/switchd.conf file. Change the setting and uncomment the line if needed. The switchd.conf file contains comments with a description for each setting.

The following example shows the first few lines of the /etc/cumulus/switchd.conf file.

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
#
# /etc/cumulus/switchd.conf - switchd configuration file
#
# Statistic poll interval (in msec)
#stats.poll_interval = 2000

# Buffer utilization poll interval (in msec), 0 means disable
#buf_util.poll_interval = 0

# Buffer utilization measurement interval (in mins)
#buf_util.measure_interval = 0

# Optimize ACL HW resources for better utilization
#acl.optimize_hw = FALSE

# Enable Flow based mirroring.
#acl.flow_based_mirroring = TRUE
...

The following table describes the /etc/cumulus/switchd.conf file parameters and indicates if you need to restart switchd with the sudo systemctl restart switchd.service command or reload switchd with the sudo systemctl reload switchd.service command for changes to take effect when you update the setting.

Restarting the switchd service causes all network ports to reset in addition to resetting the switch hardware configuration.

Parameter Description
switchd reload or restart
stats.poll_interval The statistics polling interval in milliseconds.The default setting is 2000. restart
buf_util.poll_interval The buffer utilization polling interval in milliseconds. 0 disables buffer utilization polling.The default setting is 0. restart
buf_util.measure_interval The buffer utilization measurement interval in minutes.The default setting is 0. restart
acl.optimize_hw Optimizes ACL hardware resources for better utilization.The default setting is FALSE. restart
acl.flow_based_mirroring Enables flow-based mirroring.The default setting is TRUE. restart
acl.non_atomic_update_mode Enables non atomic ACL updatesThe default setting is FALSE. Spectrum-2 and later: reload
Spectrum A1: restart
arp.next_hops Sends ARPs for next hops.The default setting is TRUE. restart
route.table The kernel routing table ID. The range is between 1 and 2^31. The default is 254. restart
route.host_max_percent The maximum neighbor table occupancy in hardware (a percentage of the hardware table size).The default setting is 100. restart
coalescing.reducer The coalescing reduction factor for accumulating changes to reduce CPU load.The default setting is 1. restart
coalescing.timeout The coalescing time limit in seconds.The default setting is 10. restart
ignore_non_swps Ignore routes that point to non-swp interfaces.The default setting is TRUE. restart
disable_internal_parity_restart Disables restart after a parity error.The default setting is TRUE. restart
disable_internal_hw_err_restart Disables restart after an unrecoverable hardware error.The default setting is FALSE. restart
nat.static_enable Enables static NAT.
The default setting is TRUE.
restart
nat.dynamic_enable Enables dynamic NAT.
The default setting is TRUE.
restart
nat.age_poll_interval The NAT age polling interval in minutes. The minimum is 1 minute and the maximum is 24 hours. You can configure this setting only when nat.dynamic_enable is set to TRUE.
The default setting is 5.
restart
nat.table_size The NAT table size limit in number of entries. You can configure this setting only when nat.dynamic_enable is set to TRUE.
The default setting is 1024.
restart
nat.config_table_size The NAT configuration table size limit in number of entries. You can configure this setting only when nat.dynamic_enable is set to TRUE.
The default setting is 64.
restart
logging Configures logging in the format BACKEND=LEVEL. Separate multiple BACKEND=LEVEL pairs with a space. The BACKEND value can be stderr, file:filename, syslog, program:executable. The LEVEL value can be CRIT, ERR, WARN, INFO, DEBUG.The default value is syslog=INFO restart
interface.<interface>.storm_control.broadcast Enables broadcast storm control and sets the number of packets per second (pps).The default setting is 400. reload
interface.<interface>.storm_control.multicast Enables multicast storm control and sets the number of packets per second (pps).The default setting is 3000. reload
interface.<interface>.storm_control.unknown_unicast Enables unicast storm control and sets the number of packets per second (pps).The default setting is 2000. reload
stats.vlan.aggregate Enables hardware statistics for VLANs and specifies the type of statistics needed. You can specify NONE, BRIEF, or DETAIL.The default setting is BRIEF. restart
stats.vxlan.aggregate Enables hardware statistics for VXLANs and specifies the type of statistics needed. You can specify NONE, BRIEF, or DETAIL. The default setting is DETAIL. restart
stats.vxlan.member Enables hardware statistics for VXLAN members and specifies the type of statistics needed. You can specify NONE, BRIEF, or DETAIL.The default setting is BRIEF. restart
stats.vlan.show_internal_vlans Show internal VLANs.The default setting is FALSE. restart
stats.vdev_hw_poll_interval The polling interval in seconds for virtual device hardware statisitcs.The default setting is 5. restart
resv_vlan_range The internal VLAN range.The default setting is 3725-3999. restart
netlink.buf_size The netlink socket buffer size in MB.The default setting is 136314880. restart
route.delete_dead_routes Delete routes on interfaces when the carrier is down.The default setting is TRUE. restart
vxlan.default_ttl The default TTL to use in VXLAN headers.The default setting is 64. restart
bridge.broadcast_frame_to_cpu Enables bridge broadcast frames to the CPU even if the SVI is not enabled.The default setting is FALSE. restart
bridge.unreg_mcast_init Initialize the prune module for IGMP snooping unregistered layer 2 multicast flood control.The default setting is FALSE. restart
bridge.unreg_v4_mcast_prune Enables unregistered layer 2 multicast prune to mrouter ports (IPv4).The default setting is FALSE (flood unregistered layer 2 multicast traffic). restart
bridge.unreg_v6_mcast_prune Enables unregistered layer 2 multicast prune to mrouter ports (IPv6).The default setting is FALSE (flood unregistered layer 2 multicast traffic). restart
netlink libnl logger The default setting is [0-5]. restart
netlink.nl_logger The default setting is 0. restart
vxlan.def_encap_dscp_action Sets the default VXLAN router DSCP action during encapsulation. You can specify copy if the inner packet is IP, set to set a specific value, or derive to derive the value from the switch priority.The default setting is derive. reload
vxlan.def_encap_dscp_value Sets the default VXLAN encapsulation DSCP value if the action is set. reload
vxlan.def_decap_dscp_action Sets the default VXLAN router DSCP action during decapsulation. You can specify copy if the inner packet is IP, preserve to preserve the inner DSCP value, or derive to derive the value from the switch priority.The default setting is derive. reload
ipmulticast.unknown_ipmc_to_cpu Enables sending unknown IPMC to the CPU.The default setting is FALSE. restart
vrf_route_leak_enable_dynamic Enables dynamic VRF route leaking.The default setting is FALSE. restart
sync_queue_depth_val The event queue depth.The default setting is 50000. restart
route.route_preferred_over_neigh Sets the preference between a route and neighbor with the same IP address and mask. You can specify TRUE to prefer the route over the neighbor, FALSE to prefer the neighbor over the route, or BOTH to install both the route and neighbor.The default setting is TRUE. reload
evpn.multihoming.enable Enables EVPN multihoming.The default setting is TRUE. restart
evpn.multihoming.shared_l2_groups Enables sharing for layer 2 next hop groups.The default setting is FALSE. restart
evpn.multihoming.shared_l3_groups Enables sharing for layer 3 next hop groups.The default setting is FALSE. restart
evpn.multihoming.fast_local_protect Enables fast reroute for egress link protection. The default setting is FALSE. restart
evpn.multihoming.bum_sph_filter Sets split-horizon filtering for EVPN multihoming. You can specify TRUE to filter only BUM traffic from the Ethernet segment (ES) peer or FALSE to filter all traffic from the ES peer.The default setting is TRUE. restart
link_flap_window The duration in seconds during which a link must flap the number of times set in the link_flap_threshold before Cumulus Linux sets the link to protodown and specifies linkflap as the reason.The default setting is 10. A value of 0 disables link flap protection. restart
link_flap_threshold The number of times the link must flap within the link flap window before Cumulus Linux sets the link to protodown and specifies linkflap as the reason.The default setting is 5. A value of 0 disables link flap protection. restart
res_usage_warn_threshold Sets the percentage over which forwarding resources (routes, hosts, MAC addresses) must go before Cumulus Linux generates a warning. You can set a value between 50 and 95.The default setting is 90. restart
res_warn_msg_int The time interval in seconds between resource warning messages. Warning messages generate only one time in the specified interval per resource type even if the threshold falls below or goes over the value set in res_usage_warn_threshold multiple times during this interval. You can set a value between 60 and 3600.The default setting is 300. restart

Show switchd Settings

You can run the following NVUE commands to show the current switchd configuration settings.

Command
Description
nv show system counter polling-interval Shows the polling interval for physical and logical interface counters in seconds.
nv show system forwarding programming Shows the log level for data plane programming logs.
nv show nve vxlan encapsulation dscp Shows the DSCP action and value (if the action is set) for the outer header in VXLAN encapsulation.
nv show nve vxlan decapsulation dscp Shows the DSCP action for the outer header in VXLAN decapsulation.
nv show system acl Shows the ACL mode (atomic or non-atomic).
nv show system global reserved vlan internal Shows the reserved VLAN range.

The following example command shows that the polling interval setting for logical interface counters is 6 seconds:

cumulus@switch:~$ nv show system counter polling-interval
                   applied  description
-----------------  -------  -----------------------------------------------------
logical-interface  0:00:06  Config polling-interval for logical interface(in sec)

The following example command shows that the log level setting for data plane programming logs is warning:

cumulus@switch:~$ nv show system forwarding programming
           applied  description
---------  -------  -------------------
log-level  warning  configure Log-level

The following example command shows that the DSCP action setting for the outer header in VXLAN encapsulation is set and the value is af12.

cumulus@switch:~$ nv show nve vxlan encapsulation dscp
        operational  applied  description
------  -----------  -------  --------------------------------------------------
action  set          set      DSCP encapsulation action
value   af12         af12     Configured DSCP value to put in outer Vxlan packet

The following command example shows that ACL mode is atomic:

cumulus@switch:~$ nv show system acl
      applied  description
----  -------  -----------------------------------------
mode  atomic   configure Atomic or Non-Atomic ACL update

The following command example shows that the reserved VLAN range is between 4064 and 4094:

cumulus@switch:~$ nv show system global reserved vlan internal
       operational  applied    description
-----  -----------  ---------  -------------------
range  4064-4094    4064-4094  Reserved Vlan range

In addition to restarting switchd when you change certain /etc/cumulus/switchd.conf file parameters manually, you also need to restart switchd whenever you modify a switchd hardware configuration file (any *.conf file that requires making a change to the switching hardware, such as /etc/cumulus/datapath/traffic.conf). You do not have to restart the switchd service when you update a network interface configuration (for example, when you edit the /etc/network/interfaces file).

Configuring a Global Proxy

You configure global HTTP and HTTPS proxies in the /etc/profile.d/ directory of Cumulus Linux. Set the http_proxy and https_proxy variables to configure the switch with the address of the proxy server you want to use to get URLs on the command line. This is useful for programs such as apt, apt-get, curl and wget, which can all use this proxy.

  1. In a terminal, create a new file in the /etc/profile.d/ directory.

    cumulus@switch:~$ sudo nano /etc/profile.d/proxy.sh
    
  2. Add a line to the file to configure either an HTTP or an HTTPS proxy, or both:

    • HTTP proxy:

      http_proxy=http://myproxy.domain.com:8080
      export http_proxy
      
    • HTTPS proxy:

      https_proxy=https://myproxy.domain.com:8080
      export https_proxy
      
  3. Create a file in the /etc/apt/apt.conf.d directory and add the following lines to the file to get the HTTP and HTTPS proxies. The example below uses http_proxy as the file name:

    cumulus@switch:~$ sudo nano /etc/apt/apt.conf.d/http_proxy
    Acquire::http::Proxy "http://myproxy.domain.com:8080";
    Acquire::https::Proxy "https://myproxy.domain.com:8080";
    
  4. Add the proxy addresses to the /etc/wgetrc file, then uncomment the http_proxy and https_proxy lines, if necessary:

    cumulus@switch:~$ sudo nano /etc/wgetrc
    ...
    https_proxy = https://myproxy.domain.com:8080
    http_proxy = http://myproxy.domain.com:8080
    ...
    
  5. To execute the /etc/profile.d/proxy.sh file in the current environment, run the source command:

    cumulus@switch:~$ source /etc/profile.d/proxy.sh
    

Use the echo command to confirm the configuration:

Set up an apt package cache

In Service System Upgrade - ISSU

Use ISSU to upgrade and troubleshoot an active switch with minimal disruption to the network.

ISSU includes the following modes:

In earlier Cumulus Linux releases, ISSU was Smart System Manager.

Restart Mode

You can configure the switch to restart in one of the following modes.

Cumulus Linux supports:

  • Fast mode for all protocols.
  • Warm mode for 802.1X, layer 2 forwarding, layer 3 forwarding with BGP, static routing, and VXLAN routing with EVPN. Cumulus Linux does not support warm boot with EVPN MLAG or EVPN multihoming.

NVIDIA recommends you use NVUE commands to configure restart mode and reboot the system. If you prefer to use csmgrctl commands, you must stop NVUE from managing the /etc/cumulus/csmgrd.conf file before you set restart mode:

  1. Run the following NVUE commands:

    cumulus@switch:~$ nv set system config apply ignore /etc/cumulus/csmgrd.conf
    cumulus@switch:~$ nv config apply
    
  2. Edit the /etc/cumulus/csmgrd.conf file and set the csmgrctl_override option to true:

    cumulus@switch:~$ sudo nano /etc/cumulus/csmgrd.conf
    csmgrctl_override=true
    ...
    
  3. Save the configuration:

    cumulus@switch:~$ nv config save
    

The following command configures the switch to restart in cold mode:

cumulus@switch:~$ nv set system reboot mode cold
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo csmgrctl -c

The following command configures the switch to restart in fast mode:

cumulus@switch:~$ nv set system reboot mode fast
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo csmgrctl -f

The following command configures the switch to restart in warm mode.

cumulus@switch:~$ nv set system reboot mode warm
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo csmgrctl -w

To reboot the switch in the restart mode you configure above with NVUE:

cumulus@switch:~$ nv action reboot system no-confirm

You must specify no-confirm at the end of the command.

To show system reboot information, such as the reboot date and time, reason, and reset mode (fast, cold, warm), run the NVUE nv show system reboot command:

cumulus@switch:~$ nv show system reboot
           operational                       applied  pending
---------  --------------------------------  -------  -------
reason                                                       
  gentime  2023-04-26T15:11:23.140569+00:00                  
  reason   Unknown                                           
  user     system/root
  mode     cold                              cold
  required no

Upgrade Mode

Upgrade mode updates all the components and services on the switch to the latest Cumulus Linux minor release without impacting traffic. After upgrade is complete, you must restart the switch with either a warm, cold, or fast restart.

If the switch is in warm restart mode, restarting the switch after an upgrade does not result in traffic loss (this is a hitless upgrade).

Upgrade mode includes the following options:

The following command upgrades all the system components to the latest release:

cumulus@switch:~$ nv action upgrade system packages to latest use-vrf default

By default, the NVUE nv action upgrade system packages command runs in the management VRF. To run the command in a non-management VRF such as default, you must use the use-vrf <vrf> option.

cumulus@switch:~$ sudo csmgrctl -u

The following command provides information on the components you want to upgrade:

cumulus@switch:~$ nv action upgrade system packages to latest use-vrf default dry-run

By default, the NVUE nv action upgrade system packages command runs in the management VRF. To run the command in a non-management VRF such as default, you must use the use-vrf <vrf> option.

cumulus@switch:~$ sudo csmgrctl -d

Maintenance Mode

Maintenance mode globally manages the BGP and MLAG control plane.

To enable maintenance mode:

cumulus@switch:~$ nv action enable system maintenance mode
Action executing ...
System maintenance mode has been enabled successfully
 Current System Mode: Maintenance, cold  
 Maintenance mode since Thu Jun 13 23:59:47 2024 (Duration: 00:00:00)
 Ports shutdown for Maintenance
 frr             : Maintenance, cold, down, up time: 29:06:27
 switchd         : Maintenance, cold, down, up time: 29:06:31
 System Services : Maintenance, cold, down, up time: 29:07:00

Action succeeded
cumulus@switch:~$ sudo csmgrctl -m1

To disable maintenance mode:

cumulus@switch:~$ nv action disable system maintenance mode
Action executing ...
System maintenance mode has been disabled successfully
 Current System Mode: cold  
 frr             : cold, up, up time: 12:57:48 (1 restart)
 switchd         : cold, up, up time: 13:12:13
 System Services : cold, up, up time: 13:12:32
Action succeeded
cumulus@switch:~$ sudo csmgrctl -m0

Before you disable maintenance mode, be sure to bring the ports back up.

To show maintenance mode status either run the NVUE nv show system maintenance command or the Linux sudo csmgrctl -s command:

cumulus@switch:~$ nv show system maintenance 
       operational
-----  -----------
mode   enabled   
ports  disabled 
cumulus@switch:~$ sudo csmgrctl -s
Current System Mode: cold  
 frr             : cold, up, up time: 00:14:51 (2 restarts)
 clagd           : cold, up, up time: 00:14:47
 switchd         : cold, up, up time: 01:09:48
 System Services : cold, up, up time: 01:10:07

Maintenance Ports

Maintenance ports globally disables or enables all configured ports.

To enable maintenance ports:

cumulus@switch:~$ nv action enable system maintenance ports
Action executing ...
System maintenance ports has been enabled successfully
 Current System Mode: cold  
 frr             : cold, up, up time: 28:54:36
 switchd         : cold, up, up time: 28:54:40
 System Services : cold, up, up time: 28:55:09

Action succeeded
cumulus@switch:~$ sudo csmgrctl -p0

To disable maintenance ports:

cumulus@switch:~$ nv action disable system maintenance ports
Action executing ...
System maintenance ports has been disabled successfully
 Current System Mode: cold  
 Ports shutdown for Maintenance
 frr             : cold, up, up time: 28:55:49
 switchd         : cold, up, up time: 28:55:53
 System Services : cold, up, up time: 28:56:22

Action succeeded
cumulus@switch:~$ sudo csmgrctl -p1

To see the status of maintenance ports, run the NVUE nv show system maintenance command:

cumulus@switch:~$ nv show system maintenance 
       operational
-----  -----------
mode   enabled   
ports  disabled 

Layer 1 and Switch Ports

This section discusses the following layer 1 and switch port configuration:

CLI Configuration

Cumulus Linux provides several options to configure the CLI; you can set a CLI session timeout, and enable and configure the pager.

Set the CLI Session Timeout

To reduce the window of opportunity for unauthorized user access to an unattended CLI session on the switch, or to end an inactive session and release the resources associated with it, set the CLI session to exit after a certain amount of idle time.

Run the nv set system cli inactive-timeout <minutes> command. You can set the CLI session timeout to a value between 0 and 86400 minutes. The default value is 0 (disabled).

cumulus@switch:~$ nv set system cli inactive-timeout 300
cumulus@switch:~$ nv config apply

Create a file in the /etc/profile.d/ directory and add the following lines with the TMOUT value in seconds:

cumulus@switch:~$ sudo nano /etc/profile.d/tmout.sh
...
readonly TMOUT=18000
export TMOUT

Configure the CLI Pager

The CLI pager enables you to view the contents of a large file or the output of an NVUE command one page at a time in the terminal window, using the up and down arrow keys or the space bar.

To configure the CLI pager, set the pager state and the pager options.

cumulus@switch:~$ nv set system cli pagination state enabled
cumulus@switch:~$ nv set system cli pagination pager more
cumulus@switch:~$ nv config apply

Edit the NVUE_PAGINATE and the NVUE_PAGER values in the /etc/profile.d/nvue_cli.sh file.

cumulus@switch:~$ sudo nano /etc/profile.d/nvue_cli.sh
...
export NVUE_PAGINATE=on
export NVUE_PAGER=more

Show CLI Settings

To show the current CLI settings, run the nv show system cli command:

cumulus@switch:~$ nv show system cli
                  applied
----------------  -------
inactive-timeout  300  
pagination               
  state           enabled
  pager           more

To show the configured pager options only, run the nv show system cli pagination command:

cumulus@switch:~$ nv show system cli pagination
       applied
-----  -------
state  enabled
pager  more

Interface Configuration and Management

This section discusses how to configure the interfaces on the switch.

Cumulus Linux (including NVUE) uses ifupdown2 to manage network interfaces, which is a new implementation of the Debian network interface manager ifupdown.

Bring an Interface Up or Down

An interface status can be in an:

The carrier state is the lower layer state of an interface. For a switch port, the carrier state represents if the switch port is enabled at the ASIC level and a cable is connected successfully. For a virtual interface, the carrier state involves the operational state of lower-level interfaces. For example, for a VLAN interface, the carrier state depends on the underlying bridge device operational state.

The operational state always depends on administrative state and carrier state; the operational state is a function of the administrative state, carrier state and other link states.

To configure and bring an interface up administratively, use the nv set interface command:

cumulus@switch:~$ nv set interface swp1
cumulus@switch:~$ nv config apply

After you bring up an interface, you can bring it down administratively by changing the link state to down:

cumulus@switch:~$ nv set interface swp1 link state down
cumulus@switch:~$ nv config apply

To bring the interface back up, change the link state back to up:

cumulus@switch:~$ nv set interface swp1 link state up
cumulus@switch:~$ nv config apply

To remove an interface from the configuration entirely, use the nv unset interface command:

cumulus@switch:~$ nv unset interface swp1
cumulus@switch:~$ nv config apply

NVUE applies only current configuration changes instead of processing the entire /etc/network/interfaces file.

To configure and bring an interface up administratively, edit the /etc/network/interfaces file to add the interface stanza, then run the ifreload -a command:

cumulus@switch:~$ sudo nano /etc/network/interfaces
auto lo
iface lo inet loopback
    address 10.10.10.1/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
...

To bring an interface down administratively after you configure it, add link-down yes to the interface stanza in the /etc/network/interfaces file, then run ifreload -a:

auto swp1
iface swp1
 link-down yes

If you configure an interface in the /etc/network/interfaces file, you can bring it down administratively with the ifdown swp1 command, then bring the interface back up with the ifup swp1 command. These changes do not persist after a reboot. After a reboot, the configuration present in /etc/network/interfaces takes effect.

  • By default, the ifupdown and ifup commands are quiet. Use the verbose option (-v) to show commands as they execute when you bring an interface down or up.
  • For configurations at scale, you can run the ifreload -a --diff command to apply only current configuration changes instead of processing the entire /etc/network/interfaces file.

To remove an interface from the configuration entirely, remove the interface stanza from the /etc/network/interfaces file, then run the ifreload -a command.

For additional information on interface administrative state and physical state, refer to this knowledge base article.

Loopback Interface

Cumulus Linux has a preconfigured loopback interface. When the switch boots up, the loopback interface called lo is up and assigned an IP address of 127.0.0.1.

The loopback interface lo must always exist on the switch and must always be up.

To configure an IP address for the loopback interface:

cumulus@switch:~$ nv set interface lo ip address 10.10.10.1
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file to add an address line:

auto lo
iface lo inet loopback
    address 10.10.10.1

  • If the IP address has no subnet mask, it automatically becomes a /32 IP address. For example, 10.10.10.1 is 10.10.10.1/32.
  • You can configure multiple IP addresses for the loopback interface.

Subinterfaces

On Linux, an interface is a network device that can be either physical, (for example, swp1) or virtual (for example, vlan100). A VLAN subinterface is a VLAN device on an interface, and the VLAN ID appends to the parent interface using dot (.) VLAN notation. For example, a VLAN with ID 100 that is a subinterface of swp1 is swp1.100. The dot VLAN notation for a VLAN device name is a standard way to specify a VLAN device on Linux.

A VLAN subinterface only receives traffic tagged for that VLAN; therefore, swp1.100 only receives packets that have a VLAN 100 tag on switch port swp1. Any packets that transmit from swp1.100 have a VLAN 100 tag.

The following example configures a routed subinterface on swp1 in VLAN 100:

cumulus@switch:~$ nv set interface swp1.100 ip address 192.168.100.1/24
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file, then run ifreload -a:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp1.100
iface swp1.100
 address 192.168.100.1/24
cumulus@switch:~$ sudo ifreload -a

  • If you are using a VLAN subinterface, do not add that VLAN under the bridge stanza.
  • You cannot use NVUE commands to create a routed subinterface for VLAN 1.

Interface IP Addresses

You can specify both IPv4 and IPv6 addresses for the same interface.

For IPv6 addresses:

The following example commands configure three IP addresses for swp1; two IPv4 addresses and one IPv6 address.

cumulus@switch:~$ nv set interface swp1 ip address 10.0.0.1/30
cumulus@switch:~$ nv set interface swp1 ip address 10.0.0.2/30
cumulus@switch:~$ nv set interface swp1 ip address 2001:DB8::1/126
cumulus@switch:~$ nv config apply

To show the MAC address for an interface, run the nv show interface <interface> link command.

In the /etc/network/interfaces file, list all IP addresses under the iface section.

auto swp1
iface swp1
    address 10.0.0.1/30
    address 10.0.0.2/30
    address 2001:DB8::1/126

The address method and address family are not mandatory; they default to inet/inet6 and static. However, you must specify inet/inet6 when you are creating DHCP or loopback interfaces.

auto lo
iface lo inet loopback

To make non-persistent changes to interfaces at runtime, use ip addr add:

cumulus@switch:~$ sudo ip addr add 10.0.0.1/30 dev swp1
cumulus@switch:~$ sudo ip addr add 2001:DB8::1/126 dev swp1

To remove an address from an interface, use ip addr del:

cumulus@switch:~$ sudo ip addr del 10.0.0.1/30 dev swp1
cumulus@switch:~$ sudo ip addr del 2001:DB8::1/126 dev swp1

Interface MAC Addresses

You can configure a MAC address for an interface with the nv set interface <interface> link mac-address <mac-address> command.

The following command configures swp1 with MAC address 00:02:00:00:00:05:

cumulus@switch:~$ nv set interface swp1 link mac-address 00:02:00:00:00:05
cumulus@switch:~$ nv config apply

The following command configures vlan10 with MAC address 00:00:5E:00:01:00:

cumulus@switch:~$ nv set interface vlan10 link mac-address 00:00:5E:00:01:00
cumulus@switch:~$ nv config apply

To unset the MAC address for an interface, run the nv unset interface <interface> link mac-address command. This command resets the MAC address to the system assigned address.

cumulus@switch:~$ nv unset interface swp1 link mac-address
cumulus@switch:~$ nv config apply

In the /etc/network/interfaces file, add a MAC address for the interface in the interface stanza, then run ifreload -a.

The following example configures swp1 with MAC address 00:02:00:00:00:05:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp1
iface swp1
    address 10.0.0.2/24
    hwaddress 00:02:00:00:00:05
...
cumulus@switch:~$ sudo ifreload -a

The following example configures vlan10 with MAC address 00:00:5E:00:01:00:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto vlan10
iface vlan10
    address 10.1.10.5/24
    hwaddress 00:00:5E:00:01:00
cumulus@switch:~$ sudo ifreload -a

To unset the MAC address for an interface, remove the mac address from the interface stanza, then run the sudo ifreload -a command.

Interface Descriptions

You can add a description (alias) to an interface.

Interface descriptions also appear in the SNMP OID IF-MIB::ifAlias

  • Interface descriptions can have a maximum of 256 characters.
  • Avoid using apostrophes or non-ASCII characters. Cumulus Linux does not parse these characters.

The following example commands create the description hypervisor_port_1 for swp1:

cumulus@switch:~$ nv set interface swp1 description hypervisor_port_1
cumulus@switch:~$ nv config apply

In the /etc/network/interfaces file, add a description using the alias keyword:

cumulus@switch:~$ sudo nano /etc/network/interfaces

auto swp1
iface swp1
    alias swp1 hypervisor_port_1

Interface Commands

You can specify user commands for an interface that run at pre-up, up, post-up, pre-down, down, and post-down.

You can add any valid command in the sequence to bring an interface up or down; however, limit the scope to network-related commands associated with the particular interface. For example, it does not make sense to install a Debian package on ifup of swp1, even though it is technically possible. See man interfaces for more details.

The following examples adds a command to an interface to enable proxy ARP:

NVUE does not provide commands to configure this feature.
cumulus@switch:~$ sudo nano /etc/network/interfaces
auto swp1
iface swp1
    address 12.0.0.1/30
    post-up echo 1 > /proc/sys/net/ipv4/conf/swp1/proxy_arp

If your post-up command also starts, restarts, or reloads any systemd service, you must use the --no-block option with systemctl. Otherwise, that service or even the switch itself might hang after starting or restarting. For example, to restart the dhcrelay service after bringing up a VLAN, the /etc network/interfaces configuration looks like this:

auto bridge.100
iface bridge.100
    post-up systemctl --no-block restart dhcrelay.service

Port Ranges

To specify port ranges in commands:

Use commas to separate different port ranges.

The following example configures the default bridge br_default with swp1 through swp46 and swp10 through swp12:

cumulus@switch:~$ nv set interface swp1-4,6,10-12 bridge domain br_default
cumulus@switch:~$ nv config apply

The following example sets all subinterfaces of swp1s within the range 1-4:

cumulus@switch:~$ nv set interface swp1s1-4
cumulus@switch:~$ nv config apply

The following example sets all interfaces within the swp range 1 through 64 and their subinterface range 1 through 3:

cumulus@switch:~$ nv set interface swp1-64s1-3
cumulus@switch:~$ nv config apply

Use the glob keyword to specify bridge ports and bond slaves:

auto br0
iface br0
    bridge-ports glob swp1-6.100

auto br1
iface br1
    bridge-ports glob swp7-9.100  swp11.100 glob swp15-18.100

Fast Linkup

Cumulus Linux supports fast linkup on interfaces on NVIDIA Spectrum1 switches. Fast linkup enables you to bring up ports with cards that require links to come up fast, such as certain 100G optical network interface cards.

You must configure both sides of the connection with the same speed and FEC settings.

cumulus@switch:~$ nv set interface swp1 link fast-linkup on
cumulus@switch:~$ nv config apply

Edit the /etc/cumulus/switchd.conf file and add the interface.<interface>.enable_media_depended_linkup_flow=TRUE and interface.<interface>.enable_port_short_tuning=TRUE settings for the interfaces on which you want to enable fast linkup. The following example enables fast linkup on swp1:

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
...
interface.swp1.enable_media_depended_linkup_flow=TRUE
interface.swp1.enable_short_tuning=TRUE

Reload switchd with the sudo systemctl reload switchd.service command.

Cumulus Linux enables link flap detection by default. Link flap detection triggers when there are five link flaps within ten seconds, at which point the interface goes into a protodown state and shows linkflap as the reason. The switchd service also shows a log message similar to the following:

2023-02-10T17:53:21.264621+00:00 cumulus switchd[10109]: sync_port.c:2263 ERR swp2 link flapped more than 3 times in the last 60 seconds, setting protodown

To show interfaces with the protodown flag, run the NVUE nv show interface command or the Linux ip link command. To check a specific interface, run the nv show interface <interface> link command.

cumulus@switch:~$ nv show interface
Interface  State  Speed  MTU    Type      Remote Host      Remote Port  Summary                                 
---------  -----  -----  -----  --------  ---------------  -----------  ----------------------------------------
eth0       up     1G     1500   eth       oob-mgmt-switch  swp10        IP Address:            192.168.200.11/24
                                                                        IP Address:  fe80::4638:39ff:fe22:17a/64
lo         up            65536  loopback                                IP Address:                  127.0.0.1/8
                                                                        IP Address:                      ::1/128
mgmt       up            65575  vrf                                     IP Address:                  127.0.0.1/8
                                                                        IP Address:                      ::1/128
swp1       up            1500   swp                                                                             
swp2       protodown     9178   swp                                                                             
swp3       up            1500   swp                                                                             
swp4       up            1500   swp                                                                             
...
cumulus@switch:~$ ip link
...
37: swp2: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9178 qdisc pfifo_fast master bond131 state DOWN mode DEFAULT group default qlen 1000
  link/ether 1c:34:da:ba:bb:2a brd ff:ff:ff:ff:ff:ff protodown on protodown_reason <linkflap>
...
cumulus@switch:~$ nv show interface swp1 link
                       operational                     
---------------------  ------------------------------
admin-status           up
oper-status            up
protodown              disabled
auto-negotiate         on
duplex                 full
speed                  800G
mac-address            9c:05:91:9a:e0:b8
fec                    rs
mtu                    9216
fast-linkup            off
stats
  in-bytes             145.08 KB
  in-pkts              756
  in-drops             8
  in-errors            0
  out-bytes            145.42 KB
  out-pkts             757
  out-drops            0
  out-errors           0
  carrier-transitions  12
eyes                   65, 62, 70, 65, 80, 82, 81, 82
grade                  65, 62, 70, 65, 80, 82, 81, 82
troubleshooting-info   No issue was observed

Clear the Interface Protodown State and Reason

To clear the protodown state and the reason:

cumulus@switch:~$ nv action clear interface swp1 link flap-protection violation 

After a few seconds the port state returns to up. Run the nv show <interface> link state command to verify that the interface is no longer in a protodown state and that the reason clears:

cumulus@switch:~$ nv show swp1 link state
operational    applied
  -----------    -------
  up             up

To clear all the interfaces from a protodown state, run the nv action clear system link flap-protection violation.

The ifdown and ifup commands do not clear the protodown state. You must clear the protodown state and the reason manually using the sudo ip link set <interface> protodown_reason linkflap off and sudo ip link set <interface> protodown off commands.

cumulus@switch:~$ sudo ip link set swp2 protodown_reason linkflap off
cumulus@switch:~$ sudo ip link set swp2 protodown off

After a few seconds, the port state returns to UP. To verify that the interface is no longer in a protodown state and that the reason clears, run the ip link show <interface> command:

cumulus@switch:~$ ip link show swp2
37: swp2: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9178 qdisc pfifo_fast master bond131 state UP mode DEFAULT group default qlen 1000
  link/ether 1c:34:da:ba:bb:2a brd ff:ff:ff:ff:ff:ff

You can change the following link flap protection settings:

The following example configures the link flap duration to 30 and the number of times the link must flap to 8.

cumulus@switch:~$ nv set system link flap-protection interval 30
cumulus@switch:~$ nv set system link flap-protection threshold 8 
cumulus@switch:~$ nv config apply

Edit the /etc/cumulus/switchd.conf file to change the link_flap_window and link_flap_threshold settings.

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
...
link_flap_window = 30
link_flap_threshold = 8
...

After you change the link flap settings, you must restart switchd with the sudo systemctl restart switchd.service command.

To disable link flap protection:

cumulus@switch:~$ nv set interface swp1 link flap-protection enable off
cumulus@switch:~$ nv config apply

Edit the /etc/cumulus/switchd.conf file, and set the link_flap_window and link_flap_threshold parameters to 0 (zero).

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
...
link_flap_window = 0
link_flap_threshold = 0

To show the link flap protection time interval and threshold settings:

cumulus@switch:~$ nv show system link flap-protection
           applied
---------  -------
threshold  8      
interval   30 

To show if link flap protection is on an interface, run the nv show interface <interface> link flap-protection command:

cumulus@switch:~$ nv show interface swp1 link flap-protection
        applied
------  -------
enable  off

Source Interface File Snippets

Sourcing interface files helps organize and manage the /etc/network/interfaces file. For example:

cumulus@switch:~$ sudo cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet dhcp

source /etc/network/interfaces.d/bond0

The contents of the sourced file used above are:

cumulus@switch:~$ sudo cat /etc/network/interfaces.d/bond0
auto bond0
iface bond0
    address 14.0.0.9/30
    address 2001:ded:beef:2::1/64
    bond-slaves swp25 swp26

Mako Templates

ifupdown2 supports Mako-style templates. The Mako template engine processes the interfaces file before parsing.

Use the template to declare cookie-cutter bridges and to declare addresses in the interfaces file:

%for i in [1,12]:
auto swp${i}
iface swp${i}
    address 10.20.${i}.3/24

  • In Mako syntax, use square brackets ([1,12]) to specify a list of individual numbers. Use range(1,12) to specify a range of interfaces.
  • To test your template and confirm it evaluates correctly, run mako-render /etc/network/interfaces.

To comment out content in Mako templates, use double hash marks (##). For example:

## % for i in range(1, 4):
## auto swp${i}
## iface swp${i}
## % endfor
##

For more Mako template examples, refer to this knowledge base article.

ifupdown Scripts

Unlike the traditional ifupdown system, ifupdown2 does not run scripts installed in /etc/network/*/ automatically to configure network interfaces.

To enable or disable ifupdown2 scripting, edit the addon_scripts_support line in the /etc/network/ifupdown2/ifupdown2.conf file. 1 enables scripting and 2 disables scripting. For example:

cumulus@switch:~$ sudo nano /etc/network/ifupdown2/ifupdown2.conf
# Support executing of ifupdown style scripts.
# Note that by default python addon modules override scripts with the same name
addon_scripts_support=1

ifupdown2 sets the following environment variables when executing commands:

Show Interface Information

To show the administrative and physical (operational) state of all interfaces on the switch:

cumulus@switch:~$ nv show interface
Interface  Admin Status  Oper Status  Speed  MTU    Type      Remote Host      Remote Port  Summary                                 
---------  ------------  -----------  -----  -----  --------  ---------------  -----------  ----------------------------------------
eth0       up            up           1G     1500   eth       oob-mgmt-switch  swp10        IP Address:            192.168.200.11/24
                                                                                            IP Address:  fe80::4638:39ff:fe22:17a/64
lo         up            unknown             65536  loopback                                IP Address:                  127.0.0.1/8
                                                                                            IP Address:                      ::1/128
mgmt       up            up                  65575  vrf                                     IP Address:                  127.0.0.1/8
                                                                                            IP Address:                      ::1/128
swp1       up            up           1G     9216   swp                                     IP Address: fe80::4ab0:2dff:fe50:fecf/64
swp2       down          down                1500   swp                                                                             
swp3       down          down                1500   swp                                                                             
swp4       down          down                1500   swp                                                                             
swp5       down          down                1500   swp                                                                             
swp6       down          down                1500   swp                                                                             
swp7       down          down                1500   swp
...

To show the administrative and physical (operational) state of an interface, and the date and time the physical state of the interface changed:

cumulus@switch:~$ nv show interface swp1
                         operational        applied
-----------------------  -----------------  -------
...
  oper-status              up                                       
  admin-status             up                                       
  oper-status-last-change  2024/10/11 19:12:16.339

Run the ip link show dev <interface> command.

In the following example, swp1 is administratively UP and the physical link is UP (LOWER_UP).

cumulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 500
    link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff

To show the last time (date and time) the operational state of an interface changed and the number of carrier transitions for each interface (from the time of interface creation):

cumulus@switch:~$ nv show interface --view=carrier-stats
Interface       Oper Status  Up Count  Down Count  Total State Changes  Last State Change      
--------------  -----------  --------  ----------  -------------------  -----------------------
BLUE            up           0         0           0                    Never                  
RED             up           0         0           0                    Never                  
bond1           up           2         1           3                    2024/10/11 19:14:59.265
bond2           up           1         0           1                    2024/10/11 19:12:18.817
bond3           up           1         0           1                    2024/10/11 19:12:18.833
br_default      up           2         2           4                    2024/10/11 19:12:15.216
eth0            up           1         1           2                    2024/10/11 19:12:02.157
lo              unknown      0         0           0                    Never                  
mgmt            up           0         0           0                    Never                  
peerlink        up           1         0           1                    2024/10/11 19:12:06.913
peerlink.4094   up           1         0           1                    2024/10/11 19:12:06.915
swp1            up           2         2           4                    2024/10/11 19:12:16.339
swp2            up           2         2           4                    2024/10/11 19:12:16.345
swp3            up           2         2           4                    2024/10/11 19:12:16.351
swp4            down         1         1           2                    2024/10/11 19:11:28.936
swp5            down         1         1           2                    2024/10/11 19:11:28.936
swp6            down         1         1           2                    2024/10/11 19:11:28.936
swp7            down         1         1           2                    2024/10/11 19:11:28.936
...

In the example above:

To show the date and time the operational state of a specific interface changes (oper-status-last-change) and the number of carrier transitions (carrier-transitions, carrier-up-count, carrier-down-count):

cumulus@switch:~$ nv show interface swp1 link
                         operational              applied  pending
-----------------------  -----------------------  -------  -------
admin-status             up                                       
oper-status              up                                       
oper-status-last-change  2024/10/11 19:12:16.339                  
protodown                disabled                                 
auto-negotiate           off                      on       on     
duplex                   full                     full     full   
speed                    1G                       auto     auto   
mac-address              48:b0:2d:fa:a1:14                        
fec                                               auto     auto   
mtu                      9000                     9216     9216   
fast-linkup              off                                      
[breakout]                                                        
state                    up                       up       up     
flap-protection                                                   
  enable                                          on       on     
stats                                                             
  in-bytes               1.96 MB                                  
  in-pkts                16399                                    
  in-drops               0                                        
  in-errors              0                                        
  out-bytes              2.37 MB                                  
  out-pkts               24669                                    
  out-drops              0                                        
  out-errors             0                                        
  carrier-transitions    4                                        
  carrier-up-count       2                                        
  carrier-down-count     2 

To show the number of carrier transitions only (carrier-transitions, carrier-up-count, carrier-down-count) for a specific interface, run the nv show interface <interface> link stats command.

To show the assigned IP address on an interface:

cumulus@switch:~$ nv show interface lo ip address
-------------
10.0.1.12/32 
10.10.10.1/32
127.0.0.1/8  
::1/128
cumulus@switch:~$ ip addr show swp1
3: swp1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 500
    link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
    inet 192.0.2.1/30 scope global swp1
    inet 192.0.2.2/30 scope global swp1
    inet6 2001:DB8::1/126 scope global tentative
        valid_lft forever preferred_lft forever

To show the description (alias) for an interface:

cumulus@switch$ nv show interface swp1
                         operational        applied
-----------------------  -----------------  -------
                          operational                   applied          
------------------------  ----------------------------  -----------------
...                                                            
description               hypervisor_port_1             hypervisor_port_1
ip                                                                       
  vrrp                                                                   
    enable                                              off              
  igmp                                                                   
    enable                                              off              
  neighbor-discovery                                                     
    enable                                              on               
    router-advertisement                                                 
      enable                                            off              
    home-agent                                                           
      enable                                            off              
    [rdnss]                                                              
    [dnssl]                                                              
    [prefix]                                                             
  ipv4                                                                   
    forward                                             on               
  ipv6                                                                   
    enable                                              on               
    forward                                             on               
  vrf                                                   default          
  [address]               fe80::4ab0:2dff:feeb:db72/64                   
  [gateway]                                                              
...
cumulus@switch$ ip link show swp1
3: swp1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT qlen 500
    link/ether aa:aa:aa:aa:aa:bc brd ff:ff:ff:ff:ff:ff
    alias hypervisor_port_1

You can monitor the traffic rate and PPS for an interface to ensure optimal network performance and reliability; refer to Commands to monitor interface traffic rate and PPS.

Considerations

Even though ifupdown2 supports the inclusion of multiple iface stanzas for the same interface, use a single iface stanza for each interface. If you must specify more than one iface stanza; for example, if the configuration for a single interface comes from many places, like a template or a sourced file, make sure the stanzas do not specify the same interface attributes. Otherwise, you see unexpected behavior.

In the following example, swp1 is in two files: /etc/network/interfaces and /etc/network/interfaces.d/speed_settings. ifupdown2 parses this configuration because the same attributes are not in multiple iface stanzas.

cumulus@switch:~$ sudo cat /etc/network/interfaces

source /etc/network/interfaces.d/speed_settings

auto swp1
iface swp1
  address 10.0.14.2/24

cumulus@switch:~$ cat /etc/network/interfaces.d/speed_settings

auto swp1
iface swp1
  link-speed 1000
  link-duplex full

ifupdown2 and sysctl

For sysctl commands in the pre-up, up, post-up, pre-down, down, and post-down lines that use the $IFACE variable, if the interface name contains a dot (.), ifupdown2 does not change the name to work with sysctl. For example, the interface name bridge.1 does not convert to bridge/1.

ifupdown2 and the gateway Parameter

The default route that the gateway parameter creates in ifupdown2 does not install in FRR, therefore does not redistribute into other routing protocols. Define a static default route instead, which installs in FRR and redistributes, if needed.

The following shows an example of the /etc/network/interfaces file when you use a static route instead of a gateway parameter:

auto swp2
iface swp2
address 172.16.3.3/24
up ip route add default via 172.16.3.2

Interface Name Limitations

Interface names can be a maximum of 15 characters. You cannot use a number for the first character and you cannot include a dash (-) in the name. In addition, you cannot use any name that matches with the regular expression .{0,13}\-v.*.

If you encounter issues, remove the interface name from the /etc/network/interfaces file, then restart the networking.service.

cumulus@switch:~$ sudo nano /etc/network/interfaces
cumulus@switch:~$ sudo systemctl restart networking.service

IP Address Scope

ifupdown2 does not honor the configured IP address scope setting in the /etc/network/interfaces file and treats all addresses as global. It does not report an error. Consider this example configuration:

auto swp2
iface swp2
    address 35.21.30.5/30
    address 3101:21:20::31/80
    scope link

When you run ifreload -a on this configuration, ifupdown2 considers all IP addresses as global.

cumulus@switch:~$ ip addr show swp2
5: swp2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 74:e6:e2:f5:62:82 brd ff:ff:ff:ff:ff:ff
inet 35.21.30.5/30 scope global swp2
valid_lft forever preferred_lft forever
inet6 3101:21:20::31/80 scope global
valid_lft forever preferred_lft forever
inet6 fe80::76e6:e2ff:fef5:6282/64 scope link
valid_lft forever preferred_lft forever

To work around this issue, configure the IP address scope:

The NVUE command is not supported.

In the /etc/network/interfaces file, configure the IP address scope using post-up ip address add <address> dev <interface> scope <scope>. For example:

auto swp6
iface swp6
    post-up ip address add 71.21.21.20/32 dev swp6 scope site

Then run the ifreload -a command on this configuration.

The following configuration shows the correct scope:

cumulus@switch:~$ ip addr show swp6
9: swp6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 74:e6:e2:f5:62:86 brd ff:ff:ff:ff:ff:ff
inet 71.21.21.20/32 scope site swp6
valid_lft forever preferred_lft forever
inet6 fe80::76e6:e2ff:fef5:6286/64 scope link
valid_lft forever preferred_lft forever

System Power

In certain situations, you might need to power off the switch instead of rebooting. To power off the switch, run the cl-poweroff command, which shuts down the switch.

cumulus@switch:~$ sudo cl-poweroff

Alternatively, you can run the Linux poweroff command, which gracefully shuts down the switch (the switch LEDs stay on). On certain switches, such as the NVIDIA SN2201, SN2010, SN2100, SN2100B, SN3420, SN3700, SN3700C, SN4410, SN4600C, SN4600, SN4700, SN5400, or SN5600, the switch reboots instead of powering off.

cumulus@switch:~$ sudo poweroff

Switch Port Attributes

Cumulus Linux exposes network interfaces for several types of physical and logical devices:

Each physical network interface (port) has several settings:

For NVIDIA Spectrum ASICs, the firmware configures FEC, link speed, duplex mode and auto-negotiation automatically, following a predefined list of parameter settings until the link comes up. You can disable FEC if necessary, which forces the firmware to not try any FEC options.

MTU

Interface MTU applies to traffic traversing the management port, front panel or switch ports, bridge, VLAN subinterfaces, and bonds (both physical and logical interfaces). MTU is the only interface setting that you must set manually.

In Cumulus Linux, ifupdown2 assigns 9216 as the default MTU setting. The initial MTU value set by the driver is 9238. After you configure the interface, the default MTU setting is 9216.

To change the MTU setting, run the following commands. The example command sets the MTU to 1500 for the swp1 interface.

cumulus@switch:~$ nv set interface swp1 link mtu 1500
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces

auto swp1
iface swp1
    mtu 1500
cumulus@switch:~$ sudo ifreload -a

Runtime Configuration (Advanced)

Run the ip link set command. The following example command sets the swp1 interface MTU to 1500.

cumulus@switch:~$ sudo ip link set dev swp1 mtu 1500

A runtime configuration is non-persistent; the configuration you create does not persist after you reboot the switch.

Set a Global Policy

To set a global MTU policy, create a policy document (called mtu.json). For example:

cumulus@switch:~$ sudo cat /etc/network/ifupdown2/policy.d/mtu.json
{
  "address": {"defaults": { "mtu": "9216" }
            }
}

The policies and attributes in any file in /etc/network/ifupdown2/policy.d/ override the default policies and attributes in /var/lib/ifupdown2/policy.d/.

Bridge MTU

The MTU setting is the lowest MTU of any interface that is a member of the bridge (every interface specified in bridge-ports in the bridge configuration of the /etc/network/interfaces file). You are not required to specify an MTU on the bridge. Consider this bridge configuration:

auto bridge
iface bridge
    bridge-ports bond1 bond2 bond3 bond4 peer5
    bridge-vids 100-110
    bridge-vlan-aware yes

For a bridge to have an MTU of 9000, set the MTU for each of the member interfaces (bond1 to bond 4, and peer5) to 9000 at minimum.

When configuring MTU for a bond, configure the MTU value directly under the bond interface; the member links or slave interfaces inherit the configured value. If you need a different MTU on the bond, set it on the bond interface, as this ensures the slave interfaces pick it up. You do not have to specify an MTU on the slave interfaces.

VLAN interfaces inherit their MTU settings from their physical devices or their lower interface; for example, swp1.100 inherits its MTU setting from swp1. Therefore, specifying an MTU on swp1 ensures that swp1.100 inherits the MTU setting for swp1.

If you are working with VXLANs, the MTU for a virtual network interface (VNI must be 50 bytes smaller than the MTU of the physical interfaces on the switch, as various headers and other data require those 50 bytes. Also, consider setting the MTU much higher than 1500.

To show the MTU setting for an interface:

cumulus@switch:~$ nv show interface swp1
...
link                                                            
  auto-negotiate          off                           on      
  duplex                  full                          full    
  speed                   1G                            auto    
  mac-address             48:b0:2d:c8:bb:07                     
  fec                                                   auto    
  mtu                     9216                          9216    
...
cumulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc pfifo_fast state UP mode DEFAULT qlen 500
   link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff

Drop Packets that Exceed the Egress Layer 3 MTU

The switch forwards all packets that are within the MTU value set for the egress layer 3 interface. However, when packets are larger in size than the MTU value, the switch fragments the packets that do not have the DF bit set and drops the packets that do have the DF bit set.

Run the following command to drop all IP packets that are larger in size than the MTU value for the egress layer 3 interface instead of fragmenting packets:

cumulus@switch:~$ nv set system control-plane trap l3-mtu-err state off
cumulus@switch:~$ nv config apply
cumulus@switch:~$ echo "0 >" /cumulus/switchd/config/trap/l3-mtu-err/enable

FEC

FEC is an encoding and decoding layer that enables the switch to detect and correct bit errors introduced over the cable between two interfaces. The target IEEE BER on high speed Ethernet links is 10-12. Because 25G transmission speeds can introduce a higher than acceptable BER on a link, FEC is often required to correct errors to achieve the target BER at 25G, 4x25G, 100G, and higher link speeds. The type and grade of a cable or module and the medium of transmission determine which FEC setting is necessary.

For the link to come up, the two interfaces on each end must use the same FEC setting.

FEC requires small latency overhead. For most applications, this small amount of latency is preferable to error packet retransmission latency.

The two FEC types are:

Cumulus Linux includes additional FEC options:

While Auto FEC is the default setting on the NVIDIA Spectrum switch, do not explicitly configure the fec auto option on the switch as this leads to a link flap whenever you run net commit or ifreload -a.

For 25G DAC, 4x25G Breakouts DAC and 100G DAC cables, the IEEE 802.3by specification creates 3 classes:

The IEEE classification specifies various dB loss measurements and minimum achievable cable length. You can build longer and shorter cables if they comply to the dB loss and BER requirements.

If a cable has a CA-25G-S classification and FEC is not on, the BER might be unacceptable in a production network. It is important to set the FEC according to the cable class (or better) to have acceptable bit error rates. See Determining Cable Class below.

You can check bit errors using cl-netstat (RX_ERR column) or ethtool -S (HwIfInErrors counter) after a large amount of traffic passes through the link. A non-zero value indicates bit errors. Expect error packets to be zero or extremely low compared to good packets. If a cable has an unacceptable rate of errors with FEC enabled, replace the cable.

For 25G, 4x25G Breakout, and 100G Fiber modules and AOCs, there is no classification of 25G cable types for dB loss, BER or length. Use FEC if the BER is low enough.

Cable Class of 100G and 25G DACs

You can determine the cable class for 100G and 25G DACs from the Extended Specification Compliance Code field (SFP28: 0Ah, byte 35, QSFP28: Page 0, byte 192) in the cable EEPROM programming.

For 100G DACs, most manufacturers use the 0x0Bh 100GBASE-CR4 or 25GBASE-CR CA-L value (the 100G DAC specification predates the IEEE 802.3by 25G DAC specification). Use RS FEC for 100G DAC; shorter or better cables might not need this setting.

A manufacturer’s EEPROM setting might not match the dB loss on a cable or the actual bit error rates that a particular cable introduces. Use the designation as a guide, but set FEC according to the bit error rate tolerance in the design criteria for the network. For most applications, the highest mutual FEC ability of both end devices is the best choice.

You can determine for which grade the manufacturer has designated the cable as follows.

For the SFP28 DAC, run the following command:

cumulus@switch:~$ sudo ethtool -m swp1 hex on | grep 0020 | awk '{ print $6}'
0c

The values at location 0x0024 are:

For the QSFP28 DAC, run the following command:

cumulus@switch:~$ sudo ethtool -m swp1s0 hex on | grep 00c0 | awk '{print $2}'
0b

The values at 0x00c0 are:

In each example below, the Compliance field comes from the method described above; the ethool -m output does not show it.

3meter cable that does not require FEC
(CA-N)
Cost: More expensive
Cable size: 26AWG (Note that AWG does not necessarily correspond to overall dB loss or BER performance)
Compliance Code: 25GBASE-CR CA-N

3meter cable that requires Base-R FEC
(CA-S)
Cost: Less expensive
Cable size: 26AWG
Compliance Code: 25GBASE-CR CA-S

When in doubt, consult the manufacturer directly to determine the cable classification.

Spectrum ASIC FEC Behavior

The firmware in a Spectrum ASIC applies FEC configuration to 25G and 100G cables based on the cable type and whether the peer switch also has a Spectrum ASIC.

When the link is between two switches with Spectrum ASICs:

Cable Type
FEC Mode
25G optical cables Base-R/FC-FEC
25G 1,2 meters: CA-N, loss <13db Base-R/FC-FEC
25G 2.5,3 meters: CA-S, loss <16db Base-R/FC-FEC
25G 2.5,3,4,5 meters: CA-L, loss > 16db RS-FEC
100G DAC or optical RS-FEC

When linking to a non-Spectrum peer, the firmware lets the peer decide. The Spectrum ASIC supports RS-FEC (for both 100G and 25G), Base-R/FC-FEC (25G only), or no-FEC (for both 100G and 25G).

Cable Type
FEC Mode
25G optical cables Let peer decide
25G 1,2 meters: CA-N, loss <13db Let peer decide
25G 2.5,3 meters: CA-S, loss <16db Let peer decide
25G 2.5,3,4,5 meters: CA-L, loss > 16db Let peer decide
100G Let peer decide: RS-FEC or No FEC

How Does Cumulus Linux use FEC?

A Spectrum switch enables FEC automatically when it powers up. The port firmware tests and determines the correct FEC mode to bring the link up with the neighbor. It is possible to get a link up to a switch without enabling FEC on the remote device as the switch eventually finds a working combination to the neighbor without FEC.

The following sections describe how to show the current FEC mode, and how to enable and disable FEC.

Show the Current FEC Mode

To show the FEC mode on a switch port, run the NVUE nv show interface <interface> link command.

cumulus@switch:~$ nv show interface swp1 link
                       operational        applied
---------------------  -----------------  -------
admin-status           up                        
oper-status            up                        
protodown              disabled                  
auto-negotiate         off                on     
duplex                 full               full   
speed                  1G                 auto   
fec                                       auto
...

Enable or Disable FEC

To enable Reed Solomon (RS) FEC on a link:

cumulus@switch:~$ nv set interface swp1 link fec rs
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file, then run the ifreload -a command. The following example enables RS FEC for the swp1 interface (link-fec rs):

cumulus@switch:~$ sudo nano /etc/network/interfaces

auto swp1
iface swp1
    link-autoneg off
    link-speed 100000
    link-fec rs
cumulus@switch:~$ sudo ifreload -a

Runtime Configuration (Advanced)

Run the ethtool --set-fec <interface> encoding RS command. For example:

cumulus@switch:~$ sudo ethtool --set-fec swp1 encoding RS

A runtime configuration is non-persistent. The configuration you create does not persist after you reboot the switch.

To enable Base-R/FireCode FEC on a link:

cumulus@switch:~$ nv set interface swp1 link fec baser
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file, then run the ifreload -a command. The following example enables Base-R FEC for the swp1 interface (link-fec baser):

cumulus@switch:~$ sudo nano /etc/network/interfaces

auto swp1
iface swp1
    link-autoneg off
    link-speed 100000
    link-fec baser
cumulus@switch:~$ sudo ifreload -a

Runtime Configuration (Advanced)

Run the ethtool --set-fec <interface> encoding baser command. For example:

cumulus@switch:~$ sudo ethtool --set-fec swp1 encoding BaseR

A runtime configuration is non-persistent. The configuration you create does not persist after you reboot the switch.

To enable FEC with Auto-negotiation:

You can use FEC with auto-negotiation on DACs only.

cumulus@switch:~$ nv set interface swp1 link auto-negotiate on
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file to set auto-negotiation to on, then run the ifreload -a command:

cumulus@switch:~$ sudo nano /etc/network/interfaces

auto swp1
iface swp1
link-autoneg on
cumulus@switch:~$ sudo ifreload -a

Runtime Configuration (Advanced)

You can use ethtool to enable FEC with auto-negotiation. For example:

ethtool -s swp1 speed 10000 duplex full autoneg on

A runtime configuration is non-persistent. The configuration you create does not persist after you reboot the switch.

To show the FEC and auto-negotiation settings for an interface, either run the NVUE nv show interface <interface> link command or the Linux sudo ethtool swp1 | egrep 'FEC|auto' command:

cumulus@switch:~$ sudo ethtool swp1 | egrep 'FEC|auto'
Supports auto-negotiation: Yes
Supported FEC modes: RS
Advertised auto-negotiation: Yes
Advertised FEC modes: RS
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported

To disable FEC on a link:

cumulus@switch:~$ nv set interface swp1 link fec off
cumulus@switch:~$ nv config apply

To configure FEC to the default value, run the nv unset interface swp1 link fec command.

Edit the /etc/network/interfaces file, then run the ifreload -a command. The following example disables Base-R FEC for the swp1 interface (link-fec baser):

cumulus@switch:~$ sudo nano /etc/network/interfaces

auto swp1
iface swp1
link-fec off
cumulus@switch:~$ sudo ifreload -a

Runtime Configuration (Advanced)

Run the ethtool --set-fec <interface> encoding off command. For example:

cumulus@switch:~$ sudo ethtool --set-fec swp1 encoding off

A runtime configuration is non-persistent. The configuration you create does not persist after you reboot the switch.

DR1 and DR4 Modules

100GBASE-DR1 modules, such as NVIDIA MMS1V70-CM, include internal RS FEC processing, which the software does not control. When using these optics, you must either set the FEC setting to off or leave it unset for the link to function.

400GBASE-DR4 modules, such as NVIDIA MMS1V00-WM, require RS FEC. The switch automatically enables FEC if it is set to off.

You typically use these optics to interconnect 4x SN2700 uplinks to a single SN4700 breakout downlink. The following configuration shows an explicit FEC example. You can leave the FEC settings unset for autodetection.

SN4700 (400GBASE-DR4 in swp1):

cumulus@SN4700:mgmt:~$ nv set interface swp1 link breakout 4x lanes-per-port 2
cumulus@SN4700:mgmt:~$ nv set interface swp1s0 link fec rs
cumulus@SN4700:mgmt:~$ nv set interface swp1s0 link speed 100G
cumulus@SN4700:mgmt:~$ nv set interface swp1s1 link fec rs
cumulus@SN4700:mgmt:~$ nv set interface swp1s1 link speed 100G
cumulus@SN4700:mgmt:~$ nv set interface swp1s2 link fec rs
cumulus@SN4700:mgmt:~$ nv set interface swp1s2 link speed 100G
cumulus@SN4700:mgmt:~$ nv set interface swp1s3 link fec rs
cumulus@SN4700:mgmt:~$ nv set interface swp1s3 link speed 100G
cumulus@SN4700:mgmt:~$ nv config apply

SN2700 (100GBASE-DR1 in swp11-14):

cumulus@SN2700:mgmt:~$ nv set interface swp11 link fec off
cumulus@SN2700:mgmt:~$ nv set interface swp11 link speed 100G
cumulus@SN2700:mgmt:~$ nv set interface swp12 link fec off
cumulus@SN2700:mgmt:~$ nv set interface swp12 link speed 100G
cumulus@SN2700:mgmt:~$ nv set interface swp13 link fec off
cumulus@SN2700:mgmt:~$ nv set interface swp13 link speed 100G
cumulus@SN2700:mgmt:~$ nv set interface swp14 link fec off
cumulus@SN2700:mgmt:~$ nv set interface swp14 link speed 100G
cumulus@SN4700:mgmt:~$ nv config apply

The FEC operational view of this configuration appears incorrect because FEC is operationally enabled only on the SN4700 400G breakout side. This is because the 100G DR1 module side handles FEC internally, which is not visible to Cumulus Linux.

cumulus@SN2700:mgmt:~$ nv show int swp11 link
                       operational        applied
---------------------  -----------------  -------
auto-negotiate         on                 on     
duplex                 full               full   
speed                  100G               auto   
fec                    off                off   
mtu                    9216               9216   
fast-linkup            off                       
[breakout]                                       
state                  up                 up     
...
cumulus@SN4700:mgmt:~$ nv show int swp1s1 link
                       operational        applied
---------------------  -----------------  -------
auto-negotiate         on                 on     
duplex                 full               full   
speed                  100G               auto   
fec                    rs                 off    
mtu                    9216               9216   
fast-linkup            off                       
[breakout]                                       
state                  up                 up     
...

Default Policies for Interface Settings

Instead of configuring settings for each individual interface, you can specify a policy for all interfaces on a switch or tailor custom settings for each interface. Create a file in /etc/network/ifupdown2/policy.d/ and populate the settings accordingly. The following example shows a file called address.json.

cumulus@switch:~$ cat /etc/network/ifupdown2/policy.d/address.json
{
    "ethtool": {
        "defaults": {
            "link-duplex": "full"
        },
        "iface_defaults": {
            "swp1": {
                "link-autoneg": "on",
                "link-speed": "1000"
          },
            "swp16": {
                "link-autoneg": "off",
                "link-speed": "10000"
            },
            "swp50": {
                "link-autoneg": "off",
                "link-speed": "100000",
                "link-fec": "rs"
            }
        }
    },
    "address": {
        "defaults": { "mtu": "9000" },
        "iface_defaults": {
            "eth0": {"mtu": "1500"}
        }
    }
}

Setting the default MTU also applies to the management interface. Be sure to add the iface_defaults to override the MTU for eth0, to remain at 9216.

Breakout Ports

Cumulus Linux supports the following ports breakout options:

18x SFP28 25G and 4x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.

All 4x QSFP28 ports can break out into 4x SFP28 or 2x QSFP28.

  • 18x 1G - 18x SFP28 set to 1G
  • 16x 1G - 4x QSFP28 configured as 4x breakouts and set to 1G

Max 1G ports: 34

  • 18x 10G - 18x SFP28 set to 10G
  • 16x 10G - 4x QSFP28 configured as 4x breakouts and set to 10G

Maximum 10G ports: 34

  • 18x 25G - 18x SFP28 (native speed)
  • 16x 25G - 4x QSFP28 breakouts to 4x and set to 25G

Maximum 25G ports: 34

4x 40G - 4x QSFP28 set to 40G

Maximum 40G ports: 4

8x 50G - 4x QSFP28 break out into 2x and set to 50G

Maximum 50G ports: 8

4x 100G - 4x QSFP28 (native speed)

Maximum 100G ports: 4

16x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.

All QSFP28 ports can break out into 4x SFP28 or 2x QSFP28.

64x 1G - 16x QSFP28 break out into 4x and set to 1G

Max 1G ports: 64

64x 10G - 16x QSFP28 break out into 4x and set to 10G

Maximum 10G ports: 64

64x 25G - 16x QSFP28 break out into 4x and set to 25G

Maximum 25G ports: 64

16x 40G - 4x QSFP28 set to 40G

Maximum 40G ports: 16

32x 50G - 16x QSFP28 break out into 2x and set to 50G

Maximum 50G ports: 32

16x 100G - 16x QSFP28 (native speed)

Maximum 100G ports: 16

48x 1GBase-T ports (RJ45 up to 100m CAT5E/6) and 4x QSFP28 100G interfaces (only support NRZ encoding). You can set all speeds down to 1G.

All 4x QSFP28 ports can break out into 4x SFP28 or 2x QSFP28.

  • 48x 1GBase-T - 48x Base-T set to 1G. You can set them to also to 10/100Mb.
  • 16x 1G - 4x QSFP28 configured as 4x breakouts and set to 1G

Maximum 10/100MBase-T ports: 48 Maximum 1GBase-T ports: 48 Maximum 1G ports: 16

  • 16x 10G - 4x QSFP28 configured as 4x breakouts and set to 10G

Maximum 10G ports: 16

  • 16x 25G - 4x QSFP28 breakouts to 4x and set to 25G

Maximum 25G ports: 16

4x 40G - 4x QSFP28 set to 40G

Maximum 40G ports: 4

8x 50G - 4x QSFP28 break out into 2x

Maximum 50G ports: 8

4x 100G - 4x QSFP28 (native speed)

Maximum 100G ports: 4

48x SFP28 25G and 8x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.

The top 4x QSFP28 ports can break out into 4x SFP28. You cannot use the lower 4x QSFP28 disabled ports.

All 8x QSFP28 ports can break out into 2x QSFP28 without disabling ports.

  • 48x 1G - 48x SFP28 set to 10G
  • 16x 1G - 4x QSFP28 break out into 4x and set to 1G

Max 1G ports: 64

  • 48x 10G - 48x SFP28 set to 10G
  • 16x 10G - 4x QSFP28 break out into 4x and set to 10G

Maximum 10G ports: 64

  • 48x 25G - 48x SFP28 (native speed)
  • 16x 25G - Top 4x QSFP28 break out into 4x (bottom 4x QSFP28 disabled)

Maximum 25G ports: 64

8x 40G - 8x QSFP28 set to 40G

Maximum 40G ports: 8

16x 50G - 8x QSFP28 break out into 2x

Maximum 50G ports: 16

8x 100G - 8x QSFP28 (native speed)

Maximum 100G ports: 8

32x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.

The top 16x QSFP28 ports can break out into 4x SFP28. You cannot use the lower 4x QSFP28 disabled ports.

All 32x QSFP28 ports can break out into 2x QSFP28 without disabling ports.

64x 1G - Top 16x QSFP28 break out into 4x and set to 1G (bottom 16XQSFP28 disabled)

Max 1G ports: 64

64x 10G - Top 16x QSFP28 break out into 4x and set to 10G (bottom 16x QSFP28 disabled)

Maximum 10G ports: 64

64x 25G - Top 16x QSFP28 break out into 4x (bottom 16x QSFP28 disabled)

Maximum 25G ports: 64

32x 40G - 32x QSFP28 set to 40G

Maximum 40G ports: 32

64x 50G - 64x QSFP28 break out into 2x

Maximum 50G ports: 64

32x 100G - 32x QSFP28 (native speed)

Maximum 100G ports: 32

48x SFP28 25G and 12x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.

All 12x QSFP28 ports can break out into 4x SFP28 or 2x QSFP28.

  • 48x 1G - 48XSFP28 set to 1G
  • 48x 1G - 12XQSFP28 break out into 4x and set to 1G

Max 1G ports: 96

  • 48x 10G - 48x SFP28 set to 10G
  • 48x 10G - 12x QSFP28 break out into 4x and set to 10G

Maximum 10G ports: 96

  • 48x 25G - 48x SFP28 (native speed)
  • 48x 25G - 12x QSFP28 break out into 4x

Maximum 25G ports: 96

12x 40G - 12x QSFP28 set to 40G

Maximum 40G ports: 12

24x 50G - 12x QSFP28 break out into 2x

Maximum 50G ports: 24

12x 100G - 12x QSFP28 (native speed)

Maximum 100G ports: 12

32x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.

All 32x QSFP28 ports can break out into 4x SFP28 or 2x QSFP28.

128x1G - 32XQSFP28 break out into 4x and set to 1G

Max 1G ports: 128

128x 10G - 32x QSFP28 break out into 4x and set to 10G

Maximum 10G ports: 128

128x25G - 32x QSFP28 break out into 4x

Maximum 25G ports: 128

32x 40G - 32x QSFP28 set to 40G

Maximum 40G ports: 32

64x 50G - 32x QSFP28 break out into 2x

Maximum 50G ports: 64

32x 100G - 32x QSFP28 (native speed)

Maximum 100G ports: 32

32x QSFP56 200G interfaces support both PAM4 and NRZ encodings. You can set all speeds down to 1G.

For lower speed interface configurations, PAM4 is automatically converted to NRZ encoding.

All 32x QSFP56 ports can break out into 4xSFP56 or 2x QSFP56.

128x 1G - 32XQSFP56 break out into 4x and set to 1G

Max 1G ports: 128

128x 10G - 32x QSFP56 break out into 4x and set to 10G

Maximum 10G ports: 128

128x 25G - 32x QSFP56 break out into 4x and set to 25G

Maximum 25G ports: 128

32x 40G - 32x QSFP56 set to 40G

Maximum 40G ports: 32

128x 50G - 32x QSFP56 break out into 4x

Maximum 50G ports: 128

64x100G - 32x QSFP56 break out into 2x

Maximum 100G ports: 64

32x 200G - 32x QSFP56 (native speed)

Maximum 200G ports: 32

SN4410 24xQSFP28-DD interfaces [ports 1-24] support both PAM4 and NRZ encoding with all speeds from 200G down to 1G.

The 8xQSFP-DD (400GbE) interfaces [ports 25-32] support both PAM4 and NRZ encodings with all speeds from 400G down to 1G.

For lower speeds, PAM4 is automatically converted to NRZ encoding.

You can split ports #1 to #32 into:

  • 2x ports with PAM 4 and NRZ encoding with no limitations.
  • 4x ports with PAM 4 and NRZ encoding with no limitations.
  • 8x ports with PAM 4 and NRZ encoding but this forces blocking of an adjacent port (the total available number of MAC addresses is 128)
  • 96x 1G - 24XQSFP28-DD break out into 4x and set to 1G
  • 32x 1G - Top 4XQSFP-DD break out into 8x and set to 1G (bottom 4XQSFP-DD blocked*)

Max 1G ports: 128

  • 96x 10G - 24xQSFP28-DD break out into 4x and set to 10G
  • 32x 10G - 4 top QSFP-DD break out into 8x and set to 10G (bottom 4xQSFP-DD blocked*)

Maximum 10G ports: 128

*Other QSFP-DD breakout combinations are available up to maximum of 128x ports.

  • 96x 25G - 24xQSFP28-DD break out into 4x
  • 32x 25G - 4 top QSFP-DD break out into 8x and set to 25G (bottom 4xQSFP-DD blocked*)

Maximum 25G ports: 128

*Other QSFP-DD breakout combinations are available up to maximum of 128x ports.

  • 48x 40G - 24xQSFP28-DD breakout into 2x and set to 40G
  • 16x 40G – 8xQSFP-DD breakout into 2x and set to 40G

Maximum 40G ports: 64

  • 96x 50G - 24xQSFP28-DD/QSFP56 break out into 4x
  • 32x 50G - 8xQSFP-DD break out into 4x

Maximum 50G ports: 128

  • 96x 100G - 24xQSFP28-DD/QSFP56 break out into 4x
  • 32x 100G - 8xQSFP-DD break out into 4x

Maximum 100G ports: 128

  • 48x 200G - 24xQSFP28-DD/QSFP56 break out into 2x
  • 16x 200G - 8xQSFP-DD break out into 2x

Maximum 200G ports: 64

8x400G - 8xQSFP-DD (native speed)

Maximum 400G ports: 8

64x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.

Only 32x QSFP28 ports can break out into 4x SFP28. You must disable the adjacent QSFP28 port. Only the first and third or second and forth rows can break out into 4xSFP28.

All 64x QSFP28 ports can break out into 2x QSFP28 without disabling ports.

128x 1G - 32XQSFP28 break out into 4x and set to 1G

Max 1G ports: 128

128x 10G - 32x QSFP28 break out into 4x and set to 10G

Maximum 10G ports: 128

128x 25G - 32x QSFP28 break out into 4x

Maximum 25G ports: 128

64x 40G - 64x QSFP28 set to 40G

Maximum 40G ports: 64

128x 50G - 64x QSFP28 break out into 2x

Maximum 50G ports: 128

64x 100G - 64x QSFP28 (native speed)

Maximum 100G ports: 64

SN4600 64xQSFP56 (200GbE) interfaces support both PAM4 and NRZ encodings with all speeds down to 1G.

For lower speeds, PAM4 is automatically converted to NRZ encoding.

Only 32xQSFP56 ports can break out into 4xSFP56 (4x50GbE). But, in this case, the adjacent QSFP56 port are blocked (only the first and third or second and fourth rows can break out into 4xSFP56).

All 64xQSFP56 ports can break out into 2xQSFP56 (2x100GbE) without blocking ports.

128x 1G - 32XQSFP56 break out into 4x and set to 1G

Max 1G ports: 128

128x10G - 64xQSFP56 break out into 4x and set to 10G

Maximum 10G ports: 128

128x25G - 64xQSFP56 break out into 4x and set to 25G

Maximum 25G ports: 128

64x40G - 64xQSFP56 set to 40G

Maximum 40G ports: 64

128x50G - 32xQSFP56 break out into 4x

Maximum 50G ports: 128

  • 128x 100G - 64xQSFP56 break out into 2x
  • 64x 100G - 64xQSFP28 set to 100G

Maximum 100G ports: 128

64x200G - 64xQSFP56 (native speed)

Maximum 200G ports: 64

SN4700 32x QSFP-DD 400GbE interfaces support both PAM4 and NRZ encodings. You can set all speeds down to 1G.

For lower speed interface configurations, PAM4 is automatically converted to NRZ encoding.

Only the top 16x QSFP-DD ports can break out into 8x SFP56. You must disable the adjacent QSFP-DD port.

All 32x QSFP-DD ports can break out into 2x QSFP56 at 2x200G or 4x QSFP56 at 4x 100G without disabling ports.

128x 1G - Top 16XQSFP-DD break out into 8x and set to 1G

Maximum 1G ports: 128

128x 10G - 16x QSFP-DD break out into 8x and set to 10G

Maximum 10G ports: 128

*Cumulus Linux supports other QSFP-DD breakout combinations up to maximum of 128x ports.

128x 25G - 16x QSFP-DD break out into 8x and set to 25G

Maximum 25G ports: 128

*Cumulus Linux supports other QSFP-DD breakout combinations up to maximum of 128x ports.

32x 40G - 32x QSFP-DD set to 40G

Maximum 40G ports: 32

128x 50G - 16x QSFP-DD break out into 8x

Maximum 50G ports: 128

*Cumulus Linux supports other QSFP-DD breakout combinations up to maximum of 128x ports.

128x 100G - 32x QSFP-DD break out into 4x

Maximum 100G ports: 128

64x 200G - 64x QSFP-DD break out into 2x

Maximum 200G ports: 64

32x 400G - 32x QSFP-DD (native speed)

Maximum 400G ports: 32

SN5400 64xQSFP-DD (400GbE) interfaces support both PAM4 and NRZ encodings with all speeds down to 10G.

For lower speeds, PAM4 is automatically converted to NRZ encoding.

Bonus ports #65 and #66 support 1G, 10G, and 25G but do not support breakouts.

Maximum 1G ports: 2 (bonus ports)

258x 10G

Maximum 10G ports: 258 (64 ports breakout 4x + 2 bonus ports)

258x 25G

Maximum 25G ports: 258 (64 ports breakout 4x + 2 bonus ports)

Maximum 40G ports: 128

256x 50G

Maximum 50G ports: 256 (32 odd ports breakout into 8x)

256x 100G

Maximum 100G ports: 256 (64 ports breakout into 4x)

128x 200G

Maximum 200G ports: 128 (64 ports breakout into 2x)

Maximum 400G ports: 64

SN5600 64xOSFP (800GbE) interfaces support both PAM4 and NRZ encodings with all speeds down to 10G.

For lower speeds, PAM4 is automatically converted to NRZ encoding.

Bonus port #65 supports 1G, 10G, and 25G but does not support breakouts.

Maximum 1G ports: 1 (bonus port)

256x 10G

Maximum 10G ports: 257 (256 + 1 bonus port)

256x 25G

Maximum 25G ports: 257 (256 + 1 bonus port)

128x 40G

Maximum 40G ports: 128

256x 50G - 32x OSFP break out into 8x - You must disable the adjacent OSFP port.

Maximum 50G ports: 256

256x 100G - 32x OSFP break out into 8x - You must disable the adjacent OSFP port.

Maximum 100G ports: 256

256x 200G - 64x OSFP break out into 4x

Maximum 200G ports: 256

128x 400G - 64x OSFP break out into 2x

Maximum 400G ports: 128

64x 800G

Maximum 800G ports: 64

  • You can use a single SFP (10/25/50G) transceiver in a QSFP (100/200/400G) port with QSFP-to-SFP Adapter (QSA). Set the port speed to the SFP speed with the nv set interface <interface> link speed <speed> command. Do not configure this port as a breakout port.
  • If you break out a port, then reload the switchd service on a switch running in nonatomic ACL mode, temporary disruption to traffic occurs while the ACLs reinstall.
  • Cumulus Linux does not support port ganging.

Configure a Breakout Port

You can break out (split) a port using the following options:

If you split a 100G port into four interfaces and auto-negotiation is on (the default setting), Cumulus Linux advertises the speed for each interface up to the maximum speed possible for a 100G port (100/4=25G). You can overide this configuration and set specific speeds for the split ports if necessary.

  • Cumulus Linux 5.4 and later uses a new format for port splitting; instead of 1=100G or 1=4x10G, you specify 1=1x or 1=4x. The new format does not support specifying a speed for breakout ports in the /etc/cumulus/ports.conf file. To set a speed, either set the link-speed parameter for each split port in the /etc/network/interfaces file or run the NVUE nv set interface <interface> link speed <speed> command.

The following example breaks out a 100G port on swp1 into four interfaces. Cumulus Linux advertises the speed for each interface up to a maximum of 25G:

cumulus@switch:~$ nv set interface swp1 link breakout 4x
cumulus@switch:~$ nv set interface swp1s0-3 link state up
cumulus@switch:~$ nv config apply

The following example splits the port into four interfaces and forces the link speed to be 10G. Cumulus disables auto-negotiation when you force set the speed.

cumulus@switch:~$ nv set interface swp1 link breakout 4x
cumulus@switch:~$ nv set interface swp1s0-3 link state up
cumulus@switch:~$ nv set interface swp1s0-3 link speed 10G

Certain switches, such as the SN2700, SN4600, and SN4600c, require that you disable the subsequent even-numbered port when you configure a breakout port for 4x or 8x. NVUE automatically disables the subsequent even-numbered port on any switch with this requirement.

  1. To split a port into multiple interfaces, edit the /etc/cumulus/ports.conf file. The following example command breaks out swp1 into four interfaces.

    cumulus@switch:~$ sudo cat /etc/cumulus/ports.conf
    ...
    1=4x 
    2=disabled 
    3=1x 
    4=1x 
    ...
    

When you configure a breakout port to 4x or 8x on certain switches such as the SN2700, SN4600, and SN4600c, you must set the subsequent even-numbered port to disabled in the /etc/cumulus/ports.conf file. The SN3700, SN3700c, SN2201, SN2010, and SN2100 switch does not have this requirement.

  1. Reload switchd with the sudo systemctl reload switchd.service command. The reload does not interrupt network services.

    cumulus@switch:~$ sudo systemctl reload switchd.service
    
  2. To configure specific speeds for the split ports, edit the /etc/network/interfaces file, then run the ifreload -a command. The following example configures the speed for each swp1 breakout port (swp1s0, swp1s1, swp1s2, and swp1s3) to 10G with auto-negotiation off.

cumulus@switch:~$ sudo cat /etc/network/interfaces
...
auto swp1s0
iface swp1s0
    link-speed 10000 
    link-duplex full 
    link-autoneg off
auto swp1s1
iface swp1s1
    link-speed 10000 
    link-duplex full 
    link-autoneg off
auto swp1s2
iface swp1s2
    link-speed 10000 
    link-duplex full 
    link-autoneg off
auto swp1s3
iface swp1s3
    link-speed 10000 
    link-duplex full 
    link-autoneg off
...
cumulus@switch:~$ sudo ifreload -a

The SN4700 and SN4410 switch does not support auto-negotiation on QSFP-DD 400G transceiver modules. You need to force set the speed.

Set the Number of Lanes per Split Port

By default, to calculate the split port width, Cumulus Linux uses the formula split port width = full port width / breakout. For example, a port split into two interfaces (2x breakout) => 8 lanes width / 2x breakout = 4 lanes per split port.

If you need to use a different port width than the default, you can set the number of lanes per port.

The following example command splits swp1 into two interfaces (2x) and sets the number of lanes per split port to 2.

cumulus@switch:~$ nv set interface swp1 link breakout 2x lanes-per-port 2
cumulus@switch:~$ nv config apply

Edit the /etc/cumulus/ports_width.conf file and add the number of lanes per split port you want to use, then reload switchd:

cumulus@switch:~$ sudo nano /etc/cumulus/ports_width.conf
...
1=2
2=default
3=default
4=default
5=default
6=default
7=default
8=default
...
cumulus@switch:~$ sudo systemctl reload switchd.service

In 5.9 and later, the 4x breakout on QSFP-DD/OSFP 8 lane ports allocates two lanes per port by default instead of one lane. Be sure to configure the lanes per port on both ends of a connection to be the same.

Remove a Breakout Port

To remove a breakout port:

  1. Run the nv unset interface <interface> command. For example:

    cumulus@switch:~$ nv unset interface swp1s0
    cumulus@switch:~$ nv unset interface swp1s1
    cumulus@switch:~$ nv unset interface swp1s2
    cumulus@switch:~$ nv unset interface swp1s3
    cumulus@switch:~$ nv config apply
    
  2. Run the nv unset interface <interface> link breakout command to configure the interface for the original speed. For example:

    cumulus@switch:~$ nv unset interface swp1 link breakout
    cumulus@switch:~$ nv config apply
    
  1. Edit the /etc/cumulus/ports.conf file to configure the interface for the original speed.

    cumulus@switch:~$ sudo nano /etc/cumulus/ports.conf
    ...
    1=1x 
    2=1x 
    3=1x 
    4=1x 
    ...
    
  2. Reload switchd. The reload does not interrupt network services.

    cumulus@switch:~$ sudo systemctl reload switchd.service
    
  3. Remove the breakout interface configuration from the /etc/network/interfaces file, then run the ifreload -a command.

Configure Port Lanes

You can override the default behavior for supported speeds and platforms and specify the number of lanes for a port. For example, for the NVIDIA SN4700 switch, the default port speed is 50G (2 lanes, NRZ signaling mode) and 100G (4 lanes, NRZ signaling mode). You can override this setting to 50G (1 lane, PAM4 signaling mode) and 100G (2 lanes, PAM4 signaling mode).

This setting does not apply when auto-negotiation is on because Cumulus Linux advertises all supported speed options, including PAM4 and NRZ during auto-negotiation.

cumulus@switch:~$ nv set interface swp1 link speed 50G
cumulus@switch:~$ nv set interface swp1 link lanes 1
cumulus@switch:~$ nv config apply 
cumulus@switch:~$ nv set interface swp2 link speed 100G
cumulus@switch:~$ nv set interface swp2 link lanes 2
cumulus@switch:~$ nv config apply
  1. Edit the /etc/network/interfaces file, then run the ifreload -a command.

    cumulus@switch:~$ sudo nano /etc/network/interfaces
    ...
    auto swp1
    iface swp1
        link-lanes 1
        link-speed 50000
    auto swp2
    iface swp2
        link-lanes 2
        link-speed 100000
    
  2. Run the ifreload -a command:

    cumulus@switch:~$ sudo ifreload -a
    

ports.conf File Validator

Cumulus Linux includes a ports.conf validator that switchd runs automatically before the switch starts up to confirm that the file syntax is correct. You can run the validator manually to verify the syntax of the file whenever you make changes. The validator is useful if you want to copy a new ports.conf file to the switch with automation tools, then validate that it has the correct syntax.

To run the validator manually, run the /usr/cumulus/bin/validate-ports -f <file> command. For example:

cumulus@switch:~$ /usr/cumulus/bin/validate-ports -f /etc/cumulus/ports.conf

Troubleshooting

This section shows basic commands for troubleshooting switch ports. For a more comprehensive troubleshooting guide, see Troubleshoot Layer 1.

Interface Settings

To see all settings for an interface, run the nv show interface <interface> command:

cumulus@switch:~$ nv show interface swp1                          operational        applied
------------------------  -----------------  -------
type                      swp                swp    
[acl]                                               
evpn                                                
  multihoming                                       
    uplink                                   off    
ptp                                                 
  enable                                     off    
router                                              
  adaptive-routing                                  
    enable                                   off    
  ospf                                              
    enable                                   off    
  ospf6                                             
    enable                                   off    
  pbr                                               
    [map]                                           
  pim                                               
    enable                                   off    
synce                                               
  enable                                     off    
ip                                                  
  igmp                                              
    enable                                   off    
  ipv4                                              
    forward                                  on     
  ipv6                                              
    enable                                   on     
    forward                                  on     
  neighbor-discovery                                
    enable                                   on     
    [dnssl]                                         
    home-agent                                      
      enable                                 off    
    [prefix]                                        
    [rdnss]                                         
    router-advertisement                            
      enable                                 off    
  vrrp                                              
    enable                                   off    
  vrf                                        default
  [gateway]                                         
link                                                
  auto-negotiate          off                on     
  duplex                  full               full   
  speed                   1G                 auto   
  fec                                        auto   
  mtu                     9000               9216   
  fast-linkup             off                       
  [breakout]                                        
  state                   up                 up     
  stats                                             
    carrier-transitions   4                         
    in-bytes              600 Bytes                 
    in-drops              5                         
    in-errors             0                         
    in-pkts               10                        
    out-bytes             2.11 MB                   
    out-drops             0                         
    out-errors            0                         
    out-pkts              33143                     
  mac                     48:b0:2d:39:3f:83         
ifindex                   3

You can add the --view option to show different views:

cumulus@switch:~$ nv show interface --view <<TAB>>
acl-statistics  carrier-stats   dot1x-counters  lldp-detail     physical        status          vrf
bond-members    counters        dot1x-summary   mac             port-security   svi             
bonds           description     down            mlag-cc         qos-profile     synce-counters  
brief           detail          lldp            neighbor        small           up

For example, the nv show interface --view=small command lists the interfaces on the switch. The nv show interface --view=brief command shows information about each interface on the switch, such as the interface type, speed, remote host and port. The nv show interface --view=mac command shows the MAC address of each interface.

The description column only shows in the output when you use the --view=detail option.

The following example shows the MAC address of each interface on the switch:

cumulus@switch:~$ nv show interface --view=mac
Interface   State  Speed  MTU    MAC                Type    
----------  -----  -----  -----  -----------------  --------
BLUE        up            65575  2a:f9:b5:3c:74:b8  vrf     
RED         up            65575  8e:91:ed:ed:d5:76  vrf     
bond1       up     1G     9000   48:b0:2d:39:3f:83  bond    
bond2       up     1G     9000   48:b0:2d:b3:5e:18  bond    
bond3       up     1G     9000   48:b0:2d:c2:9d:47  bond    
br_default  up            9216   44:38:39:22:01:7a  bridge  
br_l3vni    up            9216   44:38:39:22:01:7a  bridge  
eth0        up     1G     1500   44:38:39:22:01:7a  eth     
lo          up            65536  00:00:00:00:00:00  loopback
mgmt        up            65575  8a:58:d0:25:47:7d  vrf     
swp1        up     1G     9000   48:b0:2d:39:3f:83  swp     
swp2        up     1G     9000   48:b0:2d:b3:5e:18  swp     
swp3        up     1G     9000   48:b0:2d:c2:9d:47  swp     
swp4        down          1500   48:b0:2d:c2:7e:cd  swp     
swp5        down          1500   48:b0:2d:6e:bc:c1  swp     
swp6        down          1500   48:b0:2d:2d:89:16  swp 
...

The following example shows the bonds on the switch:

cumulus@switch:~$ nv show interface --view=bonds
Interface  Admin Status  Oper Status  Mode  Mlag ID  Lacp-rate  Lacp-bypass  Up-delay  Down-delay  
---------  ------------  -----------  ----  -------  ---------  -----------  --------  ----------  
bond1      up            up                 1        fast       on           50000     40000  
bond2      up            up                 2        fast       on           60000     20000 

You can filter the nv show interface command output on specific columns. For example, the nv show interface --filter mtu=1500 shows only the interfaces with MTU set to 1500.

To filter on multiple column outputs, enclose the filter types in parentheses; for example, nv show interface --filter "type=bridge&mtu=9216" shows data for bridges with MTU 9216.

You can filter on all revisions (operational, applied, and pending); for example, nv show interface --filter mtu=1500 --rev=applied shows only the interfaces with MTU set to 1500 in the applied revision.

The following example shows information for all bridges configured on the switch with MTU 9216:

cumulus@switch:~$ nv show interface --filter "type=bridge&mtu=9216"
Interface   State  Speed  MTU   Type    Remote Host  Remote Port  Summary                                
----------  -----  -----  ----  ------  -----------  -----------  ---------------------------------------
br_default  up            9216  bridge                            IP Address: fe80::4638:39ff:fe22:17a/64
br_l3vni    up            9216  bridge                            IP Address: fe80::4638:39ff:fe22:17a/64

Statistics

To show interface statistics, run the NVUE nv show interface <interface> counters command or the Linux sudo ethtool -S <interface> command.

cumulus@switch:~$ nv show interface swp1 counters
                    operational  applied
-------------------  -----------  -------
carrier-transitions  4                   
in-bytes             3.37 MB             
in-drops             0                   
in-errors            0                   
in-pkts              29025               
out-bytes            4.28 MB             
out-drops            0                   
out-errors           0                   
out-pkts             43945  
...

For more information about showing and clearing interface counters, refer to Monitoring Interfaces and Transceivers with NVUE.

SFP Port Information

To verify SFP settings, run the NVUE nv show interface <interface> transceiver command or the ethtool -m command. The following example shows the vendor, type and power output for swp1.

cumulus@switch:~$ nv show interface swp1 transceiver
cable-length           : 3m 
diagnostics-status     : Diagnostic Data Available 
revision-compliance    : SFF-8636 Rev 2.5/2.6/2.7 
vendor-data-code       : 210215__ 
identifier             : QSFP28 
vendor-rev             : B2 
vendor-name            : Mellanox 
vendor-pn              : MFA1A00-C003 
vendor-sn              : MT2108FT02204 
temperature            : 48.93 degrees C / 120.07 degrees F 
voltage                : 3.2744 V 
ch-1-rx-power          : 0.0000 mW / -inf dBm 
ch-1-tx-power          : 0.0000 mW / -inf dBm 
ch-1-tx-bias-current   : 0.000 mA 
ch-2-rx-power          : 0.0000 mW / -inf dBm 
ch-2-tx-power          : 0.0000 mW / -inf dBm 
ch-2-tx-bias-current   : 0.000 mA 
ch-3-rx-power          : 0.0000 mW / -inf dBm 
ch-3-tx-power          : 0.0000 mW / -inf dBm 
ch-3-tx-bias-current   : 0.000 mA 
ch-4-rx-power          : 0.0000 mW / -inf dBm 
ch-4-tx-power          : 0.0000 mW / -inf dBm
ch-4-tx-bias-current   : 0.000 mA
cumulus@switch:~$ sudo ethtool -m swp1 | egrep 'Vendor|type|power\s+:'
Transceiver type                          : 10G Ethernet: 10G Base-LR
Vendor name                               : FINISAR CORP.
Vendor OUI                                : 00:90:65
Vendor PN                                 : FTLX2071D327
Vendor rev                                : A
Vendor SN                                 : UY30DTX
Laser output power                        : 0.5230 mW / -2.81 dBm
Receiver signal average optical power     : 0.7285 mW / -1.38 dBm

Considerations

Auto-negotiation and FEC

If auto-negotiation is off on 100G and 25G interfaces, you must set FEC to OFF, RS, or BaseR to match the neighbor. The FEC default setting of auto does not link up when auto-negotiation is off.

If auto-negotiation is on and you set the link speed for a port, Cumulus Linux disables auto-negotiation and uses the port speed setting you configure.

Auto-negotiation with the Spectrum-4 Switch

When you connect an NVIDIA Spectrum-4 switch to another NVIDIA Spectrum-4 switch with PAM4 modulation, you must enable auto-negotiation.

1000BASE-T SFP Modules Supported Only on Certain 25G Platforms

The following 25G switches support 1000BASE-T SFP modules:

100G or faster switches do not support 1000BASE-T SFP modules.

After rebooting the NVIDIA SN2100 switch, eth0 always has a speed of 100MB per second. If you bring the interface down and then back up again, the interface negotiates 1000MB. This only occurs the first time the interface comes up.

To work around this issue, add the following commands to the /etc/rc.local file to flap the interface automatically when the switch boots:

modprobe -r igb
sleep 20
modprobe igb

NVIDIA SN5600 Switch and Force Mode

When you configure force mode on NVIDIA SN5600 switch ports 10 through 50, the Rx precoding setting must be the same between local and peer ports to get the optimal Signal-Integrity of the link.

Delay in Reporting Interface as Operational Down

When you remove two transceivers simultaneously from a switch, both interfaces show the carrier down status immediately. However, it takes one second for the second interface to show the operational down status. In addition, the services on this interface also take an extra second to come down.

NVIDIA Spectrum-2 Switches and FEC Mode

The NVIDIA Spectrum-2 (25G) switch only supports RS FEC.

Connecting NVIDIA SN4410, SN4700, SN5600 to a Spectrum-3 and Earlier Peer Switch

When you connect an NVIDIA SN4410, SN4700, or SN5600 switch to any Spectrum 1, Spectrum-2, or Spectrum-3 peer switch (with four lanes) using a 4x breakout configuration and the default lanes per port setting, links do not come up. To work around this issue, provide the lanes per port configuration shown below:

cumulus@switch:~$ nv set interface <interface> link breakout 4x lanes-per-port 1

ifplugd

ifplugd is an Ethernet link-state monitoring daemon that executes scripts to configure an Ethernet device when you plug in or remove a cable. Follow the steps below to install and configure the ifplugd daemon.

Install ifplugd

You can install this package even if the switch does not connect to the internet. The package is in the cumulus-local-apt-archive repository on the Cumulus Linux image.

To install ifplugd:

  1. Update the switch before installing the daemon:

    cumulus@switch:~$ sudo -E apt-get update
    
  2. Install the ifplugd package:

    cumulus@switch:~$ sudo -E apt-get install ifplugd
    

Configure ifplugd

After you install ifplugd, you must edit two configuration files:

The example configuration below configures ifplugd to bring down all uplinks when the peer bond goes down in an MLAG environment.

  1. Open /etc/default/ifplugd in a text editor and configure the file as appropriate. Add the peerbond name before you save the file.

    INTERFACES="peerbond"
    HOTPLUG_INTERFACES=""
    ARGS="-q -f -u0 -d1 -w -I"
    SUSPEND_ACTION="stop"
    
  2. Open the /etc/ifplugd/action.d/ifupdown file in a text editor. Configure the script, then save the file.

    #!/bin/sh
    set -e
    case "$2" in
    up)
            clagrole=$(clagctl | grep "Our Priority" | awk '{print $8}')
            if [ "$clagrole" = "secondary" ]
            then
                #List all the interfaces below to bring up when clag peerbond comes up.
                for interface in swp1 bond1 bond3 bond4
                do
                    echo "bringing up : $interface"  
                    ip link set $interface up
                done
            fi
        ;;
    down)
            clagrole=$(clagctl | grep "Our Priority" | awk '{print $8}')
            if [ "$clagrole" = "secondary" ]
            then
                #List all the interfaces below to bring down when clag peerbond goes down.
                for interface in swp1 bond1 bond3 bond4
                do
                    echo "bringing down : $interface"
                    ip link set $interface down
                done
            fi
        ;;
    esac
    
  3. Restart the ifplugd daemon to implement the changes:

    cumulus@switch:~$ sudo systemctl restart ifplugd.service
    

Considerations

The default shell for ifplugd is dash (/bin/sh) instead of bash, as it provides a faster and more nimble shell. However, dash contains fewer features than bash (for example, dash is unable to handle multiple uplinks).

802.1X Interfaces

The IEEE 802.1X protocol provides a way to authenticate a client (called a supplicant) over wired media. It also provides access for individual MAC addresses on a switch (called the authenticator) after an authentication server authenticates the MAC addresses. The authentication server is typically a RADIUS server.

A Cumulus Linux switch acts as an intermediary between the clients connected to the wired ports and the authentication server, which is reachable over the existing network. EAPOL operates on top of the data link layer; the switch uses EAPOL to communicate with supplicants connected to the switch ports.

Cumulus Linux implements 802.1x using a modified version of the Debian hostapd package to support auth-fail and dynamic VLANS with MBA and EAP authentication for 802.1x interfaces.

  • Cumulus Linux supports 802.1X on physical interfaces (such as swp1 or swp2s0) that are bridge access ports; the interfaces cannot be part of a bond.
  • Routed interfaces, bond interfaces, and bridged trunk ports do not support 802.1X.
  • To enable 802.1X on an access-port, it must be a member of the default NVUE bridge br_default.
  • eth0 does not support 802.1X.
  • Cumulus Linux tests 802.1X with only a few wpa_supplicant (Debian), Windows 10 and Windows 7 supplicants.
  • Cumulus Linux supports RADIUS authentication with FreeRADIUS and Cisco ACS.
  • 802.1X supports simple login and password, and EAP-TLS (Debian).
  • 802.1X supports RFC 5281 for EAP-TTLS, which provides more secure transport layer security.

Mako template-based configurations do not support 802.1X.

Configure the RADIUS Server

Before you can authenticate with 802.1x on your switch, you must configure a RADIUS server somewhere in your network. Popular examples of commercial software with RADIUS capability include Cisco ISE and Aruba ClearPass.

You can also use open source versions of software supporting RADIUS such as PacketFence and FreeRADIUS. This section discusses how to add FreeRADIUS to a Debian server on your network.

  • Do not use a Cumulus Linux switch as the RADIUS server.
  • You can configure up to three RADIUS servers (in case of failover).

To add FreeRADIUS on a Debian server:

root@radius:~# apt-get update
root@radius:~# apt-get install freeradius

After you install and configure FreeRADIUS, the FreeRADIUS server can serve Cumulus Linux running hostapd as a RADIUS client. For more information, see the FreeRADIUS documentation.

Configure 802.1X Interfaces

To configure an 802.1X interface:

Changing the 802.1X interface settings does not reset existing authorized user ports. However, removing all 802.1X interfaces or changing the RADIUS server IP address, shared secret, authentication port, accounting port, or EAP reauthentication interval restarts hostapd, which forces existing, authorized users to reauthenticate.

The following example:

  • Sets the 802.1X RADIUS server IP address to 10.10.10.1 and the shared secret to mysecret.
  • Enables 802.1X on swp1 through swp3.
cumulus@switch:~$ nv set system dot1x radius server 10.10.10.1 shared-secret mysecret
cumulus@switch:~$ nv set interface swp1,swp2,swp3 dot1x eap enabled 
cumulus@switch:~$ nv config apply

The following example:

  • Sets the 802.1X RADIUS server IP address to 10.10.10.1 and the VRF to BLUE.
  • Sets the 802.1X RADIUS shared secret to mysecret.
  • Sets the 802.1X RADIUS authentication port to 2813.
  • Sets the 802.1X RADIUS accounting port to 2812.
  • Sets the fixed IP address for the RADIUS client to receive requests to 10.10.10.6.
  • Sets the EAP reauthentication interval to 40.
  • Enables 802.1X on swp1, swp2, and swp3.
cumulus@switch:~$ nv set system dot1x radius server 10.10.10.1 vrf BLUE
cumulus@switch:~$ nv set system dot1x radius server 10.10.10.1 shared-secret mysecret
cumulus@switch:~$ nv set system dot1x radius server 10.10.10.1 authentication-port 2813 
cumulus@switch:~$ nv set system dot1x radius server 10.10.10.1 accounting-port 2812 
cumulus@switch:~$ nv set system dot1x radius client-src-ip 10.10.10.6
cumulus@switch:~$ nv set system dot1x reauthentication-interval 40
cumulus@switch:~$ nv set interface swp1,swp2,swp3 dot1x eap enabled 
cumulus@switch:~$ nv config apply

When you enable or disable 802.1X on an interface, hostapd reloads; however, existing authorized sessions do not reset.

Edit the /etc/hostapd.conf file to configure 802.1X settings, then restart the hostapd service.

The following example:

  • Sets the 802.1X RADIUS server IP address to 10.10.10.1.
  • Sets the 802.1X RADIUS shared secret to mysecret.
  • Enables 802.1X on swp1 through swp3.
cumulus@switch:~$ sudo nano /etc/hostapd.conf
...
interfaces=swp1,swp2,swp3
...
auth_server_addr=10.10.10.1
auth_server_port=1812
auth_server_shared_secret=mysecret
...

The following example:

  • Sets the 802.1X RADIUS server IP address to 10.10.10.1 and the VRF to BLUE.
  • Sets the 802.1X RADIUS shared secret to mysecret.
  • Sets the 802.1X RADIUS authentication port to 2813.
  • Sets the 802.1X RADIUS accounting port to 2812.
  • Sets the fixed IP address for the RADIUS client to receive requests to 10.10.10.6.
  • Sets the EAP reauthentication interval to 40.
  • Enables 802.1X on swp1 through swp3.
cumulus@switch:~$ sudo nano /etc/hostapd.conf
...
interfaces=swp1,swp2,swp3
...
eap_reauth_period=40
...
auth_server_addr=10.10.10.1%BLUE
auth_server_port=1813
auth_server_shared_secret=mysecret
acct_server_addr=10.10.10.1%BLUE
acct_server_port=2812
acct_server_shared_secret=mysecret
radius_client_addr=10.10.10.6
...

Enable then restart the hostapd service:

cumulus@switch:~$ sudo systemctl enable hostapd
cumulus@switch:~$ sudo systemctl restart hostapd

NVIDIA recommends you set the following configuration in the /etc/network/interfaces file for the 802.1X enabled interfaces:

...
auto swp1
iface swp1
        bridge-access <vlan>
        bridge-learning off
        mstpctl-bpduguard yes
        mstpctl-portadminedge yes
auto swp2
iface swp2
        bridge-access <vlan>
        bridge-learning off
        mstpctl-bpduguard yes
        mstpctl-portadminedge yes
auto swp3
iface swp3
        bridge-access <vlan>
        bridge-learning off
        mstpctl-bpduguard yes
        mstpctl-portadminedge yes

MAC-based Authentication

MAC-based authentication (MBA) enables bridged interfaces to allow devices to bypass authentication based on their MAC address. This is useful for devices that do not support EAP, such as printers or phones.

You must configure MBA on both the RADIUS server and the RADIUS client (the Cumulus Linux switch).

Changing the MBA settings does not reset existing authorized user ports. However, changing the MBA activation delay restarts hostapd, which forces existing, authorized users to reauthenticate.

To configure MBA:

Enable MBA in a bridged interface. The following example enables MBA on swp1:

cumulus@switch:~$ nv set interface swp1 dot1x mba enabled 
cumulus@switch:~$ nv config apply

Edit the /etc/hostapd.conf file. The following example enables MBA on swp1.

cumulus@switch:~$ sudo nano hostapd.conf
...
mab_interfaces=swp1
...

Restart the hostapd service:

cumulus@switch:~$ sudo systemctl restart hostapd

Auth-fail VLAN

If a non-authorized supplicant tries to communicate with the switch, you can route traffic from that device to a different VLAN and associate that VLAN with one of the switch ports to which the supplicant attaches. Cumulus Linux assigns the auth-fail VLAN by manipulating the PVID of the interface.

Changing the auth-fail VLAN settings does not reset existing authorized user ports. However, changing the auth-fail VLAN ID restarts hostapd, which forces existing, authorized users to reauthenticate.

The following example sets the auth-fail VLAN ID to 777 and enables auth-fail VLAN on swp1.

cumulus@switch:~$ nv set system dot1x auth-fail-vlan 777 
cumulus@switch:~$ nv set interface swp1 dot1x auth-fail-vlan enabled
cumulus@switch:~$ nv config apply

If the authentication for swp1 fails, the interface moves to the auth-fail VLAN:

cumulus@switch:~$ nv show interface swp1 dot1x 
Interface  MAC Address        Attribute                     Value
---------  -----------------  ----------------------------  -----------------
swp1       00:02:00:00:00:08  Status Flags                  [PARKED_VLAN]
                              Username                      vlan60
                              Authentication Type           MD5
                              VLAN                          777
                              Session Time (seconds)        24772
                              EAPOL Frames RX               9
                              EAPOL Frames TX               12
                              EAPOL Start Frames RX         1
                              EAPOL Logoff Frames RX        0
                              EAPOL Response ID Frames RX   4
                              EAPOL Response Frames RX      8
                              EAPOL Request ID Frames TX    4
                              EAPOL Request Frames TX       8
                              EAPOL Invalid Frames RX       0
                              EAPOL Length Error Frames Rx  0
                              EAPOL Frame Version           2
                              EAPOL Auth Last Frame Source  00:02:00:00:00:08
                              EAPOL Auth Backend Responses  8
                              RADIUS Auth Session ID        C2FED91A39D8D605

Edit the /etc/hostapd.conf file to add the auth-fail VLAN ID and interface:

cumulus@switch:~$ sudo nano hostapd.conf
...
parking_vlan_interfaces=swp1
parking_vlan_id=777
...

Restart the hostapd service:

cumulus@switch:~$ sudo systemctl restart hostapd

If the authentication for swp1 fails, the interface moves to the auth-fail VLAN.

Dynamic VLAN Assignments

A common requirement for campus networks is to assign dynamic VLANs to specific users in combination with IEEE 802.1x. After authenticating a supplicant, the user is assigned a VLAN based on the RADIUS configuration. Cumulus Linux assigns the dynamic VLAN by manipulating the PVID of the interface.

To enable dynamic VLAN assignment globally, where VLAN attributes from the RADIUS server apply to the bridge:

Run the nv set system dot1x dynamic-vlan optional or nv set system dot1x dynamic-vlan required command. If you run the nv set system dot1x dynamic-vlan required command, when VLAN attributes do not exist in the access response packet from the RADIUS server, the user is not authorized and has no connectivity. If the RADIUS server returns VLAN attributes but the user has an incorrect password, the user goes in the auth-fail VLAN (if you configure auth-fail VLAN).

cumulus@switch:~$ nv set system dot1x dynamic-vlan optional
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv set system dot1x dynamic-vlan required
cumulus@switch:~$ nv config apply

The following example shows a typical RADIUS configuration (shown for FreeRADIUS,) for a user with dynamic VLAN assignment:

# # VLAN 100 Client Configuration for Freeradius RADIUS Server.
# # This is not part of the CL configuration.
vlan10client Cleartext-Password := "client1password"
      Service-Type = Framed-User,
      Tunnel-Type = VLAN,
      Tunnel-Medium-Type = "IEEE-802",
      Tunnel-Private-Group-ID = 100

Verify the configuration (notice the [AUTHORIZED] status in the output):

cumulus@switch:~$ nv show interface dot1x-summary
Interface  MAC Address        Attribute                     Value
---------  -----------------  ----------------------------  --------------------------
swp1       00:02:00:00:00:08  Status Flags                  [DYNAMIC_VLAN][AUTHORIZED]
                              Username                      host1
                              Authentication Type           MD5
                              VLAN                          888
                              Session Time (seconds)        799
                              EAPOL Frames RX               3
                              EAPOL Frames TX               3
                              EAPOL Start Frames RX         1
                              EAPOL Logoff Frames RX        0
                              EAPOL Response ID Frames RX   1
                              EAPOL Response Frames RX      2
                              EAPOL Request ID Frames TX    1
                              EAPOL Request Frames TX       2
                              EAPOL Invalid Frames RX       0
                              EAPOL Length Error Frames Rx  0
                              EAPOL Frame Version           2
                              EAPOL Auth Last Frame Source  00:02:00:00:00:08
                              EAPOL Auth Backend Responses  2
                              RADIUS Auth Session ID        939B1A53B624FC56
  1. Edit the /etc/hostapd.conf file to set the dynamic_vlan option.

    • Specify 1 for VLAN attributes to be optional.
    • Specify 2 to require VLAN attributes; if VLAN attributes do not exist in the access response packet returned from the RADIUS server, the user is not authorized and has no connectivity. If the RADIUS server returns VLAN attributes but the user has an incorrect password, the user goes in the auth-fail VLAN, if you have configured auth-fail VLAN.
    cumulus@switch:~$ sudo nano /etc/hostapd.conf
    ...
    dynamic_vlan=1
    ...
    
  2. Remove the eap_send_identity=0 option.

    Restart the hostapd service:

    cumulus@switch:~$ sudo systemctl restart hostapd
    

The following example shows a typical RADIUS configuration (shown for FreeRADIUS, not typically configured or run on the Cumulus Linux device) for a user with a dynamic VLAN assignment:

# # VLAN 100 Client Configuration for Freeradius RADIUS Server.
# # This is not part of the CL configuration.
vlan10client Cleartext-Password := "client1password"
      Service-Type = Framed-User,
      Tunnel-Type = VLAN,
      Tunnel-Medium-Type = "IEEE-802",
      Tunnel-Private-Group-ID = 100

To disable dynamic VLAN assignment, where the Cumulus Linux ignores VLAN attributes sent from the RADIUS server and users authenticate based on existing credentials:

cumulus@switch:~$ nv set system dot1x dynamic-vlan disabled
cumulus@switch:~$ nv config apply
Edit the /etc/hostapd.conf file to set the eap_send_identity option to 0, then restart the hostapd service with the sudo systemctl restart hostapd command.

Enabling or disabling dynamic VLAN assignment restarts hostapd, which forces existing, authorized users to reauthenticate.

MAC Addresses per Port

You can specify the maximum number of authenticated MAC addresses allowed on an interface. You can specify any number between 0 and 255. The default value is 6.

The following example sets the maximum number of authenticated MAC addresses to 10.

cumulus@switch:~$ nv set system dot1x max-stations 10
cumulus@switch:~$ nv config apply

Edit the /etc/hostapd.conf file to add the max_num_sta= option. For example:

cumulus@switch:~$ sudo nano /etc/hostapd.conf
eap_server=0
ieee8021x=1
driver=wired
dynamic_vlan=1
max_num_sta=10
...

Restart the hostapd service :

cumulus@switch:~$ sudo systemctl restart hostapd

Host Modes

Cumulus Linux provides the following 802.1X host modes:

Multi Host Mode and MBA

When you enable multi host mode on an 802.1X interface with MBA, the first authorized supplicant does not need to run an EAP client but authorizes according to its MAC address.

Multi Host Mode and Auth-fail VLAN

When you enable multi host mode on an 802.1X interface with auth-fail VLAN, when the first supplicant fails to authorize, Cumulus Linux changes the access VLAN on the interface to auth-fail-vlan. The port does not allow traffic from other MAC addresses.

Multi Host Mode and Port Security

Port security limits port access to a specific number of MAC addresses or specific MAC addresses so that the port does not forward ingress traffic from undefined source addresses.

If you enable port security and 802.1X multi host mode on an interface, the MAC address limit that port security enforces on the interface limits the number of traffic sources after authorization.

In multi host mode, Cumulus Linux adds the authorized supplicant MAC address as a static sticky MAC in the forwarding table. The MAC address limit that port security enforces does not account for the supplicant MAC. For example, when you set the port security MAC limit to 2 on an interface, the supplicant and two more hosts can send traffic through the interface.

If you enable 802.1X after the switch learns port security MAC addresses, Cumulus Linux deletes the dynamic MAC addresses installed with port security from the forwarding table. Cumulus Linux disables bridge learning on an interface with 802.1X configuration; therefore, port security applies only after RADIUS authorizes the first supplicant.

Configure the Host Mode

To configure the host mode on an 802.1X interface:

The following example sets multi host mode on swp1:

cumulus@switch:~$ nv set interface swp1 dot1x host-mode multi-host
cumulus@switch:~$ nv config apply

The following example changes host mode back to the default setting (multi host authenticated) on swp1:

cumulus@switch:~$ nv set interface swp1 dot1x host-mode multi-host-authenticated
cumulus@switch:~$ nv config apply

To change back to the default host mode, you can also run the nv unset interface <interface> dot1x host-mode command.

Edit the /etc/hostapd.conf file to set the multihost_interfaces option to the 802.1X interface on which you want to enable multi host mode, then restart the hostapd service.

The following example configures multi host mode on swp1:

cumulus@switch:~$ sudo nano /etc/hostapd.conf
...
ap_server=0
ieee8021x=1
driver=wired
dynamic_vlan=0
eap_send_identity=
interfaces=swp1,swp2,swp3
voice_interfaces=
mab_interfaces=
dynamic_acl_interfaces=
default_dynamic_acl=default_preauth_dacl.rules
parking_vlan_interfaces=
parking_vlan_id=
multihost_interfaces=swp1
cumulus@switch:~$ sudo systemctl restart hostapd

To change host mode back to the default setting (multi host authenticated), remove the interface from the multihost_interfaces line in the /etc/hostapd.conf file, then restart the hostapd service.

cumulus@switch:~$ sudo nano /etc/hostapd.conf
...
...
mab_interfaces=
dynamic_acl_interfaces=
default_dynamic_acl=default_preauth_dacl.rules
parking_vlan_interfaces=
parking_vlan_id=
multihost_interfaces=
cumulus@switch:~$ sudo systemctl restart hostapd

When you change the mode on an 802.1X interface from multi host authentication (with multiple authorized supplicants) to multi host, Cumulus Linux brings down all existing sessions and closes down the port until one of the supplicants authenticates successfully.

When you change the mode on an 802.1X interface from multi host to multi host authentication, Cumulus Linux brings down existing sessions and disables bridge learning.

Show the Current Host Mode

To show the current host mode, run the nv show interface <interface> dot1x command:

cumulus@switch:~$ nv show interface swp1 dot1x
           operational  applied
---------  -----------  ---------- 
eap                     enabled
host-mode               multi-host
...

Deauthenticate an 802.1x Supplicant

To deauthenticate an 802.1X supplicant on an interface, run the nv action deauthenticate interface <interface> dot1x authorized-sessions <mac-address> command:

cumulus@switch:~$ nv action deauthenticate interface swp1 dot1x authorized-sessions 00:55:00:00:00:09

If you do not want to notify the supplicant of the deauthentication, you can add the silent option:

cumulus@switch:~$ nv action deauthenticate interface swp1 dot1x authorized-sessions 00:55:00:00:00:09 silent

Troubleshooting

Check Connectivity Between Supplicants

To check connectivity between two supplicants, ping one host from the other:

root@host1:/home/cumulus# ping 198.51.100.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.604 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.552 ms
^C
--- 10.0.0.2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.552/0.578/0

Show RADIUS Server Configuration

To show the list of RADIUS servers, run the nv show system dot1x radius command:

cumulus@switch:~$ nv show system dot1x radius
          operational  applied  
--------  -----------  ---------
[server]               10.10.10.1

To show configuration information for RADIUS servers, run the nv show system dot1x radius server command:

cumulus@switch:~$ nv show system dot1x radius server
Server      accounting-port  authentication-port  priority  shared-secret  vrf
---------   ---------------  -------------------  --------  -------------  ---
10.10.10.1  1813             1812                 1

To show configuration information for a specific RADIUS server, run the nv show system dot1x radius server <ip-address> command:

cumulus@switch:~$ nv show system dot1x radius server 10.10.10.1
                    operational  applied
-------------------  -----------  -------
priority             1            1
accounting-port      1813         1813
authentication-port  1812         1812
shared-secret                     *

Show 802.1X Configuration and Authorization Information

To check which MAC addresses RADIUS has authorized, run the nv show interface --view=dot1x-summary command:

cumulus@switch:~$ nv show interface --view=dot1x-summary
Interface  Mac-Address        Status      Auth-Type  Username      Vlan  Session-id
---------  -----------------  ----------  ---------  ------------  ----  ----------------
eth0       00:55:00:00:00:09  AUTHORIZED  MBA        005500000009  10   946E00ED478CC8D3
           00:02:00:00:00:09  AUTHORIZED  MD5        vlan10        10   9EA1784C12F4E646
lo         00:55:00:00:00:09  AUTHORIZED  MBA        005500000009  10   946E00ED478CC8D3
           00:02:00:00:00:09  AUTHORIZED  MD5        vlan10        10   9EA1784C12F4E646
mgmt       00:55:00:00:00:09  AUTHORIZED  MBA        005500000009  10   946E00ED478CC8D3
           00:02:00:00:00:09  AUTHORIZED  MD5        vlan10        10   9EA1784C12F4E646
swp1       00:55:00:00:00:09  AUTHORIZED  MBA        005500000009  10   946E00ED478CC8D3
           00:02:00:00:00:09  AUTHORIZED  MD5        vlan10        10   9EA1784C12F4E646
swp2       00:55:00:00:00:09  AUTHORIZED  MBA        005500000009  10   946E00ED478CC8D3
           00:02:00:00:00:09  AUTHORIZED  MD5        vlan10        10   9EA1784C12F4E646
swp3       00:55:00:00:00:09  AUTHORIZED  MBA        005500000009  10   946E00ED478CC8D3
           00:02:00:00:00:09  AUTHORIZED  MD5        vlan10        10   9EA1784C12F4E646
swp4       00:55:00:00:00:09  AUTHORIZED  MBA        005500000009  10   946E00ED478CC8D3
           00:02:00:00:00:09  AUTHORIZED  MD5        vlan10        10   9EA1784C12F4E646
swp5       00:55:00:00:00:09  AUTHORIZED  MBA        005500000009  10   946E00ED478CC8D3
           00:02:00:00:00:09  AUTHORIZED  MD5        vlan10        10   9EA1784C12F4E646
swp6       00:55:00:00:00:09  AUTHORIZED  MBA        005500000009  10   946E00ED478CC8D3
           00:02:00:00:00:09  AUTHORIZED  MD5        vlan10        10   9EA1784C12F4E646

To show 802.1X configuration settings and authenticated session information for an interface, run the nv show interface <interface> dot1x command:

cumulus@switch:~$ nv show interface swp1 dot1x
                operational  applied
--------------  -----------  --------
eap                          enabled
mba                          disabled
auth-fail-vlan               disabled

Authenticated Sessions
=========================
    Mac                Auth-Type  Session-id        Status      Username      Vlan  Eapol TX  Eapol RX  Err RX  Req TX  Resp RX  Start RX  Req-id TX  Resp-id RX  Invalid RX  Logoff RX
    -----------------  ---------  ----------------  ----------  ------------  ----  --------  --------  ------  ------  -------  --------  ---------  ----------  ----------  ---------
    00:02:00:00:00:09  MD5        9EA1784C12F4E646  AUTHORIZED  vlan10        10    3         3         0       2       2        1         1          1           0           0
    00:55:00:00:00:09  MBA        946E00ED478CC8D3  AUTHORIZED  005500000009  10    0         3         0       0       0        0         0          0           0           0

To show the authenticated sessions and statistics for an interface, run the nv show interface <interface> dot1x authenticated-sessions command:

cumulus@switch:~$ nv show interface swp1 dot1x authenticated-sessions
Mac                Auth-Type  Session-id        Status      Username      Vlan  Eapol TX  Eapol RX  Err RX  Req TX  Resp RX  Start RX  Req-id TX  Resp-id RX  Invalid RX  Logoff RX
-----------------  ---------  ----------------  ----------  ------------  ----  --------  --------  ------  ------  -------  --------  ---------  ----------  ----------  ---------
00:02:00:00:00:09  MD5        9EA1784C12F4E646  AUTHORIZED  vlan10        10    3         3         0       2       2        1         1          1           0           0
00:55:00:00:00:09  MBA        946E00ED478CC8D3  AUTHORIZED  005500000009  10    0         3         0       0       0        0         0          0           0           0

To show the authenticated sessions and statistics for a specific MAC address, run the nv show interface <interface-id> dot1x authenticated-sessions <mac-address> command:

cumulus@switch:~$ nv show interface swp1 dot1x authenticated-sessions 00:02:00:00:00:09
                           operational
-------------------------  -----------------
username                   vlan10
auth-type                  MD5
status                     AUTHORIZED
vlan                       10
mac-address                00:02:00:00:00:09
session-id                 9EA1784C12F4E646
counters
  eapol-frames-tx          3
  eapol-frames-rx          3
  eapol-len-err-frames-rx  0
  eapol-req-frames-tx      2
  eapol-resp-frames-rx     2
  eapol-start-frames-rx    1
  eapol-req-id-frames-tx   1
  eapol-resp-id-frames-rx  1
  eapol-invalid-frames-rx  0
  eapol-logoff-frames-rx   0

Show 802.1X Statistics

To check statistics for all interfaces, run the nv show interface --view=dot1x-counters command:

cumulus@switch:~$ nv show interface --view=dot1x-counters
Interface  Mac-Address        Eapol TX  Eapol RX  Req TX  Resp RX  Err RX  Start RX  Req-id TX  Resp-id RX  Invalid RX  Logoff RX
---------  -----------------  --------  --------  ------  -------  ------  --------  ---------  ----------  ----------  ---------
eth0       00:55:00:00:00:09  0         3         0       0        0       0         0          0           0           0
           00:02:00:00:00:09  3         3         2       2        0       1         1          1           0           0
lo         00:55:00:00:00:09  0         3         0       0        0       0         0          0           0           0
           00:02:00:00:00:09  3         3         2       2        0       1         1          1           0           0
mgmt       00:55:00:00:00:09  0         3         0       0        0       0         0          0           0           0
           00:02:00:00:00:09  3         3         2       2        0       1         1          1           0           0
swp1       00:55:00:00:00:09  0         3         0       0        0       0         0          0           0           0
           00:02:00:00:00:09  3         3         2       2        0       1         1          1           0           0
swp2       00:55:00:00:00:09  0         3         0       0        0       0         0          0           0           0
           00:02:00:00:00:09  3         3         2       2        0       1         1          1           0           0
swp3       00:55:00:00:00:09  0         3         0       0        0       0         0          0           0           0
           00:02:00:00:00:09  3         3         2       2        0       1         1          1           0           0
swp4       00:55:00:00:00:09  0         3         0       0        0       0         0          0           0           0
           00:02:00:00:00:09  3         3         2       2        0       1         1          1           0           0
swp5       00:55:00:00:00:09  0         3         0       0        0       0         0          0           0           0
           00:02:00:00:00:09  3         3         2       2        0       1         1          1           0           0
swp6       00:55:00:00:00:09  0         3         0       0        0       0         0          0           0           0
           00:02:00:00:00:09  3         3         2       2        0       1         1          1
...

Advanced Troubleshooting

You can perform more advanced troubleshooting with the following commands.

To increase the debug level in hostapd, copy over the hostapd service file, then add -d, -dd or -ddd to the ExecStart line in the hostapd.service file:

cumulus@switch:~$ cp /lib/systemd/system/hostapd.service /etc/systemd/system/hostapd.service
cumulus@switch:~$ sudo nano /etc/systemd/system/hostapd.service
...
ExecStart=/usr/sbin/hostapd -ddd -c /etc/hostapd.conf
...

To watch debugs with journalctl as supplicants attempt to connect:

cumulus@switch:~$ sudo journalctl -n 1000  -u hostapd      # see the last 1000 lines of hostapd debug logging
cumulus@switch:~$ sudo journalctl -f -u hostapd            # continuous tail of the hostapd daemon debug logging

To check ACL rules in /etc/cumulus/acl/policy.d/100_dot1x_swpX.rules before and after a supplicant attempts to authenticate:

cumulus@switch:~$ sudo cl-acltool -L eb | grep swp1
cumulus@switch:~$ sudo cl-netstat | grep swp1           # look at interface counters

To check tc rules in /var/lib/hostapd/acl/tc_swpX.rules with:

cumulus@switch:~$ sudo tc -s filter show dev swp1 parent 1:
cumulus@switch:~$ sudo tc -s filter show dev swp1 parent ffff:

Quality of Service

This section refers to frames for all internal QoS functionality. Unless explicitly stated, the actions are independent of layer 2 frames or layer 3 packets.

Cumulus Linux supports several different QoS features and standards including:

Cumulus Linux uses two configuration files for QoS:

Cumulus Linux 5.0 and later does not use the traffic.conf and datapath.conf files but uses the qos_features.conf and qos_infra.conf files instead. Before upgrading Cumulus Linux, review your existing QoS configuration to determine the changes you need to make.

switchd and QoS

When you run Linux commands to configure QoS, you must apply QoS changes to the ASIC with the following command:

cumulus@switch:~$ sudo systemctl reload switchd.service

Unlike the restart command, the reload switchd.service command does not impact traffic forwarding except when the qos_infra.conf file changes, or when the switch pauses frames or controls priority flow, which require modifications to the ASIC buffer and might result in momentary packet loss.

NVUE reloads the switchd service automatically. You do not have to run the reload switchd.service command to apply changes when configuring QoS with NVUE commands.

Classification

When a frame or packet arrives on the switch, Cumulus Linux maps it to an internal COS (switch priority) value. This value never writes to the frame or packet but classifies and schedules traffic internally through the switch.

You can define which values are trusted: 802.1p, DSCP, or both.

The following table describes the default classifications for various frame and switch priority configurations:

Setting VLAN Tagged? IP or Non-IP Result
PCP (802.1p) Yes IP Accept incoming 802.1p marking.
PCP (802.1p) Yes Non-IP Accept incoming 802.1p marking.
PCP (802.1p) No IP Use the default priority setting.
PCP (802.1p) No Non-IP Use the default priority setting.
DSCP Yes IP Accept incoming DSCP IP header marking.
DSCP Yes Non-IP Use the default priority setting.
DSCP No IP Accept incoming DSCP IP header marking.
DSCP No Non-IP Use the default priority setting.
PCP (802.1p) and DSCP Yes IP Accept incoming DSCP IP header marking.
PCP (802.1p) and DSCP Yes Non-IP Accept incoming 802.1p marking.
PCP (802.1p) and DSCP No IP Accept incoming DSCP IP header marking.
PCP (802.1p) and DSCP No Non-IP Use the default priority setting.
port Either Either Ignore any existing markings and use the default priority setting.

Trust 802.1p Marking

To trust 802.1p marking:

When 802.1p (l2) is trusted, Cumulus Linux classifies these ingress 802.1p values to switch priority values:

Switch Priority 802.1p (PCP)
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7

The PCP number is the incoming 802.1p marking; for example PCP 0 maps to switch priority 0.

To change the default profile to map PCP 0 to switch priority 4:

cumulus@switch:~$ nv set qos mapping default-global trust l2
cumulus@switch:~$ nv set qos mapping default-global pcp 0 switch-priority 4 
cumulus@switch:~$ nv config apply

You can map multiple PCP values to the same switch priority value. For example, to map PCP values 2, 3, and 4 to switch priority 0:

cumulus@switch:~$ nv set qos mapping default-global trust l2 
cumulus@switch:~$ nv set qos mapping default-global pcp 2,3,4 switch-priority 0
cumulus@switch:~$ nv config apply

If you configure the trust to be l2 but do not specify any PCP to switch priority mappings, Cumulus Linux uses the default values.

To show the ingress 802.1p mapping for the default profile, run the nv show qos mapping default-global pcp command. To show the PCP mapping for a specific switch priority in the default profile, run the nv show qos mapping default-global pcp <value> command. The following example shows that PCP 0 maps to switch priority 4:

cumulus@switch:~$ nv show qos mapping default-global pcp 0
                 operational  applied  description
---------------  -----------  -------  ------------------------
switch-priority  4            4        Internal Switch Priority

In the /etc/cumulus/datapath/qos/qos_features.conf file, set traffic.packet_priority_source_set = [802.1p].

When 802.1p marking is trusted, the following lines classify ingress 802.1p values to switch priority (internal COS) values:

traffic.cos_0.priority_source.8021p = [0]
traffic.cos_1.priority_source.8021p = [1]
traffic.cos_2.priority_source.8021p = [2]
traffic.cos_3.priority_source.8021p = [3]
traffic.cos_4.priority_source.8021p = [4]
traffic.cos_5.priority_source.8021p = [5]
traffic.cos_6.priority_source.8021p = [6]
traffic.cos_7.priority_source.8021p = [7]

The traffic.cos_ number is the switch priority value; for example 802.1p 0 maps to switch priority 0.

To map 802.1p 4 to switch priority 0, configure the traffic.cos_0.priority_source.8021p setting to 4.

traffic.cos_0.priority_source.8021p = [4]

You can map multiple values to the same switch priority value. For example, to map 802.1p values 0, 1, and 2 to switch priority 0:

traffic.cos_0.priority_source.8021p = [0, 1, 2]

You can also choose not to use a switch priority value. This example does not use switch priority values 3 and 4.

traffic.cos_0.priority_source.8021p = [0]
traffic.cos_1.priority_source.8021p = [1]
traffic.cos_2.priority_source.8021p = [2,3,4]
traffic.cos_3.priority_source.8021p = []
traffic.cos_4.priority_source.8021p = []
traffic.cos_5.priority_source.8021p = [5]
traffic.cos_6.priority_source.8021p = [6]
traffic.cos_7.priority_source.8021p = [7]

To apply a custom profile to specific interfaces, see Port Groups.

Trust DSCP

To trust ingress DSCP markings:

If DSCP (l3) is trusted, Cumulus Linux classifies these ingress DSCP values to switch priority values:

Switch Priority Ingress DSCP
0 [0,1,2,3,4,5,6,7]
1 [8,9,10,11,12,13,14,15]
2 [16,17,18,19,20,21,22,23]
3 [24,25,26,27,28,29,30,31]
4 [32,33,34,35,36,37,38,39]
5 [40,41,42,43,44,45,46,47]
6 [48,49,50,51,52,53,54,55]
7 [56,57,58,59,60,61,62,63]

The DSCP number is the ingress DSCP value; for example DSCP 0 through 7 maps to switch priority 0.

To change the default profile to map ingress DSCP 22 to switch priority 4:

cumulus@switch:~$ nv set qos mapping default-global trust l3 
cumulus@switch:~$ nv set qos mapping default-global dscp 22 switch-priority 4 
cumulus@switch:~$ nv config apply

You can map multiple ingress DSCP values to the same switch priority value. For example, to change the default profile to map ingress DSCP values 10, 21, and 36 to switch priority 0:

cumulus@switch:~$ nv set qos mapping default-global trust l3 
cumulus@switch:~$ nv set qos mapping default-global dscp 10,21,36 switch-priority 0
cumulus@switch:~$ nv config apply

If you configure the trust to be l3 but do not specify any DSCP to switch priority mappings, Cumulus Linux uses the default values.

To show the DSCP mapping in the default profile, run the nv show qos mapping default-global dscp command. To show the DSCP mapping for a specific switch priority in the default profile, run the nv show qos mapping default-global dscp <value> command. The following example shows that DSCP 22 maps to switch priority 4:

cumulus@switch:~$ nv show qos mapping default-global dscp 22
                 operational  applied  description
---------------  -----------  -------  ------------------------
switch-priority  4            4        Internal Switch Priority

In the /etc/cumulus/datapath/qos/qos_features.conf file, configure traffic.packet_priority_source_set = [dscp].

If DSCP is trusted, the following lines classify ingress DSCP values to switch priority (internal COS) values:

traffic.cos_0.priority_source.dscp = [0,1,2,3,4,5,6,7]
traffic.cos_1.priority_source.dscp = [8,9,10,11,12,13,14,15]
traffic.cos_2.priority_source.dscp = [16,17,18,19,20,21,22,23]
traffic.cos_3.priority_source.dscp = [24,25,26,27,28,29,30,31]
traffic.cos_4.priority_source.dscp = [32,33,34,35,36,37,38,39]
traffic.cos_5.priority_source.dscp = [40,41,42,43,44,45,46,47]
traffic.cos_6.priority_source.dscp = [48,49,50,51,52,53,54,55]
traffic.cos_7.priority_source.dscp = [56,57,58,59,60,61,62,63]

The # in the configuration file is a comment. By default, the file comments out the traffic.cos_*.priority_source.dscp lines.
You must uncomment them for them to take effect.

The traffic.cos_ number is the switch priority value; for example DSCP values 0 through 7 map to switch priority 0. To map ingress DSCP 22 to switch priority 4, configure the traffic.cos_4.priority_source.dscp setting.

traffic.cos_4.priority_source.dscp = [22]

You can map multiple ingress DSCP values to the same switch priority value. For example, to map ingress DSCP values 10, 21, and 36 to switch priority 0:

traffic.cos_0.priority_source.dscp = [10,21,36]

You can also choose not to use an switch priority value. This example does not use switch priority values 3 and 4:

traffic.cos_0.priority_source.dscp = [0,1,2,3,4,5,6,7]
traffic.cos_1.priority_source.dscp = [8,9,10,11,12,13,14,15]
traffic.cos_2.priority_source.dscp = [16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31]
traffic.cos_3.priority_source.dscp = []
traffic.cos_4.priority_source.dscp = []
traffic.cos_5.priority_source.dscp = [40,41,42,43,44,45,46,47,32,33,34,35,36,37,38,39]
traffic.cos_6.priority_source.dscp = [48,49,50,51,52,53,54,55]
traffic.cos_7.priority_source.dscp = [56,57,58,59,60,61,62,63]

To apply a custom DSCP profile to specific interfaces, see Port Groups.

Trust Port

You can assign all traffic to a switch priority regardless of the ingress marking.

The following commands assign all traffic to switch priority 3 regardless of the ingress marking.

cumulus@switch:~$ nv set qos mapping default-global trust port 
cumulus@switch:~$ nv set qos mapping default-global port-default-sp 3 
cumulus@switch:~$ nv config apply

To show the switch priority setting in the default profile for all traffic regardless of the ingress marking, run the nv show qos mapping default-global command:

cumulus@switch:~$ nv show qos mapping default-global
                 operational  applied  description
---------------  -----------  -------  ----------------------------
port-default-sp  3            3        Port Default Switch Priority
trust            port         port     Port Trust configuration

In the /etc/cumulus/datapath/qos/qos_features.conf file, configure traffic.packet_priority_source_set = [port].

The traffic.port_default_priority setting defines the switch priority that all traffic uses.

To apply a custom profile to specific interfaces, see Port Groups.

Mark and Remark Traffic

You can mark or remark traffic in two ways:

802.1p or DSCP for Marking

To enable global remarking of 802.1p, DSCP or both 802.1p and DSCP values:

To remark switch priority 0 to egress 802.1p 4

cumulus@switch:~$ nv set qos remark default-global rewrite l2
cumulus@switch:~$ nv set qos remark default-global switch-priority 0 pcp 4
cumulus@switch:~$ nv config apply

To remark switch priority 0 to egress DSCP 22:

cumulus@switch:~$ nv set qos remark default-global rewrite l3
cumulus@switch:~$ nv set qos remark default-global switch-priority 0 dscp 22
cumulus@switch:~$ nv config apply

You can remap multiple switch priority values to the same external 802.1p or DSCP value. For example, to map switch priority 1 and 2 to 802.1p 3:

cumulus@switch:~$ nv set qos remark default-global rewrite l2
cumulus@switch:~$ nv set qos remark default-global switch-priority 1 pcp 3
cumulus@switch:~$ nv set qos remark default-global switch-priority 2 pcp 3
cumulus@switch:~$ nv config apply

To map switch priority 1 and 2 to DSCP 40:

cumulus@switch:~$ nv set qos remark default-global rewrite l3
cumulus@switch:~$ nv set qos remark default-global switch-priority 1 dscp 40
cumulus@switch:~$ nv set qos remark default-global switch-priority 2 dscp 40
cumulus@switch:~$ nv config apply

In the /etc/cumulus/datapath/qos/qos_features.conf file, modify the traffic.packet_priority_remark_set value to [802.1p], [dscp] or [802.1p,dscp]. For example, to enable the remarking of only 802.1p values:

traffic.packet_priority_remark_set = [802.1p]

You remark 802.1p or DSCP with the priority_remark.8021p or priority_remark.dscp setting. The switch priority (internal cos_) value determines the egress 802.1p or DSCP remarking. For example, to remark switch priority 0 to egress 802.1p 4:

traffic.cos_0.priority_remark.8021p = [4]

To remark switch priority 0 to egress DSCP 22:

traffic.cos_0.priority_remark.dscp = [22]

The # in the configuration file is a comment. The file comments out the traffic.cos_*.priority_remark.8021p and the traffic.cos_*.priority_remark.dscp lines by default. You must uncomment them to set the configuration.

You can remap multiple switch priority values to the same external 802.1p or DSCP value. For example, to map switch priority 1 and 2 to 802.1p 3:

traffic.cos_1.priority_remark.8021p = [3]
traffic.cos_2.priority_remark.8021p = [3]

To map switch priority 1 and 2 to DSCP 40:

traffic.cos_1.priority_remark.dscp = [40]
traffic.cos_2.priority_remark.dscp = [40]

To apply a custom profile to specific interfaces, see Port Groups.

Policy-based Marking

Cumulus Linux supports ACLs through ebtables, iptables or ip6tables for egress packet marking and remarking.

Cumulus Linux uses ebtables to mark layer 2, 802.1p COS values. Cumulus Linux uses iptables to match IPv4 traffic and ip6tables to match IPv6 traffic for DSCP marking.

For more information on configuring and applying ACLs, refer to Access Control List Configuration.

Mark Layer 2 COS

You must use ebtables to match and mark layer 2 bridged traffic. You can match traffic with any supported ebtables rule.

To set the new 802.1p COS value when traffic matches, use -A FORWARD -o <interface> -j setqos --set-cos <value>.

You can only set COS on a per-egress interface basis. Cumulus Linux does not support ebtables based matching on ingress.

The configured action always has the following conditions:

For example, to set traffic leaving interface swp5 to 802.1p COS value 4:

-A FORWARD -o swp5 -j setqos --set-cos 4

Mark Layer 3 DSCP

You must use iptables (for IPv4 traffic) or ip6tables (for IPv6 traffic) to match and mark layer 3 traffic.

You can match traffic with any supported iptable or ip6tables rule. To set the new COS or DSCP value when traffic matches, use -A FORWARD -o <interface> -j SETQOS [--set-dscp <value> | --set-cos <value> | --set-dscp-class <name>].

The configured action always has the following conditions:

You can configure COS markings with --set-cos and a value between 0 and 7 (inclusive).

You can use only one of --set-dscp or --set-dscp-class.
--set-dscp supports decimal or hex DSCP values between 0 and 77. --set-dscp-class supports standard DSCP naming, described in RFC3260, including ef, be, CS and AF classes.

You can specify either --set-dscp or --set-dscp-class, but not both.

For example, to set traffic leaving interface swp5 to DSCP value 32:

-A FORWARD -o swp5 -j SETQOS --set-dscp 32

To set traffic leaving interface swp11 to DSCP class value CS6:

-A FORWARD -o swp11 -j SETQOS --set-dscp-class cs6

Flow Control

Flow control influences data transmission to manage congestion along a network path.

Cumulus Linux supports the following flow control mechanisms:

You can not configure link pause and PFC on the same port.

Flow Control Buffers

Before configuring link pause or PFC, configure the buffer pool memory allocated for lossless and lossy flows. The following example sets each to fifty percent:

cumulus@switch:~$ nv set qos traffic-pool default-lossless memory-percent 50
cumulus@switch:~$ nv set qos traffic-pool default-lossy memory-percent 50
cumulus@switch:~$ nv config apply

Cumulus Linux allocates 100% of the buffer memory to the default-lossy traffic pool by default. The total memory allocation across pools must not exceed 100%.

Edit the following lines in the /etc/mlx/datapath/qos/qos_infra.conf file:

  1. Modify the existing ingress_service_pool.0.percent and egress_service_pool.0.percent buffer allocation. Change the existing ingress setting to ingress_service_pool.0.percent = 50. Change the existing egress setting to egress_service_pool.0.percent = 50.

  2. Add the following lines to create a new service_pool, set flow_control to the service pool, and define buffer reservations:

ingress_service_pool.1.percent = 50.0
ingress_service_pool.1.mode = 1
egress_service_pool.1.percent = 50.0
egress_service_pool.1.mode = 1
egress_service_pool.1.infinite_flag = TRUE
#
flow_control.ingress_service_pool = 1
flow_control.egress_service_pool = 1
#
port.service_pool.1.ingress_buffer.reserved = 0
port.service_pool.1.ingress_buffer.dynamic_quota = ALPHA_1
port.service_pool.1.egress_buffer.uc.reserved = 0
port.service_pool.1.egress_buffer.uc.dynamic_quota = ALPHA_INFINITY
#
flow_control.ingress_buffer.dynamic_quota = ALPHA_1
flow_control.egress_buffer.reserved = 0
flow_control.egress_buffer.dynamic_quota = ALPHA_INFINITY

Link pause is an older flow control mechanism that causes all traffic on a link between two switches, or between a host and switch, to stop transmitting during times of congestion. Link pause starts and stops depending on buffer congestion. You configure link pause on a per-direction, per-interface basis. You can receive pause frames to stop the switch from transmitting when requested, send pause frames to request neighboring devices to stop transmitting, or both.

  • NVIDIA recommends that you use Priority Flow Control (PFC) instead of link pause.
  • Before configuring link pause, you must first modify the switch buffer allocation. Refer to Flow Control Buffers.

Link pause buffer calculation is a complex topic that IEEE 802.1Q-2012 defines. This attempts to incorporate the delay between signaling congestion and the reception of the signal by the neighboring device. This calculation includes the delay that the PHY and MAC layers (interface delay) introduce as well as the distance between end points (cable length).

Incorrect cable length settings can cause wasted buffer space (triggering congestion too early) or packet drops (congestion occurs before flow control activates).

The following example configuration:

Cumulus Linux also includes frame transmission start and stop threshold, and port buffer settings. NVIDIA recommends that you do not change these settings but, instead, let Cumulus Linux configure the settings dynamically. Only change the threshold and buffer settings if you are an advanced user who understands the buffer configuration requirements for lossless traffic to work seamlessly.

cumulus@switch:~$ nv set qos link-pause my_pause_ports tx enable
cumulus@switch:~$ nv set qos link-pause my_pause_ports rx disable
cumulus@switch:~$ nv set qos link-pause my_pause_ports cable-length 50
cumulus@switch:~$ nv set interface swp1-4,swp6 qos link-pause profile my_pause_ports
cumulus@switch:~$ nv config apply

To show the link pause settings for a profile, run the nv show qos link-pause <profile> command

Uncomment and edit the link_pause section of the /etc/cumulus/datapath/qos/qos_features.conf file.

link_pause.port_group_list = [my_pause_ports]
link_pause.my_pause_ports.port_set = swp1-swp4,swp6
link_pause.my_pause_ports.port_buffer_bytes = 25000
link_pause.my_pause_ports.xoff_size = 10000
link_pause.my_pause_ports.xon_delta = 2000
link_pause.my_pause_ports.rx_enable = false
link_pause.my_pause_ports.tx_enable = true
link_pause.my_pause_ports.cable_length = 10

To process pause frames, you must enable link pause on the specific interfaces.

Priority Flow Control (PFC)

Priority flow control extends the capabilities of link pause by the frames for a specific 802.1p value instead of stopping all traffic on a link. If a switch supports PFC and receives a PFC pause frame for a given 802.1p value, the switch stops transmitting frames from that queue, but continues transmitting frames for other queues.

You use PFC with RDMA over Converged Ethernet - RoCE. The RoCE section provides information to specifically deploy PFC and ECN for RoCE environments.

Before configuring PFC, first modify the switch buffer allocation according to Flow Control Buffers.

PFC buffer calculation is a complex topic defined in IEEE 802.1Q-2012, which attempts to incorporate the delay between signaling congestion and receiving the signal by the neighboring device. This calculation includes the delay that the PHY and MAC layers (called the interface delay) introduce as well as the distance between end points (cable length).
Incorrect cable length settings cause wasted buffer space (triggering congestion too early) or packet drops (congestion occurs before flow control activates).

To apply PFC settings on all ports, modify the default PFC profile (default-global).

The following example modifies the default profile and configures:

Cumulus Linux also includes frame transmission start and stop threshold, and port buffer settings. NVIDIA recommends that you do not change these settings but, instead, let Cumulus Linux configure the settings dynamically. Only change the threshold and buffer settings if you are an advanced user who understands the buffer configuration requirements for lossless traffic to work seamlessly.

cumulus@switch:~$ nv set qos pfc default-global switch-priority 0 
cumulus@switch:~$ nv set qos pfc default-global tx enable 
cumulus@switch:~$ nv set qos pfc default-global rx disable 
cumulus@switch:~$ nv set qos pfc default-global cable-length 50
cumulus@switch:~$ nv config apply

To show the PFC settings for the default profile, run the nv show qos pfc default-global command:

cumulus@switch:~$ nv show qos pfc default-global
                   operational  applied  description
-----------------  -----------  -------  --------------------------------
cable-length       50           50       Cable Length (in meters)
port-buffer        25000 B      25000 B  Port Buffer (in bytes)
rx                 disable      disable  PFC Rx State
tx                 enable       enable   PFC Tx State
xoff-threshold     10000 B      10000 B  Xoff Threshold (in bytes)
xon-threshold      2000 B       2000 B   Xon Threshold (in bytes)
[switch-priority]  0            0        Collection of switch priorities.

Edit the priority flow control section of the /etc/cumulus/datapath/qos/qos_features.conf file.

pfc.port_group_list = [default-global]
pfc.default-global.port_set = allports
pfc.default-global.cos_list = [0]
pfc.default-global.port_buffer_bytes = 25000
pfc.default-global.xoff_size = 10000
pfc.default-global.xon_delta = 2000
pfc.default-global.tx_enable = true
pfc.default-global.rx_enable = false
pfc.default-global.cable_length = 50

To apply a custom profile to specific interfaces, see Port Groups.

PFC Watchdog

PFC watchdog detects and mitigates pause storms on PFC-enabled ports.

In lossless Ethernet, the switch sends PFC PAUSE frames to instruct the link partner to pause sending packets on a traffic class. This back pressure might propagate across the network and, if it persists, can cause the network to stop forwarding traffic. PFC watchdog detects abnormal back pressure caused by receiving an excessive number of pause frames and disables PFC temporarily.

When a lossless queue receives a pause storm from its link partner and the queue is in a paused state for a certain period of time, PFC watchdog mitigates the pause storm. The watchdog stops processing received pause frames on every switch priority corresponding to the traffic class that detects the storm and discards new incoming packets to this egress queue.

The watchdog continues to count pause frames received on the port. If there are no pause frames received in any polling interval period, it restores the PFC configuration on the port and stops dropping packets.

PFC watchdog also detects and mitigates pause storms on link pause-enabled ports. The watchdog configuration for link pause-enabled ports is the same as the configuration for PFC-enabled ports. For a link pause-enabled port, the watchdog stops processing received pause frames on the egress port that detects the storm and discards new incoming packets to all egress queues on the port until congestion diminishes.

  • PFC watchdog only works for lossless traffic queues.
  • You can only configure PFC watchdog on a port with PFC (or link pause) configuration.
  • You can only enable PFC watchdog on a physical interface (swp).
  • You cannot enable the watchdog on a bond (for example, bond0) but you can enable the watchdog on a port that is a member of a bond (for example, swp1).

To enable PFC watchdog:

Enable PFC watchdog on the interfaces where you enable PFC:

cumulus@switch:~$ nv set interface swp1 qos pfc-watchdog
cumulus@switch:~$ nv set interface swp3 qos pfc-watchdog
cumulus@switch:~$ nv config apply

To disable PFC watchdog, run the nv unset interface <interface> qos pfc-watchdog command or the nv set interface <interface> qos pfc-watchdog state disable command.

Edit the PFC Watchdog Configuration section of the /etc/cumulus/datapath/qos/qos_features.conf file, then reload switchd.

...
# PFC Watchdog Configuration
# Add the port to the port_group_list where you want to enable PFC Watchdog
# It will enable PFC Watchdog on all the traffic-class corresponding to
# the lossless switch-priority configured on the port.
pfc_watchdog.port_group_list = [pfc_wd_port_group]
pfc_watchdog.pfc_wd_port_group.port_set = swp1,swp2
...
cumulus@switch:~$ sudo systemctl reload switchd

You can control the PFC watchdog polling interval and how many polling intervals the PFC watchdog must wait before it mitigates the storm condition. The default polling interval is 100 milliseconds. The default number of polling intervals is 3.

The following example sets the PFC watchdog polling interval to 200 milliseconds and the number of polling intervals to 5:

cumulus@switch:~$ nv set qos pfc-watchdog polling-interval 200
cumulus@switch:~$ nv set qos pfc-watchdog robustness 5
cumulus@switch:~$ nv config apply

Edit the /etc/cumulus/switchd.conf file to set the pfc_wd.poll_interval parameter and the pfc_wd.robustness parameter.

...
# PFC Watchdog poll interval (in msec)
#pfc_wd.poll_interval = 200

# PFC Watchdog robustness (# of iterations)
#pfc_wd.robustness = 5
...

Run the following commands to apply the configuration:

cumulus@switch:~$ echo 5 > /cumulus/switchd/config/pfc_wd/robustness
cumulus@switch:~$ echo 200 > /cumulus/switchd/config/pfc_wd/poll_interval

To show if PFC watchdog is on and to show the status for each traffic class, run the nv show interface <interface> qos pfc-watchdog command:

cumulus@switch:~$ nv show interface swp1 qos pfc-watchdog
                 operational  applied 
---------------  -----------  ------- 
state            enabled      enabled 

PFC WD Status 
=========================== 
    traffic-class  status    deadlock-count 
    -------------  --------  -------------- 

    0              OK        0 
    1              OK        3 
    2              DEADLOCK  2  
    3              OK        0 
    4              OK        0 
    5              OK        0 
    6              OK        0 
    7              DEADLOCK  3 

To show PFC watchdog data for a specific traffic class, run the nv show interface <interface> qos pfc-watchdog status <traffic-class> command.

To clear the PFC watchdog deadlock-count on an interface, run the nv action clear interface <interface> qos pfc-watchdog deadlock-count command.

Congestion Control (ECN)

Explicit Congestion Notification (ECN) is an end-to-end layer 3 congestion control protocol. Defined by RFC 3168, ECN relies on bits in the IPv4 header Traffic Class to signal congestion conditions. ECN requires one or both server endpoints to support ECN to be effective.

Instead of telling adjacent devices to stop transmitting during times of buffer congestion, ECN sets the ECN bits of the transit IPv4 or IPv6 header to indicate to end hosts that congestion might occur. As a result, the sending hosts reduce their sending rate until the transit switch no longer sets ECN bits.

You use ECN with RDMA over Converged Ethernet - RoCE. The RoCE section describes how to deploy PFC and ECN for RoCE environments.

ECN operates by having a transit switch that marks packets between two end hosts.

  1. The transmitting host indicates it is ECN-capable by setting the ECN bits in the outgoing IP header to 01 or 10
  2. If the buffer of a transit switch is greater than the configured minimum threshold of the buffer, the switch remarks the ECN bits to 11 indicating Congestion Encountered or CE.
  3. The receiving host marks any reply packets, like a TCP-ACK, as CE (11).
  4. The original transmitting host reduces its transmission rate.
  5. When the switch buffer congestion falls below the configured minimum threshold of the buffer, the switch stops remarking ECN bits, setting them back to 01 or 10.
  6. A receiving host reflects this new ECN marking in the next reply so that the transmitting host resumes sending at normal speeds.

The default profile (default-global) enables ECN by default on egress queue 0 for all ports with the following settings:

The following example commands change the default ECN profile that applies to all ports. The commands enable ECN on egress queue 4, 5, and 7, set the minimum buffer threshold to 40000 and the maximum buffer threshold to 200000, and enable RED.

cumulus@switch:~$ nv set qos congestion-control default-global traffic-class 4,5,7 min-threshold 40000
cumulus@switch:~$ nv set qos congestion-control default-global traffic-class 4,5,7 max-threshold 200000 
cumulus@switch:~$ nv set qos congestion-control default-global traffic-class 4,5,7 red enable
cumulus@switch:~$ nv config apply

The following example disables ECN bit marking in the default profile for all ports.

cumulus@switch:~$ nv set qos congestion-control default-global traffic-class 0 ecn disable
cumulus@switch:~$ nv config apply

To show the ECN settings for the default profile, run the nv show qos congestion-control default-global command:

cumulus@switch:~$ nv show qos congestion-control default-global
    operational  applied  description
--  -----------  -------  -----------

ECN Configurations
=====================
    traffic-class  ECN     RED     Min Th   Max Th    Probability
    -------------  ------  ------  -------  --------  -----------
    4              enable  enable  40000 B  200000 B  100
    5              enable  enable  40000 B  200000 B  100
    7              enable  enable  40000 B  200000 B  100

To show the ECN settings in the default profile for a specific egress queue, run the nv show qos congestion-control default-global traffic-class <value> command:

cumulus@switch:~$ nv show qos congestion-control default-global traffic-class 4 
               operational  applied   description
-------------  -----------  --------  -----------------------------------
ecn            enable       enable    Early Congestion Notification State
max-threshold  200000 B     200000 B  Maximum Threshold (in bytes)
min-threshold  40000 B      40000 B   Minimum Threshold (in bytes)
probability    100          100       Probability
red            enable       enable    Random Early Detection State

Edit the Explicit Congestion Notification section of the /etc/cumulus/datapath/qos/qos_features.conf file.

default_ecn_red_conf.egress_queue_list = [4,5,7]
default_ecn_red_conf.ecn_enable = true
default_ecn_red_conf.red_enable = true
default_ecn_red_conf.min_threshold_bytes = 40000
default_ecn_red_conf.max_threshold_bytes = 200000
default_ecn_red_conf.probability = 100

To disable ECN bit marking, set ecn_enable to false. The following example disables ECN bit marking in the default profile for all ports.

...
default_ecn_red_conf.ecn_enable = false 
...

To apply a custom ECN profile to specific interfaces, see Port Groups.

Egress Queues

Cumulus Linux supports eight egress queues to provide different classes of service. By default switch priority values map directly to the matching egress queue. For example, switch priority value 0 maps to egress queue 0.

You can remap queues by changing the switch priority value to the corresponding queue value. You can map multiple switch priority values to a single egress queue.

You do not have to assign all egress queues.

The following command examples assign switch priority 2 to egress queue 7:

cumulus@switch:~$ nv set qos egress-queue-mapping default-global switch-priority 2 traffic-class 7
cumulus@switch:~$ nv config apply

NVUE only supports the default-global profile.

To show the egress queue mapping configuration for the default profile, run the nv show qos egress-queue-mapping default-global command:

cumulus@switch:~$ nv show qos egress-queue-mapping default-global
    operational  applied  description
--  -----------  -------  -----------

SP->TC mapping configuration
===============================
    switch-priority  traffic-class
    ---------------  -------------
    0                0
    1                1
    2                7
    3                3
    4                4
    5                5
    6                6
    7                7

To show the egress queue mapping for a specific switch priority in the default profile, run the nv show qos egress-queue-mapping default-global switch-priority <value> command. The following example command shows that switch priority 2 maps to egress queue 7.

cumulus@switch:~$ nv show qos egress-queue-mapping default-global switch-priority 2
               operational  applied  description
-------------  -----------  -------  -------------
traffic-class  7            7        Traffic Class

You configure egress queues in the qos_infra.conf file.

cos_egr_queue.cos_0.uc  = 0
cos_egr_queue.cos_1.uc  = 1
cos_egr_queue.cos_2.uc  = 7
cos_egr_queue.cos_3.uc  = 3
cos_egr_queue.cos_4.uc  = 4
cos_egr_queue.cos_5.uc  = 5
cos_egr_queue.cos_6.uc  = 6
cos_egr_queue.cos_7.uc  = 7

Egress Scheduler

Cumulus Linux supports 802.1Qaz, Enhanced Transmission Selection, which allows the switch to assign bandwidth to egress queues and then schedule the transmission of traffic from each queue. 802.1Qaz supports Priority Queuing.

Cumulus Linux provides a default egress scheduler that applies to all ports, where the bandwidth allocated to egress queues 0,2,4,6 is 12 percent and the bandwidth allocated to egress queues 1,3,5,7 is 13 percent. You can also apply a custom egress scheduler for specific ports; see Port Groups.

The following example modifies the default profile. The commands change the bandwidth allocation for egress queues 0, 1, 5, and 7 to strict, bandwidth allocation for egress queues 2 and 6 to 30 percent and bandwidth allocation for egress queues 3 and 4 to 20 percent.

  • The traffic-class value defines the egress queue where you want to assign bandwidth. For example, traffic-class 2 defines the bandwidth allocation for egress queue 2.
  • For each egress queue, you can either define the mode as dwrr or strict. In dwrr mode, you must define a bandwidth percent value between 1 and 100. If you do not specify a value for an egress queue, Cumulus Linux uses a DWRR value of 0 (no egress scheduling). The combined total of values you assign to bw_percent must be less than or equal to 100.
cumulus@switch:~$ nv set qos egress-scheduler default-global traffic-class 2,6 mode dwrr 
cumulus@switch:~$ nv set qos egress-scheduler default-global traffic-class 2,6 bw-percent 30 
cumulus@switch:~$ nv set qos egress-scheduler default-global traffic-class 3,4 mode dwrr
cumulus@switch:~$ nv set qos egress-scheduler default-global traffic-class 3,4 bw-percent 20 
cumulus@switch:~$ nv set qos egress-scheduler default-global traffic-class 0,1,5,7 mode strict
cumulus@switch:~$ nv config apply

To show the egress scheduling policy for the default profile, run the nv show qos egress-scheduler default-global command:

cumulus@switch:~$ nv show qos egress-scheduler default-global
    operational  applied  description
--  -----------  -------  -----------

TC->DWRR weight configuration
================================
    traffic-class  mode    bw-percent
    -------------  ------  ----------
    0              strict
    1              strict
    2              dwrr    30
    3              dwrr    20
    4              dwrr    20
    5              strict
    6              dwrr    30
    7              strict

You configure the egress scheduling policy in the egress scheduling section of the /etc/cumulus/datapath/qos/qos_features.conf file.

  • The egr_queue_ value defines the egress queue where you want to assign bandwidth. For example, egr_queue_0 defines the bandwidth allocation for egress queue 0.
  • The bw_percent value defines the bandwidth allocation you want to assign to an egress queue. If you do not specify a value for an egress queue, there is no egress scheduling. If you specify a value of 0 for an egress queue, Cumulus Linux assigns strict priority mode to the egress queue and always processes it ahead of other queues. The combined total of values you assign to bw_percent must be less than or equal to 100.
default_egress_sched.egr_queue_0.bw_percent = 0
default_egress_sched.egr_queue_1.bw_percent = 0
default_egress_sched.egr_queue_2.bw_percent = 30
default_egress_sched.egr_queue_3.bw_percent = 20
default_egress_sched.egr_queue_4.bw_percent = 20
default_egress_sched.egr_queue_5.bw_percent = 0
default_egress_sched.egr_queue_6.bw_percent = 30
default_egress_sched.egr_queue_7.bw_percent = 0

strict mode does not define a maximum bandwidth allocation. This can lead to starvation of other queues.

To apply a custom egress scheduler for specific ports, see Port Groups.

Policing and Shaping

Traffic shaping and policing control the rate at which the switch sends or receives traffic on a network to prevent congestion.

Traffic shaping typically occurs at egress and traffic policing at ingress.

Shaping

Traffic shaping allows a switch to send traffic at an average bitrate lower than the physical interface. Traffic shaping prevents a receiving device from dropping bursty traffic if the device is either not capable of that rate of traffic or has a policer that limits what it accepts.

Traffic shaping works by holding packets in the buffer and releasing them at specific time intervals.

Cumulus Linux supports two levels of hierarchical traffic shaping: one at the egress queue level and one at the port level. This allows for minimum and maximum bandwidth guarantees for each egress queue and a defined port traffic shaping rate.

The following example configuration:

  • When the minimum bandwidth for an egress queue is 0, there is no bandwidth guarantee for this queue.
  • The maximum bandwidth for an egress queue must not exceed the maximum packet shaper rate for the port group.
  • The maximum packet shaper rate for the port group must not exceed the physical interface speed.
  • Cumulus Linux only shapes traffic for the traffic classes in a profile that include shaper configuration.

cumulus@switch:~$ nv set qos egress-shaper shaper1 traffic-class 2 min-rate 100
cumulus@switch:~$ nv set qos egress-shaper shaper1 traffic-class 2 max-rate 500
cumulus@switch:~$ nv set qos egress-shaper shaper1 port-max-rate 200000
cumulus@switch:~$ nv set interface swp1,swp2,swp3,swp5 qos egress-shaper profile shaper1
cumulus@switch:~$ nv config apply

Edit the shaping section of the qos_features.conf file.

Cumulus Linux bases the egr_queue value on the configured egress queue.

shaping.port_group_list = [shaper1]
shaping.shaper1.port_set = swp1-swp3,swp5
shaping.shaper1.egr_queue_0.shaper = [50000, 100000]
shaping.shaper1.port.shaper = 900000

Policing

Traffic policing prevents an interface from receiving more traffic than intended. You use policing to enforce a maximum transmission rate on an interface. The switch drops any traffic above the policing level.

Cumulus Linux supports both a single-rate policer and a dual-rate policer (tricolor policer).

You configure traffic policing using ebtables, iptables, or ip6table rules.

For more information on configuring and applying ACLs, refer to Access Control List Configuration.

Single-rate Policer

To configure a single-rate policer, use iptables JUMP action -j POLICE.

Cumulus Linux supports the following iptable flags with a single-rate policer.

iptables Flag Description
--set-mode [pkt | KB] Define the policer to count packets or kilobytes.
--set-rate [<kbytes> | <packets>] The maximum rate of traffic in kilobytes or packets per second.
--set-burst <kilobytes> The allowed burst size in kilobytes.

For example, to create a policer to allow 400 packets per second with 100 packet burst:
-j POLICE --set-mode pkt --set-rate 400 --set-burst 100

Dual-rate Policer

To configure a dual-rate policer, use the iptables JUMP action -j TRICOLORPOLICE.

Cumulus Linux supports the following iptable flags with a dual-rate policer.

iptables Flag Description
--set-color-mode [blind | aware] The policing mode: single-rate (blind) or dual-rate (aware). The default is aware.
--set-cir <kbps> The committed information rate (CIR) in kilobits per second.
--set-cbs <kbytes> The committed burst size (CBS) in kilobytes.
--set-pir <kbps> The peak information rate (PIR) in kilobits per second.
--set-ebs <kbytes> The excess burst size (EBS) in kilobytes.
--set-conform-action-dscp <dscp value> The numerical DSCP value to mark for traffic that conforms to the policer rate.
--set-exceed-action-dscp <dscp value> The numerical DSCP value to mark for traffic that exceeds the policer rate.
--set-violate-action-dscp <dscp value> The numerical DSCP value to mark for traffic that violates the policer rate.
--set-violate-action [accept | drop] Cumulus Linux either accepts and remarks, or drops packets that violate the policer rate.

For example, to configure a dual-rate, three-color policer, with a 3 Mbps CIR, 500 KB CBS, 10 Mbps PIR, and 1 MB EBS and drops packets that violate the policer:

-j TRICOLORPOLICE --set-color-mode blind --set-cir 3000 --set-cbs 500 --set-pir 10000 --set-ebs 1000 --set-violate-action drop

Port Groups

Cumulus Linux supports profiles (port groups) for all features including ECN and RED. Profiles apply similar QoS configurations to a set of ports.

  • Configurations with a profile override the global settings for the ingress ports in the port group.
  • Ports not in a profile use the global settings.
  • To apply a profile to all ports, use the global profile.

Trust and Marking

You can use port groups to assign different profiles to different ports. A profile is a label for a group of configuration settings.

The following example configures two profiles. customer1 applies to swp1, swp4, and swp6. customer2 applies to swp5 and swp7.

cumulus@switch:~$ nv set qos mapping customer1 trust l3 
cumulus@switch:~$ nv set qos mapping customer1 dscp 0 switch-priority 1-7
cumulus@switch:~$ nv set interface swp1,swp4,swp6 qos mapping profile customer1
cumulus@switch:~$ nv set qos mapping customer2 trust l2
cumulus@switch:~$ nv set qos mapping customer2 pcp 1 switch-priority 4 
cumulus@switch:~$ nv set interface swp5,swp7 qos mapping profile customer2
cumulus@switch:~$ nv config apply

The following example configures the profile customports, which assigns traffic on swp1, swp2, and swp3 to switch priority 4 regardless of the ingress marking.

cumulus@switch:~$ nv set qos mapping customports trust port 
cumulus@switch:~$ nv set qos mapping customports port-default-sp 4
cumulus@switch:~$ nv set interface swp1,swp2,swp3 qos mapping profile customports
cumulus@switch:~$ nv config apply

You define profiles with the source.port_group_list configuration in the qos_features.conf file. A source.port_group_list is one or more names used for a group of settings.

The following example configures two profiles. customer1 applies to swp1, swp4, and swp6. customer2 applies to swp5 and swp7.

source.port_group_list = [customer1,customer2]
source.customer1.packet_priority_source_set = [dscp]
source.customer1.port_set = swp1-swp4,swp6
source.customer1.port_default_priority = 0
source.customer1.cos_0.priority_source.dscp = [0-7]
source.customer2.packet_priority_source_set = [802.1p]
source.customer2.port_set = swp5,swp7
source.customer2.port_default_priority = 0
source.customer2.cos_1.priority_source.8021p = [4]
Configuration Description
source.port_group_list The names of the port groups (profiles) you want to use.
The following example defines customer1 and customer2:
source.port_group_list = [customer1,customer2]
source.customer1.packet_priority_source_set The ingress marking trust.
In the following example, ingress DSCP values are for group customer1:
source.customer1.packet_priority_source_set = [dscp]
source.customer1.port_set The set of ports on which to apply the ingress marking trust policy.
In the following example, ports swp1, swp2, swp3, swp4, and swp6 are for customer1:
source.customer1.port_set = swp1-swp4,swp6
source.customer1.port_default_priority The default switch priority marking for unmarked or untrusted traffic.
In the following example, Cumulus Linux marks unmarked traffic or layer 2 traffic for customer1 ports with switch priority 0:
source.customer1.port_default_priority = 0
source.customer1.cos_0.priority_source The ingress DSCP values to a switch priority value mapping for customer1.
In the following example, the set of DSCP values from 0 through 7 map to switch priority 0:
source.customer1.cos_0.priority_source.dscp = [0,1,2,3,4,5,6,7]
source.customer2.packet_priority_source_set The ingress marking trust for customer2.
In the following example, 802.1p is trusted:
source.packet_priority_source_set = [802.1p]
source.customer2.port_set The set of ports on which to apply the ingress marking trust policy.
In the following example, swp5 and swp7 apply for customer2:
source.customer2.port_set = swp5,swp7
source.customer2.port_default_priority The default switch priority marking for unmarked or untrusted traffic.
In the following example, Cumulus Linux marks unmarked tagged layer 2 traffic or unmarked VLAN tagged traffic for customer1 ports with switch priority 0:
source.customer2.port_default_priority = 0
source.customer2.cos_0.priority_source The switch priority value to an ingress 802.1p value mapping for customer2.
The following example maps ingress 802.1p value 4 to switch priority 1:
source.customer2.cos_1.priority_source.8021p = [4]

The following example configures the profile customports, which assigns traffic on swp1, swp2, and swp3 to switch priority 4 regardless of the ingress marking.

source.port_group_list = [customports]
source.customports.packet_priority_source_set = [port]
source.customports.port_default_priority = 4
source.customports.port_set = swp1,swp2,swp3

Remarking

You can use profiles to remark 802.1p or DSCP on egress according to the switch priority (internal COS) value.

To change the marked value on a packet, the switch ASIC reads the enable or disable rewrite flag on the ingress port and refers to the mapping configuration on the egress port to change the marked value. To remark 802.1p or DSCP values, you have to enable the rewrite on the ingress port and configure the mapping on the egress port.

In the following example configuration, only packets that ingress on swp1 and egress on swp2 change the marked value of the packet. Packets that ingress on other ports and egress on swp2 do not change the marked value of the packet. The commands map switch priority 0 and 1 to egress DSCP 37.

cumulus@switch:~$ nv set qos remark remark_port_group1 rewrite l3
cumulus@switch:~$ nv set interface swp1 qos remark profile remark_port_group1
cumulus@switch:~$ nv set qos remark remark_port_group2 switch-priority 0 dscp 37
cumulus@switch:~$ nv set qos remark remark_port_group2 switch-priority 1 dscp 37
cumulus@switch:~$ nv set interface swp2 qos remark profile remark_port_group2
cumulus@switch:~$ nv config apply

You define these profiles with remark.port_group_list in the /etc/cumulus/datapath/qos/qos_features.conf file. The name is a label for configuration settings.

remark.port_group_list = [remark_port_group1,remark_port_group2]
remark.remark_port_group1.packet_priority_remark_set = [dscp]
remark.remark_port_group1.port_set = swp1
remark.remark_port_group2.packet_priority_remark_set = []
remark.remark_port_group2.port_set = swp2
remark.remark_port_group2.cos_0.priority_remark.dscp = [37]
remark.remark_port_group2.cos_1.priority_remark.dscp = [37]

Egress Scheduling

You can use port groups with egress scheduling weights to assign different profiles to different egress ports.

In the following example, the profile list2 applies to swp1, swp3, and swp18. list2 only assigns weights to queues 2, 5, and 6, and schedules the other queues on a best-effort basis when there is no congestion in queues 2, 5, or 6. list1 applies to swp2 and assigns weights to all queues.

cumulus@switch:~$ nv set qos egress-scheduler list2 traffic-class 2,5,6 mode dwrr 
cumulus@switch:~$ nv set qos egress-scheduler list2 traffic-class 2,5 bw-percent 50 
cumulus@switch:~$ nv set qos egress-scheduler list2 traffic-class 6 mode strict
cumulus@switch:~$ nv set interface swp1,swp3,swp18 qos egress-scheduler profile list2
cumulus@switch:~$ nv set qos egress-scheduler list1 traffic-class 0,3,4,5,6 mode dwrr 
cumulus@switch:~$ nv set qos egress-scheduler list1 traffic-class 0,3,4,5,6 bw-percent 10 
cumulus@switch:~$ nv set qos egress-scheduler list1 traffic-class 1 mode dwrr
cumulus@switch:~$ nv set qos egress-scheduler list1 traffic-class 1 bw-percent 20 
cumulus@switch:~$ nv set qos egress-scheduler list1 traffic-class 2 mode dwrr
cumulus@switch:~$ nv set qos egress-scheduler list1 traffic-class 2 bw-percent 30 
cumulus@switch:~$ nv set qos egress-scheduler list1 traffic-class 7 mode strict
cumulus@switch:~$ nv set interface swp2 qos egress-scheduler profile list1
cumulus@switch:~$ nv config apply

You define port groups with egress_sched.port_group_list in the /etc/cumulus/datapath/qos/qos_features.conf file. An egress_sched.port_group_list includes the names for the group settings. The name is a label (profile) for the configuration settings.

egress_sched.port_group_list = [list1,list2]
egress_sched.list1.port_set = swp2
egress_sched.list1.egr_queue_0.bw_percent = 10
egress_sched.list1.egr_queue_1.bw_percent = 20
egress_sched.list1.egr_queue_2.bw_percent = 30
egress_sched.list1.egr_queue_3.bw_percent = 10
egress_sched.list1.egr_queue_4.bw_percent = 10
egress_sched.list1.egr_queue_5.bw_percent = 10
egress_sched.list1.egr_queue_6.bw_percent = 10
egress_sched.list1.egr_queue_7.bw_percent = 0
#
egress_sched.list2.port_set = [swp1,swp3,swp18]
egress_sched.list2.egr_queue_2.bw_percent = 50
egress_sched.list2.egr_queue_5.bw_percent = 50
egress_sched.list2.egr_queue_6.bw_percent = 0
Configuration Description
egress_sched.port_group_list The names of the port groups (labels) to use.
The following example defines port groups list1 snd list2:
egress_sched.port_group_list = [list1,list2]
egress_sched.list1.port_set The interfaces on which you want to apply the port group.
egress_sched.list1.port_set = swp2
egress_sched.list1.egr_queue_0.bw_percent The percentage of bandwidth for egress queue 0.
egress_sched.list1.egr_queue_0.bw_percent = 10
egress_sched.list1.egr_queue_1.bw_percent The percentage of bandwidth for egress queue 1.
egress_sched.list1.egr_queue_1.bw_percent = 20
egress_sched.list1.egr_queue_2.bw_percent The percentage of bandwidth for egress queue 2.
egress_sched.list1.egr_queue_2.bw_percent = 30
egress_sched.list1.egr_queue_3.bw_percent The percentage of bandwidth for egress queue 3.
egress_sched.list1.egr_queue_3.bw_percent = 10
egress_sched.list1.egr_queue_4.bw_percent The percentage of bandwidth for egress queue 4.
egress_sched.list1.egr_queue_4.bw_percent = 10
egress_sched.list1.egr_queue_5.bw_percent The percentage of bandwidth for egress queue 5.

egress_sched.list1.egr_queue_5.bw_percent = 10
egress_sched.list1.egr_queue_6.bw_percent The percentage of bandwidth for egress queue 6.
egress_sched.list1.egr_queue_6.bw_percent = 10
egress_sched.list1.egr_queue_7.bw_percent The percentage of bandwidth for egress queue 7.
0 indicates a strict priority queue:
egress_sched.list1.egr_queue_7.bw_percent = 0
egress_sched.list2.port_set The interfaces you want to apply to the port group.
The following example applies swp1, swp3 and swp18 to port group list2:
egress_sched.list2.port_set = [swp1,swp3,swp18]
egress_sched.list2.egr_queue_2.bw_percent The percentage of bandwidth for egress queue 2.
egress_sched.list2.egr_queue_2.bw_percent = 50
egress_sched.list2.egr_queue_5.bw_percent The percentage of bandwidth for egress queue 5.
egress_sched.list2.egr_queue_5.bw_percent = 50
egress_sched.list2.egr_queue_6.bw_percent The percentage of bandwidth for egress queue 6.
0 indicates a strict priority queue:
egress_sched.list2.egr_queue_6.bw_percent = 0

PFC

To set priority flow control on a group of ports, you create a profile to define the egress queues that support sending PFC pause frames and define the set of interfaces to which you want to apply PFC pause frame configuration. Cumulus Linux automatically enables PFC frame transmit and PFC frame receive, and derives all other PFC settings, such as the buffer limits that trigger PFC frames transmit to start and stop, the amount of reserved buffer space, and the cable length.

The following example applies a PFC profile called my_pfc_ports for egress queue 3 and 5 on swp1, swp2, swp3, swp4, and swp6.

cumulus@switch:~$ nv set qos pfc my_pfc_ports switch-priority 3,5
cumulus@switch:~$ nv set interface swp1-4,swp6 qos pfc profile my_pfc_ports
cumulus@switch:~$ nv config apply

The following example applies a PFC profile called my_pfc_ports2 for egress queue 0 on swp1. The commands disable PFC frame receive, and set the buffer limit that triggers PFC frame transmission to stop to 1500 bytes and to start to 1000 bytes. The commands also set the amount of reserved buffer space to 2000 bytes, and the cable length to 50 meters:

cumulus@switch:~$ nv set qos pfc my_pfc_ports2 switch-priority 0 
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 xoff-threshold 1500 
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 xon-threshold 1000 
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 tx enable 
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 rx disable 
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 port-buffer 2000 
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 cable-length 50
cumulus@switch:~$ nv set interface swp1 qos pfc profile my_pfc_ports2
cumulus@switch:~$ nv config apply
All PFC commands
Command
Description
nv set qos pfc <profile> port-buffer <value> The amount of reserved buffer space (from the global shared buffer) for the interfaces defined in the port group list .
The following example sets the amount of reserved buffer space to 25000 bytes:
nv set qos pfc my_pfc_ports port-buffer 25000
nv set qos pfc <profile> xoff-threshold <value> The amount of reserved buffer that the switch must consume before sending a PFC pause frame out of the set of interfaces in the port group list.
The following example sends PFC pause frames after consuming 20000 bytes of reserved buffer:
nv set qos pfc my_pfc_ports xoff-threshold 20000
nv set qos pfc <profile> xon-threshold <value> The number of bytes below the xoff threshold that the buffer consumption must drop below before sending PFC pause frames stops.
In the following example, the buffer congestion must reduce by 1000 bytes (to 8000 bytes) before PFC pause frames stop:
nv set qos pfc my_pfc_ports xon-threshold 1000
nv set qos pfc <profile> rx enable
nv set qos pfc <profile> rx disable
Enables or disables sending PFC pause frames. The default value is enable.
The following example disables sending PFC pause frames:
nv set qos pfc my_pfc_ports rx disable
nv set qos pfc <profile> tx enable
nv set qos pfc <profile> tx disable
Enables or disables receiving PFC pause frames. You do not need to define the COS values for rx enable. The switch receives any COS value. The default value is enable.
The following example disables receiving PFC pause frames:
nv set qos pfc my_pfc_ports tx disable
nv set qos pfc <profile> cable-length <value> The length, in meters, of the cable that attaches to the ports. Cumulus Linux uses this value internally to determine the latency between generating a PFC pause frame and receiving the PFC pause frame. The default is 10 meters.
The following example sets the cable length to 5 meters:
nv set qos pfc my_pfc_ports cable-length 5

Edit the priority flow control section of the /etc/cumulus/datapath/qos/qos_features.conf file.

The following example applies a PFC profile called my_pfc_ports for egress queue 3 and 5 on swp1, swp2, swp3, swp4, and swp6.

pfc.port_group_list = [my_pfc_ports2]
pfc.my_pfc_ports2.cos_list = [0]
pfc.my_pfc_ports2.port_set = swp1

The following example applies a PFC profile called my_pfc_ports2 for egress queue 0 on swp1. The commands also disable PFC frame receive, and set the xoff-size to 1500 bytes, the xon-size to 1000 bytes, the headroom to 2000 bytes, and the cable length to 10 meters:

pfc.port_group_list = [my_pfc_ports2]
pfc.my_pfc_ports2.cos_list = [0]
pfc.my_pfc_ports2.port_set = swp1
pfc.my_pfc_ports2.port_buffer_bytes = 2000
pfc.my_pfc_ports2.xoff_size = 1500
pfc.my_pfc_ports2.xon_delta = 1000
pfc.my_pfc_ports2.tx_enable = true
pfc.my_pfc_ports2.rx_enable = false
pfc.my_pfc_ports2.cable_length = 10
All PFC configuration options
Configuration Description
pfc.my_pfc_ports.port_buffer_bytes The amount of reserved buffer space (from the global shared buffer) for the interfaces defined in the port group list.
The following example sets the amount of reserved buffer space to 25000 bytes:
pfc.my_pfc_ports.port_buffer_bytes = 25000
pfc.my_pfc_ports.xoff_size The amount of reserved buffer that the switch must consume before sending a PFC pause frame out the set of interfaces in the port group list.
The following example sends PFC pause frames after consuming 10000 bytes of reserved buffer:
pfc.my_pfc_ports.xoff_size = 10000
pfc.my_pfc_ports.xon_delta The number of bytes below the xoff threshold that the buffer consumption must drop below before sending PFC pause frames stops.
The following example the buffer congestion must reduce by 2000 bytes (to 8000 bytes) before PFC pause frames stop:
pfc.my_pfc_ports.xon_delta = 2000
pfc.my_pfc_ports.rx_enable Enables (true) or disables (false) sending PFC pause frames. The default value is true.
The following example enables sending PFC pause frames:
pfc.my_pfc_ports.tx_enable = true
pfc.my_pfc_ports.tx_enable Enables (true) or disables (false) receiving PFC pause frames. You do not need to define the COS values for rx_enable. The switch receives any COS value. The default value is true.
The following example enables receiving PFC pause frames:
pfc.my_pfc_ports.rx_enable = true
pfc.my_pfc_ports.cable_length The length, in meters, of the cable that attaches to the port in the port group list. Cumulus Linux uses this value internally to determine the latency between generating a PFC pause frame and receiving the PFC pause frame. The default is 10 meters
In this example, the cable is 5 meters:
pfc.my_pfc_ports.cable_length = 5

ECN

You can create ECN profiles and assign them to different ports.

The following example creates a custom ECN profile called my-red-profile for egress queue (traffic-class) 1 and 2. The commands set the minimum buffer threshold to 40000 bytes, maximum buffer threshold to 200000 bytes, and the probability to 10. The commands also enable RED and apply the ECN profile to swp1 and swp2.

cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 min-threshold-bytes 40000 
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 max-threshold-bytes 200000 
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 probability 10
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 red enable
cumulus@switch:~$ nv set interface swp1,swp2 qos congestion-control my-red-profile
cumulus@switch:~$ nv config apply

You can configure different thresholds and probability values for different traffic classes in a custom profile:

cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 min-threshold-bytes 40000 
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 max-threshold-bytes 200000 
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 probability 10
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 red enable
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 4 min-threshold-bytes 30000 
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 4 max-threshold-bytes 150000 
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 4 probability 80
cumulus@switch:~$ nv set interface swp1,swp2 qos congestion-control my-red-profile
cumulus@switch:~$ nv config apply

You can disable ECN bit marking for an ECN profile. The following example disables ECN bit marking in the my-red-profile profile:

cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1 ecn disable
cumulus@switch:~$ nv config apply

Edit the Explicit Congestion Notification section of the /etc/cumulus/datapath/qos/qos_features.conf file.

The following example creates a custom ECN profile called my-red-profile for egress queue 1 and 2, with a minimum buffer threshold of 40000 bytes, maximum buffer threshold of 200000 bytes, and a probability of 10. The commands also enable RED and apply the ECN profile to swp1 and swp2.

ecn_red.port_group_list = [my-red-profile] 
my-red-profile.egress_queue_list = [1,2]
my-red-profile.port_set = swp1,swp2
my-red-profile.ecn_enable = true
my-red-profile.red_enable = true
my-red-profile.min_threshold_bytes = 40000
my-red-profile.max_threshold_bytes = 200000
my-red-profile.probability = 10

To disable ECN bit marking, set ecn_enable to false. The following example disables ECN bit marking in the my-red-profile.

...
my-red-profile.ecn_enable = false 
...

Traffic Pools

Cumulus Linux supports adjusting the following traffic pools:

Traffic Pool Description
default-lossy The default traffic pool for all switch priorities.
default-lossless The traffic pool for lossless traffic when you enable flow control.
mc-lossy The traffic pool for multicast traffic.
roce-lossy The traffic pool for RoCE lossy mode.
roce-lossless The traffic pool for RoCE lossless mode.

  • You can only have a single lossless pool configured on the switch at a time. Configure the roce-lossless pool when you are using RoCE, otherwise configure the default-lossless pool.

  • You can configure multiple lossy pools concurrently.

You configure a traffic pool by associating switch priorities and defining the buffer memory percentages allocated to the pools. The following example associates switch priority 2 and allocates a memory percentage of 30 for the mc-lossy pool:

cumulus@switch:~$ nv set qos traffic-pool default-lossy switch-priority 0,1,3,4,5,6,7
cumulus@switch:~$ nv set qos traffic-pool default-lossy memory-percent 70
cumulus@switch:~$ nv set qos traffic-pool mc-lossy switch-priority 2
cumulus@switch:~$ nv set qos traffic-pool mc-lossy memory-percent 30
cumulus@switch:~$ nv config apply

Configure the following settings in the /etc/mlx/datapath/qos/qos_infra.conf file:

traffic.priority_group_list = [service2,bulk]

priority_group.service2.cos_list = [2]
priority_group.bulk.cos_list = [0,1,3,4,5,6,7]

priority_group.service2.id = 2

priority_group.service2.service_pool = 2

ingress_service_pool.2.percent = 30
ingress_service_pool.0.percent = 70

port.service_pool.2.ingress_buffer.reserved = 10240

ingress_service_pool.2.mode = 1

port.service_pool.2.ingress_buffer.dynamic_quota = ALPHA_8

priority_group.service2.ingress_buffer.dynamic_quota = ALPHA_8

egress_buffer.egr_queue_2.uc.service_pool = 2

egress_service_pool.2.percent = 30
egress_service_pool.0.percent = 70

port.service_pool.2.egress_buffer.uc.reserved = 0

egress_buffer.cos_2.mc.service_pool = 2

egress_buffer.egr_queue_2.uc.reserved = 1024

port.egress_buffer.mc.reserved = 10240
port.egress_buffer.mc.shared_size = 2097152
egress_service_pool.2.mode = 1

port.service_pool.2.egress_buffer.uc.dynamic_quota = ALPHA_8

egress_buffer.egr_queue_2.uc.dynamic_quota = ALPHA_8

egress_buffer.cos_2.mc.dynamic_quota = ALPHA_8

For additional default-lossless and RoCE pool examples, see Flow Control Buffers and RoCE. You can view traffic-pool configuration with the nv show qos traffic-pool <pool name> command:

cumulus@switch:~$  nv show qos traffic-pool default-lossy
                   applied
-----------------  -------
memory-percent     80     
[switch-priority]  0      
[switch-priority]  1      
[switch-priority]  2      
[switch-priority]  3      
[switch-priority]  4      
[switch-priority]  5      
[switch-priority]  6      
[switch-priority]  7      

Advanced Buffer Tuning

You can use NVUE commands to tune advanced buffer properties in addition to the supported traffic pool configurations. Advanced buffer configuration can override the base traffic-pool profiles configured on the system.

You can only configure advanced buffer settings for the default-global profile.

Buffer Regions

You can adjust advanced buffer settings with the following NVUE command:

You can adjust settings for the following supported buffer regions and properties:

Buffers Supported Property Values
ingress-lossy-buffer
    Cumulus Linux supports the following properties for the bulk and service[1-7] priority groups:
    name - The priority group alias name.
    reserved - The reserved buffer allocation in bytes.
    service-pool - Service pool mapping.
    shared-alpha - The dynamic shared buffer alpha allocation.
    shared-bytes - The static shared buffer allocation in bytes.
    switch-priority - Switch priority values.
egress-lossless-buffer
    reserved - The reserved buffer allocation in bytes.
    service-pool - Service pool mapping.
    shared-alpha - The dynamic shared buffer alpha allocation.
    shared-bytes - The static shared buffer allocation in bytes.
ingress-lossless-buffer
    service-pool - Service pool mapping.
    shared-alpha - The dynamic shared buffer alpha allocation.
    shared-bytes - The static shared buffer allocation in bytes.
egress-lossy-buffer
    multicast-port - Multicast port reserved or shared-bytes allocation in bytes.
    multicast-switch-priority [0-7] - Set the reserved, service-pool,shared-alpha, or shared-bytes properties for each multicast switch priority.
    traffic-class [0-15] - Set the reserved, service-pool,shared-alpha, or shared-bytes properties for each traffic class.

Configure shared-bytes for buffer regions mapped to static service pools, and shared-alpha for buffer regions mapped to dynamic service pools.

The shared buffer alpha value determines the proportion of available shared memory allocated across buffer regions. Regions with higher alpha values receive a higher proportion of available shared buffer memory. The following example changes the ingress-lossless-buffer shared alpha value to alpha_2 when using RoCE lossless mode:

cumulus@switch:~$ nv set qos advance-buffer-config default-global ingress-lossless-buffer shared-alpha alpha_2
cumulus@switch:~$ nv config apply

Service Pools

You can configure ingress and egress service pool profile properties with the following NVUE commands:

You can adjust the following properties for each pool:

Property Description
infinite The pool infinite flag.
memory-percent The pool memory percent allocation.
mode The pool mode: static or dynamic.
reserved The reserved buffer allocation in bytes.
shared-alpha The dynamic shared buffer alpha allocation.
shared-bytes The static shared buffer allocation in bytes.

A relationship exists between the default traffic pools and the advanced buffer configuration settings.

Use caution when configuring advanced buffer settings. NVUE presents a warning if you attempt to apply incompatible traffic pool and advanced buffer configurations. NVUE performs the following validation checks before applying advanced buffer configurations:

  • You must map all switch priorities (0-7) to a priority group. You can map more than one switch priority to the same priority group.
  • The sum of memory-percent values across all ingress pools must be less than or equal to 100 percent.
  • The sum of memory-percent values across all egress pools must be less than or equal to 100 percent.

Reference the table below to view the mappings between the default traffic pool and advanced buffer properties:

Default Traffic Pool Default Traffic Pool Properties Advanced Buffer Region or Service Pool Advanced Buffer Properties
default-lossy memory-percent ingress-service-pool 0
egress-service-pool 0
memory-percent
default-lossy switch-priority ingress-lossy-buffer priority-group bulk switch-priority
default-lossless memory-percent ingress-service-pool 1
egress-service-pool 1
memory-percent
roce-lossless memory-percent ingress-service-pool 1
egress-service-pool 1
memory-percent
mc-lossy memory-percent ingress-service-pool 2
egress-service-pool 2
memory-percent
mc-lossy switch-priority ingress-lossy-buffer priority-group service2 switch-priority

For example, to assign 20 percent of memory to a new static service pool, you must allow 20 percent of memory to be available from the default traffic pools. The following commands reduce the default-lossy traffic pool to 80 percent memory, allowing you to assign the memory to ingress-service-pool 3:

cumulus@switch:~$ nv set qos traffic-pool default-lossy memory-percent 80
cumulus@switch:~$ nv set qos advance-buffer-config default-global ingress-service-pool 3 memory-percent 20
cumulus@switch:~$ nv config apply

You can view advanced buffer configuration with the nv show qos advance-buffer-config default-global <buffer/pool name> command:

cumulus@switch:~$ nv show qos advance-buffer-config default-global ingress-service-pool
Pool-Id  infinite  memory-percent  mode     reserved  shared-alpha  shared-bytes
-------  --------  --------------  -------  --------  ------------  ------------
0                  80              dynamic                                      
3                  20    

Lossy Headroom

Lossy headroom is the buffer that stores packets waiting to be processed by the switch. If the expected processing latency is longer than normal (for example, if there are multiple ACL rules), increase the lossy headroom.

To change the lossy headroom for a priority group, run the following commands. The switch calculates the default value internally based on the MTU and internal latency.

Run the nv set qos advance-buffer-config default-global ingress-lossy-buffer priority-group <priority-group> headroom <bytes> command, where <priority-group> is bulk or service1 through service7.

The following example configures the lossy headroom to 50000 bytes for priority group service1:

cumulus@switch:~$ nv set qos advance-buffer-config default-global ingress-lossy-buffer priority-group service1 headroom 50000
cumulus@switch:~$ nv config apply

To unset the lossy headroom for a priority group, run the nv unset qos advance-buffer-config default-global ingress-lossy-buffer priority-group <priority-group> headroom command.

Edit the /etc/mlx/datapath/qos/qos_infra.conf file to adjust the priority_group.<priority-group>.ingress_buffer.lossy_headroom parameter. <priority-group> can be bulk or service1 through service7.

The following example configures the lossy headroom to 50000 bytes for priority group service1:

cumulus@switch:~$ sudo nano /etc/mlx/datapath/qos/qos_infra.conf
...
priority_group.service1.ingress_buffer.lossy_headroom = 50000 

To unset the lossy headroom for a priority group, comment out the priority_group.<priority-group>.ingress_buffer.lossy_headroom parameter.

Ingress and Egress Management Buffers

Management traffic consists of OSPF and BGP hello and update packets, and BFD packets that ingress and egress the CPU.

To configure the ingress management buffer:

Run the nv set qos advance-buffer-config default-global ingress-mgmt-buffer <option> <value> command. You can adjust the following options:

Option Description
headroom The ingress management buffer headroom allocation in bytes.
reserved The ingress management reserved buffer allocation in bytes.
service-pool The ingress management buffer service pool mapping. You can specify a value between 0 and 7.
shared-alpha The dynamic ingress management shared buffer alpha allocation. You can specify one of these values: alpha_0, alpha_1_128, alpha_1_64, alpha_1_32, alpha_1_16, alpha_1_8, alpha_1_4, alpha_1_2, alpha_1, alpha_2, alpha_4, alpha_8, alpha_16, alpha_32, alpha_64, or alpha_infinity.
shared-bytes The static ingress management shared buffer allocation in bytes.

If the service-pool to which the ingress management buffer maps is dynamic, use the shared-alpha value. If the mapped service-pool is static, use the shared-bytes value.

The following example configures the ingress management buffer headroom to 20000 bytes:

cumulus@switch:~$ nv set qos advance-buffer-config default-global ingress-mgmt-buffer headroom 20000
cumulus@switch:~$ nv config apply

The following example configures the ingress management reserved buffer allocation to 45000 bytes:

cumulus@switch:~$ nv set qos advance-buffer-config default-global ingress-mgmt-buffer reserved 45000
cumulus@switch:~$ nv config apply

The following example configures the ingress management buffer service pool mapping to 0:

cumulus@switch:~$ nv set qos advance-buffer-config default-global ingress-mgmt-buffer service-pool 0
cumulus@switch:~$ nv config apply

The following example configures the static shared ingress management buffer to 14000 bytes:

cumulus@switch:~$ nv set qos advance-buffer-config default-global ingress-mgmt-buffer shared-bytes 14000
cumulus@switch:~$ nv config apply

The following example configures the dynamic shared ingress management buffer alpha allocation to alpha_2:

cumulus@switch:~$ nv set qos advance-buffer-config default-global ingress-mgmt-buffer shared-alpha alpha_2
cumulus@switch:~$ nv config apply

To unset the ingress management buffer settings, run the nv unset qos advance-buffer-config default-global ingress-mgmt-buffer <option> command; for example,

cumulus@switch:~$ nv unset qos advance-buffer-config default-global ingress-mgmt-buffer reserved
cumulus@switch:~$ nv config apply

Edit the /etc/mlx/datapath/qos/qos_infra.conf file to add the following parameters.

Parameter Description
management.ingress_service_pool The ingress management buffer service pool mapping. You can specify a value between 0 and 7.
management.ingress_buffer.lossy_headroom The ingress management buffer headroom allocation in bytes.
management.ingress_buffer.reserved The ingress management reserved buffer allocation in bytes.
management.ingress_buffer.shared-size The static ingress management shared buffer allocation in bytes. You can specify a value between 0 and 4294967295.
management.ingress_buffer.dynamic_quota The dynamic ingress management shared buffer alpha allocation. You can specify one of these values: alpha_0, alpha_1_128, alpha_1_64, alpha_1_32, alpha_1_16, alpha_1_8, alpha_1_4, alpha_1_2, alpha_1, alpha_2, alpha_4, alpha_8, alpha_16, alpha_32, alpha_64, or alpha_infinity.

If the service-pool to which the ingress management buffer maps is dynamic, use the shared-alpha value. If the mapped service-pool is static, use the shared-bytes value.

cumulus@switch:~$ sudo nano /etc/mlx/datapath/qos/qos_infra.conf
...
# all priority groups share a service pool on Spectrum
management.ingress_service_pool = 0
...
# priority group minimum buffer allocation: size in bytes
# priority group shared buffer allocation: shared buffer size in bytes
# if a priority group has no packet priority values assigned to it, the buffers will not be allocated
...
management.ingress_buffer.reserved = 45000
management.ingress_buffer.shared_size = 14000
management.ingress_buffer.lossy_headroom = 20000 
...
# Ingress buffer per-PG dynamic buffering alpha (Default: ALPHA_8)
...
management.ingress_buffer.dynamic_quota = alpha_2

To unset the ingress management buffer settings, delete or comment out the management.ingress_service_pool or management.ingress_buffer parameters.

To configure the egress management buffer:

Run the nv set qos advance-buffer-config default-global egress-mgmt-buffer <option> <value> command. You can adjust the following options:

Option Description
reserved The egress management reserved buffer allocation in bytes.
service-pool The egress management buffer service pool mapping. You can specify a value between 0 and 7.
shared-alpha The dynamic egress management shared buffer alpha allocation. You can specify one of these values: alpha_0, alpha_1_128, alpha_1_64, alpha_1_32, alpha_1_16, alpha_1_8, alpha_1_4, alpha_1_2, alpha_1, alpha_2, alpha_4, alpha_8, alpha_16, alpha_32, alpha_64, or alpha_infinity.
shared-bytes The static egress management shared buffer allocation in bytes. You can specify a value between 0 and 4294967295.

If the service-pool to which the egress management buffer maps is dynamic, use the shared-alpha value. If the mapped service-pool is static, use the shared-bytes value.

The following example configures the egress management reserved buffer to 30000 bytes:

cumulus@switch:~$ nv set qos advance-buffer-config default-global egress-mgmt-buffer reserved 30000
cumulus@switch:~$ nv config apply

The following example configures the egress management buffer service pool mapping to 0:

cumulus@switch:~$ nv set qos advance-buffer-config default-global egress-mgmt-buffer service-pool 0
cumulus@switch:~$ nv config apply

The following example configures the dynamic egress management shared buffer alpha allocation to alpha_2:

cumulus@switch:~$ nv set qos advance-buffer-config default-global egress-mgmt-buffer shared-alpha alpha_2
cumulus@switch:~$ nv config apply

The following example configures the static egress management shared buffer to 20000 bytes:

cumulus@switch:~$ nv set qos advance-buffer-config default-global egress-mgmt-buffer shared-bytes 20000 
cumulus@switch:~$ nv config apply

To unset the egress management buffer settings, run the nv unset qos advance-buffer-config default-global egress-mgmt-buffer <option> command; for example,

cumulus@switch:~$ nv unset qos advance-buffer-config default-global egress-mgmt-buffer reserved
cumulus@switch:~$ nv config apply

Edit the /etc/mlx/datapath/qos/qos_infra.conf file to add the following parameters.

Parameter Description
management.egress_service_pool The egress management buffer service pool mapping. You can specify a value between 0 and 7.
management.egress_buffer.reserved The egress management reserved buffer allocation in bytes.
management.egress_buffer.shared-alpha The dynamic egress management shared buffer alpha allocation. You can specify one of these values: alpha_0, alpha_1_128, alpha_1_64, alpha_1_32, alpha_1_16, alpha_1_8, alpha_1_4, alpha_1_2, alpha_1, alpha_2, alpha_4, alpha_8, alpha_16, alpha_32, alpha_64, or alpha_infinity.
management.egress_buffer.shared-size The static egress management shared buffer allocation in bytes.

If the service-pool to which the egress management buffer maps is dynamic, use the shared-alpha value. If the mapped service-pool is static, use the shared-bytes value.

cumulus@switch:~$ sudo nano /etc/mlx/datapath/qos/qos_infra.conf
...
# service pool assigned for egress queues
management.egress_service_pool = 0
...
# Shared buffer allocation for ePort.TC region : size in bytes.
...
management.egress_buffer.shared_size = 20000
...
# Minimum buffer allocation for ePort.TC region: size in bytes
...
management.egress_buffer.reserved = 30000
...
# Egress buffer per-egress-queue dynamic buffering quota (alpha) for multicast (Default: ALPHA_INFINITY)
...
management.egress_buffer.dynamic_quota = alpha_2

To unset the egress management buffer settings, delete or comment out the management.egress_service_pool or management.egress_buffer parameters.

To show the ingress management buffer configuration, run the nv show qos advance-buffer-config default-global ingress-mgmt-buffer command:

cumulus@switch:~$ nv show qos advance-buffer-config default-global ingress-mgmt-buffer
              operational       applied
------------  -----------       ---- 
headroom      30000 Bytes       30000 Bytes 
shared-bytes   19.53 KB         19.53 KB 

To show the egress management buffer configuration, run the nv show qos advance-buffer-config default-global egress-mgmt-buffer command:

cumulus@switch:~$ nv show qos advance-buffer-config default-global egress-mgmt-buffer 
              operational       applied 
------------  -----------       ---- 
reserved       1200 Bytes       1200 Bytes 
shared-bytes   13.53 KB         13.53 KB 

Syntax Checker

Cumulus Linux provides a syntax checker for the qos_features.conf and qos_infra.conf files to check for errors, such missing parameters or invalid parameter labels and values.

The syntax checker runs automatically with every switchd reload.

You can run the syntax checker manually from the command line with the cl-consistency-check --datapath-syntax-check command. If errors exist, they write to stderr by default. If you run the command with -q, errors write to the /var/log/switchd.log file.

The cl-consistency-check --datapath-syntax-check command takes the following options:

Option
Description
-h Displays this list of command options.
-q Runs the command in quiet mode. Errors write to the /var/log/switchd.log file instead of stderr.
-qi Runs the syntax checker against a specified qos_infra.conf file.
-qf Runs the syntax checker against a specified qos_features.conf file.

By default the syntax checker assumes:

You can run the syntax checker when switchd is either running or stopped.

Show Qos Counters

NVUE provides the following commands to show QoS statistics for an interface:

NVUE Command
Description
nv show interface <interface> counters qos Shows all QoS statistics for a specific interface.
nv show interface <interface> counters qos egress-queue-stats Shows QoS egress queue statistics for a specific interface.
nv show interface <interface> counters qos ingress-buffer-stats Shows QoS ingress buffer statistics for a specific interface.
nv show interface <interface> counters qos pfc-stats Shows QoS PFC statistics for a specific interface.
nv show interface <interface> counters qos port-stats Shows QoS port statistics for a specific interface.

The following example shows all QoS statistics for swp1:

cumulus@switch:~$ nv show interface swp1 counters qos
Ingress Buffer Statistics
============================
    priority-group  rx-frames  rx-buffer-discards  rx-shared-buffer-discards
    --------------  ---------  ------------------  -------------------------
    0               0          0 Bytes             0 Bytes                  
    1               0          0 Bytes             0 Bytes                  
    2               0          0 Bytes             0 Bytes                  
    3               0          0 Bytes             0 Bytes                  
    4               0          0 Bytes             0 Bytes                  
    5               0          0 Bytes             0 Bytes                  
    6               0          0 Bytes             0 Bytes                  
    7               0          0 Bytes             0 Bytes                  

Egress Queue Statistics
==========================
    traffic-class  tx-frames  tx-bytes  tx-uc-buffer-discards  wred-discards
    -------------  ---------  --------  ---------------------  -------------
    0              0          0 Bytes   0 Bytes                0            
    1              0          0 Bytes   0 Bytes                0            
    2              0          0 Bytes   0 Bytes                0            
    3              0          0 Bytes   0 Bytes                0            
    4              0          0 Bytes   0 Bytes                0            
    5              0          0 Bytes   0 Bytes                0            
    6              0          0 Bytes   0 Bytes                0            
    7              0          0 Bytes   0 Bytes                0            

PFC Statistics
=================
    switch-priority  rx-pause-frames  rx-pause-duration  tx-pause-frames  tx-pause-duration
    ---------------  ---------------  -----------------  ---------------  -----------------
    0                0                0                  0                0                
    1                0                0                  0                0                
    2                0                0                  0                0                
    3                0                0                  0                0                
    4                0                0                  0                0                
    5                0                0                  0                0                
    6                0                0                  0                0                
    7                0                0                  0                0                

Qos Port Statistics
======================
    Counter             Receive  Transmit
    ------------------  -------  --------
    ecn-marked-packets  n/a      0       
    mc-buffer-discards  n/a      0       
    pause-frames        0        0
... 

Clear QoS Buffers

cumulus@switch:~$ nv action clear qos buffer pool
QoS pool buffers cleared.
Action succeeded
cumulus@switch:~$ nv action clear qos buffer multicast-switch-priority
QoS multicast buffers cleared.
Action succeeded
cumulus@switch:~$ nv action clear interface swp1 qos buffer
QoS buffers cleared on swp1.
Action succeeded

Default Configuration Files

qos_features.conf
# /etc/cumulus/datapath/qos/qos_features.conf
#
# Copyright © 2021 NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
#
# This software product is a proprietary product of Nvidia Corporation and its affiliates
# (the "Company") and all right, title, and interest in and to the software
# product, including all associated intellectual property rights, are and
# shall remain exclusively with the Company.
#
# This software product is governed by the End User License Agreement
# provided with the software product. 

# packet header field used to determine the packet priority level
# fields include {802.1p, dscp, port}
traffic.packet_priority_source_set = [802.1p]
traffic.port_default_priority      = 0

# packet priority source values assigned to each internal cos value
# internal cos values {cos_0..cos_7}
# (internal cos 3 has been reserved for CPU-generated traffic)
# 802.1p values = {0..7}
traffic.cos_0.priority_source.8021p = [0]
traffic.cos_1.priority_source.8021p = [1]
traffic.cos_2.priority_source.8021p = [2]
traffic.cos_3.priority_source.8021p = [3]
traffic.cos_4.priority_source.8021p = [4]
traffic.cos_5.priority_source.8021p = [5]
traffic.cos_6.priority_source.8021p = [6]
traffic.cos_7.priority_source.8021p = [7]

# dscp values = {0..63}
#traffic.cos_0.priority_source.dscp = [0,1,2,3,4,5,6,7]
#traffic.cos_1.priority_source.dscp = [8,9,10,11,12,13,14,15]
#traffic.cos_2.priority_source.dscp = [16,17,18,19,20,21,22,23]
#traffic.cos_3.priority_source.dscp = [24,25,26,27,28,29,30,31]
#traffic.cos_4.priority_source.dscp = [32,33,34,35,36,37,38,39]
#traffic.cos_5.priority_source.dscp = [40,41,42,43,44,45,46,47]
#traffic.cos_6.priority_source.dscp = [48,49,50,51,52,53,54,55]
#traffic.cos_7.priority_source.dscp = [56,57,58,59,60,61,62,63]
# remark packet priority value
# fields include {802.1p, dscp}
traffic.packet_priority_remark_set = []

# packet priority remark values assigned from each internal cos value
# internal cos values {cos_0..cos_7}
# (internal cos 3 has been reserved for CPU-generated traffic)
# 802.1p values = {0..7}
#traffic.cos_0.priority_remark.8021p = [0]
#traffic.cos_1.priority_remark.8021p = [1]
#traffic.cos_2.priority_remark.8021p = [2]
#traffic.cos_3.priority_remark.8021p = [3]
#traffic.cos_4.priority_remark.8021p = [4]
#traffic.cos_5.priority_remark.8021p = [5]
#traffic.cos_6.priority_remark.8021p = [6]
#traffic.cos_7.priority_remark.8021p = [7]

# dscp values = {0..63}
#traffic.cos_0.priority_remark.dscp = [0]
#traffic.cos_1.priority_remark.dscp = [8]
#traffic.cos_2.priority_remark.dscp = [16]
#traffic.cos_3.priority_remark.dscp = [24]
#traffic.cos_4.priority_remark.dscp = [32]
#traffic.cos_5.priority_remark.dscp = [40]
#traffic.cos_6.priority_remark.dscp = [48]
#traffic.cos_7.priority_remark.dscp = [56]

# source.port_group_list = [source_port_group]
# source.source_port_group.packet_priority_source_set = [dscp]
# source.source_port_group.port_set = swp1-swp4,swp6
# source.source_port_group.port_default_priority = 0
# source.source_port_group.cos_0.priority_source.dscp = [0,1,2,3,4,5,6,7]
# source.source_port_group.cos_1.priority_source.dscp = [8,9,10,11,12,13,14,15]
# source.source_port_group.cos_2.priority_source.dscp = [16,17,18,19,20,21,22,23]
# source.source_port_group.cos_3.priority_source.dscp = [24,25,26,27,28,29,30,31]
# source.source_port_group.cos_4.priority_source.dscp = [32,33,34,35,36,37,38,39]
# source.source_port_group.cos_5.priority_source.dscp = [40,41,42,43,44,45,46,47]
# source.source_port_group.cos_6.priority_source.dscp = [48,49,50,51,52,53,54,55]
# source.source_port_group.cos_7.priority_source.dscp = [56,57,58,59,60,61,62,63]

# remark.port_group_list = [remark_port_group]
# remark.remark_port_group.packet_priority_remark_set = [dscp]
# remark.remark_port_group.port_set = swp1-swp4,swp6
# remark.remark_port_group.cos_0.priority_remark.dscp = [0]
# remark.remark_port_group.cos_1.priority_remark.dscp = [8]
# remark.remark_port_group.cos_2.priority_remark.dscp = [16]
# remark.remark_port_group.cos_3.priority_remark.dscp = [24]
# remark.remark_port_group.cos_4.priority_remark.dscp = [32]
# remark.remark_port_group.cos_5.priority_remark.dscp = [40]
# remark.remark_port_group.cos_6.priority_remark.dscp = [48]
# remark.remark_port_group.cos_7.priority_remark.dscp = [56]

# to configure priority flow control on a group of ports:
# -- assign cos value(s) to the cos list
# -- add or replace a port group names in the port group list
# -- for each port group in the list
#    -- populate the port set, e.g.
#       swp1-swp4,swp8,swp50s0-swp50s3
#    -- set a PFC buffer size in bytes for each port in the group
#    -- set the xoff byte limit (buffer limit that triggers PFC frames transmit to start)
#    -- set the xon byte delta (buffer limit that triggers PFC frames transmit to stop)
#    -- enable PFC frame transmit and/or PFC frame receive

# priority flow control
#pfc.port_group_list = [pfc_port_group]
#pfc.pfc_port_group.cos_list = []
#pfc.pfc_port_group.port_set = swp1-swp4,swp6
#pfc.pfc_port_group.port_buffer_bytes = 25000
#pfc.pfc_port_group.xoff_size = 10000
#pfc.pfc_port_group.xon_delta = 2000
#pfc.pfc_port_group.tx_enable = true
#pfc.pfc_port_group.rx_enable = true
#Specify cable length in mts
#pfc.pfc_port_group.cable_length = 10

# to configure pause on a group of ports:
# -- add or replace port group names in the port group list
# -- for each port group in the list
#    -- populate the port set, e.g.
#       swp1-swp4,swp8,swp50s0-swp50s3
#    -- set a pause buffer size in bytes for each port
#    -- set the xoff byte limit (buffer limit that triggers pause frames transmit to start)
#    -- set the xon byte delta (buffer limit that triggers pause frames transmit to stop)
#    -- enable pause frame transmit and/or pause frame receive

# link pause
# link_pause.port_group_list = [pause_port_group]
# link_pause.pause_port_group.port_set = swp1-swp4,swp6
# link_pause.pause_port_group.port_buffer_bytes = 25000
# link_pause.pause_port_group.xoff_size = 10000
# link_pause.pause_port_group.xon_delta = 2000
# link_pause.pause_port_group.rx_enable = true
# link_pause.pause_port_group.tx_enable = true
# Specify cable length in mts
# link_pause.pause_port_group.cable_length = 10

# Explicit Congestion Notification
# to configure ECN and RED on a group of ports:
# -- add or replace port group names in the port group list
# -- assign cos value(s) to the cos list
# -- for each port group in the list
#    -- populate the port set, e.g.
#       swp1-swp4,swp8,swp50s0-swp50s3
# -- to enable RED requires the latest traffic.conf
#Default ECN configuration on TC0
default_ecn_red_conf.egress_queue_list = [0]
default_ecn_red_conf.ecn_enable = true
default_ecn_red_conf.red_enable = false
default_ecn_red_conf.min_threshold_bytes = 150000
default_ecn_red_conf.max_threshold_bytes = 1500000
default_ecn_red_conf.probability = 100

#ecn_red.port_group_list = [ecn_red_port_group]
#ecn_red.ecn_red_port_group.egress_queue_list = [1]
#ecn_red.ecn_red_port_group.port_set = allports
#ecn_red.ecn_red_port_group.ecn_enable = true
#ecn_red.ecn_red_port_group.red_enable = false
#ecn_red.ecn_red_port_group.min_threshold_bytes = 40000
#ecn_red.ecn_red_port_group.max_threshold_bytes = 200000
#ecn_red.ecn_red_port_group.probability = 100

# Hierarchical traffic shaping
# to configure shaping at 2 levels:
#     - per egress queue egr_queue_0 - egr_queue_7
#     - port level aggregate
# -- add or replace a port group names in the port group list
# -- for each port group in the list
#    -- populate the port set, e.g.
#       swp1-swp4,swp8,swp50s0-swp50s3
#    -- set min and max rates in kbps for each egr_queue [min, max]
#    -- set max rate in kbps at port level
# shaping.port_group_list = [shaper_port_group]
# shaping.shaper_port_group.port_set = swp1-swp3,swp5,swp7s0-swp7s3
# shaping.shaper_port_group.egr_queue_0.shaper = [50000, 100000]
# shaping.shaper_port_group.egr_queue_1.shaper = [51000, 150000]
# shaping.shaper_port_group.egr_queue_2.shaper = [52000, 200000]
# shaping.shaper_port_group.egr_queue_3.shaper = [53000, 250000]
# shaping.shaper_port_group.egr_queue_4.shaper = [54000, 300000]
# shaping.shaper_port_group.egr_queue_5.shaper = [55000, 350000]
# shaping.shaper_port_group.egr_queue_6.shaper = [56000, 400000]
# shaping.shaper_port_group.egr_queue_7.shaper = [57000, 450000]
# shaping.shaper_port_group.port.shaper = 900000

# default egress scheduling weight per egress queue
# To be applied to all the ports if port_group profile not configured
# If you do not specify any bw_percent of egress_queues, those egress queues
# will assume DWRR weight 0 - no egress scheduling for those queues
# '0' indicates strict priority

default_egress_sched.egr_queue_0.bw_percent = 12
default_egress_sched.egr_queue_1.bw_percent = 13
default_egress_sched.egr_queue_2.bw_percent = 12
default_egress_sched.egr_queue_3.bw_percent = 13
default_egress_sched.egr_queue_4.bw_percent = 12
default_egress_sched.egr_queue_5.bw_percent = 13
default_egress_sched.egr_queue_6.bw_percent = 12
default_egress_sched.egr_queue_7.bw_percent = 13

# port_group profile for egress scheduling weight per egress queue
# If you do not specify any bw_percent of egress_queues, those egress queues
# will assume DWRR weight 0 - no egress scheduling for those queues
# '0' indicates strict priority
#egress_sched.port_group_list = [sched_port_group1]
#egress_sched.sched_port_group1.port_set = swp2
#egress_sched.sched_port_group1.egr_queue_0.bw_percent = 10
#egress_sched.sched_port_group1.egr_queue_1.bw_percent = 20
#egress_sched.sched_port_group1.egr_queue_2.bw_percent = 30
#egress_sched.sched_port_group1.egr_queue_3.bw_percent = 10
#egress_sched.sched_port_group1.egr_queue_4.bw_percent = 10
#egress_sched.sched_port_group1.egr_queue_5.bw_percent = 10
#egress_sched.sched_port_group1.egr_queue_6.bw_percent = 10
#egress_sched.sched_port_group1.egr_queue_7.bw_percent = 0

# PFC Watchdog Configuration
# Add the port to the port_group_list where you want to enable PFC Watchdog
# It will enable PFC Watchdog on all the traffic-class corresponding to
# the lossless switch-priority configured on the port.
#pfc_watchdog.port_group_list = [pfc_wd_port_group]
#pfc_watchdog.pfc_wd_port_group.port_set = swp3

# Cut-through is disabled by default on all chips with the exception of
# Spectrum.  On Spectrum cut-through cannot be disabled.
#cut_through_enable = false
qos_infra.conf
#
# Default qos-infra configuration for Mellanox Spectrum chip
#
# Copyright © 2021 NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
#
# This software product is a proprietary product of Nvidia Corporation and its affiliates
# (the "Company") and all right, title, and interest in and to the software
# product, including all associated intellectual property rights, are and
# shall remain exclusively with the Company.
#
# This software product is governed by the End User License Agreement
# provided with the software product. 

# scheduling algorithm: algorithm values = {dwrr}
scheduling.algorithm = dwrr

# priority groups
# supported group names are control, bulk, service1-6
traffic.priority_group_list = [bulk]

# internal cos values assigned to each priority group
# each cos value should be assigned exactly once
# internal cos values {0..7}
priority_group.bulk.cos_list = [0,1,2,3,4,5,6,7]

# Alias Name defined for each priority group
# Valid string between 0-255 chars
# Sample alias support for naming priority groups
#priority_group.bulk.alias = "Bulk"

# priority group ID assigned to each priority group
#priority_group.control.id = 7
#priority_group.service2.id = 2
priority_group.bulk.id = 0

# all priority groups share a service pool on Spectrum
# service pools assigned to each priority group
priority_group.bulk.service_pool = 0

# service pool assigned for lossless PGs
#flow_control.ingress_service_pool = 0

# --- ingress buffer space allocations ---
# total buffer
#  - ingress minimum buffer allocations
#  - ingress service pool buffer allocations
#  - priority group ingress headroom allocations
#  - ingress global headroom allocations
#  = total ingress shared buffer size

# ingress service pool buffer allocation: percent of total buffer
# If a service pool has no priority groups, the buffer is added
# to the shared buffer space.
ingress_service_pool.0.percent = 100

# Ingress buffer port.pool buffer : size in bytes
#port.service_pool.0.ingress_buffer.reserved = 10240
#port.service_pool.0.ingress_buffer.shared_size = 9000
#port.management.ingress_buffer.reserved = 0


# priority group minimum buffer allocation: size in bytes
# priority group shared buffer allocation: shared buffer size in bytes
# if a priority group has no packet priority values assigned to it, the buffers will not be allocated

#priority_group.bulk.ingress_buffer.reserved           = 0
#priority_group.bulk.ingress_buffer.shared_size        = 15

# ---- ingress dynamic buffering settings
# To enable ingress static pool, set the mode to 0
ingress_service_pool.0.mode = 1

# The ALPHA defines the max% of buffers (quota) available on a
# per ingress port OR ipool, Ingress PG, Egress TC, Egress port OR epool.
# ALPHA value equates to the following buffer limit calculated as:
# alpha%(alpha+1) = Max Buffer percentage

# https://community.mellanox.com/s/article/understanding-the-alpha-parameter-in-the-buffer-configuration-of-mellanox-spectrum-switches
# Each shared buffer pool can use a maximum of [total_buffer * (alpha / (alpha+1))]
# Configure quota values mapped to the following alpha values:
# Configuration value = alpha level:
# Both ALPHA_*(string representation) as well as integer values (old representation) will be supported for alpha
# 0/ALPHA_0  = alpha 0
# 1/ALPHA_1_128  = alpha 1/128
# 2/ALPHA_1_64  = alpha 1/64
# 3/ALPHA_1_32  = alpha 1/32
# 4/ALPHA_1_16  = alpha 1/16
# 5/ALPHA_1_8  = alpha 1/8
# 6/ALPHA_1_4  = alpha 1/4
# 7/ALPHA_1_2  = alpha 1/2
# 8/ALPHA_1  = alpha  1
# 9/ALPHA_2  = alpha  2
# 10/ALPHA_4 = alpha  4
# 11/ALPHA_8 = alpha  8
# 12/ALPHA_16 = alpha 16
# 13/ALPHA_32 = alpha 32
# 14/ALPHA_64 = alpha 64
# 15/ALPHA_INFINITY = alpha Infinity

# Ingress buffer per-port dynamic buffering alpha (Default: ALPHA_8)
#port.service_pool.0.ingress_buffer.dynamic_quota = ALPHA_8
#port.management.ingress_buffer.dynamic_quota = ALPHA_8


# Ingress buffer dynamic buffering alpha for lossless PGs (if any; Default: ALPHA_1)
#flow_control.ingress_buffer.dynamic_quota = ALPHA_1

# Ingress buffer per-PG dynamic buffering alpha (Default: ALPHA_8)
#priority_group.bulk.ingress_buffer.dynamic_quota = ALPHA_8

# --- egress buffer space allocations ---
# total egress buffer
#  - minimum buffer allocations
#  = total service pool buffer size
# service pool assigned for lossless PGs
#flow_control.egress_service_pool = 0

# service pool assigned for egress queues
egress_buffer.egr_queue_0.uc.service_pool = 0
egress_buffer.egr_queue_1.uc.service_pool = 0
egress_buffer.egr_queue_2.uc.service_pool = 0
egress_buffer.egr_queue_3.uc.service_pool = 0
egress_buffer.egr_queue_4.uc.service_pool = 0
egress_buffer.egr_queue_5.uc.service_pool = 0
egress_buffer.egr_queue_6.uc.service_pool = 0
egress_buffer.egr_queue_7.uc.service_pool = 0

# Service pool buffer allocation: percent of total
# buffer size.
egress_service_pool.0.percent = 100

# Egress buffer port.pool buffer : size in bytes
#port.service_pool.0.egress_buffer.uc.reserved = 10240
#port.service_pool.0.egress_buffer.uc.shared_size = 9000
#port.management.egress_buffer.reserved = 0

# Front panel port egress buffer limits enforced for each
# priority group.
# Unlimited egress buffers not supported on Spectrum.
#priority_group.bulk.unlimited_egress_buffer     = false

# if a priority group has no cos values assigned to it, the buffers will not be allocated

# Service pool mapping for MC.SP region
egress_buffer.cos_0.mc.service_pool = 0
egress_buffer.cos_1.mc.service_pool = 0
egress_buffer.cos_2.mc.service_pool = 0
egress_buffer.cos_3.mc.service_pool = 0
egress_buffer.cos_4.mc.service_pool = 0
egress_buffer.cos_5.mc.service_pool = 0
egress_buffer.cos_6.mc.service_pool = 0
egress_buffer.cos_7.mc.service_pool = 0
# Reserved and static shared buffer allocation for MC.SP region: size in bytes
#egress_buffer.cos_0.mc.reserved = 10240
#egress_buffer.cos_1.mc.reserved = 10240
#egress_buffer.cos_2.mc.reserved = 10240
#egress_buffer.cos_3.mc.reserved = 10240
#egress_buffer.cos_4.mc.reserved = 10240
#egress_buffer.cos_5.mc.reserved = 10240
#egress_buffer.cos_6.mc.reserved = 10240
#egress_buffer.cos_7.mc.reserved = 10240
#egress_buffer.cos_0.mc.shared_size = 40
#egress_buffer.cos_1.mc.shared_size = 40
#egress_buffer.cos_2.mc.shared_size = 40
#egress_buffer.cos_3.mc.shared_size = 40
#egress_buffer.cos_4.mc.shared_size = 40
#egress_buffer.cos_5.mc.shared_size = 40
#egress_buffer.cos_6.mc.shared_size = 40
#egress_buffer.cos_7.mc.shared_size = 40

# Shared buffer allocation for ePort.TC region : size in bytes.
#egress_buffer.egr_queue_0.uc.shared_size   = 40
#egress_buffer.egr_queue_1.uc.shared_size   = 40
#egress_buffer.egr_queue_2.uc.shared_size   = 40
#egress_buffer.egr_queue_3.uc.shared_size   = 40
#egress_buffer.egr_queue_4.uc.shared_size   = 40
#egress_buffer.egr_queue_5.uc.shared_size   = 40
#egress_buffer.egr_queue_6.uc.shared_size   = 40
#egress_buffer.egr_queue_7.uc.shared_size   = 40

# Minimum buffer allocation for ePort.TC region: size in bytes
#egress_buffer.egr_queue_0.uc.reserved = 1024
#egress_buffer.egr_queue_1.uc.reserved = 1024
#egress_buffer.egr_queue_2.uc.reserved = 1024
#egress_buffer.egr_queue_3.uc.reserved = 1024
#egress_buffer.egr_queue_4.uc.reserved = 1024
#egress_buffer.egr_queue_5.uc.reserved = 1024
#egress_buffer.egr_queue_6.uc.reserved = 1024
#egress_buffer.egr_queue_7.uc.reserved = 1024

# Reserved Egress buffer for TCs mapped to lossless SPs
#flow_control.egress_buffer.reserved = 0

# Egress buffer ePort.MC buffer : size in bytes
# the per-port limit on multicast packets (applies to all switch priorities)
#port.egress_buffer.mc.reserved = 10240
#port.egress_buffer.mc.shared_size = 92160

# To enable egress static pool, set the mode to 0
egress_service_pool.0.mode = 1

# Egress dynamic buffer pool configuration
# Replace the shared_size parameter with the dynamic_quota=n/ALPHA_x,
# where ‘n’ should be the configuration value for alpha.
# 		‘ALPHA_x’ should be string representation for alpha.
# Pls note : Same alpha configuration values can be used as mentioned in Ingress Dynamic Buffering section above
# Egress buffer per-port dynamic buffering quota (alpha ; Default: ALPHA_16)
#port.service_pool.0.egress_buffer.uc.dynamic_quota = ALPHA_16
#port.management.egress_buffer.dynamic_quota = ALPHA_8


# Egress buffer per-egress-queue dynamic buffering quota (alpha) for lossless egress queues (Default: ALPHA_INFINITY)
#flow_control.egress_buffer.dynamic_quota = ALPHA_1

# Egress buffer per-egress-queue dynamic buffering quota (alpha) for unicast (Default: ALPHA_8)
#egress_buffer.egr_queue_0.uc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_1.uc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_2.uc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_3.uc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_4.uc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_5.uc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_6.uc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_7.uc.dynamic_quota = ALPHA_8

# Egress buffer per-egress-queue dynamic buffering quota (alpha) for multicast (Default: ALPHA_INFINITY)
#egress_buffer.egr_queue_0.mc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_1.mc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_2.mc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_3.mc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_4.mc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_5.mc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_6.mc.dynamic_quota = ALPHA_8
#egress_buffer.egr_queue_7.mc.dynamic_quota = ALPHA_8

# These parameters can be assigned to the virtual Multicast port as well (Default: ALPHA_1_4)
#egress_buffer.cos_0.mc.dynamic_quota = ALPHA_1_4
#egress_buffer.cos_1.mc.dynamic_quota = ALPHA_1_4
#egress_buffer.cos_2.mc.dynamic_quota = ALPHA_1_4
#egress_buffer.cos_3.mc.dynamic_quota = ALPHA_1_4
#egress_buffer.cos_4.mc.dynamic_quota = ALPHA_1_4
#egress_buffer.cos_5.mc.dynamic_quota = ALPHA_1_4
#egress_buffer.cos_6.mc.dynamic_quota = ALPHA_1_4
#egress_buffer.cos_7.mc.dynamic_quota = ALPHA_1_4

# internal cos values mapped to egress queues
# multicast queue: same as unicast queue
cos_egr_queue.cos_0.uc  = 0
cos_egr_queue.cos_0.cpu = 0

cos_egr_queue.cos_1.uc  = 1
cos_egr_queue.cos_1.cpu = 1

cos_egr_queue.cos_2.uc  = 2
cos_egr_queue.cos_2.cpu = 2

cos_egr_queue.cos_3.uc  = 3
cos_egr_queue.cos_3.cpu = 3

cos_egr_queue.cos_4.uc  = 4
cos_egr_queue.cos_4.cpu = 4

cos_egr_queue.cos_5.uc  = 5
cos_egr_queue.cos_5.cpu = 5

cos_egr_queue.cos_6.uc  = 6
cos_egr_queue.cos_6.cpu = 6

cos_egr_queue.cos_7.uc  = 7
cos_egr_queue.cos_7.cpu = 7

Caveats

Configure QoS and Breakout Ports Simultaneously

If you configure btoh breakout ports and QoS settings for breakout interfaces at the same time, errors might occur.

You must apply breakout port configuration before QoS configuration on the breakout ports. If you are using NVUE, configure breakout ports and perform an nv config apply first, then configure QoS settings on the breakout ports followed by another nv config apply. If you are using linux file configuration, modify ports.conf first, reload switchd, then modify qos_features.conf and reload switchd a second time.

QoS Settings on Bond Member Interfaces

If you use Linux commands to apply QoS settings on bond member interfaces instead of the logical bond interface, the members must share identical QoS configuration. If the configuration is not identical between bond interfaces, the bond inherits the _last_ interface you apply to the bond.

If QoS settings do not match, switchd reload fails; however, switchd restart does not fail.

NVUE rejects QoS configurations on bond member interfaces and shows an error when you try to apply the configurations; you must apply all QoS configuration on logical bond interfaces.

Cut-through Switching

You cannot disable cut-through switching on Spectrum ASICs. Cumulus Linux ignores the cut_through_enable = false setting in the qos_features.conf file.

RDMA over Converged Ethernet - RoCE

RoCE enables you to write to compute or storage elements using RDMA over an Ethernet network instead of using host CPUs. RoCE relies on ECN and PFC to operate. Cumulus Linux supports features that can enable lossless Ethernet for RoCE environments.

While Cumulus Linux can support RoCE environments, the end hosts must support the RoCE protocol.

RoCE helps you obtain a converged network, where all services run over the Ethernet infrastructure, including Infiniband applications.

Default RoCE Configuration

The following table shows the default RoCE configuration for lossy and lossless mode.

Configuration Lossy Mode Lossless Mode
Port trust mode YES YES
Port switch priority to traffic class mapping
  • Switch priority 3 to traffic class 3 (RoCE)
  • Switch priority 6 to traffic class 6 (CNP)
  • Other switch priority to traffic class 0
YES YES
Port ETS:
  • Traffic class 6 (CNP) - Strict
  • Traffic class 3 (RoCE) - WRR 50%
  • Traffic class 0 (Other traffic) - WRR 50%
YES YES
Port ECN absolute threshold is 1501500 bytes for traffic class 3 (RoCE) YES YES
LLDP and Application TLV (RoCE)
(UDP, Protocol:4791, Priority: 3)
YES YES
Enable PFC on switch priority 3 (RoCE) NO YES
Switch priority 3 allocated to RoCE lossless traffic pool NO YES

RoCE lossless (with PFC and ECN)

RoCE uses the Infiniband (IB) Protocol over converged Ethernet. The IB global route header rides directly on top of the Ethernet header. The lossless Ethernet layer handles congestion hop by hop.

To enable RoCE lossless:

cumulus@switch:~$ nv set qos roce
cumulus@switch:~$ nv config apply

NVUE defaults to RoCE lossless. The command nv set qos roce and nv set qos roce mode lossless are equivalent.

If you enable roce mode lossy, configuring nv set qos roce without a mode does not change the RoCE mode. To change to lossless, you must configure lossless mode with the nv set qos roce mode lossless command.

Edit the /etc/cumulus/switchd.d/qos.conf file to set the traffic.roce_mode parameter to 3, then reload switchd.

cumulus@switch:~$ sudo cat /etc/cumulus/switchd.d/qos.conf
...
traffic.roce_mode = 3
...
cumulus@switch:~$ sudo systemctl reload switchd.service

Link pause is another way to provide lossless ethernet; however, PFC is the preferred method. PFC allows more granular control by pausing the traffic flow for a given CoS group instead of the entire link.

RoCE lossy (with ECN)

RoCEv2 requires flow control for lossless Ethernet. RoCEv2 uses the Infiniband (IB) Transport Protocol over UDP. The IB transport protocol includes an end-to-end reliable delivery mechanism and has its own sender notification mechanism.

RoCEv2 congestion management uses RFC 3168 to signal congestion experienced to the receiver. The receiver generates an RoCEv2 congestion notification packet directed to the source of the packet.

To enable RoCE lossy:

cumulus@switch:~$ nv set qos roce mode lossy
cumulus@switch:~$ nv config apply

Edit the /etc/cumulus/switchd.d/qos.conf file to set the traffic.roce_mode parameter to 1, then reload switchd.

cumulus@switch:~$ sudo cat /etc/cumulus/switchd.d/qos.conf
...
traffic.roce_mode = 1
...
cumulus@switch:~$ sudo systemctl reload switchd.service

Single Shared Buffer Pool

By default, Cumulus Linux separates lossy and lossless traffic into different dedicated buffer pools on both ingress and egress. You can configure the switch to combine lossy and lossless traffic on the same buffer pool for better load absorption.

To enable single shared buffer pool mode:

cumulus@switch:~$ nv set qos roce mode lossless-single-ipool
cumulus@switch:~$ nv config apply

To disable single shared buffer pool mode and use the default mode (lossless), run the nv unset qos roce mode lossless-single-ipool command.

Edit the /etc/cumulus/switchd.d/qos.conf file to set the traffic.roce_mode parameter to 4, then reload switchd.

cumulus@switch:~$ sudo cat /etc/cumulus/switchd.d/qos.conf
...
traffic.roce_mode = 4
...
cumulus@switch:~$ sudo systemctl reload switchd.service

To disable single shared buffer pool mode and use the default mode (lossless), set the traffic.roce_mode parameter to 3.

Remove RoCE Configuration

To remove RoCE configuration:

cumulus@switch:~$ nv unset qos roce
cumulus@switch:~$ nv config apply

Edit the etc/cumulus/switchd.d/qos.conf file to set the traffic.roce_mode parameter to 0, then reload switchd.

cumulus@switch:~$ sudo cat etc/cumulus/switchd.d/qos.conf
...
traffic.roce_mode = 0
...
cumulus@switch:~$ sudo systemctl reload switchd.service

Verify RoCE Configuration

You can verify RoCE configuration with NVUE nv show commands.

To show detailed information about the configured buffers, utilization and DSCP markings, run the nv show qos roce command:

cumulus@switch:mgmt:~$ nv show qos roce
                   operational  applied 
------------------  -----------  --------
                    operational            applied              
------------------  ---------------------  ---------------------
enable                                     on                   
mode                lossless-single-ipool  lossless-single-ipool
pfc                                                             
  pfc-priority      3                                           
  rx-enabled        enabled                                     
  tx-enabled        enabled                                     
  cable-length      100                                         
congestion-control                                              
  congestion-mode   ECN                                         
  enabled-tc        0,3                                         
  min-threshold     146.48 KB                                   
  max-threshold     1.43 MB                                     
  probability       100                                         
trust                                                           
  trust-mode        pcp,dscp                                    
lldp-app-tlv                                                    
  priority          -1                                          
  protocol-id       -1                                          
  selector          Non-UDP       

RoCE PCP/DSCP->SP mapping configurations
===========================================
       pcp  dscp                     switch-prio
    -  ---  -----------------------  -----------
    0  0    0,1,2,3,4,5,6,7          0          
    1  1    8,9,10,11,12,13,14,15    1          
    2  2    16,17,18,19,20,21,22,23  2          
    3  3    24,25,26,27,28,29,30,31  3          
    4  4    32,33,34,35,36,37,38,39  4          
    5  5    40,41,42,43,44,45,46,47  5          
    6  6    48,49,50,51,52,53,54,55  6          
    7  7    56,57,58,59,60,61,62,63  7          

RoCE SP->TC mapping and ETS configurations
=============================================
       switch-prio  traffic-class  scheduler-weight
    -  -----------  -------------  ----------------
    0  0            0              DWRR-50%        
    1  1            0              DWRR-50%        
    2  2            0              DWRR-50%        
    3  3            3              DWRR-50%        
    4  4            0              DWRR-50%        
    5  5            0              DWRR-50%        
    6  6            6              strict-priority 
    7  7            0              DWRR-50%        

RoCE pool config
===================
       name                   mode     size  switch-priorities  traffic-class
    -  ---------------------  -------  ----  -----------------  -------------
    0  lossy-default-ingress  Dynamic  50%   0,1,2,4,5,6,7      -            
    1  roce-reserved-ingress  Dynamic  50%   3                  -            
    2  lossy-default-egress   Dynamic  50%   -                  0,6          
    3  roce-reserved-egress   Dynamic  inf   -                  3            

Exception List
=================
No Data

To show detailed RoCE information about a single interface, run the nv show interface <interface> qos roce status command.

cumulus@switch:mgmt:~$ nv show interface swp16 qos roce status
                    operational    applied  description
------------------  -------------  -------  ---------------------------------------------------
congestion-control
  congestion-mode   ecn, absolute           Congestion config mode
  enabled-tc        0,3                     Congestion config enabled Traffic Class
  max-threshold     1.43 MB                 Congestion config max-threshold
  min-threshold     153.00 KB               Congestion config min-threshold
  probability       100                  
lldp-app-tlv                             
  priority          3                    
  protocol-id       4791                 
  selector          UDP
pfc
  pfc-priority      3                       switch-prio on which PFC is enabled
  rx-enabled        yes                     PFC Rx Enabled status
  tx-enabled        yes                     PFC Tx Enabled status
trust
  trust-mode        pcp,dscp                Trust Setting on the port for packet classification
mode                lossless                Roce Mode
 
 
RoCE PCP/DSCP->SP mapping configurations
===========================================
          pcp  dscp  switch-prio
    ----  ---  ----  -----------
    cnp   6    48    6
    roce  3    26    3
 
 
RoCE SP->TC mapping and ETS configurations
=============================================
          switch-prio  traffic-class  scheduler-weight
    ----  -----------  -------------  ----------------
    cnp   6            6              strict priority
    roce  3            3              dwrr-50%
 
 
RoCE Pool Status
===================
        name                   mode     pool-id  switch-priorities  traffic-class  size      current-usage  max-usage
    --  ---------------------  -------  -------  -----------------  -------------  --------  -------------  ---------
    0   lossy-default-ingress  DYNAMIC  2        0,1,2,4,5,6,7      -              15.16 MB  0 Bytes        16.00 MB
    1   roce-reserved-ingress  DYNAMIC  3        3                  -              15.16 MB  7.30 MB        7.90 MB
    2   lossy-default-egress   DYNAMIC  13       -                  0,6            15.16 MB  0 Bytes        16.01 MB
    3   roce-reserved-egress   DYNAMIC  14       -                  3              inf       7.29 MB        13.47 MB

To show detailed information about current buffer utilization as well as historic RoCE byte and packet counts, run the nv show interface <interface> qos roce counters command:

cumulus@switch:mgmt:~$ nv show interface swp16 qos roce counters
                               operational   applied  description
-----------------------------  ------------  -------  ------------------------------------------------------
rx-stats
  rx-non-roce-stats
    buffer-max-usage           144 Bytes              Max Ingress Pool-buffer usage for non-RoCE traffic
    buffer-usage               0 Bytes                Current Ingress Pool-buffer usage for non-RoCE traffic
    no-buffer-discard          55                     Rx buffer discards for non-RoCE traffic
    non-roce-bytes             56.52 MB               non-roce rx bytes
    non-roce-packets           462975                 non-roce rx packets
    pg-max-usage               144 Bytes              Max PG-buffer usage for non-RoCE traffic
    pg-usage                   0 Bytes                Current PG-buffer usage for non-RoCE traffic
  rx-pfc-stats
    pause-duration             0                      Rx PFC pause duration for RoCE traffic
    pause-packets              0                      Rx PFC pause packets for RoCE traffic
  rx-roce-stats
    buffer-max-usage           0 Bytes                Max Ingress Pool-buffer usage for RoCE traffic
    buffer-usage               0 Bytes                Current Ingress Pool-buffer usage for RoCE traffic
    no-buffer-discard          0                      Rx buffer discards for RoCE traffic
    pg-max-usage               0 Bytes                Max PG-buffer usage for RoCE traffic
    pg-usage                   0 Bytes                Current PG-buffer usage for RoCE traffic
    roce-bytes                 0 Bytes                Rx RoCE Bytes
    roce-packets               0                      Rx RoCE Packets
tx-stats
  tx-cnp-stats
    buffer-max-usage           16.02 MB               Max Egress Pool-buffer usage for CNP traffic
    buffer-usage               0 Bytes                Current Egress Pool-buffer usage for CNP traffic
    cnp-bytes                  0 Bytes                Tx CNP Packet Bytes
    cnp-packets                0                      Tx CNP Packets
    tc-max-usage               0 Bytes                Max TC-buffer usage for CNP traffic
    tc-usage                   0 Bytes                Current TC-buffer usage for CNP traffic
    unicast-no-buffer-discard  0                      Tx buffer discards for CNP traffic
  tx-ecn-stats
    ecn-marked-packets         693777677344           Tx ECN marked packets
  tx-pfc-stats
    pause-duration             0                      Tx PFC pause duration for RoCE traffic
    pause-packets              0                      Tx PFC pause packets for RoCE traffic
  tx-roce-stats
    buffer-max-usage           13.47 MB               Max Egress Pool-buffer usage for RoCE traffic
    buffer-usage               7.29 MB                Current Egress Pool-buffer usage for RoCE traffic
    roce-bytes                 92824.38 GB            Tx RoCE Packet bytes
    roce-packets               803785675319           Tx RoCE Packets
    tc-max-usage               16.02 MB               Max TC-buffer usage for RoCE traffic
    tc-usage                   7.29 MB                Current TC-buffer usage for RoCE traffic
    unicast-no-buffer-discard  663060754115           Tx buffer discards for RoCE traffic

To reset the counters in the nv show interface <interface> qos roce command output, run the nv action clear interface <interface> qos roce counters command.

Change RoCE Configuration

You can adjust RoCE settings using NVUE after you enable RoCE. To change the memory allocation for RoCE lossless mode to 60 percent:

cumulus@switch:mgmt:~$ nv set qos traffic-pool default-lossy memory-percent 40
cumulus@switch:mgmt:~$ nv set qos traffic-pool roce-lossless memory-percent 60
cumulus@switch:mgmt:~$ nv config apply

To change the memory allocation of the RoCE lossy traffic pool to 60 percent and remap switch priority 4 to RoCE lossy traffic:

cumulus@switch:mgmt:~$ nv set qos traffic-pool default-lossy switch-priority 0-3,5-7
cumulus@switch:mgmt:~$ nv set qos traffic-pool roce-lossy memory-percent 60
cumulus@switch:mgmt:~$ nv set qos traffic-pool default-lossy memory-percent 40
cumulus@switch:mgmt:~$ nv set qos traffic-pool roce-lossy switch-priority 4
cumulus@switch:mgmt:~$ nv set qos egress-queue-mapping default-global switch-priority 4 traffic-class 3
cumulus@switch:mgmt:~$ nv set qos egress-queue-mapping default-global switch-priority 3 traffic-class 0
cumulus@switch:mgmt:~$ nv set qos mapping default-global trust both
cumulus@switch:mgmt:~$ nv set qos mapping default-global dscp 26 switch-priority 4
cumulus@switch:mgmt:~$ nv config apply

To change the RoCE lossless switch priority from switch priority 3 to switch priority 2:

cumulus@switch:mgmt:~$ nv set qos pfc default-global switch-priority 2
cumulus@switch:mgmt:~$ nv set qos egress-queue-mapping default-global switch-priority 2 traffic-class 3
cumulus@switch:mgmt:~$ nv set qos egress-queue-mapping default-global switch-priority 3 traffic-class 0
cumulus@switch:mgmt:~$ nv set qos mapping default-global trust both
cumulus@switch:mgmt:~$ nv set qos mapping default-global dscp 26 switch-priority 2

DHCP

This section describes how to configure:

DHCP Relays

DHCP is a client server protocol that automatically provides IP hosts with IP addresses and other related configuration information. A DHCP relay (agent) is a host that forwards DHCP packets between clients and servers that are not on the same physical subnet.

This topic describes how to configure DHCP relays for IPv4 and IPv6 using the following topology:

Basic Configuration

To set up DHCP relay, you need to provide the IP address of the DHCP server and the interfaces participating in DHCP relay (facing the server and facing the client). In an MLAG configuration, you must also specify the peerlink interface in case the local uplink interfaces fail.

In the example commands below:

cumulus@leaf01:~$ nv set service dhcp-relay default interface swp51
cumulus@leaf01:~$ nv set service dhcp-relay default interface swp52
cumulus@leaf01:~$ nv set service dhcp-relay default interface vlan10
cumulus@leaf01:~$ nv set service dhcp-relay default interface peerlink.4094
cumulus@leaf01:~$ nv set service dhcp-relay default server 172.16.1.102
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ nv set service dhcp-relay6 default interface upstream swp51 server-address 2001:db8:100::2
cumulus@leaf01:~$ nv set service dhcp-relay6 default interface upstream swp52 server-address 2001:db8:100::2
cumulus@leaf01:~$ nv set service dhcp-relay6 default interface downstream vlan10
cumulus@leaf01:~$ nv set service dhcp-relay6 default interface downstream peerlink.4094
cumulus@leaf01:~$ nv config apply
  1. Edit the /etc/default/isc-dhcp-relay-default file to add the IP address of the DHCP server and the interfaces participating in DHCP relay.

    cumulus@leaf01:~$ sudo nano /etc/default/isc-dhcp-relay-default
    SERVERS="172.16.1.102"
    INTF_CMD="-i vlan10 -i swp51 -i swp52 -i peerlink.4094"
    OPTIONS=""
    
  2. Enable, then restart the dhcrelay service so that the configuration persists between reboots:

    cumulus@leaf01:~$ sudo systemctl enable dhcrelay@default.service
    cumulus@leaf01:~$ sudo systemctl restart dhcrelay@default.service
    
  1. Edit the /etc/default/isc-dhcp-relay6-default file to add the IP address of the DHCP server and the interfaces participating in DHCP relay.

    cumulus@leaf01:$ sudo nano /etc/default/isc-dhcp-relay6-default
    SERVERS=" -u 2001:db8:100::2%swp51 -u 2001:db8:100::2%swp52"
    INTF_CMD="-l vlan10 -l peerlink.4094"
    
  2. Enable, then restart the dhcrelay6 service so that the configuration persists between reboots:

    cumulus@switch:~$ sudo systemctl enable dhcrelay6@default.service
    cumulus@switch:~$ sudo systemctl restart dhcrelay6@default.service
    

  • You configure a DHCP relay on a per-VLAN basis, specifying the SVI, not the parent bridge. In the example above, you specify vlan10 as the SVI for VLAN 10 but you do not specify the bridge named bridge.
  • When you configure DHCP relay with VRR, the DHCP relay client must run on the SVI; not on the -v0 interface.
  • For every instance of a DHCP relay in a non-default VRF, you need to create a separate default file in the /etc/default directory. See DHCP with VRF.

Optional Configuration

This section describes optional DHCP relay configurations. The steps provided in this section assume that you have already configured basic DHCP relay, as described above.

DHCP Agent Information Option (Option 82)

Cumulus Linux supports DHCP Agent Information Option 82, which allows a DHCP relay to insert circuit or relay specific information into a request that the switch forwards to a DHCP server. You can use the following options:

To configure DHCP Agent Information Option 82:

The following example enables Option 82 and enables circuit ID:

cumulus@leaf01:~$ nv set service dhcp-relay <vrf-id> agent enable on
cumulus@leaf01:~$ nv set service dhcp-relay <vrf-id> agent use-pif-circuit-id enable on
cumulus@leaf01:~$ nv config apply

The following example enables Option 82 and sets the remote ID to MAC address 44:38:39:BE:EF:AA:

cumulus@leaf01:~$ nv set service dhcp-relay <vrf-id> agent enable on
cumulus@leaf01:~$ nv set service dhcp-relay default agent remote-id 44:38:39:BE:EF:AA
cumulus@leaf01:~$ nv config apply
  1. Edit the /etc/default/isc-dhcp-relay-default file and add one of the following options:

    To inject the ingress SVI interface against which DHCP processes the relayed DHCP discover packet, add -a to the OPTIONS line:

    cumulus@leaf01:~$ sudo nano /etc/default/isc-dhcp-relay-default
    ...
    # Additional options that are passed to the DHCP relay daemon?
    OPTIONS="-a"
    

    To inject the physical switch port on which the relayed DHCP discover packet arrives instead of the SVI, add -a --use-pif-circuit-id to the OPTIONS line:

    cumulus@leaf01:~$ sudo nano /etc/default/isc-dhcp-relay-default
    ...
    # Additional options that are passed to the DHCP relay daemon?
    OPTIONS="-a --use-pif-circuit-id"
    

    To customize the Remote ID sub-option, add -a -r to the OPTIONS line followed by a custom string (up to 255 characters):

    cumulus@leaf01:~$ sudo nano /etc/default/isc-dhcp-relay-default
    ...
    # Additional options that are passed to the DHCP relay daemon?
    OPTIONS="-a -r CUSTOMVALUE"
    
  2. Restart the dhcrelay service to apply the new configuration:

    cumulus@leaf01:~$ sudo systemctl restart dhcrelay@default.service
    

Control the Gateway IP Address with RFC 3527

When you need DHCP relay in an environment that relies on an anycast gateway (such as EVPN), a unique IP address is necessary on each device for return traffic. By default, in a BGP unnumbered environment with DHCP relay, the source IP address is the loopback IP address and the gateway IP address is the SVI IP address. However with anycast traffic, the SVI IP address is not unique to each rack; it is typically shared between racks. Most EVPN ToR deployments only use a single unique IP address, which is the loopback IP address.

RFC 3527 enables the DHCP server to react to these environments by introducing a new parameter to the DHCP header called the link selection sub-option, which the DHCP relay agent builds. The link selection sub-option takes on the normal role of the gateway address in relaying to the DHCP server which subnet correlates to the DHCP request. When using this sub-option, the gateway address continues to be present but only relays the return IP address that the DHCP server uses; the gateway address becomes the unique loopback IP address.

When enabling RFC 3527 support, you can specify an interface, such as the loopback interface or a switch port interface to use as the gateway address. The relay picks the first IP address on that interface. If the interface has multiple IP addresses, you can specify a specific IP address for the interface.

RFC 3527 supports IPv4 DHCP relays only.

To enable RFC 3527 support and control the gateway address:

Run the nv set service dhcp-relay default gateway-interface command with the interface or IP address you want to use. The following example uses the first IP address on the loopback interface as the gateway IP address:

cumulus@leaf01:~$ nv set service dhcp-relay default gateway-interface lo

The first IP address on the loopback interface is typically the 127.0.0.1 address. This example uses IP address 10.10.10.1 on the loopback interface as the gateway address:

cumulus@leaf01:~$ nv set service dhcp-relay default gateway-interface lo address 10.10.10.1

This example uses the first IP address on swp2 as the gateway address:

cumulus@leaf01:~$ nv set service dhcp-relay default gateway-interface swp2

This example uses IP address 10.0.0.4 on swp2 as the gateway address:

cumulus@leaf01:~$ nv set service dhcp-relay default gateway-interface swp2 address 10.0.0.4
  1. Edit the /etc/default/isc-dhcp-relay-default file and provide the -U option with the interface or IP address you want to use as the gateway address.

    This example uses the first IP address on the loopback interface as the gateway address:

    cumulus@leaf01:~$ sudo nano /etc/default/isc-dhcp-relay-default
    ...
    # Additional options that are passed to the DHCP relay daemon?
    OPTIONS="-U lo"
    

    The first IP address on the loopback interface is typically the 127.0.0.1 address. This example uses IP address 10.10.10.1 on the loopback interface as the gateway address:

    cumulus@leaf01:~$ sudo nano /etc/default/isc-dhcp-relay-default
    ...
    # Additional options that are passed to the DHCP relay daemon?
    OPTIONS="-U 10.10.10.1%lo"
    

    This example uses the first IP address on swp2 as the gateway address:

    cumulus@leaf01:~$ sudo nano /etc/default/isc-dhcp-relay-default
    ...
    # Additional options that are passed to the DHCP relay daemon?
    OPTIONS="-U swp2"
    

    This example uses IP address 10.0.0.4 on swp2 as the gateway address:

    cumulus@leaf01:~$ sudo nano /etc/default/isc-dhcp-relay-default
    ...
    # Additional options that are passed to the DHCP relay daemon?
    OPTIONS="-U 10.0.0.4%swp2"
    
  2. Restart the dhcrelay service to apply the configuration change:

    cumulus@leaf01:~$ sudo systemctl restart dhcrelay@default.service
    

DHCP Relay for IPv4 in an EVPN Symmetric Environment with MLAG

In a multi-tenant EVPN symmetric routing environment with MLAG, you must enable RFC 3527 support. You can specify an interface, such as the loopback or VRF interface for the gateway address. The interface must be reachable in the tenant VRF that you configure for DHCP relay and must have a unique IPv4 address. For EVPN symmetric routing with an anycast gateway that reuses the same SVI IP address on multiple leaf switches, you must assign a unique IP address for the VRF interface and include the layer 3 VNI for this VRF in the DHCP relay configuration.

The following example:

cumulus@leaf01:~$ nv set vrf RED loopback ip address 20.20.20.1/32
cumulus@leaf01:~$ nv set service dhcp-relay RED interface vlan10
cumulus@leaf01:~$ nv set service dhcp-relay RED interface vlan20
cumulus@leaf01:~$ nv set service dhcp-relay RED interface vlan4024_l3
cumulus@leaf01:~$ nv set service dhcp-relay RED server 10.1.10.104
cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn enable on
cumulus@leaf01:~$ nv config apply
  1. Edit the /etc/network/interfaces file to configure VRF RED with IPv4 address 20.20.20.1/32

    cumulus@leaf01:mgmt:~$ sudo nano /etc/network/interfaces
    ...
    auto RED
    iface RED
            address 20.20.20.1/32
            vrf-table auto
    
  2. Configure VRF RED to advertise the connected routes as type-5 so that the loopback IPv4 address is reachable:

    cumulus@leaf01:mgmt:~$ sudo vtysh 
    ...
    leaf01# configure terminal
    leaf01(config)# router bgp 65101 vrf RED
    leaf01(config-router)# address-family l2vpn evpn
    leaf01(config-router-af)# advertise ipv4 unicast 
    leaf01(config-router-af)# end
    leaf01# write memory
    

    The /etc/frr/frr.conf file now contains the following entries:

    ...
    router bgp 65101 vrf RED
     bgp router-id 10.10.10.1
    ..
     !
     address-family ipv4 unicast
      redistribute connected
      maximum-paths 64
      maximum-paths ibgp 64
     exit-address-family
     !
     address-family l2vpn evpn
      advertise ipv4 unicast
     exit-address-family
    exit
    
  3. Edit the /etc/default/isc-dhcp-relay-RED file.

    cumulus@leaf01:mgmt:~$ sudo nano /etc/default/isc-dhcp-relay-RED
    SERVERS="10.1.10.104"
    INTF_CMD=" -i vlan10 -i vlan20 -i vlan4024_l3" 
    OPTIONS="-U RED"
    
  4. Start and enable the DHCP service so that it starts automatically the next time the switch boots:

    sudo systemctl start dhcrelay@RED.service
    sudo systemctl enable dhcrelay@RED.service
    

DHCP Relay for IPv4 in an EVPN Symmetric Environment without MLAG

In a multi-tenant EVPN symmetric routing environment without MLAG, the VLAN interface (SVI) IPv4 address is typically unique on each leaf switch, which does not require RFC 3527 configuration.

The following example:

cumulus@leaf01:~$ nv set service dhcp-relay RED interface vlan10
cumulus@leaf01:~$ nv set service dhcp-relay RED interface vlan20
cumulus@leaf01:~$ nv set service dhcp-relay RED interface vlan4024_l3
cumulus@leaf01:~$ nv set service dhcp-relay RED server 10.1.10.104
cumulus@leaf01:~$ nv config apply
  1. Edit the /etc/default/isc-dhcp-relay-RED file.

    cumulus@leaf01:mgmt:~$ sudo nano /etc/default/isc-dhcp-relay-RED
    SERVERS="10.1.10.104"
    INTF_CMD=" -i vlan10 -i vlan20 -i vlan4024_l3" 
    OPTIONS=""
    
  2. Start the DHCP service and enable it to start automatically when the switch boots:

    sudo systemctl start dhcrelay@RED.service
    sudo systemctl enable dhcrelay@RED.service
    

DHCP Relay for IPv6 in an EVPN Symmetric Environment

For IPv6 DHCP relay in a symmetric routing environment, you must assign a unique IPv6 address to the non-default VRF interfaces that participate in DHCP relay. Cumulus Linux uses this IPv6 address as the source address when sending packets to the DHCP server and the DHCP server replies to this address.

RFC 3527 does not apply to IPv6. IPv6 has the functionality described in RFC 3527 as part of its normal operations.

The following example:

cumulus@leaf01:~$ nv set vrf RED loopback ip address 2001:db8:666::1/128
cumulus@leaf01:~$ nv set service dhcp-relay6 RED interface downstream vlan10
cumulus@leaf01:~$ nv set service dhcp-relay6 RED interface downstream vlan20
cumulus@leaf01:~$ nv set service dhcp-relay6 RED interface upstream RED server-address 2001:db8:199::2
cumulus@leaf01:~$ nv set service dhcp-relay6 RED interface upstream vlan4024_l3
cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv6-unicast route-export to-evpn enable on
cumulus@leaf01:~$ nv config apply
  1. Edit the /etc/network/interfaces file to configure VRF RED with IPv6 address 2001:db8:666::1/128:

    cumulus@leaf01:mgmt:~$ sudo nano /etc/network/interfaces
    ...
    auto RED
    iface RED
            address 2001:db8:666::1/128
            vrf-table auto
    
  2. Configure VRF RED to advertise the connected routes so that the loopback IPv6 address is reachable:

    cumulus@leaf01:mgmt:~$ sudo vtysh 
    ...
    leaf01# configure terminal
    leaf01(config)# router bgp 65101 vrf RED
    leaf01(config-router)# address-family l2vpn evpn
    leaf01(config-router-af)# advertise ipv6 unicast 
    leaf01(config-router-af)# end
    leaf01# write memory
    

    The /etc/frr/frr.conf file now contains the following entries:

    ...
    router bgp 65101 vrf RED
     bgp router-id 10.10.10.1
    ..
     !
     address-family ipv6 unicast
      redistribute connected
      maximum-paths 64
      maximum-paths ibgp 64
     exit-address-family
     !
     address-family l2vpn evpn
      advertise ipv6 unicast
     exit-address-family
    exit
    
  3. Edit the /etc/default/isc-dhcp-relay6-RED file.

    • Set the -l option to the VLANs that receive DHCP requests from hosts.
    • Set the <ip-address-dhcp-server>%<interface-facing-dhcp-server> option to associate the DHCP Server with VRF RED.
    • Set the -u option to indicate where the switch receives replies from the DHCP server (SVI vlan4024_l3).
    cumulus@leaf01:mgmt:~$ sudo nano /etc/default/isc-dhcp-relay6-RED
    INTF_CMD="-l vlan10 -l vlan20"
    SERVERS="-u 2001:db8:199::2%RED -u vlan4024_l3"
    
  4. Start and enable the DHCP service so that it starts automatically the next time the switch boots:

    sudo systemctl start dhcrelay6@RED.service
    sudo systemctl enable dhcrelay6@RED.service
    

Gateway IP Address as Source IP for Relayed DHCP Packets (Advanced)

You can configure the dhcrelay service to forward IPv4 (only) DHCP packets to a DHCP server and ensure that the source IP address of the relayed packet is the same as the gateway IP address.

This option impacts all relayed IPv4 packets globally.

To use the gateway IP address as the source IP address:

cumulus@leaf01:~$ nv set service dhcp-relay default source-ip gateway
cumulus@leaf01:~$ nv config apply
  1. Edit the /etc/default/isc-dhcp-relay-default file to add --giaddr-src to the OPTIONS line.

    cumulus@leaf01:~$ sudo nano /etc/default/isc-dhcp-relay-default
    SERVERS="172.16.1.102"
    INTF_CMD="-i vlan10 -i swp51 -i swp52 -U swp2"
    OPTIONS="--giaddr-src"
    
  2. Restart the dhcrelay service to apply the configuration change:

    cumulus@leaf01:~$ sudo systemctl restart dhcrelay@default.service
    

Configure Multiple DHCP Relays

Cumulus Linux supports multiple DHCP relay daemons on a switch to enable relaying of packets from different bridges to different upstream interfaces.

To configure multiple DHCP relay daemons on a switch:

  1. In the /etc/default directory, create a configuration file for each DHCP relay daemon. Use the naming scheme isc-dhcp-relay-<dhcp-name> for IPv4 or isc-dhcp-relay6-<dhcp-name> for IPv6. This is an example configuration file for IPv4:

    # Defaults for isc-dhcp-relay initscript
    # sourced by /etc/init.d/isc-dhcp-relay
    # installed at /etc/default/isc-dhcp-relay by the maintainer scripts
    
    #
    # This is a POSIX shell fragment
    #
    
    # What servers should the DHCP relay forward requests to?
    SERVERS="102.0.0.2"
    # On what interfaces should the DHCP relay (dhrelay) serve DHCP requests?
    # Always include the interface towards the DHCP server.
    # This variable requires a -i for each interface configured above.
    # This will be used in the actual dhcrelay command
    # For example, "-i eth0 -i eth1"
    INTF_CMD="-i swp2s2 -i swp2s3"
    
    # Additional options that are passed to the DHCP relay daemon?
    OPTIONS=""
    
  2. Run the following command to start a dhcrelay instance, where <dhcp-name> is the instance name or number.

    cumulus@leaf01:~$ sudo systemctl start dhcrelay@<dhcp-name>
    

Troubleshooting

This section provides troubleshooting tips.

Show DHCP Relay Status

To show the DHCP relay status:

Run the nv show service dhcp-relay command for IPv4 or the nv show service dhcp-relay6 command for IPv6:

cumulus@leaf01:~$ nv show service dhcp-relay
           source-ip  Summary
---------  ---------  -----------------------
+ default  auto       gateway-interface: lo
  default             interface:        swp51
  default             interface:        swp52
  default             interface:        vlan10
  default             server:    172.16.1.102

Run the Linux systemctl status dhcrelay@default.service command for IPv4 or the systemctl status dhcrelay6@default.service command for IPv6:

cumulus@leaf01:~$ sudo systemctl status dhcrelay@default.service
● dhcrelay@default.service - DHCPv4 Relay Agent Daemon default in vrf default
   Loaded: loaded (/lib/systemd/system/dhcrelay@.service; enabled; vendor preset: enabled)
  Drop-In: /run/systemd/generator/dhcrelay@.service.d
           └─vrf.conf
   Active: active (running) since Tue 2023-04-18 18:23:55 UTC; 9min ago
     Docs: man:dhcrelay(8)
 Main PID: 30904 (dhcrelay)
    Tasks: 1 (limit: 2056)
   Memory: 2.3M
   CGroup: /system.slice/system-dhcrelay.slice/dhcrelay@default.service
           └─vrf
             └─30904 /usr/sbin/dhcrelay --nl -d -i swp51 -i swp52 -i vlan10 -i peerlink.4094 172.16.1.102

Check systemd

If you are experiencing issues with DHCP relay, check if there is a problem with systemd:

To see how DHCP relay is working on your switch, run the journalctl command:

cumulus@leaf01:~$ sudo journalctl -l -n 20 | grep dhcrelay
Dec 05 20:58:55 leaf01 dhcrelay[6152]: sending upstream swp52
Dec 05 20:58:55 leaf01 dhcrelay[6152]: sending upstream swp51
Dec 05 20:58:55 leaf01 dhcrelay[6152]: Relaying Reply to fe80::4638:39ff:fe00:3 port 546 down.
Dec 05 20:58:55 leaf01 dhcrelay[6152]: Relaying Reply to fe80::4638:39ff:fe00:3 port 546 down.
Dec 05 21:03:55 leaf01 dhcrelay[6152]: Relaying Renew from fe80::4638:39ff:fe00:3 port 546 going up.
Dec 05 21:03:55 leaf01 dhcrelay[6152]: sending upstream swp52
Dec 05 21:03:55 leaf01 dhcrelay[6152]: sending upstream swp51
Dec 05 21:03:55 leaf01 dhcrelay[6152]: Relaying Reply to fe80::4638:39ff:fe00:3 port 546 down.
Dec 05 21:03:55 leaf01 dhcrelay[6152]: Relaying Reply to fe80::4638:39ff:fe00:3 port 546 down.

To specify a time period with the journalctl command, use the --since flag:

cumulus@leaf01:~$ sudo journalctl -l --since "2 minutes ago" | grep dhcrelay
Dec 05 21:08:55 leaf01 dhcrelay[6152]: Relaying Renew from fe80::4638:39ff:fe00:3 port 546 going up.
Dec 05 21:08:55 leaf01 dhcrelay[6152]: sending upstream swp52
Dec 05 21:08:55 leaf01 dhcrelay[6152]: sending upstream swp51

Configuration Errors

If you configure DHCP relays by editing the /etc/default/isc-dhcp-relay-default file manually, you can introduce configuration errors that cause the switch to crash.

For example, if you see an error similar to the following, check that there is no space between the DHCP server address and the interface you use as the uplink.

Core was generated by /usr/sbin/dhcrelay --nl -d -i vx-40 -i vlan10 10.0.0.4 -U 10.0.1.2  %vlan20.
Program terminated with signal SIGSEGV, Segmentation fault.

To resolve the issue, manually edit the /etc/default/isc-dhcp-relay-default file to remove the space, then run the systemctl restart dhcrelay@default.service command to restart the dhcrelay service and apply the configuration change.

Considerations

DHCP Servers

A DHCP server automatically provides and assigns IP addresses and other network parameters to client devices. It relies on DHCP to respond to broadcast requests from clients.

If you intend to run the dhcpd service within a VRF, including the management VRF, follow these steps.

Basic Configuration

This section shows you how to configure a DHCP server using the following topology, where the DHCP server is a switch running Cumulus Linux.

To configure the DHCP server on a Cumulus Linux switch:

In addition, you can configure a static IP address for a resource, such as a server or printer:

  • To configure static IP address assignments, you must first configure a pool.
  • You can set the DNS server IP address and domain name globally or specify different DNS server IP addresses and domain names for different pools.

The following example configures the storage-servers pool with DNS and static DHCP assignments for server1 and server2.

cumulus@switch:~$ nv set service dhcp-server default pool 10.1.10.0/24 pool-name storage-servers
cumulus@switch:~$ nv set service dhcp-server default pool 10.1.10.0/24 domain-name example.com
cumulus@switch:~$ nv set service dhcp-server default pool 10.1.10.0/24 domain-name-server 192.168.200.53
cumulus@switch:~$ nv set service dhcp-server default pool 10.1.10.0/24 range 10.1.10.100 to 10.1.10.199
cumulus@switch:~$ nv set service dhcp-server default pool 10.1.10.0/24 gateway 10.1.10.1
cumulus@switch:~$ nv set service dhcp-server default static server1
cumulus@switch:~$ nv set service dhcp-server default static server1 ip-address 10.0.0.2
cumulus@switch:~$ nv set service dhcp-server default static server1 mac-address 44:38:39:00:01:7e
cumulus@switch:~$ nv config apply

To allocate DHCP addresses from the configured pool, you must configure an interface with an IP address from the pool subnet. For example:

cumulus@switch:~$ nv set interface vlan10 ip address 10.1.10.1/24
cumulus@switch:~$ nv config apply

To set the DNS server IP address and domain name globally, use the nv set service dhcp-server <vrf> domain-name-server <address> and nv set service dhcp-server <vrf> domain-name <domain> commands.

To set the interface name for the static assignment, run the nv set service dhcp-server <vrf> static <server> ifname command.

cumulus@switch:~$ nv set service dhcp-server6 default pool 2001:db8:1::/64 
cumulus@switch:~$ nv set service dhcp-server6 default pool 2001:db8:1::/64 pool-name storage-servers
cumulus@switch:~$ nv set service dhcp-server6 default pool 2001:db8:1::/64 domain-name-server 2001:db8::64
cumulus@switch:~$ nv set service dhcp-server6 default pool 2001:db8:1::/64 domain-name example.com
cumulus@switch:~$ nv set service dhcp-server6 default pool 2001:db8:1::/64 range 2001:db8::100 to 2001:db8::199 
cumulus@switch:~$ nv set service dhcp-server6 default static server1
cumulus@switch:~$ nv set service dhcp-server6 default static server1 ip-address 2001:db8::100
cumulus@switch:~$ nv set service dhcp-server6 default static server1 mac-address 44:38:39:00:01:7e
cumulus@switch:~$ nv config apply

To allocate DHCP addresses from the configured pool, you must configure an interface with an IP address from the pool subnet. For example:

cumulus@switch:~$ nv set interface vlan10 ip address 2001:db8::10/64
cumulus@switch:~$ nv config apply

To set the DNS server IP address and domain name globally, use the nv set service dhcp-server6 <vrf> domain-name-server <address> and nv set service dhcp-server6 <vrf> domain-name <domain> commands.

  1. In a text editor, edit the /etc/dhcp/dhcpd.conf file. Use following configuration as an example:

    cumulus@switch:~$ sudo nano /etc/dhcp/dhcpd.conf
    authoritative;
    subnet 10.1.10.0 netmask 255.255.255.0 {
       option domain-name-servers 192.168.200.53;
       option domain-name example.com;
       default-lease-time 3600;
       max-lease-time 3600;
       default-url ;
    pool {
           range 10.1.10.100 10.1.10.199;
           }
    }
    #Statics
    group {
       host server1 {
          hardware ethernet 44:38:39:00:01:7e;
          fixed-address 10.0.0.2;
       }
    }
    

To set the DNS server IP address and domain name globally, add the DNS server IP address and domain name before the pool information in the /etc/dhcp/dhcpd.conf file. For example:

cumulus@switch:~$ sudo nano /etc/dhcp/dhcpd.conf
authoritative;
option domain-name servers;
option domain-name-servers 192.168.200.51;
subnet 10.1.10.0 netmask 255.255.255.0
   default-lease-time 3600;
   max-lease-time 3600;
...
  1. Edit the /etc/default/isc-dhcp-server configuration file so that the DHCP server starts when the system boots. Here is an example configuration:

    cumulus@switch:~$ sudo nano /etc/default/isc-dhcp-server
    DHCPD_CONF="-cf /etc/dhcp/dhcpd.conf"
    

    INTERFACES="swp1"

  2. Enable and start the dhcpd service:

    cumulus@switch:~$ sudo systemctl enable dhcpd.service
    cumulus@switch:~$ sudo systemctl start dhcpd.service
    
  1. In a text editor, edit the /etc/dhcp/dhcpd6.conf file. Use following configuration as an example:

    cumulus@switch:~$ sudo nano /etc/dhcp/dhcpd6.conf
    authoritative;
    subnet6 2001:db8::/64 {
       option domain-name-servers 2001:db8:100::64;
       option domain-name example.com;
       default-lease-time 3600;
       max-lease-time 3600;
       default-url ;
       pool {
           range6 2001:db8:1::100 2001:db8::199;
       }
    }
    #Statics
    group {
       host server1 {
           hardware ethernet 44:38:39:00:01:7e;
           fixed-address6 2001:db8::100;
       }
    }
    

To set the DNS server IP address and domain name globally, add the DNS server IP address and domain name before the pool information in the /etc/dhcp/dhcpd6.conf file. For example:

cumulus@switch:~$ sudo nano /etc/dhcp/dhcpd6.conf
authoritative;
option domain-name servers;
option domain-name-servers 2001:db8:100::64;
subnet6 2001:db8::/64
   default-lease-time 3600;
   max-lease-time 3600;
...
  1. Edit the /etc/default/isc-dhcp-server6 file so that the DHCP server launches when the system boots. Here is an example configuration:

    cumulus@switch:~$ sudo nano /etc/default/isc-dhcp-server6
    DHCPD_CONF="-cf /etc/dhcp/dhcpd6.conf"
    

    INTERFACES="swp1"

  2. Enable and start the dhcpd6 service:

    cumulus@switch:~$ sudo systemctl enable dhcpd6.service
    cumulus@switch:~$ sudo systemctl start dhcpd6.service
    

Optional Configuration

Lease Time

You can set the network address lease time assigned to DHCP clients. You can specify a number between 180 and 31536000. The default lease time is 3600 seconds.

cumulus@switch:~$ nv set service dhcp-server default pool 10.1.10.0/24 lease-time 200000
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv set service dhcp-server6 default pool 2001:db8:/64 lease-time 200000
cumulus@switch:~$ nv config apply
  1. Edit the /etc/dhcp/dhcpd.conf file to set the lease time (in seconds):

    cumulus@switch:~$ sudo nano /etc/dhcp/dhcpd.conf
    authoritative;
    subnet 10.1.10.0 netmask 255.255.255.0 {
       option domain-name-servers 192.168.200.53;
       option domain-name example.com;
       default-lease-time 200000;
       max-lease-time 200000;
       default-url ;
    pool {
           range 10.1.10.100 10.1.10.199;
           }
    }
    
  2. Restart the dhcpd service:

    cumulus@switch:~$ sudo systemctl restart dhcpd.service
    
  1. Edit the /etc/dhcp/dhcpd6.conf file to set the lease time (in seconds):

    cumulus@switch:~$ sudo nano /etc/dhcp/dhcpd6.conf
    authoritative;
    subnet6 2001:db8::/64 {
       option domain-name-servers 2001:db8:100::64;
       option domain-name example.com;
       default-lease-time 200000;
       max-lease-time 200000;
       default-url ;
       pool {
           range6 2001:db8:1::100 2001:db8::199;
       }
    }
    
  2. Restart the dhcpd6 service:

    cumulus@switch:~$ sudo systemctl restart dhcpd6.service
    

Ping Check

Configure the DHCP server to ping the address you want to assign to a client before issuing the IP address. If there is no response, DHCP delivers the IP address; otherwise, it attempts the next available address in the range.

cumulus@switch:~$ nv set service dhcp-server default pool 10.1.10.0/24 ping-check on
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv set service dhcp-server6 default pool 2001:db8::/64 ping-check on
cumulus@switch:~$ nv config apply
  1. Edit the /etc/dhcp/dhcpd.conf file to add ping-check true;:

    cumulus@switch:~$ sudo nano /etc/dhcp/dhcpd.conf
    authoritative;
    subnet 10.1.10.0 netmask 255.255.255.0 {
       option domain-name-servers 192.168.200.53;
       option domain-name example.com;
       default-lease-time 200000;
       max-lease-time 200000;
       ping-check true;
       default-url ;
    pool {
           range 10.1.10.100 10.1.10.199;
           }
    }
    
  2. Restart the dhcpd service:

    cumulus@switch:~$ sudo systemctl restart dhcpd.service
    
  1. Edit the /etc/dhcp/dhcpd6.conf file to add ping-check true;:

    cumulus@switch:~$ sudo nano /etc/dhcp/dhcpd6.conf
    authoritative;
    subnet6 2001:db8::/64 {
       option domain-name-servers 2001:db8:100::64;
       option domain-name example.com;
       default-lease-time 200000;
       max-lease-time 200000;
       ping-check true;
       default-url ;
       pool {
           range6 2001:db8:1::100 2001:db8::199;
       }
    }
    
  2. Restart the dhcpd6 service:

    cumulus@switch:~$ sudo systemctl restart dhcpd6.service
    

Assign a Port-based IP Address

You can assign an IP address and other DHCP options based on physical location or port regardless of MAC address to clients that attach directly to the Cumulus Linux switch through a switch port. This is helpful when swapping out switches and servers; you can avoid the inconvenience of collecting the MAC address and sending it to the network administrator to modify the DHCP server configuration.

cumulus@switch:~$ nv set service dhcp-server default static server2
cumulus@switch:~$ nv set service dhcp-server default static server2 ip-address 10.0.0.3
cumulus@switch:~$ nv set service dhcp-server default static server2 ifname swp1
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv set service dhcp-server6 default static server2
cumulus@switch:~$ nv set service dhcp-server6 default static server2 ip-address 2001:db8:1::100
cumulus@switch:~$ nv set service dhcp-server6 default static server2 ifname swp1
cumulus@switch:~$ nv config apply
  1. Edit the /etc/dhcp/dhcpd.conf file to add the interface and IP address:
   cumulus@switch:~$ sudo nano /etc/dhcp/dhcpd.conf
# Statics
group {
    host server2 {
        ifname "swp1";
        fixed-address 10.0.0.3;
    }
}
   ...
  1. Restart the dhcpd service:

    cumulus@switch:~$ sudo systemctl restart dhcpd.service
    
  1. Edit the /etc/dhcp/dhcpd6.conf file to add the interface and IP address:

    cumulus@switch:~$ sudo nano /etc/dhcp/dhcpd6.conf
    ...
    host server2 {
        ifname "swp1" ;
        fixed-address 2001:db8::100;
    }
    ...
    
  2. Restart the dhcpd6 service:

    cumulus@switch:~$ sudo systemctl restart dhcpd6.service
    

Troubleshooting

To show the current DHCP server settings, run the nv show service dhcp-server command for IPv4 or nv show service dhcp-server6 for IPv6:

cumulus@leaf01:mgmt:~$ nv show service dhcp-server
           Summary
---------  ------------------
+ default  interface:   "swp1
  default  pool: 10.1.10.0/24
  default  static:    server1

The DHCP server determines if a DHCP request is a relay or a non-relay DHCP request. Run the following command to see the DHCP request:

cumulus@server02:~$ sudo tail /var/log/syslog | grep dhcpd
2016-12-05T19:03:35.379633+00:00 server02 dhcpd: Relay-forward message from 2001:db8:101::1 port 547, link address 2001:db8:101::1, peer address fe80::4638:39ff:fe00:3
2016-12-05T19:03:35.380081+00:00 server02 dhcpd: Advertise NA: address 2001:db8::110 to client with duid 00:01:00:01:1f:d8:75:3a:44:38:39:00:00:03 iaid = 956301315 valid for 600 seconds
2016-12-05T19:03:35.380470+00:00 server02 dhcpd: Sending Relay-reply to 2001:db8:101::1 port 547

Considerations

DHCP packets received on bridge ports and sent to the CPU for processing cause the RX_DROP counter to increment on the interface.

DHCP Snooping

DHCP snooping is a network security feature that prevents unauthorized DHCP servers from assigning IP addresses, protects against DHCP spoofing and IP address conflicts, and enhances overall network security. By ensuring that only trusted DHCP servers can assign IP addresses and maintaining a binding table of IP address to MAC address mappings, DHCP snooping helps safeguard network integrity and reliability.

Cumulus Linux acts as a middle layer between the DHCP infrastructure and DHCP clients by scanning DHCP control packets and building an IP-MAC database. Cumulus Linux accepts DHCP offers from trusted interfaces only.

  • Cumulus Linux does not support DHCP option 82 processing.
  • You must add the DHCP snooping VLAN to a bridge.
  • DHCP snooping supports single bridge mode only.

Configure DHCP Snooping

To configure DHCP snooping:

The following example enables DHCP snooping on VLAN 10 and sets the trusted interface to swp3. swp3 is a member of the bridge br_default:

cumulus@switch:~$ nv set bridge domain br_default vlan 10
cumulus@switch:~$ nv set bridge domain br_default dhcp-snoop vlan 10 
cumulus@switch:~$ nv set bridge domain br_default dhcp-snoop vlan 10 trust swp3
cumulus@switch:~$ nv config apply

For IPv6, run the nv set bridge domain <bridge> dhcp-snoop6 vlan <vlan> command.

To disable DHCP snooping on a VLAN under a bridge, run the nv unset bridge domain <bridge> dhcp-snoop vlan <vlan> command for IPv4 or the nv unset bridge domain <bridge> dhcp-snoop6 vlan <vlan> command for IPv6.

Create the /etc/dhcpsnoop/dhcp_snoop.json file, then add DHCP snooping configuration under the bridge.

The following example enables DHCP snooping for IPv4 on VLAN 10 and sets the trusted interface to swp3. swp3 is a member of the bridge br_default:

cumulus@switch:~$ sudo nano /etc/dhcpsnoop/dhcp_snoop.json
{
  "bridge": [
    {
      "bridge_id": "br_default",
      "vlan": [
        {
          "vlan_id": 10,
          "snooping": 1,
          "ip_version": 4,
          "trusted_interface": [
            "swp3"
          ],
        }
      ]
    }
  ]
}

The following example enables DHCP snooping for IPv6 on VLAN 10 and sets the trusted interface to swp6. swp6 is a member of the bridge br_default:

cumulus@switch:~$ sudo nano /etc/dhcpsnoop/dhcp_snoop.json
{
  "bridge": [
    {
      "bridge_id": "br_default",
      "vlan": [
        {
          "vlan_id": 10,
          "snooping": 1,
          "ip_version": 6,
          "trusted_interface": [
            "swp6"
          ],
        }
      ]
    }
  ]
}

Show the DHCP Snooping Table

To show the DHCP snooping table, run the nv show bridge domain <bridge> dhcp-snoop command for IPv4 or the nv show bridge domain <bridge> dhcp-snoop6 command for IPv6.

The following example shows the DHCP snooping table for IPv4:

cumulus@switch:~$ nv show bridge domain br_default dhcp-snoop
DHCP Snooping Table
======================
    Vlan  Port  IP           MAC                State  Lease  Bridge    
    ----  ----  -----------  -----------------  -----  -----  ----------
    10    swp1  10.1.10.100  48:b0:2d:fa:6b:a1  ACK    3600   br_default

To show the DHCP snooping table for a specific VLAN, run the nv show bridge domain <bridge> dhcp-snoop vlan <vlan-ID> command for IPv4 or the nv show bridge domain <bridge> dhcp-snoop6 vlan <vlan-id> command for IPv6.

The following example shows the IPv4 DHCP snooping table for VLAN 10:

cumulus@switch:~$ nv show bridge domain br_default dhcp-snoop vlan 10
DHCP Snooping Vlan Trust Ports Table
=======================================
    Port 
    -----
    swp51

DHCP Snooping Vlan Bind Table
================================
    Port  IP           MAC                State  Lease  Bridge    
    ----  -----------  -----------------  -----  -----  ----------
    swp1  10.1.10.100  48:b0:2d:fa:6b:a1  ACK    3300   br_default

To show trusted port information in the DHCP snooping table, run the nv show bridge domain <bridge-id> dhcp-snoop trust-ports command for IPv4 or the nv show bridge domain <bridge> dhcp-snoop6 trust-ports command for IPv6.

The following example shows the trusted port information in the IPv4 DHCP snooping table:

cumulus@switch:~$ show bridge domain br_default dhcp-snoop trust-ports
Vlan               Ports
----------    --------------------
100           swp1, swp2
200           swp3, swp4

Prescriptive Topology Manager - PTM

PTM is a dynamic cabling verification tool that can detect and eliminate errors. PTM uses a Graphviz-DOT specified network cabling plan in a topology.dot file and couples it with runtime information from LLDP to verify that the cabling matches the specification. The check occurs on every link transition on each node in the network.

You can customize the topology.dot file to control the PTM service (ptmd) at both the global and network level, and the node and port level.

Supported Features

Configure PTM

The ptmd service verifies the physical network topology against a DOT-specified network graph file, /etc/ptm.d/topology.dot.

PTM supports undirected graphs.

At startup, the ptmd service connects to the lldpd service over a Unix socket and retrieves the neighbor name and port information. It then compares the retrieved port information with the configuration information that it reads from the topology file. If there is a match, it is a PASS, otherwise it is a FAIL.

PTM performs its LLDP neighbor check using the PortID ifname TLV information.

ptmd Scripts

The ptmd service executes scripts at /etc/ptm.d/if-topo-pass and /etc/ptm.d/if-topo-fail for each interface that goes through a change and runs if-topo-pass when an LLDP or BFD check passes, or if-topo-fails when the check fails. The scripts receive an argument string that is the result of the ptmctl command; see ptmd commands below.

You can modify these default scripts.

Configuration Parameters

You can configure ptmd parameters in the topology file. The parameters are host-only, global, per-port or node and templates.

Host-only Parameters

Host-only parameters apply to the entire host on which PTM is running. You can include the hostnametype host-only parameter that specifies if PTM uses only the hostname (hostname) or the fully qualified domain name (fqdn) while looking for the self-node in the graph file. For example, in the graph file below PTM ignores the FQDN and only looks for switch04 because that is the hostname of the switch on which it is running:

  • Always wrap the hostname in double quotes; for example, "www.example.com" to prevent ptmd from failing.
  • To avoid errors when starting the ptmd service, make sure that /etc/hosts and /etc/hostname both reflect the hostname you are using in the topology.dot file.

graph G {
          hostnametype="hostname"
          "cumulus":"swp44" -- "switch04.cumulusnetworks.com":"swp20"
          "cumulus":"swp46" -- "switch04.cumulusnetworks.com":"swp22"
}

In this next example, PTM compares using the FQDN and looks for switch05.cumulusnetworks.com, which is the FQDN of the switch on which it is running:

graph G {
          hostnametype="fqdn"
          "cumulus":"swp44" -- "switch05.cumulusnetworks.com":"swp20"
          "cumulus":"swp46" -- "switch05.cumulusnetworks.com":"swp22"
}

Global Parameters

Global parameters apply to every port in the topology file. LLDP is on by default; if no keyword is present, PTM uses the default values for all ports.

graph G {
          LLDP=""
          "cumulus":"swp44" -- "qct-ly2-04":"swp20"
          "cumulus":"swp46" -- "qct-ly2-04":"swp22"
}

Per-port Parameters

Per-port parameters provide finer-grained control at the port level. These parameters override any global or compiled defaults. For example:

graph G {
          LLDP=""
          "cumulus":"swp44" -- "qct-ly2-04":"swp20"
          "cumulus":"swp46" -- "qct-ly2-04":"swp22"
}

Templates

Templates provide flexibility so that you can choose different parameter combinations and apply them to a given port. A template instructs ptmd to reference a named parameter string instead of a default one. In the following configuration, LLDP1 and LLDP2 are templates for LLDP parameters:

For example:

graph G {
          LLDP=""
          LLDP1="match_type=ifname"
          LLDP2="match_type=portdescr"
          "cumulus":"swp44" -- "qct-ly2-04":"swp20" [LLDP="lldptmpl=LLDP1"]
          "cumulus":"swp46" -- "qct-ly2-04":"swp22" [LLDP="lldptmpl=LLDP2"]
          "cumulus":"swp46" -- "qct-ly2-04":"swp22"
}

Supported LLDP Parameters

ptmd supports the following LLDP parameters:

The following is an example of a topology with LLDP at the port level:

graph G {
          "cumulus-1":"swp44" -- "cumulus-2":"swp20" [LLDP="match_hostname=fqdn"]
          "cumulus-1":"swp46" -- "cumulus-2":"swp22" [LLDP="match_type=portdescr"]
}

When you specify match_hostname=fqdn, PTM matches the entire FQDN, (cumulus-2.domain.com in the example below). If you do not specify a value for match_hostname, PTM matches based on hostname only, (cumulus-3 below), and ignores the rest of the URL:

graph G {
          "cumulus-1":"swp44" -- "cumulus-2.domain.com":"swp20" [LLDP="match_hostname=fqdn"]
          "cumulus-1":"swp46" -- "cumulus-3":"swp22" [LLDP="match_type=portdescr"]
}

BFD

BFD provides low overhead and rapid detection of failures in the paths between two network devices. PTM provides a unified mechanism for link detection over all media and protocol layers and integrated with FRR to enable BFD. Use BFD to detect failures for IPv4 and IPv6 single or multihop paths between any two network devices, including unidirectional path failure detection. For information about configuring BFD, see BFD.

You can enable PTM to perform additional checks to ensure that routing adjacencies form only on links that have connectivity and that conform to the specification that PTM defines.

You only need to enable PTM to check link state. You do not need to enable PTM to determine BFD status.

cumulus@switch:~$ nv set router ptm enable
cumulus@switch:~$ nv config apply

To disable the check link state:

cumulus@switch:~$ nv unset router ptm enable
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# ptm-enable
switch(config)# end
switch# write memory
switch# exit
cumulus@switch:~$

To disable the check link state, set the no ptm-enable parameter:

cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# no ptm-enable
switch(config)# end
switch# write memory
switch# exit
cumulus@switch:~$

To check PTM status on an interface, run the vtysh show interface <interface> command.

cumulus@switch:~$ show interface swp51
Interface swp51 is up, line protocol is up
  Link ups:       0    last: (never)
  Link downs:     0    last: (never)
  PTM status: disabled
...

ptmd Service Commands

PTM sends client notifications in CSV format.

To start the ptmd service, run the sudo systemctl start ptmd.service command. The topology.dot file must be present for the service to start.

cumulus@switch:~$ sudo systemctl start ptmd.service

To restart the ptmd service, run the sudo systemctl restart ptmd.service command:

cumulus@switch:~$ sudo systemctl restart ptmd.service

To instruct the ptmd service to read the topology.dot file again to apply the new configuration to the running state without restarting, run the sudo systemctl reload ptmd.service command:

cumulus@switch:~$ sudo systemctl reload ptmd.service

To stop the ptmd service, run the sudo systemctl stop ptmd.service command:

cumulus@switch:~$ sudo systemctl stop ptmd.service

To retrieve the current running state of the ptmd service, run the sudo systemctl status ptmd.service command:

cumulus@switch:~$ sudo systemctl status ptmd.service

ptmctl Commands

ptmctl is a client of the ptmd service that retrieves the operational state of the ports configured on the switch and information about BFD sessions from ptmd. ptmctl parses the CSV notifications sent by ptmd. See man ptmctl for more information.

ptmctl Examples

The examples below contain the following keywords in the output of the cbl status column:

cbl status Keyword Definition
pass The topology file defines the interface, the interface receives LLDP information, and the LLDP information for the interface matches the information in the topology file.
fail The topology file defines the interface, the interface receives LLDP information, and the LLDP information for the interface does not match the information in the topology file.
N/A The topology file defines the interface, but the interface does not receive LLDP information. The interface might be down or disconnected, or the neighbor is not sending LLDP packets.
The N/A and fail status might indicate a wiring problem to investigate.
The N/A status does not show when you use the -l option with ptmctl; the output shows only interfaces that are receiving LLDP information.

For basic output, use ptmctl without any options:

PTM show command output displays BFD status when you configure BFD through integration with FRR.

cumulus@switch:~$ sudo ptmctl

-------------------------------------------------------------
port  cbl     BFD     BFD                  BFD    BFD
      status  status  peer                 local  type
-------------------------------------------------------------
swp1  pass    pass    11.0.0.2             N/A    singlehop
swp2  pass    N/A     N/A                  N/A    N/A
swp3  pass    N/A     N/A                  N/A    N/A

For more detailed output, use the -d option:

cumulus@switch:~$ sudo ptmctl -d

--------------------------------------------------------------------------------------
port  cbl    exp     act      sysname  portID  portDescr  match  last    BFD   BFD
      status nbr     nbr                                  on     upd     Type  state
--------------------------------------------------------------------------------------
swp45 pass   h1:swp1 h1:swp1  h1       swp1    swp1       IfName 5m: 5s  N/A   N/A
swp46 fail   h2:swp1 h2:swp1  h2       swp1    swp1       IfName 5m: 5s  N/A   N/A

#continuation of the output
-------------------------------------------------------------------------------------------------
BFD   BFD       det_mult  tx_timeout  rx_timeout  echo_tx_timeout  echo_rx_timeout  max_hop_cnt
peer  DownDiag
-------------------------------------------------------------------------------------------------
N/A   N/A       N/A       N/A         N/A         N/A              N/A              N/A
N/A   N/A       N/A       N/A         N/A         N/A              N/A              N/A

To show information about the active BFD sessions that the ptmd serice is tracking, use the -b option:

cumulus@switch:~$ sudo ptmctl -b

----------------------------------------------------------
port  peer        state  local         type       diag

----------------------------------------------------------
swp1  11.0.0.2    Up     N/A           singlehop  N/A
N/A   12.12.12.1  Up     12.12.12.4    multihop   N/A

To show LLDP information, use the -l option. The output shows only the active neighbors that the ptmd service is tracking.

cumulus@switch:~$ sudo ptmctl -l

---------------------------------------------
port  sysname  portID  port   match  last
                       descr  on     upd
---------------------------------------------
swp45 h1       swp1    swp1   IfName 5m:59s
swp46 h2       swp1    swp1   IfName 5m:59s

To show detailed information about the active BFD sessions that the ptmd service is tracking, use the -b and -d option:

cumulus@switch:~$ sudo ptmctl -b -d

----------------------------------------------------------------------------------------
port  peer                 state  local  type       diag  det   tx_timeout  rx_timeout
                                                          mult
----------------------------------------------------------------------------------------
swp1  fe80::202:ff:fe00:1  Up     N/A    singlehop  N/A   3     300         900
swp1  3101:abc:bcad::2     Up     N/A    singlehop  N/A   3     300         900

#continuation of output
---------------------------------------------------------------------
echo        echo        max      rx_ctrl  tx_ctrl  rx_echo  tx_echo
tx_timeout  rx_timeout  hop_cnt
---------------------------------------------------------------------
0           0           N/A      187172   185986   0        0
0           0           N/A      501      533      0        0

ptmctl Error Outputs

If there are errors in the topology file or there is no session, PTM returns appropriate outputs. Typical error strings are:

Topology file error [/etc/ptm.d/topology.dot] [cannot find node cumulus] -
please check /var/log/ptmd.log for more info

Topology file error [/etc/ptm.d/topology.dot] [cannot open file (errno 2)] -
please check /var/log/ptmd.log for more info

No Hostname/MgmtIP found [Check LLDPD daemon status] -
please check /var/log/ptmd.log for more info

No BFD sessions . Check connections

No LLDP ports detected. Check connections

Unsupported command

For example:

cumulus@switch:~$ sudo ptmctl
-------------------------------------------------------------------------
cmd         error
-------------------------------------------------------------------------
get-status  Topology file error [/etc/ptm.d/topology.dot]
            [cannot open file (errno 2)] - please check /var/log/ptmd.log
            for more info

If you encounter errors with the topology.dot file, you can use dot (included in the Graphviz package) to validate the syntax of the topology file.

Open the topology file with Graphviz to ensure that it is readable and that the file format is correct.

If you edit the topology.dot file from a Windows system, be sure to doublecheck the file formatting; there might be extra characters that keep the graph from working correctly.

Basic Topology Example

The following example shows a basic example DOT file and its corresponding topology diagram. Use the same topology.dot file on all switches and do not split the file for each device to allow for easy automation by using the same exact file on each device.

graph G {
    "spine1":"swp1" -- "leaf1":"swp1";
    "spine1":"swp2" -- "leaf2":"swp1";
    "spine2":"swp1" -- "leaf1":"swp2";
    "spine2":"swp2" -- "leaf2":"swp2";
    "leaf1":"swp3" -- "leaf2":"swp3";
    "leaf1":"swp4" -- "leaf2":"swp4";
    "leaf1":"swp5s0" -- "server1":"eth1";
    "leaf2":"swp5s0" -- "server2":"eth1";
}

Considerations

Commas in Port Descriptions

If an LLDP neighbor advertises a PortDescr that contains commas, ptmctl -d splits the string on the commas and misplaces its components in other columns. Do not use commas in your port descriptions.

Port Security

Port security is a layer 2 traffic control feature that enables you to limit port access to a specific number of MAC addresses or specific MAC addresses so that the port does not forward ingress traffic from undefined source addresses (static MAC).

You can configure what action to take when there is a port security violation (drop packets or put the port into protodown state) and add a timeout for the action to take effect. The default setting mode is to drop packets.

Port security supports 802.1X interfaces, layer 2 interfaces in trunk or access mode but not interfaces in a bond. For information about how port security and 802.1X work together, see 802.1x multi host mode.

Configure Port Security

To configure port security:

To enable security on a port, run the nv set interface <interface> port-security enable on command:

cumulus@switch:~$ nv set interface swp1 port-security enable on
cumulus@switch:~$ nv config apply

You can disable port security on an interface with the nv set interface <interface> port-security enable off command.

To configure the maximum number of MAC addresses allowed to access the port, run the nv set interface <interface> port-security mac-limit command. You can specify a value between 1 and 512. The default value is 32.

cumulus@switch:~$ nv set interface swp1 port-security mac-limit 100
cumulus@switch:~$ nv config apply 

To configure specific MAC addresses allowed to access the port, run the nv set interface <interface> port-security static-mac command.

You can configure a maximum of 450 static MAC addresses per interface.

cumulus@switch:~$ nv set interface swp1 port-security static-mac 00:02:00:00:00:05
cumulus@switch:~$ nv set interface swp1 port-security static-mac 00:02:00:00:00:06
cumulus@switch:~$ nv config apply

To enable sticky MAC port security to track specific dynamically learned MAC addresses on a port, run the nv set interface <interface> port-security sticky-mac enabled command.

Cumulus Linux maintains learned sticky MAC addresses through interface flaps and reboots if the source MAC address is still sending traffic; otherwise learned sticky MAC addresses age out according to the sticky MAC aging time.

cumulus@switch:~$ nv set interface swp1 port-security sticky-mac enabled
cumulus@switch:~$ nv config apply

To enable sticky MAC aging, run the nv set interface <interface> port-security sticky-aging enabled command.

cumulus@switch:~$ nv set interface swp1 port-security sticky-ageing enabled
cumulus@switch:~$ nv config apply

To configure the time period after which learned sticky MAC addresses age out and no longer have access to the port, run the nv set interface <interface> port-security sticky-timeout command. You can specify a value between 0 and 3600 minutes. The default setting is 1800 minutes.

cumulus@switch:~$ nv set interface swp1 port-security sticky-timeout 20
cumulus@switch:~$ nv config apply

To configure violation mode, either run the nv set interface <interface> port-security violation-mode protodown command to put a port into a protodown state or run the nv set interface <interface> port-security violation-mode restrict command to drop packets.

cumulus@switch:~$ nv set interface swp1 port-security violation-mode protodown
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo ip link set swp2 protodown_reason portsecurity off
cumulus@switch:~$ sudo ip link set swp2 protodown off

To configure the number of minutes after which the violation mode times out, run the nv set interface <interface> port-security violation-timeout command. You can specify a value between 0 and 60 minutes. The default value is 30 minutes.

cumulus@switch:~$ nv set interface swp1 port-security violation-timeout 60
cumulus@switch:~$ nv config apply

Add the configuration settings you want to use to the /etc/cumulus/switchd.d/port_security.conf file, then reload switchd with the sudo systemctl reload switchd.service command to apply the changes.

Setting
Description
interface.<port>.port_security.enable Enables and disables port security. 1 enables security on the port. 0 disables security on the port. The default setting is 0.
interface.<port>.port_security.mac_limit Configures the maximum number of MAC addresses allowed to access the port. You can specify a number between 0 and 512. The default value is 32.
interface.<port>.port_security.static_mac Configures the specific MAC addresses allowed to access the port. To specify multiple MAC addresses, separate each MAC address with a space.
interface.<port>.port_security.sticky_mac Enables and disables sticky MAC port security to track specific dynamically learned MAC addresses on a port. 1 enables sticky MAC. 0 disables sticky MAC.
Cumulus Linux maintains learned sticky MAC addresses through interface flaps and reboots if the source MAC address is still sending traffic; otherwise learned sticky MAC addresses age out according to the sticky MAC aging time.
interface.<port>.port_security.sticky_timeout The time period after which learned sticky MAC addresses age out and no longer have access to the port. You can specify a value between 0 and 3600 minutes. The default aging timeout value is 1800.
interface.<port>.port_security.sticky_aging Enables and disables sticky MAC aging. 1 enables sticky MAC aging. 0 disables sticky MAC aging.
interface.<port>.port_security.violation_mode Configures the violation mode: 0 (protodown) puts a port into a protodown state. 1 (restrict) drops packets. The default setting is 1.
interface.<port>.port_security.violation_timeout Configures the number of minutes after which the violation mode times out. You can specify a value between 0 and 3600. The default value is 1800.

The following shows an example /etc/cumulus/switchd.d/port_security.conf configuration file:

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.d/port_security.conf
...
## Interface Port security
interface.swp1.port_security.enable = 1
interface.swp1.port_security.mac_limit = 100
interface.swp1.port_security.sticky_mac = 1
interface.swp1.port_security.sticky_timeout = 2000
interface.swp1.port_security.sticky_aging = 1
interface.swp1.port_security.violation_mode = 0
interface.swp1.port_security.violation_timeout = 3600
interface.swp1.port_security.static_mac = 00:02:00:00:00:05 00:02:00:00:00:06

Clear the Protodown State

If there is a port security violation and the port goes into a protodown state, you can clear the protodown state after you mitigate the MAC address causing the violation with the following commands:

cumulus@switch:~$ sudo ip link set swp1 protodown_reason portsecurity off
cumulus@switch:~$ sudo ip link set swp1 protodown off

Troubleshooting

To show port security configuration, run the nv show interface <interface-id> port-security command:

cumulus@switch:~$ nv show interface swp1 port-security
                   operational  applied
-----------------  -----------  --------
enable             on           on
mac-limit          32           32
sticky-mac         disabled     disabled
sticky-timeout     1800         1800
sticky-ageing      disabled     disabled
violation-mode     restrict     restrict
violation-timeout  30           30

mac-addresses
================
    entry-id  MAC address        Type     Status
    --------  -----------------  -------  ---------
    1         00:01:02:03:04:05
    2         00:02:00:00:00:ab  Static
    3         00:02:00:00:00:05  Static
    4         00:02:00:00:01:05  Static
    5         00:02:00:00:01:06  Static
    6         00:02:01:00:01:06  Static
    7         01:02:01:00:01:06  Static
    8         00:02:00:00:00:11  Dynamic  Installed

Layer 2

This section describes the following layer 2 configuration:

Link Layer Discovery Protocol

LLDP shows information about connected devices. The lldpd daemon implements the IEEE802.1AB LLDP standard and starts at system boot.

LLDP in Cumulus Linux supports CDP (Cisco Discovery Protocol v1 and v2) and logs by default into /var/log/daemon.log with an lldpd prefix.

Enable or Disable LLDP

Cumulus Linux enables the lldp service by default.

You can disable LLDP globally or on an interface.

To disable LLDP globally:

cumulus@leaf01:~$ nv set service lldp state disabled 
cumulus@leaf01:~$ nv config apply

To re-enable LLDP globally, run the nv set service lldp state enabled command.

Stop the lldpd service:

cumulus@leaf01:~$ sudo systemctl stop lldpd
cumulus@leaf01:~$ sudo systemctl disable lldpd

To re-enable LLDP globally, enable and restart the lldp service:

cumulus@leaf01:~$ sudo systemctl enable lldpd
cumulus@leaf01:~$ sudo systemctl restart lldpd

To disable LLDP on an interface:

cumulus@leaf01:~$ nv set interface swp1 lldp state disabled
cumulus@leaf01:~$ nv config apply

To re-enable LLDP on an interface, run the nv set interface swp1 lldp state enabled command.

Create the /etc/lldp.d/lldp-interfaces.conf file and add the configure system interface pattern-blacklist option. The following example disables LLDP on swp1 and swp2:

cumulus@leaf01:~$ sudo nano /etc/lldpd.d/lldp-interfaces.conf
configure system interface pattern-blacklist swp1,swp2

An alternative method is to use the system interface pattern keyword to send LLDP on all interfaces except for swp1 and swp2:

cumulus@leaf01:~$ sudo nano /etc/lldpd.d/lldp-interfaces.conf
configure system interface pattern eth*,swp*,!swp1,!swp2

Restart the lldpd service for the changes to take effect:

cumulus@leaf01:~$ sudo systemctl restart lldpd

A runtime configuration does not persist when you reboot the switch; you lose all changes.

To configure active interfaces:

cumulus@leaf01:~$ sudo lldpcli configure system interface pattern "swp*"

To configure inactive interfaces:

cumulus@leaf01:~$ sudo lldpcli configure system interface pattern *,!eth0,swp*

The active interface list always overrides the inactive interface list.

To reset any interface list to none:

cumulus@leaf01:~$ sudo lldpcli configure system interface pattern ""

To show if LLDP is enabled globally or on an interface, run the nv show service lldp command.

cumulus@leaf01:~$ nv show service lldp
                        operational  applied
----------------------  -----------  -------
tx-interval             30           30     
tx-hold-multiplier      4            4      
dot1-tlv                off          off    
lldp-med-inventory-tlv  off          off    
mode                    default      default
state                   enabled      disabled

The following example show that swp1 through swp4 are up and advertising LLDP between leaf01 and leaf02:

cumulus@leaf01:~$ sudo lldpctl | egrep 'Inter|Port|SysName'
Interface:    eth0, via: LLDP, RID: 1, Time: 1 day, 03:07:48
    SysName:      oob-mgmt-switch
  Port:
    PortID:       ifname swp2
    PortDescr:    swp2
Interface:    swp3, via: LLDP, RID: 2, Time: 0 day, 06:52:48
    SysName:      leaf02
  Port:
    PortID:       ifname swp3
    PortDescr:    swp3
Interface:    swp4, via: LLDP, RID: 2, Time: 0 day, 00:07:38
    SysName:      leaf02
  Port:
    PortID:       ifname swp4
    PortDescr:    swp4

The following example shows that after disabling LLDP on swp1 and swp2, only swp3 and swp4 are generating and receiving LLDP on leaf01. leaf02 is only receiving LLDP on swp3 and swp4 from leaf01:

cumulus@leaf02:~$ sudo lldpctl | egrep 'Inter|Port|SysName'
Interface:    eth0, via: LLDP, RID: 2, Time: 0 day, 00:09:16
    SysName:      oob-mgmt-switch
  Port:
    PortID:       ifname swp3
    PortDescr:    swp3
Interface:    swp3, via: LLDP, RID: 1, Time: 0 day, 00:08:47
    SysName:      leaf01
  Port:
    PortID:       ifname swp3
    PortDescr:    swp3
Interface:    swp4, via: LLDP, RID: 1, Time: 0 day, 00:09:16
    SysName:      leaf01
  Port:
    PortID:       ifname swp4
    PortDescr:    swp4

Configure LLDP Timers

You can configure the frequency of LLDP updates (between 5 and 32768 seconds) and the amount of time (between 1 and 8192 seconds) to hold the information before discarding it. The hold time interval is a multiple of the tx-interval.

The nv show commands reflect certain configuration changes in operational values only after the hold time interval.

The following example commands configure the frequency of LLDP updates to 100 and the hold time to 3.

cumulus@switch:~$ nv set service lldp tx-interval 100
cumulus@switch:~$ nv set service lldp tx-hold-multiplier 3
cumulus@switch:~$ nv config apply

Create the /etc/lldpd.conf file or create a file in the /etc/lldpd.d/ directory with a .conf suffix and add the timers:

cumulus@switch:~$ sudo nano /etc/lldpd.conf
configure lldp tx-interval 100
configure lldp tx-hold 3
...

Restart the lldpd service for the changes to take effect:

cumulus@switch:~$ sudo systemctl restart lldpd

SNMP Subagent

The SNMP subagent allows SNMP queries to retrieve LLDP information from the lldpd service.

If you enable SNMP with NVUE commands, NVUE enables the SNMP subagent automatically. To disable the SNMP subagent, disable SNMP with the NVUE nv set system snmp-server state disable command.

If you use Linux commands to configure the switch, Cumulus Linux does not enable the SNMP subagent by default. To enable the SNMP subagent, edit the /etc/default/lldpd file and add the -x option:

cumulus@switch:~$ sudo nano /etc/default/lldpd

# Add "-x" to DAEMON_ARGS to start SNMP subagent

# Enable CDP by default
DAEMON_ARGS="-c -x -M 4"

Restart the lldpd service for the changes to take effect:

cumulus@switch:~$ sudo systemctl restart lldpd

  • The -c option enables backwards compatibility with CDP. See Change CDP Settings below.
  • The -M 4 option sends a field in discovery packets to indicate that the switch is a network device.

Change CDP Settings

Cumulus Linux provides support for CDP so that the switch can advertise information about itself with Cisco routers that do not support LLDP. By default, the Cumulus Linux switch sends CDP packets only if the peer sends CDP packets. You can change this setting by replacing -c in the /etc/default/lldpd file with one of the following options:

Option Description
-cc The Cumulus Linux switch sends CDPv1 packets even when there is no detected CDP peer.
-ccc The Cumulus Linux switch sends CDPv2 packets even when there is no detected CDP peer.
-cccc The Cumulus Linux switch disables CDPv1 and enables CDPv2.
-ccccc The Cumulus Linux switch disables CDPv1 and forces CDPv2.

The following example changes the CDP setting to -ccc so that the switch sends CDPv2 packets even when there is no detected CDP peer:

cumulus@switch:~$ sudo nano /etc/default/lldpd
...
# Enable CDP by default
DAEMON_ARGS="-ccc -x -M 4"

You must restart the lldpd service for the changes to take effect.

cumulus@switch:~$ sudo systemctl restart lldpd

Set LLDP Mode

By default, the lldpd service sends LLDP frames unless it detects a CDP peer, then it sends CDP frames. You can change this behavior and configure the lldpd service to send only CDP frames or only LLDP frames.

  • You configure the lldpd service to send only CDP or only LLDP frames globally for all interfaces; you cannot configure these settings for specific interfaces.
  • If you configure the lldpd service to send only CDP frames (CDPv1 or CDPv2), LLDP DCBX TLV transmission for QOS ROCE is not supported.

To send only CDPv1 frames:

cumulus@switch:~$ nv set service lldp mode force-send-cdpv1
cumulus@switch:~$ nv config apply

To send only CDPv2 frames:

cumulus@switch:~$ nv set service lldp mode force-send-cdpv2
cumulus@switch:~$ nv config apply

To send only LLDP frames:

cumulus@switch:~$ nv set service lldp mode force-send-lldp
cumulus@switch:~$ nv config apply

To reset to the default setting (to send both CDP and LLDP frames):

cumulus@switch:~$ nv set service lldp mode default
cumulus@switch:~$ nv config apply

Edit the /etc/default/lldpd file and add one of the following options to the DAEMON_ARGS section:

To send only CDPv1 frames:

cumulus@switch:~$ sudo nano /etc/default/lldpd
...
DAEMON_ARGS="-cc -ll -M 4"

To send only CDPv2 frames:

cumulus@switch:~$ sudo nano /etc/default/lldpd
...
DAEMON_ARGS="-cccc -ll -M 4"

To send only LLDP frames:

cumulus@switch:~$ sudo nano /etc/default/lldpd
...
DAEMON_ARGS="-l -M 4"

To reset to the default setting (to send both CDP and LLDP frames):

cumulus@switch:~$ sudo nano /etc/default/lldpd
...
DAEMON_ARGS="-c -M 4"

You must restart the lldpd service for the changes to take effect.

cumulus@switch:~$ sudo systemctl restart lldpd

To show the current LLDP mode, run the nv show service lldp command. The following example shows that the lldpd service sends CDPv2 frames only.

cumulus@leaf02:mgmt:~$ nv show service lldp
                    operational       applied
------------------  ----------------  ----------------
dot1-tlv            off               off
mode                force-send-cdpv2  force-send-cdpv2
tx-hold-multiplier  4                 4
tx-interval         30                30

CDP PortID Behavior

LLDP emulates CDP by default on interfaces where it detects a CDP neighbor. CDP only supports the PortID value (TLV) in the protocol; however, LLDP has a separate PortID and PortDescription field.

By default, when the switch sends a CDP packet, if there is an alias (description) on the interface, LLDP sends the alias in the CDP PortID field. When an LLDP neighbor receives a CDP packet, the receiving switch displays the CDP Port ID in both the PortID and PortDescription fields.

If you want LLDP to send the portID (ifname) value to a CDP neighbor instead of the interface alias, you can configure the following options:

The following table shows the TLVs sent for each configuration.

LLDP Configuration LLDP PortID Value Sent LLDP Port Description Value Sent CDP PortID Value Sent
configure lldp portidsubtype ifname (default) interface ifname interface alias interface alias
configure lldp portidsubtype macaddress MAC address interface mac address interface ifname interface ifname

Use CDP only or LLDP only to get the desired behavior of PortID, Description, or MacAddress (LLDP only) across all neighbors. For more information, see LLDP Mode.

LLDP DCBX TLVs

DCBX is an extension of LLDP that supports TLVs to provide additional information in LLDP packets to peers.

Cumulus Linux supports the following LLDP DCBX TLVs:

  • You can send a maximum of 250 VLANS per switch port in one LLDP frame.
  • Cumulus Linux does not support CEE DCBX TLVs.
  • Cumulus Linux limits DCBX support to enabling DCBX TLVs (either with ROCE global configuration or per interface) as documented in the IEEE 802.1Q standard.

IEEE 802.1 TLVs

You can transmit the following IEEE 802.1 TLVs when exchanging LLDP messages. By default, IEEE 802.1 TLV transmission is off and the switch sends all LLDP frames without IEEE 802.1 TLVs.

Name Subtype Description
Port VLAN ID 1 The port VLAN identifier.
VLAN Name 3 The name of any VLAN to which the port belongs.
Link Aggregation 7 Indicates if the port supports link aggregation and if it is on.

To enable IEEE 802.1 TLV transmission, run the nv set service lldp dot1-tlv on command:

cumulus@switch:~$ nv set service lldp dot1-tlv on
cumulus@switch:~$ nv config apply

To disable IEEE 802.1 TLV transmission, run the nv unset service lldp dot1-tlv command.

To show if IEEE 802.1 TLV transmission is on, run the NVUE nv show service lldp command:

cumulus@leaf01:mgmt:~$ nv show service lldp
                        operational  applied
----------------------  -----------  -------
tx-interval             30           30     
tx-hold-multiplier      4            4      
dot1-tlv                off          off   
...

IEEE 802.3 TLVs

Cumulus Linux transmits the following IEEE 802.3 TLVs by default. You do not need to enable them.

Name Subtype Description
Link Aggregation 3 Indicates if the port supports link aggregation and if it is on.
Maximum Frame Size 4 The MTU configuration on the port. The MTU on the port is the MFS.

QoS TLVs

Adding QoS configuration as part of the DCBX TLVs allows automated configuration on hosts and switches that connect to the switch.

You can transmit the following QoS TLVs. By default, all QoS TLV transmission is off on all interfaces.

Name Subtype Description
ETS Configuration 9 The ETS configuration settings on the switch.
ETS Recommendation A The recommended ETS settings that the switch wants the connected peer interface to use.
PFC Configuration B The PFC configuration settings on the switch.

Adding the QoS TLVs to LLDP packets on an interface relies on PFC and ETS configuration from switchd. Refer to Quality of Service for information on configuring PFC and ETS.

When you enable ROCE on the switch:

  • QoS TLV transmission (PFC Configuration, ETS Configuration, and ETS Recommendation) is on globally for all ports, which overrides any QoS TLV transmission setting on a switch port interface.
  • LLDP frames for all switch port interfaces carry PFC configuration, ETS configuration, ETS recommendation, and APP Priority TLVs. The ETS configuration and PFC configuration TLV payloads are the same for all interfaces.

Enable QoS TLV Transmission

To enable PFC Configuration TLV transmission, run the nv set interface <interface> lldp dcbx-pfc-tlv on command:

cumulus@switch:~$ nv set interface swp1 lldp dcbx-pfc-tlv on
cumulus@switch:~$ nv config apply

To enable ETS Configuration TLV transmission, run the nv set interface <interface> lldp dcbx-ets-config-tlv on command:

cumulus@switch:~$ nv set interface swp1 lldp dcbx-ets-config-tlv on
cumulus@switch:~$ nv config apply 

To enable ETS Recommendation TLV transmission, run the nv set interface <interface> lldp dcbx-ets-recomm-tlv on command:

cumulus@switch:~$ nv set interface swp1 lldp dcbx-ets-recomm-tlv on
cumulus@switch:~$ nv config apply

The interface must be a physical interface; you cannot enable TLVs on bonds.

Disable QoS TLV Transmission

To disable PFC Configuration TLV transmission, run the nv unset interface <interface> lldp dcbx-pfc-tlv command:

cumulus@switch:~$ nv unset interface swp1 lldp dcbx-pfc-tlv
cumulus@switch:~$ nv config apply

To disable ETS Configuration TLV transmission, run the nv unset interface <interface> lldp dcbx-ets-config-tlv command:

cumulus@switch:~$ nv unset interface swp1 lldp dcbx-ets-config-tlv
cumulus@switch:~$ nv config apply 

To disable ETS Recommendation TLV transmission, run the nv unset interface <interface> lldp dcbx-ets-recomm-tlv command:

cumulus@switch:~$ nv unset interface swp1 lldp dcbx-ets-recomm-tlv
cumulus@switch:~$ nv config apply

Show QoS TLV Transmission Settings

To show if Qos TLV transmission is on for an interface, run the NVUE nv show interface <interface> command:

cumulus@leaf01:mgmt:~$ nv show interface swp1
                          operational        applied      description
------------------------  -----------------  -----------  ---------------------------------------------------
...
lldp
  dcbx-ets-config-tlv                        on           DCBX ETS config TLV flag
  dcbx-ets-recomm-tlv                        off          DCBX ETS recommendation TLV flag
  dcbx-pfc-tlv                               on           DCBX PFC TLV flag
... 

LLDP-MED Inventory TLVs

LLDP-MED is an extension to LLDP that operates between endpoint devices, such as IP phones and switches. Inventory management TLV enables an endpoint to transmit detailed inventory information about itself to the switch, such as the manufacturer, model, firmware, and serial number.

To enable LLDP-MED inventory TLV transmission, run the nv set service lldp lldp-med-inventory-tlv on command:

cumulus@switch:~$ nv set service lldp lldp-med-inventory-tlv on
cumulus@switch:~$ nv config apply

To disable LLDP-MED inventory TLV transmission, run the nv unset service lldp lldp-med-inventory-tlv command.

To show if LLDP-MED Inventory TLV transmission is on, run the NVUE nv show service lldp command:

cumulus@leaf01:mgmt:~$ nv show service lldp
                        operational  applied
----------------------  -----------  -------
tx-interval             30           30     
tx-hold-multiplier      4            4      
dot1-tlv                off          off    
lldp-med-inventory-tlv  on           on     
...

Application Priority TLVs

DCBX Application priority TLVs allow hosts to receive per-application priority values in LLDP packets.

Cumulus Linux supports application priority TLVs for:

Enable Application Priority TLV Transmission

To enable application priority TLV transmission, run NVUE commands to set:

  • You cannot enable application priority TLV transmission on bonds.
  • You can configure a maximum of 10 application TLV priorities on the switch.
  • Cumulus Linux can send a maximum of 10 application priority TLVs in an LLDP PDU.

The following example sets the application priority of iSCSI traffic to 3 in the application priority TLV sent in LLDP PDUs on swp1.

cumulus@switch:~$ nv set service lldp application-tlv app iSCSI priority 3
cumulus@switch:~$ nv set interface swp1 lldp application-tlv app iSCSI
cumulus@switch:~$ nv config apply

The following example sets the application priority of NVMe traffic using TCP port 4420 to 5 in the application priority TLV sent in LLDP PDUs on swp1.

cumulus@switch:~$ nv set service lldp application-tlv app NVME_4420 priority 5
cumulus@switch:~$ nv set interface swp1 lldp application-tlv app NVME_4420
cumulus@switch:~$ nv config apply

The following example sets the application priority of NVMe traffic using TCP port 8009 to 7 in the application priority TLV sent in LLDP PDUs on swp1.

cumulus@switch:~$ nv set service lldp application-tlv app NVME_8009 priority 7
cumulus@switch:~$ nv set interface swp1 lldp application-tlv app NVME_8009
cumulus@switch:~$ nv config apply

The following example sets the application priority for TCP traffic using port 4217 to 6 in the application priority TLV sent in LLDP PDUs on swp1.

cumulus@switch:~$ nv set service lldp application-tlv tcp-port 4217 priority 6
cumulus@switch:~$ nv set interface swp1 lldp application-tlv tcp-port 4217
cumulus@switch:~$ nv config apply

The following example sets the application priority for UDP traffic using port 4317 to 4 in the application priority TLV sent in LLDP PDUs on swp1.

cumulus@switch:~$ nv set service lldp application-tlv udp-port 4317 priority 4
cumulus@switch:~$ nv set interface swp1 lldp application-tlv udp-port 4317
cumulus@switch:~$ nv config apply

The following example sets the application priority of iSCSI traffic using port 3260 to 0 (the default priority) in the application priority TLV sent in LLDP PDUs on swp1.

cumulus@switch:~$ nv set interface swp1 lldp application-tlv app iSCSI
cumulus@switch:~$ nv config apply

Disable Application Priority TLV Transmission

To stop LLDP from sending PDUs with application priority TLVs on an interface, unset the interface configuration; for example:

cumulus@switch:~$ nv unset interface swp1 lldp application-tlv
cumulus@switch:~$ nv config apply

The following example unsets application priority 3 for iSCSI, then disables transmission of the application priority TLVs on swp1.

cumulus@switch:~$ nv unset service lldp application-tlv app iSCSI priority 3
cumulus@switch:~$ nv unset interface swp1 lldp application-tlv app iSCSI
cumulus@switch:~$ nv config apply

The following example unsets application priority 5 for NVMe using TCP port 4420, then disables transmission of the application priority TLVs on swp1.

cumulus@switch:~$ nv unset service lldp application-tlv app NVME_4420 priority 5
cumulus@switch:~$ nv unset interface swp1 lldp application-tlv app NVME_4420
cumulus@switch:~$ nv config apply

The following example unsets application priority 7 for NVMe using TCP port 8009, then disables transmission of the application priority TLVs on swp1.

cumulus@switch:~$ nv unset service lldp application-tlv app NVME_8009 priority 7
cumulus@switch:~$ nv unset interface swp1 lldp application-tlv app NVME_8009
cumulus@switch:~$ nv config apply

The following example unsets application priority 6 for the application using TCP port 4217, then disables transmission of application priority TLVs on swp1:

cumulus@switch:~$ nv unset service lldp application-tlv tcp-port 4217 priority 6
cumulus@switch:~$ nv unset interface swp1 lldp application-tlv tcp-port 4217
cumulus@switch:~$ nv config apply

The following example unsets application priority 4 for the application using UDP port 4317, then disables transmission of application priority TLVs on swp1:

cumulus@switch:~$ nv unset service lldp application-tlv udp-port 4317 priority 4
cumulus@switch:~$ nv unset interface swp1 lldp application-tlv udp-port 4317
cumulus@switch:~$ nv config apply

The following example unsets application priority 0 (the default priority) for iSCSI using TCP port 3260 and disables transmission of application TLVs on swp1.

cumulus@switch:~$ nv unset interface swp1 lldp application-tlv app iSCSI
cumulus@switch:~$ nv config apply

Show Application Priority TLV Settings

To show all application priority TLV configuration on the switch:

cumulus@switch:~$ nv show service lldp application-tlv
udp-port
===========
    Port  priority
    ----  --------
    4317  4       

tcp-port
===========
    Port  priority
    ----  --------
    4217  6       
app
======
    AppName    priority
    ---------  --------
    NVME_4420  5       
    NVME_8009  7       
    iSCSI      3

To show all the application TLVs configured on an interface:

cumulus@switch:~$ nv show interface swp1 lldp application-tlv 
            operational  applied  
----------  -----------  ---------
[udp-port]  4317         4317     
[tcp-port]  4217         4217     
[app]       NVME_4420    NVME_4420
[app]       NVME_8009    NVME_8009
[app]       iSCSI        iSCSI 

To show the UDP port priority mapping:

cumulus@switch:~$ nv show service lldp application-tlv udp-port
Port  priority
----  --------
4317  4 

To show the application priority mapping:

cumulus@switch:~$ nv show service lldp application-tlv app
AppName    priority
---------  --------
NVME_4420  5       
NVME_8009  7       
iSCSI      3 

To show the TCP port priority mapping:

cumulus@switch:~$ nv show service lldp application-tlv tcp-port
Port  priority
----  --------
4217  6

To show the UDP port priority mapping for swp1:

cumulus@switch:~$ nv show interface swp1 lldp application-tlv udp-port
Ports
-----
4317

To show the application names that have application priority TLVs enabled for swp1:

cumulus@switch:~$ nv show interface swp1 lldp application-tlv app
AppName  
---------
NVME_4420
NVME_8009
iSCSI

Troubleshooting

You can use the lldpcli tool to query the lldpd daemon for neighbors, statistics, and other running configuration information. See man lldpcli(8) for details.

To show all neighbors on all ports and interfaces:

cumulus@switch:~$ sudo lldpcli show neighbors
-------------------------------------------------------------------------------
LLDP neighbors:
-------------------------------------------------------------------------------
Interface:    eth0, via: LLDP, RID: 1, Time: 0 day, 17:38:08
  Chassis:
    ChassisID:    mac 08:9e:01:e9:66:5a
    SysName:      PIONEERMS22
    SysDescr:     Cumulus Linux version 4.1.0 running on quanta lb9
    MgmtIP:       192.168.0.22
    Capability:   Bridge, on
    Capability:   Router, on
  Port:
    PortID:       ifname swp47
    PortDescr:    swp47
-------------------------------------------------------------------------------
Interface:    swp1, via: LLDP, RID: 10, Time: 0 day, 17:08:27
  Chassis:
    ChassisID:    mac 00:01:00:00:09:00
    SysName:      MSP-1
    SysDescr:     Cumulus Linux version 4.1.0 running on QEMU Standard PC (i440FX + PIIX, 1996)
    MgmtIP:       192.0.2.9
    MgmtIP:       fe80::201:ff:fe00:900
    Capability:   Bridge, off
    Capability:   Router, on
  Port:
    PortID:       ifname swp1
    PortDescr:    swp1
-------------------------------------------------------------------------------
Interface:    swp2, via: LLDP, RID: 10, Time: 0 day, 17:08:27
  Chassis:
    ChassisID:    mac 00:01:00:00:09:00
    SysName:      MSP-1
    SysDescr:     Cumulus Linux version 4.1.0 running on QEMU Standard PC (i440FX + PIIX, 1996)
    MgmtIP:       192.0.2.9
    MgmtIP:       fe80::201:ff:fe00:900
    Capability:   Bridge, off
    Capability:   Router, on
  Port:
    PortID:       ifname swp2
    PortDescr:    swp2
-------------------------------------------------------------------------------
Interface:    swp3, via: LLDP, RID: 11, Time: 0 day, 17:08:27
  Chassis:
    ChassisID:    mac 00:01:00:00:0a:00
    SysName:      MSP-2
    SysDescr:     Cumulus Linux version 4.1.0 running on QEMU Standard PC (i440FX + PIIX, 1996)
    MgmtIP:       192.0.2.10
    MgmtIP:       fe80::201:ff:fe00:a00
    Capability:   Bridge, off
    Capability:   Router, on
  Port:
    PortID:       ifname swp1
    PortDescr:    swp1
...

To show lldpd statistics for all ports:

cumulus@switch:~$ sudo lldpcli show statistics
----------------------------------------------------------------------
LLDP statistics:
----------------------------------------------------------------------
Interface:    eth0
  Transmitted:  9423
  Received:     17634
  Discarded:    0
  Unrecognized: 0
  Ageout:       10
  Inserted:     20
  Deleted:      10
--------------------------------------------------------------------
Interface:    swp1
  Transmitted:  9423
  Received:     6264
  Discarded:    0
  Unrecognized: 0
  Ageout:       0
  Inserted:     2
  Deleted:      0
---------------------------------------------------------------------
Interface:    swp2
  Transmitted:  9423
  Received:     6264
  Discarded:    0
  Unrecognized: 0
  Ageout:       0
  Inserted:     2
  Deleted:      0
---------------------------------------------------------------------
Interface:    swp3
  Transmitted:  9423
  Received:     6265
  Discarded:    0
  Unrecognized: 0
  Ageout:       0
  Inserted:     2
  Deleted:      0
----------------------------------------------------------------------
...

To show a summary of lldpd statistics for all ports:

cumulus@switch:~$ sudo lldpcli show statistics summary
---------------------------------------------------------------------
LLDP Global statistics:
---------------------------------------------------------------------
Summary of stats:
  Transmitted:  648186
  Received:     437557
  Discarded:    0
  Unrecognized: 0
  Ageout:       10
  Inserted:     38
  Deleted:      10

To show the running LLDP configuration:

cumulus@switch:~$ sudo lldpcli show running-configuration
--------------------------------------------------------------------
Global configuration:
--------------------------------------------------------------------
Configuration:
  Transmit delay: 30
  Transmit hold: 4
  Receive mode: no
  Pattern for management addresses: (none)
  Interface pattern: (none)
  Interface pattern blacklist: (none)
  Interface pattern for chassis ID: (none)
  Override description with: (none)
  Override platform with: Linux
  Override system name with: (none)
  Advertise version: yes
  Update interface descriptions: no
  Promiscuous mode on managed interfaces: no
  Disable LLDP-MED inventory: yes
  LLDP-MED fast start mechanism: yes
  LLDP-MED fast start interval: 1
  Source MAC for LLDP frames on bond slaves: local
  Portid TLV Subtype for lldp frames: ifname
--------------------------------------------------------------------

Considerations

Ethernet Bridging - VLANs

Ethernet bridges enable hosts to communicate through layer 2 by connecting the physical and logical interfaces in the system into a single layer 2 domain. The bridge is a logical interface with a MAC address and an MTU. The bridge MTU is the minimum MTU among all its members.

  • Bridge members can be individual physical interfaces, bonds, or logical interfaces that traverse an 802.1Q VLAN trunk.
  • Cumulus Linux does not put all ports into a bridge by default.

Ethernet Bridge Types

The Cumulus Linux bridge driver supports two configuration modes; one that is VLAN-aware and one that follows a more traditional Linux bridge model.

NVIDIA recommends that you use VLAN-aware mode bridges instead of traditional mode bridges. The Cumulus Linux bridge driver is capable of VLAN filtering, which allows for configurations that are similar to incumbent network devices. For a comparison of traditional and VLAN-aware modes, see this knowledge base article.

You can configure both VLAN-aware and traditional mode bridges on the same network in Cumulus Linux.

Bridge MAC Addresses

The switch learns the MAC address for a frame when the frame enters the bridge through an interface and records the MAC address in the bridge table. The bridge forwards the frame to its intended destination by looking up the destination MAC address. Cumulus Linux maintains the MAC entry for 1800 seconds (30 minutes). If the switch sees the frame with the same source MAC address before the MAC entry age expires, it refreshes the MAC entry age; if the MAC entry age expires, the switch deletes the MAC address from the bridge table.

The following example NVUE command output shows a MAC address table for the bridge.

cumulus@switch:~$ nv show bridge domain br_default mac-table
     age    bridge-domain  entry-type  interface   last-update  mac                src-vni  vlan  vni  Summary
---  -----  -------------  ----------  ----------  -----------  -----------------  -------  ----  ---  ----------------------
+ 0  87699  br_default     permanent   bond3       87699        44:38:39:00:00:35
+ 1  87699  br_default     permanent   bond1       87699        44:38:39:00:00:31
+ 2  87699  br_default     permanent   bond2       87699        44:38:39:00:00:33
+ 3                        permanent   br_default               00:00:00:00:00:10
+ 4                        permanent   br_default               00:00:00:00:00:20
+ 5                        permanent   br_default               00:00:00:00:00:30
+ 6  84130  br_default     permanent   br_default  84130        44:38:39:22:01:b1           30
+ 7  87570  br_default     permanent   vxlan48     87570        42:ff:4d:82:c9:99
+ 8  84130                 permanent   vxlan48     84130        00:00:00:00:00:00  10                  remote-dst: 224.0.0.10

bridge fdb Command Output

The Linux bridge fdb command interacts with the FDB, which the bridge uses to store the MAC addresses it learns and the ports on which it learns those MAC addresses. The bridge fdb show command output contains some specific keywords:

Keyword Description
self The FDB entry belongs to the FDB on the device referenced by the device.
For example, this FDB entry belongs to the VXLAN device:
vx-1000: 00:02:00:00:00:08 dev vx-1000 dst 27.0.0.10 self
master The FDB entry belongs to the FDB on the device’s master and the FDB entry is pointing to a master’s port.
For example, this FDB entry is from the master device named bridge and is pointing to the VXLAN bridge port:
vx-1001: 02:02:00:00:00:08 dev vx-1001 vlan 1001 master bridge
extern_learn An external control plane, such as the BGP control plane for EVPN, manages (offloads) the FDB entry.

The following example shows the bridge fdb show command output:

cumulus@switch:~$ bridge fdb show | grep 02:02:00:00:00:08
02:02:00:00:00:08 dev vx-1001 vlan 1001 extern_learn master bridge
02:02:00:00:00:08 dev vx-1001 dst 27.0.0.10 self extern_learn

Considerations

VLAN-aware Bridge Mode

VLAN-aware bridge mode in Cumulus Linux implements a configuration model for large-scale layer 2 environments, with one single instance of spanning tree protocol. Each physical bridge member port includes the list of allowed VLANs as well as the port VLAN ID, either the primary VLAN Identifier (PVID) or native VLAN. MAC address learning, filtering and forwarding are VLAN-aware. This reduces the configuration size, and eliminates the large overhead of managing the port and VLAN instances as subinterfaces, replacing them with lightweight VLAN bitmaps and state updates.

Cumulus Linux supports multiple VLAN-aware bridges but with the following limitations:

Configure a VLAN-aware Bridge

The example commands below create a VLAN-aware bridge for STP that contains two switch ports and includes 3 VLANs; tagged VLANs 10 and 20, and untagged (native) VLAN 1.

With NVUE, there is a default bridge called br_default, which has no ports assigned. The example below configures this default bridge.

cumulus@switch:~$ nv set interface swp1-2 bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default vlan 10,20
cumulus@switch:~$ nv set bridge domain br_default untagged 1
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file and add the bridge:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ports swp1 swp2
    bridge-vids 10 20
    bridge-pvid 1
    bridge-vlan-aware yes
...

Run the ifreload -a command to load the new configuration:

cumulus@switch:~$ ifreload -a

The Primary VLAN Identifier (PVID) of the bridge defaults to 1. You do not have to specify bridge-pvid for a bridge or a port. However, even though this does not affect the configuration, it helps other users for readability. The following configurations are identical to each other and the configuration above:

auto br_default
iface br_default
    bridge-ports swp1 swp2
    bridge-vids 1 10 20
    bridge-vlan-aware yes
auto br_default
iface br_default
    bridge-ports swp1 swp2
    bridge-pvid 1
    bridge-vids 1 10 20
    bridge-vlan-aware yes
auto br_default
iface br_default
    bridge-ports swp1 swp2
    bridge-vids 10 20
    bridge-vlan-aware yes

  • If you specify bridge-vids or bridge-pvid at the bridge level, all ports in the bridge inherit these configurations. However, specifying any of these settings for a specific port overrides the setting in the bridge.
  • Do not bridge the management port eth0 with any switch ports. For example, if you create a bridge with eth0 and swp1, the bridge does not work correctly and disrupts access to the management interface.

Configure Multiple VLAN-aware Bridges

This example shows the commands required to create two VLAN-aware bridges on the switch.

Bridges are independent so you can reuse VLANs between bridges. Each VLAN-aware bridge maintains its own MAC address and VLAN tag table; MAC and VLAN tags in one bridge are not visible to the other table.

cumulus@switch:~$ nv set interface swp1-2 bridge domain bridge1
cumulus@switch:~$ nv set bridge domain bridge1 vlan 10,20
cumulus@switch:~$ nv set bridge domain bridge1 untagged 1
cumulus@switch:~$ nv set interface swp3 bridge domain bridge2
cumulus@switch:~$ nv set bridge domain bridge2 vlan 10
cumulus@switch:~$ nv set bridge domain bridge2 untagged 1
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file and add the bridge:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bridge1
iface bridge1
    bridge-ports swp1 swp2
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1

auto bridge2
iface bridge2
    bridge-ports swp3
    bridge-vlan-aware yes
    bridge-vids 10
    bridge-pvid 1
...

Run the ifreload -a command to load the new configuration:

cumulus@switch:~$ ifreload -a

NVIDIA Spectrum 1 switches support a maximum of 10000 VLAN elements. NVIDIA Spectrum-2 switches and later support a maximum of 15996 VLAN elements when warm restart mode is off or 7934 VLAN elements when warm restart mode is on. Cumulus Linux calculates the total number of VLAN elements as the number of VLANs times the number of configured bridges. For example, 6 bridges, each containing 2600 VLANs totals 15600 VLAN elements.

On NVIDIA Spectrum-2 switches and later, if you enable multiple VLAN-aware bridges and want to use more VLAN elements than the default, you must update the number of VLAN elements in the /etc/mlx/datapath/broadcast_domains.conf file.

  • To specify the total number of bridge domains you want to use, uncomment and edit the broadcast_domain.max_vlans parameter. The default value is 6143 when warm restart mode is off or 4096 when warm restart mode is on.
  • To specify the total number of subinterfaces you want to use, uncomment and edit the broadcast_domain.max_subinterfaces parameter. The default value is 3872 when warm restart mode is off or 1872 when warm restart mode is on.

You must restart switchd with the systemctl restart switchd command to apply the configuration.

The number of broadcast_domain.max_vlans plus broadcast_domain.max_subinterfaces cannot exceed 15996.

Increasing the broadcast_domain.max_vlans parameter can affect layer 2 multicast scale support.

Reserved VLAN Range

For hardware data plane internal operations, the switching silicon requires VLANs for every physical port, Linux bridge, and layer 3 subinterface. Cumulus Linux reserves a range of VLANs by default; the reserved range is 3725-3999.

If the reserved VLAN range conflicts with any user-defined VLANs, you can modify the range. The new range must be a contiguous set of VLANs with IDs between 2 and 4094. For a single VLAN-aware bridge, the minimum size of the range is 2 VLANs. For multiple VLAN-aware bridges, the minimum size of the range is the number of VLAN-aware bridges on the system plus one.

The following example changes the reserved VLAN range to be between 4064 and 4094:

cumulus@switch:~$ nv set system global reserved vlan internal range 4064-4094
cumulus@switch:~$ nv config apply
  1. Edit the /etc/cumulus/switchd.conf file to uncomment the resv_vlan_range line and specify a new range.

    cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
    ...
    # global reserved vlan internal range
    resv_vlan_range = 4064-4094
    
  2. After you save the file, you must restart switchd:

    cumulus@switch:~$ sudo systemctl restart switchd.service
    

Reserved Layer 3 VNI VLANs

In addition to the internal reserved VLAN range, Cumulus Linux allocates a reserved VLAN range for layer 3 VNIs in EVPN symmetric routing deployments. Use this reserved VLAN range when you configure layer 3 VNIs in MLAG environments with NVUE commands. The default range is 4000-4064. You can display the range with the nv show system global reserved vlan l3-vni-vlan command:

cumulus@switch:~$ nv show system global reserved vlan l3-vni-vlan
operational  applied
-----  -----------  -------
begin  4000         4000
end    4064         4064

Do not use this range of VLANs in the same bridge as your MLAG interfaces and layer 3 VNIs. You can configure the range with the nv set system global reserved vlan l3-vni-vlan begin <vlan> and nv set system global reserved vlan l3-vni-vlan end <vlan> commands. For more information, see symmetric routing.

The global reserved layer 3 VNI VLAN range does not apply to switches that you configure manually with Linux commands instead of NVUE or for symmetric routing deployments without MLAG.

VLAN Pruning

By default, the bridge port inherits the bridge VIDs, however, you can configure a port to override the bridge VIDs.

This example commands configure swp3 to override the bridge VIDs:

cumulus@switch:~$ nv set interface swp1-3 bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default vlan 10,20
cumulus@switch:~$ nv set bridge domain br_default untagged 1
cumulus@switch:~$ nv set interface swp3 bridge domain br_default vlan 20
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file, then run the ifreload -a command. The following example commands configure swp3 to override the bridge VIDs:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ports swp1 swp2 swp3
    bridge-pvid 1
    bridge-vids 10 20
    bridge-vlan-aware yes

auto swp3
iface swp3
  bridge-vids 20
...
cumulus@switch:~$ ifreload -a

Access Ports and Tagged Packets

Access ports ignore all tagged packets. In the configuration below, swp1 and swp2 are access ports, while all untagged traffic goes to VLAN 10:

cumulus@switch:~$ nv set interface swp1-2 bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default vlan 10,20
cumulus@switch:~$ nv set bridge domain br_default untagged 1
cumulus@switch:~$ nv set interface swp1 bridge domain br_default access 10
cumulus@switch:~$ nv set interface swp2 bridge domain br_default access 10
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ports swp1 swp2
    bridge-pvid 1
    bridge-vids 10 20
    bridge-vlan-aware yes

auto swp1
iface swp1
    bridge-access 10

auto swp2
iface swp2
    bridge-access 10
...
cumulus@switch:~$ ifreload -a

Drop Untagged Frames

With VLAN-aware bridge mode, you can configure a switch port to drop any untagged frames. To do this, add bridge-allow-untagged no to the switch port (not to the bridge). The bridge port is without a PVID and drops untagged packets.

The following example command configures swp2 to drop untagged frames:

cumulus@switch:~$ nv set interface swp2 bridge domain br_default untagged none
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file to add the bridge-allow-untagged no line under the switch port interface stanza, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp1
iface swp1

auto swp2
iface swp2
    bridge-allow-untagged no

auto br_default
iface br_default
    bridge-ports swp1 swp2
    bridge-pvid 1
    bridge-vids 10 20
    bridge-vlan-aware yes
...
cumulus@switch:~$ sudo ifreload -a

When you check VLAN membership for that port, it shows that there is no untagged VLAN.

cumulus@switch:~$ bridge -c vlan show
portvlan ids
swp1 1 PVID Egress Untagged
  10 20

swp2 10 20

bridge 1

VLAN Layer 3 Addressing

When configuring the VLAN attributes for the bridge, specify the attributes for each VLAN interface. If you are configuring the switch virtual interface (SVI) for the native VLAN, you must declare the native VLAN and specify its IP address. Specifying the IP address in the bridge stanza itself returns an error.

The following example commands declare native VLAN 10 with IPv4 address 10.1.10.2/24 and IPv6 address 2001:db8::1/32.

The NVUE and Linux commands also show an example with multiple VLAN-aware bridges.

cumulus@switch:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@switch:~$ nv set interface vlan10 ip address 2001:db8::1/32
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv set interface bridge2_vlan10 type svi
cumulus@switch:~$ nv set interface bridge2_vlan10 vlan 10
cumulus@switch:~$ nv set interface bridge2_vlan10 base-interface bridge2
cumulus@switch:~$ nv set interface bridge2_vlan10 ip address 10.1.10.2/24
cumulus@switch:~$ nv set interface bridge1_vlan10 type svi
cumulus@switch:~$ nv set interface bridge1_vlan10 vlan 10
cumulus@switch:~$ nv set interface bridge1_vlan10 base-interface bridge1
cumulus@switch:~$ nv set interface bridge1_vlan10 ip address 12.1.10.2/24
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bridge
iface bridge
    bridge-ports swp1 swp2
    bridge-pvid 1
    bridge-vids 10 20
    bridge-vlan-aware yes
auto vlan10
iface vlan10
    address 10.1.10.2/24
    address 2001:db8::1/32
    vlan-id 10
    vlan-raw-device br_default
cumulus@switch:~$ ifreload -a
cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bridge2_vlan10
iface bridge2_vlan10
    address 10.1.10.2/24
    hwaddress 1c:34:da:1d:e6:fd
    vlan-raw-device bridge2
    vlan-id 10

auto bridge1_vlan10 iface bridge1_vlan10 address 12.1.10.2/24 hwaddress 1c:34:da:1d:e6:fd vlan-raw-device bridge1 vlan-id 10

Keep SVIs Perpetually UP

The first time you configure a switch, all southbound bridge ports are down; therefore, by default, SVIs are also down. You can force SVIs to always be up by disabling interface state tracking so that the SVIs are always in the UP state even when all member ports are down. Other implementations describe this feature as no autostate. This is beneficial if you want to perform connectivity testing.

To configure all SVIs on the switch to be perpetually UP, run the nv set system global svi-force-up enable on command.

cumulus@switch:~$ nv set system global svi-force-up enable on
cumulus@switch:~$ nv config apply

To configure SVIs in a specific bridge to be perpetually UP, run the nv set bridge domain <bridge> svi-force-up enable on command:

cumulus@switch:~$ nv set bridge domain br_default svi-force-up enable on
cumulus@switch:~$ nv config apply
  • To configure all SVIs on the switch to be perpetually DOWN, run the nv set system global svi-force-up enable off command.
  • To configure the SVIs in a specific bridge to be perpetually DOWN, run the nv set bridge domain <bridge> svi-force-up enable off command.

To configure the SVIs in a bridge to be perpetually UP, edit the /etc/network/interfaces file and add the bridge-always-up on option to the bridge stanza, then reload the configuration with the sudo ifreload -a command:

cumulus@switch:~$ sudo nano /etc/network/interfaces
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink
    hwaddress 48:b0:2d:4e:ad:89
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
    bridge-stp yes
    bridge-mcsnoop no
    bridge-always-up on
    mstpctl-forcevers rstp
cumulus@switch:~$ sudo ifreload -a

To configure all SVIs on the switch to be perpetually UP, add the bridge-always-up on option to all bridge stanzas with SVIs.

  • To configure the SVIs in a bridge to be perpetually DOWN, remove the bridge-always-up on option from the bridge stanza, then reload the configuration with the sudo ifreload -a command.
  • To configure all SVIs on the switch to be perpetually DOWN, remove the bridge-always-up on option from all bridge stanzas, then reload the configuration with the sudo ifreload -a command.

With the svi-force-up (bridge-always-up) option set to on, even when an interface is down, the bridge remains UP:

cumulus@switch:~$ ip link show bond1
7: bond1: <BROADCAST,MULTICAST,MASTER> mtu 9216 qdisc noqueue master br_default state DOWN mode DEFAULT group default qlen 1000
    link/ether 48:b0:2d:cf:e4:3e brd ff:ff:ff:ff:ff:ff
cumulus@switch:~$ ip link show br_default
18: br_default: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 8:b0:2d:4e:ad:89 brd ff:ff:ff:ff:ff:ff

To show if the svi-force-up option is set to on for all SVIs on the switch, run the nv show system global svi-force-up command:

cumulus@switch:~$ nv show system global svi-force-up
       operational  applied
------  -----------  -------
enable  on           on

To show if the svi-force-up option is set to on for SVIs in a specific bridge, run the nv show bridge domain <domain-id> svi-force-up command:

cumulus@switch:~$ nv show bridge domain br_default svi-force-up
        applied
------  -------
enable  on

By default, Cumulus Linux automatically generates IPv6 link-local addresses on VLAN interfaces. If you want to use a different mechanism to assign link-local addresses, you can disable this feature. You can disable link-local automatic address generation for both regular IPv6 addresses and address-virtual (macvlan) addresses.

To disable automatic address generation for a regular IPv6 address on a VLAN, run the following command. The following example command disables automatic address generation for a regular IPv6 address on VLAN 10.

Cumulus Linux does not provide NVUE commands for this setting.

Edit the /etc/network/interfaces file to add the line ipv6-addrgen off to the VLAN stanza, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto vlan10
iface vlan 10
    ipv6-addrgen off
    vlan-id 10
    vlan-raw-device br_default
...
cumulus@switch:~$ ifreload -a

To reenable automatic link-local address generation for a VLAN:

Cumulus Linux does not provide NVUE commands for this setting.
Edit the /etc/network/interfaces file to remove the line ipv6-addrgen off from the VLAN stanza, then run the ifreload -a command.

MAC Address for a Bridge

To configure a MAC address for a bridge, run the nv set bridge domain <bridge> mac-address <mac-address> command.

The following example configures the bridge br_default with MAC address 00:00:5E:00:53:00:

cumulus@switch:~$ nv set bridge domain br_default mac-address 00:00:5E:00:53:00
cumulus@switch:~$ nv config apply

To unset the MAC address for a bridge, run the nv unset bridge domain <bridge> mac-address <mac-address> command.

cumulus@switch:~$ nv unset bridge domain br_default mac-address
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file to add the MAC address (hwaddress) to the bridge stanza, then run the sudo ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink vxlan48
    hwaddress 00:00:5E:00:53:00
    bridge-vlan-aware yes
    bridge-vids 10 20 30
cumulus@switch:~$ sudo ifreload -a

To unset the MAC address for a bridge, remove the MAC address from the bridge stanza and run the sudo ifreload -a command.

MAC Address Aging

By default, Cumulus Linux stores MAC addresses in the Ethernet switching table for 1800 seconds (30 minutes). You can change this setting to a value between 0 and 65535. A value of 0 disables MAC learning and frames flood out of all ports in a VLAN.

The following command example changes the MAC aging setting to 600 seconds:

cumulus@switch:~$ nv set bridge domain br_default ageing 600 
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file to add the bridge-ageing parameter to the bridge interface:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ageing 600
...

To show the bridge aging configuration setting, run the nv show bridge domain <domain> command or the Linux sudo ip -d link show <bridge-domain> command.

cumulus@switch:~$ nv show bridge domain br_default
                 operational  applied   
---------------  -----------  ----------
ageing                        600
encap                         802.1Q
mac-address                   auto
type                          vlan-aware
untagged                      1
vlan-vni-offset               0
...

To reset bridge aging to the default value (1800 seconds), run the nv unset bridge domain <domain> ageing command.

Clear Dynamic MAC Address Entries

You can clear the following entries from the forwarding database instead of waiting for them to age out:

The clear dynamic MAC address commands do not clear sticky entries, permanent entries, or neighbor entries learned externally.

Clear All Dynamic MAC Addresses

To clear all dynamic MAC addresses from the forwarding database, run the nv action clear bridge domain <bridge-id> mac-table dynamic command:

cumulus@switch:~$ nv action clear bridge domain br_default mac-table dynamic

The nv action clear bridge domain <bridge-id> mac-table dynamic command clears static entries learned on ES bonds that are installed as static entries in EVPN multihoming including static VXLAN entries in the bridge driver.

Clear All Dynamic MAC Addresses for an Interface, VLAN, or Interface and VLAN

To clear all dynamic MAC addresses for a specific interface, run the nv action clear bridge domain <bridge-id> mac-table dynamic interface <interface-id command:

cumulus@switch:~$ nv action clear bridge domain br_default mac-table dynamic interface swp1

To clear all dynamic MAC addresses for a specific VLAN, run the nv action clear bridge domain <bridge-id> mac-table dynamic vlan <vlan-id command:

cumulus@switch:~$ nv action clear bridge domain br_default mac-table dynamic vlan 10

The nv action clear bridge domain <bridge-id> mac-table dynamic vlan <vlan-id> command clears the static VXLAN entries in bridge or VXLAN driver for the corresponding VLAN or VNI.

To clear all dynamic MAC addresses for a specific interface and VLAN, run the nv action clear bridge domain <bridge-id> mac-table dynamic interface <interface-id> vlan <vlan-id> command:

cumulus@switch:~$ nv action clear bridge domain br_default mac-table dynamic interface swp1 vlan 10

Clear A Specific Dynamic MAC Address for an Interface, VLAN, or Interface and VLAN

To clear a specific dynamic MAC addresses for a VLAN, run the nv action clear bridge domain <domain-id> mac-table dynamic mac <mac-address> vlan <vlan-id> command:

cumulus@switch:~$ nv action clear bridge domain br_default mac-table dynamic mac 00:00:0A:BB:28:FC vlan 10

To clear a specific dynamic MAC address for an interface, run the nv action clear bridge domain <domain-id> mac-table dynamic mac <mac-address> interface <interface-id> command:

cumulus@switch:~$ nv action clear bridge domain br_default mac-table dynamic mac 00:00:0A:BB:28:FC interface swp1

To clear a specific dynamic MAC address for a VLAN and interface, run the nv action clear bridge domain <domain-id> mac-table dynamic mac <mac-address> vlan <vlan-id interface <interface-id> command:

cumulus@switch:~$ nv action clear bridge domain br_default mac-table dynamic mac 00:00:0A:BB:28:FC vlan 10 interface swp1

Static MAC Address Entries

You can add a static MAC address entry to the layer 2 table for an interface within the VLAN-aware bridge by running a command similar to the following:

cumulus@switch:~$ sudo bridge fdb add 12:34:56:12:34:56 dev swp1 vlan 150 master static sticky
cumulus@switch:~$ sudo bridge fdb show
44:38:39:00:00:7c dev swp1 master bridge permanent
12:34:56:12:34:56 dev swp1 vlan 150 sticky master bridge static
44:38:39:00:00:7c dev swp1 self permanent
12:12:12:12:12:12 dev swp1 self permanent
12:34:12:34:12:34 dev swp1 self permanent
12:34:56:12:34:56 dev swp1 self permanent
12:34:12:34:12:34 dev bridge master bridge permanent
44:38:39:00:00:7c dev bridge vlan 500 master bridge permanent
12:12:12:12:12:12 dev bridge master bridge permanent

Troubleshooting

To show the ports mapped to each bridge, run the NVUE nv show bridge port command or the Linux bridge link show command:

cumulus@switch:~$ nv show bridge port
domain                       port             
--------        ------------------------------
br_default      swp1,swp2,swp3

To show port information for a specific bridge, run the NVUE nv show bridge domain <domain-name> port command:

cumulus@switch:~$ nv show bridge domain br_default port
port  flags                       state     
----  --------------------------  ----------
swp1  flood,learning,mcast_flood  forwarding
swp2  flood,learning,mcast_flood  forwarding
swp3  flood,learning,mcast_flood  forwarding

To show the VLANs mapped to each bridge port, run the NVUE nv show bridge port-vlan command or the Linux bridge vlan show command:

cumulus@switch:~$ nv show bridge port-vlan
domain        port            vlan   tag-state
-------    ---------     ---------   ---------
br_default    swp1              10    untagged
              swp2               1    untagged
                                10      tagged
                                20      tagged
                                30      tagged
              swp3               1    untagged
                                10      tagged
                                20      tagged
                                30      tagged

To show VLAN information for a specific bridge, run the NVUE nv show bridge domain <domain-name> port vlan command or the Linux bridge -d vlan show command:

cumulus@switch:~$ nv show bridge domain br_default port vlan
port  vlan  tag-state  fwd-state 
----  ----  ---------  ----------
swp1  10    untagged   forwarding
swp2  1     untagged   forwarding
      10    tagged     forwarding
      20    tagged     forwarding
      30    tagged     forwarding
swp3  1     untagged   forwarding
      10    tagged     forwarding
      20    tagged     forwarding
      30    tagged     forwarding

Example Configuration

The following example configuration contains an access port (swp51), a trunk carrying all VLANs (swp3 thru swp48), and a trunk pruning some VLANs from a switch port (swp2).

cumulus@switch:mgmt:~$ nv set interface swp3-48 bridge domain br_default
cumulus@switch:mgmt:~$ nv set bridge domain br_default vlan 310,700,707,712,850,910
cumulus@switch:mgmt:~$ nv set interface swp1 bridge domain br_default access 310
cumulus@switch:mgmt:~$ nv set interface swp1 bridge domain br_default stp bpdu-guard on
cumulus@switch:mgmt:~$ nv set interface swp1 bridge domain br_default stp admin-edge on
cumulus@switch:mgmt:~$ nv set interface swp2 bridge domain br_default vlan 707,712,850
cumulus@switch:mgmt:~$ nv set interface swp2 bridge domain br_default stp admin-edge on
cumulus@switch:mgmt:~$ nv set interface swp2 bridge domain br_default stp bpdu-guard on
cumulus@switch:mgmt:~$ nv set interface swp49 bridge domain br_default stp network on
cumulus@switch:mgmt:~$ nv set interface swp50 bridge domain br_default stp network on
cumulus@switch:mgmt:~$ nv config apply
cumulus@switch:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '310': {}
            '700': {}
            '707': {}
            '712': {}
            '850': {}
            '910': {}
    interface:
      swp1:
        bridge:
          domain:
            br_default:
              access: 310
              stp:
                admin-edge: on
                bpdu-guard: on
        type: swp
      swp2:
        bridge:
          domain:
            br_default:
              stp:
                admin-edge: on
                bpdu-guard: on
              vlan:
                '707': {}
                '712': {}
                '850': {}
        type: swp
      ...  
      swp49:
        bridge:
          domain:
            br_default:
              stp:
                network: on
        type: swp
      swp50:
        bridge:
          domain:
            br_default:
              stp:
                network: on
        type: swp
    system:
      hostname: switch
cumulus@switch:mgmt:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback

auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto

auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt

# the following is an access port

auto swp1
iface swp1
    bridge-access 310
    mstpctl-bpduguard yes
    mstpctl-portadminedge yes

# the following is a trunk port that is pruned
# only .1q tags of 707, 712, 850 are sent and received

auto swp2
iface swp2
    bridge-vids 707 712 850
    mstpctl-bpduguard yes
    mstpctl-portadminedge yes
...
# the following port is the trunk uplink and inherits all vlans
# from br_default; bridge assurance is enabled using portnetwork

auto swp49
iface swp49
    mstpctl-portnetwork yes

# the following port is the trunk uplink and inherits all vlans
# from 'br_default'; bridge assurance is enabled using portnetwork

auto swp50
iface swp50
    mstpctl-portnetwork yes

# ports swp3-swp48 are trunk ports that inherit vlans 
# 310,700,707,712,850,910 from the bridge br_default

auto br_default
iface br_default
    bridge-ports swp1 swp2 swp3... swp49 swp50
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 310 700 707 712 850 910
    bridge-pvid 1

Considerations

Spanning Tree Protocol (STP)

VLAN Translation

You cannot enable VLAN translation on a bridge in VLAN-aware mode. Only traditional mode bridges support VLAN translation.

Bridge Conversion

You cannot convert traditional mode bridges automatically to and from a VLAN-aware bridge. You must delete the original configuration and bring down all member switch ports before creating a new bridge.

VLAN Memory Resource Limitations

On Spectrum-2 and later, Cumulus Linux uses internal debugging flow counters for each VLAN that require KVD and ATCAM memory space. When you configure more than 1000 VLAN interfaces, you might not be able to apply ACLs if flow counter resources deplete the ACL resource space. In addition, you might see error messages in the /var/log/switchd.log file similar to the following:

error: hw sync failed (sync_acl hardware installation failed) Rolling back .. failed.
error: hw sync failed (Bulk counter init failed with No More Resources). Rolling back ..

To troubleshoot this issue and manage Netfilter resources with high VLAN and ACL scale, refer to Troubleshooting ACL Rule Installation Failures.

Traditional Bridge Mode

For traditional Linux bridges, the kernel supports VLANs in the form of VLAN subinterfaces. When you enable bridging on multiple VLANs, you configure a bridge for each VLAN and create one or more VLAN subinterfaces for each member port on the bridge. This mode can pose scalability challenges with configuration size as well as boot time and run time state management when the number of ports times the number of VLANs becomes large.

  • Use VLAN-aware mode bridges instead of traditional mode bridges.
  • Use traditional mode bridges if you need to use PVSTP+.

Configure a Traditional Mode Bridge

The following example commands configure a traditional mode bridge called my_bridge, where swp1, swp2, swp3, and swp4 are members of the bridge. The example also configures the bridge with IP address 10.10.10.10/24 to provide IP access to the bridge interface.

Cumulus Linux does not provide NVUE commands for traditional bridge mode.

Edit the /etc/network/interfaces file, then run the ifreload -a command.

...
auto swp1
iface swp1

auto swp2
iface swp2

auto swp3
iface swp3

auto swp4
iface swp4

auto my_bridge
iface my_bridge
    address 10.10.10.10/24
    bridge-ports swp1 swp2 swp3 swp4
    bridge-vlan-aware no
...
cumulus@switch:~$ sudo ifreload -a

  • Do not bridge the management port, eth0, with any switch ports (swp0, swp1, and so on). For example, if you create a bridge with eth0 and swp1, it does not work.
  • The name of the bridge must be compliant with Linux interface naming conventions and unique within the switch.

To configure spanning tree options for a bridge interface, refer to Spanning Tree and Rapid Spanning Tree - STP.

Configure Multiple Traditional Mode Bridges

You can configure multiple bridges to divide a switch into multiple layer 2 domains. This enables hosts to communicate with other hosts in the same domain, while separating them from hosts in other domains.

The example below shows a multiple bridge configuration, where host-1 and host-2 connect to bridge-A, and host-3 and host-4 connect to bridge-B:

This example configuration looks like this in the /etc/network/interfaces file:

...
auto bridge-A
iface bridge-A
    bridge-ports swp1 swp2
    bridge-vlan-aware no

auto bridge-B
iface bridge-B
    bridge-ports swp3 swp4
    bridge-vlan-aware no
...

Trunks in Traditional Bridge Mode

The standard for trunking is 802.1Q. The 802.1Q specification adds a four byte header within the Ethernet frame that identifies the VLAN of which the frame is a member.

802.1Q also identifies an untagged frame as belonging to the native VLAN (most network devices default their native VLAN to 1). In Cumulus Linux:

A bridge in traditional mode has no concept of trunks, just tagged or untagged frames. With a trunk of 200 VLANs, there needs to be 199 bridges, each containing a tagged physical interface, and one bridge containing the native untagged VLAN.

The interaction of tagged and untagged frames on the same trunk often leads to undesired and unexpected behavior. A switch that uses VLAN 1 for the native VLAN can send frames to a switch that uses VLAN 2 for the native VLAN, merging those two VLANs and their spanning tree state.

To create the above example:

Cumulus Linux does not provide NVUE commands for traditional bridge mode.

Add the following configuration to the /etc/network/interfaces file:

...
auto br-VLAN10
iface br-VLAN10
   bridge-ports swp1.10 swp2.10

auto br-VLAN20
iface br-VLAN20
   bridge-ports swp1.20 swp2.20
...

Advanced VLAN Tagging Example

The following advanced VLAN tagging configuration shows three hosts and two switches, with several bridges and a bond that connects them all.

The bridge member ports function as 802.1Q access ports and trunk ports. To compare Cumulus Linux with a traditional Cisco device:

To create the above configuration, edit the /etc/network/interfaces file and add a configuration like the following:

# Config for host1

# swp1 does not need an iface section unless it has a specific setting
# it will be picked up as a dependent of swp1.100
# swp1 must exist in the system to create the .1q subinterfaces
# but it is not applied to any bridge or assigned an address

auto swp1.100
iface swp1.100

# Config for host2
# swp2 does not need an iface section unless it has a specific setting
# it will be picked up as a dependent of swp2.100 and swp2.120
# swp2 must exist in the system to create the .1q subinterfaces
# but it is not applied to any bridge or assigned an address

auto swp2.100
iface swp2.100

auto swp2.120
iface swp2.120

# Config for host3
# swp3 does not need an iface section unless it has a specific setting
# it will be picked up as a dependent of swp3.120 and swp3.130
# swp3 must exist in the system to create the .1q subinterfaces
# but it is not applied to any bridge or assigned an address

auto swp3.120
iface swp3.120

auto swp3.130
iface swp3.130

# configure the bond

auto bond2
iface bond2
  bond-slaves glob swp4-7

# configure the bridges

auto br-untagged
iface br-untagged
    address 10.0.0.1/24
    bridge-ports swp1 bond2
    bridge-stp on

auto br-tag100
iface br-tag100
    address 10.0.100.1/24
    bridge-ports swp1.100 swp2.100 bond2.100
    bridge-stp on

auto br-vlan120
iface br-vlan120
    address 10.0.120.1/24
    bridge-ports swp2.120 swp3.120 bond2.120
    bridge-stp on

auto v130
iface v130
    address 10.0.130.1/24
    bridge-ports swp3.130 bond2.130
    bridge-stp on

To verify the configuration:

cumulus@switch:~$ sudo mstpctl showbridge br-tag100
br-tag100 CIST info
  enabled         yes
  bridge id       8.000.44:38:39:00:32:8B
  designated root 8.000.44:38:39:00:32:8B
  regional root   8.000.44:38:39:00:32:8B
  root port       none
  path cost     0          internal path cost   0
  max age       20         bridge max age       20
  forward delay 15         bridge forward delay 15
  tx hold count 6          max hops             20
  hello time    2          ageing time          300
  force protocol version     rstp
  time since topology change 333040s
  topology change count      1
  topology change            no
  topology change port       swp2.100
  last topology change port  None
cumulus@switch:~$ sudo mstpctl showportdetail br-tag100  | grep -B 2 state
br-tag100:bond2.100 CIST info
  enabled            yes                     role                 Designated
  port id            8.003                   state                forwarding
--
br-tag100:swp1.100 CIST info
  enabled            yes                     role                 Designated
  port id            8.001                   state                forwarding
--
  br-tag100:swp2.100 CIST info
  enabled            yes                     role                 Designated
  port id            8.002                   state                forwarding
cumulus@switch:~$ cat /proc/net/vlan/config
VLAN Dev name    | VLAN ID
Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
bond2.100      | 100  | bond2
bond2.120      | 120  | bond2
bond2.130      | 130  | bond2
swp1.100       | 100  | swp1
swp2.100       | 100  | swp2
swp2.120       | 120  | swp2
swp3.120       | 120  | swp3
swp3.130       | 130  | swp3

A single bridge cannot contain multiple subinterfaces of the same port. If you try to apply this configuration, you see an error:

cumulus@switch:~$ sudo brctl addbr another_bridge
cumulus@switch:~$ sudo brctl addif another_bridge swp9 swp9.100
bridge cannot contain multiple subinterfaces of the same port: swp9, swp9.100

VLAN Translation

By default, Cumulus Linux does not allow VLAN subinterfaces associated with different VLAN IDs to be part of the same bridge. Base interfaces do not associate with any VLAN IDs and are exempt from this restriction.

In some cases, it is useful to relax this restriction. For example, when two servers connect to the switch using VLAN trunks, but the VLAN numbering on the two servers is not consistent. You can bridge two VLAN subinterfaces of different VLAN IDs from the servers by enabling the sysctl net.bridge.bridge-allow-multiple-vlans option. Packets that enter a bridge from a member VLAN subinterface egress another member VLAN subinterface with the VLAN ID translated.

The following example enables the VLAN translation sysctl:

cumulus@switch:~$ echo net.bridge.bridge-allow-multiple-vlans = 1 | sudo tee /etc/sysctl.d/multiple_vlans.conf
net.bridge.bridge-allow-multiple-vlans = 1
cumulus@switch:~$ sudo sysctl -p /etc/sysctl.d/multiple_vlans.conf
net.bridge.bridge-allow-multiple-vlans = 1

After you enable sysctl, you can add ports with different VLAN IDs to the same bridge. In the following example, the switch bridges packets that enter the bridge br-mix from swp10.100 to swp11.200. Cumulus Linux translates the VLAN ID from 100 to 200:

cumulus@switch:~$ sudo brctl addif br_mix swp10.100 swp11.200

cumulus@switch:~$ sudo brctl show br_mix
bridge name     bridge id               STP enabled     interfaces
br_mix          8000.4438390032bd       yes             swp10.100
                                                        swp11.200

Spanning Tree and Rapid Spanning Tree - STP

STP identifies links in the network and shuts down redundant links, preventing possible network loops and broadcast radiation on a bridged network. STP also provides redundant links for automatic failover when an active link fails. Cumulus Linux enables STP by default for both VLAN-aware and traditional bridges.

Exercise caution when changing the STP settings below to prevent STP loop avoidance issues.

STP Modes

Cumulus Linux supports STP for VLAN-aware and traditional bridges.

STP Modes for a VLAN-aware Bridge

VLAN-aware bridges operate in:

Configure the Mode for a VLAN-aware Bridge

RSTP is the default mode for a VLAN-aware bridge. You can change the mode to PVRST.

You cannot configure PVRST mode for multiple VLAN-aware bridges.

The following example sets PVRST mode on the br_default bridge:

cumulus@switch:~$ nv set bridge domain br_default stp mode pvrst
cumulus@switch:~$ nv config apply

To revert the mode to the default setting (RSTP), run the nv unset bridge domain <bridge> stp mode command.

Add mstpctl-pvrst-mode yes under the bridge stanza in the /etc/network/interfaces file, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ports swp1 swp2
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
    bridge-stp yes
    mstpctl-pvrst-mode yes
...
cumulus@switch:~$ ifreload -a

Runtime Configuration (Advanced)

A runtime configuration is non-persistent, which means the configuration you create here does not persist after you reboot the switch.

To set STP mode to PVRST at runtime:

cumulus@switch:~$ sudo mstpctl setmodepvrst

To revert the mode to the default setting (RSTP), run the sudo mstpctl clearmodepvrst command.

PVRST Scale

The maximum number of PVRST instances you can configure is 300 VLANs with 24 ports. The default forwarding rate and burst rate for the rpvst trap group is 2000 pps, as shown with the nv show system control-plane policer rpvst command ouput below:

cumulus@switch:~$ nv show system control-plane policer rpvst
                 operational  applied
---------------  -----------  -------
burst            2000         2000
rate             2000         2000
state            on           on
statistics
  policer-cbs    11
  policer-cir    2000
  policer-id     22
  to-cpu-bytes   0
  to-cpu-pkts    0
  trap-group-id  4
  violated-pkts  0

If you enable PVRST mode, you must modify the rpvst trap group settings to scale to the maximum number of PVRST instances by setting the forwarding rate and the burst rate to 7200 pps:

cumulus@switch:~$ nv set system control-plane policer rpvst rate 7200
cumulus@switch:~$ nv set system control-plane policer rpvst burst 7200
cumulus@switch:~$ nv config apply
  1. Edit the /etc/cumulus/control-plane/policers.conf file to change the copp.rpvst.rate and copp.rpvst.burst parameters to 7200:

    cumulus@switch:~$ sudo nano /etc/cumulus/control-plane/policers.conf
    ...
    copp.rpvst.enable = TRUE
    copp.rpvst.rate = 7200
    copp.rpvst.burst = 7200
    ...
    
  2. Run the following command to apply the change:

    cumulus@switch:~$ /usr/lib/cumulus/switchdctl --load /etc/cumulus/control-plane/policers.conf
    

STP Modes for a Traditional Bridge

Traditional bridges operate in:

  • For maximum interoperability, when connected to a switch that has a native VLAN configuration, you must configure the native VLAN to VLAN 1.
  • NVUE does not provide commands to configure a traditional mode bridge.

STP Interoperability

This section discusses STP interoperability.

RSTP and STP Interoperability

If a bridge running RSTP (802.1w) receives a common STP (802.1D) BPDU, it falls back to 802.1D automatically.

RSTP and PVRST Interoperability

The RSTP domain sends BPDUs on the native VLAN, whereas PVRST sends BPDUs on each VLAN along with IEEE BPDUS. For both protocols to work together, you need to enable the native VLAN on the link between the RSTP to PVRST domain; the spanning tree builds according to the native VLAN parameters.

The RSTP protocol does not send or parse BPDUs on other VLANs, but floods BPDUs across the network, enabling the PVRST domain to maintain its spanning-tree topology and provide a loop-free network.

RSTP and MST Interoperability

RSTP works with MST seamlessly, creating a single instance of spanning tree that transmits BPDUs on the native VLAN.

RSTP treats the MST domain as one giant switch, whereas MST treats the RSTP domain as a different region. To ensure proper communication between the regions, MST creates a CST that connects all the boundary switches and forms the overall view of the MST domain. Because changes in the CST must reflect in all regions, the RSTP tree exists is in the CST to ensure that changes on the RSTP domain are in the CST domain. Topology changes on the RSTP domain impact the rest of the network but inform the MST domain of every change occurring in the RSTP domain, ensuring a loop-free network.

Configure the root bridge within the MST domain by changing the priority on the relevant MST switch. When MST detects an RSTP link, it falls back into RSTP mode. The MST domain chooses the switch with the lowest cost to the CST root bridge as the CIST root bridge.

RSTP with MLAG

More than one spanning tree instance enables switches to load balance and use different links for different VLANs. With RSTP, there is only one instance of spanning tree. To better utilize the links, you can configure MLAG on the switches connected to the MST domain and set up these interfaces as an MLAG port. The MST domain thinks it connects to a single switch and utilizes all the links connected to it. Load balancing depends on the port channel hashing mechanism instead of different spanning tree instances and uses all the links between the RSTP to MST domains. For information about configuring MLAG, see Multi-Chassis Link Aggregation - MLAG.

Optional Configuration

The following section provides optional configuration commands.

Spanning Tree Priority

If you are running STP for a VLAN-aware bridge in the default mode (RSTP) and you have a multiple spanning tree instance (MSTI 0, also known as a common spanning tree, or CST), you can set the tree priority for a bridge. The bridge with the lowest priority is the root bridge. The priority must be a number between 0 and 61440, and must be a multiple of 4096. The default value is 32768.

If you are running MLAG and have multiple bridges, the STP priority must be the same on all bridges on both peer switches.

The following example sets the tree priority to 8192:

cumulus@switch:~$ nv set bridge domain br_default stp priority 8192
cumulus@switch:~$ nv config apply

Configure the tree priority (mstpctl-treeprio) under the bridge stanza in the /etc/network/interfaces file, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bridge
iface bridge
    # bridge-ports includes all ports related to VxLAN and CLAG.
    # does not include the Peerlink.4094 subinterface
    bridge-ports bond01 bond02 peerlink vni13 vni24 vxlan4001
    bridge-pvid 1
    bridge-vids 13 24
    bridge-vlan-aware yes
    mstpctl-treeprio 8192
...
cumulus@switch:~$ ifreload -a

Runtime Configuration (Advanced)

A runtime configuration is non-persistent, which means the configuration you create here does not persist after you reboot the switch.

Run the sudo mstpctl settreeprio <bridge> <MSTI> <priority> command:

cumulus@switch:~$ sudo mstpctl settreeprio br_default 0 8192

Cumulus Linux supports MSTI 0 only. It does not support MSTI 1 through 15.

Port Path Cost

You can configure the path cost for an interface in the bridge to influence the spanning tree forwarding path. You can specify a value between 1 and 200000000.

For PVRST mode, the port cost for a VLAN takes precedence over the cost for a port. If you do not configure the port cost for a VLAN, Cumulus Linux applies the port cost to all the interfaces in the VLAN. If you do not configure either the port cost for a VLAN or the cost for a port, Cumulus Linux bases the port cost on the link speed.

The following example sets the path cost to 4000.

cumulus@switch:~$ nv set interface swp1 bridge domain br_default stp path-cost 4000
cumulus@switch:~$ nv config apply

Add the mstpctl-portpathcost parameter under the interface stanza of the /etc/network/interfaces file.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp1
iface swp1
    mstpctl-bpduguard yes
    mstpctl-portadminedge yes
    mstpctl-portpathcost 4000
...

Runtime Configuration (Advanced)

A runtime configuration is non-persistent, which means the configuration you create here does not persist after you reboot the switch.

To set path cost to 4000 at runtime:

cumulus@switch:~$ sudo mstpctl setportpathcost br_default swp1 4000

PVRST Bridge Priority

You can set the spanning tree bridge priority for a VLAN. The priority must be a number between 4096 and 61440 and must be a multiple of 4096. The default value is 32768.

The following example sets the bridge priority to 4096 for VLAN 10 and to 61440 for VLAN 20:

cumulus@switch:~$ nv set bridge domain br_default stp vlan 10 bridge-priority 4096
cumulus@switch:~$ nv set bridge domain br_default stp vlan 20 bridge-priority 61440
cumulus@switch:~$ nv config apply

The following example sets the bridge priority to 61440 for VLAN 10, 20, and 30:

cumulus@switch:~$ nv set bridge domain br_default stp vlan 10,20,30 bridge-priority 61440
cumulus@switch:~$ nv config apply

To set the bridge priority for a range of VLANs, use a hyphen (-). For example, to set the bridge priority to 61440 for VLAN 10 through VLAN 30:

cumulus@switch:~$ nv set bridge domain br_default stp vlan 10-30 bridge-priority 61440
cumulus@switch:~$ nv config apply

Add the bridge-stp-vlan-priority parameter under the bridge stanza of the /etc/network/interfaces file, then run the ifreload -a command.

The following example sets the bridge priority to 4096 for VLAN 10 and to 61440 for VLAN 20:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ports swp1 swp2
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
    bridge-stp yes
    mstpctl-pvrst-mode yes
    bridge-stp-vlan-priority 10=4096 20=61440
...
cumulus@switch:~$ ifreload -a

The following example sets the bridge priority to 61440 for VLAN 10, 20, and 30:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ports swp1 swp2
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
    bridge-stp yes
    mstpctl-pvrst-mode yes
    bridge-stp-vlan-priority 10=61440 20=61440 30=61440
...
cumulus@switch:~$ ifreload -a

To set the bridge priority for a range of VLANs, use a hyphen (-). For example, to set the bridge priority to 61440 for VLAN 10 through VLAN 30:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ports swp1 swp2
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
    bridge-stp yes
    mstpctl-pvrst-mode yes
    bridge-stp-vlan-priority 10-30=61440
...
cumulus@switch:~$ ifreload -a

PVRST Timers

You can set the following PVRST timers:

The max age timer must be equal to or less than two times the forward delay minus one second (bridge max age <= 2 * bridge foward delay - 1 second).

The following example sets the max age to 6 seconds, the hello time to 4 seconds, and the forward delay to 4 seconds for VLAN 10, 20, and 30.

cumulus@switch:~$ nv set bridge domain br_default stp vlan 10,20,30 max-age 6
cumulus@switch:~$ nv set bridge domain br_default stp vlan 10,20,30 hello-time 4 
cumulus@switch:~$ nv set bridge domain br_default stp vlan 10,20,30 forward-delay 4
cumulus@switch:~$ nv config apply

To set the PVRST timers for a range of VLANs, use a hyphen (-). For example nv set bridge domain br_default stp vlan 10-30 max-age 6.

Add the bridge-stp-vlan-maxage, bridge-stp-vlan-hello, and bridge-stp-vlan-fdelay parameters under the bridge stanza in the /etc/network/interfaces file, then run the ifreload -a command.

The following example sets the max age to 6 seconds, the hello time to 4 seconds, and the forward delay to 4 seconds for VLAN 10, 20, and 30.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ports swp1 swp2
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
    bridge-stp yes
    mstpctl-pvrst-mode yes
    bridge-stp-vlan-priority 10=4096
    bridge-stp-vlan-hello 10=4 20=4 30=4
    bridge-stp-vlan-fdelay 10=4 20=4 30=4
    bridge-stp-vlan-maxage 10=6 20=6 30=6
...
cumulus@switch:~$ ifreload -a

To set the PVRST timers for a range of VLANs, use a hyphen (-). For example bridge-stp-vlan-hello 10-30=4.

PVRST Port Settings

You can configure an interface port priority and path cost for each VLAN to influence the spanning tree forwarding path. You can specify a path cost between 1 and 200000000. You can specify a priority between 0 and 240; the value must be a multiple of 16.

The following examples set the path cost to 4000 and the priority to 240 for VLAN 10.

cumulus@switch:~$ nv set interface swp1 bridge domain br_default stp vlan 10 path-cost 4000
cumulus@switch:~$ nv set interface swp1 bridge domain br_default stp vlan 10 priority 240
cumulus@switch:~$ nv config apply

Add the mstpctl-port-vlan-path-cost and mstpctl-port-vlan-priority parameters under the interface stanza of the /etc/network/interfaces file:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp1
iface swp1
    bridge-access 10
    mstpctl-bpduguard yes
    mstpctl-portadminedge yes
    mstpctl-port-vlan-path-cost 10=4000
    mstpctl-port-vlan-priority 10=240
...

PortAdminEdge (PortFast Mode)

PortAdminEdge is equivalent to the PortFast feature offered by other vendors. It enables or disables the initial edge state of a port in a bridge. All ports with PortAdminEdge bypass the listening and learning states and go straight to forwarding.

PortAdminEdge mode causes loops if you do not use it with BPDU guard.

You typically configure edge ports as access ports for a simple end host. In the data center, edge ports connect to servers, which pass both tagged and untagged traffic.

The following example commands configure PortAdminEdge and BPDU guard for swp5:

cumulus@switch:~$ nv set interface swp5 bridge domain br_default stp admin-edge on
cumulus@switch:~$ nv set interface swp5 bridge domain br_default stp bpdu-guard on
cumulus@switch:~$ nv config apply

Configure PortAdminEdge and BPDU guard under the switch port interface stanza in the /etc/network/interfaces file, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp5
iface swp5
    mstpctl-bpduguard yes
    mstpctl-portadminedge yes
...
cumulus@switch:~$ sudo ifreload -a

Runtime Configuration (Advanced)

A runtime configuration is non-persistent, which means the configuration you create here does not persist after you reboot the switch.

To configure PortAdminEdge and BPDU guard at runtime, run the following commands:

cumulus@switch:~$ sudo mstpctl setportadminedge br2 swp1 yes
cumulus@switch:~$ sudo mstpctl setbpduguard br2 swp1 yes

PortAutoEdge

PortAutoEdge is an enhancement to the standard PortAdminEdge (PortFast) mode, which allows for the automatic detection of edge ports. PortAutoEdge enables and disables the auto transition to and from the edge state of a port in a bridge.

Edge ports and access ports are not the same. Edge ports transition directly to the forwarding state and skip the listening and learning stages. Upstream topology change notifications are not generated when an edge port link changes state. Access ports only forward untagged traffic; however, there is no such restriction on edge ports, which can forward both tagged and untagged traffic.

When a port with PortAutoEdge receives a BPDU, the port stops being in the edge port state and transitions into a normal STP port. When the interface no longer receives BPDUs, the port becomes an edge port, and transitions through the discarding and learning states before it resumes forwarding.

Cumulus Linux enables PortAutoEdge by default.

The following example commands disable PortAutoEdge on swp1:

cumulus@switch:~$ nv set interface swp1 bridge domain br_default stp auto-edge off
cumulus@switch:~$ nv config apply

Edit the switch port interface stanza in the /etc/network/interfaces file to add the mstpctl-portautoedge no line, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp1
iface swp1
    alias to Server01
    # Port to Server02
    mstpctl-portautoedge no
...
cumulus@switch:~$ sudo ifreload -a

The following example commands reenable PortAutoEdge on swp1:

cumulus@switch:~$ nv set interface swp1 bridge domain br_default stp auto-edge on
cumulus@switch:~$ nv config apply
Edit the switch port interface stanza in the /etc/network/interfaces file to remove mstpctl-portautoedge no, then run the ifreload -a command.

BPDU Guard

You can configure BPDU guard to protect the spanning tree topology from an unauthorized device affecting the forwarding path. For example, if you add a new host to an access port off a leaf switch and the host sends an STP BPDU, BPDU guard protects against undesirable topology changes in the environment.

The following example commands set BPDU guard for swp5:

cumulus@switch:~$ nv set interface swp5 bridge domain br_default stp bpdu-guard on
cumulus@switch:~$ nv config apply

Edit the switch port interface stanza in the /etc/network/interfaces file to add the mstpctl-bpduguard yes line, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp5
iface swp5
    mstpctl-bpduguard yes
...
cumulus@switch:~$ sudo ifreload -a

To see if a port has BPDU guard on or if the port receives a BPDU:

cumulus@switch:~$ nv show bridge domain br_default stp
cumulus@switch:~$ mstpctl showportdetail br_default
bridge:swp5 CIST info
  enabled            no                      role                 Disabled
  port id            8.001                   state                discarding
  external port cost 305                     admin external cost  0
  internal port cost 305                     admin internal cost  0
  designated root    8.000.6C:64:1A:00:4F:9C dsgn external cost   0
  dsgn regional root 8.000.6C:64:1A:00:4F:9C dsgn internal cost   0
  designated bridge  8.000.6C:64:1A:00:4F:9C designated port      8.001
  admin edge port    no                      auto edge port       yes
  oper edge port     no                      topology change ack  no
  point-to-point     yes                     admin point-to-point auto
  restricted role    no                      restricted TCN       no
  port hello time    10                      disputed             no
  bpdu guard port    yes                     bpdu guard error     yes
  network port       no                      BA inconsistent      no
  Num TX BPDU        3                       Num TX TCN           2
  Num RX BPDU        488                     Num RX TCN           2
  Num Transition FWD 1                       Num Transition BLK   2
  bpdufilter port    no
  clag ISL           no                      clag ISL Oper UP     no
  clag role          unknown                 clag dual conn mac   0:0:0:0:0:0
  clag remote portID F.FFF                   clag system mac      0:0:0:0:0:0

If a port receives a BPDU, it goes into a protodown state, which results in a local OPER DOWN (carrier down) on the interface. Cumulus Linux also sets the protodown reason as bpduguard and records a log message in /var/log/syslog.

To show the reason for the port protodown, run the ip -p -j link show <interface> command.

cumulus@switch:~$ ip -p -j link show swp5

To recover from the protodown state, remove the protodown reason and protodown from the interface with the NVUE nv action clear interface <interface> bridge domain <domain> stp bpduguardviolation command or the Linux mstpctl clearbpduguardviolation <bridge> <interface> command.

  • Bringing up the disabled port does not correct the problem if the configuration on the connected end station does not resolve.
  • If you remove the interface from the bridge while the interface is in a protodown state, you must use the ip link set <interface>> protodown off protodown_reason stp off command to recover from the protodown state.

Bridge Assurance

On a point-to-point link where RSTP is running, if you want to detect unidirectional links and put the port in a discarding state, you can enable bridge assurance on the port by enabling a port type network. The port is then in a bridge assurance inconsistent state until it receives a BPDU from the peer. You need to configure the port type network on both ends of the link for bridge assurance.

Cumulus Linux disables bridge assurance by default.

The following example commands enable bridge assurance on swp1:

cumulus@switch:~$ nv set interface swp5 bridge domain br_default stp network on
cumulus@switch:~$ nv config apply

Edit the switch port interface stanza in the /etc/network/interfaces file to add the mstpctl-portnetwork yes line, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp5
iface swp5
    mstpctl-portnetwork yes
...
cumulus@switch:~$ sudo ifreload -a

Runtime Configuration (Advanced)

A runtime configuration is non-persistent, which means the configuration you create here does not persist after you reboot the switch.

To enable bridge assurance at runtime, run mstpctl:

cumulus@switch:~$ sudo mstpctl setportnetwork br1007 swp1.1007 yes

cumulus@switch:~$ sudo mstpctl showportdetail br1007 swp1.1007 | grep network
  network port       yes                     BA inconsistent      yes

To monitor logs for bridge assurance messages, run the following command:

cumulus@switch:~$ sudo grep -in assurance /var/log/syslog | grep mstp
  1365:Jun 25 18:03:17 mstpd: br1007:swp1.1007 Bridge assurance inconsistent

BPDU Filter

You can enable bpdufilter on a switch port, which filters BPDUs in both directions. This disables STP on the port as no BPDUs are transiting.

Using BDPU filter sometimes causes layer 2 loops. Use this feature with caution.

The following example commands configure the BPDU filter on swp6:

cumulus@switch:~$ nv set interface swp6 bridge domain br_default stp bpdu-filter on
cumulus@switch:~$ nv config apply

Edit the switch port interface stanza in the /etc/network/interfaces file to add the mstpctl-portbpdufilter yes line, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp6
iface swp6
    mstpctl-portbpdufilter yes
...
cumulus@switch:~$ sudo ifreload -a

Runtime Configuration (Advanced)

A runtime configuration is non-persistent, which means the configuration you create here does not persist after you reboot the switch.

To enable BPDU filter at runtime, run mstpctl. For example:

cumulus@switch:~$ sudo mstpctl setportbpdufilter br100 swp1.100=yes swp2.100=yes

Restricted Role

To enable the interface in the bridge to take the restricted role:

cumulus@switch:~$ nv set interface swp1 bridge domain br_default stp restrrole on
cumulus@switch:~$ nv config apply

Edit the switch port interface stanza in the /etc/network/interfaces file to add the mstpctl-portrestrrole yes line, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp6
iface swp6
    mstpctl-portrestrrole yes
...
cumulus@switch:~$ sudo ifreload -a

Force Version Setting

By default, the switch sends RSTP type 2 BPDUs. You can configure the switch to send BPDU type 0 STP configuration BPDUs when you need to interoperate with other systems.

cumulus@switch:~$ nv set bridge domain br_default stp force-protocol-version stp
cumulus@switch:~$ nv config apply

To change the setting back to the default, run the nv set bridge domain <domain> stp force-protocol-version rstp command.

Edit the bridge stanza in the /etc/network/interfaces file to add the mstpctl-forcevers stp line, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    hwaddress 08:00:27:60:36:0b
    bridge-vlan-aware yes
    bridge-vids 10
    bridge-pvid 1
    bridge-stp yes
    bridge-mcsnoop no
    mstpctl-forcevers stp
    mstpctl-pvrst-mode yes
...
cumulus@switch:~$ sudo ifreload -a

To change the setting back to the default, change the line in the bridge stanza to mstpctl-forcevers rstp, then run the ifreload -a command.

Additional STP Settings

The table below describes additional STP configuration parameters available in Cumulus Linux. You can set these optional parameters manually by editing the /etc/network/interfaces file. Cumulus Linux does not provide NVUE commands for these parameters.

The IEEE 802.1D and 802.1Q specifications describe STP parameters. For a comparison of STP parameter configuration between mstpctl and other vendors, read this knowledge base article.

Parameter Description
mstpctl-maxage Sets the maximum age of the bridge in seconds. The default is 20. The maximum age timer must be equal to, or less than, two times the forward delay minus one second (bridge max age <= 2 * bridge foward delay - 1 second).
Add this parameter to the bridge stanza of the /etc/network/interfaces file.
If you are running STP in PVRST mode, see PVRST Mode for a VLAN-aware Bridge.
mstpctl-fdelay Sets the bridge forward delay time in seconds. The default value is 15. Two times the forward delay minus one second must be more than or equal to the maximum age (2 * bridge foward delay - 1 second >= bridge max age).
Add this parameter to the bridge stanza of the /etc/network/interfaces file.
If you are running STP in PVRST mode, see PVRST Mode for a VLAN-aware Bridge.
mstpctl-maxhops Sets the maximum hops for the bridge. The default is 20.
Add this parameter to the bridge stanza of the /etc/network/interfaces file.
This parameter does not apply to PVRST mode.
mstpctl-txholdcount Sets the bridge transmit hold count. The default value is 6 seconds.
Add this parameter to the bridge stanza of the /etc/network/interfaces file.
In PVRST mode, the transmit hold count applies to each interface in the VLAN.
mstpctl-hello Sets the bridge hello time in seconds. The default is 2.
Add this parameter to the bridge stanza of the /etc/network/interfaces file.
If you are running STP in PVRST mode, see PVRST Mode for a VLAN-aware Bridge.
mstpctl-portp2p Enables or disables point-to-point detection mode for the interface in the bridge.
Add this parameter to the interface stanza of the /etc/network/interfaces file.
mstpctl-portrestrtcn Enables or disables the interface in the bridge to propagate received topology change notifications. The default is no.
Add this parameter to the interface stanza of the /etc/network/interfaces file.
mstpctl-treeportcost Sets the spanning tree port cost to a value from 0 to 255. The default is 0.
Add this parameter to the interface stanza of the /etc/network/interfaces file.
This parameter applies to RSTP mode only.

Be sure to run the sudo ifreload -a command after you set the STP parameter in the /etc/network/interfaces file.

Troubleshooting

To show the STP state for a bridge:

cumulus@switch:~$ nv show bridge domain br_default stp state
   operational  applied
   -----------  -------
   up           up

To show configuration information for a bridge interface:

cumulus@switch:~$ nv show interface swp1 bridge domain br_default
               operational  applied
-------------  -----------  -------
access         10           10     
learning       on           on     
stp                                
  admin-edge   on           on     
  auto-edge    on           on     
  bpdu-filter  off          off    
  bpdu-guard   on           on     
  network      off          off    
  path-cost    20000               
  restrrole    off          off    
  [vlan]                           
  state        forwarding 

To show STP configuration information for a bridge interface:

cumulus@switch:~$  nv show interface swp1 bridge domain br_default stp
             operational  applied
-----------  -----------  -------
admin-edge   on           on     
auto-edge    on           on     
bpdu-filter  off          off    
bpdu-guard   on           on     
network      off          off    
path-cost    20000               
restrrole    off          off    
[vlan]                           
state        forwarding

To show the STP information for a bridge, run the nv show bridge domain br_default stp command.

The following example shows the output in PVRST mode:

cumulus@switch:~$ nv show bridge domain br_default stp
Bridge
    mode    : pvrst
Vlan              Bridge ID               HelloTime   MaxAge      FwdDly   
        Priority         MAC-addr        (seconds)    (seconds)   (seconds)  
-----  ------------------------------    ----------  ----------  ----------
1      32769   44:38:39:22:01:7A            2           20          15     
10     4106    44:38:39:22:01:7A            4           6           4      
20     61460   44:38:39:22:01:7A            2           20          15     
30     32798   44:38:39:22:01:7A            2           20          15 

The following example shows the output in RSTP mode:

cumulus@switch:~$ nv show bridge domain br_default stp
Bridge
    mode    : rstp
    priority: 32768
    state   : up
Bridge ID                priority    : 32768   mac-address       : 44:38:39:22:01:8A   
Designated Root ID       priority    : 32768   mac-address       : 44:38:39:22:01:8A   root-port  : -
Timers                   hello-time  : 2s      forward-delay     : 15s                 max-age    : 20s
Max Hops                 max-hops    : 20      
Topology Change Network  count       : 1       time since change : 838s
                         change port : swp3    last change port  : swp2

Interface info: swp1
---------------------------------
port-id            : 128.1(priority: 128, num: 1)
role               : Designated
state              : forwarding
port-path-cost     : 20000
fdb-flush          : no
disputed           : no
...

To show PVRST information for the VLANs in a bridge:

cumulus@switch:~$ nv show bridge domain br_default stp vlan
Bridge Vlan: 1
--------------------------------------------------------------------------
Bridge ID                priority    : 32769   mac-address       : 44:38:39:22:01:B1   
Designated Root ID       priority    : 32769   mac-address       : 44:38:39:22:01:B1   root-port  : -
Timers                   hello-time  : 2s      forward-delay     : 15s                 max-age    : 20s
Topology Change Network  count       : 0       time since change : 1152s
                         change port : None    last change port  : None

Bridge Vlan: 10
--------------------------------------------------------------------------
Bridge ID                priority    : 4106    mac-address       : 44:38:39:22:01:B1   
Designated Root ID       priority    : 4106    mac-address       : 44:38:39:22:01:B1   root-port  : -
Timers                   hello-time  : 4s      forward-delay     : 4s                  max-age    : 6s
Topology Change Network  count       : 1       time since change : 1147s
                         change port : swp2    last change port  : swp1

Bridge Vlan: 20
--------------------------------------------------------------------------
Bridge ID                priority    : 32788   mac-address       : 44:38:39:22:01:B1   
Designated Root ID       priority    : 32788   mac-address       : 44:38:39:22:01:B1   root-port  : -
Timers                   hello-time  : 2s      forward-delay     : 15s                 max-age    : 20s
Topology Change Network  count       : 1       time since change : 1147s
                         change port : swp2    last change port  : swp1

To show PVRST information for a specific bridge VLAN:

cumulus@switch:~$ nv show bridge domain br_default stp vlan 10
Bridge ID                priority    : 4106    mac-address       : 44:38:39:22:01:B1   
Designated Root ID       priority    : 4106    mac-address       : 44:38:39:22:01:B1   root-port  : -
Timers                   hello-time  : 4s      forward-delay     : 4s                  max-age    : 6s
Topology Change Network  count       : 1       time since change : 1174s
                         change port : swp2    last change port  : swp1

Interface info: swp1
---------------------------------
port-id            : 8.001
role               : Designated
state              : forwarding
port-path-cost     : 20000
tx-hold-count      : 6
port-hello-time    : 4s
fdb-flush          : no
disputed           : no

Interface info: swp2
---------------------------------
port-id            : 8.002
role               : Designated
state              : forwarding
port-path-cost     : 20000
tx-hold-count      : 6
port-hello-time    : 4s
fdb-flush          : no
disputed           : no

To show STP information for the ports in a bridge:

cumulus@switch:~$ nv show bridge domain br_default stp port
Interface Info: bond1
--------------------------------------------------------------------------
enabled              : yes         mcheck            : no
admin-edge-port      : no          bpdu-guard-port   : no
auto-edge-port       : yes         bpdu-filter-port  : no
oper-edge-port       : yes         bpdu-guard-error  : no
admin-port-path-cost : 0           restricted-tcn    : no
port-path-cost       : 20000       restricted-role   : no
network-port         : no          ba-inconsistent   : no
clag-role            : primary     clag-system-mac   : 44:38:39:BE:EF:AA
clag-isl             : no          clag-isl-oper-up  : no
clag-dual-conn-mac   : 00:00:00:00:00:00

Interface Info: bond2
--------------------------------------------------------------------------
enabled              : yes         mcheck            : no
admin-edge-port      : no          bpdu-guard-port   : no
auto-edge-port       : yes         bpdu-filter-port  : no
oper-edge-port       : yes         bpdu-guard-error  : no
admin-port-path-cost : 0           restricted-tcn    : no
port-path-cost       : 20000       restricted-role   : no
network-port         : no          ba-inconsistent   : no
clag-role            : primary     clag-system-mac   : 44:38:39:BE:EF:AA
clag-isl             : no          clag-isl-oper-up  : no
clag-dual-conn-mac   : 00:00:00:00:00:00

Interface Info: bond3
--------------------------------------------------------------------------
enabled              : yes         mcheck            : no
admin-edge-port      : no          bpdu-guard-port   : no
auto-edge-port       : yes         bpdu-filter-port  : no
oper-edge-port       : yes         bpdu-guard-error  : no
admin-port-path-cost : 0           restricted-tcn    : no
port-path-cost       : 20000       restricted-role   : no
network-port         : no          ba-inconsistent   : no
clag-role            : primary     clag-system-mac   : 44:38:39:BE:EF:AA
clag-isl             : no          clag-isl-oper-up  : no
clag-dual-conn-mac   : 00:00:00:00:00:00
...

To show STP information for a specific bridge port:

cumulus@switch:~$ nv show bridge domain br_default stp port swp1
enabled              : yes         mcheck            : no
admin-edge-port      : no          bpdu-guard-port   : no
auto-edge-port       : yes         bpdu-filter-port  : no
oper-edge-port       : yes         bpdu-guard-error  : no
admin-port-path-cost : 0           restricted-tcn    : no
port-path-cost       : 20000       restricted-role   : no
network-port         : no          ba-inconsistent   : no
clag-role            : primary     clag-system-mac   : 44:38:39:BE:EF:AA
clag-isl             : no          clag-isl-oper-up  : no
clag-dual-conn-mac   : 00:00:00:00:00:00

To show the root ID and root cost for the bridge, run the nv show bridge domain <bridge> stp root command.

The following command shows the output in PVRST mode:

cumulus@switch:~$ nv show bridge domain br_default stp root
instance             root-id                root-cost  hello-time  fwd-dly     max-age      root-port
           Priority        MAC-addr                     (seconds)   (seconds)   (seconds)  
-------- --------------------------------  ----------  ----------  ----------  ----------  ----------
1        32769   44:38:39:22:01:7A            0           2           15          20          -      
10       4106    44:38:39:22:01:7A            0           4           4           6           -      
20       61460   44:38:39:22:01:7A            0           2           15          20          -      
30       32798   44:38:39:22:01:7A            0           2           15          20          -    

The following command shows the output in RSTP mode:

cumulus@switch:~$ nv show bridge domain br_default stp root
instance             root-id                root-cost  hello-time  fwd-dly     max-age      root-port
          Priority        MAC-addr                     (seconds)   (seconds)   (seconds)  
-------- --------------------------------  ----------  ----------  ----------  ----------  ----------
CIST     32768      44:38:39:22:01:8A            0           2           15          20          -      

To show STP counters for a bridge:

cumulus@switch:~$ nv show bridge domain br_default stp counters
port  tx-bpdu  rx-bpdu  tx-tcn  rx-tcn  fwd-trans  blk-trans  tx-pvst-tnl-bpdu  rx-pvst-tnl-bpdu
----  -------  -------  ------  ------  ---------  ---------  ----------------  ----------------
swp1  182      0        0       0       1          0          91                0               
swp2  296      0        2       0       2          1          297               0               
swp3  296      0        7       0       4          7          539               0

To show all blocked ports in the bridge:

cumulus@switch:~$ nv show bridge domain br_default stp blocked-ports

To show the mstpd bridge port state, run the mstpctl showstpport <bridge> command.

The following command shows the output in RSTP mode:

cumulus@switch:~$ sudo mstpctl showstpport br_default
 E swp1  8.001 forw 8.000.44:38:39:22:01:8A 8.000.44:38:39:22:01:8A 8.001 Desg
 E swp2  8.002 forw 8.000.44:38:39:22:01:8A 8.000.44:38:39:22:01:8A 8.002 Desg
 E swp3  8.003 forw 8.000.44:38:39:22:01:8A 8.000.44:38:39:22:01:8A 8.003 Desg

The following command shows the output in PVRST mode:

cumulus@switch:~$ sudo mstpctl showstpport br_default
 E swp1 
  ---PTP Info---
Port: swp1 vid: 1
8.001  8.001.44:38:39:22:01:8A 8.001.44:38:39:22:01:8A 8.001 Desg
state: forw
  ---PTP Info---
Port: swp1 vid: 10
8.001  1.00A.44:38:39:22:01:8A 1.00A.44:38:39:22:01:8A 8.001 Desg
state: forw
  ---PTP Info---
Port: swp1 vid: 20
8.001  F.014.44:38:39:22:01:8A F.014.44:38:39:22:01:8A 8.001 Desg
state: forw
  ---PTP Info---
Port: swp1 vid: 30
8.001  8.01E.44:38:39:22:01:8A 8.01E.44:38:39:22:01:8A 8.001 Desg
state: forw
 E swp2 
...

To show STP information for a bridge domain, including STP counters, run the sudo mstpctl showstpall command.

The following command shows the output in RSTP mode:

cumulus@switch:~$ sudo mstpctl showstpall 

Global info 
  debug level       2

BRIDGE: br_default, Br_index: 58

Spanning-tree enabled protocol rstp
Bridge Parameters for br_default
  enabled                    yes
  force protocol version     rstp
  tx hold count              6
  hello time                 2s
  bridge forward delay       15s
  bridge max age             20s
  max hops                   20
  migrate_time               3s
  ageing time                300s
  if_index: 58, name: br_default, up: yes, vlan_filter: yes uptime: 1244
---CIST info---
  bridge id       8.000.44:38:39:22:01:8A
  designated root 8.000.44:38:39:22:01:8A
  regional root   8.000.44:38:39:22:01:8A
  path cost     0          internal path cost   0

  root port         none
  root max age       20        s
  root forward delay 15        s
  time since topology change 1239s
  topology change count      1
  topology change            no
  topology change port       swp3
  last topology change port  swp2
  PRSSM_state: role_selection
...

The following command shows the output in PVRST mode:

cumulus@switch:~$ sudo mstpctl showstpall 

Global info 
  debug level       2

BRIDGE: br_default, Br_index: 58

Spanning-tree enabled protocol rapid-pvst
Bridge Parameters for br_default
  enabled                    yes
  force protocol version     rstp
  tx hold count              6
  migrate_time               3s
  ageing time                300s
  if_index: 58, name: br_default, up: yes, vlan_filter: yes uptime: 141
---Bridge Vlan 1---
  bridge id       8.001.44:38:39:22:01:8A
  priority      32769      
  forward delay       15         
  Max_Age       20         
  Hello_Time       2          
  designated root 8.001.44:38:39:22:01:8A
  root path cost     0          
  root port         none
  root max age       20        s
  root forward delay 15        s
  time since topology change 141s
  topology change count      0
  topology change            no
  topology change port       None
  last topology change port  None
  PRSSM_state: role_selection
---Bridge Vlan 10---
  bridge id       1.00A.44:38:39:22:01:8A
  priority      4106       
  forward delay       4          
  Max_Age       6          
  Hello_Time       4          
  designated root 1.00A.44:38:39:22:01:8A
  root path cost     0          
  root port         none
  root max age       6         s
  root forward delay 4         s
  time since topology change 136s
  topology change count      1
  topology change            no
  topology change port       swp3
  last topology change port  swp2
  PRSSM_state: role_selection
  ...

To show the bridge state, run the brctl show command:

cumulus@switch:~$ sudo brctl show
  bridge name     bridge id               STP enabled     interfaces
  br_default      8000.001401010100       yes             swp1
                                                          swp2
                                                          swp3

mstpd is the preferred utility for interacting with STP on Cumulus Linux. brctl also provides certain tools for STP; however, they are not as complete and output from brctl is sometimes misleading.

Considerations

You must remove PVRST VLAN configuration before you remove the VLANs on the interface or bridge.

The following example removes PVRST VLAN 10 configuration, then removes VLAN 10 from the bridge:

cumulus@switch:~$ nv unset bridge domain br_default stp vlan 10
cumulus@switch:~$ nv unset bridge domain br_default vlan 10

The following example removes PVRST VLAN 10 configuration, then removes VLAN 10 from swp1:

cumulus@switch:~$ nv unset interface swp1 bridge domain br_default stp vlan 10
cumulus@switch:~$ nv unset interface swp1 bridge domain br_default vlan 10

Storm Control

Storm control provides protection against excessive inbound BUM (broadcast, unknown unicast, multicast) traffic on layer 2 switch port interfaces, which can cause poor network performance.

Configure Storm Control

To configure storm control settings, you can either run NVUE commands or manually edit the /etc/cumulus/switchd.conf file.

The following command example enables broadcast storm control for swp4 at 400 packets per second (pps), multicast storm control at 3000 pps, and unknown unicast at 2000 pps.

cumulus@switch:~$ nv set interface swp4 storm-control broadcast 400
cumulus@switch:~$ nv set interface swp4 storm-control multicast 3000
cumulus@switch:~$ nv set interface swp4 storm-control unknown-unicast 2000
cumulus@switch:~$ nv config apply

The storm control settings require a switchd reload. Before applying the settings, NVUE indicates if it requires a switchd reload and prompts you for confirmation. When the switchd service reloads, there is no interruption to network services.

The following example command disables multicast storm control on swp4:

cumulus@switch:~$ nv unset interface swp4 storm-control multicast
cumulus@switch:~$ nv config apply

Edit the /etc/cumulus/switchd.conf file and uncomment the storm_control.broadcast, storm_control.multicast, and storm_control.unknown_unicast lines:

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
...
# Storm Control setting on a port, in pps
interface.swp4.storm_control.broadcast = 400
interface.swp4.storm_control.multicast = 3000
interface.swp4.storm_control.unknown_unicast = 2000
...

When you change the storm control settings, you must reload switchd with the sudo systemctl reload switchd.service command for the changes to take effect. The reload does not interrupt network services.

Show Storm Control Settings

To show the current storm control settings for a layer 2 interface, run the nv show interface <interface> storm-control command.

cumulus@switch:~$ nv show interface swp4 storm-control
                 applied  description
---------------  -------  ----------------------------------------------------------
broadcast        400      Configure storm control for broadcast traffic in pps
multicast        3000     Configure storm control for multicast traffic in pps
unknown-unicast  2000      Configure storm control for unknown unicast traffic in pps

Bonding - Link Aggregation

Linux bonding provides a way to aggregate multiple network interfaces (slaves) into a single logical bonded interface (bond). Link aggregation is useful for linear scaling of bandwidth, load balancing, and failover protection.

Cumulus Linux supports two bonding modes:

Cumulus Linux uses version 1 of the LAG control protocol (LACP).

  • NVUE does not accept a bond name starting with an interface type ID, such as sw, eth, vlan, lo, ib, fnm, or vrrp. For example, you cannot name a bond login123, eth2, sw1, or vlan10.
  • An interface cannot belong to multiple bonds.
  • A bond can have subinterfaces, but subinterfaces cannot have a bond.
  • A bond cannot enslave VLAN subinterfaces.
  • All slave ports within a bond must have the same speed or duplex and match the slave ports of the link partner.

Create a Bond

To create a bond, specify the bond members. In the example below, the front panel port interfaces swp1 thru swp4 are members of bond1 but swp5 and swp6 are not part of bond1.

cumulus@switch:~$ nv set interface bond1 bond member swp1-4
cumulus@switch:~$ nv config apply

In NVUE, if you create the bond interface with a name that starts with bond, NVUE automatically sets the interface type to bond. If you create a bond interface with a name that does not start with bond, you must set the interface type to bond with the nv set interface <interface-name> type bond command.

Edit the /etc/network/interfaces file to add a stanza for the bond, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bond1
iface bond1
    bond-slaves swp1 swp2 swp3 swp4
...
cumulus@switch:~$ ifreload -a

  • By default, the bond uses IEEE 802.3ad link aggregation mode. To configure the bond in balance-xor mode, see Optional Configuration below.
  • If the bond is not going to be part of a bridge, you must specify an IP address.
  • Make sure the name of the bond adheres to Linux interface naming conventions and is unique within the switch.
  • To temporarily bring up a bond even when there is no LACP partner, use LACP Bypass.

When you start networking, the switch creates bond1 as MASTER and interfaces swp1 thru swp4 come up in SLAVE mode:

cumulus@switch:~$ ip link show
...

3: swp1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond1 state UP mode DEFAULT qlen 500
    link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
4: swp2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond1 state UP mode DEFAULT qlen 500
    link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
5: swp3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond1 state UP mode DEFAULT qlen 500
    link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
6: swp4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond1 state UP mode DEFAULT qlen 500
    link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
...

55: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT
    link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff

All slave interfaces within a bond have the same MAC address as the bond. Typically, the first slave you add to the bond donates its MAC address as the bond MAC address. The bond MAC address is the source MAC address for all traffic leaving the bond and provides a single destination MAC address to address traffic to the bond.

Removing a bond slave interface from which a bond derives its MAC address affects traffic when the bond interface flaps to update the MAC address.

Optional Configuration

You can set these configuration options for a bond.

Option
Description
Link aggregation mode Cumulus Linux supports IEEE 802.3ad link aggregation mode (802.3ad) and balance-xor mode. The default mode is 802.3ad.
Set balance-xor mode only if you cannot use LACP; LACP can detect mismatched link attributes between bond members and can even detect misconnections.

When you use balance-xor mode to dual-connect host-facing bonds in an MLAG environment, you must configure the MLAG ID with the same value on both MLAG switches. Otherwise, the MLAG switch pair treats the bonds as single-connected.

MII link monitoring frequency How often (in milliseconds) you want to inspect the link state of each slave for failures.
You can specify a value between 0 and 255. The default value is 100.
miimon link status mode The miimon link status mode. You can set the mode to either netif_carrier_ok(), or MII or ethtool ioctls. The default setting is netif_carrier_ok().
LACP bypass Set LACP bypass on a bond in 802.3ad mode so that it becomes active and forwards traffic even when there is no LACP partner. You can specify on or off. The default setting is off. See LACP Bypass.
Transmit rate The rate at which the link partner transmits LACP control packets. You can specify slow or fast. The default setting is fast.
Minimum number of links The minimum number of links that must be active before the bond goes into service. You can set a value between 0 and 255. The default value is 1, which indicates that the bond must have at least one active member.

Use a value greater than 1 if you need higher level services to ensure a minimum aggregate bandwidth level before activating a bond.

If the number of active members drops below this setting, the bond appears to upper-level protocols as link-down. When the number of active links returns to greater than or equal to this value, the bond becomes link-up.

Cumulus Linux sets the bond configuration options to the recommended values by default; use caution when changing settings.

To set the link aggregation mode on bond1 to balance-xor mode:

cumulus@switch:~$ nv set interface bond1 bond mode static 
cumulus@switch:~$ nv config apply

To reset the link aggregation mode for bond1 to the default value of 802.3ad, run the nv set interface bond1 bond mode lacp command.

Edit the /etc/network/interfaces file and add the balance-xor parameter to the bond stanza, then run the ifreload -a command:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bond1
iface bond1
    bond-mode balance-xor
    bond-slaves swp1 swp2 swp3 swp4
...
cumulus@switch:~$ ifreload -a

To reset the bond mode for bond1 to the default value of 802.3ad, use the bond-mode 802.3ad parameter:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bond1
iface bond1
    bond-mode 802.3ad
    bond-slaves swp1 swp2 swp3 swp4
...

To enable LACP bypass:

cumulus@switch:~$ nv set interface bond1 bond lacp-bypass on 
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file and add the bond-lacp-bypass-allow parameter to the bond stanza, then run the ifreload -a command:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bond1
iface bond1
    bond-lacp-bypass-allow
    bond-slaves swp1 swp2 swp3 swp4
...
cumulus@switch:~$ ifreload -a

To set the miimon link status mode to MII or ethtool ioctls:

Cumulus Linux does not provide NVUE commands for this setting.

Edit the /etc/network/interfaces file and add the bond-use-carrier no parameter to the bond stanza, then run the ifreload -a command:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bond1
iface bond1
    bond-use-carrier no
    bond-slaves swp1 swp2 swp3 swp4
...
cumulus@switch:~$ ifreload -a

To reset the miimon link status mode to the default of netif_carrier_ok(), use the bond-use-carrier yes parameter.

To set the rate at which the link partner transmits LACP control packets to slow:

cumulus@switch:~$ nv set interface bond1 bond lacp-rate slow
cumulus@switch:~$ nv config apply

To reset the rate to the default value of fast, run the nv set interface bond1 bond lacp-rate fast command.

Edit the /etc/network/interfaces file and add the bond-lacp-rate slow parameter to the bond stanza, then run the ifreload -a command:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bond1
iface bond1
    bond-lacp-rate slow
    bond-slaves swp1 swp2 swp3 swp4
...
cumulus@switch:~$ ifreload -a

To reset the rate to the default (fast), use the bond-lacp-rate fast parameter:

To set the minimum number of links that must be active before the bond goes into service to 50:

Cumulus Linux does not provide NVUE commands for this setting.

Edit the /etc/network/interfaces file and add the bond-min-links 50 parameter to the bond stanza, then run the ifreload -a command:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bond1
iface bond1
    bond-min-links 50
    bond-slaves swp1 swp2 swp3 swp4
...
cumulus@switch:~$ ifreload -a

Custom Hashing

The switch distributes egress traffic through a bond to a slave based on a packet hash calculation, providing load balancing over the slaves; the switch distributes conversation flows over all available slaves to load balance the total traffic. Traffic for a single conversation flow always hashes to the same slave. In a failover event, the switch adjusts the hash calculation to steer traffic over available slaves.

The hash calculation uses packet header data to choose to which slave to transmit the packet:

For load balancing between multiple interfaces that are members of the same bond, you can hash on these fields:

Field
Default Setting NVUE Command traffic.conf
IP protocol on nv set system forwarding lag-hash ip-protocol on

nv set system forwarding lag-hash ip-protocol off
lag_hash_config.ip_prot
Source MAC address on nv set system forwarding lag-hash source-mac on

nv set system forwarding lag-hash source-mac off
lag_hash_config.smac
Destination MAC address on nv set system forwarding lag-hash destination-mac on

nv set system forwarding lag-hash destination-mac off
lag_hash_config.dmac
Source IP address on nv set system forwarding lag-hash source-ip on

nv set system forwarding lag-hash source-ip off
lag_hash_config.sip
Destination IP address on nv set system forwarding lag-hash destination-ip on

nv set system forwarding lag-hash destination-ip off
lag_hash_config.dip
Source port on nv set system forwarding lag-hash source-port on

nv set system forwarding lag-hash source-port off
lag_hash_config.sport
Destination port on nv set system forwarding lag-hash destination-port on

nv set system forwarding lag-hash destination-port off
lag_hash_config.dport
Ethertype on nv set system forwarding lag-hash ether-type on

nv set system forwarding lag-hash ether-type off
lag_hash_config.ether_type
VLAN ID on nv set system forwarding lag-hash vlan on

nv set system forwarding lag-hash vlan off
lag_hash_config.vlan_id
TEID (see GTP Hashing) off nv set system forwarding lag-hash gtp-teid on

nv set system forwarding lag-hash gtp-teid off
lag_hash_config.gtp_teid

The following example commands omit the source MAC address and destination MAC address from the hash calculation:

cumulus@switch:~$ nv set system forwarding lag-hash source-mac off
cumulus@switch:~$ nv set system forwarding lag-hash destination-mac off
cumulus@switch:~$ nv config apply

Use the instructions below when NVUE is not enabled. If you are using NVUE to configure your switch, the NVUE commands change the settings in /etc/cumulus/datapath/nvue_traffic.conf which takes precedence over the settings in /etc/cumulus/datapath/traffic.conf.

  1. Edit the /etc/cumulus/datapath/traffic.conf file:
    • Uncomment the lag_hash_config.enable option.
    • Set the lag_hash_config.smac and lag_hash_config.dmac options to false.
cumulus@switch:~$ sudo nano /etc/cumulus/datapath/traffic.conf
...
#LAG HASH config
#HASH config for LACP to enable custom fields
#Fields will be applicable for LAG hash
#calculation
#Uncomment to enable custom fields configured below
lag_hash_config.enable = true

lag_hash_config.smac = false
lag_hash_config.dmac = false
lag_hash_config.sip  = true
lag_hash_config.dip  = true
lag_hash_config.ether_type = true
lag_hash_config.vlan_id = true
lag_hash_config.sport = true
lag_hash_config.dport = true
lag_hash_config.ip_prot = true
#GTP-U teid
lag_hash_config.gtp_teid = false
...
  1. Run the echo 1 > /cumulus/switchd/ctrl/hash_config_reload command. This command does not cause any traffic interruptions.

    cumulus@switch:~$ echo 1 > /cumulus/switchd/ctrl/hash_config_reload
    

Cumulus Linux enables symmetric hashing by default. Make sure that the settings for the source IP and destination IP fields match, and that the settings for the source port and destination port fields match; otherwise Cumulus Linux disables symmetric hashing automatically. If necessary, you can disable symmetric hashing manually in the /etc/cumulus/datapath/traffic.conf file by setting symmetric_hash_enable = FALSE.

You can also set a unique hash seed for each switch to avoid hash polarization. See Unique Hash Seed.

GTP Hashing

GTP carries mobile data within the core of the mobile operator’s network. Traffic in the 5G Mobility core cluster, from cell sites to compute nodes, have the same source and destination IP address. The only way to identify individual flows is with the GTP TEID. Enabling GTP hashing adds the TEID as a hash parameter and helps the Cumulus Linux switches in the network to distribute mobile data traffic evenly across ECMP routes.

Cumulus Linux supports TEID-based load balancing for traffic egressing a bond and is only applicable if the outer header egressing the port is GTP encapsulated and if the ingress packet is either a GTP-U packet or a VXLAN encapsulated GTP-U packet.

  • Cumulus Linux supports GTP Hashing on NVIDIA Spectrum-2 and later.
  • GTP-C packets are not part of GTP hashing.

To enable TEID-based load balancing:

cumulus@switch:~$ nv set system forwarding lag-hash gtp-teid on
cumulus@switch:~$ nv config apply

To disable TEID-based load balancing, run the nv set system forwarding lag-hash gtp-teid off command.

Use the instructions below when NVUE is not enabled. If you are using NVUE to configure your switch, the NVUE commands change the settings in /etc/cumulus/datapath/nvue_traffic.conf which takes precedence over the settings in /etc/cumulus/datapath/traffic.conf.

  1. Edit the /etc/cumulus/datapath/traffic.conf file:

    • Uncomment the hash_config.enable = true line.
    • Change the lag_hash_config.gtp_teid parameter to true.
    cumulus@switch:~$ sudo nano /etc/cumulus/datapath/traffic.conf
    ...
    # Uncomment to enable custom fields configured below
    hash_config.enable = true
    ...
    #GTP-U teid
    lag_hash_config.gtp_teid = true
    
  2. Run the echo 1 > /cumulus/switchd/ctrl/hash_config_reload command. This command does not cause any traffic interruptions.

    cumulus@switch:~$ echo 1 > /cumulus/switchd/ctrl/hash_config_reload
    

To disable TEID-based load balancing, set the lag_hash_config.gtp_teid parameter to false, then reload the configuration.

Troubleshooting

To show information for a bond, run the NVUE nv show interface <bond> bond command:

cumulus@leaf01:mgmt:~$ nv show interface bond1 bond
             operational  applied  description
-----------  -----------  -------  ------------------------------------------------------
down-delay   0            0        bond down delay
lacp-bypass  on           on       lacp bypass
lacp-rate    fast         fast     lacp rate
mode                      lacp     bond mode
up-delay     0            0        bond up delay
[member]     swp1         swp1     Set of bond members
mlag
  enable                  on       Turn the feature 'on' or 'off'.  The default is 'off'.
  id         1            1        MLAG id
  status     single                Mlag Interface status

You can also run the Linux sudo cat /proc/net/bonding/<bond> command:

cumulus@leaf01:mgmt:~$ sudo cat /proc/net/bonding/bond1
...
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Min links: 1
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 44:38:39:be:ef:aa
Active Aggregator Info:
	Aggregator ID: 1
	Number of ports: 1
	Actor Key: 9
	Partner Key: 1
	Partner Mac Address: 00:00:00:00:00:00

Slave Interface: swp1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 44:38:39:00:00:37
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: churned
Actor Churned Count: 1
Partner Churned Count: 2
...

To show specific bond information, use the nv show interface <bond> <option> commands:

cumulus@switch:~$ nv show interface bond1 TAB
acl        bridge     ip         lldp       ptp        router     
bond       evpn       link       pluggable  qos
cumulus@leaf02:mgmt:~$ nv show interface bond1 link
                       operational        applied  description
---------------------  -----------------  -------  ----------------------------------------------------------------------
auto-negotiate         off                on       Link speed and characteristic auto negotiation
duplex                 full               full     Link duplex
fec                                       auto     Link forward error correction mechanism
mtu                    9000               9000     interface mtu
speed                  1G                 auto     Link speed
dot1x
  mab                                     off      bypass MAC authentication
  parking-vlan                            off      VLAN for unauthorized MAC addresses
state                  up                 up       The state of the interface
stats
  carrier-transitions  1                           Number of times the interface state has transitioned between up and...
  in-bytes             0 Bytes                     total number of bytes received on the interface
  in-drops             0                           number of received packets dropped
  in-errors            0                           number of received packets with errors
  in-pkts              0                           total number of packets received on the interface
  out-bytes            3.65 MB                     total number of bytes transmitted out of the interface
  out-drops            0                           The number of outbound packets that were chosen to be discarded eve...
  out-errors           0                           The number of outbound packets that could not be transmitted becaus...
  out-pkts             51949                       total number of packets transmitted out of the interface
mac                    44:38:39:00:00:37           MAC Address on an interface

Multi-Chassis Link Aggregation - MLAG

MLAG or CLAG: Other vendors refer to the Cumulus Linux implementation of MLAG as CLAG, MC-LAG or VPC. You even see references to CLAG in Cumulus Linux, including the management daemon, named clagd, and other options in the code, such as clag-id, which exist for historical purposes. The Cumulus Linux implementation is truly a multi-chassis link aggregation protocol so this document uses MLAG.

MLAG enables a server or switch with a two-port bond, such as a link aggregation group (LAG), EtherChannel, port group or trunk, to connect those ports to different switches and operate as if they connect to a single, logical switch. This provides greater redundancy and greater system throughput.

Dual-connected devices can create LACP bonds that contain links to each physical switch; Cumulus Linux supports active-active links from the dual-connected devices even though they connect to two different physical switches.

How Does MLAG Work?

A basic MLAG configuration looks like this:


  • The two switches, leaf01 and leaf02, known as peer switches, appear as a single device to the bond on server01.
  • server01 distributes traffic between the two links to leaf01 and leaf02 in the way you configure on the host.
  • Traffic inbound to server01 can traverse leaf01 or leaf02 and arrive at server01.

More elaborate configurations are also possible. The number of links between the host and the switches can be greater than two and does not have to be symmetrical. Also, because the two peer switches appear as a single switch to other bonding devices, you can also connect pairs of MLAG switches to each other in a switch-to-switch MLAG configuration:



  • leaf01 and leaf02 are also MLAG peer switches and present a two-port bond from a single logical system to spine01 and spine02.

Link Aggregation Control Protocol (LACP), the IEEE standard protocol for managing bonds, verifies dual-connectedness. LACP runs on the dual-connected devices and on each of the MLAG peer switches. On a dual-connected device, the only configuration requirement is to create a bond that LACP manages.

On each of the peer switches, you must place the links that connect to the dual-connected host or switch in the bond. This is true even if the links are a single port on each peer switch, where each port is in a bond, as shown below:

The dual-connected bonds on the peer switches have their system ID set to the MLAG system ID. Therefore, from the point of view of the hosts, each of the links in its bond connects to the same system and so the host uses both links.

Each peer switch periodically makes a list of the LACP partner MAC addresses for its bonds and sends that list to its peer (using the clagd service). The LACP partner MAC address is the MAC address of the system at the other end of a bond (server01, server02, and server03 in the figure above). When a switch receives this list from its peer, it compares the list to the LACP partner MAC addresses on its switch. If there are any matches and the clag-id for those bonds match, then that bond is a dual-connected bond.

Requirements

MLAG has these requirements:

  • MLAG is not supported in a multiple VLAN-aware bridge configuration.
  • Both MLAG peers must use the same VXLAN device type (single or traditional).

Basic Configuration

To configure MLAG, you need to create a bond that uses LACP on the dual-connected devices and configure the interfaces (including bonds, VLANs, bridges, and peer links) on each peer switch. Follow these steps on each peer switch in the MLAG pair:

  1. On the dual-connected device, such as a host or server that sends traffic to and from the switch, create a bond that uses LACP. The method you use varies with the type of device you are configuring.

    If you cannot use LACP in your environment, you can configure the bonds in balance-xor mode.

  2. Place every interface that connects to the MLAG pair from a dual-connected device into a bond, even if the bond contains only a single link on a single physical switch.

    The following examples place swp1 in bond1 and swp2 in bond2.

    cumulus@leaf01:~$ nv set interface bond1 bond member swp1
    cumulus@leaf01:~$ nv set interface bond1 description bond1-on-swp1
    cumulus@leaf01:~$ nv set interface bond2 bond member swp2
    cumulus@leaf01:~$ nv set interface bond2 description bond2-on-swp1
    cumulus@leaf01:~$ nv config apply
    

    Add the following lines to the /etc/network/interfaces file. The example also adds a description for the bonds (an alias), which is optional.

    cumulus@leaf01:~$ sudo nano /etc/network/interfaces
    ...
    auto bond1
    iface bond1
        alias bond1 on swp1
        bond-slaves swp1
    ...
    
    auto bond2
    iface bond2
        alias bond2 on swp2
        bond-slaves swp2
    ...
    
  3. Add a unique MLAG ID to each bond.

    You must specify a unique MLAG ID (clag-id) for every dual-connected bond on each peer switch so that switches know which links dual-connect or connect to the same host or switch. The value must be between 1 and 65535 and must be the same on both peer switches. A value of 0 disables MLAG on the bond.

    The example commands below add an MLAG ID of 1 to bond1 and 2 to bond2:

    cumulus@leaf01:~$ nv set interface bond1 bond mlag id 1
    cumulus@leaf01:~$ nv set interface bond2 bond mlag id 2 
    cumulus@leaf01:~$ nv config apply
    

    In the /etc/network/interfaces file, add the line clag-id 1 to the auto bond1 stanza and clag-id 2 to auto bond2 stanza:

    cumulus@switch:~$ sudo nano /etc/network/interfaces
    ...
    auto bond1
    iface bond1
        alias bond1 on swp1
        bond-slaves swp1
        clag-id 1
    
    auto bond2
    iface bond2
        alias bond2 on swp2
        bond-slaves swp2
        clag-id 2
    ...
    
  4. Add the bonds you created above to a bridge. The example commands below add bond1 and bond2 to a VLAN-aware bridge.

    You must add all VLANs configured on the MLAG bond to the bridge so that traffic to the downstream device connected in MLAG redirects over the peerlink in case the MLAG bond fails.

    cumulus@leaf01:~$ nv set interface bond1-2 bridge domain br_default 
    cumulus@leaf01:~$ nv config apply
    

    Edit the /etc/network/interfaces file to add the bridge-ports bond1 bond2 lines to the auto bridge stanza:

    cumulus@leaf01:~$ sudo nano /etc/network/interfaces
    ...
    auto bridge
    iface bridge
        bridge-ports bond1 bond2
        bridge-vlan-aware yes
    ...
    
  5. Create the inter-chassis bond and the peer link VLAN (as a VLAN subinterface). You also need to provide the peer link IP address, the MLAG bond interfaces, the MLAG system MAC address, and the backup interface.

    • By default, Cumulus Linux configures the inter-chassis bond with the name peerlink and the peer link VLAN with the name peerlink.4094. Use peerlink.4094 to ensure that the VLAN is independent of the bridge and spanning tree forwarding decisions.
    • The peer link IP address is a link-local address that provides layer 3 connectivity between the peer switches.
    • NVIDIA provides a reserved range of MAC addresses for MLAG (between 44:38:39:ff:00:00 and 44:38:39:ff:ff:ff). Use a MAC address from this range to prevent conflicts with other interfaces in the same bridged network.
      • Do not to use a multicast MAC address.
      • Do not use the same MAC address for different MLAG pairs; make sure you specify a different MAC address for each MLAG pair in the network.
    • The backup IP address is any layer 3 backup interface for the peer link, which the switch uses when the peer link goes down. You must add the backup IP address, which must be different than the peer link IP address. Make sure that any route that does not use the peer link can reach the backup IP address. Use the loopback or management IP address of the switch.
      Loopback or Management IP Address?
      • If your MLAG configuration has bridged uplinks (such as a campus network or a large, flat layer 2 network), use the peer switch eth0 address. When the peer link is down, the secondary switch routes towards the eth0 address using the OOB network (provided you have implemented an OOB network).
      • If your MLAG configuration has routed uplinks (a modern approach to the data center fabric network), use the peer switch loopback address. When the peer link is down, the secondary switch routes towards the loopback address using uplinks (towards the spine layer). If the primary switch also has a more significant problem (for example, switchd does not respond or stops), the secondary switch promotes itself to primary and the traffic flows.

      When using BGP, to ensure IP connectivity between the loopbacks, the MLAG peer switches must use unique BGP ASNs; if they use the same ASN, you must bypass the BGP loop prevention check on the AS_PATH attribute.

    The following examples show commands for both MLAG peers (leaf01 and leaf02).

    cumulus@leaf01:~$ nv set interface peerlink bond member swp49-50
    cumulus@leaf01:~$ nv set mlag mac-address 44:38:39:FF:00:AA
    cumulus@leaf01:~$ nv set mlag backup 10.10.10.2
    cumulus@leaf01:~$ nv set mlag peer-ip linklocal
    cumulus@leaf01:~$ nv config apply
    

    To configure the backup link to a VRF, include the name of the VRF with the backup-ip parameter. The following example configures the backup link to VRF mgmt:

    cumulus@leaf01:~$ nv set mlag backup 10.10.10.2 vrf mgmt
    cumulus@leaf01:~$ nv config apply
    
    cumulus@leaf02:~$ nv set interface peerlink bond member swp49-50
    cumulus@leaf02:~$ nv set mlag mac-address 44:38:39:FF:00:AA
    cumulus@leaf02:~$ nv set mlag backup 10.10.10.1
    cumulus@leaf02:~$ nv set mlag peer-ip linklocal
    cumulus@leaf02:~$ nv config apply
    

    To configure the backup link to a VRF, include the name of the VRF with the backup-ip parameter. The following example configures the backup link to VRF mgmt:

    cumulus@leaf02:~$ nv set mlag backup 10.10.10.1 vrf mgmt
    cumulus@leaf02:~$ nv config apply
    

    Edit the /etc/network/interfaces file to add the following parameters, then run the sudo ifreload -a command.

    • The inter-chasis bond (peerlink) with two ports in the bond (swp49 and swp50 in the example command below)
    • The peerlink bond to the bridge
    • The peer link VLAN (peerlink.4094) with the backup IP address, the peer link IP address (linklocal), and the MLAG system MAC address (from the reserved range of addresses).
    cumulus@leaf01:~$ sudo nano /etc/network/interfaces
    ...
    auto br_default
    iface br_default
        bridge-ports bond1 bond2 peerlink
        bridge-vlan-aware yes
    ...
    auto peerlink
    iface peerlink
        bond-slaves swp49 swp50
    

    auto peerlink.4094 iface peerlink.4094 clagd-backup-ip 10.10.10.2 clagd-peer-ip linklocal clagd-sys-mac 44:38:39:FF:00:AA …

    To configure the backup link to a VRF, include the name of the VRF with the clagd-backup-ip parameter. The following example configures the backup link to VRF RED:

    cumulus@leaf01:~$ sudo nano /etc/network/interfaces
    ...
    auto peerlink.4094
    iface peerlink.4094
        clagd-backup-ip 10.10.10.2 vrf RED
        clagd-peer-ip linklocal
        clagd-sys-mac 44:38:39:FF:00:AA
    ...
    

    Run the sudo ifreload -a command to apply all the configuration changes:

    cumulus@leaf01:~$ sudo ifreload -a
    
    cumulus@leaf02:~$ sudo nano /etc/network/interfaces
    ...
    auto br_default
    iface br_default
        bridge-ports bond1 bond2 peerlink
        bridge-vlan-aware yes
    ...
    auto peerlink
    iface peerlink
        bond-slaves swp49 swp50
    

    auto peerlink.4094 iface peerlink.4094 clagd-backup-ip 10.10.10.1 clagd-peer-ip linklocal clagd-sys-mac 44:38:39:FF:00:AA …

    To configure the backup link to a VRF, include the name of the VRF with the clagd-backup-ip parameter. The following example configures the backup link to VRF RED:

    cumulus@leaf02:~$ sudo nano /etc/network/interfaces
    ...
    auto peerlink.4094
    iface peerlink.4094
        clagd-backup-ip 10.10.10.1 vrf RED
        clagd-peer-ip linklocal
        clagd-sys-mac 44:38:39:FF:00:AA
    ...
    

    Run the sudo ifreload -a command to apply all the configuration changes:

    cumulus@leaf02:~$ sudo ifreload -a
    

  • Do not add VLAN 4094 to the bridge VLAN list; You cannot configure VLAN 4094 for the peer link subinterface as a bridged VLAN with bridge VIDs under the bridge.
  • Do not use 169.254.0.1 as the MLAG peer link IP address; Cumulus Linux uses this address for BGP unnumbered interfaces.
  • When you configure MLAG manually in the /etc/network/interfaces file, the changes take effect when you bring the peer link interface up with the sudo ifreload -a command. Do not use systemctl restart clagd.service to apply the new configuration.
  • The MLAG bond does not support layer 3 configuration.

MLAG synchronizes the dynamic state between the two peer switches but it does not synchronize the switch configurations. After modifying the configuration of one peer switch, you must make the same changes to the configuration on the other peer switch. This applies to all configuration changes, including:

Optional Configuration

This section describes optional configuration procedures.

Set Roles and Priority

Each MLAG-enabled switch in the pair has a role. When the peering relationship establishes between the two switches, one switch goes into the primary role and the other into the secondary role. When an MLAG-enabled switch is in the secondary role, it does not send STP BPDUs on dual-connected links; it only sends BPDUs on single-connected links. The switch in the primary role sends STP BPDUs on all single- and dual-connected links.

By default, the switch determines the role by comparing the MAC addresses of the two sides of the peering link; the switch with the lower MAC address assumes the primary role. You can override this by setting the priority option for the peer link:

cumulus@leaf01:~$ nv set mlag priority 2084
cumulus@leaf01:~$ nv config apply

Edit the /etc/network/interfaces file and add the clagd-priority option, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.2
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-priority 2048
...
cumulus@switch:~$ sudo ifreload -a

The switch with the lower priority value is in the primary role; the default value is 32768 and the range is between 0 and 65535.

When the MLAG service exits during switch reboot or if you stop the service on the primary switch, the peer switch that is in the secondary role becomes the primary.

However, if the primary switch goes down without stopping the MLAG service or if the peer link goes down, the secondary switch does not change its role. If the peer switch is not alive, the switch in the secondary role rolls back the LACP system ID to be the bond interface MAC address instead of the MLAG system MAC address (clagd-sys-mac). The switch in the primary role uses the MLAG system MAC address as the LACP system ID on the bonds.

Set clagctl Timers

The clagd service has several timers that you can tune for enhanced performance.

Timer
Description
--reloadTimer <seconds> The number of seconds to wait for the peer switch to become active. If the peer switch does not become active after the timer expires, the MLAG bonds leave the initialization (protodown) state and become active. This provides clagd with sufficient time to determine whether the peer switch is coming up or if it is permanently unreachable.
The default is 300 seconds.
--peerTimeout <seconds>
The number of seconds clagd waits without receiving any messages from the peer switch before it determines that the peer is no longer active. At this point, the switch reverts all configuration changes so that it operates as a standard non-MLAG switch. This includes removing all statically assigned MAC addresses, clearing the egress forwarding mask, and allowing addresses to move from any port to the peer port. After a message is again received from the peer, MLAG operation restarts. If this parameter is not specified, clagd uses ten times the local lacpPoll value.
--initDelay <seconds> The number of seconds clagd delays bringing up MLAG bonds and anycast IP addresses.
The default is 180 seconds. NVIDIA recommends you set this parameter to 300 seconds in a scaled environment.
This timer sets to 0 automatically under the following conditions:
  • When the peer is not alive and the backup link is not active after a reload timeout
  • When the peer sends a goodbye (through the peer link or the backup link)
  • When both MLAG sessions come up at the same time
--sendTimeout <seconds> The number of seconds clagd waits until the sending socket times out. If it takes longer than the sendTimeout value to send data to the peer, clagd generates an exception.
The default is 30 seconds.
--lacpPoll <seconds> The number of seconds clagd waits before obtaining local LACP information.
The default is 2 seconds.

The only timer you can set with NVUE is the initial delay timer. The following example NVUE Command sets the initial delay to 100 seconds:

cumulus@leaf01:~$ nv set mlag init-delay 100
cumulus@leaf01:~$ nv config apply

To set the clagd timers, edit the /etc/network/interfaces file to add the clagd-args --<timer> line to the peerlink.4094 stanza, then run the ifreload -a command.

The following example command sets the initial delay timer to 100 seconds:

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto peerlink.4094
iface peerlink.4094
    clagd-args --initDelay 100
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.2
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-priority 2048
...
cumulus@leaf01:~$ sudo ifreload -a

The following example command sets the peer timeout to 900 seconds:

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto peerlink.4094
iface peerlink.4094
    clagd-args --peerTimeout 900
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.2
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-priority 2048
...
cumulus@leaf01:~$ sudo ifreload -a

Configure MLAG with a Traditional Bridge

To configure MLAG with a traditional mode bridge instead of a VLAN-aware mode bridge, you must configure the peer link and all dual-connected links as untagged (native) ports on a bridge (note the absence of any VLANs in the bridge-ports line and the lack of the bridge-vlan-aware parameter below):

...
auto br0
iface br0
    bridge-ports peerlink bond1 bond2
...

The following example shows you how to allow VLAN 10 across the peer link:

...
auto br0.10
iface br0.10
    bridge-ports peerlink.10 bond1.10 bond2.10
    bridge-stp on
...

In an MLAG and traditional bridge configuration, NVIDIA recommends that you set bridge learning to off on all VLANs over the peerlink except for the layer 3 peer link subinterface; for example:

...
auto peerlink
iface peerlink
    bridge-learning off
    
auto peerlink.1510
iface peerlink.1510
    bridge-learning off

auto peerlink.4094
iface peerlink.4094
...

Configure a Backup UDP Port

By default, Cumulus Linux uses UDP port 5342 with the backup IP address. To change the backup UDP port, edit the /etc/network/interfaces file to add clagd-args --backupPort <port> to the auto peerlink.4094 stanza. For example:

...
auto peerlink.4094
iface peerlink.4094
    clagd-args --backupPort 5400
    clagd-backup-ip 10.10.10.2
    clagd-peer-ip linklocal
    clagd-sys-mac 44:38:39:FF:00:AA
...

Run the sudo ifreload -a command to apply all the configuration changes:

cumulus@leaf01:~$ sudo ifreload -a

Unconfigure MLAG

To unconfigure MLAG:

Run the following commands to unset MLAG, and unset the peerlink and the peerlink VLAN subinterface that Cumulus Linux creates automatically. You must run the commands at the same time with the nv config apply command.

cumulus@leaf01:~$ nv unset mlag
cumulus@leaf01:~$ nv unset interface peerlink
cumulus@leaf01:~$ nv unset interface peerlink.4094
cumulus@leaf01:~$ nv config apply

Edit the /etc/network/interfaces file.

  1. Remove the auto peerlink stanza; for example, remove lines similar to the following:
...
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
auto peerlink.4094
iface peerlink.4094
clagd-backup-ip 10.10.10.2
clagd-peer-ip linklocal
clagd-sys-mac 44:38:39:FF:00:AA
...
  1. Remove the clag-id line from the bond stanzas. In the following example, remove clag-id 1 from the auto bond1 stanza and clag-id 2 from the auto bond2 stanza:
...
auto bond1
iface bond1
    alias bond1 on swp1
    bond-slaves swp1
    clag-id 1

auto bond2
iface bond2
    alias bond2 on swp2
    bond-slaves swp2
    clag-id 2
...
  1. Remove peerlink from the bridge-ports line of the bridge stanza. In the following example, remove peerlink from the auto br_default stanza:
auto br_default
iface br_default
    bridge-ports bond1 bond2 peerlink
    bridge-vlan-aware yes
  1. Run the sudo ifreload -a command:
cumulus@leaf01:~$ sudo ifreload -a

Best Practices

Follow these best practices when configuring MLAG on your switches.

MTU and MLAG

The bridge MTU determines the MTU in MLAG traffic. The lowest MTU setting of an interface that is a member of the bridge determines the bridge MTU. If you want to set an MTU other than the default of 9216 bytes, you must configure the MTU on each physical interface and the bond interface that is a member of every MLAG bridge in the entire bridged domain.

The following example commands set an MTU of 1500 for each of the bond interfaces (peer link, uplink, bond1, bond2), which are members of bridge bridge:

cumulus@leaf01:~$ nv set interface peerlink.4094 link mtu 1500
cumulus@leaf01:~$ nv set interface uplink link mtu 1500
cumulus@leaf01:~$ nv set interface bond1 link mtu 1500
cumulus@leaf01:~$ nv set interface bond2 link mtu 1500
cumulus@leaf01:~$ nv config apply

Edit the /etc/network/interfaces file, then run the ifreload -a command. For example:

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ports peerlink uplink bond1 bond2

auto peerlink
iface peerlink
    mtu 1500

auto bond1
iface bond1
    mtu 1500

auto bond2
iface bond2
    mtu 1500

auto uplink
iface uplink
    mtu 1500
...
cumulus@leaf01:~$ sudo ifreload -a

STP and MLAG

Always enable STP in your layer 2 network and BPDU Guard on the host-facing bond interfaces.

The peer link carries little traffic when compared to the bandwidth consumed by data plane traffic. In a typical MLAG configuration, most connections between the two switches in the MLAG pair are dual-connected; the only traffic going across the peer link is traffic from the clagd process and some LLDP or LACP traffic. The switch does not forward traffic received on the peer link out of the dual-connected bonds.

However, there are some instances where a host connects to only one switch in the MLAG pair; for example:

Determine how much bandwidth is traveling across the single-connected interfaces and set half of that bandwidth to the peer link. On average, one half of the traffic destined to the single-connected host arrives on the switch directly connected to the single-connected host and the other half arrives on the switch that is not directly connected to the single-connected host. When this happens, only the traffic that arrives on the switch that is not directly connected to the single-connected host needs to traverse the peer link.

In addition, you can add extra links to the peer link bond to handle link failures in the peer link bond itself.


  • Each host has two 10G links, with each 10G link going to each switch in the MLAG pair.
  • Each host has 20G of dual-connected bandwidth; all three hosts have a total of 60G of dual-connected bandwidth.
  • Set at least 15G of bandwidth for each peer link bond, which represents half of the single-connected bandwidth.

When planning for link failures for a full rack, you need only set enough bandwidth to meet your site strategy for handling failure scenarios. For example, for a full rack with 40 servers and two switches, you can plan for four to six servers to lose connectivity to a single switch and become single connected before you respond to the event. Therefore, if you have 40 hosts each with 20G of bandwidth dual-connected to the MLAG pair, you can set between 20G and 30G of bandwidth to the peer link, which accounts for half of the single-connected bandwidth for four to six hosts.

When enabling a routing protocol in an MLAG environment, it is also necessary to manage the uplinks; by default MLAG is not aware of layer 3 uplink interfaces. If there is a peer link failure, MLAG does not remove static routes or bring down a BGP or OSPF adjacency unless you use a separate link state daemon such as ifplugd.

When you use MLAG with VRR, set up a routed adjacency across the peerlink.4094 interface. If a routed connection is not built across the peer link, during an uplink failure on one of the switches in the MLAG pair, egress traffic does not forward if the destination is on the switch whose uplinks are down.

To set up the adjacency, configure a BGP or OSPF unnumbered peering, as appropriate for your network.

  • For switches with the Spectrum ASIC, the MLAG loop avoidance mechanism also drops routed traffic that arrives on an MLAG peer link interface and routes to a dual-connected VNI. If you need to route unencapsulated traffic to an MLAG peer switch for VXLAN forwarding to accommodate uplink failures or other design needs, configure a routing adjacency across a separate routed interface that is not the MLAG peerlink.
  • Switches with the Spectrum-2 ASIC and later allow packets arriving on the peer link to route to a VNI for VXLAN encapsulation.

For BGP, use a configuration like this:

cumulus@leaf01:~$ nv set vrf default router bgp neighbor peerlink.4094 remote-as external
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# bgp router-id 10.10.10.1
leaf01(config-router)# neighbor peerlink.4094 remote-as external
leaf01(config-router)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

If you are using EVPN and MLAG, you need to enable the EVPN address family across the peerlink.4094 interface as well:

cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# bgp router-id 10.10.10.1
leaf01(config-router)# neighbor peerlink.4094 remote-as external
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# neighbor peerlink.4094 activate
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

For OSPF, use a configuration like this:

cumulus@leaf01:~$ nv set interface peerlink.4094 router ospf area 0.0.0.1
cumulus@leaf01:~$ nv config apply

MLAG Routing Support

In addition to the routing adjacency over the peer link, Cumulus Linux supports routing adjacencies from attached network devices to MLAG switches under the following conditions:

The router cannot:

  • Attach to the switch over a MLAG bond interface.
  • Form routing adjacencies to a virtual address (VRR or VRRP).

Troubleshooting

Use the following troubleshooting tips to check MLAG configuration.

Check MLAG Status

To verify MLAG configuration, run the nv show mlag command:

cumulus@leaf01:mgmt:~$ nv show mlag
                operational              applied            description
--------------  -----------------------  -----------------  ------------------------------------------------------
enable                                   on                 Turn the feature 'on' or 'off'.  The default is 'off'.
debug                                    off                Enable MLAG debugging
init-delay                               100                The delay, in seconds, before bonds are brought up.
mac-address     44:38:39:FF:00:aa        44:38:39:FF:00:AA  Override anycast-mac and anycast-id
peer-ip         fe80::4638:39ff:fe00:5a  linklocal          Peer Ip Address
priority        32768                    32768              Mlag Priority
[backup]        10.10.10.2               10.10.10.2         Set of MLAG backups
backup-active   False                                       Mlag Backup Status
backup-reason                                               Mlag Backup Reason
local-id        44:38:39:00:00:59                           Mlag Local Unique Id
local-role      primary                                     Mlag Local Role
peer-alive      True                                        Mlag Peer Alive Status
peer-id         44:38:39:00:00:5a                           Mlag Peer Unique Id
peer-interface  peerlink.4094                               Mlag Peerlink Interface
peer-priority   32768                                       Mlag Peer Priority
peer-role       secondary                                   Mlag Peer Role

To show the MLAG interface information, run the clagctl command:

cumulus@leaf01:mgmt:~$ clagctl
The peer is alive
     Our Priority, ID, and Role: 32768 48:b0:2d:8b:f4:cb primary
    Peer Priority, ID, and Role: 32768 48:b0:2d:cf:ba:45 secondary
          Peer Interface and IP: peerlink.4094 fe80::4ab0:2dff:fecf:ba45 (linklocal)
                      Backup IP: 10.10.10.2 (active)
                     System MAC: 44:38:39:FF:00:aa

CLAG Interfaces
Our Interface      Peer Interface     CLAG Id   Conflicts              Proto-Down Reason
----------------   ----------------   -------   --------------------   -----------------
           bond1   -                  1         lacp partner mac       -              
                                                mismatch
           bond2   -                  2         lacp partner mac       -              
                                                mismatch
           bond3   -                  3         lacp partner mac       -              
                                                mismatch

Show All MLAG Settings

To see all MLAG settings, run the nv show mlag command:

cumulus@leaf01:~$ nv show mlag
                operational                applied   
--------------  -------------------------  ----------
enable          on                         on        
mac-address     44:38:39:FF:00:aa          auto      
peer-ip         fe80::4ab0:2dff:fe52:1190  linklocal 
priority        1000                       1000      
init-delay      10                         10        
debug           off                        off       
[backup]        10.10.10.2                 10.10.10.2
peer-priority   2000                                 
backup-active   True                                 
local-id        48:b0:2d:d1:e4:e1                    
peer-id         48:b0:2d:52:11:90                    
local-role      primary                              
peer-role       secondary                            
peer-interface  peerlink.4094                        
peer-alive      True                                 
backup-reason                                        
anycast-ip      10.0.1.12

View the MLAG Log File

By default, when running, the clagd service logs status messages to the /var/log/clagd.log file and to syslog:

cumulus@spine01:~$ sudo tail /var/log/clagd.log
2016-10-03T20:31:50.471400+00:00 spine01 clagd[1235]: Initial config loaded
2016-10-03T20:31:52.479769+00:00 spine01 clagd[1235]: The peer switch is active.
2016-10-03T20:31:52.496490+00:00 spine01 clagd[1235]: Initial data sync to peer done.
2016-10-03T20:31:52.540186+00:00 spine01 clagd[1235]: Role is now primary; elected
2016-10-03T20:31:54.250572+00:00 spine01 clagd[1235]: HealthCheck: role via backup is primary
2016-10-03T20:31:54.252642+00:00 spine01 clagd[1235]: HealthCheck: backup active
2016-10-03T20:31:54.537967+00:00 spine01 clagd[1235]: Initial data sync from peer done.
2016-10-03T20:31:54.538435+00:00 spine01 clagd[1235]: Initial handshake done.
2016-10-03T22:47:35.255317+00:00 spine01 clagd[1235]: leaf01-02 is now dual connected.

Monitor the clagd Service

Due to the critical nature of the clagd service, systemd continuously monitors its status by receiving notify messages every 30 seconds. If the clagd service terminates or becomes unresponsive for any reason and systemd receives no messages after 60 seconds, systemd restarts the clagd service. systemd logs these failures in the /var/log/syslog file and, on the first failure, also generates a cl-supportfile.

Monitoring occurs automatically as long as:

You can check if clagd is running with the systemctl status command:

cumulus@leaf01:~$ systemctl status clagd.service
 ● clagd.service - Cumulus Linux Multi-Chassis LACP Bonding Daemon
    Loaded: loaded (/lib/systemd/system/clagd.service; enabled)
    Active: active (running) since Fri 2021-06-11 16:17:19 UTC; 12min ago
        Docs: man:clagd(8)
    Main PID: 27078 (clagd)
    CGroup: /system.slice/clagd.service
            └─27078 /usr/bin/python3 /usr/sbin/clagd --daemon linklocal peerlink.4094 44:38:39:FF:00:AA --priority 32768

When you make an MLAG configuration change, Cumulus Linux automatically validates the corresponding parameters on both MLAG peers and takes action based on the type of conflict it sees. For every conflict, the /var/log/clagd.log file records a log message.

The following table shows the conflict types and actions that Cumulus Linux takes.

Conflict Type Action
Bridge STP mode Global Protodown only the MLAG bonds on the secondary switch when there is an STP mode mismatch across peers.
MLAG native VLAN Interface Protodown only the MLAG bonds on the secondary switch when there is a native VLAN mismatch.
STP root bridge priority Global Protodown the MLAG bonds and VNIs on the secondary switch when there is an STP priority mismatch across peers.
MLAG system MAC address Global Protodown the MLAG bonds and VNIs on the secondary switch when there is an MLAG system MAC address mismatch across peers.
Peer IP Global Protodown the MLAG bonds and VNIs on the secondary switch when there is an IP address mismatch within the same subnet between peers. The consistency checker does not trigger an IP address mismatch between the link-local keyword and a static IPv4 address, or between IPv4 addresses across subnets.
Peer link MTU Global Protodown the MLAG bonds and VNIs on the secondary switch when there is a peer link MTU mismatch across peers.
Peer link native VLAN Global Protodown the MLAG bonds and VNIs on the secondary switch when there is a peer link VLAN mismatch across peers.
Protodown the MLAG bonds and VNIs on the secondary switch when there is no PVID.
VXLAN anycast IP address Global Protodown the MLAG bonds and VNIs on the secondary switch when there is an anycast IP address mismatch across peers.
Protodown the MLAG bonds and VNIs on the node where there is no configured anycast IP address.
Peer link bridge member Global Protodown the MLAG bonds and VNIs on the MLAG switch where there is a peer link bridge member conflict.

The peer value always displays NOT-SYNCED for this consistency check because Cumulus Linux does not enforce the same interface name for the peerlink and because of limitations with traditional bridges.

MLAG bond bridge member Interface Protodown the MLAG bonds and VNIs on the MLAG switch if the MLAG bond is not a bridge member.

The peer value always displays NOT-SYNCED for this consistency check because Cumulus Linux does not enforce the same interface name for the peerlink and because of limitations with traditional bridges.

LACP partner MAC address Interface Protodown the MLAG bonds on the MLAG switch if there is an LACP partner MAC address mismatch or if there is a duplicate LACP partner MAC address. MLAG VLANs Interface Suspend the inconsistent VLANs on either MLAG peer if the VLANs are not part of the peer link or if there is mismatch of VLANs configured on the MLAG bonds between the MLAG peers. Peer link VLANs Global Suspend the inconsistent VLANs on either MLAG peer on all the dual-connected MLAG bonds and VXLAN interfaces. MLAG protocol version Global The consistency check records an MLAG protocol version mismatch between the MLAG peers. Cumulus Linux does not take any disruptive action. MLAG package version Global The consistency check records an MLAG package version mismatch between the MLAG peers. Cumulus Linux does not take any disruptive action.

You can also manually check for MLAG inconsistencies with the following commands:

The following example command shows global MLAG settings for each peer and indicates that the MLAG system MAC address does not match.

cumulus@leaf01:mgmt:~$ nv show mlag consistency-checker global
Global Consistency-checker
=============================
    Parameter               LocalValue                 PeerValue                  Conflict  Summary
    ----------------------  -------------------------  -------------------------  --------  -------
    anycast-ip              -                          -                          -                
    bridge-priority         32768                      32768                      -                
    bridge-stp              on                         on                         -                
    bridge-type             vlan-aware                 vlan-aware                 -                
    clag-pkg-version        1.6.0-cl5.7.0u2            1.6.0-cl5.7.0u2            -                
    clag-protocol-version   1.6.1                      1.6.1                      -                
    peer-ip                 fe80::4ab0:2dff:fe3c:61d1  fe80::4ab0:2dff:fe3c:61d1  -                
    peerlink-bridge-member  Yes                        Yes                        -                
    peerlink-mtu            9216                       9216                       -                
    peerlink-native-vlan    1                          1                          -                
    peerlink-vlans          1, 10, 20, 30              1, 10, 20, 30              -                
    redirect2-enable        yes                        yes                        -                
    system-mac              44:38:39:FF:00:aa          44:38:39:FF:00:aa          system mac mismatch between clag peers

The following example command shows MLAG settings for all interfaces on each peer with no conflicts:

cumulus@leaf01:mgmt:~$ nv show interface --view=mlag-cc
Interface  Conflict  LocalValue         Parameter         PeerValue
---------  --------  -----------------  ----------------  -----------------
bond1    -         yes                bridge-learning   yes
bond1    -         1                  clag-id           1
bond1    -         44:38:39:FF:00:aa  lacp-actor-mac    44:38:39:FF:00:aa
bond1    -         00:00:00:00:00:00  lacp-partner-mac  00:00:00:00:00:00
bond1    -         br_default         master            NOT-SYNCED
bond1    -         9216               mtu               9216
bond1    -         1                  native-vlan       1
bond1    -         1, 10, 20, 30      vlan-id           1, 10, 20, 30
bond2    -         yes                bridge-learning   yes
bond2    -         2                  clag-id           2
bond2    -         44:38:39:FF:00:aa  lacp-actor-mac    44:38:39:FF:00:aa
bond2    -         00:00:00:00:00:00  lacp-partner-mac  00:00:00:00:00:00
bond2    -         br_default         master            NOT-SYNCED
bond2    -         9216               mtu               9216
bond2    -         1                  native-vlan       1
bond2    -         1, 10, 20, 30      vlan-id           1, 10, 20, 30
bond3    -         yes                bridge-learning   yes
bond3    -         3                  clag-id           3
bond3    -         44:38:39:FF:00:aa  lacp-actor-mac    44:38:39:FF:00:aa
bond3    -         00:00:00:00:00:00  lacp-partner-mac  00:00:00:00:00:00
bond3    -         br_default         master            NOT-SYNCED
bond3    -         9216               mtu               9216
bond3    -         1                  native-vlan       1
bond3    -         1, 10, 20, 30      vlan-id           1, 10, 20, 30

The following example command shows the MLAG settings for bond1 on each peer and indicates that the MTU does not match:

cumulus@leaf01:mgmt:~$ nv show interface bond1 bond mlag consistency-checker
Parameter           LocalValue         PeerValue          Conflict  Summary
------------------  -----------------  -----------------  --------  -------
bridge-learning   yes                yes                -
clag-id           1                  1                  -
lacp-actor-mac    44:38:39:FF:00:aa  44:38:39:FF:00:aa  -
lacp-partner-mac  00:00:00:00:00:00  00:00:00:00:00:00  -
master            br_default         NOT-SYNCED         -
mtu               4800               1500               mtu mismatch on clag interface between clag peers
native-vlan       1                  1                  -
vlan-id           1, 10, 20, 30      1, 10, 20, 30      -

The following example command shows global MLAG settings for each peer and indicates that the MLAG system MAC address does not match.

cumulus@leaf02:mgmt:~$ clagctl consistency-check global
Parameter              LocalValue               PeerValue                Conflict
---------------------  -----------------------  -----------------------  --------------------------------------
system-mac             44:38:39:FF:00:ab        44:38:39:FF:00:aa        system mac mismatch between clag peers
clag-protocol-version  1.6.0                    1.6.0                    -
clag-pkg-version       1.6.0-cl5.0.1+u15        1.6.0-cl5.0.1+u15        -
bridge-priority        32768                    32768                    -
anycast-ip             -                        -                        -
peer-ip                fe80::4638:39ff:fe00:59  fe80::4638:39ff:fe00:59  -
redirect2-enable       yes                      yes                      -
peerlink-mtu           9216                     9216                     -
bridge-type            vlan-aware               vlan-aware               -
peerlink-master        br_default               NOT-SYNCED               -
peerlink-vlans         1, 10, 20, 30            1, 10, 20, 30            -
bridge-stp             on                       on                       -
peerlink-native-vlan   1                        1                        -

The following example command shows MLAG settings for all interfaces on each peer with no conflicts:

cumulus@leaf01:mgmt:~$ clagctl consistency-check interface
Clag Interface: bond1
=====================
Parameter         LocalValue         PeerValue          Conflict
----------------  -----------------  -----------------  ----------
clag-id           1                  1                  -
lacp-partner-mac  00:00:00:00:00:00  00:00:00:00:00:00  -
lacp-actor-mac    44:38:39:FF:00:aa  44:38:39:FF:00:aa  -
vlan-id           1, 10, 20, 30      1, 10, 20, 30      -
native-vlan       1                  1                  -
master            br_default         NOT-SYNCED         -
mtu               9216               9216               -
bridge-learning   yes                yes                -

Clag Interface: bond2
=====================
Parameter         LocalValue         PeerValue          Conflict
----------------  -----------------  -----------------  ----------
clag-id           2                  2                  -
lacp-partner-mac  00:00:00:00:00:00  00:00:00:00:00:00  -
lacp-actor-mac    44:38:39:FF:00:aa  44:38:39:FF:00:aa  -
vlan-id           1, 10, 20, 30      1, 10, 20, 30      -
native-vlan       1                  1                  -
master            br_default         NOT-SYNCED         -
mtu               9216               9216               -
bridge-learning   yes                yes                -

Clag Interface: bond3
=====================
Parameter         LocalValue         PeerValue          Conflict
----------------  -----------------  -----------------  ----------
clag-id           3                  3                  -
lacp-partner-mac  00:00:00:00:00:00  00:00:00:00:00:00  -
lacp-actor-mac    44:38:39:FF:00:aa  44:38:39:FF:00:aa  -
vlan-id           1, 10, 20, 30      1, 10, 20, 30      -
native-vlan       1                  1                  -
master            br_default         NOT-SYNCED         -
mtu               9216               9216               -
bridge-learning   yes                yes                -

The following example command shows MLAG parameters for bond1 on each peer and indicates that the MTU does not match:

cumulus@leaf01:mgmt:~$ clagctl consistency-check interface bond1
Parameter         LocalValue         PeerValue          Conflict
----------------  -----------------  -----------------  ----------
clag-id           1                  1                  -
lacp-partner-mac  00:00:00:00:00:00  00:00:00:00:00:00  -
lacp-actor-mac    44:38:39:FF:00:aa  44:38:39:FF:00:aa  -
vlan-id           1, 10, 20, 30      1, 10, 20, 30      -
native-vlan       1                  1                  -
master            br_default         NOT-SYNCED         -
mtu               1480               1500               mtu mismatch on clag interface between clag peers
bridge-learning   yes                yes                -

The actions that Cumulus Linux takes when there is a conflict are disruptive. If you prefer, you can configure the switch to not take any action when there is a conflict. Edit the /etc/network/interfaces file to add the clagd-args --gracefulConsistencyCheck FALSE parameter in the peer link stanza.

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto peerlink.4094
iface peerlink.4094
    clagd-args --gracefulConsistencyCheck FALSE
    clagd-backup-ip 10.10.10.2
    clagd-peer-ip linklocal
    clagd-sys-mac 44:38:39:FF:00:AA
...

You can expect a large volume of packet drops across one of the peer link interfaces. These drops serve to prevent looping of BUM (broadcast, unknown unicast, multicast) packets. When the switch receives a packet across the peer link, if the destination lookup results in an egress interface that is a dual-connected bond, the switch does not forward the packet (to prevent loops). The peer link records a dropped packet.

To check packet drops across peer link interfaces, run the ethtool -S <interface> command:

cumulus@leaf01:mgmt:~$ ethtool -S swp49
NIC statistics:
     rx_queue_0_packets: 136
     rx_queue_0_bytes: 36318
     rx_queue_0_drops: 0
     rx_queue_0_xdp_packets: 0
     rx_queue_0_xdp_tx: 0
     rx_queue_0_xdp_redirects: 0
     rx_queue_0_xdp_drops: 0
     rx_queue_0_kicks: 1
     tx_queue_0_packets: 200
     tx_queue_0_bytes: 44244
     tx_queue_0_xdp_tx: 0
     tx_queue_0_xdp_tx_drops: 0
     tx_queue_0_kicks: 195

You can also run the nv show interface counters command. The number of dropped packets shows in the RX_DRP column.

cumulus@leaf01:mgmt:~$ nv show interface counters
Interface       MTU    RX_OK  RX_ERR  RX_DRP  RX_OVR  TX_OK  TX_ERR  TX_DRP  TX_OVR  Flg  
--------------  -----  -----  ------  ------  ------  -----  ------  ------  ------  -----
BLUE            65575  0      0       0       0       0      0       1       0       OmRU 
RED             65575  0      0       0       0       0      0       1       0       OmRU 
bond1           9000   0      0       0       0       1336   0       0       0       BMmRU
bond2           9000   0      0       0       0       1337   0       0       0       BMmRU
bond3           9000   0      0       0       0       1336   0       0       0       BMmRU
br_default      9216   69     0       0       0       191    0       0       0       BMRU 
eth0            1500   6184   0       0       0       3384   0       0       0       BMRU 
lo              65536  3835   0       0       0       3835   0       0       0       LRU  
mgmt            65575  4098   0       0       0       0      0       13      0       OmRU 
peerlink        9216   14604  0       0       0       14134  0       0       0       BMmRU
peerlink.4094   9216   9923   0       0       0       9423   0       0       0       BMRU 
swp1            9000   5      0       5       0       1336   0       0       0       BMsRU
swp2            9000   5      0       5       0       1337   0       0       0       BMsRU
swp3            9000   5      0       5       0       1336   0       0       0       BMsRU
swp4            1500        

In addition to the standard UP and DOWN administrative states, an interface that is a member of an MLAG bond can also be in a protodown state. When MLAG detects a problem that can result in connectivity issues, it puts that interface into protodown state. Such connectivity issues include:

When an interface goes into a protodown state, it results in a local OPER DOWN (carrier down) on the interface.

To show an interface in protodown state, run the Linux ip link show command. For example:

cumulus@leaf01:mgmt:~$ ip link show
3: swp1 state DOWN: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 9216 master pfifo_fast master host-bond1 state DOWN mode DEFAULT qlen 500 protodown on
    link/ether 44:38:39:00:69:84 brd ff:ff:ff:ff:ff:ff

LACP Partner MAC Address Duplicate or Mismatch

Cumulus Linux puts interfaces in a protodown state under the following conditions:

After you make the necessary cable or configuration changes to avoid the protodown state and you want MLAG to reevaluate the LACP partners, run the NVUE nv action clear mlag lacp-conflict command or the Linux clagctl clearconflictstate command to remove duplicate-partner-mac or partner-mac-mismatch from the protodown bonds, allowing them to come back up.

Configuration Example

The example below shows a basic MLAG configuration, where:

For an example configuration with MLAG and BGP, see the BGP configuration example.

cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:~$ nv set interface swp1-3,swp49-51
cumulus@leaf01:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:~$ nv set interface bond2 bond member swp2
cumulus@leaf01:~$ nv set interface bond3 bond member swp3
cumulus@leaf01:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf01:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf01:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf01:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@leaf01:~$ nv set interface vlan20 ip address 10.1.20.2/24
cumulus@leaf01:~$ nv set interface vlan30 ip address 10.1.30.2/24
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf01:~$ nv set interface bond1-3 bridge domain br_default 
cumulus@leaf01:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf01:~$ nv set system global anycast-mac 44:38:39:FF:00:AA
cumulus@leaf01:~$ nv set mlag backup 10.10.10.2
cumulus@leaf01:~$ nv set mlag peer-ip linklocal
cumulus@leaf01:~$ nv set mlag init-delay 100
cumulus@leaf01:~$ nv config apply
cumulus@leaf02:~$ nv set interface lo ip address 10.10.10.2/32
cumulus@leaf02:~$ nv set interface swp1-3,swp49-51
cumulus@leaf02:~$ nv set interface bond1 bond member swp1
cumulus@leaf02:~$ nv set interface bond2 bond member swp2
cumulus@leaf02:~$ nv set interface bond3 bond member swp3
cumulus@leaf02:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf02:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf02:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf02:~$ nv set interface vlan10 ip address 10.1.10.3/24
cumulus@leaf02:~$ nv set interface vlan20 ip address 10.1.20.3/24
cumulus@leaf02:~$ nv set interface vlan30 ip address 10.1.30.3/24
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf02:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf02:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf02:~$ nv set system global anycast-mac 44:38:39:FF:00:AA
cumulus@leaf02:~$ nv set mlag backup 10.10.10.1
cumulus@leaf02:~$ nv set mlag peer-ip linklocal
cumulus@leaf02:~$ nv set mlag init-delay 100
cumulus@leaf02:~$ nv config apply
cumulus@spine01:~$ nv set interface lo ip address 10.10.10.101/32
cumulus@spine01:~$ nv set interface swp1-2
cumulus@spine01:~$ nv config apply
- set:
    bridge:
      domain:
        br_default:
          vlan:
            10,20,30: {}
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default: {}
        type: bond
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.2/24: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.2/24: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.2/24: {}
        type: svi
        vlan: 30
    mlag:
      backup:
        10.10.10.2: {}
      enable: on
      init-delay: 100
      peer-ip: linklocal
    router:
      bgp:
        autonomous-system: 65101
        enable: on
        router-id: 10.10.10.1
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$S2E6GFmpZnyoFDOp$bb7l0oMB4DfsWrTSxiWr4JmEnF/Qtt9bXO2MF.EPR3uN8u0W4yXZCVLf7d21vxswoEIe5nfKaWrp4oYsaqMlz1
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:7a
      hostname: leaf01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
            enable: on
            neighbor:
              swp51:
                remote-as: external
                type: unnumbered
- set:
    bridge:
      domain:
        br_default:
          vlan:
            10,20,30: {}
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default: {}
        type: bond
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.2/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.3/24: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.3/24: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.3/24: {}
        type: svi
        vlan: 30
    mlag:
      backup:
        10.10.10.1: {}
      enable: on
      init-delay: 100
      peer-ip: linklocal
    router:
      bgp:
        autonomous-system: 65102
        enable: on
        router-id: 10.10.10.2
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$JW5a3iLCLTHo1x3N$q9EkD6TfEPFd9OyAFsFHi09eQljep/UF7YidEO1xMjIs0Tv7oAoIvdurs2i1xs44AGXTD2dIeOehiqyIBUOGG0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        system-mac: 44:38:39:22:01:78
      hostname: leaf02
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
            enable: on
            neighbor:
              swp51:
                remote-as: external
                type: unnumbered
- set:
    interface:
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.101/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.101
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$hkck.ZuD4W5LusMJ$hVOsTgz/oyjK8axsEAExzZ2.hb3JDBR/tnsHjRpF5vrh2DgsWmSQshj7/Qg6oaaPl5BgSsJfe6bScC2yayvnT0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        system-mac: 44:38:39:22:01:82
      hostname: spine01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
            enable: on
            neighbor:
              swp1:
                remote-as: external
                type: unnumbered
              swp2:
                remote-as: external
                type: unnumbered
auto lo
iface lo inet loopback
    address 10.10.10.1/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto bond2
iface bond2
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 2
auto bond3
iface bond3
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 3
auto vlan10
iface vlan10
    address 10.1.10.2/24
    hwaddress 44:38:39:22:01:b1
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.2/24
    hwaddress 44:38:39:22:01:b1
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.2/24
    hwaddress 44:38:39:22:01:b1
    vlan-raw-device br_default
    vlan-id 30
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.2
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 100
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
auto lo
iface lo inet loopback
    address 10.10.10.2/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto bond2
iface bond2
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 2
auto bond3
iface bond3
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 3
auto vlan10
iface vlan10
    address 10.1.10.3/24
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.3/24
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.3/24
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 30
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.1
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 100
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
auto lo
iface lo inet loopback
    address 10.10.10.101/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1

auto swp2 iface swp2

This simulation is running Cumulus Linux 5.11. The Cumulus Linux 5.12 simulation is coming soon.

The simulation starts with the example MLAG configuration. The demo is pre-configured using NVUE commands.

To validate the configuration, run the commands listed in the troubleshooting section above.

LACP Bypass

In Cumulus Linux, LACP bypass allows a bond configured in 802.3ad mode to become active and forward traffic even when there is no LACP partner. For example, you can enable a host that does not have the capability to run LACP to PXE boot while connected to a switch on a bond configured in 802.3ad mode. After the pre-boot process completes and the host is capable of running LACP, the normal 802.3ad link aggregation operation takes over.

LACP Bypass All-active Mode

In all-active mode, when a bond has multiple slave interfaces, each bond slave interface operates as an active link while the bond is in bypass mode. This is useful during PXE boot of a server with multiple NICs, when you cannot determine beforehand which port needs to be active.

Configure LACP Bypass

To enable LACP bypass on the host-facing bond:

The following commands create a VLAN-aware bridge with LACP bypass enabled:

cumulus@leaf01:~$ nv set interface bond1 bond member swp1-2
cumulus@leaf01:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf01:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf01:~$ nv config apply

Edit the /etc/network/interfaces file to add the set bond-lacp-bypass-allow to yes option, then run the ifreload -a command. The following configuration creates a VLAN-aware bridge with LACP bypass enabled.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bond1
iface bond1
    bond-slaves swp1 swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
...
auto bridge
iface bridge
    bridge-ports bond1 bond2 bond3
    bridge-vids 10 20 30
    bridge-vlan-aware yes
...
cumulus@switch:~$ sudo ifreload -a

To show the bond configuration, run the nv show interface <bond> command.

cumulus@leaf01:mgmt:~$ nv show interface bond1
                         operational        applied     description
-----------------------  -----------------  ----------  ----------------------------------------------------------------------
type                     bond               bond        The type of interface
[acl]                                                   Interface ACL rules
bond
  down-delay             0                  0           bond down delay
  lacp-bypass                               on          lacp bypass
  lacp-rate              fast               fast        lacp rate
  mode                                      lacp        bond mode
  up-delay               0                  0           bond up delay
  [member]               swp1               swp1        Set of bond members
  mlag
    enable                                  on          Turn the feature 'on' or 'off'.  The default is 'off'.
    id                   1                  1           MLAG id
    peer-interface       bond1                          Peer interface
    status               dual                           Mlag Interface status
bridge
  [domain]               br_default         br_default  Bridge domains on this interface
...

To check the status of the link, run the NVUE nv show interface <interface> command or the Linux ip link show command on the bond and its slave interfaces:

cumulus@switch:~$ ip link show bond1
164: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP mode DORMANT group default
    link/ether c4:54:44:f6:44:5a brd ff:ff:ff:ff:ff:ff
cumulus@switch:~$ ip link show swp1
55: swp1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond1 state UP mode DEFAULT group default qlen 1000
    link/ether c4:54:44:f6:44:5a brd ff:ff:ff:ff:ff:ff
cumulus@switch:~$ ip link show swp2
56: swp2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond1 state UP mode DEFAULT group default qlen 1000
    link/ether c4:54:44:f6:44:5a brd ff:ff:ff:ff:ff:ff

Virtual Router Redundancy - VRR

VRR enables hosts to communicate with any redundant switch without reconfiguration by running dynamic router protocols or router redundancy protocols. Redundant switches respond to ARP requests from hosts. The switches respond in an identical manner, but if one fails, the other redundant switches continue to respond. You use VRR with MLAG.

Use VRR when you connect multiple devices to a single logical connection, such as an MLAG bond. A device that connects to an MLAG bond believes there is a single device on the other end of the bond and only forwards one copy of the transit frames. If the destination of this frame is the virtual MAC address and you are running VRRP, the frame can go to the link connected to the VRRP standby device, which does not forward the frame to the right destination. With the virtual MAC active on both MLAG devices, either MLAG device handles the frame it receives.

You cannot configure both VRR and VRRP on the same switch.

The diagram below illustrates a basic VRR-enabled network configuration.

The network includes three servers and two Cumulus Linux switches. The switches use MLAG.

Configure the Switches

The switches implement the layer 2 network interconnecting the servers and the redundant switches. To configure the switches, add a bridge with the following interfaces to each switch:

Cumulus Linux only supports VRR on an SVI. You cannot configure VRR on a physical interface or virtual subinterface.

The example commands below create a VLAN-aware bridge interface for a VRR-enabled network. The example assumes you have already configured a VLAN-aware bridge with VLAN 10 and that VLAN 10 has an IP address and uses the default fabric-wide VRR MAC address 00:00:5e:00:01:01.

cumulus@switch:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@switch:~$ nv set interface vlan10 ip vrr state up
cumulus@switch:~$ nv config apply

Use the same commands for IPv6 addresses; for example:

cumulus@switch:~$ nv set interface vlan10 ip vrr address 2001:db8::1/32
cumulus@switch:~$ nv set interface vlan10 ip vrr state up

Edit the /etc/network/interfaces file, then run the ifreload -a command.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto vlan10
iface vlan10
    address 10.1.10.2/24
    address-virtual 00:00:5e:00:01:01 10.1.10.1/24
    vlan-raw-device br_default
    vlan-id 10
...
cumulus@switch:~$ sudo ifreload -a

Change the VRR MAC Address

Cumulus Linux sets a fabric-wide MAC address to ensure consistency across VRR switches, which is especially useful in an EVPN multi-fabric environment. If you prefer, you can change the VRR MAC address globally with one NVUE command. You can also override the global setting for a specific VLAN.

To set the VRR MAC address globally with one NVUE command, either:

The default VRR MAC address is 00:00:5E:00:01:01, which the switch derives from a fabric ID setting of 1.

To change a VRR MAC address globally on the switch, run the nv set system global fabric-mac <mac-address> command:

cumulus@switch:mgmt:~$ nv set system global fabric-mac 00:00:5E:00:01:FF
cumulus@switch:mgmt:~$ nv config apply

To set a fabric ID, run the nv set system global fabric-id <number> command:

cumulus@switch:mgmt:~$ nv set system global fabric-id 255
cumulus@switch:mgmt:~$ nv config apply

To override the global setting for a specific VLAN, run the nv set interface <vlan> ip vrr mac-address <mac-address> command:

cumulus@switch:mgmt:~$ nv set interface vlan10 ip vrr mac-address 00:00:5E:00:01:00
cumulus@switch:mgmt:~$ nv config apply

To change the VRR MAC address manually, edit the /etc/network/interfaces file and update the MAC address in the address-virtual line for each VLAN. Cumulus Linux does not provide a fabric ID option in the /etc/network/interfaces file.

The following example shows vlan10, vlan20, and vlan30:

cumulus@switch:mgmt:~$ sudo nano /etc/network/interfaces
...
auto vlan10
iface vlan10
    address 10.1.10.5/24
    address-virtual 00:00:5E:00:01:FF 10.1.10.1/24
    hwaddress 44:38:39:22:01:c1
    vrf RED
    vlan-raw-device br_default
    vlan-id 10

auto vlan20
iface vlan20
    address 10.1.20.5/24
    address-virtual 00:00:5E:00:01:FF 10.1.20.1/24
    hwaddress 44:38:39:22:01:c1
    vrf RED
    vlan-raw-device br_default
    vlan-id 20

auto vlan30
iface vlan30
    address 10.1.30.5/24
    address-virtual 00:00:5E:00:01:FF 10.1.30.1/24
    hwaddress 44:38:39:22:01:c1
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 30
...

Make sure to set the same VRR MAC address on both MLAG peers.

EVPN Routing with VRR

In an EVPN routing environment, if you want to configure multiple subnets as VRR addresses on a VLAN, you must configure them with the same VRR MAC address.

The following example commands configure both 10.1.10.1/24 and 10.1.11.1/24 on VLAN 10 using the default fabric-wide VRR MAC address 00:00:5e:00:01:01.

cumulus@switch:mgmt:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@switch:mgmt:~$ nv set interface vlan10 ip vrr address 10.1.11.1/24
cumulus@switch:mgmt:~$ nv config apply

Edit the /etc/network/interfaces file; for example:

cumulus@switch:mgmt:~$ sudo nano /etc/network/interfaces
auto vlan10
iface vlan10
    address 10.1.10.2/24
    address 10.1.11.2/24
    address-virtual 00:00:5e:00:01:01 10.1.10.1/24 10.1.11.1/24
    hwaddress 44:38:39:22:01:7a
    vlan-raw-device br_default
    vlan-id 10
...

To reduce BGP EVPN processing during convergence, NVIDIA recommends that you use the same fabric-wide MAC address across all VLANs and VRR subnets.

Configure the Servers

Each server must have two network interfaces. The switches configure the interfaces as bonds running LACP; the servers must also configure the two interfaces using teaming, port aggregation, port group, or EtherChannel running LACP. Configure the servers either statically or with DHCP, with a gateway address that is the IP address of the virtual router; this default gateway address never changes.

Configure the links between the servers and the switches in active-active mode for FHRP.

Troubleshooting

To verify the configuration on the switch, run the nv show interface command:

cumulus@leaf01:mgmt:~$ nv show interface
Interface       State  Speed  MTU    Type      Remote Host      Remote Port  Summary                                 
--------------  -----  -----  -----  --------  ---------------  -----------  ----------------------------------------
BLUE            up            65575  vrf                                     IP Address:                  127.0.0.1/8
                                                                             IP Address:                      ::1/128
RED             up            65575  vrf                                     IP Address:                  127.0.0.1/8
                                                                             IP Address:                      ::1/128
bond1           up     1G     9000   bond                                                                            
bond2           up     1G     9000   bond                                                                            
bond3           up     1G     9000   bond                                                                            
br_default      up            9216   bridge                                  IP Address:  fe80::4638:39ff:fe22:17a/64
eth0            up     1G     1500   eth       oob-mgmt-switch  swp10        IP Address:            192.168.200.11/24
                                                                             IP Address:  fe80::4638:39ff:fe22:17a/64
lo              up            65536  loopback                                IP Address:                 10.0.1.12/32
                                                                             IP Address:                10.10.10.1/32
                                                                             IP Address:                  127.0.0.1/8
                                                                             IP Address:                      ::1/128
mgmt            up            65575  vrf                                     IP Address:                  127.0.0.1/8
                                                                             IP Address:                      ::1/128
peerlink        up     2G     9216   bond                                                                            
peerlink.4094   up            9216   sub                                     IP Address: fe80::4ab0:2dff:fed1:e4e1/64
swp1            up     1G     9000   swp 
...

Configuration Example

The following example creates an MLAG configuration that incorporates VRR.

cumulus@leaf01:mgmt:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:mgmt:~$ nv set interface swp1-3,swp49-51
cumulus@leaf01:mgmt:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:mgmt:~$ nv set interface bond2 bond member swp2
cumulus@leaf01:mgmt:~$ nv set interface bond3 bond member swp3
cumulus@leaf01:mgmt:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf01:mgmt:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf01:mgmt:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf01:mgmt:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf01:mgmt:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf01:mgmt:~$ nv set system global anycast-mac 44:38:39:FF:00:AA
cumulus@leaf01:mgmt:~$ nv set mlag backup 10.10.10.2
cumulus@leaf01:mgmt:~$ nv set mlag peer-ip linklocal
cumulus@leaf01:mgmt:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf01:mgmt:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@leaf01:mgmt:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@leaf01:mgmt:~$ nv set interface vlan10 ip vrr state up
cumulus@leaf01:mgmt:~$ nv set interface vlan20 ip address 10.1.20.2/24
cumulus@leaf01:mgmt:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@leaf01:mgmt:~$ nv set interface vlan20 ip vrr state up
cumulus@leaf01:mgmt:~$ nv set interface vlan30 ip address 10.1.30.2/24
cumulus@leaf01:mgmt:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
cumulus@leaf01:mgmt:~$ nv set interface vlan30 ip vrr state up
cumulus@leaf01:mgmt:~$ nv config apply
cumulus@leaf02:mgmt:~$ nv set interface lo ip address 10.10.10.2/32
cumulus@leaf02:mgmt:~$ nv set interface swp1-3,swp49-51
cumulus@leaf02:mgmt:~$ nv set interface bond1 bond member swp1
cumulus@leaf02:mgmt:~$ nv set interface bond2 bond member swp2
cumulus@leaf02:mgmt:~$ nv set interface bond3 bond member swp3
cumulus@leaf02:mgmt:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf02:mgmt:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf02:mgmt:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf02:mgmt:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf02:mgmt:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf02:mgmt:~$ nv set system global anycast-mac 44:38:39:FF:00:AA
cumulus@leaf02:mgmt:~$ nv set mlag backup 10.10.10.1
cumulus@leaf02:mgmt:~$ nv set mlag peer-ip linklocal
cumulus@leaf02:mgmt:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf02:mgmt:~$ nv set interface vlan10 ip address 10.1.10.3/24
cumulus@leaf02:mgmt:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@leaf02:mgmt:~$ nv set interface vlan10 ip vrr state up
cumulus@leaf02:mgmt:~$ nv set interface vlan20 ip address 10.1.20.3/24
cumulus@leaf02:mgmt:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@leaf02:mgmt:~$ nv set interface vlan20 ip vrr state up
cumulus@leaf02:mgmt:~$ nv set interface vlan30 ip address 10.1.30.2/24
cumulus@leaf02:mgmt:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
cumulus@leaf02:mgmt:~$ nv set interface vlan30 ip vrr state up
cumulus@leaf02:mgmt:~$ nv config apply
cumulus@leaf01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    bridge:
      domain:
        br_default:
          vlan:
            10,20,30: {}
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default: {}
        type: bond
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.2/24: {}
          vrr:
            address:
              10.1.10.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.2/24: {}
          vrr:
            address:
              10.1.20.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.2/24: {}
          vrr:
            address:
              10.1.30.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 30
    mlag:
      backup:
        10.10.10.2: {}
      enable: on
      init-delay: 100
      peer-ip: linklocal
    router:
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$j04yw0gknNcfsUxt$OPF0Z9ilC5IF30kJAaQ5lWEhqk67uAugMvKRomBM8az8hZGbyAKmRdfUJrKCmakKxqdd/sq/smbtkD/xQB8rW.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:7a
      hostname: leaf01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
cumulus@leaf02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            10,20,30: {}
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default: {}
        type: bond
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.2/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.3/24: {}
          vrr:
            address:
              10.1.10.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.3/24: {}
          vrr:
            address:
              10.1.20.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.3/24: {}
          vrr:
            address:
              10.1.30.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 30
    mlag:
      backup:
        10.10.10.1: {}
      enable: on
      init-delay: 100
      peer-ip: linklocal
    router:
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$/jEbjL96YZO24NK/$3H1mMl1S1Udxcv9l4jQUXFgZN2bVAxEaDLLzy.dbpHjH80TIq0YhTbCMG.Y0p5s7wtUIEHrWaaBaBRsfSkKwM/
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:78
      hostname: leaf02
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
cumulus@leaf01:mgmt:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto bond2
iface bond2
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 2
auto bond3
iface bond3
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 3
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.2
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 100
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto vlan10
iface vlan10
    address 10.1.10.2/24
    address-virtual 00:00:5e:00:01:00 10.1.10.1/24
    hwaddress 44:38:39:22:01:b1
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.2/24
    address-virtual 00:00:5e:00:01:00 10.1.20.1/24
    hwaddress 44:38:39:22:01:b1
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.2/24
    address-virtual 00:00:5e:00:01:00 10.1.30.1/24
    hwaddress 44:38:39:22:01:b1
    vlan-raw-device br_default
    vlan-id 30
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
cumulus@leaf02:mgmt:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
   address 10.10.10.2/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto bond2
iface bond2
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 2
auto bond3
iface bond3
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 3
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.1
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 100
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto vlan10
iface vlan10
    address 10.1.10.3/24
    address-virtual 00:00:5e:00:01:00 10.1.10.1/24
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.3/24
    address-virtual 00:00:5e:00:01:00 10.1.20.1/24
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 20
uto vlan30
iface vlan30
    address 10.1.30.2/24
    address-virtual 00:00:5e:00:01:00 10.1.30.1/24
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 30
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
cumulus@server01:mgmt:~$ sudo cat /etc/network/interfaces
...
auto eth0
iface eth0 inet dhcp
  post-up sysctl -w net.ipv6.conf.eth0.accept_ra=2

auto eth1 iface eth1

auto eth2 iface eth2

auto bond1 iface bond1 bond-miimon 100 bond-mode 802.3ad bond-min-links 1 bond-slaves eth1 eth2 post-up ip route add 10.0.0.0/8 via 10.1.20.1

auto bond1.10 iface bond1.10 address 10.1.10.101/24

auto bond1.20 iface bond1.20 address 10.1.20.101/24

auto bond1.30 iface bond1.30 address 10.1.30.101/24

cumulus@server02:mgmt:~$ sudo cat /etc/network/interfaces
...
auto eth0
iface eth0 inet dhcp
  post-up sysctl -w net.ipv6.conf.eth0.accept_ra=2

auto eth1 iface eth1

auto eth2 iface eth2

auto bond1 iface bond1 bond-miimon 100 bond-mode 802.3ad bond-min-links 1 bond-slaves eth1 eth2 post-up ip route add 10.0.0.0/8 via 10.1.20.1

auto bond1.10 iface bond1.10 address 10.1.10.102/24

auto bond1.20 iface bond1.20 address 10.1.20.102/24

auto bond1.30 iface bond1.30 address 10.1.30.102/24

This simulation is running Cumulus Linux 5.11. The Cumulus Linux 5.12 simulation is coming soon.

The simulation is pre-configured using NVUE commands.

To validate the configuration, run the nv show interface <vlan> ip vrr command:

cumulus@leaf02:mgmt:~$ nv show interface vlan10 ip vrr
             operational        applied            description
-----------  -----------------  -----------------  ------------------------------------------------------
enable                          on                 Turn the feature 'on' or 'off'.  The default is 'off'.
mac-address  00:00:5e:00:01:00  00:00:5e:00:01:00  Override anycast-mac
mac-id                          none               Override anycast-id
[address]    10.1.10.1/24       10.1.10.1/24       Virtual addresses with prefixes
state        up                 up                 The state of the interface

IGMP and MLD Snooping

IGMP and MLD snooping prevent hosts on a local network from receiving traffic for a multicast group they have not explicitly joined. IGMP snooping is for IPv4 environments and MLD snooping is for IPv6 environments.

The bridge driver in Cumulus Linux kernel includes IGMP and MLD snooping. If you disable IGMP or MLD snooping, multicast traffic floods to all the bridge ports in the bridge. Similarly, in the absence of receivers in a VLAN, multicast traffic floods to all ports in the VLAN.

Configure the IGMP and MLD Querier

Without a multicast router, a single switch in an IP subnet can coordinate multicast traffic flows. This switch is the querier or the designated router. The querier generates query messages to check group membership, and processes membership reports and leave messages.

To configure the querier on the switch for a VLAN-aware bridge, enable the multicast querier on the bridge and add the source IP address of the queries to the VLAN.

Before you configure the querier, make sure to configure the bridge, VLAN, and ports.

The following example:

cumulus@switch:~$ nv set interface swp1-3 bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default vlan 10
cumulus@switch:~$ nv set interface swp1 bridge domain br_default vlan 10
cumulus@switch:~$ nv set bridge domain br_default multicast snooping querier enable on
cumulus@switch:~$ nv set bridge domain br_default vlan 10 multicast snooping querier source-ip 10.10.10.1
cumulus@switch:~$ nv config apply

NVUE commands for a bridge in traditional mode are not supported.

Edit the /etc/network/interfaces file to add bridge-mcquerier 1 to the bridge stanza (this enables the multicast querier on the bridge) and add bridge-igmp-querier-src <ip-address> to the VLAN stanza (the is the source IP address of the queries).

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default.10
vlan br_default.10
    bridge-igmp-querier-src 10.10.10.1

auto br_default
iface br_default
    bridge-ports swp1 swp2 swp3
    hwaddress 1c:34:da:b9:46:fd
    bridge-vlan-aware yes
    bridge-vids 10
    bridge-pvid 1
    bridge-stp yes
    bridge-mcsnoop yes
    bridge-mcquerier yes
    mstpctl-forcevers rstp
...

Run the ifreload -a command to reload the configuration:

cumulus@switch:~$ sudo ifreload -a

To configure the querier on the switch for a bridge in traditional mode, edit the bridge stanza in the /etc/network/interfaces file to add bridge-mcquerier 1 (this enables the multicast querier on the bridge) and bridge-mcqifaddr to 1 (this configures the source IP address of the queries to be the bridge IP address).

...
auto br0
iface br0
  address 10.10.10.10/24
  bridge-ports swp1 swp2 swp3
  bridge-vlan-aware no
  bridge-mcquerier 1
  bridge-mcqifaddr 1
...

Run the ifreload -a command to reload the configuration:

cumulus@switch:~$ sudo ifreload -a

Optimized Multicast Flooding (OMF)

IGMP snooping restricts multicast forwarding only to the ports that receive IGMP report messages. If the ports do not receive IGMP reports, multicast traffic floods to all ports in the bridge domain (also known as unregistered multicast (URMC) traffic). To restrict this flooding to only mrouter ports, you can enable OMF.

To enable OMF:

  1. Configure an IGMP querier. See Configure the IGMP and MLD Querier above.

  2. In the IGMP snooping unregistered L2 multicast flood control section of the /etc/cumulus/switchd.conf file, uncomment and change these settings to TRUE, then restart switchd.

    • bridge.unreg_mcast_init
    • bridge.unreg_v4_mcast_prune
    • bridge.unreg_v6_mcast_prune
    cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
    ...
    #IGMP snooping unregistered L2 multicast flood control
    #
    #Initialize prune module:
    bridge.unreg_mcast_init = TRUE
    #
    #Note:
    #Below configuration allowed only when bridge.unreg_mcast_init is set to TRUE
    #
    #Set below to TRUE to enable unregistered L2 multicast prune to mrouter ports.
    #Default is to flood the unregistered L2 multicast
    #
    bridge.unreg_v4_mcast_prune = TRUE
    bridge.unreg_v6_mcast_prune = TRUE
    
cumulus@switch:~$ sudo systemctl restart switchd.service

Restarting the switchd service causes all network ports to reset, interrupting network services, in addition to resetting the switch hardware configuration.

When IGMP reports go to a multicast group, OMF has no effect; normal IGMP snooping occurs.

When you enable OMF, you can configure a bridge port as an mrouter port to forward unregistered multicast traffic to that port.

Cumulus Linux does not provide NVUE commands for this setting.

Edit the /etc/network/interfaces file to add bridge-portmcrouter enabled to the swp1 stanza.

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp1
iface swp1
   bridge-portmcrouter enabled
...

Run the ifreload -a command to reload the configuration:

cumulus@switch:~$ sudo ifreload -a

OMF increases memory usage, which can impact scaling on Spectrum 1 switches.

Improve Multicast Convergence

For large multicast environments, the default CoPP policer might be too restrictive. You can adjust the policer to improve multicast convergence.

To tune the IGMP and MLD forwarding and burst rates:

The following example commands set the IGMP forwarding rate to 400 and the IGMP burst rate to 200 packets per second:

cumulus@switch:~$ nv set system control-plane policer igmp rate 400
cumulus@switch:~$ nv set system control-plane policer igmp burst 200
cumulus@switch:~$ nv config apply
  1. Edit the /etc/cumulus/control-plane/policers.conf file.

    • For IGMP, change the copp.igmp.rate and copp.igmp.burst parameters.
    • For MLD, change the copp.icmp6_def_mld.rate and copp.icmp6_def_mld.burst parameters.

    The following example changes the IGMP and MLD forwarding rate to 400 packets per second and the burst rate to 200 packets per second:

    cumulus@switch:~$ sudo nano /etc/cumulus/control-plane/policers.conf
    ...
    copp.igmp.enable = TRUE
    copp.igmp.rate = 400
    copp.igmp.burst = 200
    ...
    copp.icmp6_def_mld.enable = TRUE
    copp.icmp6_def_mld.rate = 400
    copp.icmp6_def_mld.burst = 200
    ...
    
  2. Run the following command:

    cumulus@switch:~$ /usr/lib/cumulus/switchdctl --load /etc/cumulus/control-plane/policers.conf
    

Change the Bridge IGMP Version

You can configure a bridge to use IGMPv2 or IGMPv3. IGMPv2 is the default version. To change the IGMP version, add the bridge-igmp-version <version> parameter to the bridge stanza in the /etc/network/interfaces file. For example, to change the IGMP version to IGMPv3:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto br_default
iface br_default
    bridge-ports swp3
    hwaddress 44:38:39:22:01:bb
    bridge-vlan-aware yes
    bridge-vids 1
    bridge-pvid 1
    bridge-igmp-version 3

NVUE does not provide a command to change the bridge IGMP version.

Disable IGMP and MLD Snooping

If you do not use mirroring functions or other types of multicast traffic, you can disable IGMP and MLD snooping.

cumulus@switch:~$ nv set bridge domain br_default multicast snooping enable off
cumulus@switch:~$ nv config apply

Edit the /etc/network/interfaces file and set bridge-mcsnoop to 0 in the bridge stanza:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto bridge
iface bridge
  bridge-mcquerier 1
  bridge-mcsnoop 0
  bridge-ports swp1 swp2 swp3
  bridge-pvid 1
  bridge-vids 100 200
  bridge-vlan-aware yes
...

Run the ifreload -a command to reload the configuration:

cumulus@switch:~$ sudo ifreload -a

Troubleshooting

To show the IGMP and MLD snooping bridge state, run the brctl showstp <bridge> command:

cumulus@switch:~$ sudo brctl showstp bridge
  bridge
  bridge id              8000.7072cf8c272c
  designated root        8000.7072cf8c272c
  root port                 0                    path cost                  0
  max age                  20.00                 bridge max age            20.00
  hello time                2.00                 bridge hello time          2.00
  forward delay            15.00                 bridge forward delay      15.00
  ageing time             300.00
  hello timer               0.00                 tcn timer                  0.00
  topology change timer     0.00                 gc timer                 263.70
  hash elasticity        4096                    hash max                4096
  mc last member count      2                    mc init query count        2
  mc router                 1                    mc snooping                1
  mc last member timer      1.00                 mc membership timer      260.00
  mc querier timer        255.00                 mc query interval        125.00
  mc response interval     10.00                 mc init query interval    31.25
  mc querier                0                    mc query ifaddr            0
  flags

swp1 (1)
  port id                8001                    state                forwarding
  designated root        8000.7072cf8c272c       path cost                  2
  designated bridge      8000.7072cf8c272c       message age timer          0.00
  designated port        8001                    forward delay timer        0.00
  designated cost           0                    hold timer                 0.00
  mc router                 1                    mc fast leave              0
  flags

swp2 (2)
  port id                8002                    state                forwarding
  designated root        8000.7072cf8c272c       path cost                  2
  designated bridge      8000.7072cf8c272c       message age timer          0.00
  designated port        8002                    forward delay timer        0.00
  designated cost           0                    hold timer                 0.00
  mc router                 1                    mc fast leave              0
  flags

swp3 (3)
  port id                8003                    state                forwarding
  designated root        8000.7072cf8c272c       path cost                  2
  designated bridge      8000.7072cf8c272c       message age timer          0.00
  designated port        8003                    forward delay timer        8.98
  designated cost           0                    hold timer                 0.00
  mc router                 1                    mc fast leave              0
  flags

Cumulus Linux tracks multicast group and port state in the MDB. To show the groups and bridge port state, run the Linux sudo bridge mdb show command. To show detailed router ports and group information, run the sudo bridge -d -s mdb show command:

cumulus@switch:~$ sudo bridge -d -s mdb show
  dev bridge port swp2 grp 234.10.10.10 temp 241.67
  dev bridge port swp1 grp 238.39.20.86 permanent 0.00
  dev bridge port swp1 grp 234.1.1.1 temp 235.43
  dev bridge port swp2 grp ff1a::9 permanent 0.00
  router ports on bridge: swp3

Scale Considerations

The number of unique multicast groups supported in the MDB is 4096 by default. To increase the maximum number of multicast groups in the MDB, edit the /etc/network/interfaces file to add a bridge-hashmax value to the bridge stanza:

auto br_default
iface br_default
  bridge-hashmax 16384
  bridge-ports swp1 swp2 swp3
  bridge-vlan-aware yes
  bridge-vids 10 20
  bridge-pvid 1
  bridge-mcquerier 1
  bridge-mcsnoop 1

The supported values for bridge-hashmax are 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536.

DIP-based Multicast Forwarding

Cumulus Linux does not support DIP-based multicast forwarding. Do not configure the 224.0.0.x through 239.0.0.x and 224.128.0.x through 239.128.0.x IP ranges as multicast groups, which map to link-local MAC addresses (01:00:5e:00:00:xx).

MAC Address Translation

MAC address translation enables you to translate the source MAC address for packets on egress and the destination MAC address for packets on ingress. MAC address translation is equivalent to static NAT but operates at layer 2 on Ethernet frames.

Configure MAC Address Translation

To configure MAC address translation:

Cumulus Linux only supports one MAC address in a translation rule.

The following example matches Ethernet packets with source MAC address 01:12:34:32:11:01 and translates the MAC address to 99:de:fc:32:11:01 on egress on swp5.

cumulus@switch:~$ nv set acl MACL1 type mac
cumulus@switch:~$ nv set acl MACL1 rule 1 match mac source-mac b8:ce:f6:3c:62:06  
cumulus@switch:~$ nv set acl MACL1 rule 1 action source-nat translate-mac 99:de:fc:32:11:01 
cumulus@switch:~$ nv config apply

cumulus@switch:~$ nv set interface swp5 acl MACL1 outbound  
cumulus@switch:~$ nv config apply   

The following example matches Ethernet packets with destination MAC address 01:12:34:32:11:01 and translates the MAC address to 99:de:fc:32:11:01 on ingress on swp5.

cumulus@switch:~$ nv set acl MACL2 type mac
cumulus@switch:~$ nv set acl MACL2 rule 1 match mac dest-mac 01:12:34:32:11:01 
cumulus@switch:~$ nv set acl MACL2 rule 1 action dest-nat translate-mac 99:de:fc:32:11:01
cumulus@switch:~$ nv config apply

cumulus@switch:~$ nv set interface swp5 acl MACL2 inbound  
cumulus@switch:~$ nv config apply   

To create rules, use cl-acltool.

To add rules using cl-acltool, either edit an existing file in the /etc/cumulus/acl/policy.d directory and add rules under [ebtables] or create a new file in the /etc/cumulus/acl/policy.d directory and add rules under an [ebtables] section. For example:

cumulus@switch:~$ sudo nano /etc/cumulus/acl/policy.d/60_mac.rules
[ebtables]

 #Add rule

Example Rules

The following example matches Ethernet packets with source MAC address 01:12:34:32:11:01 and translates the MAC address to 99:de:fc:32:11:01 on egress on swp5.

[ebtables]

-t nat -A POSTROUTING -s 01:12:34:32:11:01 -j snat --to-source 99:de:fc:32:11:01 –o swp5   

The following example matches Ethernet packets with destination MAC address 01:12:34:32:11:01 coming in on swp5 and translates the MAC address to 99:de:fc:32:11:01 on ingress on swp5.

[ebtables]

-t nat -A PREROUTING -d 01:12:34:32:11:01 -j dnat --to-dst 99:de:fc:32:11:01 –i swp5  

Show MAC Address Translation Configuration and Statistics

To show the current MAC address translation configuration:

cumulus@switch:~$ nv show acl
       type  Summary
-----  ----  -------
MACL1  mac   rule: 1
MACL2  mac   rule: 1

To show information about a specific MAC address translation rule, run the nv show acl <name> --applied -o=json command:

cumulus@switch:~$ nv show acl MACL1 --applied -o=json
{
  "rule": {
    "1": {
      "action": {
        "source-nat": {
          "translate-ip": {},
          "translate-mac": "99:de:fc:32:11:01",
          "translate-port": {}
        }
      },
      "match": {
        "mac": {
          "dest-mac-mask": "ff:ff:ff:ff:ff:ff",
          "source-mac": "b8:ce:f6:3c:62:06",
          "source-mac-mask": "ff:ff:ff:ff:ff:ff"
        }
      }
    }
  },
  "type": "mac"
}

To show statistics for MAC address translation, such as the number of packets that match the rules and the number of bytes in the matched packets, run the NVUE nv show interface acl-statistics command or the Linux cl-acltool -L eb command:

cumulus@switch:~$ nv show interface acl-statistics
Interface  ACL Name   Rule ID   In Packets  In Bytes  Out Packets  Out Bytes
---------  ---------  -------   ----------  --------  -----------  ---------
swp2       macl_snat  10                              14            1.13 KB
cumulus@switch:~$ sudo cl-acltool -L eb
-s ec:d:9a:84:8b:82 -o swp2 --comment rule_id:10 -j snat --to-src 0:0:0:0:0:2 --snat-target ACCEPT, pcnt = 14 -- bcnt = 1162

In the above example Linux command output:

Network Virtualization

VXLAN is a standard overlay protocol that abstracts logical virtual networks from the physical network underneath. You can deploy simple and scalable layer 3 Clos architectures while extending layer 2 segments over that layer 3 network.

VXLAN uses a VLAN-like encapsulation technique to encapsulate MAC-based layer 2 Ethernet frames within layer 3 UDP packets. Each virtual network is a VXLAN logical layer 2 segment. VXLAN scales to 16 million segments - a 24-bit VXLAN network identifier (VNI ID) in the VXLAN header - for multi-tenancy.

Hosts on a given virtual network join together through an overlay protocol that initiates and terminates tunnels at the edge of the multi-tenant network, typically the hypervisor vSwitch or top of rack. These edge points are the VXLAN tunnel end points (VTEP).

Cumulus Linux can start and stop VTEPs in hardware and supports wire-rate VXLAN. VXLAN provides an efficient hashing scheme across the IP fabric during the encapsulation process; the source UDP port is unique, with the hash based on layer 2 through layer 4 information from the original frame. The UDP destination port is the standard port 4789.

Cumulus Linux does not support VXLAN encapsulation over layer 3 subinterfaces (for example, swp3.111) or SVIs as traffic transiting through the switch drops, even if you use the subinterface only for underlay traffic and it does not perform VXLAN encapsulation. Only configure VXLAN uplinks as layer 3 interfaces without any subinterfaces (for example, swp3). The VXLAN tunnel endpoints cannot share a common subnet; there must be at least one layer 3 hop between the VXLAN source and destination.

Considerations

Cut-through Mode and Store and Forward Switching

On switches with the NVIDIA Spectrum ASICs, Cumulus Linux supports cut-through mode for VXLANs but does not support store and forward switching.

MTU Size for Virtual Network Interfaces

The maximum transmission unit (MTU) size for a virtual network interface should be 50 bytes smaller than the MTU for the physical interfaces on the switch. For more information on setting MTU, read Layer 1 and Switch Port Attributes.

Layer 3 and Layer 2 VNI ID must be Different

A layer 3 VNI and a layer 2 VNI must have different IDs. If the VNI IDs are the same, Cumulus Linux does not create the layer 2 VNI.

Change a Layer 2 VNI to Layer 3

To change a layer 2 VNI to a layer 3 VNI, make sure you follow this sequence:

  1. Remove the bridge VLAN to VNI mapping.
  2. Remove the current layer 3 VNI from the VRF.
  3. Configure the VNI as a layer 3 VNI.
cumulus@switch:~$ nv unset bridge domain br_default vlan 10 vni 10
cumulus@switch:~$ nv unset vrf RED evpn vni 4001
cumulus@switch:~$ nv set vrf RED evpn vni 10
cumulus@switch:~$ nv config apply

Ethernet Virtual Private Network - EVPN

VXLAN enables layer 2 segments to extend over an IP core (the underlay). The initial definition of VXLAN (RFC 7348) does not include any control plane and relied on a flood-and-learn approach for MAC address learning.

EVPN is a standards-based control plane for VXLAN defined in RFC 7432 and RFC 8365 that allows for building and deploying VXLANs at scale. It relies on multi-protocol BGP (MP-BGP) to exchange information and uses BGP-MPLS IP VPNs (RFC 4364). It enables not only bridging between end systems in the same layer 2 segment but also routing between different segments (subnets). There is also inherent support for multi-tenancy.

Cumulus Linux installs the routing control plane (including EVPN) as part of the FRR package. For more information about FRR, refer to FRRouting.

Key Features

Cumulus Linux fully supports EVPN as the control plane for VXLAN, including for both intra-subnet bridging and inter-subnet routing, and provides these key features:

Cumulus Linux supports the EVPN address family with both eBGP and iBGP peering. If you configure underlay routing with eBGP, you can use the same eBGP session to carry EVPN routes. In a typical 2-tier Clos network where the leafs are VTEPs, if you use eBGP sessions between the leafs and spines for underlay routing, the same sessions exchange EVPN routes. The spine switches act as route forwarders and do not install any forwarding state as they are not VTEPs. When the switch exchanges EVPN routes over iBGP peering, you can use OSPF as the IGP or resolve next hops using iBGP.

Cumulus Linux disables data plane MAC learning by default on VXLAN interfaces. Do not enable MAC learning on VXLAN interfaces: EVPN installs remote MACs.

Basic Configuration

The following sections provide the basic configuration needed to use EVPN as the control plane for VXLAN in a BGP-EVPN-based layer 2 extension deployment. For layer 3 multi-tenancy configuration, see Inter-subnet Routing. For additional EVPN configuration, see EVPN Enhancements.

Basic EVPN Configuration Commands

Basic configuration in a BGP-EVPN-based layer 2 extension deployment requires you to:

For a non-VTEP device that is only participating in EVPN route exchange, such as a spine switch where the network deployment uses hop-by-hop eBGP or the switch is acting as an iBGP route reflector, configuring VXLAN interfaces is not required.

  1. Configure VXLAN Interfaces. The following example creates a single VXLAN device (vxlan0), maps VLAN 10 to vni10 and VLAN 20 to vni20, adds the VXLAN device to the default bridge br_default, and sets the VXLAN local tunnel IP address to 10.10.10.1.

    cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10
    cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20
    cumulus@leaf01:~$ nv set nve vxlan source address 10.10.10.1
    cumulus@leaf01:~$ nv config apply
    

    To create a traditional VXLAN device, where each VNI represents a separate device instead of a set of VNIs in a single device model, see VXLAN-Devices.

  2. Configure BGP. The following example commands assign an ASN and router ID to leaf01 and spine01, specify the interfaces between the two BGP peers, and the prefixes to originate. For complete information on how to configure BGP, see Border Gateway Protocol - BGP.

cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.1/32
cumulus@leaf01:~$ nv config apply
cumulus@spine01:~$ nv set router bgp autonomous-system 65199
cumulus@spine01:~$ nv set router bgp router-id 10.10.10.101
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 remote-as external
cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.101/32
cumulus@spine01:~$ nv config apply
  1. Activate the EVPN address family and enable EVPN between BGP neighbors. The following example commands enable EVPN between leaf01 and spine01:
cumulus@leaf01:~$ nv set evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv config apply

You do not need to enable the BGP control plane for all VNIs configured on the switch with NVUE with the advertise-all-vni option. FRR is aware of any local VNIs and MACs, and hosts (neighbors) associated with those VNIs.

NVUE creates the following configuration snippet in the /etc/nvue.d/startup.yaml file:

cumulus@leaf01:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    nve:
      vxlan:
        enable: on
        source:
          address: 10.10.10.1
    router:
      bgp:
        autonomous-system: 65101
        enable: on
        router-id: 10.10.10.1
    vrf:
      default:
        router:
          bgp:
            peer:
              swp51:
                remote-as: external
                type: unnumbered
            enable: on
            address-family:
              ipv4-unicast:
                network:
                  10.10.10.1/32: {}
                enable: on
cumulus@spine01:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv config apply

NVUE creates the following configuration snippet in the /etc/nvue.d/startup.yaml file:

cumulus@spine01:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.101/32: {}
        type: loopback
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    nve:
      vxlan:
        enable: on
        source:
          address: 10.10.10.101
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.101
    vrf:
      default:
        router:
          bgp:
            peer:
              swp1:
                remote-as: external
                type: unnumbered
            enable: on
            address-family:
              ipv4-unicast:
                network:
                  10.10.10.101/32: {}
                enable: on
  1. Edit the /etc/network/interfaces file to create a single VXLAN device, attach it to a bridge, map the VLANs to the VNIs, and set the VXLAN local tunnel IP address. The example below creates a single VXLAN interface (vxlan0), maps VLAN 10 to vni10 and VLAN 20 to vni20, and sets the VXLAN local tunnel IP address to 10.10.10.1.

    cumulus@leaf01:~$ sudo nano /etc/network/interfaces
    ...
    auto lo
    iface lo inet loopback
            address 10.10.10.1/32
            vxlan-local-tunnelip 10.10.10.1
    ...
    auto vxlan0
    iface vxlan0
     bridge-vlan-vni-map 10=10 20=20
     bridge-vids 10 20
     bridge-learning off
    
    auto br_default
    iface br_default
            bridge-ports swp1 swp2 vxlan0
            bridge-vlan-aware yes
            bridge-vids 10 20
            bridge-pvid 1
    

To create a traditional VXLAN device, where each VNI represents a separate device instead of a set of VNIs in a single device model, see VXLAN-Devices.

  1. Configure BGP with vtysh commands. The following example commands assign an ASN and router ID to leaf01 and spine01, specify the interfaces between the two BGP peers, and the prefixes to originate. For complete information on how to configure BGP, see Border Gateway Protocol - BGP.
cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# bgp router-id 10.10.10.1
leaf01(config-router)# neighbor swp51 remote-as external
leaf01(config-router)# address-family ipv4
leaf01(config-router-af)# network 10.10.10.1/32
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$
cumulus@spine01:~$ sudo vtysh
spine01# configure terminal
spine01(config)# router bgp 65199
spine01(config-router)# bgp router-id 10.10.10.101
spine01(config-router)# neighbor swp1 remote-as external
spine01(config-router)# address-family ipv4
spine01(config-router-af)# network 10.10.10.101/32
spine01(config-router-af)# end
spine01# write memory
spine01# exit
cumulus@spine01:~$
  1. Activate the EVPN address family and enable EVPN between BGP neighbors. The following example commands enable EVPN between leaf01 and spine01. The commands automatically provision all locally configured VNIs so the BGP control plane can advertise them.
cumulus@leaf01:~$ sudo vtysh

leaf01# configure terminal leaf01(config)# router bgp 65101 leaf01(config-router)# bgp router-id 10.10.10.1 leaf01(config-router)# neighbor swp51 interface remote-as external leaf01(config-router)# address-family l2vpn evpn leaf01(config-router-af)# neighbor swp51 activate leaf01(config-router-af)# advertise-all-vni leaf01(config-router-af)# end leaf01# write memory leaf01# exit cumulus@leaf01:~$

The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf file.

...
router bgp 65101
  bgp router-id 10.10.10.1
  neighbor swp51 interface remote-as external
  address-family l2vpn evpn
neighbor swp51 activate
  advertise-all-vni
...
cumulus@spine01:~$ sudo vtysh

spine01# configure terminal spine01(config)# router bgp 65199 spine01(config-router)# bgp router-id 10.10.10.101 spine01(config-router)# neighbor swp1 interface remote-as external spine01(config-router)# address-family l2vpn evpn spine01(config-router-af)# neighbor swp1 activate spine01(config-router-af)# end spine01# write memory spine01# exit cumulus@spine01:~$

The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf file:

...
router bgp 65199
  bgp router-id 10.10.10.101
  neighbor swp1 interface remote-as external
  address-family l2vpn evpn
neighbor swp1 activate
...

You only need to set the advertise-all-vni option on leafs that are VTEPs. The switch accepts EVPN routes from a BGP peer even without this option. The routes are in the global EVPN routing table but Cumulus Linux only imports them into the per-VNI routing table and installs the appropriate entries in the kernel when the VNI corresponding to the received route is locally known.

Show EVPN Configuration

To show the current EVPN configuration on the switch, run the nv show evpn command:

cumulus@leaf01:~$ nv show evpn  
                       operational   applied      
---------------------  ------------  -------------
enable                               on           
route-advertise                                   
  nexthop-setting                    system-ip-mac
  svi-ip               off           off          
  default-gateway      off           off          
dad                                               
  enable               on            on           
  mac-move-threshold   5             5            
  move-window          180           180          
  duplicate-action     warning-only  warning-only 
[vni]                                             
multihoming                                       
  enable                             off          
  mac-holdtime         1080                       
  neighbor-holdtime    1080                       
  startup-delay        180                        
  startup-delay-timer  --:--:--                   
  uplink-count         0                          
  uplink-active        0                          
l2vni-count            3                          
l3vni-count            2

You can also show the EVPN configuration in json format with the nv show evpn -o json command or in yaml format with the nv show evpn -o yaml command.

EVPN and VXLAN Active-active Mode

For EVPN in VXLAN active-active mode, both switches in the MLAG pair establish EVPN peering with other EVPN speakers (for example, with spine switches if using hop-by-hop eBGP) and inform about their locally known VNIs and MACs. When MLAG is active, both switches announce this information with the shared anycast IP address.

For active-active configuration, make sure that:

MLAG synchronizes information between the two switches in the MLAG pair; EVPN does not synchronize.

For type-5 routes in an EVPN symmetric configuration with VXLAN active-active mode, Cumulus Linux uses Primary IP Address Advertisement. For information on configuring Primary IP Address Advertisement, see Advertise Primary IP Address.

For information about active-active VTEPs and anycast IP behavior, and for failure scenarios, see VXLAN Active-active Mode.

Considerations

EVPN Enhancements

This section describes EVPN enhancements.

Define RDs and RTs

The RD and RTs for the layer 2 VNI are different from the tenant VRF RD and RTs. To define the tenant VRF RD and RTs, see Configure the RD and RTs for the Tenant VRF.

When FRR learns about a local VNI and there is no explicit configuration for that VNI in FRR, the switch derives the RD and import and export RTs for this VNI automatically. The RD uses RouterId:VNI-Index and the import and export RTs use AS:VNI. For routes that come from a layer 2 VNI (type-2 and type-3), the RD uses the VXLAN local tunnel IP address (vxlan-local-tunnelip) from the layer 2 VNI interface instead of the RouterId (vxlan-local-tunnelip:VNI). EVPN route exchange uses the RD and RTs.

The RD disambiguates EVPN routes in different VNIs (they can have the same MAC and IP address) while the RTs describe the VPN membership for the route. The VNI-Index for the RD is a unique number that the switch generates. It only has local significance; on remote switches, its only role is for route disambiguation. The switch uses this number instead of the VNI value itself because this number has to be less than or equal to 65535. In the RT, the AS is always a 2-byte value to allow room for a large VNI. If the router has a 4-byte AS, it only uses the lower 2 bytes. This ensures a unique RT for different VNIs while having the same RT for the same VNI across routers in the same AS.

For eBGP EVPN peering, the peers are in a different AS so using an automatic RT of AS:VNI does not work for route import. Therefore, Cumulus Linux treats the import RT as *:VNI to determine which received routes apply to a particular VNI. This only applies when the switch auto-derives the import RT.

If you do not want to derive RDs and RTs (layer 2 RTS) automatically, you can define them manually. The following example commands are per VNI.

cumulus@leaf01:~$ nv set evpn vni 10 rd 10.10.10.1:20
cumulus@leaf01:~$ nv set evpn vni 10 route-target export 65101:10
cumulus@leaf01:~$ nv set evpn vni 10 route-target import 65102:10
cumulus@leaf01:~$ nv config apply
cumulus@leaf03:~$ nv set evpn vni 10 rd 10.10.10.3:20
cumulus@leaf03:~$ nv set evpn vni 10 route-target export 65102:10
cumulus@leaf03:~$ nv set evpn vni 10 route-target import 65101:10
cumulus@leaf03:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# vni 10
leaf01(config-router-af-vni)# rd 10.10.10.1:20
leaf01(config-router-af-vni)# route-target export 65101:10
leaf01(config-router-af-vni)# route-target import 65102:10
leaf01(config-router-af-vni)# exit
leaf01(config-router-af)# advertise-all-vni
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf file.

...
address-family l2vpn evpn
  advertise-all-vni
  vni 10
   rd 10.10.10.1:20
   route-target export 65101:10
   route-target import 65102:10
...
cumulus@leaf03:~$ sudo vtysh
...
leaf03# configure terminal
leaf03(config)# router bgp 65102
leaf03(config-router)# address-family l2vpn evpn
leaf03(config-router-af)# vni 10
leaf03(config-router-af-vni)# rd 10.10.10.3:20
leaf03(config-router-af-vni)# route-target export 65102:10
leaf03(config-router-af-vni)# route-target import 65101:10
leaf03(config-router-af-vni)# exit
leaf03(config-router-af)# advertise-all-vni
leaf03(config-router-af)# end
leaf03# write memory
leaf03# exit

The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf file:

...
address-family l2vpn evpn
  advertise-all-vni
  vni 10
   rd 10.10.10.3:20
   route-target export 65102:10
   route-target import 65101:10

You can configure multiple RT values. In addition, you can configure both the import and export route targets with a single command by using route-target both:

cumulus@leaf01:~$ nv set evpn vni 10 route-target import 65102:10
cumulus@leaf01:~$ nv set evpn vni 10 route-target import 65102:20
cumulus@leaf01:~$ nv set evpn vni 20 route-target both 65101:10
cumulus@leaf01:~$ nv config apply
cumulus@leaf03:~$ nv set evpn vni 10 route-target import 65101:10
cumulus@leaf03:~$ nv set evpn vni 10 route-target import 65101:20
cumulus@leaf03:~$ nv set evpn vni 20 route-target both 65102:10
cumulus@leaf03:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# vni 10
leaf01(config-router-af-vni)# route-target import 65102:10
leaf01(config-router-af-vni)# route-target import 65102:20
leaf01(config-router-af-vni)# exit
leaf01(config-router-af)# vni 20
leaf01(config-router-af-vni)# route-target both 65101:10
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf file:

...
address-family l2vpn evpn
  vni 10
    route-target import 65102:10
    route-target import 65102:20
  vni 20
    route-target import 65101:10
    route-target export 65101:10
...
cumulus@leaf03:~$ sudo vtysh
...
leaf03# configure terminal
leaf03(config)# router bgp 65102
leaf03(config-router)# address-family l2vpn evpn
leaf03(config-router-af)# vni 10
leaf03(config-router-af-vni)# route-target import 65101:10
leaf03(config-router-af-vni)# route-target import 65101:20
leaf03(config-router-af-vni)# exit
leaf03(config-router-af)# vni 20
leaf03(config-router-af-vni)# route-target both 65102:10
leaf03(config-router-af)# end
leaf03# write memory
leaf03# exit

The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf file:

...
address-family l2vpn evpn
  vni 10
    route-target import 65101:10
    route-target import 65101:20
  vni 20
    route-target import 65102:10
    route-target export 65102:10
...

Enable EVPN in an iBGP Environment with an OSPF Underlay

You can use EVPN with an OSPF or static route underlay. This is a more complex configuration than using eBGP. In this case, iBGP advertises EVPN routes directly between VTEPs and the spines are unaware of EVPN or BGP.

The leafs peer with each other in a full mesh within the EVPN address family without using route reflectors. The leafs generally peer to their loopback addresses, which advertise in OSPF. The receiving VTEP imports routes into a specific VNI with a matching route target community.

cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.2 remote-as internal
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.3 remote-as internal
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.4 remote-as internal
cumulus@leaf01:~$ nv set evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.2 address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.3 address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.4 address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router ospf router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router ospf area 0 network 10.10.10.1/32
cumulus@leaf01:~$ nv set interface lo router ospf passive on
cumulus@leaf01:~$ nv set interface swp49 router ospf area 0.0.0.0
cumulus@leaf01:~$ nv set interface swp50 router ospf area 0.0.0.0
cumulus@leaf01:~$ nv set interface swp51 router ospf area 0.0.0.0
cumulus@leaf01:~$ nv set interface swp52 router ospf area 0.0.0.0
cumulus@leaf01:~$ nv set interface swp49 router ospf network-type point-to-point
cumulus@leaf01:~$ nv set interface swp50 router ospf network-type point-to-point
cumulus@leaf01:~$ nv set interface swp51 router ospf network-type point-to-point
cumulus@leaf01:~$ nv set interface swp52 router ospf network-type point-to-point
cumulus@leaf01:~$ nv config apply

NVUE creates the following configuration snippet in the /etc/nvue.d/startup.yaml file:

cumulus@leaf01:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        router:
          ospf:
            area: 0
            enable: on
            network-type: point-to-point    
        type: loopback
      swp49:
        router:
          ospf:
            area: 0.0.0.0
            enable: on
        type: swp
      swp50:
        router:
          ospf:
            area: 0.0.0.0
            enable: on
            network-type: point-to-point
        type: swp
      swp51:
        router:
          ospf:
            area: 0.0.0.0
            enable: on
            network-type: point-to-point
        type: swp
      swp52:
        router:
          ospf:
            area: 0.0.0.0
            enable: on
            network-type: point-to-point
        type: swp
    bridge:
      domain:
        br_default:
          multicast:
            snooping:
              enable: off
              querier:
                enable: on
    router:
      bgp:
        autonomous-system: 65101
        enable: on
        router-id: 10.10.10.1
      ospf:
        router-id: 10.10.10.1
        enable: on
    vrf:
      default:
        router:
          bgp:
            peer:
              10.10.10.2:
                remote-as: internal
                type: numbered
                address-family:
                  l2vpn-evpn:
                    enable: on
              10.10.10.3:
                remote-as: internal
                type: numbered
                address-family:
                  l2vpn-evpn:
                    enable: on
              10.10.10.4:
                remote-as: internal
                type: numbered
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            address-family:
              l2vpn-evpn:
                enable: on
    evpn:
      enable: on
    nve:
      vxlan:
        enable: on
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor 10.10.10.2 remote-as internal
leaf01(config-router)# neighbor 10.10.10.3 remote-as internal
leaf01(config-router)# neighbor 10.10.10.4 remote-as internal
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# neighbor 10.10.10.2 activate
leaf01(config-router-af)# neighbor 10.10.10.3 activate
leaf01(config-router-af)# neighbor 10.10.10.4 activate
leaf01(config-router-af)# advertise-all-vni
leaf01(config-router-af)# exit
leaf01(config-router)# exit
leaf01(config)# router ospf
leaf01(config-router)# router-id 10.10.10.1
leaf01(config-router)# passive-interface lo
leaf01(config-router)# exit
leaf01(config)# interface lo
leaf01(config-if)# ip ospf area 0.0.0.0
leaf01(config-if)# exit
leaf01(config)# interface swp49
leaf01(config-if)# ip ospf area 0.0.0.0
leaf01(config-if)# ospf network point-to-point
leaf01(config-if)# exit
leaf01(config)# interface swp50
leaf01(config-if)# ip ospf area 0.0.0.0
leaf01(config-if)# ospf network point-to-point
leaf01(config-if)# exit
leaf01(config)# interface swp51
leaf01(config-if)# ip ospf area 0.0.0.0
leaf01(config-if)# ospf network point-to-point
leaf01(config-if)# exit
leaf01(config)# interface swp52
leaf01(config-if)# ip ospf area 0.0.0.0
leaf01(config-if)# ospf network point-to-point
leaf01(config-if)# end
leaf01# write memory
leaf01# exit

The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf file.

...
interface lo
  ip ospf area 0.0.0.0
!
interface swp49
  ip ospf area 0.0.0.0
  ip ospf network point-to-point
!
interface swp50
  ip ospf area 0.0.0.0
  ip ospf network point-to-point
!
interface swp51
  ip ospf area 0.0.0.0
  ip ospf network point-to-point
!
interface swp52
  ip ospf area 0.0.0.0
  ip ospf network point-to-point
!
router bgp 65101
  neighbor 10.10.10.2 remote-as internal
  neighbor 10.10.10.3 remote-as internal
  neighbor 10.10.10.4 remote-as internal
  !
  address-family l2vpn evpn
  neighbor 10.10.10.2 activate
  neighbor 10.10.10.3 activate
  neighbor 10.10.10.4 activate
  advertise-all-vni
  exit-address-family
  !
Router ospf
  Ospf router-id 10.10.10.1
  Passive-interface lo
...

ARP and ND Suppression

ARP suppression with EVPN allows a VTEP to suppress ARP flooding over VXLAN tunnels as much as possible. A local proxy handles ARP requests from locally attached hosts for remote hosts. ARP suppression is for IPv4; ND suppression is for IPv6.

Cumulus Linux enables ARP and ND suppression by default on all VNIs to reduce ARP and ND packet flooding over VXLAN tunnels; however, you must configure layer 3 interfaces (SVIs) for ARP and ND suppression to work with EVPN.

ND Suppression and IPv6 Address Reuse

If you disable ND suppression and reuse IPv6 addresses, IPv6 duplicate address detection fails and the address remains tentative and not useable. The following example shows an IPv6 duplicate address detection failure on vlan10:

cumulus@switch:~$ ip address show vlan10 | grep dad
inet6 2001:db8::1/32 scope global dadfailed tentative

To prevent IPv6 duplicate address detection from failing, you can either disable IPv6 duplicate address detection globally or on the interface address.

To disable IPv6 duplicate address detection globally, add the following lines in the /etc/sysctl.conf file, then reboot the switch.

cumulus@switch:~$ sudo nano /etc/sysctl.conf
...
net.ipv6.conf.default.accept_dad = 0

To disable IPv6 duplicate address detection on an interface address, create an NVUE snippet, then patch and apply the configuration. The following snippet disables duplicate address detection on vlan10 with the IP address 2001:db8::1/32:

cumulus@switch:~$ sudo nano DisableDadVlan10.yaml
- set:
    system:
      config:
        snippet:
          ifupdown2_eni:
            vlan10: |
              post-up ip address add 2001:db8::1/32 dev vlan10 nodad
cumulus@switch:~$ nv config patch DisableDadVlan10.yaml
cumulus@switch:~$ nv config apply

You do not need to reboot the switch after you create and apply the snippet.

ARP ND Suppression and Centralized Routing

In a centralized routing deployment, you must configure layer 3 interfaces even if you configure the switch only for layer 2 (you are not using VXLAN routing). To avoid installing unnecessary layer 3 information, you can turn off IP forwarding.

The following example commands turn off IPv4 and IPv6 forwarding on VLAN 10 and VLAN 20.

cumulus@leaf01:~$ nv set interface vlan10 ip ipv4 forward off
cumulus@leaf01:~$ nv set interface vlan10 ip ipv6 forward off
cumulus@leaf01:~$ nv set interface vlan20 ip ipv4 forward off
cumulus@leaf01:~$ nv set interface vlan20 ip ipv6 forward off
cumulus@leaf01:~$ nv config apply

Edit the /etc/network/interfaces file.

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto vlan10
iface vlan10
    ip6-forward off
    ip-forward off
    vlan-id 10
    vlan-raw-device bridge

auto vlan20
iface vlan20
    ip6-forward off
    ip-forward off
    vlan-id 20
    vlan-raw-device bridge

auto vni10
iface vni10
    bridge-access 10
    vxlan-id 10
    bridge-learning off

auto vni20
iface vni20
      bridge-access 20
      vxlan-id 20
      bridge-learning off
...

For a bridge in traditional mode, you must edit the bridge configuration in the /etc/network/interfaces file using a text editor:

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto bridge1
iface bridge1
    bridge-ports swp1.10 swp2.10 vni10
    ip6-forward off
    ip-forward off
...

Disable ARP and ND Suppression

NVIDIA recommends that you keep ARP and ND suppression on to reduce ARP and ND packet flooding over VXLAN tunnels. However, if you do need to disable ARP and ND suppression, run the NVUE nv set nve vxlan arp-nd-suppress off command or set bridge-arp-nd-suppress off in the /etc/network/interfaces file:

cumulus@leaf01:~$ nv set nve vxlan arp-nd-suppress off
cumulus@leaf01:~$ nv config apply

Edit the /etc/network/interfaces file to set bridge-arp-nd-suppress off on the VXLAN device, then run the ifreload -a command.

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...

auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20 30=30 4036=4002 4024=4001
    bridge-learning off
    bridge-arp-nd-suppress off

...
cumulus@leaf01:~$ sudo ifreload -a

The neighbor manager service relies on ARP and ND suppression to snoop on packets and update forwarding entries based on neighbor changes. If you disable suppression, you must enable the neighbor manager snooper manually:

  1. Create the systemd override configuration file /etc/systemd/system/neighmgrd.service with the following content:

    [Service]
    ExecStart=/usr/bin/neighmgrd --snoop-all-bridges
    
  2. Reload the systemd unit configuration with the sudo systemctl daemon-reload command.

  3. Restart the neighmgrd service with the sudo systemctl restart neighmgrd.service command.

Configure Static MAC Addresses

You can configure a MAC address that you intend to pin to a particular VTEP on the VTEP as a static bridge FDB entry. EVPN picks up these MAC addresses and advertises them to peers as remote static MACs. You configure static bridge FDB entries for MAC addresses under the bridge configuration:

Cumulus Linux does not provide NVUE commands for this configuration.

Edit the /etc/network/interfaces file. For example:

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto bridge
iface bridge
    bridge-ports bond1 vni10
    bridge-vids 10
    bridge-vlan-aware yes
    post-up bridge fdb add 26:76:e6:93:32:78 dev bond1 vlan 10 master static sticky
...

For a bridge in traditional mode, you must edit the bridge configuration in the /etc/network/interfaces file using a text editor:

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto br10
iface br10
    bridge-ports swp1.10 vni10
    post-up bridge fdb add 26:76:e6:93:32:78 dev swp1.10 master static sticky
...

Configure a Site ID for MLAG

When you use EVPN with MLAG, EVPN might install local MAC addresses or neighbor entries as remote entries. To prevent EVPN from taking ownership of local MAC addresses or neighbor entries from MLAG, you can associate all local layer 2 VNIs with a unique site ID, which represents an MLAG pair.

When you configure a site ID, Cumulus Linux:

The site ID is in the format <IPv4 address>:<2-byte Value>, where the IPv4 address is the anycast IP address (a virtual IP address for VXLAN data-path termination) and the 2-byte value is an integer between 0 and 65535. For example: 10.0.1.12:10

cumulus@leaf01:~$ nv set evpn mac-vrf-soo 10.0.1.12:10
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# mac-vrf soo 10.0.1.12:10
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

NVIDIA recommends you do not configure a site ID on a standalone or multihoming VTEP.

Filter EVPN Routes

It is common to subdivide the data center into multiple pods with full host mobility within a pod but only do prefix-based routing across pods. You can achieve this by only exchanging EVPN type-5 routes across pods.

The following example commands configure EVPN to advertise type-5 routes:

cumulus@leaf01:~$ nv set router policy route-map map1 rule 10 match type ipv4
cumulus@leaf01:~$ nv set router policy route-map map1 rule 10 match evpn-route-type ip-prefix
cumulus@leaf01:~$ nv set router policy route-map map1 rule 10 action permit
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast route-export to-evpn route-map map1
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
..
leaf01# configure terminal
leaf01(config)# route-map map1 permit 1
leaf01(config)# match evpn route-type prefix
leaf01(config)# end
leaf01# write memory
leaf01# exit

You must apply the route map for the configuration to take effect. See Route Maps for more information.

In many situations, it is also desirable to only exchange EVPN routes carrying a particular VXLAN ID. For example, if data centers or pods within a data center only share certain tenants, you can use a route map to control the EVPN routes exchanged based on the VNI.

The following example configures a route map that only advertises EVPN routes from VNI 1000:

cumulus@switch:~$ nv set router policy route-map map1 rule 10 match evpn-vni 1000
cumulus@switch:~$ nv set router policy route-map map1 rule 10 action permit
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# route-map map1 permit 1
switch(config)# match evpn vni 1000
switch(config)# end
switch# write memory
switch# exit

You can only match type-2 and type-5 routes based on VNI.

BGP Neighbor Prefix Limits for EVPN

Cumulus Linux provides commands to control the number of inbound prefixes allowed from a BGP neighbor for EVPN.

To configure inbound prefix limits, set:

Before you configure a prefix limit, determine how many routes the remote BGP neighbor typically sends and set a threshold that is slightly higher than the number of BGP prefixes you expect to receive during normal operations.

The following example sets the maximum inbound prefix limit from the neighbor swp51 to 3, generates a warning syslog message and brings down the BGP session when the number of prefixes received reaches 50 percent of the maximum limit. After 60 seconds, the BGP session with the peer reestablishes.

cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound maximum 3
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound warning-threshold 50
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound reestablish-wait 60
cumulus@switch:~$ nv config apply

The following example sets the maximum inbound prefix limit from peer swp51 to 3 and generates a warning syslog message only (without bringing down the BGP session) when the number of prefixes received reaches 50 percent of the maximum limit.

cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound maximum 3
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound warning-threshold 50
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn prefix-limits inbound warning-only on
cumulus@switch:~$ nv config apply

The following example sets the maximum inbound prefix limit from the neighbor swp51 to 3, generates a warning syslog message and brings down the BGP session when the number of prefixes received reaches 50 percent of the maximum limit. After 1 minute, the BGP session with the peer reestablishes.

cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn 
switch(config-router-af)# neighbor swp51 maximum-prefix 3 50 restart 1
switch(config-router-af)# end
switch# write memory
switch# exit

You can use the force option (neighbor swp51 maximum-prefix 3 50 restart 1 force) to force check all received routes, not only accepted routes.

The following example sets the maximum inbound prefix limit from peer swp51 to 3, and generates a warning syslog message only (without bringing down the BGP session) when the number of prefixes received reaches 50 percent of the maximum limit.

cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn 
switch(config-router-af)# neighbor swp51 maximum-prefix 3 50 warning-only 
switch(config-router-af)# end
switch# write memory
switch# exit

You can use the force option (neighbor swp51 maximum-prefix 3 50 warning-only force) to force check all received routes, not only accepted routes.

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@switch:~$ sudo cat /etc/frr/frr.conf
...
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp51 maximum-prefix 5 warning-only
...

In a typical EVPN deployment, you reuse SVI IP addresses on VTEPs across multiple racks. However, if you use unique SVI IP addresses across multiple racks and you want the local SVI IP address to be reachable via remote VTEPs, you can enable the advertise SVI IP and MAC address option. This option advertises the SVI IP and MAC address as a type-2 route and eliminates the need for any flooding over VXLAN to reach the IP address from a remote VTEP or rack.

To advertise all SVI IP and MAC addresses on the switch, run these commands:

cumulus@leaf01:~$ nv set evpn route-advertise svi-ip on
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# advertise-svi-ip
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

To advertise a specific SVI IP/MAC address, run these commands:

cumulus@leaf01:~$ nv set evpn vni 10 route-advertise svi-ip on
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# vni 10
leaf01(config-router-af-vni)# advertise-svi-ip
leaf01(config-router-af-vni)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
address-family l2vpn evpn
  vni 10
  advertise-svi-ip
exit-address-family
...

Disable BUM Flooding

By default, the VTEP floods all broadcast, and unknown unicast and multicast packets (such as ARP, NS, or DHCP) it receives to all interfaces (except for the incoming interface) and to all VXLAN tunnel interfaces in the same broadcast domain. When the switch receives such packets on a VXLAN tunnel interface, it floods the packets to all interfaces in the packet’s broadcast domain.

You can disable BUM flooding over VXLAN tunnels so that EVPN does not advertise type-3 routes for each local VNI and stops taking action on received type-3 routes.

Disabling BUM flooding is useful in a deployment with a controller or orchestrator, where the switch is pre-provisioned and there is no need to flood any ARP, NS, or DHCP packets.

For information on EVPN BUM flooding with PIM, refer to EVPN BUM Traffic with PIM-SM.

To disable BUM flooding:

cumulus@leaf01:~$ nv set nve vxlan flooding enable off
cumulus@leaf01:~$ nv config apply

To reenable BUM flooding, run the following commands. Enabling BUM flooding requires head-end replication.

cumulus@leaf01:~$ nv set nve vxlan flooding enable on
cumulus@leaf01:~$ nv set nve vxlan flooding head-end-replication evpn
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# flooding disable
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65101
 !
 address-family l2vpn evpn
  flooding disable
 exit-address-family
...

To reenable BUM flooding, run the vtysh flooding head-end-replication command.

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# flooding head-end-replication
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

To show that BUM flooding is off, run the vtysh show bgp l2vpn evpn vni command. For example:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# show bgp l2vpn evpn vni
Advertise Gateway Macip: Disabled
Advertise SVI Macip: Disabled
Advertise All VNI flag: Enabled
BUM flooding: Disabled
Number of L2 VNIs: 3
Number of L3 VNIs: 2
Flags: * - Kernel
  VNI        Type RD                    Import RT                 Export RT                Tenant VRF
* 20         L2   10.10.10.1:3          65101:20                  65101:20                 RED
* 30         L2   10.10.10.1:4          65101:30                  65101:30                 BLUE
* 10         L2   10.10.10.1:6          65101:10                  65101:10                 RED
* 4002       L3   10.1.30.2:2           65101:4002                65101:4002               BLUE
* 4001       L3   10.1.20.2:5           65101:4001                65101:4001               RED

Run the vtysh show bgp l2vpn evpn route type multicast command to make sure there are no EVPN type-3 routes that originate locally.

Extended Mobility

Cumulus Linux supports scenarios where the IP to MAC binding for a host or virtual machine changes across the move. In addition to the simple mobility scenario where a host or virtual machine with a binding of IP1, MAC1 moves from one rack to another, Cumulus Linux supports additional scenarios where a host or virtual machine with a binding of IP1, MAC1 moves and takes on a new binding of IP2, MAC1 or IP1, MAC2. The EVPN protocol mechanism to handle extended mobility continues to use the MAC mobility extended community and is the same as the standard mobility procedures. Extended mobility defines how to compute the sequence number in this attribute when binding changes occur.

Extended mobility not only supports virtual machine moves, but also where one virtual machine shuts down and you provision another on a different rack that uses the IP address or the MAC address of the previous virtual machine. For example, in an EVPN deployment with OpenStack, where virtual machines for a tenant provision and shut down dynamically, a new virtual machine can use the same IP address as an earlier virtual machine but with a different MAC address.

To reuse the same distributed gateway on VLANs fabric wide, you can set the fabric-wide MAC address; see Change the VRR MAC address.

Cumulus Linux enables extended mobility by default.

To examine the sequence numbers for a host or virtual machine MAC address and IP address, run the vtysh show evpn mac vni <vni> mac <address> command. For example:

cumulus@switch:~$ sudo vtysh
...
switch# show evpn mac vni 10100 mac 00:02:00:00:00:42
MAC: 00:02:00:00:00:42
  Remote VTEP: 10.0.0.2
  Local Seq: 0 Remote Seq: 3
  Neighbors:
    10.1.1.74 Active

switch# show evpn arp vni 10100 ip 10.1.1.74
IP: 10.1.1.74
  Type: local
  State: active
  MAC: 44:39:39:ff:00:24
  Local Seq: 2 Remote Seq: 3

Duplicate Address Detection

Cumulus Linux can detect duplicate MAC and IPv4 or IPv6 addresses on hosts or virtual machines in a VXLAN-EVPN configuration. The Cumulus Linux switch (VTEP) considers a host MAC or IP address to be duplicate if the address moves across the network more than a certain number of times within a certain number of seconds (five moves within 180 seconds by default). In addition to legitimate host or VM mobility scenarios, address movement can occur when you configure IP addresses incorrectly on a host or when packet looping occurs in the network due to faulty configuration or behavior.

Cumulus Linux enables duplicate address detection by default, which triggers when:

By default, when the switch detects a duplicate address, it flags the address as a duplicate and generates an error in syslog so that you can troubleshoot the reason and address the fault, then clear the duplicate address flag. The switch does not take any functional action on the address.

When Does Duplicate Address Detection Trigger?

The VTEP that sees an address move from remote to local begins the detection process by starting a timer. Each VTEP runs duplicate address detection independently. Detection always starts with the first mobility event from remote to local. If the address is initially remote, the detection count can start with the first move for the address. If the address is initially local, the detection count starts only with the second or higher move for the address. If an address is undergoing a mobility event between remote VTEPs, duplicate detection does not start.

The following illustration shows VTEP-A, VTEP-B, and VTEP-C in an EVPN configuration. Duplicate address detection triggers on VTEP-A when there is a duplicate MAC address for two hosts attached to VTEP-A and VTEP-B. However, duplicate detection does not trigger on VTEP-A when mobility events occur between two remote VTEPs (VTEP-B and VTEP-C).

Configure Duplicate Address Detection

You can configure the threshold for MAC and IP address moves. The maximum number of moves allowed can be between 2 and 1000 and the detection time interval can be between 2 and 1800 seconds.

The following example command sets the maximum number of address moves allowed to 10 and the duplicate address detection time interval to 1200 seconds.

cumulus@switch:~$ nv set evpn dad mac-move-threshold 10
cumulus@switch:~$ nv set evpn dad move-window 1200
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn
switch(config-router-af)# dup-addr-detection max-moves 10 time 1200
switch(config-router-af)# end
switch# write memory
switch# exit

To disable duplicate address detection, see Disable Duplicate Address Detection below.

Example syslog Messages

The following example shows the syslog message that generates when Cumulus Linux detects a MAC address as a duplicate during a local update:

2018/11/06 18:55:29.463327 ZEBRA: [EC 4043309149] VNI 1001: MAC 00:01:02:03:04:11 detected as duplicate during local update, last VTEP 172.16.0.16

The following example shows the syslog message that generates when Cumulus Linux detects an IP address as a duplicate during a remote update:

2018/11/09 22:47:15.071381 ZEBRA: [EC 4043309151] VNI 1002: MAC aa:22:aa:aa:aa:aa IP 10.0.0.9 detected as duplicate during remote update, from VTEP 172.16.0.16

Freeze a Detected Duplicate Address

Cumulus Linux provides a freeze option that takes action on a detected duplicate address. You can freeze the address permanently (until you intervene) or for a defined amount of time, after which it clears automatically.

When you enable the freeze option and the switch detects a duplicate address:

To recover from a freeze, shut down the faulty host or VM or fix any other misconfiguration in the network. If the address freezes permanently, run the clear command on the VTEP where the address is duplicate. If the address freezes for a defined period of time, it clears automatically after the timer expires (you can clear the duplicate address before the timer expires with the clear command).

If you run the clear command or the timer expires before you address the fault, duplicate address detection can continue to occur.

After you clear a frozen address, if it is present behind a remote VTEP, the kernel and hardware forwarding tables update. If this VTEP learns the address locally, the address advertises to remote VTEPs. All VTEPs get the correct address as soon as the host communicates. The switch only learns silent hosts after the faulty entries age out, or you intervene and clear the faulty MAC and ARP table entries.

Configure the Freeze Option

You can enable Cumulus Linux to freeze detected duplicate addresses. The duration can be any number of seconds between 30 and 3600.

The following example command freezes duplicate addresses for a period of 1000 seconds, after which it clears automatically:

cumulus@switch:~$ nv set evpn dad duplicate-action freeze duration 1000
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn
switch(config-router-af)# dup-addr-detection freeze 1000
switch(config-router-af)# end
switch# write memory
switch# exit

Set the freeze timer to be three times the duplicate address detection window. For example, if the duplicate address detection window is 180 seconds, set the freeze timer to 540 seconds.

The following example command freezes duplicate addresses permanently (until you run the clear command):

cumulus@switch:~$ nv set evpn dad duplicate-action freeze duration permanent
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn
switch(config-router-af)# dup-addr-detection freeze permanent
switch(config-router-af)# end
switch# write memory
switch# exit

Clear Duplicate Addresses

You can clear duplicate addresses for all VNIs, or clear a duplicate MAC or IP address (and unfreeze a frozen address).

To clear duplicate addresses for all VNIs:

cumulus@switch:~$ nv action clear evpn vni
Action succeeded

To clear duplicate IP address 10.0.0.9 for VNI 10:

cumulus@switch:~$ nv action clear evpn vni 10 host 10.0.0.9
Action succeeded

To clear duplicate MAC address 00:e0:ec:20:12:62 for VNI 10:

cumulus@switch:~$ nv action clear evpn vni 10 mac 00:e0:ec:20:12:62
Action succeeded

To clear duplicate addresses for all VNIs:

cumulus@switch:~$ sudo vtysh
...
switch# clear evpn dup-addr vni all
switch# exit

To clear duplicate IP address 10.0.0.9 for VNI 10:

cumulus@switch:~$ sudo vtysh
...
switch# clear evpn dup-addr vni 10 ip 10.0.0.9
switch# exit

To clear duplicate MAC address 00:e0:ec:20:12:62 for VNI 10:

cumulus@switch:~$ sudo vtysh
...
switch# clear evpn dup-addr vni 10 mac 00:e0:ec:20:12:62
switch# exit

Disable Duplicate Address Detection

Duplicate address detection is on by default. The switch generates a syslog error when it detects a duplicate address. To disable duplicate address detection, run the following command.

cumulus@switch:~$ nv set evpn dad enable off
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family l2vpn evpn
switch(config-router-af)# no dup-addr-detection
switch(config-router-af)# end
switch# write memory
switch# exit

When you disable duplicate address detection, Cumulus Linux clears the configuration and all existing duplicate addresses.

Show Detected Duplicate Address Information

During the duplicate address detection process, you can see the start time and current detection count with the vtysh show evpn mac vni <vni_id> mac <mac_addr> command. The following command example shows that detection starts for MAC address 00:01:02:03:04:11 for VNI 1001 on Tuesday, Nov 6 at 18:55:05 and Cumulus Linux detects one move.

cumulus@switch:~$ sudo vtysh
...
switch# show evpn mac vni 1001 mac 00:01:02:03:04:11
MAC: 00:01:02:03:04:11
  Intf: hostbond3(15) VLAN: 1001
  Local Seq: 1 Remote Seq: 0
  Duplicate detection started at Tue Nov  6 18:55:05 2018, detection count 1
  Neighbors:
    10.0.1.26 Active

After the duplicate MAC address clears, the vtysh show evpn mac vni <vni_id> mac <mac_addr> command shows:

MAC: 00:01:02:03:04:11
  Remote VTEP: 172.16.0.16
  Local Seq: 13 Remote Seq: 14
  Duplicate, detected at Tue Nov  6 18:55:29 2018
  Neighbors:
    10.0.1.26 Active

To display information for a duplicate IP address, run the vtysh show evpn arp-cache vni <vni_id> ip <ip_addr> command. The following command example shows information for IP address 10.0.0.9 for VNI 1001.

cumulus@switch:~$ sudo vtysh
...
switch# show evpn arp-cache vni 1001 ip 10.0.0.9
IP: 10.0.0.9
  Type: remote
  State: inactive
  MAC: 00:01:02:03:04:11
  Remote VTEP: 10.0.0.34
  Local Seq: 0 Remote Seq: 14
  Duplicate, detected at Tue Nov  6 18:55:29 2018

To show a list of MAC addresses detected as duplicate for a specific VNI or for all VNIs, run the vtysh show evpn mac vni <vni-id|all> duplicate command. The following example command shows a list of duplicate MAC addresses for VNI 1001:

cumulus@switch:~$ sudo vtysh
...
switch# show evpn mac vni 1001 duplicate
Number of MACs (local and remote) known for this VNI: 16
MAC               Type   Intf/Remote VTEP      VLAN
aa:bb:cc:dd:ee:ff local  hostbond3             1001

To show a list of IP addresses detected as duplicate for a specific VNI or for all VNIs, run the vtysh show evpn arp-cache vni <vni-id|all> duplicate command. The following example command shows a list of duplicate IP addresses for VNI 1001:

cumulus@switch:~$ sudo vtysh
...
switch# show evpn arp-cache vni 1001 duplicate
Number of ARPs (local and remote) known for this VNI: 20
IP                Type   State    MAC                Remote VTEP
10.0.0.8          local  active   aa:11:aa:aa:aa:aa
10.0.0.9          local  active   aa:11:aa:aa:aa:aa
10.10.0.12        remote active   aa:22:aa:aa:aa:aa  172.16.0.16

To show configured duplicate address detection parameters, run the vtysh show evpn command:

cumulus@switch:~$ sudo vtysh
...
switch# show evpn
L2 VNIs: 4
L3 VNIs: 2
Advertise gateway mac-ip: No
Duplicate address detection: Enable
  Detection max-moves 7, time 300
  Detection freeze permanent

To show the configured action to take when the switch detects a duplicate address, run the nv show evpn dad duplicate-action command:

cumulus@switch:~$ nv show evpn dad duplicate-action
operational   applied     
------------  ------------
warning-only  warning-only

Show Current EVPN Configuration

To show the current EVPN configuration on the switch, run the nv show evpn command:

cumulus@leaf01:~$ nv show evpn  
                       operational   applied      
---------------------  ------------  -------------
enable                               on           
route-advertise                                   
  nexthop-setting                    system-ip-mac
  svi-ip               off           off          
  default-gateway      off           off          
dad                                               
  enable               on            on           
  mac-move-threshold   5             5            
  move-window          180           180          
  duplicate-action     warning-only  warning-only 
[vni]                                             
multihoming                                       
  enable                             off          
  mac-holdtime         1080                       
  neighbor-holdtime    1080                       
  startup-delay        180                        
  startup-delay-timer  --:--:--                   
  uplink-count         0                          
  uplink-active        0                          
l2vni-count            3                          
l3vni-count            2

You can also show the EVPN configuration in json format with the nv show evpn -o json command or in yaml format with the nv show evpn -o yaml command.

cumulus@leaf01:~$ nv show evpn -o json
{
  "dad": {
    "duplicate-action": {
      "warning-only": {}
    },
    "enable": "on",
    "mac-move-threshold": 5,
    "move-window": 180
  },
  "l2vni-count": 3,
  "l3vni-count": 2,
  "multihoming": {
    "mac-holdtime": 1080,
    "neighbor-holdtime": 1080,
    "startup-delay": 180,
    "startup-delay-timer": "--:--:--",
    "uplink-active": 0,
    "uplink-count": 0
  },
  "route-advertise": {
    "default-gateway": "off",
    "svi-ip": "off"
  }
}
cumulus@leaf01:~$ nv show evpn -o yaml
dad:
  duplicate-action:
    warning-only: {}
  enable: on
  mac-move-threshold: 5
  move-window: 180
l2vni-count: 3
l3vni-count: 2
multihoming:
  mac-holdtime: 1080
  neighbor-holdtime: 1080
  startup-delay: 180
  startup-delay-timer: --:--:--
  uplink-active: 0
  uplink-count: 0
route-advertise:
  default-gateway: off
  svi-ip: off

Inter-subnet Routing

EVPN includes multiple models for routing between different subnets (VLANs), also known as inter-VLAN routing. The model you choose depends if every VTEP acts as a layer 3 gateway and performs routing or if only specific VTEPs perform routing, and if routing occurs only at the ingress of the VXLAN tunnel or both the ingress and the egress of the VXLAN tunnel.

Cumulus Linux supports these models:

Centralized Routing

In centralized routing, you configure a specific VTEP to act as the default gateway for all the hosts in a particular subnet throughout the EVPN fabric. It is common to provision a pair of VTEPs in active-active mode as the default gateway using an anycast IP and MAC address for each subnet. You need to configure all subnets on such a gateway VTEP. When a host in one subnet wants to communicate with a host in another subnet, it addresses the packets to the gateway VTEP. The ingress VTEP (to which the source host attaches) bridges the packets to the gateway VTEP over the corresponding VXLAN tunnel. The gateway VTEP routes to the destination host and, post-routing, the packet bridges to the egress VTEP (to which the destination host attaches). The egress VTEP then bridges the packet on to the destination host.

To enable centralized routing, you must configure the gateway VTEPs to advertise their IP and MAC address.

cumulus@leaf01:~$ nv set evpn route-advertise default-gateway on
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh

leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# advertise-default-gw
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf file.

...
router bgp 65101
...
  address-family l2vpn evpn
    advertise-default-gw
  exit-address-family
...

You can deploy centralized routing at the VNI level, where you can configure the advertise-default-gw command per VNI; you use centralized routing for certain VNIs and distributed symmetric routing (described below) other VNIs. NVIDIA does not recommend this type of configuration.

When you use centralized routing, even if the source host and destination host attach to the same VTEP, the packets travel to the gateway VTEP, the switch routes the packets, then the packets come back.

Asymmetric Routing

In distributed asymmetric routing, each VTEP acts as a layer 3 gateway, performing routing for its attached hosts. Only the ingress VTEP performs routing, the egress VTEP only performs bridging. You can achieve asymmetric routing with only host routing, which does not involve any interconnecting VNIs. However, you must provision each VTEP with all VLANs and corresponding VNIs (the subnets between which communication takes place) even if there are no locally attached hosts for a particular VLAN.

The only additional configuration required to implement asymmetric routing beyond the standard configuration for a layer 2 VTEP described above is to ensure that each VTEP has all VLANs (and corresponding VNIs) provisioned and that you configure the SVI for each VLAN with an anycast IP or MAC address.

Symmetric Routing

In distributed symmetric routing, each VTEP acts as a layer 3 gateway, performing routing for its attached hosts; however, both the ingress VTEP and egress VTEP route the packets (similar to the traditional routing behavior of routing to a next hop router). In the VXLAN encapsulated packet, the inner destination MAC address is the router MAC address of the egress VTEP to indicate that the egress VTEP is the next hop and also needs to perform routing. All routing happens in the context of a tenant (VRF). For a packet that the ingress VTEP receives from a locally attached host, the SVI interface corresponding to the VLAN determines the VRF. For a packet that the egress VTEP receives over the VXLAN tunnel, the VNI in the packet has to specify the VRF. For symmetric routing, this is a VNI corresponding to the tenant and is different from either the source VNI or the destination VNI. This VNI is a layer 3 VNI or interconnecting VNI. The regular VNI, which maps a VLAN, is the layer 2 VNI.

In an EVPN symmetric routing configuration, when the switch announces a type-2 (MAC,IP) route, in addition to containing two VNIs (the layer 2 VNI and the layer 3 VNI), the route also contains separate RTs for layer 2 and layer 3. The layer 3 RT associates the route with the tenant VRF. By default, this is auto-derived using the layer 3 VNI instead of the layer 2 VNI; however you can also configure it.

For EVPN symmetric routing, you need to perform the following additional configuration. Optional configuration includes configuring RD and RTs for the tenant VRF and advertising the locally-attached subnets.

Specify the VRF to layer 3 VNI mapping. This configuration is for the BGP control plane.

cumulus@leaf01:~$ nv set vrf RED evpn vni 4001
cumulus@leaf01:~$ nv config apply

When you run the nv set vrf RED evpn vni 4001 command, NVUE:

  • Creates a layer 3 bridge called br_l3vni if a layer 3 VNI was not previously configured
  • Creates a layer 3 VNI called vni4001 in VRF RED
  • Assigns vni4001 a VLAN automatically and creates a VLAN interface with _l3 (layer 3) at the end of the interface name (for example, vlan220_l3) in VRF RED. NVUE adds the VLAN to bridge br_l3vni
  • Adds vni4001 to the VLAN-VNI map of a single VXLAN device in bridge br_l3vni

This behavior is different in an MLAG environment. If you configure MLAG and you run the nv set vrf RED evpn vni 4001 command, NVUE:

  • Creates a layer 3 VNI called vni4001 in VRF RED
  • Assigns vni4001 a VLAN automatically out of the global reserved layer 3 VNI VLAN range and creates a VLAN interface with _l3 (layer 3) at the end of the interface name (for example, vlan220_l3) in VRF RED. NVUE adds the VLAN to bridge br_default
  • Adds vni4001 to the VLAN-VNI map of the single VXLAN device in bridge br_default

The global reserved layer 3 VNI VLAN range is different than the switch internal reserved VLAN range. You can configure it with the nv set system global reserved vlan l3-vni-vlan command.

  1. Configure a per-tenant VNI interface and associated VLAN for the VNI. Configure the VNI and VLAN in the map for the VXLAN device placed in a bridge for the layer 3 VNIs. The router MAC address of the VTEPs install over the VNI interface and remote host routes for symmetric routing install over the VLAN interface:

    Edit the /etc/network/interfaces file. For example:

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto vni4001
iface vni4001
  bridge-access 220
  bridge-learning off
  vxlan-id 4001

auto vlan220_l3
iface vlan220_l3
  vrf RED
  vlan-raw-device br_l3vni
  vlan-id 220

auto vxlan99
iface vxlan99
  bridge-vlan-vni-map 220=4001
  bridge-learning off

auto br_l3vni
iface br_l3vni
  bridge-ports vxlan99
  hwaddress 44:38:39:22:01:7a
  bridge-vlan-aware yes
   ...

When you are using MLAG, the VNIs and VXLAN device must belong to the same bridge as your MLAG peerlink. In environments without MLAG configured, you can configure a separate bridge for L3VNIs as displayed above.

  1. Specify the VRF to layer 3 VNI mapping. This configuration is for the BGP control plane.

    Edit the /etc/frr/frr.conf file. For example:

    cumulus@leaf01:~$ sudo nano /etc/frr/frr.conf
    ...
    vrf RED
      vni 4001
    !
    ...
    

If you need to convert a layer 2 VNI to a layer 3 VNI, refer to Change a Layer 2 VNI to Layer 3.

Configure RD and RTs for the Tenant VRF

If you do not want Cumulus Linux to derive the RD and RTs (layer 3 RTs) for the tenant VRF automatically, you can configure them manually by specifying them under the l2vpn evpn address family for that specific VRF.

You can configure the RD, the RT you want to attach to the host or prefix routes when importing them into EVPN, and the RTs to attach to host or prefix routes when importing them into a VRF.

The tenant VRF RD and RTs are different from the RD and RTs for the layer 2 VNI. To define the RD and RTs for the layer 2 VNI, see Define RDs and RTs.

cumulus@leaf01:~$ nv set vrf RED router bgp rd 10.1.20.2:5
cumulus@leaf01:~$ nv set vrf RED router bgp route-import from-evpn route-target 65102:4001
cumulus@leaf01:~$ nv set vrf RED router bgp route-export to-evpn route-target 65101:4002
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh

leaf01# configure terminal
leaf01(config)# router bgp 65101 vrf RED
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# rd 10.1.20.2:5
leaf01(config-router-af)# route-target import 65102:4001
leaf01(config-router-af)# route-target export 65101:4002
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

The vtysh commands create the following configuration snippet in the /etc/frr/frr.conf file:

...
router bgp 65101 vrf RED
  address-family l2vpn evpn
  rd 10.1.20.2:5
  route-target import 65102:4001
  route-target export 65101:4002
...

Symmetric routing presents a problem in the presence of silent hosts. If the ingress VTEP does not have the destination subnet and the host route does not advertise for the destination host, the ingress VTEP cannot route the packet to its destination. You can overcome this problem by having VTEPs announce the subnet prefixes corresponding to their connected subnets in addition to announcing host routes. Cumulus Linux announces these routes as EVPN prefix (type-5) routes.

To advertise locally attached subnets:

  1. Enable advertisement of EVPN prefix (type-5) routes. Refer to Prefix-based Routing - EVPN Type-5 Routes, below.
  2. Ensure that the routes corresponding to the connected subnets are in the BGP VRF routing table by injecting them using the network command or redistributing them using the redistribute connected command.

Use this configuration only if you have silent hosts and only on one VTEP per subnet (or two for redundancy).

Prefix-based Routing

EVPN in Cumulus Linux supports prefix-based routing using EVPN type-5 (prefix) routes. Type-5 routes (or prefix routes) primarily route to destinations outside of the data center fabric.

EVPN prefix routes carry the layer 3 VNI and router MAC address and follow the symmetric routing model to route to the destination prefix.

Install EVPN Type-5 Routes

For a switch to install EVPN type-5 routes into the routing table, you must configure layer 3 VNI related information. This configuration is the same as for symmetric routing. You need to:

  1. Configure a per-tenant VXLAN interface that specifies the layer 3 VNI for the tenant. This VXLAN interface is part of the bridge; router MAC addresses of remote VTEPs install over this interface.
  2. Configure an SVI (layer 3 interface) corresponding to the per-tenant VXLAN interface. This attaches to the VRF of the tenant. The remote prefix routes install over this SVI.
  3. Specify the mapping of the VRF to layer 3 VNI. This configuration is for the BGP control plane.

Announce EVPN Type-5 Routes

The tenant VRF requires the following configuration to announce IP prefixes in the BGP RIB as EVPN type-5 routes.

cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn enable on
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh

leaf01# configure terminal
leaf01(config)# router bgp 65101 vrf RED
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# advertise ipv4 unicast
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

The vtysh commands create the following snippet in the /etc/frr/frr.conf file:

...
router bgp 65101 vrf RED
  address-family l2vpn evpn
    advertise ipv4 unicast
  exit-address-family
end
...

Control RIB Routes

By default, when announcing IP prefixes in the BGP RIB as EVPN type-5 routes, the switch selects all routes in the BGP RIB to advertise as EVPN type-5 routes. You can use a route map to allow selective route advertisement from the BGP RIB.

The following commands add a route map filter to IPv4 EVPN type-5 route advertisement:

cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn route-map map1
cumulus@leaf01:~$ sudo vtysh

leaf01# configure terminal
leaf01(config)# router bgp 65101 vrf RED
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# advertise ipv4 unicast route-map map1
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Originate Default EVPN Type-5 Routes

Cumulus Linux supports originating EVPN default type-5 routes. The default type-5 route originates from a border (exit) leaf and advertises to all the other leafs within the pod. Any leaf within the pod follows the default route towards the border leaf for all external traffic (towards the Internet or a different pod).

To originate a default type-5 route in EVPN:

cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn default-route-origination on
cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv6-unicast route-export to-evpn default-route-origination on
cumulus@leaf01:~$ sudo vtysh

leaf01# configure terminal
leaf01(config)# router bgp 65101 vrf RED
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# default-originate ipv4
leaf01(config-router-af)# default-originate ipv6
leaf01(config-router-af)# end
leaf01# write memory

In EVPN symmetric routing configurations with VXLAN active-active (MLAG), all EVPN routes advertise with the anycast IP address as the next hop IP address and the anycast MAC address as the router MAC address. In a failure scenario, the switch might forward traffic to a leaf switch that does not have the destination routes. To prevent dropped traffic in this failure scenario, Cumulus Linux enables the Advertise Primary IP address feature by default so that the switch handles the next hop IP address of the VTEP conditionally depending on the route type: host type-2 (MAC/IP advertisement) or type-5 (IP prefix route).

For more information about VXLAN active-active, see VXLAN Active-active Mode.

Set the Anycast MAC Address

You set the anycast MAC address on both switches in the MLAG pair.

NVUE provides two commands to set the anycast MAC address globally. You can either:

If you use Linux commands to configure the switch instead of NVUE, add the address-virtual <anycast-mac> option under every VLAN interface in the /etc/network/interfaces file. Cumulus Linux does not provide a global anycast MAC address or MAC ID option in the /etc/network/interfaces file.

To set the anycast MAC address:

cumulus@leaf01:~$ nv set system global anycast-mac 44:38:39:ff:00:ff
cumulus@leaf01:~$ nv config apply

To set the anycast MAC ID:

cumulus@leaf01:~$ nv set system global anycast-id 255
cumulus@leaf01:~$ nv config apply

Edit the /etc/network/interfaces file and add address-virtual <anycast-mac> under each VLAN interface. For example:

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto vlan4001
iface vlan4001
    address-virtual 44:38:39:FF:00:AA
    vrf RED
    vlan-raw-device bridge
    vlan-id 4001
...

The anycast MAC address is different from the fabric-wide VRR MAC address, which distributes the same VRR gateway on VLAN interfaces across switches fabric-wide. The following diagram shows the relationship between the anycast MAC address or ID, which is unique for each active-active pair, and the fabric MAC address or ID, which is consistent across the entire fabric.

When configuring third party networking devices using MLAG and EVPN for interoperability, you must configure and announce a single shared router MAC value for each advertised next hop IP address.

Disable Advertise Primary IP Address

Each switch in the MLAG pair advertises type-5 routes with its own system IP, which creates an additional next hop at the remote VTEPs. In a large multi-tenancy EVPN deployment, where additional resources are a concern, you can disable this feature.

To disable Advertise Primary IP Address:

cumulus@leaf01:~$ nv set evpn route-advertise nexthop-setting shared-ip-mac
cumulus@leaf01:~$ nv config apply

To reenable Advertise Primary IP Address, run the nv set evpn route-advertise nexthop-setting system-ip-mac command.

cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# router bgp 65101 vrf RED
leaf01(config)# address-family l2vpn evpn
leaf01(config)# no advertise-pip
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

To reenable Advertise Primary IP Address:

cumulus@leaf01:~$ sudo vtysh

leaf01# configure terminal
leaf01(config)# router bgp 65101 vrf RED
leaf01(config)# address-family l2vpn evpn
leaf01(config)# advertise-pip
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Show Advertise Primary IP Address Information

To show Advertise Primary IP Address parameters, run the vtysh show bgp l2vpn evpn vni <vni> command. For example:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# show bgp l2vpn evpn vni 4001
VNI: 4001 (known to the kernel)
  Type: L3
  Tenant VRF: RED
  RD: 10.1.20.2:5
  Originator IP: 10.0.1.1
  Advertise-gw-macip : n/a
  Advertise-svi-macip : n/a
  Advertise-pip: Yes
  System-IP: 10.10.10.1
  System-MAC: 44:38:39:FF:00:aa
  Router-MAC: 44:38:39:FF:00:aa
  Import Route Target:
    65101:4001
  Export Route Target:
    65101:4001

To show EVPN routes with Primary IP Advertisement, run the vtysh show bgp l2vpn evpn route command. For example:

cumulus@leaf01:~$ sudo vtysh
leaf01# show bgp l2vpn evpn route
...
Route Distinguisher: 10.10.10.1:3
*> [2]:[0]:[48]:[00:60:08:69:97:ef]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:10 RT:65101:4001 Rmac:44:38:39:FF:00:aa
*> [2]:[0]:[48]:[26:76:e6:93:32:78]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:10 RT:65101:4001 Rmac:44:38:39:FF:00:aa
*> [2]:[0]:[48]:[26:76:e6:93:32:78]:[32]:[10.1.10.101]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:10 RT:65101:4001 Rmac:44:38:39:FF:00:aa
...

To show the learned route from an external router injected as a type-5 route, run the vtysh show bgp vrf <vrf> ipv4 unicast command.

Downstream VNI

Downstream VNI (symmetric EVPN route leaking) enables you to assign a VNI from a downstream remote VTEP through EVPN routes instead of configuring layer 3 VNIs globally across the network.

To configure a downstream VNI, you configure tenant VRFs as usual; however, to configure the desired route leaking, you define a route target import and, or export statement.

Configure Route Targets

The route target import or export statement is in the format route-target import|export <asn>:<vni>; for example, route-target import 65101:6000. For route target import statements, you can use route-target import ANY:<vni> for NVUE commands or route-target import *:<vni> in the /etc/frr/frr.conf file. ANY in NVUE commands or the asterisk (*) in the /etc/frr/frr.conf file uses any ASN as a wildcard.

The NVUE commands are as follows:

The following example shows a configuration with downstream VNI on leaf01 thru leaf04, and border01.

Traffic Flow between VRF RED and VRF 10

  1. server01 forwards traffic to leaf01.
  2. leaf01 encapsulates the packet with the VNI in its route-target import statement (6000) and tunnels the traffic over to border01.
  3. border01 uses the VNI received from leaf01 to forward the packet.
  4. The reverse traffic from border01 to server01 is encapsulated with the VNI in the route-target import statement on border01 (4001) and tunneled over to leaf01, where routing occurs in VRF RED.

The configuration for the example is below.

Because the configuration is similar on all the leafs, the example only shows configuration files for leaf01 and border01. For brevity, the example do not show the spine configuration files.

cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:~$ nv set interface swp1-3,swp51-52
cumulus@leaf01:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:~$ nv set interface bond2 bond member swp2
cumulus@leaf01:~$ nv set interface bond3 bond member swp3
cumulus@leaf01:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond3 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond1 link mtu 9000
cumulus@leaf01:~$ nv set interface bond2 link mtu 9000
cumulus@leaf01:~$ nv set interface bond3 link mtu 9000
cumulus@leaf01:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf01:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf01:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf01:~$ nv set interface bond3 bridge domain br_default access 30
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf01:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@leaf01:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@leaf01:~$ nv set interface vlan10 ip vrr mac-address 00:00:00:00:00:10
cumulus@leaf01:~$ nv set interface vlan10 ip vrr state up
cumulus@leaf01:~$ nv set interface vlan20 ip address 10.1.20.2/24
cumulus@leaf01:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@leaf01:~$ nv set interface vlan20 ip vrr mac-address 00:00:00:00:00:20
cumulus@leaf01:~$ nv set interface vlan20 ip vrr state up
cumulus@leaf01:~$ nv set interface vlan30 ip address 10.1.30.2/24
cumulus@leaf01:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
cumulus@leaf01:~$ nv set interface vlan30 ip vrr mac-address 00:00:00:00:00:30
cumulus@leaf01:~$ nv set interface vlan30 ip vrr state up
cumulus@leaf01:~$ nv set vrf RED
cumulus@leaf01:~$ nv set vrf BLUE
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf01:~$ nv set bridge domain br_default vlan 30 vni 30
cumulus@leaf01:~$ nv set interface vlan10 ip vrf RED
cumulus@leaf01:~$ nv set interface vlan20 ip vrf RED
cumulus@leaf01:~$ nv set interface vlan30 ip vrf BLUE
cumulus@leaf01:~$ nv set nve vxlan source address 10.10.10.1
cumulus@leaf01:~$ nv set nve vxlan arp-nd-suppress on 
cumulus@leaf01:~$ nv set vrf RED evpn vni 4001
cumulus@leaf01:~$ nv set vrf BLUE evpn vni 4002
cumulus@leaf01:~$ nv set system global anycast-mac 44:38:39:FF:00:AA
cumulus@leaf01:~$ nv set evpn enable on
cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf01:~$ nv set vrf RED router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set vrf RED router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf01:~$ nv set vrf RED router bgp route-import from-evpn route-target 65163:6000
cumulus@leaf01:~$ nv set vrf BLUE router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set vrf BLUE router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf BLUE router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf01:~$ nv set vrf BLUE router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf01:~$ nv set vrf BLUE router bgp route-import from-evpn route-target 65163:6000
cumulus@leaf01:~$ nv set evpn multihoming enable on
cumulus@leaf01:~$ nv set interface bond1 evpn multihoming segment local-id 1
cumulus@leaf01:~$ nv set interface bond2 evpn multihoming segment local-id 2
cumulus@leaf01:~$ nv set interface bond3 evpn multihoming segment local-id 3
cumulus@leaf01:~$ nv set interface bond1-3 evpn multihoming segment mac-address 44:38:39:FF:00:AA
cumulus@leaf01:~$ nv set interface bond1-3 evpn multihoming segment df-preference 50000
cumulus@leaf01:~$ nv set interface swp51-52 evpn multihoming uplink on
cumulus@leaf01:~$ nv config apply
cumulus@border01:~$ nv set interface lo ip address 10.10.10.63/32
cumulus@border01:~$ nv set interface swp1-3,swp51-52
cumulus@border01:~$ nv set interface bond1 bond member swp1
cumulus@border01:~$ nv set interface bond2 bond member swp2
cumulus@border01:~$ nv set interface bond3 bond member swp3
cumulus@border01:~$ nv set interface bond1 bond lacp-bypass on
cumulus@border01:~$ nv set interface bond2 bond lacp-bypass on
cumulus@border01:~$ nv set interface bond3 bond lacp-bypass on
cumulus@border01:~$ nv set interface bond1 link mtu 9000
cumulus@border01:~$ nv set interface bond2 link mtu 9000
cumulus@border01:~$ nv set interface bond3 link mtu 9000
cumulus@border01:~$ nv set interface bond1-3 bridge domain br_default
cumulus@border01:~$ nv set interface bond1 bridge domain br_default access 2001
cumulus@border01:~$ nv set interface bond2 bridge domain br_default access 2002
cumulus@border01:~$ nv set interface bond3 bridge domain br_default access 2010
cumulus@border01:~$ nv set interface vlan2001 ip address 10.1.201.1/24
cumulus@border01:~$ nv set interface vlan2002 ip address 10.1.202.1/24
cumulus@border01:~$ nv set interface vlan2010 ip address 10.1.210.1/24
cumulus@border01:~$ nv set bridge domain br_default vlan 2001,2002,2010
cumulus@border01:~$ nv set vrf VRF10
cumulus@border01:~$ nv set vrf EXTERNAL1
cumulus@border01:~$ nv set vrf EXTERNAL2
cumulus@border01:~$ nv set bridge domain br_default vlan 2001 vni 2001
cumulus@border01:~$ nv set bridge domain br_default vlan 2002 vni 2002
cumulus@border01:~$ nv set bridge domain br_default vlan 2010 vni 2010
cumulus@border01:~$ nv set interface vlan2001 ip vrf EXTERNAL1
cumulus@border01:~$ nv set interface vlan2002 ip vrf EXTERNAL2
cumulus@border01:~$ nv set interface vlan2010 ip vrf VRF10
cumulus@border01:~$ nv set nve vxlan source address 10.10.10.63
cumulus@border01:~$ nv set nve vxlan arp-nd-suppress on 
cumulus@border01:~$ nv set vrf VRF10 evpn vni 6000
cumulus@border01:~$ nv set system global anycast-mac 44:38:39:FF:00:FF
cumulus@border01:~$ nv set evpn enable on
cumulus@border01:~$ nv set router bgp autonomous-system 65163
cumulus@border01:~$ nv set router bgp router-id 10.10.10.63
cumulus@border01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@border01:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@border01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@border01:~$ nv set vrf VRF10 router bgp autonomous-system 65163
cumulus@border01:~$ nv set vrf VRF10 router bgp router-id 10.10.10.63
cumulus@border01:~$ nv set vrf VRF10 router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@border01:~$ nv set vrf VRF10 router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@border01:~$ nv set vrf VRF10 router bgp address-family ipv4-unicast route-export to-evpn
cumulus@border01:~$ nv set vrf VRF10 router bgp route-import from-evpn route-target 65101:4001
cumulus@border01:~$ nv set vrf VRF10 router bgp route-import from-evpn route-target 65101:4002
cumulus@border01:~$ nv set vrf EXTERNAL1 router bgp autonomous-system 65163
cumulus@border01:~$ nv set vrf EXTERNAL1 router bgp router-id 10.10.10.63
cumulus@border01:~$ nv set vrf EXTERNAL1 router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@border01:~$ nv set vrf EXTERNAL1 router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@border01:~$ nv set vrf EXTERNAL1 router bgp address-family ipv4-unicast route-export to-evpn
cumulus@border01:~$ nv set vrf EXTERNAL2 router bgp autonomous-system 65163
cumulus@border01:~$ nv set vrf EXTERNAL2 router bgp router-id 10.10.10.63
cumulus@border01:~$ nv set vrf EXTERNAL2 router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@border01:~$ nv set vrf EXTERNAL2 router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@border01:~$ nv set vrf EXTERNAL2 router bgp address-family ipv4-unicast route-export to-evpn
cumulus@border01:~$ nv config apply
cumulus@leaf01:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
            '30':
              vni:
                '30': {}
    evpn:
      enable: on
      multihoming:
        enable: on
    interface:
      bond1:
        bond:
          lacp-bypass: on
          member:
            swp1: {}
        bridge:
          domain:
            br_default:
              access: 10
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 1
              mac-address: 44:38:39:FF:00:AA
        link:
          mtu: 9000
        type: bond
      bond2:
        bond:
          lacp-bypass: on
          member:
            swp2: {}
        bridge:
          domain:
            br_default:
              access: 20
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 2
              mac-address: 44:38:39:FF:00:AA
        link:
          mtu: 9000
        type: bond
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
        bridge:
          domain:
            br_default:
              access: 30
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 3
              mac-address: 44:38:39:FF:00:AA
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp51:
        evpn:
          multihoming:
            uplink: on
        type: swp
      swp52:
        evpn:
          multihoming:
            uplink: on
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.2/24: {}
          vrf: RED
          vrr:
            address:
              10.1.10.1/24: {}
            enable: on
            mac-address: 00:00:00:00:00:10
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.2/24: {}
          vrf: RED
          vrr:
            address:
              10.1.20.1/24: {}
            enable: on
            mac-address: 00:00:00:00:00:20
            state:
              up: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.2/24: {}
          vrf: BLUE
          vrr:
            address:
              10.1.30.1/24: {}
            enable: on
            mac-address: 00:00:00:00:00:30
            state:
              up: {}
        type: svi
        vlan: 30
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        source:
          address: 10.10.10.1
    router:
      bgp:
        autonomous-system: 65101
        enable: on
        router-id: 10.10.10.1
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$yJf4CI.6MAcRaFk7$w4JpnsELzwZ.2IQmDCNbTOzXvn8tigF53ZQr5bev5HkZqcrvT6s/uV.NN3ejCXAEVS0B6Erm2gDAZmoZjhiiR0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        system-mac: 44:38:39:22:01:7a
      hostname: leaf01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      BLUE:
        evpn:
          enable: on
          vni:
            '4002': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
              l2vpn-evpn:
                enable: on
            autonomous-system: 65101
            enable: on
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
            route-import:
              from-evpn:
                route-target:
                  65163:6000: {}
            router-id: 10.10.10.1
      RED:
        evpn:
          enable: on
          vni:
            '4001': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
              l2vpn-evpn:
                enable: on
            autonomous-system: 65101
            enable: on
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
            route-import:
              from-evpn:
                route-target:
                  65163:6000: {}
            router-id: 10.10.10.1
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@border01:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '2001':
              vni:
                '2001': {}
            '2002':
              vni:
                '2002': {}
            '2010':
              vni:
                '2010': {}
    evpn:
      enable: on
      multihoming:
        enable: on
    interface:
      bond1:
        bond:
          lacp-bypass: on
          member:
            swp1: {}
        bridge:
          domain:
            br_default:
              access: 2001
        evpn:
          multihoming: {}
        link:
          mtu: 9000
        type: bond
      bond2:
        bond:
          lacp-bypass: on
          member:
            swp2: {}
        bridge:
          domain:
            br_default:
              access: 2002
        evpn:
          multihoming: {}
        link:
          mtu: 9000
        type: bond
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
        bridge:
          domain:
            br_default:
              access: 2010
        evpn:
          multihoming: {}
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.63/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp51:
        evpn:
          multihoming: {}
        type: swp
      swp52:
        evpn:
          multihoming: {}
        type: swp
      vlan2001:
        ip:
          address:
            10.1.201.1/24: {}
          vrf: EXTERNAL1
        type: svi
        vlan: 2001
      vlan2002:
        ip:
          address:
            10.1.202.1/24: {}
          vrf: EXTERNAL2
        type: svi
        vlan: 2002
      vlan2010:
        ip:
          address:
            10.1.210.1/24: {}
          vrf: VRF10
        type: svi
        vlan: 2010
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        source:
          address: 10.10.10.63
    router:
      bgp:
        autonomous-system: 65163
        enable: on
        router-id: 10.10.10.63
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$sz.v3Uf8h2bL19/6$zxzdafQL5gqGp63/t/Vmg34IuZ6ztC3ie3g08KwmhWRBnFrb52d2qzMUJxn4dUpJZSwkkDmScJwSvljH1RwYj.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:FF
        system-mac: 44:38:39:22:01:74
      hostname: border01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      EXTERNAL1:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
              l2vpn-evpn:
                enable: on
            autonomous-system: 65163
            enable: on
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
            router-id: 10.10.10.63
      EXTERNAL2:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
              l2vpn-evpn:
                enable: on
            autonomous-system: 65163
            enable: on
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
            router-id: 10.10.10.63
      VRF10:
        evpn:
          enable: on
          vni:
            '6000': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
              l2vpn-evpn:
                enable: on
            autonomous-system: 65163
            enable: on
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
            route-import:
              from-evpn:
                route-target:
                  65101:4001: {}
                  65101:4002: {}
            router-id: 10.10.10.63
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@leaf01:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.1/32
    vxlan-local-tunnelip 10.10.10.1
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto RED
iface RED
    vrf-table auto
auto BLUE
iface BLUE
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    mtu 9000
    es-sys-mac 44:38:39:FF:00:AA
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    es-sys-mac 44:38:39:FF:00:AA
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 20
auto bond3
iface bond3
    mtu 9000
    es-sys-mac 44:38:39:FF:00:AA
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 30
auto vlan10
iface vlan10
    address 10.1.10.2/24
    address-virtual 00:00:00:00:00:10 10.1.10.1/24
    hwaddress 44:38:39:22:01:b1
    vrf RED
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.2/24
    address-virtual 00:00:00:00:00:20 10.1.20.1/24
    hwaddress 44:38:39:22:01:b1
    vrf RED
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.2/24
    address-virtual 00:00:00:00:00:30 10.1.30.1/24
    hwaddress 44:38:39:22:01:b1
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 30
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20 30=30
    bridge-vids 10 20 30
    bridge-learning off
auto vlan220_l3
iface vlan220_l3
    vrf RED
    vlan-raw-device br_l3vni
    vlan-id 220
auto vlan297_l3
iface vlan297_l3
    vrf BLUE
    vlan-raw-device br_l3vni
    vlan-id 297
auto vxlan99
iface vxlan99
    bridge-vlan-vni-map 220=4001 297=4002
    bridge-vids 220 297
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 vxlan48
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
auto br_l3vni
iface br_l3vni
    bridge-ports vxlan99
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
cumulus@border01:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.63/32
    vxlan-local-tunnelip 10.10.10.63
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto EXTERNAL1
iface EXTERNAL1
    vrf-table auto
auto EXTERNAL2
iface EXTERNAL2
    vrf-table auto
auto VRF10
iface VRF10
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 2001
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 2002
auto bond3
iface bond3
    mtu 9000
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 2010
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp51
iface swp51
auto swp52
iface swp52
auto vlan2001
iface vlan2001
    address 10.1.201.1/24
    hwaddress 44:38:39:22:01:74
    vrf EXTERNAL1
    vlan-raw-device br_default
    vlan-id 2001
auto vlan2002
iface vlan2002
    address 10.1.202.1/24
    hwaddress 44:38:39:22:01:74
    vrf EXTERNAL2
    vlan-raw-device br_default
    vlan-id 2002
auto vlan2010
iface vlan2010
    address 10.1.210.1/24
    hwaddress 44:38:39:22:01:74
    vrf VRF10
    vlan-raw-device br_default
    vlan-id 2010
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 2001=2001 2002=2002 2010=2010
    bridge-learning off
auto vlan336_l3
iface vlan336_l3
    vrf VRF10
    vlan-raw-device br_l3vni
    vlan-id 336
auto vxlan99
iface vxlan99
    bridge-vlan-vni-map 336=6000
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 vxlan48
    hwaddress 44:38:39:22:01:74
    bridge-vlan-aware yes
    bridge-vids 2001 2002 2010
    bridge-pvid 1
auto br_l3vni
iface br_l3vni
    bridge-ports vxlan99
    hwaddress 44:38:39:22:01:74
    bridge-vlan-aware yes
cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
evpn mh mac-holdtime 1080
evpn mh neigh-holdtime 1080
evpn mh startup-delay 180
interface bond1
evpn mh es-df-pref 50000
evpn mh es-id 1
evpn mh es-sys-mac 44:38:39:FF:00:AA
interface bond2
evpn mh es-df-pref 50000
evpn mh es-id 2
evpn mh es-sys-mac 44:38:39:FF:00:AA
interface bond3
evpn mh es-df-pref 50000
evpn mh es-id 3
evpn mh es-sys-mac 44:38:39:FF:00:AA
interface swp51
evpn mh uplink
interface swp52
evpn mh uplink
vrf BLUE
vni 4002
exit-vrf
vrf RED
vni 4001
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65101 vrf default
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65101 vrf default
router bgp 65101 vrf RED
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
route-target import 65163:6000
exit-address-family
! end of router bgp 65101 vrf RED
router bgp 65101 vrf BLUE
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
route-target import 65163:6000
exit-address-family
! end of router bgp 65101 vrf BLUE
cumulus@border01:~$ sudo cat /etc/frr/frr.conf
vrf EXTERNAL1
exit-vrf
vrf EXTERNAL2
exit-vrf
vrf VRF10
vni 6000
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65163 vrf default
bgp router-id 10.10.10.63
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65163 vrf default
router bgp 65163 vrf EXTERNAL1
bgp router-id 10.10.10.63
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
neighbor underlay activate
exit-address-family
! end of router bgp 65163 vrf EXTERNAL1
router bgp 65163 vrf EXTERNAL2
bgp router-id 10.10.10.63
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
neighbor underlay activate
exit-address-family
! end of router bgp 65163 vrf EXTERNAL2
router bgp 65163 vrf VRF10
bgp router-id 10.10.10.63
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
route-target import 65101:4001
route-target import 65101:4002
neighbor underlay activate
exit-address-family
! end of router bgp 65163 vrf VRF10

This simulation is running Cumulus Linux 5.11. The Cumulus Linux 5.12 simulation is coming soon.

The simulation starts with the example downstream VNI configuration. To simplify the example, only one spine is in the topology. The demo is pre-configured using NVUE commands.

  • fw1 has IP address 10.1.210.254 configured beyond border01 in VRF10.
  • server01 has IP address 10.1.10.101 as in the example.

To validate the configuration, run the verification commands shown below.

Verify Configuration

To verify the configuration, check that the routes are properly received and tagged:

The following vtysh command on leaf01 shows the route from border01 tagged with route target 6000

cumulus@leaf01:~$ sudo vtysh
leaf01# show bgp l2vpn evpn route type prefix
...
Route Distinguisher: 10.10.10.63:3
*> [5]:[0]:[24]:[10.1.210.0]
                    10.10.10.63                            0 65222 65163 ?
                    RT:65163:6000 ET:8 Rmac:44:38:39:22:01:b3
...

The following Linux command on leaf01 shows the encapsulated ID (6000) on the routes:

cumulus@leaf01:mgmt:~$ ip route show vrf RED 10.1.210.0/24
10.1.210.0/24  encap ip id 6000 src 0.0.0.0 dst 10.10.10.63 ttl 0 tos 0 via 10.10.10.63 dev vxlan99 proto bgp metric 20 onlink

The following vtysh command on border01 shows the routes from leaf01 tagged with route targets 4001 and 4002:

cumulus@border01:~$ sudo vtysh
border01# show bgp l2vpn evpn route type prefix
...
Route Distinguisher: 10.10.10.1:2
*> [5]:[0]:[24]:[10.1.10.0]
                    10.10.10.1                             0 65222 65101 ?
                    RT:65101:4001 ET:8 Rmac:44:38:39:22:01:b1
*> [5]:[0]:[24]:[10.1.20.0]
                    10.10.10.1                             0 65222 65101 ?
                    RT:65101:4001 ET:8 Rmac:44:38:39:22:01:b1
Route Distinguisher: 10.10.10.1:3
*> [5]:[0]:[24]:[10.1.30.0]
                    10.10.10.1                             0 65222 65101 ?
                    RT:65101:4002 ET:8 Rmac:44:38:39:22:01:b1
...

The following Linux command on border01 shows the encapsulated IDs (4001 and 4002) on the routes:

cumulus@border01:mgmt:~$ ip route show vrf VRF10
10.1.10.0/24  encap ip id 4001 src 0.0.0.0 dst 10.10.10.1 ttl 0 tos 0 via 10.10.10.1 dev vxlan99 proto bgp metric 20 onlink 
10.1.20.0/24  encap ip id 4001 src 0.0.0.0 dst 10.10.10.1 ttl 0 tos 0 via 10.10.10.1 dev vxlan99 proto bgp metric 20 onlink 
10.1.30.0/24  encap ip id 4002 src 0.0.0.0 dst 10.10.10.1 ttl 0 tos 0 via 10.10.10.1 dev vxlan99 proto bgp metric 20 onlink 
...

Considerations

Centralized Routing with ARP Suppression Enabled on the Gateway

In an EVPN centralized routing configuration, where the layer 2 network extends beyond VTEPs, (for example, a host with bridges), the gateway MAC address does not refresh in the network when ARP suppression exists on the gateway. To work around this issue, disable ARP suppression on the centralized gateway.

Symmetric Routing and the Same SVI IP Address Across Racks

In EVPN symmetric routing, if you use the same SVI IP address across racks (for example, if the SVI IP address for a specific VLAN interface (such as vlan100) is the same on all VTEPs where this SVI is present):

Host-to-host traffic does not have these issues.

EVPN Multihoming

EVPN multihoming (EVPN-MH) provides support for all-active server redundancy. It is a standards-based replacement for MLAG in data centers deploying Clos topologies. Replacing MLAG provides these benefits:

EVPN-MH uses BGP-EVPN type-1, type-2 and type-4 routes to discover Ethernet segments (ES) and to forward traffic to those Ethernet segments. The MAC and neighbor databases synchronize between the Ethernet segment peers through these routes as well. An Ethernet segment is a group of switch links that attach to the same server. Each Ethernet segment has an unique Ethernet segment ID (ESI) across the entire PoD.

To configure EVPN-MH, you set an Ethernet segment MAC address and a local Ethernet segment ID on a static or LACP bond. These two parameters generate the unique MAC-based ESI value (type-3) automatically:

While you can specify a different segment MAC address on different Ethernet segments attached to the same switch, the Ethernet segment MAC address must be the same on the downlinks attached to the same server.

On Spectrum-2 and later, an Ethernet segment can span more than two switches. Each Ethernet segment is a distinct redundancy group. However, on Spectrum A1 switches, you can include a maximum of two switches in a redundancy group or Ethernet segment.

Required and Supported Features

This section describes features that you must enable to use EVPN multihoming. Other supported and unsupported features are also described.

Required Features

You must enable the following features to use EVPN-MH:

Cumulus Linux uses HER by default with EVPN multihoming. If you prefer to use EVPN BUM traffic handling with EVPN-PIM on multihomed sites through Type-4/ESR routes, configure EVPN-PIM as described in EVPN BUM Traffic with PIM-SM.

On Spectrum A1 switches, NVIDIA recommends that you use a PIM-SM underlay to distribute BUM traffic with EVPN multihoming for better performance. To check if you have a Spectrum A1 switch, run the sudo decode-syseeprom version | egrep -i "tlv|--|device version" command. If the command output shows the Device Version value at 16 or higher, you have a Spectrum A1 switch:

cumulus@switch:~$ sudo decode-syseeprom version | egrep -i "tlv|--|device version"
TlvInfo Header:
   Id String:    TlvInfo
TLV Name             Code Len Value
-------------------- ---- --- -----
Device Version       0x26   1 16

To use EVPN-MH, you must remove any MLAG configuration on the switch:

Supported Features

Supported EVPN Route Types

EVPN multihoming supports the following route types.

Route Type Description RFC
1 Ethernet auto-discovery (A-D) route RFC 7432
2 MAC/IP advertisement route RFC 7432
3 Inclusive multicast route RFC 7432
4 Ethernet segment route RFC 7432
5 IP prefix route RFC 9136

Unsupported Features

The following features are not supported with EVPN-MH:

Basic Configuration

To configure EVPN-MH, you must complete all the following steps:

  1. Enable EVPN multihoming.
  2. Configure an ESI on each EVPN-MH bond interface.
  3. Configure multihoming uplinks.

You can associate static and LACP bonds with an ESI.

The switch selects a designated forwarder (DF) for each Ethernet segment. The DF forwards flooded traffic received through the VXLAN overlay to the locally attached Ethernet segment. Specify a preference on an Ethernet segment for the DF election, as this leads to predictable failure scenarios. The EVPN VTEP with the highest DF preference setting becomes the DF. The DF preference setting defaults to 32767.

NVUE generates the EVPN-MH configuration and reloads FRR and ifupdown2. The configuration appears in both the /etc/network/interfaces file and in /etc/frr/frr.conf file.

When you enable EVPN-MH, all SVI MAC addresses advertise as type-2 routes. You do not need to configure a unique SVI IP address or configure the BGP EVPN address family with advertise-svi-ip.

Enable EVPN-MH

NVIDIA recommends that you enable EVPN-MH on all VTEPs throughout the fabric to avoid duplicate packets.

cumulus@leaf01:~$ nv set evpn multihoming enable on
cumulus@leaf01:~$ nv config apply

When you enable multihoming with the nv set evpn multihoming enable on command, NVUE restarts the switchd service, which causes all network ports to reset in addition to resetting the switch hardware configuration.

Set the evpn.multihoming.enable variable in the /etc/cumulus/switchd.conf file to TRUE. Cumulus Linux disables this variable by default.

cumulus@leaf01:~$ sudo nano /etc/cumulus/switchd.conf
...
evpn.multihoming.enable = TRUE
...

On the Spectrum A1 switch, you must restart switchd with the sudo systemctl restart switchd.service command after you enable multihoming.

Configure the EVPN-MH Bonds

To configure bond interfaces for EVPN-MH:

You can either set both the local Ethernet segment ID and the segment MAC address to generate a unique ESI automatically or set the 10-byte Ethernet segment ID manually, then set the segment MAC address. You can see both options below.

The following example commands configure each bond interface with the local Ethernet segment ID and the segment MAC address to generate a unique ESI automatically:

cumulus@leaf01:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:~$ nv set interface bond2 bond member swp2
cumulus@leaf01:~$ nv set interface bond3 bond member swp3
cumulus@leaf01:~$ nv set interface bond1 evpn multihoming segment local-id 1
cumulus@leaf01:~$ nv set interface bond2 evpn multihoming segment local-id 2
cumulus@leaf01:~$ nv set interface bond3 evpn multihoming segment local-id 3
cumulus@leaf01:~$ nv set interface bond1-3 evpn multihoming segment mac-address 44:38:39:FF:00:AA
cumulus@leaf01:~$ nv set interface bond1-3 evpn multihoming segment df-preference 50000
cumulus@leaf01:~$ nv config apply

The following example commands configure each bond interface with the Ethernet segment ID manually. The ID must be a 10-byte (80-bit) integer and must be unique. When you configure the 10-byte Ethernet segment ID, ensure that the local ID is not present. You must also configure the segment MAC address. The example configures a global segment MAC address for use on all the Ethernet segment bonds.

  • In Cumulus Linux 5.6 and later, NVUE no longer supports a 10-byte ESI value starting with a non 00 hex value.
  • When setting the segment MAC address manually, NVIDIA recommends using the reserved MAC address range 44:38:39:ff:00:00 through 44:38:39:ff:ff:ff

cumulus@leaf01:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:~$ nv set interface bond2 bond member swp2
cumulus@leaf01:~$ nv set interface bond3 bond member swp3
cumulus@leaf01:~$ nv set interface bond1 evpn multihoming segment identifier 00:44:38:39:FF:00:AA:00:00:01
cumulus@leaf01:~$ nv set interface bond2 evpn multihoming segment identifier 00:44:38:39:FF:00:AA:00:00:02
cumulus@leaf01:~$ nv set interface bond3 evpn multihoming segment identifier 00:44:38:39:FF:00:AA:00:00:03
cumulus@leaf01:~$ nv set interface bond1-3 evpn multihoming segment df-preference 50000
cumulus@leaf01:~$ nv set evpn multihoming segment mac-address 44:38:39:ff:ff:01
cumulus@leaf01:~$ nv config apply

The following example commands configure each bond interface with the local Ethernet segment ID and the segment MAC address to generate a unique ESI automatically:

  1. Configure the ESI on each bond interface with the local Ethernet segment ID and the segment MAC address:

    cumulus@leaf01:~$ sudo vtysh
    leaf01# configure terminal
    leaf01(config)# interface bond1
    leaf01(config-if)# evpn mh es-df-pref 50000
    leaf01(config-if)# evpn mh es-id 1
    leaf01(config-if)# evpn mh es-sys-mac 44:38:39:FF:00:AA
    leaf01(config-if)# exit
    leaf01(config)# interface bond2
    leaf01(config-if)# evpn mh es-df-pref 50000
    leaf01(config-if)# evpn mh es-id 2
    leaf01(config-if)# evpn mh es-sys-mac 44:38:39:FF:00:AA
    leaf01(config-if)# exit
    leaf01(config)# interface bond3
    leaf01(config-if)# evpn mh es-df-pref 50000
    leaf01(config-if)# evpn mh es-id 3
    leaf01(config-if)# evpn mh es-sys-mac 44:38:39:FF:00:AA
    leaf01(config-if)# exit
    leaf01(config)# write memory
    leaf01(config)# exit
    leaf01# exit
    cumulus@leaf01:~$
    

    The vtysh commands create the following configuration in the /etc/frr/frr.conf file.

    cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
    ...
    !
    interface bond1
     evpn mh es-df-pref 50000
     evpn mh es-id 1
     evpn mh es-sys-mac 44:38:39:FF:00:AA
    !
    interface bond2
     evpn mh es-df-pref 50000
     evpn mh es-id 2
     evpn mh es-sys-mac 44:38:39:FF:00:AA
    !
    interface bond3
     evpn mh es-df-pref 50000
     evpn mh es-id 3
     evpn mh es-sys-mac 44:38:39:FF:00:AA
    !
    
  2. Add the segment MAC address to the bond interfaces in the /etc/network/interfaces file, then run the ifreload -a command.

    cumulus@leaf01:~$ sudo nano /etc/network/interfaces
    ...
    interface bond1
      bond-slaves swp1
      es-sys-mac 44:38:39:FF:00:AA
       
    interface bond2
      bond-slaves swp2
      es-sys-mac 44:38:39:FF:00:AA
       
    interface bond3
      bond-slaves swp3
      es-sys-mac 44:38:39:FF:00:AA
    
    cumulus@leaf01:~$ sudo ifreload -a
    

The following example commands configure each bond interface with the Ethernet segment ID manually. The ID must be a 10-byte (80-bit) integer and must be unique. When you configure the 10-byte Ethernet segment ID, ensure that the local ID is not present. You must also configure the segment MAC address separately. The example configures a global segment MAC address for use on all the Ethernet segment bonds.

In Cumulus Linux 5.6 and later, NVUE no longer supports a 10-byte ESI value starting with a non 00 hex value.

  1. Configure each bond interface with the Ethernet segment ID manually:

    cumulus@leaf01:~$ sudo vtysh
    leaf01# configure terminal
    leaf01(config)# interface bond1
    leaf01(config-if)# evpn mh es-df-pref 50000
    leaf01(config-if)# evpn mh es-id 00:44:38:39:FF:00:AA:00:00:01
    leaf01(config-if)# exit
    leaf01(config)# interface bond2
    leaf01(config-if)# evpn mh es-df-pref 50000
    leaf01(config-if)# evpn mh es-id 00:44:38:39:FF:00:AA:00:00:02
    leaf01(config-if)# exit
    leaf01(config)# interface bond3
    leaf01(config-if)# evpn mh es-df-pref 50000
    leaf01(config-if)# evpn mh es-id 00:44:38:39:FF:00:aa:00:00:03
    leaf01(config-if)# exit
    leaf01(config)# write memory
    leaf01(config)# exit
    leaf01# exit
    cumulus@leaf01:~$
    

    The vtysh commands create the following configuration in the /etc/frr/frr.conf file.

    cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
    ...
    interface bond1
    evpn mh es-df-pref 50000
    evpn mh es-id 00:44:38:39:FF:00:AA:00:00:01
    interface bond2
    evpn mh es-df-pref 50000
    evpn mh es-id 00:44:38:39:FF:00:AA:00:00:02
    interface bond3
    evpn mh es-df-pref 50000
    evpn mh es-id 00:44:38:39:FF:00:AA:00:00:03
    ...
    
  2. Add the segment MAC address to the bond interfaces in the /etc/network/interfaces file, then run the ifreload -a command.

    cumulus@leaf01:~$ sudo nano /etc/network/interfaces
    ...
    interface bond1
      bond-slaves swp1
      es-sys-mac 44:38:39:FF:00:AA
       
    interface bond2
      bond-slaves swp2
      es-sys-mac 44:38:39:FF:00:AA
       
    interface bond3
      bond-slaves swp3
      es-sys-mac 44:38:39:FF:00:AA
    

When all uplinks go down, the VTEP loses connectivity to the VXLAN overlay. To prevent traffic loss, Cumulus Linux tracks the operational state of the uplink. When all the uplinks are down, the Ethernet segment bonds on the switch are in a protodown or error-disabled state. An MH uplink is any routed interface to which the switch routes locally encapsulated VXLAN traffic (after encapsulation) or any routed interface receiving VXLAN traffic (before decapsulation) that the local device decapsulates.

Split-horizon and Designated-Forwarder filters only apply to interfaces that are MH uplinks. If you configure EVPN-MH without MH uplinks, BUM traffic duplicates or loops back to the same ES. This can cause MAC flaps or other issues on multihomed devices.

cumulus@leaf01:~$ nv set interface swp51-54 evpn multihoming uplink on
cumulus@leaf01:~$ nv config apply

If you are configuring EVPN multihoming with EVPN-PIM, be sure to configure PIM on the interfaces.

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# interface swp51
leaf01(config-if)# evpn mh uplink
leaf01(config-if)# exit
leaf01(config)# interface swp52
leaf01(config-if)# evpn mh uplink
leaf01(config-if)# exit
leaf01(config)# interface swp53
leaf01(config-if)# evpn mh uplink
leaf01(config-if)# exit
leaf01(config)# interface swp54
leaf01(config-if)# evpn mh uplink
leaf01(config-if)# exit
leaf01(config)# write memory
leaf01(config)# exit
leaf01# exit
cumulus@leaf01:~$

The vtysh commands create the following configuration in the /etc/frr/frr.conf file:

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
!
interface swp1
 evpn mh uplink
!
interface swp2
 evpn mh uplink
!
interface swp3
 evpn mh uplink
!
interface swp4
 evpn mh uplink
!
...

To show if uplinks are down, run the nv show interface status command:

cumulus@leaf01:~$ nv show interface status
Interface    Admin Status  Oper Status  Protodown  Protodown Reason
-----------  ------------  -----------  ---------  ----------------
br_default   up            up           disabled
br_l3vni     up            up           disabled
eth0         up            up           disabled
bond3        up            down         disabled
bond4        up            down         disabled
bond5        up            down         disabled
bond6        up            up           disabled
lo           up            unknown      disabled
mgmt         up            up           disabled
swp5         up            down         enabled    frr   <<<< part of bond3 
swp6         up            down         enabled    frr
swp7         up            down         enabled    frr

Optional EVPN MH Configuration

Global Settings

You can set these global settings for EVPN-MH:

To configure a MAC hold time for 1000 seconds, run the following commands:

cumulus@leaf01:~$ nv set evpn multihoming mac-holdtime 1000
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# evpn mh mac-holdtime 1000
leaf01(config)# exit
leaf01# write memory

The vtysh commands create the following configuration in the /etc/frr/frr.conf file:

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
evpn mh mac-holdtime 1000

To configure a neighbor hold time for 600 seconds, run the following commands:

cumulus@leaf01:~$ nv set evpn multihoming neighbor-holdtime 600
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# evpn mh neigh-holdtime 600
leaf01(config)# exit
leaf01# write memory

The vtysh commands create the following configuration in the /etc/frr/frr.conf file:

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
evpn mh neigh-holdtime 600

To configure a startup delay for 1800 seconds, run the following commands:

cumulus@leaf01:~$ nv set evpn multihoming startup-delay 1800
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# evpn mh startup-delay 1800
leaf01(config)# exit
leaf01# write memory

The vtysh commands create the following configuration in the /etc/frr/frr.conf file:

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
evpn mh startup-delay 1800

To disable fast failover of traffic destined to the access port through the VXLAN overlay (for Cumulus VX):

Cumulus Linux does not provide NVUE commands to disable fast failover.
cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# evpn mh redirect-off
leaf01(config)# exit
leaf01# write memory

The vtysh commands create the following configuration in the /etc/frr/frr.conf file:

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
evpn mh redirect-off

Enable FRR Debugging

You can add debug statements to the /etc/frr/frr.conf file to debug the Ethernet segments, routes, and routing protocols (via Zebra).

Cumulus Linux does not provide NVUE commands for FRR debugging; however, you can create a snippet to enable FRR debugging. Refer to /etc/frr/frr.conf snippets.
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# debug bgp evpn mh es
leaf01(config)# debug bgp evpn mh route
leaf01(config)# debug bgp zebra
leaf01(config)# debug zebra evpn mh es
leaf01(config)# debug zebra evpn mh mac
leaf01(config)# debug zebra evpn mh neigh
leaf01(config)# debug zebra evpn mh nh
leaf01(config)# debug zebra vxlan
leaf01(config)# write memory
leaf01(config)# exit
leaf01# exit
cumulus@leaf01:~$

The vtysh commands create the following configuration in the /etc/frr/frr.conf file:

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
!
debug bgp evpn mh es
debug bgp evpn mh route
debug bgp zebra
debug zebra evpn mh es
debug zebra evpn mh mac
debug zebra evpn mh neigh
debug zebra evpn mh nh
debug zebra vxlan
!
...

Fast failover

When an Ethernet segment link goes down, the attached VTEP notifies all other VTEPs using a single EAD-ES withdraw. Cumulus Linux uses an Ethernet segment bond redirect.

Fast failover also triggers:

Disable Next Hop Group Sharing in the ASIC

When you configure EVPN-MH, container sharing for both layer 2 and layer 3 next hop groups is on by default. You can disable container sharing for faster failover when an Ethernet segment link flaps.

To disable container sharing for layer 2 next hop groups, edit the /etc/cumulus/switchd.conf file, add the evpn.multihoming.shared_l2_groups = FALSE variable, then restart the switchd service:

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
...
evpn.multihoming.shared_l2_groups = FALSE
...
cumulus@switch:~$ sudo systemctl restart switchd.service

To disable container sharing for layer 3 next hop groups, create the /etc/cumulus/switchd.d/switchd_misc.conf file, add the l3_nexthop.shared_ecmp_groups = FALSE variable, then restart the switchd service:

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.d/switchd_misc.conf 
l3_nexthop.shared_ecmp_groups = FALSE
...
cumulus@switch:~$ sudo systemctl restart switchd.service

Disable EAD-per-EVI Route Advertisements

RFC 7432 requires the switch to advertise type-1/EAD (Ethernet Auto-discovery) routes:

Some third party switch vendors do not advertise EAD-per-EVI routes; they only advertise EAD-per-ES routes. To interoperate with these vendors, you need to disable EAD-per-EVI route advertisements.

To remove the dependency on EAD-per-EVI routes and activate the VTEP upon receiving the EAD-per-ES route:

cumulus@switch:~$ nv set evpn multihoming ead-evi-route rx off
cumulus@switch:~$ nv config apply

To suppress the advertisement of EAD-per-EVI routes, run:

cumulus@switch:~$ nv set evpn multihoming ead-evi-route tx off
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# router bgp
switch(config-router)# address-family l2vpn evpn 
switch(config-router-af)# disable-ead-evi-rx
switch(config-router-af)# end
switch# write memory
switch# exit
cumulus@switch:~$

To suppress the advertisement of EAD-per-EVI routes, run:

cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# router bgp
switch(config-router)# address-family l2vpn evpn 
switch(config-router-af)# disable-ead-evi-tx
switch(config-router-af)# end
switch# write memory
switch# exit
cumulus@switch:~$

Troubleshooting

Use the following commands to troubleshoot your EVPN multihoming configuration.

Show Global EVPN-MH Information

To show global EVPN-MH information, such as the uplink count, startup delay timer, neighbor hold time, and MAC entry hold time, run the NVUE nv show evpn multihoming command:

cumulus@switch:~$ nv show evpn multihoming
                     operational  applied
-------------------  -----------  -------
enable                            on     
mac-holdtime         1080         1080   
neighbor-holdtime    1080         1080   
startup-delay        180          180    
ead-evi-route                            
  rx                              on     
  tx                              on     
segment                                  
  df-preference                   32767  
startup-delay-timer  --:--:--            
uplink-active        2                   
uplink-count         2  

Show Ethernet Segment Information

To show the Ethernet segments across all VNIs, run the nv show evpn multihoming esi command or the vtysh show evpn es command. For example:

cumulus@switch:~$ nv show evpn multihoming esi
SInterface - Local interface, NHG - Nexthop group ID, DFPref - Designated
forwarder preference, VNICnt - ESI EVPN instances, MacCnt - Mac entries using
this ES as destination, RemoteVTEPs - Remote tunnel Endpoint

ESI                            ESInterface  NHG        DFPref  VNICnt  MacCnt  Flags   RemoteVTEPs
-----------------------------  -----------  ---------  ------  ------  ------  ------  -----------
03:44:38:39:FF:00:aa:00:00:01  bond1        536870913  50000   1       2       local   10.10.10.2
03:44:38:39:FF:00:aa:00:00:02  bond2        536870914  50000   1       2       local   10.10.10.2
03:44:38:39:FF:00:aa:00:00:03  bond3        536870915  50000   1       2       local   10.10.10.2
03:44:38:39:FF:00:bb:00:00:01               536870916  0       0       2       remote  10.10.10.3
       10.10.10.4
cumulus@switch:~$ sudo vtysh
...
switch# show evpn es
Type: B bypass, L local, R remote, N non-DF
ESI                            Type ES-IF                 VTEPs
03:44:38:39:FF:00:aa:00:00:01  LR   bond1                 10.10.10.2
03:44:38:39:FF:00:aa:00:00:02  LR   bond2                 10.10.10.2
03:44:38:39:FF:00:aa:00:00:03  LR   bond3                 10.10.10.2
03:44:38:39:FF:00:bb:00:00:01  R    -                     10.10.10.3,10.10.10.4

You can also show the Ethernet segments across all VNIs with NVUE in json format:

cumulus@switch:~$ nv show evpn multihoming esi -o json
{
  "03:44:38:39:FF:00:aa:00:00:01": {
    "df-preference": 50000,
    "flags": {
      "bridge-port": "on",
      "designated-forward": "on",
      "local": "on",
      "nexthop-group-active": "on",
      "oper-up": "on",
      "ready-for-bgp": "on",
      "remote": "on"
    },
    "local-interface": "bond1",
    "mac-count": 2,
    "nexthop-group-id": 536870913,
    "remote-vtep": {
      "10.10.10.2": {
        "df-algorithm": "preference",
        "df-preference": 50000,
        "nexthop-group-id": 268435462
      }
    },
    "vni-count": 1
  },
  "03:44:38:39:FF:00:aa:00:00:02": {
    "df-preference": 50000,
    "flags": {
      "bridge-port": "on",
      "designated-forward": "on",
      "local": "on",
      "nexthop-group-active": "on",
      "oper-up": "on",
      "ready-for-bgp": "on",
      "remote": "on"
    },
    "local-interface": "bond2",
    "mac-count": 2,
    "nexthop-group-id": 536870914,
    "remote-vtep": {
      "10.10.10.2": {
        "df-algorithm": "preference",
        "df-preference": 50000,
        "nexthop-group-id": 268435462
      }
    },
    "vni-count": 1
  },
  "03:44:38:39:FF:00:aa:00:00:03": {
    "df-preference": 50000,
    "flags": {
      "bridge-port": "on",
      "designated-forward": "on",
      "local": "on",
      "nexthop-group-active": "on",
      "oper-up": "on",
      "ready-for-bgp": "on",
      "remote": "on"
    },
    "local-interface": "bond3",
    "mac-count": 2,
    "nexthop-group-id": 536870915,
    "remote-vtep": {
      "10.10.10.2": {
        "df-algorithm": "preference",
        "df-preference": 50000,
        "nexthop-group-id": 268435462
      }
    },
    "vni-count": 1
  },
  "03:44:38:39:FF:00:bb:00:00:01": {
    "df-preference": 0,
    "flags": {
      "nexthop-group-active": "on",
      "remote": "on"
    },
    "mac-count": 2,
    "nexthop-group-id": 536870916,
    "remote-vtep": {
      "10.10.10.3": {
        "nexthop-group-id": 268435461
      },
      "10.10.10.4": {
        "nexthop-group-id": 268435463
      }
    },
    "vni-count": 0
  }
}

To show information about a specific ESI:

cumulus@switch:~$ nv show evpn multihoming esi 03:44:38:39:FF:00:aa:00:00:01
                      operational
--------------------  -----------
df-preference         50000      
local-interface       bond1      
mac-count             2          
nexthop-group-id      5.369e+08  
vni-count             1          
flags                            
  bridge-port         on         
  designated-forward  on         
  local               on         
  oper-up             on         
  ready-for-bgp       on
  remote              on         
[remote-vtep]         10.10.10.2 

Show Ethernet Segment per VNI Information

To display the Ethernet segments learned for each VNI, run the vtysh show evpn es-evi command. For example:

cumulus@switch:~$ sudo vtysh
...
switch# show evpn es-evi
Type: L local, R remote
VNI      ESI                            Type
20       03:44:38:39:FF:00:aa:00:00:02  L   
30       03:44:38:39:FF:00:aa:00:00:03  L   
10       03:44:38:39:FF:00:aa:00:00:01  L 

To show the Ethernet segments for a specific VNI, run the NVUE nv show evpn vni <vni> multihoming esi command. For example:

cumulus@switch:~$ nv show evpn vni 10 multihoming esi
ESI                            Local  Remote
-----------------------------  -----  ------
03:44:38:39:FF:00:aa:00:00:01  yes    no

Show BGP Ethernet Segment Information

To show the Ethernet segments across all VNIs learned through type-1 and type-4 routes, run the NVUE nv show evpn multihoming bgp-info esi command or the vtysh show bgp l2vpn evpn es command. For example:

cumulus@switch:~$ nv show evpn multihoming bgp-info esi
SrcIP - Originator IP, VNICnt - VNI Count, VRFCnt - VRF Count, MACIPCnt - MAC IP
path count, MacGlblCnt - Mac global count, VTEP - Remote VTEP ID, FragID -
Fragments ID
ESI                            RD            SrcIP       VNICnt  VRFCnt  MACIPCnt  MacGlblCnt  Local  Remote  VTEP        FragID
-----------------------------  ------------  ----------  ------  ------  --------  ----------  -----  ------  ----------  ------------
03:44:38:39:FF:00:aa:00:00:01  10.10.10.1:3  10.10.10.1  1       1       3   6           yes    yes     10.10.10.2  10.10.10.1:3
03:44:38:39:FF:00:aa:00:00:02  10.10.10.1:4  10.10.10.1  1       1       2   4           yes    yes     10.10.10.2  10.10.10.1:4
03:44:38:39:FF:00:aa:00:00:03  10.10.10.1:5  10.10.10.1  1       1       2   4           yes    yes     10.10.10.2  10.10.10.1:5
03:44:38:39:FF:00:bb:00:00:01                0.0.0.0     1       1       0   12                 yes     10.10.10.3
                              10.10.10.4
03:44:38:39:FF:00:bb:00:00:02                0.0.0.0     1       1       0   0                  yes
03:44:38:39:FF:00:bb:00:00:03                0.0.0.0     1       1       0   0                  yes
cumulus@switch:~$ show bgp l2vpn evpn es
ES Flags: B - bypass, L local, R remote, I inconsistent
VTEP Flags: E ESR/Type-4, A active nexthop
ESI                            Flags RD                    #VNIs    VTEPs
03:44:38:39:FF:00:aa:00:00:01  LR    10.10.10.1:3          1        10.10.10.2(EA)
03:44:38:39:FF:00:aa:00:00:02  LR    10.10.10.1:4          1        10.10.10.2(EA)
03:44:38:39:FF:00:aa:00:00:03  LR    10.10.10.1:5          1        10.10.10.2(EA)
03:44:38:39:FF:00:bb:00:00:01  R     (null)                1        10.10.10.3(A),10.10.10.4(A)
03:44:38:39:FF:00:bb:00:00:02  R     (null)                1
03:44:38:39:FF:00:bb:00:00:03  R     (null)                1

You can also show the Ethernet segments across all VNIs learned through type-1 and type-4 routes with NVUE in json format:

cumulus@switch:~$ nv show evpn multihoming bgp-info esi -o json
{
  "03:44:38:39:FF:00:aa:00:00:01": {
    "es-df-preference": 50000,
    "flags": {
      "advertise-evi": "on",
      "up": "on"
    },
    "fragments": {
      "10.10.10.1:3": {
        "evi-count": 1
      }
    },
    "inconsistent-vni-count": 0,
    "macip-global-path-count": 8,
    "macip-path-count": 4,
    "originator-ip": "10.10.10.1",
    "rd": "10.10.10.1:3",
    "remote-vtep": {
      "10.10.10.2": {
        "df-algorithm": "preference",
        "df-preference": 50000,
        "flags": {
          "active": "on",
          "esr": "on"
        }
      }
    },
    "type": {
      "local": "on",
      "remote": "on"
    },
    "vni-count": 1,
    "vrf-count": 1
  },
  "03:44:38:39:FF:00:aa:00:00:02": {
    "es-df-preference": 50000,
    "flags": {
      "advertise-evi": "on",
      "up": "on"
    },
    "fragments": {
      "10.10.10.1:4": {
        "evi-count": 1
      }
    },
    "inconsistent-vni-count": 0,
    "macip-global-path-count": 6,
    "macip-path-count": 3,
    "originator-ip": "10.10.10.1",
    "rd": "10.10.10.1:4",
    "remote-vtep": {
      "10.10.10.2": {
        "df-algorithm": "preference",
        "df-preference": 50000,
        "flags": {
          "active": "on",
          "esr": "on"
        }
      }
    },
    "type": {
      "local": "on",
      "remote": "on"
    },
    "vni-count": 1,
    "vrf-count": 1
  },
  "03:44:38:39:FF:00:aa:00:00:03": {
    "es-df-preference": 50000,
    "flags": {
      "advertise-evi": "on",
      "up": "on"
    },
    "fragments": {
      "10.10.10.1:5": {
        "evi-count": 1
      }
    },
    "inconsistent-vni-count": 0,
    "macip-global-path-count": 6,
    "macip-path-count": 3,
    "originator-ip": "10.10.10.1",
    "rd": "10.10.10.1:5",
    "remote-vtep": {
      "10.10.10.2": {
        "df-algorithm": "preference",
        "df-preference": 50000,
        "flags": {
          "active": "on",
          "esr": "on"
        }
      }
    },
    "type": {
      "local": "on",
      "remote": "on"
    },
    "vni-count": 1,
    "vrf-count": 1
  },
  "03:44:38:39:FF:00:bb:00:00:01": {
    "inconsistent-vni-count": 0,
    "macip-global-path-count": 16,
    "macip-path-count": 0,
    "originator-ip": "0.0.0.0",
    "remote-vtep": {
      "10.10.10.3": {
        "flags": {
          "active": "on"
        }
      },
      "10.10.10.4": {
        "flags": {
          "active": "on"
        }
      }
    },
    "type": {
      "remote": "on"
    },
    "vni-count": 1,
    "vrf-count": 1
  },
  "03:44:38:39:FF:00:bb:00:00:02": {
    "inconsistent-vni-count": 0,
    "macip-global-path-count": 0,
    "macip-path-count": 0,
    "originator-ip": "0.0.0.0",
    "type": {
      "remote": "on"
    },
    "vni-count": 1,
    "vrf-count": 1
  },
  "03:44:38:39:FF:00:bb:00:00:03": {
    "inconsistent-vni-count": 0,
    "macip-global-path-count": 0,
    "macip-path-count": 0,
    "originator-ip": "0.0.0.0",
    "type": {
      "remote": "on"
    },
    "vni-count": 1,
    "vrf-count": 1
  }
}

Show BGP Ethernet Segment per VNI Information

To display the Ethernet segments per VNI learned through type-1 and type-4 routes, run the vtysh show bgp l2vpn evpn es-evi command.

cumulus@switch:~$ sudo vtysh
...
switch# show bgp l2vpn evpn es-evi
Flags: L local, R remote, I inconsistent
VTEP-Flags: E EAD-per-ES, V EAD-per-EVI
VNI      ESI                            Flags VTEPs
20       03:44:38:39:FF:00:aa:00:00:02  LR    10.10.10.2(V)
20       03:44:38:39:FF:00:bb:00:00:02  R     10.10.10.3(V),10.10.10.4(V)
30       03:44:38:39:FF:00:aa:00:00:03  LR    10.10.10.2(V)
30       03:44:38:39:FF:00:bb:00:00:03  R     10.10.10.3(V),10.10.10.4(V)
10       03:44:38:39:FF:00:aa:00:00:01  LR    10.10.10.2(V)
10       03:44:38:39:FF:00:bb:00:00:01  R     10.10.10.3(V),10.10.10.4(V)
...

Show EAD Route Types

To view type-1 EAD routes, run the NVUE vtysh show bgp l2vpn evpn route type ead command. For example:

cumulus@switch:~$ sudo vtysh
...
switch# show bgp l2vpn evpn route type ead
BGP table version is 3, local router ID is 10.10.10.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[ESI]:[EthTag]:[IPlen]:[VTEP-IP]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]

   Network          Next Hop            Metric LocPrf Weight Path
                    Extended Community
Route Distinguisher: 10.10.10.1:2
*> [1]:[0]:[03:44:38:39:FF:00:aa:00:00:02]:[128]:[0.0.0.0]
                    10.10.10.1                         32768 i
                    ET:8 RT:65101:20
Route Distinguisher: 10.10.10.1:6
*> [1]:[0]:[03:44:38:39:FF:00:aa:00:00:03]:[128]:[0.0.0.0]
                    10.10.10.1                         32768 i
                    ET:8 RT:65101:30
Route Distinguisher: 10.10.10.1:7
*> [1]:[0]:[03:44:38:39:FF:00:aa:00:00:01]:[128]:[0.0.0.0]
                    10.10.10.1                         32768 i
                    ET:8 RT:65101:10
Route Distinguisher: 10.10.10.2:2
*> [1]:[0]:[03:44:38:39:FF:00:aa:00:00:02]:[32]:[0.0.0.0]
                    10.10.10.2                             0 65199 65102 i
                    RT:65102:20 ET:8
Route Distinguisher: 10.10.10.2:6
*> [1]:[0]:[03:44:38:39:FF:00:aa:00:00:03]:[32]:[0.0.0.0]
                    10.10.10.2                             0 65199 65102 i
                    RT:65102:30 ET:8
Route Distinguisher: 10.10.10.2:7
*> [1]:[0]:[03:44:38:39:FF:00:aa:00:00:01]:[32]:[0.0.0.0]
                    10.10.10.2                             0 65199 65102 i
                    RT:65102:10 ET:8
Route Distinguisher: 10.10.10.3:2
*> [1]:[0]:[03:44:38:39:FF:00:bb:00:00:02]:[32]:[0.0.0.0]
                    10.10.10.3                             0 65199 65103 i
                    RT:65103:20 ET:8
...

Considerations

If you enable EVPN-MH and configure VLAN match rules in ebtables with a {{mark}} target, the ebtables rule might overwrite the {{mark}} set by traffic class rules you configure for EVPN-MH on ingress. Egress EVPN MH traffic class rules that match the ingress traffic class {{mark}} might not get hit. To work around this issue, add ebtable rules to {{ACCEPT}} the packets already marked by EVPN-MH traffic class rules on ingress.

Configuration Example

The following configuration examples use the topology illustrated below and configure EVPN multihoming with head end replication using single VXLAN devices. The examples provide configuration for server01 through server04. The configuration for server05 and server06 are not included for simplicity.

cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:~$ nv set interface swp1-3,swp51-52
cumulus@leaf01:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:~$ nv set interface bond2 bond member swp2
cumulus@leaf01:~$ nv set interface bond3 bond member swp3
cumulus@leaf01:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond3 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond1 link mtu 9000
cumulus@leaf01:~$ nv set interface bond2 link mtu 9000
cumulus@leaf01:~$ nv set interface bond3 link mtu 9000
cumulus@leaf01:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf01:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf01:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf01:~$ nv set interface bond3 bridge domain br_default access 30
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf01:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@leaf01:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@leaf01:~$ nv set interface vlan10 ip vrr state up
cumulus@leaf01:~$ nv set interface vlan20 ip address 10.1.20.2/24
cumulus@leaf01:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@leaf01:~$ nv set interface vlan20 ip vrr state up
cumulus@leaf01:~$ nv set interface vlan30 ip address 10.1.30.2/24
cumulus@leaf01:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
cumulus@leaf01:~$ nv set interface vlan30 ip vrr state up
cumulus@leaf01:~$ nv set vrf RED
cumulus@leaf01:~$ nv set vrf BLUE
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf01:~$ nv set bridge domain br_default vlan 30 vni 30
cumulus@leaf01:~$ nv set interface vlan10 ip vrf RED
cumulus@leaf01:~$ nv set interface vlan20 ip vrf RED
cumulus@leaf01:~$ nv set interface vlan30 ip vrf BLUE
cumulus@leaf01:~$ nv set nve vxlan source address 10.10.10.1
cumulus@leaf01:~$ nv set nve vxlan arp-nd-suppress on 
cumulus@leaf01:~$ nv set vrf RED evpn vni 4001
cumulus@leaf01:~$ nv set vrf BLUE evpn vni 4002
cumulus@leaf01:~$ nv set evpn enable on
cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf01:~$ nv set vrf RED router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set vrf RED router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf01:~$ nv set vrf BLUE router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set vrf BLUE router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf BLUE router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf01:~$ nv set vrf BLUE router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf01:~$ nv set evpn multihoming enable on
cumulus@leaf01:~$ nv set interface bond1 evpn multihoming segment local-id 1
cumulus@leaf01:~$ nv set interface bond2 evpn multihoming segment local-id 2
cumulus@leaf01:~$ nv set interface bond3 evpn multihoming segment local-id 3
cumulus@leaf01:~$ nv set interface bond1-3 evpn multihoming segment mac-address 44:38:39:FF:00:AA
cumulus@leaf01:~$ nv set interface bond1-3 evpn multihoming segment df-preference 50000
cumulus@leaf01:~$ nv set interface swp51-52 evpn multihoming uplink on
cumulus@leaf01:~$ nv config apply
cumulus@leaf02:~$ nv set interface lo ip address 10.10.10.2/32
cumulus@leaf02:~$ nv set interface swp1-3,swp51-52
cumulus@leaf02:~$ nv set interface bond1 bond member swp1
cumulus@leaf02:~$ nv set interface bond2 bond member swp2
cumulus@leaf02:~$ nv set interface bond3 bond member swp3
cumulus@leaf02:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf02:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf02:~$ nv set interface bond3 bond lacp-bypass on
cumulus@leaf02:~$ nv set interface bond1 link mtu 9000
cumulus@leaf02:~$ nv set interface bond2 link mtu 9000
cumulus@leaf02:~$ nv set interface bond3 link mtu 9000
cumulus@leaf02:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf02:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf02:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf02:~$ nv set interface bond3 bridge domain br_default access 30
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf02:~$ nv set interface vlan10 ip address 10.1.10.3/24
cumulus@leaf02:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@leaf02:~$ nv set interface vlan10 ip vrr state up
cumulus@leaf02:~$ nv set interface vlan20 ip address 10.1.20.3/24
cumulus@leaf02:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@leaf02:~$ nv set interface vlan20 ip vrr state up
cumulus@leaf02:~$ nv set interface vlan30 ip address 10.1.30.3/24
cumulus@leaf02:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
cumulus@leaf02:~$ nv set interface vlan30 ip vrr state up
cumulus@leaf02:~$ nv set vrf RED
cumulus@leaf02:~$ nv set vrf BLUE
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf02:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf02:~$ nv set bridge domain br_default vlan 30 vni 30
cumulus@leaf02:~$ nv set interface vlan10 ip vrf RED
cumulus@leaf02:~$ nv set interface vlan20 ip vrf RED
cumulus@leaf02:~$ nv set interface vlan30 ip vrf BLUE
cumulus@leaf02:~$ nv set nve vxlan source address 10.10.10.2
cumulus@leaf02:~$ nv set nve vxlan arp-nd-suppress on 
cumulus@leaf02:~$ nv set vrf RED evpn vni 4001
cumulus@leaf02:~$ nv set vrf BLUE evpn vni 4002
cumulus@leaf02:~$ nv set evpn enable on
cumulus@leaf02:~$ nv set router bgp autonomous-system 65102
cumulus@leaf02:~$ nv set router bgp router-id 10.10.10.2
cumulus@leaf02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf02:~$ nv set vrf RED router bgp autonomous-system 65102
cumulus@leaf02:~$ nv set vrf RED router bgp router-id 10.10.10.2
cumulus@leaf02:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf02:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf02:~$ nv set vrf BLUE router bgp autonomous-system 65102
cumulus@leaf02:~$ nv set vrf BLUE router bgp router-id 10.10.10.2
cumulus@leaf02:~$ nv set vrf BLUE router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf02:~$ nv set vrf BLUE router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf02:~$ nv set evpn multihoming enable on
cumulus@leaf02:~$ nv set interface bond1 evpn multihoming segment local-id 1
cumulus@leaf02:~$ nv set interface bond2 evpn multihoming segment local-id 2
cumulus@leaf02:~$ nv set interface bond3 evpn multihoming segment local-id 3
cumulus@leaf02:~$ nv set interface bond1-3 evpn multihoming segment mac-address 44:38:39:FF:00:AA
cumulus@leaf02:~$ nv set interface bond1-3 evpn multihoming segment df-preference 50000
cumulus@leaf02:~$ nv set interface swp51-52 evpn multihoming uplink on
cumulus@leaf02:~$ nv config apply
cumulus@leaf03:~$ nv set interface lo ip address 10.10.10.3/32
cumulus@leaf03:~$ nv set interface swp1-3,swp51-52
cumulus@leaf03:~$ nv set interface bond1 bond member swp1
cumulus@leaf03:~$ nv set interface bond2 bond member swp2
cumulus@leaf03:~$ nv set interface bond3 bond member swp3
cumulus@leaf03:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf03:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf03:~$ nv set interface bond3 bond lacp-bypass on
cumulus@leaf03:~$ nv set interface bond1 link mtu 9000
cumulus@leaf03:~$ nv set interface bond2 link mtu 9000
cumulus@leaf03:~$ nv set interface bond3 link mtu 9000
cumulus@leaf03:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf03:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf03:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf03:~$ nv set interface bond3 bridge domain br_default access 30
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf03:~$ nv set interface vlan10 ip address 10.1.10.4/24
cumulus@leaf03:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@leaf03:~$ nv set interface vlan10 ip vrr state up
cumulus@leaf03:~$ nv set interface vlan20 ip address 10.1.20.4/24
cumulus@leaf03:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@leaf03:~$ nv set interface vlan20 ip vrr state up
cumulus@leaf03:~$ nv set interface vlan30 ip address 10.1.30.4/24
cumulus@leaf03:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
cumulus@leaf03:~$ nv set interface vlan30 ip vrr state up
cumulus@leaf03:~$ nv set vrf RED
cumulus@leaf03:~$ nv set vrf BLUE
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf03:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf03:~$ nv set bridge domain br_default vlan 30 vni 30
cumulus@leaf03:~$ nv set interface vlan10 ip vrf RED
cumulus@leaf03:~$ nv set interface vlan20 ip vrf RED
cumulus@leaf03:~$ nv set interface vlan30 ip vrf BLUE
cumulus@leaf03:~$ nv set nve vxlan source address 10.10.10.3
cumulus@leaf03:~$ nv set nve vxlan arp-nd-suppress on 
cumulus@leaf03:~$ nv set vrf RED evpn vni 4001
cumulus@leaf03:~$ nv set vrf BLUE evpn vni 4002
cumulus@leaf03:~$ nv set evpn enable on
cumulus@leaf03:~$ nv set router bgp autonomous-system 65103
cumulus@leaf03:~$ nv set router bgp router-id 10.10.10.3
cumulus@leaf03:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf03:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf03:~$ nv set vrf RED router bgp autonomous-system 65103
cumulus@leaf03:~$ nv set vrf RED router bgp router-id 10.10.10.3
cumulus@leaf03:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf03:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf03:~$ nv set vrf BLUE router bgp autonomous-system 65103
cumulus@leaf03:~$ nv set vrf BLUE router bgp router-id 10.10.10.3
cumulus@leaf03:~$ nv set vrf BLUE router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf03:~$ nv set vrf BLUE router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf03:~$ nv set evpn multihoming enable on
cumulus@leaf03:~$ nv set interface bond1 evpn multihoming segment local-id 1
cumulus@leaf03:~$ nv set interface bond2 evpn multihoming segment local-id 2
cumulus@leaf03:~$ nv set interface bond3 evpn multihoming segment local-id 3
cumulus@leaf03:~$ nv set interface bond1-3 evpn multihoming segment mac-address 44:38:39:FF:00:BB
cumulus@leaf03:~$ nv set interface bond1-3 evpn multihoming segment df-preference 50000
cumulus@leaf03:~$ nv set interface swp51-52 evpn multihoming uplink on
cumulus@leaf03:~$ nv config apply
cumulus@leaf04:~$ nv set interface lo ip address 10.10.10.4/32
cumulus@leaf04:~$ nv set interface swp1-3,swp51-52
cumulus@leaf04:~$ nv set interface bond1 bond member swp1
cumulus@leaf04:~$ nv set interface bond2 bond member swp2
cumulus@leaf04:~$ nv set interface bond3 bond member swp3
cumulus@leaf04:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf04:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf04:~$ nv set interface bond3 bond lacp-bypass on
cumulus@leaf04:~$ nv set interface bond1 link mtu 9000
cumulus@leaf04:~$ nv set interface bond2 link mtu 9000
cumulus@leaf04:~$ nv set interface bond3 link mtu 9000
cumulus@leaf04:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf04:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf04:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf04:~$ nv set interface bond3 bridge domain br_default access 30
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf04:~$ nv set interface vlan10 ip address 10.1.10.5/24
cumulus@leaf04:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@leaf04:~$ nv set interface vlan10 ip vrr state up
cumulus@leaf04:~$ nv set interface vlan20 ip address 10.1.20.5/24
cumulus@leaf04:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@leaf04:~$ nv set interface vlan20 ip vrr state up
cumulus@leaf04:~$ nv set interface vlan30 ip address 10.1.30.5/24
cumulus@leaf04:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
cumulus@leaf04:~$ nv set interface vlan30 ip vrr state up
cumulus@leaf04:~$ nv set vrf RED
cumulus@leaf04:~$ nv set vrf BLUE
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf04:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf04:~$ nv set bridge domain br_default vlan 30 vni 30
cumulus@leaf04:~$ nv set interface vlan10 ip vrf RED
cumulus@leaf04:~$ nv set interface vlan20 ip vrf RED
cumulus@leaf04:~$ nv set interface vlan30 ip vrf BLUE
cumulus@leaf04:~$ nv set nve vxlan source address 10.10.10.4
cumulus@leaf04:~$ nv set nve vxlan arp-nd-suppress on 
cumulus@leaf04:~$ nv set vrf RED evpn vni 4001
cumulus@leaf04:~$ nv set vrf BLUE evpn vni 4002
cumulus@leaf04:~$ nv set evpn enable on
cumulus@leaf04:~$ nv set router bgp autonomous-system 65104
cumulus@leaf04:~$ nv set router bgp router-id 10.10.10.4
cumulus@leaf04:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf04:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf04:~$ nv set vrf RED router bgp autonomous-system 65104
cumulus@leaf04:~$ nv set vrf RED router bgp router-id 10.10.10.4
cumulus@leaf04:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf04:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf04:~$ nv set vrf BLUE router bgp autonomous-system 65104
cumulus@leaf04:~$ nv set vrf BLUE router bgp router-id 10.10.10.4
cumulus@leaf04:~$ nv set vrf BLUE router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf04:~$ nv set vrf BLUE router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf04:~$ nv set evpn multihoming enable on
cumulus@leaf04:~$ nv set interface bond1 evpn multihoming segment local-id 1
cumulus@leaf04:~$ nv set interface bond2 evpn multihoming segment local-id 2
cumulus@leaf04:~$ nv set interface bond3 evpn multihoming segment local-id 3
cumulus@leaf04:~$ nv set interface bond1-3 evpn multihoming segment mac-address 44:38:39:FF:00:BB
cumulus@leaf04:~$ nv set interface bond1-3 evpn multihoming segment df-preference 50000
cumulus@leaf04:~$ nv set interface swp51-52 evpn multihoming uplink on
cumulus@leaf04:~$ nv config apply
cumulus@spine01:~$ nv set interface lo ip address 10.10.10.101/32
cumulus@spine01:~$ nv set interface swp1-4
cumulus@spine01:~$ nv set router bgp autonomous-system 65199
cumulus@spine01:~$ nv set router bgp router-id 10.10.10.101
cumulus@spine01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@spine01:~$ nv config apply
cumulus@spine02:~$ nv set interface lo ip address 10.10.10.102/32
cumulus@spine02:~$ nv set interface swp1-4
cumulus@spine02:~$ nv set router bgp autonomous-system 65199
cumulus@spine02:~$ nv set router bgp router-id 10.10.10.102
cumulus@spine02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@spine02:~$ nv config apply
cumulus@leaf01:~$ cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
            '30':
              vni:
                '30': {}
    evpn:
      enable: on
      multihoming:
        enable: on
    interface:
      bond1:
        bond:
          lacp-bypass: on
          member:
            swp1: {}
        bridge:
          domain:
            br_default:
              access: 10
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 1
              mac-address: 44:38:39:FF:00:AA
        link:
          mtu: 9000
        type: bond
      bond2:
        bond:
          lacp-bypass: on
          member:
            swp2: {}
        bridge:
          domain:
            br_default:
              access: 20
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 2
              mac-address: 44:38:39:FF:00:AA
        link:
          mtu: 9000
        type: bond
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
        bridge:
          domain:
            br_default:
              access: 30
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 3
              mac-address: 44:38:39:FF:00:AA
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp51:
        evpn:
          multihoming:
            uplink: on
        type: swp
      swp52:
        evpn:
          multihoming:
            uplink: on
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.2/24: {}
          vrf: RED
          vrr:
            address:
              10.1.10.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.2/24: {}
          vrf: RED
          vrr:
            address:
              10.1.20.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.2/24: {}
          vrf: BLUE
          vrr:
            address:
              10.1.30.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 30
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        source:
          address: 10.10.10.1
    router:
      bgp:
        autonomous-system: 65101
        enable: on
        router-id: 10.10.10.1
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$0UJ.vs.J1XC6/Kwq$jLHpbKGoLU0wI.NezCBMtHjXHSixMAgbLP3aF3vFbrjF2ZoJx5RIDoNE3v1qELWhVQ0RqB9uY/BSF6o7ypyxS0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:7a
      hostname: leaf01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      BLUE:
        evpn:
          enable: on
          vni:
            '4002': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65101
            enable: on
            router-id: 10.10.10.1
      RED:
        evpn:
          enable: on
          vni:
            '4001': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65101
            enable: on
            router-id: 10.10.10.1
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@leaf02:~$ cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
            '30':
              vni:
                '30': {}
    evpn:
      enable: on
      multihoming:
        enable: on
    interface:
      bond1:
        bond:
          lacp-bypass: on
          member:
            swp1: {}
        bridge:
          domain:
            br_default:
              access: 10
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 1
              mac-address: 44:38:39:FF:00:AA
        link:
          mtu: 9000
        type: bond
      bond2:
        bond:
          lacp-bypass: on
          member:
            swp2: {}
        bridge:
          domain:
            br_default:
              access: 20
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 2
              mac-address: 44:38:39:FF:00:AA
        link:
          mtu: 9000
        type: bond
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
        bridge:
          domain:
            br_default:
              access: 30
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 3
              mac-address: 44:38:39:FF:00:AA
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.2/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp51:
        evpn:
          multihoming:
            uplink: on
        type: swp
      swp52:
        evpn:
          multihoming:
            uplink: on
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.3/24: {}
          vrf: RED
          vrr:
            address:
              10.1.10.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.3/24: {}
          vrf: RED
          vrr:
            address:
              10.1.20.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.3/24: {}
          vrf: BLUE
          vrr:
            address:
              10.1.30.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 30
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        source:
          address: 10.10.10.2
    router:
      bgp:
        autonomous-system: 65102
        enable: on
        router-id: 10.10.10.2
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$3l/mGeft8luHcK4f$IBKQ3M5rSzk/w2Czp4m0FYT3W/o8uDvqPQVN7ffy9qIfVAZuhyEdISSgbcU7ey7qD1AmfBKSNM42j0M0Nssar0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:78
      hostname: leaf02
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      BLUE:
        evpn:
          enable: on
          vni:
            '4002': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65102
            enable: on
            router-id: 10.10.10.2
      RED:
        evpn:
          enable: on
          vni:
            '4001': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65102
            enable: on
            router-id: 10.10.10.2
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@leaf03:~$ cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
            '30':
              vni:
                '30': {}
    evpn:
      enable: on
      multihoming:
        enable: on
    interface:
      bond1:
        bond:
          lacp-bypass: on
          member:
            swp1: {}
        bridge:
          domain:
            br_default:
              access: 10
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 1
              mac-address: 44:38:39:FF:00:BB
        link:
          mtu: 9000
        type: bond
      bond2:
        bond:
          lacp-bypass: on
          member:
            swp2: {}
        bridge:
          domain:
            br_default:
              access: 20
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 2
              mac-address: 44:38:39:FF:00:BB
        link:
          mtu: 9000
        type: bond
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
        bridge:
          domain:
            br_default:
              access: 30
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 3
              mac-address: 44:38:39:FF:00:BB
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.3/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp51:
        evpn:
          multihoming:
            uplink: on
        type: swp
      swp52:
        evpn:
          multihoming:
            uplink: on
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.4/24: {}
          vrf: RED
          vrr:
            address:
              10.1.10.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.4/24: {}
          vrf: RED
          vrr:
            address:
              10.1.20.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.4/24: {}
          vrf: BLUE
          vrr:
            address:
              10.1.30.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 30
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        source:
          address: 10.10.10.3
    router:
      bgp:
        autonomous-system: 65103
        enable: on
        router-id: 10.10.10.3
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$fXqglI7FdhhtxVQq$oFuDfEvAWHFpSpLJYuBwckXJ0TOdK6H0RkWYRf4QXXUtom3oIBrn2JIucCvMYZUW02Me6jf9FOPe.xFfKdrfl/
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:84
      hostname: leaf03
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      BLUE:
        evpn:
          enable: on
          vni:
            '4002': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65103
            enable: on
            router-id: 10.10.10.3
      RED:
        evpn:
          enable: on
          vni:
            '4001': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65103
            enable: on
            router-id: 10.10.10.3
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@leaf04:~$ cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
            '30':
              vni:
                '30': {}
    evpn:
      enable: on
      multihoming:
        enable: on
    interface:
      bond1:
        bond:
          lacp-bypass: on
          member:
            swp1: {}
        bridge:
          domain:
            br_default:
              access: 10
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 1
              mac-address: 44:38:39:FF:00:BB
        link:
          mtu: 9000
        type: bond
      bond2:
        bond:
          lacp-bypass: on
          member:
            swp2: {}
        bridge:
          domain:
            br_default:
              access: 20
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 2
              mac-address: 44:38:39:FF:00:BB
        link:
          mtu: 9000
        type: bond
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
        bridge:
          domain:
            br_default:
              access: 30
        evpn:
          multihoming:
            segment:
              df-preference: 50000
              enable: on
              local-id: 3
              mac-address: 44:38:39:FF:00:BB
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.4/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp51:
        evpn:
          multihoming:
            uplink: on
        type: swp
      swp52:
        evpn:
          multihoming:
            uplink: on
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.5/24: {}
          vrf: RED
          vrr:
            address:
              10.1.10.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.5/24: {}
          vrf: RED
          vrr:
            address:
              10.1.20.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.5/24: {}
          vrf: BLUE
          vrr:
            address:
              10.1.30.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 30
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        source:
          address: 10.10.10.4
    router:
      bgp:
        autonomous-system: 65104
        enable: on
        router-id: 10.10.10.4
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$V2IH48/ZUEa5lSC3$24Gvui8RFRw24XUmnhT2BqCZa8BHkEJO2ruqZ0xqXldRXJkQUOqxx4X0q/PHWjpIx5W5MsWVSqjEpG8iw4SBW1
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:8a
      hostname: leaf04
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      BLUE:
        evpn:
          enable: on
          vni:
            '4002': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65104
            enable: on
            router-id: 10.10.10.4
      RED:
        evpn:
          enable: on
          vni:
            '4001': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65104
            enable: on
            router-id: 10.10.10.4
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@spine01:~$ cat /etc/nvue.d/startup.yaml
- set:
    interface:
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.101/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.101
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$qruUi1M0Kp3aiwbm$e5Wt0hwS7p70L5TfzVOz7YD05wFHlE7a6HEie4CtV0exC8G7WrsaQ8OUddnsN9rP4xl4fdkInFDQfoBUUhVgg1
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:82
      hostname: spine01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@spine02:~$ cat /etc/nvue.d/startup.yaml
- set:
    interface:
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.102/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.102
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$KXiEkc0lH0nj62X1$5AJMEw8EPgIJyq8C3KuKNwH11ykSdXEpncFAxz.I9YZCb6HeYrZRw5dLBW5oHGn5kBWyH52wUh.8gwa1w1uGh1
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:92
      hostname: spine02
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@leaf01:~$ cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.1/32
    vxlan-local-tunnelip 10.10.10.1
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto RED
iface RED
    vrf-table auto
auto BLUE
iface BLUE
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    mtu 9000
    es-sys-mac 44:38:39:FF:00:AA
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    es-sys-mac 44:38:39:FF:00:AA
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 20
auto bond3
iface bond3
    mtu 9000
    es-sys-mac 44:38:39:FF:00:AA
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 30
auto vlan10
iface vlan10
    address 10.1.10.2/24
    address-virtual 00:00:5E:00:01:01 10.1.10.1/24
    hwaddress 44:38:39:22:01:b1
    vrf RED
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.2/24
    address-virtual 00:00:5E:00:01:01 10.1.20.1/24
    hwaddress 44:38:39:22:01:b1
    vrf RED
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.2/24
    address-virtual 00:00:5E:00:01:01 10.1.30.1/24
    hwaddress 44:38:39:22:01:b1
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 30
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20 30=30
    bridge-vids 10 20 30
    bridge-learning off
auto vlan220_l3
iface vlan220_l3
    vrf RED
    vlan-raw-device br_l3vni
    vlan-id 220
auto vlan297_l3
iface vlan297_l3
    vrf BLUE
    vlan-raw-device br_l3vni
    vlan-id 297
auto vxlan99
iface vxlan99
    bridge-vlan-vni-map 220=4001 297=4002
    bridge-vids 220 297
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 vxlan48
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
auto br_l3vni
iface br_l3vni
    bridge-ports vxlan99
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
cumulus@leaf02:~$ cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.2/32
    vxlan-local-tunnelip 10.10.10.2
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto RED
iface RED
    vrf-table auto
auto BLUE
iface BLUE
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    mtu 9000
    es-sys-mac 44:38:39:FF:00:AA
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    es-sys-mac 44:38:39:FF:00:AA
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 20
auto bond3
iface bond3
    mtu 9000
    es-sys-mac 44:38:39:FF:00:AA
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 30
auto vlan10
iface vlan10
    address 10.1.10.3/24
    address-virtual 00:00:5E:00:01:01 10.1.10.1/24
    hwaddress 44:38:39:22:01:af
    vrf RED
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.3/24
    address-virtual 00:00:5E:00:01:01 10.1.20.1/24
    hwaddress 44:38:39:22:01:af
    vrf RED
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.3/24
    address-virtual 00:00:5E:00:01:01 10.1.30.1/24
    hwaddress 44:38:39:22:01:af
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 30
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20 30=30
    bridge-vids 10 20 30
    bridge-learning off
auto vlan220_l3
iface vlan220_l3
    vrf RED
    vlan-raw-device br_l3vni
    vlan-id 220
auto vlan297_l3
iface vlan297_l3
    vrf BLUE
    vlan-raw-device br_l3vni
    vlan-id 297
auto vxlan99
iface vxlan99
    bridge-vlan-vni-map 220=4001 297=4002
    bridge-vids 220 297
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 vxlan48
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
auto br_l3vni
iface br_l3vni
    bridge-ports vxlan99
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
cumulus@leaf03:~$ cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.3/32
    vxlan-local-tunnelip 10.10.10.3
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto RED
iface RED
    vrf-table auto
auto BLUE
iface BLUE
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    mtu 9000
    es-sys-mac 44:38:39:FF:00:BB
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    es-sys-mac 44:38:39:FF:00:BB
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 20
auto bond3
iface bond3
    mtu 9000
    es-sys-mac 44:38:39:FF:00:BB
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 30
auto vlan10
iface vlan10
    address 10.1.10.4/24
    address-virtual 00:00:5E:00:01:01 10.1.10.1/24
    hwaddress 44:38:39:22:01:bb
    vrf RED
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.4/24
    address-virtual 00:00:5E:00:01:01 10.1.20.1/24
    hwaddress 44:38:39:22:01:bb
    vrf RED
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.4/24
    address-virtual 00:00:5E:00:01:01 10.1.30.1/24
    hwaddress 44:38:39:22:01:bb
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 30
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20 30=30
    bridge-vids 10 20 30
    bridge-learning off
auto vlan220_l3
iface vlan220_l3
    vrf RED
    vlan-raw-device br_l3vni
    vlan-id 220
auto vlan297_l3
iface vlan297_l3
    vrf BLUE
    vlan-raw-device br_l3vni
    vlan-id 297
auto vxlan99
iface vxlan99
    bridge-vlan-vni-map 220=4001 297=4002
    bridge-vids 220 297
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 vxlan48
    hwaddress 44:38:39:22:01:bb
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
auto br_l3vni
iface br_l3vni
    bridge-ports vxlan99
    hwaddress 44:38:39:22:01:bb
    bridge-vlan-aware yes
cumulus@leaf04:~$ cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.4/32
    vxlan-local-tunnelip 10.10.10.4
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto RED
iface RED
    vrf-table auto
auto BLUE
iface BLUE
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    mtu 9000
    es-sys-mac 44:38:39:FF:00:BB
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    es-sys-mac 44:38:39:FF:00:BB
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 20
auto bond3
iface bond3
    mtu 9000
    es-sys-mac 44:38:39:FF:00:BB
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    bridge-access 30
auto vlan10
iface vlan10
    address 10.1.10.5/24
    address-virtual 00:00:5E:00:01:01 10.1.10.1/24
    hwaddress 44:38:39:22:01:c1
    vrf RED
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.5/24
    address-virtual 00:00:5E:00:01:01 10.1.20.1/24
    hwaddress 44:38:39:22:01:c1
    vrf RED
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.5/24
    address-virtual 00:00:5E:00:01:01 10.1.30.1/24
    hwaddress 44:38:39:22:01:c1
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 30
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20 30=30
    bridge-vids 10 20 30
    bridge-learning off
auto vlan220_l3
iface vlan220_l3
    vrf RED
    vlan-raw-device br_l3vni
    vlan-id 220
auto vlan297_l3
iface vlan297_l3
    vrf BLUE
    vlan-raw-device br_l3vni
    vlan-id 297
auto vxlan99
iface vxlan99
    bridge-vlan-vni-map 220=4001 297=4002
    bridge-vids 220 297
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 vxlan48
    hwaddress 44:38:39:22:01:c1
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
auto br_l3vni
iface br_l3vni
    bridge-ports vxlan99
    hwaddress 44:38:39:22:01:c1
    bridge-vlan-aware yes
cumulus@spine01:~$ cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.101/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
cumulus@spine02:~$ cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.102/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
cumulus@server01:~$ sudo cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback
# The OOB network interface
auto eth0
iface eth0 inet dhcp
# The data plane network interfaces
auto eth1
iface eth1 inet manual
  # Required for Vagrant
  post-up ip link set promisc on dev eth1
auto eth2
iface eth2 inet manual
  # Required for Vagrant
  post-up ip link set promisc on dev eth2
auto uplink
iface uplink inet static
  address 10.1.10.101
  netmask 255.255.255.0
  mtu 9000
  bond-slaves eth1 eth2
  bond-mode 802.3ad
  bond-miimon 100
  bond-lacp-rate 1
  bond-min-links 1
  bond-xmit-hash-policy layer3+4
  post-up ip route add 10.0.0.0/8 via 10.1.10.1
cumulus@server02:~$ sudo cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback
# The OOB network interface
auto eth0
iface eth0 inet dhcp
# The data plane network interfaces
auto eth1
iface eth1 inet manual
  # Required for Vagrant
  post-up ip link set promisc on dev eth1
auto eth2
iface eth2 inet manual
  # Required for Vagrant
  post-up ip link set promisc on dev eth2
auto uplink
iface uplink inet static
  address 10.1.20.102
  netmask 255.255.255.0
  mtu 9000
  bond-slaves eth1 eth2
  bond-mode 802.3ad
  bond-miimon 100
  bond-lacp-rate 1
  bond-min-links 1
  bond-xmit-hash-policy layer3+4
  post-up ip route add 10.0.0.0/8 via 10.1.20.1
cumulus@server03:~$ sudo cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback
# The OOB network interface
auto eth0
iface eth0 inet dhcp
# The data plane network interfaces
auto eth1
iface eth1 inet manual
  # Required for Vagrant
  post-up ip link set promisc on dev eth1
auto eth2
iface eth2 inet manual
  # Required for Vagrant
  post-up ip link set promisc on dev eth2
auto uplink
iface uplink inet static
  address 10.1.30.103
  netmask 255.255.255.0
  mtu 9000
  bond-slaves eth1 eth2
  bond-mode 802.3ad
  bond-miimon 100
  bond-lacp-rate 1
  bond-min-links 1
  bond-xmit-hash-policy layer3+4
  post-up ip route add 10.0.0.0/8 via 10.1.30.1
cumulus@server04:~$ sudo cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback
# The OOB network interface
auto eth0
iface eth0 inet dhcp
# The data plane network interfaces
auto eth1
iface eth1 inet manual
  # Required for Vagrant
  post-up ip link set promisc on dev eth1
auto eth2
iface eth2 inet manual
  # Required for Vagrant
  post-up ip link set promisc on dev eth2
auto uplink
iface uplink inet static
  address 10.1.10.104
  netmask 255.255.255.0
  mtu 9000
  bond-slaves eth1 eth2
  bond-mode 802.3ad
  bond-miimon 100
  bond-lacp-rate 1
  bond-min-links 1
  bond-xmit-hash-policy layer3+4
  post-up ip route add 10.0.0.0/8 via 10.1.10.1
cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
evpn mh mac-holdtime 1080
evpn mh neigh-holdtime 1080
evpn mh startup-delay 180
interface bond1
evpn mh es-df-pref 50000
evpn mh es-id 1
evpn mh es-sys-mac 44:38:39:FF:00:AA
interface bond2
evpn mh es-df-pref 50000
evpn mh es-id 2
evpn mh es-sys-mac 44:38:39:FF:00:AA
interface bond3
evpn mh es-df-pref 50000
evpn mh es-id 3
evpn mh es-sys-mac 44:38:39:FF:00:AA
interface swp51
evpn mh uplink
interface swp52
evpn mh uplink
vrf BLUE
vni 4002
exit-vrf
vrf RED
vni 4001
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65101 vrf default
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65101 vrf default
router bgp 65101 vrf RED
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65101 vrf RED
router bgp 65101 vrf BLUE
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65101 vrf BLUE
...
cumulus@leaf02:~$ sudo cat /etc/frr/frr.conf
...
evpn mh mac-holdtime 1080
evpn mh neigh-holdtime 1080
evpn mh startup-delay 180
interface bond1
evpn mh es-df-pref 50000
evpn mh es-id 1
evpn mh es-sys-mac 44:38:39:FF:00:AA
interface bond2
evpn mh es-df-pref 50000
evpn mh es-id 2
evpn mh es-sys-mac 44:38:39:FF:00:AA
interface bond3
evpn mh es-df-pref 50000
evpn mh es-id 3
evpn mh es-sys-mac 44:38:39:FF:00:AA
interface swp51
evpn mh uplink
interface swp52
evpn mh uplink
vrf BLUE
vni 4002
exit-vrf
vrf RED
vni 4001
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65102 vrf default
bgp router-id 10.10.10.2
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65102 vrf default
router bgp 65102 vrf RED
bgp router-id 10.10.10.2
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65102 vrf RED
router bgp 65102 vrf BLUE
bgp router-id 10.10.10.2
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65102 vrf BLUE
cumulus@leaf03:~$ sudo cat /etc/frr/frr.conf
...
evpn mh mac-holdtime 1080
evpn mh neigh-holdtime 1080
evpn mh startup-delay 180
interface bond1
evpn mh es-df-pref 50000
evpn mh es-id 1
evpn mh es-sys-mac 44:38:39:FF:00:BB
interface bond2
evpn mh es-df-pref 50000
evpn mh es-id 2
evpn mh es-sys-mac 44:38:39:FF:00:BB
interface bond3
evpn mh es-df-pref 50000
evpn mh es-id 3
evpn mh es-sys-mac 44:38:39:FF:00:BB
interface swp51
evpn mh uplink
interface swp52
evpn mh uplink
vrf BLUE
vni 4002
exit-vrf
vrf RED
vni 4001
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65103 vrf default
bgp router-id 10.10.10.3
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65103 vrf default
router bgp 65103 vrf RED
bgp router-id 10.10.10.3
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65103 vrf RED
router bgp 65103 vrf BLUE
bgp router-id 10.10.10.3
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65103 vrf BLUE
cumulus@leaf03:~$ sudo cat /etc/frr/frr.conf
...
evpn mh mac-holdtime 1080
evpn mh neigh-holdtime 1080
evpn mh startup-delay 180
interface bond1
evpn mh es-df-pref 50000
evpn mh es-id 1
evpn mh es-sys-mac 44:38:39:FF:00:BB
interface bond2
evpn mh es-df-pref 50000
evpn mh es-id 2
evpn mh es-sys-mac 44:38:39:FF:00:BB
interface bond3
evpn mh es-df-pref 50000
evpn mh es-id 3
evpn mh es-sys-mac 44:38:39:FF:00:BB
interface swp51
evpn mh uplink
interface swp52
evpn mh uplink
vrf BLUE
vni 4002
exit-vrf
vrf RED
vni 4001
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65104 vrf default
bgp router-id 10.10.10.4
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65104 vrf default
router bgp 65104 vrf RED
bgp router-id 10.10.10.4
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65104 vrf RED
router bgp 65104 vrf BLUE
bgp router-id 10.10.10.4
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65104 vrf BLUE
...
cumulus@spine01:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.101
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65199 vrf default
cumulus@spine02:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.102
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65199 vrf default

This simulation is running Cumulus Linux 5.11. The Cumulus Linux 5.12 simulation is coming soon.

The simulation starts with the EVPN-MH with Head End Replication configuration. The demo is pre-configured using NVUE commands.

  • Run the vtysh show evpn es command to show the Ethernet segments across all VNIs.
  • Run the vtysh show bgp l2vpn evpn route type ead command to show the type-1 EAD routes.

To further validate the configuration, run the commands shown in the troubleshooting section below.

When you run the nv set vrf RED evpn vni 4001 and the nv set vrf BLUE evpn vni 4002 commands, NVUE creates the following in the /etc/network/interfaces file:

cumulus@leaf01:~$ sudo cat /etc/network/interfaces
...
auto vlan220_l3
iface vlan220_l3
vrf RED
vlan-raw-device br_l3vni
vlan-id 220


auto vlan297_l3
iface vlan297_l3
vrf BLUE
vlan-raw-device br_l3vni
vlan-id 297


auto vxlan99
iface vxlan99
bridge-vlan-vni-map 220=4001 297=4002
bridge-vids 220 297
bridge-learning off

auto br_l3vni
iface br_l3vni
bridge-ports vxlan99
hwaddress 44:38:39:22:01:b1
bridge-vlan-aware yes
...

EVPN BUM Traffic with PIM-SM

Without EVPN and PIM-SM, HER is the default way to replicate BUM traffic to remote VTEPs, where the ingress VTEP generates the same number of copies as VTEPs for each overlay BUM packet. In certain deployments, this is not optimal.

The following example shows a EVPN-PIM configuration, where underlay multicast distributes BUM traffic. An MDT optimizes the flow of overlay BUM traffic in the underlay network.

In the above example, host01 sends an ARP request to resolve host03. leaf01 (in addition to flooding the packet to host02) sends an encapsulated packet over the underlay network, which the spine forwards using the MDT to leaf02 and leaf03.

For PIM-SM, type-3 routes do not result in any forwarding entries. Cumulus Linux does not advertise type-3 routes for a layer 2 VNI when BUM mode for that VNI is PIM-SM.

If you use a PIM-SM based MDT for EVPN BUM replication, NVIDIA recommends that you use EVPN multihoming.

Configure Multicast VXLAN Tunnels

To configure multicast VXLAN tunnels, you need to configure PIM-SM in the underlay:

For the configuration steps to configure PIM-SM in the underlay, refer to Protocol Independent Multicast - PIM.

In addition to the PIM-SM configuration, you need to run the following commands on each VTEP to provide the layer 2 VNI to MDT mapping.

Run the nv set nve vxlan flooding multicast-group <ip-address> command. For example:

cumulus@switch:~$ nv set nve vxlan flooding multicast-group 224.0.0.10

Edit the /etc/network/interfaces file and add vxlan-mcastgrp <ip-address> to the interface stanza. For example:

cumulus@switch:~$ sudo vi /etc/network/interfaces
...
auto vxlan10
iface vxlan10
  vxlan-id 10
  vxlan-mcastgrp 224.0.0.10
  ...

Run the ifreload -a command to load the new configuration:

cumulus@switch:~$ ifreload -a

One multicast group per layer 2 VNI is optimal configuration for underlay bandwidth utilization. However, you can specify the same multicast group for more than one layer 2 VNI.

Verify EVPN-PIM

Run the vtysh show ip mroute command to review the multicast route information in FRR. When using EVPN-PIM, every VTEP acts as both source and destination for a VNI-MDT group, therefore, mroute entries on each VTEP should look like this:

cumulus@switch:~$ sudo vtysh
...
switch# show ip mroute
IP Multicast Routing Table
Flags: S - Sparse, C - Connected, P - Pruned
       R - RP-bit set, F - Register flag, T - SPT-bit set

Source          Group           Flags    Proto  Input            Output           TTL  Uptime
*               224.0.0.10      S        IGMP   swp54            pimreg           1    23:20:54
                                                                 ipmr-lo          1            
10.10.10.1      224.0.0.10      SFT      PIM    lo               swp51            1    23:20:56
*               224.0.0.20      S        IGMP   swp53            pimreg           1    23:20:54
                                                                 ipmr-lo          1            
10.10.10.1      224.0.0.20      SFT      PIM    lo               swp52            1    23:20:56
*               224.0.0.30      S        IGMP   swp51            pimreg           1    23:20:54
                                                                 ipmr-lo          1            
10.10.10.1      224.0.0.30      SFT      PIM    lo               swp53            1    23:20:56

(*,G) entries should show ipmr-lo in the OIL (Outgoing Interface List) and (S,G) entries should show lo as the Source interface or incoming interface and ipmr-lo in the OIL.

Run the ip mroute command to review the multicast route information in the kernel. The kernel information should match the FRR information.

cumulus@switch:~$ ip mroute
(10.10.10.1,224.0.0.30)          Iif: lo         Oifs: swp53  State: resolved
(10.10.10.1,224.0.0.20)          Iif: lo         Oifs: swp52  State: resolved
(10.10.10.1,224.0.0.10)          Iif: lo         Oifs: swp51  State: resolved
(0.0.0.0,224.0.0.10)             Iif: swp54      Oifs: pimreg ipmr-lo swp54  State: resolved
(0.0.0.0,224.0.0.20)             Iif: swp53      Oifs: pimreg ipmr-lo swp53  State: resolved
(0.0.0.0,224.0.0.30)             Iif: swp51      Oifs: pimreg ipmr-lo swp51  State: resolved

Run the bridge fdb show | grep 00:00:00:00:00:00 command to verify that all zero MAC addresses for every VXLAN device point to the correct multicast group destination.

cumulus@switch:~$ bridge fdb show | grep 00:00:00:00:00:00
00:00:00:00:00:00 dev vxlan10 dst 224.0.0.10 self permanent
00:00:00:00:00:00 dev vxlan20 dst 224.0.0.20 self permanent

The show ip mroute count command, often used to check multicast packet counts does not update for encapsulated BUM traffic originating or terminating on the VTEPs.

Run the vtysh show evpn vni <vni> command to ensure that your layer 2 VNI has the correct flooding information:

cumulus@switch:~$ sudo vtysh
switch# show evpn vni 10
VNI: 10
 Type: L2
 Tenant VRF: default
 VxLAN interface: vni10
 VxLAN ifIndex: 18
 Local VTEP IP: 10.10.10.1
 Mcast group: 224.0.0.10   <<<<<<<
 Remote VTEPs for this VNI:
  10.10.10.3 flood: -
 Number of MACs (local and remote) known for this VNI: 6
 Number of ARPs (IPv4 and IPv6, local and remote) known for this VNI: 14
 Advertise-gw-macip: No

Example Configuration

The following example shows an EVPN-PIM configuration on the VTEP, where:

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
ip pim rp 10.10.100.100
ip pim keep-alive-timer 3600
ip pim ecmp
service integrated-vtysh-config
vrf BLUE
 vni 4002
 exit-vrf
vrf RED
 vni 4001
 exit-vrf
vrf mgmt
 ip route 0.0.0.0/0 192.168.200.1
 exit-vrf
interface swp51
 ip pim
interface swp52
 ip pim
interface swp53
 ip pim
interface swp54
 ip pim
interface lo
 ip igmp
 ip pim
 ip pim use-source 10.10.10.1
router bgp 65101
 bgp router-id 10.10.10.1
 neighbor underlay peer-group
 neighbor underlay remote-as external
 neighbor swp51 interface peer-group underlay
 neighbor swp52 interface peer-group underlay
 neighbor swp53 interface peer-group underlay
 neighbor swp54 interface peer-group underlay
 !
 address-family ipv4 unicast
  redistribute connected
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor underlay activate
  advertise-all-vni
 exit-address-family
 !
router bgp 65101 vrf RED
 bgp router-id 10.10.10.1
 !
 address-family ipv4 unicast
  redistribute connected
 exit-address-family
 !
 address-family l2vpn evpn
  advertise ipv4 unicast
 exit-address-family
!
router bgp 65101 vrf BLUE
 bgp router-id 10.10.10.1
 !
 address-family ipv4 unicast
  redistribute connected
 exit-address-family
 !
 address-family l2vpn evpn
  advertise ipv4 unicast
 exit-address-family
cumulus@leaf01:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.1/32
    vxlan-local-tunnelip 10.10.10.1

auto eth0
iface eth0
    vrf mgmt
    address 192.168.200.11/24

auto mgmt
iface mgmt
  vrf-table auto
  address 127.0.0.1/8
  address ::1/128

auto RED
iface RED
  vrf-table auto

auto BLUE
iface BLUE
  vrf-table auto

auto bridge
iface bridge
    bridge-ports bond1 bond2 bond3
    bridge-ports vni10 vni20 vni30 vniRED vniBLUE 
    bridge-vids 10 20 30
    bridge-vlan-aware yes

auto vni10
iface vni10
    bridge-access 10
    vxlan-id 10
    mstpctl-portbpdufilter yes
    mstpctl-bpduguard yes
    bridge-learning off
    bridge-arp-nd-suppress on
    vxlan-mcastgrp 224.0.0.10

auto vni20
iface vni20
    bridge-access 20
    vxlan-id 20
    mstpctl-portbpdufilter yes
    mstpctl-bpduguard yes
    bridge-learning off
    bridge-arp-nd-suppress on
    vxlan-mcastgrp 224.0.0.20

auto vni30
iface vni30
    bridge-access 30
    vxlan-id 30
    mstpctl-portbpdufilter yes
    mstpctl-bpduguard yes
    bridge-learning off
    bridge-arp-nd-suppress on
    vxlan-mcastgrp 224.0.0.30

auto vniRED
iface vniRED
    bridge-access 4001
    vxlan-id 4001
    mstpctl-portbpdufilter yes
    mstpctl-bpduguard yes
    bridge-learning off
    bridge-arp-nd-suppress on

auto vniBLUE
iface vniBLUE
    bridge-access 4002
    vxlan-id 4002
    mstpctl-portbpdufilter yes
    mstpctl-bpduguard yes
    bridge-learning off
    bridge-arp-nd-suppress on

auto vlan10
iface vlan10
    address 10.1.10.2/24
    address-virtual 00:00:00:00:00:10 10.1.10.1/24
    vrf RED
    vlan-raw-device bridge
    vlan-id 10

auto vlan20
iface vlan20
    address 10.1.20.2/24
    address-virtual 00:00:00:00:00:20 10.1.20.1/24
    vrf RED
    vlan-raw-device bridge
    vlan-id 20

auto vlan30
iface vlan30
    address 10.1.30.2/24
    address-virtual 00:00:00:00:00:30 10.1.30.1/24
    vrf BLUE
    vlan-raw-device bridge
    vlan-id 30

auto vlan4001
iface vlan4001
    hwaddress 44:38:39:BE:EF:AA
    vrf RED
    vlan-raw-device bridge
    vlan-id 4001

auto vlan4002
iface vlan4002
    hwaddress 44:38:39:BE:EF:AA
    vrf BLUE
    vlan-raw-device bridge
    vlan-id 4002

auto swp51
iface swp51
    alias to spine

auto swp52
iface swp52
    alias to spine

auto swp53
iface swp53
    alias to spine

auto swp54
iface swp54
    alias to spine

auto swp1
iface swp1
    alias bond member of bond1

auto bond1
iface bond1
    bond-slaves swp1 
    bridge-access 10
    mtu 9000
    bond-lacp-bypass-allow yes
    mstpctl-bpduguard yes
    mstpctl-portadminedge yes

auto swp2
iface swp2
    alias bond member of bond2

auto bond2
iface bond2
    bond-slaves swp2 
    bridge-access 20
    mtu 9000
    bond-lacp-bypass-allow yes
    mstpctl-bpduguard yes
    mstpctl-portadminedge yes

auto swp3
iface swp3
    alias bond member of bond3

auto bond3
iface bond3
    bond-slaves swp3 
    bridge-access 30
    mtu 9000
    bond-lacp-bypass-allow yes
    mstpctl-bpduguard yes
    mstpctl-portadminedge yes

Configure EVPN-PIM in VXLAN Active-active Mode

To configure EVPN-PIM with an MLAG pair in VXLAN active-active mode, enable PIM on the peer link subinterface of each MLAG peer switch (in addition to the configuration described in Configure Multicast VXLAN Tunnels, above).

Run the nv set interface <peerlink> router pim command. For example:

cumulus@switch:~$ nv set interface peerlink.4094 router pim
cumulus@switch:~$ nv config apply

In the vtysh shell, run the following commands:

cumulus@switch:~$ sudo vtysh

switch# configure terminal
switch(config)# interface peerlink.4094
switch(config-if)# ip pim
switch(config-if)# end
switch# write memory
switch# exit
cumulus@switch:~$

Troubleshooting EVPN

This section provides various commands to help you examine your EVPN configuration and provides troubleshooting tips.

General Commands

You can use various NVUE or Linux commands to examine interfaces, VLAN mappings and the bridge MAC forwarding database known to the Linux kernel. You can also use these commands to examine the neighbor cache and the routing table (for the underlay or for a specific tenant VRF). Some of the key commands are:

The sample output below shows ip -d link show type vxlan command output for one VXLAN interface. Relevant parameters are the VNI value, the state, the local IP address for the VXLAN tunnel, the UDP port number (4789) and the bridge of which the interface is part (bridge in the example below). The output also shows that MAC learning is off on the VXLAN interface.

cumulus@leaf01:~$ ip -d link show type vxlan
14: vni10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc noqueue master bridge state UP mode DEFAULT group default qlen 1000
    link/ether 42:83:73:20:46:ba brd ff:ff:ff:ff:ff:ff promiscuity 1 minmtu 68 maxmtu 65535
    vxlan id 10 local 10.0.1.1 srcport 0 0 dstport 4789 nolearning ttl 64 ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx
    bridge_slave state forwarding priority 8 cost 100 hairpin off guard off root_block off fastleave off learning off flood on port_id 0x8005 port_no 0x5 designated_port 32773 designated_cost 0 designated_bridge 8000.76:ed:2a:8a:67:24 designated_root 8000.76:ed:2a:8a:67:24 hold_timer    0.00 message_age_timer    0.00 forward_delay_timer    0.00 topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off mcast_router 1 mcast_fast_leave off mcast_flood on neigh_suppress on group_fwd_mask 0x0 group_fwd_mask_str 0x0 group_fwd_maskhi 0x0 group_fwd_maskhi_str 0x0 vlan_tunnel off isolated off addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
...

The following shows example output for the nv show bridge domain <domain> mac-table command:

cumulus@leaf01:mgmt:~$ nv show bridge domain br_default mac-table
entry-id  MAC address        vlan  interface   remote-dst   src-vni  entry-type    last-update  age    
--------  -----------------  ----  ----------  -----------  -------  ------------  -----------  -------
1         48:b0:2d:fd:d3:bf  10    vxlan48                           extern_learn  8:06:02      8:06:02
2         48:b0:2d:4e:1c:fe  20    vxlan48                           extern_learn  8:06:02      8:06:02
3         48:b0:2d:a7:4d:ce  30    vxlan48                           extern_learn  8:06:02      8:06:02
4         48:b0:2d:53:d2:34  20    vxlan48                           extern_learn  8:06:30      8:06:30
5         44:38:39:be:ef:bb  4063  vxlan48                           extern_learn  8:06:30      8:06:30
6         48:b0:2d:2d:5f:b3  30    vxlan48                           extern_learn  8:06:32      8:06:32
7         44:38:39:be:ef:bb  4006  vxlan48                           extern_learn  8:06:32      8:06:32
8         48:b0:2d:93:a1:3e  10    vxlan48                           extern_learn  8:06:35      8:06:35
9         44:38:39:22:01:74  4006  vxlan48                           extern_learn  8:06:38      8:06:38
10        44:38:39:22:01:74  4063  vxlan48                           extern_learn  8:06:38      8:06:38
11        44:38:39:22:01:7c  4006  vxlan48                           extern_learn  8:06:39      8:06:39
12        44:38:39:22:01:7c  4063  vxlan48                           extern_learn  8:06:39      8:06:39
13        44:38:39:22:01:8a  30    vxlan48                           extern_learn  8:06:42      8:06:42
14        44:38:39:22:01:8a  20    vxlan48                           extern_learn  8:06:42      8:06:42
15        44:38:39:22:01:8a  10    vxlan48                           extern_learn  8:06:42      8:04:05
16        44:38:39:22:01:84  10    vxlan48                           extern_learn  8:06:43      8:06:43
17        44:38:39:22:01:84  30    vxlan48                           extern_learn  8:06:43      8:06:15
18        44:38:39:22:01:84  20    vxlan48                           extern_learn  8:06:43      8:06:43
19        44:38:39:22:01:8a  4006  vxlan48                           extern_learn  8:06:43      8:06:43
20        44:38:39:22:01:8a  4063  vxlan48                           extern_learn  8:06:43      8:06:43
21        44:38:39:22:01:84  4063  vxlan48                           extern_learn  8:06:43      8:06:43
22        44:38:39:22:01:84  4006  vxlan48                           extern_learn  8:06:43      8:06:43
23        44:38:39:22:01:78  4063  vxlan48                           extern_learn  8:06:43      8:06:43
24        44:38:39:22:01:78  4006  vxlan48                           extern_learn  8:06:43      8:06:43
25        02:91:8d:cf:03:b2        vxlan48                           permanent     8:06:56      8:06:56
26        00:00:00:00:00:00        vxlan48     10.0.1.34    30       permanent     8:06:43      0:28:22
27        44:38:39:22:01:78        vxlan48     10.10.10.2   4001     extern_learn  8:06:43      8:06:43
28        44:38:39:22:01:8a        vxlan48     10.0.1.34    30       static        8:06:43      8:06:43
29        48:b0:2d:fd:d3:bf        vxlan48     10.0.1.34    10       extern_learn  8:06:02      8:06:02
30        44:38:39:22:01:84        vxlan48     10.0.1.34    10       extern_learn  8:06:43      8:06:43
31        48:b0:2d:2d:5f:b3        vxlan48     10.0.1.34    30       extern_learn  8:06:32      8:06:32
...

The following example shows the nv show interface neighbor command output:

cumulus@leaf01:mgmt:~$ nv show interface neighbor
Interface      IP/IPV6                    LLADR(MAC)         State      Flag      
-------------  -------------------------  -----------------  ---------  ----------
eth0           192.168.200.1              48:b0:2d:82:3b:b3  reachable            
               192.168.200.251            48:b0:2d:00:00:01  stale                
               fe80::4ab0:2dff:fe00:1     48:b0:2d:00:00:01  reachable  router    
peerlink.4094  169.254.0.1                48:b0:2d:52:11:90  permanent            
               fe80::4ab0:2dff:fe52:1190  48:b0:2d:52:11:90  reachable  router    
swp51          169.254.0.1                48:b0:2d:b8:2b:bc  permanent            
               fe80::4ab0:2dff:feb8:2bbc  48:b0:2d:b8:2b:bc  reachable  router    
swp52          169.254.0.1                48:b0:2d:e1:08:f7  permanent            
               fe80::4ab0:2dff:fee1:8f7   48:b0:2d:e1:08:f7  reachable  router    
swp53          169.254.0.1                48:b0:2d:c0:71:8b  permanent            
               fe80::4ab0:2dff:fec0:718b  48:b0:2d:c0:71:8b  reachable  router    
swp54          169.254.0.1                48:b0:2d:18:f4:68  permanent            
               fe80::4ab0:2dff:fe18:f468  48:b0:2d:18:f4:68  reachable  router    
vlan10         10.1.10.3                  44:38:39:22:01:78  permanent            
               fe80::4638:39ff:fe22:178   44:38:39:22:01:78  permanent            
vlan20         10.1.20.3                  44:38:39:22:01:78  permanent            
               fe80::4638:39ff:fe22:178   44:38:39:22:01:78  permanent            
vlan30         10.1.30.3                  44:38:39:22:01:78  permanent            
               fe80::4638:39ff:fe22:178   44:38:39:22:01:78  permanent            
vlan4024_l3    10.10.10.63                44:38:39:22:01:74  noarp      |ext_learn
               10.10.10.64                44:38:39:22:01:7c  noarp      |ext_learn
               10.10.10.4                 44:38:39:22:01:8a  noarp      |ext_learn
               10.10.10.3                 44:38:39:22:01:84  noarp      |ext_learn
               10.10.10.2                 44:38:39:22:01:78  noarp      |ext_learn
               fe80::4638:39ff:fe22:178   44:38:39:22:01:78  permanent            
vlan4036_l3    10.10.10.63                44:38:39:22:01:74  noarp      |ext_learn
               10.10.10.64                44:38:39:22:01:7c  noarp      |ext_learn
               10.10.10.4                 44:38:39:22:01:8a  noarp      |ext_learn
               10.10.10.3                 44:38:39:22:01:84  noarp      |ext_learn
               10.10.10.2                 44:38:39:22:01:78  noarp      |ext_learn
               fe80::4638:39ff:fe22:178   44:38:39:22:01:78  permanent            
vxlan48        10.10.10.63                44:38:39:22:01:74  noarp      |ext_learn
               10.10.10.4                 44:38:39:22:01:8a  noarp      |ext_learn
               10.10.10.3                 44:38:39:22:01:84  noarp      |ext_learn
               10.10.10.2                 44:38:39:22:01:78  noarp      |ext_learn
               10.10.10.64                44:38:39:22:01:7c  noarp      |ext_learn
...

The following command shows the VLAN to VNI mapping for all bridges:

cumulus@switch:mgmt:~$nv show bridge vlan-vni-map
br_default vlan-vni-offset: 0         
      VLAN        VNI         
      ----        -------     
      10          10          
      20          20          
      30          30

The following command shows the VLAN to VNI mapping for a specific bridge:

cumulus@switch:mgmt:~$ nv show bridge domain br_default vlan-vni-map
vlan-vni-offset: 0         
      VLAN        VNI         
      ----        -------     
      10          10          
      20          20          
      30          30   

General BGP Commands

If you use BGP for the underlay routing, run the vtysh show bgp summary command to view a summary of the layer 3 fabric connectivity:

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show bgp summary
IPv4 Unicast Summary
BGP router identifier 10.10.10.1, local AS number 65101 vrf-id 0
BGP table version 13
RIB entries 25, using 4800 bytes of memory
Peers 5, using 106 KiB of memory
Peer groups 1, using 64 bytes of memory

Neighbor              V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
spine01(swp51)        4      65199       814       805        0    0    0 00:37:34            7
spine02(swp52)        4      65199       814       805        0    0    0 00:37:34            7
spine03(swp53)        4      65199       814       805        0    0    0 00:37:34            7
spine04(swp54)        4      65199       814       805        0    0    0 00:37:34            7
leaf02(peerlink.4094) 4      65101       766       768        0    0    0 00:37:35           12

Total number of neighbors 5


show bgp ipv6 unicast summary
=============================
% No BGP neighbors found


show bgp l2vpn evpn summary
===========================
BGP router identifier 10.10.10.1, local AS number 65101 vrf-id 0
BGP table version 0
RIB entries 23, using 4416 bytes of memory
Peers 4, using 85 KiB of memory
Peer groups 1, using 64 bytes of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
spine01(swp51)  4      65199       814       805        0    0    0 00:37:35           34
spine02(swp52)  4      65199       814       805        0    0    0 00:37:35           34
spine03(swp53)  4      65199       814       805        0    0    0 00:37:35           34
spine04(swp54)  4      65199       814       805        0    0    0 00:37:35           34

Total number of neighbors 4

Run the vtysh show ip route command to examine the underlay routing and determine how the switch reaches remote VTEPs. The following example shows output from a leaf switch:

This is the routing table of the global (underlay) routing table. Use the `vrf` keyword to see routes for specific VRFs where the hosts reside.

cumulus@leaf01:mgmt:~$ sudo vtysh
leaf01# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

C>* 10.0.1.1/32 is directly connected, lo, 00:40:02
B>* 10.0.1.2/32 [20/0] via fe80::2ef3:45ff:fef4:6f5f, swp53, weight 1, 00:40:04
  *                    via fe80::ae56:f0ff:fef3:590c, swp54, weight 1, 00:40:04
  *                    via fe80::c299:6bff:fec0:e1ca, swp52, weight 1, 00:40:04
  *                    via fe80::f208:5fff:fe12:cc8c, swp51, weight 1, 00:40:04
B>* 10.0.1.254/32 [20/0] via fe80::2ef3:45ff:fef4:6f5f, swp53, weight 1, 00:35:18
  *                      via fe80::ae56:f0ff:fef3:590c, swp54, weight 1, 00:35:18
  *                      via fe80::c299:6bff:fec0:e1ca, swp52, weight 1, 00:35:18
  *                      via fe80::f208:5fff:fe12:cc8c, swp51, weight 1, 00:35:18
C>* 10.10.10.1/32 is directly connected, lo, 00:42:58
B>* 10.10.10.2/32 [200/0] via fe80::c28a:e6ff:fe03:96d0, peerlink.4094, weight 1, 00:42:56
B>* 10.10.10.3/32 [20/0] via fe80::2ef3:45ff:fef4:6f5f, swp53, weight 1, 00:42:55
  *                      via fe80::ae56:f0ff:fef3:590c, swp54, weight 1, 00:42:55
  *                      via fe80::c299:6bff:fec0:e1ca, swp52, weight 1, 00:42:55
  *                      via fe80::f208:5fff:fe12:cc8c, swp51, weight 1, 00:42:55
B>* 10.10.10.4/32 [20/0] via fe80::2ef3:45ff:fef4:6f5f, swp53, weight 1, 00:42:55
  *                      via fe80::ae56:f0ff:fef3:590c, swp54, weight 1, 00:42:55
  *                      via fe80::c299:6bff:fec0:e1ca, swp52, weight 1, 00:42:55
  *                      via fe80::f208:5fff:fe12:cc8c, swp51, weight 1, 00:42:55
B>* 10.10.10.63/32 [20/0] via fe80::2ef3:45ff:fef4:6f5f, swp53, weight 1, 00:42:55
  *                       via fe80::ae56:f0ff:fef3:590c, swp54, weight 1, 00:42:55
  *                       via fe80::c299:6bff:fec0:e1ca, swp52, weight 1, 00:42:55
  *                       via fe80::f208:5fff:fe12:cc8c, swp51, weight 1, 00:42:55
B>* 10.10.10.64/32 [20/0] via fe80::2ef3:45ff:fef4:6f5f, swp53, weight 1, 00:38:07
  *                       via fe80::ae56:f0ff:fef3:590c, swp54, weight 1, 00:38:07
  *                       via fe80::c299:6bff:fec0:e1ca, swp52, weight 1, 00:38:07
  *                       via fe80::f208:5fff:fe12:cc8c, swp51, weight 1, 00:38:07
B>* 10.10.10.101/32 [20/0] via fe80::f208:5fff:fe12:cc8c, swp51, weight 1, 00:42:56
B>* 10.10.10.102/32 [20/0] via fe80::c299:6bff:fec0:e1ca, swp52, weight 1, 00:42:56
B>* 10.10.10.103/32 [20/0] via fe80::2ef3:45ff:fef4:6f5f, swp53, weight 1, 00:42:56
B>* 10.10.10.104/32 [20/0] via fe80::ae56:f0ff:fef3:590c, swp54, weight 1, 00:42:56

Show EVPN Address Family Peers

Run the vtysh show bgp l2vpn evpn summary command to see the BGP peers participating in the EVPN address family and their states. The following example output from a leaf switch shows eBGP peering with four spine switches to exchange EVPN routes; all peering sessions are in the established state.

cumulus@leaf01:mgmt:~$ sudo vtysh
leaf01# show bgp l2vpn evpn summary
BGP router identifier 10.10.10.1, local AS number 65101 vrf-id 0
BGP table version 0
RIB entries 23, using 4416 bytes of memory
Peers 4, using 85 KiB of memory
Peer groups 1, using 64 bytes of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
spine01(swp51)  4      65199       958       949        0    0    0 00:44:46           34
spine02(swp52)  4      65199       958       949        0    0    0 00:44:46           34
spine03(swp53)  4      65199       958       949        0    0    0 00:44:46           34
spine04(swp54)  4      65199       958       949        0    0    0 00:44:46           34

Total number of neighbors 4

Show EVPN VNIs

To display the configured VNIs on a network device participating in BGP EVPN, run the vtysh show bgp l2vpn evpn vni command. This command is only relevant on a VTEP. For symmetric routing, this command displays the special layer 3 VNIs for each tenant VRF.

cumulus@leaf01:mgmt:~$ sudo vtysh
leaf01# show bgp l2vpn evpn vni
Advertise Gateway Macip: Disabled
Advertise SVI Macip: Disabled
Advertise All VNI flag: Enabled
BUM flooding: Head-end replication
Number of L2 VNIs: 3
Number of L3 VNIs: 2
Flags: * - Kernel
  VNI        Type RD                    Import RT                 Export RT                 Tenant VRF
* 20         L2   10.10.10.1:4          65101:20                  65101:20                 RED
* 30         L2   10.10.10.1:6          65101:30                  65101:30                 BLUE
* 10         L2   10.10.10.1:3          65101:10                  65101:10                 RED
* 4002       L3   10.1.30.2:2           65101:4002                65101:4002               BLUE
* 4001       L3   10.1.20.2:5           65101:4001                65101:4001               RED

Run the NVUE nv show evpn vni command or the vtysh show evpn vni command to see a summary of all VNIs and the number of MAC or ARP entries associated with each VNI.

cumulus@leaf01:mgmt:~$ nv show evpn vni 
NumMacs - Number of MACs (local and remote) known for this VNI, NumArps - Number
of ARPs (IPv4 and IPv6, local and remote) known for this VNI                    
, NumRemVteps - Number of Remote Vteps, Bridge - Bridge to which the vni        
belongs, Vlan - VLAN assoicated to MAC                                          
VNI  NumMacs  NumArps  NumRemVteps  TenantVrf  Bridge      Vlan
---  -------  -------  -----------  ---------  ----------  ----
10   7        4        1            RED        br_default  10  
20   7        4        1            RED        br_default  20  
30   7        4        1            BLUE       br_default  30  

Run the NVUE nv show evpn vni <vni> command or the vtysh show evpn vni <vni> command to examine EVPN information for a specific VNI in detail. The following example output shows details for the layer 2 VNI 10. The output shows the remote VTEPs that contain that VNI.

cumulus@leaf01:mgmt:~$ nv show evpn vni 10
-----------------  -----------  -------
                   operational  applied
-----------------  -----------  -------
route-advertise                        
  svi-ip           off                 
  default-gateway  off                 
[remote-vtep]      10.0.1.34           
vlan               10                  
bridge-domain      br_default          
tenant-vrf         RED                 
vxlan-interface    vxlan48             
mac-count          7                   
host-count         4                   
remote-vtep-count  1                   
local-vtep         10.0.1.12

To show VNI BGP information run the NVUE nv show evpn vni <id> bgp-info and nv show vrf <vrf_id> evpn bgp-info commands, or the vtysh show bgp l2vpn evpn vni <vni> command.

cumulus@border01:mgmt:~$ nv show vrf RED evpn bgp-info
                       operational      
---------------------  -----------------
rd                     10.10.10.1:3     
local-vtep             10.0.1.12        
router-mac             44:38:39:be:ef:aa
system-mac             44:38:39:22:01:7a
system-ip              10.10.10.1       
[import-route-target]  65101:4001       
[export-route-target]  65101:4001

Examine Local and Remote MAC Addresses for a VNI

Run the NVUE nv show evpn vni <vni> mac command or the vtysh show evpn mac vni <vni> command to examine all local and remote MAC addresses for a VNI. This command is only relevant for a layer 2 VNI:

cumulus@leaf01:mgmt:~$ nv show evpn vni 10 mac                                                                               
LocMobSeq - local mobility sequence, RemMobSeq - remote mobility sequence,      
RemoteVtep - Remote Vtep address, Esi - Remote Esi                              
MAC address        Type    LocMobSeq  RemMobSeq  Interface  RemoteVtep  Esi
-----------------  ------  ---------  ---------  ---------  ----------  ---
44:38:39:22:01:8a  remote  0          0                     10.0.1.34      
44:38:39:22:01:78  local   0          0          peerlink                  
44:38:39:22:01:84  remote  0          0                     10.0.1.34      
48:b0:2d:5c:8a:ee  local   0          0          bond1                     
48:b0:2d:29:c0:bb  remote  0          0                     10.0.1.34      
48:b0:2d:c9:f8:14  remote  0          0                     10.0.1.34      
48:b0:2d:fa:72:e7  local   0          0          bond      

Run the vtysh show evpn mac vni all command to examine MAC addresses for all VNIs.

You can examine the details for a specific MAC addresses or query all remote MAC addresses behind a specific VTEP:

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show evpn mac vni 10 mac 94:8e:1c:0d:77:93
MAC: 94:8e:1c:0d:77:93
 Remote VTEP: 10.0.1.2
 Sync-info: neigh#: 0
 Local Seq: 0 Remote Seq: 0
 Neighbors:
    No Neighbors

leaf01# show evpn mac vni 20 vtep 10.0.1.2
VNI 20

MAC               Type   FlagsIntf/Remote ES/VTEP            VLAN  Seq #'s
12:15:9a:9c:f2:e1 remote       10.0.1.2                             1/0
50:88:b2:3c:08:f9 remote       10.0.1.2                             0/0
f8:4f:db:ef:be:8b remote       10.0.1.2                             0/0
c8:7d:bc:96:71:f3 remote       10.0.1.2                             0/0

Examine Local and Remote Neighbors for a VNI

Run the vtysh show evpn arp-cache vni <vni> command to examine all local and remote neighbors (ARP entries) for a VNI. This command is only relevant for a layer 2 VNI and the output shows both IPv4 and IPv6 neighbor entries:

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show evpn arp-cache vni 10
Number of ARPs (local and remote) known for this VNI: 6
Flags: I=local-inactive, P=peer-active, X=peer-proxy
Neighbor                  Type   Flags State    MAC               Remote ES/VTEP                 Seq #'s
10.1.10.2                 local        active   76:ed:2a:8a:67:24                                0/0
fe80::968e:1cff:fe0d:7793 remote       active   68:0f:31:ae:3d:7a 10.0.1.2                       0/0
10.1.10.101               local        active   26:76:e6:93:32:78                                0/0
fe80::9465:45ff:fe6d:4890 local        active   26:76:e6:93:32:78                                0/0
10.1.10.104               remote       active   68:0f:31:ae:3d:7a 10.0.1.2                       0/0
fe80::74ed:2aff:fe8a:6724 local        active   76:ed:2a:8a:67:24                                0/0
...

Run the vtysh show evpn arp-cache vni all command to examine neighbor entries for all VNIs.

Examine Remote Router MAC Addresses

To examine the router MAC addresses corresponding to all remote VTEPs for symmetric routing, run the NVUE nv show vrf <vrf> evpn remote-router-mac command or the vtysh show evpn rmac vni all command. This command is only relevant for a layer 3 VNI:

cumulus@border01:mgmt:~$ nv show vrf RED evpn remote-router-mac
MAC address        remote-vtep
-----------------  -----------
44:38:39:22:01:7a  10.10.10.1 
44:38:39:22:01:7c  10.10.10.64
44:38:39:22:01:8a  10.10.10.4 
44:38:39:22:01:78  10.10.10.2 
44:38:39:22:01:84  10.10.10.3 
44:38:39:be:ef:aa  10.0.1.12

Examine Gateway Next Hops

To examine the gateway next hops for symmetric routing, run the NVUE nv show vrf <vrf> evpn nexthop-vtep command or the vtysh show evpn next-hops vni all command. This command is only relevant for a layer 3 VNI. The gateway next hop IP addresses correspond to the remote VTEP IP addresses. Cumulus Linux installs the remote host and prefix routes using these next hops.

cumulus@border01:mgmt:~$ nv show vrf RED evpn nexthop-vtep
Nexthop      router-mac       
-----------  -----------------
10.0.1.12    44:38:39:be:ef:aa
10.10.10.1   44:38:39:22:01:7a
10.10.10.2   44:38:39:22:01:78
10.10.10.3   44:38:39:22:01:84
10.10.10.4   44:38:39:22:01:8a
10.10.10.64  44:38:39:22:01:7c

To show the router MAC address for a specific next hop, run the NVUE nv show vrf <vrf> evpn nexthop-vtep <ip-address> command:

cumulus@leaf01:mgmt:~$ nv show vrf RED evpn nexthop-vtep 10.10.10.2
            operational       
----------  -----------------
router-mac  44:38:39:22:01:78

To show the remote host and prefix routes through a specific next hop, run the vtysh show evpn next-hops vni <vni> ip <ip-address> command:

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show evpn next-hops vni 4001 ip 10.0.1.2
Ip: 10.0.1.2
  RMAC: 44:38:39:be:ef:bb
  Refcount: 2
  Prefixes:
    10.1.10.104/32
    10.1.20.105/32

To show the VTEP IP addresses for the next hop groups, run the nv show evpn l2-nhg vtep-ip command.

Show Access VLANs

To show access VLANs on the switch and their corresponding VNI, run the NVUE nv show evpn access-vlan-info command or the vtysh show evpn access-vlan command.

cumulus@border01:mgmt:~$ nv show evpn access-vlan-info
vlan
=======
    Id    MemberCnt  Vni  VniCnt  VxlanIntf  MemberIntf
    ----  ---------  ---  ------  ---------  ----------
    1     1                                  peerlink  
    10    2          10   1       vxlan48    bond1     
                                             peerlink  
    20    2          20   1       vxlan48    bond2     
                                             peerlink  
    30    2          30   1       vxlan48    bond3     
                                             peerlink  
    4006                  1       vxlan48              
    4063                  1       vxlan48    

You can drill down and show information about a specific vlan with the nv show evpn access-vlan-info vlan <vlan> command.

Show the VRF Routing Table in FRR

Run the NVUE nv show vrf <vrf-id> router rib <address-family> route command or the vtysh show ip route vrf <vrf-name> command to examine the VRF routing table. Use this command for symmetric routing to verify that remote host and prefix routes are in the VRF routing table and point to the appropriate gateway next hop.

cumulus@leaf01:mgmt:~$ nv show vrf RED router rib ipv4 route
                                                                                
Flags - * - selected, q - queued, o - offloaded, i - installed, S - fib-        
selected, x - failed                                                            
                                                                                
Route           Protocol   Distance  Uptime                NHGId  Metric  Flags
--------------  ---------  --------  --------------------  -----  ------  -----
0.0.0.0/0       kernel     255       2024-10-25T14:02:23Z  21     8192    *Si  
10.1.10.0/24    connected  0         2024-10-25T14:02:33Z  100    1024    io   
                connected  0         2024-10-25T14:02:33Z  88     0       *Sio 
10.1.20.0/24    connected  0         2024-10-25T14:02:33Z  103    1024    io   
                connected  0         2024-10-25T14:02:33Z  92     0       *Sio 
10.1.20.105/32  bgp        20        2024-10-25T14:02:46Z  166    0       *Si  
10.1.30.0/24    bgp        20        2024-10-25T14:02:39Z  154    0       *Si
cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show ip route vrf RED
show ip route vrf RED
======================
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route


VRF RED:
K>* 0.0.0.0/0 [255/8192] unreachable (ICMP unreachable), 00:53:46
C * 10.1.10.0/24 [0/1024] is directly connected, vlan10-v0, 00:53:46
C>* 10.1.10.0/24 is directly connected, vlan10, 00:53:46
B>* 10.1.10.104/32 [20/0] via 10.0.1.2, vlan4001 onlink, weight 1, 00:43:55
C * 10.1.20.0/24 [0/1024] is directly connected, vlan20-v0, 00:53:46
C>* 10.1.20.0/24 is directly connected, vlan20, 00:53:46
B>* 10.1.20.105/32 [20/0] via 10.0.1.2, vlan4001 onlink, weight 1, 00:20:07
...

In the output above, EVPN specifies the next hops for these routes to be onlink, or reachable over the specified SVI. This is necessary because this interface does not need to have an IP address. Even if the interface has an IP address, the next hop is not on the same subnet as it is typically the IP address of the remote VTEP (part of the underlay IP network).

Show the Global BGP EVPN Routing Table

Run the vtysh show bgp l2vpn evpn route command to display all EVPN routes, both local and remote. Cumulus Linux bases the routes on the RD as they are across VNIs and VRFs:

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show bgp l2vpn evpn route
BGP table version is 6, local router ID is 10.10.10.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[ESI]:[EthTag]:[IPlen]:[VTEP-IP]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]

   Network          Next Hop            Metric LocPrf Weight Path
                    Extended Community
Route Distinguisher: 10.10.10.1:3
*> [2]:[0]:[48]:[00:60:08:69:97:ef]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:10 RT:65101:4001 Rmac:44:38:39:be:ef:aa
*> [2]:[0]:[48]:[26:76:e6:93:32:78]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:10 RT:65101:4001 Rmac:44:38:39:be:ef:aa
*> [2]:[0]:[48]:[26:76:e6:93:32:78]:[32]:[10.1.10.101]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:10 RT:65101:4001 Rmac:44:38:39:be:ef:aa
*> [2]:[0]:[48]:[26:76:e6:93:32:78]:[128]:[fe80::9465:45ff:fe6d:4890]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:10
*> [2]:[0]:[48]:[c0:8a:e6:03:96:d0]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:10 RT:65101:4001 MM:0, sticky MAC Rmac:44:38:39:be:ef:aa
*> [3]:[0]:[32]:[10.0.1.1]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:10
Route Distinguisher: 10.10.10.1:4
*> [2]:[0]:[48]:[c0:8a:e6:03:96:d0]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:20 RT:65101:4001 MM:0, sticky MAC Rmac:44:38:39:be:ef:aa
*> [2]:[0]:[48]:[cc:6e:fa:8d:ff:92]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:20 RT:65101:4001 Rmac:44:38:39:be:ef:aa
*> [2]:[0]:[48]:[f0:9d:d0:59:60:5d]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:20 RT:65101:4001 Rmac:44:38:39:be:ef:aa
*> [2]:[0]:[48]:[f0:9d:d0:59:60:5d]:[128]:[fe80::ce6e:faff:fe8d:ff92]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:20
*> [3]:[0]:[32]:[10.0.1.1]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:20
Route Distinguisher: 10.10.10.1:6
*> [2]:[0]:[48]:[c0:8a:e6:03:96:d0]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:30 RT:65101:4002 MM:0, sticky MAC Rmac:44:38:39:be:ef:aa
*> [2]:[0]:[48]:[de:02:3b:17:c9:6d]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:30 RT:65101:4002 Rmac:44:38:39:be:ef:aa
*> [2]:[0]:[48]:[de:02:3b:17:c9:6d]:[128]:[fe80::dc02:3bff:fe17:c96d]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:30
*> [2]:[0]:[48]:[ea:77:bb:f1:a7:ca]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:30 RT:65101:4002 Rmac:44:38:39:be:ef:aa
*> [3]:[0]:[32]:[10.0.1.1]
                    10.0.1.1                           32768 i
                    ET:8 RT:65101:30
Route Distinguisher: 10.10.10.3:3
*> [2]:[0]:[48]:[12:15:9a:9c:f2:e1]
                    10.0.1.2                               0 65199 65102 i
                    RT:65102:20 RT:65102:4001 ET:8 Rmac:44:38:39:be:ef:bb
*  [2]:[0]:[48]:[12:15:9a:9c:f2:e1]
                    10.0.1.2                               0 65199 65102 i
                    RT:65102:20 RT:65102:4001 ET:8 Rmac:44:38:39:be:ef:bb
...

You can filter the routing table based on EVPN route type. The available options are: ead: EAD (Type-1) route es: Ethernet Segment (type-4) route macip: MAC-IP (Type-2) route multicast: Multicast prefix: An IPv4 or IPv6 prefix

Show EVPN RD Routes

To show EVPN RD routes, run the nv show vrf <vrf> router bgp address-family l2vpn-evpn route command. This command shows the EVPN RD routes in brief format to improve performance for high scale environments. To show the EVPN RD routes in more detail, run the nv show vrf <vrf> router bgp address-family l2vpn-evpn route --view=detail command. To show the information in json format, run the nv show vrf <vrf> router bgp address-family l2vpn-evpn route -o json command.

cumulus@leaf01:mgmt:~$ nv show vrf default router bgp address-family l2vpn-evpn route
PathCnt - number of L2VPN EVPN per (RD, route-type) paths
Route                                                                   rd             route-type  PathCnt
----------------------------------------------------------------------  -------------  ----------  -------
[10.10.10.1:2]:[5]:[0]:[10.1.30.0/24]                                   10.10.10.1:2   5           1      
[10.10.10.1:3]:[5]:[0]:[10.1.10.0/24]                                   10.10.10.1:3   5           1      
[10.10.10.1:3]:[5]:[0]:[10.1.20.0/24]                                   10.10.10.1:3   5           1      
[10.10.10.1:4]:[2]:[0]:[44:38:39:22:01:78]                              10.10.10.1:4   2           1      
[10.10.10.1:4]:[2]:[0]:[48:b0:2d:7f:74:13]                              10.10.10.1:4   2           1      
[10.10.10.1:4]:[2]:[0]:[48:b0:2d:7f:74:13]:[10.1.20.102]                10.10.10.1:4   2           1      
[10.10.10.1:4]:[2]:[0]:[48:b0:2d:7f:74:13]:[fe80::4ab0:2dff:fe7f:7413]  10.10.10.1:4   2           1      
[10.10.10.1:4]:[2]:[0]:[48:b0:2d:a4:40:62]                              10.10.10.1:4   2           1      
[10.10.10.1:4]:[3]:[0]:[10.0.1.12]                                      10.10.10.1:4   3           1      
[10.10.10.1:5]:[2]:[0]:[44:38:39:22:01:78]                              10.10.10.1:5   2           1      
[10.10.10.1:5]:[2]:[0]:[48:b0:2d:99:9e:04]                              10.10.10.1:5   2           1      
[10.10.10.1:5]:[2]:[0]:[48:b0:2d:c2:f9:21]                              10.10.10.1:5   2           1      
[10.10.10.1:5]:[2]:[0]:[48:b0:2d:c2:f9:21]:[10.1.30.103]                10.10.10.1:5   2           1      
[10.10.10.1:5]:[2]:[0]:[48:b0:2d:c2:f9:21]:[fe80::4ab0:2dff:fec2:f921]  10.10.10.1:5   2           1      
[10.10.10.1:5]:[3]:[0]:[10.0.1.12]                                      10.10.10.1:5   3           1      
[10.10.10.1:6]:[2]:[0]:[44:38:39:22:01:78]                              10.10.10.1:6   2           1      
[10.10.10.1:6]:[2]:[0]:[48:b0:2d:5c:8a:ee]                              10.10.10.1:6   2           1      
[10.10.10.1:6]:[2]:[0]:[48:b0:2d:fa:72:e7]                              10.10.10.1:6   2           1      
[10.10.10.1:6]:[2]:[0]:[48:b0:2d:fa:72:e7]:[10.1.10.101]                10.10.10.1:6   2           1      
[10.10.10.1:6]:[2]:[0]:[48:b0:2d:fa:72:e7]:[fe80::4ab0:2dff:fefa:72e7]  10.10.10.1:6   2           1      
[10.10.10.1:6]:[3]:[0]:[10.0.1.12]                                      10.10.10.1:6   3           1      
[10.10.10.2:2]:[5]:[0]:[10.1.30.0/24]                                   10.10.10.2:2   5           5      
[10.10.10.2:3]:[5]:[0]:[10.1.10.0/24]                                   10.10.10.2:3   5           5      
[10.10.10.2:3]:[5]:[0]:[10.1.20.0/24]                                   10.10.10.2:3   5           5      
[10.10.10.3:2]:[5]:[0]:[10.1.30.0/24]                                   10.10.10.3:2   5           5      
[10.10.10.3:3]:[5]:[0]:[10.1.10.0/24]                                   10.10.10.3:3   5           5      
[10.10.10.3:3]:[5]:[0]:[10.1.20.0/24]                                   10.10.10.3:3   5           5      
[10.10.10.3:4]:[2]:[0]:[44:38:39:22:01:8a]                              10.10.10.3:4   2           5      
[10.10.10.3:4]:[2]:[0]:[48:b0:2d:48:21:9d]                              10.10.10.3:4   2           5      
[10.10.10.3:4]:[2]:[0]:[48:b0:2d:82:43:48]                              10.10.10.3:4   2           5      
[10.10.10.3:4]:[2]:[0]:[48:b0:2d:82:43:48]:[10.1.20.105]                10.10.10.3:4   2           5      
[10.10.10.3:4]:[2]:[0]:[48:b0:2d:82:43:48]:[fe80::4ab0:2dff:fe82:4348]  10.10.10.3:4   2           5      
[10.10.10.3:4]:[3]:[0]:[10.0.1.34]                                      10.10.10.3:4   3           5      
[10.10.10.3:5]:[2]:[0]:[44:38:39:22:01:8a]                              10.10.10.3:5   2           5      
[10.10.10.3:5]:[2]:[0]:[48:b0:2d:d5:45:6f]                              10.10.10.3:5   2           5      
[10.10.10.3:5]:[2]:[0]:[48:b0:2d:d5:45:6f]:[10.1.30.106]                10.10.10.3:5   2           5      
[10.10.10.3:5]:[2]:[0]:[48:b0:2d:d5:45:6f]:[fe80::4ab0:2dff:fed5:456f]  10.10.10.3:5   2           5      
[10.10.10.3:5]:[2]:[0]:[48:b0:2d:df:a8:20]                              10.10.10.3:5   2           5      
[10.10.10.3:5]:[3]:[0]:[10.0.1.34]                                      10.10.10.3:5   3           5      
[10.10.10.3:6]:[2]:[0]:[44:38:39:22:01:8a]                              10.10.10.3:6   2           5      
[10.10.10.3:6]:[2]:[0]:[48:b0:2d:29:c0:bb]                              10.10.10.3:6   2           5      
[10.10.10.3:6]:[2]:[0]:[48:b0:2d:29:c0:bb]:[10.1.10.104]                10.10.10.3:6   2           5      
[10.10.10.3:6]:[2]:[0]:[48:b0:2d:29:c0:bb]:[fe80::4ab0:2dff:fe29:c0bb]  10.10.10.3:6   2           5      
[10.10.10.3:6]:[2]:[0]:[48:b0:2d:c9:f8:14]                              10.10.10.3:6   2           5      
[10.10.10.3:6]:[3]:[0]:[10.0.1.34]                                      10.10.10.3:6   3           5      
[10.10.10.4:2]:[5]:[0]:[10.1.30.0/24]                                   10.10.10.4:2   5           5      
[10.10.10.4:3]:[5]:[0]:[10.1.10.0/24]                                   10.10.10.4:3   5           5      
[10.10.10.4:3]:[5]:[0]:[10.1.20.0/24]                                   10.10.10.4:3   5           5      
[10.10.10.4:4]:[2]:[0]:[44:38:39:22:01:84]                              10.10.10.4:4   2           5      
[10.10.10.4:4]:[2]:[0]:[48:b0:2d:48:21:9d]                              10.10.10.4:4   2           5      
[10.10.10.4:4]:[2]:[0]:[48:b0:2d:82:43:48]                              10.10.10.4:4   2           5      
[10.10.10.4:4]:[2]:[0]:[48:b0:2d:82:43:48]:[10.1.20.105]                10.10.10.4:4   2           5      
[10.10.10.4:4]:[2]:[0]:[48:b0:2d:82:43:48]:[fe80::4ab0:2dff:fe82:4348]  10.10.10.4:4   2           5      
[10.10.10.4:4]:[3]:[0]:[10.0.1.34]                                      10.10.10.4:4   3           5      
[10.10.10.4:5]:[2]:[0]:[44:38:39:22:01:84]                              10.10.10.4:5   2           5      
[10.10.10.4:5]:[2]:[0]:[48:b0:2d:d5:45:6f]                              10.10.10.4:5   2           5      
[10.10.10.4:5]:[2]:[0]:[48:b0:2d:d5:45:6f]:[10.1.30.106]                10.10.10.4:5   2           5      
[10.10.10.4:5]:[2]:[0]:[48:b0:2d:d5:45:6f]:[fe80::4ab0:2dff:fed5:456f]  10.10.10.4:5   2           5      
[10.10.10.4:5]:[2]:[0]:[48:b0:2d:df:a8:20]                              10.10.10.4:5   2           5      
[10.10.10.4:5]:[3]:[0]:[10.0.1.34]                                      10.10.10.4:5   3           5      
[10.10.10.4:6]:[2]:[0]:[44:38:39:22:01:84]                              10.10.10.4:6   2           5      
[10.10.10.4:6]:[2]:[0]:[48:b0:2d:29:c0:bb]                              10.10.10.4:6   2           5      
[10.10.10.4:6]:[2]:[0]:[48:b0:2d:29:c0:bb]:[10.1.10.104]                10.10.10.4:6   2           5      
[10.10.10.4:6]:[2]:[0]:[48:b0:2d:29:c0:bb]:[fe80::4ab0:2dff:fe29:c0bb]  10.10.10.4:6   2           5      
[10.10.10.4:6]:[2]:[0]:[48:b0:2d:c9:f8:14]                              10.10.10.4:6   2           5      
[10.10.10.4:6]:[3]:[0]:[10.0.1.34]                                      10.10.10.4:6   3           5      
[10.10.10.63:2]:[5]:[0]:[10.1.10.0/24]                                  10.10.10.63:2  5           5      
[10.10.10.63:2]:[5]:[0]:[10.1.20.0/24]                                  10.10.10.63:2  5           5      
[10.10.10.63:3]:[5]:[0]:[10.1.30.0/24]                                  10.10.10.63:3  5           5      
[10.10.10.64:2]:[5]:[0]:[10.1.10.0/24]                                  10.10.10.64:2  5           5      
[10.10.10.64:2]:[5]:[0]:[10.1.20.0/24]                                  10.10.10.64:2  5           5      
[10.10.10.64:3]:[5]:[0]:[10.1.30.0/24]                                  10.10.10.64:3  5           5 

Show a Specific EVPN Route

To drill down on a specific route for more information, run the vtysh show bgp l2vpn evpn route rd <rd-value> command. This command displays all EVPN routes with that RD and with the path attribute details for each path. Additional filtering is possible based on route type or by specifying the MAC and/or IP address. The following example shows the specific MAC/IP route of server05. The output shows that this remote host is behind VTEP 10.10.10.3 and is reachable through four paths; one through each spine switch. This example is from a symmetric routing configuration, so the route shows both the layer 2 VNI (20) and the layer 3 VNI (4001), as well as the EVPN route target attributes corresponding to each and the associated router MAC address.

cumulus@leaf01:mgmt:~$ sudo vtysh
leaf01# show bgp l2vpn evpn route rd 10.10.10.3:3 mac 12:15:9a:9c:f2:e1 ip 10.1.20.105
BGP routing table entry for 10.10.10.3:3:[2]:[0]:[48]:[12:15:9a:9c:f2:e1]:[32]:[10.1.20.105]
Paths: (4 available, best #1)
  Advertised to non peer-group peers:
  spine01(swp51) spine02(swp52) spine03(swp53) spine04(swp54)
  Route [2]:[0]:[48]:[12:15:9a:9c:f2:e1]:[32]:[10.1.20.105] VNI 20/4001
  65199 65102
    10.0.1.2 from spine01(swp51) (10.10.10.101)
      Origin IGP, valid, external, bestpath-from-AS 65199, best (Router ID)
      Extended Community: RT:65102:20 RT:65102:4001 ET:8 Rmac:44:38:39:be:ef:bb
      Last update: Fri Jan 15 08:16:24 2021
  Route [2]:[0]:[48]:[12:15:9a:9c:f2:e1]:[32]:[10.1.20.105] VNI 20/4001
  65199 65102
    10.0.1.2 from spine04(swp54) (10.10.10.104)
      Origin IGP, valid, external
      Extended Community: RT:65102:20 RT:65102:4001 ET:8 Rmac:44:38:39:be:ef:bb
      Last update: Fri Jan 15 08:16:24 2021
  Route [2]:[0]:[48]:[12:15:9a:9c:f2:e1]:[32]:[10.1.20.105] VNI 20/4001
  65199 65102
    10.0.1.2 from spine02(swp52) (10.10.10.102)
      Origin IGP, valid, external
      Extended Community: RT:65102:20 RT:65102:4001 ET:8 Rmac:44:38:39:be:ef:bb
      Last update: Fri Jan 15 08:16:24 2021
  Route [2]:[0]:[48]:[12:15:9a:9c:f2:e1]:[32]:[10.1.20.105] VNI 20/4001
  65199 65102
    10.0.1.2 from spine03(swp53) (10.10.10.103)
      Origin IGP, valid, external
      Extended Community: RT:65102:20 RT:65102:4001 ET:8 Rmac:44:38:39:be:ef:bb
      Last update: Fri Jan 15 08:16:24 2021

Displayed 4 paths for requested prefix

Show the VNI EVPN Routing Table

The switch maintains the received EVPN routes in the global EVPN routing table, even if there are no appropriate local VNIs to import them into. For example, a spine maintains the global EVPN routing table even though there are no VNIs present in the table. When local VNIs are present, the switch imports received EVPN routes into the per-VNI routing tables according to the route target attributes. You can examine the per-VNI routing table with the vtysh show bgp vni <vni> command:

leaf01# show bgp vni 10
BGP table version is 351, local router ID is 10.10.10.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[ESI]:[EthTag]:[IPlen]:[VTEP-IP]:[Frag-id]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]

   Network          Next Hop            Metric LocPrf Weight Path
*> [2]:[0]:[48]:[44:38:39:00:00:32]:[32]:[10.1.10.101]
                    10.0.1.12 (leaf01)
                                                       32768 i
                    ET:8 RT:65101:10 RT:65101:4001 Rmac:44:38:39:be:ef:aa
*> [2]:[0]:[48]:[44:38:39:00:00:32]:[128]:[fe80::4638:39ff:fe00:32]
                    10.0.1.12 (leaf01)
                                                       32768 i
                    ET:8 RT:65101:10
*  [2]:[0]:[48]:[44:38:39:00:00:3e]:[128]:[fe80::4638:39ff:fe00:3e]
                    10.0.1.34 (leaf02)
                                                           0 65102 65199 65104 i
                    RT:65104:10 ET:8
*  [2]:[0]:[48]:[44:38:39:00:00:3e]:[128]:[fe80::4638:39ff:fe00:3e]
                    10.0.1.34 (leaf02)
                                                           0 65102 65199 65103 i
                    RT:65103:10 ET:8
*  [2]:[0]:[48]:[44:38:39:00:00:3e]:[128]:[fe80::4638:39ff:fe00:3e]
                    10.0.1.34 (spine02)
                                                           0 65199 65104 i
                    RT:65104:10 ET:8
*  [2]:[0]:[48]:[44:38:39:00:00:3e]:[128]:[fe80::4638:39ff:fe00:3e]
                    10.0.1.34 (spine02)
                                                           0 65199 65103 i
                    RT:65103:10 ET:8
*  [2]:[0]:[48]:[44:38:39:00:00:3e]:[128]:[fe80::4638:39ff:fe00:3e]
                    10.0.1.34 (spine04)
                                                           0 65199 65104 i
                    RT:65104:10 ET:8
*  [2]:[0]:[48]:[44:38:39:00:00:3e]:[128]:[fe80::4638:39ff:fe00:3e]
                    10.0.1.34 (spine04)
                                                           0 65199 65103 i
                    RT:65103:10 ET:8
*  [2]:[0]:[48]:[44:38:39:00:00:3e]:[128]:[fe80::4638:39ff:fe00:3e]
                    10.0.1.34 (spine03)
                                                           0 65199 65104 i
                    RT:65104:10 ET:8
*  [2]:[0]:[48]:[44:38:39:00:00:3e]:[128]:[fe80::4638:39ff:fe00:3e]
                    10.0.1.34 (spine03)
                                                           0 65199 65103 i
                    RT:65103:10 ET:8
*  [2]:[0]:[48]:[44:38:39:00:00:3e]:[128]:[fe80::4638:39ff:fe00:3e]
                    10.0.1.34 (spine01)
                                                           0 65199 65104 i
                    RT:65104:10 ET:8
*> [2]:[0]:[48]:[44:38:39:00:00:3e]:[128]:[fe80::4638:39ff:fe00:3e]
                    10.0.1.34 (spine01)
                                                           0 65199 65103 i
                    RT:65103:10 ET:8
...

To display the VNI routing table for all VNIs, run the vtysh show bgp l2vpn evpn route vni all command.

To view the EVPN RIB with NVUE, run the nv show vrf <vrf> router bgp address-family l2vpn-evpn route command.

Show the VRF BGP Routing Table

For symmetric routing, the switch imports received type-2 and type-5 routes into the VRF routing table (according to address family: IPv4 unicast or IPv6 unicast) based on a match on the route target attributes. To examine the BGP VRF routing table, run the vtysh show bgp vrf <vrf-name> ipv4 unicast and show bgp vrf <vrf-name> ipv6 unicast command.

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show bgp vrf RED ipv4 unicast
BGP table version is 2, local router ID is 10.1.20.2, vrf id 24
Default local pref 100, local AS 65101
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*  10.1.10.104/32   10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i
*>                  10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i
*  10.1.20.105/32   10.0.1.2<                              0 65199 65102 i
*>                  10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i
*                   10.0.1.2<                              0 65199 65102 i

Displayed  2 routes and 16 total paths

Support for EVPN Neighbor Discovery (ND) Extended Community

In EVPN VXLAN with ARP and ND suppression where you only configure the VTEPs for layer 2, EVPN needs to carry additional information for the attached devices so proxy ND can provide the correct information to attached hosts. Without this information, hosts cannot configure their default routers or lose their existing default router information. Cumulus Linux supports the EVPN Neighbor Discovery (ND) Extended Community with a type field value of 0x06, a subtype field value of 0x08 (ND Extended Community), and a router flag; this enables the switch to determine if a particular IPv6-MAC pair belongs to a host or a router.

The following configurations use the router flag (R-bit):

When the MAC/IP (type-2) route contains the IPv6-MAC pair with the R-bit flag, the route belongs to a router. If the R-bit is zero, the route belongs to a host. If the router is in a local LAN segment, the switch implementing the proxy ND function learns of this information by snooping on neighbor advertisement messages for the associated IPv6 address. Other EVPN peers exchange this information by using the ND extended community in BGP updates.

To show that the neighbor table includes the EVPN arp-cache and that the IPv6-MAC entry belongs to a router, run the vtysh show evpn arp-cache vni <vni> ip <address> command. For example:

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show evpn arp-cache vni 20 ip 10.1.20.105
IP: 10.1.20.105
 Type: remote
 State: active
 MAC: 12:15:9a:9c:f2:e1
 Sync-info: -
 Remote VTEP: 10.0.1.2
 Local Seq: 0 Remote Seq: 0

Examine MAC Moves

The first time a MAC moves from behind one VTEP to behind another, BGP associates a MAC Mobility (MM) extended community attribute of sequence number 1, with the type-2 route for that MAC. From there, each time this MAC moves to a new VTEP, the MM sequence number increments by 1. You can examine the MM sequence number associated with a MAC’s type-2 route with the vtysh show bgp l2vpn evpn route vni <vni> mac <mac> command. The example output below shows the type-2 route for a MAC that has moved three times:

cumulus@switch:~$ sudo vtysh
...
switch# show bgp l2vpn evpn route vni 10109 mac 00:02:22:22:22:02
BGP routing table entry for [2]:[0]:[0]:[48]:[00:02:22:22:22:02]
Paths: (1 available, best #1)
Not advertised to any peer
Route [2]:[0]:[0]:[48]:[00:02:22:22:22:02] VNI 10109
Local
6.0.0.184 from 0.0.0.0 (6.0.0.184)
Origin IGP, localpref 100, weight 32768, valid, sourced, local, bestpath-from-AS Local, best
Extended Community: RT:650184:10109 ET:8 MM:3
AddPath ID: RX 0, TX 10350121
Last update: Tue Feb 14 18:40:37 2017

Displayed 1 paths for requested prefix

Examine Static MAC Addresses

You can identify static or sticky MACs in EVPN by the presence of MM:0, sticky MAC in the Extended Community line of the output from the vtysh show bgp l2vpn evpn route vni <vni> mac <mac> command.

cumulus@switch:~$ sudo vtysh
...
switch# show bgp l2vpn evpn route vni 10101 mac 00:02:00:00:00:01
BGP routing table entry for [2]:[0]:[0]:[48]:[00:02:00:00:00:01]
Paths: (1 available, best #1)
  Not advertised to any peer
  Route [2]:[0]:[0]:[48]:[00:02:00:00:00:01] VNI 10101
  Local
    172.16.130.18 from 0.0.0.0 (172.16.130.18)
      Origin IGP, localpref 100, weight 32768, valid, sourced, local, bestpath-from-AS Local, best
      Extended Community: ET:8 RT:60176:10101 MM:0, sticky MAC
      AddPath ID: RX 0, TX 46
      Last update: Tue Apr 11 21:44:02 2017

Displayed 1 paths for requested prefix

Enable FRR Debug Logs

To troubleshoot EVPN, enable FRR debug logs. The relevant debug options are:

Option
Description
debug zebra vxlan Traces VNI addition and deletion (local and remote) as well as MAC and neighbor addition and deletion (local and remote).
debug zebra kernel Traces actual netlink messages exchanged with the kernel, which includes everything, not just EVPN.
debug bgp updates Traces BGP update exchanges, including all updates. The output also shows EVPN specific information.
debug bgp zebra Traces interactions between BGP and zebra for EVPN (and other) routes.

ICMP echo Replies and the ping Command

When you run the ping -I command and specify an interface, you do not receive an ICMP echo reply. However, when you run the ping command without the -I option, everything works as expected.

ping -I command example:

cumulus@switch:default:~:# ping -I swp2 10.0.10.1
PING 10.0.10.1 (10.0.10.1) from 10.0.0.2 swp1.5: 56(84) bytes of data.

ping command example:

cumulus@switch:default:~:# ping 10.0.10.1
PING 10.0.10.1 (10.0.10.1) 56(84) bytes of data.
64 bytes from 10.0.10.1: icmp_req=1 ttl=63 time=4.00 ms
64 bytes from 10.0.10.1: icmp_req=2 ttl=63 time=0.000 ms
64 bytes from 10.0.10.1: icmp_req=3 ttl=63 time=0.000 ms
64 bytes from 10.0.10.1: icmp_req=4 ttl=63 time=0.000 ms
^C
--- 10.0.10.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 0.000/1.000/4.001/1.732 ms

When you send an ICMP echo request to an IP address that is not in the same subnet using the ping -I command, Cumulus Linux creates a failed ARP entry for the destination IP address.

For more information, refer to this article.

Configuration Examples

This section shows the following EVPN configuration examples:

Layer 2 EVPN with External Routing

The following example configures a network infrastructure that creates a layer 2 extension between racks. Inter-VXLAN routed traffic routes between VXLANs on an external device.

The following images shows traffic flow between tenants. For simplicity, the images do not show spines and other devices.

Traffic Flow between server01 and server04
server01 and server04 are in the same VLAN but are across different leafs.
  1. server01 makes a LACP hash decision and forwards traffic to leaf01.
  2. leaf01 does a layer 2 lookup, has the MAC address for server04, and forwards the packet out VNI10, towards leaf04.
  3. The VXLAN encapsulated frame arrives on leaf04, which does a layer 2 lookup and has the MAC address for server04 in VLAN10.
cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:~$ nv set interface swp1-2,swp49-54
cumulus@leaf01:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:~$ nv set interface bond2 bond member swp2
cumulus@leaf01:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf01:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf01:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond1 link mtu 9000
cumulus@leaf01:~$ nv set interface bond2 link mtu 9000
cumulus@leaf01:~$ nv set interface bond1-2 bridge domain br_default
cumulus@leaf01:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf01:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10,20
cumulus@leaf01:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf01:~$ nv set mlag mac-address 44:38:39:FF:00:AA
cumulus@leaf01:~$ nv set mlag backup 10.10.10.2
cumulus@leaf01:~$ nv set mlag peer-ip linklocal
cumulus@leaf01:~$ nv set mlag priority 1000
cumulus@leaf01:~$ nv set mlag init-delay 10
cumulus@leaf01:~$ nv set interface vlan10
cumulus@leaf01:~$ nv set interface vlan20
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf01:~$ nv set nve vxlan mlag shared-address 10.0.1.12
cumulus@leaf01:~$ nv set nve vxlan source address 10.10.10.1
cumulus@leaf01:~$ nv set nve vxlan arp-nd-suppress on 
cumulus@leaf01:~$ nv set evpn enable on
cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf01:~$ nv config apply
cumulus@leaf02:~$ nv set interface lo ip address 10.10.10.2/32
cumulus@leaf02:~$ nv set interface swp1-2,swp49-54
cumulus@leaf02:~$ nv set interface bond1 bond member swp1
cumulus@leaf02:~$ nv set interface bond2 bond member swp2
cumulus@leaf02:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf02:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf02:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf02:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf02:~$ nv set interface bond1 link mtu 9000
cumulus@leaf02:~$ nv set interface bond2 link mtu 9000
cumulus@leaf02:~$ nv set interface bond1-2 bridge domain br_default
cumulus@leaf02:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf02:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10,20
cumulus@leaf02:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf02:~$ nv set mlag mac-address 44:38:39:FF:00:AA
cumulus@leaf02:~$ nv set mlag backup 10.10.10.1
cumulus@leaf02:~$ nv set mlag peer-ip linklocal
cumulus@leaf02:~$ nv set mlag priority 2000
cumulus@leaf02:~$ nv set mlag init-delay 10
cumulus@leaf02:~$ nv set interface vlan10
cumulus@leaf02:~$ nv set interface vlan20
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf02:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf02:~$ nv set nve vxlan mlag shared-address 10.0.1.12
cumulus@leaf02:~$ nv set nve vxlan source address 10.10.10.2
cumulus@leaf02:~$ nv set nve vxlan arp-nd-suppress on 
cumulus@leaf02:~$ nv set evpn enable on
cumulus@leaf02:~$ nv set router bgp autonomous-system 65102
cumulus@leaf02:~$ nv set router bgp router-id 10.10.10.2
cumulus@leaf02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf02:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf02:~$ nv config apply
cumulus@leaf03:~$ nv set interface lo ip address 10.10.10.3/32
cumulus@leaf03:~$ nv set interface swp1-2,swp49-54
cumulus@leaf03:~$ nv set interface bond1 bond member swp1
cumulus@leaf03:~$ nv set interface bond2 bond member swp2
cumulus@leaf03:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf03:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf03:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf03:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf03:~$ nv set interface bond1 link mtu 9000
cumulus@leaf03:~$ nv set interface bond2 link mtu 9000
cumulus@leaf03:~$ nv set interface bond1-2 bridge domain br_default
cumulus@leaf03:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf03:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10,20
cumulus@leaf03:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf03:~$ nv set mlag mac-address 44:38:39:FF:00:BB
cumulus@leaf03:~$ nv set mlag backup 10.10.10.4
cumulus@leaf03:~$ nv set mlag peer-ip linklocal
cumulus@leaf03:~$ nv set mlag priority 1000
cumulus@leaf03:~$ nv set mlag init-delay 10
cumulus@leaf03:~$ nv set interface vlan10
cumulus@leaf03:~$ nv set interface vlan20
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf03:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf03:~$ nv set nve vxlan mlag shared-address 10.0.1.34
cumulus@leaf03:~$ nv set nve vxlan source address 10.10.10.3
cumulus@leaf03:~$ nv set nve vxlan arp-nd-suppress on
cumulus@leaf03:~$ nv set evpn enable on
cumulus@leaf03:~$ nv set router bgp autonomous-system 65103
cumulus@leaf03:~$ nv set router bgp router-id 10.10.10.3
cumulus@leaf03:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf03:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf03:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf03:~$ nv config apply
cumulus@leaf04:~$ nv set interface lo ip address 10.10.10.4/32
cumulus@leaf04:~$ nv set interface swp1-2,swp49-54
cumulus@leaf04:~$ nv set interface bond1 bond member swp1
cumulus@leaf04:~$ nv set interface bond2 bond member swp2
cumulus@leaf04:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf04:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf04:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf04:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf04:~$ nv set interface bond1 link mtu 9000
cumulus@leaf04:~$ nv set interface bond2 link mtu 9000
cumulus@leaf04:~$ nv set interface bond1-2 bridge domain br_default
cumulus@leaf04:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf04:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10,20
cumulus@leaf04:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf04:~$ nv set mlag mac-address 44:38:39:FF:00:BB
cumulus@leaf04:~$ nv set mlag backup 10.10.10.3
cumulus@leaf04:~$ nv set mlag peer-ip linklocal
cumulus@leaf04:~$ nv set mlag priority 2000
cumulus@leaf04:~$ nv set mlag init-delay 10
cumulus@leaf04:~$ nv set interface vlan10
cumulus@leaf04:~$ nv set interface vlan20
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf04:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf04:~$ nv set nve vxlan mlag shared-address 10.0.1.34
cumulus@leaf04:~$ nv set nve vxlan source address 10.10.10.4
cumulus@leaf04:~$ nv set nve vxlan arp-nd-suppress on
cumulus@leaf04:~$ nv set evpn enable on
cumulus@leaf04:~$ nv set router bgp autonomous-system 65104
cumulus@leaf04:~$ nv set router bgp router-id 10.10.10.4
cumulus@leaf04:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf04:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf04:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf04:~$ nv config apply
cumulus@spine01:~$ nv set interface lo ip address 10.10.10.101/32
cumulus@spine01:~$ nv set interface swp1-6
cumulus@spine01:~$ nv set router bgp autonomous-system 65199
cumulus@spine01:~$ nv set router bgp router-id 10.10.10.101
cumulus@spine01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine01:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine01:~$ nv config apply
cumulus@spine02:~$ nv set interface lo ip address 10.10.10.102/32
cumulus@spine02:~$ nv set interface swp1-6
cumulus@spine02:~$ nv set router bgp autonomous-system 65199
cumulus@spine02:~$ nv set router bgp router-id 10.10.10.102
cumulus@spine02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine02:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine02:~$ nv config apply
cumulus@spine03:~$ nv set interface lo ip address 10.10.10.103/32
cumulus@spine03:~$ nv set interface swp1-6
cumulus@spine03:~$ nv set router bgp autonomous-system 65199
cumulus@spine03:~$ nv set router bgp router-id 10.10.10.103
cumulus@spine03:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine03:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine03:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine03:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine03:~$ nv config apply
cumulus@spine04:~$ nv set interface lo ip address 10.10.10.104/32
cumulus@spine04:~$ nv set interface swp1-6
cumulus@spine04:~$ nv set router bgp autonomous-system 65199
cumulus@spine04:~$ nv set router bgp router-id 10.10.10.104
cumulus@spine04:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine04:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine04:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine04:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine04:~$ nv config apply
cumulus@border01:~$ nv set interface lo ip address 10.10.10.63/32
cumulus@border01:~$ nv set interface swp3,swp49-54
cumulus@border01:~$ nv set interface bond3 bond member swp3
cumulus@border01:~$ nv set interface bond3 bond mlag id 1
cumulus@border01:~$ nv set interface bond3 bond lacp-bypass on
cumulus@border01:~$ nv set interface bond3 link mtu 9000
cumulus@border01:~$ nv set interface bond3 bridge domain br_default
cumulus@border01:~$ nv set interface peerlink bond member swp49-50
cumulus@border01:~$ nv set mlag mac-address 44:38:39:FF:00:FF
cumulus@border01:~$ nv set mlag backup 10.10.10.64
cumulus@border01:~$ nv set mlag peer-ip linklocal
cumulus@border01:~$ nv set mlag priority 1000
cumulus@border01:~$ nv set mlag init-delay 10
cumulus@border01:~$ nv set interface vlan10
cumulus@border01:~$ nv set interface vlan20
cumulus@border01:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@border01:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@border01:~$ nv set interface bond3 bridge domain br_default vlan 10,20
cumulus@border01:~$ nv set nve vxlan mlag shared-address 10.0.1.255
cumulus@border01:~$ nv set nve vxlan source address 10.10.10.63
cumulus@border01:~$ nv set nve vxlan arp-nd-suppress on
cumulus@border01:~$ nv set evpn enable on
cumulus@border01:~$ nv set router bgp autonomous-system 65253
cumulus@border01:~$ nv set router bgp router-id 10.10.10.63
cumulus@border01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@border01:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@border01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@border01:~$ nv config apply
cumulus@border02:~$ nv set interface lo ip address 10.10.10.64/32
cumulus@border02:~$ nv set interface swp3,swp49-54
cumulus@border02:~$ nv set interface bond3 bond member swp3
cumulus@border02:~$ nv set interface bond3 bond mlag id 1
cumulus@border02:~$ nv set interface bond3 bond lacp-bypass on
cumulus@border02:~$ nv set interface bond3 link mtu 9000
cumulus@border02:~$ nv set interface bond3 bridge domain br_default
cumulus@border02:~$ nv set interface peerlink bond member swp49-50
cumulus@border02:~$ nv set mlag mac-address 44:38:39:FF:00:FF
cumulus@border02:~$ nv set mlag backup 10.10.10.63
cumulus@border02:~$ nv set mlag peer-ip linklocal
cumulus@border02:~$ nv set mlag priority 2000
cumulus@border02:~$ nv set mlag init-delay 10
cumulus@border02:~$ nv set interface vlan10
cumulus@border02:~$ nv set interface vlan20
cumulus@border02:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@border02:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@border02:~$ nv set interface bond3 bridge domain br_default vlan 10,20
cumulus@border02:~$ nv set nve vxlan mlag shared-address 10.0.1.255
cumulus@border02:~$ nv set nve vxlan source address 10.10.10.64
cumulus@border02:~$ nv set nve vxlan arp-nd-suppress on
cumulus@border02:~$ nv set evpn enable on
cumulus@border02:~$ nv set router bgp autonomous-system 65254
cumulus@border02:~$ nv set router bgp router-id 10.10.10.64
cumulus@border02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@border02:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@border02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@border02:~$ nv config apply
cumulus@leaf01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 10
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            id: 2
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 20
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    mlag:
      mac-address: 44:38:39:FF:00:AA
      backup:
        10.10.10.2: {}
      peer-ip: linklocal
      priority: 1000
      init-delay: 10
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.12
        source:
          address: 10.10.10.1
        arp-nd-suppress: on
    evpn:
      enable: on
    router:
      bgp:
        enable: on
        autonomous-system: 65101
        router-id: 10.10.10.1
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@leaf02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.2/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 10
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            id: 2
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 20
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    mlag:
      mac-address: 44:38:39:FF:00:AA
      backup:
        10.10.10.1: {}
      peer-ip: linklocal
      priority: 2000
      init-delay: 10
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.12
        source:
          address: 10.10.10.2
        arp-nd-suppress: on
    evpn:
      enable: on
    router:
      bgp:
        enable: on
        autonomous-system: 65102
        router-id: 10.10.10.2
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@leaf03:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.3/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 10
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            id: 2
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 20
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    mlag:
      mac-address: 44:38:39:FF:00:BB
      backup:
        10.10.10.4: {}
      peer-ip: linklocal
      priority: 1000
      init-delay: 10
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.34
        source:
          address: 10.10.10.3
        arp-nd-suppress: on
    evpn:
      enable: on
    router:
      bgp:
        enable: on
        autonomous-system: 65103
        router-id: 10.10.10.3
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@leaf04:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.4/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 10
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            id: 2
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 20
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    mlag:
      mac-address: 44:38:39:FF:00:BB
      backup:
        10.10.10.3: {}
      peer-ip: linklocal
      priority: 2000
      init-delay: 10
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.34
        source:
          address: 10.10.10.4
        arp-nd-suppress: on
    evpn:
      enable: on
    router:
      bgp:
        enable: on
        autonomous-system: 65104
        router-id: 10.10.10.4
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@spine01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.101/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.101
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            address-family:
              l2vpn-evpn:
                enable: on
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@spine02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.102/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.102
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            address-family:
              l2vpn-evpn:
                enable: on
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@spine03:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.103/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.103
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            address-family:
              l2vpn-evpn:
                enable: on
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@spine04:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.104/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.104
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            address-family:
              l2vpn-evpn:
                enable: on
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@border01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.63/32: {}
        type: loopback
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              vlan:
                '10': {}
                '20': {}
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    mlag:
      mac-address: 44:38:39:FF:00:FF
      backup:
        10.10.10.64: {}
      peer-ip: linklocal
      priority: 1000
      init-delay: 10
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.255
        source:
          address: 10.10.10.63
        arp-nd-suppress: on
    evpn:
      enable: on
    router:
      bgp:
        enable: on
        autonomous-system: 65253
        router-id: 10.10.10.63
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@border02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.64/32: {}
        type: loopback
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              vlan:
                '10': {}
                '20': {}
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    mlag:
      mac-address: 44:38:39:FF:00:FF
      backup:
        10.10.10.63: {}
      peer-ip: linklocal
      priority: 2000
      init-delay: 10
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.255
        source:
          address: 10.10.10.64
        arp-nd-suppress: on
    evpn:
      enable: on
    router:
      bgp:
        enable: on
        autonomous-system: 65254
        router-id: 10.10.10.64
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@leaf01:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.1/32
    clagd-vxlan-anycast-ip 10.0.1.12
    vxlan-local-tunnelip 10.10.10.1
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 1000
    clagd-backup-ip 10.10.10.2
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:b1
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:b1
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off

auto br_default iface br_default bridge-ports bond1 bond2 peerlink vxlan48 hwaddress 44:38:39:22:01:b1 bridge-vlan-aware yes bridge-vids 10 20 bridge-pvid 1

cumulus@leaf02:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.2/32
    clagd-vxlan-anycast-ip 10.0.1.12
    vxlan-local-tunnelip 10.10.10.2
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 2000
    clagd-backup-ip 10.10.10.1
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 peerlink vxlan48
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
cumulus@leaf03:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.3/32
    clagd-vxlan-anycast-ip 10.0.1.34
    vxlan-local-tunnelip 10.10.10.3
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 1000
    clagd-backup-ip 10.10.10.4
    clagd-sys-mac 44:38:39:FF:00:BB
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:bb
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:bb
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 peerlink vxlan48
    hwaddress 44:38:39:22:01:bb
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
cumulus@leaf04:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.4/32
    clagd-vxlan-anycast-ip 10.0.1.34
    vxlan-local-tunnelip 10.10.10.4
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 2000
    clagd-backup-ip 10.10.10.3
    clagd-sys-mac 44:38:39:FF:00:BB
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:c1
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:c1
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 peerlink vxlan48
    hwaddress 44:38:39:22:01:c1
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
cumulus@spine01:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.101/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6
cumulus@spine02:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.102/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6
cumulus@spine03:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.103/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6
cumulus@spine04:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.104/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6
cumulus@border01:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.63/32
    clagd-vxlan-anycast-ip 10.0.1.255
    vxlan-local-tunnelip 10.10.10.63
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond3
iface bond3
    mtu 9000
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-vids 10 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 1000
    clagd-backup-ip 10.10.10.64
    clagd-sys-mac 44:38:39:FF:00:FF
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:ab
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:ab
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond3 peerlink vxlan48
    hwaddress 44:38:39:22:01:ab
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
cumulus@border02:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.64/32
    clagd-vxlan-anycast-ip 10.0.1.255
    vxlan-local-tunnelip 10.10.10.64
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond3
iface bond3
    mtu 9000
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-vids 10 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 2000
    clagd-backup-ip 10.10.10.63
    clagd-sys-mac 44:38:39:FF:00:FF
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:b3
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:b3
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond3 peerlink vxlan48
    hwaddress 44:38:39:22:01:b3
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
    ```

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65101 vrf default
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
...
cumulus@leaf02:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65102 vrf default
bgp router-id 10.10.10.2
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
cumulus@leaf03:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65103 vrf default
bgp router-id 10.10.10.3
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
...
cumulus@leaf04:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65104 vrf default
bgp router-id 10.10.10.4
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
cumulus@spine01:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.101
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface remote-as external
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface remote-as external
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface remote-as external
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
cumulus@spine02:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.102
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface remote-as external
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface remote-as external
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface remote-as external
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
cumulus@spine03:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.103
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface remote-as external
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface remote-as external
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface remote-as external
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
cumulus@spine04:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.104
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface remote-as external
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface remote-as external
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface remote-as external
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
cumulus@border01:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65253 vrf default
bgp router-id 10.10.10.63
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
cumulus@border02:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65254 vrf default
bgp router-id 10.10.10.64
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family

EVPN Centralized Routing

The following example shows an EVPN centralized routing configuration:

The following images shows traffic flow between tenants. The spines and other devices are omitted for simplicity.

Traffic Flow between server01 and server05
server01 and server05 are in a different VLAN and are located across different leafs.
  1. server01 makes a LACP hash decision and forwards traffic to leaf01.
  2. leaf01 does a layer 2 lookup and forwards traffic to server01’s default gateway (border01) out VNI10.
  3. border01 does a layer 3 lookup and routes the packet out VNI20 towards leaf04.
  4. The VXLAN encapsulated frame arrives on leaf04, which does a layer 2 lookup and has the MAC address for server05 in VLAN20.
cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:~$ nv set interface swp1-2,swp49-54
cumulus@leaf01:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:~$ nv set interface bond2 bond member swp2
cumulus@leaf01:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf01:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf01:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond1 link mtu 9000
cumulus@leaf01:~$ nv set interface bond2 link mtu 9000
cumulus@leaf01:~$ nv set interface bond1-2 bridge domain br_default
cumulus@leaf01:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf01:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10,20
cumulus@leaf01:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf01:~$ nv set mlag mac-address 44:38:39:FF:00:AA
cumulus@leaf01:~$ nv set mlag backup 10.10.10.2
cumulus@leaf01:~$ nv set mlag peer-ip linklocal
cumulus@leaf01:~$ nv set mlag priority 1000
cumulus@leaf01:~$ nv set mlag init-delay 10
cumulus@leaf01:~$ nv set interface vlan10
cumulus@leaf01:~$ nv set interface vlan20
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf01:~$ nv set nve vxlan mlag shared-address 10.0.1.12
cumulus@leaf01:~$ nv set nve vxlan source address 10.10.10.1
cumulus@leaf01:~$ nv set nve vxlan arp-nd-suppress on
cumulus@leaf01:~$ nv set evpn enable on
cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf01:~$ nv config apply
cumulus@leaf02:~$ nv set interface lo ip address 10.10.10.2/32
cumulus@leaf02:~$ nv set interface swp1-2,swp49-54
cumulus@leaf02:~$ nv set interface bond1 bond member swp1
cumulus@leaf02:~$ nv set interface bond2 bond member swp2
cumulus@leaf02:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf02:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf02:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf02:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf02:~$ nv set interface bond1 link mtu 9000
cumulus@leaf02:~$ nv set interface bond2 link mtu 9000
cumulus@leaf02:~$ nv set interface bond1-2 bridge domain br_default
cumulus@leaf02:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf02:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf02:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf02:~$ nv set mlag mac-address 44:38:39:FF:00:AA
cumulus@leaf02:~$ nv set mlag backup 10.10.10.1
cumulus@leaf02:~$ nv set mlag peer-ip linklocal
cumulus@leaf02:~$ nv set mlag priority 2000
cumulus@leaf02:~$ nv set mlag init-delay 10
cumulus@leaf02:~$ nv set interface vlan10
cumulus@leaf02:~$ nv set interface vlan20
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf02:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf02:~$ nv set nve vxlan mlag shared-address 10.0.1.12
cumulus@leaf02:~$ nv set nve vxlan source address 10.10.10.2
cumulus@leaf02:~$ nv set nve vxlan arp-nd-suppress on
cumulus@leaf02:~$ nv set evpn enable on
cumulus@leaf02:~$ nv set router bgp autonomous-system 65102
cumulus@leaf02:~$ nv set router bgp router-id 10.10.10.2
cumulus@leaf02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf02:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf02:~$ nv config apply
cumulus@leaf03:~$ nv set interface lo ip address 10.10.10.3/32
cumulus@leaf03:~$ nv set interface swp1-2,swp49-54
cumulus@leaf03:~$ nv set interface bond1 bond member swp1
cumulus@leaf03:~$ nv set interface bond2 bond member swp2
cumulus@leaf03:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf03:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf03:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf03:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf03:~$ nv set interface bond1 link mtu 9000
cumulus@leaf03:~$ nv set interface bond2 link mtu 9000
cumulus@leaf03:~$ nv set interface bond1-2 bridge domain br_default
cumulus@leaf03:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf03:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10,20
cumulus@leaf03:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf03:~$ nv set mlag mac-address 44:38:39:FF:00:BB
cumulus@leaf03:~$ nv set mlag backup 10.10.10.4
cumulus@leaf03:~$ nv set mlag peer-ip linklocal
cumulus@leaf03:~$ nv set mlag priority 1000
cumulus@leaf03:~$ nv set mlag init-delay 10
cumulus@leaf03:~$ nv set interface vlan10
cumulus@leaf03:~$ nv set interface vlan20
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf03:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf03:~$ nv set nve vxlan mlag shared-address 10.0.1.34
cumulus@leaf03:~$ nv set nve vxlan source address 10.10.10.3
cumulus@leaf03:~$ nv set nve vxlan arp-nd-suppress on
cumulus@leaf03:~$ nv set evpn enable on
cumulus@leaf03:~$ nv set router bgp autonomous-system 65103
cumulus@leaf03:~$ nv set router bgp router-id 10.10.10.3
cumulus@leaf03:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf03:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf03:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf03:~$ nv config apply
cumulus@leaf04:~$ nv set interface lo ip address 10.10.10.4/32
cumulus@leaf04:~$ nv set interface swp1-2,swp49-54
cumulus@leaf04:~$ nv set interface bond1 bond member swp1
cumulus@leaf04:~$ nv set interface bond2 bond member swp2
cumulus@leaf04:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf04:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf04:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf04:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf04:~$ nv set interface bond1 link mtu 9000
cumulus@leaf04:~$ nv set interface bond2 link mtu 9000
cumulus@leaf04:~$ nv set interface bond1-2 bridge domain br_default
cumulus@leaf04:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf04:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10,20
cumulus@leaf04:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf04:~$ nv set mlag mac-address 44:38:39:FF:00:BB
cumulus@leaf04:~$ nv set mlag backup 10.10.10.3
cumulus@leaf04:~$ nv set mlag peer-ip linklocal
cumulus@leaf04:~$ nv set mlag priority 2000
cumulus@leaf04:~$ nv set mlag init-delay 10
cumulus@leaf04:~$ nv set interface vlan10
cumulus@leaf04:~$ nv set interface vlan20
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf04:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf04:~$ nv set nve vxlan mlag shared-address 10.0.1.34
cumulus@leaf04:~$ nv set nve vxlan source address 10.10.10.4
cumulus@leaf04:~$ nv set nve vxlan arp-nd-suppress on
cumulus@leaf04:~$ nv set evpn enable on
cumulus@leaf04:~$ nv set router bgp autonomous-system 65104
cumulus@leaf04:~$ nv set router bgp router-id 10.10.10.4
cumulus@leaf04:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf04:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp v swp54 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf04:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf04:~$ nv config apply
cumulus@spine01:~$ nv set interface lo ip address 10.10.10.101/32
cumulus@spine01:~$ nv set interface swp1-6
cumulus@spine01:~$ nv set router bgp autonomous-system 65199
cumulus@spine01:~$ nv set router bgp router-id 10.10.10.101
cumulus@spine01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine01:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine01:~$ nv config apply
cumulus@spine02:~$ nv set interface lo ip address 10.10.10.102/32
cumulus@spine02:~$ nv set interface swp1-6
cumulus@spine02:~$ nv set router bgp autonomous-system 65199
cumulus@spine02:~$ nv set router bgp router-id 10.10.10.102
cumulus@spine02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine02:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine02:~$ nv config apply
cumulus@spine03:~$ nv set interface lo ip address 10.10.10.103/32
cumulus@spine03:~$ nv set interface swp1-6
cumulus@spine03:~$ nv set router bgp autonomous-system 65199
cumulus@spine03:~$ nv set router bgp router-id 10.10.10.103
cumulus@spine03:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine03:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine03:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine03:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine03:~$ nv config apply
cumulus@spine04:~$ nv set interface lo ip address 10.10.10.104/32
cumulus@spine04:~$ nv set interface swp1-6
cumulus@spine04:~$ nv set router bgp autonomous-system 65199
cumulus@spine04:~$ nv set router bgp router-id 10.10.10.104
cumulus@spine04:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine04:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine04:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine04:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine04:~$ nv config apply
cumulus@border01:~$ nv set interface lo ip address 10.10.10.63/32
cumulus@border01:~$ nv set interface swp1-3,swp49-54
cumulus@border01:~$ nv set interface bond3 bond member swp3
cumulus@border01:~$ nv set interface bond3 bond mlag id 1
cumulus@border01:~$ nv set interface bond3 bond lacp-bypass on
cumulus@border01:~$ nv set interface bond3 link mtu 9000
cumulus@border01:~$ nv set interface bond3 bridge domain br_default
cumulus@border01:~$ nv set interface peerlink bond member swp49-50
cumulus@border01:~$ nv set mlag mac-address 44:38:39:FF:00:FF
cumulus@border01:~$ nv set mlag backup 10.10.10.64
cumulus@border01:~$ nv set mlag peer-ip linklocal
cumulus@border01:~$ nv set mlag priority 1000
cumulus@border01:~$ nv set mlag init-delay 10
cumulus@border01:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@border01:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@border01:~$ nv set interface vlan10 ip vrr mac-address 00:00:00:00:00:10
cumulus@border01:~$ nv set interface vlan10 ip vrr state up
cumulus@border01:~$ nv set interface vlan20 ip address 10.1.10.2/24
cumulus@border01:~$ nv set interface vlan20 ip vrr address 10.1.20.2/24
cumulus@border01:~$ nv set interface vlan20 ip vrr mac-address 00:00:00:00:00:20
cumulus@border01:~$ nv set interface vlan20 ip vrr state up
cumulus@border01:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@border01:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@border01:~$ nv set interface bond3 bridge domain br_default vlan 10,20
cumulus@border01:~$ nv set nve vxlan mlag shared-address 10.0.1.255
cumulus@border01:~$ nv set nve vxlan source address 10.10.10.63
cumulus@border01:~$ nv set nve vxlan arp-nd-suppress on
cumulus@border01:~$ nv set evpn enable on
cumulus@border01:~$ nv set router bgp autonomous-system 65253
cumulus@border01:~$ nv set router bgp router-id 10.10.10.63
cumulus@border01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@border01:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@border01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@border01:~$ nv set evpn route-advertise default-gateway on
cumulus@border01:~$ nv config apply
cumulus@border02:~$ nv set interface lo ip address 10.10.10.64/32
cumulus@border02:~$ nv set interface swp1-3,swp49-54
cumulus@border02:~$ nv set interface bond3 bond member swp3
cumulus@border02:~$ nv set interface bond3 bond mlag id 1
cumulus@border02:~$ nv set interface bond3 bond lacp-bypass on
cumulus@border02:~$ nv set interface bond3 link mtu 9000
cumulus@border02:~$ nv set interface bond3 bridge domain br_default
cumulus@border02:~$ nv set interface peerlink bond member swp49-50
cumulus@border02:~$ nv set mlag mac-address 44:38:39:FF:00:FF
cumulus@border02:~$ nv set mlag backup 10.10.10.63
cumulus@border02:~$ nv set mlag peer-ip linklocal
cumulus@border02:~$ nv set mlag priority 2000
cumulus@border02:~$ nv set mlag init-delay 10
cumulus@border02:~$ nv set interface vlan10 ip address 10.1.10.1/24
cumulus@border02:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@border02:~$ nv set interface vlan10 ip vrr mac-address 00:00:00:00:00:10
cumulus@border02:~$ nv set interface vlan10 ip vrr state up
cumulus@border02:~$ nv set interface vlan20 ip address 10.1.20.1/24
cumulus@border02:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@border02:~$ nv set interface vlan20 ip vrr mac-address 00:00:00:00:00:20
cumulus@border02:~$ nv set interface vlan20 ip vrr state up
cumulus@border02:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@border02:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@border02:~$ nv set interface bond3 bridge domain br_default vlan 10,20
cumulus@border02:~$ nv set nve vxlan mlag shared-address 10.0.1.255
cumulus@border02:~$ nv set nve vxlan source address 10.10.10.64
cumulus@border02:~$ nv set nve vxlan arp-nd-suppress on
cumulus@border02:~$ nv set evpn enable on
cumulus@border02:~$ nv set router bgp autonomous-system 65254
cumulus@border02:~$ nv set router bgp router-id 10.10.10.64
cumulus@border02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@border02:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@border02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@border02:~$ nv set evpn route-advertise default-gateway on
cumulus@border02:~$ nv config apply
cumulus@leaf01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 10
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            id: 2
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 20
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    mlag:
      mac-address: 44:38:39:FF:00:AA
      backup:
        10.10.10.2: {}
      peer-ip: linklocal
      priority: 1000
      init-delay: 10
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.12
        source:
          address: 10.10.10.1
        arp-nd-suppress: on
    evpn:
      enable: on
    router:
      bgp:
        enable: on
        autonomous-system: 65101
        router-id: 10.10.10.1
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@leaf02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.2/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 10
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            id: 2
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 20
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    mlag:
      mac-address: 44:38:39:FF:00:AA
      backup:
        10.10.10.1: {}
      peer-ip: linklocal
      priority: 2000
      init-delay: 10
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.12
        source:
          address: 10.10.10.2
        arp-nd-suppress: on
    evpn:
      enable: on
    router:
      bgp:
        enable: on
        autonomous-system: 65102
        router-id: 10.10.10.2
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@leaf03:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.3/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 10
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            id: 2
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 20
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    mlag:
      mac-address: 44:38:39:FF:00:BB
      backup:
        10.10.10.4: {}
      peer-ip: linklocal
      priority: 1000
      init-delay: 10
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.34
        source:
          address: 10.10.10.3
        arp-nd-suppress: on
    evpn:
      enable: on
    router:
      bgp:
        enable: on
        autonomous-system: 65103
        router-id: 10.10.10.3
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@leaf04:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.4/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 10
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            id: 2
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              access: 20
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    mlag:
      mac-address: 44:38:39:FF:00:BB
      backup:
        10.10.10.3: {}
      peer-ip: linklocal
      priority: 2000
      init-delay: 10
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.34
        source:
          address: 10.10.10.4
        arp-nd-suppress: on
    evpn:
      enable: on
    router:
      bgp:
        enable: on
        autonomous-system: 65104
        router-id: 10.10.10.4
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@spine01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.101/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.101
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            address-family:
              l2vpn-evpn:
                enable: on
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@spine02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.102/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.102
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            address-family:
              l2vpn-evpn:
                enable: on
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@spine03:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.103/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.103
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            address-family:
              l2vpn-evpn:
                enable: on
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@spine04:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.104/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.104
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            address-family:
              l2vpn-evpn:
                enable: on
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@border01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.63/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              vlan:
                '10': {}
                '20': {}
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        ip:
          address:
            10.1.10.2/24: {}
          vrr:
            address:
              10.1.10.1/24: {}
            mac-address: 00:00:00:00:00:10
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.10.2/24: {}
          vrr:
            address:
              10.1.20.2/24: {}
            mac-address: 00:00:00:00:00:20
            state:
              up: {}
        type: svi
        vlan: 20
    mlag:
      mac-address: 44:38:39:FF:00:FF
      backup:
        10.10.10.64: {}
      peer-ip: linklocal
      priority: 1000
      init-delay: 10
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.255
        source:
          address: 10.10.10.63
        arp-nd-suppress: on
    evpn:
      enable: on
      route-advertise:
        default-gateway: on
    router:
      bgp:
        enable: on
        autonomous-system: 65253
        router-id: 10.10.10.63
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
cumulus@border02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      lo:
        ip:
          address:
            10.10.10.64/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            id: 1
          lacp-bypass: on
        type: bond
        link:
          mtu: 9000
        bridge:
          domain:
            br_default:
              vlan:
                '10': {}
                '20': {}
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        type: sub
        base-interface: peerlink
        vlan: 4094
      vlan10:
        ip:
          address:
            10.1.10.1/24: {}
          vrr:
            address:
              10.1.10.1/24: {}
            mac-address: 00:00:00:00:00:10
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.1/24: {}
          vrr:
            address:
              10.1.20.1/24: {}
            mac-address: 00:00:00:00:00:20
            state:
              up: {}
        type: svi
        vlan: 20
    mlag:
      mac-address: 44:38:39:FF:00:FF
      backup:
        10.10.10.63: {}
      peer-ip: linklocal
      priority: 2000
      init-delay: 10
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.255
        source:
          address: 10.10.10.64
        arp-nd-suppress: on
    evpn:
      enable: on
      route-advertise:
        default-gateway: on
    router:
      bgp:
        enable: on
        autonomous-system: 65254
        router-id: 10.10.10.64
    vrf:
      default:
        router:
          bgp:
            peer-group:
              underlay:
                remote-as: external
                address-family:
                  l2vpn-evpn:
                    enable: on
            enable: on
            peer:
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
            address-family:
              ipv4-unicast:
                redistribute:
                  connected:
                    enable: on
                enable: on
...
auto lo
iface lo inet loopback
    address 10.10.10.1/32
    clagd-vxlan-anycast-ip 10.0.1.12
    vxlan-local-tunnelip 10.10.10.1
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 1000
    clagd-backup-ip 10.10.10.2
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 peerlink vxlan48
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
...
auto lo
iface lo inet loopback
    address 10.10.10.2/32
    clagd-vxlan-anycast-ip 10.0.1.12
    vxlan-local-tunnelip 10.10.10.2
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 2000
    clagd-backup-ip 10.10.10.1
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 peerlink vxlan48
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
...
auto lo
iface lo inet loopback
    address 10.10.10.3/32
    clagd-vxlan-anycast-ip 10.0.1.34
    vxlan-local-tunnelip 10.10.10.3
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 1000
    clagd-backup-ip 10.10.10.4
    clagd-sys-mac 44:38:39:FF:00:BB
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 peerlink vxlan48
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
...
auto lo
iface lo inet loopback
    address 10.10.10.4/32
    clagd-vxlan-anycast-ip 10.0.1.34
    vxlan-local-tunnelip 10.10.10.4
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 2000
    clagd-backup-ip 10.10.10.3
    clagd-sys-mac 44:38:39:FF:00:BB
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 peerlink vxlan48
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
...
auto lo
iface lo inet loopback
    address 10.10.10.101/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6
...
auto lo
iface lo inet loopback
    address 10.10.10.102/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6

...
auto lo
iface lo inet loopback
    address 10.10.10.103/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6
...
auto lo
iface lo inet loopback
    address 10.10.10.104/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6
...
auto lo
iface lo inet loopback
    address 10.10.10.63/32
    clagd-vxlan-anycast-ip 10.0.1.255
    vxlan-local-tunnelip 10.10.10.63
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond3
iface bond3
    mtu 9000
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-vids 10 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 1000
    clagd-backup-ip 10.10.10.64
    clagd-sys-mac 44:38:39:FF:00:FF
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    address 10.1.10.2/24
    address-virtual 00:00:5E:00:01:01 10.1.10.1/24
    hwaddress 44:38:39:22:01:ab
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.10.2/24
    address-virtual 00:00:5E:00:01:01 10.1.20.2/24
    hwaddress 44:38:39:22:01:ab
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond3 peerlink vxlan48
    hwaddress 44:38:39:22:01:ab
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
...
auto lo
iface lo inet loopback
    address 10.10.10.64/32
    clagd-vxlan-anycast-ip 10.0.1.255
    vxlan-local-tunnelip 10.10.10.64
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond3
iface bond3
    mtu 9000
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-vids 10 20
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 2000
    clagd-backup-ip 10.10.10.63
    clagd-sys-mac 44:38:39:FF:00:FF
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    address 10.1.10.1/24
    address-virtual 00:00:5E:00:01:01 10.1.10.1/24
    hwaddress 44:38:39:22:01:b3
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.1/24
    address-virtual 00:00:5E:00:01:01 10.1.20.1/24
    hwaddress 44:38:39:22:01:b3
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond3 peerlink vxlan48
    hwaddress 44:38:39:22:01:b3
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65101 vrf default
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65102 vrf default
bgp router-id 10.10.10.2
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65103 vrf default
bgp router-id 10.10.10.3
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65104 vrf default
bgp router-id 10.10.10.4
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.101
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface remote-as external
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface remote-as external
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface remote-as external
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.102
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface remote-as external
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface remote-as external
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface remote-as external
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.103
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface remote-as external
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface remote-as external
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface remote-as external
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.104
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface remote-as external
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface remote-as external
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface remote-as external
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65253 vrf default
bgp router-id 10.10.10.63
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
advertise-default-gw
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65254 vrf default
bgp router-id 10.10.10.64
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
advertise-default-gw
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family

EVPN Symmetric Routing

The following example shows an EVPN symmetric routing configuration, where:

The following images shows traffic flow between tenants. The spines and other devices are omitted for simplicity.

Traffic Flow between server01 and server04
server01 and server04 are in the same VRF and the same VLAN but are located across different leafs.
  1. server01 makes a LACP hash decision and forwards traffic to leaf01.
  2. leaf01 does a layer 2 lookup and has the MAC address for server04, it then forwards the packet out VNI10, through leaf04.
  3. The VXLAN encapsulated frame arrives on leaf04, which does a layer 2 lookup and has the MAC address for server04 in VLAN10.
Traffic Flow between server01 and server05
server01 and server05 are in the same VRF, different VLANs, and are located across different leafs.
  1. server01 makes an LACP hash decision to reach the default gateway and forwards traffic to leaf01.
  2. leaf01 does a layer 3 lookup in VRF RED and has a route out VNIRED through leaf04.
  3. The VXLAN encapsulated packet arrives on leaf04, which does a layer 3 lookup in VRF RED and has a route through VLAN20 to server05.
Traffic Flow between server01 and server06
server01 and server06 are in different VRFs, different VLANs, and are located across different leafs.
  1. server01 makes an LACP hash decision to reach the default gateway and forwards traffic to leaf01.
  2. leaf01 does a layer 3 lookup in VRF RED and has a route out VNIRED through border01.
  3. The VXLAN encapsulated packet arrives on border01, which does a layer 3 lookup in VRF RED and has a route through VLAN101 to fw01 (the policy device).
  4. fw01 does a layer 3 lookup (without any VRFs) and has a route in VLAN40, through border02.
  5. border02 does a layer 3 lookup in VRF BLUE and has a route out VNIBLUE, through leaf04.
  6. The VXLAN encapsulated packet arrives on leaf04, which does a layer 3 lookup in VRF BLUE and has a route in VLAN30 to server06.
cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:~$ nv set interface swp1-3,swp49-54
cumulus@leaf01:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:~$ nv set interface bond2 bond member swp2
cumulus@leaf01:~$ nv set interface bond3 bond member swp3
cumulus@leaf01:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf01:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf01:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf01:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond3 bond lacp-bypass on
cumulus@leaf01:~$ nv set interface bond1 link mtu 9000
cumulus@leaf01:~$ nv set interface bond2 link mtu 9000
cumulus@leaf01:~$ nv set interface bond3 link mtu 9000
cumulus@leaf01:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf01:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf01:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf01:~$ nv set interface bond3 bridge domain br_default access 30
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf01:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf01:~$ nv set mlag backup 10.10.10.2
cumulus@leaf01:~$ nv set mlag peer-ip linklocal
cumulus@leaf01:~$ nv set mlag priority 1000
cumulus@leaf01:~$ nv set mlag init-delay 10
cumulus@leaf01:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@leaf01:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@leaf01:~$ nv set interface vlan10 ip vrr state up
cumulus@leaf01:~$ nv set interface vlan20 ip address 10.1.20.2/24
cumulus@leaf01:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@leaf01:~$ nv set interface vlan20 ip vrr state up
cumulus@leaf01:~$ nv set interface vlan30 ip address 10.1.30.2/24
cumulus@leaf01:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
cumulus@leaf01:~$ nv set interface vlan30 ip vrr state up
cumulus@leaf01:~$ nv set vrf RED
cumulus@leaf01:~$ nv set vrf BLUE
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf01:~$ nv set bridge domain br_default vlan 30 vni 30
cumulus@leaf01:~$ nv set interface vlan10 ip vrf RED
cumulus@leaf01:~$ nv set interface vlan20 ip vrf RED
cumulus@leaf01:~$ nv set interface vlan30 ip vrf BLUE
cumulus@leaf01:~$ nv set nve vxlan mlag shared-address 10.0.1.12
cumulus@leaf01:~$ nv set nve vxlan source address 10.10.10.1
cumulus@leaf01:~$ nv set nve vxlan arp-nd-suppress on
cumulus@leaf01:~$ nv set vrf RED evpn vni 4001
cumulus@leaf01:~$ nv set vrf BLUE evpn vni 4002
cumulus@leaf01:~$ nv set system global anycast-mac 44:38:39:FF:00:AA
cumulus@leaf01:~$ nv set evpn enable on
cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf01:~$ nv set vrf RED router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set vrf RED router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf01:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf01:~$ nv set vrf BLUE router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set vrf BLUE router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf BLUE router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf01:~$ nv set vrf BLUE router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf01:~$ nv config apply
cumulus@leaf02:~$ nv set interface lo ip address 10.10.10.2/32
cumulus@leaf02:~$ nv set interface swp1-3,swp49-54
cumulus@leaf02:~$ nv set interface bond1 bond member swp1
cumulus@leaf02:~$ nv set interface bond2 bond member swp2
cumulus@leaf02:~$ nv set interface bond3 bond member swp3
cumulus@leaf02:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf02:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf02:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf02:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf02:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf02:~$ nv set interface bond3 bond lacp-bypass on
cumulus@leaf02:~$ nv set interface bond1 link mtu 9000
cumulus@leaf02:~$ nv set interface bond2 link mtu 9000
cumulus@leaf02:~$ nv set interface bond3 link mtu 9000
cumulus@leaf02:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf02:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf02:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf02:~$ nv set interface bond3 bridge domain br_default access 30
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf02:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf02:~$ nv set mlag backup 10.10.10.1
cumulus@leaf02:~$ nv set mlag peer-ip linklocal
cumulus@leaf02:~$ nv set mlag priority 2000
cumulus@leaf02:~$ nv set mlag init-delay 10
cumulus@leaf02:~$ nv set interface vlan10 ip address 10.1.10.3/24
cumulus@leaf02:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@leaf02:~$ nv set interface vlan10 ip vrr state up
cumulus@leaf02:~$ nv set interface vlan20 ip address 10.1.20.3/24
cumulus@leaf02:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@leaf02:~$ nv set interface vlan20 ip vrr state up
cumulus@leaf02:~$ nv set interface vlan30 ip address 10.1.30.3/24
cumulus@leaf02:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
cumulus@leaf02:~$ nv set interface vlan30 ip vrr state up
cumulus@leaf02:~$ nv set vrf RED
cumulus@leaf02:~$ nv set vrf BLUE
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf02:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf02:~$ nv set bridge domain br_default vlan 30 vni 30
cumulus@leaf02:~$ nv set interface vlan10 ip vrf RED
cumulus@leaf02:~$ nv set interface vlan20 ip vrf RED
cumulus@leaf02:~$ nv set interface vlan30 ip vrf BLUE
cumulus@leaf02:~$ nv set nve vxlan mlag shared-address 10.0.1.12
cumulus@leaf02:~$ nv set nve vxlan source address 10.10.10.2
cumulus@leaf02:~$ nv set nve vxlan arp-nd-suppress on
cumulus@leaf02:~$ nv set vrf RED evpn vni 4001
cumulus@leaf02:~$ nv set vrf BLUE evpn vni 4002
cumulus@leaf02:~$ nv set system global anycast-mac 44:38:39:FF:00:AA
cumulus@leaf02:~$ nv set evpn enable on
cumulus@leaf02:~$ nv set router bgp autonomous-system 65102
cumulus@leaf02:~$ nv set router bgp router-id 10.10.10.2
cumulus@leaf02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf02:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf02:~$ nv set vrf RED router bgp autonomous-system 65102
cumulus@leaf02:~$ nv set vrf RED router bgp router-id 10.10.10.2
cumulus@leaf02:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf02:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf02:~$ nv set vrf BLUE router bgp autonomous-system 65102
cumulus@leaf02:~$ nv set vrf BLUE router bgp router-id 10.10.10.2
cumulus@leaf02:~$ nv set vrf BLUE router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf02:~$ nv set vrf BLUE router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf02:~$ nv config apply
cumulus@leaf03:~$ nv set interface lo ip address 10.10.10.3/32
cumulus@leaf03:~$ nv set interface swp1-3,swp49-54
cumulus@leaf03:~$ nv set interface bond1 bond member swp1
cumulus@leaf03:~$ nv set interface bond2 bond member swp2
cumulus@leaf03:~$ nv set interface bond3 bond member swp3
cumulus@leaf03:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf03:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf03:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf03:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf03:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf03:~$ nv set interface bond3 bond lacp-bypass on
cumulus@leaf03:~$ nv set interface bond1 link mtu 9000
cumulus@leaf03:~$ nv set interface bond2 link mtu 9000
cumulus@leaf03:~$ nv set interface bond3 link mtu 9000
cumulus@leaf03:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf03:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf03:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf03:~$ nv set interface bond3 bridge domain br_default access 30
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf03:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf03:~$ nv set mlag backup 10.10.10.4
cumulus@leaf03:~$ nv set mlag peer-ip linklocal
cumulus@leaf03:~$ nv set mlag priority 1000
cumulus@leaf03:~$ nv set mlag init-delay 10
cumulus@leaf03:~$ nv set interface vlan10 ip address 10.1.10.4/24
cumulus@leaf03:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@leaf03:~$ nv set interface vlan10 ip vrr state up
cumulus@leaf03:~$ nv set interface vlan20 ip address 10.1.20.4/24
cumulus@leaf03:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@leaf03:~$ nv set interface vlan20 ip vrr state up
cumulus@leaf03:~$ nv set interface vlan30 ip address 10.1.30.4/24
cumulus@leaf03:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
cumulus@leaf03:~$ nv set interface vlan30 ip vrr state up
cumulus@leaf03:~$ nv set vrf RED
cumulus@leaf03:~$ nv set vrf BLUE
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf03:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf03:~$ nv set bridge domain br_default vlan 30 vni 30
cumulus@leaf03:~$ nv set interface vlan10 ip vrf RED
cumulus@leaf03:~$ nv set interface vlan20 ip vrf RED
cumulus@leaf03:~$ nv set interface vlan30 ip vrf BLUE
cumulus@leaf03:~$ nv set nve vxlan mlag shared-address 10.0.1.34
cumulus@leaf03:~$ nv set nve vxlan source address 10.10.10.3
cumulus@leaf03:~$ nv set nve vxlan arp-nd-suppress on
cumulus@leaf03:~$ nv set vrf RED evpn vni 4001
cumulus@leaf03:~$ nv set vrf BLUE evpn vni 4002
cumulus@leaf03:~$ nv set system global anycast-mac 44:38:39:FF:00:BB
cumulus@leaf03:~$ nv set evpn enable on
cumulus@leaf03:~$ nv set router bgp autonomous-system 65103
cumulus@leaf03:~$ nv set router bgp router-id 10.10.10.3
cumulus@leaf03:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf03:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf03:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf03:~$ nv set vrf RED router bgp autonomous-system 65103
cumulus@leaf03:~$ nv set vrf RED router bgp router-id 10.10.10.3
cumulus@leaf03:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf03:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf03:~$ nv set vrf BLUE router bgp autonomous-system 65103
cumulus@leaf03:~$ nv set vrf BLUE router bgp router-id 10.10.10.3
cumulus@leaf03:~$ nv set vrf BLUE router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf03:~$ nv set vrf BLUE router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf03:~$ nv config apply
cumulus@leaf04:~$ nv set interface lo ip address 10.10.10.4/32
cumulus@leaf04:~$ nv set interface swp1-3,swp49-54
cumulus@leaf04:~$ nv set interface bond1 bond member swp1
cumulus@leaf04:~$ nv set interface bond2 bond member swp2
cumulus@leaf04:~$ nv set interface bond3 bond member swp3
cumulus@leaf04:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf04:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf04:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf04:~$ nv set interface bond1 bond lacp-bypass on
cumulus@leaf04:~$ nv set interface bond2 bond lacp-bypass on
cumulus@leaf04:~$ nv set interface bond3 bond lacp-bypass on
cumulus@leaf04:~$ nv set interface bond1 link mtu 9000
cumulus@leaf04:~$ nv set interface bond2 link mtu 9000
cumulus@leaf04:~$ nv set interface bond3 link mtu 9000
cumulus@leaf04:~$ nv set interface bond1-3 bridge domain br_default
cumulus@leaf04:~$ nv set interface bond1 bridge domain br_default access 10
cumulus@leaf04:~$ nv set interface bond2 bridge domain br_default access 20
cumulus@leaf04:~$ nv set interface bond3 bridge domain br_default access 30
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf04:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf04:~$ nv set mlag backup 10.10.10.3
cumulus@leaf04:~$ nv set mlag peer-ip linklocal
cumulus@leaf04:~$ nv set mlag priority 2000
cumulus@leaf04:~$ nv set mlag init-delay 10
cumulus@leaf04:~$ nv set interface vlan10 ip address 10.1.10.5/24
cumulus@leaf04:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
cumulus@leaf04:~$ nv set interface vlan10 ip vrr state up
cumulus@leaf04:~$ nv set interface vlan20 ip address 10.1.20.5/24
cumulus@leaf04:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
cumulus@leaf04:~$ nv set interface vlan20 ip vrr state up
cumulus@leaf04:~$ nv set interface vlan30 ip address 10.1.30.5/24
cumulus@leaf04:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
cumulus@leaf04:~$ nv set interface vlan30 ip vrr state up
cumulus@leaf04:~$ nv set vrf RED
cumulus@leaf04:~$ nv set vrf BLUE
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf04:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf04:~$ nv set bridge domain br_default vlan 30 vni 30
cumulus@leaf04:~$ nv set interface vlan10 ip vrf RED
cumulus@leaf04:~$ nv set interface vlan20 ip vrf RED
cumulus@leaf04:~$ nv set interface vlan30 ip vrf BLUE
cumulus@leaf04:~$ nv set nve vxlan mlag shared-address 10.0.1.34
cumulus@leaf04:~$ nv set nve vxlan source address 10.10.10.4
cumulus@leaf04:~$ nv set nve vxlan arp-nd-suppress on
cumulus@leaf04:~$ nv set vrf RED evpn vni 4001
cumulus@leaf04:~$ nv set vrf BLUE evpn vni 4002
cumulus@leaf04:~$ nv set system global anycast-mac 44:38:39:FF:00:BB
cumulus@leaf04:~$ nv set evpn enable on
cumulus@leaf04:~$ nv set router bgp autonomous-system 65104
cumulus@leaf04:~$ nv set router bgp router-id 10.10.10.4
cumulus@leaf04:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@leaf04:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@leaf04:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf04:~$ nv set vrf RED router bgp autonomous-system 65104
cumulus@leaf04:~$ nv set vrf RED router bgp router-id 10.10.10.4
cumulus@leaf04:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf04:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf04:~$ nv set vrf BLUE router bgp autonomous-system 65104
cumulus@leaf04:~$ nv set vrf BLUE router bgp router-id 10.10.10.4
cumulus@leaf04:~$ nv set vrf BLUE router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@leaf04:~$ nv set vrf BLUE router bgp address-family ipv4-unicast route-export to-evpn
cumulus@leaf04:~$ nv config apply
cumulus@spine01:~$ nv set interface lo ip address 10.10.10.101/32
cumulus@spine01:~$ nv set interface swp1-6
cumulus@spine01:~$ nv set router bgp autonomous-system 65199
cumulus@spine01:~$ nv set router bgp router-id 10.10.10.101
cumulus@spine01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine01:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine01:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine01:~$ nv config apply
cumulus@spine02:~$ nv set interface lo ip address 10.10.10.102/32
cumulus@spine02:~$ nv set interface swp1-6
cumulus@spine02:~$ nv set router bgp autonomous-system 65199
cumulus@spine02:~$ nv set router bgp router-id 10.10.10.102
cumulus@spine02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine02:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine02:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine02:~$ nv config apply
cumulus@spine03:~$ nv set interface lo ip address 10.10.10.103/32
cumulus@spine03:~$ nv set interface swp1-6
cumulus@spine03:~$ nv set router bgp autonomous-system 65199
cumulus@spine03:~$ nv set router bgp router-id 10.10.10.103
cumulus@spine03:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine03:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine03:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine03:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine03:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine03:~$ nv config apply
cumulus@spine04:~$ nv set interface lo ip address 10.10.10.104/32
cumulus@spine04:~$ nv set interface swp1-6
cumulus@spine04:~$ nv set router bgp autonomous-system 65199
cumulus@spine04:~$ nv set router bgp router-id 10.10.10.104
cumulus@spine04:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine04:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp1 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp2 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp3 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp4 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp5 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp neighbor swp6 peer-group underlay
cumulus@spine04:~$ nv set vrf default router bgp address-family l2vpn-evpn enable on
cumulus@spine04:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@spine04:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine04:~$ nv config apply
cumulus@border01:~$ nv set interface lo ip address 10.10.10.63/32
cumulus@border01:~$ nv set interface swp3,swp49-54
cumulus@border01:~$ nv set interface bond3 bond member swp3
cumulus@border01:~$ nv set interface bond3 bond mlag id 1
cumulus@border01:~$ nv set interface bond3 bond lacp-bypass on
cumulus@border01:~$ nv set interface bond3 link mtu 9000
cumulus@border01:~$ nv set interface bond3 bridge domain br_default
cumulus@border01:~$ nv set interface bond3 bridge domain br_default vlan 101,102
cumulus@border01:~$ nv set interface peerlink bond member swp49-50
cumulus@border01:~$ nv set mlag backup 10.10.10.64
cumulus@border01:~$ nv set mlag peer-ip linklocal
cumulus@border01:~$ nv set mlag priority 1000
cumulus@border01:~$ nv set mlag init-delay 10
cumulus@border01:~$ nv set vrf RED
cumulus@border01:~$ nv set vrf BLUE
cumulus@border01:~$ nv set interface vlan101 ip address 10.1.101.64/24
cumulus@border01:~$ nv set interface vlan101 ip vrr address 10.1.101.1/24
cumulus@border01:~$ nv set interface vlan101 ip vrr state up
cumulus@border01:~$ nv set interface vlan102 ip address 10.1.102.64/24
cumulus@border01:~$ nv set interface vlan102 ip vrr address 10.1.102.1/24
cumulus@border01:~$ nv set interface vlan102 ip vrr state up
cumulus@border01:~$ nv set bridge domain br_default vlan 101,102
cumulus@border01:~$ nv set interface vlan101 ip vrf RED
cumulus@border01:~$ nv set interface vlan102 ip vrf BLUE
cumulus@border01:~$ nv set nve vxlan mlag shared-address 10.0.1.255
cumulus@border01:~$ nv set nve vxlan source address 10.10.10.63
cumulus@border01:~$ nv set nve vxlan arp-nd-suppress on
cumulus@border01:~$ nv set vrf RED evpn vni 4001
cumulus@border01:~$ nv set vrf BLUE evpn vni 4002
cumulus@border01:~$ nv set system global anycast-mac 44:38:39:FF:00:FF
cumulus@border01:~$ nv set evpn enable on
cumulus@border01:~$ nv set router bgp autonomous-system 65253
cumulus@border01:~$ nv set router bgp router-id 10.10.10.63
cumulus@border01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@border01:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@border01:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@border01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@border01:~$ nv set vrf RED router bgp autonomous-system 65253
cumulus@border01:~$ nv set vrf RED router bgp router-id 10.10.10.63
cumulus@border01:~$ nv set vrf RED router static 10.1.30.0/24 via 10.1.101.4
cumulus@border01:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute static
cumulus@border01:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn
cumulus@border01:~$ nv set vrf BLUE router bgp autonomous-system 65253
cumulus@border01:~$ nv set vrf BLUE router bgp router-id 10.10.10.63
cumulus@border01:~$ nv set vrf BLUE router static 10.1.10.0/24 via 10.1.102.4
cumulus@border01:~$ nv set vrf BLUE router static 10.1.20.0/24 via 10.1.102.4
cumulus@border01:~$ nv set vrf BLUE router bgp address-family ipv4-unicast redistribute static
cumulus@border01:~$ nv set vrf BLUE router bgp address-family ipv4-unicast route-export to-evpn
cumulus@border01:~$ nv config apply
cumulus@border02:~$ nv set interface lo ip address 10.10.10.64/32
cumulus@border02:~$ nv set interface swp3,swp49-54
cumulus@border02:~$ nv set interface bond3 bond member swp3
cumulus@border02:~$ nv set interface bond3 bond mlag id 1
cumulus@border02:~$ nv set interface bond3 bond lacp-bypass on
cumulus@border02:~$ nv set interface bond3 link mtu 9000
cumulus@border02:~$ nv set interface bond3 bridge domain br_default
cumulus@border02:~$ nv set interface bond3 bridge domain br_default vlan 101,102
cumulus@border02:~$ nv set interface peerlink bond member swp49-50
cumulus@border02:~$ nv set mlag backup 10.10.10.63
cumulus@border02:~$ nv set mlag peer-ip linklocal
cumulus@border02:~$ nv set mlag priority 2000
cumulus@border02:~$ nv set mlag init-delay 10
cumulus@border02:~$ nv set vrf RED
cumulus@border02:~$ nv set vrf BLUE
cumulus@border02:~$ nv set interface vlan101 ip address 10.1.101.65/24
cumulus@border02:~$ nv set interface vlan101 ip vrr address 10.1.101.1/24
cumulus@border02:~$ nv set interface vlan101 ip vrr state up
cumulus@border02:~$ nv set interface vlan102 ip address 10.1.102.65/24
cumulus@border02:~$ nv set interface vlan102 ip vrr address 10.1.102.1/24
cumulus@border02:~$ nv set interface vlan102 ip vrr state up
cumulus@border02:~$ nv set bridge domain br_default vlan 101,102
cumulus@border02:~$ nv set interface vlan101 ip vrf RED
cumulus@border02:~$ nv set interface vlan102 ip vrf BLUE
cumulus@border02:~$ nv set nve vxlan mlag shared-address 10.0.1.255
cumulus@border02:~$ nv set nve vxlan source address 10.10.10.64
cumulus@border02:~$ nv set nve vxlan arp-nd-suppress on
cumulus@border02:~$ nv set vrf RED evpn vni 4001
cumulus@border02:~$ nv set vrf BLUE evpn vni 4002
cumulus@border02:~$ nv set system global anycast-mac 44:38:39:FF:00:FF
cumulus@border02:~$ nv set evpn enable on
cumulus@border02:~$ nv set router bgp autonomous-system 65254
cumulus@border02:~$ nv set router bgp router-id 10.10.10.64
cumulus@border02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@border02:~$ nv set vrf default router bgp neighbor swp51 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp neighbor swp52 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp neighbor swp53 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp neighbor swp54 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp peer-group underlay address-family l2vpn-evpn enable on
cumulus@border02:~$ nv set vrf default router bgp neighbor peerlink.4094 peer-group underlay
cumulus@border02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
cumulus@border02:~$ nv set vrf RED router bgp autonomous-system 65254
cumulus@border02:~$ nv set vrf RED router bgp router-id 10.10.10.64
cumulus@border02:~$ nv set vrf RED router static 10.1.30.0/24 via 10.1.101.4
cumulus@border02:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute static
cumulus@border02:~$ nv set vrf RED router bgp address-family ipv4-unicast route-export to-evpn
cumulus@border02:~$ nv set vrf BLUE router bgp autonomous-system 65254
cumulus@border02:~$ nv set vrf BLUE router bgp router-id 10.10.10.64
cumulus@border02:~$ nv set vrf BLUE router static 10.1.10.0/24 via 10.1.102.4
cumulus@border02:~$ nv set vrf BLUE router static 10.1.20.0/24 via 10.1.102.4
cumulus@border02:~$ nv set vrf BLUE router bgp address-family ipv4-unicast redistribute static
cumulus@border02:~$ nv set vrf BLUE router bgp address-family ipv4-unicast route-export to-evpn
cumulus@border02:~$ nv config apply
cumulus@leaf01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
            '30':
              vni:
                '30': {}
    evpn:
      enable: on
    interface:
      bond1:
        bond:
          lacp-bypass: on
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default:
              access: 10
        link:
          mtu: 9000
        type: bond
      bond2:
        bond:
          lacp-bypass: on
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default:
              access: 20
        link:
          mtu: 9000
        type: bond
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default:
              access: 30
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.2/24: {}
          vrf: RED
          vrr:
            address:
              10.1.10.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.2/24: {}
          vrf: RED
          vrr:
            address:
              10.1.20.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.2/24: {}
          vrf: BLUE
          vrr:
            address:
              10.1.30.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 30
    mlag:
      backup:
        10.10.10.2: {}
      enable: on
      init-delay: 10
      peer-ip: linklocal
      priority: 1000
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        mlag:
          shared-address: 10.0.1.12
        source:
          address: 10.10.10.1
    router:
      bgp:
        autonomous-system: 65101
        enable: on
        router-id: 10.10.10.1
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$OCOAPC8HKzjocQN.$q.BS6./DVAq9zdSQZZ9VxDTe88u9tnYE9i7ZFohs8aDyl5.6EfVNTO9zILQ/EwDRn3LoXDEKC3fKnJA2UqB78.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:7a
      hostname: leaf01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      BLUE:
        evpn:
          enable: on
          vni:
            '4002': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65101
            enable: on
            router-id: 10.10.10.1
      RED:
        evpn:
          enable: on
          vni:
            '4001': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65101
            enable: on
            router-id: 10.10.10.1
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@leaf02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
            '30':
              vni:
                '30': {}
    evpn:
      enable: on
    interface:
      bond1:
        bond:
          lacp-bypass: on
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default:
              access: 10
        link:
          mtu: 9000
        type: bond
      bond2:
        bond:
          lacp-bypass: on
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default:
              access: 20
        link:
          mtu: 9000
        type: bond
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default:
              access: 30
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.2/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.3/24: {}
          vrf: RED
          vrr:
            address:
              10.1.10.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.3/24: {}
          vrf: RED
          vrr:
            address:
              10.1.20.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.3/24: {}
          vrf: BLUE
          vrr:
            address:
              10.1.30.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 30
    mlag:
      backup:
        10.10.10.1: {}
      enable: on
      init-delay: 10
      peer-ip: linklocal
      priority: 2000
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        mlag:
          shared-address: 10.0.1.12
        source:
          address: 10.10.10.2
    router:
      bgp:
        autonomous-system: 65102
        enable: on
        router-id: 10.10.10.2
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$URHoRipZ3lRn6IXA$WBUfFpy1V1eqywPL.kUzrcJrkSf4hyR/SWYlh0/WcVNVMhAsKiq/uCuXWfwK42hgYJe6NUNSOYVvImaL56Djg1
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:78
      hostname: leaf02
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      BLUE:
        evpn:
          enable: on
          vni:
            '4002': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65102
            enable: on
            router-id: 10.10.10.2
      RED:
        evpn:
          enable: on
          vni:
            '4001': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65102
            enable: on
            router-id: 10.10.10.2
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@leaf03:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
            '30':
              vni:
                '30': {}
    evpn:
      enable: on
    interface:
      bond1:
        bond:
          lacp-bypass: on
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default:
              access: 10
        link:
          mtu: 9000
        type: bond
      bond2:
        bond:
          lacp-bypass: on
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default:
              access: 20
        link:
          mtu: 9000
        type: bond
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default:
              access: 30
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.3/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.4/24: {}
          vrf: RED
          vrr:
            address:
              10.1.10.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.4/24: {}
          vrf: RED
          vrr:
            address:
              10.1.20.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.4/24: {}
          vrf: BLUE
          vrr:
            address:
              10.1.30.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 30
    mlag:
      backup:
        10.10.10.4: {}
      enable: on
      init-delay: 10
      peer-ip: linklocal
      priority: 1000
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        mlag:
          shared-address: 10.0.1.34
        source:
          address: 10.10.10.3
    router:
      bgp:
        autonomous-system: 65103
        enable: on
        router-id: 10.10.10.3
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$XVltRxaTi.fSLt5s$dYeWbVtlcwVwhbHl2urx6wFNf/43FEtzhAPWxZeOGWdlPkQvAcqaVV7kxOx4jYWwDc60tG9EFRGoGWl2Q6lpj.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:BB
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:84
      hostname: leaf03
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      BLUE:
        evpn:
          enable: on
          vni:
            '4002': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65103
            enable: on
            router-id: 10.10.10.3
      RED:
        evpn:
          enable: on
          vni:
            '4001': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65103
            enable: on
            router-id: 10.10.10.3
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@leaf04:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
            '30':
              vni:
                '30': {}
    evpn:
      enable: on
    interface:
      bond1:
        bond:
          lacp-bypass: on
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default:
              access: 10
        link:
          mtu: 9000
        type: bond
      bond2:
        bond:
          lacp-bypass: on
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default:
              access: 20
        link:
          mtu: 9000
        type: bond
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default:
              access: 30
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.4/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.5/24: {}
          vrf: RED
          vrr:
            address:
              10.1.10.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.5/24: {}
          vrf: RED
          vrr:
            address:
              10.1.20.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.5/24: {}
          vrf: BLUE
          vrr:
            address:
              10.1.30.1/24: {}
            enable: on
            state:
              up: {}
        type: svi
        vlan: 30
    mlag:
      backup:
        10.10.10.3: {}
      enable: on
      init-delay: 10
      peer-ip: linklocal
      priority: 2000
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        mlag:
          shared-address: 10.0.1.34
        source:
          address: 10.10.10.4
    router:
      bgp:
        autonomous-system: 65104
        enable: on
        router-id: 10.10.10.4
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$i6CVk1BB1B6mX6j8$YGSjqfSQuyty2a9nY7BltGrwOnIwjH.hYu254Izy1W7QyqvUat8txjeam2PsNRwxd./mu4Ma7GziBb8wqAfgV0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:BB
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:8a
      hostname: leaf04
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      BLUE:
        evpn:
          enable: on
          vni:
            '4002': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65104
            enable: on
            router-id: 10.10.10.4
      RED:
        evpn:
          enable: on
          vni:
            '4001': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65104
            enable: on
            router-id: 10.10.10.4
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@spine01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.101/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.101
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$XypepWzsnufsZdyc$.wmJPhwniCvx8AyJsKvXbFv8Ob/xkPCqEH0Gf8tqtNCOo1qFNLJSmsOEk9PUb6cG1sy3RAVXZq93CqEUxhYGy.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        system-mac: 44:38:39:22:01:82
      hostname: spine01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            path-selection:
              multipath:
                aspath-ignore: on
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@spine02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.102/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.102
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$NSPZYaTpJLKabVCI$QyQ10dRnNz/0nf.4FVGojaJBR0wZgSADKpHB7On3bp8t/4w.1tBJIpe8tRwU4Nk3v8hXjhPxKWwscqnrIH01e1
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        system-mac: 44:38:39:22:01:92
      hostname: spine02
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            path-selection:
              multipath:
                aspath-ignore: on
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@spine03:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.103/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.103
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$jaJGX2fU4UkRtS4R$5Gvld7RRH5onAa/bNQRv3Y5wZJc.ap.kscKquWC.CtV2sEJp.SIqxzudjFbWe1PMElkkxM8Kjd3cdSOWEs8z61
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        system-mac: 44:38:39:22:01:70
      hostname: spine03
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            path-selection:
              multipath:
                aspath-ignore: on
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@spine04:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.104/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
      swp5:
        type: swp
      swp6:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.104
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$HbweixOuqYxzQLiD$HJtHLRnP0aEoqAGXYmz0Y8zWIe13bzwMYAkXdoG7uMgLMkpr6OKN.qRyttO1g6DZk6HplX3xV14T2CsJKH3qf0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        system-mac: 44:38:39:22:01:6c
      hostname: spine04
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp1:
                peer-group: underlay
                type: unnumbered
              swp2:
                peer-group: underlay
                type: unnumbered
              swp3:
                peer-group: underlay
                type: unnumbered
              swp4:
                peer-group: underlay
                type: unnumbered
              swp5:
                peer-group: underlay
                type: unnumbered
              swp6:
                peer-group: underlay
                type: unnumbered
            path-selection:
              multipath:
                aspath-ignore: on
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@border01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            101-102: {}
    evpn:
      enable: on
    interface:
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default:
              vlan:
                101-102: {}
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.63/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      vlan101:
        ip:
          address:
            10.1.101.64/24: {}
          vrf: RED
          vrr:
            address:
              10.1.101.1/24: {}
            enable: on
            mac-address: 00:00:00:00:00:01
            state:
              up: {}
        type: svi
        vlan: 101
      vlan102:
        ip:
          address:
            10.1.102.64/24: {}
          vrf: BLUE
          vrr:
            address:
              10.1.102.1/24: {}
            enable: on
            mac-address: 00:00:00:00:00:02
            state:
              up: {}
        type: svi
        vlan: 102
    mlag:
      backup:
        10.10.10.64: {}
      enable: on
      init-delay: 10
      mac-address: 44:38:39:FF:00:FF
      peer-ip: linklocal
      priority: 1000
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        mlag:
          shared-address: 10.0.1.255
        source:
          address: 10.10.10.63
    router:
      bgp:
        autonomous-system: 65253
        enable: on
        router-id: 10.10.10.63
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$/h/sO1z7jrNI6NUy$UlAJkdi7laIJJsInFey8Cz1tb5c41i706uWJXhhyXPrx431ccsJASRcEerEm1lRRXdnOjbbrGMRpYpgxJRWoG.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:FF
        system-mac: 44:38:39:22:01:74
      hostname: border01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      BLUE:
        evpn:
          enable: on
          vni:
            '4002': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  static:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65253
            enable: on
            router-id: 10.10.10.63
          static:
            10.1.10.0/24:
              address-family: ipv4-unicast
              via:
                10.1.102.4:
                  type: ipv4-address
            10.1.20.0/24:
              address-family: ipv4-unicast
              via:
                10.1.102.4:
                  type: ipv4-address
      RED:
        evpn:
          enable: on
          vni:
            '4001': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  static:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65253
            enable: on
            router-id: 10.10.10.63
          static:
            10.1.30.0/24:
              address-family: ipv4-unicast
              via:
                10.1.101.4:
                  type: ipv4-address
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@border02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            101-102: {}
    evpn:
      enable: on
    interface:
      bond3:
        bond:
          lacp-bypass: on
          member:
            swp3: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default:
              vlan:
                101-102: {}
        link:
          mtu: 9000
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.64/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      swp53:
        type: swp
      swp54:
        type: swp
      vlan101:
        ip:
          address:
            10.1.101.65/24: {}
          vrf: RED
          vrr:
            address:
              10.1.101.1/24: {}
            enable: on
            mac-address: 00:00:00:00:00:01
            state:
              up: {}
        type: svi
        vlan: 101
      vlan102:
        ip:
          address:
            10.1.102.65/24: {}
          vrf: BLUE
          vrr:
            address:
              10.1.102.1/24: {}
            enable: on
            mac-address: 00:00:00:00:00:02
            state:
              up: {}
        type: svi
        vlan: 102
    mlag:
      backup:
        10.10.10.63: {}
      enable: on
      init-delay: 10
      mac-address: 44:38:39:FF:00:FF
      peer-ip: linklocal
      priority: 2000
    nve:
      vxlan:
        arp-nd-suppress: on
        enable: on
        mlag:
          shared-address: 10.0.1.255
        source:
          address: 10.10.10.64
    router:
      bgp:
        autonomous-system: 65254
        enable: on
        router-id: 10.10.10.64
      vrr:
        enable: on
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$jp44/edEqkxrDwRK$C2zrTa/4pjFw/ZHMsseuAQJDpyLGHms5j.9R/piRaU0b2/rFSc0GmlqikqJoftzl6awlTfULcytVUFjB.APx30
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:FF
        system-mac: 44:38:39:22:01:7c
      hostname: border02
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      BLUE:
        evpn:
          enable: on
          vni:
            '4002': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  static:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65254
            enable: on
            router-id: 10.10.10.64
          static:
            10.1.10.0/24:
              address-family: ipv4-unicast
              via:
                10.1.102.4:
                  type: ipv4-address
            10.1.20.0/24:
              address-family: ipv4-unicast
              via:
                10.1.102.4:
                  type: ipv4-address
      RED:
        evpn:
          enable: on
          vni:
            '4001': {}
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  static:
                    enable: on
                route-export:
                  to-evpn:
                    enable: on
            autonomous-system: 65254
            enable: on
            router-id: 10.10.10.64
          static:
            10.1.30.0/24:
              address-family: ipv4-unicast
              via:
                10.1.101.4:
                  type: ipv4-address
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              peerlink.4094:
                peer-group: underlay
                type: unnumbered
              swp51:
                peer-group: underlay
                type: unnumbered
              swp52:
                peer-group: underlay
                type: unnumbered
              swp53:
                peer-group: underlay
                type: unnumbered
              swp54:
                peer-group: underlay
                type: unnumbered
            peer-group:
              underlay:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
cumulus@leaf01:mgmt:~$ sudo cat /etc/network/interfaces 
...
auto lo
iface lo inet loopback
    address 10.10.10.1/32
    clagd-vxlan-anycast-ip 10.0.1.12
    vxlan-local-tunnelip 10.10.10.1
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto RED
iface RED
    vrf-table auto
auto BLUE
iface BLUE
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto bond3
iface bond3
    mtu 9000
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 3
    bridge-access 30
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 1000
    clagd-backup-ip 10.10.10.2
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    address 10.1.10.2/24
    address-virtual 00:00:5E:00:01:01 10.1.10.1/24
    hwaddress 44:38:39:22:01:b1
    vrf RED
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.2/24
    address-virtual 00:00:5E:00:01:01 10.1.20.1/24
    hwaddress 44:38:39:22:01:b1
    vrf RED
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.2/24
    address-virtual 00:00:5E:00:01:01 10.1.30.1/24
    hwaddress 44:38:39:22:01:b1
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 30
auto vlan4024_l3
iface vlan4024_l3
    vrf RED
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:AA
    vlan-id 4024
auto vlan4036_l3
iface vlan4036_l3
    vrf BLUE
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:AA
    vlan-id 4036
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20 30=30 4024=4001 4036=4002
    bridge-vids 10 20 30 4024 4036
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink vxlan48
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
cumulus@leaf02:mgmt:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.2/32
    clagd-vxlan-anycast-ip 10.0.1.12
    vxlan-local-tunnelip 10.10.10.2
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto RED
iface RED
    vrf-table auto
auto BLUE
iface BLUE
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto bond3
iface bond3
    mtu 9000
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 3
    bridge-access 30
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 2000
    clagd-backup-ip 10.10.10.1
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    address 10.1.10.3/24
    address-virtual 00:00:5E:00:01:01 10.1.10.1/24
    hwaddress 44:38:39:22:01:af
    vrf RED
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.3/24
    address-virtual 00:00:5E:00:01:01 10.1.20.1/24
    hwaddress 44:38:39:22:01:af
    vrf RED
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.3/24
    address-virtual 00:00:5E:00:01:01 10.1.30.1/24
    hwaddress 44:38:39:22:01:af
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 30
auto vlan4024_l3
iface vlan4024_l3
    vrf RED
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:AA
    vlan-id 4024
auto vlan4036_l3
iface vlan4036_l3
    vrf BLUE
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:AA
    vlan-id 4036
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20 30=30 4024=4001 4036=4002
    bridge-vids 10 20 30 4024 4036
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink vxlan48
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
cumulus@leaf03:mgmt:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.3/32
    clagd-vxlan-anycast-ip 10.0.1.34
    vxlan-local-tunnelip 10.10.10.3
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto RED
iface RED
    vrf-table auto
auto BLUE
iface BLUE
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto bond3
iface bond3
    mtu 9000
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 3
    bridge-access 30
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 2000
    clagd-backup-ip 10.10.10.4
    clagd-sys-mac 44:38:39:FF:00:BB
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    address 10.1.10.4/24
    address-virtual 00:00:5E:00:01:01 10.1.10.1/24
    hwaddress 44:38:39:22:01:bb
    vrf RED
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.4/24
    address-virtual 00:00:5E:00:01:01 10.1.20.1/24
    hwaddress 44:38:39:22:01:bb
    vrf RED
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.4/24
    address-virtual 00:00:5E:00:01:01 10.1.30.1/24
    hwaddress 44:38:39:22:01:bb
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 30
auto vlan4024_l3
iface vlan4024_l3
    vrf RED
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:BB
    vlan-id 4024
auto vlan4036_l3
iface vlan4036_l3
    vrf BLUE
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:BB
    vlan-id 4036
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20 30=30 4024=4001 4036=4002
    bridge-vids 10 20 30 4024 4036
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink vxlan48
    hwaddress 44:38:39:22:01:bb
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
cumulus@leaf04:mgmt:~$ sudo cat /etc/network/interfaces 
...
auto lo
iface lo inet loopback
    address 10.10.10.4/32
    clagd-vxlan-anycast-ip 10.0.1.34
    vxlan-local-tunnelip 10.10.10.4
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto RED
iface RED
    vrf-table auto
auto BLUE
iface BLUE
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond1
iface bond1
    mtu 9000
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-access 10
auto bond2
iface bond2
    mtu 9000
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 2
    bridge-access 20
auto bond3
iface bond3
    mtu 9000
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 3
    bridge-access 30
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 2000
    clagd-backup-ip 10.10.10.3
    clagd-sys-mac 44:38:39:FF:00:BB
    clagd-args --initDelay 10
auto vlan10
iface vlan10
    address 10.1.10.5/24
    address-virtual 00:00:5E:00:01:01 10.1.10.1/24
    hwaddress 44:38:39:22:01:c1
    vrf RED
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.5/24
    address-virtual 00:00:5E:00:01:01 10.1.20.1/24
    hwaddress 44:38:39:22:01:c1
    vrf RED
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.5/24
    address-virtual 00:00:5E:00:01:01 10.1.30.1/24
    hwaddress 44:38:39:22:01:c1
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 30
auto vlan4024_l3
iface vlan4024_l3
    vrf RED
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:BB
    vlan-id 4024
auto vlan4036_l3
iface vlan4036_l3
    vrf BLUE
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:BB
    vlan-id 4036
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20 30=30 4024=4001 4036=4002
    bridge-vids 10 20 30 4024 4036
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink vxlan48
    hwaddress 44:38:39:22:01:c1
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
cumulus@spine01:mgmt:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.101/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6
cumulus@spine02:mgmt:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.102/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6
cumulus@spine03:mgmt:~$ sudo cat /etc/network/interfaces 
...
auto lo
iface lo inet loopback
    address 10.10.10.103/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6
cumulus@spine04:mgmt:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.104/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto swp5
iface swp5
auto swp6
iface swp6
cumulus@border01:mgmt:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.63/32
    clagd-vxlan-anycast-ip 10.0.1.255
    vxlan-local-tunnelip 10.10.10.63
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto RED
iface RED
    vrf-table auto
auto BLUE
iface BLUE
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond3
iface bond3
    mtu 9000
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-vids 101 102
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 1000
    clagd-backup-ip 10.10.10.64
    clagd-sys-mac 44:38:39:FF:00:FF
    clagd-args --initDelay 10
auto vlan101
iface vlan101
    address 10.1.101.64/24
    address-virtual 00:00:5E:00:01:01 10.1.101.1/24
    hwaddress 44:38:39:22:01:ab
    vrf RED
    vlan-raw-device br_default
    vlan-id 101
auto vlan102
iface vlan102
    address 10.1.102.64/24
    address-virtual 00:00:5E:00:01:01 10.1.102.1/24
    hwaddress 44:38:39:22:01:ab
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 102
auto vlan4024_l3
iface vlan4024_l3
    vrf RED
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:FF
    vlan-id 4024
auto vlan4036_l3
iface vlan4036_l3
    vrf BLUE
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:FF
    vlan-id 4036
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 4024=4001 4036=4002
    bridge-vids 4024 4036
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond3 peerlink vxlan48
    hwaddress 44:38:39:22:01:ab
    bridge-vlan-aware yes
    bridge-vids 101 102
    bridge-pvid 1
cumulus@border02:mgmt:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.64/32
    clagd-vxlan-anycast-ip 10.0.1.255
    vxlan-local-tunnelip 10.10.10.64
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto RED
iface RED
    vrf-table auto
auto BLUE
iface BLUE
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto swp53
iface swp53
auto swp54
iface swp54
auto bond3
iface bond3
    mtu 9000
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow yes
    clag-id 1
    bridge-vids 101 102
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-priority 2000
    clagd-backup-ip 10.10.10.63
    clagd-sys-mac 44:38:39:FF:00:FF
    clagd-args --initDelay 10
auto vlan101
iface vlan101
    address 10.1.101.65/24
    address-virtual 00:00:5E:00:01:01 10.1.101.1/24
    hwaddress 44:38:39:22:01:b3
    vrf RED
    vlan-raw-device br_default
    vlan-id 101
auto vlan102
iface vlan102
    address 10.1.102.65/24
    address-virtual 00:00:5E:00:01:01 10.1.102.1/24
    hwaddress 44:38:39:22:01:b3
    vrf BLUE
    vlan-raw-device br_default
    vlan-id 102
auto vlan4024_l3
iface vlan4024_l3
    vrf RED
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:FF
    vlan-id 4024
auto vlan4036_l3
iface vlan4036_l3
    vrf BLUE
    vlan-raw-device br_default
    address-virtual 44:38:39:FF:00:FF
    vlan-id 4036
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 4024=4001 4036=4002
    bridge-vids 4024 4036
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond3 peerlink vxlan48
    hwaddress 44:38:39:22:01:b3
    bridge-vlan-aware yes
    bridge-vids 101 102
    bridge-pvid 1
cumulus@leaf01:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf BLUE
vni 4002
exit-vrf
vrf RED
vni 4001
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65101 vrf default
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65101 vrf default
router bgp 65101 vrf RED
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65101 vrf RED
router bgp 65101 vrf BLUE
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65101 vrf BLUE
cumulus@leaf02:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf BLUE
vni 4002
exit-vrf
vrf RED
vni 4001
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65102 vrf default
bgp router-id 10.10.10.2
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65102 vrf default
router bgp 65102 vrf RED
bgp router-id 10.10.10.2
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
neighbor underlay activate
exit-address-family
! end of router bgp 65102 vrf RED
router bgp 65102 vrf BLUE
bgp router-id 10.10.10.2
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65102 vrf BLUE
cumulus@leaf03:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf BLUE
vni 4002
exit-vrf
vrf RED
vni 4001
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65103 vrf default
bgp router-id 10.10.10.3
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65103 vrf default
router bgp 65103 vrf RED
bgp router-id 10.10.10.3
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65103 vrf RED
router bgp 65103 vrf BLUE
bgp router-id 10.10.10.3
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65103 vrf BLUE
cumulus@leaf04:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf BLUE
vni 4002
exit-vrf
vrf RED
vni 4001
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65104 vrf default
bgp router-id 10.10.10.4
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65104 vrf default
router bgp 65104 vrf RED
bgp router-id 10.10.10.4
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65104 vrf RED
router bgp 65104 vrf BLUE
bgp router-id 10.10.10.4
timers bgp 3 9
bgp deterministic-med
! Neighbors
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
! end of router bgp 65104 vrf BLUE
cumulus@spine01:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.101
bgp bestpath as-path multipath-relax
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
cumulus@spine02:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.102
bgp bestpath as-path multipath-relax
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
cumulus@spine03:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.103
bgp bestpath as-path multipath-relax
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
cumulus@spine04:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.104
bgp bestpath as-path multipath-relax
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface peer-group underlay
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface peer-group underlay
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface peer-group underlay
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface peer-group underlay
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
neighbor swp5 interface peer-group underlay
neighbor swp5 timers 3 9
neighbor swp5 timers connect 10
neighbor swp5 advertisement-interval 0
neighbor swp5 capability extended-nexthop
neighbor swp6 interface peer-group underlay
neighbor swp6 timers 3 9
neighbor swp6 timers connect 10
neighbor swp6 advertisement-interval 0
neighbor swp6 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor swp5 activate
neighbor swp6 activate
neighbor underlay activate
exit-address-family
cumulus@border01:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf BLUE
ip route 10.1.10.0/24 10.1.102.4
ip route 10.1.20.0/24 10.1.102.4
vni 4002
exit-vrf
vrf RED
ip route 10.1.30.0/24 10.1.101.4
vni 4001
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65253 vrf default
bgp router-id 10.10.10.63
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65253 vrf default
router bgp 65253 vrf RED
bgp router-id 10.10.10.63
timers bgp 3 9
bgp deterministic-med
! Address families
address-family ipv4 unicast
redistribute static
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
neighbor underlay activate
exit-address-family
! end of router bgp 65253 vrf RED
router bgp 65253 vrf BLUE
bgp router-id 10.10.10.63
timers bgp 3 9
bgp deterministic-med
! Address families
address-family ipv4 unicast
redistribute static
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
neighbor underlay activate
exit-address-family
! end of router bgp 65253 vrf BLUE
cumulus@border02:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf BLUE
ip route 10.1.10.0/24 10.1.102.4
ip route 10.1.20.0/24 10.1.102.4
vni 4002
exit-vrf
vrf RED
ip route 10.1.30.0/24 10.1.101.4
vni 4001
exit-vrf
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65254 vrf default
bgp router-id 10.10.10.64
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 interface peer-group underlay
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 interface peer-group underlay
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 interface peer-group underlay
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
neighbor swp53 interface remote-as external
neighbor swp53 interface peer-group underlay
neighbor swp53 timers 3 9
neighbor swp53 timers connect 10
neighbor swp53 advertisement-interval 0
neighbor swp53 capability extended-nexthop
neighbor swp54 interface remote-as external
neighbor swp54 interface peer-group underlay
neighbor swp54 timers 3 9
neighbor swp54 timers connect 10
neighbor swp54 advertisement-interval 0
neighbor swp54 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
neighbor swp53 activate
neighbor swp54 activate
neighbor underlay activate
exit-address-family
! end of router bgp 65254 vrf default
router bgp 65254 vrf RED
bgp router-id 10.10.10.64
timers bgp 3 9
bgp deterministic-med
! Address families
address-family ipv4 unicast
redistribute static
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
neighbor underlay activate
exit-address-family
! end of router bgp 65254 vrf RED
router bgp 65254 vrf BLUE
bgp router-id 10.10.10.64
timers bgp 3 9
bgp deterministic-med
! Address families
address-family ipv4 unicast
redistribute static
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
advertise ipv4 unicast
neighbor underlay activate
exit-address-family
! end of router bgp 65254 vrf BLUE

This simulation is running Cumulus Linux 5.11. The Cumulus Linux 5.12 simulation is coming soon.

The simulation starts with the example EVPN symmetric routing configuration. The demo is pre-configured using NVUE commands.

To validate the configuration, run the commands listed in the Troubleshooting EVPN section.

VXLAN Devices

Cumulus Linux supports both single and traditional VXLAN devices.

Single VXLAN Device

With a single VXLAN device, a set of VNIs represent a single device model. The single VXLAN device has a set of attributes that belong to the VXLAN construct. Individual VNIs include a VLAN to VNI mapping and you can specify which VLANs map to the associated VNIs. Single VXLAN device simplifies the configuration and reduces the overhead by replacing multiple traditional VXLAN devices with a single VXLAN device.

Cumulus Linux supports multiple single VXLAN devices when configured with multiple VLAN-aware bridges. You configure multiple single VXLAN devices in the same way you configure a single VXLAN device. Make sure not to duplicate VNIs across single VXLAN device configurations.

The limitations listed for multiple VLAN-aware bridges also apply to multiple single VXLAN devices.

You can configure a single VXLAN device with NVUE or by manually editing the /etc/network/interfaces file. When you configure a single VXLAN device with NVUE, Cumulus Linux creates a unique name for the device in the format vxlan<id>. Cumulus Linux generates the ID using the bridge name as the hash key.

The following static VXLAN example configuration:

cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf01:~$ nv set nve vxlan source address 10.10.10.1
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10 flooding multicast-group 239.1.1.110
cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20 flooding multicast-group 239.1.1.120
cumulus@leaf01:~$ nv set interface swp1 bridge domain br_default access 10
cumulus@leaf01:~$ nv set interface swp2 bridge domain br_default access 20
cumulus@leaf01:~$ nv config apply

NVUE creates the following configuration snippet in the /etc/nvue.d/startup.yaml file:

cumulus@leaf01:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10':
                  flooding:
                    multicast-group: 239.1.1.110
                    enable: on
            '20':
              vni:
                '20':
                  flooding:
                    multicast-group: 239.1.1.120
                    enable: on
    nve:
      vxlan:
        enable: on
        source:
          address: 10.10.10.1
    interface:
      swp1:
        bridge:
          domain:
            br_default:
              access: 10
        type: swp
      swp2:
        bridge:
          domain:
            br_default:
              access: 20
        type: swp

Edit the /etc/network/interfaces file then run the ifreload -a command.

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto swp1
iface swp1
    bridge-access 10

auto swp2
iface swp2
    bridge-access 20

auto vxlan48
iface vxlan48
    vxlan-mcastgrp-map 10=239.1.1.110 20=239.1.1.120
    bridge-vlan-vni-map 10=10 20=20
    bridge-vids 10 20
    bridge-learning off

auto br_default
iface br_default
    bridge-ports swp1 swp2 vxlan48
    hwaddress 44:38:39:22:01:ab
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
cumulus@leaf01:~$ ifreload -a

Traditional VXLAN Device

With a traditional VXLAN device, each VNI is a separate device (for example, vni10, vni20, vni30). You can configure traditional VXLAN devices by manually editing the /etc/network/interfaces file.

The following example configuration:

You cannot use NVUE commands to configure traditional VXLAN devices.

Edit the /etc/network/interfaces file, then run the ifreload -a command.

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.1/32
    vxlan-local-tunnelip 10.10.10.1

auto mgmt
iface mgmt
    address 127.0.0.1/8
    vrf-table auto

auto swp1
iface swp1
    bridge-access 10

auto swp2
iface swp2
    bridge-access 20

auto vni10
iface vni10
    bridge-access 10
    mstpctl-bpduguard yes
    mstpctl-portbpdufilter yes
    vxlan-id 10

auto vni20
iface vni20
    bridge-access 20
    mstpctl-bpduguard yes
    mstpctl-portbpdufilter yes
    vxlan-id 20

auto bridge
iface bridge
    bridge-ports swp1 swp2 vni10 vni20
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
cumulus@leaf01:~$ ifreload -a

Automatic VLAN to VNI Mapping

In an EVPN VXLAN environment, you need to map individual VLANs to VNIs. For a single VXLAN device, you can do this with a separate NVUE command per VLAN; however, this can be cumbersome if you have to configure many VLANS or need to isolate tenants and reuse VLANs. To simplify the configuration, you can use these two commands instead:

The following commands automatically set the VNIs for VLAN 10, 20, 30, 40, and 50 on the default bridge (br_default) to 1000010, 1000020, 1000030, 1000040, and 1000050, and set the VNIs for VLAN 10, 20, 30, 40, and 50 on bridge br_01 to 2000010, 2000020, 2000030, 2000040, and 2000050:

cumulus@switch:mgmt:~$ nv set bridge domain br_default vlan 10,20,30,40,50 vni auto
cumulus@switch:mgmt:~$ nv set bridge domain br_default vlan-vni-offset 10000
cumulus@switch:mgmt:~$ nv set bridge domain br_01 vlan 10,20,30,40,50 vni auto
cumulus@switch:mgmt:~$ nv set bridge domain br_01 vlan-vni-offset 20000
cumulus@switch:mgmt:~$ nv config apply

You cannot use automatic NVUE VLAN to VNI mapping commands to configure static VXLAN tunnels.

The following configuration example configures VLANS 10, 20, and 30. The VLANs map automatically to VNIs with an offset of 10000.

cumulus@switch:mgmt:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@switch:mgmt:~$ nv set interface swp1-2 bridge domain br_default
cumulus@switch:mgmt:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@switch:mgmt:~$ nv set interface vlan10
cumulus@switch:mgmt:~$ nv set interface vlan20
cumulus@switch:mgmt:~$ nv set interface vlan30
cumulus@switch:mgmt:~$ nv set bridge domain br_default vlan 10,20,30 vni auto
cumulus@switch:mgmt:~$ nv set bridge domain br_default vlan-vni-offset 10000
cumulus@switch:mgmt:~$ nv config apply

To unset the above configuration, run the nv unset commands in the reverse order. You must omit the bridge name from the nv unset interface swp1-2 bridge domain br_default command and auto from the nv unset bridge domain br_default vlan 10,20,30 vni auto commands.

cumulus@switch:mgmt:~$ nv unset bridge domain br_default vlan-vni-offset
cumulus@switch:mgmt:~$ nv unset bridge domain br_default vlan 10,20,30 vni
cumulus@switch:mgmt:~$ nv unset interface vlan30
cumulus@switch:mgmt:~$ nv unset interface vlan20
cumulus@switch:mgmt:~$ nv unset interface vlan10
cumulus@switch:mgmt:~$ nv unset bridge domain br_default vlan 10,20,30
cumulus@switch:mgmt:~$ nv unset interface swp1-2 bridge domain
cumulus@switch:mgmt:~$ nv unset interface lo ip address 10.10.10.1/32
cumulus@switch:mgmt:~$ nv config apply
cumulus@switch:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                auto: {}
            '20':
              vni:
                auto: {}
            '30':
              vni:
                auto: {}
          vlan-vni-offset: 10000
    interface:
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
      swp1:
        bridge:
          domain:
            br_default: {}
        type: swp
      swp2:
        bridge:
          domain:
            br_default: {}
        type: swp
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
      vlan30:
        type: svi
        vlan: 30
    nve:
      vxlan:
        enable: on
cumulus@switch:mgmt:~$ sudo cat /etc/network/interfaces
auto lo
iface lo inet loopback
    address 10.10.10.1/32
    vxlan-local-tunnelip 10.10.10.1

auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto

auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt

auto swp1
iface swp1

auto swp2
iface swp2

auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:ab
    vlan-raw-device br_default
    vlan-id 10

auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:ab
    vlan-raw-device br_default
    vlan-id 20

auto vlan30
iface vlan30
    hwaddress 44:38:39:22:01:ab
    vlan-raw-device br_default
    vlan-id 30

auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10010 20=10020 30=10030
    bridge-learning off

auto br_default
iface br_default
    bridge-ports swp1 swp2 vxlan48
    hwaddress 44:38:39:22:01:ab
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1

VXLAN UDP Port

You can change the UDP port that Cumulus Linux uses for VXLAN encapsulation. The default port is 4789.

The following example changes the UDP port for VXLAN encapsulation to 1024:

cumulus@switch:mgmt:~$ nv set nve vxlan port 1024

Reserved Field in VXLAN Header

By default, Cumulus Linux drops VXLAN packets at ingress that have reserved bits set in the header. You can change the forwarding behavior to ignore the reserved bits on ingress instead of dropping the packet.

NVUE does not provide commands to configure the switch to ignore the reserved bits in a VXLAN packet.

To configure the switch to ignore the reserved bits on ingress:

  1. Create the /etc/cumulus/switchd.d/vxlan.conf file and add the vxlan_reserved_fields_ignore=True parameter. This parameter configures the switch ASIC to ignore reserved fields at ingress.

    cumulus@switch:mgmt:~$ sudo nano /etc/cumulus/switchd.d/vxlan.conf
    vxlan_reserved_fields_ignore=True
    
  2. Reload switchd with the sudo systemctl reload switchd.service command.

  3. Create the /etc/modprobe.d/vxlan.conf file and add the options vxlan reserved_fields_ignore=1 parameter. This parameter configures the switch kernel to ignore reserved fields at ingress.

    cumulus@switch:mgmt:~$ sudo nano /etc/modprobe.d/vxlan.conf
    options vxlan reserved_fields_ignore=1
    
  4. Reboot the switch for the kernel change to take effect or run the echo 1 > /sys/module/vxlan/parameters/reserved_fields_ignore command to enable the setting in real time.

To configure the switch back to the default behavior (drop VXLAN packets at ingress that have reserved bits set in the header):

  1. Edit the /etc/cumulus/switchd.d/vxlan.conf file to change the vxlan_reserved_fields_ignore parameter to False.

    cumulus@switch:mgmt:~$ sudo nano /etc/cumulus/switchd.d/vxlan.conf
    vxlan_reserved_fields_ignore=False
    
  2. Reload switchd with the sudo systemctl reload switchd.service command.

  3. Edit the /etc/modprobe.d/vxlan.conf file to change the options vxlan reserved_fields_ignore parameter to 0.

    cumulus@switch:mgmt:~$ sudo nano /etc/modprobe.d/vxlan.conf
    options vxlan reserved_fields_ignore=0
    
  4. Reboot the switch for the kernel change to take effect or run the echo 0 > /sys/module/vxlan/parameters/reserved_fields_ignore command to disable the setting in real time.

VXLAN Routing

VXLAN routing, sometimes referred to as inter-VXLAN routing, provides IP routing between VXLAN VNIs in overlay networks. Cumulus Linux routes traffic using the inner header or the overlay tenant IP address.

Because VXLAN routing is fundamentally routing, you deploy it typically with a control plane, such as Ethernet Virtual Private Network (EVPN). You can also set up static routing for MAC distribution and BUM handling.

For a detailed description of different VXLAN routing models and configuration examples, refer to EVPN.

VXLAN routing supports full layer 3 multi-tenancy; all routing occurs in the context of a VRF. Also, VXLAN routing supports dual-attached hosts where the associated VTEPs function in active-active mode.

Static VXLAN Tunnels

Static VXLAN tunnels serve to connect two VTEPs in a given environment. Static VXLAN tunnels are the simplest deployment mechanism for small scale environments and are interoperable with other vendors that adhere to VXLAN standards. Because you map which VTEPs are in a particular VNI, you can avoid the tedious process of defining connections to every VLAN on every other VTEP on every other rack.

Cumulus Linux supports more than one VXLAN ID per VLAN-aware bridge but does not support more than one VXLAN ID per traditional bridge.

Configure Static VXLAN Tunnels

To configure static VXLAN tunnels, you create VXLAN devices. Cumulus Linux supports:

The configuration examples use the following topology. Each IP address corresponds to the loopback address of the switch.

Traditional VXLAN Device

The following traditional VXLAN device configuration:

Cumulus Linux does not provide NVUE commands for traditional VXLAN device configuration.

Edit the /etc/network/interfaces file, then run the ifreload -a command.

auto lo
iface lo inet loopback
    address 10.10.10.1/32
    vxlan-local-tunnelip 10.10.10.1

auto mgmt iface mgmt address 127.0.0.1/8 address ::1/128 vrf-table auto

auto eth0 iface eth0 inet dhcp ip-forward off ip6-forward off vrf mgmt

auto swp1 iface swp1 bridge-access 10

auto swp2 iface swp2 bridge-access 20

auto vni10 iface vni10 bridge-access 10 vxlan-remoteip 10.10.10.2 vxlan-remoteip 10.10.10.3 vxlan-remoteip 10.10.10.4 vxlan-id 10

auto vni20 iface vni20 bridge-access 20 vxlan-remoteip 10.10.10.2 vxlan-remoteip 10.10.10.3 vxlan-remoteip 10.10.10.4 vxlan-id 20

auto bridge iface bridge bridge-ports swp1 swp2 vni10 vni20 bridge-vlan-aware yes bridge-vids 10 20 bridge-pvid 1

auto lo
iface lo inet loopback
    address 10.10.10.2/32
    vxlan-local-tunnelip 10.10.10.2

auto mgmt iface mgmt address 127.0.0.1/8 address ::1/128 vrf-table auto

auto eth0 iface eth0 inet dhcp ip-forward off ip6-forward off vrf mgmt

auto swp1 iface swp1 bridge-access 10

auto swp2 iface swp2 bridge-access 20

auto vni10 iface vni10 bridge-access 10 vxlan-remoteip 10.10.10.1 vxlan-remoteip 10.10.10.3 vxlan-remoteip 10.10.10.4 vxlan-id 10

auto vni20 iface vni20 bridge-access 20 vxlan-remoteip 10.10.10.1 vxlan-remoteip 10.10.10.3 vxlan-remoteip 10.10.10.4 vxlan-id 20

auto bridge iface bridge bridge-ports swp1 swp2 vni10 vni20 bridge-vlan-aware yes bridge-vids 10 20 bridge-pvid 1

auto lo
iface lo inet loopback
    address 10.10.10.3/32
    vxlan-local-tunnelip 10.10.10.3

auto mgmt iface mgmt address 127.0.0.1/8 address ::1/128 vrf-table auto

auto eth0 iface eth0 inet dhcp ip-forward off ip6-forward off vrf mgmt

auto swp1 iface swp1 bridge-access 10

auto swp2 iface swp2 bridge-access 20

auto vni10 iface vni10 bridge-access 10 vxlan-remoteip 10.10.10.1 vxlan-remoteip 10.10.10.2 vxlan-remoteip 10.10.10.4 vxlan-id 10

auto vni20 iface vni20 bridge-access 20 vxlan-remoteip 10.10.10.1 vxlan-remoteip 10.10.10.2 vxlan-remoteip 10.10.10.4 vxlan-id 20

auto bridge iface bridge bridge-ports swp1 swp2 vni10 vni20 bridge-vlan-aware yes bridge-vids 10 20 bridge-pvid 1

auto lo
iface lo inet loopback
    address 10.10.10.4/32
    vxlan-local-tunnelip 10.10.10.3

auto mgmt iface mgmt address 127.0.0.1/8 address ::1/128 vrf-table auto

auto eth0 iface eth0 inet dhcp ip-forward off ip6-forward off vrf mgmt

auto swp1 iface swp1 bridge-access 10

auto swp2 iface swp2 bridge-access 20

auto vni10 iface vni10 bridge-access 10 vxlan-remoteip 10.10.10.1 vxlan-remoteip 10.10.10.2 vxlan-remoteip 10.10.10.3 vxlan-id 10

auto vni20 iface vni20 bridge-access 20 vxlan-remoteip 10.10.10.1 vxlan-remoteip 10.10.10.2 vxlan-remoteip 10.10.10.3 vxlan-id 20

auto bridge iface bridge bridge-ports swp1 swp2 vni10 vni20 bridge-vlan-aware yes bridge-vids 10 20 bridge-pvid 1

Single VXLAN Device

The following single VXLAN device example configuration:

cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf01:~$ nv set nve vxlan mac-learning on
cumulus@leaf01:~$ nv set nve vxlan source address 10.10.10.1
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10 flooding head-end-replication 10.10.10.2
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10 flooding head-end-replication 10.10.10.3
cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20 flooding head-end-replication 10.10.10.4
cumulus@leaf01:~$ nv set interface swp1 bridge domain br_default access 10
cumulus@leaf01:~$ nv set interface swp2 bridge domain br_default access 20
cumulus@leaf01:~$ nv config apply
cumulus@leaf02:~$ nv set interface lo ip address 10.10.10.2/32
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf02:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf02:~$ nv set nve vxlan mac-learning on
cumulus@leaf02:~$ nv set nve vxlan source address 10.10.10.2
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10 vni 10 flooding head-end-replication 10.10.10.1
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10 vni 10 flooding head-end-replication 10.10.10.3
cumulus@leaf02:~$ nv set bridge domain br_default vlan 20 vni 20 flooding head-end-replication 10.10.10.4
cumulus@leaf02:~$ nv set interface swp1 bridge domain br_default access 10
cumulus@leaf02:~$ nv set interface swp2 bridge domain br_default access 20
cumulus@leaf02:~$ nv config apply
cumulus@leaf03:~$ nv set interface lo ip address 10.10.10.3/32
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf03:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf03:~$ nv set nve vxlan mac-learning on
cumulus@leaf03:~$ nv set nve vxlan source address 10.10.10.3
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10 vni 10 flooding head-end-replication 10.10.10.1
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10 vni 10 flooding head-end-replication 10.10.10.2
cumulus@leaf03:~$ nv set bridge domain br_default vlan 20 vni 20 flooding head-end-replication 10.10.10.4
cumulus@leaf03:~$ nv set interface swp1 bridge domain br_default access 10
cumulus@leaf03:~$ nv set interface swp2 bridge domain br_default access 20
cumulus@leaf03:~$ nv config apply
cumulus@leaf04:~$ nv set interface lo ip address 10.10.10.4/32
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf04:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf01:~$ nv set nve vxlan mac-learning on
cumulus@leaf04:~$ nv set nve vxlan source address 10.10.10.4
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10 vni 10 flooding head-end-replication 10.10.10.1
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10 vni 10 flooding head-end-replication 10.10.10.2
cumulus@leaf04:~$ nv set bridge domain br_default vlan 20 vni 20 flooding head-end-replication 10.10.10.3
cumulus@leaf04:~$ nv set interface swp1 bridge domain br_default access 10
cumulus@leaf04:~$ nv set interface swp2 bridge domain br_default access 20
cumulus@leaf04:~$ nv config apply

Edit the /etc/network/interfaces file, then run the sudo ifreload -a command.

auto lo
iface lo inet loopback
    address 10.10.10.1/32
    vxlan-local-tunnelip 10.10.10.1

auto swp1 iface swp1 bridge-access 10

auto swp2 iface swp2 bridge-access 20

auto vxlan48 iface vxlan48 vxlan-remoteip-map 10=10.10.10.2 10=10.10.10.3 20=10.10.10.4 bridge-vlan-vni-map 10=10 20=20 bridge-vids 10 20

auto br_default iface br_default bridge-ports swp1 swp2 vxlan48 hwaddress 44:38:39:22:01:aa bridge-vlan-aware yes bridge-vids 10 20 bridge-pvid 1

auto lo
iface lo inet loopback
    address 10.10.10.2/32
    vxlan-local-tunnelip 10.10.10.2

auto swp1 iface swp1 bridge-access 10

auto swp2 iface swp2 bridge-access 20

auto vxlan48 iface vxlan48 vxlan-remoteip-map 10=10.10.10.1 10=10.10.10.3 20=10.10.10.4 bridge-vlan-vni-map 10=10 20=20 bridge-vids 10 20

auto br_default iface br_default bridge-ports swp1 swp2 vxlan48 hwaddress 44:38:39:22:01:ab bridge-vlan-aware yes bridge-vids 10 20 bridge-pvid 1

auto lo
iface lo inet loopback
    address 10.10.10.3/32
    vxlan-local-tunnelip 10.10.10.3

auto swp1 iface swp1 bridge-access 10

auto swp2 iface swp2 bridge-access 20

auto vxlan48 iface vxlan48 vxlan-remoteip-map 10=10.10.10.1 10=10.10.10.2 20=10.10.10.4 bridge-vlan-vni-map 10=10 20=20 bridge-vids 10 20

auto br_default iface br_default bridge-ports swp1 swp2 vxlan48 hwaddress 44:38:39:22:01:bb bridge-vlan-aware yes bridge-vids 10 20 bridge-pvid 1

auto lo
iface lo inet loopback
    address 10.10.10.4/32
    vxlan-local-tunnelip 10.10.10.4

auto swp1 iface swp1 bridge-access 10

auto swp2 iface swp2 bridge-access 20

auto vxlan48 iface vxlan48 vxlan-remoteip-map 10=10.10.10.1 10=10.10.10.2 20=10.10.10.3 bridge-vlan-vni-map 10=10 20=20 bridge-vids 10 20

auto br_default iface br_default bridge-ports swp1 swp2 vxlan48 hwaddress 44:38:39:22:01:c1 bridge-vlan-aware yes bridge-vids 10 20 bridge-pvid 1

This simulation is running Cumulus Linux 5.11. The Cumulus Linux 5.12 simulation is coming soon.

The simulation starts with the example static VXLAN configuration. The demo is pre-configured using NVUE commands.

To validate the configuration, run the verification commands shown below.

The above NVUE commands specify a different flooding list for each VNI. If you want to set the same flooding list for all VNIs, you can use the nv set nve vxlan flooding head-end-replication command; for example:

cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf01:~$ nv set nve vxlan mac-learning on
cumulus@leaf01:~$ nv set nve vxlan source address 10.10.10.1
cumulus@leaf01:~$ nv set nve vxlan flooding head-end-replication 10.10.10.2
cumulus@leaf01:~$ nv set nve vxlan flooding head-end-replication 10.10.10.3
cumulus@leaf01:~$ nv set nve vxlan flooding head-end-replication 10.10.10.4
cumulus@leaf01:~$ nv set interface swp1 bridge domain br_default access 10
cumulus@leaf01:~$ nv set interface swp2 bridge domain br_default access 20
cumulus@leaf01:~$ nv config apply

The above commands create this configuration in the /etc/network/interfaces file:

...
auto vxlan48
iface vxlan48
    vxlan-remoteip-map 10=10.10.10.2 10=10.10.10.3 10=10.10.10.4 20=10.10.10.2 20=10.10.10.3 20=10.10.10.4
    bridge-vlan-vni-map 10=10 20=20
    bridge-learning on
...

Verify the Configuration

After you configure all the leafs, run the following command to check for replication entries. The command output is different for traditional and single VXLAN devices.

For traditional VXLAN devices:

cumulus@leaf01:~$ sudo bridge fdb show | grep 00:00:00:00:00:00
00:00:00:00:00:00 dev vni10 dst 10.10.10.3 self permanent
00:00:00:00:00:00 dev vni10 dst 10.10.10.2 self permanent
00:00:00:00:00:00 dev vni20 dst 10.10.10.4 self permanent

For a single VXLAN devices:

cumulus@leaf01:mgmt:~$ sudo bridge fdb show | grep 00:00:00:00:00:00
00:00:00:00:00:00 dev vxlan48 dst 10.10.10.2 src_vni 10 self permanent
00:00:00:00:00:00 dev vxlan48 dst 10.10.10.3 src_vni 10 self permanent
00:00:00:00:00:00 dev vxlan48 dst 10.10.10.4 src_vni 20 self permanent

Cumulus Linux disables bridge learning and enables ARP suppression by default on VXLAN interfaces. You can change the default behavior to set bridge learning on and ARP suppression off for all VNIs by creating a policy file called bridge.json in the /etc/network/ifupdown2/policy.d/ directory. For example:

cumulus@leaf01:~$ sudo cat /etc/network/ifupdown2/policy.d/bridge.json
{
    "bridge": {
        "module_globals": {
            "bridge_vxlan_port_learning" : "on",
            "bridge-vxlan-arp-nd-suppress" : "off"
        }
    }
}

After you create the file, run ifreload -a to load the new configuration.

VXLAN Active-active Mode

VXLAN active-active mode enables a pair of MLAG switches to act as a single VTEP, providing active-active VXLAN termination for bare metal as well as virtualized workloads.

To use VXLAN active-active mode, you need to configure:

Configure VXLAN Active-Active

To configure VXLAN active-active mode, you must provision each switch in an MLAG pair with a virtual IP address for VXLAN data-path termination. The VXLAN termination address is an anycast IP address that you configure under the loopback interface. With MLAG peering, both switches use the anycast IP address for VXLAN encapsulation and decapsulation. This enables remote VTEPs to learn the host MAC addresses attached to the MLAG switches against one logical VTEP, even though the switches independently encapsulate and decapsulate layer 2 traffic originating from the host.

MLAG dynamically adds and removes the anycast IP address as the loopback interface address as follows:

  1. When the switches boot up, all VXLAN interfaces are in a PROTO_DOWN state. The anycast IP address is not in use.
  2. MLAG peering takes place and a successful VXLAN interface consistency check between the switches occurs.
  3. The clagd daemon adds the anycast address to the loopback interface as a second address. It then changes the local IP address of the VXLAN interface from a unique address to the anycast IP address and puts the interface in an UP state.

To configure the anycast IP address:

Run the nv set nve vxlan mlag shared-address command.

cumulus@leaf01:~$ nv set nve vxlan mlag shared-address 10.0.1.12
cumulus@leaf01:~$ nv config apply
cumulus@leaf02:~$ nv set nve vxlan mlag shared-address 10.0.1.12
cumulus@leaf02:~$ nv config apply

Add the clagd-vxlan-anycast-ip parameter under the loopback interface in the /etc/network/interfaces file:

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto lo
iface lo inet loopback
  address 10.10.10.1/32
  clagd-vxlan-anycast-ip 10.0.1.12
...
cumulus@leaf02:~$ sudo nano /etc/network/interfaces
...
auto lo
iface lo inet loopback
  address 10.10.10.2/32
  clagd-vxlan-anycast-ip 10.0.1.12
...

When you use EVPN with MLAG, EVPN might install local MAC addresses or neighbor entries as remote entries. To prevent EVPN from taking ownership of local MAC addresses or neighbor entries from MLAG, you can associate all local layer 2 VNIs with a unique site ID, which represents an MLAG pair. See Configure a Site ID for MLAG.

Troubleshooting

This section describes VXLAN active-active failure conditions and provides troubleshooting commands.

Failure Conditions

Failure Condition
Behavior
The peer link goes down. The primary MLAG switch continues to keep all VXLAN interfaces up with the anycast IP address while the secondary switch brings down all VXLAN interfaces and places them in a PROTO_DOWN state. The secondary MLAG switch removes the anycast IP address from the loopback interface.
One of the switches goes down. The other operational switch continues to use the anycast IP address.
clagd stops. All VXLAN interfaces go in a PROTO_DOWN state. The switch removes the anycast IP address from the loopback interface and the local IP addresses of the VXLAN interfaces change from the anycast IP address to unique non-virtual IP addresses.
MLAG peering does not establish between the switches. clagd brings up all the VXLAN interfaces after the reload timer expires with the configured anycast IP address. This allows the VXLAN interface to be up and running on both switches even though peering is not established.
The peer link goes down but the peer switch is up (the backup link is active). All VXLAN interfaces go into a PROTO_DOWN state on the secondary switch.
The anycast IP address is different on the MLAG peers. The VXLAN interface goes into a PROTO_DOWN state on the secondary switch.

Troubleshooting Commands

To show the MLAG configuration on the switch, run the NVUE nv show mlag command:

cumulus@leaf01:mgmt:~$ nv show mlag
                operational              applied            description
--------------  -----------------------  -----------------  ------------------------------------------------------
enable                                   on                 Turn the feature 'on' or 'off'.  The default is 'off'.
debug                                    off                Enable MLAG debugging
init-delay                               180                The delay, in seconds, before bonds are brought up.
mac-address     44:38:39:FF:00:aa        44:38:39:FF:00:AA  Override anycast-mac and anycast-id
peer-ip         fe80::4638:39ff:fe00:5a  linklocal          Peer Ip Address
priority        32768                    32768              Mlag Priority
[backup]        10.10.10.2               10.10.10.2         Set of MLAG backups
anycast-ip      10.0.1.12                                   Vxlan Anycast Ip address
backup-active   True                                        Mlag Backup Status
backup-reason                                               Mlag Backup Reason
local-id        44:38:39:00:00:59                           Mlag Local Unique Id
local-role      primary                                     Mlag Local Role
peer-alive      True                                        Mlag Peer Alive Status
peer-id         44:38:39:00:00:5a                           Mlag Peer Unique Id
peer-interface  peerlink.4094                               Mlag Peerlink Interface
peer-priority   32768                                       Mlag Peer Priority
peer-role       secondary                                   Mlag Peer Role

To show the MLAG neighbor information on the switch, run the NVUE nv show mlag neighbor command:

cumulus@leaf01:mgmt:~$ nv show mlag neighbor
    operational  applied  description
--  -----------  -------  -----------


dynamic
==========
        interface  ip-address  lladdr  vlan-id
    --  ---------  ----------  ------  -------


permanent
============
        address-family  interface  ip-address                lladdr             vlan-id
    --  --------------  ---------  ------------------------  -----------------  -------
    1   10              vlan10     fe80::4638:39ff:fe22:1b1  44:38:39:22:01:b1  10
    2   10              vlan20     fe80::4638:39ff:fe22:1b1  44:38:39:22:01:b1  20
    3   10              vlan10     fe80::4638:39ff:fe22:1af  44:38:39:22:01:af  10
    4   10              vlan20     fe80::4638:39ff:fe22:1af  44:38:39:22:01:af  20

To show MLAG behavior and any inconsistencies between an MLAG pair, run the clagctl command.

In the following example, no conflicts exist for this MLAG interface and the VXLAN is up and running (there is no Proto-Down). The VXLAN anycast IP address shared by the MLAG pair for VTEP termination is in use and is 10.0.1.12.

cumulus@leaf01$ clagctl
The peer is alive
     Our Priority, ID, and Role: 32768 44:38:39:00:00:59 primary
    Peer Priority, ID, and Role: 32768 44:38:39:00:00:5a secondary
          Peer Interface and IP: peerlink.4094 fe80::4638:39ff:fe00:5a (linklocal)
               VxLAN Anycast IP: 10.0.1.12
                      Backup IP: 10.10.10.2 (active)
                     System MAC: 44:38:39:FF:00:aa

CLAG Interfaces
Our Interface      Peer Interface     CLAG Id   Conflicts              Proto-Down Reason
----------------   ----------------   -------   --------------------   -----------------
           bond1   -                  1         -                      -              
         vxlan48   vxlan48            -         -                      -

In the following example, the primary switch has the wrong VXLAN anycast IP address configured. When you run the clagctl command on the secondary switch, the Proto-Down Reason shows anycast-ip-mismatch on bond01 and vxlan-single,anycast-ip-mismatch on vxlan48.

cumulus@leaf04:mgmt:~$ clagctl
The peer is alive
     Our Priority, ID, and Role: 32768 44:38:39:00:00:5e secondary
    Peer Priority, ID, and Role: 32768 44:38:39:00:00:5d primary
          Peer Interface and IP: peerlink.4094 fe80::4638:39ff:fe00:5d (linklocal)
               VxLAN Anycast IP: 10.0.1.34
                      Backup IP: 10.10.10.3 (active)
                     System MAC: 44:38:39:FF:00:bb

CLAG Interfaces
Our Interface      Peer Interface     CLAG Id   Conflicts              Proto-Down Reason
----------------   ----------------   -------   --------------------   -----------------
           bond1   -                  1         -                      anycast-ip-mismatch
         vxlan48   -                  -         -                      vxlan-single,anycast-ip-mismatch

Configuration Example

The commands in this example configure:

cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:~$ nv set interface swp1,swp49-52
cumulus@leaf01:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf01:~$ nv set interface bond1 bridge domain br_default 
cumulus@leaf01:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf01:~$ nv set system global anycast-mac 44:38:39:FF:00:AA
cumulus@leaf01:~$ nv set mlag backup 10.10.10.2
cumulus@leaf01:~$ nv set mlag peer-ip linklocal
cumulus@leaf01:~$ nv set interface vlan10 
cumulus@leaf01:~$ nv set interface vlan20
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10,20
cumulus@leaf01:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf01:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf01:~$ nv set nve vxlan mlag shared-address 10.0.1.12
cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp52 remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp neighbor peerlink.4094 remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.1/32
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf01:~$ nv set evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp52 address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv set vrf default router bgp neighbor peerlink.4094 address-family l2vpn-evpn enable on
cumulus@leaf01:~$ nv config apply
cumulus@leaf02:~$ nv set interface lo ip address 10.10.10.2/32
cumulus@leaf02:~$ nv set interface swp1,swp49-52
cumulus@leaf02:~$ nv set interface bond1 bond member swp1
cumulus@leaf02:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf02:~$ nv set interface bond1 bridge domain br_default 
cumulus@leaf02:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf02:~$ nv set system global anycast-mac 44:38:39:FF:00:AA
cumulus@leaf02:~$ nv set mlag backup 10.10.10.1
cumulus@leaf02:~$ nv set mlag peer-ip linklocal
cumulus@leaf02:~$ nv set interface vlan10 
cumulus@leaf02:~$ nv set interface vlan20
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10,20
cumulus@leaf02:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf02:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf02:~$ nv set nve vxlan mlag shared-address 10.0.1.12
cumulus@leaf02:~$ nv set router bgp autonomous-system 65102
cumulus@leaf02:~$ nv set router bgp router-id 10.10.10.2
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp52 remote-as external
cumulus@leaf02:~$ nv set vrf default router bgp neighbor peerlink.4094 remote-as external
cumulus@leaf02:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.2/32
cumulus@leaf02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf02:~$ nv set evpn enable on
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn enable on
cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp52 address-family l2vpn-evpn enable on
cumulus@leaf02:~$ nv set vrf default router bgp neighbor peerlink.4094 address-family l2vpn-evpn enable on
cumulus@leaf02:~$ nv config apply
cumulus@leaf03:~$ nv set interface lo ip address 10.10.10.3/32
cumulus@leaf03:~$ nv set interface swp1,swp49-52
cumulus@leaf03:~$ nv set interface bond1 bond member swp1
cumulus@leaf03:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf03:~$ nv set interface bond1 bridge domain br_default 
cumulus@leaf03:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf03:~$ nv set system global anycast-mac 44:38:39:FF:00:BB
cumulus@leaf03:~$ nv set mlag backup 10.10.10.4
cumulus@leaf03:~$ nv set mlag peer-ip linklocal
cumulus@leaf03:~$ nv set interface vlan10 
cumulus@leaf03:~$ nv set interface vlan20
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10,20
cumulus@leaf03:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf03:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf03:~$ nv set nve vxlan mlag shared-address 10.0.1.34
cumulus@leaf03:~$ nv set router bgp autonomous-system 65103
cumulus@leaf03:~$ nv set router bgp router-id 10.10.10.3
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp52 remote-as external
cumulus@leaf03:~$ nv set vrf default router bgp neighbor peerlink.4094 remote-as external
cumulus@leaf03:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.3/32
cumulus@leaf03:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf03:~$ nv set evpn enable on
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn enable on
cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp52 address-family l2vpn-evpn enable on
cumulus@leaf03:~$ nv set vrf default router bgp neighbor peerlink.4094 address-family l2vpn-evpn enable on
cumulus@leaf03:~$ nv config apply
cumulus@leaf04:~$ nv set interface lo ip address 10.10.10.4/32
cumulus@leaf04:~$ nv set interface swp1,swp49-52
cumulus@leaf04:~$ nv set interface bond1 bond member swp1
cumulus@leaf04:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf04:~$ nv set interface bond1 bridge domain br_default 
cumulus@leaf04:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf04:~$ nv set system global anycast-mac 44:38:39:FF:00:BB
cumulus@leaf04:~$ nv set mlag backup 10.10.10.3
cumulus@leaf04:~$ nv set mlag peer-ip linklocal
cumulus@leaf04:~$ nv set interface vlan10 
cumulus@leaf04:~$ nv set interface vlan20
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10,20
cumulus@leaf04:~$ nv set bridge domain br_default vlan 10 vni 10
cumulus@leaf04:~$ nv set bridge domain br_default vlan 20 vni 20
cumulus@leaf04:~$ nv set nve vxlan mlag shared-address 10.0.1.34
cumulus@leaf04:~$ nv set router bgp autonomous-system 65104
cumulus@leaf04:~$ nv set router bgp router-id 10.10.10.4
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp52 remote-as external
cumulus@leaf04:~$ nv set vrf default router bgp neighbor peerlink.4094 remote-as external
cumulus@leaf04:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.4/32
cumulus@leaf04:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf04:~$ nv set evpn enable on
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp51 address-family l2vpn-evpn enable on
cumulus@leaf04:~$ nv set vrf default router bgp neighbor swp52 address-family l2vpn-evpn enable on
cumulus@leaf04:~$ nv set vrf default router bgp neighbor peerlink.4094 address-family l2vpn-evpn enable on
cumulus@leaf04:~$ nv config apply
cumulus@spine01:~$ nv set interface lo ip address 10.10.10.101/32
cumulus@spine01:~$ nv set interface swp1-4
cumulus@spine01:~$ nv set router bgp autonomous-system 65199
cumulus@spine01:~$ nv set router bgp router-id 10.10.10.101
cumulus@spine01:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 remote-as external
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp2 remote-as external
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp3 remote-as external
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp4 remote-as external
cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp2 address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp3 address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp4 address-family l2vpn-evpn enable on
cumulus@spine01:~$ nv config apply
cumulus@spine02:~$ nv set interface lo ip address 10.10.10.102/32
cumulus@spine02:~$ nv set interface swp1-4
cumulus@spine02:~$ nv set router bgp autonomous-system 65199
cumulus@spine02:~$ nv set router bgp router-id 10.10.10.102
cumulus@spine02:~$ nv set vrf default router bgp peer-group underlay remote-as external
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp1 remote-as external
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp2 remote-as external
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp3 remote-as external
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp4 remote-as external
cumulus@spine02:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp1 address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp2 address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp3 address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv set vrf default router bgp neighbor swp4 address-family l2vpn-evpn enable on
cumulus@spine02:~$ nv config apply
cumulus@leaf01:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    evpn:
      enable: on
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    mlag:
      backup:
        10.10.10.2: {}
      enable: on
      peer-ip: linklocal
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.12
    router:
      bgp:
        autonomous-system: 65101
        enable: on
        router-id: 10.10.10.1
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$oUZN3YNn0KEqb9JM$bR.wk.hti5DfDJg08Pvy4O3mp8Dn1zuaaGK/uRNoXpEpOUNdHdAvR5i5zb3uwP4uPYYAUx8ofd64TmRcUespA0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:7a
      hostname: leaf01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.1/32: {}
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              peerlink.4094:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp51:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp52:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
cumulus@leaf02:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    evpn:
      enable: on
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.2/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    mlag:
      backup:
        10.10.10.1: {}
      enable: on
      peer-ip: linklocal
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.12
    router:
      bgp:
        autonomous-system: 65102
        enable: on
        router-id: 10.10.10.2
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$LgnUK2KofdPm7n6m$gKVSvoCLGfp6NFtIzIFYNc0IT7SRjvvjJfAONmUjFrN1H7VdxnlJHnyPXivQIq.I6QoOHT2o/buwAjYI5I4rt0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:AA
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:78
      hostname: leaf02
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.2/32: {}
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              peerlink.4094:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp51:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp52:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
cumulus@leaf03:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    evpn:
      enable: on
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.3/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    mlag:
      backup:
        10.10.10.4: {}
      enable: on
      peer-ip: linklocal
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.34
    router:
      bgp:
        autonomous-system: 65103
        enable: on
        router-id: 10.10.10.3
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$s7Z8L4oTOtEMFyO1$Y2PG.Y/DxxOCULiPBwf2IbgxGoz7YVeiqNAgBfv2gR3Ey9zbXNjiVFwXINkUfHkEBEYec2FPus9s/93szZ13L.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:BB
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:84
      hostname: leaf03
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.3/32: {}
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              peerlink.4094:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp51:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp52:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
cumulus@leaf04:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          vlan:
            '10':
              vni:
                '10': {}
            '20':
              vni:
                '20': {}
    evpn:
      enable: on
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.4/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      vlan10:
        type: svi
        vlan: 10
      vlan20:
        type: svi
        vlan: 20
    mlag:
      backup:
        10.10.10.3: {}
      enable: on
      peer-ip: linklocal
    nve:
      vxlan:
        enable: on
        mlag:
          shared-address: 10.0.1.34
    router:
      bgp:
        autonomous-system: 65104
        enable: on
        router-id: 10.10.10.4
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$R3sLiogPvZYI5cUo$8EJcDFHAabnAmNb2XWBS85LtjNpisAvWxwZ1Q4u3Ufiv2T4nEc7TwpYqdKYg5Yl/x7Bn2XbZKeFZ6GpvQ1nmj.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        anycast-mac: 44:38:39:FF:00:BB
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:8a
      hostname: leaf04
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.4/32: {}
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              peerlink.4094:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp51:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp52:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
cumulus@spine01:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    evpn:
      enable: on
    interface:
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.101/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
    nve:
      vxlan:
        enable: on
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.101
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$n0CbAqxMRKBnnQKP$zXodkw5uKNjvpRgJJyYJbPfzeQjhYaIbVqpBgtLWrT5F/m6mgML0ghwjfFaqsqdPd4vFHGfuF66VVZrfmYeAm.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:82
      hostname: spine01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp1:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp2:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp3:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp4:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
            peer-group:
              underlay:
                remote-as: external
cumulus@spine02:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    interface:
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.102/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.102
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$7LHHn9oEA0i/Zzdw$wIgRjxG/bC7hLyJYhxkxco9wVWpJr6/z1LVQAEjN9Y2tqpzHVZhpYOzGyJ43Ht3VJlAwmj3yLwo.s9lESPA.b0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        fabric-mac: 00:00:5E:00:01:01
        system-mac: 44:38:39:22:01:92
      hostname: spine02
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                redistribute:
                  connected:
                    enable: on
              l2vpn-evpn:
                enable: on
            enable: on
            neighbor:
              swp1:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp2:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp3:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
              swp4:
                address-family:
                  l2vpn-evpn:
                    enable: on
                remote-as: external
                type: unnumbered
            peer-group:
              underlay:
                remote-as: external
cumulus@leaf01:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.1/32
    clagd-vxlan-anycast-ip 10.0.1.12
    vxlan-local-tunnelip 10.10.10.1
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.2
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 180
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 peerlink vxlan48
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
cumulus@leaf02:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.2/32
    clagd-vxlan-anycast-ip 10.0.1.12
    vxlan-local-tunnelip 10.10.10.2
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.1
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 180
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 peerlink vxlan48
    hwaddress 44:38:39:22:01:af
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
cumulus@leaf03:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.3/32
    clagd-vxlan-anycast-ip 10.0.1.34
    vxlan-local-tunnelip 10.10.10.3
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.4
    clagd-sys-mac 44:38:39:FF:00:AA
    clagd-args --initDelay 180
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:bb
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:bb
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 peerlink vxlan48
    hwaddress 44:38:39:22:01:bb
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
cumulus@leaf04:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.4/32
    clagd-vxlan-anycast-ip 10.0.1.34
    vxlan-local-tunnelip 10.10.10.4
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.3
    clagd-sys-mac 44:38:39:FF:00:BB
    clagd-args --initDelay 180
auto vlan10
iface vlan10
    hwaddress 44:38:39:22:01:c1
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    hwaddress 44:38:39:22:01:c1
    vlan-raw-device br_default
    vlan-id 20
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 10=10 20=20
    bridge-learning off
auto br_default
iface br_default
    bridge-ports bond1 peerlink vxlan48
    hwaddress 44:38:39:22:01:c1
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1
cumulus@spine01:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.101/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
cumulus@spine02:~$ sudo cat /etc/network/interfaces
...
auto lo
iface lo inet loopback
    address 10.10.10.102/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
auto lo
iface lo inet loopback
auto lo
iface lo inet static
address 10.0.0.31/32

auto eth0 iface eth0 inet dhcp

auto eth1 iface eth1

auto eth2 iface eth2

auto bond1 iface bond1 bond-slaves eth1 eth2 bond-miimon 100 bond-min-links 1 bond-mode 802.3ad bond-xmit-hash-policy layer3+4 bond-lacp-rate 1

auto bond1.10 iface bond1.10 address 172.16.10.101/24 auto bond1.20 iface bond1.20 address 172.16.20.101/24

auto lo
iface lo inet loopback

auto lo iface lo inet static address 10.0.0.33/32

auto eth0 iface eth0 inet dhcp

auto eth1 iface eth1

auto eth2 iface eth2

auto bond1 iface bond1 bond-slaves eth1 eth2 bond-miimon 100 bond-min-links 1 bond-mode 802.3ad bond-xmit-hash-policy layer3+4 bond-lacp-rate 1

auto bond1.10 iface bond1.10 address 172.16.10.103/24 auto bond1.20 iface bond1.20 address 172.16.20.103/24

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65101 vrf default
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
network 10.10.10.1/32
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
! end of router bgp 65101 vrf default
cumulus@leaf02:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65102 vrf default
bgp router-id 10.10.10.2
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
network 10.10.10.2/32
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
! end of router bgp 65102 vrf default
cumulus@leaf03:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65103 vrf default
bgp router-id 10.10.10.3
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
network 10.10.10.3/32
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
! end of router bgp 65103 vrf default
cumulus@leaf04:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65104 vrf default
bgp router-id 10.10.10.4
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 capability extended-nexthop
neighbor swp51 interface remote-as external
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
network 10.10.10.4/32
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
address-family l2vpn evpn
advertise-all-vni
neighbor peerlink.4094 activate
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
! end of router bgp 65104 vrf default
cumulus@spine01:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.101
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface remote-as external
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
exit-address-family
! end of router bgp 65199 vrf default
cumulus@spine02:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.102
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor underlay peer-group
neighbor underlay remote-as external
neighbor underlay timers 3 9
neighbor underlay timers connect 10
neighbor underlay advertisement-interval 0
no neighbor underlay capability extended-nexthop
neighbor swp1 interface remote-as external
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10activate
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
! Address families
address-family ipv4 unicast
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
neighbor underlay activate
exit-address-family
address-family l2vpn evpn
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
exit-address-family
! end of router bgp 65199 vrf default

This simulation is running Cumulus Linux 5.11. The Cumulus Linux 5.12 simulation is coming soon.

The simulation is pre-configured using NVUE commands.

To validate the configuration, run the commands shown in the troublshooting section above.

For a full EVPN symmetric active-active configuration example, see Configuration Examples.

Bridge Layer 2 Protocol Tunneling

A VXLAN connects layer 2 domains across a layer 3 fabric; however, layer 2 protocol packets, such as LLDP, LACP, STP, and CDP stop at the ingress VTEP. If you want the VXLAN to behave more like a wire or hub, where the switch tunnels protocol packets instead of terminating them locally, you can enable bridge layer 2 protocol tunneling.

Configure Bridge Layer 2 Protocol Tunneling

To configure bridge layer 2 protocol tunneling for all protocols:

Cumulus Linux does not provide NVUE commands for this setting.

Add bridge-l2protocol-tunnel all to the interface stanza and the VNI stanza of the /etc/network/interfaces file:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp1
iface swp1
    bridge-access 10
    bridge-l2protocol-tunnel all

auto swp2
iface swp2

auto swp3
iface swp3

auto swp4
iface swp4
...
interface vni10
    bridge-access 10
    bridge-l2protocol-tunnel all
    bridge-learning off
    mstpctl-bpduguard yes
    mstpctl-portbpdufilter yes
    vxlan-id 10
    vxlan-local-tunnelip 10.10.10.1

To configure bridge layer 2 protocol tunneling for a specific protocol, such as LACP:

Cumulus Linux does not provide NVUE commands for this configuration.

Add bridge-l2protocol-tunnel <protocol> to the interface stanza and the VNI stanza of the /etc/network/interfaces file:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto swp1
iface swp1
    bridge-access 10
    bridge-l2protocol-tunnel lacp

auto swp2
iface swp2

auto swp3
iface swp3

auto swp4
iface swp4
...
interface vni10
    bridge-access 10
    bridge-l2protocol-tunnel lacp
    bridge-learning off
    mstpctl-bpduguard yes
    mstpctl-portbpdufilter yes
    vxlan-id 10
    vxlan-local-tunnelip 10.10.10.1

You must enable layer 2 protocol tunneling on the VXLAN link in addition to the interface so that the packets get bridged and forwarded correctly.

LLDP Example

Here is another example configuration for Link Layer Discovery Protocol. You can verify the configuration with lldpcli.

cumulus@switch:~$ sudo lldpcli show neighbors
-------------------------------------------------------------------------------
LLDP neighbors:
-------------------------------------------------------------------------------
Interface: swp23, via LLDP, RID: 13, TIme: 0 day, 00:58:20
  Chassis:
    ChassisID: mac e4:1d:2d:f7:d5:52
    SysName: H1
    MgmtIP: 10.0.2.207
    MgmtIP: fe80::e61d:2dff:fef7:d552
    Capability: Bridge, off
    Capability: Router, on
  Port:
    PortID: ifname swp14
    PortDesc: swp14
    TTL: 120
    PMD autoneg: support: yes, enabled: yes
      Adv: 1000Base-T, HD: no, FD: yes
      MAU oper type: 40GbaseCR4 - 40GBASE-R PCS/PMA over 4 lane shielded copper balanced cable
...

LACP Example

H2 bond0:
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer 3+4(1)

802.3ad: info
LACP rate: fast
Min links: 1
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: cc:37:ab:e7:b5:7e
Active Aggregator Info:
    Aggregator ID: 1
    Number of ports: 2

Slave Interface: eth0
...
details partner lacp pdu:
    system priority: 65535
    system MAC address: 44:38:39:00:a4:95
...
Slave Interface: eth1
...
details partner lacp pdu:
    system priority: 65535
    system MAC address: 44:38:39:00:a4:95

Pseudowire Example

In this example:

Considerations

Use caution when enabling bridge layer 2 protocol tunneling:

VXLAN Tunnel DSCP Operations

Cumulus Linux provides configuration options to control DSCP operations during VXLAN encapsulation and decapsulation, specifically for solutions that require end-to-end quality of service, such as RDMA over Converged Ethernet.

The configuration options propagate ECN between the underlay and overlay according to RFC 6040, which describes how to construct the IP header of an ECN field on both ingress to and egress from an IP-in-IP tunnel.

Configure DSCP Operations

You can set the following DSCP operations:

The following example sets the VXLAN encapsulation DSCP action to copy.

cumulus@switch:~$ nv set nve vxlan encapsulation dscp action copy
cumulus@switch:~$ nv config apply

The following example sets the VXLAN encapsulation DSCP value to 16.

cumulus@switch:~$ nv set nve vxlan encapsulation dscp action set
cumulus@switch:~$ nv set nve vxlan encapsulation dscp value 16
cumulus@switch:~$ nv config apply

The following example sets the VXLAN decapsulation DSCP value to preserve.

cumulus@switch:~$ nv set nve vxlan decapsulation dscp action preserve
cumulus@switch:~$ nv config apply

Edit the /etc/cumulus/switchd.conf file, then reload switchd.

The following example sets the VXLAN encapsulation DSCP action to copy.

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
...
# vxlan encapsulation update
vxlan.def_encap_dscp_action = copy
# default vxlan encap dscp value, only applicable if action is 'set'
#vxlan.def_encap_dscp_value =

# vxlan decapsulation update
#vxlan.def_decap_dscp_action = derive

The following example sets the VXLAN encapsulation DSCP value to 16.

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
...
# vxlan encapsulation update
vxlan.def_encap_dscp_action = set
# default vxlan encap dscp value, only applicable if action is 'set'
vxlan.def_encap_dscp_value = 16

# vxlan decapsulation update
#vxlan.def_decap_dscp_action = derive

The following example sets the VXLAN decapsulation DSCP value to preserve.

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
...
# vxlan encapsulation update
#vxlan.def_encap_dscp_action = derive
# default vxlan encap dscp value, only applicable if action is 'set'
#vxlan.def_encap_dscp_value =

# vxlan decapsulation update
vxlan.def_decap_dscp_action = preserve
...

After you modify /etc/cumulus/switchd.conf file, you must reload switchd with the sudo systemctl reload switchd.service command.

Show the DSCP Setting

To show the VXLAN encapsulation DSCP setting, run the nv show nve vxlan encapsulation dscp command:

cumulus@switch:~$ nv show nve vxlan encapsulation dscp 
       operational  applied
------  -----------  -------
action  copy         copy

To show the VXLAN decapsulation DSCP setting, run the nv show nve vxlan decapsulation dscp command.

cumulus@switch:~$ nv show nve vxlan decapsulation dscp
        operational  applied
------  -----------  --------
action  preserve     preserve

Considerations

You can only set the VXLAN encapsulation and decapsulation DSCP actions globally. Cumulus Linux does not support per-VXLAN or per-tunnel settings.

QinQ and VXLANs

QinQ is an amendment to the IEEE 802.1Q specification that enables you to insert multiple VLAN tags into a single Ethernet frame.

QinQ with VXLAN is typically used by a service provider who offers multi-tenant layer 2 connectivity between different customer data centers over a virtualized layer 3 provider network. The customer VLANs are transparent to the provider network.

Cumulus Linux supports the standard 802.1ad with a VLAN-aware bridge where you map a customer (S-tag) to a VNI and preserve the inner VLAN (C-tag) inside a VXLAN packet.

Cumulus Linux also supports a special case with a VLAN-unaware bridge where you use both the S-tag, C-tag tuple for forwarding lookup and mapping to a VNI. Cumulus Linux removes both the S-tag and C-tag during VXLAN encapsulation; Cumulus Linux refers to this configuration as Double Tag Translation.

You must disable ARP and ND suppression on VXLAN bridges when using QinQ.

802.1ad with a VLAN-aware Bridge

In the standard 802.1ad QinQ model, the customer-facing interface is a QinQ access port and the outer S-tag is the PVID representing the customer. Cumulus Linux translates the S-tag to a VXLAN VNI. The inner C-tag is transparent to the provider. It is also possible that the provider has VLAN trunks connected to the same bridge, carrying traffic from different customers on the same port. In this case, the S-tag maps to a VNI. Cumulus Linux removes the S-tag during VXLAN encapsulation and adds it after decapsulation.

An example configuration in VLAN-aware bridge mode looks like this:

You configure two switches: one at the service provider edge that faces the customer (the switch on the left above), and one on the remote provider edge with a VLAN trunk (the switch on the right above).

Remote Provider Edge Switch

For the switch facing the remote provider cloud:

To configure the remote provider switch:

Cumulus Linux does not provide NVUE commands for this configuration.

Edit the /etc/network/interfaces file to add the following configuration:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 100=1000 200=3000
    bridge-learning off

auto br_default
iface br_default
    bridge-ports swp3 vxlan48
    bridge-vids 100 200
    bridge-vlan-aware yes
    bridge-pvid 1
    bridge-vlan-protocol 802.1ad
...

Run the ifreload -a command to load the new configuration:

cumulus@switch:~$ ifreload -a

Customer-facing Edge Switch

For the switch facing the customer:

To configure the customer-facing switch:

Cumulus Linux does not provide NVUE commands for this configuration.

Edit the /etc/network/interfaces file to add the following configuration:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto vxlan48
iface vxlan48
    bridge-vlan-vni-map 100=1000 200=3000
    bridge-learning off

auto swp3
iface swp3
    bridge-access 100

auto swp4
iface swp4
    bridge-access 200

auto br_default
iface br_default
    bridge-ports swp3 swp4 vxlan48
    bridge-vids 100 200
    bridge-vlan-aware yes
    bridge-pvid 1
    bridge-vlan-protocol 802.1ad
...

View the Configuration

To verify the bridge QinQ configuration, run the ip -d link show bridge commands and check for vlan_protocol 802.1ad in the output:

cumulus@switch:~$ sudo ip -d link show bridge
287: bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
    link/ether 06:a2:ae:de:e3:43 brd ff:ff:ff:ff:ff:ff promiscuity 0
    bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 30000 stp_state 2 priority 32768 vlan_filtering 1 vlan_protocol 802.1ad bridge_id 8000.6:a2:ae:de:e3:43 designated_root 8000.6:a2:ae:de:e3:43 root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_timer    0.00 gc_timer   64.29 vlan_default_pvid 1 vlan_stats_enabled 1 group_fwd_mask 0 group_address 01:80:c2:00:00:08 mcast_snooping 0 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 4096 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3125 mcast_stats_enabled 1 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 addrgenmode eui64

Example Configuration

This example shows a configuration for 802.1ad QinQ in traditional bridge mode on a leaf.

Example /etc/network/interfaces File
auto swp3.11
iface swp3.11
    vlan-protocol 802.1ad

auto vxlan1000101
iface vxlan1000101
    vxlan-id 1000101
    vxlan-local-tunnelip 10.0.0.13

auto br11
iface br11
    bridge-ports swp3.11 vxlan1000101

Double Tag Translation

Double tag translation includes a bridge with double-tagged member interfaces, where a combination of the C-tag and S-tag map to a VNI. You create the configuration only at the edge facing the public cloud. The VXLAN configuration at the customer-facing edge does not need to change.

The double tag is always a cloud connection. The customer-facing edge is either single-tagged or untagged. At the public cloud handoff point, the VNI maps to double VLAN tags, with the S-tag indicating the customer and the C-tag indicating the service.

The configuration in Cumulus Linux uses the outer tag for the customer and the inner tag for the service.

You can use double tag translation:

Double tag translation uses:

To configure a double-tagged interface, stack the VLANs as <port>.<outer tag>.<inner tag>. For example, swp1.100.10, where the outer tag is VLAN 100, which represents the customer, and the inner tag is VLAN 10, which represents the service.

An example configuration:

NVUE does not support double tag translation.

To configure the switch for double tag translation using the above example, edit the /etc/network/interfaces file in a text editor and add the following:

auto swp3.100.10
iface swp3.100.10
    mstpctl-portbpdufilter yes
    mstpctl-bpduguard yes

auto vni1000
iface vni1000
    vxlan-local-tunnelip  10.0.0.1
    mstpctl-portbpdufilter yes
    mstpctl-bpduguard yes
    vxlan-id 1000

auto custA-10-azr
iface custA-10-azr
    bridge-ports swp3.100.10 vni1000
    bridge-vlan-aware no

To check the configuration, run the brctl show command:

cumulus@switch:~$ sudo brctl show
bridge name     bridge id               STP enabled     interfaces
custA-10-azr    8000.00020000004b       yes             swp3.100.10
                                                        vni1000
custB-20-azr    8000.00020000004b       yes             swp3.200.20
                                                        vni3000

Considerations

The Linux kernel limits interface names to 15 characters in length, which can be a problem for QinQ interfaces. To work around this issue, create two VLANs as nested VLAN raw devices, one for the outer tag and one for the inner tag. For example, you cannot create an interface called swp50s0.1001.101 because it contains 16 characters. Instead, edit the /etc/network/interfaces file to create VLANs with IDs 1001 and 101:

cumulus@switch:~$ sudo nano /etc/network/interfaces
...
auto vlan1001
iface vlan1001
      vlan-id 1001
       vlan-raw-device swp50s0

auto vlan1001-101
iface vlan1001-101
       vlan-id 101
       vlan-raw-device vlan1001

auto bridge101
iface bridge101
    bridge-ports vlan1001-101 vxlan1000101
...

Layer 3

This section describes layer 3 configuration. Read this section to understand routing protocols and learn how to configure routing on the Cumulus Linux switch:

Routing

Network routing is the process of selecting a path across one or more networks. When the switch receives a packet, it reads the packet headers to find out its intended destination. It then determines where to route the packet based on information in its routing tables, which can be static or dynamic.

Cumulus Linux supports both Static Routing, where you enter routes and specify the next hop manually and dynamic routing such as BGP, and OSP, where you configure a routing protocol on your switch and the routing protocol learns about other routers automatically.

For the number of route table entries supported per platform, see Forwarding Table Size and Profiles.

Static Routing

You can use static routing if you do not require the complexity of a dynamic routing protocol (such as BGP or OSPF), if you have routes that do not change frequently and for which the destination is only one or two paths away.

With static routing, you configure the switch manually to send traffic with a specific destination prefix to a specific next hop. When the switch receives a packet, it looks up the destination IP address in the routing table and forwards the packet accordingly.

Configure a Static Route

Cumulus Linux adds static routes to the FRR routing table and then to the kernel routing table.

The following example commands configure Cumulus Linux to send traffic with the destination prefix 10.10.10.101/32 out swp51 (10.0.1.1/31) to the next hop 10.0.1.0.

cumulus@leaf01:~$ nv set interface swp1 ip address 10.0.1.1/31
cumulus@leaf01:~$ nv set vrf default router static 10.10.10.101/32 via 10.0.1.0
cumulus@leaf01:~$ nv config apply

Edit the /etc/network/interfaces file to configure an IP address for the interface on the switch that sends out traffic. For example:

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto swp51
iface swp51
    address 10.0.1.1/31
...

Run vtysh commands to configure the static route (the destination prefix and next hop). For example:

cumulus@leaf01:~$ sudo vtysh

leaf01# configure terminal
leaf01(config)# ip route 10.10.10.101/32 10.0.1.0
leaf01(config)# exit
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

The vtysh commands save the static route configuration in the /etc/frr/frr.conf file. For example:

...
!
ip route 10.10.10.101/32 10.0.1.0
!
...

The following example commands configure Cumulus Linux to send traffic with the destination prefix 10.10.10.61/32 out swp3 (10.0.0.32/31) to the next hop 10.0.0.33 in vrf BLUE.

cumulus@border01:~$ nv set interface swp3 ip address 10.0.0.32/31
cumulus@border01:~$ nv set interface swp3 ip vrf BLUE
cumulus@border01:~$ nv set vrf BLUE router static 10.10.10.61/32 via 10.0.0.33
cumulus@border01:~$ nv config apply

Edit the /etc/network/interfaces file to configure an IP address for the interface on the switch that sends out traffic. For example:

cumulus@border01:~$ sudo nano /etc/network/interfaces
...
auto swp3
iface swp3
    address 10.0.0.32/31
    vrf BLUE
...

Run vtysh commands to configure the static route (the destination prefix and next hop). For example:

cumulus@border01:~$ sudo vtysh

border01# configure terminal
border01(config)# vrf BLUE
border01(config-vrf)# ip route 10.10.10.61/32 10.0.0.33
border01(config-vrf)# end
border01# write memory
border01# exit
cumulus@border01:mgmt:~$ 

The vtysh commands save the static route configuration in the /etc/frr/frr.conf file. For example:

...
vrf BLUE
 ip route 10.10.10.61/32 10.0.0.33
...

To delete a static route:

cumulus@leaf01:~$ nv unset vrf default router static 10.10.10.101/32 via 10.0.1.0
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh

leaf01# configure terminal
leaf01(config)# no ip route 10.10.10.101/32 10.0.1.0
leaf01(config)# exit
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

To view static routes, run the vtysh show ip route command. For example:

cumulus@leaf01:mgmt:~$ sudo vtysh
leaf01# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

S>* 10.10.10.101/32 [1/0] via 10.0.1.0, swp51, weight 1, 00:02:07

You can also create a static route by adding the route to a switch port configuration. For example:

Cumulus Linux does not provide NVUE commands for this configuration.

Edit the /etc/network/interfaces file and add the following post-up and post-down lines to the interface stanza:

cumulus@leaf01:~$  sudo nano /etc/network/interfaces
...
auto swp51
iface swp51
    address 10.0.1.1/31
    post-up ip route add 10.10.10.101/32 via 10.0.1.0
    post-down ip route del 10.10.10.101/32 via 10.0.1.0

The ip route command allows you to manipulate the kernel routing table directly from the Linux shell. See man ip(8) for details. FRR monitors the kernel routing table changes and updates its own routing table accordingly.

Configure a Gateway or Default Route

On each switch, consider creating a gateway or default route for traffic destined outside the switch’s subnet or local network. All such traffic passes through the gateway, which is a system on the same network that routes packets to their destination beyond the local network.

The following example configures the default route 0.0.0.0/0, which indicates that you can send any IP address to the gateway. The gateway is another switch with the IP address 10.0.1.0.

cumulus@leaf01:~$ nv set vrf default router static 0.0.0.0/0 via 10.0.1.0
cumulus@leaf01:~$ nv config apply

Instead of 0.0.0.0/0, you can specify default or default6.

cumulus@leaf01:~$ sudo vtysh

leaf01# configure terminal
leaf01(config)# ip route 0.0.0.0/0 10.0.1.0
leaf01(config)# exit
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
!
ip route 0.0.0.0/0 10.0.1.0
!
...

The default route created by the gateway parameter in ifupdown2 does not install in FRR and does not redistribute into other routing protocols. See ifupdown2 and the gateway Parameter for more information.

Considerations

Deleting Routes through the Linux Shell

To avoid incorrect routing, do not use the Linux shell to delete static routes that you added with vtysh commands. Delete the routes with the vtysh commands.

IPv4 and IPv6 Neighbor Cache Aging Timer

Cumulus Linux does not support different neighbor cache aging timer settings for IPv4 and IPv6.

The net.ipv4.neigh.default.base_reachable_time_ms and net.ipv6.neigh.default.base_reachable_time_ms settings in the /etc/sysctl.d/neigh.conf file must have the same value:

cumulus@leaf01:~$ sudo cat /etc/sysctl.d/neigh.conf
...
net.ipv4.neigh.default.base_reachable_time_ms=1080000
net.ipv6.neigh.default.base_reachable_time_ms=1080000
...

Forwarding Table Size and Profiles

Cumulus Linux advertises the maximum number of forwarding table entries supported on the switch, including:

To determine the current table sizes on a switch, run the NVUE nv show platform asic resource command or the Linux cl-resource-query command.

Each switching architecture has specific resources available for forwarding table entries. Cumulus Linux stores:

Cumulus Linux provides various general profiles for forwarding table resources, and, based on your network design, you might need to adjust various switch parameters to allocate resources, as needed.

The values provided in the profiles below are the maximum values that Cumulus Linux software allocates; the theoretical hardware limits might be higher. These limits refer to values that NVIDIA checks as part of unidimensional scale validation. If you try to achieve maximum scalability with multiple features enabled, results might differ from the values listed in this guide.

Spectrum 1

Forwarding resource profiles control unicast forwarding table entry allocations. On the Spectrum 1 switch, TCAM profiles control multicast forwarding table entry allocations. For more information about multicast route entry limitations, refer to Hardware Limitations for ACL Rules.

Profile
MAC Addresses
Layer 3 Neighbors
LPM
default 40k 32k (IPv4) and 8k (IPv6) 64k (IPv4) and 22k (IPv6-long)
l2-heavy 88k 48k (IPv4) and 18k (IPv6) 8k (IPv4) and 8k (IPv6-long)
l2-heavy-1 176k 4k (IPv4) and 2k (IPv6) 4k (IPv4) and 2k (IPv6-long)
l2-heavy-2 86k 86k (IPv4) and 4k (IPv6) 8k (IPv4), 4k (IPv6-long)
v4-lpm-heavy 8k 8k (IPv4) and 16k (IPv6) 80k (IPv4) and 16k (IPv6-long)
v4-lpm-heavy-1 6k 6k (IPv4) and 2k (IPv6) 176k (IPv4) and 2k (IPv6-long)
v6-lpm-heavy 27k 8k (IPv4) and 36k (IPv6) 8k (IPv4), 32k (IPv6-long) and 32k (IPv6/64)
lpm-balanced 6k 4k (IPv4) and 3k (IPv6) 60k (IPv4), 60k (IPv6-long) and 120k (IPv6/64)
ecmp-nh-heavy 20K 32k (IPv4) and 4k (IPv6) 50k (IPv4) 4k (IPv6-long)

The ecmp-nh-heavy profile does not support warm restart mode.

Spectrum-2 and Later

On Spectrum-2 and later, forwarding resource profiles control both unicast and multicast forwarding table entry allocations.

Profile
MAC Addresses
Layer 3 Neighbors
LPM
default 50k 41k (IPv4) and 20k (IPv6) 82k (IPv4), 74k (IPv6-short), 1k (IPv4-Mcast)
l2-heavy 115k 74k (IPv4) and 37k (IPv6) 16k (IPv4), 24k (IPv6-short), 1k (IPv4-Mcast)
l2-heavy-1 239k 16k (IPv4) and 12k (IPv6) 16k (IPv4), 16k (IPv6-short), 1k (IPv4-Mcast)
l2-heavy-2 124k 132k (IPv4) and 12k (IPv6) 16k (IPv4), 16k (IPv6-short), 1k (IPv4-Mcast)
l2-heavy-3 107k 90k (IPv4) and 80k (IPv6) 25k (IPv4), 10k (IPv6-short), 1k (IPv4-Mcast)
v4-lpm-heavy 16k 41k (IPv4) and 24k (IPv6) 124k (IPv4), 24k (IPv6-short), 1k (IPv4-Mcast)
v4-lpm-heavy-1 16k 16k (IPv4) and 4k (IPv6) 256k (IPv4), 8k (IPv6-short), 1k (IPv4-Mcast)
v6-lpm-heavy 16k 16k (IPv4) and 62k (IPv6) 16k (IPv4), 99k (IPv6-short), 1k (IPv4-Mcast)
v6-lpm-heavy-1 5k 4k (IPv4) and 4k (IPv6) 90k (IPv4), 235k (IPv6-short), 1k (IPv4-Mcast)
lpm-balanced 16k 16k (IPv4) and 12k (IPv6) 124k (IPv4), 124k (IPv6-short), 1k (IPv4-Mcast)
ipmc-heavy 57k 41k (IPv4) and 20k (IPv6) 82k (IPv4), 66k (IPv6-short), 8k (IPv4-Mcast)
ipmc-max 41K 41k (IPv4) and 20k (IPv6) 74k (IPv4), 66k (IPv6-short), 15k (IPv4-Mcast)

The IPv6 number corresponds to the /64 IPv6 prefix. The /128 IPv6 prefix number is half of the /64 IPv6 prefix number.

For the ipmc-max profile, the cl-resource-query command output displays 33K instead of 15K as the maximum number of IPv4 multicast routes in switchd. 15K is the supported and validated value. You can use the higher value of 33K to test higher multicast scale in non-production environments.

Change Forwarding Resource Profiles

You can set the profile that best suits your network architecture.

Run the nv set system forwarding profile <profile-name> command to specify the profile you want to use.

The following example command sets the l2-heavy profile:

cumulus@switch:~$ nv set system forwarding profile l2-heavy 
cumulus@switch:~$ nv config apply

To set the profile back to the default:

cumulus@switch:~$ nv unset system forwarding profile l2-heavy 
cumulus@switch:~$ nv config apply

Instead of the above command, you can run the nv set system forwarding profile default command to set the profile back to the default.

Specify the profile you want to use with the forwarding_table.profile variable in the /etc/cumulus/datapath/traffic.conf file. The following example specifies l2-heavy:

cumulus@switch:~$ sudo cat /etc/cumulus/datapath/traffic.conf
...
forwarding_table.profile = l2-heavy

After you specify a different profile, restart switchd with the sudo systemctl restart switchd.service command.

To show the different forwarding profiles that your switch supports and the MAC address, layer 3 neighbor, and LPM scale availability for each forwarding profile, run the nv show system forwarding profile-option command.

ACL and VLAN Memory Resources

In addition to forwarding table memory resources, there are limitations on other memory resources for ACLs and VLAN interfaces; refer to Hardware Limitations for ACL Rules.

Route Filtering and Redistribution

Route filtering lets you exclude routes that neighbors advertise or receive. You can use route filtering to manipulate traffic flows, reduce memory utilization, and improve security.

This section discusses the following route filtering methods:

Route map and prefix list names must start with a letter and can contain letters, digits, underscores and dashes. For example, you can name a route map MAP10 or ROUTE-MAP_10 but you cannot name a route map 10 or 10_ROUTE-MAP.

Prefix Lists

Prefix lists are access lists for route advertisements that match routes instead of traffic. Prefix lists are typically used with route maps and other filtering methods. A prefix list can match the prefix (the network itself) and the prefix length (the length of the subnet mask).

Configure a Prefix List

The following example commands configure a prefix list that permits all prefixes in the range 10.0.0.0/16 with a subnet mask less than or equal to /30. For networks 10.0.0.0/24, 10.10.10.0/24, and 10.0.0.10/32, only 10.0.0.0/24 matches (10.10.10.0/24 has a different prefix and 10.0.0.10/32 has a greater subnet mask).

cumulus@switch:~$ nv set router policy prefix-list LIST1 rule 1 match 10.0.0.0/16 max-prefix-len 30
cumulus@switch:~$ nv set router policy prefix-list LIST1 rule 1 action permit
cumulus@switch:~$ nv config apply

For IPv6, you need to run the nv set router policy prefix-list <name> type ipv6 command to set the prefix list type to IPv6. For example:

cumulus@switch:~$ nv set router policy prefix-list prefixlistipv6 type ipv6
cumulus@switch:~$ nv set router policy prefix-list prefixlistipv6 rule 1 match 2001:100::1/64
cumulus@switch:~$ nv set router policy prefix-list prefixlistipv6 rule 1 action permit 
cumulus@switch:~$ nv config apply

The following example commands configure a prefix list that permits all prefixes in the range 10.1.1.0/24 with a subnet mask less than 32 but more than 26. For networks 10.1.1.0/25, 10.10.10.0/24, and 10.1.1.2/32, only 10.1.1.2/32 matches (10.1.1.0/25 has a lower subnet mask, and 10.10.10.0/24 has a different prefix and a lower subnet mask).

cumulus@switch:~$ nv set router policy prefix-list LIST1 rule 1 match 10.1.1.0/24 max-prefix-len 32
cumulus@switch:~$ nv set router policy prefix-list LIST1 rule 1 match 10.1.1.0/24 min-prefix-len 26
cumulus@switch:~$ nv set router policy prefix-list LIST1 rule 1 action permit
cumulus@switch:~$ nv config apply

The following example commands configure a prefix list that permits all prefixes in the range 10.0.0.0/16 with a subnet mask less than or equal to /30. For networks 10.0.0.0/24, 10.10.10.0/24, and 10.0.0.10/32, only 10.0.0.0/24 matches (10.10.10.0/24 has a different prefix and 10.0.0.10/32 has a greater subnet mask).

cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# ip prefix-list LIST1 seq 1 permit 10.0.0.0/16 le 30
switch(config)# exit
switch# write memory
switch# exit
cumulus@switch:~$

The following example commands configure a prefix list that permits all prefixes in the range 10.1.1.0/24 with a subnet mask less than 32 but more than 26. For networks 10.1.1.0/29, 10.10.10.0/24, and 10.1.1.2/32, only 10.1.1.2/32 matches (10.10.10.0/24 has a different prefix and a lower subnet mask and 10.1.1.0/29 has a higher subnet mask).

cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# ip prefix-list LIST1 seq 1 permit 10.1.1.0/24 ge 26 le 32
switch(config)# exit
switch# write memory
switch# exit
cumulus@switch:~$

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@switch:~$ sudo cat /etc/frr/frr.conf
...
router ospf
 ospf router-id 10.10.10.1
 timers throttle spf 80 100 6000
 passive-interface vlan10
 passive-interface vlan20
ip prefix-list LIST1 seq 1 permit 10.0.0.0/16 le 30

To use this prefix list in a route map called MAP1:

cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 action permit
cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 match type ipv4
cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 match ip-prefix-list LIST1
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# route-map MAP1 permit 10
switch(config-route-map)# match ip address prefix-list LIST1
switch(config-route-map)# exit
switch# write memory
switch# exit
cumulus@switch:~$

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@switch:~$ sudo cat /etc/frr/frr.conf
...
ip prefix-list LIST1 seq 1 permit 10.0.0.0/16 le 30
route-map MAP1 permit 10
match ip address prefix-list LIST1

Clear Matches Shown Against a Prefix List

You can clear prefix list statistics.

The following example clears the number of matches shown against prefix list LIST1:

cumulus@switch:~$ nv action clear router policy prefix-list LIST1
Action succeeded

The following example clears the number of matches shown against LIST1 rule 10 with match criteria 10.0.0.0/16:

cumulus@switch:~$ nv action clear router policy prefix-list LIST1 rule 10 match 10.0.0.0/16
Action succeeded

Route Maps

Route maps are routing policies that Cumulus Linux considers before the router examines the forwarding table. Each statement in a route map has a sequence number, and includes a series of match and set statements. The route map parses from the lowest sequence number to the highest, and stops when there is a match.

Cumulus Linux supports several match and set statements. For example, you can match on an interface, prefix length, next hop or BGP AS path list. You can set the BGP metric, local-preference on routes, source IP, or the tag on the matched route. For a list of supported match and set statements, see Match and Set Statements below.

Configure a Route Map

To configure a route map:

  1. Specify one or more conditions that must match and, optionally, one or more set actions to set or modify attributes of the route. If a route map does not specify any matching conditions, it always matches.
  2. Specify the matching policy: permit (if the entry matches, carry out the set actions) or deny (if the entry matches, deny the route).

To apply the route map, see Apply a Route Map below.

The following example commands configure a route map that sets the BGP metric to 50 for interface swp51:

cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 match interface swp51
cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 set metric 50
cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 action permit
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# route-map MAP1 permit 10
switch(config-route-map)# match interface swp51
switch(config-route-map)# set metric 50
switch(config-route-map)# end
switch# write memory
switch# exit
cumulus@switch:~$

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@switch:~$ sudo cat /etc/frr/frr.conf
...
route-map MAP1 permit 10
 match interface swp51
 set metric 50

The following example commands configure a route map to match the prefixes defined in LIST1 and set the nexth hop to 10.10.10.5:

cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 match ip-prefix-list LIST1
cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 set ip-nexthop 10.10.10.5
cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 action permit
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# route-map MAP1 permit 10
switch(config-route-map)# match ip route-source prefix-list LIST1
switch(config-route-map)# set ip next-hop 10.10.10.5
switch(config-route-map)# end
switch# write memory
switch# exit
cumulus@switch:~$

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@switch:~$ sudo cat /etc/frr/frr.conf
...
route-map MAP1 permit 10
 match ip route-source prefix-list LIST1
 set ip next-hop 10.10.10.5

The following example commands configure a route map to set the local-preference on routes to 400:

cumulus@switch:~$ nv set router policy route-map MAP2 rule 10 set local-preference 400
cumulus@switch:~$ nv set router policy route-map MAP2 rule 10 action permit
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# route-map MAP2 permit 10
switch(config-route-map)# set local-preference 400
switch(config-route-map)# end
switch# write memory
switch# exit
cumulus@switch:~$

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@switch:~$ sudo cat /etc/frr/frr.conf
...
route-map MAP2 permit 10
 set local-preference 400

Match and Set Statements

Cumulus Linux supports the following match and set statements.

You can use the following list of supported match and set statements with NVUE commands. For a list of the match and set statements that vtysh supports, see the FRRouting User Guide.

Match
Description
as-path-list Matches the specified AS path list.
interface Matches the specified interface.
ip-prefix-len Matches the specified prefix length.
origin Matches the specified BGP origin. You can specify egp, igp, or incomplete.
type Matches the specified route type, such as IPv4 or IPv6.
community-list Matches the specified community list.
ip-nexthop Matches the specified next hop.
ip-prefix-list Matches the specified prefix list.
peer Matches the specified BGP neighbor.
evpn-default-route Matches the EVPN default route. You can specify on or off.
ip-nexthop-len Matches the specified next hop prefix length.
large-community-list Matches the specified large community list.
source-protocol Matches the specified source protocol, such as BGP, OSPF, or static.
evpn-route-type Matches the specified EVPN route type. You can specify macip, imet, or prefix.
ip-nexthop-list Matches the specified next hop list.
local-preference Matches the specified local preference. You can specify a value between 0 and 4294967295.
source-vrf Matches the specified source VRF.
evpn-vni Matches the specified EVPN VNI.
ip-nexthop-type Matches the specified next hop type, such as blackhole.
metric Matches the specified BGP metric.
tag Matches the specified tag value associated with the route. You can specify a value between 1 and 4294967295.

BGP and zebra support the source-protocol match statement. Route maps configured for other routing protocols, such as OSPF, do not support the match source-protocol statement.

Set
Description
aggregator-as Sets the aggregator AS.
ext-community-rt Sets the BGP extended community RT. See BGP Community Lists.
originator-id Sets the originator ID so that BGP chooses the preferred path.
as-path-exclude Sets BGP AS path exclude attribute to avoid considering the AS path during best path route selection.
ext-community-soo Sets the BGP extended community SOO. See BGP Community Lists.
large-community Sets the BGP large community.
source-ip Sets the source IP address.
as-path-prepend Sets the BGP AS path prepend attribute.
forwarding-address Sets the route forwarding address.
large-community-delete-list Sets the BGP large community delete list.
tag Sets a tag on the matched route. You can specify a value between 1 and 4294967295.
atomic-aggregate Sets the Atomic Aggregate attribute to inform BGP peers that the local router is using a less specific (aggregated) route to a destination.
ip-nexthop Sets the BGP next hop.
local-preference Sets the BGP local preference to local_pref.
weight Sets the route’s weight.
community Sets the BGP community attribute.
ipv6-nexthop-global Sets the IPv6 next hop global attribute.
metric Sets the BGP attribute MED to a specific value. You can specify metric-minus to subtract the specified value from the MED, 34metric-plus to add the specified value to the MED, rtt to set the MED to the round trip time, rtt-minus to subtract the round trip time from the MED, or rtt-plus to add the round trip time to the MED.
community-delete-list Sets the BGP community delete list.
ipv6-nexthop-local Sets the IPv6 next hop local attribute.
metric-type Sets the metric type. You can specify type-1 or type-2.
ext-community-bw Sets the BGP extended community link bandwidth.
ipv6-nexthop-prefer-global Sets IPv6 inbound routes to use the global address when both a global and link-local next hop is available.
origin Sets the BGP route origin, such as eBGP or iBGP.

Permit Action Exit Policies

You can configure the permit action exit policy for a route map to:

To configure the permit action exit policy:

The following command configures the permit action exit policy to go to the next rule when you meet the matching conditions:

cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 action permit exit-policy next-rule
cumulus@switch:~$ nv config apply

The following command configures the permit action exit policy to go to rule 20 when you meet the matching conditions:

cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 action permit exit-policy rule 20
cumulus@switch:~$ nv config apply

The following command configures the permit action exit policy to exit further rule processing:

cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# route-map MAP1 permit 10
switch(config-route-map)# continue 30
switch(config-route-map)# end
switch# write memory
Note: this version of vtysh never writes vtysh.conf
Building Configuration...
Integrated configuration saved to /etc/frr/frr.conf
[OK]
switch# exit
cumulus@switch:mgmt:~$ 

The following command configures the permit action exit policy to go to the next rule when you meet the matching conditions:

cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# route-map MAP1 permit 10
switch(config-route-map)# on-match next
switch(config-route-map)# end
switch# write memory
Note: this version of vtysh never writes vtysh.conf
Building Configuration...
Integrated configuration saved to /etc/frr/frr.conf
[OK]
switch# exit
cumulus@switch:mgmt:~$ 

The following command configures the permit action exit policy to go to rule 20 when you meet the matching conditions:

cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# route-map MAP1 permit 10
switch(config-route-map)# on-match goto 20
switch(config-route-map)# end
switch# write memory
Note: this version of vtysh never writes vtysh.conf
Building Configuration...
Integrated configuration saved to /etc/frr/frr.conf
[OK]
switch# exit
cumulus@switch:mgmt:~$ 

Apply a Route Map

To apply the route map, you specify the routing protocol and the route map name.

The following example commands apply the route map called routemap2 to BGP neighbor swp51:

cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast policy inbound route-map MAP2
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family ipv4 unicast 
switch(config-router-af)# neighbor swp51 route-map MAP2 in
switch(config-router-af)# end
switch# write memory
Note: this version of vtysh never writes vtysh.conf
Building Configuration...
Integrated configuration saved to /etc/frr/frr.conf
[OK]
switch# exit
cumulus@switch:mgmt:~$ 

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@switch:~$ sudo cat /etc/frr/frr.conf
...
neighbor swp51 route-map MAP2 in

The following example filters routes from Zebra (RIB) into the Linux kernel (FIB). The commands apply the route map called MAP1 to BGP routes in the RIB:

cumulus@switch:~$ nv set vrf default router rib ipv4 fib-filter protocol bgp route-map MAP1
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# ip protocol bgp route-map MAP1
switch(config)# exit
switch# write memory
switch# exit
cumulus@switch:~$

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@switch:~$ sudo cat /etc/frr/frr.conf
...
ip protocol bgp route-map MAP1

For BGP, you can also apply a route map on route updates from BGP to the RIB. You can match on prefix, next hop, communities, and so on. You can set the metric and next hop only. Route maps do not affect the BGP internal RIB. You can use both IPv4 and IPv6 address families. Route maps work on multi-paths; however, BGP bases the metric setting on the best path only.

To apply a route map to filter route updates from BGP into the RIB:

cumulus@switch:$ nv set vrf default router bgp address-family ipv4-unicast rib-filter MAP1
cumulus@switch:$ nv config apply
cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# router bgp 65000
switch(config-router)# address-family ipv4 unicast
switch(config-router-af)# table-map MAP1
switch(config-router-af)# end
switch# write memory
switch# exit
cumulus@switch:~$

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@switch:~$ sudo cat /etc/frr/frr.conf
...
address-family ipv4 unicast
table-map MAP1

To apply an outbound route map to a route reflector client, you must run the NVUE nv set vrf <vrf> router bgp route-reflection outbound-policy on command or the vtysh neighbor <neighbor> route-map SET_IBGP_ORIG out command under the address family, before you apply the route map.

Route Map Description

To provide a description for a route map, run the NVUE nv set router policy route-map <route-map> rule <rule> description command.

cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 match interface swp51
cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 set metric 50
cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 action permit
cumulus@switch:~$ nv set router policy route-map MAP1 rule 10 description set-metric-swp51
cumulus@switch:~$ nv config apply

Clear Matches Against a Route Map

To clear the number of matches shown against a route map, run the nv action clear router policy route-map <route-map> command.

The following example clears the number of matches shown against route map MAP1.

cumulus@switch:~$ nv action clear router policy route-map MAP1
Running handle_clear_route_map MAP1
Action succeeded

To clear the number of matches shown against all route maps, run the nv action clear router policy route-map command.

Route Redistribution

Route redistribution allows a network to use a routing protocol to route traffic dynamically based on the information learned from a different routing protocol or from static routes. Route redistribution helps increase accessibility within networks.

The following example commands redistribute routing information from OSPF routes into BGP:

cumulus@switch:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute ospf
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# router bgp
switch(config-router)# redistribute ospf
switch(config-router)# end
switch# write memory
switch# exit
cumulus@switch:~$

To redistribute all directly connected networks, use the redistribute connected command. For example:

cumulus@switch:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# router bgp
switch(config-router)# redistribute connected
switch(config-router)# end
switch# write memory
switch# exit
cumulus@switch:~$

For OSPF, redistribution loads the database unnecessarily with type-5 LSAs. Only use this method to generate real external prefixes (type-5 LSAs).

Configuration Examples

This section provides example route map configurations. The examples do not include commands to apply a route map; for the commands to apply a route map, refer to Appy a Route Map.

Match AS Path List

The following example configures a route map to allow prefixes that pass through AS 65102.

cumulus@leaf01:~$ nv set router policy as-path-list LIST1 rule 100 action permit
cumulus@leaf01:~$ nv set router policy as-path-list LIST1 rule 100 aspath-exp _65102_
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 match as-path-list LIST1
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# bgp as-path access-list LIST1 seq 100 permit 65102
leaf01(config)# route-map MAP1 permit 10
leaf01(config-route-map)# match as-path LIST1
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Match Origin

The following example configures a route map to allow prefixes originated using an interior gateway protocol (IGP) such as OSPF.

cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 match origin igp
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# route-map MAP1 permit 10
leaf01(config-route-map)# match origin igp
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Match Tag

The following example configures a route map to allow prefixes that match tag 4.

cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 match tag 4
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# route-map MAP1 permit 10
leaf01(config-route-map)# match tag 4
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Match Metric

The following example configures a route map to allow prefixes that match metric 10.

cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 100 action permit
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 100 match metric 10
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# route-map MAP1 permit 100
leaf01(config-route-map)# match metric 10
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Match Source Protocol

The following example configures a route map to allow prefixes that match BGP as the source protocol.

When you configure the match source protocol in a route map, the switch only advertises that protocol type to the peers. If you configure route leaking between VRFs and the leaked routes are learned as BGP routes, you need to match the BGP source protocol to advertise that route.

cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 100 action permit
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 100 match source-protocol bgp
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# route-map MAP1 permit 100
leaf01(config-route-map)# match source-protocol bgp
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

When you configure the match source protocol in a route map, the switch only advertises that protocol type to the peers. If you configure route leaking between VRFs and the leaked routes are learned as BGP routes, you need to match the BGP source protocol to advertise that route in addition to matching the connected source protocol:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# route-map MAP1 permit 100
leaf01(config-route-map)# match source-protocol bgp
leaf01(config-route-map)# match source-protocol connected
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Match Next Hop

The following example configures a route map to allow prefixes that match next hop 10.0.1.1.

cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 100 action permit
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 100 match ip-nexthop 10.0.1.1
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# route-map MAP1 permit 100
leaf01(config-route-map)# match ip next-hop address 10.0.1.1
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Match Next Hop List

The following example configures a route map to allow prefixes that match the next hop prefix list called LIST2.

cumulus@leaf01:~$ nv set router policy prefix-list LIST2 rule 100 action permit
cumulus@leaf01:~$ nv set router policy prefix-list LIST2 rule 100 match 10.0.1.0/32
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 100 action permit
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 100 match ip-nexthop-list LIST2
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# ip prefix-list LIST2 seq 100 permit 10.0.1.0/32
leaf01(config)# route-map MAP1 permit 100
leaf01(config-route-map)# match ip next-hop prefix-list LIST2
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Match Next Hop Type

The following example configures a route map to allow prefixes that match blackhole as the next hop type.

cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 100 action permit
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 100 match ip-nexthop-type blackhole
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# route-map MAP1 permit 100
leaf01(config-route-map)# match ip next-hop type blackhole
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Match Community List

The following example configures a route map to allow prefixes that match BGP community-list 11. For information about BGP community lists, refer to BGP Community Lists.

cumulus@leaf01:~$ nv set router policy community-list 11 rule 100 action permit
cumulus@leaf01:~$ nv set router policy community-list 11 rule 100 community 400:34
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 match community-list 11
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# bgp community-list 11 seq 100 permit 400:34
leaf01(config)# route-map MAP1 permit 10
leaf01(config-route-map)# match community 11
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Match Large Community List

The following example configures a route map to allow prefixes that match BGP large community-list 11.

cumulus@leaf01:~$ nv set router policy large-community-list 11 rule 10 action permit
cumulus@leaf01:~$ nv set router policy large-community-list 11 rule 10 large-community 4200857911:011:011
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 match large-community-list mylist
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# bgp large-community-list 11 seq 10 permit 4200857911:011:011
leaf01(config)# route-map MAP1 permit 10
leaf01(config-route-map)# match large-community 11
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Set IPv6 Prefer Global

With multiple BGP peerings to the same router when adaptive routing is on, or with multiple peerings to the same router on interfaces that share the same MAC address or physical interface, you can configure a route map to prefer the global IPv6 address when a route contains both link-local and global next hop addresses.

cumulus@leaf01:~$ nv set router policy route-map IPV6-PREFER-GLOBAL rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map IPV6-PREFER-GLOBAL rule 10 set ipv6-nexthop-prefer-global on
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# route-map IPV6-PREFER-GLOBAL permit 10
leaf01(config-route-map)# set ipv6 next-hop prefer-global
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@sleaf01:~$

Show Route Filtering

To show route filtering results in the BGP routing table after applying inbound policies, run the NVUE nv show vrf <vrf> router bgp address-family <address-family> route command or the vtysh show ip bgp command.

cumulus@leaf01:~$ nv show vrf default router bgp address-family ipv4 route
                                                                                
PathCount - Number of paths present for the prefix, MultipathCount - Number of  
paths that are part of the ECMP, DestFlags - * - bestpath-exists, w - fib-wait- 
for-install, s - fib-suppress, i - fib-installed, x - fib-install-failed        
                                                                                
Prefix           PathCount  MultipathCount  DestFlags
---------------  ---------  --------------  ---------
10.0.1.12/32     2          1               *        
10.0.1.34/32     5          4               *        
10.0.1.255/32    5          4               *        
10.10.10.1/32    1          1               *        
10.10.10.2/32    5          1               *        
10.10.10.3/32    5          4               *        
10.10.10.4/32    5          4               *        
10.10.10.63/32   5          4               *        
10.10.10.64/32   5          4               *        
10.10.10.101/32  2          1               *        
10.10.10.102/32  2          1               *        
10.10.10.103/32  2          1               *        
10.10.10.104/32  2          1               *

Considerations

Match Lists

When you configure a route map to match a prefix list, community list, or aspath list, the permit or deny actions in the list determine the criteria to evaluate in each route map sequence; for example:

NVIDIA recommends you always configure a community list as permit, and permit or deny routes using route map sequences.

Set BGP Community Additive

To set more than one community in a route map, you can run the nv set router policy route-map <route-map-id> rule <rule-id> set community additive command. The following example sets both community 100:100 and community 555:111 in the route map called ROUTEMAP1:

cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 5 action permit
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 5 match ip-prefix-list LIST1
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 5 match type ipv4
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 5 set community 100:100
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 5 set community 555:111
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 5 set community additive
cumulus@leaf01:~$ nv config apply

When you unset the additive community with the nv unset router policy route-map <route-map-id> rule <rule-id> set community additive command, NVUE does not remove the communities. You must unset each community and the community additive to remove the communities:

cumulus@leaf01:~$ nv unset router policy route-map ROUTEMAP1 rule 5 set community 100:100
cumulus@leaf01:~$ nv unset router policy route-map ROUTEMAP1 rule 5 set community 555:111
cumulus@leaf01:~$ nv unset router policy route-map ROUTEMAP1 rule 5 set community additive
cumulus@leaf01:~$ nv config apply

Policy-based Routing

Typical routing systems and protocols forward traffic based on the destination address in the packet, which they look up in a routing table. However, sometimes the traffic on your network requires a more hands-on approach. Sometimes, you need to forward a packet based on the source address, the packet size, or other information in the packet header.

PBR lets you make routing decisions based on filters that change the routing behavior of specific traffic so that you can override the routing table and influence where the traffic goes. For example, you can use PBR to reach the best bandwidth utilization for business-critical applications, isolate traffic for inspection or analysis, or manually load balance outbound traffic.

Cumulus Linux applies PBR to incoming packets. All packets received on a PBR-enabled interface pass through enhanced packet filters that determine rules and specify where to forward the packets.

Configure PBR

A PBR policy contains one or more policy maps. Each policy map:

To use PBR in Cumulus Linux, you define a PBR policy and apply it to the ingress interface (the interface must already have an IP address assigned). Cumulus Linux matches traffic against the match rules in sequential order and forwards the traffic according to the set rule in the first match. Traffic that does not match any rule passes on to the normal destination based routing mechanism.

To configure a PBR policy:

When you configure PBR with NVUE commands, NVUE enables the pbrd service and restarts the FRR service; An FRR service restart might impact traffic.

  1. Configure the policy map.

    The example commands below configure a policy map called map1 with rule number 1 that matches on destination address 10.1.2.0/24 and source address 10.1.4.1/24.

    If the IP address in the rule is 0.0.0.0/0 or ::/0, any IP address is a match. You cannot mix IPv4 and IPv6 addresses in a rule.

    cumulus@switch:~$ nv set router pbr map map1 rule 1 match destination-ip 10.1.2.0/24
    cumulus@switch:~$ nv set router pbr map map1 rule 1 match source-ip 10.1.4.1/24 
    

    Instead of matching on IP address, you can match packets according to the DSCP or ECN field in the IP header. The DSCP value can be an integer between 0 and 63 or the DSCP codepoint name. The ECN value can be an integer between 0 and 3. The following example command configures a policy map called map1 with rule number 1 that matches on packets with the DSCP value 10:

    cumulus@switch:~$ nv set router pbr map map1 rule 1 match dscp 10
    

    The following example command configures a policy map called map1 with rule number 1 that matches on packets with the ECN value 2:

    cumulus@switch:~$ nv set router pbr map map1 rule 1 match ecn 2
    
  2. Apply a next hop group to the policy map. First configure the next hop group, then apply the group to the policy map. The example commands below create a next hop group called group1 that contains the next hop 192.168.0.21 on output interface swp1 and VRF RED and the next hop 192.168.0.22, then applies the next hop group group1 to the map1 policy map.

    The output interface and VRF are optional. However, you must specify the VRF if the next hop is not in the default VRF.

    cumulus@switch:~$ nv set router nexthop group group1 via 192.168.0.21 interface swp1
    cumulus@switch:~$ nv set router nexthop group group1 via 192.168.0.21 vrf RED
    cumulus@switch:~$ nv set router nexthop group group1 via 192.168.0.22
    cumulus@switch:~$ nv set router pbr map map1 rule 1 action nexthop-group group1
    

    If you want the rule to use a specific VRF table as its lookup, set the VRF. If you do not set a VRF, the rule uses the VRF table the interface is in as its lookup. The example command below sets the rule to use the dmz VRF table.

    You can set the VRF in a virtual environment only. Cumulus Linux on an NVIDIA switch does not support setting the VRF.

    cumulus@switch:~$ nv set router pbr map map1 rule 1 action vrf dmz
    
  3. Assign the PBR policy to an ingress interface. The example command below assigns the PBR policy map1 to interface swp51:

    cumulus@switch:~$ nv set interface swp51 router pbr map map1
    cumulus@switch:~$ nv config apply
    
  1. Enable the pbrd service in the /etc/frr/daemons file:

    cumulus@switch:~$ sudo nano /etc/frr/daemons
    ...
    bgpd=yes
    ospfd=no
    ospf6d=no
    ripd=no
    ripngd=no
    isisd=no
    fabricd=no
    pimd=no
    ldpd=no
    nhrpd=no
    eigrpd=no
    babeld=no
    sharpd=no
    pbrd=yes
    ...
    
  1. Restart FRR with this command:

cumulus@switch:~$ sudo systemctl restart frr.service

Restarting FRR restarts all the routing protocol daemons that are enabled and running.

  1. Configure the policy map.

    The example commands below configure a policy map called map1 with sequence number 1, that matches on destination address 10.1.2.0/24 and source address 10.1.4.1/24.

    cumulus@switch:~$ sudo vtysh
    
    switch# configure terminal
    switch(config)# pbr-map map1 seq 1
    switch(config-pbr-map)# match dst-ip 10.1.2.0/24
    switch(config-pbr-map)# match src-ip 10.1.4.1/24
    switch(config-pbr-map)# exit
    switch(config)# 
    

    If the IP address in the rule is 0.0.0.0/0 or ::/0, any IP address is a match. You cannot mix IPv4 and IPv6 addresses in a rule.

    Instead of matching on IP address, you can match packets according to the DSCP or ECN field in the IP header. The DSCP value can be an integer between 0 and 63 or the DSCP codepoint name. The ECN value can be an integer between 0 and 3. The following example command configures a policy map called map1 with sequence number 1 that matches on packets with the DSCP value 10:

    switch# configure terminal
    switch(config)# pbr-map map1 seq 1
    switch(config-pbr-map)# match dscp 10
    switch(config-pbr-map)# exit
    switch(config)# 
    

    The following example command configures a policy map called map1 with sequence number 1 that matches on packets with the ECN value 2:

    switch# configure terminal
    switch(config)# pbr-map map1 seq 1
    switch(config-pbr-map)# match ecn 2
    switch(config-pbr-map)# exit
    switch(config)# 
    
  2. Apply a next hop group to the policy map. First configure the next hop group, then apply the group to the policy map. The example commands below create a next hop group called group1 that contains the next hop 192.168.0.21 on output interface swp1 and VRF RED, and the next hop 192.168.0.22, then applies the next hop group group1 to the map1 policy map.

    The output interface and VRF are optional. However, you must specify the VRF if the next hop is not in the default VRF.

    switch(config)# nexthop-group group1
    switch(config-nh-group)# nexthop 192.168.0.21 swp1 nexthop-vrf RED
    switch(config-nh-group)# nexthop 192.168.0.22
    switch(config-nh-group)# exit
    switch(config)# pbr-map map1 seq 1
    switch(config-pbr-map)# set nexthop-group group1
    switch(config-pbr-map)# exit
    switch(config)#
    

    If you want the rule to use a specific VRF table as its lookup, set the VRF. If you do not set a VRF, the rule uses the VRF table the interface is in as its lookup. The example command below sets the rule to use the dmz VRF table.

    You can set the VRF in a virtual environment only. Cumulus Linux on an NVIDIA switch does not support setting the VRF.

    switch(config)# pbr-map map1 seq 1
    switch(config-pbr-map)# set vrf dmz
    switch(config-pbr-map)# exit
    switch(config)#
    
  3. Assign the PBR policy to an ingress interface. The example command below assigns the PBR policy map1 to interface swp51:

    switch(config)# interface swp51
    switch(config-if)# pbr-policy map1
    switch(config-if)# end
    switch# write memory
    switch# exit
    cumulus@switch:~$
    

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
nexthop-group group1
nexthop 192.168.0.21 nexthop-vrf RED swp1
nexthop 192.168.0.22
pbr-map map1 seq 1
match dst-ip 10.1.2.0/24
match src-ip 10.1.4.1/24
set nexthop-group group1
interface swp51
pbr-policy map1
...

You can only set one policy per interface.

Modify PBR Rules

When you want to change or extend an existing PBR rule, you must first delete the conditions in the rule, then add the rule back with the modification or addition.

Modify an existing match/set condition

The example below shows an existing configuration.

cumulus@switch:~$ sudo vtysh
...
switch# show pbr map
Seq: 4 rule: 303 Installed: yes Reason: Valid
    SRC Match: 10.1.4.1/24
    DST Match: 10.1.2.0/24
 nexthop 192.168.0.21
    Installed: yes Tableid: 10009

The commands for the above configuration are:

cumulus@switch:~$ nv set router pbr map pbr-policy rule 4 match source-ip 10.1.4.1/24
cumulus@switch:~$ nv set router pbr map pbr-policy rule 4 match destination-ip 10.1.2.0/24
cumulus@switch:~$ nv set router nexthop group group1 via 192.168.0.21
cumulus@switch:~$ nv set router pbr map pbr-policy rule 4 action nexthop-group group1

To change the source IP match from 10.1.4.1/24 to 10.1.4.2/24, you must delete the existing sequence by explicitly specifying the match/set condition. For example:

cumulus@switch:~$ nv unset router pbr map pbr-policy rule 4 match source-ip
cumulus@switch:~$ nv unset router pbr map pbr-policy rule 4 match destination-ip
cumulus@switch:~$ nv unset router nexthop group group1 via 192.168.0.21

Add the new rule with the following commands:

cumulus@switch:~$ nv set router pbr map pbr-policy rule 4 match source-ip 10.1.4.2/24
cumulus@switch:~$ nv set router pbr map pbr-policy rule 4 match destination-ip 10.1.2.0/24
cumulus@switch:~$ nv set router nexthop group group1 via 192.168.0.21
cumulus@switch:~$ nv config apply

Run the vtysh show pbr map command to verify that the rule has the updated source IP match:

cumulus@switch:~$ sudo vtysh
...
switch# show pbr map
Seq: 4 rule: 303 Installed: yes Reason: Valid
     SRC Match: 10.1.4.2/24
     DST Match: 10.1.2.0/24
   nexthop 192.168.0.21
     Installed: yes Tableid: 10012

Run the Linux ip rule show command to verify the entry in the kernel:

cumulus@switch:~$ ip rule show

303: from 10.1.4.1/24 to 10.1.4.2 iif swp16 lookup 10012

Run the following command to verify switchd:

cumulus@switch:~$ sudo cat /cumulus/switchd/run/iprule/show | grep 303 -A 1
303: from 10.1.4.1/24 to 10.1.4.2 iif swp16 lookup 10012
     [hwstatus: unit: 0, installed: yes, route-present: yes, resolved: yes, nh-valid: yes, nh-type: nh, ecmp/rif: 0x1, action: route,  hitcount: 0]
Add a match condition to an existing rule

The example below shows an existing configuration with only one source IP match:

Seq: 3 rule: 302 Installed: yes Reason: Valid
	SRC Match: 10.1.4.1/24
nexthop 192.168.0.21
	Installed: yes Tableid: 10008

The commands for the above configuration are:

cumulus@switch:~$ nv set router pbr map pbr-policy rule 3 match source-ip 10.1.4.1/24
cumulus@switch:~$ nv set router nexthop group group1 via 192.168.0.21

To add a destination IP match to the rule, you must delete the existing rule sequence:

cumulus@switch:~$ nv router pbr map pbr-policy rule 3 match source-ip
cumulus@switch:~$ nv unset router nexthop group group1 via 192.168.0.21
cumulus@switch:~$ nv config apply

Add back the source IP match and next hop condition, and add the new destination IP match (dst-ip 10.1.2.0/24):

cumulus@switch:~$ nv set router pbr map pbr-policy rule 3 match source-ip 10.1.4.1/24
cumulus@switch:~$ nv set router pbr map pbr-policy rule 3 match destination-ip 10.1.2.0/24
cumulus@switch:~$ nv set router nexthop group group1 via 192.168.0.21
cumulus@switch:~$ nv config apply

Run the vtysh show pbr map command to verify the update:

cumulus@switch:~$ sudo vtysh
...
switch# show pbr map
Seq: 3 rule: 302 Installed: 1(9) Reason: Valid
    SRC Match: 10.1.4.1/24
    DST Match: 10.1.2.0/24
   nexthop 192.168.0.21
    Installed: 1(1) Tableid: 10013

Run the ip rule show command to verify the entry in the kernel:

cumulus@switch:~$ ip rule show
302:  from 10.1.4.1/24 to 10.1.2.0 iif swp16 lookup 10013

Run the following command to verify switchd:

cumulus@switch:~$ cat /cumulus/switchd/run/iprule/show | grep 302 -A 1
302: from 10.1.4.1/24 to 10.1.2.0 iif swp16 lookup 10013
     [hwstatus: unit: 0, installed: yes, route-present: yes, resolved: yes, nh-valid: yes, nh-type: nh, ecmp/rif: 0x1, action: route,  hitcount: 0]

Delete PBR Rules and Policies

You can delete a PBR rule, a next hop group, or a policy. The following commands provide examples.

Use caution when deleting PBR rules and next hop groups. Do not create an incorrect configuration for the PBR policy.

The following examples show how to delete a PBR rule match:

cumulus@switch:~$ nv unset router pbr map map1 rule 1 match destination-ip
cumulus@switch:~$ nv config apply

The following examples show how to delete a next hop from a group:

cumulus@switch:~$ nv unset router nexthop group group1 via 192.168.0.22
cumulus@switch:~$ nv config apply

The following examples show how to delete a next hop group:

cumulus@switch:~$ nv unset router nexthop group group1
cumulus@switch:~$ nv config apply

The following examples show how to delete a PBR policy so that the PBR interface is no longer receiving PBR traffic:

cumulus@switch:~$ nv unset interface swp51 router pbr map map1
cumulus@switch:~$ nv config apply

The following examples show how to delete a PBR rule:

cumulus@switch:~$ nv unset router pbr map map1
cumulus@switch:~$ nv config apply

To remove a PBR map and the corresponding next hop group, you must first delete the PBR map and run nv config apply, then remove the corresponding next hop group; for example:

cumulus@switch:~$ nv unset router pbr map map1 rule 1
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv unset router nexthop group group1
cumulus@switch:~$ nv config apply

The following examples show how to delete a PBR rule match:

cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# pbr-map map1 seq 1
switch(config-pbr-map)# no match dst-ip 10.1.2.0/24
switch(config-pbr-map)# end
switch# write memory
switch# exit

The following examples show how to delete a next hop from a group:

cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# nexthop-group group1
switch(config-nh-group)# no nexthop 192.168.0.32 swp1 nexthop-vrf RED
switch(config-nh-group)# end
switch# write memory
switch# exit

The following examples show how to delete a next hop group:

cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# no nexthop-group group1
switch(config)# end
switch# write memory
switch# exit

The following examples show how to delete a PBR policy so that the PBR interface is no longer receiving PBR traffic:

cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# interface swp51
switch(config-if)# no pbr-policy map1
switch(config-if)# end
switch# write memory
switch# exit

The following examples show how to delete a PBR rule:

cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# no pbr-map map1 seq 1
switch(config)# end
switch# write memory
switch# exit

If a PBR rule has multiple conditions (for example, a source IP match and a destination IP match), but you only want to delete one condition, you have to delete all conditions first, then re-add the ones you want to keep.

The example below shows an existing configuration that has a source IP match and a destination IP match.

Seq: 6 rule: 305 Installed: yes Reason: Valid
   SRC Match: 10.1.4.1/24
   DST Match: 10.1.2.0/24
nexthop 192.168.0.21
   Installed: yes Tableid: 10011

The commands for the above configuration are:

cumulus@switch:~$ nv set router pbr map pbr-policy rule 6 match source-ip 10.1.4.1/24
cumulus@switch:~$ nv set router pbr map pbr-policy rule 6 match destination-ip 10.1.2.0/24
cumulus@switch:~$ nv set router nexthop group group1 via 192.168.0.21

To remove the destination IP match, you must first delete all existing conditions defined under this sequence:

cumulus@switch:~$ nv unset router pbr map pbr-policy rule 6 match source-ip 
cumulus@switch:~$ nv unset router pbr map pbr-policy rule 6 match destination-ip
cumulus@switch:~$ nv unset router nexthop group group1 via 192.168.0.21
cumulus@switch:~$ nv config apply

Then, add back the conditions you want to keep:

cumulus@switch:~$ nv set router pbr map pbr-policy rule 6 match source-ip 10.1.4.1/24
cumulus@switch:~$ nv unset router nexthop group group1 via 192.168.0.21
cumulus@switch:~$ nv config apply

Troubleshooting

To see the policies applied to all interfaces on the switch, run the NVUE nv show router pbr -o json command:

cumulus@switch:~$ nv show router pbr  -o json
{
  "map": {
    "map1": {
      "rule": {
        "1": {
          "action": {
            "nexthop-group": {
              "group1": {
                "installed": "off",
                "table-id": 10000
              }
            }
          },
          "installed": "off",
          "installed-reason": "Invalid NH-group",
          "ip-rule-id": 300,
          "match": {
            "destination-ip": "10.1.2.0/24",
            "dscp": 10,
            "source-ip": "10.1.4.1/24"
          }
        }
      }
    }
  }
}

To see the policies applied to a specific interface on the switch, run the NVUE nv show interface <interface> router pbr command or the vtysh show pbr interface <interface> command.

To see information about all policies, including mapped table and rule numbers, run the NVUE nv show router pbr map command or the vtysh show pbr map command. If the rule is not set, you see a reason why.

cumulus@switch:~$ sudo vtysh
switch# show pbr map
 pbr-map map1 valid: yes
  Seq: 700 rule: 999 Installed: yes Reason: Valid
      SRC Match: 10.0.0.1/32
  nexthop 192.168.0.32
      Installed: yes Tableid: 10003
  Seq: 701 rule: 1000 Installed: yes Reason: Valid
      SRC Match: 90.70.0.1/32
  nexthop 192.168.0.32
      Installed: yes Tableid: 10004

To see information about a policy, its matches, and associated interface, run the NVUE nv show router pbr map <map> -o json command or the vtysh show pbr map <map-name> command.

cumulus@switch:~$ nv show router pbr map map1 -o json
{
  "rule": {
    "1": {
      "action": {
        "nexthop-group": {
          "group1": {
            "installed": "on",
            "table-id": 10000
          }
        }
      },
      "installed": "no",
      "installed-reason": "Valid",
      "ip-rule-id": 300,
      "match": {
        "destination-ip": "10.1.2.0/24",
        "source-ip": "10.1.4.1/24"
      }
    }
  },
  "valid": "yes"
}

To see information about all next hop groups, run the NVUE nv show router pbr nexthop-group command or the vtysh show pbr nexthop-group command.

cumulus@switch:~$ nv show router pbr nexthop-group
Nexthop-groups  installed  valid    Summary         
--------------  ---------  -----    ----------------
group1          yes         yes     Nexthop-index: 1
                                    Nexthop-index: 2

To show more detailed information about the next hop groups, run the nv show router pbr nexthop-group -o json command:

cumulus@switch:~$ nv show router pbr nexthop-group -o json
{
  "group1": {
    "installed": "yes",
    "nexthop": {
      "1": {
        "nexthop": "20.1.1.2",
        "valid": "yes",
        "vrf": "swp1s0"
      }
    },
    "valid": "yes"
  }
}
...

To see information about a specific next hop group, run the NVUE nv show router pbr nexthop-group <nexthop-group> command or the vtysh show pbr nexthop-group <nexthop-group> command.

Each next hop and next hop group uses a new Linux routing table ID.

To show the reserved routing table range, run the NVUE nv show system global reserved routing-table pbr command.

cumulus@switch:~$ nv show system global reserved routing-table pbr
       operational  applied   
-----  -----------  ----------
begin  10000        10000     
end    4294966272   4294966272

Example Configuration

In the following example, the PBR-enabled switch has a PBR policy to route all traffic from the Internet to a server that performs anti-DDOS. After cleaning, the traffic returns to the PBR-enabled switch and then passes on to the regular destination-based routing mechanism.

cumulus@switch:~$ nv set router pbr map map1 rule 1 match source-ip 0.0.0.0/0
cumulus@switch:~$ nv set router nexthop group group1 via 192.168.0.32
cumulus@switch:~$ nv set router pbr map map1 rule 1 action nexthop-group group1
cumulus@switch:~$ nv set interface swp51 router pbr map map1
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh

switch# configure terminal
switch(config)# nexthop-group group1
switch(config-nh-group)#  nexthop 192.168.0.32
switch(config-nh-group)# exit
switch(config)# pbr-map map1 seq 1
switch(config-pbr-map)#  match src-ip 0.0.0.0/0
switch(config-pbr-map)#  set nexthop-group group1
switch(config-pbr-map)# exit
switch(config)# interface swp51
switch(config-if)#  pbr-policy map1
switch(config-if)# end
switch# write memory
switch# exit
cumulus@switch:mgmt:~$

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

interface swp51
pbr-policy map1
nexthop-group group1
nexthop 192.168.0.32
pbr-map map1 seq 1
match src-ip 0.0.0.0/0
set nexthop-group group1
...

Equal Cost Multipath Load Sharing

Cumulus Linux enables ECMP by default. Load sharing occurs automatically for IPv4 and IPv6 routes with multiple installed next hops. The hardware or the routing protocol configuration determines the maximum number of routes for which load sharing occurs.

ECMP operates only on equal cost routes in the RIB. For Cumulus Linux to consider routes equal, the routes must:

When multiple routes are in the routing table, a hash determines through which path a packet follows. To prevent out of order packets, ECMP hashes on a per-flow basis; all packets with the same source and destination IP addresses and the same source and destination ports always hash to the same next hop. ECMP hashing does not keep a record of packets that hash to each next hop and does not guarantee that traffic to each next hop is equal.

Cumulus Linux enables the BGP maximum-paths setting by default and installs multiple routes. Refer to BGP and ECMP.

Next Hop Groups

ECMP routes resolve to next hop groups, which identify one or more next hops. To view next hop information, run the NVUE nv show router nexthop rib or nv show router nexthop rib <id> commands, or the ip nexthop show or ip nexthop show <id> kernel commands.

cumulus@leaf01:mgmt:~$ nv show router nexthop rib
Installed - Install state 
ID   Installed  Uptime                Vrf      Valid  Via                        ViaIntf        ViaVrf   Depends
---  ---------  --------------------  -------  -----  -------------------------  -------------  -------  -------
12              2024-10-22T18:35:53Z  default  on     swp53                                     default         
13              2024-10-22T18:35:53Z  default  on     swp51                                     default         
14   on         2024-10-22T18:35:53Z  default  on     swp54                                     default         
15   on         2024-10-22T18:36:00Z  default  on     lo                                        default         
16   on         2024-10-22T18:35:53Z  default  on     eth0                                      mgmt            
17   on         2024-10-22T18:35:53Z  default  on     eth0                                      mgmt            
18              2024-10-22T18:36:00Z  default  on                                                               
19   on         2024-10-22T18:36:00Z  default  on     192.168.200.1              eth0           mgmt            
20   on         2024-10-22T18:36:00Z  default  on                                                               
21   on         2024-10-22T18:36:00Z  default  on                                                               
22   on         2024-10-22T18:36:00Z  default  on                                                               
24              2024-10-22T18:35:53Z  default  on     swp52                                     default         
52              2024-10-22T18:35:55Z  default  on     peerlink.4094                             default         
62   on         2024-10-22T18:36:02Z  default  on     fe80::4ab0:2dff:feb5:3daa  peerlink.4094  default         
74              2024-10-22T18:35:59Z  default  on     br_default                                default         
75              2024-10-22T18:35:59Z  default  on     vlan10v0                                  RED             
76   on         2024-10-22T18:35:59Z  default  on     vlan10                                    RED             
77              2024-10-22T18:35:59Z  default  on     vlan10v0                                  RED             
78              2024-10-22T18:35:59Z  default  on     vlan4063_l3                               RED             
79              2024-10-22T18:35:59Z  default  on     vlan20                                    RED             
80   on         2024-10-22T18:35:59Z  default  on     vlan10                                    RED             
81   on         2024-10-22T18:35:59Z  default  on     vlan20                                    RED             
82   on         2024-10-22T18:35:59Z  default  on     vlan30                                    BLUE            
83              2024-10-22T18:35:59Z  default  on     vlan4006_l3                               BLUE            
84   on         2024-10-22T18:35:59Z  default  on     vlan30                                    BLUE            
91              2024-10-22T18:35:59Z  default  on     vlan20v0                                  RED             
92              2024-10-22T18:35:59Z  default  on     vlan4063_l3v0                             RED             
93              2024-10-22T18:35:59Z  default  on     vlan20v0                                  RED             
94              2024-10-22T18:35:59Z  default  on     vlan30v0                                  BLUE            
95              2024-10-22T18:35:59Z  default  on     vlan4006_l3v0                             BLUE            
96              2024-10-22T18:35:59Z  default  on     vlan30v0                                  BLUE            
100             2024-10-22T18:36:01Z  default  on     vxlan48                                   default         
107  on         2024-10-22T18:36:04Z  default  on     fe80::4ab0:2dff:fe32:2a3f  swp52          default         
110  on         2024-10-22T18:36:04Z  default  on     10.10.10.63                vlan4063_l3    RED             
111  on         2024-10-22T18:36:04Z  default  on     10.10.10.63                vlan4006_l3    BLUE            
115  on         2024-10-22T18:36:04Z  default  on     fe80::4ab0:2dff:fe41:6b79  swp51          default         
125  on         2024-10-22T18:42:21Z  default  on                                                        107    
                                                                                                         115    
126  on         2024-10-22T18:36:04Z  default  on                                                        111    
                                                                                                         127    
127  on         2024-10-22T18:36:04Z  default  on     10.10.10.64                vlan4006_l3    BLUE            
128  on         2024-10-22T18:36:04Z  default  on                                                        110    
                                                                                                         129    
129  on         2024-10-22T18:36:04Z  default  on     10.10.10.64                vlan4063_l3    RED             
140  on         2024-10-22T18:42:34Z  default  on     10.0.1.34                  vlan4006_l3    BLUE            
142  on         2024-10-22T18:42:36Z  default  on     10.0.1.34                  vlan4063_l3    RED
...

The following example shows information for next hop group 108:

cumulus@leaf01:mgmt:~$ nv show router nexthop rib 129
                 operational         
---------------  --------------------
type             zebra               
ref-count        2                   
vrf              default             
valid            on                  
installed        on                  
interface-index  74                  
uptime           2024-10-22T18:36:04Z

Via
======
                                                                          
    Flags - u - unreachable, r - recursive, o - onlink, i - installed, d -          
    duplicate, c - connected, A - active, Type - Type of nexthop, Weight - Weight to
    be used by the nexthop for purposes of ECMP, VRF - VRF to use for egress.       
                                                                                
    Nexthop      Flags  Type        Weight  VRF  Interface  
    -----------  -----  ----------  ------  ---  -----------
    10.10.10.64  oA     ip-address  1       RED  vlan4063_l3

Via BackupNexthops
=====================
No Data

Depends
==========
No Data

Dependents
=============
    Nexthop-group
    -------------
    128

ECMP Hashing

You can configure custom hashing to specify what to include in the hash calculation during load balancing between:

For ECMP load balancing between multiple next hops of a layer 3 route, you can hash on these fields:

Field
Default Setting NVUE Command traffic.conf
IP protocol on nv set system forwarding ecmp-hash ip-protocol on

nv set system forwarding ecmp-hash ip-protocol off
hash_config.ip_prot
Source IP address on nv set system forwarding ecmp-hash source-ip on

nv set system forwarding ecmp-hash source-ip off
hash_config.sip
Destination IP address on nv set system forwarding ecmp-hash destination-ip on

nv set system forwarding ecmp-hash destination-ip off
hash_config.dip
Source port on nv set system forwarding ecmp-hash source-port on

nv set system forwarding ecmp-hash source-port off
hash_config.sport
Destination port on nv set system forwarding ecmp-hash destination-port on

nv set system forwarding ecmp-hash destination-port off
hash_config.dport
IPv6 flow label on nv set system forwarding ecmp-hash ipv6-label on

nv set system forwarding ecmp-hash ipv6-label off
hash_config.ip6_label
Ingress interface off nv set system forwarding ecmp-hash ingress-interface on

nv set system forwarding ecmp-hash ingress-interface off
hash_config.ing_intf
TEID (see GTP Hashing) off nv set system forwarding ecmp-hash gtp-teid on

nv set system forwarding ecmp-hash gtp-teid off
hash_config.gtp_teid
Inner IP protocol off nv set system forwarding ecmp-hash inner-ip-protocol on

nv set system forwarding ecmp-hash inner-ip-protocol off
hash_config.inner_ip_prot
Inner source IP address off nv set system forwarding ecmp-hash inner-source-ip on

nv set system forwarding ecmp-hash inner-source-ip off
hash_config.inner_sip
Inner destination IP address off nv set system forwarding ecmp-hash inner-destination-ip on

nv set system forwarding ecmp-hash inner-destination-ip off
hash_config.inner_dip
Inner source port off nv set system forwarding ecmp-hash inner-source-port on

nv set system forwarding ecmp-hash inner-source-port off
hash_config.inner-sport
Inner destination port off nv set system forwarding ecmp-hash inner-destination-port on

nv set system forwarding ecmp-hash inner-destination-port off
hash_config.inner_dport
Inner IPv6 flow label off nv set system forwarding ecmp-hash inner-ipv6-label on

nv set system forwarding ecmp-hash inner-ipv6-label off
hash_config.inner_ip6_label

The following example commands omit the source port and destination port from the hash calculation:

cumulus@switch:~$ nv set system forwarding ecmp-hash source-port off
cumulus@switch:~$ nv set system forwarding ecmp-hash destination-port off
cumulus@switch:~$ nv config apply

Use the instructions below when NVUE is not enabled. If you are using NVUE to configure your switch, the NVUE commands change the settings in /etc/cumulus/datapath/nvue_traffic.conf which takes precedence over the settings in /etc/cumulus/datapath/traffic.conf.

  1. Edit the /etc/cumulus/datapath/traffic.conf file:
    • Uncomment the hash_config.enable = true option.
    • Set the hash_config.sport and hash_config.dport options to false.
cumulus@switch:~$ sudo nano /etc/cumulus/datapath/traffic.conf
...
# HASH config for  ECMP to enable custom fields
# Fields will be applicable for ECMP hash
# calculation
#Note : Currently supported only for MLX platform
# Uncomment to enable custom fields configured below
hash_config.enable = true

#hash Fields available ( assign true to enable)
#ip protocol
hash_config.ip_prot = true
#source ip
hash_config.sip = true
#destination ip
hash_config.dip = true
#source port
hash_config.sport = false
#destination port
hash_config.dport = false
...
  1. Run the echo 1 > /cumulus/switchd/ctrl/hash_config_reload command. This command does not cause any traffic interruptions.

    cumulus@switch:~$ echo 1 > /cumulus/switchd/ctrl/hash_config_reload
    

Cumulus Linux enables symmetric hashing by default. Make sure that the settings for the source IP and destination IP fields match, and that the settings for the source port and destination port fields match; otherwise Cumulus Linux disables symmetric hashing automatically. If necessary, you can disable symmetric hashing manually in the /etc/cumulus/datapath/traffic.conf file by setting symmetric_hash_enable = FALSE.

GTP Hashing

GTP carries mobile data within the core of the mobile operator’s network. Traffic in the 5G Mobility core cluster, from cell sites to compute nodes, have the same source and destination IP address. The only way to identify individual flows is with the GTP TEID. Enabling GTP hashing adds the TEID as a hash parameter and helps the Cumulus Linux switches in the network to distribute mobile data traffic evenly across ECMP routes.

Cumulus Linux supports TEID-based ECMP hashing for:

For TEID-based load balancing for traffic on a bond, see Bonding - Link Aggregation.

GTP TEID-based ECMP hashing is only applicable if the outer header egressing the port is GTP encapsulated and if the ingress packet is either a GTP-U packet or a VXLAN encapsulated GTP-U packet.

To enable TEID-based ECMP hashing:

cumulus@switch:~$ nv set system forwarding ecmp-hash gtp-teid on
cumulus@switch:~$ nv config apply

To disable TEID-based ECMP hashing, run the nv set system forwarding ecmp-hash gtp-teid off command.

Use the instructions below when NVUE is not enabled. If you are using NVUE to configure your switch, the NVUE commands change the settings in /etc/cumulus/datapath/nvue_traffic.conf which takes precedence over the settings in /etc/cumulus/datapath/traffic.conf.

  1. Edit the /etc/cumulus/datapath/traffic.conf file and change the lag_hash_config.gtp_teid parameter to true:

    cumulus@switch:~$ sudo nano /etc/cumulus/datapath/traffic.conf
    ...
    #GTP-U teid
    hash_config.gtp_teid = true
    
  2. Run the echo 1 > /cumulus/switchd/ctrl/hash_config_reload command. This command does not cause any traffic interruptions.

    cumulus@switch:~$ echo 1 > /cumulus/switchd/ctrl/hash_config_reload
    

To disable TEID-based ECMP hashing, set the hash_config.gtp_teid parameter to false, then reload the configuration.

To show that TEID-based ECMP hashing is on, run the command:

cumulus@switch:~$ nv show system forwarding ecmp-hash
                   applied  description
-----------------  -------  -----------------------------------
destination-ip     on       Destination IPv4/IPv6 Address
destination-port   on       TCP/UDP destination port
gtp-teid           on       GTP-U TEID
...

ECMP Hash Buckets

When there are multiple routes in the routing table, Cumulus Linux assigns each route to an ECMP bucket. When the ECMP hash executes, the result of the hash determines which bucket to use.

In the following example, four next hops exist. Three different flows hash to different hash buckets. Each next hop goes to a unique hash bucket.

The addition of a next hop creates a new hash bucket. The assignment of next hops to hash buckets, as well as the hash result, sometimes changes with the addition of next hops.

With the addition of a new next hop, there is a new hash bucket. As a result, the hash and hash bucket assignment changes, so the existing flows go to different next hops.

When you remove a next hop, the remaining hash bucket assignments can change, which can also change the next hop selected for an existing flow.

A next hop fails, which removes the next hop and hash bucket. It is possible that Cumulus Linux reassigns the remaining next hops.

In most cases, modifying hash buckets has no impact on traffic flows as the switch forwards traffic to a single end host. In deployments where multiple end hosts use the same IP address (anycast), you must use resilient hashing.

Unique Hash Seed

You can configure a unique hash seed for each switch to prevent hash polarization, a type of network congestion that occurs when multiple data flows try to reach a switch using the same switch ports.

You can set a hash seed value between 0 and 4294967295. If you do not specify a value, switchd creates a randomly generated seed.

To configure the hash seed:

cumulus@switch:~$ nv set system forwarding hash-seed 50
cumulus@switch:~$ nv config apply

If you do not enable NVUE, use the instructions below. If you are using NVUE to configure your switch, the NVUE commands change the settings in /etc/cumulus/datapath/nvue_traffic.conf which takes precedence over the settings in /etc/cumulus/datapath/traffic.conf.

Edit /etc/cumulus/datapath/traffic.conf file to change the ecmp_hash_seed parameter, then restart switchd.

cumulus@switch:~$ sudo nano /etc/cumulus/datapath/traffic.conf
...
#Specify the hash seed for Equal cost multipath entries
# and for custom ecmp and lag hash
# Default value: random
# Value Range: {0..4294967295}
ecmp_hash_seed = 50
...
cumulus@switch:~$ sudo systemctl restart switchd.service

Restarting the switchd service causes all network ports to reset, interrupting network services, in addition to resetting the switch hardware configuration.

cl-ecmpcalc

Run the cl-ecmpcalc command to determine a hardware hash result. For example, you can see which path a flow takes through a network. You must provide all fields in the hash, including the ingress interface, layer 3 source IP, layer 3 destination IP, layer 4 source port, and layer 4 destination port.

cl-ecmpcalc only supports input interfaces that convert to a single physical port in the port tab file, such as the physical switch ports (swp). You can not specify virtual interfaces like bridges, bonds, or subinterfaces.

cumulus@switch:~$ sudo cl-ecmpcalc -i swp1 -s 10.0.0.1 -d 10.0.0.1 -p tcp --sport 20000 --dport 80
ecmpcalc: will query hardware
swp3

If you omit a field, cl-ecmpcalc fails.

cumulus@switch:~$ sudo cl-ecmpcalc -i swp1 -s 10.0.0.1 -d 10.0.0.1 -p tcp
ecmpcalc: will query hardware
usage: cl-ecmpcalc [-h] [-v] [-p PROTOCOL] [-s SRC] [--sport SPORT] [-d DST]
                   [--dport DPORT] [--vid VID] [-i IN_INTERFACE]
                   [--sportid SPORTID] [--smodid SMODID] [-o OUT_INTERFACE]
                   [--dportid DPORTID] [--dmodid DMODID] [--hardware]
                   [--nohardware] [-hs HASHSEED]
                   [-hf HASHFIELDS [HASHFIELDS ...]]
                   [--hashfunction {crc16-ccitt,crc16-bisync}] [-e EGRESS]
                   [-c MCOUNT]
cl-ecmpcalc: error: --sport and --dport required for TCP and UDP frames

Resilient Hashing

In Cumulus Linux, when a next hop fails or you remove the next hop from an ECMP pool, the hashing or hash bucket assignment can change. Resilient hashing is an alternate way to manage ECMP groups. Cumulus Linux assigns next hops to buckets using their hashing header fields and uses the resulting hash to index into the table of 2^n hash buckets. Because all packets in a given flow have the same header hash value, they all use the same flow bucket.

The NVIDIA Spectrum ASIC assigns packets to hash buckets and assigns hash buckets to next hops. The ASIC also runs a background thread that monitors buckets and can migrate buckets between next hops to rebalance the load.

Any flow can migrate to any next hop, depending on flow activity and load balance conditions. Over time, the flow can get pinned, which is the default setting and behavior.

When you enable resilient hashing, Cumulus Linux assigns next hops in round robin fashion to a fixed number of buckets. In this example, there are 12 buckets and four next hops.

Unlike default ECMP hashing, when you need to remove a next hop, the number of hash buckets does not change.

With 12 buckets and four next hops, instead of reducing the number of buckets, which impacts flows to known good hosts, the remaining next hops replace the failed next hop.

After you remove the failed next hop, the remaining next hops replace it. This prevents impact to any flows that hash to working next hops.

Resilient hashing does not prevent possible impact to existing flows when you add new next hops. Because there are a fixed number of buckets, a new next hop requires reassigning next hops to buckets.

As a result, some flows hash to new next hops, which can impact anycast deployments.

Cumulus Linux does not enable resilient hashing by default. When you enable resilient hashing, all ECMP groups share 65,536 buckets. An ECMP group is a list of unique next hops that multiple ECMP routes reference.

An ECMP route counts as a single route with multiple next hops.

All ECMP routes must use the same number of buckets (you cannot configure the number of buckets per ECMP route).

A larger number of ECMP buckets reduces the impact on adding new next hops to an ECMP route. However, the system supports fewer ECMP routes. If you install the maximum number of ECMP routes, new ECMP routes log an error and do not install.

You can configure route and MAC address hardware resources depending on ECMP bucket size changes. See NVIDIA Spectrum routing resources.

To enable resilient hashing:

Cumulus Linux does not provide NVUE commands for this setting.
  1. Edit the /etc/cumulus/datapath/traffic.conf file to uncomment and set the resilient_hash_enable parameter to TRUE.

    You can also set the resilient_hash_entries_ecmp parameter to the number of hash buckets to use for all ECMP routes. On Spectrum switches, you can set the number of buckets to 64, 512, 1024, 2048, or 4096. On NVIDIA Spectrum-2 and later, you can set the number of buckets to 64, 128, 256, 512, 1024, 2048, or 4096. The default value is 64.

    # Enable resilient hashing
    resilient_hash_enable = TRUE
    
    # Resilient hashing flowset entries per ECMP group
    # 
    # Mellanox Spectrum platforms:
    # Valid values - 64, 512, 1024, 2048, 4096
    #
    # Mellanox Spectrum2/3 platforms
    # Valid values -  64, 128, 256, 512, 1024, 2048, 4096
    #
    # resilient_hash_entries_ecmp = 64
    
  2. Restart the switchd service:

cumulus@switch:~$ sudo systemctl restart switchd.service

Restarting the switchd service causes all network ports to reset, interrupting network services, in addition to resetting the switch hardware configuration.

  1. Resilient hashing in hardware does not work with next hop groups; the switch remaps flows to new next hops when the set of nexthops changes. To work around this issue, configure zebra not to install next hop IDs in the kernel with the following vtysh command:
cumulus@switch:~$ sudo vtysh
switch# configure terminal
switch(config)# zebra nexthop proto only
switch(config)# exit
switch# write memory
switch# exit
cumulus@switch:~$

Considerations

When the router adds or removes ECMP paths, or when the next hop IP address, interface, or tunnel changes, the next hop information for an IPv6 prefix can change. FRR deletes the existing route to that prefix from the kernel, then adds a new route with all the relevant new information. In certain situations, Cumulus Linux does not maintain resilient hashing for IPv6 flows.

To work around this issue, you can enable IPv6 route replacement.

For certain configurations, IPv6 route replacement can lead to incorrect forwarding decisions and lost traffic. For example, it is possible for a destination to have next hops with a gateway value with the outbound interface or just the outbound interface itself, without a gateway address. If both types of next hops for the same destination exist, route replacement does not operate correctly; Cumulus Linux adds an additional route entry and next hop but does not delete the previous route entry and next hop.

To enable the IPv6 route replacement option:

  1. In the /etc/frr/daemons file, add the configuration option --v6-rr-semantics to the zebra daemon definition. For example:

    cumulus@switch:~$ sudo nano /etc/frr/daemons
    ...
    vtysh_enable=yes
    zebra_options=" -M cumulus_mlag -M snmp -A 127.0.0.1 --v6-rr-semantics -s 90000000"
    bgpd_options=" -M snmp  -A 127.0.0.1"
    ospfd_options=" -M snmp -A 127.0.0.1"
    ...
    
  1. Restart FRR with this command:

    cumulus@switch:~$ sudo systemctl restart frr.service

    Restarting FRR restarts all the routing protocol daemons that are enabled and running.

To verify that IPv6 route replacement, run the systemctl status frr command:

cumulus@switch:~$ systemctl status frr

    ● frr.service - FRRouting
      Loaded: loaded (/lib/systemd/system/frr.service; enabled; vendor preset: enabled)
      Active: active (running) since Mon 2020-02-03 20:02:33 UTC; 3min 8s ago
        Docs: https://frrouting.readthedocs.io/en/latest/setup.html
      Process: 4675 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/SUCCESS)
      Memory: 14.4M
      CGroup: /system.slice/frr.service
            ├─4685 /usr/lib/frr/watchfrr -d zebra bgpd staticd
            ├─4701 /usr/lib/frr/zebra -d -M snmp -A 127.0.0.1 --v6-rr-semantics -s 90000000
            ├─4705 /usr/lib/frr/bgpd -d -M snmp -A 127.0.0.1
            └─4711 /usr/lib/frr/staticd -d -A 127.0.0.1

Adaptive Routing

Adaptive routing is a load balancing feature that improves network utilization for eligible IP packets by selecting forwarding paths dynamically based on the state of the switch, such as queue occupancy and port utilization.

The benefits of using adaptive routing include:

With adaptive routing, the switch forwards packets to the less loaded path on a per packet basis to best utilize the fabric resources and avoid congestion. The change decision for port selection is set to one microsecond; you cannot change it.

Cumulus Linux supports ECMP resource optimization for adaptive routing, which addresses the requirement of large numbers of ECMP groups during routing protocol convergence in transient scenarios.

Cumulus Linux supports adaptive routing with:

Cumulus Linux also supports BGP W-ECMP with adaptive routing; see BGP Weighted Equal Cost Multipath.

Enable Adaptive Routing

To enable adaptive routing:

Run the nv set interface <interface> router adaptive-routing enable on command on all the ports that are part of the same ECMP route. NVUE sets adaptive routing on the ports and enables the adaptive routing feature.

cumulus@switch:~$ nv set interface swp51 router adaptive-routing enable on
cumulus@switch:~$ nv set interface swp52 router adaptive-routing enable on
cumulus@switch:~$ nv config apply

To disable adaptive routing, run the nv set router adaptive-routing enable off command. To disable adaptive routing on a specific port, run the nv set interface <interface> router adaptive-routing enable off command.

Enabling or disabling adaptive routing restarts the switchd service, which causes all network ports to reset, interrupts network services, and resets the switch hardware configuration.

Edit the /etc/cumulus/switchd.d/adaptive_routing.conf file:

  • Set the adaptive_routing.enable parameter to TRUE to enable the adaptive routing feature.
  • Set the interface.<port>.adaptive_routing.enable parameter to TRUE in the Per-port configuration section to enable adaptive routing on all the ports that are part of the same ECMP route.
cumulus@switch:~$ sudo nano /etc/cumulus/switchd.d/adaptive_routing.conf
## Global adaptive-routing enable/disable setting
adaptive_routing.enable = TRUE
...
## Per-port configuration
interface.swp51.adaptive_routing.enable = TRUE
interface.swp51.adaptive_routing.link_util_thresh = 70
interface.swp52.adaptive_routing.enable = TRUE
interface.swp52.adaptive_routing.link_util_thresh = 70
...

Restart switchd with the sudo systemctl restart switchd.service command.

  • To disable adaptive routing, set the adaptive_routing.enable parameter to FALSE in the /etc/cumulus/switchd.d/adaptive_routing.conf file.
  • To disable adaptive routing on a specific port, set the interface.<port>.adaptive_routing.enable parameter to FALSE in the /etc/cumulus/switchd.d/adaptive_routing.conf file.

When you enable adaptive routing, Cumulus Linux uses the default profile settings for your switch ASIC type. You cannot change the default profile settings. If you need to make adjustments to the settings, contact NVIDIA Customer Support.

Link utilization, when crossing a threshold, is one of the parameters in the adaptive routing decision. The default link utilization threshold percentage on an interface is 70. You can change the percentage to a value between 1 and 100.

Link utilization is off by default; you must enable the global link utilization setting to use the link utilization thresholds set on adaptive routing interfaces. You cannot enable or disable link utilization per interface.

In Cumulus Linux 5.5 and earlier, link utilization is on by default. If you configured link utilization in a previous release, be sure to enable link utilization after you upgrade.

The following example enables link utilization and uses the default link utilization threshold percentage of 70:

cumulus@switch:~$ nv set router adaptive-routing link-utilization-threshold on
cumulus@switch:~$ nv config apply

The following example changes the link utilization threshold percentage to 100 on swp51 and enables link utilization:

cumulus@switch:~$ nv set interface swp51 router adaptive-routing link-utilization-threshold 100
cumulus@switch:~$ nv set router adaptive-routing link-utilization-threshold on
cumulus@switch:~$ nv config apply

Enabling or disabling link utilization reloads the switchd service.

Edit the /etc/cumulus/switchd.d/adaptive_routing.conf file to set:

  • interface.<interface>.adaptive_routing.link_util_thresh to a value between 1 and 100.
  • adaptive_routing.link_util_threshold_disabled to FALSE.
cumulus@switch:~$ sudo nano /etc/cumulus/switchd.d/adaptive_routing.conf
## Global adaptive-routing enable/disable setting
adaptive_routing.enable = TRUE

## Global Link-utilization-threshold on/off
adaptive_routing.link_utilization_threshold_disabled = FALSE

## Per-port configuration
interface.swp51.adaptive_routing.enable = TRUE
interface.swp51.adaptive_routing.link_util_thresh = 100

Reload switchd with the sudo systemctl reload switchd.service command.

Example Configuration

The following example enables adaptive routing on swp1 and swp2. Global link utilization is off (the default setting).

cumulus@switch:~$ nv set interface swp51 router adaptive-routing enable on
cumulus@switch:~$ nv set interface swp52 router adaptive-routing enable on
cumulus@switch:~$ nv config apply

The following example enables adaptive routing on swp51 and swp52, sets the link utilization threshold percentage to 100 on both swp51 and swp52, and enables global link utilization:

cumulus@switch:~$ nv set interface swp51 router adaptive-routing enable on
cumulus@switch:~$ nv set interface swp52 router adaptive-routing enable on
cumulus@switch:~$ nv set interface swp51 router adaptive-routing link-utilization-threshold 100
cumulus@switch:~$ nv set interface swp52 router adaptive-routing link-utilization-threshold 100
cumulus@switch:~$ nv set router adaptive-routing link-utilization-threshold on
cumulus@switch:~$ nv config apply 

The following example enables adaptive routing on swp51 and swp52. Global link utilization is off (the default setting).

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.d/ad.aptive_routing.conf
## Global adaptive-routing enable/disable setting
adaptive_routing.enable = TRUE

## Global Link-utilization-threshold on/off
adaptive_routing.link_utilization_threshold_disabled = TRUE

## Per-port configuration
interface.swp51.adaptive_routing.enable = TRUE
interface.swp51.adaptive_routing.link_util_thresh = 0
interface.swp52.adaptive_routing.enable = TRUE
interface.swp52.adaptive_routing.link_util_thresh = 0
...

Reload switchd with the sudo systemctl reload switchd.service command.

The following example enables adaptive routing on swp51 and swp52, sets the link utilization threshold percentage to 100 on both swp51 and swp52, and enables global link utilization.

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.d/adaptive_routing.conf
## Global adaptive-routing enable/disable setting
adaptive_routing.enable = TRUE

## Global Link-utilization-threshold on/off
adaptive_routing.link_utilization_threshold_disabled = FALSE

## Per-port configuration
interface.swp51.adaptive_routing.enable = TRUE
interface.swp51.adaptive_routing.link_util_thresh = 100
interface.swp52.adaptive_routing.enable = TRUE
interface.swp52.adaptive_routing.link_util_thresh = 100

Reload switchd with the sudo systemctl reload switchd.service command.

Show Adaptive Routing Settings

To show adaptive routing settings, run the nv show router adaptive-routing command:

cumulus@leaf01:mgmt:~$ nv show router adaptive-routing
                            operational   applied
--------------------------  ------------  -------
enable                      on            off

To show adaptive routing configuration for an interface, run the nv show interface <interface> router adaptive-routing.

Considerations

IPv6 Next Hop Preference

Cumulus Linux uses IPv6 link-local addresses as BGP next hops when receiving a route with both link-local and global next hops. To configure a BGP peering to prefer global next hop addresses, configure the ipv6-nexthop-prefer-global option in an inbound route map applied to the peer. Use this configuration when there are multiple BGP peerings to the same router with adaptive routing enabled, or multiple peerings to the same router on interfaces that share the same MAC address or physical interface. Refer to Set IPv6 Prefer Global.

BGP Weighted Equal Cost Multipath

You use W-ECMP in data center networks that rely on anycast routing to provide network-based load balancing. Cumulus Linux supports BGP W-ECMP by using the BGP link bandwidth extended community to load balance traffic towards anycast services for IPv4 and IPv6 routes in a layer 3 deployment and for prefix (type-5) routes in an EVPN deployment.

W-ECMP Routing

In ECMP, the route to a destination has multiple next hops and traffic distributes across them equally. Flow-based hashing ensures that all traffic associated with a particular flow uses the same next hop and the same path across the network.

In W-ECMP, along with the ECMP flow-based hash, Cumulus Linux associates a weight with each next hop and distributes traffic across the next hops in proportion to their weight. The BGP link bandwidth extended community carries information about the anycast server distribution through the network, which maps to the weight of the corresponding next hop. The mapping factors the bandwidth value of a particular path against the total bandwidth values of all possible paths, mapped to the range 1 to 100. The BGP best path selection algorithm and the multipath computation algorithm that determines which paths you can use for load balancing does not change.

W-ECMP Example

The above example shows how traffic towards 192.168.10.1/32 is load balanced when you use W-ECMP routing:

Now, each spine has four W-ECMP routes:

The border leafs also have four W-ECMP routes:

The border leafs balance traffic equally; all weights are equal to the spines. Only the spines have unequal load sharing based on the weight values.

Configure W-ECMP

Set the BGP link bandwidth extended community in a route map against all prefixes, a specific prefix, or set of prefixes using the match clause of the route map. Apply the route map on the first device to receive the prefix; against the BGP neighbor that generated this prefix.

The BGP link bandwidth extended community uses bytes-per-second. To convert the number of ECMP paths, Cumulus Linux uses a reference bandwidth of 1024Kbps. For example, if there are four ECMP paths to an anycast IP, the encoded bandwidth in the extended community is 512,000. The actual value is not important, as long as all routers originating the link bandwidth convert the number of ECMP paths in the same way.

Cumulus Linux accepts the bandwidth extended community by default. You do not need to configure transit devices where W-ECMP routes are not originated.

The following example sets the BGP link bandwidth extended community against all prefixes.

cumulus@switch:~$ nv set router policy route-map ucmp-route-map rule 10 action permit 
cumulus@switch:~$ nv set router policy route-map ucmp-route-map rule 10 set ext-community-bw multipaths
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast policy outbound route-map ucmp-route-map 
cumulus@switch:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# route-map ucmp-route-map permit 10
leaf01(config-route-map)# set extcommunity bandwidth num-multipaths
leaf01(config-route-map)# exit
leaf01(config)# router bgp 65011
leaf01(config-router)# address-family ipv4 unicast
leaf01(config-router)# neighbor 10.1.1.1 route-map ucmp-route-map out
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
address-family ipv4 unicast
 neighbor 10.1.1.1 route-map ucmp-route-map out
!
route-map ucmp-route-map permit 10
 set extcommunity bandwidth num-multipaths
...

The following example sets the BGP link bandwidth extended community for anycast servers in the 192.168/16 IP address range.

cumulus@switch:~$ nv set router policy prefix-list anycast_ip type ipv4
cumulus@switch:~$ nv set router policy prefix-list anycast_ip rule 1 match 192.168.0.0/16 max-prefix-len 30
cumulus@switch:~$ nv set router policy prefix-list anycast_ip rule 1 action permit
cumulus@switch:~$ nv set router policy route-map ucmp-route-map rule 1 action permit 
cumulus@switch:~$ nv set router policy route-map ucmp-route-map rule 1 match ip-prefix-list anycast_ip
cumulus@switch:~$ nv set router policy route-map ucmp-route-map rule 1 set ext-community-bw multipaths
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast policy outbound prefix-list anycast_ip 
cumulus@switch:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# ip prefix-list anycast_ip seq 10 permit 192.168.0.0/16 le 32
leaf01(config)# route-map ucmp-route-map permit 10
leaf01(config-route-map)# match ip address prefix-list anycast_ip
leaf01(config-route-map)# set extcommunity bandwidth num-multipaths
leaf01(config-route-map)# router bgp 65011
leaf01(config-router)# address-family ipv4 unicast
leaf01(config-router-af)# neighbor swp51 prefix-list anycast_ip out
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
address-family ipv4 unicast
 neighbor 10.1.1.1 route-map ucmp-route-map out
!
ip prefix-list anycast-ip permit 192.168.0.0/16 le 32
route-map ucmp-route-map permit 10
 match ip address prefix-list anycast-ip
 set extcommunity bandwidth num-multipaths
...

EVPN Configuration

For EVPN configuration, make sure that you activate the commands under the EVPN address family. The following shows an example EVPN configuration that sets the BGP link bandwidth extended community against all prefixes.

cumulus@switch:~$ nv set vrf turtle router bgp autonomous-system 65011
cumulus@switch:~$ nv set vrf turtle router bgp address-family ipv4-unicast route-export to-evpn route-map ucmp-route-map
cumulus@switch:~$ nv set router policy route-map ucmp-route-map rule 10 action permit
cumulus@switch:~$ nv set router policy route-map ucmp-route-map rule 10 set ext-community-bw cumulative
cumulus@switch:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# route-map ucmp-route-map permit 10
leaf01(config-route-map)# set extcommunity bandwidth num-multipaths
leaf01(config-route-map)# router bgp 65011 vrf turtle
leaf01(config-router)# address-family l2vpn evpn
leaf01(config-router-af)# advertise ipv4 unicast route-map ucmp-route-map
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65011 vrf turtle
 !
 address-family ipv4 unicast
  maximum-paths 64
  maximum-paths ibgp 64
 exit-address-family
 !
 address-family l2vpn evpn
  advertise ipv4 unicast route-map ucmp-route-map
 exit-address-family

Control W-ECMP on the Receiving Switch

To control W-ECMP on the receiving switch, you can:

Set Default Values for W-ECMP Routes

By default, if some of the multipaths do not have link bandwidth, Cumulus Linux ignores the bestpath bandwidth value in any of the multipaths and performs ECMP. However, you can set one of the following options instead:

Change this setting per BGP instance for both IPv4 and IPv6 unicast routes in the BGP instance. For EVPN, set the options on the tenant VRF.

Run the NVUE nv set vrf <vrf> router bgp path-selection multipath bandwidth ignore, nv set vrf <vrf> router bgp path-selection multipath bandwidth skip-missing, or nv set vrf <vrf> router bgp path-selection multipath bandwidth default-weight-for-missing command.

The following example sets link bandwidth processing to skip paths without link bandwidth and perform W-ECMP among the other paths:

cumulus@switch:~$ nv set vrf default router bgp path-selection multipath bandwidth skip-missing
cumulus@switch:~$ nv config apply

Run the vtysh bgp bestpath bandwidth ignore, bgp bestpath bandwidth skip-missing, or bgp bestpath bandwidth default-weight-for-missing command.

The following example sets link bandwidth processing to skip paths without link bandwidth and perform UCMP among the other paths:

cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65011
switch(config-router)# bgp bestpath bandwidth skip-missing
switch(config-router)# end
switch# write memory
switch# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

router bgp 65011
  bgp bestpath as-path multipath-relax
  neighbor LEAF peer-group
  neighbor LEAF remote-as external
  neighbor swp1 interface peer-group LEAF
  neighbor swp2 interface peer-group LEAF
  neighbor swp3 interface peer-group LEAF
  neighbor swp4 interface peer-group LEAF
  bgp bestpath bandwidth skip-missing
!
  address-family ipv4 unicast
    network 10.0.0.1/32
  exit-address-family
 ...

The BGP link bandwidth extended community passes on automatically with the prefix to eBGP peers. If you do not want to pass on the BGP link bandwidth extended community outside of a particular domain, you can disable the advertisement of all BGP extended communities on specific peerings.

You cannot disable just the BGP link bandwidth extended community from advertising to a neighbor; you either send all BGP extended communities, or none.

The following example disables all BGP extended communities on a peer:

cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast community-advertise extended off
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65011
switch(config-router)# no neighbor 10.10.0.2 send-community extended
switch(config-router)# end
switch# write memory
switch# exit

Weight Normalization

The NVIDIA Spectrum switch supports weight programming for ECMP by repeating each individual path, which consumes resources. To reduce hardware utilization of ECMP resources, you can enable weight normalization.

To enable weight normalization:

cumulus@leaf01:mgmt:~$ nv set system forwarding ecmp-weight-normalisation mode enabled
cumulus@leaf01:mgmt:~$ nv config apply

To disable weight normalization, run the nv set system forwarding ecmp-weight-normalisation mode disabled command.

You can also adjust the maximum number of hardware entries for weighted ECMP by running the nv set system forwarding ecmp-weight-normalisation max-hw-weight command. You can specify a value between 8 and 4096. The default value is 32.

cumulus@leaf01:mgmt:~$ nv set system forwarding ecmp-weight-normalisation max-hw-weight 100
cumulus@leaf01:mgmt:~$ nv config apply

Exercise caution when adjusting the maximum number of hardware entries. Configuring the setting too low consumes fewer resources but provides less weight granularity. Configuring the setting too high consumes more resources but provides more weight granularity.

BGP W-ECMP with Adaptive Routing

Cumulus Linux supports BGP W-ECMP with adaptive routing for high-performance Ethernet topologies, where you use adaptive routing for optimal and efficient traffic distribution. You do not need to perform any additional configuration other than the configuration specified above.

Troubleshooting

To show the extended community in a received or local route, run the vtysh show bgp command.

The following example shows that the switch receives an IPv4 unicast route with the BGP link bandwidth attribute from two peers. The link bandwidth extended community is in bytes per second and shows in megabits per second: Extended Community: LB:65002:131072000 (1000.000 Mbps) and Extended Community: LB:65001:65536000 (500.000 Mbps).

cumulus@switch:~$ sudo vtysh
...
switch# show ip bgp ipv4 unicast 192.168.10.1/32
BGP routing table entry for 192.168.10.1/32
Paths: (2 available, best #2, table default)
  Advertised to non peer-group peers:
  l1(swp1) l2(swp2) l3(swp3) l4(swp4)
  65002
    fe80::202:ff:fe00:1b from l2(swp2) (10.0.0.2)
    (fe80::202:ff:fe00:1b) (used)
      Origin IGP, metric 0, valid, external, multipath, bestpath-from-AS 65002
      Extended Community: LB:65002:131072000 (1000.000 Mbps)
      Last update: Thu Feb 20 18:34:16 2020

  65001
    fe80::202:ff:fe00:15 from l1(swp1) (110.0.0.1)
    (fe80::202:ff:fe00:15) (used)
      Origin IGP, metric 0, valid, external, multipath, bestpath-from-AS 65001, best (Older Path)
      Extended Community: LB:65001:65536000 (500.000 Mbps)
      Last update: Thu Feb 20 18:22:34 2020

The bandwidth value used by W-ECMP is only to determine the percentage of load to a given next hop and has no impact on actual link or flow bandwidth.

To show EVPN type-5 routes, run the vtysh show bgp l2vpn evpn route type prefix command.

The bandwidth shows both as bytes per second (unsigned 32 bits) as well as in Gbps, Mbps, or Kbps. For example:

cumulus@switch:~$ sudo vtysh
...
switch# show bgp l2vpn evpn route type prefix
BGP table version is 1, local router ID is 10.0.0.11
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
...
*> [5]:[0]:[32]:[192.168.10.1]
            10.0.0.5                           0 65100 65050 65200 i
            RT:65050:104001 LB:65050:134217728 (1.000 Gbps) ET:8 Rmac:36:4f:15:ea:81:90

To see weights associated with next hops for a route with multiple paths, run the vtysh show ip route command. For example:

cumulus@switch:~$ sudo vtysh
...
switch# show ip route 192.168.10.1/32
Routing entry for 192.168.10.1/32
  Known via "bgp", distance 20, metric 0, best
  Last update 00:00:32 ago
  * fe80::202:ff:fe00:1b, via swp2, weight 66
  * fe80::202:ff:fe00:15, via swp1, weight 33

Considerations

W-ECMP with BGP link bandwidth is only available for BGP-learned routes.

ECMP Resource Sharing During Next Hop Group Updates

During network events such as reboots, link flaps, and in any transient scenarios, next hop group churn might create a higher number of ECMP containers. Also, when FRR allocates a single next hop group per source, more ECMP hardware resources are required.

To configure the switch to share ECMP resources during next hop group updates with weight changes, create the /etc/cumulus/switchd.d/switchd_misc.conf file and add nhg_update_ecmp_sharing_enable = TRUE:

cumulus@leaf01:mgmt:~$ sudo nano /etc/cumulus/switchd.d/switchd_misc.conf
nhg_update_ecmp_sharing_enable = TRUE 

To disable ECMP resource sharing during next hop group updates with weight changes, set the nhg_update_ecmp_sharing_enable option to FALSE.

NVUE does not provide commands to enable or disable ECMP resource sharing during next hop group updates with weight changes.

IETF draft - BGP Link Bandwidth Extended Community

Redistribute Neighbor

Redistribute neighbor provides a way for IP subnets to span racks without forcing the end hosts to run a routing protocol by announcing individual host /32 routes in the routed fabric. Other hosts on the fabric can use this new path to access the hosts in the fabric. If ECMP is available, traffic can load balance across the available paths natively.

Hosts use ARP to resolve MAC addresses when sending to an IPv4 address. A host then builds an ARP cache table of known MAC addresses: IPv4 tuples as they receive or respond to ARP requests.

For a leaf switch, where hosts within the rack use the default gateway, the ARP cache table contains a list of all hosts that ARP for their default gateway. In most cases, this table contains all the layer 3 information necessary. Redistribute neighbor formats and synchronizes this table into the routing protocol.

The current implementation of redistribute neighbor:

Target Use Cases and Best Practices

You use redistribute neighbor in these configurations:

Follow these guidelines:

How Does Redistribute Neighbor Work?

Redistribute neighbor works as follows:

  1. The leaf or ToR switch learns about connected hosts when the host sends an ARP request or ARP reply.
  2. The kernel neighbor table adds an entry for the host of each leaf.
  3. The redistribute neighbor daemon (rdnbrd) monitors the kernel neighbor table and creates a /32 route for each neighbor entry. This /32 route is in kernel table 10.
  4. FRR imports routes from kernel table 10.
  5. A route map controls which routes to import from table 10.
  6. FRR imports these routes as table routes.
  7. You configure BGP or OSPF to redistribute the table 10 routes.

Example Configuration

The following example configuration uses the following topology.

Configure the Leafs

Cumulus Linux does not provide NVUE commands redistribute neighbor configuration.
  1. Edit the /etc/network/interfaces file to configure the same IP address with a /32 prefix on both interfaces that face the host. In this example, swp1 and swp2 face server01 and server02:

    cumulus@leaf01:~$ sudo nano /etc/network/interfaces
    
    auto lo
    iface lo inet loopback
        address 10.0.0.1/32
    
    auto swp1
    iface swp1
        address 10.0.0.1/32
    
    auto swp2
    iface swp2
        address 10.0.0.1/32
    ...
    
  2. Enable the daemon to start at boot up, then start the daemon:

    cumulus@leaf01:~$ sudo systemctl enable rdnbrd.service
    cumulus@leaf01:~$ sudo systemctl restart rdnbrd.service
    
  3. Configure routing:

    1. Add the table as routes into the local routing table:

      cumulus@leaf01:~$ sudo vtysh
      
      leaf01# configure terminal
      leaf01(config)# ip import-table 10
      
    2. Define a route map that matches on the host-facing interface:

      leaf01(config)# route-map REDIST_NEIGHBOR permit 10
      leaf01(config-route-map)# match interface swp1
      leaf01(config-route-map)# route-map REDIST_NEIGHBOR permit 20
      leaf01(config-route-map)# match interface swp2
      
    3. Apply that route map to routes imported into table:

      leaf01(config)# ip import-table 10 route-map REDIST_NEIGHBOR
      

      To set the administrative distance to use for the routes, add the distance option before the route map name:

      leaf01(config)# ip import-table 10 distance 20 route-map REDIST_NEIGHBOR
      
    4. Redistribute the imported table routes into the appropriate routing protocol.

      BGP:

      leaf01(config)# router bgp 65001
      leaf01(config-router)# address-family ipv4 unicast
      leaf01(config-router-af)# redistribute table 10
      leaf01(config-router-af)# exit
      leaf01(config-router)# exit
      leaf01(config)# exit
      leaf01# write memory
      leaf01# exit
      cumulus@leaf01:~$
      

      OSPF:

      leaf01(config)# router ospf
      leaf01(config-router)# redistribute table 10
      leaf01(config-router)# exit
      leaf01(config)# exit
      leaf01# write memory
      leaf01# exit
      cumulus@leaf01:~$
      

The commands save the configuration in the /etc/frr/frr.conf file.

frr defaults datacenter
ip import-table 10 route-map REDIST_NEIGHBOR
username cumulus nopassword
!
service integrated-vtysh-config
!
log syslog informational
!
router bgp 65001
 !
 address-family ipv4 unicast
  redistribute table 10
 exit-address-family
!
route-map REDIST_NEIGHBOR permit 10
 match interface swp1
!
route-map REDIST_NEIGHBOR permit 20
 match interface swp2
!
router ospf
 redistribute table 10
!
line vty
!

Configure the Hosts

This document describes dual-connected Linux hosts with static IP addresses.

Configure a host with the same /32 IP address on its loopback and uplinks so that both leafs advertise the same /32 regardless of the interface. Cumulus Linux relies on ECMP to load balance across the interfaces southbound, and an equal cost static route (see the configuration below) to load balance northbound.

The loopback hosts the primary service IP address to which you can bind services.

Configure the loopback and physical interfaces. In the example topology above:

Install ifplugd

Install and use ifplugd, which modifies the behavior of the Linux routing table when an interface undergoes a link transition (carrier up/down). By default, the Linux kernel keeps routes up even when the physical interface is unavailable (NO-CARRIER).

After you install ifplugd, edit /etc/default/ifplugd as follows, where eth1 and eth2 are the interface names that your host uses to connect to the leafs.

user@server01:$ sudo nano /etc/default/ifplugd
INTERFACES="eth1 eth2"
HOTPLUG_INTERFACES=""
ARGS="-q -f -u10 -d10 -w -I"
SUSPEND_ACTION="stop"

For complete instructions to install ifplugd on Ubuntu, follow this guide.

Troubleshooting

Check if rdnbrd is Running

rdnbrd is the redistribute neighbor daemon. To check if the daemon is running, run the systemctl status rdnbrd.service command:

cumulus@leaf01:~$ systemctl status rdnbrd.service
* rdnbrd.service - Cumulus Linux Redistribute Neighbor Service
 Loaded: loaded (/lib/systemd/system/rdnbrd.service; enabled)
 Active: active (running) since Wed 2016-05-04 18:29:03 UTC; 1h 13min ago
 Main PID: 1501 (python)
 CGroup: /system.slice/rdnbrd.service
 `-1501 /usr/bin/python /usr/sbin/rdnbrd -d

Change rdnbrd Configuration

To change the default configuration of rdnbrd, edit the /etc/rdnbrd.conf file, then run systemctl restart rdnbrd.service:

cumulus@leaf01:~$ sudo nano /etc/rdnbrd.conf
# syslog logging level CRITICAL, ERROR, WARNING, INFO, or DEBUG
loglevel = INFO

# TX an ARP request to known hosts every keepalive seconds
keepalive = 1

# If a host does not send an ARP reply for holdtime consider the host down
holdtime = 3

# Install /32 routes for each host into this table
route_table = 10

# Uncomment to enable ARP debugs on specific interfaces.
# Note that ARP debugs can be very chatty.
# debug_arp = swp1 swp2 swp3 br1
# If we already know the MAC for a host, unicast the ARP request. This is
# unusual for ARP (why ARP if you know the destination MAC) but we will be
# using ARP as a keepalive mechanism and do not want to broadcast so many ARPs
# if we do not have to. If a host cannot handle a unicasted ARP request, set
#
# Unicasting ARP requests is common practice (in some scenarios) for other
# networking operating systems so it is unlikely that you will need to set
# this to False.
unicast_arp_requests = True
cumulus@leaf01:~$ sudo systemctl restart rdnbrd.service

Set the Routing Table ID

The Linux kernel supports multiple routing tables and can use 0 through 255 table IDs; however it reserves tables 0, 253, 254 and 255, and uses 1 first. Therefore, rdnbrd only allows you to specify between 2 and 252. Cumulus Linux uses table ID 10, however you can set the ID to any value between 2 and 252. You can see all the tables specified here:

cumulus@leaf01:~$ cat /etc/iproute2/rt_tables
#
# reserved values
#
255 local
254 main
253 default
0 unspec
#
# local
#
#1  inr.ruhep

For more information, refer to Linux route tables or you can read the Ubuntu man pages for ip route.

Check /32 Redistribute Neighbor Advertised Routes

For BGP, run the vtysh show ip bgp neighbor <interface> advertised-routes command. For example:

cumulus@leaf01:~$ show ip bgp neighbor swp51 advertise-routes
BGP table version is 5, local router ID is 10.0.0.11
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
              i internal, r RIB-failure, S Stale, R Removed
Origin codes: i - IGP, e - EGP, ? - incomplete

    Network         Next Hop            Metric LocPrf Weight Path
*> 10.0.0.11/32     0.0.0.0                  0         32768 i
*> 10.0.0.12/32     ::                                     0 65020 65012 i
*> 10.0.0.21/32     ::                                     0 65020 i
*> 10.0.0.22/32     ::                                     0 65020 i

Total number of prefixes 4

Verify the Kernel Routing Table

Use the following workflow to verify that the kernel routing table populates correctly and that routes import and advertise correctly:

  1. Verify that ARP neighbor entries populate into the Kernel routing table 10.

    cumulus@leaf01:~$ ip route show table 10
    10.0.1.101 dev swp1 scope link
    

    If these routes do not generate, verify that the rdnbrd daemon is running and check that the /etc/rdnbrd.conf file includes the correct table number.

  2. Verify that routes import into FRR from the kernel routing table 10.

    cumulus@leaf01:~$ sudo vtysh
    leaf01# show ip route table
    Codes: K - kernel route, C - connected, S - static, R - RIP,
            O - OSPF, I - IS-IS, B - BGP, A - Babel, T - Table,
            > - selected route, * - FIB route
    T[10]>* 10.0.1.101/32 [19/0] is directly connected, swp1, 01:25:29
    

    Both the > and * must be present so that table 10 routes install as preferred into the routing table. If the routes do not install, verify the imported distance of the locally imported kernel routes with the ip import 10 distance X command (where X is not less than the administrative distance of the routing protocol). If the distance is too low, routes learned from the protocol can overwrite the locally imported routes. Also, verify that the routes are in the kernel routing table.

  3. Confirm that routes are in the BGP/OSPF database and that they advertise.

    leaf01# show ip bgp
    

Considerations

Route Scale

Redistribute neighbor adds each ARP entry as a /32 host route into the routing table of all switches within a summarization domain. Make sure the number of hosts plus fabric routes is under the allocated hardware LPM table size of the switch according to the forwarding resource profile in use.

Uneven Traffic Distribution

Linux uses source layer 3 addresses only to load balance on most older distributions.

Silent Hosts Never Receive Traffic

Sometimes, freshly provisioned hosts that have yet to send traffic do not ARP for their default gateways. The post-up arping command in the /etc/network/interfaces file on the host takes care of this. If the host does not ARP, then rdnbrd on the leaf does not learn about the host.

FRRouting

Cumulus Linux uses FRR to provide the routing protocols for dynamic routing and supports the following routing protocols:

Architecture

The FRR suite consists of various protocol-specific daemons and a protocol-independent daemon called zebra. Each of the protocol-specific daemons are responsible for running the relevant protocol and building the routing table based on the information exchanged.

It is not uncommon to have more than one protocol daemon running at the same time. For example, at the edge of an enterprise, protocols internal to an enterprise such as OSPF run alongside the protocols that connect an enterprise to the rest of the world such as BGP.

zebra is the daemon that resolves the routes provided by multiple protocols (including the static routes you specify) and programs these routes in the Linux kernel using netlink (in Linux). The FRRouting documentation defines zebra as the IP routing manager for FRR that provides kernel routing table updates, interface lookups, and redistribution of routes between different routing protocols.

Configure FRR

The information in this section does not apply if you use NVUE to configure your switch. NVUE manages FRR daemons and configuration automatically. These instructions are only applicable for users managing FRR directly through linux flat file configurations.

If you do not configure your system using NVUE, FRR does not start by default in Cumulus Linux. Before you run FRR, make sure you have enabled the relevant daemons that you intend to use (bgpd, ospfd, ospf6d, pimd, or pbrd) in the /etc/frr/daemons file.

NVIDIA has not tested RIP, RIPv6, IS-IS, or Babel.

Cumulus Linux enables the zebra daemon by default. You can enable the other daemons according to how you plan to route your network.

Before you start FRR, edit the /etc/frr/daemons file to enable each daemon you want to use. For example, to enable BGP, set bgpd to yes:

...
bgpd=yes
ospfd=no
ospf6d=no
ripd=no
ripngd=no
isisd=no
fabricd=no
pimd=no
ldpd=no
nhrpd=no
eigrpd=no
babeld=no
sharpd=no
pbrd=no
vrrpd=no
...

Enable and Start FRR

The information in this section does not apply if you use NVUE to configure your switch. NVUE manages FRR daemons and configuration automatically. These instructions are only applicable for users managing FRR directly through linux flat file configurations.

After you enable the appropriate daemons, enable and start the FRR service:

cumulus@switch:~$ sudo systemctl enable frr.service
cumulus@switch:~$ sudo systemctl start frr.service

Restore the Default Configuration

The information in this section does not apply if you use NVUE to configure your switch. NVUE manages FRR daemons and configuration automatically. These instructions are only applicable if you manage FRR directly with linux flat file configurations.

If you need to restore the FRR configuration to the default running configuration, delete the frr.conf file and restart the frr service.

Back up frr.conf (or any configuration files you want to remove) before proceeding.

  1. Confirm that service integrated-vtysh-config is running.

  2. Remove /etc/frr/frr.conf:

    cumulus@switch:~$ sudo rm /etc/frr/frr.conf
    
  3. Restart FRR with this command:

    cumulus@switch:~$ sudo systemctl restart frr.service
    

    Restarting FRR restarts all the routing protocol daemons that you enable and are running. NVIDIA recommends that you reboot the switch instead of restarting the FRR service to minimize traffic impact when redundant switches are present with MLAG.

Interface IP Addresses and VRFs

FRR inherits the IP addresses and any associated routing tables for the network interfaces from the /etc/network/interfaces file. This is the recommended way to define the addresses; do not create interfaces using FRR. For more information, see Configure IP Addresses and Virtual Routing and Forwarding - VRF.

vtysh Modal CLI

FRR provides a command-line interface (CLI) called vtysh for configuring and displaying protocol state. To start the CLI, run the sudo vtysh command:

cumulus@switch:~$ sudo vtysh

Hello, this is FRRouting (version 8.4.3).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

switch#

FRR provides different modes to the CLI and certain commands are only available within a specific mode. Configuration is available with the configure terminal command:

switch# configure terminal
switch(config)#

The prompt displays the current CLI mode. For example, when you run the interface-specific commands, the prompt changes to:

switch(config)# interface swp1
switch(config-if)#

When you run the routing protocol specific commands, the prompt changes:

switch(config)# router ospf
switch(config-router)#

? displays the list of available top-level commands:

switch(config-if)# ?
  bandwidth    Set bandwidth informational parameter
  description  Interface specific description
  end          End current mode and change to enable mode
  exit         Exit current mode and down to previous mode
  ip           IP Information
  ipv6         IPv6 Information
  isis         IS-IS commands
  link-detect  Enable link detection on interface
  list         Print command list
  mpls-te      MPLS-TE specific commands
  multicast    Set multicast flag to interface
  no           Negate a command or set its defaults
  ptm-enable   Enable neighbor check with specified topology
  quit         Exit current mode and down to previous mode
  shutdown     Shutdown the selected interface

?-based completion is also available to see the parameters that a command takes:

switch(config-if)# bandwidth ?
<1-10000000>  Bandwidth in kilobits
switch(config-if)# ip ?
address  Set the IP address of an interface
irdp     Alter ICMP Router discovery preference this interface
ospf     OSPF interface commands
rip      Routing Information Protocol
router   IP router interface commands

In addition to ?-based completion, you can use tab completion to get help with the valid keywords or options as you enter commands. For example, using tab completion with router ospf shows the possible options for the command and returns you to the command prompt to complete the command.

switch(config)# router ospf vrf<<press tab>>
BLUE     RED      default  mgmt     
switch(config)# router ospf vrf

To search for specific vtysh commands so that you can identify the correct syntax to use, run the sudo vtysh -c 'find <term>' command. For example, to show only commands that include mlag:

cumulus@leaf01:mgmt:~$ sudo vtysh -c 'find mlag'
  (view)  show ip pim [mlag] vrf all interface [detail|WORD] [json]
  (view)  show ip pim [vrf NAME] interface [mlag] [detail|WORD] [json]
  (view)  show ip pim [vrf NAME] mlag upstream [A.B.C.D [A.B.C.D]] [json]
  (view)  show ip pim mlag summary [json]
  (view)  show ip pim vrf all mlag upstream [json]
  (view)  show zebra mlag
  (enable)  [no$no] debug zebra mlag
  (enable)  debug pim mlag
  (enable)  no debug pim mlag
  (enable)  test zebra mlag <none$none|primary$primary|secondary$secondary>
  (enable)  show ip pim [mlag] vrf all interface [detail|WORD] [json]
  (enable)  show ip pim [vrf NAME] interface [mlag] [detail|WORD] [json]
  (enable)  show ip pim [vrf NAME] mlag upstream [A.B.C.D [A.B.C.D]] [json]
  (enable)  show ip pim mlag summary [json]
  (enable)  show ip pim vrf all mlag upstream [json]
  (enable)  show zebra mlag
  (config)  [no$no] debug zebra mlag
  (config)  debug pim mlag
  (config)  ip pim mlag INTERFACE role [primary|secondary] state [up|down] addr A.B.C.D
  (config)  no debug pim mlag
  (config)  no ip pim mlag

You can display the state at any level, including the top level. For example, to see the routing table as seen by zebra:

switch# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, T - Table,
       > - selected route, * - FIB route
B>* 0.0.0.0/0 [20/0] via fe80::4638:39ff:fe00:c, swp29, 00:11:57
  *                  via fe80::4638:39ff:fe00:52, swp30, 00:11:57
B>* 10.0.0.1/32 [20/0] via fe80::4638:39ff:fe00:c, swp29, 00:11:57
  *                    via fe80::4638:39ff:fe00:52, swp30, 00:11:57
B>* 10.0.0.11/32 [20/0] via fe80::4638:39ff:fe00:5b, swp1, 00:11:57
B>* 10.0.0.12/32 [20/0] via fe80::4638:39ff:fe00:2e, swp2, 00:11:58
B>* 10.0.0.13/32 [20/0] via fe80::4638:39ff:fe00:57, swp3, 00:11:59
B>* 10.0.0.14/32 [20/0] via fe80::4638:39ff:fe00:43, swp4, 00:11:59
C>* 10.0.0.21/32 is directly connected, lo
B>* 10.0.0.51/32 [20/0] via fe80::4638:39ff:fe00:c, swp29, 00:11:57
  *                     via fe80::4638:39ff:fe00:52, swp30, 00:11:57
B>* 172.16.1.0/24 [20/0] via fe80::4638:39ff:fe00:5b, swp1, 00:11:57
  *                      via fe80::4638:39ff:fe00:2e, swp2, 00:11:57
B>* 172.16.3.0/24 [20/0] via fe80::4638:39ff:fe00:57, swp3, 00:11:59
  *                      via fe80::4638:39ff:fe00:43, swp4, 00:11:59

To run the same command at a config level, prepend do:

switch(config-router)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, T - Table,
       > - selected route, * - FIB route
B>* 0.0.0.0/0 [20/0] via fe80::4638:39ff:fe00:c, swp29, 00:05:17
  *                  via fe80::4638:39ff:fe00:52, swp30, 00:05:17
B>* 10.0.0.1/32 [20/0] via fe80::4638:39ff:fe00:c, swp29, 00:05:17
  *                    via fe80::4638:39ff:fe00:52, swp30, 00:05:17
B>* 10.0.0.11/32 [20/0] via fe80::4638:39ff:fe00:5b, swp1, 00:05:17
B>* 10.0.0.12/32 [20/0] via fe80::4638:39ff:fe00:2e, swp2, 00:05:18
B>* 10.0.0.13/32 [20/0] via fe80::4638:39ff:fe00:57, swp3, 00:05:18
B>* 10.0.0.14/32 [20/0] via fe80::4638:39ff:fe00:43, swp4, 00:05:18
C>* 10.0.0.21/32 is directly connected, lo
B>* 10.0.0.51/32 [20/0] via fe80::4638:39ff:fe00:c, swp29, 00:05:17
  *                     via fe80::4638:39ff:fe00:52, swp30, 00:05:17
B>* 172.16.1.0/24 [20/0] via fe80::4638:39ff:fe00:5b, swp1, 00:05:17
  *                      via fe80::4638:39ff:fe00:2e, swp2, 00:05:17
B>* 172.16.3.0/24 [20/0] via fe80::4638:39ff:fe00:57, swp3, 00:05:18
  *                      via fe80::4638:39ff:fe00:43, swp4, 00:05:18

To run single commands with vtysh, use the -c option:

cumulus@switch:~$ sudo vtysh -c 'sh ip route'
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, A - Babel,
       > - selected route, * - FIB route

K>* 0.0.0.0/0 via 192.168.0.2, eth0
C>* 192.0.2.11/24 is directly connected, swp1
C>* 192.0.2.12/24 is directly connected, swp2
B>* 203.0.113.30/24 [200/0] via 192.0.2.2, swp1, 11:05:10
B>* 203.0.113.31/24 [200/0] via 192.0.2.2, swp1, 11:05:10
B>* 203.0.113.32/24 [200/0] via 192.0.2.2, swp1, 11:05:10
C>* 127.0.0.0/8 is directly connected, lo
C>* 192.168.0.0/24 is directly connected, eth0

To run a command multiple levels down:

cumulus@switch:~$ sudo vtysh -c 'configure terminal' -c 'router ospf' -c 'area 0.0.0.1 range 10.10.10.0/24'

The commands also take a partial command name (for example, sh ip route) as long as the partial command name is not aliased:

cumulus@switch:~$ sudo vtysh -c 'sh ip r'
% Ambiguous command.

To disable a command or feature in FRR, prepend the command with no. For example:

cumulus@switch:~$ sudo vtysh

switch# configure terminal
switch(config)# router ospf
switch(config-router)# no area 0.0.0.1 range 10.10.10.0/24
switch(config-router)# exit
switch(config)# exit
switch# write memory
switch# exit
cumulus@switch:~$

To view the current state of the configuration, run the show running-config command:

Example command
switch# show running-config
Building configuration...

Current configuration:
!
username cumulus nopassword
!
service integrated-vtysh-config
!
vrf mgmt
!
interface lo
  link-detect
!
interface swp1
  ipv6 nd ra-interval 10
  link-detect
!
interface swp2
  ipv6 nd ra-interval 10
  link-detect
!
interface swp3
  ipv6 nd ra-interval 10
  link-detect
!
interface swp4
  ipv6 nd ra-interval 10
  link-detect
!
interface swp29
  ipv6 nd ra-interval 10
  link-detect
!
interface swp30
  ipv6 nd ra-interval 10
  link-detect
!
interface swp31
  link-detect
!
interface swp32
  link-detect
!
interface vagrant
  link-detect
!
interface eth0 vrf mgmt
  ipv6 nd suppress-ra
  link-detect
!
interface mgmt vrf mgmt
  link-detect
!
router bgp 65020
  bgp router-id 10.0.0.21
  bgp bestpath as-path multipath-relax
  bgp bestpath compare-routerid
  neighbor fabric peer-group
  neighbor fabric remote-as external
  neighbor fabric description Internal Fabric Network
  neighbor fabric capability extended-nexthop
  neighbor swp1 interface peer-group fabric
  neighbor swp2 interface peer-group fabric
  neighbor swp3 interface peer-group fabric
  neighbor swp4 interface peer-group fabric
  neighbor swp29 interface peer-group fabric
  neighbor swp30 interface peer-group fabric
  !
  address-family ipv4 unicast
  network 10.0.0.21/32
  neighbor fabric activate
  neighbor fabric prefix-list dc-spine in
  neighbor fabric prefix-list dc-spine out
  exit-address-family
!
ip prefix-list dc-spine seq 10 permit 0.0.0.0/0
ip prefix-list dc-spine seq 20 permit 10.0.0.0/24 le 32
ip prefix-list dc-spine seq 30 permit 172.16.1.0/24
ip prefix-list dc-spine seq 40 permit 172.16.2.0/24
ip prefix-list dc-spine seq 50 permit 172.16.3.0/24
ip prefix-list dc-spine seq 60 permit 172.16.4.0/24
ip prefix-list dc-spine seq 500 deny any
!
ip forwarding
ipv6 forwarding
!
line vty
!
end

If you try to configure a routing protocol that is not running, vtysh ignores those commands.

NVUE Show Commands and vtysh Output

NVUE provides the --output raw option for certain NVUE show commands to show vtysh native output.

NVUE Commands that support --output raw
nv show evpn multihoming esi
nv show evpn multihoming esi <esi_id>
nv show evpn vni <vni_id> multihoming esi
nv show evpn vni <vni_id> multihoming esi <esi_id>
nv show evpn vni <vni_id>
nv show vrf <tenant vrf> evpn
nv show vrf <tenant vrf> evpn bgp-info
nv show vrf <vrf> evpn nexthop-vtep <vtep>
nv show evpn vni <vni_id> bgp-info
nv show vrf <vrf-id> router bgp neighbor
nv show vrf <vrf-id> router bgp neighbor <neighbor-id>
nv show vrf <vrf-id> router bgp nexthop
nv show vrf <vrf-id> router bgp nexthop ipv4
nv show vrf <vrf-id> router bgp nexthop ipv6
nv show vrf default router rib ipv4 route
nv show vrf default router rib ipv6 route
nv show vrf default router rib
nv show vrf default router rib ipv4
nv show vrf default router rib ipv6

Show Routes in the Routing Table

To show all the routes in the routing table, run the nv show vrf <vrf> router rib <address-family> route command:

cumulus@switch:~$ nv show vrf default router rib ipv4 route

Flags - * - selected, q - queued, o - offloaded, i - installed, S - fib-        
selected, x - failed                                                            
                                                                                
Route            Protocol   Distance  Uptime                NHGId  Metric  Flags
---------------  ---------  --------  --------------------  -----  ------  -----
10.0.1.12/32     connected  0         2024-10-22T18:36:01Z  15     0       *Sio 
10.0.1.34/32     bgp        20        2024-10-22T18:42:22Z  125    0       *Si  
10.0.1.255/32    bgp        20        2024-10-22T18:36:05Z  125    0       *Si  
10.10.10.1/32    connected  0         2024-10-22T18:35:54Z  15     0       *Sio 
10.10.10.2/32    bgp        20        2024-10-22T18:35:58Z  62     0       *Si  
10.10.10.3/32    bgp        20        2024-10-22T18:42:16Z  125    0       *Si  
10.10.10.4/32    bgp        20        2024-10-22T18:42:16Z  125    0       *Si  
10.10.10.63/32   bgp        20        2024-10-22T18:36:05Z  125    0       *Si  
10.10.10.64/32   bgp        20        2024-10-22T18:36:05Z  125    0       *Si  
10.10.10.101/32  bgp        20        2024-10-22T18:36:05Z  115    0       *Si  
10.10.10.102/32  bgp        20        2024-10-22T18:36:04Z  107    0       *Si

To show information about a specific route, run the nv show vrf <vrf> router rib <address-family> route <prefix> command:

cumulus@switch:~$ nv show vrf default router rib ipv4 route 10.0.1.34/32
route-entry
==============
                                                                                
    Protocol - Protocol name, TblId - Table Id, NHGId - Nexthop group Id, Flags - u 
    - unreachable, r - recursive, o - onlink, i - installed, d - duplicate, c -     
    connected, A - active                                                           
                                                                                
    EntryIdx  Protocol  TblId  NHGId  Distance  Metric  ResolvedVia                ResolvedViaIntf  Weight  Flags
    --------  --------  -----  -----  --------  ------  -------------------------  ---------------  ------  -----
    1         bgp       254    125    20        0       fe80::4ab0:2dff:fe32:2a3f  swp52            1       iA   
                                                        fe80::4ab0:2dff:fe41:6b79  swp51            1       iA

To show the total number of routes in the routing table, run the nv show vrf <vrf> router rib <address-family> route-count command:

cumulus@switch:~$ nv show vrf default router rib ipv4 route-count
                 operational 
------------     ----------- 
total-routes    34 
[protocol]      bgp 
[protocol]      connected 

For IPv6 run the nv show vrf <vrf> router rib ipv6 route-count command.

To show the total number of routes per protocol in the routing table, run the nv show vrf <vrf> router rib <address-family> route-count protocol command:

cumulus@switch:~$ nv show vrf default router rib ipv4 route-count protocol
Protocol   Total 
---------  ----- 
bgp        6 
connected  3 
ospf       8 
static     3 

For IPv6 run the nv show vrf <vrf> router rib ipv6 route-count protocol command.

Look Up the Route for a Destination

To look up the route in the routing table for a specific destination, run the nv action lookup vrf <vrf-id> router fib <address-family> <ip-address> command.

The following example looks up the route in the routing table for the destination with the IPv4 address 10.10.10.3:

cumulus@switch:~$ nv action lookup vrf default router fib ipv4 10.10.10.3
Action executing ... 
 [{"dst":"10.10.10.3","nhid":455,"table":"default","protocol":"bgp","metric":20,"flags":[]}] 

 Action succeeded 

The following example shows the route in the routing table for the destination with the IPv6 address 228:35::5

cumulus@switch:~$ nv action lookup vrf RED router fib ipv6 228:35::5
[{"dst":"228:35::5","nhid":454,"table":"RED","protocol":"bgp","metric":20,"flags":[],"pref":"medium"}] 

 Action succeeded 

Next Hop Tracking

Routing daemons track the validity of next hops through notifications from the zebra daemon. For example, FRR uninstalls BGP routes that resolve to a next hop over a connected route in zebra when bgpd receives a next hop tracking (NHT) notification after zebra removes the connected route if the associated interface goes down.

The zebra daemon does not consider next hops that resolve to a default route as valid by default. You can configure NHT to consider a longest prefix match lookup for next hop addresses resolving to the default route as a valid next hop. The following example configures the default route to be valid for NHT in the default VRF:

cumulus@leaf01:~$ nv set vrf default router nexthop-tracking ipv4 resolved-via-default on
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# ip nht resolve-via-default
leaf01(config)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

You can apply a route map to NHT for specific routing daemons to permit or deny routes from consideration as valid next hops. The following example applies ROUTEMAP1 to BGP, preventing NHT from considering next hops resolving to 10.0.0.0/8 in the default VRF as valid:

cumulus@leaf01:~$ nv set router policy prefix-list PREFIX1 type ipv4
cumulus@leaf01:~$ nv set router policy prefix-list PREFIX1 rule 1 match 10.0.0.0/8
cumulus@leaf01:~$ nv set router policy prefix-list PREFIX1 rule 1 action permit
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 1 match ip-prefix-list PREFIX1
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 1 action deny 
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 2 action permit
cumulus@leaf01:~$ nv set vrf default router nexthop-tracking ipv4 route-map ROUTEMAP1 protocol bgp
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
leaf02# configure terminal
leaf02(config)# ip prefix-list PREFIX1 seq 1 permit 10.0.0.0/8
leaf02(config)# route-map ROUTEMAP1 deny 1
leaf02(config-route-map)#  match ip address prefix-list PREFIX1
leaf02(config-route-map)# route-map ROUTEMAP1 permit 2
leaf02(config-route-map)# ip nht bgp route-map ROUTEMAP1
leaf02(config)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

You can show tracked next hops with the following NVUE commands:

cumulus@leaf01:~$  nv show vrf default router nexthop-tracking ipv4
                      operational  applied  pending
--------------------  -----------  -------  -------
resolved-via-default                        on

route-map
============
No Data

ip-address
=============
                                                                                
    DirectlyConnected - Indicates if nexthop is directly connected or not,          
    ResolvedProtocol - Resolved via protocol, Interface - Resolved via interface,   
    ProtocolFiltered - Indicates whether protocol filtered or not, Flags - o -      
    onlink, c - directly-connected, A - active                                      
                                                                                
    IPAddress    DirectlyConnected  ResolvedProtocol  Interface      VRF      Weight  ProtocolFiltered  Flags
    -----------  -----------------  ----------------  -------------  -------  ------  ----------------  -----
    10.0.1.34    off                bgp               swp52          default  1       off               A    
                                                      swp53          default  1                         A    
                                                      swp54          default  1                         A    
                                                      swp51          default  1                         A    
    10.10.10.2   off                bgp               peerlink.4094  default  1       off               A    
    10.10.10.3   off                bgp               swp52          default  1       off               A    
                                                      swp53          default  1                         A    
                                                      swp54          default  1                         A    
                                                      swp51          default  1                         A    
    10.10.10.4   off                bgp               swp52          default  1       off               A    
                                                      swp53          default  1                         A    
                                                      swp54          default  1                         A    
                                                      swp51          default  1                         A    
    10.10.10.63  off                bgp               swp52          default  1       off               A    
                                                      swp53          default  1                         A    
                                                      swp54          default  1                         A    
                                                      swp51          default  1                         A    
    10.10.10.64  off                bgp               swp52          default  1       off               A    
                                                      swp53          default  1                         A    
                                                      swp54          default  1                         A    
                                                      swp51          default  1                         A

You can also run the vtysh show ip nht vrf <vrf> <ip-address> command.

Reload the FRR Configuration

The information in this section does not apply if you use NVUE to configure your switch. NVUE manages FRR daemons and configuration automatically. These instructions are only applicable for users managing FRR directly through linux flat file configurations.

If you make a change to your routing configuration, you need to reload FRR so your changes take place. FRR reload enables you to apply only the modifications you make to your FRR configuration, synchronizing its running state with the configuration in /etc/frr/frr.conf. This is useful for optimizing FRR automation in your environment or to apply changes made at runtime.

To reload your FRR configuration after you modify /etc/frr/frr.conf, run:

cumulus@switch:~$ sudo systemctl reload frr.service

Examine the running configuration and verify that it matches the configuration in /etc/frr/frr.conf.

If the running configuration is not what you expect, submit a support request and supply the following information:

FRR Logging

The information in this section does not apply if you use NVUE to configure your switch. NVUE manages FRR daemons and configuration automatically. These instructions are only applicable for users managing FRR directly through linux flat file configurations.

By default, Cumulus Linux configures FFR with syslog severity level 6 (informational). Log output writes to the /var/log/frr/frr.log file.

To write debug messages to the log file, you must run the log syslog debug command to configure FRR with syslog severity 7 (debug); otherwise, when you issue a debug command such as, debug bgp neighbor-events, no output goes to /var/log/frr/frr.log. However, when you manually define a log target with the log file /var/log/frr/debug.log command, FRR automatically defaults to severity 7 (debug) logging and the output logs to /var/log/frr/debug.log.

Considerations

Duplicate Hostnames

The switch can have two hostnames in the FRR configuration. For example:

cumulus@spine01:~$ sudo vtysh...
spine01# configure terminal
spine01(config)# hostname spine01-1
spine01-1(config)# do sh run
Building configuration...
Current configuration:
!
frr version 7.0+cl4u3
frr defaults datacenter
hostname spine01
hostname spine01-1
...

If you configure the same numbered BGP neighbor with both the neighbor x.x.x.x and neighbor swp# interface commands, two neighbor entries are present for the same IP address in the configuration. To correct this issue, update the configuration and restart the FRR service.

TCP Sockets and BGP Peering Sessions

The FRR startup configuration includes a setting for the maximum number of open files allowed. For BGP, open files include TCP sockets that BGP connections use. Either BGP speaker can start a BGP peering almost simultaneously; therefore, you can have two TCP sockets for a single BGP peer. These two sockets exist until the BGP protocol determines which socket to use, then the other socket closes.

The default setting of 1024 open files supports up to 512 BGP peering sessions. If you expect your network deployment to have more BGP peering sessions, you need to update this setting.

NVIDIA recommends you set the value to at least twice the maximum number of BGP peering sessions you expect.

To update the open files setting:

  1. Edit the /lib/systemd/system/frr.service file and change the LimitNOFILE parameter. The following example sets the LimitNOFILE parameter to 4096.

    cumulus@switch:~$ sudo cat /lib/systemd/system/frr.service
    [Unit]
    Description=FRRouting
    Documentation=https://frrouting.readthedocs.io/en/latest/setup.html
    After=networking.service csmgrd.service
       
    [Service]
    Nice=-5
    Type=forking
    NotifyAccess=all
    StartLimitInterval=3m
    StartLimitBurst=3
    TimeoutSec=2m
    WatchdogSec=60s
    RestartSec=5
    Restart=on-abnormal
    LimitNOFILE=4096
    ...
    
  2. Restart the FRR service.

    cumulus@switch:~$ sudo systemctl restart frr.service
    

gNMI Streaming

You can use gRPC Network Management Interface (gNMI) to collect system resource, interface, and counter information from Cumulus Linux and export it to your own gNMI client.

Configure the gNMI Agent

The netq-agent package includes the gNMI agent, which it disables by default. To enable the gNMI agent:

 cumulus@switch:~$ sudo systemctl enable netq-agent.service
 cumulus@switch:~$ sudo systemctl start netq-agent.service
 cumulus@switch:~$ netq config add agent gnmi-enable true

The gNMI agent listens over port 9339. You can change the default port in case you use that port in another application. The /etc/netq/netq.yml file stores the configuration.

Use the following commands to adjust the settings:

  1. Disable the gNMI agent:

    cumulus@switch:~$ netq config add agent gnmi-enable false
    
  2. Change the default port over which the gNMI agent listens:

    cumulus@switch:~$ netq config add agent gnmi-port <gnmi_port>
    
  3. Restart the NetQ Agent to incorporate the configuration changes:

    cumulus@switch:~$ netq config restart agent
    

The gNMI agent relies on the data it collects from the NVUE service. For complete data collection with gNMI, you must enable the NVUE service. To check the status of the nvued service, run the sudo systemctl status nvued.service command:

cumulus@switch:mgmt:~$ sudo systemctl status nvued.service
● nvued.service - NVIDIA User Experience Daemon
   Loaded: loaded (/lib/systemd/system/nvued.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2023-03-09 20:00:17 UTC; 6 days ago

If necessary, enable and start the service:

cumulus@switch:mgmt:~$ sudo systemctl enable nvued.service
cumulus@switch:mgmt:~$ sudo systemctl start nvued.service

Use the gNMI Agent Only

NVIDIA recommends that you collect data with both the gNMI and NetQ agents. However, if you do not want to collect data with both agents or you are not streaming data to NetQ, you can disable the NetQ agent. Cumulus Linux then sents data only to the gNMI agent.

To disable the NetQ agent:

cumulus@switch:~$ netq config add agent opta-enable false

You cannot disable both the NetQ and gNMI agent. If you enable both agents on Cumulus Linux and a NetQ server is unreachable, the switch does not send the data to gNMI from the following models:

WJH, openconfig-platform, and openconfig-lldp data continue streaming to gNMI in this state. If you are only using gNMI and a NetQ telemetry server does not exist, disable the NetQ agent by setting opta-enable to false.

Supported Subscription Modes

Cumulus Linux supports the following gNMI subscription modes:

Supported Models

Cumulus Linux supports the following OpenConfig models:

Model Supported Data
openconfig-interfaces Name, Operstatus, AdminStatus, IfIndex, MTU, LoopbackMode, Enabled, Counters (InPkts, OutPkts, InOctets, InUnicastPkts, InDiscards, InMulticastPkts, InBroadcastPkts, InErrors, OutOctets, OutUnicastPkts, OutMulticastPkts, OutBroadcastPkts, OutDiscards, OutErrors)
openconfig-if-ethernet AutoNegotiate, PortSpeed, MacAddress, NegotiatedPortSpeed, Counters (InJabberFrames, InOversizeFrames,​ InUndersizeFrames)
openconfig-if-ethernet-ext Frame size counters (InFrames_64Octets, InFrames_65_127Octets, InFrames_128_255Octets, InFrames_256_511Octets, InFrames_512_1023Octets, InFrames_1024_1518Octets)
openconfig-system Memory, CPU
openconfig-platform Platform data (Name, Description, Version)
openconfig-lldp LLDP data (PortIdType, PortDescription, LastUpdate, SystemName, SystemDescription, ChassisId, Ttl, Age, ManagementAddress, ManagementAddressType, Capability)
Model Supported Data
nvidia-if-wjh-drop-aggregate Aggregated WJH drops, including layer 1, layer 2, router, ACL, tunnel, and buffer drops
nvidia-if-ethernet-ext Extended Ethernet counters (AlignmentError, InAclDrops, InBufferDrops, InDot3FrameErrors, InDot3LengthErrors, InL3Drops, InPfc0Packets, InPfc1Packets, InPfc2Packets, InPfc3Packets, InPfc4Packets, InPfc5Packets, InPfc6Packets, InPfc7Packets, OutNonQDrops, OutPfc0Packets, OutPfc1Packets, OutPfc2Packets, OutPfc3Packets, OutPfc4Packets, OutPfc5Packets, OutPfc6Packets, OutPfc7Packets, OutQ0WredDrops, OutQ1WredDrops, OutQ2WredDrops, OutQ3WredDrops, OutQ4WredDrops, OutQ5WredDrops, OutQ6WredDrops, OutQ7WredDrops, OutQDrops, OutQLength, OutWredDrops, SymbolErrors, OutTxFifoFull)

The client can use the following YANG models as a reference:

nvidia-if-ethernet-ext
module nvidia-if-ethernet-counters-ext {
    // xPath --> /interfaces/interface[name=*]/ethernet/counters/state/

   namespace "http://nvidia.com/yang/nvidia-ethernet-counters";
   prefix "nvidia-if-ethernet-counters-ext";


  // import some basic types
  import openconfig-interfaces { prefix oc-if; }
  import openconfig-if-ethernet { prefix oc-eth; }
  import openconfig-yang-types { prefix oc-yang; }


  revision "2021-10-12" {
    description
      "Initial revision";
    reference "1.0.0.";
  }

  grouping ethernet-counters-ext {

    leaf alignment-error {
      type oc-yang:counter64;
    }

    leaf in-acl-drops {
      type oc-yang:counter64;
    }

    leaf in-buffer-drops {
      type oc-yang:counter64;
    }

    leaf in-dot3-frame-errors {
      type oc-yang:counter64;
    }

    leaf in-dot3-length-errors {
      type oc-yang:counter64;
    }

    leaf in-l3-drops {
      type oc-yang:counter64;
    }

    leaf in-pfc0-packets {
      type oc-yang:counter64;
    }

    leaf in-pfc1-packets {
      type oc-yang:counter64;
    }

    leaf in-pfc2-packets {
      type oc-yang:counter64;
    }

    leaf in-pfc3-packets {
      type oc-yang:counter64;
    }

    leaf in-pfc4-packets {
      type oc-yang:counter64;
    }

    leaf in-pfc5-packets {
      type oc-yang:counter64;
    }

    leaf in-pfc6-packets {
      type oc-yang:counter64;
    }

    leaf in-pfc7-packets {
      type oc-yang:counter64;
    }

    leaf out-non-q-drops {
      type oc-yang:counter64;
    }

    leaf out-pfc0-packets {
      type oc-yang:counter64;
    }

    leaf out-pfc1-packets {
      type oc-yang:counter64;
    }

    leaf out-pfc2-packets {
      type oc-yang:counter64;
    }

    leaf out-pfc3-packets {
      type oc-yang:counter64;
    }

    leaf out-pfc4-packets {
      type oc-yang:counter64;
    }

    leaf out-pfc5-packets {
      type oc-yang:counter64;
    }

    leaf out-pfc6-packets {
      type oc-yang:counter64;
    }

    leaf out-pfc7-packets {
      type oc-yang:counter64;
    }

    leaf out-q0-wred-drops {
      type oc-yang:counter64;
    }

    leaf out-q1-wred-drops {
      type oc-yang:counter64;
    }

    leaf out-q2-wred-drops {
      type oc-yang:counter64;
    }

    leaf out-q3-wred-drops {
      type oc-yang:counter64;
    }

    leaf out-q4-wred-drops {
      type oc-yang:counter64;
    }

    leaf out-q5-wred-drops {
      type oc-yang:counter64;
    }

    leaf out-q6-wred-drops {
      type oc-yang:counter64;
    }

    leaf out-q7-wred-drops {
      type oc-yang:counter64;
    }

    leaf out-q8-wred-drops {
      type oc-yang:counter64;
    }

    leaf out-q9-wred-drops {
      type oc-yang:counter64;
    }

    leaf out-q-drops {
      type oc-yang:counter64;
    }

    leaf out-q-length {
      type oc-yang:counter64;
    }

    leaf out-wred-drops {
      type oc-yang:counter64;
    }

    leaf symbol-errors {
      type oc-yang:counter64;
    }

    leaf out-tx-fifo-full {
      type oc-yang:counter64;
    }

  }

  augment "/oc-if:interfaces/oc-if:interface/oc-eth:ethernet/" +
    "oc-eth:state/oc-eth:counters" {
      uses ethernet-counters-ext;
  }

}
nvidia-if-wjh-drop-aggregate
module nvidia-wjh {
    // Entrypoint /oc-if:interfaces/oc-if:interface
    //
    // xPath L1     --> interfaces/interface[name=*]/wjh/aggregate/l1
    // xPath L2     --> /interfaces/interface[name=*]/wjh/aggregate/l2/reasons/reason[id=*][severity=*]
    // xPath Router --> /interfaces/interface[name=*]/wjh/aggregate/router/reasons/reason[id=*][severity=*]
    // xPath Tunnel --> /interfaces/interface[name=*]/wjh/aggregate/tunnel/reasons/reason[id=*][severity=*]
    // xPath Buffer --> /interfaces/interface[name=*]/wjh/aggregate/buffer/reasons/reason[id=*][severity=*]
    // xPath ACL    --> /interfaces/interface[name=*]/wjh/aggregate/acl/reasons/reason[id=*][severity=*]

    import openconfig-interfaces { prefix oc-if; }

    namespace "http://nvidia.com/yang/what-just-happened-config";
    prefix "nvidia-wjh";

    revision "2021-10-12" {
        description
            "Initial revision";
        reference "1.0.0.";
    }

    augment "/oc-if:interfaces/oc-if:interface" {
        uses interfaces-wjh;
    }

    grouping interfaces-wjh {
        description "Top-level grouping for What-just happened data.";
        container wjh {
            container aggregate {
                container l1 {
                    container state {
                        leaf drop {
                            type string;
                            description "Drop list based on wjh-drop-types module encoded in JSON";
                        }
                    }
                }
                container l2 {
                    uses reason-drops;
                }
                container router {
                    uses reason-drops;
                }
                container tunnel {
                    uses reason-drops;
                }
                container acl {
                    uses reason-drops;
                }
                container buffer {
                    uses reason-drops;
                }
            }
        }
    }

    grouping reason-drops {
        container reasons {
            list reason {
                key "id severity";
                leaf id {
                    type leafref {
                        path "../state/id";
                    }
                    description "reason ID";
                }
                leaf severity {
                    type leafref {
                        path "../state/severity";
                    }
                    description "Reason severity";
                }
                container state {
                    leaf id {
                        type uint32;
                        description "Reason ID";
                    }
                    leaf name {
                        type string;
                        description "Reason name";
                    }
                    leaf severity {
                        type string;
                        mandatory "true";
                        description "Reason severity";
                    }
                    leaf drop {
                        type string;
                        description "Drop list based on wjh-drop-types module encoded in JSON";
                    }
                }
            }
        }
    }
}

module wjh-drop-types {
    namespace "http://nvidia.com/yang/what-just-happened-config-types";
    prefix "wjh-drop-types";

    container l1-aggregated {
        uses l1-drops;
    }
    container l2-aggregated {
        uses l2-drops;
    }
    container router-aggregated {
        uses router-drops;
    }
    container tunnel-aggregated {
        uses tunnel-drops;
    }
    container acl-aggregated {
        uses acl-drops;
    }
    container buffer-aggregated {
        uses buffer-drops;
    }

    grouping reason-key {
        leaf id {
            type uint32;
            mandatory "true";
            description "reason ID";
        }
        leaf severity {
            type string;
            mandatory "true";
            description "Severity";
        }
    }

    grouping reason_info {
        leaf reason {
                type string;
                mandatory "true";
                description "Reason name";
        }
        leaf drop_type {
            type string;
            mandatory "true";
            description "reason drop type";
        }
        leaf ingress_port {
            type string;
            mandatory "true";
            description "Ingress port name";
        }
        leaf ingress_lag {
            type string;
            description "Ingress LAG name";
        }
        leaf egress_port {
            type string;
            description "Egress port name";
        }
        leaf agg_count {
            type uint64;
            description "Aggregation count";
        }
        leaf severity {
            type string;
            description "Severity";
        }
        leaf first_timestamp {
            type uint64;
            description "First timestamp";
        }
        leaf end_timestamp {
            type uint64;
            description "End timestamp";
        }
    }

    grouping packet_info {
        leaf smac {
            type string;
            description "Source MAC";
        }
        leaf dmac {
            type string;
            description "Destination MAC";
        }
        leaf sip {
            type string;
            description "Source IP";
        }
        leaf dip {
            type string;
            description "Destination IP";
        }
        leaf proto {
            type uint32;
            description "Protocol";
        }
        leaf sport {
            type uint32;
            description "Source port";
        }
        leaf dport {
            type uint32;
            description "Destination port";
        }
    }

    grouping l1-drops {
        description "What-just happened drops.";
        leaf ingress_port {
            type string;
            description "Ingress port";
        }
        leaf is_port_up {
            type boolean;
            description "Is port up";
        }
        leaf port_down_reason {
            type string;
            description "Port down reason";
        }
        leaf description {
            type string;
            description "Description";
        }
        leaf state_change_count {
            type uint64;
            description "State change count";
        }
        leaf symbol_error_count {
            type uint64;
            description "Symbol error count";
        }
        leaf crc_error_count {
            type uint64;
            description "CRC error count";
        }
        leaf first_timestamp {
            type uint64;
            description "First timestamp";
        }
        leaf end_timestamp {
            type uint64;
            description "End timestamp";
        }
        leaf timestamp {
            type uint64;
            description "Timestamp";
        }
    }
    grouping l2-drops {
        description "What-just happened drops.";
        uses reason_info;
        uses packet_info;
    }

    grouping router-drops {
        description "What-just happened drops.";
        uses reason_info;
        uses packet_info;
    }

    grouping tunnel-drops {
        description "What-just happened drops.";
        uses reason_info;
        uses packet_info;
    }

    grouping acl-drops {
        description "What-just happened drops.";
        uses reason_info;
        uses packet_info;
        leaf acl_rule_id {
            type uint64;
            description "ACL rule ID";
        }
        leaf acl_bind_point {
            type uint32;
            description "ACL bind point";
        }
        leaf acl_name {
            type string;
            description "ACL name";
        }
        leaf acl_rule {
            type string;
            description "ACL rule";
        }
    }

    grouping buffer-drops {
        description "What-just happened drops.";
        uses reason_info;
        uses packet_info;
        leaf traffic_class {
            type uint32;
            description "Traffic Class";
        }
        leaf original_occupancy {
            type uint32;
            description "Original occupancy";
        }
        leaf original_latency {
            type uint64;
            description "Original latency";
        }
    }
}

Collect WJH Data with gNMI

You can export What Just Happened (WJH) data from the NetQ agent to your own gNMI client. Refer to the nvidia-if-wjh-drop-aggregate reference YANG model, above.

Supported Features

The gNMI Agent supports Capabilities and STREAM subscribe requests for WJH events.

WJH Drop Reasons

The data that NetQ sends to the gNMI agent is in the form of WJH drop reasons. The SDK generates the drop reasons and Cumulus Linux stores them in the /usr/etc/wjh_lib_conf.xml file. Use this file as a guide to filter for specific reason types (L1, ACL, and so on), reason IDs, or event severeties.

Layer 1 Drop Reasons

Reason ID Reason Description
10021 Port admin down Validate port configuration
10022 Auto-negotiation failure Set port speed manually, disable auto-negotiation
10023 Logical mismatch with peer link Check cable or transceiver
10024 Link training failure Check cable or transceiver
10025 Peer is sending remote faults Replace cable or transceiver
10026 Bad signal integrity Replace cable or transceiver
10027 Cable or transceiver is not supported Use supported cable or transceiver
10028 Cable or transceiver is unplugged Plug cable or transceiver
10029 Calibration failure Check cable or transceiver
10030 Cable or transceiver bad status Check cable or transceiver
10031 Other reason Other L1 drop reason

Layer 2 Drop Reasons

Reason ID Reason Severity Description
201 MLAG port isolation Notice Expected behavior
202 Destination MAC is reserved (DMAC=01-80-C2-00-00-0x) Error Bad packet received from the peer
203 VLAN tagging mismatch Error Validate the VLAN tag configuration on both ends of the link
204 Ingress VLAN filtering Error Validate the VLAN membership configuration on both ends of the link
205 Ingress spanning tree filter Notice Expected behavior
206 Unicast MAC table action discard Error Validate MAC table for this destination MAC
207 Multicast egress port list is empty Warning Validate why IGMP join or multicast router port does not exist
208 Port loopback filter Error Validate MAC table for this destination MAC
209 Source MAC is multicast Error Bad packet received from peer
210 Source MAC equals destination MAC Error Bad packet received from peer

Router Drop Reasons

Reason ID Reason Severity Description
301 Non-routable packet Notice Expected behavior
302 Blackhole route Warning Validate routing table for this destination IP
303 Unresolved neighbor or next hop Warning Validate ARP table for the neighbor or next hop
304 Blackhole ARP or neighbor Warning Validate ARP table for the next hop
305 IPv6 destination in multicast scope FFx0:/16 Notice Expected behavior - packet is not routable
306 IPv6 destination in multicast scope FFx1:/16 Notice Expected behavior - packet is not routable
307 Non-IP packet Notice Destination MAC is the router, packet is not routable
308 Unicast destination IP but multicast destination MAC Error Bad packet received from the peer
309 Destination IP is loopback address Error Bad packet received from the peer
310 Source IP is multicast Error Bad packet received from the peer
311 Source IP is in class E Error Bad packet received from the peer
312 Source IP is loopback address Error Bad packet received from the peer
313 Source IP is unspecified Error Bad packet received from the peer
314 Checksum or IPver or IPv4 IHL too short Error Bad cable or bad packet received from the peer
315 Multicast MAC mismatch Error Bad packet received from the peer
316 Source IP equals destination IP Error Bad packet received from the peer
317 IPv4 source IP is limited broadcast Error Bad packet received from the peer
318 IPv4 destination IP is local network (destination=0.0.0.0/8) Error Bad packet received from the peer
320 Ingress router interface is disabled Warning Validate your configuration
321 Egress router interface is disabled Warning Validate your configuration
323 IPv4 routing table (LPM) unicast miss Warning Validate routing table for this destination IP
324 IPv6 routing table (LPM) unicast miss Warning Validate routing table for this destination IP
325 Router interface loopback Warning Validate the interface configuration
326 Packet size is larger than router interface MTU Warning Validate the router interface MTU configuration
327 TTL value is too small Warning Actual path is longer than the TTL

Tunnel Drop Reasons

Reason ID Reason Severity Description
402 Overlay switch - Source MAC is multicast Error The peer sent a bad packet
403 Overlay switch - Source MAC equals destination MAC Error The peer sent a bad packet
404 Decapsulation error Error The peer sent a bad packet

ACL Drop Reasons

Reason ID Reason Severity Description
601 Ingress port ACL Notice Validate Access Control List configuration
602 Ingress router ACL Notice Validate Access Control List
603 Egress router ACL Notice Validate Access Control List
604 Egress port ACL Notice Validate Access Control List

Buffer Drop Reasons

Reason ID Reason Severity Description
503 Tail drop Warning Monitor network congestion
504 WRED Warning Monitor network congestion
505 Port TC congestion threshold crossed Notice Monitor network congestion
506 Packet latency threshold crossed Notice Monitor network congestion

gNMI Client Requests

You can use your gNMI client on a host to request capabilities and data to which the Agent subscribes. The examples below use the gNMIc client..

The following example shows a gNMIc STREAM request for WJH data:

gnmic -a 10.209.37.121:9339 -u cumulus -p ****** --skip-verify subscribe --path "wjh/aggregate/l2/reasons/reason[id=209][severity=error]/state/drop" --mode stream --prefix "/interfaces/interface[name=swp8]/" --target netq

{
  "source": "10.209.37.121:9339",
  "subscription-name": "default-1677695197",
  "timestamp": 1677695102858146800,
  "time": "2023-03-01T18:25:02.8581468Z",
  "prefix": "interfaces/interface[name=swp8]/wjh/aggregate/l2/reasons/reason[severity=error][id=209]",
  "target": "netq",
  "updates": [
    {
      "Path": "state/drop",
      "values": {
        "state/drop": "[{\"AggCount\":283,\"Dip\":\"0.0.0.0\",\"Dmac\":\"1c:34:da:17:93:7c\",\"Dport\":0,\"DropType\":\"L2\",\"EgressPort\":\"\",\"EndTimestamp\":1677695102,\"FirstTimestamp\":1677695072,\"Hostname\":\"neo-switch01\",\"IngressLag\":\"\",\"IngressPort\":\"swp8\",\"Proto\":0,\"Reason\":\"Source MAC is multicast\",\"ReasonId\":209,\"Severity\":\"Error\",\"Sip\":\"0.0.0.0\",\"Smac\":\"01:00:5e:00:00:01\",\"Sport\":0}]"
      }
    }
  ]
}
{
  "source": "10.209.37.121:9339",
  "subscription-name": "default-1677695197",
  "timestamp": 1677695132988218890,
  "time": "2023-03-01T18:25:32.98821889Z",
  "prefix": "interfaces/interface[name=swp8]/wjh/aggregate/l2/reasons/reason[severity=error][id=209]",
  "target": "netq",
  "updates": [
    {
      "Path": "state/drop",
      "values": {
        "state/drop": "[{\"AggCount\":287,\"Dip\":\"0.0.0.0\",\"Dmac\":\"1c:34:da:17:93:7c\",\"Dport\":0,\"DropType\":\"L2\",\"EgressPort\":\"\",\"EndTimestamp\":1677695132,\"FirstTimestamp\":1677695102,\"Hostname\":\"neo-switch01\",\"IngressLag\":\"\",\"IngressPort\":\"swp8\",\"Proto\":0,\"Reason\":\"Source MAC is multicast\",\"ReasonId\":209,\"Severity\":\"Error\",\"Sip\":\"0.0.0.0\",\"Smac\":\"01:00:5e:00:00:01\",\"Sport\":0}]"
      }
    }
  ]
}

The following example shows a gNMIc ONCE mode request for interface port speed:

gnmic -a 10.209.37.121:9339 -u cumulus -p ****** --skip-verify subscribe --path "ethernet/state/port-speed" --mode once --prefix "/interfaces/interface[name=swp1]/" --target netq
{
  "source": "10.209.37.123:9339",
  "subscription-name": "default-1677695151",
  "timestamp": 1677256036962254134,
  "time": "2023-02-24T16:27:16.962254134Z",
  "target": "netq",
  "updates": [
    {
      "Path": "interfaces/interface[name=swp1]/ethernet/state/port-speed",
      "values": {
        "interfaces/interface/ethernet/state/port-speed": "SPEED_1GB"
      }
    }
  ]
}

The following example shows a gNMIc POLL mode request for interface status:

gnmic -a 10.209.37.121:9339 -u cumulus -p ****** --skip-verify subscribe --path "state/oper-status" --mode poll --prefix "/interfaces/interface[name=swp1]/" --target netq
{
  "timestamp": 1677644403153198642,
  "time": "2023-03-01T04:20:03.153198642Z",
  "prefix": "interfaces/interface[name=swp1]",
  "target": "netq",
  "updates": [
    {
      "Path": "state/oper-status",
      "values": {
        "state/oper-status": "UP"
      }
    }
  ]
}
received sync response 'true' from '10.209.37.123:9339'
{
  "timestamp": 1677644403153198642,
  "time": "2023-03-01T04:20:03.153198642Z",
  "prefix": "interfaces/interface[name=swp1]",
  "target": "netq",
  "updates": [
    {
      "Path": "state/oper-status",
      "values": {
        "state/oper-status": "UP"
      }
    }
  ]
}

Border Gateway Protocol - BGP

BGP is the routing protocol that runs the Internet. It manages how packets get routed from network to network by exchanging routing and reachability information.

BGP is an increasingly popular protocol for use in the data center as it lends itself well to the rich interconnections in a Clos topology. RFC 7938 provides further details about using BGP in the data center.

How Does BGP Work?

BGP directs packets between autonomous systems (AS), which are a set of routers under a common administration. Each router maintains a routing table that controls how the switch forwards packets. The BGP process on the router generates information in the routing table based on information coming from other routers and from information in the RIB. The RIB stores routes and continually updates the routing table as changes occur.

Autonomous System

BGP treats each independently managed enterprise and service provider as an autonomous system, responsible for a set of network addresses. Each such autonomous system has a unique number called an ASN. A central authority (ICANN) hands out ASNs but numbers between 64512 and 65535 are for private use. When you use BGP within the data center, you must either use this number space or the single ASN you own.

The ASN is central to how BGP builds a forwarding topology. A BGP route advertisement carries with it not only the ASN of the originator, but also the list of ASNs that this route advertisement passes through. When forwarding a route advertisement, a BGP speaker adds itself to this list. The AS path includes the list of ASNs. BGP uses the AS path to detect and avoid loops.

FRR supports both 16-bit and 32-bit ASNs.

Auto BGP

In a two-tier leaf and spine environment, you can use auto BGP to generate 32-bit ASNs automatically so that you do not have to think about which numbers to configure. Auto BGP helps build optimal ASN configurations in your data center to avoid suboptimal routing and path hunting, which occurs when you assign the wrong spine ASNs. Auto BGP makes no changes to standard BGP behavior or configuration.

Auto BGP assigns private ASNs in the range 4200000000 through 4294967294. This is the private space that RFC 6996 defines. Each leaf has a random and unique value in the range 4200000001 through 4294967294. Each spine has the value 4200000000; the first number in the range. For information about configuring auto BGP, refer to Basic BGP Configuration.

eBGP and iBGP

When you use BGP to peer between autonomous systems, the peering is eBGP. When you use BGP within an autonomous system, the peering is iBGP. eBGP peers have different ASNs while iBGP peers have the same ASN.

The heart of the protocol is the same when used as eBGP or iBGP but there is a key difference in the protocol behavior between eBGP and iBGP. To prevent loops, an iBGP speaker does not forward routing information learned from one iBGP peer to another iBGP peer. eBGP prevents loops using the AS_Path attribute.

You need to peer all iBGP speakers with each other in a full mesh. In a large network, this requirement can become unscalable. The most popular method to scale iBGP networks is to introduce a route reflector.

BGP Path Selection

BGP is a path-vector routing algorithm that does not rely on a single routing metric to determine the lowest cost route, unlike IGPs like OSPF.

The BGP path selection algorithm looks at multiple factors to determine which path is best. Cumulus Linux enables BGP multipath by default so that multiple equal cost routes install in the routing table but only a single route advertises to BGP peers.

The order of the BGP algorithm process is as follows:

To see the reason Cumulus Linux selects one path over another, run the vtysh show ip bgp command.

When you use BGP multipath, if multiple paths are equal, BGP still selects a single best path to advertise to peers. This path shows as best with the reason, although BGP can install multiple paths into the routing table.

BGP Unnumbered

Historically, peers connect over IPv4 and TCP port 179, and after they establish a session, they exchange prefixes. When a BGP peer advertises an IPv4 prefix, it must include an IPv4 next hop address, which is the address of the advertising router. This requires each BGP peer to have an IPv4 address, which in a large network can consume a lot of address space and can require a separate IP address for each peer-facing interface.

The BGP unnumbered standard in RFC 5549, uses ENHE and does not require that you advertise an IPv4 prefix together with an IPv4 next hop. You can configure BGP peering between your Cumulus Linux switches and exchange IPv4 prefixes without having to configure an IPv4 address on each switch; BGP uses unnumbered interfaces.

The next hop address for each prefix is an IPv6 link-local address, which BGP assigns automatically to each interface. Using the IPv6 link-local address as a next hop instead of an IPv4 unicast address, BGP unnumbered saves you from having to configure IPv4 addresses on each interface.

When you use BGP unnumbered, BGP learns the prefixes, calculates the routes and installs them in IPv4 AFI to IPv6 AFI format. ENHE in Cumulus Linux does not install routes into the kernel in IPv4 prefix to IPv6 next hop format. For link-local peerings that you enable using IPv6 neighbor discovery router advertisements, BGP converts an IPv6 next hop into an IPv4 link-local address. It then installs a static neighbor entry for this IPv4 link-local address with the MAC address that it derives from the link-local address of the other end.

Basic BGP Configuration

This section describes how to configure BGP using either BGP numbered or BGP unnumbered. With BGP unnumbered, you can set up BGP peering between your Cumulus Linux switches and exchange IPv4 prefixes without having to configure an IPv4 address on each switch.

BGP unnumbered simplifies configuration. NVIDIA recommends you use BGP unnumbered for data center deployments.

When you enable BGP for the first time, the FRR service restarts, which might impact traffic. Any time you enable or disable BGP, or change the ASN, the FRR service also restarts.

BGP Numbered

To configure BGP numbered on a BGP node, you need to:

  1. Identify the BGP node by assigning an ASN.

    • To assign an ASN manually:

      cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
      
    • To use auto BGP to assign an ASN automatically on the leaf:

      cumulus@leaf01:~$ nv set router bgp autonomous-system leaf
      

      The auto BGP leaf keyword is only used to configure the ASN. The configuration files and nv show commands display the AS number.

  2. BGP automatically assigns the loopback address of the switch to be the router ID. If you do not have a loopback address configured or you do not want to use the loopback address as the router ID, you must assign the router ID either globally with the nv set router bgp router-id command or in a VRF with the nv set vrf <vrf> router bgp router-id command.

    cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
    
  3. Specify the BGP neighbor to which you want to distribute routing information.

    cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.0.1.0 remote-as external
    

    For BGP to advertise IPv6 prefixes, you need to run an additional command to activate the BGP neighbor under the IPv6 address family. Cumulus Linux enables the IPv4 address family by default; you do not need to run the activate command for IPv4 route exchange.

    cumulus@leaf01:~$ nv set vrf default router bgp neighbor 2001:db8:0002::0a00:0002 remote-as external
    cumulus@leaf01:~$ nv set vrf default router bgp neighbor 2001:db8:0002::0a00:0002 address-family ipv6-unicast enable on
    

    For BGP to advertise IPv4 prefixes with IPv6 next hops, see Advertise IPv4 Prefixes with IPv6 Next Hops.

  4. Specify which prefixes to originate:

    cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.1/32
    cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.1.10.0/24
    cumulus@leaf01:~$ nv config apply
    

    IPv6 prefix example:

    cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv6-unicast network 2001:db8::1/128
    cumulus@leaf01:~$ nv config apply
    

The NVUE Commands create the following configuration snippet in the /etc/nvue.d/startup.yaml file:

cumulus@leaf01:~$ sudo cat /etc/nvue.d/startup.yaml
...
- set:
    router:
      bgp:
        autonomous-system: 65101
        enable: on
        router-id: 10.10.10.1
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.1.10.0/24: {}
                  10.10.10.1/32: {}
            enable: on
            neighbor:
              10.0.1.0:
                remote-as: external
                type: numbered
  1. Identify the BGP node by assigning an ASN.

    • To assign an ASN manually:

      cumulus@spine01:~$ nv set router bgp autonomous-system 65199
      
    • To use auto BGP to assign an ASN automatically on the spine:

      cumulus@spine01:~$ nv set router bgp autonomous-system spine
      

      The auto BGP spine keyword is only used to configure the ASN. The configuration files and nv show commands display the AS number.

  2. BGP automatically assigns the loopback address of the switch to be the router ID. If you do not have a loopback address configured or you do not want to use the loopback address as the router ID, you must assign the router ID either globally with the nv set router bgp router-id command or in a VRF with the nv set vrf <vrf> router bgp router-id command.

    cumulus@spine01:~$ nv set router bgp router-id 10.10.10.101
    
  3. Specify the BGP neighbor to which you want to distribute routing information.

    cumulus@spine01:~$ nv set vrf default router bgp neighbor 10.0.1.0 remote-as external
    

    For BGP to advertise IPv6 prefixes, you need to run an additional command to activate the BGP neighbor under the IPv6 address family. Cumulus Linux enables the IPv4 address family by default; you do not need to run the activate command for IPv4 route exchange.

    cumulus@spine01:~$ nv set vrf default router bgp neighbor 2001:db8:0002::0a00:1 remote-as external
    cumulus@spine01:~$ nv set vrf default router bgp neighbor address-family ipv6-unicast 2001:db8:0002::0a00:1 enable on
    

    For BGP to advertise IPv4 prefixes with IPv6 next hops, see Advertise IPv4 Prefixes with IPv6 Next Hops.

  4. Specify which prefixes to originate:

    cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.101/32
    cumulus@spine01:~$ nv config apply
    

    IPv6 prefix example:

    cumulus@spine01:~$ nv set vrf default router bgp address-family ipv6-unicast network 2001:db8::101/128
    cumulus@spine01:~$ nv config apply
    

The NVUE Commands create the following configuration snippet in the /etc/nvue.d/startup.yaml file:

cumulus@spine01:~$ sudo cat /etc/nvue.d/startup.yaml
...
- set:
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.101
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.101/32: {}
            enable: on
            neighbor:
              10.0.1.0:
                remote-as: external
                type: numbered
  1. Enable the bgpd daemon as described in FRRouting.

  2. Identify the BGP node by assigning an ASN and, if necessary, the router ID.

    BGP automatically assigns the router ID using the loopback address or the highest IPv4 address for the interface. If you want to assign a specific IPv4 address for the router ID, add the router ID globally or per VRF.

    cumulus@leaf01:~$ sudo vtysh
    ...
    leaf01# configure terminal
    leaf01(config)# router bgp 65101
    leaf01(config-router)# bgp router-id 10.10.10.1
    
  3. Specify where to distribute routing information:

    leaf01(config-router)# neighbor 10.0.1.0 remote-as external
    

    For BGP to advertise IPv6 prefixes, you need to run an additional command to activate the BGP neighbor under the IPv6 address family. Cumulus Linux enables the IPv4 address family by default; you do not need to run the activate command for IPv4 route exchange.

    leaf01(config-router)# neighbor 2001:db8:0002::0a00:1 remote-as external
    leaf01(config-router)# address-family ipv6 unicast
    leaf01(config-router-af)# neighbor 2001:db8:0002::0a00:1 activate
    

    For BGP to advertise IPv4 prefixes with IPv6 next hops, see Advertise IPv4 Prefixes with IPv6 Next Hops.

  4. Specify which prefixes to originate:

    leaf01(config-router)# address-family ipv4
    leaf01(config-router-af)# network 10.10.10.1/32
    leaf01(config-router-af)# network 10.1.10.0/24
    leaf01(config-router-af)# end
    leaf01# write memory
    leaf01# exit
    cumulus@leaf01:~$
    

    IPv6 prefix example:

    leaf01(config-router)# address-family ipv6
    leaf01(config-router-af)# network 2001:db8::1/128
    leaf01(config-router-af)# end
    leaf01# write memory
    leaf01# exit
    
  1. Enable the bgpd daemon as described in FRRouting.

  2. Identify the BGP node by assigning an ASN and, if necessary, the router ID.

    BGP automatically assigns the router ID using the loopback address or the highest IPv4 address for the interface. If you want to assign a specific IPv4 address for the router ID, add the router ID globally or per VRF.

    cumulus@spine01:~$ sudo vtysh
    ...
    spine01# configure terminal
    spine01(config)# router bgp 65199
    spine01(config-router)# bgp router-id 10.10.10.101
    
  3. Specify where to distribute routing information:

    spine01(config-router)# neighbor 10.0.1.1 remote-as external
    

    For BGP to advertise IPv6 prefixes, you need to run an additional command to activate the BGP neighbor under the IPv6 address family. Cumulus Linux enables the IPv4 address family by default; you do not need to run the activate command for IPv4 route exchange.

    spine01(config-router)# neighbor 2001:db8:0002::0a00:0002 remote-as external
    spine01(config-router)# address-family ipv6 unicast
    spine01(config-router-af)# neighbor 2001:db8:0002::0a00:0002 activate
    

    For BGP to advertise IPv4 prefixes with IPv6 next hops, see Advertise IPv4 Prefixes with IPv6 Next Hops.

  4. Specify which prefixes to originate:

    spine01(config-router)# address-family ipv4
    spine01(config-router-af)# network 10.10.10.101/32
    spine01(config-router-af)# end
    spine01# write memory
    spine01# exit
    

    IPv6 prefixes:

    spine01(config-router)# address-family ipv4
    spine01(config-router-af)# network 2001:db8::101/128
    spine01(config-router-af)# end
    spine01# write memory
    spine01# exit
    

When using auto BGP, there are no references to leaf or spine in the configurations. Auto BGP determines the ASN for the system and configures it using standard vtysh commands.

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@spine01:~$ sudo cat /etc/frr/frr.conf
...
router bgp 65199
 bgp router-id 10.10.10.101
 neighbor 10.0.1.1 remote-as external
 !
 address-family ipv4 unicast
  network 10.10.10.101/32
 exit-address-family
...

BGP Unnumbered

The following example commands show a basic BGP unnumbered configuration for two switches, leaf01 and spine01, which are eBGP peers.

The only difference between a BGP unnumbered configuration and the BGP numbered configuration shown above is that the BGP neighbor is as an interface (instead of an IP address). You do not need to configure an IP address on the interface between the two peers on each side.

cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.1/32
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.1.10.0/24
cumulus@leaf01:~$ nv config apply

For BGP to advertise IPv6 prefixes, you need to run an additional command to activate the BGP neighbor under the IPv6 address family. Cumulus Linux enables the IPv4 address family by default; you do not need to run the activate command for IPv4 route exchange.

cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv6-unicast enable on
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv6-unicast network 2001:db8::1/128
cumulus@leaf01:~$ nv config apply

The NVUE Commands create the following configuration snippet in the /etc/nvue.d/startup.yaml file:

cumulus@leaf01:~$ sudo cat /etc/nvue.d/startup.yaml
...
- set:
    router:
      bgp:
        autonomous-system: 65101
        enable: on
        router-id: 10.10.10.1
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.1.10.0/24: {}
                  10.10.10.1/32: {}
            enable: on
            neighbor:
              swp51:
                remote-as: external
                type: unnumbered
cumulus@spine01:~$ nv set router bgp autonomous-system 65199
cumulus@spine01:~$ nv set router bgp router-id 10.10.10.101
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 remote-as external
cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.101/32
cumulus@spine01:~$ nv config apply

For BGP to advertise IPv6 prefixes, you need to run an additional command to activate the BGP neighbor under the IPv6 address family. Cumulus Linux enables the IPv4 address family by default; you do not need to run the activate command for IPv4 route exchange.

cumulus@spine01:~$ nv set router bgp autonomous-system 65199
cumulus@spine01:~$ nv set router bgp router-id 10.10.10.101
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 remote-as external
cumulus@spine01:~$ nv set vrf default router bgp address-family ipv6-unicast enable on
cumulus@spine01:~$ nv set vrf default router bgp address-family ipv6-unicast network 2001:db8::101/128
cumulus@spine01:~$ nv config apply

The NVUE Commands create the following configuration snippet in the /etc/nvue.d/startup.yaml file:

cumulus@spine01:~$ sudo cat /etc/nvue.d/startup.yaml
...
- set:
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.101
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.101/32: {}
            enable: on
            neighbor:
              swp1:
                remote-as: external
                type: unnumbered
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# bgp router-id 10.10.10.1
leaf01(config-router)# neighbor swp51 interface remote-as external
leaf01(config-router)# address-family ipv4
leaf01(config-router-af)# network 10.10.10.1/32
leaf01(config-router-af)# network 10.1.10.0/24
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

For BGP to advertise IPv6 prefixes, you need to run an additional command to activate the BGP neighbor under the IPv6 address family. Cumulus Linux enables the IPv4 address family by default; you do not need to run the activate command for IPv4 route exchange.

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# bgp router-id 10.10.10.1
leaf01(config-router)# neighbor swp51 interface remote-as external
leaf01(config-router)# address-family ipv6 unicast
leaf01(config-router-af)# neighbor swp51 activate
leaf01(config-router-af)# network 2001:db8::1/128
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@leaf01:~$  sudo cat /etc/frr/frr.conf
...
router bgp 65101
 bgp router-id 10.10.10.1
 neighbor swp51 interface remote-as external
 !
 address-family ipv4 unicast
  network 10.10.10.1/32
  network 10.1.10.0/24
 exit-address-family
...
cumulus@spine01:~$ sudo vtysh
...
spine01# configure terminal
spine01(config)# router bgp 65199
spine01(config-router)# bgp router-id 10.10.10.101
spine01(config-router)# neighbor swp1 interface remote-as external
spine01(config-router)# address-family ipv4
spine01(config-router-af)# network 10.10.10.101/32
spine01(config-router-af)# end
spine01# write memory
spine01# exit

For BGP to advertise IPv6 prefixes, you need to run an additional command to activate the BGP neighbor under the IPv6 address family. Cumulus Linux enables the IPv4 address family by default; you do not need to run the activate command for IPv4 route exchange.

cumulus@spine01:~$ sudo vtysh
...
spine01# configure terminal
spine01(config)# router bgp 65199
spine01(config-router)# bgp router-id 10.10.10.101
spine01(config-router)# neighbor swp1 interface remote-as external
spine01(config-router)# address-family ipv6 unicast
spine01(config-router-af)# neighbor swp1 activate
spine01(config-router-af)# network 2001:db8::101/128
spine01(config-router-af)# end
spine01# write memory
spine01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@spine01:~$  sudo cat /etc/frr/frr.conf
...
router bgp 65199
 bgp router-id 10.10.10.101
 neighbor swp1 interface remote-as external
 !
 address-family ipv4 unicast
  network 10.10.10.101/32
 exit-address-family
...

Verify Configuration

To verify that the switch can see its BGP neighbors, run the vtysh show ip bgp summary command:

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show ip bgp summary
IPv4 Unicast Summary (VRF default):
BGP router identifier 10.10.10.1, local AS number 65101 vrf-id 0
BGP table version 16
RIB entries 15, using 2880 bytes of memory
Peers 2, using 40 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
spine01(swp51)  4      65199       599       603        0    0    0 00:29:35            6        8 N/A
spine02(swp52)  4      65199       582       585        0    0    0 00:28:43            3        8 N/A

Total number of neighbors 2
cumulus@spine01:mgmt:~$ sudo vtysh
...
spine01# show ip bgp summary
IPv4 Unicast Summary (VRF default):
BGP router identifier 10.10.10.101, local AS number 65199 vrf-id 0
BGP table version 7
RIB entries 13, using 2496 bytes of memory
Peers 4, using 79 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
leaf01(swp1)    4      65101       637       634        0    0    0 00:31:20            4        7 N/A
leaf02(swp2)    4      65102       639       636        0    0    0 00:31:25            4        7 N/A
swp3            4          0         0         0        0    0    0    never         Idle        0 N/A
leaf04(swp4)    4      65104       636       635        0    0    0 00:31:23            1        7 N/A

Total number of neighbors 4

To verify that you can see the prefixes of the other neighbor in the routing table, run the vtysh show ip route command.

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, A - Babel, D - SHARP, F - PBR, f - OpenFabric,
       Z - FRR,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

C>* 10.1.10.0/24 is directly connected, vlan10, 00:36:43
C>* 10.1.20.0/24 is directly connected, vlan20, 00:36:43
C>* 10.1.30.0/24 is directly connected, vlan30, 00:36:43
C>* 10.10.10.1/32 is directly connected, lo, 00:38:34
B>* 10.10.10.2/32 [20/0] via fe80::4ab0:2dff:fe51:3d2a, swp52, weight 1, 00:32:23
  *                      via fe80::4ab0:2dff:feb1:8706, swp51, weight 1, 00:32:23
B>* 10.10.10.4/32 [20/0] via fe80::4ab0:2dff:fe51:3d2a, swp52, weight 1, 00:32:19
  *                      via fe80::4ab0:2dff:feb1:8706, swp51, weight 1, 00:32:19
B>* 10.10.10.101/32 [20/0] via fe80::4ab0:2dff:feb1:8706, swp51, weight 1, 00:33:18
B>* 10.10.10.102/32 [20/0] via fe80::4ab0:2dff:fe51:3d2a, swp52, weight 1, 00:32:2
cumulus@spine01:mgmt:~$ sudo vtysh
...
spine01# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, A - Babel, D - SHARP, F - PBR, f - OpenFabric,
       Z - FRR,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

B>* 10.1.10.0/24 [20/0] via fe80::4ab0:2dff:fe90:b5c8, swp2, weight 1, 00:37:24
B>* 10.1.20.0/24 [20/0] via fe80::4ab0:2dff:fe90:b5c8, swp2, weight 1, 00:37:24
B>* 10.1.30.0/24 [20/0] via fe80::4ab0:2dff:fe90:b5c8, swp2, weight 1, 00:37:24
B>* 10.10.10.1/32 [20/0] via fe80::4ab0:2dff:feb5:d65e, swp1, weight 1, 00:37:20
B>* 10.10.10.2/32 [20/0] via fe80::4ab0:2dff:fe90:b5c8, swp2, weight 1, 00:37:24
B>* 10.10.10.4/32 [20/0] via fe80::4ab0:2dff:fea4:fab6, swp4, weight 1, 00:37:22
C>* 10.10.10.101/32 is directly connected, lo, 00:37:31

Optional BGP Configuration

This section describes optional configuration. The steps provided in this section assume that you already configured basic BGP as described in Basic BGP Configuration.

Peer Groups

Instead of specifying properties of each individual peer, you can define one or more peer groups and associate all the attributes common to that peer session to a peer group. You need to attach a peer to a peer group one time; it then inherits all address families activated for that peer group.

If the peer you want to add to a group already exists in the BGP configuration, delete it first, than add it to the peer group.

The following example commands create a peer group called SPINE that includes two external peers.

cumulus@leaf01:~$ nv set vrf default router bgp peer-group SPINE
cumulus@leaf01:~$ nv set vrf default router bgp peer-group SPINE remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.0.1.0 peer-group SPINE
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.0.1.12 peer-group SPINE
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor SPINE peer-group
leaf01(config-router)# neighbor SPINE remote-as external
leaf01(config-router)# neighbor 10.0.1.0 peer-group SPINE
leaf01(config-router)# neighbor 10.0.1.12 peer-group SPINE
leaf01(config-router)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

For an unnumbered configuration, you can use a single command to configure a neighbor and attach it to a peer group.

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 peer-group SPINE
leaf01(config-router)# neighbor swp51 interface peer-group SPINE

If you unset a peer group, make sure that it is not applied to any neighbors. If the peer group is applied to neighbors, configure all parameters, such as the remote AS, directly on the neighbors before removing the peer group.

BGP Dynamic Neighbors

BGP dynamic neighbors provides BGP peering to remote neighbors within a specified range of IPv4 or IPv6 addresses for a BGP peer group. You can configure each range as a subnet IP address.

After you configure the dynamic neighbors, a BGP speaker can listen for, and form peer relationships with, any neighbor that is in the IP address range and maps to a peer group. You can also limit the number of dynamic peers. The default value is 100.

The following example commands configure BGP peering to remote neighbors within the address range 10.0.1.0/24 for the peer group SPINE and limit the number of dynamic peers to 5.

The peer group must already exist otherwise the configuration does not apply.

cumulus@leaf01:~$ nv set vrf default router bgp dynamic-neighbor listen-range 10.0.1.0/24 peer-group SPINE
cumulus@leaf01:~$ nv set vrf default router bgp dynamic-neighbor limit 5
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# bgp listen range 10.0.1.0/24 peer-group SPINE
leaf01(config-router)# bgp listen limit 5
leaf01(config-router)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

router bgp 65101
  neighbor SPINE peer-group
  neighbor SPINE remote-as external
  bgp listen limit 5
  bgp listen range 10.0.1.0/24 peer-group SPINE

eBGP Multihop

The eBGP multihop option lets you use BGP to exchange routes with an external peer that is more than one hop away.

The following example command configures Cumulus Linux to establish a connection between two eBGP peers that are not directly connected and sets the maximum number of hops used to reach a eBGP peer to 1.

cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.101 remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.101 multihop-ttl 1
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor 10.10.10.101 remote-as external
leaf01(config-router)# neighbor 10.10.10.101 ebgp-multihop
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

BGP TTL Security Hop Count

You can use the TTL security hop count option to prevent attacks against eBGP, such as denial of service (DoS) attacks. By default, BGP messages to eBGP neighbors have an IP time-to-live (TTL) of 1, which requires the peer to be directly connected, otherwise, the packets expire along the way. You can adjust the TTL with the eBGP multihop option. An attacker can adjust the TTL of packets so that they look like they originate from a directly connected peer.

The BGP TTL security hops option inverts the direction in which BGP counts the TTL. Instead of accepting only packets with a TTL of 1, Cumulus Linux accepts BGP messages with a TTL greater than or equal to 255 minus the specified hop count.

When you use TTL security, you do not need eBGP multihop.

The following command example sets the TTL security hop count value to 200:

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 ttl-security hops 200
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor swp51 ttl-security hops 200
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65101
  ...
  neighbor swp51 ttl-security hops 200
...

  • When you configure ttl-security hops on a peer group instead of a specific neighbor, FRR does not add it to either the running configuration or to the /etc/frr/frr.conf file. To work around this issue, add ttl-security hops to individual neighbors instead of the peer group.
  • Enabling ttl-security hops does not program the hardware with relevant information. Cumulus Linux forwards frames to the CPU and then drops them. Use the NVUE Command to explicitly add the relevant entry to hardware. For more information about ACLs, see Access Control Lists.

MD5-enabled BGP Neighbors

You can authenticate your BGP peer connection to prevent interference with your routing tables.

To enable MD5 authentication for BGP peers, set the same password on each peer.

The following example commands set the password mypassword on BGP peers leaf01 and spine01:

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 password mypassword
cumulus@leaf01:~$ nv config apply
cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 password mypassword
cumulus@spine01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor swp51 password mypassword
leaf01(config-router)# end
leaf01# write memory
leaf01# exit
cumulus@spine01:~$ sudo vtysh
...
spine01# configure terminal
spine01(config)# router bgp 65199
spine01(config-router)# neighbor swp1 password mypassword
spine01(config-router)# end
spine01# write memory
spine01# exit

You can confirm the configuration with the NVUE nv show vrf default router bgp neighbor <neighbor> command or the vtysh show ip bgp neighbor <neighbor> command.

example

The following example shows that Cumulus Linux establishes a session with the peer. The output shows Peer Authentication Enabled towards the end.

cumulus@spine01:~$ sudo vtysh
...
spine01# show ip bgp neighbor swp1
BGP neighbor on swp1: fe80::2294:15ff:fe02:7bbf, remote AS 65101, local AS 65199, external link
Hostname: leaf01
  BGP version 4, remote router ID 10.10.10.1, local router ID 10.10.10.101
  BGP state = Established, up for 00:00:39
  Last read 00:00:00, Last write 00:00:00
  Hold time is 9, keepalive interval is 3 seconds
  Neighbor capabilities:
    4 Byte AS: advertised and received
    AddPath:
      IPv4 Unicast: RX advertised IPv4 Unicast and received
    Route refresh: advertised and received(old & new)
    Address Family IPv4 Unicast: advertised and received
    Hostname Capability: advertised (name: spine01,domain name: n/a) received (name: leaf01,domain name: n/a)
    Graceful Restart Capability: advertised and received
      Remote Restart timer is 120 seconds
      Address families by peer:
        none
  Graceful restart information:
    End-of-RIB send: IPv4 Unicast
    End-of-RIB received: IPv4 Unicast
  Message statistics:
    Inq depth is 0
    Outq depth is 0
                         Sent       Rcvd
    Opens:                  2          2
    Notifications:          0          2
    Updates:              424        369
    Keepalives:           633        633
    Route Refresh:          0          0
    Capability:             0          0
    Total:               1059       1006
  Minimum time between advertisement runs is 0 seconds
  For address family: IPv4 Unicast
  Update group 1, subgroup 1
  Packet Queue length 0
  Community attribute sent to this neighbor(all)
  3 accepted prefixes
  Connections established 2; dropped 1
  Last reset 00:02:37,   Notification received (Cease/Other Configuration Change)
Local host: fe80::7c41:fff:fe93:b711, Local port: 45586
Foreign host: fe80::2294:15ff:fe02:7bbf, Foreign port: 179
Nexthop: 10.10.10.101
Nexthop global: fe80::7c41:fff:fe93:b711
Nexthop local: fe80::7c41:fff:fe93:b711
BGP connection: shared network
BGP Connect Retry Timer in Seconds: 10
Peer Authentication Enabled
Read thread: on  Write thread: on  FD used: 27

Cumulus Linux does not enforce the MD5 password configured against a BGP listen-range peer group (used to accept and create dynamic BGP neighbors) and accepts connections from peers that do not specify a password.

Remove Private BGP ASNs

If you use private ASNs in the data center, routes advertised to neighbors contain your private ASNs. The examples below show how to remove the private ASNs from routes and how to replace the private ASNs with your public ASN.

The following example command removes private ASNs from routes advertised to the neighbor on swp51 (an unnumbered interface):

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast aspath private-as remove
cumulus@leaf01:~$ nv config apply

You can replace the private ASNs with your public ASN with the following command:

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast aspath replace-peer-as on
cumulus@leaf01:~$ nv config apply

To unset the above configuration:

cumulus@leaf01:~$ nv unset vrf default router bgp neighbor swp51 address-family ipv4-unicast aspath private-as remove
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ nv unset vrf default router bgp neighbor swp51 address-family ipv4-unicast aspath replace-peer-as on
cumulus@leaf01:~$ nv config apply

Add the line neighbor swp51 remove-private-AS to the address-family ipv4 unicast stanza:

cumulus@leaf01:~$ sudo nano /etc/frr/frr.conf
...
router bgp 65101
 bgp router-id 10.10.10.1
 neighbor underlay peer-group
 neighbor underlay remote-as external
 neighbor swp51 interface peer-group underlay
 neighbor swp52 interface peer-group underlay
 neighbor swp53 interface peer-group underlay
 neighbor swp54 interface peer-group underlay
 !
 address-family ipv4 unicast
  redistribute connected
  neighbor swp51 remove-private-AS
 exit-address-family
 !
...

Multiple BGP ASNs

Cumulus Linux supports the use of distinct ASNs for different VRF instances.

The following example configures VRF RED and VRF BLUE on border01 to use ASN 65532 towards fw1 and 65533 towards fw2:

cumulus@border01:~$ nv set vrf RED router bgp autonomous-system 65532        
cumulus@border01:~$ nv set vrf RED router bgp router-id 10.10.10.63
cumulus@border01:~$ nv set vrf RED router bgp neighbor swp3 remote-as external
cumulus@border01:~$ nv set vrf BLUE router bgp autonomous-system 65533 
cumulus@border01:~$ nv set vrf BLUE router bgp router-id 10.10.10.63
cumulus@border01:~$ nv set vrf BLUE router bgp neighbor swp4 remote-as external
cumulus@border01:~$ nv config apply
cumulus@border01:~$ sudo vtysh
...
border01# configure terminal
border01(config)# router bgp 65532 vrf RED
border01(config-router)# bgp router-id 10.10.10.63
border01(config-router)# neighbor swp3 interface remote-as external
border01(config-router)# exit
border01(config)# router bgp 65533 vrf BLUE
border01(config-router)# bgp router-id 10.10.10.63
border01(config-router)# neighbor swp4 interface remote-as external
border01(config-router)# end
border01# write memory
border01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file:

cumulus@border01:~$ cat /etc/frr/frr.conf
...
log syslog informational
!
vrf RED
  vni 4001
vrf BLUE
  vni 4002
!
router bgp 65132
 bgp router-id 10.10.10.63
 bgp bestpath as-path multipath-relax
 neighbor underlay peer-group
 neighbor underlay remote-as external
 neighbor peerlink.4094 interface remote-as internal
 neighbor swp51 interface peer-group underlay
 neighbor swp52 interface peer-group underlay
 !
 address-family ipv4 unicast
  redistribute connected
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor underlay activate
  advertise-all-vni
 exit-address-family
!
router bgp 65532 vrf RED
 bgp router-id 10.10.10.63
 neighbor swp3 remote-as external
 !
 address-family ipv4 unicast
  redistribute static
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor underlay activate
  advertise-all-vni
 exit-address-family
!
router bgp 65533 vrf BLUE
 bgp router-id 10.10.10.63
 neighbor swp4 remote-as external
 !
 address-family ipv4 unicast
  redistribute static
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor underlay activate
  advertise-all-vni
 exit-address-family
!
line vty

With the above configuration, the vtysh show ip bgp vrf RED summary command output shows the local ASN as 65532.

cumulus@border01:mgmt:~$ sudo vtysh
...
border01# show ip bgp vrf RED summary
ipv4 unicast summary

BGP router identifier 10.10.10.63, local AS number 65532 vrf-id 35
BGP table version 1
RIB entries 1, using 192 bytes of memory
Peers 1, using 21 KiB of memory

Neighbor      V      AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt
fw1(swp3)     4   65199      2015      2015        0    0    0 01:40:36            1        1

Total number of neighbors 1
...

The vtysh show ip bgp summary command displays the global table, where the local ASN 65132 peers with spine01.

cumulus@border01:mgmt:~$ sudo vtysh
...
leaf01# show ip bgp summary
ipv4 unicast summary

BGP router identifier 10.10.10.63, local AS number 65132 vrf-id 0
BGP table version 3
RIB entries 5, using 960 bytes of memory
Peers 1, using 43 KiB of memory
Peer groups 1, using 64 bytes of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt
spine01(swp51)  4      65199      2223      2223        0    0    0 01:50:18            1        3

Total number of neighbors 1
...

BGP allowas-in

To prevent loops, the switch automatically discards BGP network prefixes if it sees its own ASN in the AS path. However, you can configure Cumulus Linux to receive and process routes even if it detects its own ASN in the AS path (allowas-in).

To enable allowas-in:

cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast aspath allow-my-asn enable on
cumulus@switch:~$ nv config apply

To disable allowas-in, run the nv unset command:

cumulus@switch:~$ nv unset vrf default router bgp neighbor swp51 address-family ipv4-unicast aspath allow-my-asn enable on
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family ipv4 unicast
switch(config-router-af)# neighbor swp51 allowas-in
switch(config-router-af)# end
switch# write memory
switch# exit

The vtysh commands save the configuration in the address-family stanza of the /etc/frr/frr.conf file. For example:

...
address-family ipv4 unicast
  network 10.10.10.1/32
  redistribute connected
  neighbor swp51 allowas-in
...

You can configure additional options:

The following example sets the maximum number of occurrences of the local system’s AS number in the received AS path to 4:

cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast aspath allow-my-asn occurrences 4
cumulus@switch:~$ nv config apply

To unset the above configuration, run the nv unset command:

cumulus@switch:~$ nv unset vrf default router bgp neighbor swp51 address-family ipv4-unicast aspath allow-my-asn occurrences 4
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family ipv4 unicast
switch(config-router-af)# neighbor swp51 allowas-in 4
switch(config-router-af)# end
switch# write memory
switch# exit

The vtysh commands save the configuration in the address-family stanza of the /etc/frr/frr.conf file. For example:

...
address-family ipv4 unicast
  network 10.10.10.1/32
  redistribute connected
  neighbor swp51 allowas-in 4
...

The following example allows a received AS path containing the ASN of the local system but only if it is the originating AS:

cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast aspath allow-my-asn origin on
cumulus@switch:~$ nv config apply

To unset the above configuration, run the nv unset command:

cumulus@switch:~$ nv unset vrf default router bgp neighbor swp51 address-family ipv4-unicast aspath allow-my-asn origin on
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family ipv4 unicast
switch(config-router-af)# neighbor swp51 allowas-in origin
switch(config-router-af)# end
switch# write memory
switch# exit

The vtysh commands save the configuration in the address-family stanza of the /etc/frr/frr.conf file. For example:

...
address-family ipv4 unicast
  network 10.10.10.1/32
  redistribute connected
  neighbor swp51 allowas-in origin
...

Update Source

You can configure BGP to use a specific IP address when exchanging BGP updates with a neighbor. For example, in a numbered BGP configuration, you can set the source IP address to be the loopback address of the switch.

cumulus@leaf01:~$ nv set vrf default router bgp neighbor 10.10.10.10 update-source 10.10.10.1
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor 10.10.10.10 update-source 10.10.10.1
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65101
 bgp router-id 10.10.10.1
 neighbor 10.10.10.10 remote-as 65000
 neighbor 10.10.10.10 update-source 10.10.10.1
 ...

ECMP

BGP supports equal-cost multipathing (ECMP). If a BGP node hears a certain prefix from multiple peers, it has the information necessary to program the routing table and forward traffic for that prefix through all these peers. BGP typically chooses one best path for each prefix and installs that route in the forwarding table.

Cumulus Linux enables the BGP multipath option by default and sets the maximum number of paths to 64 so that the switch can install multiple equal-cost BGP paths to the forwarding table and load balance traffic across multiple links. You can change the number of paths allowed, according to your needs.

The example commands change the maximum number of paths to 120. You can set a value between 1 and 256. 1 disables the BGP multipath option.

cumulus@switch:~$ nv set vrf default router bgp address-family ipv4-unicast multipaths ibgp 120 
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# address-family ipv4
switch(config-router-af)# maximum-paths 120
switch(config-router-af)# end
switch# write memory
switch# exit

The vtysh commands save the configuration in the address-family stanza of the /etc/frr/frr.conf file. For example:

...
address-family ipv4 unicast
 network 10.1.10.0/24
 network 10.10.10.1/32
 maximum-paths 120
exit-address-family
...

When you enable BGP multipath, Cumulus Linux load balances BGP routes from the same AS. If the routes go across several different AS neighbors, even if the AS path length is the same, they are not load balanced. To load balance between multiple paths received from different AS neighbors:.

cumulus@switch:~$ nv set vrf default router bgp path-selection multipath aspath-ignore on
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# bgp bestpath as-path multipath-relax
switch(config-router)# end
switch# write memory
switch# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65101
  bgp router-id 10.0.0.1
  bgp bestpath as-path multipath-relax
...

When you disable the bestpath as-path multipath-relax option, EVPN type-5 routes do not use the updated configuration. Type-5 routes continue to use all available ECMP paths in the underlay fabric, regardless of ASN.

RFC 5549 defines how BGP advertises IPv4 prefixes with IPv6 next hops. The RFC does not make a distinction between whether the IPv6 peering and next hop values must be global unicast addresses (GUA) or link-local addresses. Cumulus Linux supports advertising IPv4 prefixes with IPv6 global unicast and link-local next hop addresses, with either unnumbered or numbered BGP.

When BGP peering uses IPv6 global addresses, and BGP advertises and installs IPv4 prefixes, Cumulus Linux uses IPv6 route advertisements to derive the MAC address of the peer so that FRR can create an IPv4 route with a link-local IPv4 next hop address (defined by RFC 3927). FRR configures these route advertisement settings automatically upon receiving an update from a BGP peer that uses IPv6 global addresses with an IPv4 prefix and an IPv6 next hop, and after it negotiates the enhanced-next hop capability.

To enable advertisement of IPv4 prefixes with IPv6 next hops over global IPv6 peerings, add the extended-nexthop capability to the global IPv6 neighbor statements on each end of the BGP sessions.

cumulus@switch:~$ nv set vrf default router bgp neighbor 2001:db8:0002::0a00:0002 capabilities extended-nexthop on
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# neighbor 2001:db8:0002::0a00:0002 capability extended-nexthop
switch(config-router)# end
switch# write memory
switch# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65101
  ...
  neighbor 2001:db8:0002::0a00:0002 capability extended-nexthop
...

Ensure that you have activated the IPv6 peers under the IPv4 unicast address family; otherwise, all peers activate in the IPv4 unicast address family by default. If you configure no bgp default ipv4-unicast, you need to activate the IPv6 neighbor under the IPv4 unicast address family as shown below:

cumulus@switch:~$ nv set vrf default router bgp neighbor 2001:db8:0002::0a00:0002 capabilities extended-nexthop on
cumulus@switch:~$ nv set vrf default router bgp neighbor 2001:db8:0002::0a00:0002 address-family ipv4-unicast enable on
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo vtysh
...
switch# configure terminal
switch(config)# router bgp 65101
switch(config-router)# neighbor 2001:db8:0002::0a00:0002 capability extended-nexthop
switch(config-router)# address-family ipv4 unicast
switch(config-router-af)# neighbor 2001:db8:0002::0a00:0002 activate
switch(config-router-af)# end
switch# write memory
switch# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65101
router-id 10.10.10.1
no bgp default ipv4-unicast
neighbor 2001:db8:0002::0a00:0002 remote-as external
neighbor 2001:db8:0002::0a00:0002 capability extended-nexthop
!
address-family ipv4 unicast
  neighbor 2001:db8:0002::0a00:0002 activate
exit-address-family
...

Neighbor Maximum Prefixes

To protect against an internal network connectivity disruption caused by BGP, you can control the number of route announcements (prefixes) you want to receive from a BGP neighbor.

The following example commands set the maximum number of prefixes allowed from the BGP neighbor on swp51 to 3000:

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast prefix-limits inbound maximum 3000
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65001
leaf01(config-router)# neighbor swp51 maximum-prefix 3000
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

Aggregate Addresses

To minimize the size of the routing table and save bandwidth, you can aggregate a range of networks in your routing table into a single prefix.

The following example command aggregates a range of addresses, such as 10.1.1.0/24, 10.1.2.0/24, 10.1.3.0/24 into the single prefix 10.1.0.0/16:

cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast aggregate-route 10.1.0.0/16 
cumulus@leaf01:~$ nv config apply

The summary-only option ensures that BGP suppresses longer-prefixes inside the aggregate address before sending updates:

cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast aggregate-route 10.1.0.0/16 summary-only on
cumulus@leaf01:~$ nv config apply

Suppress Route Advertisement

You can configure BGP to wait for a response from the RIB indicating that the routes installed in the RIB are also installed in the ASIC before sending updates to peers.

cumulus@leaf01:~$ nv set router bgp wait-for-install on
cumulus@leaf01:~$ nv config apply

When you configure suppress route advertisement, NVUE reloads switchd.

  1. Run the following vtysh commands:

    cumulus@leaf01:~$ sudo vtysh
    ...
    leaf01# configure terminal
    leaf01(config)# router bgp 65101
    leaf01(config-router)# bgp suppress-fib-pending
    leaf01(config-router)# end
    leaf01# write memory
    leaf01# exit
    

    The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

    ...
    router bgp 65199
    bgp router-id 10.10.10.101
    neighbor swp51 remote-as external
    bgp suppress-fib-pending
    ...
    
  2. Edit the /etc/cumulus/switchd.d/kernel_route_offload_flags.conf file to set the kernel_route_offload_flags parameter to 2:

    cumulus@leaf01:~$ sudo nano /etc/cumulus/switchd.d/kernel_route_offload_flags.conf  
    # Set routing-forwarding-sync mode for routes.
    #  0: No notification on HW install success or failure (default mode)
    #  1: Notify HW install failure
    #  2: Notify HW install success/failure
    kernel_route_offload_flags = 2
    
  3. Restart switchd:

    cumulus@leaf01:~$ sudo systemctl restart switchd.service
    

ISSU suppresses route advertisement automatically when upgrading or troubleshooting an active switch so that there is minimal disruption to the network.

BGP add-path

Cumulus Linux supports both BGP add-path RX and BGP add-path TX.

BGP add-path RX

BGP add-path RX enables BGP to receive multiple paths for the same prefix. A path identifier ensures that additional paths do not override previously advertised paths. Cumulus Linux enables BGP add-path RX by default; you do not need to perform additional configuration.

To view the existing capabilities, run the vtysh show ip bgp neighbors command. You can see the existing capabilities in the subsection Add Path, below Neighbor capabilities.

The following example output shows that BGP can send and receive additional BGP paths, and that the BGP neighbor on swp51 supports both.

cumulus@leaf01:~$ sudo vtysh
...
leaf01# show ip bgp neighbors
BGP neighbor on swp51: fe80::7c41:fff:fe93:b711, remote AS 65199, local AS 65101, external link
Hostname: spine01
  BGP version 4, remote router ID 10.10.10.101, local router ID 10.10.10.1
  BGP state = Established, up for 1d12h39m
  Last read 00:00:03, Last write 00:00:01
  Hold time is 9, keepalive interval is 3 seconds
  Neighbor capabilities:
    4 Byte AS: advertised and received
    AddPath:
      IPv4 Unicast: RX advertised IPv4 Unicast and received
    Extended nexthop: advertised and received
      Address families by peer:
                   IPv4 Unicast
    Route refresh: advertised and received(old & new)
    Address Family IPv4 Unicast: advertised and received
    Hostname Capability: advertised (name: leaf01,domain name: n/a) received (name: spine01,domain name: n/a)
    Graceful Restart Capability: advertised and received
...

To view the current additional paths, run the vtysh show ip bgp <prefix> command. The example output shows that the TX node adds an additional path for receiving. Each path has a unique AddPath ID.

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show ip bgp 10.10.10.9
BGP routing table entry for 10.10.10.9/32
Paths: (2 available, best #1, table Default-IP-Routing-Table)
  Advertised to non peer-group peers:
  spine01(swp51) spine02(swp52)
  65020 65012
    fe80::4638:39ff:fe00:5c from spine01(swp51) (10.10.10.12)
    (fe80::4638:39ff:fe00:5c) (used)
      Origin incomplete, localpref 100, valid, external, multipath, bestpath-from-AS 65020, best (Older Path)
      AddPath ID: RX 0, TX 6
      Last update: Wed Nov 16 22:47:00 2016
  65020 65012
    fe80::4638:39ff:fe00:2b from spine02(swp52) (10.10.10.12)
    (fe80::4638:39ff:fe00:2b) (used)
      Origin incomplete, localpref 100, valid, external, multipath
      AddPath ID: RX 0, TX 3
      Last update: Fri Oct  2 03:56:33 2020

BGP add-path TX

BGP add-path TX enables BGP to advertise more than just the best path for a prefix. Cumulus Linux includes two options:

The following example commands configure leaf01 to advertise the best path learned from each AS to the BGP neighbor on swp50:

cumulus@leaf01:~$ nv set vrf default router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp50 address-family ipv4-unicast add-path-tx best-per-as
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor swp50 addpath-tx-bestpath-per-AS
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

The following example commands configure leaf01 to advertise all paths learned from each AS to the BGP neighbor on swp50:

cumulus@leaf01:~$ nv set vrf default router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp50 address-family ipv4-unicast add-path-tx all-paths
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor swp50 addpath-tx-all-paths
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

The following example configuration shows how BGP add-path TX advertises the best path learned from each AS.

In this configuration:
  • Every leaf and every spine has a different ASN
  • eBGP is configured between:
    • leaf01 and spine01, spine02
    • leaf03 and spine01, spine02
    • leaf01 and leaf02 (leaf02 only has a single peer, which is leaf01)
  • leaf01 is configured to advertise the best path learned from each AS to BGP neighbor leaf02
  • leaf03 generates a loopback IP address (10.10.10.3/32) into BGP with a network statement

When you run the show ip bgp 10.10.10.3/32 command on leaf02, the command output shows the leaf03 loopback IP address and two BGP paths, both from leaf01:

cumulus@leaf02:mgmt:~$ sudo vtysh
...
leaf02# show ip bgp 10.10.10.3/32
BGP routing table entry for 10.10.10.3/32
Paths: (2 available, best #2, table default)
       Advertised to non peer-group peers:
       leaf01(swp50)
  65101 65199 65103
    fe80::4638:39ff:fe00:13 from leaf01(swp50) (10.10.10.1)
    (fe80::4638:39ff:fe00:13) (used)
      Origin IGP, valid, external
      AddPath ID: RX 4, TX-All 0 TX-Best-Per-AS 0
      Last update: Thu Oct 15 18:31:46 2020
  65101 65198 65103
    fe80::4638:39ff:fe00:13 from leaf01(swp50) (10.10.10.1)
    (fe80::4638:39ff:fe00:13) (used)
      Origin IGP, valid, external, bestpath-from-AS 65101, best (Nothing left to compare)
      AddPath ID: RX 3, TX-All 0 TX-Best-Per-AS 0
      Last update: Thu Oct 15 18:31:46 2020

Conditional Advertisement

Routes are typically propagated even if a different path exists. The BGP conditional advertisement feature lets you advertise certain routes only if other routes either do or do not exist.

This feature is typically used in multihomed networks where BGP advertises some prefixes to one of the providers only if information from the other provider is not present. For example, a multihomed router can use conditional advertisement to choose which upstream provider learns about the routes it provides so that it can influence which provider handles traffic destined for the downstream router. This is useful for cost of service, latency, or other policy requirements that are not natively accounted for in BGP.

Conditional advertisement uses the non-exist-map or the exist-map and the advertise-map keywords to track routes by route prefix. You configure the BGP neighbors to use the route maps.

Use caution when configuring conditional advertisement on a large number of BGP neighbors. Cumulus Linux scans the entire RIB table every 60 seconds by default; depending on the number of routes in the RIB, this can result in longer processing times. NVIDIA does not recommend that you configure conditional advertisement on more than 50 neighbors.

The following example commands configure the switch to send a 10.0.0.0/8 summary route only if the 10.0.0.0/24 route exists in the routing table. The commands perform the following configuration:

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast conditional-advertise enable on 
cumulus@leaf01:~$ nv set router policy prefix-list EXIST rule 10 match 10.0.0.0/24
cumulus@leaf01:~$ nv set router policy prefix-list EXIST rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map EXISTMAP rule 10 match type ipv4
cumulus@leaf01:~$ nv set router policy route-map EXISTMAP rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map EXISTMAP rule 10 match ip-prefix-list EXIST
cumulus@leaf01:~$ nv set router policy prefix-list ADVERTISE rule 10 action permit
cumulus@leaf01:~$ nv set router policy prefix-list ADVERTISE rule 10 match 10.0.0.0/8
cumulus@leaf01:~$ nv set router policy route-map ADVERTISEMAP rule 10 match type ipv4
cumulus@leaf01:~$ nv set router policy route-map ADVERTISEMAP rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map ADVERTISEMAP rule 10 match ip-prefix-list ADVERTISE
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast conditional-advertise advertise-map ADVERTISEMAP
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 address-family ipv4-unicast conditional-advertise exist-map EXIST
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# ip prefix-list EXIST seq 10 permit 10.0.0.0/24
leaf01(config)# route-map EXISTMAP permit 10
leaf01(config-route-map)# match ip address prefix-list EXIST
leaf01(config-route-map)# exit
leaf01(config)# ip prefix-list ADVERTISE seq 10 permit 10.0.0.0/8
leaf01(config)# route-map ADVERTISEMAP permit 10
leaf01(config-route-map)# match ip address prefix-list ADVERTISE
leaf01(config-route-map)# exit
leaf01(config)# router bgp
leaf01(config-router)# neighbor swp51 advertise-map ADVERTISEMAP exist-map EXISTMAP
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

The commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
neighbor swp51 activate
neighbor swp51 advertise-map ADVERTISEMAP exist-map EXIST
...
ip prefix-list ADVERTISE seq 10 permit 10.0.0.0/8
ip prefix-list EXIST seq 10 permit 10.0.0.0/24
route-map ADVERTISEMAP permit 10
match ip address prefix-list ADVERTISE
route-map EXISTMAP permit 10
match ip address prefix-list EXIST

Cumulus Linux scans the entire RIB table every 60 seconds. You can set the conditional advertisement timer to increase or decrease how often you want Cumulus Linux to scan the RIB table. You can set a value between 5 and 240 seconds.

A lower value (such as 5) increases the amount of processing needed. Use caution when configuring conditional advertisement on a large number of BGP neighbors.

cumulus@leaf01:~$ nv set vrf default router bgp timers conditional-advertise 100
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp
leaf01(config-router)# bgp conditional-advertisement timer 100
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

The commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
...
router bgp 65101
 bgp router-id 10.10.10.1
 bgp conditional-advertisement timer 100
 neighbor swp51 interface remote-as external
 neighbor swp51 advertisement-interval 0
 neighbor swp51 timers 3 9
 neighbor swp51 timers connect 10
 neighbor swp52 interface remote-as external
 neighbor swp52 advertisement-interval 0
 neighbor swp52 timers 3 9
 neighbor swp52 timers connect 10
...

Next Hop Tracking

By default, next hop tracking does not resolve next hops through the default route. If you want BGP to peer across the default route, run the vtysh ip nht resolve-via-default command.

The following example command configures BGP to peer across the default route from the default VRF.

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# ip nht resolve-via-default
leaf01(config)# exit
leaf01# write memory
leaf01# exit

The following example command configures BGP to peer across the default route from VRF BLUE:

cumulus@leaf01:~$ sudo vtysh
leaf01# configure terminal
leaf01(config)# vrf BLUE
leaf01(config-vrf)# ip nht resolve-via-default
leaf01(config-vrf)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

BGP Prefix Independent Convergence

BGP prefix independent convergence (PIC) reduces data plane convergence times and improves unicast traffic convergence for remote link failures (when the BGP next hop fails). A remote link is a link between a spine and a remote leaf, or a spine and the super spine layer.

When you configure BGP PIC, Cumulus Linux assigns one next hop group for each source and the remote leaf advertises the router ID loopback route. The remote leaf tags prefix routes with a route-origin extended community so that the local leaf recognizes the routes. When the network topology changes, the local leaf obtains the router ID loopback route with the updated ECMP, allowing a O (1) next hop group replace operation for all prefixes from the remote leaf without waiting for individual BGP updates.

To enable PIC:

On a leaf switch, enable the BGP advertise origin option so that BGP can attach the Site-of-Origin (SOO) extended community to all routes advertised to its peers from the source where the routes originate.

The following example enables BGP advertise origin for IPv4:

cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast advertise-origin
cumulus@leaf01:~$ nv config apply

For IPv6, run the nv set vrf <vrf> router bgp address-family ipv6-unicast advertise-origin command.

On all switches (leaf, spine and super spine), enable the next hop group per source option so that when BGP receives routes with the SOO extended community, it allocates a next hop group for each source:

The following example enables the next hop group per source option for IPv4:

cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast nhg-per-origin
cumulus@leaf01:~$ nv config apply

For IPv6, run the nv set vrf <vrf> router bgp address-family ipv6-unicast nhg-per-origin command.

On a leaf switch, enable the BGP advertise origin option so that BGP can attach the Site-of-Origin (SOO) extended community to all routes advertised to its peers from the source where the routes originate.

The following example enables BGP advertise origin for IPv4:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family ipv4
leaf01(config-router-af)# bgp advertise-origin
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

On all switches (leaf, spine and super spine), enable the next hop group per source option so that when BGP receives routes with the SOO extended community, it allocates a next hop group for each source.

The following example enables BGP advertise origin for IPv4:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# address-family ipv4
leaf01(config-router-af)# bgp nhg-per-origin
leaf01(config-router-af)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65101
  ...
  bgp advertise-origin
  bgp nhg-per-origin
...

BGP Timers

BGP includes several timers that you can configure.

Keepalive Interval and Hold Time

By default, BGP exchanges periodic keepalive messages to measure and ensure that a peer is still alive and functioning. If BGP does not receive a keepalive or update message from the peer within the hold time, it declares the peer down and withdraws all routes received by this peer from the local BGP table. By default, the keepalive interval is 3 seconds and the hold time is 9 seconds. To decrease CPU load when there are a lot of neighbors, you can increase the values of these timers or disable the exchange of keepalives. When manually configuring new values, the keepalive interval can be less than or equal to one third of the hold time, but cannot be less than 1 second. Setting the keepalive and hold time values to 0 disables the exchange of keepalives.

The following example commands set the keepalive interval to 10 seconds and the hold time to 30 seconds.

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 timers keepalive 10
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 timers hold 30
cumulus@leaf01:~$ nv config apply

To set the timers back to the default values, run the nv unset vrf <vrf> router bgp neighbor <interface> timers keepalive and the nv unset vrf <vrf> router bgp neighbor <interface> timers hold commands.

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp
leaf01(config-router)# neighbor swp51 timers 10 30
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65101
  ...
  neighbor swp51 timers 10 30
...

Reconnect Interval

By default, the BGP process attempts to connect to a peer after a failure (or on startup) every 10 seconds. You can change this value to suit your needs.

The following example commands set the reconnect value to 30 seconds:

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 timers connection-retry 30
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp
leaf01(config-router)# neighbor swp51 timers connect 30
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65101
  ...
  neighbor swp51 timers connect 30
...

After making a new best path decision for a prefix, BGP can insert a delay before advertising the new results to a peer. This delay rate limits the amount of changes advertised to downstream peers and lowers processing requirements by slowing down convergence. By default, this interval is 0 seconds for both eBGP and iBGP sessions, which allows for fast convergence. For more information about the advertisement interval, see this IETF draft.

The following example commands set the advertisement interval to 5 seconds:

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 timers route-advertisement 5
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp
leaf01(config-router)# neighbor swp51 advertisement-interval 5
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65101
  ...
  neighbor swp51 advertisement-interval 5
...

BGP Input and Ouput Message Queue Limit

You can configure the input and the output message queue limit for all peers. For both the input and output queue limit, you can set a value between 1 and 4294967295 messages. The default setting is 10000.

Only increase the input or output queue if you have enough memory to handle large queues of messages at the same time.

The following example sets the input queue limit to 2048 messages and the output queue limit to 2048 messages:

cumulus@leaf01:~$ nv set router bgp queue-limit input 2048
cumulus@leaf01:~$ nv set router bgp queue-limit output 2048
cumulus@leaf01:~$ nv config apply

The following example sets the input queue limit to 2048 messages and the output queue limit to 2048 messages:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# bgp input-queue-limit 2048
leaf01(config)# bgp output-queue-limit 2048
leaf01(config)# end
leaf01# write memory
leaf01# exit

To show the input and output message queue configuration, run the nv show router bgp queue-limit command.

Route Reflectors

iBGP rules state that BGP cannot send a route learned from an iBGP peer to another iBGP peer. In a data center spine and leaf network using iBGP, this prevents a spine from sending a route learned from a leaf to any other leaf. As a workaround, you can use a route reflector. When an iBGP speaker is a route reflector, it can send iBGP learned routes to other iBGP peers.

In the following example, spine01 is acting as a route reflector. The leaf switches, leaf01, leaf02 and leaf03 are route reflector clients. BGP sends any route that spine01 learns from a route reflector client to other route reflector clients.

To configure the BGP node as a route reflector for a BGP peer, set the neighbor route-reflector-client option. The following example sets spine01 shown in the illustration above to be a route reflector for leaf01 (on swp1), which is a route reflector client. You do not have to configure the client.

cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 address-family ipv4-unicast route-reflector-client on
cumulus@spine01:~$ nv config apply
cumulus@spine01:~$ sudo vtysh
...
spine01# configure terminal
spine01(config)# router bgp 65199
spine01(config-router)# address-family ipv4
spine01(config-router-af)# neighbor swp1 route-reflector-client
spine01(config-router-af)# end
spine01# write memory
spine01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65199
 bgp router-id 10.10.10.101
 neighbor swp51 remote-as external
 !
 address-family ipv4 unicast
  network 10.10.10.101/32
  neighbor swp51 route-reflector-client
 exit-address-family
...

BGP Confederations

To reduce the number of iBGP peerings, configure a confederation to divide an AS into smaller sub-ASs.

To configure a BGP confederation:

The following example configures confederation ID 2 with sub-ASs 65101, 65102, 65103, and 65104.

cumulus@spine01:~$ nv set vrf default router bgp confederation id 2
cumulus@spine01:~$ nv set vrf default router bgp confederation member-as 65101-65104
cumulus@spine01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
spine01# configure terminal
spine01(config)# router bgp 65199
spine01(config-router)# bgp confederation identifier 2
spine01(config-router)# bgp confederation peers 65101
spine01(config-router)# bgp confederation peers 65102
spine01(config-router)# bgp confederation peers 65103
spine01(config-router)# bgp confederation peers 65104
spine01(config-router)# end
spine01# write memory
spine01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

cumulus@spine01:~$ sudo cat /etc/frr/frr.conf
...
router bgp 65199
 bgp router-id 10.10.10.101
 bgp confederation identifier 2
 bgp confederation peers 65101 65102 65103 65104
...

Administrative Distance

Cumulus Linux uses the administrative distance to choose which routing protocol to use when two different protocols provide route information for the same destination. The smaller the distance, the more reliable the protocol. For example, if the switch receives a route from OSPF with an administrative distance of 110 and the same route from BGP with an administrative distance of 100, the switch chooses BGP.

The following example commands set the administrative distance for external routes to 150 and internal routes to 110:

cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast admin-distance external 150
cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast admin-distance internal 110
cumulus@spine01:~$ nv config apply
cumulus@spine01:~$ sudo vtysh
...
spine01# configure terminal
spine01(config)# router bgp 65101
spine01(config-router)# distance bgp 150 110
spine01(config-router)# end
spine01# write memory
spine01# exit

BGP Neighbor Shutdown

You can shut down all active BGP sessions with a neighbor and remove all associated routing information without removing its associated configuration. When shut down, the neighbor goes into an administratively idle state.

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 shutdown on
cumulus@leaf01:~$ nv config apply

To bring BGP sessions with the neighbor back up, run the nv set vrf default router bgp neighbor swp51 shutdown off command.

cumulus@spine01:~$ sudo vtysh
...
spine01# configure terminal
spine01(config)# router bgp 65101
spine01(config-router)# neighbor swp51 shutdown
spine01(config-router)# end
spine01# write memory
spine01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65199
  ...
  neighbor swp51 shutdown
...

To bring BGP sessions with the neighbor back up, run the no neighbor swp51 shutdown command.

Graceful BGP Shutdown

To reduce packet loss during planned switch or link maintenance, you can configure graceful BGP shutdown globally, on a peer group, or on a specific peer.

You can enable graceful BGP shutdown either globally or on a peer or peer group but not both.

Global Graceful BGP Shutdown

When you enable graceful shutdown globally on the switch, Cumulus Linux adds the graceful-shutdown community to all inbound and outbound routes from all eBGP peers and sets the local-pref for the routes to 0 (refer to RFC8326).

To enable graceful shutdown globally on the switch:

cumulus@leaf01:~$ nv set router bgp graceful-shutdown on
cumulus@leaf01:~$ nv config apply

To disable graceful shutdown globally on the switch:

cumulus@leaf01:~$ nv set router bgp graceful-shutdown off
cumulus@leaf01:~$ nv config apply

To enable graceful shutdown globally:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# bgp graceful-shutdown
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

To disable graceful shutdown globally on the switch:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# no bgp graceful-shutdown
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

To show the configuration, run the vtysh show ip bgp <route> command. For example:

cumulus@leaf01:~$ sudo vtysh
leaf01# show ip bgp 10.10.10.0/24
BGP routing table entry for 10.10.10.0/24
Paths: (2 available, best #1, table Default-IP-Routing-Table)
  Advertised to non peer-group peers:
  bottom0(10.10.10.2)
  30 20
    10.10.10.2 (metric 10) from top1(10.10.10.2) (10.10.10.2)
      Origin IGP, localpref 100, valid, internal, bestpath-from-AS 30, best
      Community: 99:1
      AddPath ID: RX 0, TX 52
      Last update: Mon Sep 18 17:01:18 2017

  20
    10.10.10.3 from bottom0(10.10.10.32) (10.10.10.10)
      Origin IGP, metric 0, localpref 0, valid, external, bestpath-from-AS 20
      Community: 99:1 graceful-shutdown
      AddPath ID: RX 0, TX 2
      Last update: Mon Sep 18 17:01:18 2017

As optional configuration, you can create a route map to prepend the AS so that reduced preference using a longer AS path propagates to other parts of network.

Example Configuration Using a Route Map
router bgp 65101
 bgp router-id 10.10.10.1
 bgp graceful-restart
 bgp bestpath as-path multipath-relax
 neighbor fabric peer-group
 neighbor swp51 interface remote-as external

 address-family ipv4 unicast
  redistribute connected
  neighbor swp51 route-map prependas out
 exit-address-family

bgp community-list standard gshut seq 5 permit graceful-shutdown

route-map prependas permit 10
 match community gshut exact-match
 set as-path prepend 65101

route-map prependas permit 20

With the above configuration, the peer sees:

cumulus@spine01:~$ sudo vtysh
...
spine01# show ip bgp 10.10.10.1/32
BGP routing table entry for 10.10.10.1/32
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
65101 65101
10.10.10.1 from leaf01(10.10.10.1) (10.10.10.1)
Origin incomplete, metric 0, localpref 0, valid, external, bestpath-from-AS 65101, best (First path received)
Community: graceful-shutdown
Last update: Sun Dec 20 03:04:53 2020

Graceful BGP Shutdown on a Peer

When you enable BGP graceful shutdown on a peer, Cumulus Linux attaches a graceful-shutdown community to the relevant routes. Neighbors receiving the graceful-shutdown community mark these routes as less preferred if alternative routes exist. If no other routes are available, neighbors continue to use the routes with the graceful-shutdown community. If you enable graceful shutdown (maintenance) in multiple parts of the network or where there are no additional routes, traffic does not stop on the routes that have the attached graceful-shutdown community.

Before you enable graceful shutdown on a peer, make sure that global graceful shutdown is off.

To enable graceful shutdown on a peer, run the nv set vrf <vrf> router bgp neighbor <neighbor> graceful-shutdown on command:

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 graceful-shutdown on
cumulus@leaf01:~$ nv config apply

To disable graceful shutdown on a peer, run the nv unset vrf <vrf> router bgp neighbor <neighbor> graceful-shutdown command:

cumulus@leaf01:~$ nv unset vrf default router bgp neighbor swp51 graceful-shutdown
cumulus@leaf01:~$ nv config apply

To enable graceful shutdown on a peer:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor swp51 graceful-shutdown
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

To disable graceful shutdown on a peer:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor swp51 no graceful-shutdown
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

To show if graceful shutdown is on a peer, run the nv show vrf <vrf> router bgp neighbor <neighbor> command:

cumulus@leaf01:~$ nv show vrf default router bgp neighbor swp51
                                    operational                     applied   
----------------------------------  ------------------------------  ----------
password                                                            *         
enforce-first-as                                                    off       
passive-mode                                                        off       
nexthop-connected-check                                             on        
description                                                         none      
bfd                                                                           
  enable                                                            off       
...
graceful-shutdown                                                   on      
...

Graceful BGP Shutdown on a Peer Group

When you enable BGP graceful shutdown on a peer group, Cumulus Linux attaches a graceful-shutdown community to the relevant routes. Neighbors receiving the graceful-shutdown community mark these routes as less preferred if alternative routes exist. If no other routes are available, neighbors continue to use the routes with the graceful-shutdown community. If you enable graceful shutdown (maintenance) in multiple parts of the network or where there are no additional routes, traffic does not stop on the routes that have the attached graceful-shutdown community.

Before you enable graceful shutdown on a peer group, make sure that global graceful shutdown is off.

To enable graceful shutdown on a peer group, run the nv set vrf <vrf> router bgp peer-group <peer-group-id> graceful-shutdown on command:

cumulus@leaf01:~$ nv set vrf default router bgp peer-group underlay graceful-shutdown on
cumulus@leaf01:~$ nv config apply

To disable graceful shutdown on a peer group, run the nv unset vrf <vrf> router bgp peer-group <peer-group-id> graceful-shutdown command:

cumulus@leaf01:~$ nv unset vrf default router bgp peer-group underlay graceful-shutdown
cumulus@leaf01:~$ nv config apply

To enable graceful shutdown on a peer group:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor underlay graceful-shutdown
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

To disable graceful shutdown on a peer group:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor underlay no graceful-shutdown
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

To show if graceful shutdown is on a peer group, run the nv show vrf <vrf> router bgp peer-group <peer-group-id> command:

cumulus@leaf01:~$ nv show vrf default router bgp peer-group underlay
                                    operational                     applied   
----------------------------------  ------------------------------  ----------
password                                                            *         
enforce-first-as                                                    off       
passive-mode                                                        off       
nexthop-connected-check                                             on        
description                                                         none      
bfd                                                                           
  enable                                                            off       
...
graceful-shutdown                                                   on      
...

Graceful BGP Restart

When BGP restarts on a switch, all BGP peers detect that the session goes down and comes back up. This session transition results in a routing flap on BGP peers that causes BGP to recompute routes, generate route updates, and add unnecessary churn to the forwarding tables. The routing flaps can create transient forwarding blackholes and loops, and also consume resources on the switches affected by the flap, which can affect overall network performance.

To minimize the negative effects that occur when BGP restarts, Cumulus Linux enables graceful BGP restart by default, which lets a BGP speaker signal to its peers that it can preserve its forwarding state and continue data forwarding during a restart. BGP graceful restart also enables a BGP speaker to continue to use routes announced by a peer even after the peer has gone down.

When BGP establishes a session, BGP peers use the BGP OPEN message to negotiate a graceful restart. If the BGP peer also supports graceful restart, it activates for that neighbor session. If the BGP session stops, the BGP peer (the restart helper) flags all routes associated with the device as stale but continues to forward packets to these routes for a certain period of time. The restarting device also continues to forward packets during the graceful restart. After the device comes back up and establishes BGP sessions again with its peers (restart helpers), it waits to learn all routes that these peers announce before selecting a cumulative path; after which, it updates its forwarding tables and re-announces the appropriate routes to its peers. These procedures ensure that if there are any routing changes while the BGP speaker is restarting, the network converges.

Restart Modes

Cumulus Linux supports graceful BGP restart full mode and helper-only mode for IPv4, IPv6 and EVPN. The default setting is helper-only mode.

You can configure graceful BGP restart globally, where all BGP peers inherit the graceful restart capability, or for a BGP peer or peer group (useful for misbehaving peers or when working with third party devices).

The switch has graceful restart enabled in helper-only mode by default. To set graceful BGP restart to full mode globally on the switch:

cumulus@leaf01:~$ nv set router bgp graceful-restart mode full
cumulus@leaf01:~$ nv config apply

To set graceful BGP restart to full mode on the BGP peer connected on swp51:

cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 graceful-restart mode full
cumulus@leaf01:~$ nv config apply

To set graceful BGP restart back to the default setting (helper-only mode), run the nv unset router bgp graceful-restart command or the nv set router bgp graceful-restart mode helper-only command.

The switch has graceful restart enabled in helper-only mode by default. To set graceful BGP restart to full mode globally on the switch:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# bgp graceful-restart
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

To set graceful BGP restart to full mode on the BGP peer connected on swp51:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor swp51 graceful-restart
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

To set graceful BGP restart back to the default setting (helper-only mode), run the no bgp graceful-restart command or the no neighbor <interface> graceful-restart command

Disable Graceful Restart

If you disable graceful BGP restart, you cannot achieve a switch restart or switch software upgrade with minimal traffic loss in a BGP configuration. Refer to ISSU for more information.

To disable graceful BGP restart globally on the switch:

cumulus@leaf01:~$ nv set router bgp graceful-restart mode off
cumulus@leaf01:~$ nv config apply

To disable graceful BGP restart on a BGP peer:

cumulus@leaf01:~$ nv unset vrf default router bgp neighbor swp51 graceful-restart
cumulus@leaf01:~$ nv config apply

To disable graceful BGP restart globally on the switch:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# bgp graceful-restart-disable
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

To disable graceful BGP restart on a BGP peer:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# neighbor swp51 graceful-restart-disable
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

Restart Timers

You can configure the following graceful BGP restart timers.

Timer
Description
restart-time The number of seconds to wait for a graceful restart capable peer to re-establish BGP peering. You can set a value between 1 and 4095. The default is 120 seconds.
pathselect-defer-time The number of seconds a restarting peer defers path-selection when waiting for the EOR marker from peers. You can set a value between 0 and 3600. The default is 360 seconds.
stalepath-time The number of seconds to hold stale routes for a restarting peer. You can set a value between 1 and 4095. The default is 360 seconds.

To avoid traffic loss during warm boot in an EVPN multihoming configuration with multihop BGP sessions, increase the restart-time timer to more than 180 seconds on all multihoming configured switches.

The following example commands set the restart-time to 400 seconds, pathselect-defer-time to 300 seconds, and stalepath-time to 400 seconds:

cumulus@leaf01:~$ nv set router bgp graceful-restart restart-time 400
cumulus@leaf01:~$ nv set router bgp graceful-restart path-selection-deferral-time 300
cumulus@leaf01:~$ nv set router bgp graceful-restart stale-routes-time 400
cumulus@leaf01:~$ nv config apply
Timer
Description
notification Enables graceful BGP restart support for BGP NOTIFICATION messages.
preserve-fw-state Sets the F-bit indication to preserve the FIB during a graceful BPG restart.
restart-time The number of seconds to wait for a graceful restart capable peer to re-establish BGP peering. You can set a value between 1 and 4095. The default is 120 seconds.
rib-stale-time The stale route removal time in the RIB (in seconds). You can set a value between 1 and 3600.
select-defer-time The number of seconds a restarting peer defers path-selection when waiting for the EOR marker from peers. You can set a value between 0 and 3600. The default is 360 seconds.
stalepath-time The number of seconds to hold stale routes for a restarting peer. You can set a value between 1 and 4095. The default is 360 seconds.

The following example commands set the restart-time to 400 seconds, pathselect-defer-time to 300 seconds, and stalepath-time to 400 seconds:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp 65101
leaf01(config-router)# bgp graceful-restart restart-time 400
leaf01(config-router)# bgp graceful-restart select-defer-time 300
leaf01(config-router)# bgp graceful-restart stalepath-time 400
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65199
 bgp router-id 10.10.10.101
 neighbor swp51 remote-as external
 bgp graceful-restart restart-time 400
 bgp graceful-restart select-defer-time 300
 bgp graceful-restart stalepath-time 400
...

Show Graceful Restart Information

To show global graceful BGP restart configuration settings, run the NVUE nv show router bgp graceful-restart command:

cumulus@leaf01:mgmt:~$ nv show router bgp graceful-restart 
                              applied      pending    
----------------------------  -----------  -----------
mode                          helper-only  helper-only
restart-time                  120          120        
path-selection-deferral-time  360          360        
stale-routes-time             360          360

To show graceful BGP restart information on a specific BGP peer, run the vtysh show ip bgp neighbor <neighbor> graceful-restart command.

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show ip bgp neighbor swp51 graceful-restart
Codes: GR - Graceful Restart, * -  Inheriting Global GR Config,
       Restart - GR Mode-Restarting, Helper - GR Mode-Helper,
       Disable - GR Mode-Disable.

BGP neighbor on swp51: fe80::4638:39ff:fe00:2, remote AS 65199, local AS 65101, external link
  BGP state = Established, up for 00:15:54
  Neighbor GR capabilities:
    Graceful Restart Capability: advertised and received
      Remote Restart timer is 120 seconds
      Address families by peer:
        none
  Graceful restart information:
    End-of-RIB send: IPv4 Unicast
    End-of-RIB received: IPv4 Unicast
    Local GR Mode: Helper*
    Remote GR Mode: Helper
    R bit: False
    Timers:
      Configured Restart Time(sec): 120
      Received Restart Time(sec): 120
    IPv4 Unicast:
      F bit: False
      End-of-RIB sent: Yes
      End-of-RIB sent after update: Yes
      End-of-RIB received: Yes
      Timers:
        Configured Stale Path Time(sec): 360

Enable Read-only Mode

Sometimes, as Cumulus Linux establishes BGP peers and receives updates, it installs prefixes in the RIB and advertises them to BGP peers before receiving and processing information from all the peers. Also, depending on the timing of the updates, Cumulus Linux sometimes installs prefixes, then withdraws and replaces them with new routing information. Read-only mode minimizes this BGP route churn in both the local RIB and with BGP peers.

Enable read-only mode to reduce CPU and network usage when restarting the BGP process. Because intermediate best paths are possible for the same prefix as peers establish and start receiving updates at different times, read-only mode is useful in topologies where BGP learns a prefix from a large number of peers and the network has a high number of prefixes.

While in read-only mode, BGP does not run best-path or generate any updates to its peers.

The following example commands enable read-only mode:

cumulus@leaf01:~$ nv set router bgp convergence-wait time 300
cumulus@leaf01:~$ nv set router bgp convergence-wait establish-wait-time 200
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# router bgp
leaf01(config-router)# update-delay 300 90
leaf01(config-router)# end
leaf01# write memory
leaf01# exit

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router bgp 65199
 bgp router-id 10.10.10.101
 neighbor swp51 remote-as external
 bgp update-delay 300 200
...

To show the configured timers and information about the transitions when a convergence event occurs, run the vtysh show ip bgp summary command.

cumulus@leaf01:mgmt:~$ sudo vtysh
...
leaf01# show ip bgp summary
ipv4 Unicast Summary

BGP router identifier 10.10.10.1, local AS number 65101 vrf-id 0
Read-only mode update-delay limit: 300 seconds
                   Establish wait: 200 seconds
BGP table version 0
RIB entries 3, using 576 bytes of memory
Peers 1, using 21 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt
spine01(swp51)  4      65199     30798     30802        0    0    0 1d01h09m            0        0

Total number of neighbors 1
...

The vtysh show ip bgp summary json command shows the last convergence event.

BGP Community Lists

You can use community lists to define a BGP community to tag one or more routes. You can then use the communities to apply a route policy on either egress or ingress.

The BGP community list can be either standard, extended, or large. The standard BGP community list is a pair of values (such as 100:100) that you can tag on a specific prefix and advertise to other neighbors, or you can apply them on route ingress. The standard BGP community list can be one of four BGP default communities:

An extended BGP community list takes a regular expression of communities and matches the listed communities.

A large community-list accommodates more identification information, including 4-byte AS numbers.

When the neighbor receives the prefix, it examines the community value and takes action accordingly, such as permitting or denying the community member in the routing policy.

The following example configures a standard community list filter:

cumulus@leaf01:~$ nv set router policy community-list COMMUNITY1 rule 10 action permit
cumulus@leaf01:~$ nv set router policy community-list COMMUNITY1 rule 10 community 100:100
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# bgp community-list standard COMMUNITY1 permit 100:100
leaf01(config)# exit
leaf01# write memory
leaf01# exit

To apply the community list to a route map to define the routing policy:

cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 10 match community-list COMMUNITY1
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 10 action permit
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# route-map ROUTEMAP1 
leaf01(config-route-map)# match community COMMUNITY1
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit

The following example configures a BGP extended community RT filter and applies the extended community list to a route map.

cumulus@leaf01:~$ nv set router policy ext-community-list EXTCOMM1 rule 10 ext-community rt 11:11,22:22
cumulus@leaf01:~$ nv set router policy ext-community-list EXTCOMM1 rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 10 match ext-community-list EXTCOMM1
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 10 action permit
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# bgp extcommunity standard EXTCOMM1 permit rt 11:11 rt 22:22
leaf01(config)# route-map ROUTEMAP1 permit 10
leaf01(config-route-map)# match extcommunity EXTCOMM1
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit

The following example configures a BGP extended community RT filter with a regex match and applies the extended community list to a route map.

cumulus@leaf01:~$ nv set router policy ext-community-list EXTCOMM2 rule 10 ext-community rt "\.*_65000:2002_.*","\.*_89000:2002_.*"
cumulus@leaf01:~$ nv set router policy ext-community-list EXTCOMM2 rule 10 action permit
cumulus@leaf01:~$ nv set router policy  ROUTEMAP3 rule 10 match ext-community-list EXTCOMM2
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP3 rule 10 action permit
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# bgp extcommunity expanded EXTCOMM2 permit rt "\.*_65000:2002_.*","\.*_89000:2002_.*"
leaf01(config)# route-map ROUTEMAP3 permit 10
leaf01(config-route-map)# match extcommunity EXTCOMM2
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit

The following example configures a BGP extended community SOO filter and applies the extended community list to a route map.

cumulus@leaf01:~$ nv set router policy ext-community-list EXTCOMM1 rule 10 ext-community soo 66:66,77:77
cumulus@leaf01:~$ nv set router policy ext-community-list EXTCOMM1 rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 10 match ext-community-list EXTCOMM1
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP1 rule 10 action permit
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# bgp extcommunity standard EXTCOMM2 permit soo 66:66 soo 77:77
leaf01(config)# route-map ROUTEMAP1 permit 10
leaf01(config-route-map)# match extcommunity EXTCOMM2
leaf01(config)# end
leaf01# write memory
leaf01# exit

The following example configures a BGP extended community SOO filter with a regex match and applies the extended community list to a route map.

cumulus@leaf01:~$ nv set router policy ext-community-list EXTCOMM2 rule 10 ext-community soo "\.*_65000:2002_.*","\.*_89000:2002_.*"
cumulus@leaf01:~$ nv set router policy ext-community-list EXTCOMM2 rule 10 action permit
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP20 rule 10 match ext-community-list EXTCOMM2
cumulus@leaf01:~$ nv set router policy route-map ROUTEMAP20 rule 10 action permit
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# bgp extcommunity expanded EXTCOMM2 permit soo "\.*_65000:2002_.*","\.*_89000:2002_.*"
leaf01(config)# route-map ROUTEMAP20 permit 10
leaf01(config-route-map)# match extcommunity EXTCOMM2
leaf01(config)# exit
leaf01# write memory
leaf01# exit

To use a special character, such as a period (.) in the regular expression for an extended BGP community list, you must escape the character with a backslash (\). For example, nv set router policy community-list COMMUNITY1 rule 10 community "\.*_65000:2002_.*".

The following example configures a BGP large community list and applies the large community list to a route map.

cumulus@leaf01:~$ nv set router policy large-community-list 11 rule 10 action permit
cumulus@leaf01:~$ nv set router policy large-community-list 11 rule 10 large-community 4200857911:011:011
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 match large-community-list mylist
cumulus@leaf01:~$ nv set router policy route-map MAP1 rule 10 action permit
cumulus@leaf01:~$ nv config apply
cumulus@leaf01:~$ sudo vtysh
...
leaf01# configure terminal
leaf01(config)# bgp large-community-list 11 seq 10 permit 4200857911:011:011
leaf01(config)# route-map MAP1 permit 10
leaf01(config-route-map)# match large-community 11
leaf01(config-route-map)# end
leaf01# write memory
leaf01# exit
cumulus@leaf01:~$

Cumulus Linux considers the full list of communities on a BGP route as a single string to evaluate. If you try to match $ (ends with), Cumulus Linux matches the last community value in the list of communities, not the individual community values within the list.

For example, if you use the regular expression ".*:(20)$", Cumulus Linux matches all the BGP routes with a list of communities ending in 20.

Troubleshooting BGP

Use the following commands to troubleshoot BGP.

Basic Troubleshooting Commands

Run the following commands to help you troubleshoot BGP.

Show BGP configuration Summary

To show a summary of the BGP configuration on the switch, run the NVUE nv show router bgp command or the vtysh show ip bgp summary command. For example:

cumulus@switch:~$ nv show router bgp 
                                applied      pending    
------------------------------  -----------  -----------
enable                          on           on         
autonomous-system               65101        65101      
router-id                       10.10.10.1   10.10.10.1 
policy-update-timer             5            5          
graceful-shutdown               off          off        
wait-for-install                off          off        
graceful-restart                                        
  mode                          helper-only  helper-only
  restart-time                  120          120        
  path-selection-deferral-time  360          360        
  stale-routes-time             360          360        
convergence-wait                                        
  time                          0            0          
  establish-wait-time           0            0          
queue-limit                                             
  input                         10000        10000      
  output                        10000        10000  
cumulus@switch:~$ sudo vtysh
...
switch# show ip bgp summary
ipv4 Unicast Summary
BGP router identifier 10.10.10.1, local AS number 65101 vrf-id 0
BGP table version 88
RIB entries 25, using 4800 bytes of memory
Peers 5, using 106 KiB of memory
Peer groups 1, using 64 bytes of memory

Neighbor              V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
spine01(swp51)        4      65199     31122     31194        0    0    0 1d01h44m            7
spine02(swp52)        4      65199     31060     31151        0    0    0 01:47:13            7
spine03(swp53)        4      65199     31150     31207        0    0    0 01:48:31            7
spine04(swp54)        4      65199     31042     31098        0    0    0 01:46:57            7
leaf02(peerlink.4094) 4      65101     30919     30913        0    0    0 01:47:43           12

Total number of neighbors 5

To view the routing table as defined by BGP, run the vtysh show ip bgp ipv4 unicast command. For example:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# show ip bgp ipv4 unicast
BGP table version is 88, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65101
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
* i10.0.1.1/32      peerlink.4094            0    100      0 ?
*>                  0.0.0.0                  0         32768 ?
*= 10.0.1.2/32      swp54                                  0 65199 65102 ?
*=                  swp52                                  0 65199 65102 ?
* i                 peerlink.4094                 100      0 65199 65102 ?
*=                  swp53                                  0 65199 65102 ?
*>                  swp51                                  0 65199 65102 ?
*= 10.0.1.254/32    swp54                                  0 65199 65132 ?
*=                  swp52                                  0 65199 65132 ?
* i                 peerlink.4094                 100      0 65199 65132 ?
*=                  swp53                                  0 65199 65132 ?
*>                  swp51                                  0 65199 65132 ?
*> 10.10.10.1/32    0.0.0.0                  0         32768 ?
*>i10.10.10.2/32    peerlink.4094            0    100      0 ?
*= 10.10.10.3/32    swp54                                  0 65199 65102 ?
*=                  swp52                                  0 65199 65102 ?
* i                 peerlink.4094                 100      0 65199 65102 ?
*=                  swp53                                  0 65199 65102 ?
*>                  swp51                                  0 65199 65102 ?
...

To show a more detailed breakdown of a specific neighbor, run the vtysh show ip bgp neighbor <neighbor> command or the NVUE nv show vrf <vrf> router bgp neighbor <neighbor> command:

cumulus@switch:~$ nv show vrf default router bgp neighbor swp51
                               operational                applied   
-----------------------------  -------------------------  ----------
password                                                   *         
enforce-first-as                                           off       
passive-mode                                               off       
nexthop-connected-check                                    on        
description                                                none      
bfd                                                                  
  enable                                                   off       
ttl-security                                                                  
  enable                        on                         off       
  hops                          1                                    
local-as                                                             
  enable                                                   off       
timers                                                               
  keepalive                     3                          auto      
  hold                          9                          auto      
  connection-retry              10                         auto      
  route-advertisement           none                       auto      
address-family                                                       
  ipv4-unicast                                                       
    enable                                                 on        
    route-reflector-client                                 off       
    route-server-client                                    off       
    soft-reconfiguration                                   off       
    nexthop-setting                                        auto      
    add-path-tx                                            off       
    attribute-mod                                                    
      aspath                    off                        on        
      med                       off                        on        
      nexthop                   off                        on
...

To see details of a specific route, such as its source and destination, run the NVUE nv show vrf <vrf-id> router rib ipv4 route <route> or nv show vrf <vrf-id> router rib ipv6 route <route> command or the vtysh show ip bgp <route> command.

cumulus@switch:~$ nv show vrf default router rib ipv4 route 10.10.10.3/32
route-entry
==============
                                                                                
    Protocol - Protocol name, TblId - Table Id, NHGId - Nexthop group Id, Flags - u 
    - unreachable, r - recursive, o - onlink, i - installed, d - duplicate, c -     
    connected, A - active                                                           
                                                                                
    EntryIdx  Protocol  TblId  NHGId  Distance  Metric  ResolvedVia                ResolvedViaIntf  Weight  Flags
    --------  --------  -----  -----  --------  ------  -------------------------  ---------------  ------  -----
    1         bgp       254    132    20        0       fe80::4ab0:2dff:fe14:82a3  swp52            1       iA   
                                                        fe80::4ab0:2dff:fe53:538c  swp53            1       iA   
                                                        fe80::4ab0:2dff:fea3:d534  swp54            1       iA   
                                                        fe80::4ab0:2dff:feac:d7c4  swp51            1       iA 
cumulus@switch:~$ sudo vtysh
...
switch# show ip bgp 10.10.10.3/32
GP routing table entry for 10.10.10.3/32
Paths: (5 available, best #5, table default)
  Advertised to non peer-group peers:
  spine01(swp51) spine02(swp52) spine03(swp53) spine04(swp54) leaf02(peerlink.4094)
  65199 65102
    fe80::8e24:2bff:fe79:7d46 from spine04(swp54) (10.10.10.104)
    (fe80::8e24:2bff:fe79:7d46) (used)
      Origin incomplete, valid, external, multipath
      Last update: Wed Oct  7 13:13:13 2020
  65199 65102
    fe80::841:43ff:fe27:caf from spine02(swp52) (10.10.10.102)
    (fe80::841:43ff:fe27:caf) (used)
      Origin incomplete, valid, external, multipath
      Last update: Wed Oct  7 13:13:14 2020
  65199 65102
    fe80::90b1:7aff:fe00:3121 from leaf02(peerlink.4094) (10.10.10.2)
      Origin incomplete, localpref 100, valid, internal
      Last update: Wed Oct  7 13:13:08 2020
  65199 65102
    fe80::48e7:fbff:fee9:5bcf from spine03(swp53) (10.10.10.103)
    (fe80::48e7:fbff:fee9:5bcf) (used)
      Origin incomplete, valid, external, multipath
      Last update: Wed Oct  7 13:13:13 2020
  65199 65102
    fe80::7c41:fff:fe93:b711 from spine01(swp51) (10.10.10.101)
    (fe80::7c41:fff:fe93:b711) (used)
      Origin incomplete, valid, external, multipath, bestpath-from-AS 65199, best (Older Path)
      Last update: Wed Oct  7 13:13:13 2020

Check BGP Timer Settings

To check BGP timers, such as the BGP keepalive interval, hold time, and advertisement interval, run the NVUE nv show vrf <vrf> router bgp neighbor <neighbor> timers command or the vtysh show ip bgp neighbor <peer> command. For example:

cumulus@leaf01:~$ nv show vrf default router bgp neighbor swp51 timers
                     operational  applied
-------------------  -----------  -------
keepalive            3            auto   
hold                 9            auto   
connection-retry     10           auto   
route-advertisement  none         auto

BGP Update Groups

You can show information about update group events or information about a specific IPv4 or IPv6 update group.

To show information about update group events, run the vtysh show bgp update-group command or run these NVUE commands:

cumulus@leaf01:~$ nv show vrf default router bgp address-family ipv4-unicast update-group
RouteMap - Outbound route map, MinAdvInterval - Minimum route advertisement     
interval, CreationTime - Time when the update group was created, LocalAsChange -
LocalAs changes for inbound route, Flags - r - replace-as, x - no-prepend       

UpdateGrp  RouteMap  MinAdvInterval  CreationTime          LocalAsChange  Flags
---------  --------  --------------  --------------------  -------------  -----
1                    0               2024-07-08T18:00:57Z                      
3                    0               2024-07-09T20:48:11Z            

To show information about a specific update group, such as the number of peer refresh events, prune events, and packet queue length, run the vtysh show bgp update-group <group-id> command or run these NVUE commands:

cumulus@leaf01:~$ nv show vrf default router bgp address-family ipv4-unicast update-group 1 -o json
{
  "create-time": "2024-10-25T14:02:24Z",
  "min-route-advertisement-interval": 0,
  "sub-group": {
    "1": {
      "adjacency-count": 13,
      "coalesce-time": 1300,
      "counters": {
        "join-events": 5,
        "merge-check-events": 0,
        "merge-events": 3,
        "peer-refresh-events": 0,
        "prune-events": 0,
        "split-events": 0,
        "switch-events": 0
      },
      "create-time": "2024-10-25T14:02:24Z",
      "needs-refresh": "off",
      "neighbor": {
        "peerlink.4094": {},
        "swp51": {},
        "swp52": {},
        "swp53": {},
        "swp54": {}
      },
      "packet-counters": {
        "queue-hwm-len": 2,
        "queue-len": 0,
        "queue-total": 14,
        "total-enqueued": 14
      },
      "sub-group-id": 1,
      "version": 18
    }
  },
  "update-group-id": "1"
}

Show BGP Route Information

You can run NVUE commands to show route statistics for a BGP neighbor, such as the number of routes, and information about advertised and received routes.

To show the RIB table for IPv4 routes, run the nv show vrf <vrf> router rib ipv4 route command. To show the RIB table for IPv6 routes, run the nv show vrf <vrf> router rib ipv6 route command.

cumulus@leaf01:mgmt:~$ nv show vrf default router rib ipv4 route
                                                                                
Flags - * - selected, q - queued, o - offloaded, i - installed, S - fib-        
selected, x - failed                                                            
                                                                                
Route            Protocol   Distance  Uptime                NHGId  Metric  Flags
---------------  ---------  --------  --------------------  -----  ------  -----
10.1.10.0/24     connected  0         2024-07-18T21:57:29Z  46     0       *Sio 
10.1.20.0/24     connected  0         2024-07-18T21:57:29Z  47     0       *Sio 
10.1.30.0/24     connected  0         2024-07-18T21:57:29Z  48     0       *Sio 
10.1.40.0/24     bgp        20        2024-07-18T22:02:22Z  57     0       *Si  
10.1.50.0/24     bgp        20        2024-07-18T22:02:22Z  57     0       *Si  
10.1.60.0/24     bgp        20        2024-07-18T22:02:22Z  57     0       *Si  
10.10.10.1/32    connected  0         2024-07-18T21:55:54Z  7      0       *Sio 
10.10.10.2/32    bgp        20        2024-07-18T21:57:29Z  34     0       *Si  
10.10.10.3/32    bgp        20        2024-07-18T22:02:22Z  57     0       *Si  
10.10.10.4/32    bgp        20        2024-07-18T22:02:27Z  57     0       *Si  
10.10.10.101/32  bgp        20        2024-07-18T22:01:14Z  50     0       *Si  
10.10.10.102/32  bgp        20        2024-07-18T22:02:22Z  58     0       *Si

To show the local RIB routes, run the nv show vrf <vrf> router bgp address-family ipv4-unicast route command for IPv4 or the nv show vrf <vrf> router bgp address-family ipv6-unicast route for IPv6. You can also run the command with -o json to show the received routes in json format.

cumulus@leaf02:~$ nv show vrf default router bgp address-family ipv4-unicast route                                            PathCount - Number of paths present for the prefix, MultipathCount - Number of  
paths that are part of the ECMP, DestFlags - * - bestpath-exists, w - fib-wait- 
for-install, s - fib-suppress, i - fib-installed, x - fib-install-failed        
                                                                                
Prefix           PathCount  MultipathCount  DestFlags
---------------  ---------  --------------  ---------
10.0.1.12/32     2          1               *        
10.0.1.34/32     5          4               *        
10.0.1.255/32    5          4               *        
10.10.10.1/32    1          1               *        
10.10.10.2/32    5          1               *        
10.10.10.3/32    5          4               *        
10.10.10.4/32    5          4               *        
10.10.10.63/32   5          4               *        
10.10.10.64/32   5          4               *        
10.10.10.101/32  2          1               *        
10.10.10.102/32  2          1               *        
10.10.10.103/32  2          1               *        
10.10.10.104/32  2          1               * 

To show information about a specific local RIB route, run the nv show vrf <vrf> router bgp address-family ipv4-unicast route <route> for IPv4 or nv show vrf <vrf> router bgp address-family ipv6-unicast route <route> for IPv6.

The above IPv4 and IPv6 command shows the local RIB route information in brief format to improve performance for high scale environments. You can also run the command with -o json to show the received routes in json format.

cumulus@leaf01:~$ nv show vrf default router bgp address-family ipv4-unicast route 10.10.10.64/32
                 operational
---------------  -----------
path-count       5          
multipath-count  4

path
=======                                                                           
    Origin - Route origin, Local - Locally originated route, Sourced - Sourced      
    route, Weight - Route weight, Metric - Route metric, LocalPref - Route local    
    preference, PathFrom - Route path origin, LastUpdate - Route last update,       
    NexthopCnt - Number of nexthops, Flags - = - multipath, * - bestpath, v - valid,
    s - suppressed, R - removed, S - stale                                          
                                                                                
    Path  Origin      Local  Sourced  Weight  Metric  LocalPref  PathFrom  LastUpdate            NexthopCnt  Flags
    ----  ----------  -----  -------  ------  ------  ---------  --------  --------------------  ----------  -----
    1     incomplete                                             external  2024-10-25T14:02:33Z  2           =*v  
    2     incomplete                                             external  2024-10-25T14:02:42Z  2           =v   
    3     incomplete                                             external  2024-10-25T14:02:36Z  2           =v   
    4     incomplete                                             external  2024-10-25T14:02:36Z  2           =v   
    5     incomplete                                             external  2024-10-25T14:02:33Z  2           *v   
advertised-to
================
    Neighbor       hostname
    -------------  --------
    peerlink.4094  leaf02  
    swp51          spine01 
    swp52          spine02 
    swp53          spine03 
    swp54          spine04

To show the route count, run the nv show vrf <vrf-id> router bgp neighbor <neighbor-id> address-family ipv4-unicast route-counters command for IPv4 or the nv show vrf <vrf-id> router bgp neighbor <neighbor-id> address-family ipv6-unicast route-counters for IPv6.

cumulus@leaf01:~$ nv show vrf default router bgp neighbor swp51 address-family ipv4-unicast route-counters
                operational
--------------  -----------
route-count     8          
adj-rib-in      0          
damped          0          
removed         0          
history         0          
stale           0          
valid           8          
all-rib         8          
routes-counted  8          
best-routes     7          
usable          8 

To show all advertised routes, run the nv show vrf <vrf> router bgp neighbor <neighbor> address-family ipv4-unicast advertised-routes command for IPv4 or the nv show vrf <vrf>> router bgp neighbor <neighbor> address-family ipv6-unicast advertised-routes for IPv6.

The above IPv4 and IPv6 command shows advertised routes in brief format to improve performance for high scale environments. You can also run the command with -o json to show the received routes in json format.

cumulus@leaf01:~$ nv show vrf default router bgp neighbor swp51 address-family ipv4-unicast advertised-routes 
PathCount - Number of paths present for the prefix, MultipathCount - Number of  
paths that are part of the ECMP                                                 
                                                                                
IPv4 Prefix      PathCount  MultipathCount  DestFlags      
---------------  ---------  --------------  ---------------
10.1.10.0/24     3          1               bestpath-exists
10.1.20.0/24     3          1               bestpath-exists
10.1.30.0/24     3          1               bestpath-exists
10.1.40.0/24     3          2               bestpath-exists
10.1.50.0/24     3          2               bestpath-exists
10.1.60.0/24     3          2               bestpath-exists
10.10.10.1/32    2          1               bestpath-exists
10.10.10.2/32    3          1               bestpath-exists
10.10.10.3/32    3          2               bestpath-exists
10.10.10.4/32    3          2               bestpath-exists
10.10.10.101/32  2          1               bestpath-exists
10.10.10.102/32  2          1               bestpath-exists

To show information about a specific advertised route, run thenv show <vrf> default router bgp neighbor <neighbor> address-family ipv4-unicast advertised-routes <route> for IPv4 or nv show <vrf> default router bgp neighbor <neighbor> address-family ipv6-unicast advertised-routes <route> for IPv6.

cumulus@leaf01:~$ nv show vrf default router bgp neighbor swp51 address-family ipv4-unicast advertised-route 10.10.10.1/32
                 operational
---------------  -----------
path-count       2          
multipath-count  1
path
====================
                                                                                
    Origin - Route origin, Local - Locally originated route, Sourced - Sourced      
    route, Weight - Route weight, Metric - Route metric, LocalPref - Route local    
    preference, PathFrom - Route path origin, LastUpdate - Route last update,       
    NexthopCnt - Number of nexthops, Flags - = - multipath, * - bestpath, v - valid,
    s - suppressed, R - removed, S - stale                                          
                                                                                
    Path  Origin      Local  Sourced  Weight  Metric  LocalPref  PathFrom  LastUpdate            NexthopCnt  Flags
    ----  ----------  -----  -------  ------  ------  ---------  --------  --------------------  ----------  -----
    1     IGP         on     on       32768   0                            2024-07-18T21:55:54Z  1           *v   
    2     incomplete         on       32768   0                            2024-07-18T21:55:54Z  1           v 
...

To show all the received routes, run the nv show vrf <vrf> router bgp neighbor <neighbor> address-family ipv4-unicast received-routes command for IPv4 or nv show vrf <vrf> router bgp neighbor <neighbor> address-family ipv6-unicast received-routes command for IPv6. These commands show received routes in brief format to improve performance for high scale environments. You can also run the command with --view=detail to see more detailed information or with -o json to show the received routes in json format.

To show information about a specific received route, run the nv show vrf <vrf> router bgp neighbor <neighbor> address-family ipv4-unicast received-routes <route> -o json for IPv4 or nv show vrf <vrf> router bgp neighbor <neighbor> address-family ipv6-unicast received-routes <route> -o json for IPv6.

Show Next Hop Information

To show a summary of all the BGP IPv4 or IPv6 next hops, run the nv show vrf <vrf> router bgp nexthop ipv4 or nv show vrf <vrf> router bgp nexthop ipv6 command. The output shows the IGP metric, the number of paths pointing to a next hop, and the address or interface used to reach a next hop.

cumulus@leaf01:mgmt:~$ nv show vrf default router bgp nexthop ipv4
Nexthops
===========
                                                                                 
    PathCnt - Number of paths pointing to this Nexthop, ResolvedVia - Resolved via   
    address or interface, Interface - Resolved via interface                         
                                                                                 
    Address      IGPMetric  Valid  PathCnt  ResolvedVia                Interface    
    -----------  ---------  -----  -------  -------------------------  -------------
    10.0.1.34    0          on     160      fe80::4ab0:2dff:fe60:910e  swp54        
                                            fe80::4ab0:2dff:fea7:7852  swp53        
                                            fe80::4ab0:2dff:fec8:8fb9  swp52        
                                            fe80::4ab0:2dff:feff:e147  swp51        
    10.10.10.2   0          on     15       fe80::4ab0:2dff:fe2d:495c  peerlink.4094
    10.10.10.3   0          on     15       fe80::4ab0:2dff:fe60:910e  swp54        
                                            fe80::4ab0:2dff:fea7:7852  swp53        
                                            fe80::4ab0:2dff:fec8:8fb9  swp52        
                                            fe80::4ab0:2dff:feff:e147  swp51        
    10.10.10.4   0          on     15       fe80::4ab0:2dff:fe60:910e  swp54        
                                            fe80::4ab0:2dff:fea7:7852  swp53        
                                            fe80::4ab0:2dff:fec8:8fb9  swp52        
                                            fe80::4ab0:2dff:feff:e147  swp51        
    10.10.10.63  0          on     15       fe80::4ab0:2dff:fe60:910e  swp54        
                                            fe80::4ab0:2dff:fea7:7852  swp53        
                                            fe80::4ab0:2dff:fec8:8fb9  swp52        
                                            fe80::4ab0:2dff:feff:e147  swp51        
    10.10.10.64  0          on     15       fe80::4ab0:2dff:fe60:910e  swp54        
                                            fe80::4ab0:2dff:fea7:7852  swp53        
                                            fe80::4ab0:2dff:fec8:8fb9  swp52        
                                            fe80::4ab0:2dff:feff:e147  swp51    

To show information about a specific next hop, run the vtysh NVUE nv show vrf <vrf-id> router bgp nexthop ipv4 ip-address <ip-address> command for IPv4 or nv show vrf <vrf-id> router bgp nexthop ipv6 ip-address <ip-address> for IPv6. You can also run the vtysh show bgp vrf default nexthop <ip-address> command.

cumulus@leaf01:mgmt:~$  nv show vrf default router bgp nexthop ipv4 ip-address 10.10.10.2
                  operational              
----------------  -------------------------
valid             yes                      
complete          on                       
igp-metric        0                        
path-count        15                       
last-update-time  2024-10-25T14:02:32Z     
[resolved-via]    fe80::4ab0:2dff:fee8:57ba

To show through which address and interface BGP resolves a specific next hop, run the nv show vrf <vrf-id> router bgp nexthop ipv4 ip-address <ip-address-id> resolved-via command for IPv4 or the nv show vrf <vrf-id> router bgp nexthop ipv6 ip-address <ip-address-id> resolved-via command for IPv6.

cumulus@leaf01:~$ nv show vrf default router bgp nexthop ipv4 ip-address 10.10.10.2 resolved-via
Nexthop                    interface
-------------------------  ---------
fe80::4ab0:2dff:fe20:ac25  swp51    
fe80::4ab0:2dff:fe93:d92d  swp52

Troubleshoot BGP Unnumbered

To verify that FRR learns the neighboring link-local IPv6 address through the IPv6 neighbor discovery router advertisements on a given interface, run the vtysh show interface <interface> command.

If you do not enable ipv6 nd suppress-ra on both ends of the interface, Neighbor address(s): shows the link-local address of the other end (the address that BGP uses when that interface uses BGP).

Cumulus Linux automatically enables IPv6 route advertisements (RAs) on an interface with IPv6 addresses. You do not need to run the no ipv6 nd suppress-ra command for BGP unnumbered.

cumulus@switch:~$ sudo vtysh
...
switch# show interface swp51
  Interface swp51 is up, line protocol is up
  Link ups:       0    last: (never)
  Link downs:     0    last: (never)
  PTM status: disabled
  vrf: default
  OS Description: leaf to spine
  index 8 metric 0 mtu 9216 speed 1000
  flags: <UP,BROADCAST,RUNNING,MULTICAST>
  Type: Ethernet
  HWaddr: 10:d8:68:d4:a6:81
  inet6 fe80::12d8:68ff:fed4:a681/6
  Interface Type Other
  protodown: off
  ND advertised reachable time is 0 milliseconds
  ND advertised retransmit interval is 0 milliseconds
  ND advertised hop-count limit is 64 hops
  ND router advertisements sent: 217 rcvd: 216
  ND router advertisements are sent every 10 seconds
  ND router advertisements lifetime tracks ra-interval
  ND router advertisement default router preference is medium
  Hosts use stateless autoconfig for addresses.
  Neighbor address(s):
  inet6 fe80::f208:5fff:fe12:cc8c/128

Troubleshoot IPv4 Prefixes Learned with IPv6 Next Hops

To show IPv4 prefixes learned with IPv6 next hops, run the following commands.

The following examples show an IPv4 prefix learned from a BGP peer over an IPv6 session using IPv6 global addresses, but where the next hop installed by BGP is a link-local IPv6 address. This occurs when the session is directly between peers, and the BGP update for the prefix includes both link-local and global IPv6 addresses as next hops. If both global and link-local next hops exist, BGP prefers the link-local address for route installation.

cumulus@spine01:mgmt:~$ sudo vtysh
...
spine01# show ip bgp ipv4 unicast summary
BGP router identifier 10.10.10.101, local AS number 65199 vrf-id 0
BGP table version 3
RIB entries 3, using 576 bytes of memory
Peers 1, using 21 KiB of memory

Neighbor                   V      AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
leaf01(2001:db8:2::a00:1) 4     65101       22        22        0    0    0  00:01:00           0

Total number of neighbors 1
cumulus@spine01:mgmt:~$ sudo vtysh
...
spine01# show ip bgp ipv4 unicast
BGP table version is 3, local router ID is 10.10.10.101, vrf id 0
Default local pref 100, local AS 65199
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop                Metric LocPrf Weight Path
   10.10.10.101/32   fe80::a00:27ff:fea6:b9fe      0     0   32768 i

Displayed  1 routes and 1 total paths
cumulus@spine01:~$ sudo vtysh
...
spine01# show ip bgp ipv4 unicast 10.10.10.101/32
BGP routing table entry for 10.10.10.101/32
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  Leaf01(2001:db8:0002::0a00:1)
  3
    2001:db8:0002::0a00:1 from Leaf01(2001:db8:0002::0a00:1) (10.10.10.101)
    (fe80::a00:27ff:fea6:b9fe) (used)
      Origin IGP, metric 0, valid, external, bestpath-from-AS 3, best (First path received)
      AddPath ID: RX 0, TX 3
      Last update: Mon Oct 22 08:09:22 2018

The example output below shows the results of installing the route in the FRR RIB as well as the kernel FIB. The next hop installed in the FRR RIB is the link-local IPv6 address, which Cumulus Linux converts into an IPv4 link-local address, as required for installation into the kernel FIB.

cumulus@spine01:~$ sudo vtysh
...
spine01# show ip route 10.10.10.101/32
RIB entry for 10.10.10.101/32
===========================
Routing entry for 10.10.10.101/32
  Known via "bgp", distance 20, metric 0, best
  Last update 2d17h05m ago
  * fe80::a00:27ff:fea6:b9fe, via swp1

FIB entry for 10.10.10.101/32
===========================
10.10.10.101/32 via 10.0.1.0 dev swp1 proto bgp metric 20 onlink

If BGP learns an IPv4 prefix with only an IPv6 global next hop address (when it learns the route through a route reflector), the command output shows the IPv6 global address as the next hop value. The command also shows that it learns recursively through the link-local address of the route reflector. When you use a global IPv6 address as a next hop for route installation in the FRR RIB, the switch still converts it into an IPv4 link-local address for installation into the kernel.

cumulus@leaf01:~$ sudo vtysh
...
leaf01# show ip bgp ipv4 unicast summary
BGP router identifier 10.10.10.1, local AS number 65101 vrf-id 0
BGP table version 1
RIB entries 1, using 152 bytes of memory
Peers 1, using 19 KiB of memory

Neighbor             V AS MsgRcvd  MsgSent  TblVer  InQ  OutQ  Up/Down  State/PfxRcd
Spine01(2001:db8:0002::0a00:2) 4 1   74       68         0     0     0     00:00:45      1

Total number of neighbors 1
cumulus@leaf01:~$ sudo vtysh
...
leaf01# show ip bgp ipv4 unicast summary
  BGP table version is 1, local router ID is 10.10.10.1
  Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
                i internal, r RIB-failure, S Stale, R Removed
  Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop    Metric LocPrf Weight Path
*>i10.1.10.0/24 2001:2:2::4       0    100      0    i

Displayed 1 routes and 1 total paths
cumulus@leaf01:~$ sudo vtysh
...
leaf01# show ip bgp ipv4 unicast 10.10.10.101/32
BGP routing table entry for 10.10.10.101/32
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Local
  2001:2:2::4 from Spine01(2001:1:1::1) (10.10.10.104)
    Origin IGP, metric 0, localpref 100, valid, internal, bestpath-from-AS Local, best (First path received)
    Originator: 10.0.0.14, Cluster list: 10.10.10.111
    AddPath ID: RX 0, TX 5
    Last update: Mon Oct 22 14:25:30 2018
cumulus@leaf01:~$ sudo vtysh
...
leaf01# show ip route 10.10.10.1/32
RIB entry for 10.10.10.1/32
===========================
Routing entry for 10.10.10.1/32
  Known via "bgp", distance 200, metric 0, best
  Last update 00:01:13 ago
  2001:2:2::4 (recursive)
  * fe80::a00:27ff:fe5a:84ae, via swp1

FIB entry for 10.10.10.1/32
===========================
10.10.10.1/32 via 10.0.1.1 dev swp1 proto bgp metric 20 onlink

To only use IPv6 global addresses for route installation into the FRR RIB, you must add an additional route map to the neighbor or peer group statement in the appropriate address family. When the route map command set ipv6 next-hop prefer-global applies to a neighbor, if both a link-local and global IPv6 address are in the BGP update for a prefix, BGP uses the IPv6 global address for route installation.

With this additional configuration, the output in the FRR RIB changes in the direct neighbor case as shown below:

router bgp 65101
  bgp router-id 10.10.10.1
  neighbor 2001:db8:2::a00:1 remote-as internal
  neighbor 2001:db8:2::a00:1 capability extended-nexthop
  !
  address-family ipv4 unicast
  neighbor 2001:db8:2::a00:1 route-map GLOBAL in
  exit-address-family
!
route-map GLOBAL permit 20
  set ipv6 next-hop prefer-global
!

The resulting FRR RIB output is as follows:

cumulus@leaf01:~$ sudo vtysh
...
leaf01# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
    O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
    T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
    F - PBR,
    > - selected route, * - FIB route

B 0.0.0.0/0 [200/0] via 2001:2:2::4, swp2, 00:01:00
K 0.0.0.0/0 [0/0] via 10.0.2.2, eth0, 1d02h29m
C>* 10.0.0.9/32 is directly connected, lo, 5d18h32m
C>* 10.0.2.0/24 is directly connected, eth0, 03:51:31
B>* 172.16.4.0/24 [200/0] via 2001:2:2::4, swp2, 00:01:00ß
C>* 172.16.10.0/24 is directly connected, swp3, 5d18h32m

When the switch learns the route through a route reflector, it appears like this:

router bgp 65101
  bgp router-id 10.10.10.1
  neighbor 2001:db8:2::a00:2 remote-as internal
  neighbor 2001:db8:2::a00:2 capability extended-nexthop
  !
  address-family ipv6 unicast
  neighbor 2001:db8:2::a00:2 activate
  neighbor 2001:db8:2::a00:2 route-map GLOBAL in
  exit-address-family
!
route-map GLOBAL permit 10
  set ipv6 next-hop prefer-global
cumulus@leaf01:~$ sudo vtysh
...
leaf01# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR,
       > - selected route, * - FIB route

B   0.0.0.0/0 [200/0] via 2001:2:2::4, 00:00:01
K   0.0.0.0/0 [0/0] via 10.0.2.2, eth0, 3d00h26m
C>* 10.0.0.8/32 is directly connected, lo, 3d00h26m
C>* 10.0.2.0/24 is directly connected, eth0, 03:39:18
C>* 172.16.3.0/24 is directly connected, swp2, 3d00h26m
B>  172.16.4.0/24 [200/0] via 2001:2:2::4 (recursive), 00:00:01
  *                         via 2001:1:1::1, swp1, 00:00:01
C>* 172.16.10.0/24 is directly connected, swp3, 3d00h26m

Neighbor State Change Log

Cumulus Linux records the changes that a neighbor goes through in syslog and in the /var/log/frr/frr.log file. For example:

020-10-05T15:51:32.621773-07:00 leaf01 bgpd[10104]: %NOTIFICATION: sent to neighbor peerlink.4094 6/7 (Cease/Connection collision resolution) 0 bytes
2020-10-05T15:51:32.623023-07:00 leaf01 bgpd[10104]: %ADJCHANGE: neighbor peerlink.4094(leaf02) in vrf default Up
2020-10-05T15:51:32.623156-07:00 leaf01 bgpd[10104]: %NOTIFICATION: sent to neighbor peerlink.4094 6/7 (Cease/Connection collision resolution) 0 bytes
2020-10-05T15:51:32.623496-07:00 leaf01 bgpd[10104]: %ADJCHANGE: neighbor peerlink.4094(leaf02) in vrf default Down No AFI/SAFI activated for peer
2020-10-05T15:51:33.040332-07:00 leaf01 bgpd[10104]: [EC 33554454] swp53 [Error] bgp_read_packet error: Connection reset by peer
2020-10-05T15:51:33.279468-07:00 leaf01 bgpd[10104]: [EC 33554454] swp52 [Error] bgp_read_packet error: Connection reset by peer
2020-10-05T15:51:33.339487-07:00 leaf01 bgpd[10104]: %ADJCHANGE: neighbor swp54(spine04) in vrf default Up
2020-10-05T15:51:33.340893-07:00 leaf01 bgpd[10104]: %ADJCHANGE: neighbor swp53(spine03) in vrf default Up
2020-10-05T15:51:33.341648-07:00 leaf01 bgpd[10104]: %ADJCHANGE: neighbor swp52(spine02) in vrf default Up
2020-10-05T15:51:33.342369-07:00 leaf01 bgpd[10104]: %ADJCHANGE: neighbor swp51(spine01) in vrf default Up
2020-10-05T15:51:33.627958-07:00 leaf01 bgpd[10104]: %ADJCHANGE: neighbor peerlink.4094(leaf02) in vrf default Up

Clear BGP Routes

NVUE provides commands to clear and refresh routes in the BGP table. You can clear all routes in the BGP table or all routes for an address family (IPv4, IPv6, or EVPN) in a VRF.

The BGP clear commands do not clear counters in the kernel or hardware.

To clear and refresh all IPv4 inbound routes:

cumulus@leaf01:~$ nv action clear vrf default router bgp address-family ipv4-unicast soft in

To clear and resend all IPv6 outbound routes:

cumulus@leaf01:~$ nv action clear vrf default router bgp address-family ipv6-unicast soft out

To clear and refresh all EVPN inbound routes:

cumulus@leaf01:~$ nv action clear vrf default router bgp address-family l2vpn-evpn soft in

To clear and resend all outbound IPv4 routes:

cumulus@leaf01:~$ nv action clear vrf default router bgp address-family ipv4-unicast soft out

To clear and resend all IPv6 outbound routes to BGP neighbor 10.10.10.101:

cumulus@leaf01:~$ nv action clear vrf default router bgp neighbor 10.10.10.101 address-family ipv6-unicast out

To clear and resend outbound routes for all address families (IPv4, IPv6, and l2vpn-evpn) for the BGP peer group SPINES:

cumulus@leaf01:~$ nv action clear vrf default router bgp peer-group SPINES out

To clear and refresh all IPv4 inbound routes for all VRFs and address families:

cumulus@switch:~$ nv action clear router bgp soft in
Action succeeded

To clear and refresh inbound routes for all neighbors, address families, and VRFs and to refresh the outbound route filtering prefix-list:

cumulus@switch:~$ nv action clear router bgp in prefix-filter
Action succeeded

To clear BGP sessions with all neighbors, forcing the neighbors to restart:

cumulus@switch:~$ nv action clear router bgp
Action succeeded

To clear and refresh all IPv4 inbound routes:

cumulus@spine01:~$ sudo vtysh
...
switch# clear bgp vrf default ipv4 unicast * soft in
switch# exit

To clear and resend all IPv6 outbound routes:

cumulus@spine01:~$ sudo vtysh
...
switch# clear bgp vrf default ipv6 unicast * soft out
switch# exit

To clear and refresh all EVPN inbound routes:

cumulus@spine01:~$ sudo vtysh
...
switch# clear bgp vrf default l2vpn evpn * soft in
switch# exit

To clear and resend all outbound IPv4 routes:

cumulus@spine01:~$ sudo vtysh
...
switch# clear bgp vrf default ipv4 unicast * soft out
switch# exit

To clear and resend all IPv6 outbound routes to BGP neighbor 10.10.10.101:

cumulus@spine01:~$ sudo vtysh
...
switch# clear bgp vrf default ipv6 unicast 10.10.10.101 out
switch# exit

To clear and resend outbound routes for all address families (IPv4, IPv6, and l2vpn-evpn) for the BGP peer group SPINES:

cumulus@spine01:~$ sudo vtysh
...
switch# clear bgp vrf default peer-group SPINES out
switch# exit

To clear and refresh inbound routes for all neighbors, address families, and VRFs and to refresh the outbound route filtering prefix-list:

cumulus@spine01:~$ sudo vtysh
...
switch# clear bgp in prefix-filter
switch# exit

To clear BGP sessions with all neighbors, forcing the neighbors to restart:

cumulus@switch:~$ sudo vtysh
...
switch# clear bgp *
switch# write memory
switch# exit

Configuration Example

This section shows a BGP configuration example based on the reference topology. The example configures BGP unnumbered on all leafs and spines, and MLAG on leaf01 and leaf02, and on leaf03 and leaf04.

cumulus@leaf01:mgmt:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:mgmt:~$ nv set interface swp1-3,swp49-52
cumulus@leaf01:mgmt:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:mgmt:~$ nv set interface bond2 bond member swp2
cumulus@leaf01:mgmt:~$ nv set interface bond3 bond member swp3
cumulus@leaf01:mgmt:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf01:mgmt:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf01:mgmt:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf01:mgmt:~$ nv set interface bond1-3 bridge domain br_default 
cumulus@leaf01:mgmt:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf01:mgmt:~$ nv set mlag mac-address 44:38:39:BE:EF:AA
cumulus@leaf01:mgmt:~$ nv set mlag backup 10.10.10.2
cumulus@leaf01:mgmt:~$ nv set mlag peer-ip linklocal
cumulus@leaf01:mgmt:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@leaf01:mgmt:~$ nv set interface vlan20 ip address 10.1.20.2/24
cumulus@leaf01:mgmt:~$ nv set interface vlan30 ip address 10.1.30.2/24
cumulus@leaf01:mgmt:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf01:mgmt:~$ nv set bridge domain br_default untagged 1
cumulus@leaf01:mgmt:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:mgmt:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:mgmt:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf01:mgmt:~$ nv set vrf default router bgp neighbor swp52 remote-as external
cumulus@leaf01:mgmt:~$ nv set vrf default router bgp neighbor peerlink.4094 remote-as external
cumulus@leaf01:mgmt:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.1/32
cumulus@leaf01:mgmt:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf01:mgmt:~$ nv config apply
cumulus@leaf02:mgmt:~$ nv set interface lo ip address 10.10.10.2/32
cumulus@leaf02:mgmt:~$ nv set interface swp1-3,swp49-52
cumulus@leaf02:mgmt:~$ nv set interface bond1 bond member swp1
cumulus@leaf02:mgmt:~$ nv set interface bond2 bond member swp2
cumulus@leaf02:mgmt:~$ nv set interface bond3 bond member swp3
cumulus@leaf02:mgmt:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf02:mgmt:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf02:mgmt:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf02:mgmt:~$ nv set interface bond1-3 bridge domain br_default 
cumulus@leaf02:mgmt:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf02:mgmt:~$ nv set mlag mac-address 44:38:39:BE:EF:AA
cumulus@leaf02:mgmt:~$ nv set mlag backup 10.10.10.1
cumulus@leaf02:mgmt:~$ nv set mlag peer-ip linklocal
cumulus@leaf02:mgmt:~$ nv set interface vlan10 ip address 10.1.10.3/24
cumulus@leaf02:mgmt:~$ nv set interface vlan20 ip address 10.1.20.3/24
cumulus@leaf02:mgmt:~$ nv set interface vlan30 ip address 10.1.30.3/24
cumulus@leaf02:mgmt:~$ nv set bridge domain br_default vlan 10,20,30
cumulus@leaf02:mgmt:~$ nv set bridge domain br_default untagged 1
cumulus@leaf02:mgmt:~$ nv set router bgp autonomous-system 65102
cumulus@leaf02:mgmt:~$ nv set router bgp router-id 10.10.10.2
cumulus@leaf02:mgmt:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf02:mgmt:~$ nv set vrf default router bgp neighbor swp52 remote-as external
cumulus@leaf02:mgmt:~$ nv set vrf default router bgp neighbor peerlink.4094 remote-as external
cumulus@leaf02:mgmt:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.2/32
cumulus@leaf02:mgmt:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf02:mgmt:~$ nv config apply
cumulus@leaf03:mgmt:~$ nv set interface lo ip address 10.10.10.3/32
cumulus@leaf03:mgmt:~$ nv set interface swp1-3,swp49-52
cumulus@leaf03:mgmt:~$ nv set interface bond1 bond member swp1
cumulus@leaf03:mgmt:~$ nv set interface bond2 bond member swp2
cumulus@leaf03:mgmt:~$ nv set interface bond3 bond member swp3
cumulus@leaf03:mgmt:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf03:mgmt:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf03:mgmt:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf03:mgmt:~$ nv set interface bond1-3 bridge domain br_default 
cumulus@leaf03:mgmt:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf03:mgmt:~$ nv set mlag mac-address 44:38:39:BE:EF:AA
cumulus@leaf03:mgmt:~$ nv set mlag backup 10.10.10.4
cumulus@leaf03:mgmt:~$ nv set mlag peer-ip linklocal
cumulus@leaf03:mgmt:~$ nv set interface vlan40 ip address 10.1.40.4/24
cumulus@leaf03:mgmt:~$ nv set interface vlan50 ip address 10.1.50.4/24
cumulus@leaf03:mgmt:~$ nv set interface vlan60 ip address 10.1.60.4/24
cumulus@leaf03:mgmt:~$ nv set bridge domain br_default vlan 40,50,60
cumulus@leaf03:mgmt:~$ nv set bridge domain br_default untagged 1
cumulus@leaf03:mgmt:~$ nv set router bgp autonomous-system 65103
cumulus@leaf03:mgmt:~$ nv set router bgp router-id 10.10.10.3
cumulus@leaf03:mgmt:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf03:mgmt:~$ nv set vrf default router bgp neighbor swp52 remote-as external
cumulus@leaf03:mgmt:~$ nv set vrf default router bgp neighbor peerlink.4094 remote-as external
cumulus@leaf03:mgmt:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.3/32
cumulus@leaf03:mgmt:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf03:mgmt:~$ nv config apply
cumulus@leaf04:mgmt:~$ nv set interface lo ip address 10.10.10.4/32
cumulus@leaf04:mgmt:~$ nv set interface swp1-3,swp49-52
cumulus@leaf04:mgmt:~$ nv set interface bond1 bond member swp1
cumulus@leaf04:mgmt:~$ nv set interface bond2 bond member swp2
cumulus@leaf04:mgmt:~$ nv set interface bond3 bond member swp3
cumulus@leaf04:mgmt:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf04:mgmt:~$ nv set interface bond2 bond mlag id 2
cumulus@leaf04:mgmt:~$ nv set interface bond3 bond mlag id 3
cumulus@leaf04:mgmt:~$ nv set interface bond1-3 bridge domain br_default 
cumulus@leaf04:mgmt:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf04:mgmt:~$ nv set mlag mac-address 44:38:39:BE:EF:AA
cumulus@leaf04:mgmt:~$ nv set mlag backup 10.10.10.3
cumulus@leaf04:mgmt:~$ nv set mlag peer-ip linklocal
cumulus@leaf04:mgmt:~$ nv set interface vlan40 ip address 10.1.40.5/24
cumulus@leaf04:mgmt:~$ nv set interface vlan50 ip address 10.1.50.5/24
cumulus@leaf04:mgmt:~$ nv set interface vlan60 ip address 10.1.60.5/24
cumulus@leaf04:mgmt:~$ nv set bridge domain br_default vlan 40,50,60
cumulus@leaf04:mgmt:~$ nv set bridge domain br_default untagged 1
cumulus@leaf04:mgmt:~$ nv set router bgp autonomous-system 65104
cumulus@leaf04:mgmt:~$ nv set router bgp router-id 10.10.10.4
cumulus@leaf04:mgmt:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf04:mgmt:~$ nv set vrf default router bgp neighbor swp52 remote-as external
cumulus@leaf04:mgmt:~$ nv set vrf default router bgp neighbor peerlink.4094 remote-as external
cumulus@leaf04:mgmt:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.4/32
cumulus@leaf04:mgmt:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected
cumulus@leaf04:mgmt:~$ nv config apply
cumulus@spine01:mgmt:~$ nv set interface lo ip address 10.10.10.101/32
cumulus@spine01:mgmt:~$ nv set interface swp1-4
cumulus@spine01:mgmt:~$ nv set router bgp autonomous-system 65199
cumulus@spine01:mgmt:~$ nv set router bgp router-id 10.10.10.101
cumulus@spine01:mgmt:~$ nv set vrf default router bgp neighbor swp1 remote-as external
cumulus@spine01:mgmt:~$ nv set vrf default router bgp neighbor swp2 remote-as external
cumulus@spine01:mgmt:~$ nv set vrf default router bgp neighbor swp3 remote-as external
cumulus@spine01:mgmt:~$ nv set vrf default router bgp neighbor swp4 remote-as external
cumulus@spine01:mgmt:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.101/32
cumulus@spine01:mgmt:~$ nv config apply
cumulus@spine02:mgmt:~$ nv set interface lo ip address 10.10.10.102/32
cumulus@spine02:mgmt:~$ nv set interface swp1-4
cumulus@spine02:mgmt:~$ nv set router bgp autonomous-system 65199
cumulus@spine02:mgmt:~$ nv set router bgp router-id 10.10.10.102
cumulus@spine02:mgmt:~$ nv set vrf default router bgp neighbor swp1 remote-as external
cumulus@spine02:mgmt:~$ nv set vrf default router bgp neighbor swp2 remote-as external
cumulus@spine02:mgmt:~$ nv set vrf default router bgp neighbor swp3 remote-as external
cumulus@spine02:mgmt:~$ nv set vrf default router bgp neighbor swp4 remote-as external
cumulus@spine02:mgmt:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.102/32
cumulus@spine02:mgmt:~$ nv config apply

NVUE saves the configuration in the /etc/nvue.d/startup.yaml file. For example:

cumulus@leaf01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    bridge:
      domain:
        br_default:
          untagged: 1
          vlan:
            10,20,30: {}
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default: {}
        type: bond
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.1/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.2/24: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.2/24: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.2/24: {}
        type: svi
        vlan: 30
    mlag:
      backup:
        10.10.10.2: {}
      enable: on
      mac-address: 44:38:39:BE:EF:AA
      peer-ip: linklocal
    router:
      bgp:
        autonomous-system: 65101
        enable: on
        router-id: 10.10.10.1
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$s0YidtKoOX/niP8T$.Kbhq.CvV1yroC6pcY89Ld7ez1q4rhK.87HIBvy/R3aOtML4uGJbK3OgN7CUHZGjl2CTME7jPaoChYiybT5YA0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        system-mac: 44:38:39:22:01:7a
      hostname: leaf01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.1/32: {}
                redistribute:
                  connected:
                    enable: on
            enable: on
            neighbor:
              peerlink.4094:
                remote-as: external
                type: unnumbered
              swp51:
                remote-as: external
                type: unnumbered
              swp52:
                remote-as: external
                type: unnumbered
cumulus@leaf02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
    bridge:
      domain:
        br_default:
          untagged: 1
          vlan:
            10,20,30: {}
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default: {}
        type: bond
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.2/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      vlan10:
        ip:
          address:
            10.1.10.3/24: {}
        type: svi
        vlan: 10
      vlan20:
        ip:
          address:
            10.1.20.3/24: {}
        type: svi
        vlan: 20
      vlan30:
        ip:
          address:
            10.1.30.3/24: {}
        type: svi
        vlan: 30
    mlag:
      backup:
        10.10.10.1: {}
      enable: on
      mac-address: 44:38:39:BE:EF:AA
      peer-ip: linklocal
    router:
      bgp:
        autonomous-system: 65102
        enable: on
        router-id: 10.10.10.2
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$fF9zaaykxuMirThP$id.eaNuuBb7A7.s1JVgFhUFQdS5KPGkmpqnK1jQZWT7m0Uk/xGGZ3GMMBkNksaWkX0.oy6FEfZOgn9zgZPCxE0
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        system-mac: 44:38:39:22:01:78
      hostname: leaf02
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.2/32: {}
                redistribute:
                  connected:
                    enable: on
            enable: on
            neighbor:
              peerlink.4094:
                remote-as: external
                type: unnumbered
              swp51:
                remote-as: external
                type: unnumbered
              swp52:
                remote-as: external
                type: unnumbered
cumulus@leaf03:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    bridge:
      domain:
        br_default:
          untagged: 1
          vlan:
            40,50,60: {}
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default: {}
        type: bond
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.3/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      vlan40:
        ip:
          address:
            10.1.40.4/24: {}
        type: svi
        vlan: 40
      vlan50:
        ip:
          address:
            10.1.50.4/24: {}
        type: svi
        vlan: 50
      vlan60:
        ip:
          address:
            10.1.60.4/24: {}
        type: svi
        vlan: 60
    mlag:
      backup:
        10.10.10.4: {}
      enable: on
      mac-address: 44:38:39:BE:EF:AA
      peer-ip: linklocal
    router:
      bgp:
        autonomous-system: 65103
        enable: on
        router-id: 10.10.10.3
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$N8YXk5gYH.wFxXxG$rEssNuUMEkTlKoED1t74zKE08vXWeJRlrpS0tS3phQAHKPrGa6HmJYOys/2d6sXWeszC5CqlvBEtQoHlgj5GO.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        system-mac: 44:38:39:22:01:84
      hostname: leaf03
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.3/32: {}
                redistribute:
                  connected:
                    enable: on
            enable: on
            neighbor:
              peerlink.4094:
                remote-as: external
                type: unnumbered
              swp51:
                remote-as: external
                type: unnumbered
              swp52:
                remote-as: external
                type: unnumbered
cumulus@leaf04:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    bridge:
      domain:
        br_default:
          untagged: 1
          vlan:
            40,50,60: {}
    interface:
      bond1:
        bond:
          member:
            swp1: {}
          mlag:
            enable: on
            id: 1
        bridge:
          domain:
            br_default: {}
        type: bond
      bond2:
        bond:
          member:
            swp2: {}
          mlag:
            enable: on
            id: 2
        bridge:
          domain:
            br_default: {}
        type: bond
      bond3:
        bond:
          member:
            swp3: {}
          mlag:
            enable: on
            id: 3
        bridge:
          domain:
            br_default: {}
        type: bond
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.4/32: {}
        type: loopback
      peerlink:
        bond:
          member:
            swp49: {}
            swp50: {}
        type: peerlink
      peerlink.4094:
        base-interface: peerlink
        type: sub
        vlan: 4094
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp49:
        type: swp
      swp50:
        type: swp
      swp51:
        type: swp
      swp52:
        type: swp
      vlan40:
        ip:
          address:
            10.1.40.5/24: {}
        type: svi
        vlan: 40
      vlan50:
        ip:
          address:
            10.1.50.5/24: {}
        type: svi
        vlan: 50
      vlan60:
        ip:
          address:
            10.1.60.5/24: {}
        type: svi
        vlan: 60
    mlag:
      backup:
        10.10.10.3: {}
      enable: on
      mac-address: 44:38:39:BE:EF:AA
      peer-ip: linklocal
    router:
      bgp:
        autonomous-system: 65104
        enable: on
        router-id: 10.10.10.4
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$PzlQBAYykTbGNgG3$cp7tO7Y02Aq86A6aVYLkfi3WT.jVU3UPN/L3wsiYuQGovr65nQQEwG0GA7.q7vg0sq2SUh7kE0vNmxuJOiek9.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        system-mac: 44:38:39:22:01:8a
      hostname: leaf04
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.4/32: {}
                redistribute:
                  connected:
                    enable: on
            enable: on
            neighbor:
              peerlink.4094:
                remote-as: external
                type: unnumbered
              swp51:
                remote-as: external
                type: unnumbered
              swp52:
                remote-as: external
                type: unnumbered
cumulus@spine01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.101/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.101
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$z2fhK9bF0cUg7Gpx$/W/MPFTEiymYnYO/e1FglYzoNQ2xX9cj.inmj8yGkAwjS.vohDWreWjzrtUpkgvTzDxXlW6HcwNl7v0ABVSFo/
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        system-mac: 44:38:39:22:01:82
      hostname: spine01
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.101/32: {}
            enable: on
            neighbor:
              swp1:
                remote-as: external
                type: unnumbered
              swp2:
                remote-as: external
                type: unnumbered
              swp3:
                remote-as: external
                type: unnumbered
              swp4:
                remote-as: external
                type: unnumbered
cumulus@spine02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
- set:
    interface:
      eth0:
        ip:
          address:
            dhcp: {}
          vrf: mgmt
        type: eth
      lo:
        ip:
          address:
            10.10.10.102/32: {}
        type: loopback
      swp1:
        type: swp
      swp2:
        type: swp
      swp3:
        type: swp
      swp4:
        type: swp
    router:
      bgp:
        autonomous-system: 65199
        enable: on
        router-id: 10.10.10.102
    service:
      ntp:
        mgmt:
          server:
            0.cumulusnetworks.pool.ntp.org: {}
            1.cumulusnetworks.pool.ntp.org: {}
            2.cumulusnetworks.pool.ntp.org: {}
            3.cumulusnetworks.pool.ntp.org: {}
    system:
      aaa:
        class:
          nvapply:
            action: allow
            command-path:
              /:
                permission: all
          nvshow:
            action: allow
            command-path:
              /:
                permission: ro
          sudo:
            action: allow
            command-path:
              /:
                permission: all
        role:
          nvue-admin:
            class:
              nvapply: {}
          nvue-monitor:
            class:
              nvshow: {}
          system-admin:
            class:
              nvapply: {}
              sudo: {}
        user:
          cumulus:
            full-name: cumulus,,,
            hashed-password: $6$AzORFSdbvMofGHPG$wT9XRvHYmhOzygKOv1fy.jLhYgtz7nqxdxDBEBfWFiR4IEjAd.dld0ATXpE417M5jswCnUqKRryHfPlA6xwVo.
            role: system-admin
      api:
        state: enabled
      config:
        auto-save:
          enable: on
      control-plane:
        acl:
          acl-default-dos:
            inbound: {}
          acl-default-whitelist:
            inbound: {}
      global:
        system-mac: 44:38:39:22:01:92
      hostname: spine02
      reboot:
        mode: cold
      ssh-server:
        state: enabled
      wjh:
        channel:
          forwarding:
            trigger:
              l2: {}
              l3: {}
              tunnel: {}
        enable: on
    vrf:
      default:
        router:
          bgp:
            address-family:
              ipv4-unicast:
                enable: on
                network:
                  10.10.10.102/32: {}
            enable: on
            neighbor:
              swp1:
                remote-as: external
                type: unnumbered
              swp2:
                remote-as: external
                type: unnumbered
              swp3:
                remote-as: external
                type: unnumbered
              swp4:
                remote-as: external
                type: unnumbered
cumulus@leaf01:mgmt:~$ sudo cat /etc/network/interfaces
auto lo
iface lo inet loopback
    address 10.10.10.1/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto bond2
iface bond2
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 2
auto bond3
iface bond3
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 3
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.2
    clagd-sys-mac 44:38:39:BE:EF:AA
    clagd-args --initDelay 180
auto vlan10
iface vlan10
    address 10.1.10.2/24
    hwaddress 44:38:39:22:01:b1
    vlan-raw-device br_default
    vlan-id 10
auto vlan20
iface vlan20
    address 10.1.20.2/24
    hwaddress 44:38:39:22:01:b1
    vlan-raw-device br_default
    vlan-id 20
auto vlan30
iface vlan30
    address 10.1.30.2/24
    hwaddress 44:38:39:22:01:b1
    vlan-raw-device br_default
    vlan-id 30
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink
    hwaddress 44:38:39:22:01:b1
    bridge-vlan-aware yes
    bridge-vids 10 20 30
    bridge-pvid 1
cumulus@leaf02:mgmt:~$ sudo cat /etc/network/interfaces
auto lo
iface lo inet loopback
    address 10.10.10.2/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto bond2
iface bond2
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 2
auto bond3
iface bond3
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 3
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.1
    clagd-sys-mac 44:38:39:BE:EF:AA
    clagd-args --initDelay 180
auto vlan10
iface vlan10
    address 10.1.10.3/24
    hwaddress 44:38:39:22:01:af
    vlan-raw-device br_default
    vlan-id 10

auto vlan20 iface vlan20 address 10.1.20.3/24 hwaddress 44:38:39:22:01:af vlan-raw-device br_default vlan-id 20 auto vlan30 iface vlan30 address 10.1.30.3/24 hwaddress 44:38:39:22:01:af vlan-raw-device br_default vlan-id 30 auto br_default iface br_default bridge-ports bond1 bond2 bond3 peerlink hwaddress 44:38:39:22:01:af bridge-vlan-aware yes bridge-vids 10 20 30 bridge-pvid 1

cumulus@leaf03:mgmt:~$ sudo cat /etc/network/interfaces
auto lo
iface lo inet loopback
    address 10.10.10.3/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto bond2
iface bond2
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 2
auto bond3
iface bond3
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 3
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.4
    clagd-sys-mac 44:38:39:BE:EF:AA
    clagd-args --initDelay 180
auto vlan40
iface vlan40
    address 10.1.40.4/24
    hwaddress 44:38:39:22:01:bb
    vlan-raw-device br_default
    vlan-id 40
auto vlan50
iface vlan50
    address 10.1.50.4/24
    hwaddress 44:38:39:22:01:bb
    vlan-raw-device br_default
    vlan-id 50
auto vlan60
iface vlan60
    address 10.1.60.4/24
    hwaddress 44:38:39:22:01:bb
    vlan-raw-device br_default
    vlan-id 60
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink
    hwaddress 44:38:39:22:01:bb
    bridge-vlan-aware yes
    bridge-vids 40 50 60
    bridge-pvid 1
cumulus@leaf04:mgmt:~$ sudo cat /etc/network/interfaces
auto lo
iface lo inet loopback
    address 10.10.10.4/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp49
iface swp49
auto swp50
iface swp50
auto swp51
iface swp51
auto swp52
iface swp52
auto bond1
iface bond1
    bond-slaves swp1
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 1
auto bond2
iface bond2
    bond-slaves swp2
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 2
auto bond3
iface bond3
    bond-slaves swp3
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
    clag-id 3
auto peerlink
iface peerlink
    bond-slaves swp49 swp50
    bond-mode 802.3ad
    bond-lacp-bypass-allow no
auto peerlink.4094
iface peerlink.4094
    clagd-peer-ip linklocal
    clagd-backup-ip 10.10.10.3
    clagd-sys-mac 44:38:39:BE:EF:AA
    clagd-args --initDelay 180
auto vlan40
iface vlan40
    address 10.1.40.5/24
    hwaddress 44:38:39:22:01:c1
    vlan-raw-device br_default
    vlan-id 40
auto vlan50
iface vlan50
    address 10.1.50.5/24
    hwaddress 44:38:39:22:01:c1
    vlan-raw-device br_default
    vlan-id 50
auto vlan60
iface vlan60
    address 10.1.60.5/24
    hwaddress 44:38:39:22:01:c1
    vlan-raw-device br_default
    vlan-id 60
auto br_default
iface br_default
    bridge-ports bond1 bond2 bond3 peerlink
    hwaddress 44:38:39:22:01:c1
    bridge-vlan-aware yes
    bridge-vids 40 50 60
    bridge-pvid 1
cumulus@spine01:mgmt:~$ sudo cat /etc/network/interfaces
auto lo
iface lo inet loopback
    address 10.10.10.101/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
cumulus@spine02:mgmt:~$ sudo cat /etc/network/interfaces
auto lo
iface lo inet loopback
    address 10.10.10.102/32
auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto
auto eth0
iface eth0 inet dhcp
    ip-forward off
    ip6-forward off
    vrf mgmt
auto swp1
iface swp1
auto swp2
iface swp2
auto swp3
iface swp3
auto swp4
iface swp4
cumulus@leaf01:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65101 vrf default
bgp router-id 10.10.10.1
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor swp51 interface remote-as external
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
network 10.1.10.0/24
network 10.10.10.1/32
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
! end of router bgp 65101 vrf default
cumulus@leaf02:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65102 vrf default
bgp router-id 10.10.10.2
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor swp51 interface remote-as external
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
network 10.10.10.2/32
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
! end of router bgp 65102 vrf default
cumulus@leaf03:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65103 vrf default
bgp router-id 10.10.10.3
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor swp51 interface remote-as external
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
network 10.10.10.3/32
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
! end of router bgp 65103 vrf default
cumulus@leaf04:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65104 vrf default
bgp router-id 10.10.10.4
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor peerlink.4094 interface remote-as external
neighbor peerlink.4094 advertisement-interval 0
neighbor peerlink.4094 timers 3 9
neighbor peerlink.4094 timers connect 10
neighbor swp51 interface remote-as external
neighbor swp51 timers 3 9
neighbor swp51 timers connect 10
neighbor swp51 advertisement-interval 0
neighbor swp51 capability extended-nexthop
neighbor swp52 interface remote-as external
neighbor swp52 timers 3 9
neighbor swp52 timers connect 10
neighbor swp52 advertisement-interval 0
neighbor swp52 capability extended-nexthop
! Address families
address-family ipv4 unicast
network 10.10.10.4/32
redistribute connected
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp51 activate
neighbor swp52 activate
exit-address-family
! end of router bgp 65104 vrf default
cumulus@spine01:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.101
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor swp1 interface remote-as external
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
! Address families
address-family ipv4 unicast
network 10.10.10.101/32
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
exit-address-family
! end of router bgp 65199 vrf default
cumulus@spine02:mgmt:~$ sudo cat /etc/frr/frr.conf
...
vrf default
exit-vrf
vrf mgmt
exit-vrf
router bgp 65199 vrf default
bgp router-id 10.10.10.102
timers bgp 3 9
bgp deterministic-med
! Neighbors
neighbor swp1 interface remote-as external
neighbor swp1 timers 3 9
neighbor swp1 timers connect 10
neighbor swp1 advertisement-interval 0
neighbor swp1 capability extended-nexthop
neighbor swp2 interface remote-as external
neighbor swp2 timers 3 9
neighbor swp2 timers connect 10
neighbor swp2 advertisement-interval 0
neighbor swp2 capability extended-nexthop
neighbor swp3 interface remote-as external
neighbor swp3 timers 3 9
neighbor swp3 timers connect 10
neighbor swp3 advertisement-interval 0
neighbor swp3 capability extended-nexthop
neighbor swp4 interface remote-as external
neighbor swp4 timers 3 9
neighbor swp4 timers connect 10
neighbor swp4 advertisement-interval 0
neighbor swp4 capability extended-nexthop
! Address families
address-family ipv4 unicast
network 10.10.10.102/32
maximum-paths ibgp 64
maximum-paths 64
distance bgp 20 200 200
neighbor swp1 activate
neighbor swp2 activate
neighbor swp3 activate
neighbor swp4 activate
exit-address-family
! end of router bgp 65199 vrf default

This simulation is running Cumulus Linux 5.11. The Cumulus Linux 5.12 simulation is coming soon.

The simulation starts with the example BGP configuration. The demo is pre-configured using NVUE commands.

To validate the configuration, run the commands listed in the Troubleshooting-BGP section.

Open Shortest Path First - OSPF

OSPF is a link-state routing protocol you use between routers to exchange information about routes and the cost to reach an intended destination. OSPF routers exchange information about their links, prefixes, and associated cost with LSAs. This topology information builds a topology database. Each router within an area has an identical database and calculates its own routing table using SPF algorithm. Cumulus Linux uses the SPF algorithm any time there are changes to routing information in the network. OSPF uses the concept of areas to try and limit the size of the topology database on different routers. The routers that exist in more than one area are ABRs, which simplify the information in LSAs when advertising them from one area to another. ABRs are the routers in OSPF that implement route filtering or route summarization.

Cumulus Linux supports:

Open Shortest Path First v2 - OSPFv2

This topic describes OSPFv2, which is a link-state routing protocol for IPv4. For IPv6 commands, refer to Open Shortest Path First v3 - OSPFv3.

Basic OSPFv2 Configuration

You can configure OSPF using either numbered interfaces or unnumbered interfaces.

When you enable or disable OSPF, the FRR service restarts, which might impact traffic.

OSPFv2 Numbered

To configure OSPF using numbered interfaces, you specify the router ID, IP subnet prefix, and area address. You must put all the interfaces on the switch with an IP address that matches the network subnet into the specified area. OSPF attempts to discover other OSPF routers on those interfaces. Cumulus Linux adds all matching interface network addresses to a type-1 LSA and advertises to discovered neighbors for proper reachability.

If you do not want to bring up an OSPF adjacency on certain interfaces, but want to advertise those networks in the OSPF database, you can configure the interfaces as passive interfaces. A passive interface creates a database entry but does not send or receive OSPF hello packets. For example, in a data center topology, the host-facing interfaces do not need to run OSPF, however, you need to advertise the corresponding IP addresses to neighbors.

Network statements can be as inclusive or generic as necessary to cover the interface networks.

The following example commands configure OSPF numbered on leaf01 and spine01.

leaf01 spine01
  • The loopback address is 10.10.10.1/32
  • The IP address on swp51 is 10.0.1.0/31
  • The router ID is 10.10.10.1
  • All the interfaces on the switch with an IP address that matches subnet 10.10.10.1/32 and swp51 with IP address 10.0.1.0/31 are in area 0
  • swp1 and swp2 are passive interfaces
  • The loopback address is 10.10.10.101/32
  • The IP address on swp1 is 10.0.1.1/31
  • The router ID is 10.10.10.101
  • All interfaces on the switch with an IP address that matches subnet 10.10.10.101/32 and swp1 with IP address 10.0.1.1/31 are in area 0.
cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@leaf01:~$ nv set interface swp51 ip address 10.0.1.0/31
cumulus@leaf01:~$ nv set vrf default router ospf router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router ospf area 0 network 10.10.10.1/32
cumulus@leaf01:~$ nv set vrf default router ospf area 0 network 10.0.1.0/31
cumulus@leaf01:~$ nv set interface swp1 router ospf passive on
cumulus@leaf01:~$ nv set interface swp2 router ospf passive on
cumulus@leaf01:~$ nv config apply
cumulus@spine01:~$ nv set interface lo ip address 10.10.10.101/32
cumulus@spine01:~$ nv set interface swp1 ip address 10.0.1.1/31
cumulus@spine01:~$ nv set vrf default router ospf router-id 10.10.10.101
cumulus@spine01:~$ nv set vrf default router ospf area 0 network 10.10.10.101/32
cumulus@spine01:~$ nv set vrf default router ospf area 0 network 10.0.1.1/31
cumulus@spine01:~$ nv config apply

When you change the router ID after initial configuration, you must run the nv action clear vrf <vrf> router ospf database command.

  1. Edit the /etc/frr/daemons file to enable the ospf daemon, then start the FRR service (see FRRouting).

  2. Edit the /etc/network/interfaces file to configure the IP address for the loopback and swp51:

cumulus@leaf01:~$ sudo nano /etc/network/interfaces
...
auto lo
iface lo inet loopback
  address 10.10.10.1/32

auto swp51 iface swp51 address 10.0.1.0/31

  1. Run the ifreload -a command to load the new configuration:

    cumulus@leaf01:~$ sudo ifreload -a
    
  2. From the vtysh shell, configure OSPF:

    cumulus@leaf01:~$ sudo vtysh
    ...
    leaf01# configure terminal
    leaf01(config)# router ospf
    leaf01(config-router)# ospf router-id 10.10.10.1
    leaf01(config-router)# network 10.10.10.1/32 area 0
    leaf01(config-router)# network 10.0.1.0/31 area 0
    leaf01(config-router)# passive-interface swp1
    leaf01(config-router)# passive-interface swp2
    leaf01(config-router)# exit
    leaf01(config)# exit
    leaf01# write memory
    leaf01# exit
    

You can use the passive-interface default command to set all interfaces as passive and selectively bring up protocol adjacency on certain interfaces:

leaf01(config)# router ospf
leaf01(config-router)# passive-interface default
leaf01(config-router)# no passive-interface swp51

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router ospf
 ospf router-id 10.10.10.1
 network 10.10.10.1/32 area 0
 network 10.0.1.0/31 area 0
 passive-interface swp1
 passive-interface swp2
...
  1. Edit the /etc/frr/daemons file to enable the ospf daemon, then start the FRR service (see FRRouting).

  2. Edit the /etc/network/interfaces file to configure the IP address for the loopback and swp1:

    cumulus@spine01:~$ sudo nano /etc/network/interfaces
    ...
    auto lo
    iface lo inet loopback
      address 10.10.10.101/32
    

    auto swp51 iface swp51 address 10.0.1.1/31

  3. Run the ifreload -a command to load the new configuration:

    cumulus@spine01:~$ sudo ifreload -a
    
  4. From the vtysh shell, configure OSPF:

    cumulus@spine01:~$ sudo vtysh
    ...
    spine01# configure terminal
    spine01(config)# router ospf
    spine01(config-router)# ospf router-id 10.10.101.1
    spine01(config-router)# network 10.10.10.101/32 area 0
    spine01(config-router)# network 10.0.1.1/31 area 0
    spine01(config-router)# exit
    spine01(config)# exit
    spine01# write memory
    spine01# exit
    

You can use the passive-interface default command to set all interfaces as passive and selectively bring up protocol adjacency on certain interfaces:

spine01(config)# router ospf
spine01(config-router)# passive-interface default
spine01(config-router)# no passive-interface swp1

The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

...
router ospf
 ospf router-id 10.10.10.101
 network 10.10.10.101/32 area 0
 network 10.0.1.1/31 area 0
...

OSPFv2 Unnumbered

Unnumbered interfaces are interfaces without unique IP addresses; multiple interfaces share the same IP address. In OSPFv2, unnumbered interfaces do not need unique IP addresses on leaf and spine interfaces and simplify the OSPF database, which reduces the memory footprint and improves SPF convergence times.

To configure an unnumbered interface, take the IP address of loopback interface (called the anchor) and use that as the IP address of the unnumbered interface.

OSPF unnumbered supports point-to-point interfaces only and does not support network statements.

The following example commands configure OSPF unnumbered on leaf01 and spine01.

leaf01 spine01
  • The loopback address is 10.10.10.1/32
  • The IP address of the unnumbered interface (swp51) is 10.10.10.1/32
  • The router ID is 10.10.10.1
  • OSPF is on the loopback interface and on swp51 in area 0
  • swp1 and swp2 are passive interfaces
  • swp51 is a point-to-point interface (Cumulus Linux requires point-to-point for unnumbered interfaces)
    • The loopback address is 10.10.10.101/32
    • The IP address of the unnumbered interface (swp1) is 10.10.10.101/32
    • The router ID is 10.10.10.101
    • OSPF is on the loopback interface and on swp1 in area 0
    • swp1 is a point-to-point interface (Cumulus Linux requires point-to-point for unnumbered interfaces)

      Configure the unnumbered interface:

      cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
      cumulus@leaf01:~$ nv set interface swp51 ip address 10.10.10.1/32
      cumulus@leaf01:~$ nv config apply
      

      Configure OSPF:

      cumulus@leaf01:~$ nv set vrf default router ospf router-id 10.10.10.1
      cumulus@leaf01:~$ nv set interface lo router ospf area 0
      cumulus@leaf01:~$ nv set interface swp51 router ospf area 0
      cumulus@leaf01:~$ nv set interface swp1 router ospf passive on
      cumulus@leaf01:~$ nv set interface swp2 router ospf passive on
      cumulus@leaf01:~$ nv set interface swp51 router ospf network-type point-to-point
      cumulus@leaf01:~$ nv config apply
      

      Configure the unnumbered interface:

      cumulus@spine01:~$ nv set interface lo ip address 10.10.10.101/32
      cumulus@spine01:~$ nv set interface swp1 ip address 10.10.10.101/32
      cumulus@spine01:~$ nv config apply
      

      Configure OSPF:

      cumulus@spine01:~$ nv set vrf default router ospf router-id 10.10.10.101
      cumulus@spine01:~$ nv set interface lo router ospf area 0
      cumulus@spine01:~$ nv set interface swp1 router ospf area 0
      cumulus@spine01:~$ nv set interface swp1 router ospf network-type point-to-point
      cumulus@spine01:~$ nv config apply
      

      When you change the router ID after initial configuration, you must run the nv action clear vrf <vrf> router ospf database command.

      1. Edit the /etc/frr/daemons file to enable the ospf daemon, then start the FRR service (see FRRouting).

      2. Edit the /etc/network/interfaces file to configure the loopback and unnumbered interface address:

        cumulus@leaf01:~$ sudo nano /etc/network/interfaces
        ...
        auto lo
        iface lo inet loopback
          address 10.10.10.1/32
        

        auto swp51 iface swp51 address 10.10.10.1/32

      3. Run the ifreload -a command to load the new configuration:

        cumulus@leaf01:~$ ifreload -a
        
      4. From the vtysh shell, configure OSPF:

        cumulus@leaf01:~$ sudo vtysh
        ...
        leaf01# configure terminal
        leaf01(config)# router ospf
        leaf01(config-router)# ospf router-id 10.10.10.1
        leaf01(config-router)# interface swp51
        leaf01(config-if)# ip ospf area 0
        leaf01(config-if)# ip ospf network point-to-point
        leaf01(config-if)# exit
        leaf01(config)# interface lo
        leaf01(config-if)# ip ospf area 0
        leaf01(config-if)# exit
        leaf01(config)# router ospf
        leaf01(config-router)# passive-interface swp1,swp2
        leaf01(config-router)# end
        leaf01# write memory
        leaf01# exit
        

        You can use the passive-interface default command to set all interfaces as passive and selectively bring up protocol adjacency on certain interfaces:

        leaf01(config)# router ospf
        leaf01(config-router)# passive-interface default
        leaf01(config-router)# no passive-interface swp51
        

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

      ...
      interface lo
       ip ospf area 0
      interface swp51
       ip ospf area 0
       ip ospf network point-to-point
      router ospf
       ospf router-id 10.10.10.1
       passive-interface swp1,swp2
      ...
      
      1. Edit the /etc/frr/daemons file to enable the ospf daemon, then start the FRR service (see FRRouting).

      2. Edit the /etc/network/interfaces file to configure the loopback and unnumbered interface address:

        cumulus@spine01:~$ sudo nano /etc/network/interfaces
        ...
        auto lo
        iface lo inet loopback
           address 10.10.10.101/32
        

        auto swp1 iface swp1 address 10.10.10.101/32

      3. Run the ifreload -a command to load the new configuration:

        cumulus@spine01:~$ sudo ifreload -a
        
      4. From the vtysh shell, configure OSPF:

        cumulus@spine01:~$ sudo vtysh
        ...
        spine01# configure terminal
        spine01(config)# router ospf
        spine01(config)# ospf router-id 10.10.10.101
        spine01(config)# interface swp1
        spine01(config-if)# ip ospf area 0
        spine01(config-if)# ip ospf network point-to-point
        spine01(config-if)# exit
        spine01(config)# interface lo
        spine01(config-if)# ip ospf area 0
        spine01(config-if)# exit
        spine01(config-if)# end
        spine01# write memory
        spine01# exit
        

        You can use the passive-interface default command to set all interfaces as passive and selectively bring up protocol adjacency on certain interfaces:

        spine01(config)# router ospf
        spine01(config-router)# passive-interface default
        spine01(config-router)# no passive-interface swp1
        

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

      ...
      interface lo
       ip ospf area 0
      interface swp1
       ip ospf area 0
       ip ospf network point-to-point
      router ospf
       ospf router-id 10.10.10.101
      ...
      

      Optional OSPFv2 Configuration

      This section describes optional configuration. The steps provided in this section assume that you already configured basic OSPFv2 as described in Basic OSPF Configuration, above.

      Interface Parameters

      You can define the following OSPF parameters per interface:

      The following command example sets the network type to point-to-point.

      cumulus@switch:~$ nv set interface swp51 router ospf network-type point-to-point
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# interface swp51
      switch(config-if)# ip ospf network point-to-point
      switch(config-if)# end
      switch# write memory
      switch# exit
      

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example

      ...
      interface swp51
       ip ospf network point-to-point
      ...
      

      The following command example sets the hello interval to 5 seconds and the dead interval to 60 seconds. The hello interval and dead interval can be any value between 1 and 65535 seconds.

      cumulus@switch:~$ nv set interface swp51 router ospf timers hello-interval 5
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# interface swp51
      switch(config-if)# ip ospf network hello-interval 5
      switch(config-if)# ip ospf network dead-interval 60
      switch(config-if)# end
      switch# write memory
      switch# exit
      

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example

      ...
      interface swp51
       ip ospf hello-interval 5
       ip ospf dead-interval 60
      ...
      

      The following command example sets the priority to 5 for swp51. The priority can be any value between 0 to 255. 0 configures the interface to never become the OSPF Designated Router (DR) on a broadcast interface.

      cumulus@switch:~$ nv set interface swp51 router ospf priority 5
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# interface swp51
      switch(config-if)# ip ospf network priority 5
      switch(config-if)# end
      switch# write memory
      switch# exit
      

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example

      ...
      interface swp51
       ip ospf priority 5
      ...
      

      To see the configured OSPF interface parameter values, run the vtysh show ip ospf interface command.

      SPF Timer Defaults

      OSPF uses the following default timers to prevent consecutive SPF from overburdening the CPU:

      The following example commands change the number of milliseconds from the initial event until SPF runs to 80, the number of milliseconds between consecutive SPF runs to 100, and the maximum number of milliseconds between SPFs to 6000.

      cumulus@switch:~$ nv set router ospf timers spf delay 80
      cumulus@switch:~$ nv set router ospf timers spf holdtime 100
      cumulus@switch:~$ nv set router ospf timers spf max-holdtime 6000
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# router ospf
      switch(config-router)# timers throttle spf 80 100 6000
      switch(config-router)# end
      switch# write memory
      switch# exit
      

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

      ...
      router ospf
       ospf router-id 10.10.10.1
       passive-interface swp1
       passive-interface swp2
       network 10.10.10.1/32 area 0
       timers throttle spf 80 100 6000
      ...
      

      To see the configured SPF timer values, run the vtysh show ip ospf command.

      MD5 Authentication

      To configure MD5 authentication on the switch, you need to create a key and a key ID, then enable MD5 authentication. The key ID must be a value between 1 and 255 that represents the key used to create the message digest. This value must be consistent across all routers on a link. The key must be a value with an upper range of 16 characters (longer strings truncate) that represents the actual message digest.

      The following example commands create key ID 1 with the key thisisthekey and enable MD5 authentication on swp51 on leaf01 and on swp1 on spine01.

      cumulus@leaf01:~$ nv set interface swp51 router ospf authentication message-digest-key 1
      cumulus@leaf01:~$ nv set interface swp51 router ospf authentication md5-key thisisthekey
      cumulus@leaf01:~$ nv set interface swp51 router ospf authentication enable on
      cumulus@leaf01:~$ nv config apply
      
      cumulus@spine01:~$ nv set interface swp1 router ospf authentication message-digest-key 1
      cumulus@spine01:~$ nv set interface swp1 router ospf authentication md5-key thisisthekeynet 
      cumulus@spine01:~$ nv set interface swp1 router ospf authentication enable on
      cumulus@spine01:~$ nv config apply
      
      cumulus@leaf01:~$ sudo vtysh
      ...
      leaf01# configure terminal
      leaf01(config)# interface swp51
      leaf01(config-if)# ip ospf authentication message-digest
      leaf01(config-if)# ip ospf message-digest-key 1 md5 thisisthekey
      leaf01(config-if)# end
      leaf01# write memory
      leaf01# exit
      

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

      ...
      interface swp51
       ip ospf authentication message-digest
       ip ospf message-digest-key 1 md5 thisisthekey
       ...
      
      cumulus@spine01:~$ sudo vtysh
      ...
      spine01# configure terminal
      spine01(config)# interface swp1
      spine01(config-if)# ip ospf authentication message-digest
      spine01(config-if)# ip ospf message-digest-key 1 md5 thisisthekey
      spine01(config-if)# end
      spine01# write memory
      spine01# exit
      

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

      ...
      interface swp1
       ip ospf authentication message-digest
       ip ospf message-digest-key 1 md5 thisisthekey
       ...
      

      To remove existing MD5 authentication hashes, run the vtysh no ip ospf command (no ip ospf message-digest-key 1 md5 thisisthekey).

      Summarization and Prefix Range

      By default, an ABR creates a summary (type-3) LSA for each route in an area and advertises it in adjacent areas. Prefix range configuration optimizes this behavior by creating and advertising one summary LSA for multiple routes. OSPF only allows for route summarization between areas on a ABR.

      The following example shows a topology divided into area 0 and area 1. border01 and border02 are ABRs that have links to multiple areas and perform a set of specialized tasks, such as SPF computation per area and summarization of routes across areas.

      On border01:

      These commands create a summary route for all the routes in the range 172.16.1.0/24 in area 0:

      cumulus@leaf01:~$ nv set vrf default router ospf area 0 range 172.16.1.0/24
      cumulus@leaf01:~$ nv config apply
      
      cumulus@leaf01:~$ sudo vtysh
      ...
      leaf01# configure terminal
      leaf01(config)# router ospf
      leaf01(config-router)# area 0 range 172.16.1.0/24
      leaf01(config-router)# end
      leaf01# write memory
      leaf01# exit
      

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

      cumulus@border01:mgmt:~$ sudo cat /etc/frr/frr.conf
      ...
      interface lo
       ip ospf area 0
      interface swp1
       ip ospf area 1
      interface swp2
       ip ospf area 1
      interface swp51
       ip ospf area 0
      interface swp52
       ip ospf area 0
      router ospf
       ospf router-id 10.10.10.63
       area 0 range 172.16.1.0/24
      

      Stub Areas

      External routes are the routes redistributed into OSPF from another protocol. They have an AS-wide flooding scope. Typically, external link states make up a large percentage of the link-state database (LSDB). Stub areas reduce the LSDB size by not flooding AS-external LSAs.

      All routers must agree that an area is a stub, otherwise they do not become OSPF neighbors.

      To configure a stub area:

      cumulus@switch:~$ nv set vrf default router ospf area 1 type stub
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# router ospf
      switch(config-router)# area 1 stub
      switch(config-router)# end
      switch# write memory
      switch# exit
      

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

      ...
      router ospf
       router-id 10.10.10.63
       area 1 stub
      ...
      

      Stub areas still receive information about networks that belong to other areas of the same OSPF domain. If summarization is not configured (or is not comprehensive), the information can be overwhelming for the nodes. Totally stubby areas address this issue. Routers in totally stubby areas keep information about routing within their area in their LSDB.

      To configure a totally stubby area:

      cumulus@switch:~$ nv set vrf default router ospf area 1 type totally-stub 
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# router ospf
      switch(config-router)# area 1 stub no-summary
      switch(config-router)# end
      switch# write memory
      switch# exit
      

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

      ...
      router ospf
       router-id 10.10.10.63
       area 1 stub no-summary
      ...
      

      Here is a brief summary of the area type differences:

      Type Behavior
      Normal non-zero area LSA types 1, 2, 3, 4 area-scoped, type 5 externals, inter-area routes summarized
      Stub area LSA types 1, 2, 3, 4 area-scoped, no type 5 externals, inter-area routes summarized
      Totally stubby area LSA types 1, 2 area-scoped, default summary, no type 3, 4, 5 LSA types allowed

      Auto-cost Reference Bandwidth

      When you set the auto-cost reference bandwidth, Cumulus Linux dynamically calculates the OSPF interface cost to support higher speed links. The default value is 100000 for 100Gbps link speed. The cost of interfaces with link speeds lower than 100Gbps is higher.

      To avoid routing loops, set the bandwidth to a consistent value across all OSPF routers.

      The following example commands configure the auto-cost reference bandwidth for 90Gbps link speed:

      cumulus@switch:~$ nv set vrf default router ospf reference-bandwidth 9000
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# router ospf
      switch(config-router)# auto-cost reference-bandwidth 90000
      switch(config-router)# end
      switch# write memory
      switch# exit
      

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

      ...
      router ospf
       router-id 10.10.10.1
       auto-cost reference-bandwidth 90000
      ...
      

      Administrative Distance

      Cumulus Linux uses the administrative distance to choose which routing protocol to use when two different protocols provide route information for the same destination. The smaller the distance, the more reliable the protocol. For example, if the switch receives a route from OSPF with an administrative distance of 110 and the same route from BGP with an administrative distance of 100, the switch chooses BGP.

      Cumulus Linux provides several commands to change the distance for OSPF routes. The default value is 110.

      The following example commands set the distance for an entire group of routes:

      The NVUE command is not supported.
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# router ospf
      switch(config-router)# distance 254
      switch(config-router)# end
      switch# write memory
      switch# exit
      

      The following example commands change the OSPF administrative distance to 150 for internal routes and 220 for external routes:

      cumulus@switch:~$ nv set vrf default router ospf distance intra-area 150 
      cumulus@switch:~$ nv set vrf default router ospf distance inter-area 150
      cumulus@switch:~$ nv set vrf default router ospf distance external 220
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# router ospf
      switch(config-router)# distance ospf intra-area 150 inter-area 150 external 220
      switch(config-router)# end
      switch# write memory
      switch# exit
      

      The following example commands change the OSPF administrative distance to 150 for internal routes to a subnet or network inside the same area as the router:

      cumulus@switch:~$ nv set vrf default router ospf distance intra-area 150 
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# router ospf
      switch(config-router)# distance ospf intra-area 150
      switch(config-router)# end
      switch# write memory
      switch# exit
      

      The following example commands change the OSPF administrative distance to 150 for internal routes to a subnet in an area of which the router is not a part:

      cumulus@switch:~$ nv set vrf default router ospf distance inter-area 150
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# router ospf
      switch(config-router)# distance ospf inter-area 150
      switch(config-router)# end
      switch# write memory
      switch# exit
      

      The vtysh commands save the configuration to the /etc/frr/frr.conf file. For example:

      ...
      router ospf
        ospf router-id 10.10.10.1
        distance ospf intra-area 150 inter-area 150 external 220
      ...
      

      Topology Changes and OSPF Reconvergence

      When you remove a router or OSPF interface, LSA updates trigger throughout the network to inform all routers of the topology change. When the switch receives the LSA and runs OSPF, a routing update occurs. This can cause short-duration outages while the network detects the failure and updates the OSPF database.

      With a planned outage (such as during a maintenance window), you can configure the OSPF router with an OSPF max-metric to notify its neighbors not to use it as part of the OSPF topology. While the network converges, all traffic forwarded to the max-metric router is still forwarded. After you update the network, the max-metric router no longer receives any traffic and you can configure the max-metric setting. To remove a single interface, you can configure the OSPF cost for that specific interface.

      For failure events, traffic loss can occur during reconvergence (until SPF on all nodes computes an alternative path around the failed link or node to each of the destinations).

      To configure the max-metric (for all interfaces):

      cumulus@switch:~$ nv set vrf default router ospf max-metric administrative on
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# router ospf
      switch(config-router)# max-metric router-lsa administrative
      switch(config-router)# end
      switch# write memory
      switch# exit
      

      To configure the cost (for a specific interface):

      cumulus@switch:~$ nv set interface swp51 router ospf cost 65535
      cumulus@switch:~$ nv config apply
      
      cumulus@switch:~$ sudo vtysh
      ...
      switch# configure terminal
      switch(config)# interface swp51
      switch(config-if)# ospf cost 65535
      switch(config-if)# end
      switch# write memory
      switch# exit
      

      Troubleshooting

      NVUE provides several commands to show OSPF interface and OSPF neighbor configuration and statistics.

      Description
      NVUE Command
      nv show vrf <vrf> router ospf interface Shows all OSPF interfaces.
      nv show vrf <vrf> router ospf interface <interface> Shows information about a specific OSPF interface.
      nv show vrf <vrf> router ospf interface <interface> local-ip Shows the local IP addresses for the specified OSPF interface.
      nv show vrf <vrf> router ospf interface <interface> local-ip <IPv4_address> Shows statistics for a specific OSPF interface local IP address.
      nv show vrf <vrf> router ospf neighbor Shows the OSPF neighbor ID and the OSPF interface for all OSPF neighbors.
      nv show vrf <vrf> router ospf neighbor <IPv4-address> Shows the interface and local IP addresses for a specific OSPF neighbor.
      nv show vrf <vrf> router ospf neighbor <IPv4-address> interface Shows the local IP addresses of all the interfaces for an OSPF neighbor.
      nv show vrf <vrf> router ospf neighbor <IPv4-address> interface <interface> local-ip Shows the local IP addresses for a specific OSPF neighbor interface.
      nv show vrf <vrf> router ospf neighbor <IPv4-address> interface <interface> local-ip <IPv4-address> Shows statistics for a specific OSPF neighbor interface local IP address.

      The following example shows all OSPF interfaces:

      cumulus@leaf01:mgmt:~$ nv show vrf default router ospf interface
      Interface  Summary             
      ---------  --------------------
      lo         local-ip: 10.10.10.1
      swp51      local-ip:   10.0.1.0
      

      The following example shows the OSPF neighbor ID and the OSPF interface for all OSPF neighbors:

      cumulus@switch:~$ nv show vrf default router ospf neighbor
                    Summary         
      ------------  ----------------
      10.10.10.101  Interface: swp51
      

      The following example shows detailed OSPF neighbor information, which includes statistics:

      cumulus@leaf01:mgmt:~$ nv show vrf default router ospf neighbor --operational -o json
      {
        "10.10.10.101": {
          "interface": {
            "swp51": {
              "local-ip": {
                "10.0.1.0": {
                  "bdr-router-id": "10.10.10.101",
                  "dead-timer-expiry": 33519,
                  "dr-router-id": "10.10.10.1",
                  "neighbor-ip": "10.0.1.1",
                  "priority": 1,
                  "role": "BDR",
                  "state": "full",
                  "statistics": {
                    "db-summary-qlen": 0,
                    "ls-request-qlen": 0,
                    "ls-retrans-qlen": 0,
                    "state-changes": 5
                  }
                }
              }
            }
          }
        }
      }
      

      The following example shows the interface and local IP addresses for OSPF neighbor 10.10.10.101.

      cumulus@switch:~$ nv show vrf default router ospf neighbor 10.10.10.101
      Interface  Summary             
      ---------  --------------------
      swp51      local-ip: 10.0.1.0
      

      The following example shows more detailed information for OSPF neighbor 10.10.10.101 and includes statistics:

      cumulus@switch:~$ nv show vrf default router ospf neighbor 10.10.10.101 --operational -o json
      {
        "interface": {
          "swp51": {
            "local-ip": {
              "10.0.1.0": {
                "bdr-router-id": "10.10.10.101",
                "dead-timer-expiry": 30794,
                "dr-router-id": "10.10.10.1",
                "neighbor-ip": "10.0.1.1",
                "priority": 1,
                "role": "BDR",
                "state": "full",
                "statistics": {
                  "db-summary-qlen": 0,
                  "ls-request-qlen": 0,
                  "ls-retrans-qlen": 0,
                  "state-changes": 5
                }
              }
            }
          }
        }
      }
      

      The following example shows configuration and statistics for OSPF neighbor 10.10.10.101 on interface swp51 with the local IP address 10.10.10.1:

      cumulus@leaf01:mgmt:~$ nv show vrf default router ospf neighbor 10.10.10.101 interface swp51 local-ip 10.0.1.0
                         operational   applied
      -----------------  ------------  -------
      bdr-router-id      10.10.10.101         
      dead-timer-expiry  30042                
      dr-router-id       10.10.10.1           
      neighbor-ip        10.0.1.1             
      priority           1                    
      role               BDR                  
      state              full                 
      statistics                              
        db-summary-qlen  0                    
        ls-request-qlen  0                    
        ls-retrans-qlen  0                    
        state-changes    5    
      

      FRR (vtysh) provides several OSPF troubleshooting commands:

      Description
      vtysh Command
      show ip ospf neighbor Shows OSPF neighbor information.
      show ip ospf database Shows if the LSDB synchronizes across all routers in the network.
      show ip route ospf Shows if Cumulus Linux does not forward an OSPF route properly.
      show ip ospf interface Shows OSPF interfaces.
      show ip ospf Shows information about the OSPF process.

      The following example shows OSPF neighbor information:

      cumulus@leaf01:mgmt:~$ sudo vtysh
      ...
      leaf01# show ip ospf neighbor
      Neighbor ID     Pri State           Dead Time Address         Interface                        RXmtL RqstL DBsmL
      10.10.10.101      1 Full/Backup       30.307s 10.0.1.1        swp51:10.0.1.0                       0     0     0
      

      The following example shows if Cumulus Linux does not forward an OSPF route properly:

      cumulus@leaf01:mgmt:~$ sudo vtysh
      ...
      leaf01# show ip route ospf
      Codes: K - kernel route, C - connected, S - static, R - RIP,
             O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
             T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
             F - PBR, f - OpenFabric,
             > - selected route, * - FIB route, q - queued route, r - rejected route
      
      O   10.0.1.0/31 [110/100] is directly connected, swp51, weight 1, 00:02:37
      O   10.10.10.1/32 [110/0] is directly connected, lo, weight 1, 00:02:37
      O>* 10.10.10.101/32 [110/100] via 10.0.1.1, swp51, weight 1, 00:00:57
      

      To capture OSPF packets, run the sudo tcpdump -v -i swp1 ip proto ospf command.

      Clear OSPF Counters

      You can run the following commands to clear the OSPF counters shown in the NVUE show commands.

      The following example command clears all counters for OSPF interface swp51:

      cumulus@leaf01:mgmt:~$ nv action clear vrf default router ospf interface swp51
      ...
      Action succeeded
      

      Clear the OSPF Database

      To clear the OSPF database, reestablish neighborships, and reoriginate LSAs, run the nv action clear vrf <vrf>> router ospf database command:

      cumulus@leaf01:mgmt:~$ nv action clear vrf default router ospf database 
      Action executing ...
      Cleared vrf default ospf database
      Action succeeded
      

      Considerations

      With NVUE, you cannot run both the nv set vrf default router ospf area <area> network command and the nv set interface <interface> router ospf area command in the same configuration; for example, if you run the following commands, NVUE shows an invalid configuration error:

      cumulus@switch:~$ nv set router ospf enable on
      cumulus@switch:~$ nv set vrf default router ospf area 0 network 10.10.10.101/32
      cumulus@switch:~$ nv set vrf default router ospf enable on
      cumulus@switch:~$ nv set vrf default router ospf router-id 10.10.10.101
      cumulus@switch:~$ nv set interface swp1 router ospf area 10
      cumulus@switch:~$ nv config apply
      Invalid config [rev_id: 3]
        Please remove all network commands from `vrf.default.router.ospf.area.42` first.
      

      Open Shortest Path First v3 - OSPFv3

      OSPFv3 is a revised version of OSPFv2 and supports the IPv6 address family.

      IETF has defined extensions to OSPFv3 to support multiple address families (both IPv6 and IPv4). FRR does not support multiple address families.

      Basic OSPFv3 Configuration

      You can configure OSPF using either numbered interfaces or unnumbered interfaces.

      When you enable or disable OSPF, the FRR service restarts, which might impact traffic.

      NVUE commands are not supported for OSPFv3.

      OSPFv3 Numbered

      To configure OSPF using numbered interfaces, you specify the router ID, IP subnet prefix, and area address. All the interfaces on the switch with an IP address that matches the network subnet go into the specified area. OSPF attempts to discover other OSPF routers on those interfaces. Cumulus Linux adds all matching interface network addresses to a Type-1 Router LSA and advertises to discovered neighbors for proper reachability.

      If you do not want to bring up an OSPF adjacency on certain interfaces, but want to advertise those networks in the OSPF database, you can configure the interfaces as passive interfaces. A passive interface creates a database entry but does not send or receive OSPF hello packets. For example, in a data center topology, the host-facing interfaces do not need to run OSPF, however, you must advertise the corresponding IP addresses to neighbors.

      The following example commands configure OSPF numbered on leaf01 and spine01.

      leaf01 spine01
      • The loopback address is 2001:db8::a0a:0a01/128
      • The IP address on swp51 is 2001:db8::a00:0101/127
      • The router ID is 10.10.10.1
      • All the interfaces on the switch with an IP address that matches subnet 2001:db8::a0a:0a01/128 and swp51 with IP address 2001:db8::a00:0101/127 are in area 0.0.0.0
      • swp1 and swp2 are passive interfaces
      • The loopback address is 2001:db8::a0a:0a65/128
      • The IP address on swp1 is 22001:db8::a00:0100/127
      • The router ID is 10.10.10.101
      • All interfaces on the switch with an IP address that matches subnet 2001:db8::a0a:0a65/128 and swp1 with IP address 2001:db8::a00:0100/127 are in area 0.0.0.0.
      1. Edit the /etc/frr/daemons file to enable the ospf6 daemon, then start the FRR service (see FRRouting).

      2. Edit the /etc/network/interfaces file to configure the IP address for the loopback and swp51:

      cumulus@leaf01:~$ sudo nano /etc/network/interfaces
      ...
      auto lo
      iface lo inet loopback
        address 2001:db8::a0a:0a01/128
      
      auto swp51
      iface swp51
        address 2001:db8::a00:0101/127
      
      1. Run the ifreload -a command to load the new configuration:

        cumulus@leaf01:~$ sudo ifreload -a
        
      2. From the vtysh shell, configure OSPF:

        cumulus@leaf01:~$ sudo vtysh
        ...
        leaf01# configure terminal
        leaf01(config)# router ospf6
        leaf01(config-ospf6)# ospf6 router-id 10.10.10.1
        leaf01(config-ospf6)# interface lo area 0.0.0.0
        leaf01(config-ospf6)# interface swp51 area 0.0.0.0
        leaf01(config-ospf6)# exit
        leaf01(config)# interface swp1
        leaf01(config-if)# ipv6 ospf6 passive
        leaf01(config-if)# exit
        leaf01(config)# interface swp2
        leaf01(config-if)# ipv6 ospf6 passive
        leaf01(config-if)# end
        leaf01# write memory
        leaf01# exit
        
      1. Edit the /etc/frr/daemons file to enable the ospf6 daemon, then start the FRR service (see FRRouting).

      2. Edit the /etc/network/interfaces file to configure the IP address for the loopback and swp1:

        cumulus@spine01:~$ sudo nano /etc/network/interfaces
        ...
        auto lo
        iface lo inet loopback
          address 2001:db8::a0a:0a65/128
        
        auto swp1
        iface swp1
          address 2001:db8::a00:0100/127
        
      3. Run the ifreload -a command to load the new configuration:

        cumulus@spine01:~$ sudo ifreload -a
        
      4. From the vtysh shell, configure OSPF:

        cumulus@spine01:~$ sudo vtysh
        ...
        spine01# configure terminal
        spine01(config)# router ospf6
        spine01(config-ospf6)# ospf6 router-id 10.10.10.101
        spine01(config-ospf6)# interface lo area 0.0.0.0
        spine01(config-ospf6)# interface swp1 area 0.0.0.0
        spine01(config-ospf6)# end
        spine01# write memory
        spine01# exit
        

      The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

      ...
      router ospf6
       ospf6 router-id 10.10.10.1
       interface lo area 0.0.0.0
       interface swp51 area 0.0.0.0
      interface swp1
       ipv6 ospf6 passive
      interface swp2
       ipv6 ospf6 passive
      ...
      
      ...
      router ospf6
       ospf router-id 10.10.10.101
       interface lo area 0.0.0.0
       interface swp1 area 0.0.0.0
      ...
      

      OSPFv3 Unnumbered

      Unnumbered interfaces are interfaces without unique IP addresses; multiple interfaces share the same IP address.

      To configure an unnumbered interface, take the IP address of another interface (called the anchor) and use that as the IP address of the unnumbered interface. The anchor is typically the loopback interface on the switch.

      OSPFv3 unnumbered supports point-to-point interfaces only.

      The following example commands configure OSPFv3 unnumbered on leaf01 and spine01.

      leaf01 spine01
      • The loopback address is 2001:db8::a0a:0a01/128
      • The router ID is 10.10.10.1
      • OSPF is on the loopback interface and on swp51 in area 0.0.0.0
      • swp1 and swp2 are passive interfaces
      • swp51 is a point-to-point interface (unnumbered interfaces require point-to-point)
        • The loopback address is 2001:db8::a0a:0a65/128
        • The router ID is 10.10.10.101
        • OSPF is on the loopback interface and on swp1 in area 0.0.0.0
        • swp1 is a point-to-point interface (unnumbered interfaces require point-to-point)
          1. Edit the /etc/frr/daemons file to enable the ospf6 daemon, then start the FRR service (see FRRouting).

          2. Edit the /etc/network/interfaces file to configure the IP address for the loopback and swp51:

          cumulus@leaf01:~$ sudo nano /etc/network/interfaces
          ...
          auto lo
          iface lo inet loopback
            address 2001:db8::a0a:0a01/128
          
          auto swp1
          iface swp1
            address 2001:db8::a0a:0a01/128
          
          1. Run the ifreload -a command to load the new configuration:

            cumulus@leaf01:~$ sudo ifreload -a
            
            
          2. From the vtysh shell, configure OSPFv3:

          cumulus@leaf01:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# router ospf6
          leaf01(config-ospf6)# ospf6 router-id 10.10.10.1
          leaf01(config-ospf6)# interface lo area 0.0.0.0
          leaf01(config-ospf6)# interface swp51 area 0.0.0.0
          leaf01(config-ospf6)# exit
          leaf01(config)# interface swp1
          leaf01(config-if)# ipv6 ospf6 passive
          leaf01(config-if)# exit
          leaf01(config)# interface swp2
          leaf01(config-if)# ipv6 ospf6 passive
          leaf01(config-if)# exit
          leaf01(config)# interface swp51
          leaf01(config-if)# ipv6 ospf6 network point-to-point
          leaf01(config-if)# end
          leaf01# write memory
          leaf01# exit
          
          1. Edit the /etc/frr/daemons file to enable the ospf6 daemon, then start the FRR service (see FRRouting).

          2. Edit the /etc/network/interfaces file to configure the IP address for the loopback and swp1:

          cumulus@spine01:~$ sudo nano /etc/network/interfaces
          ...
          auto lo
          iface lo inet loopback
            address 2001:db8::a0a:0a65/128
          
          auto swp1
          iface swp1
            address 2001:db8::a0a:0a65/128
          
          1. Run the ifreload -a command to load the new configuration:

            cumulus@spine01:~$ sudo ifreload -a
            
          2. From the vtysh shell, configure OSPFv3:

          cumulus@spine01:~$ sudo vtysh
          ...
          spine01# configure terminal
          spine01(config)# router ospf6
          spine01(config-ospf6)# ospf router-id 10.10.10.101
          spine01(config-ospf6)# interface lo area 0.0.0.0
          spine01(config-ospf6)# interface swp1 area 0.0.0.0
          spine01(config-ospf6)# exit
          spine01(config)# interface swp1
          spine01(config-if)# ipv6 ospf6 network point-to-point
          spine01(config-if)# end
          spine01# write memory
          spine01# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          router ospf6
           ospf6 router-id 10.10.10.1
           interface lo area 0.0.0.0
           interface swp51 area 0.0.0.0
          interface swp1
           ipv6 ospf6 passive
          interface swp2
           ipv6 ospf6 passive
          interface swp51
           ipv6 ospf6 network point-to-point
          ...
          
          ...
          router ospf6
           ospf6 router-id 10.10.10.101
           interface lo area 0.0.0.0
           interface swp1 area 0.0.0.0
          interface swp1
           ipv6 ospf6 network point-to-point
          ...
          

          Optional OSPFv3 Configuration

          This section describes optional configuration. The steps provided in this section assume that you already configured basic OSPFv3 as described in Basic OSPF Configuration, above.

          Interface Parameters

          You can define the following OSPF parameters per interface:

          The following command example sets the network type to point-to-point on swp51.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# interface swp51
          switch(config-if)# ipv6 ospf6 network point-to-point
          switch(config-if)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          interface swp51
           ipv6 ospf6 network point-to-point
          ...
          

          The following command example sets the hello interval to 5 seconds, the dead interval to 60 seconds, and the priority to 5 for swp51. The hello interval and dead interval can be any value between 1 and 65535 seconds. The priority can be any value between 0 to 255 (0 configures the interface to never become the OSPF Designated Router (DR) on a broadcast interface).

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# interface swp51
          switch(config-if)# ipv6 ospf6 hello-interval 5
          switch(config-if)# ipv6 ospf6 network dead-interval 60
          switch(config-if)# ipv6 ospf6 network priority 5
          switch(config-if)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          interface swp51
           ipv6 ospf6 hello-interval 5
           ipv6 ospf6 dead-interval 60
           ipv6 ospf6 priority 5
          ...
          

          The following example command configures interface swp51 with the IPv6 advertise prefix list named myfilter:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# interface swp51
          switch(config-if)# ipv6 ospf6 advertise prefix-list myfilter
          switch(config-if)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          interface swp51
            ipv6 ospf6 advertise prefix-list myfilter
          ...
          

          The following example command configures the cost for swp51.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# interface swp51
          switch(config-if)# ipv6 ospf6 cost 1
          switch(config-if)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          interface swp51
            ipv6 ospf6 cost 1
          ...
          

          To show the configured OSPF interface parameter values, run the vtysh show ipv6 ospf6 interface command.

          SPF Timer Defaults

          OSPF3 uses the following default timers to prevent consecutive SPF from overburdening the CPU:

          The following example commands change the number of milliseconds from the initial event until SPF runs to 80, the number of milliseconds between consecutive SPF runs to 100, and the maximum number of milliseconds between SPFs to 6000.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf6
          switch(config-ospf6)# timers throttle spf 80 100 6000
          switch(config-ospf6)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          router ospf6
           ospf router-id 10.10.10.1
           passive-interface swp1
           passive-interface swp2
           network swp51 area 0.0.0.0
           timers throttle spf 80 100 6000
          ...
          

          To see the configured SPF timer values, run the vtysh show ipv6 ospf6 command.

          Configure the OSPFv3 Area

          You can use different areas to control routing. You can:

          The following section provides command examples.

          The following example command removes the 3:3::/64 route from the routing table. Without a route in the table, any destinations in that network are not reachable.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf6
          switch(config-ospf6)# area 0.0.0.0 range 3:3::/64 not-advertise
          switch(config-ospf6)# end
          switch# write memory
          switch# exit
          

          The following example command creates a summary route for all the routes in the range 2001::/64:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf6
          switch(config-ospf6)# area 0.0.0.0 range 2001::/64 advertise
          switch(config-ospf6)# end
          switch# write memory
          switch# exit
          

          You can also configure the cost for a summary route, which Cumulus Linux uses to determine the shortest paths to the destination. The value for cost must be between 0 and 16777215.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf6
          switch(config-ospf6)# area 0.0.0.0 range 2001::/64 cost 160
          switch(config-ospf6)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          router ospf6
            ospf6 router-id 10.10.10.1
            area 0.0.0.0 range 3:3::/64 not-advertise
            area 0.0.0.0 range 2001::/64 advertise
            area 0.0.0.0 range 2001::/64 cost 160
          ...
          

          Stub Areas

          External routes are the routes redistributed into OSPF from another protocol. They have an AS-wide flooding scope. Typically, external link states make up a large percentage of the LSDB. Stub areas reduce the LSDB size by not flooding AS-external LSAs.

          All routers must agree that an area is a stub, otherwise they do not become OSPF neighbors.

          To configure a stub area:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf6
          switch(config-ospf6)# area 0.0.0.1 stub
          switch(config-ospf6)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          router ospf6
           ospf6 router-id 10.10.10.63
           area 0.0.0.1 stub
          ...
          

          Stub areas still receive information about networks that belong to other areas of the same OSPF domain. If summarization is not configured (or is not comprehensive), the information can be overwhelming for the nodes. Totally stubby areas address this issue. Routers in totally stubby areas keep information about routing within their area in their LSDB.

          To configure a totally stubby area:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf6
          switch(config-ospf6)# area 0.0.0.1 stub no-summary
          switch(config-ospf6)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          router ospf6
           ospf6 router-id 10.10.10.63
           area 0.0.0.1 stub no-summary
          ...
          

          Here is a brief summary of the area type differences:

          Type Behavior
          Normal non-zero area LSA types 1, 2, 3, 4 area-scoped, type 5 externals, inter-area routes summarized
          Stub area LSA types 1, 2, 3, 4 area-scoped, no type 5 externals, inter-area routes summarized
          Totally stubby area LSA types 1, 2 area-scoped, default summary, no type 3, 4, 5 LSA types allowed

          Auto-cost Reference Bandwidth

          When you set the auto-cost reference bandwidth, Cumulus Linux dynamically calculates the OSPF interface cost to support higher speed links. The default value is 100000 for 100Gbps link speed. The cost of interfaces with link speeds lower than 100Gbps is higher.

          To avoid routing loops, set the bandwidth to a consistent value across all OSPF routers.

          The following example commands configure the auto-cost reference bandwidth for 90Gbps link speed:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf6
          switch(config-ospf6)# auto-cost reference-bandwidth 90000
          switch(config-ospf6)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          router ospf6
           ospf6 router-id 10.10.10.1
           interface lo area 0.0.0.0
           interface swp51 area 0.0.0.0
           auto-cost reference-bandwidth 90000
          ...
          

          Administrative Distance

          Cumulus Linux uses the administrative distance to choose which routing protocol to use when two different protocols provide route information for the same destination. The smaller the distance, the more reliable the protocol. For example, if the switch receives a route from OSPFv3 with an administrative distance of 110 and the same route from BGP with an administrative distance of 100, the switch chooses BGP.

          Cumulus Linux provides several commands to change the administrative distance for OSPF routes. The default value is 110.

          This example command sets the distance for an entire group of routes, rather than a specific route.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf6
          switch(config-ospf6)# distance 254
          switch(config-ospf6)# end
          switch# write memory
          switch# exit
          

          This example command changes the OSPF administrative distance to 150 for internal routes and 220 for external routes:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf6
          switch(config-ospf6)# distance ospf6 intra-area 150 inter-area 150 external 220
          switch(config-ospf6)# end
          switch# write memory
          switch# exit
          

          This example command changes the OSPF administrative distance to 150 for internal routes to a subnet or network inside the same area as the router:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf6
          switch(config-ospf6)# distance ospf6 intra-area 150
          switch(config-ospf6)# end
          switch# write memory
          switch# exit
          

          This example command changes the OSPF administrative distance to 150 for internal routes to a subnet in an area of which the router is not a part:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf6
          switch(config-ospf6)# distance ospf6 inter-area 150
          switch(config-ospf6)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration to the /etc/frr/frr.conf file. For example:

          ...
          router ospf6
           ospf6 router-id 10.10.10.1
           interface lo area 0.0.0.0
           distance ospf6 intra-area 150 inter-area 150 external 220
          ...
          

          Troubleshooting

          Cumulus Linux provides several OSPFv3 troubleshooting commands:

          To
          vtysh Command
          Show neighbor states show ipv6 ospf6 neighbor
          Verify that the LSDB is the same across all routers in the network show ipv6 ospf6 database
          Determine why Cumulus Linux does forward an OSPF route correctly show ipv6 ospf6 route
          Show OSPF interfaces show ipv6 ospf6 interface
          Help visualize the network view show ipv6 ospf6 spf tree
          Show information about the OSPFv3 process show ipv6 ospf6

          The following example shows the vtysh show ipv6 ospf6 neighbor command output:

          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# show ipv6 ospf6 neighbor
          Neighbor ID     Pri    DeadTime    State/IfState         Duration I/F[State]
          10.10.10.101      1    00:00:34     Full/BDR             00:02:58 swp51[DR]
          

          The following example shows the vtysh show ipv6 ospf6 route command output:

          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# show ipv6 ospf6 route
          Codes: K - kernel route, C - connected, S - static, R - RIPng,
                 O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
                 v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
                 f - OpenFabric,
                 > - selected route, * - FIB route, q - queued route, r - rejected route
          
          O   2001:db8::a00:100/127 [110/100] is directly connected, swp51, weight 1, 00:00:20
          O   2001:db8::a0a:a01/128 [110/10] is directly connected, lo, weight 1, 00:01:40
          O>* 2001:db8::a0a:a65/128 [110/110] via fe80::4638:39ff:fe00:2, swp51, weight 1, 00:00:15
          

          To capture OSPF packets, run the sudo tcpdump -v -i swp1 ip proto ospf6 command.

          OSPF Configuration Example

          This section shows an OSPF configuration example based on the reference topology.

          The example configuration configures:

          cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
          cumulus@leaf01:~$ nv set interface swp51 ip address 10.10.10.1/32
          cumulus@leaf01:~$ nv set interface swp52 ip address 10.10.10.1/32
          cumulus@leaf01:~$ nv set interface bond1 bond member swp1
          cumulus@leaf01:~$ nv set interface bond2 bond member swp2
          cumulus@leaf01:~$ nv set interface bond3 bond member swp3
          cumulus@leaf01:~$ nv set interface bond1 bond mlag id 1
          cumulus@leaf01:~$ nv set interface bond2 bond mlag id 2
          cumulus@leaf01:~$ nv set interface bond3 bond mlag id 3
          cumulus@leaf01:~$ nv set interface bond1 bond lacp-bypass on
          cumulus@leaf01:~$ nv set interface bond2 bond lacp-bypass on
          cumulus@leaf01:~$ nv set interface bond3 bond lacp-bypass on
          cumulus@leaf01:~$ nv set interface bond1-3 bridge domain br_default
          cumulus@leaf01:~$ nv set interface peerlink bond member swp49-50
          cumulus@leaf01:~$ nv set mlag mac-address 44:38:39:FF:00:AA
          cumulus@leaf01:~$ nv set mlag backup 10.10.10.2
          cumulus@leaf01:~$ nv set mlag peer-ip linklocal
          cumulus@leaf01:~$ nv set interface vlan10 ip address 10.1.10.2/24
          cumulus@leaf01:~$ nv set interface vlan20 ip address 10.1.20.2/24
          cumulus@leaf01:~$ nv set interface vlan30 ip address 10.1.30.2/24
          cumulus@leaf01:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
          cumulus@leaf01:~$ nv set interface vlan10 ip vrr mac-address 00:00:5e:00:01:00
          cumulus@leaf01:~$ nv set interface vlan10 ip vrr state up
          cumulus@leaf01:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
          cumulus@leaf01:~$ nv set interface vlan20 ip vrr mac-address 00:00:5e:00:01:00
          cumulus@leaf01:~$ nv set interface vlan20 ip vrr state up
          cumulus@leaf01:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
          cumulus@leaf01:~$ nv set interface vlan30 ip vrr mac-address 00:00:5e:00:01:00
          cumulus@leaf01:~$ nv set interface vlan30 ip vrr state up
          cumulus@leaf01:~$ nv set bridge domain br_default vlan 10,20,30
          cumulus@leaf01:~$ nv set bridge domain br_default untagged 1
          cumulus@leaf01:~$ nv set interface bond1 bridge domain br_default access 10
          cumulus@leaf01:~$ nv set interface bond2 bridge domain br_default access 20
          cumulus@leaf01:~$ nv set interface bond3 bridge domain br_default access 30
          cumulus@leaf01:~$ nv set vrf default router ospf router-id 10.10.10.1
          cumulus@leaf01:~$ nv set interface lo router ospf area 0
          cumulus@leaf01:~$ nv set interface swp51 router ospf area 0
          cumulus@leaf01:~$ nv set interface swp52 router ospf area 0
          cumulus@leaf01:~$ nv set interface swp51 router ospf network-type point-to-point
          cumulus@leaf01:~$ nv set interface swp52 router ospf network-type point-to-point
          cumulus@leaf01:~$ nv set interface swp51 router ospf timers hello-interval 5
          cumulus@leaf01:~$ nv set interface swp51 router ospf timers dead-interval 60
          cumulus@leaf01:~$ nv set interface swp52 router ospf timers hello-interval 5
          cumulus@leaf01:~$ nv set interface swp52 router ospf timers dead-interval 60
          cumulus@leaf01:~$ nv set interface vlan10 router ospf area 0
          cumulus@leaf01:~$ nv set interface vlan20 router ospf area 0
          cumulus@leaf01:~$ nv set interface vlan30 router ospf area 0
          cumulus@leaf01:~$ nv set interface vlan10 router ospf passive on
          cumulus@leaf01:~$ nv set interface vlan20 router ospf passive on
          cumulus@leaf01:~$ nv set interface vlan30 router ospf passive on
          cumulus@leaf01:~$ nv set router ospf timers spf delay 80
          cumulus@leaf01:~$ nv set router ospf timers spf holdtime 100
          cumulus@leaf01:~$ nv set router ospf timers spf max-holdtime 6000
          cumulus@leaf01:~$ nv config apply
          
          cumulus@leaf02:~$ nv set interface lo ip address 10.10.10.2/32
          cumulus@leaf02:~$ nv set interface swp51 ip address 10.10.10.2/32
          cumulus@leaf02:~$ nv set interface swp52 ip address 10.10.10.2/32
          cumulus@leaf02:~$ nv set interface bond1 bond member swp1
          cumulus@leaf02:~$ nv set interface bond2 bond member swp2
          cumulus@leaf02:~$ nv set interface bond3 bond member swp3
          cumulus@leaf02:~$ nv set interface bond1 bond mlag id 1
          cumulus@leaf02:~$ nv set interface bond2 bond mlag id 2
          cumulus@leaf02:~$ nv set interface bond3 bond mlag id 3
          cumulus@leaf02:~$ nv set interface bond1 bond lacp-bypass on
          cumulus@leaf02:~$ nv set interface bond2 bond lacp-bypass on
          cumulus@leaf02:~$ nv set interface bond3 bond lacp-bypass on
          cumulus@leaf02:~$ nv set interface bond1-3 bridge domain br_default
          cumulus@leaf02:~$ nv set interface peerlink bond member swp49-50
          cumulus@leaf02:~$ nv set mlag mac-address 44:38:39:FF:00:AA
          cumulus@leaf02:~$ nv set mlag backup 10.10.10.1
          cumulus@leaf02:~$ nv set mlag peer-ip linklocal
          cumulus@leaf02:~$ nv set interface vlan10 ip address 10.1.10.3/24
          cumulus@leaf02:~$ nv set interface vlan20 ip address 10.1.20.3/24
          cumulus@leaf02:~$ nv set interface vlan30 ip address 10.1.30.3/24
          cumulus@leaf02:~$ nv set interface vlan10 ip vrr address 10.1.10.1/24
          cumulus@leaf02:~$ nv set interface vlan10 ip vrr mac-address 00:00:5e:00:01:00
          cumulus@leaf02:~$ nv set interface vlan10 ip vrr state up
          cumulus@leaf02:~$ nv set interface vlan20 ip vrr address 10.1.20.1/24
          cumulus@leaf02:~$ nv set interface vlan20 ip vrr mac-address 00:00:5e:00:01:00
          cumulus@leaf02:~$ nv set interface vlan20 ip vrr state up
          cumulus@leaf02:~$ nv set interface vlan30 ip vrr address 10.1.30.1/24
          cumulus@leaf02:~$ nv set interface vlan30 ip vrr mac-address 00:00:5e:00:01:00
          cumulus@leaf02:~$ nv set interface vlan30 ip vrr state up
          cumulus@leaf02:~$ nv set bridge domain br_default vlan 10,20,30
          cumulus@leaf02:~$ nv set bridge domain br_default untagged 1
          cumulus@leaf02:~$ nv set interface bond1 bridge domain br_default access 10
          cumulus@leaf02:~$ nv set interface bond2 bridge domain br_default access 20
          cumulus@leaf02:~$ nv set interface bond3 bridge domain br_default access 30
          cumulus@leaf02:~$ nv set vrf default router ospf router-id 10.10.10.2
          cumulus@leaf02:~$ nv set interface lo router ospf area 0
          cumulus@leaf02:~$ nv set interface swp51 router ospf area 0
          cumulus@leaf02:~$ nv set interface swp52 router ospf area 0
          cumulus@leaf02:~$ nv set interface swp51 router ospf network-type point-to-point
          cumulus@leaf02:~$ nv set interface swp52 router ospf network-type point-to-point
          cumulus@leaf02:~$ nv set interface swp51 router ospf timers hello-interval 5
          cumulus@leaf02:~$ nv set interface swp51 router ospf timers dead-interval 60
          cumulus@leaf02:~$ nv set interface swp52 router ospf timers hello-interval 5
          cumulus@leaf02:~$ nv set interface swp52 router ospf timers dead-interval 60
          cumulus@leaf02:~$ nv set interface vlan10 router ospf area 0
          cumulus@leaf02:~$ nv set interface vlan20 router ospf area 0
          cumulus@leaf02:~$ nv set interface vlan30 router ospf area 0
          cumulus@leaf02:~$ nv set interface vlan10 router ospf passive on
          cumulus@leaf02:~$ nv set interface vlan20 router ospf passive on
          cumulus@leaf02:~$ nv set interface vlan30 router ospf passive on
          cumulus@leaf02:~$ nv set router ospf timers spf delay 80
          cumulus@leaf02:~$ nv set router ospf timers spf holdtime 100
          cumulus@leaf02:~$ nv set router ospf timers spf max-holdtime 6000
          cumulus@leaf02:~$ nv config apply
          
          cumulus@spine01:~$ nv set interface lo ip address 10.10.10.101/32
          cumulus@spine01:~$ nv set interface swp1 ip address 10.10.10.101/32
          cumulus@spine01:~$ nv set interface swp2 ip address 10.10.10.101/32
          cumulus@spine01:~$ nv set interface swp5 ip address 10.10.10.101/32
          cumulus@spine01:~$ nv set interface swp6 ip address 10.10.10.101/32
          cumulus@spine01:~$ nv set vrf default router ospf router-id 10.10.10.101
          cumulus@spine01:~$ nv set interface lo router ospf area 0
          cumulus@spine01:~$ nv set interface swp1 router ospf area 0
          cumulus@spine01:~$ nv set interface swp1 router ospf network-type point-to-point
          cumulus@spine01:~$ nv set interface swp1 router ospf timers hello-interval 5
          cumulus@spine01:~$ nv set interface swp1 router ospf timers dead-interval 60
          cumulus@spine01:~$ nv set interface swp2 router ospf area 0
          cumulus@spine01:~$ nv set interface swp2 router ospf network-type point-to-point
          cumulus@spine01:~$ nv set interface swp2 router ospf timers hello-interval 5
          cumulus@spine01:~$ nv set interface swp2 router ospf timers dead-interval 60
          cumulus@spine01:~$ nv set interface swp5 router ospf area 0
          cumulus@spine01:~$ nv set interface swp5 router ospf network-type point-to-point
          cumulus@spine01:~$ nv set interface swp5 router ospf timers hello-interval 5
          cumulus@spine01:~$ nv set interface swp5 router ospf timers dead-interval 60
          cumulus@spine01:~$ nv set interface swp6 router ospf area 0
          cumulus@spine01:~$ nv set interface swp6 router ospf network-type point-to-point
          cumulus@spine01:~$ nv set interface swp6 router ospf timers hello-interval 5
          cumulus@spine01:~$ nv set interface swp6 router ospf timers dead-interval 60
          cumulus@spine01:~$ nv set router ospf timers spf max-holdtime 6000
          cumulus@spine01:~$ nv set router ospf timers spf holdtime 100
          cumulus@spine01:~$ nv set router ospf timers spf max-holdtime 6000
          cumulus@spine01:~$ nv config apply
          
          cumulus@spine02:~$ nv set interface lo ip address 10.10.10.102/32
          cumulus@spine02:~$ nv set interface swp1 ip address 10.10.10.102/32
          cumulus@spine02:~$ nv set interface swp2 ip address 10.10.10.102/32
          cumulus@spine02:~$ nv set interface swp5 ip address 10.10.10.102/32
          cumulus@spine02:~$ nv set interface swp6 ip address 10.10.10.102/32
          cumulus@spine02:~$ nv set vrf default router ospf router-id 10.10.10.102
          cumulus@spine02:~$ nv set interface lo router ospf area 0
          cumulus@spine02:~$ nv set interface swp1 router ospf area 0
          cumulus@spine02:~$ nv set interface swp1 router ospf network-type point-to-point
          cumulus@spine02:~$ nv set interface swp1 router ospf timers hello-interval 5
          cumulus@spine02:~$ nv set interface swp1 router ospf timers dead-interval 60
          cumulus@spine02:~$ nv set interface swp2 router ospf area 0
          cumulus@spine02:~$ nv set interface swp2 router ospf network-type point-to-point
          cumulus@spine02:~$ nv set interface swp2 router ospf timers hello-interval 5
          cumulus@spine02:~$ nv set interface swp2 router ospf timers dead-interval 60
          cumulus@spine02:~$ nv set interface swp5 router ospf area 0
          cumulus@spine02:~$ nv set interface swp5 router ospf network-type point-to-point
          cumulus@spine02:~$ nv set interface swp5 router ospf timers hello-interval 5
          cumulus@spine02:~$ nv set interface swp5 router ospf timers dead-interval 60
          cumulus@spine02:~$ nv set interface swp6 router ospf area 0
          cumulus@spine02:~$ nv set interface swp6 router ospf network-type point-to-point
          cumulus@spine02:~$ nv set interface swp6 router ospf timers hello-interval 5
          cumulus@spine02:~$ nv set interface swp6 router ospf timers dead-interval 60
          cumulus@spine02:~$ nv set router ospf timers spf max-holdtime 6000
          cumulus@spine02:~$ nv set router ospf timers spf holdtime 100
          cumulus@spine02:~$ nv set router ospf timers spf max-holdtime 6000
          cumulus@spine02:~$ nv config apply
          
          cumulus@border01:~$ nv set interface lo ip address 10.10.10.63/32
          cumulus@border01:~$ nv set interface swp51 ip address 10.10.10.63/32
          cumulus@border01:~$ nv set interface swp52 ip address 10.10.10.63/32
          cumulus@border01:~$ nv set interface bond1 bond member swp1
          cumulus@border01:~$ nv set interface bond2 bond member swp2
          cumulus@border01:~$ nv set interface bond1 bond mlag id 1
          cumulus@border01:~$ nv set interface bond2 bond mlag id 2
          cumulus@border01:~$ nv set interface bond1 bond lacp-bypass on
          cumulus@border01:~$ nv set interface bond2 bond lacp-bypass on
          cumulus@border01:~$ nv set interface bond1 bridge domain br_default access 2001
          cumulus@border01:~$ nv set interface bond2 bridge domain br_default access 2001
          cumulus@border01:~$ nv set interface bond1-2 bridge domain br_default
          cumulus@border01:~$ nv set interface vlan2001
          cumulus@border01:~$ nv set interface vlan2001 ip address 10.1.201.2/24
          cumulus@border01:~$ nv set interface vlan2001 ip vrr address 10.1.201.1/24
          cumulus@border01:~$ nv set interface vlan2001 ip vrr mac-address 00:00:5e:00:01:00
          cumulus@border01:~$ nv set interface vlan2001 ip vrr state up
          cumulus@border01:~$ nv set interface peerlink bond member swp49-50
          cumulus@border01:~$ nv set mlag mac-address 44:38:39:FF:00:FF
          cumulus@border01:~$ nv set mlag backup 10.10.10.64
          cumulus@border01:~$ nv set mlag peer-ip linklocal
          cumulus@border01:~$ nv set bridge domain br_default untagged 1
          cumulus@border01:~$ nv set vrf default router ospf router-id 10.10.10.63
          cumulus@border01:~$ nv set interface lo router ospf area 0
          cumulus@border01:~$ nv set interface swp51 router ospf area 0
          cumulus@border01:~$ nv set interface swp51 router ospf network-type point-to-point
          cumulus@border01:~$ nv set interface swp51 router ospf timers hello-interval 5
          cumulus@border01:~$ nv set interface swp51 router ospf timers dead-interval 60
          cumulus@border01:~$ nv set interface swp52 router ospf area 0
          cumulus@border01:~$ nv set interface swp52 router ospf network-type point-to-point
          cumulus@border01:~$ nv set interface swp52 router ospf timers hello-interval 5
          cumulus@border01:~$ nv set interface swp52 router ospf timers dead-interval 60
          cumulus@border01:~$ nv set interface vlan2001 router ospf area 1
          cumulus@border01:~$ nv set router ospf timers spf max-holdtime 6000
          cumulus@border01:~$ nv set router ospf timers spf holdtime 100
          cumulus@border01:~$ nv set router ospf timers spf max-holdtime 6000
          cumulus@border01:~$ nv config apply
          
          cumulus@border02:~$ nv set interface lo ip address 10.10.10.64/32
          cumulus@border02:~$ nv set interface swp51 ip address 10.10.10.64/32
          cumulus@border02:~$ nv set interface swp52 ip address 10.10.10.64/32
          cumulus@border02:~$ nv set interface bond1 bond member swp1
          cumulus@border02:~$ nv set interface bond2 bond member swp2
          cumulus@border02:~$ nv set interface bond1 bond mlag id 1
          cumulus@border02:~$ nv set interface bond2 bond mlag id 2
          cumulus@border02:~$ nv set interface bond1 bond lacp-bypass on
          cumulus@border02:~$ nv set interface bond2 bond lacp-bypass on
          cumulus@border02:~$ nv set interface bond1 bridge domain br_default access 2001
          cumulus@border02:~$ nv set interface bond2 bridge domain br_default access 2001
          cumulus@border02:~$ nv set interface bond1-2 bridge domain br_default
          cumulus@border02:~$ nv set interface vlan2001
          cumulus@border02:~$ nv set interface vlan2001 ip address 10.1.201.3/24
          cumulus@border02:~$ nv set interface vlan2001 ip vrr address 10.1.201.1/24
          cumulus@border02:~$ nv set interface vlan2001 ip vrr mac-address 00:00:5e:00:01:00
          cumulus@border02:~$ nv set interface vlan2001 ip vrr state up
          cumulus@border02:~$ nv set interface peerlink bond member swp49-50
          cumulus@border02:~$ nv set mlag mac-address 44:38:39:FF:00:FF
          cumulus@border02:~$ nv set mlag backup 10.10.10.63
          cumulus@border02:~$ nv set mlag peer-ip linklocal
          cumulus@border02:~$ nv set bridge domain br_default untagged 1
          cumulus@border02:~$ nv set vrf default router ospf router-id 10.10.10.64
          cumulus@border02:~$ nv set interface lo router ospf area 0
          cumulus@border02:~$ nv set interface swp51 router ospf area 0
          cumulus@border02:~$ nv set interface swp51 router ospf network-type point-to-point
          cumulus@border02:~$ nv set interface swp51 router ospf timers hello-interval 5
          cumulus@border02:~$ nv set interface swp51 router ospf timers dead-interval 60
          cumulus@border02:~$ nv set interface swp52 router ospf area 0
          cumulus@border02:~$ nv set interface swp52 router ospf network-type point-to-point
          cumulus@border02:~$ nv set interface swp52 router ospf timers hello-interval 5
          cumulus@border02:~$ nv set interface swp52 router ospf timers dead-interval 60
          cumulus@border02:~$ nv set interface vlan2001 router ospf area 1
          cumulus@border02:~$ nv set router ospf timers spf max-holdtime 6000
          cumulus@border02:~$ nv set router ospf timers spf holdtime 100
          cumulus@border02:~$ nv set router ospf timers spf max-holdtime 6000
          cumulus@border02:~$ nv config apply
          
          cumulus@leaf01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
          - set:
              bridge:
                domain:
                  br_default:
                    untagged: 1
                    vlan:
                      10,20,30: {}
              interface:
                bond1:
                  bond:
                    lacp-bypass: on
                    member:
                      swp1: {}
                    mlag:
                      enable: on
                      id: 1
                  bridge:
                    domain:
                      br_default:
                        access: 10
                  type: bond
                bond2:
                  bond:
                    lacp-bypass: on
                    member:
                      swp2: {}
                    mlag:
                      enable: on
                      id: 2
                  bridge:
                    domain:
                      br_default:
                        access: 20
                  type: bond
                bond3:
                  bond:
                    lacp-bypass: on
                    member:
                      swp3: {}
                    mlag:
                      enable: on
                      id: 3
                  bridge:
                    domain:
                      br_default:
                        access: 30
                  type: bond
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.1/32: {}
                  type: loopback
                peerlink:
                  bond:
                    member:
                      swp49: {}
                      swp50: {}
                  type: peerlink
                peerlink.4094:
                  base-interface: peerlink
                  type: sub
                  vlan: 4094
                swp51:
                  ip:
                    address:
                      10.10.10.1/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                swp52:
                  ip:
                    address:
                      10.10.10.1/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                vlan10:
                  ip:
                    address:
                      10.1.10.2/24: {}
                    vrr:
                      address:
                        10.1.10.1/24: {}
                      enable: on
                      mac-address: 00:00:5e:00:01:00
                      state:
                        up: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      passive: on
                  type: svi
                  vlan: 10
                vlan20:
                  ip:
                    address:
                      10.1.20.2/24: {}
                    vrr:
                      address:
                        10.1.20.1/24: {}
                      enable: on
                      mac-address: 00:00:5e:00:01:00
                      state:
                        up: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      passive: on
                  type: svi
                  vlan: 20
                vlan30:
                  ip:
                    address:
                      10.1.30.2/24: {}
                    vrr:
                      address:
                        10.1.30.1/24: {}
                      enable: on
                      mac-address: 00:00:5e:00:01:00
                      state:
                        up: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      passive: on
                  type: svi
                  vlan: 30
              mlag:
                backup:
                  10.10.10.2: {}
                enable: on
                init-delay: 5
                mac-address: 44:38:39:FF:00:AA
                peer-ip: linklocal
              router:
                ospf:
                  enable: on
                  timers:
                    spf:
                      delay: 80
                      holdtime: 100
                      max-holdtime: 6000
                vrr:
                  enable: on
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$LVtX8JO1GJbiiVfq$Lqn/7MDaxbfgkKbDETAB.2sPuqvXJxGFnldbuJqMUBqczlMM1nNTrV5Kld7KwBvAkky6vJlQziYPqJS/ge88n.
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:7a
                hostname: leaf01
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    ospf:
                      area:
                        '0':
                          network:
                            10.10.10.1/32: {}
                      enable: on
                      router-id: 10.10.10.1
          
          cumulus@leaf02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
          - set:
              bridge:
                domain:
                  br_default:
                    untagged: 1
                    vlan:
                      10,20,30: {}
              interface:
                bond1:
                  bond:
                    lacp-bypass: on
                    member:
                      swp1: {}
                    mlag:
                      enable: on
                      id: 1
                  bridge:
                    domain:
                      br_default:
                        access: 10
                  type: bond
                bond2:
                  bond:
                    lacp-bypass: on
                    member:
                      swp2: {}
                    mlag:
                      enable: on
                      id: 2
                  bridge:
                    domain:
                      br_default:
                        access: 20
                  type: bond
                bond3:
                  bond:
                    lacp-bypass: on
                    member:
                      swp3: {}
                    mlag:
                      enable: on
                      id: 3
                  bridge:
                    domain:
                      br_default:
                        access: 30
                  type: bond
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.2/32: {}
                  type: loopback
                peerlink:
                  bond:
                    member:
                      swp49: {}
                      swp50: {}
                  type: peerlink
                peerlink.4094:
                  base-interface: peerlink
                  type: sub
                  vlan: 4094
                swp51:
                  ip:
                    address:
                      10.10.10.2/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                swp52:
                  ip:
                    address:
                      10.10.10.2/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                vlan10:
                  ip:
                    address:
                      10.1.10.3/24: {}
                    vrr:
                      address:
                        10.1.10.1/24: {}
                      enable: on
                      mac-address: 00:00:5e:00:01:00
                      state:
                        up: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      passive: on
                  type: svi
                  vlan: 10
                vlan20:
                  ip:
                    address:
                      10.1.20.3/24: {}
                    vrr:
                      address:
                        10.1.20.1/24: {}
                      enable: on
                      mac-address: 00:00:5e:00:01:00
                      state:
                        up: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      passive: on
                  type: svi
                  vlan: 20
                vlan30:
                  ip:
                    address:
                      10.1.30.3/24: {}
                    vrr:
                      address:
                        10.1.30.1/24: {}
                      enable: on
                      mac-address: 00:00:5e:00:01:00
                      state:
                        up: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      passive: on
                  type: svi
                  vlan: 30
              mlag:
                backup:
                  10.10.10.1: {}
                enable: on
                init-delay: 5
                mac-address: 44:38:39:FF:00:AA
                peer-ip: linklocal
              router:
                ospf:
                  enable: on
                  timers:
                    spf:
                      delay: 80
                      holdtime: 100
                      max-holdtime: 6000
                vrr:
                  enable: on
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$VYY4ykwe0LrdedRG$MNfa/eX7COUh57bGG2pZJROnvBWDfOQCnowaOiuKumvVyno/4fvWbEMEbaACLqsAQMGw5SYtgtTn.5WU5USFo.
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:78
                hostname: leaf02
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    ospf:
                      area:
                        '0':
                          network:
                            10.10.10.2/32: {}
                      enable: on
                      router-id: 10.10.10.2
          
          cumulus@spine01:~$ sudo cat /etc/nvue.d/startup.yaml
          - set:
              interface:
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.101/32: {}
                  type: loopback
                swp1:
                  ip:
                    address:
                      10.10.10.101/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                swp2:
                  ip:
                    address:
                      10.10.10.101/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                swp5:
                  ip:
                    address:
                      10.10.10.101/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                swp6:
                  ip:
                    address:
                      10.10.10.101/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
              router:
                ospf:
                  enable: on
                  timers:
                    spf:
                      holdtime: 100
                      max-holdtime: 6000
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$m.snt3F/unawCsit$8frw1.klD4wdYPMjb/chqYLihsDvjLtoT2913fZ/3p9vZfXRsAkcjV0O2mpOoLrvrM2uZlLIYVgxqoHZH7c6t/
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:82
                hostname: spine01
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    ospf:
                      area:
                        '0':
                          network:
                            10.10.10.101/32: {}
                      enable: on
                      router-id: 10.10.10.101
          
          cumulus@spine02:~$ sudo cat /etc/nvue.d/startup.yaml
          - set:
              interface:
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.102/32: {}
                  type: loopback
                swp1:
                  ip:
                    address:
                      10.10.10.102/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                swp2:
                  ip:
                    address:
                      10.10.10.102/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                swp5:
                  ip:
                    address:
                      10.10.10.102/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                swp6:
                  ip:
                    address:
                      10.10.10.102/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
              router:
                ospf:
                  enable: on
                  timers:
                    spf:
                      holdtime: 100
                      max-holdtime: 6000
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$UWQi/FawiF0WBP.8$zlLS2.FiUHsZ37L6v/8MmV9W0CVjdbyn4PSDwm5Cr6Ct02EtvAihYXgUy9owXAx0jQYIm2XbKBunxN6VpEr4X1
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:92
                hostname: spine02
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    ospf:
                      area:
                        '0':
                          network:
                            10.10.10.102/32: {}
                      enable: on
                      router-id: 10.10.10.102
          
          cumulus@border01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
          - set:
              bridge:
                domain:
                  br_default:
                    untagged: 1
                    vlan:
                      '2001': {}
              interface:
                bond1:
                  bond:
                    lacp-bypass: on
                    member:
                      swp1: {}
                    mlag:
                      enable: on
                      id: 1
                  bridge:
                    domain:
                      br_default:
                        access: 2001
                  type: bond
                bond2:
                  bond:
                    lacp-bypass: on
                    member:
                      swp2: {}
                    mlag:
                      enable: on
                      id: 2
                  bridge:
                    domain:
                      br_default:
                        access: 2001
                  type: bond
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.63/32: {}
                  type: loopback
                peerlink:
                  bond:
                    member:
                      swp49: {}
                      swp50: {}
                  type: peerlink
                peerlink.4094:
                  base-interface: peerlink
                  type: sub
                  vlan: 4094
                swp51:
                  ip:
                    address:
                      10.10.10.63/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                swp52:
                  ip:
                    address:
                      10.10.10.63/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                vlan2001:
                  ip:
                    address:
                      10.1.201.2/24: {}
                    vrr:
                      address:
                        10.1.201.1/24: {}
                      enable: on
                      mac-address: 00:00:5e:00:01:00
                      state:
                        up: {}
                  router:
                    ospf:
                      area: 1
                      enable: on
                  type: svi
                  vlan: 2001
              mlag:
                backup:
                  10.10.10.64: {}
                enable: on
                init-delay: 5
                mac-address: 44:38:39:FF:00:FF
                peer-ip: linklocal
              router:
                ospf:
                  enable: on
                  timers:
                    spf:
                      holdtime: 100
                      max-holdtime: 6000
                vrr:
                  enable: on
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$siKWEoNyDqJpzgTg$kjQ12uQTIHRnsbF0hYbbPfRP6PRuCSk66Q79KHKEJVcx.raueCfL3hiW4FxqgDBOxWxLTC.U8fYeASiKvBS7A0
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:74
                hostname: border01
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    ospf:
                      area:
                        '0':
                          network:
                            10.10.10.63/32: {}
                      enable: on
                      router-id: 10.10.10.63
          
          cumulus@border02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml 
          - set:
              bridge:
                domain:
                  br_default:
                    untagged: 1
                    vlan:
                      '2001': {}
              interface:
                bond1:
                  bond:
                    lacp-bypass: on
                    member:
                      swp1: {}
                    mlag:
                      enable: on
                      id: 1
                  bridge:
                    domain:
                      br_default:
                        access: 2001
                  type: bond
                bond2:
                  bond:
                    lacp-bypass: on
                    member:
                      swp2: {}
                    mlag:
                      enable: on
                      id: 2
                  bridge:
                    domain:
                      br_default:
                        access: 2001
                  type: bond
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.64/32: {}
                  type: loopback
                peerlink:
                  bond:
                    member:
                      swp49: {}
                      swp50: {}
                  type: peerlink
                peerlink.4094:
                  base-interface: peerlink
                  type: sub
                  vlan: 4094
                swp51:
                  ip:
                    address:
                      10.10.10.64/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                swp52:
                  ip:
                    address:
                      10.10.10.64/32: {}
                  router:
                    ospf:
                      area: 0
                      enable: on
                      network-type: point-to-point
                      timers:
                        dead-interval: 60
                        hello-interval: 5
                  type: swp
                vlan2001:
                  ip:
                    address:
                      10.1.201.3/24: {}
                    vrr:
                      address:
                        10.1.201.1/24: {}
                      enable: on
                      mac-address: 00:00:5e:00:01:00
                      state:
                        up: {}
                  router:
                    ospf:
                      area: 1
                      enable: on
                  type: svi
                  vlan: 2001
              mlag:
                backup:
                  10.10.10.63: {}
                enable: on
                init-delay: 5
                mac-address: 44:38:39:FF:00:FF
                peer-ip: linklocal
              router:
                ospf:
                  enable: on
                  timers:
                    spf:
                      holdtime: 100
                      max-holdtime: 6000
                vrr:
                  enable: on
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$tJNymcft48141Lz5$cEJBzLJTIQSgIIPOLRSLFPgVPR0QkBUXY1pVAPraVuatKWGS9s.AdUZCd0ayHqgCfwvYyECf9e93VYkdl4wgM0
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:7c
                hostname: border02
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    ospf:
                      area:
                        '0':
                          network:
                            10.10.10.64/32: {}
                      enable: on
                      router-id: 10.10.10.64
          
          cumulus@leaf01:mgmt:~$ sudo cat /etc/network/interfaces
          ...
          auto lo
          iface lo inet loopback
              address 10.10.10.1/32
          

          auto mgmt iface mgmt address 127.0.0.1/8 address ::1/128 vrf-table auto

          auto eth0 iface eth0 inet dhcp ip-forward off ip6-forward off vrf mgmt

          auto swp51 iface swp51 address 10.10.10.1/32

          auto swp52 iface swp52 address 10.10.10.1/32

          auto bond1 iface bond1 bond-slaves swp1 bond-mode 802.3ad bond-lacp-bypass-allow yes clag-id 1 bridge-access 10

          auto bond2 iface bond2 bond-slaves swp2 bond-mode 802.3ad bond-lacp-bypass-allow yes clag-id 2 bridge-access 20

          auto bond3 iface bond3 bond-slaves swp3 bond-mode 802.3ad bond-lacp-bypass-allow yes clag-id 3 bridge-access 30

          auto peerlink iface peerlink bond-slaves swp49 swp50 bond-mode 802.3ad bond-lacp-bypass-allow no

          auto peerlink.4094 iface peerlink.4094 clagd-peer-ip linklocal clagd-backup-ip 10.10.10.2 clagd-sys-mac 44:38:39:FF:00:AA clagd-args –initDelay 180

          auto vlan10 iface vlan10 address 10.1.10.2/24 address-virtual 00:00:5e:00:01:00 10.1.10.1/24 hwaddress 44:38:39:22:01:b1 vlan-raw-device br_default vlan-id 10

          auto vlan20 iface vlan20 address 10.1.20.2/24 address-virtual 00:00:5e:00:01:00 10.1.20.1/24 hwaddress 44:38:39:22:01:b1 vlan-raw-device br_default vlan-id 20

          auto vlan30 iface vlan30 address 10.1.30.2/24 address-virtual 00:00:5e:00:01:00 10.1.30.1/24 hwaddress 44:38:39:22:01:b1 vlan-raw-device br_default vlan-id 30

          auto br_default iface br_default bridge-ports bond1 bond2 bond3 peerlink hwaddress 44:38:39:22:01:b1 bridge-vlan-aware yes bridge-vids 10 20 30 bridge-pvid 1

          cumulus@leaf02:~$ sudo cat /etc/network/interfaces
          ...
          auto lo
          iface lo inet loopback
              address 10.10.10.2/32
          

          auto mgmt iface mgmt address 127.0.0.1/8 address ::1/128 vrf-table auto

          auto eth0 iface eth0 inet dhcp ip-forward off ip6-forward off vrf mgmt

          auto swp51 iface swp51 address 10.10.10.2/32

          auto swp52 iface swp52 address 10.10.10.2/32

          auto bond1 iface bond1 bond-slaves swp1 bond-mode 802.3ad bond-lacp-bypass-allow yes clag-id 1 bridge-access 10

          auto bond2 iface bond2 bond-slaves swp2 bond-mode 802.3ad bond-lacp-bypass-allow yes clag-id 2 bridge-access 20

          auto bond3 iface bond3 bond-slaves swp3 bond-mode 802.3ad bond-lacp-bypass-allow yes clag-id 3 bridge-access 30

          auto peerlink iface peerlink bond-slaves swp49 swp50 bond-mode 802.3ad bond-lacp-bypass-allow no

          auto peerlink.4094 iface peerlink.4094 clagd-peer-ip linklocal clagd-backup-ip 10.10.10.1 clagd-sys-mac 44:38:39:FF:00:AA clagd-args –initDelay 180

          auto vlan10 iface vlan10 address 10.1.10.3/24 address-virtual 00:00:5e:00:01:00 10.1.10.1/24 hwaddress 44:38:39:22:01:af vlan-raw-device br_default vlan-id 10

          auto vlan20 iface vlan20 address 10.1.20.3/24 address-virtual 00:00:5e:00:01:00 10.1.20.1/24 hwaddress 44:38:39:22:01:af vlan-raw-device br_default vlan-id 20

          auto vlan30 iface vlan30 address 10.1.30.3/24 address-virtual 00:00:5e:00:01:00 10.1.30.1/24 hwaddress 44:38:39:22:01:af vlan-raw-device br_default vlan-id 30

          auto br_default iface br_default bridge-ports bond1 bond2 bond3 peerlink hwaddress 44:38:39:22:01:af bridge-vlan-aware yes bridge-vids 10 20 30 bridge-pvid 1

          cumulus@spine01:~$ cat /etc/network/interfaces
          ...
          auto lo
          iface lo inet loopback
              address 10.10.10.101/32
          auto mgmt
          iface mgmt
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          auto eth0
          iface eth0 inet dhcp
              ip-forward off
              ip6-forward off
              vrf mgmt
          auto swp1
          iface swp1
              address 10.10.10.101/32
          auto swp2
          iface swp2
              address 10.10.10.101/32
          auto swp5
          iface swp5
              address 10.10.10.101/32
          auto swp6
          iface swp6
              address 10.10.10.101/32
          
          cumulus@spine02:~$ cat /etc/network/interfaces
          ...
          auto lo
          iface lo inet loopback
              address 10.10.10.102/32
          auto mgmt
          iface mgmt
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          auto eth0
          iface eth0 inet dhcp
              ip-forward off
              ip6-forward off
              vrf mgmt
          auto swp1
          iface swp1
              address 10.10.10.102/32
          auto swp2
          iface swp2
              address 10.10.10.102/32
          auto swp5
          iface swp5
              address 10.10.10.102/32
          auto swp6
          iface swp6
              address 10.10.10.102/32
          
          cumulus@border01:~$ sudo cat /etc/network/interfaces
          ...
          auto lo
          iface lo inet loopback
              address 10.10.10.63/32
          

          auto mgmt iface mgmt address 127.0.0.1/8 address ::1/128 vrf-table auto

          auto eth0 iface eth0 inet dhcp ip-forward off ip6-forward off vrf mgmt

          auto swp51 iface swp51 address 10.10.10.63/32

          auto swp52 iface swp52 address 10.10.10.63/32

          auto bond1 iface bond1 bond-slaves swp1 bond-mode 802.3ad bond-lacp-bypass-allow yes clag-id 1 bridge-access 2001

          auto bond2 iface bond2 bond-slaves swp2 bond-mode 802.3ad bond-lacp-bypass-allow yes clag-id 2 bridge-access 2001

          auto vlan2001 iface vlan2001 address 10.1.201.2/24 address-virtual 00:00:5e:00:01:00 10.1.201.1/24 hwaddress 44:38:39:22:01:ab vlan-raw-device br_default vlan-id 2001

          auto peerlink iface peerlink bond-slaves swp49 swp50 bond-mode 802.3ad bond-lacp-bypass-allow no

          auto peerlink.4094 iface peerlink.4094 clagd-peer-ip linklocal clagd-backup-ip 10.10.10.64 clagd-sys-mac 44:38:39:FF:00:FF clagd-args –initDelay 180

          auto br_default iface br_default bridge-ports bond1 bond2 peerlink hwaddress 44:38:39:22:01:ab bridge-vlan-aware yes bridge-vids 1 bridge-pvid 1

          cumulus@border02:~$ sudo cat /etc/network/interfaces
          ...
          auto lo
          iface lo inet loopback
              address 10.10.10.64/32
          

          auto mgmt iface mgmt address 127.0.0.1/8 address ::1/128 vrf-table auto

          auto eth0 iface eth0 inet dhcp ip-forward off ip6-forward off vrf mgmt

          auto swp51 iface swp51 address 10.10.10.64/32

          auto swp52 iface swp52 address 10.10.10.64/32

          auto bond1 iface bond1 bond-slaves swp1 bond-mode 802.3ad bond-lacp-bypass-allow yes clag-id 1 bridge-access 2001

          auto bond2 iface bond2 bond-slaves swp2 bond-mode 802.3ad bond-lacp-bypass-allow yes clag-id 2 bridge-access 2001

          auto vlan2001 iface vlan2001 address 10.1.201.3/24 address-virtual 00:00:5e:00:01:00 10.1.201.1/24 hwaddress 44:38:39:22:01:b3 vlan-raw-device br_default vlan-id 2001

          auto peerlink iface peerlink bond-slaves swp49 swp50 bond-mode 802.3ad bond-lacp-bypass-allow no

          auto peerlink.4094 iface peerlink.4094 clagd-peer-ip linklocal clagd-backup-ip 10.10.10.63 clagd-sys-mac 44:38:39:FF:00:FF clagd-args –initDelay 180

          auto br_default iface br_default bridge-ports bond1 bond2 peerlink hwaddress 44:38:39:22:01:b3 bridge-vlan-aware yes bridge-vids 1 bridge-pvid 1

          cumulus@leaf01:~$ sudo cat /etc/frr/frr.conf
          ...
          interface lo
          ip ospf area 0
          interface swp51
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface swp52
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface vlan10
          ip ospf area 0
          router ospf
          passive-interface vlan10
          interface vlan20
          ip ospf area 0
          router ospf
          passive-interface vlan20
          interface vlan30
          ip ospf area 0
          router ospf
          passive-interface vlan30
          vrf default
          exit-vrf
          vrf mgmt
          exit-vrf
          router ospf
          ospf router-id 10.10.10.1
          timers throttle spf 80 100 6000
          ! end of router ospf block
          
          cumulus@leaf02:~$ sudo cat /etc/frr/frr.conf
          ...
          interface lo
          ip ospf area 0
          interface swp51
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface swp52
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface vlan10
          ip ospf area 0
          router ospf
          passive-interface vlan10
          interface vlan20
          ip ospf area 0
          router ospf
          passive-interface vlan20
          interface vlan30
          ip ospf area 0
          router ospf
          passive-interface vlan30
          vrf default
          exit-vrf
          vrf mgmt
          exit-vrf
          router ospf
          ospf router-id 10.10.10.2
          timers throttle spf 80 100 6000
          ! end of router ospf block
          
          cumulus@spine01:~$ sudo cat /etc/frr/frr.conf
          ...
          interface lo
          ip ospf area 0
          interface swp1
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface swp2
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface swp5
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface swp6
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          vrf default
          exit-vrf
          vrf mgmt
          exit-vrf
          router ospf
          ospf router-id 10.10.10.101
          timers throttle spf 0 100 6000
          ! end of router ospf block
          
          cumulus@spine02:~$ sudo cat /etc/frr/frr.conf
          ...
          interface lo
          ip ospf area 0
          interface swp1
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface swp2
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface swp5
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface swp6
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          vrf default
          exit-vrf
          vrf mgmt
          exit-vrf
          router ospf
          ospf router-id 10.10.10.102
          timers throttle spf 0 100 6000
          ! end of router ospf block
          
          cumulus@border01:~$ sudo cat /etc/frr/frr.conf
          ...
          interface lo
          ip ospf area 0
          interface swp51
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface swp52
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface vlan2001
          ip ospf area 1
          vrf default
          exit-vrf
          vrf mgmt
          exit-vrf
          router ospf
          ospf router-id 10.10.10.63
          timers throttle spf 0 100 6000
          ! end of router ospf block
          
          cumulus@border02:~$ sudo cat /etc/frr/frr.conf
          ...
          interface lo
          ip ospf area 0
          interface swp51
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface swp52
          ip ospf area 0
          ip ospf network point-to-point
          ip ospf hello-interval 5
          ip ospf dead-interval 60
          interface vlan2001
          ip ospf area 1
          vrf default
          exit-vrf
          vrf mgmt
          exit-vrf
          router ospf
          ospf router-id 10.10.10.64
          timers throttle spf 0 100 6000
          ! end of router ospf block
          

          This simulation is running Cumulus Linux 5.11. The Cumulus Linux 5.12 simulation is coming soon.

          The simulation starts with the example OSPF configuration. The demo is pre-configured using NVUE commands.

          To validate the configuration, run the commands listed in the Troubleshooting section.

          VRFs

          This section discusses:

          Virtual Routing and Forwarding - VRF

          Virtual routing and forwarding (VRF) enables you to use multiple independent routing tables that work simultaneously on the same switch. Other implementations call this feature VRF-Lite.

          You typically use VRFs in the data center to carry multiple isolated traffic streams for multi-tenant environments. The traffic streams can cross over only at configured boundary points, such as a firewall or IDS. You can also use VRFs to burst traffic from private clouds to enterprise networks where the burst point is at layer 3.

          VRF is fully supported in the Linux kernel and has the following characteristics:

          Configure a VRF

          Cumulus Linux calls each routing table a VRF table, which has its own table ID.

          To configure VRF, you associate a subset of interfaces to a VRF routing table and configure an instance of the routing protocol (BGP or OSPFv2) for each routing table. Configuring a VRF is similar to configuring other network interfaces. Keep in mind the following:

          The following example commands configure VRF BLUE and assigns a table ID automatically.

          cumulus@switch:~$ nv set vrf BLUE table auto
          cumulus@switch:~$ nv set interface swp1 ip vrf BLUE
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/network/interfaces file to add the VRF and assign a table ID automatically:

          ...
          auto swp1
          iface swp1
            vrf BLUE
          
          auto BLUE
          iface BLUE
            vrf-table auto
          ...
          

          To load the new configuration, run ifreload -a:

          cumulus@switch:~$ sudo ifreload -a
          

          Specify a Table ID

          Instead of assigning a table ID for the VRF automatically, you can specify your own table ID in the configuration. Cumulus Linux saves the table ID to name mapping in the /etc/iproute2/rt_tables.d/ directory. Instead of using the auto option as shown above, specify the table ID. For example:

          cumulus@switch:~$ nv set vrf BLUE table 1016
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/network/interfaces file:

          ...
          auto swp1
          iface swp1
            vrf BLUE
          
          auto BLUE
          iface BLUE
            vrf-table 1016
          ...
          

          To load the new configuration, run ifreload -a:

          cumulus@switch:~$ sudo ifreload -a
          

          The table ID range must be between 1001 to 1255. Cumulus Linux reserves this range for VRF table IDs.

          Bring a VRF Up After You Run ifdown

          If you take down a VRF using ifdown, run one of the following commands to bring the VRF back up:

          For example:

          cumulus@switch:~$ sudo ifdown BLUE
          cumulus@switch:~$ sudo ifup --with-depends BLUE
          

          Use the vrf Command

          Run the vrf command to show information about VRF tables not available in other Linux commands, such as iproute.

          To show a list of VRF tables, run the vrf list command:

          cumulus@switch:~$ vrf list
          VRF              Table
          ---------------- -----
          BLUE            1016
          

          To show a list of processes and PIDs for a specific VRF table, run the ip vrf pids <vrf-name> command. For example:

          cumulus@switch:~$ ip vrf pids BLUE
          VRF: BLUE
          -----------------------
          dhclient           2508
          sshd               2659
          bash               2681
          su                 2702
          bash               2720
          vrf                2829
          

          To determine which VRF table associates with a particular PID, run the ip vrf identify <pid> command. For example:

          cumulus@switch:~$ ip vrf identify 2829
          BLUE
          

          IPv4 and IPv6 Commands in a VRF Context

          You can execute non-VRF-specific Linux commands and perform other tasks against a given VRF table. This typically applies to single-use commands started from a login shell, as they affect only AF_INET and AF_INET6 sockets opened by the command that executes; it has no impact on netlink sockets, associated with the ip command.

          To execute such a command against a VRF table, run ip vrf exec <vrf-name> <command>. For example, to SSH from the switch to a device accessible through VRF BLUE:

          cumulus@switch:~$ sudo ip vrf exec BLUE ssh user@host
          

          Services in VRFs

          For services that need to run against a specific VRF, Cumulus Linux uses systemd instances, where the instance is the VRF. You start a service within a VRF with the systemctl start <service>@<vrf-name> command. For example, to run the dhcpd service in the BLUE VRF:

          cumulus@switch:~$ sudo systemctl start dhcpd@BLUE
          

          In most cases, you need to stop the instance running in the default VRF before a VRF instance can start. This is because the instance running in the default VRF owns the port across all VRFs (it is VRF global). Cumulus Linux stops systemd-based services when you restart networking or run an ifdown/ifup sequence. Refer to management VRF for details.

          The following services work with VRF instances:

          If systemd instances do not work; use a service-specific configuration option instead. For example, to configure rsyslogd to send messages to remote systems over a VRF:

          action(type="omfwd" Target="hostname or ip here" Device="mgmt" Port=514
          Protocol="udp")
          

          VRF Route Leaking

          You typically use VRFs when you want multiple independent routing and forwarding tables; however, you might want to reach destinations in one VRF from another VRF, as in the following cases:

          Cumulus Linux supports dynamic VRF route leaking (not static route leaking).

          Configure Route Leaking

          With route leaking, a destination VRF wants to know the routes of a source VRF. As routes come and go in the source VRF, they dynamically leak to the destination VRF through BGP. If BGP learns the routes in the source VRF, you do not need to perform any additional configuration. If OSPF learns the routes in the source VRF, if you configure the routes statically, or you need to reach directly connected networks, you need to redistribute the routes first into BGP (in the source VRF).

          You can also use route leaking to reach remote destinations as well as directly connected destinations in another VRF. Multiple VRFs can import routes from a single source VRF and a VRF can import routes from multiple source VRFs. You can use this method when a single VRF provides connectivity to external networks or a shared service for other VRFs. You can control the routes leaked dynamically across VRFs with a route map.

          Because route leaking happens through BGP, the underlying mechanism relies on the BGP constructs of the Route Distinguisher (RD) and Route Targets (RTs). However, you do not need to configure these parameters; Cumulus Linux derives them automatically when you enable route leaking between a pair of VRFs.

          When you use route leaking:

          In the following example commands, routes in the BGP routing table of VRF BLUE dynamically leak into VRF RED.

          cumulus@switch:~$ nv set vrf RED router bgp address-family ipv4-unicast route-import from-vrf list BLUE
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router bgp 65001 vrf RED
          switch(config-router)# address-family ipv4 unicast
          switch(config-router-af)# import vrf BLUE
          switch(config-router-af)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          router bgp 65001 vrf RED
           !
           address-family ipv4 unicast
            import vrf BLUE
          ...
          

          Exclude Certain Prefixes

          To exclude certain prefixes from the import process, configure the prefixes in a route map.

          The following example configures a route map to match the source protocol BGP and imports the routes from VRF BLUE to VRF RED. For the imported routes, the community is 11:11 in VRF RED.

          cumulus@switch:~$ nv set vrf RED router bgp address-family ipv4-unicast route-import from-vrf list BLUE
          cumulus@switch:~$ nv set router policy route-map BLUEtoRED rule 10 match type ipv4
          cumulus@switch:~$ nv set router policy route-map BLUEtoRED rule 10 match source-protocol bgp 
          cumulus@switch:~$ nv set router policy route-map BLUEtoRED rule 10 action permit
          cumulus@switch:~$ nv set router policy route-map BLUEtoRED rule 10 set community 11:11
          cumulus@switch:~$ nv set vrf RED router bgp address-family ipv4-unicast route-import from-vrf route-map BLUEtoRED
          cumulus@switch:~$ nv config
          
          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router bgp 65001 vrf RED
          switch(config-router)# address-family ipv4 unicast
          switch(config-router-af)# import vrf BLUE
          switch(config-router-af)# route-map BLUEtoRED permit 10
          switch(config-route-map)# match source-protocol bgp
          switch(config-route-map)# set community 11:11
          switch(config-route-map)# exit
          switch(config)# router bgp 65001 vrf RED
          switch(config-router)# address-family ipv4 unicast
          switch(config-router-af)# import vrf route-map BLUEtoRED
          switch(config-router-af)# end
          switch# write memory
          switch# exit
          

          Routes from eBGP Multihop Neighbors

          If the routes you want to leak are connected routes sourced from an eBGP multihop neighbor, you must disable the next hop connection verification process for eBGP multihop peering sessions in the target VRF so that Cumulus Linux can add these routes to the routing table.

          To disable the next hop connection verification process, you need to run vtysh commands; NVUE does not provide commands for this option.

          The following example disables the next hop connection verification process for eBGP multihop peering sessions in the target VRF BLUE:

          cumulus@leaf01:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# router bgp 65101 vrf BLUE
          leaf01(config-router)# bgp disable-ebgp-connected-route-check
          leaf01(config-router)# end
          leaf01# write memory
          leaf01# exit
          

          If you need to force Cumulus Linux to reimport the routes into the target VRF, run the clear ip bgp vrf <source-vrf> * command on the VRF from which you are leaking routes.

          Verify Route Leaking Configuration

          To check the status of VRF route leaking, run the NVUE nv show vrf <vrf-name> router bgp address-family ipv4-unicast route-import command or the vtysh show ip bgp vrf <vrf-name> ipv4|ipv6 unicast route-leak command. For example:

          cumulus@switch:~$ nv show vrf RED router bgp address-family ipv4-unicast route-import
                          operational   applied  
          --------------  ------------  ---------
          from-vrf                               
            enable                      on       
            route-map                   BLUEtoRED
            [list]        BLUE          BLUE     
          [route-target]  10.10.10.1:3    
          

          To show more detailed status information, you can run the following NVUE commands:

          To view the BGP routing table, run the NVUE nv show vrf <vrf-name> router bgp address-family ipv4-unicast command or the vtysh show ip bgp vrf <vrf-name> ipv4|ipv6 unicast command.

          To view the FRR IP routing table, run the vtysh show ip route vrf <vrf-name> command. These commands show all routes, including routes leaked from other VRFs.

          The following example commands show all routes in VRF RED, including routes leaked from VRF BLUE:

          cumulus@switch:~$ sudo vtysh
          switch# show ip route vrf RED
          Codes: K - kernel route, C - connected, S - static, R - RIP,
                 O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
                 T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
                 F - PBR,
                 > - selected route, * - FIB route
          
          VRF RED:
          K * 0.0.0.0/0 [255/8192] unreachable (ICMP unreachable), 6d07h01m
          C>* 10.1.1.1/32 is directly connected, BLUE, 6d07h01m
          B>* 10.0.100.1/32 [200/0] is directly connected, RED(vrf RED), 6d05h10m
          B>* 10.0.200.0/24 [20/0] via 10.10.2.2, swp1.11, 5d05h10m
          B>* 10.0.300.0/24 [200/0] via 10.20.2.2, swp1.21(vrf RED), 5d05h10m
          C>* 10.10.2.0/30 is directly connected, swp1.11, 6d07h01m
          C>* 10.10.3.0/30 is directly connected, swp2.11, 6d07h01m
          C>* 10.10.4.0/30 is directly connected, swp3.11, 6d07h01m
          B>* 10.20.2.0/30 [200/0] is directly connected, swp1.21(vrf RED), 6d05h10m
          

          Delete Route Leaking Configuration

          The following example commands delete leaked routes from VRF BLUE to VRF RED:

          cumulus@switch:~$ nv unset vrf RED router bgp address-family ipv4-unicast route-import from-vrf list BLUE
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router bgp 65001 vrf RED
          switch(config-router)# address-family ipv4 unicast
          switch(config-router-af)# no import vrf BLUE
          switch(config-router-af)# end
          switch# write memory
          switch# exit
          

          Cumulus Linux no longer supports kernel commands. To avoid issues with VRF route leaking in FRR, do not use the kernel commands.

          FRRouting in a VRF

          Cumulus Linux supports BGP, OSPFv2 and static routing for both IPv4 and IPv6 within a VRF context. Various “FRRouting”) routing constructs, such as routing tables, nexthops, router-id, and related processing are also VRF-aware.

          FRR learns of VRFs on the system as well as interface attachment to a VRF through notifications from the kernel.

          The following sections show example VRF configurations with BGP and OSPF. For an example VRF configuration with static routing, see static routing.

          BGP

          Because BGP is VRF-aware, Cumulus Linux supports per-VRF neighbors, both iBGP and eBGP, as well as numbered and unnumbered interfaces. Non-interface-based VRF neighbors bind to the VRF, so you can have overlapping address spaces in different VRFs. Each VRF can have its own parameters, such as address families and redistribution. Incoming connections rely on the Linux kernel for VRF-global sockets. You can track BGP neighbors with BFD, both for single and multiple hops. You can configure multiple BGP instances, associating each with a VRF.

          The following example shows a BGP unnumbered interface configuration in VRF RED. In BGP unnumbered, there are no addresses on any interface. However, debugging tools like traceroute need at least a single IP address per node as the source IP address. Typically, this address is the loopback device. With VRF, you can associate an IP address with the VRF device, which acts as the loopback interface for that VRF.

          cumulus@switch:~$ nv set vrf RED table auto
          cumulus@switch:~$ nv set vrf RED loopback ip address 10.10.10.1/32
          cumulus@switch:~$ nv set interface swp51 ip vrf RED
          cumulus@switch:~$ nv set vrf RED router bgp router-id 10.10.10.1
          cumulus@switch:~$ nv set vrf RED router bgp autonomous-system 65001
          cumulus@switch:~$ nv set vrf RED router bgp neighbor swp51 remote-as external 
          cumulus@switch:~$ nv set vrf RED router bgp address-family ipv4-unicast redistribute connected enable on
          cumulus@switch:~$ nv set vrf RED router bgp neighbor swp51 address-family ipv4-unicast enable on
          cumulus@switch:~$ nv config apply
          

          /etc/network/interfaces file configuration:

          cumulus@switch:~$ sudo nano /etc/network/interfaces
          ...
          auto RED 
          iface RED
              address 10.10.10.1/32
              vrf-table auto
          auto swp51
          iface swp51
              vrf RED
          ...
          

          vtysh commands:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router bgp 65001 vrf RED
          switch(config-router)# bgp router-id 10.10.10.1
          switch(config-router)# neighbor swp51 interface remote-as external
          switch(config-router)# address-family ipv4 unicast
          switch(config-router-af)# redistribute connected
          switch(config-router-af)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          router bgp 65001 vrf RED
           bgp router-id 10.10.10.1
           neighbor swp51 interface remote-as external
           !
           address-family ipv4 unicast
            redistribute connected
            exit-address-family
          ...
          

          OSPF

          A VRF-aware OSPFv2 configuration supports numbered and unnumbered interfaces, and layer 3 interfaces such as SVIs, subinterfaces and physical interfaces. The VRF supports types 1 through 5 (ABR and ASBR - external LSAs) and types 9 through 11 (opaque LSAs) link state advertisements, redistribution of other routing protocols, connected and static routes, and route maps. You can track OSPF neighbors with BFD.

          Cumulus Linux does not support multiple VRFs in multi-instance OSPF.

          The following example shows an OSPF configuration in VRF RED.

          cumulus@switch:~$ nv set vrf RED loopback ip address 10.10.10.1/31
          cumulus@switch:~$ nv set interface swp51 ip address 10.0.1.0/31
          cumulus@switch:~$ nv set vrf RED router ospf enable on
          cumulus@switch:~$ nv set vrf RED router ospf router-id 10.10.10.1
          cumulus@switch:~$ nv set vrf RED router ospf redistribute connected
          cumulus@switch:~$ nv set vrf RED router ospf redistribute bgp
          cumulus@switch:~$ nv set vrf RED router ospf area 0.0.0.0 network 10.10.10.1/32
          cumulus@switch:~$ nv set vrf RED router ospf area 0.0.0.0 network 10.0.1.0/31
          cumulus@switch:~$ nv config apply
          

          The /etc/network/interfaces file configuration:

          cumulus@switch:~$ sudo nano /etc/network/interfaces
          ...
          auto RED
          iface RED
              address 10.10.10.1/32
              vrf-table auto
          auto swp51
          iface swp51
            address 10.0.1.0/31
          

          vtysh commands:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router ospf vrf RED
          switch(config-router)# ospf router-id 10.10.10.1
          switch(config-router)# redistribute connected
          switch(config-router)# redistribute bgp
          switch(config-router)# network 10.10.10.1/32 area 0.0.0.0
          switch(config-router)# network 10.0.1.0/31 area 0.0.0.0
          switch(config-router)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          router ospf vrf RED
            ospf router-id 10.10.10.1
            network 10.10.10.1/32 area 0.0.0.0
            network 10.0.1.0/31 area 0.0.0.0
            redistribute connected
            redistribute bgp
          ...
          

          DHCP with VRF

          Because you can use VRF to bind IPv4 and IPv6 sockets to non-default VRF tables, you can start DHCP servers and relays in any non-default VRF table using the dhcpd and dhcrelay services. systemd must manage these services and the /etc/vrf/systemd.conf file must list the services. By default, this file already lists these two services, as well as others. You can add more services as needed, such as dhcpd6 and dhcrelay6 for IPv6.

          If you edit /etc/vrf/systemd.conf, run sudo systemctl daemon-reload to generate the systemd instance files for the newly added services. Then you can start the service in the VRF using systemctl start <service>@<vrf-name>.service, where <service> is the name of the service (such as dhcpd or dhcrelay) and <vrf-name> is the name of the VRF.

          For example, to start the dhcrelay service after you configure a VRF named BLUE, run:

          cumulus@switch:~$ sudo systemctl start dhcrelay@BLUE.service
          

          To enable the service at boot time, you must also enable the service:

          cumulus@switch:~$ sudo systemctl enable dhcrelay@BLUE.service
          

          In addition, you need to create a separate default file in the /etc/default directory for every instance of a DHCP server or relay in a non-default VRF. To run multiple instances of any of these services, you need a separate file for each instance. The files must have the following names:

          See the example configuration below for more details.

          Example Configuration

          In the following example, there is one IPv4 network with a VRF named RED and one IPv6 network with a VRF named BLUE.

          IPv4 DHCP Server/relay network IPv6 DHCP Server/relay network

          Configure each DHCP server and relay as follows:

          1. Create the file isc-dhcp-server-RED in /etc/default/. Here is sample content:

            # Defaults for isc-dhcp-server initscript
            # sourced by /etc/init.d/isc-dhcp-server
            # installed at /etc/default/isc-dhcp-server by the maintainer scripts
            #
            # This is a POSIX shell fragment
            #
            # Path to dhcpd's config file (default: /etc/dhcp/dhcpd.conf).
            DHCPD_CONF="-cf /etc/dhcp/dhcpd-RED.conf"
            # Path to dhcpd's PID file (default: /var/run/dhcpd.pid).
            DHCPD_PID="-pf /var/run/dhcpd-RED.pid"
            # Additional options to start dhcpd with.
            # Don't use options -cf or -pf here; use DHCPD_CONF/ DHCPD_PID instead
            #OPTIONS=""
            # On what interfaces should the DHCP server (dhcpd) serve DHCP requests?
            # Separate multiple interfaces with spaces, e.g. "eth0 eth1".
            INTERFACES="swp2"
            
          2. Enable the DHCP server:

            cumulus@switch:~$ sudo systemctl enable dhcpd@RED.service
            
          3. Start the DHCP server:

            cumulus@switch:~$ sudo systemctl start dhcpd@RED.service
            
          4. Check status:

            cumulus@switch:~$ sudo systemctl status dhcpd@RED.service
            

          You can create this configuration using the vrf command (see IPv4 and IPv6 Commands in a VRF Context above for more details):

          cumulus@switch:~$ sudo ip vrf exec RED /usr/sbin/dhcpd -f -q -cf /
              /etc/dhcp/dhcpd-RED.conf -pf /var/run/dhcpd-RED.pid swp2
          
          1. Create the file isc-dhcp-server6-BLUE in /etc/default/. Here is sample content:

            # Defaults for isc-dhcp-server initscript
            # sourced by /etc/init.d/isc-dhcp-server
            # installed at /etc/default/isc-dhcp-server by the maintainer scripts
            #
            # This is a POSIX shell fragment
            #
            # Path to dhcpd's config file (default: /etc/dhcp/dhcpd.conf).
            DHCPD_CONF="-cf /etc/dhcp/dhcpd6-BLUE.conf"
            # Path to dhcpd's PID file (default: /var/run/dhcpd.pid).
            DHCPD_PID="-pf /var/run/dhcpd6-BLUE.pid"
            # Additional options to start dhcpd with.
            # Don't use options -cf or -pf here; use DHCPD_CONF/ DHCPD_PID instead
            #OPTIONS=""
            # On what interfaces should the DHCP server (dhcpd) serve DHCP requests?
            # Separate multiple interfaces with spaces, e.g. "eth0 eth1".
            INTERFACES="swp3"
            
          2. Enable the DHCP server:

            cumulus@switch:~$ sudo systemctl enable dhcpd6@BLUE.service
            
          3. Start the DHCP server:

            cumulus@switch:~$ sudo systemctl start dhcpd6@BLUE.service
            
          4. Check status:

            cumulus@switch:~$ sudo systemctl status dhcpd6@BLUE.service
            

          You can create this configuration using the vrf command (see IPv4 and IPv6 Commands in a VRF Context above for more details):

          cumulus@switch:~$ sudo ip vrf exec BLUE dhcpd -6 -q -cf /
            /etc/dhcp/dhcpd6-BLUE.conf -pf /var/run/dhcpd6-BLUE.pid swp3
          
          1. Create the file isc-dhcp-relay-RED in /etc/default/. Here is sample content:

            # Defaults for isc-dhcp-relay initscript
            # sourced by /etc/init.d/isc-dhcp-relay
            # installed at /etc/default/isc-dhcp-relay by the maintainer scripts
            #
            # This is a POSIX shell fragment
            #
            # What servers should the DHCP relay forward requests to?
            SERVERS="102.0.0.2"
            # On what interfaces should the DHCP relay (dhrelay) serve DHCP requests?
            # Always include the interface towards the DHCP server.
            # This variable requires a -i for each interface configured above.
            # This will be used in the actual dhcrelay command
            # For example, "-i eth0 -i eth1"
            INTF_CMD="-i swp2s2 -i swp2s3"
            # Additional options that are passed to the DHCP relay daemon?
            OPTIONS=""
            
          2. Enable the DHCP relay:

            cumulus@switch:~$ sudo systemctl enable dhcrelay@RED.service
            
          3. Start the DHCP relay:

            cumulus@switch:~$ sudo systemctl start dhcrelay@RED.service
            
          4. Check status:

            cumulus@switch:~$ sudo systemctl status dhcrelay@RED.service
            

          You can create this configuration using the vrf command (see IPv4 and IPv6 Commands in a VRF Context above for more details):

          cumulus@switch:~$ sudo ip vrf exec RED /usr/sbin/dhcrelay -d -q -i /
              swp2s2 -i swp2s3 102.0.0.2
          
          1. Create the file isc-dhcp-relay6-BLUE in /etc/default/. Here is sample content:

            # Defaults for isc-dhcp-relay initscript
            # sourced by /etc/init.d/isc-dhcp-relay
            # installed at /etc/default/isc-dhcp-relay by the maintainer scripts
            #
            # This is a POSIX shell fragment
            #
            # What servers should the DHCP relay forward requests to?
            #SERVERS="103.0.0.2"
            # On what interfaces should the DHCP relay (dhrelay) serve DHCP requests?
            # Always include the interface towards the DHCP server.
            # This variable requires a -i for each interface configured above.
            # This will be used in the actual dhcrelay command
            # For example, "-i eth0 -i eth1"
            INTF_CMD="-l swp18s0 -u swp18s1"
            # Additional options that are passed to the DHCP relay daemon?
            OPTIONS="-pf /var/run/dhcrelay6@BLUE.pid"
            
          2. Enable the DHCP relay:

            cumulus@switch:~$ sudo systemctl enable dhcrelay6@BLUE.service
            
          3. Start the DHCP relay:

            cumulus@switch:~$ sudo systemctl start dhcrelay6@BLUE.service
            
          4. Check status:

            cumulus@switch:~$ sudo systemctl status dhcrelay6@BLUE.service
            

          You can create this configuration using the vrf command (see IPv4 and IPv6 Commands in a VRF Context above for more details):

          cumulus@switch:~$ sudo ip vrf exec BLUE /usr/sbin/dhcrelay -d -q -6 -l /
              swp18s0 -u swp18s1 -pf /var/run/dhcrelay6@BLUE.pid
          

          Use ping or traceroute on a VRF

          You can run ping or traceroute on a VRF from the default VRF.

          To ping a VRF from the default VRF, run the ping -I <vrf-name> command. For example:

          cumulus@switch:~$ ping -I BLUE
          

          To run traceroute on a VRF from the default VRF, run the traceroute -i <vrf-name> command. For example:

          cumulus@switch:~$ sudo traceroute -i BLUE
          

          Troubleshooting

          You can use vtysh or Linux show commands to troubleshoot VRFs.

          To show all VRFs learned by FRR from the kernel, run the show vrf command. The table ID shows the corresponding routing table in the kernel.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# show vrf
          vrf RED id 14 table 1012
          vrf BLUE id 21 table 1013
          

          To show the VRFs configured in BGP (including the default VRF), run the show bgp vrfs command. A non-zero ID is a VRF that you define in the /etc/network/interfaces file.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# show bgp vrfs
          Type  Id     RouterId       #PeersCfg  #PeersEstb  Name
          DFLT  0      6.0.0.7                0           0  Default
           VRF  14     6.0.2.7                6           6  RED
           VRF  21     6.0.3.7                6           6  BLUE
          
          Total number of VRFs (including default): 3
          

          To show interfaces known to FRR and attached to a specific VRF, run the show interface vrf <vrf-name> command. For example:

          cumulus@switch:~$ sudo vtysh
          
          switch# show interface vrf vrf1012
          Interface br2 is up, line protocol is down
            PTM status: disabled
            vrf: RED
            index 13 metric 0 mtu 1500
            flags: <UP,BROADCAST,MULTICAST>
            inet 20.7.2.1/24
          
            inet6 fe80::202:ff:fe00:a/64
            ND advertised reachable time is 0 milliseconds
            ND advertised retransmit interval is 0 milliseconds
            ND router advertisements are sent every 600 seconds
            ND router advertisements lifetime tracks ra-interval
            ND router advertisement default router preference is medium
            Hosts use stateless autoconfig for addresses.
          

          To show VRFs configured in OSPF, run the show ip ospf vrfs command. For example:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# show ip ospf vrfs
          Name                            Id     RouterId
          Default-IP-Routing-Table        0      0.0.0.0
          RED                             57     0.0.0.10
          BLUE                            58     0.0.0.20
          Total number of OSPF VRFs (including default): 3
          

          To show all OSPF routes in a VRF, run the show ip ospf vrf all route command. For example:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# show ip ospf vrf all route
          ============ OSPF network routing table ============
          N    7.0.0.0/24            [10] area: 0.0.0.0
                                     directly attached to swp2
          
          ============ OSPF router routing table =============
          
          ============ OSPF external routing table ===========
          
          ============ OSPF network routing table ============
          N    8.0.0.0/24            [10] area: 0.0.0.0
                                     directly attached to swp1
          
          ============ OSPF router routing table =============
          
          ============ OSPF external routing table ===========
          

          To see the routing table for each VRF, use the show ip route vrf all command. The OSPF route is in the row that starts with O.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# show ip route vrf all
          Codes: K - kernel route, C - connected, S - static, R - RIP,
                 O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
                 T - Table, v - VNC, V - VNC-Direct, A - Babel,
                 > - selected route, * - FIB route
          VRF BLUE:
          K>* 0.0.0.0/0 [0/8192] unreachable (ICMP unreachable)
          O   7.0.0.0/24 [110/10] is directly connected, swp2, 00:28:35
          C>* 7.0.0.0/24 is directly connected, swp2
          C>* 7.0.0.5/32 is directly connected, BLUE
          C>* 7.0.0.100/32 is directly connected, BLUE
          C>* 50.1.1.0/24 is directly connected, swp31s1
          VRF RED:
          K>* 0.0.0.0/0 [0/8192] unreachable (ICMP unreachable)
          O
          8.0.0.0/24 [110/10]
          is directly connected, swp1, 00:23:26
          C>* 8.0.0.0/24 is directly connected, swp1
          C>* 8.0.0.5/32 is directly connected, RED
          C>* 8.0.0.100/32 is directly connected, RED
          C>* 50.0.1.0/24 is directly connected, swp31s0
          

          To list all VRFs, and include the VRF ID and table ID, run the ip -d link show type vrf command. For example:

          cumulus@switch:~$ ip -d link show type vrf
          14: vrf1012: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 1000
              link/ether 46:96:c7:64:4d:fa brd ff:ff:ff:ff:ff:ff promiscuity 0
              vrf table 1012 addrgenmode eui64
          21: vrf1013: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 1000
              link/ether 7a:8a:29:0f:5e:52 brd ff:ff:ff:ff:ff:ff promiscuity 0
              vrf table 1013 addrgenmode eui64
          28: vrf1014: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 1000
              link/ether e6:8c:4d:fc:eb:b1 brd ff:ff:ff:ff:ff:ff promiscuity 0
              vrf table 1014 addrgenmode eui64
          

          To show the interfaces attached to a specific VRF, run the ip -d link show vrf <vrf-name> command. For example:

          cumulus@switch:~$ ip -d link show vrf vrf1012
          8: swp1.2@swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vrf1012 state UP mode DEFAULT group default
              link/ether 00:02:00:00:00:07 brd ff:ff:ff:ff:ff:ff promiscuity 0
              vlan protocol 802.1Q id 2 <REORDER_HDR>
              vrf_slave addrgenmode eui64
          9: swp2.2@swp2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vrf1012 state UP mode DEFAULT group default
              link/ether 00:02:00:00:00:08 brd ff:ff:ff:ff:ff:ff promiscuity
              vlan protocol 802.1Q id 2 <REORDER_HDR>
              vrf_slave addrgenmode eui64
          10: swp3.2@swp3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vrf1012 state UP mode DEFAULT group default
              link/ether 00:02:00:00:00:09 brd ff:ff:ff:ff:ff:ff promiscuity 0
              vlan protocol 802.1Q id 2 <REORDER_HDR>
              vrf_slave addrgenmode eui64
          11: swp4.2@swp4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vrf1012 state UP mode DEFAULT group default
              link/ether 00:02:00:00:00:0a brd ff:ff:ff:ff:ff:ff promiscuity 0
              vlan protocol 802.1Q id 2 <REORDER_HDR>
              vrf_slave addrgenmode eui64
          12: swp5.2@swp5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vrf1012 state UP mode DEFAULT group default
              link/ether 00:02:00:00:00:0b brd ff:ff:ff:ff:ff:ff promiscuity 0
              vlan protocol 802.1Q id 2 <REORDER_HDR>
              vrf_slave addrgenmode eui64
          13: br2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue master vrf1012 state DOWN mode DEFAULT group default
              link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff promiscuity 0
              bridge forward_delay 100 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768
              vlan_filtering 0 vlan_protocol 802.1Q bridge_id 8000.0:0:0:0:0:0 designated_root 8000.0:0:0:0:0:0
              root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer    0.00
              tcn_timer    0.00 topology_change_timer    0.00 gc_timer  202.23 vlan_default_pvid 1 group_fwd_mask 0
              group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0
              mcast_hash_elasticity 4096 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2
              mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500
              mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3125
              nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0
              vrf_slave addrgenmode eui64
          

          To show IPv4 routes in a VRF, run the ip route show table <vrf-name> command. For example:

          cumulus@switch:~$ ip route show table RED
          unreachable default  metric 240
          broadcast 20.7.2.0 dev br2  proto kernel  scope link  src 20.7.2.1 dead linkdown
          20.7.2.0/24 dev br2  proto kernel  scope link  src 20.7.2.1 dead linkdown
          local 20.7.2.1 dev br2  proto kernel  scope host  src 20.7.2.1
          broadcast 20.7.2.255 dev br2  proto kernel  scope link  src 20.7.2.1 dead linkdown
          broadcast 169.254.2.8 dev swp1.2  proto kernel  scope link  src 169.254.2.9
          169.254.2.8/30 dev swp1.2  proto kernel  scope link  src 169.254.2.9
          local 169.254.2.9 dev swp1.2  proto kernel  scope host  src 169.254.2.9
          broadcast 169.254.2.11 dev swp1.2  proto kernel  scope link  src 169.254.2.9
          broadcast 169.254.2.12 dev swp2.2  proto kernel  scope link  src 169.254.2.13
          169.254.2.12/30 dev swp2.2  proto kernel  scope link  src 169.254.2.13
          local 169.254.2.13 dev swp2.2  proto kernel  scope host  src 169.254.2.13
          broadcast 169.254.2.15 dev swp2.2  proto kernel  scope link  src 169.254.2.13
          broadcast 169.254.2.16 dev swp3.2  proto kernel  scope link  src 169.254.2.17
          169.254.2.16/30 dev swp3.2  proto kernel  scope link  src 169.254.2.17
          local 169.254.2.17 dev swp3.2  proto kernel  scope host  src 169.254.2.17
          broadcast 169.254.2.19 dev swp3.2  proto kernel  scope link  src 169.254.2.17
          

          To show IPv6 routes in a VRF, run the ip -6 route show table <vrf-name> command. For example:

          cumulus@switch:~$ ip -6 route show table RED
          local fe80:: dev lo  proto none  metric 0  pref medium
          local fe80:: dev lo  proto none  metric 0  pref medium
          local fe80:: dev lo  proto none  metric 0  pref medium
          local fe80:: dev lo  proto none  metric 0  pref medium
          local fe80::202:ff:fe00:7 dev lo  proto none  metric 0  pref medium
          local fe80::202:ff:fe00:8 dev lo  proto none  metric 0  pref medium
          local fe80::202:ff:fe00:9 dev lo  proto none  metric 0  pref medium
          local fe80::202:ff:fe00:a dev lo  proto none  metric 0  pref medium
          fe80::/64 dev br2  proto kernel  metric 256 dead linkdown  pref medium
          fe80::/64 dev swp1.2  proto kernel  metric 256  pref medium
          fe80::/64 dev swp2.2  proto kernel  metric 256  pref medium
          fe80::/64 dev swp3.2  proto kernel  metric 256  pref medium
          ff00::/8 dev br2  metric 256 dead linkdown  pref medium
          ff00::/8 dev swp1.2  metric 256  pref medium
          ff00::/8 dev swp2.2  metric 256  pref medium
          ff00::/8 dev swp3.2  metric 256  pref medium
          unreachable default dev lo  metric 240  error -101 pref medium
          

          To see a list of links associated with a particular VRF table, run the ip link list <vrf-name> command. For example:

          cumulus@switch:~$ ip link list RED
          
          VRF: RED
          --------------------
          swp1.10@swp1     UP             6c:64:1a:00:5a:0c <BROADCAST,MULTICAST,UP,LOWER_UP>
          swp2.10@swp2     UP             6c:64:1a:00:5a:0d <BROADCAST,MULTICAST,UP,LOWER_UP>
          

          To see a list of routes associated with a particular VRF table, run the ip route list <vrf-name> command. For example:

          cumulus@switch:~$ ip route list RED
          
          VRF: RED
          --------------------
          unreachable default  metric 8192
          10.1.1.0/24 via 10.10.1.2 dev swp2.10
          10.1.2.0/24 via 10.99.1.2 dev swp1.10
          broadcast 10.10.1.0 dev swp2.10  proto kernel  scope link  src 10.10.1.1
          10.10.1.0/28 dev swp2.10  proto kernel  scope link  src 10.10.1.1
          local 10.10.1.1 dev swp2.10  proto kernel  scope host  src 10.10.1.1
          broadcast 10.10.1.15 dev swp2.10  proto kernel  scope link  src 10.10.1.1
          broadcast 10.99.1.0 dev swp1.10  proto kernel  scope link  src 10.99.1.1
          10.99.1.0/30 dev swp1.10  proto kernel  scope link  src 10.99.1.1
          local 10.99.1.1 dev swp1.10  proto kernel  scope host  src 10.99.1.1
          broadcast 10.99.1.3 dev swp1.10  proto kernel  scope link  src 10.99.1.1
          
          local fe80:: dev lo  proto none  metric 0  pref medium
          local fe80:: dev lo  proto none  metric 0  pref medium
          local fe80::6e64:1aff:fe00:5a0c dev lo  proto none  metric 0  pref medium
          local fe80::6e64:1aff:fe00:5a0d dev lo  proto none  metric 0  pref medium
          fe80::/64 dev swp1.10  proto kernel  metric 256  pref medium
          fe80::/64 dev swp2.10  proto kernel  metric 256  pref medium
          ff00::/8 dev swp1.10  metric 256  pref medium
          ff00::/8 dev swp2.10  metric 256  pref medium
          unreachable default dev lo  metric 8192  error -101 pref medium
          

          You can also show routes in a VRF using the ip [-6] route show vrf <vrf-name> command. This command omits local and broadcast routes, which can clutter the output.

          Considerations

          Management VRF

          Management VRF is a subset of Virtual Routing and Forwarding - VRF (virtual routing tables and forwarding) and provides a separation between the out-of-band management network and the in-band data plane network. For VRFs, the main routing table is the default table for the data plane switch ports. With management VRF, the switch uses a second table, mgmt, for routing through the Ethernet ports of the switch. The mgmt name is special cased to identify the management VRF from a data plane VRF.

          Cumulus Linux only supports eth0 (or eth1, depending on the switch platform) for out-of-band management. The Ethernet ports are software-only ports that are not hardware accelerated by switchd. VLAN subinterfaces, bonds, bridges, and the front panel switch ports are not supported as OOB management interfaces.

          In band management of Cumulus Linux is possible using loopbacks and SVIs (switch virtual interfaces).

          Cumulus Linux enables Management VRF by default. IPv4 and IPv6 networking applications (for example, Ansible, Chef, and apt-get) run by an administrator communicate out the management network by default. This default context does not impact services run through systemd and the systemctl command, and does not impact commands examining the state of the switch, such as the ip command to list links, neighbors, or routes.

          The management VRF configurations in this section contain a localhost loopback IPv4 address of 127.0.0.1/8 and IPv6 address of ::1/128. Management VRF must have an IPv6 address as well as an IPv4 address to work correctly. Adding the loopback address to the layer 3 domain of the management VRF prevents issues with applications that expect the loopback IP address to exist in the VRF, such as NTP.

          Bring Up the Management VRF

          If you take down the management VRF using ifdown, to bring it back up you need to do one of two things:

          The following command example brings down the management VRF, then brings it back up with the ifup --with-depends mgmt command:

          cumulus@switch:~$ sudo ifdown mgmt
          cumulus@switch:~$ sudo ifup --with-depends mgmt
          

          Running ifreload -a disconnects the session for any interface configured as auto.

          Run Services within the Management VRF

          Most default services in Cumulus Linux are VRF aware. If you want to run a service within the management VRF instead of the default VRF, run the following commands:

          1. If the service is running, stop the service:

            cumulus@switch:~$ sudo systemctl stop <service>.service
            
          2. Disable the service from starting automatically in the default VRF:

            cumulus@switch:~$ sudo systemctl disable <service>.service
            
          3. Start the service in the management VRF:

            cumulus@switch:~$ sudo systemctl start <service>@mgmt.service
            
          4. Enable the service in the management VRF so that it starts when the switch boots:

            cumulus@switch:~$ sudo systemctl enable <service>@mgmt.service
            
          5. Verify that the service is running in the management VRF with the ps aux | grep <service> command.

          Run the following command to show the process IDs associated with the management VRF:

          cumulus@switch:~$ ip vrf pids mgmt
           2559  login
           2753  bash
           2045  dhclient
           5421  sshd
           5462  sshd
           5463  bash
          37691  sshd
          37732  sshd
          37735  bash
          55679  sshd
          55720  sshd
          55721  bash
          55993  ip
           3834  ntpd
           2023  python3
           2563  netqd
           1855  login
           2770  bash
          

          Run the following command to show the VRF association of the specified process:

          cumulus@switch:~$ ip vrf identify 2045
          mgmt
          

          Run ip vrf help for additional ip vrf commands.

          Enable Polling with snmpd in a Management VRF

          When you enable snmpd to run in the management VRF, you need to specify that VRF so that snmpd listens on eth0 in the management VRF; you can also configure snmpd to listen on other ports. In Cumulus Linux, SNMP configuration is VRF aware so snmpd can bind to multiple IP addresses each configured with a particular VRF (routing table). The snmpd daemon responds to polling requests on the interfaces of the VRF on which the request comes in. For information about configuring SNMP version 1, 2c, and 3 Traps and (v3) Inform messages, refer to Simple Network Management Protocol - SNMP.

          The message Duplicate IPv4 address detected, some interfaces may not be visible in IP-MIB displays after starting snmpd in the management VRF. This is because the IP-MIB assumes that you cannot use the same IP address twice on the same device; the IP-MIB is not VRF aware. This message is a warning that the SNMP IP-MIB detects overlapping IP addresses on the system; it does not indicate a problem and does not impact the operation of the switch.

          ping or traceroute on the Management VRF

          By default, when you issue a ping or traceroute, the packet goes to the data plane network (the main routing table). To use ping or traceroute on the management network, use ping -I mgmt or traceroute -i mgmt. To select a source address within the management VRF, use the -s flag for traceroute.

          cumulus@switch:~$ ping -I mgmt <destination-ip>
          
          cumulus@switch:~$ sudo traceroute -i mgmt -s <source-ip> <destination-ip>
          

          For additional information on using ping and traceroute, see Network Troubleshooting.

          Run Services as a Non-root User

          To run services in the management VRF as a non-root user, you need to create a custom service based on the original service file. The following example commands configure the SSH service to run in the management VRF as a non-root user.

          1. Run the following command to create a custom service file in the /etc/systemd/system directory.

            cumulus@switch:~$ sudo -E systemctl edit --full ssh.service
            
          2. If a User directive exists under [Service], comment it out.

            cumulus@switch:~$ sudo nano /etc/systemd/system/ssh.service
            ...
            [Service]
            #User=username
            ExecStart=/usr/local/bin/ssh agent -data-dir=/tmp/ssh -bind=192.168.0.11
            ...
            
          3. Modify the ExecStart line to /usr/bin/ip vrf exec mgmt /sbin/runuser -u USER -- ssh:

            ...
            [Service]
            #User=username
            ExecStart=/usr/bin/ip vrf exec mgmt /sbin/runuser -u cumulus -- ssh
            ...
            

          OSPF and BGP

          FRR is VRF-aware and sends packets based on the switch port routing table. This includes BGP peering through loopback interfaces. BGP looks up routes in the default table. However, depending on how you redistribute your routes, you can perform the following modification.

          Management VRF uses the mgmt table, including local routes. This does not affect route redistribution when you use routing protocols, such as OSPF and BGP.

          To redistribute the routes in your network, use the redistribute connected command under BGP or OSPF. This enables the directly connected network out of eth0 to advertise to its neighbor.

          This also creates a route on the neighbor device to the management network through the data plane.

          NVIDIA recommends route maps to control advertised networks that you redistribute with the redistribute connected command.

          cumulus@switch:~$ nv set router policy route-map REDISTRIBUTE rule 10 match type ipv4
          cumulus@switch:~$ nv set router policy route-map REDISTRIBUTE rule 10 match interface eth0
          cumulus@switch:~$ nv set router policy route-map REDISTRIBUTE rule 10 action deny
          cumulus@switch:~$ nv set vrf default router bgp address-family ipv4-unicast redistribute connected route-map REDISTRIBUTE
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# route-map REDISTRIBUTE-CONNECTED deny 10 
          switch(config-route-map)# match interface eth0
          switch(config)# route-map REDISTRIBUTE-CONNECTED permit 100
          switch(config-route-map)# exit
          switch(config)# router bgp
          switch(config-router)# address-family ipv4 unicast
          switch((config-router-af)# redistribute connected route-map REDISTRIBUTE-CONNECTED
          switch(config)# end
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          router bgp 65101
           bgp router-id 10.10.10.1
           neighbor swp51 interface remote-as external
           neighbor swp52 interface remote-as external
           !
           address-family ipv4 unicast
            network 10.1.10.0/24
            network 10.10.10.1/32
            redistribute connected route-map REDISTRIBUTE-CONNECTED
            maximum-paths 64
            maximum-paths ibgp 64
           exit-address-family
          !
          route-map REDISTRIBUTE-CONNECTED deny 100
          match interface eth0
          !
          route-map REDISTRIBUTE-CONNECTED permit 1000
          ...
          

          SSH

          View the Routing Tables

          When you use ip route get to return information about a single route, the command resolves over the mgmt table by default. To show information about the route in the switching silicon, run this command:

          cumulus@switch:~$ ip route get <ip-address>
          

          To get the route for any VRF, run the ip route get <ip-address> oif <vrf-name> command. For example, to show the route for the management VRF, run:

          cumulus@switch:~$ ip route get <ip-address> oif mgmt
          

          mgmt Interface Class

          ifupdown2 uses interface classes to create a user-defined grouping for interfaces. The special class mgmt is available to separate the management interfaces of the switch from the data interfaces. This allows you to manage the data interfaces by default using ifupdown2 commands. Performing operations on the mgmt interfaces requires specifying the --allow-mgmt option, which prevents inadvertent outages on the management interfaces. Cumulus Linux by default brings up all interfaces in both the auto (default) class and the mgmt interface class when the switch boots.

          You configure the management interface in the /etc/network/interfaces file. The example below adds the management interface eth0 and the management VRF stanzas to the mgmt interface class:

          ...
          auto lo
          iface lo inet loopback
          
          allow-mgmt eth0
          iface eth0 inet dhcp
              vrf mgmt
          
          allow-mgmt mgmt
          iface mgmt
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          ...
          

          When you run ifupdown2 commands against the interfaces in the mgmt class, include --allow=mgmt with the commands. For example, to see which interfaces are in the mgmt interface class, run:

          cumulus@switch:~$ ifquery l --allow=mgmt
          eth0
          mgmt
          

          To reload the configurations for interfaces in the mgmt class, run:

          cumulus@switch:~$ sudo ifreload --allow=mgmt
          

          You can still bring the management interface up and down using ifup eth0 and ifdown eth0.

          Management VRF and DNS

          Cumulus Linux supports both DHCP and static DNS entries over management VRF through IP FIB rules, which it adds to direct lookups to the DNS addresses out of the management VRF.

          For DNS to use the management VRF, the static DNS entries must reference the management VRF in the /etc/resolv.conf file. You cannot specify the same DNS server address twice to associate it with different VRFs.

          For example, to specify DNS servers and associate some of them with the management VRF, run the following commands:

          cumulus@switch:~$ nv set service dns default server 192.0.2.1
          cumulus@switch:~$ nv set service dns mgmt server 198.51.100.31
          cumulus@switch:~$ nv set service dns mgmt server 203.0.113.13
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/resolv.conf file to add the DNS servers and associate some of them with the management VRF. For example:

          cumulus@switch:~$ sudo nano /etc/resolv.conf
          nameserver 192.0.2.1
          nameserver 198.51.100.31 # vrf mgmt
          nameserver 203.0.113.13 # vrf mgmt
          

          Run the ifreload -a command to load the new configuration:

          cumulus@switch:~$ ifreload -a
          

          Protocol Independent Multicast - PIM

          PIM is a multicast control plane protocol that advertises multicast sources and receivers over a routed layer 3 network. Layer 3 multicast relies on PIM to advertise information about multicast capable routers, and the location of multicast senders and receivers. Multicast does not go through a routed network without PIM.

          PIM operates in PIM-SM or PIM-DM mode. Cumulus Linux supports PIM-SM only.

          PIM-SM is a pull multicast distribution method; multicast traffic only goes through the network if receivers explicitly ask for it. When a receiver pulls multicast traffic, it must notify the network periodically that it wants to continue the multicast stream.

          PIM-SM has three configuration options:

          Cumulus Linux supports ASM and SSM only.

          For additional information on PIM-SM, refer to RFC 7761 - Protocol Independent Multicast - Sparse Mode. For a brief description of how PIM works, refer to PIM Overview.

          Example PIM Topology

          The following illustration shows a basic PIM ASM configuration:

          Basic PIM Configuration

          To configure PIM:

          When you enable or disable PIM, the FRR service restarts, which might impact traffic.

          These example commands configure leaf01, leaf02 and spine01 as shown in the topology example above.

          cumulus@leaf01:~$ nv set router pim enable on
          cumulus@leaf01:~$ nv set interface vlan10 router pim
          cumulus@leaf01:~$ nv set interface vlan10 ip igmp
          cumulus@leaf01:~$ nv set interface swp51 router pim
          cumulus@leaf01:~$ nv set vrf default router pim address-family ipv4 rp 10.10.10.101
          cumulus@leaf01:~$ nv config apply
          
          cumulus@leaf02:~$ nv set router pim enable on
          cumulus@leaf02:~$ nv set interface vlan20 router pim
          cumulus@leaf02:~$ nv set interface vlan20 ip igmp
          cumulus@leaf02:~$ nv set interface swp51 router pim
          cumulus@leaf02:~$ nv set vrf default router pim address-family ipv4 rp 10.10.10.101
          cumulus@leaf02:~$ nv config apply
          
          cumulus@spine01:~$ nv set router pim enable on
          cumulus@spine01:~$ nv set interface swp1 router pim
          cumulus@spine01:~$ nv set interface swp2 router pim
          cumulus@spine01:~$ nv set vrf default router pim address-family ipv4 rp 10.10.10.101 
          cumulus@spine01:~$ nv config apply
          

          The FRR package includes PIM. For proper PIM operation, PIM depends on Zebra. You must configure unicast routing and a routing protocol or static routes.

          1. Edit the /etc/frr/daemons file and add pimd=yes to the end of the file:

            cumulus@leaf01:~$ sudo nano /etc/frr/daemons
            ...
            pimd=yes
            ...
            
          1. Restart FRR with this command:

          cumulus@switch:~$ sudo systemctl restart frr.service

          Restarting FRR restarts all the routing protocol daemons that are enabled and running.

          1. In the vtysh shell, run the following commands to configure the PIM interfaces. PIM must be on all interfaces facing multicast sources or multicast receivers, as well as on the interface with the RP address.

            cumulus@leaf01:~$ sudo vtysh
            ...
            leaf01# configure terminal
            leaf01(config)# interface vlan10
            leaf01(config-if)# ip pim
            leaf01(config-if)# exit
            leaf01(config)# interface swp51
            leaf01(config-if)# ip pim
            leaf01(config-if)# exit
            
          2. Enable IGMP on all interfaces that have attached hosts.

            leaf01(config)# interface vlan10
            leaf01(config-if)# ip igmp
            leaf01(config-if)# exit
            
          3. For ASM, configure a group mapping for a static RP:

            leaf01(config)# ip pim rp 10.10.10.101
            leaf01(config)# exit
            leaf01# write memory
            leaf01#  exit
            
          1. Edit the /etc/frr/daemons file and add pimd=yes to the end of the file:

            cumulus@leaf02:~$ sudo nano /etc/frr/daemons
            ...
            pimd=yes
            ...
            
          1. Restart FRR with this command:

          cumulus@switch:~$ sudo systemctl restart frr.service

          Restarting FRR restarts all the routing protocol daemons that are enabled and running.

          1. In the vtysh shell, run the following commands to configure the PIM interfaces. PIM must be on all interfaces facing multicast sources or multicast receivers, as well as on the interface with the RP address.

            cumulus@leaf02:~$ sudo vtysh
            ...
            leaf02# configure terminal
            leaf02(config)# interface vlan20
            leaf02(config-if)# ip pim
            leaf02(config-if)# exit
            leaf02(config)# interface swp51
            leaf02(config-if)# ip pim
            leaf02(config-if)# exit
            
          2. Enable IGMP on all interfaces that have attached hosts.

            leaf02(config)# interface vlan20
            leaf02(config-if)# ip igmp
            leaf02(config-if)# exit
            
          3. For ASM, configure a group mapping for a static RP:

            leaf02(config)# ip pim rp 10.10.10.101
            leaf02(config)# exit
            leaf02# write memory
            leaf02# exit
            
          1. Edit the /etc/frr/daemons file and add pimd=yes to the end of the file:

            cumulus@spine01:~$ sudo nano /etc/frr/daemons
            ...
            pimd=yes
            ...
            
          1. Restart FRR with this command:

          cumulus@switch:~$ sudo systemctl restart frr.service

          Restarting FRR restarts all the routing protocol daemons that are enabled and running.

          1. In the vtysh shell, run the following commands to configure the PIM interfaces. PIM must be on all interfaces facing multicast sources or multicast receivers, as well as on the interface with the RP address.

            cumulus@spine01:~$ sudo vtysh
            ...
            spine01# configure terminal
            spine01(config)# interface swp1
            spine01(config-if)# ip pim
            spine01(config-if)# exit
            spine01(config)# interface swp2
            spine01(config-if)# ip pim
            spine01(config-if)# exit
            
          2. For ASM, configure a group mapping for a static RP:

            spine01(config)# ip pim rp 10.10.10.101
            spine01(config-if)# end
            spine01# write memory
            spine01# exit
            

          The above commands configure the switch to send all multicast traffic to RP 10.10.10.101. The following commands configure PIM to send traffic from multicast group 224.10.0.0/16 to RP 10.10.10.101 and traffic from multicast group 224.10.2.0/24 to RP 10.10.10.102:

          cumulus@leaf01:~$ nv set vrf default router pim address-family ipv4 rp 10.10.10.101 group-range 224.10.0.0/16
          cumulus@leaf01:~$ nv set vrf default router pim address-family ipv4 rp 10.10.10.102 group-range 224.10.2.0/24
          
          cumulus@leaf01:~$ sudo vtysh
          ...
          spine01# configure terminal
          spine01(config)# ip pim rp 10.10.10.101 224.10.0.0/16
          spine01(config)# ip pim rp 10.10.10.102 224.10.2.0/16
          spine01(config)# end
          spine01# exit
          

          The following commands use a prefix list to configure PIM to send traffic from multicast group 224.10.0.0/16 to RP 10.10.10.101 and traffic from multicast group 224.10.2.0/24 to RP 10.10.10.102:

          cumulus@leaf01:~$ nv set router policy prefix-list MCAST1 rule 1 action permit
          cumulus@leaf01:~$ nv set router policy prefix-list MCAST1 rule 1 match 224.10.0.0/16
          cumulus@leaf01:~$ nv set router policy prefix-list MCAST2 rule 1 action permit
          cumulus@leaf01:~$ nv set router policy prefix-list MCAST2 rule 1 match 224.10.2.0/24
          cumulus@leaf01:~$ nv config apply
          cumulus@leaf01:~$ nv set vrf default router pim address-family ipv4 rp 10.10.10.101 prefix-list MCAST1
          cumulus@leaf01:~$ nv set vrf default router pim address-family ipv4 rp 10.10.10.102 prefix-list MCAST2
          cumulus@leaf01:~$ nv config apply
          
          cumulus@leaf01:~$ sudo vtysh
          ...
          spine01# configure terminal
          switch(config)# ip prefix-list MCAST1 seq 1 permit 224.10.0.0/16
          switch(config)# ip prefix-list MCAST2 seq 1 permit 224.10.2.0/24
          spine01(config)# ip pim rp 10.10.10.101 prefix-list MCAST1
          spine01(config)# ip pim rp 10.10.10.102 prefix-list MCAST2
          spine01(config)# end
          spine01# exit
          

          Optional PIM Configuration

          This section describes optional configuration procedures.

          ASM SPT Infinity

          When the LHR receives the first multicast packet, it sends a PIM (S,G) join towards the FHR to forward traffic through the network. This builds the SPT, or the tree that is the shortest path to the source. When the traffic arrives over the SPT, a PIM (S,G) RPT prune goes up the shared tree towards the RP. This removes multicast traffic from the shared tree; multicast data only goes over the SPT.

          You can configure SPT switchover per group (SPT infinity), which allows for some groups to never switch to a shortest path tree. The LHR now sends both (*,G) joins and (S,G) RPT prune messages towards the RP.

          When you use a prefix list in Cumulus Linux to match a multicast group destination address (GDA) range, you must include the /32 operator. In the NVUE command example below, max-prefix-len 32 after the group match range specifies the /32 operator. In the vtysh command example, ge 32 after the group permit range specifies the /32 operator.

          To configure a group to never follow the SPT, create the necessary prefix lists, then configure SPT switchover for the prefix list:

          cumulus@switch:~$ nv set router policy prefix-list SPTrange rule 1 match 235.0.0.0/8 max-prefix-len 32
          cumulus@switch:~$ nv set router policy prefix-list SPTrange rule 1 action permit
          cumulus@switch:~$ nv set router policy prefix-list SPTrange rule 2 match 238.0.0.0/8 max-prefix-len 32
          cumulus@switch:~$ nv set router policy prefix-list SPTrange rule 2 action permit
          cumulus@switch:~$ nv set vrf default router pim address-family ipv4 spt-switchover prefix-list SPTrange
          cumulus@switch:~$ nv set vrf default router pim address-family ipv4 spt-switchover action infinity
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# ip prefix-list spt-range permit 235.0.0.0/8 ge 32
          switch(config)# ip prefix-list spt-range permit 238.0.0.0/8 ge 32
          switch(config)# ip pim spt-switchover infinity prefix-list spt-range
          switch(config)# end
          switch# exit
          

          To view the configured prefix list, run the vtysh show ip mroute command. The following command shows that SPT switchover (pimreg) is on 235.0.0.0.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# show ip mroute
          Source          Group           Proto   Input     Output     TTL  Uptime
          *               235.0.0.0       IGMP     swp1     pimreg     1    00:03:3
                                          IGMP              vlan10     1    00:03:38
          *               238.0.0.0       IGMP     swp1     vlan10     1    00:02:08
          

          SSM Multicast Group Ranges

          232.0.0.0/8 is the default multicast group range reserved for SSM. To modify the SSM multicast group range, define a prefix list and apply it. You can change (expand) the default group or add additional groups to this range.

          You must include 232.0.0.0/8 in the prefix list as this is the reserved SSM range. Using a prefix-list, you can expand the SSM range but all devices in the source tree must agree on the SSM range. When you use a prefix list in Cumulus Linux to match a multicast group destination address (GDA) range, you must include the /32 operator. In the NVUE command example below, max-prefix-len 32 after the group match range specifies the /32 operator. In the vtysh command example, ge 32 after the group permit range specifies the /32 operator.

          Create a prefix list with the permit keyword to match address ranges that you want to treat as multicast groups and the deny keyword for the address ranges you do not want to treat as multicast groups:

          cumulus@switch:~$ nv set router policy prefix-list MyCustomSSMrange rule 5 match 232.0.0.0/8 max-prefix-len 32
          cumulus@switch:~$ nv set router policy prefix-list MyCustomSSMrange rule 5 action permit
          cumulus@switch:~$ nv set router policy prefix-list MyCustomSSMrange rule 10 match 238.0.0.0/8 max-prefix-len 32
          cumulus@switch:~$ nv set router policy prefix-list MyCustomSSMrange rule 10 action permit
          

          Apply the custom prefix list:

          cumulus@switch:~$ nv set vrf default router pim address-family ipv4 ssm-prefix-list MyCustomSSMrange
          cumulus@switch:~$ nv config apply
          

          Create a prefix list with the permit keyword to match address ranges that you want to treat as multicast groups and the deny keyword for the address ranges you do not want to treat as multicast groups:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# ip prefix-list ssm-range seq 5 permit 232.0.0.0/8 ge 32
          switch(config)# ip prefix-list ssm-range seq 10 permit 238.0.0.0/8 ge 32
          

          Apply the custom prefix list as an ssm-range:

          switch(config)# ip pim ssm prefix-list ssm-range
          switch(config)# exit
          switch# write memory
          switch# exit
          

          To view the configured prefix lists, run the vtysh show ip prefix-list my-custom-ssm-range command:

          switch#  show ip prefix-list my-custom-ssm-range
          ZEBRA: ip prefix-list my-custom-ssm-range: 1 entries
             seq 5 permit 232.0.0.0/8 ge 32
          PIM: ip prefix-list my-custom-ssm-range: 1 entries
             seq 10 permit 232.0.0.0/8 ge 32
          

          PIM and ECMP

          PIM uses RPF to choose an upstream interface to build a forwarding state. If you configure ECMP, PIM chooses the RPF based on the ECMP hash algorithm.

          You can configure PIM to use all the available next hops when installing mroutes. For example, if you have four-way ECMP, PIM spreads the S,G and *,G mroutes across the four different paths.

          You can also configure PIM to recalculate all stream paths over one of the ECMP paths if the switch loses a path. Otherwise, only the streams that are using the lost path move to alternate ECMP paths. This recalculation does not affect existing groups.

          Recalculating all stream paths over one of the ECMP paths can cause some packet loss.

          To configure PIM to use all the available next hops when installing mroutes:

          cumulus@switch:~$ nv set vrf default router pim ecmp enable on
          cumulus@switch:~$ nv config apply
          

          To recalculate all stream paths over one of the ECMP paths if the switch loses a path:

          cumulus@switch:~$ nv set vrf default router pim ecmp rebalance on
          cumulus@switch:~$ nv config apply
          

          To configure PIM to use all the available next hops when installing mroutes:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# ip pim ecmp
          switch(config)# exit
          switch# write memory
          switch# exit
          

          To recalculate all stream paths over one of the ECMP paths if the switch loses a path:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# ip pim ecmp rebalance
          switch(config)# exit
          switch# write memory
          switch# exit
          

          To show the next hop for a specific source or group, run the vtysh show ip pim nexthop command:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# show ip pim nexthop
          Number of registered addresses: 3
          Address         Interface      Nexthop
          -------------------------------------------
          6.0.0.9         swp31s0        169.254.0.9
          6.0.0.9         swp31s1        169.254.0.25
          6.0.0.11        lo             0.0.0.0
          6.0.0.10        swp31s0        169.254.0.9
          6.0.0.10        swp31s1        169.254.0.25
          

          IP Multicast Boundaries

          Use multicast boundaries to limit the distribution of multicast traffic and push multicast to a subset of the network. With boundaries in place, the switch drops or accepts incoming IGMP or PIM joins according to a prefix list. To configure the boundary, apply an IP multicast boundary OIL (outgoing interface list) on an interface.

          First create a prefix list consisting of multicast group addresses, then run the following commands:

          cumulus@switch:~$ nv set interface swp1 router pim address-family ipv4-unicast multicast-boundary-oil MyPrefixList
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# interface swp1
          switch(config-if)# ip multicast boundary oil my-prefix-list
          switch(config-if)# end
          switch# write memory
          switch# exit
          

          MSDP

          You can use MSDP to connect multiple PIM-SM multicast domains using the PIM-SM RPs. If you configure anycast RPs with the same IP address on multiple multicast switches (on the loopback interface), you can use more than one RP per multicast group.

          When an RP discovers a new source (a PIM-SM register message), it sends an SA message to each MSDP peer. The peer then determines if there are any interested receivers.

          The following steps configure a Cumulus switch to use MSDP:

          1. Add an anycast IP address to the loopback interface for each RP in the domain:

            cumulus@rp01:~$ nv set interface lo ip address 10.10.10.101/32
            cumulus@rp01:~$ nv set interface lo ip address 10.100.100.100/32
            
          2. On every multicast switch, configure the group to RP mapping using the anycast address:

            cumulus@switch:$ nv set vrf default router pim address-family ipv4 rp 10.100.100.100 group-range 224.0.0.0/4
            cumulus@switch:$ nv config apply
            
          3. Configure the MSDP mesh group for all active RPs. The following example uses three RPs:

            The mesh group must include all RPs in the domain as members, with a unique address as the source. This configuration results in MSDP peerings between all RPs.

            cumulus@rp01:$ nv set vrf default router pim msdp-mesh-group cumulus member-address 100.1.1.2
            cumulus@rp01:$ nv set vrf default router pim msdp-mesh-group cumulus member-address 100.1.1.3
            
            cumulus@rp02:$ nv set vrf default router pim msdp-mesh-group cumulus member-address 100.1.1.1
            cumulus@rp02:$ nv set vrf default router pim msdp-mesh-group cumulus member-address 100.1.1.3
            
            cumulus@rp03:$ nv set vrf default router pim msdp-mesh-group cumulus member-address 100.1.1.1
            cumulus@rp03:$ nv set vrf default router pim msdp-mesh-group cumulus member-address 100.1.1.2
            
          4. Pick the local loopback address as the source of the MSDP control packets:

            cumulus@rp01:$ nv set vrf default router pim msdp-mesh-group cumulus source-address 10.10.10.101
            
            cumulus@rp02:$ nv set vrf default router pim msdp-mesh-group cumulus source-address 10.10.10.102
            
            cumulus@rp03:$ nv set vrf default router pim msdp-mesh-group cumulus source-address 10.10.10.103
            
          5. Inject the anycast IP address into the IGP of the domain. If the network uses unnumbered BGP as the IGP, avoid using the anycast IP address to establish unicast or multicast peerings. For PIM-SM, ensure that you use the unique address as the PIM hello source by setting the source:

            cumulus@rp01:$ nv set interface lo router pim address-family ipv4-unicast use-source 10.100.100.100
            cumulus@rp01:$ nv config apply
            
          1. Edit the /etc/network/interfaces file to add an anycast IP address to the loopback interface for each RP in the domain. For example:

            cumulus@rp01:~$ sudo nano /etc/network/interfaces
            auto lo
            iface lo inet loopback
               address 10.10.10.101/32
               address 10.100.100.100/32
            ...
            
          2. Run the ifreload -a command to load the new configuration:

            cumulus@switch:~$ ifreload -a
            
          3. On every multicast switch, configure the group to RP mapping using the anycast address:

            cumulus@rp01:~$ sudo vtysh
            ...
            rp01# configure terminal
            rp01(config)# ip pim rp 10.100.100.100 224.0.0.0/4
            
          4. Configure the MSDP mesh group for all active RPs (the following example uses three RPs):

            The mesh group must include all RPs in the domain as members, with a unique address as the source. This configuration results in MSDP peerings between all RPs.

            rp01(config)# ip msdp mesh-group cumulus member 100.1.1.2
            rp01(config)# ip msdp mesh-group cumulus member 100.1.1.3
            
            rp02(config)# ip msdp mesh-group cumulus member 100.1.1.1
            rp02(config)# ip msdp mesh-group cumulus member 100.1.1.3
            
            rp03(config)# ip msdp mesh-group cumulus member 100.1.1.1
            rp03(config)# ip msdp mesh-group cumulus member 100.1.1.2
            
          5. Pick the local loopback address as the source of the MSDP control packets

            rp01(config)# ip msdp mesh-group cumulus source 10.10.10.101
            rp02(config)# ip msdp mesh-group cumulus source 10.10.10.102
            rp03(config)# ip msdp mesh-group cumulus source 10.10.10.103
            
          6. Inject the anycast IP address into the IGP of the domain. If the network uses unnumbered BGP as the IGP, avoid using the anycast IP address to establish unicast or multicast peerings. For PIM-SM, ensure that you use the unique address as the PIM hello source by setting the source:

            rp01# interface lo
            rp01(config-if)# ip pim use-source 100.100.100.100
            rp01(config-if)# end
            rp01# write memory
            rp01# exit
            

          PIM in a VRF

          VRFs divide the routing table on a per-tenant basis to provide separate layer 3 networks over a single layer 3 infrastructure. With a VRF, each tenant has its own virtualized layer 3 network so IP addresses can overlap between tenants.

          PIM in a VRF enables PIM trees and multicast data traffic to run inside a layer 3 virtualized network, with a separate tree per domain or tenant. Each VRF has its own multicast tree with its own RPs, sources, and so on. Therefore, you can have one tenant per corporate division, client, or product.

          If you do not enable MP-BGP MPLS VPN, VRFs on different switches typically connect or peer over subinterfaces, where each subinterface is in its own VRF.

          To configure PIM in a VRF:

          Add the VRFs and associate them with switch ports:

          cumulus@switch:~$ nv set vrf RED
          cumulus@switch:~$ nv set vrf BLUE
          cumulus@switch:~$ nv set interface swp1 ip vrf RED
          cumulus@switch:~$ nv set interface swp2 ip vrf BLUE
          

          Add PIM configuration:

          cumulus@switch:~$ nv set interface swp1 router pim
          cumulus@switch:~$ nv set interface swp2 router pim
          cumulus@switch:~$ nv set vrf RED router bgp autonomous-system 65001
          cumulus@switch:~$ nv set vrf BLUE router bgp autonomous-system 65000
          cumulus@switch:~$ nv set vrf RED router bgp router-id 10.1.1.1
          cumulus@switch:~$ nv set vrf BLUE router bgp router-id 10.1.1.2
          cumulus@switch:~$ nv set vrf RED router bgp neighbor swp1 remote-as external
          cumulus@switch:~$ nv set vrf BLUE router bgp neighbor swp2 remote-as external
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/network/interfaces file and to the VRFs and associate them with switch ports, then run ifreload -a to reload the configuration.

          cumulus@switch:~$ sudo nano /etc/network/interfaces
          ...
          auto swp1
          iface swp1
              vrf RED
          
          auto swp2
          iface swp2
              vrf BLUE
          
          auto RED
          iface RED
              vrf-table auto
          
          auto BLUE
          iface BLUE
              vrf-table auto
          ...
          

          Add the PIM configuration:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# interface swp1
          switch(config-if)# ip pim
          switch(config-if)# exit
          switch(config)# interface swp2
          switch(config-if)# ip pim
          switch(config-if)# exit
          switch(config)# router bgp 65001 vrf RED
          switch(config-router)# bgp router-id 10.1.1.2
          switch(config-router)# neighbor swp1 interface remote-as external
          switch(config-router)# exit
          switch(config)# router bgp 65000 vrf BLUE
          switch(config-router)# bgp router-id 10.1.1.1
          switch(config-router)# neighbor swp2 interface remote-as external
          switch(config-router)# end
          switch# write memory
          switch# exit
          

          BFD for PIM Neighbors

          You can use BFD for PIM neighbors to detect link failures. When you configure an interface, include the pim bfd option. The following example commands configure BFD between leaf01 and spine01:

          cumulus@leaf01:~$ nv set interface swp51 router pim bfd enable on
          cumulus@leaf01:~$ nv config apply
          
          cumulus@spine01:~$ nv set interface swp1 router pim bfd enable on
          cumulus@spine01:~$ nv config apply
          
          cumulus@leaf01:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# interface swp51
          leaf01(config-if)# ip pim bfd
          leaf01(config-if)# end
          leaf01# write memory
          leaf01# exit
          
          cumulus@spine01:~$ sudo vtysh
          ...
          spine01# configure terminal
          spine01(config)# interface swp1
          spine01(config-if)# ip pim bfd
          spine01(config-if)# end
          spine01# write memory
          spine01# exit
          

          Allow RP

          To begin receiving multicast traffic for a group, a receiver expresses its interest in the group by sending an IGMP membership report on its connected LAN. The LHR receives this report and begins to build a multicast routing tree back towards the source. To build this tree, another router known both to the LHR and to the multicast source needs to exist to act as an RP for senders and receivers. The LHR looks up the RP for the group specified by the receiver and sends a PIM Join message towards the RP. Per RFC 7761, intermediary routers between the LHR and the RP must check that the RP for the group matches the one in the PIM Join, and if not, to drop the Join.

          In some configurations, it is desirable to configure the LHR with an RP address that does not match the actual RP address for the group. In this case, you must configure the upstream routers to accept the Join and propagate it towards the appropriate RP for the group, ignoring the mismatched RP address in the PIM Join and replacing it with its own RP for the group.

          You can configure the switch to allow joins from all upstream neighbors or you can provide a prefix list so that the switch only accepts joins with an upstream neighbor address.

          The following example command configures PIM to ignore the RP check for all upstream neighbors:

          cumulus@switch:~$ nv set interface swp50 router pim address-family ipv4-unicast allow-rp enable on
          cumulus@switch:~$ nv config apply
          

          The following example command configures PIM to only ignore the RP check for RP addresses in the prefix list called allowRP:

          cumulus@switch:~$ nv set interface swp50 router pim address-family ipv4-unicast allow-rp rp-list allowRP
          cumulus@switch:~$ nv config apply
          

          The following example command configures PIM to ignore the RP check for all upstream neighbors:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# interface swp50
          switch(config-if)# ip pim allow-rp
          switch(config-if)# end
          switch# write memory
          switch# exit
          

          The following example command configures PIM to only ignore the RP check for RP addresses in the prefix list called allowRP:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# interface swp50
          switch(config-if)# ip pim allow-rp rp-list allowRP
          switch(config-if)# end
          switch# write memory
          switch# exit
          

          PIM Timers

          Cumulus Linux provides the following PIM timers:

          Timer Description
          hello-interval The interval in seconds at which the PIM router sends hello messages to discover PIM neighbors and maintain PIM neighbor relationships. You can specify a value between 1 and 180. The default setting is 30 seconds. With vtysh, you set the hello interval for a specific PIM enabled interface. With NVUE, you can set the hello interval globally for all PIM enabled interfaces or for a specific PIM enabled interface.
          holdtime The number of seconds during which the neighbor must be in a reachable state. auto (the default setting) uses three and half times the hello-interval. You can specify a value between 1 and 180. With vtysh, you set the holdtime for a specific PIM enabled interface. With NVUE, you can set the holdtime globally for all PIM enabled interfaces or for a specific PIM enabled interface.
          join-prune-interval The interval in seconds at which a PIM router sends join/prune messages to its upstream neighbors for a state update. You can specify a value between 60 and 600. The default setting is 60 seconds. You set the join-prune-interval globally for all PIM enabled interfaces. NVUE also provides the option of setting the join-prune-interval for a specific VRF.
          keepalive The timeout value for the S,G stream in seconds. You can specify a value between 31 and 60000. The default setting is 210 seconds. You can set the keepalive timer globally or all PIM enabled interfaces or for a specific VRF. In vtysh, the timer is keep-alive.
          register-suppress The number of seconds during which to stop sending register messages to the RP. You can specify a value between 5 and 60000. The default setting is 60 seconds. You can set the keepalive timer globally for all PIM enabled interfaces or for a specific VRF.
          rp-keepalive NVUE only. The timeout value for the RP in seconds. You can specify a value between 31 and 60000. The default setting is 185 seconds. You set the register-suppress-time timer globally for all PIM enabled interfacesor for a specific VRF. In vtysh, the timer is rp-keep-alive.

          The following example commands set the join-prune-interval to 100 seconds, the keepalive timer to 10000 seconds, and the register-suppress time to 20000 seconds globally for all PIM enabled interfaces:

          cumulus@switch:~$ nv set router pim timers join-prune-interval 100
          cumulus@switch:~$ nv set router pim timers keepalive 10000
          cumulus@switch:~$ nv set router pim timers register-suppress 20000
          cumulus@switch:~$ nv config apply
          

          The following example commands set the hello-interval to 60 seconds for swp51:

          cumulus@switch:~$ nv set interface swp51 router pim timers hello-interval 60
          cumulus@switch:~$ nv config apply
          

          The following example commands set the rp-keepalive to 10000 for VRF RED:

          cumulus@switch:~$ nv set vrf RED router pim timers rp-keepalive 10000
          cumulus@switch:~$ nv config apply
          

          The following example commands set the join-prune-interval to 100 seconds, the keep-alive timer to 10000 seconds, and the register-suppress time to 20000 seconds globally for all PIM enabled interfaces:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# ip pim join-prune-interval 100
          switch(config)# ip pim keep-alive-timer 10000
          switch(config)# ip pim register-suppress-time 20000
          switch(config)# end
          switch# write memory
          switch# exit
          

          The following example command sets the hello-interval to 60 seconds and the holdtime to 120 for swp51:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# interface swp51
          switch(config-if)# ip pim hello 60 120
          switch(config-if)# end
          switch# write memory
          switch# exit
          

          The following example command sets the keep-alive-timer to 10000 seconds for VRF RED:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# vrf RED
          switch(config-vrf)# ip pim keep-alive-timer 10000
          switch(config-if)# end
          switch# write memory
          switch# exit
          

          Improve Multicast Convergence

          For large multicast environments, the default CoPP policer might be too restrictive. You can adjust the policer to improve multicast convergence.

          To adjust the policer:

          The following example commands set the PIM forwarding and burst rate to 400 packets per second:

          cumulus@switch:~$ nv set system control-plane policer pim-ospf-rip rate 400
          cumulus@switch:~$ nv set system control-plane policer pim-ospf-rip burst 400
          cumulus@switch:~$ nv config apply 
          

          The following example commands set the IGMP forwarding rate to 400 and the IGMP burst rate to 200 packets per second:

          cumulus@switch:~$ nv set system control-plane policer igmp rate 400
          cumulus@switch:~$ nv set system control-plane policer igmp burst 200
          cumulus@switch:~$ nv config apply 
          
          1. Edit the /etc/cumulus/control-plane/policers.conf file:

            • To tune the PIM forwarding and burst rate, change the copp.pim_ospf_rip.rate and copp.pim_ospf_rip.burst parameters.

            • To tune the IGMP forwarding and burst rate, change the copp.igmp.rate and copp.igmp.burst parameters.

              The following example changes the PIM forwarding rate and the PIM burst rate to 400 packets per second, the IGMP forwarding rate to 400 packets per second and the IGMP burst rate to 200 packets per second:

              cumulus@switch:~$ sudo nano /etc/cumulus/control-plane/policers.conf
              ...
              copp.pim_ospf_rip.enable = TRUE
              copp.pim_ospf_rip.rate = 400
              copp.pim_ospf_rip.burst = 400
              ...
              copp.igmp.enable = TRUE
              copp.igmp.rate = 400
              copp.igmp.burst = 200
              ...
              
          2. Run the following command:

            cumulus@switch:~$ /usr/lib/cumulus/switchdctl --load /etc/cumulus/control-plane/policers.conf
            

          IGMP Settings

          You can set the following optional IGMP settings on a PIM interface:

          The following example sets the last member query interval to 80, the maximum response time for IGMP general queries to 120 seconds, the number of group-specific queries that a querier can send to 5, and configures IGMP to send query-host messages every 180 seconds:

          cumulus@switch:~$ nv set interface swp1 ip igmp last-member-query-interval 80
          cumulus@switch:~$ nv set interface swp1 ip igmp query-max-response-time 120
          cumulus@switch:~$ nv set interface swp1 ip igmp last-member-query-count 5
          cumulus@switch:~$ nv set interface swp1 ip igmp query-interval 180
          cumulus@switch:~$ nv config apply
          

          The following example enables fast leave processing:

          cumulus@switch:~$ nv set interface swp1 ip igmp fast-leave on
          cumulus@switch:~$ nv config apply
          

          To disable fast leave processing, run the nv set interface <interface> ip igmp fast-leave off command.

          The following example sets the last member query interval to 80, the maximum response time for IGMP general queries to 120 seconds, the number of group-specific queries that a querier sends to 5, and configures IGMP to send query-host messages every 180 seconds:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# 
          switch(config)# interface vlan10
          leaf02(config-if)# ip igmp last-member-query-interval 80
          leaf02(config-if)# ip igmp query-max-response-time 120
          leaf02(config-if)# ip igmp last-member-query-count 5
          leaf02(config-if)# ip igmp query-interval 180
          leaf02(config-if)# end
          switch# write memory
          switch# exit
          

          The vtysh ip igmp last-member-query-count command adds the configuration to the /etc/frr/frr.conf file:

          cumulus@switch:~$ sudo nano /etc/frr/frr.conf
          ...
          ip igmp
          ip igmp version 3
          ip igmp query-interval 180
          ip igmp last-member-query-interval 80
          ip igmp last-member-query-count 5
          ip igmp query-max-response-time 120
          ...
          

          To enable fast leave processing, edit the /etc/network/interfaces file and add the bridge-portmcfl yes parameter under the interface stanza:

          cumulus@switch:~$ sudo nano /etc/network/interfaces
          ...
          auto vlan10
          iface vlan10
              address 10.1.10.1/24
              hwaddress 44:38:39:22:01:b1
              bridge-portmcfl yes
              vlan-raw-device br_default
              vlan-id 10
          ...
          

          To disable fast leave processing, edit the /etc/network/interfaces file and set the bridge-portmcfl no parameter under the interface stanza.

          PIM Active-active with MLAG

          When a multicast sender attaches to an MLAG bond, the sender hashes the outbound multicast traffic over a single member of the bond. Traffic arrives on one of the MLAG enabled switches. Regardless of which switch receives the traffic, it goes over the MLAG peer link to the other MLAG-enabled switch, because the peerlink is always the multicast router port and always receives the multicast stream.

          Traffic from multicast sources attached to an MLAG bond always goes over the MLAG peerlink. Be sure to size the peerlink appropriately to accommodate this traffic.

          The PIM DR for the VLAN where the source resides sends the PIM register towards the RP. The PIM DR is the PIM speaker with the highest IP address on the segment. After the PIM register process is complete and traffic is flowing along the SPT, either MLAG switch forwards traffic towards the receivers.

          PIM joins sent towards the source can be ECMP load shared by upstream PIM neighbors. Either MLAG member can receive the PIM join and forward traffic, regardless of DR status.

          A dual-attached multicast receiver sends an IGMP join on the attached VLAN. One of the MLAG switches receives the IGMP join, then adds the IGMP join to the IGMP Join table and layer 2 MDB table. The layer 2 MDB table, like the unicast MAC address table, synchronizes through MLAG control messages over the peerlink. This allows both MLAG switches to program IGMP and MDB table forwarding information. Both switches send *,G PIM Join messages towards the RP. If the source is already sending, both MLAG switches receive the multicast stream.

          Traditionally, the PIM DR is the only node to send the PIM *,G Join. To provide resiliency in case of failure, both MLAG switches send PIM *,G Joins towards the RP to receive the multicast stream.

          To prevent duplicate multicast packets, PIM elects a DF, which is the primary member of the MLAG pair. The MLAG secondary switch puts the VLAN in the OIL, preventing duplicate multicast traffic.

          Example Traffic Flow

          The examples below show the flow of traffic between server02 and server03:

          Step 1
          1. server02 sends traffic to leaf02.

          2. leaf02 forwards traffic to leaf01 because the peerlink is a multicast router port.

          3. spine01 receives a PIM register from leaf01, the DR.

          4. leaf02 syncs the *,G table from leaf01 as an MLAG active-active peer.
          Step 2
          1. leaf02 has the *,G route indicating that it must forward traffic towards spine01.

          2. Either leaf02 or leaf01 sends this traffic directly based on which MLAG switch receives it from the attached source.

          3. In this case, leaf02 receives the traffic on the MLAG bond and forwards it directly upstream.

          Configure PIM with MLAG

          You can use a multicast sender or receiver over a dual-attached MLAG bond. On the VLAN interface where multicast sources or receivers exist, configure PIM active-active and IGMP. Enabling PIM active-active automatically enables PIM on that interface.

          cumulus@leaf01:~$ nv set interface vlan10 router pim active-active on
          cumulus@leaf01:~$ nv set interface vlan10 ip igmp
          cumulus@leaf01:~$ nv config apply
          
          cumulus@leaf01:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# interface vlan10
          leaf01(config-if)# ip pim active-active
          leaf01(config-if)# ip igmp
          leaf01(config-if)# end
          leaf01# write memory
          leaf01# exit
          

          To verify PIM active-active configuration, run the vtysh show ip pim mlag summary command:

          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# show ip pim mlag summary
          MLAG daemon connection: up
          MLAG peer state: up
          Zebra peer state: up
          MLAG role: PRIMARY
          Local VTEP IP: 0.0.0.0
          Anycast VTEP IP: 0.0.0.0
          Peerlink: peerlink.4094
          Session flaps: mlagd: 0 mlag-peer: 0 zebra-peer: 0
          Message Statistics:
          mroute adds: rx: 5, tx: 5
          mroute dels: rx: 0, tx: 0
          peer zebra status updates: 1
          PIM status updates: 0
          VxLAN updates: 0
          

          Troubleshooting

          This section provides commands to examine your PIM configuration and provides troubleshooting tips.

          PIM Show Commands

          To show the contents of the IP multicast routing table, run the vtysh show ip mroute command. You can verify the (S,G) and (*,G) state entries from the flags and check that the incoming and outgoing interfaces are correct:

          cumulus@fhr:~$ sudo vtysh
          ...
          fhr# show ip mroute
          IP Multicast Routing Table
          Flags: S - Sparse, C - Connected, P - Pruned
                 R - RP-bit set, F - Register flag, T - SPT-bit set
          
          Source          Group           Flags    Proto  Input            Output           TTL  Uptime
          10.1.10.101     239.1.1.1       SFP      none   vlan10           none             0    --:--:-- 
          

          To see the active source on the switch, run the vtysh show ip pim upstream command.

          cumulus@fhr:~$ sudo vtysh
          ...
          fhr# show ip pim upstream
          Iif    Source        Group     State   Uptime    JoinTimer  RSTimer   KATimer   RefCnt
          vlan10 10.1.10.101   239.1.1.1 Prune   00:07:40  --:--:--   00:00:36  00:02:50  1
          

          To show upstream information for S,Gs and the desire to join the multicast tree, run the vtysh show ip pim upstream-join-desired command.

          cumulus@fhr:~$ sudo vtysh
          ...
          fhr# show ip pim upstream-join-desired
          Source          Group           EvalJD
          10.1.10.101     239.1.1.1       yes 
          

          To show the PIM interfaces on the switch, run the vtysh show ip pim interface command.

          cumulus@fhr:mgmt:~$ sudo vtysh
          ...
          fhr# show ip pim interface
          Interface         State          Address  PIM Nbrs           PIM DR  FHR IfChannels
          lo                   up       10.10.10.1         0            local    0          0
          swp51                up       10.10.10.1         1     10.10.10.101    0          0
          vlan10               up        10.1.10.1         0            local    1          0
          

          The vtysh show ip pim interface detail command shows more detail about the PIM interfaces on the switch:

          cumulus@fhr:~$ sudo vtysh
          ...
          fhr# show ip pim interface detail
          ...
          Interface  : vlan10
          State      : up
          Address    : 10.1.10.1 (primary)
                       fe80::4638:39ff:fe00:31/64
          
          Designated Router
          -----------------
          Address   : 10.1.10.1
          Priority  : 1(0)
          Uptime    : --:--:--
          Elections : 1
          Changes   : 0
          
          FHR - First Hop Router
          ----------------------
          239.1.1.1 : 10.1.10.101 is a source, uptime is 00:03:08
          ...
          

          To show local membership information for a PIM interface, run the vtysh show ip pim local-membership command.

          cumulus@lhr:~$ sudo vtysh
          ...
          lhr# show ip pim local-membership
          Interface         Address          Source           Group            Membership
          vlan20            10.2.10.1        *                239.1.1.1        INCLUDE 
          

          To show information about known S,Gs, the IIF and the OIL, run the vtysh show ip pim state command.

          cumulus@fhr:~$ sudo vtysh
          ...
          fhr# show ip pim state
          Codes: J -> Pim Join, I -> IGMP Report, S -> Source, * -> Inherited from (*,G), V -> VxLAN, M -> Muted
          Active Source           Group            RPT  IIF               OIL
          1      10.1.10.101      239.1.1.1        n    vlan10 
          

          To show the IGMP configuration settings for an interface, run the nv show interface <interface> ip igmp command

          cumulus@lhr:~$ nv show interface swp3 ip igmp
                                      operational  applied
          ---------------------------  -----------  -------
          enable                                    on
          version                      3            3
          fast-leave                                off
          query-interval               125          125
          query-max-response-time      100          100
          last-member-query-interval                10
          last-member-query-count      2            2
          [static-group]
          interface-state              up
          ip-address                   33.1.1.10
          ifindex                      3
          querier                      local
          querier-ip                   33.1.1.10
          query-start-count            0
          group-membership-interval    350
          older-host-present-interval  350
          other-querier-interval       300
          robustness-variable          2
          startup-query-interval       31
          last-member-query-time       20
          timers
            query-timer                00:00:12
            query-other-timer          --:--:--
          flags
            multicast                  on
            broadcast                  on
            lan-delay                  on             
          

          To show IGMP operational data for an interface, run the NVUE nv show interface <interface> ip igmp -o json command or the vtysh show ip igmp statistics command.

          To verify that the receiver is sending IGMP reports (joins) for the group, run the NVUE nv show interface <interface> ip igmp group command or the vtysh show ip igmp groups command.

          cumulus@lhr:~$ nv show interface swp3 ip igmp group
          StaticGroupID  filter-mode  source-count  timer     uptime    version  Summary
          -------------  -----------  ------------  --------  --------  -------  -------------------------
          225.1.101.1    exclude      1             00:02:43  00:02:56  3        source-address:         *
          225.1.101.2    exclude      1             00:02:43  00:02:56  3        source-address:         *
          225.1.101.3    exclude      1             00:02:43  00:02:56  3        source-address:         *
          225.1.101.4    exclude      1             00:02:43  00:02:56  3        source-address:         *
          225.1.101.5    exclude      1             00:02:43  00:02:56  3        source-address:         *
          232.1.1.99     include      1             --:--:--  00:00:02  3        source-address: 10.1.10.1
          

          To show IGMP source information, run the vtysh show ip igmp sources command.

          cumulus@lhr:~$ sudo vtysh
          ...
          lhr# show ip igmp sources
          Interface        Address         Group           Source          Timer Fwd Uptime  
          vlan20           10.2.10.1       239.1.1.1       *               03:13   Y 05:28:42 
          

          FHR Stuck in the Registering Process

          When a multicast source starts, the FHR sends unicast PIM register messages from the RPF interface towards the source. After the RP receives the PIM register, it sends a PIM register stop message to the FHR to end the register process. If an issue occurs with this communication, the FHR becomes stuck in the registering process, which can result in high CPU (the FHR CPU generates and sends PIM register packets to the RP CPU).

          To assess this issue, review the FHR. You can see the output interface of pimreg here. If this does not change to an interface within a couple of seconds, it is possible that the FHR remains in the registering process.

          cumulus@fhr:~$ sudo vtysh
          ...
          fhr# show ip mroute
          Source          Group           Proto  Input      Output     TTL  Uptime
          10.1.10.101     239.2.2.3       PIM    vlan10     pimreg     1    00:03:59
          

          To troubleshoot the issue:

          1. Validate that the FHR can reach the RP. If the RP and FHR can not communicate, the registration process fails:

            cumulus@fhr:~$ ping 10.10.10.101
            PING 10.10.10.101 (10.10.10.101) from 10.1.10.1: 56(84) bytes of data.
            ^C
            --- 10.0.0.21 ping statistics ---
            4 packets transmitted, 0 received, 100% packet loss, time 3000ms
            
          2. On the RP, use tcpdump to see if the PIM register packets arrive:

            cumulus@rp01:~$ sudo tcpdump -i swp1
            tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
            listening on swp1, link-type EN10MB (Ethernet), capture size 262144 bytes
            23:33:17.524982 IP 10.1.10.101 > 10.10.10.101: PIMv2, Register, length 66
            
          3. If the switch is receiving PIM registration packets, verify that PIM sees them by running the vtysh debug pim packets command:

            cumulus@fhr:~$ sudo vtysh -c "debug pim packets"
            PIM Packet debugging is on
            
            cumulus@rp01:~$ sudo tail /var/log/frr/frr.log
            2016/10/19 23:46:51 PIM: Recv PIM REGISTER packet from 172.16.5.1 to 10.0.0.21 on swp30: ttl=255 pim_version=2 pim_msg_size=64 checksum=a681
            
          4. Repeat the process on the FHR to see that it receives PIM register stop messages and passes them to the PIM process:

            cumulus@fhr:~$ sudo tcpdump -i swp51
            23:58:59.841625 IP 172.16.5.1 > 10.0.0.21: PIMv2, Register, length 28
            23:58:59.842466 IP 10.0.0.21 > 172.16.5.1: PIMv2, Register Stop, length 18
            
            cumulus@fhr:~$ sudo vtysh -c "debug pim packets"
            PIM Packet debugging is on
            
            cumulus@fhr:~$ sudo tail -f /var/log/frr/frr.log
            2016/10/19 23:59:38 PIM: Recv PIM REGSTOP packet from 10.10.10.101 to 10.10.10.1 on swp51: ttl=255 pim_version=2 pim_msg_size=18 checksum=5a39
            

          LHR Does Not Build *,G

          If you do not enable both PIM and IGMP on an interface facing a receiver, the LHR does not build *,G.

          cumulus@lhr:~$ sudo vtysh
          ...
          lhr# show run
          !
          interface vlan20
           ip igmp
           ip pim
          

          To troubleshoot this issue, ensure that the receiver sends IGMPv3 joins when you enable both PIM and IGMP:

          cumulus@lhr:~$ sudo tcpdump -i vlan20 igmp
          tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
          listening on vlan20, link-type EN10MB (Ethernet), capture size 262144 bytes
          00:03:55.789744 IP 10.2.10.1 > igmp.mcast.net: igmp v3 report, 1 group record(s)
          

          No mroute Created on the FHR

          To troubleshoot this issue:

          1. Verify that the FHR is receiving multicast traffic:

            cumulus@fhr:~$ sudo tcpdump -i vlan10
            tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
            listening on vlan10, link-type EN10MB (Ethernet), capture size 262144 bytes
            19:57:58.429632 IP 10.1.10.101.42420 > 239.1.1.1.1000: UDP, length 8
            19:57:59.431250 IP 10.1.10.101.42420 > 239.1.1.1.1000: UDP, length 8
            
          2. Verify PIM configuration on the interface facing the source:

            cumulus@fhr:~$ sudo vtysh
            ...
            fhr# show run
            !
            interface vlan10
             ip igmp
             ip pim
            !
            
          3. Verify that the RPF interface for the source matches the interface that receives multicast traffic:

            fhr# show ip rpf 10.1.10.1
            Routing entry for 10.1.10.0/24 using Unicast RIB
            Known via "connected", distance 0, metric 0, best
            Last update 1d00h26m ago
            * directly connected, vlan10
            
          4. Verify RP configuration for the multicast group:

            fhr# show ip pim rp-info
            RP address       group/prefix-list   OIF               I am RP    Source      Group-Type
            10.10.10.101     224.0.0.0/4         swp51             no         Static      ASM
            

          No S,G on the RP for an Active Group

          An RP does not build an mroute when there are no active receivers for a multicast group even though the FR creates the mroute.

          cumulus@rp01:~$ sudo vtysh
          ...
          rp01# show ip mroute
          Source          Group           Flags    Proto  Input            Output           TTL  Uptime
          

          You can see the active source on the RP with either the vtysh show ip pim upstream command.

          cumulus@rp01:~$ sudo vtysh
          ...
          rp01# show ip pim upstream
          Iif             Source          Group           State       Uptime   JoinTimer RSTimer   KATimer   RefCnt
          vlan10          10.1.10.101     239.1.1.1       Prune       00:08:03 --:--:--  --:--:--  00:02:20       1
          

          No mroute Entry in Hardware

          To verify that the hardware IP multicast entry is the maximum value, run the cl-resource-query | grep Mcast command.

          cumulus@switch:~$ cl-resource-query  | grep Mcast
          Total Mcast Routes:         450,   0% of maximum value    450
          

          Refer to Forwarding Table Size and Profiles.

          Verify the MSDP Session State

          To verify the state of MSDP sessions, run the vtysh show ip msdp mesh-group command.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# show ip msdp mesh-group
          Mesh group : pod1
            Source : 10.1.10.101
            Member                 State
            10.1.10.102        established
            10.1.10.103        established
          
          cumulus@switch:~$ sudo vtysh
          switch# show ip msdp peer
          Peer                    Local         State     Uptime    SaCnt
          10.1.10.102       10.1.10.101   established    00:07:21       0
          10.1.10.103       10.1.10.101   established    00:07:21       0
          

          View the Active Sources

          To review the active sources that the switch learns locally (through PIM registers) and from MSDP peers, run the vtysh show ip msdp sa command.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# show ip msdp sa
          Source                Group               RP   Local    SPT      Uptime
          10.1.10.101       239.1.1.1     10.10.10.101       n      n    00:00:40
          10.1.10.101       239.1.1.2    100.10.10.101       n      n    00:00:25
          

          Clear PIM State and Statistics

          If you are troubleshooting or making changes to your multicast environment, you can:

          To clear PIM neighbors for all PIM interfaces in a VRF:

          cumulus@switch:~$ nv action clear vrf default router pim interfaces
          Action succeeded
          

          To clear traffic statistics for all PIM interfaces in a VRF:

          cumulus@switch:~$ nv action clear vrf default router pim interface-traffic
          Action succeeded
          

          To clear the IGMP interface state:

          cumulus@switch:~$ nv action clear router igmp interfaces
          Action succeeded
          

          To clear PIM neighbors for all PIM interfaces in a VRF:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# clear ip pim vrf default interfaces
          switch# exit
          

          To clear traffic statistics for all PIM interfaces in a VRF:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# clear ip pim vrf default interface traffic
          switch# exit
          

          To rescan the PIM OIL to update the output interface list in a VRF:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# clear ip pim vrf default oil
          switch# exit
          

          To clear all PIM process statistics in a VRF:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# clear ip pim statistics vrf default
          switch# exit
          

          To clear all PIM process statistics:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# clear ip pim statistics
          switch# exit
          

          To clear the IGMP interface state:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# clear ip igmp interfaces
          switch# exit
          

          Configuration Example

          The following example configures PIM and BGP on leaf01, leaf02, and spine01.

          Traffic Flow along the Shared Tree




          1. The FHR receives a multicast data packet from the source, encapsulates the packet in a unicast PIM register message, then sends it to the RP.

          2. The RP builds an (S,G) mroute, decapsulates the multicast packet, then forwards it along the (*,G) tree towards the receiver.

          3. The LHR receives multicast traffic and sees that it has a shorter path to the source. It requests the multicast stream from leaf01 and simultaneously sends the multicast stream to the receiver.
          Traffic Flow for the Shortest Path Tree




          1. The FHR hears a PIM join directly from the LHR and forwards multicast traffic directly to it.

          2. The LHR receives the multicast packet both from the FHR and the RP. The LHR discards the packet from the RP and prunes itself from the RP.

          3. The RP receives a prune message from the LHR and instructs the FHR to stop sending PIM register messages

          4. Traffic continues directly between the FHR and the LHR.
          cumulus@leaf01:~$ nv set router pim enable on
          cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
          cumulus@leaf01:~$ nv set interface swp1,swp49,swp51
          cumulus@leaf01:~$ nv set interface swp1 bridge domain br_default
          cumulus@leaf01:~$ nv set interface swp1 bridge domain br_default access 10
          cumulus@leaf01:~$ nv set bridge domain br_default vlan 10
          cumulus@leaf01:~$ nv set interface vlan10 ip address 10.1.10.1/24
          cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
          cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
          cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 remote-as external
          cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.1/32
          cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.1.10.0/24
          cumulus@leaf01:~$ nv set interface lo router pim
          cumulus@leaf01:~$ nv set interface swp51 router pim
          cumulus@leaf01:~$ nv set interface vlan10 router pim
          cumulus@leaf01:~$ nv set interface vlan10 ip igmp
          cumulus@leaf01:~$ nv set vrf default router pim address-family ipv4 rp 10.10.10.101
          cumulus@leaf01:~$ nv config apply
          
          cumulus@leaf02:~$ nv set router pim enable on
          cumulus@leaf02:~$ nv set interface lo ip address 10.10.10.2/32
          cumulus@leaf02:~$ nv set interface swp2,swp49,swp51
          cumulus@leaf02:~$ nv set interface swp2 bridge domain br_default
          cumulus@leaf02:~$ nv set interface swp2 bridge domain br_default access 20
          cumulus@leaf02:~$ nv set bridge domain br_default vlan 20
          cumulus@leaf02:~$ nv set interface vlan20 ip address 10.2.10.1/24
          cumulus@leaf02:~$ nv set router bgp autonomous-system 65102
          cumulus@leaf02:~$ nv set router bgp router-id 10.10.10.2
          cumulus@leaf02:~$ nv set vrf default router bgp neighbor swp51 remote-as external
          cumulus@leaf02:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.2/32
          cumulus@leaf02:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.2.10.0/24
          cumulus@leaf02:~$ nv set interface lo router pim
          cumulus@leaf02:~$ nv set interface swp51 router pim
          cumulus@leaf02:~$ nv set interface vlan20 router pim
          cumulus@leaf02:~$ nv set interface vlan20 ip igmp
          cumulus@leaf02:~$ nv set vrf default router pim address-family ipv4 rp 10.10.10.101
          cumulus@leaf02:~$ nv config apply
          
          cumulus@spine01:~$ nv set router pim enable on
          cumulus@spine01:~$ nv set interface lo ip address 10.10.10.101/32
          cumulus@spine01:~$ nv set router bgp autonomous-system 65199
          cumulus@spine01:~$ nv set router bgp router-id 10.10.10.101
          cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 remote-as external
          cumulus@spine01:~$ nv set vrf default router bgp neighbor swp2 remote-as external
          cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.101/32
          cumulus@spine01:~$ nv set interface lo router pim
          cumulus@spine01:~$ nv set interface swp1 router pim
          cumulus@spine01:~$ nv set interface swp2 router pim
          cumulus@spine01:~$ nv set vrf default router pim address-family ipv4 rp 10.10.10.101 
          cumulus@spine01:~$ nv config apply
          
          cumulus@leaf01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
          - set:
              bridge:
                domain:
                  br_default:
                    vlan:
                      '10': {}
              interface:
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.1/32: {}
                  router:
                    pim:
                      enable: on
                  type: loopback
                swp1:
                  bridge:
                    domain:
                      br_default:
                        access: 10
                  type: swp
                swp49:
                  type: swp
                swp51:
                  router:
                    pim:
                      enable: on
                  type: swp
                vlan10:
                  ip:
                    address:
                      10.1.10.1/24: {}
                    igmp:
                      enable: on
                  router:
                    pim:
                      enable: on
                  type: svi
                  vlan: 10
              router:
                bgp:
                  autonomous-system: 65101
                  enable: on
                  router-id: 10.10.10.1
                pim:
                  enable: on
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$Cir2YG.pLVeUZFGi$txBVny7YpjZDGE2gOIz0G.bbs3CzYQ1P9T9XgCqV7oRkfPUQ2gWoJvnnhMH3NtmVxA2.40P5bgaMydfBsIzYP0
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:7a
                hostname: leaf01
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    bgp:
                      address-family:
                        ipv4-unicast:
                          enable: on
                          network:
                            10.1.10.0/24: {}
                            10.10.10.1/32: {}
                      enable: on
                      neighbor:
                        swp51:
                          remote-as: external
                          type: unnumbered
                    pim:
                      address-family:
                        ipv4:
                          rp:
                            10.10.10.101: {}
                      enable: on
          
          cumulus@leaf02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
          - set:
              bridge:
                domain:
                  br_default:
                    vlan:
                      '20': {}
              interface:
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.2/32: {}
                  router:
                    pim:
                      enable: on
                  type: loopback
                swp2:
                  bridge:
                    domain:
                      br_default:
                        access: 20
                  type: swp
                swp49:
                  type: swp
                swp51:
                  router:
                    pim:
                      enable: on
                  type: swp
                vlan20:
                  ip:
                    address:
                      10.2.10.1/24: {}
                    igmp:
                      enable: on
                  router:
                    pim:
                      enable: on
                  type: svi
                  vlan: 20
              router:
                bgp:
                  autonomous-system: 65102
                  enable: on
                  router-id: 10.10.10.2
                pim:
                  enable: on
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$02mrE7tXtzp0MzJV$Ou9Vo4jcCC5ztEzb8ChYrDaGiqGwLKPQj2VRPEDYt0/EuTjmVDXM65TpJ06cmPGQZ0bf5NEmaMAH7cXkTQ4j9/
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:78
                hostname: leaf02
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    bgp:
                      address-family:
                        ipv4-unicast:
                          enable: on
                          network:
                            10.2.10.0/24: {}
                            10.10.10.2/32: {}
                      enable: on
                      neighbor:
                        swp51:
                          remote-as: external
                          type: unnumbered
                    pim:
                      address-family:
                        ipv4:
                          rp:
                            10.10.10.101: {}
                      enable: on
          
          cumulus@spine01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
          - set:
              interface:
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.101/32: {}
                  router:
                    pim:
                      enable: on
                  type: loopback
                swp1:
                  router:
                    pim:
                      enable: on
                  type: swp
                swp2:
                  router:
                    pim:
                      enable: on
                  type: swp
              router:
                bgp:
                  autonomous-system: 65199
                  enable: on
                  router-id: 10.10.10.101
                pim:
                  enable: on
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$ojEWGEv5NjLUIQ2T$3YJ2PBG0ekWj0nUoY5Psn8wzd6lsxW8KxDTMTXGZiZwoT8VMSYF0zqF/3AVjx3NhIJ8x10YJ5aCTeBz7kR7Ns1
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:82
                hostname: spine01
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    bgp:
                      address-family:
                        ipv4-unicast:
                          enable: on
                          network:
                            10.10.10.101/32: {}
                      enable: on
                      neighbor:
                        swp1:
                          remote-as: external
                          type: unnumbered
                        swp2:
                          remote-as: external
                          type: unnumbered
                    pim:
                      address-family:
                        ipv4:
                          rp:
                            10.10.10.101: {}
                      enable: on
          
          cumulus@leaf01:mgmt:~$ sudo cat /etc/network/interfaces
          ...
          auto lo
          iface lo inet loopback
              address 10.10.10.1/32
          auto mgmt
          iface mgmt
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          auto eth0
          iface eth0 inet dhcp
              ip-forward off
              ip6-forward off
              vrf mgmt
          auto swp1
          iface swp1
              bridge-access 10
          auto swp49
          iface swp49
          auto swp51
          iface swp51
          auto vlan10
          iface vlan10
              address 10.1.10.1/24
              hwaddress 44:38:39:22:01:b1
              vlan-raw-device br_default
              vlan-id 10
          auto br_default
          iface br_default
              bridge-ports swp1
              hwaddress 44:38:39:22:01:b1
              bridge-vlan-aware yes
              bridge-vids 10
              bridge-pvid 1
          
          cumulus@leaf02:mgmt:~$ sudo cat /etc/network/interfaces
          ...
          auto lo
          iface lo inet loopback
              address 10.10.10.2/32
          auto mgmt
          iface mgmt
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          auto eth0
          iface eth0 inet dhcp
              ip-forward off
              ip6-forward off
              vrf mgmt
          auto swp2
          iface swp2
              bridge-access 20
          auto swp49
          iface swp49
          auto swp51
          iface swp51
          auto vlan20
          iface vlan20
              address 10.2.10.1/24
              hwaddress 44:38:39:22:01:af
              vlan-raw-device br_default
              vlan-id 20
          auto br_default
          iface br_default
              bridge-ports swp2
              hwaddress 44:38:39:22:01:af
              bridge-vlan-aware yes
              bridge-vids 20
              bridge-pvid 1
          
          cumulus@spine01:mgmt:~$ sudo cat /etc/network/interfaces
          ...
          auto lo
          iface lo inet loopback
              address 10.10.10.101/32
          auto mgmt
          iface mgmt
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          auto eth0
          iface eth0 inet dhcp
              ip-forward off
              ip6-forward off
              vrf mgmt
          auto swp1
          iface swp1
          auto swp2
          iface swp2
          
          cumulus@server01:~$ sudo cat /etc/network/interfaces
          # The loopback network interface
          auto lo
          iface lo inet loopback
          # The OOB network interface
          auto eth0
          iface eth0 inet dhcp
          # The data plane network interfaces
          auto eth1
          iface eth1 inet manual
            address 10.1.10.101
            netmask 255.255.255.0
            mtu 9000
            post-up ip route add 10.0.0.0/8 via 10.1.10.1
          
          cumulus@server02:~$ sudo cat /etc/network/interfaces
          auto lo
          iface lo inet loopback
          # The OOB network interface
          auto eth0
          iface eth0 inet dhcp
          # The data plane network interfaces
          auto eth2
          iface eth2 inet manual
            address 10.2.10.102
            netmask 255.255.255.0
            mtu 9000
            post-up ip route add 10.0.0.0/8 via 10.2.10.1
          
          cumulus@leaf01:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          

          vrf default ip pim rp 10.10.10.101 224.0.0.0/4 exit-vrf vrf mgmt exit-vrf interface lo ip pim interface swp51 ip pim interface vlan10 ip igmp ip igmp version 3 ip igmp query-interval 125 ip igmp last-member-query-interval 100 ip igmp last-member-query-count 2 ip igmp query-max-response-time 1000 ip pim router bgp 65101 vrf default bgp router-id 10.10.10.1 timers bgp 3 9 bgp deterministic-med ! Neighbors neighbor swp51 interface remote-as external neighbor swp51 timers 3 9 neighbor swp51 timers connect 10 neighbor swp51 advertisement-interval 0 neighbor swp51 capability extended-nexthop ! Address families address-family ipv4 unicast network 10.1.10.0/24 network 10.10.10.1/32 maximum-paths ibgp 64 maximum-paths 64 distance bgp 20 200 200 neighbor swp51 activate exit-address-family ! end of router bgp 65101 vrf default

          cumulus@leaf02:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          vrf default
          ip pim rp 10.10.10.101 224.0.0.0/4
          exit-vrf
          vrf mgmt
          exit-vrf
          interface lo
          ip pim
          interface swp51
          ip pim
          interface vlan20
          ip igmp
          ip igmp version 3
          ip igmp query-interval 125
          ip igmp last-member-query-interval 100
          ip igmp last-member-query-count 2
          ip igmp query-max-response-time 1000
          ip pim
          router bgp 65102 vrf default
          bgp router-id 10.10.10.2
          timers bgp 3 9
          bgp deterministic-med
          ! Neighbors
          neighbor swp51 interface remote-as external
          neighbor swp51 timers 3 9
          neighbor swp51 timers connect 10
          neighbor swp51 advertisement-interval 0
          neighbor swp51 capability extended-nexthop
          ! Address families
          address-family ipv4 unicast
          network 10.10.10.2/32
          network 10.2.10.0/24
          maximum-paths ibgp 64
          maximum-paths 64
          distance bgp 20 200 200
          neighbor swp51 activate
          exit-address-family
          ! end of router bgp 65102 vrf default
          
          cumulus@spine01:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          rf default
          ip pim rp 10.10.10.101 224.0.0.0/4
          exit-vrf
          vrf mgmt
          exit-vrf
          interface lo
          ip pim
          interface swp1
          ip pim
          interface swp2
          ip pim
          router bgp 65199 vrf default
          bgp router-id 10.10.10.101
          timers bgp 3 9
          bgp deterministic-med
          ! Neighbors
          neighbor swp1 interface remote-as external
          neighbor swp1 timers 3 9
          neighbor swp1 timers connect 10
          neighbor swp1 advertisement-interval 0
          neighbor swp1 capability extended-nexthop
          neighbor swp2 interface remote-as external
          neighbor swp2 timers 3 9
          neighbor swp2 timers connect 10
          neighbor swp2 advertisement-interval 0
          neighbor swp2 capability extended-nexthop
          ! Address families
          address-family ipv4 unicast
          network 10.10.10.101/32
          maximum-paths ibgp 64
          maximum-paths 64
          distance bgp 20 200 200
          neighbor swp1 activate
          neighbor swp2 activate
          exit-address-family
          ! end of router bgp 65199 vrf default
          

          This simulation is running Cumulus Linux 5.11. The Cumulus Linux 5.12 simulation is coming soon.

          The simulation starts with the example PIM configuration. To simplify the example, only one spine and two leafs are in the topology. The demo is pre-configured using NVUE commands.

          To validate the configuration, run the PIM show commands listed in the troubleshooting section above.

          Considerations

          Virtual Router Redundancy Protocol - VRRP

          VRRP allows two or more network devices in an active standby configuration to share a single virtual default gateway. The VRRP router that forwards packets at any given time is the master. If this VRRP router fails, another VRRP standby router automatically takes over as master. The master sends VRRP advertisements to other VRRP routers in the same virtual router group, which include the priority and state of the master. VRRP router priority determines the role that each virtual router plays and who becomes the new master if the master fails.

          Use VRRP when you have multiple distinct devices that connect to a layer 2 segment through multiple logical connections (not through a single bond). VRRP elects a single active forwarder that owns the virtual MAC address while it is active. This prevents the forwarding database of the layer 2 domain from continuously updating in response to MAC flaps because the switch receives frames sourced from the virtual MAC address from discrete logical connections.

          All virtual routers use 00:00:5E:00:01:XX for IPv4 gateways or 00:00:5E:00:02:XX for IPv6 gateways as their MAC address. The last byte of the address is the Virtual Router IDentifier (VRID), which is different for each virtual router in the network. Only one physical router uses this MAC address at a time. The router replies with this address when it receives ARP requests or neighbor solicitation packets for the IP addresses of the virtual router.

          RFC 5798 describes VRRP in detail.

          The following example illustrates a basic VRRP configuration.

          Configure VRRP

          To configure VRRP, specify the following information on each switch:

          You can also set these optional parameters:

          Optional Parameter Default Value Description
          priority 100 The priority level of the virtual router within the virtual router group, which determines the role that each virtual router plays and what happens if the master fails. Virtual routers have a priority between 1 and 254; the router with the highest priority becomes the master.
          advertisement interval 1000 milliseconds The advertisement interval is the interval between successive advertisements by the master in a virtual router group. You can specify a value between 10 and 40950.
          preempt enabled Preempt mode lets the router take over as master for a virtual router group if it has a higher priority than the current master. Preempt mode is on by default. To disable preempt mode, edit the /etc/frr/frr.conf file to add the line no vrrp <VRID> preempt to the interface stanza, then restart the FRR service.
          version 3 The VRRP protocol version. You can specify a value of either 2 or 3.

          The following example commands configure two switches (spine01 and spine02) that form one virtual router group (VRID 44) with IPv4 address 10.0.0.1/24 and IPv6 address 2001:0db8::1/64. spine01 is the master; it has a priority of 254. spine02 is the backup VRRP router.

          The parent interface must use a primary address as the source address on VRRP advertisement packets.

          When you configure VRRP with NVUE commands, NVUE enables the vrrpd service and restarts the FRR service; An FRR service restart might impact traffic.

          cumulus@spine01:~$ nv set interface swp1 ip address 10.0.0.2/24
          cumulus@spine01:~$ nv set interface swp1 ip address 2001:0db8::2/64
          cumulus@spine01:~$ nv set interface swp1 ip vrrp virtual-router 44 address 10.0.0.1
          cumulus@spine01:~$ nv set interface swp1 ip vrrp virtual-router 44 address 2001:0db8::1
          cumulus@spine01:~$ nv set interface swp1 ip vrrp virtual-router 44 priority 254
          cumulus@spine01:~$ nv set interface swp1 ip vrrp virtual-router 44 advertisement-interval 5000
          cumulus@spine01:~$ nv config apply
          
          cumulus@spine02:~$ nv set interface swp1 ip address 10.0.0.3/24
          cumulus@spine02:~$ nv set interface swp1 ip address 2001:0db8::3/64
          cumulus@spine02:~$ nv set interface swp1 ip vrrp virtual-router 44 address 10.0.0.1/24
          cumulus@spine02:~$ nv set interface swp1 ip vrrp virtual-router 44 address 2001:0db8::1/64
          cumulus@spine02:~$ nv config apply
          
          1. Edit the /etc/network/interface file to assign an IP address to the parent interface; for example:

            cumulus@spine01:~$ sudo vi /etc/network/interfaces
            ...
            auto swp1
            iface swp1
                address 10.0.0.2/24
                address 2001:0db8::2/64
            
          2. Enable the vrrpd daemon, then start the FRR service. See FRRouting.

          3. From the vtysh shell, configure VRRP.

            cumulus@spine01:~$ sudo vtysh
            ...
            spine01# configure terminal
            spine01(config)# interface swp1
            spine01(config-if)# vrrp 44 ip 10.0.0.1
            spine01(config-if)# vrrp 44 ipv6 2001:0db8::1
            spine01(config-if)# vrrp 44 priority 254
            spine01(config-if)# vrrp 44 advertisement-interval 5000
            spine01(config-if)# end
            spine01# write memory
            spine01# exit
            
          1. Edit the /etc/network/interface file to assign an IP address to the parent interface; for example:

            cumulus@spine02:~$ sudo vi /etc/network/interfaces
            ...
            auto swp1
            iface swp1
                address 10.0.0.3/24
                address 2001:0db8::3/64
            
          2. Enable the vrrpd daemon, then start the FRR service. See FRRouting.

          3. From the vtysh shell, configure VRRP.

            cumulus@spine02:~$ sudo vtysh
            ...
            spine02# configure terminal
            spine02(config)# interface swp1
            spine02(config-if)# vrrp 44 ip 10.0.0.1
            spine02(config-if)# vrrp 44 ipv6 2001:0db8::1
            spine02(config-if)# end
            spine02# write memory
            spine02# exit
            

          The vtysh commands save the configuration in the /etc/network/interfaces file and the /etc/frr/frr.conf file. For example:

          cumulus@spine01:~$ sudo cat /etc/network/interfaces
          ...
          auto swp1
          iface swp1
              address 10.0.0.2/24
              address 2001:0db8::2/64
              vrrp 44 10.0.0.1/24 2001:0db8::1/64
          ...
          
          cumulus@spine01:~$ sudo cat /etc/frr/frr.conf
          ...
          interface swp1
          vrrp 44
          vrrp 44 advertisement-interval 5000
          vrrp 44 priority 254
          vrrp 44 ip 10.0.0.1
          vrrp 44 ipv6 2001:0db8::1
          ...
          

          Show VRRP Configuration

          To show global VRRP configuration, run the NVUE nv show router vrrp command:

          cumulus@switch:~$ nv show router vrrp
                                  applied
          ----------------------  -------
          enable                  on     
          advertisement-interval  1000   
          preempt                 on     
          priority                100    
          

          The vtysh show vrrp command shows VRRP configuration and operational data:

          ...
          switch# show vrrp
           Virtual Router ID                       44                          
           Protocol Version                        3                           
           Autoconfigured                          No                          
           Shutdown                                No                          
           Interface                               swp1                        
           VRRP interface (v4)                     vrrp4-3-44                  
           VRRP interface (v6)                     vrrp6-3-44                  
           Primary IP (v4)                         10.0.0.2                    
           Primary IP (v6)                         fe80::14a8:c009:2597:9854   
           Virtual MAC (v4)                        00:00:5e:00:01:2c           
           Virtual MAC (v6)                        00:00:5e:00:02:2c           
           Status (v4)                             Master                      
           Status (v6)                             Master                      
           Priority                                254                         
           Effective Priority (v4)                 254                         
           Effective Priority (v6)                 254                         
           Preempt Mode                            Yes                         
           Accept Mode                             Yes                         
           Advertisement Interval                  5000 ms                     
           Master Advertisement Interval (v4) Rx   5000 ms (stale)             
           Master Advertisement Interval (v6) Rx   5000 ms (stale)             
           Advertisements Tx (v4)                  4                           
           Advertisements Tx (v6)                  3                           
           Advertisements Rx (v4)                  0                           
           Advertisements Rx (v6)                  0                           
           Gratuitous ARP Tx (v4)                  1                           
           Neigh. Adverts Tx (v6)                  1                           
           State transitions (v4)                  2                           
           State transitions (v6)                  2                           
           Skew Time (v4)                          30 ms                       
           Skew Time (v6)                          30 ms                       
           Master Down Interval (v4)               15030 ms                    
           Master Down Interval (v6)               15030 ms                    
           IPv4 Addresses                          1                           
           ..................................      10.0.0.1                    
           IPv6 Addresses                          1                           
           ..................................      2001:db8::1
          

          To show configuration and operational information about all configured VRRP virtual routers, run the NVUE nv show interface <interface-id> ip vrrp virtual-router command or the vtysh show vrrp command.

          Add -o json at the end of the NVUE command to see the output in a more readable format:

          cumulus@switch:~$ nv show interface swp1 ip vrrp virtual-router -o json
          {
            "44": {
              "accept-mode": "on",
              "address-family": {
                "ipv4": {
                  "counters": {
                    "adv-rx": 0,
                    "adv-tx": 4663,
                    "garp-tx": 1,
                    "state-transitions": 2
                  },
                  "down-interval": 15030,
                  "master-adv-interval": 5000,
                  "primary-addr": "10.0.0.2",
                  "priority": 254,
                  "skew-time": 30,
                  "status": "Master",
                  "virtual-addresses": {
                    "10.0.0.1": {}
                  },
                  "vmac": "00:00:5e:00:01:2c",
                  "vrrp-interface": "vrrp4-3-44"
                },
                "ipv6": {
                  "counters": {
                    "adv-rx": 0,
                    "adv-tx": 4662,
                    "neigh-adv-tx": 1,
                    "state-transitions": 2
                  },
                  "down-interval": 15030,
                  "master-adv-interval": 5000,
                  "primary-addr": "fe80::42cc:fd5c:fb48:76a8",
                  "priority": 254,
                  "skew-time": 30,
                  "status": "Master",
                  "virtual-addresses": {
                    "2001:db8::1": {}
                  },
                  "vmac": "00:00:5e:00:02:2c",
                  "vrrp-interface": "vrrp6-3-44"
                }
              },
              "advertisement-interval": 5000,
              "auto-config": "off",
              "interface": "swp1",
              "is-shutdown": "off",
              "preempt": "on",
              "priority": 254,
              "version": 3
            }
          }
          

          To show configuration information about a specific VRRP virtual router, run the NVUE nv show interface <interface-id> ip vrrp virtual-router <virtual-router-id> command or the vtysh show vrrp <virtual-router-id> command:

          cumulus@switch:~$ nv show interface swp1 ip vrrp virtual-router 44
                                  operational  applied
          ----------------------  -----------  ------------
          advertisement-interval  5000         5000
          preempt                 on           auto
          priority                254          254
          version                 3            3
          [address]                            10.0.0.1
          [address]                            2001:0db8::1
          accept-mode             on
          auto-config             off
          interface               swp1
          is-shutdown             off
          [address-family]        ipv4
          [address-family]        ipv6
          

          The vtysh show vrrp <virtual-router-id> command shows operational information in addition to configuration information:

          ...
          switch# show vrrp  44
          
           Virtual Router ID                       44                          
           Protocol Version                        3                           
           Autoconfigured                          No                          
           Shutdown                                No                          
           Interface                               swp1                        
           VRRP interface (v4)                     vrrp4-3-44                  
           VRRP interface (v6)                     vrrp6-3-44                  
           Primary IP (v4)                         10.0.0.2                    
           Primary IP (v6)                         fe80::42cc:fd5c:fb48:76a8   
           Virtual MAC (v4)                        00:00:5e:00:01:2c           
           Virtual MAC (v6)                        00:00:5e:00:02:2c           
           Status (v4)                             Master                      
           Status (v6)                             Master                      
           Priority                                254                         
           Effective Priority (v4)                 254                         
           Effective Priority (v6)                 254                         
           Preempt Mode                            Yes                         
           Accept Mode                             Yes                         
           Advertisement Interval                  5000 ms                     
           Master Advertisement Interval (v4) Rx   5000 ms (stale)             
           Master Advertisement Interval (v6) Rx   5000 ms (stale)             
           Advertisements Tx (v4)                  4710                        
           Advertisements Tx (v6)                  4709                        
           Advertisements Rx (v4)                  0                           
           Advertisements Rx (v6)                  0                           
           Gratuitous ARP Tx (v4)                  1                           
           Neigh. Adverts Tx (v6)                  1                           
           State transitions (v4)                  2                           
           State transitions (v6)                  2                           
           Skew Time (v4)                          30 ms                       
           Skew Time (v6)                          30 ms                       
           Master Down Interval (v4)               15030 ms                    
           Master Down Interval (v6)               15030 ms                    
           IPv4 Addresses                          1                           
           ..................................      10.0.0.1                    
           IPv6 Addresses                          1                           
           ..................................      2001:db8::1 
          

          GRE Tunneling

          GRE is a tunneling protocol that encapsulates network layer protocols inside virtual point-to-point links over an Internet Protocol network. The tunnel source and tunnel destination addresses on each side identify the two endpoints.

          GRE packets travel directly between the two endpoints through a virtual tunnel. As a packet comes across other routers, there is no interaction with its payload; the routers only parse the outer IP packet. When the packet reaches the endpoint of the GRE tunnel, the switch de-encapsulates the outer packet, parses the payload, then forwards it to its ultimate destination.

          GRE uses multiple protocols over a single-protocol backbone and is less demanding than some of the alternative solutions, such as VPN. You can use GRE to transport protocols that the underlying network does not support, work around networks with limited hops, connect non-contiguous subnets, and allow VPNs across wide area networks.

          The following example shows two sites that use IPv4 addresses. Using GRE tunneling, the two end points can encapsulate an IPv4 or IPv6 payload inside an IPv4 packet. The switch routes the packet based on the destination in the outer IPv4 header.

          Configure GRE Tunneling

          To configure GRE tunneling, you create a GRE tunnel interface with routes for tunneling on both endpoints as follows:

          The following configuration example shows the commands used to set up a bidirectional GRE tunnel between two endpoints: tunnelR1 and tunnelR2. The local tunnel endpoint for tunnelR1 is 10.10.10.1 and the remote endpoint is 10.10.10.3. The local tunnel endpoint for tunnelR2 is 10.10.10.3 and the remote endpoint is 10.10.10.1.

          In NVUE, if you create the GRE interface with a name that starts with tunnel, NVUE automatically sets the interface type to tunnel. If you create a GRE interface with a name that does not start with tunnel, you must set the interface type to tunnel with the nv set interface <interface-name> type tunnel command.

          cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
          cumulus@leaf01:~$ nv set interface swp1 ip address 10.2.1.1/24
          cumulus@leaf01:~$ nv set interface tunnelR2 ip address 10.1.100.1/30
          cumulus@leaf01:~$ nv set interface tunnelR2 tunnel mode gre
          cumulus@leaf01:~$ nv set interface tunnelR2 tunnel dest-ip 10.10.10.3
          cumulus@leaf01:~$ nv set interface tunnelR2 tunnel source-ip 10.10.10.1
          cumulus@leaf01:~$ nv set interface tunnelR2 tunnel ttl 255
          cumulus@leaf01:~$ nv set vrf default router static 10.1.1.0/24 via tunnelR2
          cumulus@leaf01:~$ nv config apply
          
          cumulus@leaf03:~$ nv set interface lo ip address 10.10.10.3/32
          cumulus@leaf03:~$ nv set interface swp1 ip address 10.1.1.1/24
          cumulus@leaf03:~$ nv set interface tunnelR1 ip address 10.1.100.2/30
          cumulus@leaf03:~$ nv set interface tunnelR1 tunnel mode gre
          cumulus@leaf03:~$ nv set interface tunnelR1 tunnel dest-ip 10.10.10.1
          cumulus@leaf03:~$ nv set interface tunnelR1 tunnel source-ip 10.10.10.3
          cumulus@leaf03:~$ nv set interface tunnelR1 tunnel ttl 255
          cumulus@leaf03:~$ nv set vrf default router static 10.2.1.0/24 via tunnelR1
          cumulus@leaf03:~$ nv config apply
          
          1. Edit the /etc/network /interfaces file to add the tunnel interface:

            cumulus@leaf01:~$ sudo nano /etc/network/interfaces
            ...
            auto lo
            iface lo inet loopback
               address 10.10.10.1/32
            auto swp1
            iface swp1
               address 10.2.1.1/24
            auto tunnelR2
            iface tunnelR2
               address 10.1.100.1/30
               tunnel-mode gre
               tunnel-local 10.10.10.1
               tunnel-endpoint 10.10.10.3
               tunnel-ttl 255
            
          2. Run the ifreload -a command to load the configuration:

            cumulus@leaf01:mgmt:~$ sudo ifreload -a
            
          3. Run vtysh commands to configure the static route:

            cumulus@leaf01:mgmt:~$ sudo vtysh
            ...
            leaf01# configure terminal
            leaf01(config)# ip route 10.1.1.0/24 tunnelR2
            leaf01(config)# exit
            leaf01# write memory
            leaf01# exit
            cumulus@leaf01:mgmt:~$
            

            The vtysh commands save the static route configuration in the /etc/frr/frr.conf file. For example:

            cumulus@leaf01:mgmt:~$ sudo cat /etc/frr/frr.conf
            ...
            vrf default
            ip route 10.1.1.0/24 tunnelR2
            exit-vrf
            ...
            
          1. Edit the /etc/network /interfaces file to add the tunnel interface:

            cumulus@leaf03:~$ sudo nano /etc/network/interfaces
            ...
            auto lo
            iface lo inet loopback
               address 10.10.10.3/32
            auto swp1
            iface swp1
               address 10.1.1.1/24
            auto tunnelR1
            iface tunnelR1
               address 10.1.100.2/30
               tunnel-mode gre
               tunnel-local 10.10.10.3
               tunnel-endpoint 10.10.10.1
               tunnel-ttl 255
            
          2. Run the ifreload -a command to load the configuration.

            cumulus@leaf03:mgmt:~$ sudo ifreload -a
            
          3. Run vtysh commands to configure the static route:

            cumulus@leaf03:mgmt:~$ sudo vtysh
            ...
            leaf01# configure terminal
            leaf01(config)# ip route 10.2.1.0/24 tunnelR1
            leaf01(config)# exit
            leaf01# write memory
            leaf01# exit
            cumulus@leaf03:mgmt:~$
            

            The vtysh commands save the static route configuration in the /etc/frr/frr.conf file. For example:

            cumulus@leaf03:mgmt:~$ sudo cat /etc/frr/frr.conf
            ...
            vrf default
            ip route 10.2.1.0/24 tunnelR1
            exit-vrf
            vrf mgmt
            exit-vrf
            ...
            

          To delete a GRE tunnel, remove the tunnel interface, and remove the routes configured with the tunnel interface. Either run the NVUE nv unset commands or remove the tunnel configuration from the /etc/network/interfaces file and run the ifreload -a command.

          Troubleshooting

          To check GRE tunnel settings, run the NVUE nv show interface <interface> tunnel command, or run the Linux ip tunnel show or ifquery --check command. For example:

          cumulus@leaf01:mgmt:~$ nv show interface tunnelR2 tunnel
                     operational  applied     description
          ---------  -----------  ----------  -------------------------------
          dest-ip    10.10.10.3   10.10.10.3  Destination underlay IP address
          mode       gre          gre         tunnel mode
          source-ip  10.10.10.1   10.10.10.1  Source underlay IP address
          ttl                     255         time to live
          
          cumulus@leaf01:mgmt:~$ ip tunnel show
          gre0: gre/ip remote any local any ttl inherit nopmtudisc
          tunnelR2: gre/ip remote 10.10.10.3 local 10.10.10.1 ttl 255
          
          cumulus@leaf01:mgmt:~$ ifquery --check tunnelR2
          auto tunnelR2
          iface tunnelR2                                                      [pass]
                  tunnel-mode gre                                             [pass]
                  tunnel-local 10.10.10.1/32                                  [pass]
                  tunnel-endpoint 10.10.10.3/32                               [pass]
                  tunnel-ttl 255                                              [pass]
                  address 10.1.100.1/30                                       [pass]
          

          Configuration Example

          This example uses the reference topology, and uses spine01 and spine02 to represent the transit IPv4 network to connect the GRE endpoints.

          cumulus@leaf01:~$ nv set interface lo ip address 10.10.10.1/32
          cumulus@leaf01:~$ nv set interface swp1 ip address 10.2.1.1/24
          cumulus@leaf01:~$ nv set interface swp1,51-52
          cumulus@leaf01:~$ nv set interface tunnelR2 ip address 10.1.100.1/30
          cumulus@leaf01:~$ nv set interface tunnelR2 tunnel mode gre
          cumulus@leaf01:~$ nv set interface tunnelR2 tunnel dest-ip 10.10.10.3
          cumulus@leaf01:~$ nv set interface tunnelR2 tunnel source-ip 10.10.10.1
          cumulus@leaf01:~$ nv set interface tunnelR2 tunnel ttl 255
          cumulus@leaf01:~$ nv set vrf default router static 10.1.1.0/24 via tunnelR2
          cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
          cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
          cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.1/32
          cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 remote-as external
          cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp52 remote-as external
          cumulus@leaf01:~$ nv config apply
          
          cumulus@leaf03:~$ nv set interface lo ip address 10.10.10.3/32
          cumulus@leaf03:~$ nv set interface swp1 ip address 10.1.1.1/24
          cumulus@leaf03:~$ nv set interface swp1,51-52
          cumulus@leaf03:~$ nv set interface tunnelR1 ip address 10.1.100.2/30
          cumulus@leaf01:~$ nv set interface tunnelR1 tunnel mode gre
          cumulus@leaf03:~$ nv set interface tunnelR1 tunnel dest-ip 10.10.10.1
          cumulus@leaf03:~$ nv set interface tunnelR1 tunnel source-ip 10.10.10.3
          cumulus@leaf03:~$ nv set interface tunnelR1 tunnel ttl 255
          cumulus@leaf03:~$ nv set vrf default router static 10.2.1.0/24 via tunnelR1
          cumulus@leaf03:~$ nv set router bgp autonomous-system 65103
          cumulus@leaf03:~$ nv set router bgp router-id 10.10.10.3
          cumulus@leaf03:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.3/32
          cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp51 remote-as external
          cumulus@leaf03:~$ nv set vrf default router bgp neighbor swp52 remote-as external
          cumulus@leaf03:~$ nv config apply
          
          cumulus@spine01:~$ nv set interface lo ip address 10.10.10.101/32
          cumulus@spine01:~$ nv set interface swp1,3
          cumulus@spine01:~$ nv set router bgp autonomous-system 65199
          cumulus@spine01:~$ nv set router bgp router-id 10.10.10.101
          cumulus@spine01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.101/32
          cumulus@spine01:~$ nv set vrf default router bgp neighbor swp1 remote-as external
          cumulus@spine01:~$ nv set vrf default router bgp neighbor swp3 remote-as external
          cumulus@spine01:~$ nv config apply
          
          cumulus@spine02:~$ nv set interface lo ip address 10.10.10.102/32
          cumulus@spine02:~$ nv set interface swp1,3
          cumulus@spine02:~$ nv set router bgp autonomous-system 65199
          cumulus@spine02:~$ nv set router bgp router-id 10.10.10.102
          cumulus@spine02:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.102/32
          cumulus@spine02:~$ nv set vrf default router bgp neighbor swp1 remote-as external
          cumulus@spine02:~$ nv set vrf default router bgp neighbor swp3 remote-as external
          cumulus@spine02:~$ nv config apply
          
          cumulus@leaf01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
          - set:
              interface:
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.1/32: {}
                  type: loopback
                swp1:
                  ip:
                    address:
                      10.2.1.1/24: {}
                  type: swp
                swp51:
                  type: swp
                swp52:
                  type: swp
                tunnelR2:
                  ip:
                    address:
                      10.1.100.1/30: {}
                  tunnel:
                    dest-ip: 10.10.10.3
                    mode: gre
                    source-ip: 10.10.10.1
                    ttl: 255
                  type: tunnel
              router:
                bgp:
                  autonomous-system: 65101
                  enable: on
                  router-id: 10.10.10.1
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$Q1oWhPxoShG7XD.5$OaVCPFxz.8pxCNTIBP6j5mqKskt9x6pZVFvBpvrB2GDChmH0zLa8FdWP6D8y/QBp577ylmKnoL1cOyI9L4mMm0
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:7a
                hostname: leaf01
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    bgp:
                      address-family:
                        ipv4-unicast:
                          enable: on
                          network:
                            10.10.10.1/32: {}
                      enable: on
                      neighbor:
                        swp51:
                          remote-as: external
                          type: unnumbered
                        swp52:
                          remote-as: external
                          type: unnumbered
                    static:
                      10.1.1.0/24:
                        address-family: ipv4-unicast
                        via:
                          tunnelR2:
                            type: interface
          
          cumulus@leaf03:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
          - set:
              interface:
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.3/32: {}
                  type: loopback
                swp1:
                  ip:
                    address:
                      10.1.1.1/24: {}
                  type: swp
                swp51:
                  type: swp
                swp52:
                  type: swp
                tunnelR1:
                  ip:
                    address:
                      10.1.100.2/30: {}
                  tunnel:
                    dest-ip: 10.10.10.1
                    mode: gre
                    source-ip: 10.10.10.3
                    ttl: 255
                  type: tunnel
              router:
                bgp:
                  autonomous-system: 65103
                  enable: on
                  router-id: 10.10.10.3
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$1nU9GYzwvSQWSXxk$lvpOd0vZVFZ4ksBO6/CdTFVSI7Rf02t5EDnwWLrrTKxWKBulMGfSxZxnKDKLaeAkaIgSaeZq.qHKzhFtNpeW..
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:84
                hostname: leaf03
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    bgp:
                      address-family:
                        ipv4-unicast:
                          enable: on
                          network:
                            10.10.10.3/32: {}
                      enable: on
                      neighbor:
                        swp51:
                          remote-as: external
                          type: unnumbered
                        swp52:
                          remote-as: external
                          type: unnumbered
                    static:
                      10.2.1.0/24:
                        address-family: ipv4-unicast
                        via:
                          tunnelR1:
                            type: interface
          
          cumulus@spine01:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
          - set:
              interface:
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.101/32: {}
                  type: loopback
                swp1:
                  type: swp
                swp3:
                  type: swp
              router:
                bgp:
                  autonomous-system: 65199
                  enable: on
                  router-id: 10.10.10.101
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$U5LDz6062WliqcV/$BUodYzPhxdHcCt9v2aN59Y25RshkXq7zpKhNEBl5klEVzlx9x6oSyDWUjkRaQeUg8yVRhb37cl4.tyU5Shcy5.
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:82
                hostname: spine01
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    bgp:
                      address-family:
                        ipv4-unicast:
                          enable: on
                          network:
                            10.10.10.101/32: {}
                      enable: on
                      neighbor:
                        swp1:
                          remote-as: external
                          type: unnumbered
                        swp3:
                          remote-as: external
                          type: unnumbered
          
          cumulus@spine02:mgmt:~$ sudo cat /etc/nvue.d/startup.yaml
          - set:
              interface:
                eth0:
                  ip:
                    address:
                      dhcp: {}
                    vrf: mgmt
                  type: eth
                lo:
                  ip:
                    address:
                      10.10.10.102/32: {}
                  type: loopback
                swp1:
                  type: swp
                swp3:
                  type: swp
              router:
                bgp:
                  autonomous-system: 65199
                  enable: on
                  router-id: 10.10.10.102
              service:
                ntp:
                  mgmt:
                    server:
                      0.cumulusnetworks.pool.ntp.org: {}
                      1.cumulusnetworks.pool.ntp.org: {}
                      2.cumulusnetworks.pool.ntp.org: {}
                      3.cumulusnetworks.pool.ntp.org: {}
              system:
                aaa:
                  class:
                    nvapply:
                      action: allow
                      command-path:
                        /:
                          permission: all
                    nvshow:
                      action: allow
                      command-path:
                        /:
                          permission: ro
                    sudo:
                      action: allow
                      command-path:
                        /:
                          permission: all
                  role:
                    nvue-admin:
                      class:
                        nvapply: {}
                    nvue-monitor:
                      class:
                        nvshow: {}
                    system-admin:
                      class:
                        nvapply: {}
                        sudo: {}
                  user:
                    cumulus:
                      full-name: cumulus,,,
                      hashed-password: $6$VBrmD8yYqxPPlyV7$xnsnH.LHtqVsaC2rqvMgs5ePmCt6dBX11qgkLAvovBtTiq5La/sHbwyPOJ4Zyia4CdAQTYEcMzthz4IB4ZW.i0
                      role: system-admin
                api:
                  state: enabled
                config:
                  auto-save:
                    enable: on
                control-plane:
                  acl:
                    acl-default-dos:
                      inbound: {}
                    acl-default-whitelist:
                      inbound: {}
                global:
                  system-mac: 44:38:39:22:01:92
                hostname: spine02
                reboot:
                  mode: cold
                ssh-server:
                  state: enabled
                wjh:
                  channel:
                    forwarding:
                      trigger:
                        l2: {}
                        l3: {}
                        tunnel: {}
                  enable: on
              vrf:
                default:
                  router:
                    bgp:
                      address-family:
                        ipv4-unicast:
                          enable: on
                          network:
                            10.10.10.102/32: {}
                      enable: on
                      neighbor:
                        swp1:
                          remote-as: external
                          type: unnumbered
                        swp3:
                          remote-as: external
                          type: unnumbered
          
          cumulus@leaf01:mgmt:~$ sudo cat /etc/network/interfaces
          auto lo
          iface lo inet loopback
              address 10.10.10.1/32
          auto mgmt
          iface mgmt
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          auto eth0
          iface eth0 inet dhcp
              ip-forward off
              ip6-forward off
              vrf mgmt
          auto swp1
          iface swp1
              address 10.2.1.1/24
          auto swp51
          iface swp51
          auto swp52
          iface swp52
          auto tunnelR2
          iface tunnelR2
              address 10.1.100.1/30
              tunnel-mode gre
              tunnel-local 10.10.10.1
              tunnel-endpoint 10.10.10.3
              tunnel-ttl 255
          
          cumulus@leaf03:mgmt:~$ sudo cat /etc/network/interfaces
          auto lo
          iface lo inet loopback
              address 10.10.10.3/32
          auto mgmt
          iface mgmt
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          auto eth0
          iface eth0 inet dhcp
              ip-forward off
              ip6-forward off
              vrf mgmt
          auto swp1
          iface swp1
              address 10.1.1.1/24
          auto swp51
          iface swp51
          auto swp52
          iface swp52
          auto tunnelR1
          iface tunnelR1
              address 10.1.100.2/30
              tunnel-mode gre
              tunnel-local 10.10.10.3
              tunnel-endpoint 10.10.10.1
              tunnel-ttl 255
          
          cumulus@spine01:mgmt:~$ sudo cat /etc/network/interfaces
          auto lo
          iface lo inet loopback
              address 10.10.10.101/32
          auto mgmt
          iface mgmt
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          auto eth0
          iface eth0 inet dhcp
              ip-forward off
              ip6-forward off
              vrf mgmt
          auto swp1
          iface swp1
          auto swp3
          iface swp3
          
          cumulus@spine02:mgmt:~$ sudo cat /etc/network/interfaces
          auto lo
          iface lo inet loopback
              address 10.10.10.102/32
          auto mgmt
          iface mgmt
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          auto eth0
          iface eth0 inet dhcp
              ip-forward off
              ip6-forward off
              vrf mgmt
          auto swp1
          iface swp1
          auto swp3
          iface swp3
          
          cumulus@server01:mgmt:~$ sudo cat /etc/network/interfaces
          auto eth0
          iface eth0 inet dhcp
            post-up sysctl -w net.ipv6.conf.eth0.accept_ra=2
          auto eth1
          iface eth1
           address 10.2.1.2/24
           post-up ip route add 10.0.0.0/8 via 10.2.1.1
          
          cumulus@server04:mgmt:~$ sudo cat /etc/network/interfaces
          auto eth0
          iface eth0 inet dhcp
            post-up sysctl -w net.ipv6.conf.eth0.accept_ra=2
          auto eth1
          iface eth1
           address 10.1.1.2/24
           post-up ip route add 10.0.0.0/8 via 10.1.1.1
          
          cumulus@leaf01:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          vrf default
          ip route 10.1.1.0/24 tunnelR2
          exit-vrf
          vrf mgmt
          exit-vrf
          router bgp 65101 vrf default
          bgp router-id 10.10.10.1
          timers bgp 3 9
          bgp deterministic-med
          ! Neighbors
          neighbor swp51 interface remote-as external
          neighbor swp51 timers 3 9
          neighbor swp51 timers connect 10
          neighbor swp51 advertisement-interval 0
          neighbor swp51 capability extended-nexthop
          neighbor swp52 interface remote-as external
          neighbor swp52 timers 3 9
          neighbor swp52 timers connect 10
          neighbor swp52 advertisement-interval 0
          neighbor swp52 capability extended-nexthop
          ! Address families
          address-family ipv4 unicast
          network 10.10.10.1/32
          maximum-paths ibgp 64
          maximum-paths 64
          distance bgp 20 200 200
          neighbor swp51 activate
          neighbor swp52 activate
          exit-address-family
          ! end of router bgp 65101 vrf default
          
          cumulus@leaf03:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          vrf default
          ip route 10.2.1.0/24 tunnelR1
          exit-vrf
          vrf mgmt
          exit-vrf
          router bgp 65103 vrf default
          bgp router-id 10.10.10.3
          timers bgp 3 9
          bgp deterministic-med
          ! Neighbors
          neighbor swp51 interface remote-as external
          neighbor swp51 timers 3 9
          neighbor swp51 timers connect 10
          neighbor swp51 advertisement-interval 0
          neighbor swp51 capability extended-nexthop
          neighbor swp52 interface remote-as external
          neighbor swp52 timers 3 9
          neighbor swp52 timers connect 10
          neighbor swp52 advertisement-interval 0
          neighbor swp52 capability extended-nexthop
          ! Address families
          address-family ipv4 unicast
          network 10.10.10.3/32
          maximum-paths ibgp 64
          maximum-paths 64
          distance bgp 20 200 200
          neighbor swp51 activate
          neighbor swp52 activate
          exit-address-family
          ! end of router bgp 65103 vrf default
          
          cumulus@spine01:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          vrf default
          exit-vrf
          vrf mgmt
          exit-vrf
          router bgp 65199 vrf default
          bgp router-id 10.10.10.101
          timers bgp 3 9
          bgp deterministic-med
          ! Neighbors
          neighbor swp1 interface remote-as external
          neighbor swp1 timers 3 9
          neighbor swp1 timers connect 10
          neighbor swp1 advertisement-interval 0
          neighbor swp1 capability extended-nexthop
          neighbor swp3 interface remote-as external
          neighbor swp3 timers 3 9
          neighbor swp3 timers connect 10
          neighbor swp3 advertisement-interval 0
          neighbor swp3 capability extended-nexthop
          ! Address families
          address-family ipv4 unicast
          network 10.10.10.101/32
          maximum-paths ibgp 64
          maximum-paths 64
          distance bgp 20 200 200
          neighbor swp1 activate
          neighbor swp3 activate
          exit-address-family
          ! end of router bgp 65199 vrf default
          
          cumulus@spine02:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          vrf default
          exit-vrf
          vrf mgmt
          exit-vrf
          router bgp 65199 vrf default
          bgp router-id 10.10.10.102
          timers bgp 3 9
          bgp deterministic-med
          ! Neighbors
          neighbor swp1 interface remote-as external
          neighbor swp1 timers 3 9
          neighbor swp1 timers connect 10
          neighbor swp1 advertisement-interval 0
          neighbor swp1 capability extended-nexthop
          neighbor swp3 interface remote-as external
          neighbor swp3 timers 3 9
          neighbor swp3 timers connect 10
          neighbor swp3 advertisement-interval 0
          neighbor swp3 capability extended-nexthop
          ! Address families
          address-family ipv4 unicast
          network 10.10.10.102/32
          maximum-paths ibgp 64
          maximum-paths 64
          distance bgp 20 200 200
          neighbor swp1 activate
          neighbor swp3 activate
          exit-address-family
          ! end of router bgp 65199 vrf default
          

          This simulation is running Cumulus Linux 5.11. The Cumulus Linux 5.12 simulation is coming soon.

          The simulation starts with the example GRE configuration. The demo is pre-configured using NVUE commands.

          To validate the configuration, run the commands listed in the troubleshooting section.

          Network Address Translation - NAT

          Network Address Translation (NAT) enables your network to use one set of IP addresses for internal traffic and a second set of addresses for external traffic.

          NAT overcomes addressing problems due to the explosive growth of the Internet. In addition to preventing the depletion of IPv4 addresses, NAT enables you to use the private address space internally and still have a way to access the Internet.

          Cumulus Linux supports both static NAT and dynamic NAT. Static NAT provides a permanent mapping between one private IP address and a single public address. Dynamic NAT maps private IP addresses to public addresses; these public IP addresses come from a pool. Cumulus Linux creates the translations as needed dynamically, so that a large number of private addresses can share a smaller pool of public addresses. You can enable both static NAT and dynamic NAT at the same time.

          Static and dynamic NAT both support:

          Static NAT supports double NAT (also known as twice NAT) where the switch translates both the source and destination IP addresses as a packet crosses address realms. You use double NAT when the address space in a private network overlaps with IP addresses in the public space.

          The following illustration shows a basic NAT configuration.

          Static NAT

          Static NAT provides a one-to-one mapping between a private IP address inside your network and a public IP address. For example, if you have a web server with the private IP address 10.0.0.10 and you want a remote host to make a request to the web server using the IP address 172.30.58.80, you configure a static NAT mapping between the two IP addresses.

          Static NAT entries do not time out from the translation table.

          Cumulus Linux also support MAC address translation, which operates on Ethernet packets at layer 2. For more information, refer to MAC Address Translation.

          Configure Static NAT

          NVUE commands require you configure an inbound or outbound interface for static NAT rules. However, rules you configure in a rules file in the /etc/cumulus/acl/policy.d/ directory do not require an inbound or outbound interface.

          The following rule matches TCP packets with source IP address 10.0.0.1 coming in on interface swp51 and translates the IP address to 172.30.58.80:

          cumulus@switch:~$ nv set acl acl_1 type ipv4
          cumulus@switch:~$ nv set acl acl_1 rule 1 match ip protocol tcp 
          cumulus@switch:~$ nv set acl acl_1 rule 1 match ip source-ip 10.0.0.1
          cumulus@switch:~$ nv set acl acl_1 rule 1 action source-nat translate-ip 172.30.58.80
          cumulus@switch:~$ nv set interface swp51 acl acl_1 inbound 
          cumulus@switch:~$ nv config apply 
          

          The following rule matches ICMP packets with destination IP address 172.30.58.80 coming in on interface swp51 and translates the IP address to 10.0.0.1

          cumulus@switch:~$ nv set acl acl_2 type ipv4
          cumulus@switch:~$ nv set acl acl_2 rule 1 match ip protocol icmp 
          cumulus@switch:~$ nv set acl acl_2 rule 1 match ip dest-ip 172.30.58.80
          cumulus@switch:~$ nv set acl acl_2 rule 1 action dest-nat translate-ip 10.0.0.1
          cumulus@switch:~$ nv set interface swp51 acl acl_2 inbound 
          cumulus@switch:~$ nv config apply 
          

          The following rule matches UDP packets with source IP address 10.0.0.1 and source port 5000 going out of swp6, and translates the IP address to 172.30.58.80 and the port to 6000.

          cumulus@switch:~$ nv set acl acl_3 type ipv4 
          cumulus@switch:~$ nv set acl acl_3 rule 1 match ip protocol udp 
          cumulus@switch:~$ nv set acl acl_3 rule 1 match ip source-ip 10.0.0.1
          cumulus@switch:~$ nv set acl acl_3 rule 1 match ip udp source-port 5000
          cumulus@switch:~$ nv set acl acl_3 rule 1 action source-nat translate-ip 172.30.58.80
          cumulus@switch:~$ nv set acl acl_3 rule 1 action source-nat translate-port 6000
          cumulus@switch:~$ nv set interface swp6 acl acl_3 outbound 
          cumulus@switch:~$ nv config apply
          

          The following rule matches UDP packets with destination IP address 172.30.58.80 and destination port 6000 coming in on interface swp51, and translates the IP address to 10.0.0.1 and the port to 5000.

          cumulus@switch:~$ nv set acl acl_4 type ipv4
          cumulus@switch:~$ nv set acl acl_4 rule 1 match ip protocol udp
          cumulus@switch:~$ nv set acl acl_4 rule 1 match ip dest-ip 172.30.58.80
          cumulus@switch:~$ nv set acl acl_4 rule 1 match ip udp dest-port 6000
          cumulus@switch:~$ nv set acl acl_4 rule 1 action dest-nat translate-ip 10.0.0.1
          cumulus@switch:~$ nv set acl acl_4 rule 1 action dest-nat translate-port 5000
          cumulus@switch:~$ nv set interface swp51 acl acl_4 inbound
          cumulus@switch:~$ nv config apply 
          

          To create rules, use cl-acltool.

          To add NAT rules using cl-acltool, either edit an existing file in the /etc/cumulus/acl/policy.d directory and add rules under [iptables] or create a new file in the /etc/cumulus/acl/policy.d directory and add rules under an [iptables] section. For example:

          cumulus@switch:~$ sudo nano /etc/cumulus/acl/policy.d/60_nat.rules
          [iptables]
          
           #Add rule
          

          Example Rules

          The following rule matches TCP packets with source IP address 10.0.01 and translates the IP address to 172.30.58.80:

          -t nat -A POSTROUTING -s 10.0.0.1 -p tcp -j SNAT --to-source 172.30.58.80
          

          The following rule matches ICMP packets with destination IP address 172.30.58.80 on interface swp51 and translates the IP address to 10.0.0.1

          -t nat -A PREROUTING -d 172.30.58.80 -p icmp --in-interface swp51 -j DNAT --to-destination 10.0.0.1
          

          The following rule matches UDP packets with source IP address 10.0.0.1 and source port 5000, and translates the IP address to 172.30.58.80 and the port to 6000.

          -t nat -A POSTROUTING -s 10.0.0.1 -p udp --sport 5000 -j SNAT --to-source 172.30.58.80:6000
          

          The following rule matches UDP packets with destination IP address 172.30.58.80 and destination port 6000 on interface swp51, and translates the IP address to 10.0.0.1 and the port to 5000.

          -t nat -A PREROUTING -d 172.30.58.80 -p udp --dport 6000 --in-interface swp51  -j DNAT --to-destination 10.0.0.1:5000
          

          When you configure a static SNAT rule for outgoing traffic, you must also configure a static DNAT rule for the reverse traffic so that traffic goes in both directions.

          Delete a Static NAT Rule

          To delete a static NAT rule:

          Run the nv unset acl <acl> command.

          cumulus@switch:~$ nv unset acl acl_1
          cumulus@switch:~$ nv config apply 
          
          Remove the rule from the policy file in the /etc/cumulus/acl/policy.d directory, then run the sudo cl-acltool -i command.

          Dynamic NAT

          Dynamic NAT maps private IP addresses and ports to a public IP address and port range or a public IP address range and port range. Cumulus Linux assigns IP addresses from a pool of addresses dynamically. When the switch releases entries after a period of inactivity, it maps new incoming connections dynamically to the freed up addresses and ports.

          Enable Dynamic NAT

          To use dynamic NAT, you must enable dynamic mode.

          cumulus@switch:~$ nv set system nat mode dynamic
          

          Edit the /etc/cumulus/switchd.conf file and uncomment the nat.dynamic_enable = TRUE option, then restart switchd:

          cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
          ...
          # NAT configuration
          # Enables NAT
          nat.dynamic_enable = TRUE
          ...
          
          cumulus@switch:~$ sudo systemctl restart switchd.service

          Restarting the switchd service causes all network ports to reset, interrupting network services, in addition to resetting the switch hardware configuration.

          Optional Dynamic NAT Settings

          You can customize the following dynamic NAT settings.

          Setting
          Description
          age-poll-interval The period of inactivity (in minutes) before Cumulus Linux releases a NAT entry from the translation table. You can set a value between 1 and 1440. The default value is 5.
          translate-table-size The maximum number of dynamic snat and dnat entries in the translation table. You can set a value between 1024 and 8192. The default value is 1024.
          rule-table-size The maximum number of rules allowed. You can set a value between 64 and 1024. The default value is 64.

          The following example sets:

          • The period of inactivity before Cumulus Linux releases a NAT entry from the translation table to 10.
          • The maximum number of dynamic snat and dnat entries in the translation table to 2048.
          • The maximum number of rules allowed to 100.
          cumulus@switch:~$ nv set system nat age-poll-interval 10
          cumulus@switch:~$ nv set system nat translate-table-size 2048
          cumulus@switch:~$ nv set system nat rule-table-size 100
          cumulus@switch:~$ nv config apply
          

          The /etc/cumulus/switchd.conf file includes the following configuration options for dynamic NAT. Only change these options if you enable dynamic NAT.

          Setting
          Description
          nat.age_poll_interval The period of inactivity (in minutes) before switchd releases a NAT entry from the translation table. You can set a value between 1 and 1440. The default value is 5.
          nat.table_size The maximum number of dynamic snat and dnat entries in the translation table. You can set a value between 512 and 8192. The default value is 1024.
          nat.config_table_size The maximum number of rules allowed. You can set a value between 64 and 1024. The default value is 64.

          After you change any of the dynamic NAT configuration options, restart switchd.

          cumulus@switch:~$ sudo systemctl restart switchd.service

          Restarting the switchd service causes all network ports to reset, interrupting network services, in addition to resetting the switch hardware configuration.

          Configure Dynamic NAT

          For dynamic NAT, create a rule that matches an IP address in CIDR notation and translates the address to a public IP address or IP address range.

          For dynamic PAT, create a rule that matches an IP address in CIDR notation and translates the address to a public IP address and port range or an IP address range and port range. You can also match on an IP address in CIDR notation and port.

          NVUE commands require you configure an inbound or outbound interface for dynamic NAT rules. However, rules you configure in a rules file in the /etc/cumulus/acl/policy.d/ directory do not require an inbound or outbound interface.

          Example Rules

          The following rule matches TCP packets with source IP address in the range 10.0.0.0/24 going out of swp5 and translates the address dynamically to an IP address in the range 172.30.58.0-172.30.58.80.

          cumulus@switch:~$ nv set acl acl_1 type ipv4
          cumulus@switch:~$ nv set acl acl_1 rule 1 match ip protocol tcp 
          cumulus@switch:~$ nv set acl acl_1 rule 1 match ip source-ip 10.0.0.0/24 
          cumulus@switch:~$ nv set acl acl_1 rule 1 action source-nat translate-ip 172.30.58.0 to 172.30.58.80
          cumulus@switch:~$ nv set interface swp5 acl acl_1 outbound 
          cumulus@switch:~$ nv config apply 
          

          The following rule matches UDP packets with source IP address in the range 10.0.0.0/24 going out of swp5 and translates the addresses dynamically to IP address 172.30.58.80 with layer 4 ports in the range 1024-1200:

          cumulus@switch:~$ nv set acl acl_2 type ipv4
          cumulus@switch:~$ nv set acl acl_2 rule 1 match ip protocol udp
          cumulus@switch:~$ nv set acl acl_2 rule 1 match ip source-ip 10.0.0.0/24 
          cumulus@switch:~$ nv set acl acl_2 rule 1 action source-nat translate-ip 172.30.58.80
          cumulus@switch:~$ nv set acl acl_2 rule 1 action source-nat translate-port 1024-1200
          cumulus@switch:~$ nv set interface swp5 acl acl_2 outbound
          cumulus@switch:~$ nv config apply 
          

          The following rule matches UDP packets with source IP address in the range 10.0.0.0/24 on source port 5000 coming in on swp6 and translates the addresses dynamically to IP address 172.30.58.80 with layer 4 ports in the range 1024-1200:

          cumulus@switch:~$ nv set acl acl_3 type ipv4
          cumulus@switch:~$ nv set acl acl_3 rule 1 match ip protocol udp 
          cumulus@switch:~$ nv set acl acl_3 rule 1 match ip source-ip 10.0.0.0/24
          cumulus@switch:~$ nv set acl acl_3 rule 1 match ip udp source-port 5000
          cumulus@switch:~$ nv set acl acl_3 rule 1 action source-nat translate-ip 172.30.58.80
          cumulus@switch:~$ nv set acl acl_3 rule 1 action source-nat translate-port 1024-1200
          cumulus@switch:~$ nv set interface swp6 acl acl_3 inbound 
          cumulus@switch:~$ nv config apply 
          

          The following rule matches TCP packets with destination IP address in the range 10.1.0.0/24 coming in on swp6 and translates the address dynamically to IP address range 172.30.58.0-172.30.58.80 with layer 4 ports in the range 1024-1200:

          cumulus@switch:~$ nv set acl acl_4 type ipv4
          cumulus@switch:~$ nv set acl acl_4 rule 1 match ip protocol tcp 
          cumulus@switch:~$ nv set acl acl_4 rule 1 match ip dest-ip 10.1.0.0/24 
          cumulus@switch:~$ nv set acl acl_4 rule 1 action dest-nat translate-ip 172.30.58.0 to 172.30.58.80
          cumulus@switch:~$ nv set acl acl_4 rule 1 action dest-nat translate-port 1024-1200
          cumulus@switch:~$ nv set interface swp6 acl acl_4 inbound 
          cumulus@switch:~$ nv config apply
          

          The following rule matches ICMP packets with source IP address in the range 10.0.0.0/24 and destination IP address in the range 10.1.0.0/24 coming in on swp6. The rule translates the address dynamically to IP address range 172.30.58.0-172.30.58.80 with layer 4 ports in the range 1024-1200:

          cumulus@switch:~$ nv set acl acl_5 type ipv4 
          cumulus@switch:~$ nv set acl acl_5 rule 1 match ip protocol icmp 
          cumulus@switch:~$ nv set acl acl_5 rule 1 match ip source-ip 10.0.0.0/24 
          cumulus@switch:~$ nv set acl acl_5 rule 1 match ip dest-ip 10.1.0.0/24 
          cumulus@switch:~$ nv set acl acl_5 rule 1 action source-nat translate-ip 172.30.58.0 to 172.30.58.80
          cumulus@switch:~$ nv set acl acl_5 rule 1 action source-nat translate-port 1024-1200
          cumulus@switch:~$ nv set interface swp6 acl acl_5 inbound
          cumulus@switch:~$ nv config apply
          

          To add NAT rules using cl-acltool, either edit an existing file in the /etc/cumulus/acl/policy.d directory and add rules under [iptables] or create a new file in the /etc/cumulus/acl/policy.d directory and add rules under an [iptables] section. For example:

          cumulus@switch:~$ sudo nano /etc/cumulus/acl/policy.d/60_nat.rules
          [iptables]
          
           #Add rule
          

          Example Rules

          The following rule matches TCP packets with source IP address in the range 10.0.0.0/24 on outbound interface swp5 and translates the address dynamically to an IP address in the range 172.30.58.0-172.30.58.80.

          -t nat -A POSTROUTING -s 10.0.0.0/24 --out-interface swp5 -p tcp -j SNAT --to-source 172.30.58.0-172.30.58.80
          

          The following rule matches UDP packets with source IP address in the range 10.0.0.0/24 and translates the addresses dynamically to IP address 172.30.58.80 with layer 4 ports in the range 1024-1200:

          -t nat -A POSTROUTING -s 10.0.0.0/24 -p udp -j SNAT --to-source 172.30.58.80:1024-1200
          

          The following rule matches UDP packets with source IP address in the range 10.0.0.0/24 on source port 5000 and translates the addresses dynamically to IP address 172.30.58.80 with layer 4 ports in the range 1024-1200:

          -t nat -A POSTROUTING -s 10.0.0.0/24 -p udp --sport 5000 -j SNAT --to-source 172.30.58.80:1024-1200
          

          The following rule matches TCP packets with destination IP address in the range 10.1.0.0/24 and translates the address dynamically to IP address range 172.30.58.0-172.30.58.80 with layer 4 ports in the range 1024-1200:

          -t nat -A PREROUTING -d 10.1.0.0/24 -p tcp -j DNAT --to-destination 172.30.58.0-172.30.58.80:1024-1200
          

          The following rule matches ICMP packets with source IP address in the range 10.0.0.0/24 and destination IP address in the range 10.1.0.0/24. The rule translates the address dynamically to IP address range 172.30.58.0-172.30.58.80 with layer 4 ports in the range 1024-1200:

          -t nat -A POSTROUTING -s 10.0.0.0/24 -d 10.1.0.0/24 -p icmp -j SNAT --to-source 172.30.58.0-172.30.58.80:1024-1200
          

          Delete a Dynamic NAT Rule

          To delete a dynamic NAT rule:

          Run the nv unset acl <acl> command:

          cumulus@switch:~$ nv unset acl acl_1
          cumulus@switch:~$ nv config apply
          
          Remove the rule from the policy file in the /etc/cumulus/acl/policy.d directory, then run the sudo cl-acltool -i command.

          Show Configured NAT Rules

          To see the NAT rules configured on the switch, run the NVUE nv show acl <acl> --applied -o=json command, or the Linux sudo iptables -t nat -v -L or sudo cl-acltool -L ip -v commands. For example:

          cumulus@switch:~$ nv show acl acl_5 --applied -o=json
          {
            "rule": {
              "1": {
                "action": {
                  "source-nat": {
                    "translate-ip": {
                      "172.30.58.0": {
                        "to": "172.30.58.80"
                      }
                    },
                    "translate-port": {
                      "1024-1200": {}
                    }
                  }
                },
                "match": {
                  "ip": {
                    "dest-ip": "10.1.0.0/24",
                    "protocol": "icmp",
                    "source-ip": "10.0.0.0/24"
                  }
                }
              }
            },
            "type": "ipv4"
          }
          
          cumulus@switch:~$ sudo iptables -t nat -v -L -n
          ...
           pkts bytes target     prot opt in     out     source               destination         
              0     0 SNAT       icmp --  *      swp6    10.0.0.0/24          10.1.0.0/24          /* rule_id:1,acl_name:acl_5,dir:outbound,interface_id:swp6 */ to:172.30.58.0-172.30.58.80:1024-1200
          

          Show Conntrack Flows

          To see the active connection tracking (conntrack) flows, run the sudo cat /proc/net/nf_conntrack command. The hardware offloaded flows contain [OFFLOAD] in the output.

          cumulus@switch:~$ sudo cat /proc/net/nf_conntrack
          ipv4     2 udp      17 src=172.30.10.5 dst=10.0.0.2 sport=5001 dport=5000 src=10.0.0.2 dst=10.1.0.10 sport=6000 dport=1026 [OFFLOAD] mark=0 zone=0 use=2
          

          Considerations

          When using NAT, you must enable proxy ARP for intra-subnet ARP requests when:

          To enable proxy ARP for intra-subnet ARP requests:

          NVUE does not provide commands for this setting.

          Edit the /etc/network/interfaces file to set /proc/sys/net/ipv4/conf/<interface>/proxy_arp_pvlan to 1 in the interface stanza, then run the ifreload -a command.

          cumulus@switch:~$ sudo nano /etc/network/interfaces
          ...
          auto swp1
          iface swp1
              post-up echo 1 > /proc/sys/net/ipv4/conf/swp1/proxy_arp_pvlan
          ...
          
          cumulus@switch:~$ sudo ifreload -a
          

          Bidirectional Forwarding Detection - BFD

          BFD provides low overhead and rapid detection of failures in the paths between two network devices. It provides a unified mechanism for link detection over all media and protocol layers. Use BFD to detect failures for IPv4 and IPv6 single or multihop paths between any two network devices, including unidirectional path failure detection.

          Cumulus Linux does not support:

          BFD Multihop Routed Paths

          BFD multihop sessions build over arbitrary paths between two systems, which results in some complexity that does not exist for single hop sessions. To avoid spoofing with multihop paths, configure the maximum hop count (max_hop_cnt) for each peer, which limits the number of hops for a BFD session. The switch drops all BFD packets exceeding the maximum hop count.

          Cumulus Linux supports multihop BFD sessions for both IPv4 and IPv6 peers.

          Configure BFD

          You can configure BFD with NVUE or vtysh commands or by specifying the configuration in the PTM `topology.dot` file. However, the topology file has some limitations:

          Use FRR to register multihop peers with PTM and BFD, and monitor the connectivity to the remote BGP multihop peer. FRR can dynamically register and unregister both IPv4 and IPv6 peers with BFD when the BFD-enabled peer connectivity starts or stops. Also, you can configure BFD parameters for each BGP or OSPF peer.

          The BFD parameter in the topology file takes precedence over the client-configured BFD parameters for a BFD session that both the topology file and FRR creates.

          Every BFD interface requires an IP address. The neighbor IP address for a single hop BFD session must exist in the ARP table before BFD can start sending control packets.

          When you configure BFD, you can set the following parameters for both IPv4 and IPv6 sessions. If you do not set these parameters, Cumulus Linux uses the default values.

          BFD in BGP

          When you configure BFD in BGP, PTM registers and de-registers neighbors dynamically.

          To configure BFD in BGP, run the following commands.

          You can configure BFD for a peer group or for an individual neighbor.

          The following example configures BFD for swp51 and uses the default intervals.

          cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 bfd enable on
          cumulus@switch:~$ nv config apply
          

          The following example configures BFD for the peer group fabric and sets the interval multiplier to 4, the minimum interval between received BFD control packets to 400, and the minimum interval for sending BFD control packets to 400.

          cumulus@switch:~$ nv set vrf default router bgp neighbor fabric bfd enable on
          cumulus@switch:~$ nv set vrf default router bgp neighbor fabric bfd detect-multiplier 4 
          cumulus@switch:~$ nv set vrf default router bgp neighbor fabric bfd min-rx-interval 400 
          cumulus@switch:~$ nv set vrf default router bgp neighbor fabric bfd min-tx-interval 400
          cumulus@switch:~$ nv config apply
          

          The following example configures BFD for swp1 and uses the default intervals:

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router bgp 65000
          switch(config-router)# neighbor swp1 bfd
          switch(config-router)# exit
          switch(config)# exit
          switch# write memory
          switch# exit
          

          The following example configures BFD for the peer group fabric and sets the interval multiplier to 4, the minimum interval between received BFD control packets to 400, and the minimum interval for sending BFD control packets to 400.

          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# router bgp 65000
          switch(config-router)# neighbor fabric bfd 4 400 400
          switch(config-router)# exit
          switch(config)# exit
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          router bgp 65101 vrf default
          bgp router-id 10.10.10.1
          ! Neighbors
          neighbor fabric peer-group
          neighbor fabric remote-as external
          neighbor fabric bfd 4 400 400
          ...
          

          To see neighbor information in BGP, including BFD status, run the vtysh show ip bgp neighbor <interface> command. For example:

          cumulus@switch:~$ sudo vtysh 
          switch# show ip bgp neighbor swp51
          ...
          BFD: Type: single hop
            Detect Mul: 4, Min Rx interval: 400, Min Tx interval: 400
            Status: Down, Last update: 0:00:00:08
          ...
          

          BFD in OSPF

          When you enable or disable BFD in OSPF, PTM registers and de-registers neighbors dynamically. When you enable BFD on the interface, a neighbor registers with BFD when two-way adjacency starts and de-registers when adjacency goes down. The BFD configuration is per interface and any IPv4 and IPv6 neighbors discovered on that interface inherit the configuration.

          The following example configures BFD in OSPF for interface swp1 and sets interval multiplier to 4, the minimum interval between received BFD control packets to 400, and the minimum interval for sending BFD control packets to 400.

          cumulus@switch:~$ nv set interface swp1 router ospf bfd detect-multiplier 4
          cumulus@switch:~$ nv set interface swp1 router ospf bfd min-receive-interval 400
          cumulus@switch:~$ nv set interface swp1 router ospf bfd min-transmit-interval 400
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ sudo vtysh
          ...
          switch# configure terminal
          switch(config)# interface swp1
          switch(config-if)# ipv6 ospf6 bfd 4 400 400
          switch(config-if)# exit
          switch(config)# exit
          switch# write memory
          switch# exit
          

          The vtysh commands save the configuration in the /etc/frr/frr.conf file. For example:

          ...
          interface swp1
            ipv6 ospf6 bfd 4 400 400
            ...
          

          You can run different commands to show neighbor information in OSPF, including BFD status.

          Scripts

          ptmd executes scripts at /etc/ptm.d/bfd-sess-down when BFD sessions go down and /etc/ptm.d/bfd-sess-up when BFD sessions goes up. Modify these default scripts as needed.

          Echo Function

          Cumulus Linux supports the echo function for IPv4 single hops only, and with the asynchronous operating mode only (Cumulus Linux does not support demand mode).

          Use the echo function to test the forwarding path on a remote system. To enable the echo function, set echoSupport to 1 in the topology file.

          After the remote system loops the echo packets, the BFD control packets can send at a much lower rate. You configure this lower rate by setting the slowMinTx parameter in the topology file to a non-zero value in milliseconds.

          You can use more aggressive detection times for echo packets because the round-trip time is less; echo packets access the forwarding path. You can configure the detection interval by setting the echoMinRx parameter in the topology file. The minimum setting is 50 milliseconds. After you configure this setting, BFD control packets send at this required minimum echo Rx interval. This indicates to the peer that the local system can loop back the echo packets. Echo packets transmit if the peer supports receiving echo packets.

          About the Echo Packet

          Cumulus Linux encapsulates BFD echo packets into UDP packets over destination and source UDP port number 3785. The BFD echo packet format is vendor-specific. BFD echo packets that originate from Cumulus Linux are eight bytes long and have the following format:

          0 1 2 3
          Version Length Reserved Reserved
          My Discriminator

          Where:

          Transmit and Receive Echo Packets

          Cumulus Linux transmits BFD echo packets for a BFD session only when the peer advertises a non-zero value for the required minimum echo receive interval (the echoMinRx setting) in the BFD control packet when the BFD session starts. The switch bases the transmit rate of the echo packets on the peer advertised echo receive value in the control packet.

          Cumulus Linux loops BFD echo packets back to the originating node for a BFD session only if you configure the echoMinRx and echoSupport locally to a non-zero values.

          Echo Function Parameters

          You configure the echo function by setting the following parameters in the topology file at the global, template and port level:

          Troubleshooting

          To troubleshoot BFD, run the Linux ptmctl -b command.

          cumulus@switch:~$ ptmctl -b
          
          ----------------------------------------------------------------------------------------
          port  peer                 state  local  type       diag  det   tx_timeout  rx_timeout
                                                                    mult
          ----------------------------------------------------------------------------------------
          swp1  fe80::202:ff:fe00:1  Up     N/A    singlehop  N/A   3     300         900
          swp1  3101:abc:bcad::2     Up     N/A    singlehop  N/A   3     300         900
          
          #continuation of output
          ---------------------------------------------------------------------
          echo        echo        max      rx_ctrl  tx_ctrl  rx_echo  tx_echo
          tx_timeout  rx_timeout  hop_cnt
          ---------------------------------------------------------------------
          0           0           N/A      187172   185986   0        0
          0           0           N/A      501      533      0        0
          

          Address Resolution Protocol - ARP

          ARP is a communication protocol that discovers the link layer address, such as a MAC address, associated with a network layer address. The Cumulus Linux ARP implementation differs from standard Debian Linux ARP behavior because Cumulus Linux is an operating system for routers and switches, not servers.

          For a definition of ARP, refer to RFC 826.

          Standard Debian ARP Behavior and the Tunable ARP Parameters

          Debian has these five tunable ARP parameters:

          For a full description of these parameters, refer to the Linux documentation.

          The standard Debian installation sets these ARP parameters to 0, leaving the router as wide open and unrestricted as possible. The Linux IP addresses are a property of the device, not an individual interface. Therefore, you can send an ARP request or reply on one interface with an address that resides on a different interface. While this unrestricted behavior makes sense for a server, it is not the normal behavior of a router. Routers expect the MAC and IP address mappings that ARP provides to match the physical topology, so that the IP addresses match the interfaces on which they reside. With these tunable ARP parameters, Cumulus Linux is able to specify the behavior to match the expectations of a router.

          ARP Tunable Parameter Settings in Cumulus Linux

          Parameter Default Setting Type Description
          arp_accept 0 BOOL Defines the behavior for gratuitous ARP frames when the IP address is not already in the ARP table:
          • 0: Do not create new entries in the ARP table.
          • 1: Create new entries in the ARP table.

          You can set arp_accept on an individual interface which differs from the rest of the switch (see below).
          arp_announce 2 INT Defines different restriction levels for announcing the local source IP address from IP packets in ARP requests that send on an interface:
          • 0: Use any local address configured on any interface.
          • 1: Avoid local addresses that are not in the target subnet for this interface. You can use this mode when target hosts reachable through this interface require the source IP address in ARP requests to be part of their logical network configured on the receiving interface. When Cumulus Linux generates the request, it checks all subnets that include the target IP address and preserves the source address if it is from such a subnet. If there is no such subnet, Cumulus Linux selects the source address according to the rules for level 2.
          • 2: Always use the best local address for this target. In this mode, Cumulus Linux ignores the source address in the IP packet and tries to select the local address preferred for talks with the target host. To select the local address, Cumulus Linux looks for primary IP addresses on all the subnets on the outgoing interface that include the target IP address. If there is no suitable local address, Cumulus Linux selects the first local address on the outgoing interface or on all other interfaces, so that it receives a reply for the request regardless of the announced source IP address.
          The default Debian behavior (arp_announce is 0) sends gratuitous ARPs or ARP requests using any local source IP address and does not limit the IP source of the ARP packet to an address residing on the interface that sends the packet.

          Routers expect a different relationship between the IP address and the physical network. Adjoining routers look for MAC and IP addresses to reach a next hop residing on a connecting interface for transiting traffic. By setting the arp_announce parameter to 2, Cumulus Linux uses the best local address for each ARP request, preferring the primary addresses on the interface that sends the ARP.
          arp_filter 0 BOOL
          • 0: The kernel can respond to ARP requests with addresses from other interfaces to increase the chance of successful communication. The complete host on Linux (not specific interfaces) owns the IP addresses. For more complex configurations, such as load balancing, this behavior can cause problems.
          • 1: Allows you to have multiple network interfaces on the same subnet and to answer the ARPs for each interface based on whether the kernel routes a packet from the ARPd IP address out of that interface (you must use source based routing).
          arp_filter for the interface is on if at least one of conf/{all,interface}/arp_filter is TRUE, it is off otherwise.

          Cumulus Linux uses the default Debian Linux arp_filter setting of 0.
          The switch uses arp_filter when multiple interfaces reside in the same subnet and allows certain interfaces to respond to ARP requests. For OSPF with IP unnumbered interfaces, multiple interfaces appear in the same subnet and contain the same address. If you use multiple interfaces between a pair of routers and set arp_filter to 1, forwarding can fail.

          The arp_filter parameter allows a response on any interface in the subnet, where the arp_ignore setting (below) limits cross-interface ARP behavior.
          arp_ignore 1 INT Defines different modes for sending replies in response to received ARP requests that resolve local target IP addresses:
          • 0: Reply for any local target IP address on any interface.
          • 1: Reply only if the target IP address is the local address on the incoming interface.
          • 2: Reply only if the target IP address is the local address on the incoming interface and the sender IP address is part of same subnet on this interface.
          • 3: Do not reply for local addresses with scope host; the switch replies only for global and link addresses.
          • 4-7: Reserved.
          • 8: Do not reply for all local addresses.
          The switch uses the maximum value from conf/{all,interface}/arp_ignore when the {interface} receives the ARP request.

          The default arp_ignore setting of 1 allows the device to reply to an ARP request for any IP address on any interface. While this matches the expectation that an IP address belongs to the device, not an interface, it can cause some unexpected behavior on a router.

          For example, if arp_ignore is 0 and the switch receives an ARP request on one interface for the IP address residing on a different interface, the switch responds with an ARP reply even if the interface of the target address is down. This can cause traffic loss because the switch does not know if it can reach the next hops and results in troubleshooting challenges for failure conditions.

          If you set arp_ignore to 2, the switch only replies to ARP requests if the target IP address is a local address and both the sender and target IP addresses are part of the same subnet on the incoming interface. The router does not create stale neighbor entries when a peer device sends an ARP request from a source IP address that is not on the connected subnet. Eventually, the switch sends ARP requests to the host to try to keep the entry fresh. If the host responds, the switch now has reachable neighbor entries for hosts that are not on the connected subnet.
          arp_notify 1 BOOL Defines the mode to notify address and device changes.
          • 0: Do nothing.
          • 1: Generate gratuitous ARP requests when the device comes up or the hardware address changes.
          The default Debian arp_notify setting is to remain silent when an interface comes up or the hardware address changes. Because Cumulus Linux often acts as a next hop for several end hosts, it notifies attached devices when an interface comes up or the address changes, which speeds up new information convergence and provides the most rapid support for changes.

          Change Tunable ARP Parameters

          You can change the ARP parameter settings in several places, including:

          The ARP parameter changes in Cumulus Linux use the default file locations.

          The all and default locations sound similar but they operate in different ways. The all location can potentially change the value for all interfaces running IP, both now and in the future. The all value applies to each parameter using either MAX or OR logic between the all and any port-specific settings, as the following table shows:

          ARP Parameter Condition
          arp_accept OR
          arp_announce MAX
          arp_filter OR
          arp_ignore MAX
          arp_notify MAX

          For example, if you set the /proc/sys/net/conf/all/arp_ignore value to 1 and the /proc/sys/net/conf/swp1/arp_ignore value to 0 to try to disable it on a per-port basis, interface swp1 still uses the value of 1; the port-specific setting does not override the global all setting. Instead, the MAX value between the all value and port-specific value defines the actual behavior.

          The default location /proc/sys/net/ipv4/conf/default/arp* defines the values for all future IP interfaces. Changing the default setting of an ARP parameter does not impact interfaces that already have an IP address. If you make changes to a running system that already has assigned IP addresses, use port-specific settings instead.

          Cumulus Linux copies the value of the default parameter to every port-specific location, excluding those that already have an IP address. There is no complicated logic between the default setting and the port-specific setting (unlike the all location).

          To determine the current ARP parameter settings for each of the locations, run the following commands:

          cumulus@switch:~$ sudo grep . /proc/sys/net/ipv4/conf/all/arp*
          /proc/sys/net/ipv4/conf/all/arp_accept:0
          /proc/sys/net/ipv4/conf/all/arp_announce:0
          /proc/sys/net/ipv4/conf/all/arp_filter:0
          /proc/sys/net/ipv4/conf/all/arp_ignore:0
          /proc/sys/net/ipv4/conf/all/arp_notify:1
          
          cumulus@switch:~$ sudo grep . /proc/sys/net/ipv4/conf/default/arp*
          /proc/sys/net/ipv4/conf/default/arp_accept:0
          /proc/sys/net/ipv4/conf/default/arp_announce:2
          /proc/sys/net/ipv4/conf/default/arp_filter:0
          /proc/sys/net/ipv4/conf/default/arp_ignore:1
          /proc/sys/net/ipv4/conf/default/arp_notify:1
          
          cumulus@switch:~$ sudo grep . /proc/sys/net/ipv4/conf/swp1/arp*
          /proc/sys/net/ipv4/conf/swp1/arp_accept:0
          /proc/sys/net/ipv4/conf/swp1/arp_announce:2
          /proc/sys/net/ipv4/conf/swp1/arp_filter:0
          /proc/sys/net/ipv4/conf/swp1/arp_ignore:1
          /proc/sys/net/ipv4/conf/swp1/arp_notify:1
          

          Cumulus Linux implements this change at boot time using the arp.conf file in the following location:

          cumulus@switch:~$ cat /etc/sysctl.d/arp.conf
          net.ipv4.conf.default.arp_announce = 2
          net.ipv4.conf.all.arp_notify = 1
          net.ipv4.conf.default.arp_notify = 1
          net.ipv4.conf.default.arp_ignore=1
          

          Change Port-specific ARP Parameters

          To configure port-specific ARP parameters in a running device, run the following command:

          cumulus@switch:~$ sudo sh -c "echo 0 > /proc/sys/net/ipv4/conf/swp1/arp_ignore"
          cumulus@switch:~$ sudo grep . /proc/sys/net/ipv4/conf/swp1/arp*
          /proc/sys/net/ipv4/conf/swp1/arp_accept:0
          /proc/sys/net/ipv4/conf/swp1/arp_announce:2
          /proc/sys/net/ipv4/conf/swp1/arp_filter:0
          /proc/sys/net/ipv4/conf/swp1/arp_ignore:0
          /proc/sys/net/ipv4/conf/swp1/arp_notify:1
          
          

          To make the change persist through reboots, edit the /etc/sysctl.d/arp.conf file and add your port-specific ARP setting.

          Configure Proxy ARP

          When you enable proxy ARP, if the switch receives an ARP request for which it has a route to the destination IP address, the switch sends a proxy ARP reply that contains its own MAC address. The host that sent the ARP request then sends its packets to the switch and the switch forwards the packets to the intended host.

          Proxy ARP works with IPv4 only; ARP is an IPv4-only protocol.

          The following example commands enable proxy ARP on swp1.

          NVUE does not provide commands for this setting.

          Edit the /etc/network/interfaces file to set /proc/sys/net/ipv4/conf/<interface>/proxy_arp to 1 in the interface stanza, then run the ifreload -a command.

          cumulus@switch:~$ sudo nano /etc/network/interfaces
          ...
          auto swp1
          iface swp1
              post-up echo 1 > /proc/sys/net/ipv4/conf/swp1/proxy_arp
          ...
          
          cumulus@switch:~$ sudo ifreload -a
          

          If you are running two interfaces in the same broadcast domain (typically seen when using VRR, which creates a -v0 interface in the same broadcast domain), set /proc/sys/net/ipv4/conf/<INTERFACE>/medium_id to 2 on both the base SVI interface and the -v0 interface. In this case only one of the two interfaces replies when getting an ARP request. This prevents the v0 interface from proxy replying on behalf of the SVI (and the SVI from proxy replying on behalf of the v0 interface). You can only prevent duplicate replies when the ARP request is for the SVI or the v0 interface directly.

          Cumulus Linux does not provide NVUE commands for this setting.

          Edit the /etc/network/interfaces file, then run the ifreload -a command. For example:

          cumulus@switch:~$ sudo nano /etc/network/interfaces
          ...
          auto swp1
          iface swp1
              post-up echo 1 > /proc/sys/net/ipv4/conf/swp1/proxy_arp
              post-up echo 2 > /proc/sys/net/ipv4/conf/swp1/medium_id
          
          auto swp1-v0
          iface swp1-v0
              post-up echo 1 > /proc/sys/net/ipv4/conf/swp1-v0/proxy_arp
              post-up echo 2 > /proc/sys/net/ipv4/conf/swp1-v0/medium_id
          ...
          
          cumulus@switch:~$ sudo ifreload -a
          

          If you are running proxy ARP on a VRR interface, add a post-up line to the VRR interface stanza similar to the following. For example, if vlan100 is the VRR interface for the configuration above:

          Cumulus Linux does not provide NVUE commands for this setting.

          Edit the /etc/network/interfaces file, then run the ifreload -a command. For example:

          cumulus@switch:~$ sudo nano /etc/networks/interfaces
          ...
          auto vlan100
          iface vlan100
              post-up echo 1 > /proc/sys/net/ipv4/conf/swp1-v0/proxy_arp
              post-up echo 1 > /proc/sys/net/ipv4/conf/swp1/proxy_arp
              post-up echo 2 > /proc/sys/net/ipv4/conf/swp1-v0/medium_id
              post-up echo 2 > /proc/sys/net/ipv4/conf/swp1/medium_id
              vlan-id 100
          ...
          
          cumulus@switch:~$ sudo ifreload -a
          

          Duplicate Address Detection (Windows Hosts)

          In centralized VXLAN environments with ARP and ND suppression, if the SVIs on the leafs but do not have an IP address within the subnet, problems with the Duplicate Address Detection process on Microsoft Windows hosts occur. For example, in a pure layer 2 scenario or with SVIs that have the ip-forward option off, the SVI does not have an IP address. The neighmgrd service selects a source IP address for an ARP probe based on the subnet match on the neighbor IP address. Because the SVI that learns this neighbor does not have an IP address, the subnet match fails and neighmgrd uses UNSPEC (0.0.0.0 for IPv4) as the source IP address in the ARP probe.

          To work around this issue, run the neighmgrctl setsrcipv4 <ipaddress> command to specify a non-0.0.0.0 address for the source; for example:

          cumulus@switch:~$ neighmgrctl setsrcipv4 10.1.0.2
          

          The configuration above does not persist if you reboot the switch. To make the changes apply persistently:

          1. Create a new file called /etc/cumulus/neighmgr.conf and add the setsrcipv4 <ipaddress> option; for example:

            cumulus@switch:~$  sudo nano /etc/cumulus/neighmgr.conf
            
            [main]
            setsrcipv4: 10.1.0.2
            
          2. Restart the neighmgrd service:

            cumulus@switch:~$ sudo systemctl restart neighmgrd
            

          Neighbor Base Reachable Timer

          You can set how long a neighbor cache entry is valid with the NVUE nv set system global arp base-reachable-time command. The entry is valid for at least the value between the base reachable time divided by two and three times the base reachable time divided by two. You can specify a value between 30 and 2147483 seconds. The default value is auto; NVUE derives the value for auto from the /etc/sysctl.d/neigh.conf file.

          The following example configures the neighbor base reachable timer to 50 seconds.

          cumulus@leaf01:~$ nv set system global arp base-reachable-time 50
          cumulus@leaf01:~$ nv config apply
          

          To reset the neighbor base reachable timer to the default setting, run the nv unset system global arp base-reachable-time command.

          NVIDIA recommends that you run the NVUE command to change the neighbor base reachable timer instead of modifying the /etc/sysctl.d/neigh.conf file manually.

          To show the neighbor base reachable timer setting, run the nv show system global arp command:

          cumulus@leaf02:mgmt:~$ nv show system global arp
                                        operational  applied
          ----------------------------  -----------  -------
          base-reachable-time           50           50   
          garbage-collection-threshold                      
            effective                   35840               
            maximum                     40960               
            minimum                     128            
          

          ARP Refresh

          Cumulus Linux does not interact directly with end systems as much as end systems interact with each another. Therefore, after ARP places a neighbor into a reachable state, if Cumulus Linux does not interact with the client again for a long enough period of time, the neighbor can move into a stale state. To keep neighbors in the reachable state, Cumulus Linux includes a background process (/usr/bin/neighmgrd). The background process can track neighbors that move into a stale, delay, or probe state, and attempt to refresh their state before removing them from the Linux kernel and from hardware forwarding. If you want the neighmgrd process to add a neighbor if the sender IP address in the ARP packet is in one of the SVI’s subnets, create the /etc/cumulus/neighmgr.conf file and add the subnet_checks=1 parameter under the [snooper] header. By default, the subnet_checks option is set to 0 (disabled) so that neighmgrd allows SVIs to process out-of-network neighbors.

          The ARP refresh timer defaults to 1080 seconds (18 minutes).

          cumulus@leaf02:mgmt:~$ sudo nano /etc/cumulus/neighmgr.conf
          [snooper]
          subnet_checks=1
          

          Add Static ARP Table Entries

          You can add static ARP table entries for easy management or as a security measure to prevent spoofing and other nefarious activities.

          To create a static ARP entry for an interface with an IPv4 address associated with a MAC address, run the nv set interface <interface> neighbor ipv4 <ip-address> lladdr <mac-address> command.

          cumulus@leaf01:mgmt:~$ nv set interface swp51 neighbor ipv4 10.5.5.51 lladdr 00:00:5E:00:53:51
          cumulus@leaf01:mgmt:~$ nv config apply
          

          You can also set a flag to indicate that the neighbour is a router (is-router) or learned externally (ext_learn) and set the neighbor state (delay, failed, incomplete, noarp, permanent, probe, reachable, or stale).

          cumulus@leaf01:mgmt:~$ nv set interface swp51 neighbor ipv4 10.5.5.51 lladdr 00:00:5E:00:53:51 flag is-router
          cumulus@leaf01:mgmt:~$ nv set interface swp51 neighbor ipv4 10.5.5.51 lladdr 00:00:5E:00:53:51 state permanent
          cumulus@leaf01:mgmt:~$ nv config apply
          

          To delete an entry in the ARP table, run the nv unset interface <interface> neighbor ipv4 <ip-address> command:

          cumulus@leaf01:mgmt:~$ nv unset interface swp51 neighbor ipv4 10.5.5.51
          cumulus@leaf01:mgmt:~$ nv config apply
          

          To create a static ARP entry for an interface with an IPv4 address associated with a MAC address, add post-up ip neigh add <ipv4-address> lladdr <mac-address> to the interface stanza of the /etc/network/interfaces file, then run the ifreload -a command:

          cumulus@leaf01:mgmt:~$ sudo nano /etc/network/interfaces
          ...
          auto swp51
          iface swp51
              address 10.5.5.1/24
              post-up ip neigh add 10.5.5.51 lladdr 00:00:5E:00:53:51 dev swp51
          ...
          
          cumulus@leaf01:mgmt:~$ sudo ifreload -a
          

          You can also set a flag to indicate that the neighbour is a router (router) or learned externally (extern_learn) and set the neighbor state (delay, failed, incomplete, noarp, permanent, probe, reachable, or stale).

          cumulus@leaf01:mgmt:~$ sudo nano /etc/network/interfaces
          ...
          auto swp51
          iface swp51
              address 10.5.5.1/24
              post-up ip neigh add 10.5.5.51 lladdr 00:00:5E:00:53:51 dev swp51 nud permanent router
          ...
          
          cumulus@leaf01:mgmt:~$ sudo ifreload -a
          

          To delete an entry in the ARP table, remove the post-up ip neigh add line from the interface stanza of the /etc/network/interfaces file.

          Show the ARP Table

          To show all the entries in the IP neighbor table, run the nv show interface neighbor command or the Linux ip neighbor show command:

          cumulus@leaf01:mgmt:~$ nv show interface neighbor
          Interface      IP/IPV6                    LLADR(MAC)         State      Flag      
          -------------  -------------------------  -----------------  ---------  ----------
          eth0           192.168.200.251            48:b0:2d:00:00:01  stale                
                         192.168.200.1              48:b0:2d:aa:8b:45  reachable            
                         fe80::4ab0:2dff:fe00:1     48:b0:2d:00:00:01  reachable  router    
          peerlink.4094  169.254.0.1                48:b0:2d:3f:69:d6  permanent            
                         fe80::4ab0:2dff:fe3f:69d6  48:b0:2d:3f:69:d6  reachable  router    
          swp51          169.254.0.1                48:b0:2d:a2:4c:79  permanent            
                         fe80::4ab0:2dff:fea2:4c79  48:b0:2d:a2:4c:79  reachable  router    
          swp52          169.254.0.1                48:b0:2d:48:f1:ae  permanent            
                         fe80::4ab0:2dff:fe48:f1ae  48:b0:2d:48:f1:ae  reachable  router    
          swp53          169.254.0.1                48:b0:2d:2d:de:93  permanent            
                         fe80::4ab0:2dff:fe2d:de93  48:b0:2d:2d:de:93  reachable  router    
          swp54          169.254.0.1                48:b0:2d:80:8c:21  permanent            
                         fe80::4ab0:2dff:fe80:8c21  48:b0:2d:80:8c:21  reachable  router    
          vlan10         10.1.10.3                  44:38:39:22:01:78  permanent            
                         10.1.10.101                48:b0:2d:a1:3f:4b  reachable            
                         10.1.10.104                48:b0:2d:1d:d7:e8  noarp      |ext_learn
                         fe80::4ab0:2dff:fea1:3f4b  48:b0:2d:a1:3f:4b  reachable            
                         fe80::4ab0:2dff:fe1d:d7e8  48:b0:2d:1d:d7:e8  noarp      |ext_learn
                         fe80::4638:39ff:fe22:178   44:38:39:22:01:78  permanent            
          vlan10-v0      10.1.10.101                48:b0:2d:a1:3f:4b  stale                
                         fe80::4ab0:2dff:fea1:3f4b  48:b0:2d:a1:3f:4b  stale                
                         fe80::4ab0:2dff:fe1d:d7e8  48:b0:2d:1d:d7:e8  stale                
          vlan20         10.1.20.105                48:b0:2d:75:bf:9e  noarp      |ext_learn
                         10.1.20.102                48:b0:2d:00:e9:05  reachable            
                         10.1.20.3                  44:38:39:22:01:78  permanent            
                         fe80::4638:39ff:fe22:178   44:38:39:22:01:78  permanent            
                         fe80::4ab0:2dff:fe75:bf9e  48:b0:2d:75:bf:9e  noarp      |ext_learn
                         fe80::4ab0:2dff:fe00:e905  48:b0:2d:00:e9:05  reachable
          ...
          
          cumulus@leaf01:mgmt:~$ ip neighbor show
          192.168.200.251 dev eth0 lladdr 48:b0:2d:00:00:01 STALE 
          10.5.5.51 dev swp51 lladdr 00:00:5e:00:53:51 router PERMANENT 
          192.168.200.1 dev eth0 lladdr 48:b0:2d:b1:48:ef REACHABLE 
          fe80::4ab0:2dff:fe00:1 dev eth0 lladdr 48:b0:2d:00:00:01 router REACHABLE
          ...
          

          To show IPv4 entries only, run the Linux ip -4 neighbor command:

          cumulus@leaf01:mgmt:~$ ip -4 neighbor
          169.254.0.1 dev swp54 lladdr 48:b0:2d:80:8c:21 PERMANENT proto zebra 
          169.254.0.1 dev peerlink.4094 lladdr 48:b0:2d:3f:69:d6 PERMANENT proto zebra 
          10.10.10.3 dev vxlan48 lladdr 44:38:39:22:01:84 extern_learn  NOARP proto zebra 
          10.10.10.64 dev vlan4024_l3 lladdr 44:38:39:22:01:7c extern_learn  NOARP proto zebra 
          10.1.20.102 dev vlan20-v0 lladdr 48:b0:2d:00:e9:05 STALE
          192.168.200.251 dev eth0 lladdr 48:b0:2d:00:00:01 STALE
          10.10.10.4 dev vlan4024_l3 lladdr 44:38:39:22:01:8a extern_learn  NOARP proto zebra 
          10.10.10.64 dev vlan4036_l3 lladdr 44:38:39:22:01:7c extern_learn  NOARP proto zebra 
          169.254.0.1 dev swp53 lladdr 48:b0:2d:2d:de:93 PERMANENT proto zebra 
          10.10.10.4 dev vlan4036_l3 lladdr 44:38:39:22:01:8a extern_learn  NOARP proto zebra 
          10.1.10.3 dev vlan10 lladdr 44:38:39:22:01:78 PERMANENT
          169.254.0.1 dev swp52 lladdr 48:b0:2d:48:f1:ae PERMANENT proto zebra 
          10.10.10.2 dev vlan4024_l3 lladdr 44:38:39:22:01:78 extern_learn  NOARP proto zebra 
          10.1.20.105 dev vlan20 lladdr 48:b0:2d:75:bf:9e extern_learn  NOARP proto zebra 
          10.10.10.64 dev vxlan48 lladdr 44:38:39:22:01:7c extern_learn  NOARP proto zebra 
          10.0.1.34 dev vxlan48 lladdr 44:38:39:be:ef:bb extern_learn  NOARP proto zebra 
          10.10.10.2 dev vlan4036_l3 lladdr 44:38:39:22:01:78 extern_learn  NOARP proto zebra 
          10.1.10.101 dev vlan10-v0 lladdr 48:b0:2d:a1:3f:4b STALE
          10.1.10.101 dev vlan10 lladdr 48:b0:2d:a1:3f:4b REACHABLE
          ...
          

          To show all table entries for a specific interface, run the nv show interface <interface_id> neighbor command:

          cumulus@leaf01:mgmt:~$ nv show interface swp51 neighbor
          ipv4
          =========
              IPV4         LLADR(MAC)         State      Flag
              -----------  -----------------  ---------  ----
              10.5.5.51    00:00:5e:00:53:51  permanent      
              169.254.0.1  48:b0:2d:a2:4c:79  permanent
          ipv6
          =========
              IPV6                       LLADR(MAC)         State      Flag     
              -------------------------  -----------------  ---------  ---------
              fe80::4ab0:2dff:fea2:4c79  48:b0:2d:a2:4c:79  reachable  is-router
          

          To show all IPv4 table entries for an interface, run the nv show interface <interface> neighbor ipv4 command:

          cumulus@leaf01:mgmt:~$ nv show interface swp1 neighbor ipv4
          IPV4         LLADR(MAC)         State      Flag
          -----------  -----------------  ---------  ----
          10.188.52.1  00:00:5e:00:01:22  reachable
          10.188.52.2  1c:34:da:e8:1d:c8  stale
          

          To show table entries for an interface with a specific IPv4 address, run the nv show interface <interface_id> neighbor ipv4 <ip-address> command.

          cumulus@leaf01:mgmt:~$ nv show interface swp51 neighbor ipv4 169.254.0.1
          lladdr
          =========
              LLADR(MAC)         State      Flag
              -----------------  ---------  ----
              48:b0:2d:a2:4c:79  permanent
          

          Neighbor Discovery - ND

          ND allows different devices on the same link to advertise their existence to their neighbors and to learn about the existence of their neighbors. ND is the IPv6 equivalent of IPv4 ARP for layer 2 address resolution.

          ND is on by default. Cumulus Linux provides a set of configuration options to support IPv6 networks and adjust your security settings.

          Cumulus Linux provides options to configure:

          Router Advertisement

          Router Advertisement is disabled by default. To enable Router Advertisment for an interface:

          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery router-advertisement enable on
          cumulus@leaf01:mgmt:~$ nv config apply
          
          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# interface swp1
          leaf01(config-if)# no ipv6 nd suppress-ra
          

          For Stateless Address Auto-Configuration (SLAAC), Router Advertisment must be enabled on the interface. The prefix advertised in Router Advertisement must belong to the /64 subnet.

          You can configure these optional settings:

          The following example commands set:

          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery router-advertisement interval 60000
          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery router-advertisement router-preference high
          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery router-advertisement reachable-time 3600000
          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery router-advertisement retransmit-time 4294967295
          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery router-advertisement hop-limit 100
          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery router-advertisement lifetime 4000
          cumulus@leaf01:mgmt:~$ nv config apply
          
          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# interface swp1
          leaf01(config-if)# ipv6 nd ra-interval 60
          leaf01(config-if)# ipv6 nd router-preference high
          leaf01(config-if)# ipv6 nd reachable-time 3600000
          leaf01(config-if)# ipv6 nd ra-retrans-interval 4294967295
          leaf01(config-if)# ipv6 nd ra-hop-limit 100
          leaf01(config-if)# ipv6 nd ra-lifetime 4000
          leaf01(config-if)# end
          leaf01# write memory
          leaf01# exit
          cumulus@leaf01:mgmt:~$ 
          

          The vtysh commands save the configuration in the etc/frr/frr.conf file:

          cumulus@leaf01:mgmt:~$ sudo cat etc/frr/frr.conf
          ...
          interface swp1
           ipv6 nd ra-hop-limit 100
           ipv6 nd ra-interval 60
           ipv6 nd ra-lifetime 4000
           ipv6 nd ra-retrans-interval 4294967295
           ipv6 nd reachable-time 3600000
           ipv6 nd router-preference high
          

          The following example commands set fast retransmit to off and managed configuration to on:

          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery router-advertisement fast-retransmit off
          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery router-advertisement managed-config on
          cumulus@leaf01:mgmt:~$ nv config apply
          
          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# interface swp1
          leaf01(config-if)# ipv6 nd ra-fast-retrans
          leaf01(config-if)# ipv6 nd managed-config-flag
          leaf01(config-if)# end
          leaf01# write memory
          leaf01# exit
          cumulus@leaf01:mgmt:~$ 
          

          The vtysh commands save the configuration in the etc/frr/frr.conf file:

          cumulus@leaf01:mgmt:~$ sudo cat etc/frr/frr.conf
          ...
          interface swp1
           ipv6 nd ra-fast-retrans
           ipv6 nd managed-config-flag
          

          IPv6 Prefixes

          To configure IPv6 prefixes, you must specify the IPv6 prefixes you want to include in router advertisements. In addition, you can configure these optional settings:

          The following example commands set the IPv6 prefix to 2001:db8:1::100/32, the amount of time that the prefix is valid for on-link determination to 2000000000, and the amount of time that addresses generated from a prefix remain preferred to 1000000000.

          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery prefix 2001:db8:1::100/32 valid-lifetime 2000000000
          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery prefix 2001:db8:1::100/32 preferred-lifetime 1000000000
          cumulus@leaf01:mgmt:~$ nv config apply
          
          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# interface swp1
          leaf01(config-if)# ipv6 nd prefix 2001:db8:1::100/32 2000000000 1000000000
          leaf01(config-if)# end
          leaf01# write memory
          leaf01# exit
          cumulus@leaf01:mgmt:~$ 
          

          The vtysh commands write to the /etc/frr/frr.conf file:

          cumulus@leaf01:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          interface swp1
           ipv6 nd prefix 2001:db8::/32 2000000000 1000000000
           ...
          

          The following example commands set advertisement to make no statement about prefix on-link or off-link properties, enable the specified prefix to use IPv6 autoconfiguration, and indicate to hosts on the local link that the specified prefix contains a complete IP address.

          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery prefix 2001:db8:1::100/32 off-link on
          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery prefix 2001:db8:1::100/32 autoconfig on
          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery prefix 2001:db8:1::100/32 router-address on
          cumulus@leaf01:mgmt:~$ nv config apply
          
          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# interface swp1
          leaf01(config-if)# ipv6 nd prefix 2001:db8:1::100/32 off-link
          leaf01(config-if)# ipv6 nd prefix 2001:db8:1::100/32 no-autoconfig
          leaf01(config-if)# ipv6 nd prefix 2001:db8:1::100/32 router-address
          leaf01(config-if)# end
          leaf01# write memory
          leaf01# exit
          cumulus@leaf01:mgmt:~$ 
          

          The vtysh commands write to the /etc/frr/frr.conf file:

          cumulus@leaf01:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          interface swp1
           ipv6 nd prefix 2001:db8::/32 off-link
           ipv6 nd prefix 2001:db8::/32 router-address
           ipv6 nd prefix 2001:db8::/32 no-autoconfig
           ...
          

          Recursive DNS Servers

          To configure recursive DNS servers (RDNSS), you must specify the IPv6 address of each RDNSS you want to advertise.

          An optional parameter lets you set the maximum amount of time you want to use the RDNSS for domain name resolution. You can set a value between 0 and 4294967295 seconds or use the keyword infinte to set the time to never expire. If you set the value to 0, Cumulus Linux no longer advertises the RDNSS address.

          The following example commands set the RDNSS address to 2001:db8:1::100 and the lifetime to infinite:

          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery rdnss 2001:db8:1::100 lifetime infinite
          cumulus@leaf01:mgmt:~$ nv config apply
          
          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# interface swp1
          leaf01(config-if)# ipv6 nd rdnss 2001:db8:1::100 infinite
          leaf01(config-if)# end
          leaf01# write memory
          leaf01# exit
          cumulus@leaf01:mgmt:~$ 
          

          The vtysh commands write to the /etc/frr/frr.conf file:

          cumulus@leaf01:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          interface swp1
           ipv6 nd rdnss 2001:db8:1::100 infinite
           ...
          

          DNS Search Lists

          To configure DNS search lists (DNSSL), you must specify the domain suffix you want to advertise.

          An optional parameter lets you set the maximum amount of time you want to use the domain suffix for domain name resolution. You can set a value between 0 and 4294967295 seconds or use the keyword infinte to set the time to never expire. If you set the value to 0, the host does not use the DNSSL.

          The following example command sets the domain suffix to accounting.nvidia.com and the maximum amount of time you want to use the domain suffix to infinite:

          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery dnssl accounting.nvidia.com lifetime infinite
          cumulus@leaf01:mgmt:~$ nv config apply
          
          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# interface swp1
          leaf01(config-if)# ipv6 nd dnssl accounting.nvidia.com infinite
          leaf01(config-if)# end
          leaf01# write memory
          leaf01# exit
          cumulus@leaf01:mgmt:~$ 
          

          The vtysh commands write to the /etc/frr/frr.conf file:

          cumulus@leaf01:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          interface swp1
           ipv6 nd dnssl accounting.nvidia.com infinite
          ...
          

          Home Agents

          Mobile IPv6 defines an additional flag in the router advertisement message that indicates if the advertising router is capable of being a Home Agent. Each Home Agent on the home link sets this flag when it sends router advertisements.

          You can configure the switch to be a Home Agent with these settings:

          The following example commands configure the switch as a Home Agent by setting the maximum amount of time the router acts as a Home Agent to 20000 seconds and the router preference to 100:

          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery home-agent preference 100
          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery home-agent lifetime 20000
          cumulus@leaf01:mgmt:~$ nv config apply
          

          When you run the above commands, NVUE adds the ipv6 nd home-agent-config-flag line under the interface stanza in the /etc/network/interfaces file in addition to the ipv6 nd home-agent-preference and ipv6 nd home-agent-lifetime lines.

          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# interface swp1
          leaf01(config-if)# ipv6 nd home-agent-config-flag
          leaf01(config-if)# ipv6 nd home-agent-preference 100
          leaf01(config-if)# ipv6 nd home-agent-lifetime 0
          leaf01(config-if)# end
          leaf01# write memory
          leaf01# exit
          cumulus@leaf01:mgmt:~$ 
          

          The vtysh commands write to the /etc/frr/frr.conf file:

          cumulus@leaf01:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          interface swp1
           ipv6 nd home-agent-config-flag
           ipv6 nd home-agent-lifetime 0
           ipv6 nd home-agent-preference 100
          ...
          

          MTU

          You can set the MTU for neighbor discovery messages on an interface. You can configure a value between 1 and 65535.

          To following example commands set the MTU on swp1 to 1500:

          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery mtu 1500
          cumulus@leaf01:mgmt:~$ nv config apply
          
          cumulus@leaf01:mgmt:~$ sudo vtysh
          ...
          leaf01# configure terminal
          leaf01(config)# interface swp1
          leaf01(config-if)# ipv6 nd mtu 1500
          leaf01(config-if)# end
          leaf01# write memory
          leaf01# exit
          cumulus@leaf01:mgmt:~$ 
          

          The vtysh commands write to the /etc/frr/frr.conf file:

          cumulus@leaf01:mgmt:~$ sudo cat /etc/frr/frr.conf
          ...
          interface swp1
           ipv6 nd mtu 1500
          ...
          

          Neighbor Base Reachable Timer

          You can set how long a neighbor cache entry is valid with the NVUE nv set system global nd base-reachable-time command. The entry is valid for at least the value between the base reachable time divided by two and three times the base reachable time divided by two. You can specify a value between 30 and 2147483 seconds. The default value is auto; NVUE derives the value for auto from the /etc/sysctl.d/neigh.conf file.

          The following example configures the neighbor base reachable timer to 50 seconds.

          cumulus@leaf01:~$ nv set system global nd base-reachable-time 50
          cumulus@leaf01:~$ nv config apply
          

          To reset the neighbor base reachable timer to the default setting, run the nv unset system global nd base-reachable-time command.

          NVIDIA recommends that you run the NVUE command to change the neighbor base reachable timer instead of modifying the /etc/sysctl.d/neigh.conf file manually.

          To show the neighbor base reachable timer setting, run the nv show system global nd command:

          cumulus@leaf01:~$ nv show system global nd
                                        operational  applied  
          ----------------------------  -----------  ------- 
          base-reachable-time           50           50      
          garbage-collection-threshold                               
            effective                   17920                        
            maximum                     20480                        
            minimum                     128       
          

          Disable ND

          To disable ND, run the NVUE nv set interface <interface> ip neighbor-discovery enable off command:

          cumulus@leaf01:mgmt:~$ nv set interface swp1 ip neighbor-discovery enable off
          cumulus@leaf01:mgmt:~$ nv config apply
          

          Add Static IP Neighbor Table Entries

          You can add static IPv6 neighbor table entries for easy management or as a security measure to prevent spoofing and other nefarious activities.

          To create a static neighbor entry for an interface with an IPv6 address associated with a MAC address, run the nv set interface <interface> neighbor ipv6 <ip-address> lladdr <mac-address> command.

          cumulus@leaf01:mgmt:~$ nv set interface swp51 neighbor ipv6 fe80::4ab0:2dff:fea2:4c79 lladdr 00:00:5E:00:53:51
          cumulus@leaf01:mgmt:~$ nv config apply
          

          You can also set a flag to indicate that the neighbour is a router (is-router) or learned externally (ext_learn) and set the neighbor state (delay, failed, incomplete, noarp, permanent, probe, reachable, or stale).

          cumulus@leaf01:mgmt:~$ nv set interface swp51 neighbor ipv6 fe80::4ab0:2dff:fea2:4c79 lladdr 00:00:5E:00:53:51 flag is-router
          cumulus@leaf01:mgmt:~$ nv set interface swp51 neighbor ipv6 fe80::4ab0:2dff:fea2:4c79 lladdr 00:00:5E:00:53:51 state permanent
          cumulus@leaf01:mgmt:~$ nv config apply
          

          To delete an entry in the IP neighbor table, run the nv unset interface <interface> neighbor ipv6 <ip-address> command:

          cumulus@leaf01:mgmt:~$ nv unset interface swp51 neighbor ipv6 fe80::4ab0:2dff:fea2:4c79
          cumulus@leaf01:mgmt:~$ nv config apply
          

          To create a static neighbor entry for an interface with an IPv6 address associated with a MAC address, add post-up ip neigh add <ipv6-address> lladdr <mac-address> to the interface stanza of the /etc/network/interfaces file, then run the ifreload -a command:

          cumulus@leaf01:mgmt:~$ sudo nano /etc/network/interfaces
          ...
          auto swp51
          iface swp51
              post-up ip neigh add fe80::4ab0:2dff:fea2:4c79 lladdr 00:00:5E:00:53:51 dev swp51
          ...
          
          cumulus@leaf01:mgmt:~$ sudo ifreload -a
          

          You can also set a flag to indicate that the IPv6 neighbor is a router (router) or learned externally (extern_learn) and set the neighbor state (delay, failed, incomplete, noarp, permanent, probe, reachable, or stale).

          cumulus@leaf01:mgmt:~$ sudo nano /etc/network/interfaces
          ...
          auto swp51
          iface swp51
              post-up ip neigh add fe80::4ab0:2dff:fea2:4c79 lladdr 00:00:5E:00:53:51 dev swp51 nud permanent router
          ...
          
          cumulus@leaf01:mgmt:~$ sudo ifreload -a
          

          To delete a static neighbor entry, remove the post-up ip neigh add line from the interface stanza of the /etc/network/interfaces file.

          Show the IP Neighbor Table

          To show all the entries in the IP neighbor table, run the nv show interface neighbor command or the Linux ip neighbor command:

          cumulus@leaf01:mgmt:~$ nv show interface neighbor
          Interface      IP/IPV6                    LLADR(MAC)         State      Flag      
          -------------  -------------------------  -----------------  ---------  ----------
          eth0           192.168.200.251            48:b0:2d:00:00:01  stale                
                         192.168.200.1              48:b0:2d:aa:8b:45  reachable            
                         fe80::4ab0:2dff:fe00:1     48:b0:2d:00:00:01  reachable  router    
          peerlink.4094  169.254.0.1                48:b0:2d:3f:69:d6  permanent            
                         fe80::4ab0:2dff:fe3f:69d6  48:b0:2d:3f:69:d6  reachable  router    
          swp51          169.254.0.1                48:b0:2d:a2:4c:79  permanent            
                         fe80::4ab0:2dff:fea2:4c79  48:b0:2d:a2:4c:79  reachable  router    
          swp52          169.254.0.1                48:b0:2d:48:f1:ae  permanent            
                         fe80::4ab0:2dff:fe48:f1ae  48:b0:2d:48:f1:ae  reachable  router    
          swp53          169.254.0.1                48:b0:2d:2d:de:93  permanent            
                         fe80::4ab0:2dff:fe2d:de93  48:b0:2d:2d:de:93  reachable  router    
          swp54          169.254.0.1                48:b0:2d:80:8c:21  permanent            
                         fe80::4ab0:2dff:fe80:8c21  48:b0:2d:80:8c:21  reachable  router    
          vlan10         10.1.10.3                  44:38:39:22:01:78  permanent            
                         10.1.10.101                48:b0:2d:a1:3f:4b  reachable            
                         10.1.10.104                48:b0:2d:1d:d7:e8  noarp      |ext_learn
                         fe80::4ab0:2dff:fea1:3f4b  48:b0:2d:a1:3f:4b  reachable            
                         fe80::4ab0:2dff:fe1d:d7e8  48:b0:2d:1d:d7:e8  noarp      |ext_learn
                         fe80::4638:39ff:fe22:178   44:38:39:22:01:78  permanent            
          vlan10-v0      10.1.10.101                48:b0:2d:a1:3f:4b  stale                
                         fe80::4ab0:2dff:fea1:3f4b  48:b0:2d:a1:3f:4b  stale                
                         fe80::4ab0:2dff:fe1d:d7e8  48:b0:2d:1d:d7:e8  stale                
          vlan20         10.1.20.105                48:b0:2d:75:bf:9e  noarp      |ext_learn
                         10.1.20.102                48:b0:2d:00:e9:05  reachable            
                         10.1.20.3                  44:38:39:22:01:78  permanent            
                         fe80::4638:39ff:fe22:178   44:38:39:22:01:78  permanent            
                         fe80::4ab0:2dff:fe75:bf9e  48:b0:2d:75:bf:9e  noarp      |ext_learn
                         fe80::4ab0:2dff:fe00:e905  48:b0:2d:00:e9:05  reachable
          ...
          

          To show IPv6 entries only, run the Linux ip -6 neighbor command:

          cumulus@leaf01:mgmt:~$ ip -6 neighbor
          fe80::4ab0:2dff:fe4e:c76a dev vlan30 lladdr 48:b0:2d:4e:c7:6a extern_learn  NOARP proto zebra 
          fe80::4ab0:2dff:fea1:3f4b dev vlan10 lladdr 48:b0:2d:a1:3f:4b REACHABLE
          fe80::4ab0:2dff:fee9:d399 dev vlan30-v0 lladdr 48:b0:2d:e9:d3:99 STALE
          fe80::4ab0:2dff:fe75:bf9e dev vlan20-v0 lladdr 48:b0:2d:75:bf:9e STALE
          fe80::4638:39ff:fe22:178 dev vlan20 lladdr 44:38:39:22:01:78 PERMANENT
          fe80::4ab0:2dff:fea2:4c79 dev swp51 lladdr 48:b0:2d:a2:4c:79 router REACHABLE
          fe80::4ab0:2dff:fe00:1 dev eth0 lladdr 48:b0:2d:00:00:01 router REACHABLE
          fe80::4ab0:2dff:fee9:d399 dev vlan30 lladdr 48:b0:2d:e9:d3:99 REACHABLE
          fe80::4ab0:2dff:fe48:f1ae dev swp52 lladdr 48:b0:2d:48:f1:ae router REACHABLE
          fe80::4ab0:2dff:fe1d:d7e8 dev vlan10 lladdr 48:b0:2d:1d:d7:e8 extern_learn  NOARP proto zebra 
          fe80::4ab0:2dff:fea1:3f4b dev vlan10-v0 lladdr 48:b0:2d:a1:3f:4b STALE
          fe80::4ab0:2dff:fe80:8c21 dev swp54 lladdr 48:b0:2d:80:8c:21 router REACHABLE
          fe80::4ab0:2dff:fe75:bf9e dev vlan20 lladdr 48:b0:2d:75:bf:9e extern_learn  NOARP proto zebra 
          fe80::4638:39ff:fe22:178 dev vlan4024_l3 lladdr 44:38:39:22:01:78 PERMANENT
          fe80::4ab0:2dff:fe00:e905 dev vlan20-v0 lladdr 48:b0:2d:00:e9:05 STALE
          fe80::4ab0:2dff:fe3f:69d6 dev peerlink.4094 lladdr 48:b0:2d:3f:69:d6 router REACHABLE
          ...
          

          To show all table entries for a specific interface, run the nv show interface <interface_id> neighbor command:

          cumulus@leaf01:mgmt:~$ nv show interface swp51 neighbor
          ipv4
          =========
              IPV4         LLADR(MAC)         State      Flag
              -----------  -----------------  ---------  ----
              10.5.5.51    00:00:5e:00:53:51  permanent      
              169.254.0.1  48:b0:2d:a2:4c:79  permanent
          ipv6
          =========
              IPV6                       LLADR(MAC)         State      Flag     
              -------------------------  -----------------  ---------  ---------
              fe80::4ab0:2dff:fea2:4c79  48:b0:2d:a2:4c:79  reachable  is-router
          

          To show all IPv6 table entries for an interface, run the nv show interface <interface> neighbor ipv6 command:

          cumulus@leaf01:mgmt:~$ nv show interface swp1 neighbor ipv6
          IPV6                       LLADR(MAC)         State      Flag
          -------------------------  -----------------  ---------  ---------
          fe80::1e34:daff:fe6c:dd8   1c:34:da:6c:0d:d8  stale
          fe80::3e2c:30ff:fe4b:800   3c:2c:30:4b:08:00  reachable
          

          To show table entries for an interface with a specific IPv6 address, run the nv show interface <interface_id> neighbor ipv6 <ip-address> command:

          cumulus@leaf01:mgmt:~$ nv show interface swp51 neighbor ipv6 fe80::4ab0:2dff:fea2:4c79
          lladdr
          =========
              LLADR(MAC)         State      Flag
              -----------------  ---------  ----
              00:00:5E:00:53:51  permanent
          

          Troubleshooting

          To show the ND configuration settings for an interface, run the NVUE nv show interface <interface-id> ip neighbor-discovery command:

          cumulus@leaf01:mgmt:~$ nv show interface swp1 ip neighbor-discovery
                                applied             description
          --------------------  ------------------  ----------------------------------------------------------------------
          enable                on                  Turn the feature 'on' or 'off'.  The default is 'on'.
          home-agent
            lifetime            0                   Lifetime of a home agent in seconds
            preference          0                   Home agent's preference value that is used to order the addresses r...
          [prefix]              2001:db8:1::100/32  IPv6 prefix configuration
          router-advertisement
            enable              on                  Turn the feature 'on' or 'off'.  The default is 'on'.
            fast-retransmit     off                 Allow consecutive RA packets more frequently than every 3 seconds
            hop-limit           100                 Value in hop count field in IP header of the outgoing router advert...
            interval            6000                Maximum time in milliseconds allowed between sending unsolicited mu...
            interval-option     on                  Indicates hosts that the router will use advertisement interval to...
            lifetime            4000                Maximum time in seconds that the router can be treated as default g...
            managed-config      on                  Knob to allow dynamic host to use managed (stateful) protocol for a...
            other-config        off                 Knob to allow dynamic host to use managed (stateful) protocol for a...
            reachable-time      3600000             Time in milliseconds that a IPv6 node is considered reachable
            retransmit-time     4294967295          Time in milliseconds between retransmission of neighbor solicitatio...
            router-preference   high                Hosts use router preference in selection of the default router
          

          To show prefix configuration for an interface, run the nv show interface <interface> ip neighbor-discovery prefix <prefix> command.

          cumulus@leaf01:mgmt:~$ nv show interface swp1 ip neighbor-discovery prefix 2001:db8:1::100/32
                              applied     description
          ------------------  -------     ----------------------------------------------------------------------
          autoconfig          on          Indicates to hosts on the local link that the specified prefix can...
          off-link            on          Indicates that adverisement makes no statement about on-link or off...
          preferred-lifetime  1000000000  Time in seconds that addresses generated from a prefix remain prefe...
          router-address      on          Indicates to hosts on the local link that the specified prefix cont...
          valid-lifetime      2000000000  Time in seconds the prefix is valid for on-link determination
          

          To show Home Agent configuration for an interface, run the nv show interface <interface> ip neighbor-discovery home-agent command:

          cumulus@leaf01:mgmt:~$ nv show interface swp1 ip neighbor-discovery home-agent
                      applied  description
          ----------  -------  ----------------------------------------------------------------------
          lifetime    20000    Lifetime of a home agent in seconds
          preference  100      Home agent's preference value that is used to order the addresses r...
          

          To show router advertisement configuration for an interface, run the nv show interface <interface> ip neighbor-discovery router-advertisement command. The command also shows the number of router advertisement packets sent on the interface and the number of router advertisement and router solicitation packets received on the interface.

          cumulus@leaf01:mgmt:~$ nv show interface swp1 ip neighbor-discovery router-advertisement
                                applied
          -----------------     -----------------
          enable                on
          interval              10000
          interval-option       off
          fast-retransmit       on
          lifetime              1800
          reachable-time        0
          retransmit-time       0
          managed-config        off
          other-config          off
          hop-limit             64
          router-preference     medium
          ra-sent               218
          ra-received           2
          rs-received           1
          

          To show RDNSS configuration for an interface, run the nv show interface <interface> ip neighbor-discovery rdnss <address> command:

          cumulus@leaf01:mgmt:~$ nv show interface swp1 ip neighbor-discovery rdnss 2001:db8:1::100
                    applied   description
          --------  --------  ----------------------------------------------------------------------
          lifetime  infinite  Maximum time in seconds for which the server may be used for domain...
          

          To show DNSSL configuration for an interface, run the nv show interface <interface> ip neighbor-discovery dnssl <domain-suffix> command:

          cumulus@leaf01:mgmt:~$ nv show interface swp1 ip neighbor-discovery dnssl accounting.nvidia.com
                    applied   description
          --------  --------  ----------------------------------------------------------------------
          lifetime  infinite  Maximum time in seconds for which the domain suffix may be used for...
          

          Monitoring and Troubleshooting

          This chapter introduces the basics for monitoring and troubleshooting Cumulus Linux.

          Serial Console

          Use the serial console to debug issues if you reboot the switch often or if you do not have a reliable network connection.

          The default serial console baud rate is 115200, which is the baud rate ONIE uses.

          Configure the Serial Console

          On x86 switches, you configure serial console baud rate by editing grub.

          Incorrect configuration settings in grub cause the switch to be inaccessible through the console. Review grub changes before you implement them.

          The valid values for the baud rate are:

          To change the serial console baud rate:

          1. Edit the /etc/default/grub file and provide a valid value for the --speed and console variables:

            GRUB_SERIAL_COMMAND="serial --port=0x2f8 --speed=115200 --word=8 --parity=no --stop=1"
            GRUB_CMDLINE_LINUX="console=ttyS1,115200n8 cl_platform=accton_as5712_54x"
            
          2. After you save your changes to the grub configuration, type the following at the command prompt:

            cumulus@switch:~$ update-grub
            
          3. If you plan on accessing the switch BIOS over the serial console, you need to update the baud rate in the switch BIOS. For more information, see this knowledge base article.

          4. Reboot the switch.

          Change the Console Log Level

          By default, the console prints all log messages except debug messages. To tune console logging to be less verbose so that certain levels of messages do not print, run the dmesg -n <level> command, where the log levels are:

          Level Description
          0 Emergency messages (the system is about to crash or is unstable).
          1 Serious conditions; you must take action immediately.
          2 Critical conditions (serious hardware or software failures).
          3 Error conditions (often used by drivers to indicate difficulties with the hardware).
          4 Warning messages (nothing serious but might indicate problems).
          5 Message notifications for many conditions, including security events.
          6 Informational messages.
          7 Debug messages.

          Only messages with a value lower than the level specified print to the console. For example, if you specify level 3, only level 2 (critical conditions), level 1 (serious conditions), and level 0 (emergency messages) print to the console:

          cumulus@switch:~$ sudo dmesg -n 3
          

          You can also run dmesg --console-level <level> command, where the log levels are emerg, alert, crit, err, warn, notice, info, or debug. For example, to print critical conditions, run the following command:

          cumulus@switch:~$ sudo dmesg --console-level crit
          

          The dmesg command applies until the next reboot.

          For more details about the dmesg command, run man dmesg.

          Show System Information

          Cumulus Linux provides commands to obtain system information and to show the version of Cumulus Linux you are running. Use these commands when performing system diagnostics, troubleshooting performance, or submitting a support request.

          To show information about the version of Cumulus Linux running on the switch, run the nv show system command:

          cumulus@switch:~$ nv show system
                      operational          applied
          -----------  -------------------  -------
          hostname     leaf01                
          build        Cumulus Linux 5.12        
          uptime       0:02:50                     
          timezone     Etc/UTC                     
          maintenance                              
            mode       disabled                    
            ports      enabled
          

          To show system memory information in bytes, run the nv show system memory command:

          cumulus@switch:~$ nv show system memory
          Type      Buffers     Cache        Free         Total         Used         Utilization
          --------  ----------  -----------  -----------  ------------  -----------  -----------
          Physical  81661952 B  571834368 B  373276672 B  1813528576 B  786755584 B  79.4%
          Swap                               0 B          0 B           0 B          0.0%
          

          To show system CPU information, run the nv show system cpu command:

          cumulus@switch:~$ nv show system cpu
                       operational                  
          -----------  -----------------------------
          model        QEMU Virtual CPU version 2.5+
          core-count   1                            
          utilization  0.3%
          

          To show general information about the switch, run the nv show platform command:

          cumulus@switch:~$ nv show platform
                        operational                            
          ------------  ---------------------------------------
          system-mac    44:38:39:22:01:b1                      
          manufacturer  Accton                                 
          cpu           x86_64 QEMU Virtual CPU version 2.5+ x1
          memory        1751856 kB                             
          disk-size     n/a                                    
          port-layout   n/a                                    
          asic-model    n/a                                    
          system-uuid   a6bfbd6d-70ac-426f-b46d-3743e16e1f4b
          

          Diagnostics Using a cl-support File

          You can generate a single export cl-support file that contains various details about switch configuration, and is useful for remote debugging and troubleshooting.

          Generate a cl-support file to investigate issues before you submit a support request. You can either run the NVUE nv action generate system tech-support command or the Linux sudo cl-support command:

          cumulus@switch:~$ nv action generate system tech-support
          ...
          

          For more information, refer to Understanding the cl-support Output File.

          Send Log Files to a syslog Server

          You can configure Cumulus Linux to send log files to one or more remote syslog servers.

          The following example configures Cumulus Linux to send log files to the remote syslog server with the 192.168.0.254 address in the default VRF on port 514 using UDP.

          You must specify a VRF in the command.

          cumulus@switch:~$ nv set service syslog default server 192.168.0.254 port 514
          cumulus@switch:~$ nv set service syslog default server 192.168.0.254 protocol udp
          cumulus@switch:~$ nv config apply
          

          The configuration creates the /etc/rsyslog.d/11-remotesyslog-default.conf file. The file has the following content:

          cumulus@switch:~$ sudo cat /etc/rsyslog.d/11-remotesyslog-default.conf
          # Auto-generated by NVUE!
          # Any local modifications will prevent NVUE from re-generating this file.
          # md5sum: c8e094c868c7f9be4cfa6ccec752b44b
          #
          # Remote syslog servers configured through CUE
          #
          action(type="omfwd" Target="192.168.0.254" Port="514" Protocol="udp")
          

          Log Technical Details

          rsyslog performs logging on Cumulus Linux. rsyslog provides both local logging to the syslog file and the ability to export logs to an external syslog server. All rsyslog log files use high precision timestamps:

          2015-08-14T18:21:43.337804+00:00 cumulus switchd[3629]: switchd.c:1409 switchd version 1.0-cl2.5+5
          

          Cumulus Linux includes applications in the /var/log/ directory that write directly to a log file without going through rsyslog.

          All Cumulus Linux rules are in separate files in /etc/rsyslog.d/, which rsyslog calls at the end of the GLOBAL DIRECTIVES section of the /etc/rsyslog.conf file. rsyslog ignores the RULES section at the end of the rsyslog.conf file; the rules in the /etc/rsyslog.d file must process the messages, which the last line in the /etc/rsyslog.d/99-syslog.conf file drops.

          Local Logging

          Cumulus Linux sends logs through rsyslog, which writes them to files in the /var/log directory. There are default rules in the /etc/rsyslog.d/ directory that define where the logs write:

          Rule Purpose
          10-rules.conf Sets defaults for log messages, include log format and log rate limits.
          15-crit.conf Logs crit, alert or emerg log messages to /var/log/crit.log to ensure they do not rotate away.
          20-clagd.conf Logs clagd messages to /var/log/clagd.log for MLAG.
          22-linkstate.conf Logs link state changes for all physical and logical network links to /var/log/linkstate.
          25-switchd.conf Logs switchd messages to /var/log/switchd.log.
          30-ptmd.conf Logs ptmd messages to /var/log/ptmd.log for Prescription Topology Manager.
          35-rdnbrd.conf Logs rdnbrd messages to /var/log/rdnbrd.log for Redistribute Neighbor.
          42-nvued.conf Logs nvued messages to /var/log/nvued.log for NVUE.
          45-frr.conf Logs routing protocol messages to /var/log/frr/frr.log. This includes BGP and OSPF log messages.
          50-netq-agent.conf Logs NetQ agent messages to /var/log/netq-agent.log.
          50-netqd.conf Logs netqd messages to /var/log/netqd.log.
          55-dhcpsnoop.conf Logs DHCP snooping messages to /var/log/dhcpsnoop.log.
          66-ptp4l.conf Logs PTP messages to /var/log/ptp4l.log.
          99-syslog.conf Sends all remaining processes that use rsyslog to /var/log/syslog.

          Cumulus Linux rotates and compresses log files into an archive. Processes that do not use rsyslog write to their own log files within the /var/log directory. For more information on specific log files, see Troubleshooting Log Files.

          Enable Remote syslog

          Cumulus Linux does not send all log messages to a remote server. To send other log files (such as switchd logs) to a syslog server, follow these steps:

          1. Create a file in /etc/rsyslog.d/. Make sure the filename starts with a number lower than 99 so that it executes before log messages go in, such as 20-clagd.conf or 25-switchd.conf. The name of the example file below is /etc/rsyslog.d/11-remotesyslog.conf. Add content similar to the following:

            ## Logging switchd messages to remote syslog server
            
            @192.168.1.2:514
            

            This configuration sends log messages to a remote syslog server for the following processes: clagd, switchd, ptmd, rdnbrd, nvued and syslog. It follows the same syntax as the /var/log/syslog file, where @ indicates UDP, 192.168.12 is the IP address of the syslog server, and 514 is the UDP port.

            • For TCP-based syslog, use two @@ before the IP address @@192.168.1.2:514.
            • The file numbering in /etc/rsyslog.d/ dictates how the rules install into rsyslog.d. Lower numbered rules process first and rsyslog processing terminates with the stop keyword. For example, the rsyslog configuration for FRR is in the 45-frr.conf file with an explicit stop at the bottom of the file. FRR messages log to the /var/log/frr/frr.log file on the local disk only (these messages do not go to a remote server using the default configuration). To log FRR messages remotely in addition to writing FRR messages to the local disk, rename the 99-syslog.conf file to 11-remotesyslog.conf. The 11-remotesyslog.conf rule (transmit to remote server) processes FRR messages first, then the 45-frr.conf file continues to process the messages (write to local disk in the /var/log/frr/frr.log file).
            • Do not use the imfile module with any file written by rsyslogd.

          2. Restart rsyslog.

            cumulus@switch:~$ sudo systemctl restart rsyslog.service
            

          Write to syslog with Management VRF Enabled

          You can write to syslog with management VRF enabled by applying the following configuration; the /etc/rsyslog.d/11-remotesyslog.conf file comments out this configuration.

          cumulus@switch:~$ cat /etc/rsyslog.d/11-remotesyslog.conf
          ## Copy all messages to the remote syslog server at 192.168.0.254 port 514
          action(type="omfwd" Target="192.168.0.254" Device="mgmt" Port="514" Protocol="udp")
          

          For each syslog server, configure a unique action line. For example, to configure two syslog servers at 192.168.0.254 and 10.0.0.1:

          cumulus@switch:~$ cat /etc/rsyslog.d/11-remotesyslog.conf
          ## Copy all messages to the remote syslog servers at 192.168.0.254 and 10.0.0.1 port 514
          action(type="omfwd" Target="192.168.0.254" Device="mgmt" Port="514" Protocol="udp")
          action(type="omfwd" Target="10.0.0.1" Device="mgmt" Port="514" Protocol="udp")
          

          If you configure remote logging to use the TCP protocol, local logging might stop when the remote syslog server is unreachable. Also, if you configure remote logging to use the UDP protocol, local logging might stop if the UDP servers are unreachable because there are no routes available for the destination IP addresses.

          To avoid this behavior, configure a disk queue size and maximum retry count in your rsyslog configuration:

          action(type="omfwd" Target="172.28.240.15" Device="mgmt" Port="1720" Protocol="tcp" action.resumeRetryCount="100" queue.type="linkedList" queue.size="10000")
          
          action(type="omfwd" Target="172.28.240.15" Device="mgmt" Port="540" Protocol="udp" action.resumeRetryCount="100" queue.type="linkedList" queue.size="10000")
          

          Rate-limit syslog Messages

          If you want to limit the number of syslog messages that write to the syslog file from individual processes, add the following configuration to the /etc/rsyslog.conf file. Adjust the interval and burst values to rate-limit messages to the appropriate levels required by your environment. For more information, read the rsyslog documentation.

          module(load="imuxsock"
                SysSock.RateLimit.Interval="2" SysSock.RateLimit.Burst="50")
          

          The following test script shows an example of rate-limit output.

          Example test script
          root@leaf1:mgmt-vrf:/home/cumulus# cat ./syslog.py 
          #!/usr/bin/python
          import syslog
          message_count=100
          print "Sending %s Messages..."%(message_count)
          for i in range(0,message_count):
          syslog.syslog("Message Number:%s"%(i))
          print "DONE."
          
          root@leaf1:mgmt-vrf:/home/cumulus# ./syslog.py
          Sending 100 Messages...
          DONE.
          
          root@leaf1:mgmt-vrf:/home/cumulus# tail -n 60 /var/log/syslog
          2017-02-22T19:59:50.043342+00:00 leaf1 syslog.py[22830]: Message Number:0
          2017-02-22T19:59:50.043723+00:00 leaf1 syslog.py[22830]: Message Number:1
          2017-02-22T19:59:50.043941+00:00 leaf1 syslog.py[22830]: Message Number:2
          2017-02-22T19:59:50.044565+00:00 leaf1 syslog.py[22830]: Message Number:3
          2017-02-22T19:59:50.044830+00:00 leaf1 syslog.py[22830]: Message Number:4
          2017-02-22T19:59:50.045680+00:00 leaf1 syslog.py[22830]: Message Number:5
          <...snip...>
          2017-02-22T19:59:50.056727+00:00 leaf1 syslog.py[22830]: Message Number:45
          2017-02-22T19:59:50.057599+00:00 leaf1 syslog.py[22830]: Message Number:46
          2017-02-22T19:59:50.057741+00:00 leaf1 syslog.py[22830]: Message Number:47
          2017-02-22T19:59:50.057936+00:00 leaf1 syslog.py[22830]: Message Number:48
          2017-02-22T19:59:50.058125+00:00 leaf1 syslog.py[22830]: Message Number:49
          2017-02-22T19:59:50.058324+00:00 leaf1 rsyslogd-2177: imuxsock[pid 22830]: begin to drop messages due to rate-limiting
          

          Harmless syslog Error: Failed to reset devices.list

          The following message logs to /var/log/syslog when you run systemctl daemon-reload and during system boot:

          systemd[1]: Failed to reset devices.list on /system.slice: Invalid argument
          

          This message is harmless, you can ignore it. It logs when systemd attempts to change read-only group attributes. Cumulus Linux modifies the upstream version of systemd to not log this message by default.

          The systemctl daemon-reload command runs when you install Debian packages. You see the message multiple times when upgrading packages.

          Troubleshoot syslog

          You can use the following commands to troubleshoot syslog issues.

          Verifying that rsyslog is Running

          To verify that the rsyslog service is running, use the sudo systemctl status rsyslog.service command:

          cumulus@leaf01:mgmt-vrf:~$ sudo systemctl status rsyslog.service
          rsyslog.service - System Logging Service
            Loaded: loaded (/lib/systemd/system/rsyslog.service; enabled)
            Active: active (running) since Sat 2017-12-09 00:48:58 UTC; 7min ago
              Docs: man:rsyslogd(8)
                    http://www.rsyslog.com/doc/
          Main PID: 11751 (rsyslogd)
             Tasks: 4 (limit: 2032)
               Memory: 1.1M
                  CPU: 20ms
               CGroup: /system.slice/rsyslog.service
                       └─8587 /usr/sbin/rsyslogd -n -iNONE
          
          Dec 09 00:48:58 leaf01 systemd[1]: Started System Logging Service.
          

          Verify your rsyslog Configuration

          After making manual changes to any files in the /etc/rsyslog.d directory, use the sudo rsyslogd -N1 command to identify any errors in the configuration files that prevent the rsyslog service from starting.

          In the following example, a closing parenthesis is missing in the 11-remotesyslog.conf file, which configures syslog for management VRF:

          cumulus@leaf01:mgmt-vrf:~$ cat /etc/rsyslog.d/11-remotesyslog.conf
          action(type="omfwd" Target="192.168.0.254" Device="mgmt" Port="514" Protocol="udp"
          
          cumulus@leaf01:mgmt-vrf:~$ sudo rsyslogd -N1
          rsyslogd: version 8.4.2, config validation run (level 1), master config /etc/rsyslog.conf
          syslogd: error during parsing file /etc/rsyslog.d/15-crit.conf, on or before line 3: invalid character '$' in object definition - is there an invalid escape sequence somewhere? [try http: /www.rsyslog.com/e/2207 ]
          rsyslogd: error during parsing file /etc/rsyslog.d/15-crit.conf, on or before line 3: syntax error on token 'crit_log' [try http://www.rsyslog.com/e/2207 ]
          

          After correcting the invalid syntax, issuing the sudo rsyslogd -N1 command produces the following output.

          cumulus@leaf01:mgmt-vrf:~$ cat /etc/rsyslog.d/11-remotesyslog.conf
          action(type="omfwd" Target="192.168.0.254" Device="mgmt" Port="514" Protocol="udp")
          cumulus@leaf01:mgmt-vrf:~$ sudo rsyslogd -N1
          rsyslogd: version 8.4.2, config validation run (level 1), master config /etc/rsyslog.conf
          rsyslogd: End of config validation run. Bye.
          

          tcpdump

          If a syslog server is not accessible to validate that syslog messages are exporting, you can use tcpdump.

          In the following example, a syslog server uses 192.168.0.254 for UDP syslog messages on port 514:

          cumulus@leaf01:mgmt-vrf:~$ sudo tcpdump -i eth0 host 192.168.0.254 and udp port 514
          

          To generate syslog messages, use sudo in another session such as sudo date. Using sudo generates an authpriv log.

          cumulus@leaf01:mgmt-vrf:~$ sudo tcpdump -i eth0 host 192.168.0.254 and udp port 514
          tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
          listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
          00:57:15.356836 IP leaf01.lab.local.33875 > 192.168.0.254.syslog: SYSLOG authpriv.notice, length: 105
          00:57:15.364346 IP leaf01.lab.local.33875 > 192.168.0.254.syslog: SYSLOG authpriv.info, length: 103
          00:57:15.369476 IP leaf01.lab.local.33875 > 192.168.0.254.syslog: SYSLOG authpriv.info, length: 85
          

          To see the contents of the syslog file, use the tcpdump -X option:

          cumulus@leaf01:mgmt-vrf:~$ sudo tcpdump -i eth0 host 192.168.0.254 and udp port 514 -X -c 3
          tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
          listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
          00:59:15.980048 IP leaf01.lab.local.33875 > 192.168.0.254.syslog: SYSLOG authpriv.notice, length: 105
          0x0000: 4500 0085 33ee 4000 4011 8420 c0a8 000b E...3.@.@.......
          0x0010: c0a8 00fe 8453 0202 0071 9d18 3c38 353e .....S...q..<85>
          0x0020: 4465 6320 2039 2030 303a 3539 3a31 3520 Dec..9.00:59:15.
          0x0030: 6c65 6166 3031 2073 7564 6f3a 2020 6375 leaf01.sudo:..cu
          0x0040: 6d75 6c75 7320 3a20 5454 593d 7074 732f mulus.:.TTY=pts/
          0x0050: 3120 3b20 5057 443d 2f68 6f6d 652f 6375 1.;.PWD=/home/cu
          0x0060: 6d75 6c75 7320 3b20 5553 4552 3d72 6f6f mulus.;.USER=roo
          0x0070: 7420 3b20 434f 4d4d 414e 443d 2f62 696e t.;.COMMAND=/bin
          0x0080: 2f64 6174 65 /date
          

          Monitoring System Hardware

          You can monitor system hardware with the following commands and utilities:

          NVUE Commands

          You can run NVUE commands to monitor your system hardware.

          Command Description
          nv show system health Shows information about the health of the switch and describes any issues.
          nv show platform Shows platform hardware information on the switch, such as the model and manufacturer, memory, serial number and system MAC address.
          nv show platform environment fan Shows information about the fans on the switch, such as the minimum, maximum and current speed, the fan state, and the fan direction.
          nv show platform environment led Shows information about the LEDs on the switch, such as the LED name and color.
          nv show platform environment psu Shows information about the PSUs on the switch, such as the PSU name and state.
          nv show platform environment temperature Shows information about the sensors on the switch, such as the critical, maximum, minimum and current temperature and the current state of the sensor.
          nv show platform environment voltage Shows the list of voltage sensors on the switch.
          nv show platform inventory Shows the switch inventory, which includes fan and PSU hardware version, model, serial number, state, and type. For information about a specific fan or PSU, run the nv show platform inventory <inventory-name> command.

          The following example shows the nv show platform command output:

          cumulus@switch:~$ nv show platform
                         operational      
          -------------  -----------------
          system-mac     44:38:39:22:01:b1                      
          manufacturer   Cumulus                                
          product-name   VX                                     
          cpu            x86_64 QEMU Virtual CPU version 2.5+ x1
          memory         1756460 kB                             
          disk-size      n/a                                    
          port-layout    n/a                                    
          part-number    5.12                                 
          serial-number  44:38:39:22:01:7a                      
          asic-model     n/a                                    
          system-uuid    e928ee83-20f7-4515-bfab-c204db3e604c
          

          The following example shows the nv show platform environment fan command output. The airflow direction must be the same for all fans. If Cumulus Linux detects that the fan airflow direction is not uniform, it logs a message in the var/log/syslog file.

          cumulus@switch:~$ nv show platform environment fan
          Name      Fan State  Current Speed (RPM)  Max Speed  Min Speed  Fan Direction
          --------  ---------  -------------------  ---------  ---------  -------------
          FAN1/1    ok         6000                 29000      2500       F2B         
          FAN1/2    ok         6000                 29000      2500       F2B         
          FAN2/1    ok         6000                 29000      2500       F2B         
          FAN2/2    ok         6000                 29000      2500       F2B         
          FAN3/1    ok         6000                 29000      2500       F2B         
          FAN3/2    ok         6000                 29000      2500       F2B         
          PSU1/FAN  ok         6000                 29000      2500       F2B         
          PSU2/FAN  ok         6000                 29000      2500       F2B   
          

          If the airflow direction for all fans is not in the same (front to back or back to front), cooling is suboptimal for the switch, rack, and even the entire data center.

          decode-syseeprom Command

          Use the decode-syseeprom command to retrieve information about the switch EEPROM. If the EEPROM is writable, you can set values on the EEPROM.

          The following is example decode-syseeprom command output. The output is different on different switches:

          cumulus@switch:~$ decode-syseeprom
          TlvInfo Header:
             Id String:    TlvInfo
             Version:      1
             Total Length: 69
          TLV Name             Code Len Value
          -------------------- ---- --- -----
          Vendor Name          0x2D  16 Cumulus Networks
          Product Name         0x21   2 VX
          Device Version       0x26   1 3
          Part Number          0x22   5 5.12
          MAC Addresses        0x2A   2 55
          Base MAC Address     0x24   6 44:38:39:22:01:7A
          Serial Number        0x23  17 44:38:39:22:01:7a
          CRC-32               0xFE   4 0xF305A73F
          (checksum valid)
          

          The decode-syseeprom command includes the following options:

          Option Description
          -h, -help Displays the help message and exits.
          -a Prints the base MAC address for switch interfaces.
          -r Prints the number of MAC addresses allocated for the switch interfaces.
          -s Sets the EEPROM content (if the EEPROM is writable). You can provide arguments in the command line in a comma separated list in the form <field>=<value>.
          • . , and = are not allowed in field names and values.
          • Any field not specified defaults to the current value.

          NVIDIA Spectrum switches do not support this option.
          -j, --json Displays JSON output.
          -t <target> Prints the target EEPROM information (board, psu2, psu1).
          --serial, -e Prints the device serial number.
          -m Prints the base MAC address for the management interfaces.
          --init Clears and initializes the board EEPROM cache.

          Run the sudo dmidecode command to retrieve hardware configuration information populated in the BIOS.

          smond

          The smond service monitors system units like power supply and fan, updates the corresponding LEDs, and logs the change in state. The cpld registers detect changes in system unit state. smond utilizes these registers to read all sources, which determines the health of the unit and updates the system LEDs.

          Run the sudo smonctl command to display sensor information for the various system units:

          cumulus@switch:~$ sudo smonctl
          Fan1      (Fan Tray 1, Fan 1                     ):  OK
          Fan2      (Fan Tray 1, Fan 2                     ):  OK
          Fan3      (Fan Tray 2, Fan 1                     ):  OK
          Fan4      (Fan Tray 2, Fan 2                     ):  OK
          Fan5      (Fan Tray 3, Fan 1                     ):  OK
          Fan6      (Fan Tray 3, Fan 2                     ):  OK
          PSU1                                              :  OK
          PSU2                                              :  OK
          PSU1Fan1  (PSU1 Fan                              ):  OK
          PSU1Temp1 (PSU1 Temp Sensor                      ):  OK
          PSU2Fan1  (PSU2 Fan                              ):  OK
          PSU2Temp1 (PSU2 Temp Sensor                      ):  OK
          Temp1     (Board Sensor near CPU                 ):  OK
          Temp2     (Board Sensor Near Virtual Switch      ):  OK
          Temp3     (Board Sensor at Front Left Corner     ):  OK
          Temp4     (Board Sensor at Front Right Corner    ):  OK
          Temp5     (Board Sensor near Fan                 ):  OK
          

          When the switch is not powered on, smonctl shows the PSU status as BAD instead of POWERED OFF or NOT DETECTED. This is a known limitation.

          The smonctl command includes the following options:

          Option Description
          -s <sensor>, --sensor <sensor> Displays data for the specified sensor.
          -v, --verbose Displays detailed hardware sensors data.

          The following command example shows information about FAN6 on the switch:

          cumulus@switch:~$ smonctl -s FAN6 -v
          Fan6      (Fan Tray 3, Fan 2                     ):  OK
          

          For more information, read man smond and man smonctl.

          sensors Command

          Run the sensors command to monitor the health of your switch hardware, such as power, temperature and fan speeds. This command executes lm-sensors.

          Even though you can use the sensors command to monitor the health of your switch hardware, NVIDIA recommends you use the smond daemon to monitor hardware health. See smond Daemon above.

          For example:

          cumulus@switch:~$ sensors
          cumulus_vx_cpld-isa-0000
          Adapter: ISA adapter
          fan1:        6000 RPM  (min = 2500 RPM, max = 29000 RPM)
          fan2:        6000 RPM  (min = 2500 RPM, max = 29000 RPM)
          fan3:        6000 RPM  (min = 2500 RPM, max = 29000 RPM)
          fan4:        6000 RPM  (min = 2500 RPM, max = 29000 RPM)
          fan5:        6000 RPM  (min = 2500 RPM, max = 29000 RPM)
          fan6:        6000 RPM  (min = 2500 RPM, max = 29000 RPM)
          fan7:        6000 RPM  (min = 2500 RPM, max = 29000 RPM)
          fan8:        6000 RPM  (min = 2500 RPM, max = 29000 RPM)
          temp1:        +25.0°C  (low  =  +5.0°C, high = +80.0°C)
                                 (crit low =  +0.0°C, crit = +85.0°C)
          temp2:        +25.0°C  (low  =  +5.0°C, high = +80.0°C)
                                 (crit low =  +0.0°C, crit = +85.0°C)
          temp3:        +25.0°C  (low  =  +5.0°C, high = +80.0°C)
                                 (crit low =  +0.0°C, crit = +85.0°C)
          temp4:        +25.0°C  (low  =  +5.0°C, high = +80.0°C)
                                 (crit low =  +0.0°C, crit = +85.0°C)
          temp5:        +25.0°C  (low  =  +5.0°C, high = +80.0°C)
                                 (crit low =  +0.0°C, crit = +85.0°C)
          temp6:        +25.0°C  (low  =  +5.0°C, high = +80.0°C)
                                 (crit low =  +0.0°C, crit = +85.0°C)
          temp7:        +25.0°C  (low  =  +5.0°C, high = +80.0°C)
                                 (crit low =  +0.0°C, crit = +85.0°C)
          

          The following table shows the sensors command options.

          Option Description
          -c --config-file Specify a configuration file; use - after -c to read the configuration file from stdin; by default, sensors references the configuration file in /etc/sensors.d/.
          -s --set Execute set statements in the configuration file (root only); sensors -s runs one time at boot and applies all the settings to the boot drivers.
          -f --fahrenheit Show temperatures in degrees Fahrenheit.
          -A --no-adapter
          -A --bus-list
          Do not show the adapter for each chip.
          Generate bus statements for sensors.conf.
          -u Generate raw output.
          -j Generate json output.
          -v Show the program version.

          Hardware Watchdog

          Cumulus Linux includes a simplified version of the wd_keepalive(8) daemon instead of the one in the standard watchdog Debian package. wd_keepalive writes to a file called /dev/watchdog periodically (at least one time per minute) to prevent the switch from resetting. Each write delays the reboot time by another minute. After one minute of inactivity, where wd_keepalive does not write to /dev/watchdog, the switch resets itself.

          Cumulus Linux enables the watchdog by default, which starts when you boot the switch (before switchd starts).

          To disable the watchdog, disable and stop the wd_keepalive service:

          cumulus@switch:~$ sudo systemctl disable wd_keepalive ; systemctl stop wd_keepalive 
          

          You can modify the settings for the watchdog, such as the timeout and the scheduler priority, in the /etc/watchdog.conf configuration file.

          cumulus@switch:~$ sudo nano /etc/watchdog.conf
          watchdog-device	= /dev/watchdog
          # Set the hardware watchdog timeout in seconds
          watchdog-timeout = 30
          # Kick the hardware watchdog every 'interval' seconds
          interval = 5
          # Log a status message every (interval * logtick) seconds.  Requires
          # --verbose option to enable.
          logtick = 240
          # Run the daemon using default scheduler SCHED_OTHER with slightly
          # elevated process priority.  See man setpriority(2).
          realtime = no
          priority = -2
          

          Network Switch Port LED and Status LED Guidelines

          Data centers today have a large number of network switches manufactured by different hardware vendors running network operating systems from different providers. This section provides a set of guidelines for how network port and status LEDs appear on the front panel of a network switch.

          Network Port LEDs

          A network port LED indicates the state of the link, such as link UP or transmit and receive activity. Here are the requirements for these LEDs:

          | Activity            | Max Speed indication | Lower Speed Indication |
          | ------------------- | -------------------- | ---------------------- |
          | Physical Link Down  | Off                  | Off                    |
          | Physical Link UP    | Solid Green          | Solid Amber            |
          | Link Tx/Rx Activity | Blinking Green       | Blinking Amber         |
          | Beaconing          | Slow Blinking Amber  | Slow Blinking Amber    |
          | Fault               | Slow Blinking Amber  | Slow Blinking Amber    |
          

          Status LEDs

          One side of a network switch has a set of status LEDs. The status LEDs provide a visual indication on what is physically wrong with the network switch. Typical LEDs on the front panel are for PSUs (power supply units), fans, and system. Locator LEDs are also on the front panel of a switch. Each component that has an LED is a unit.

          Understanding the cl-support Output File

          The cl-support script generates a compressed archive file of useful information for troubleshooting. The system either creates the file automatically or you can create the file manually.

          Automatic cl-support File

          The system creates the cl-support file automatically:

          Manual cl-support File

          To create the cl-support file manually, run the nv action generate system tech-support command:

          cumulus@switch:~$ nv action generate system tech-support
          Action executing ...
          Generating system tech-support file, it might take a few minutes...
          Action executing ...
          Generated tech-support
          Action succeeded
          

          Cumulus Linux saves the cl-support file in the /var/support directory. The file name starts with cl_support and ends with the date and time of creation.

          The Linux command to generate the cl-support file includes more options; for example, you can include security sensitive information, include debugging information, only run certain modules, and provide a reason for running the script in the file.

          To create the cl-support file manually, run the cl-support command:

          cumulus@switch:~$ sudo cl-support
          

          If the Cumulus Linux support team requests that you submit the output from cl-support to investigate issues you experience, and you need to include security-sensitive information, such as the sudoers file, use the -s option:

          cumulus@switch:~$ sudo cl-support -s
          

          cl-support Script Options

          Option Description
          -h: Displays the available cl-support script options with a description.
          -c: Runs only modules matching core files (if no -e modules).
          -D: Displays debugging information.
          -d: Does not run modules in the provided comma separated list.
          -e: Only runs modules in the provided comma separated list. -e all runs all modules and submodules, including all optional modules.
          -j: Creates json output files for modules, where supported.
          -l: Lists the available modules, then exits.
          -M: Does not set a timeout for modules. Use this option with -T.
          -m: Runs modules serially and sets the module memory limit in MB; -m 0 runs serially without limits.
          -p: Adds a prefix to the cl-support archive file name.
          -r: Provides the reason for running the cl-support script. You must enclose the reason in quotes.
          -S: Uses a different output directory than the default /var/support.
          -s: Includes security sensitive information, such as the sudoers file.
          -T: Sets the timeout in seconds for creating the cl-support file. 0 disables the timeout.
          -t: Provides a tag string as part of the cl-support file name.
          -v: Runs in verbose mode to display status messages.

          cl-support Examples

          The following example does not run the cl-support script on the ptp4l.ptp4l and what-just-happened.wjh modules.

          cumulus@switch:~$ sudo cl-support -d ptp4l.ptp4l,what-just-happened.wjh
          cl-support: cl-support is running without memory limits
          Please send /var/support/cl_support_leaf01_20240214_183635.txz to Cumulus support.
          

          The following example runs the cl-support script and displays debugging information:

          cumulus@switch:~$ sudo cl-support -D
          DEBUG: Memory headroom set as 256MB
          DEBUG: Available memory 576MB
          DEBUG: Allowed memory consumption calculated at 320MB
          DEBUG: Using calculated memory limit
          DEBUG: Last parallel mode archive creation used 4MB
          DEBUG: /usr/bin/systemd-run -q -P -G -p MemoryMax=320M /usr/bin/time -v -o /tmp/tmp.f8L5l6odWn /usr/lib/cumulus/cl-support -D
          DEBUG: run_timeout 90 synced
          ...
          

          The following example runs the cl-support script, lists available modules, then exits.

          cumulus@switch:~$ sudo cl-support -l
          Default modules: synced.synced ptp4l.ptp4l what-just-happened.wjh
             gdb.coreinfo openvswitch.dump ptmd.ptm switchd.mlx switchd.stack
             switchd.fuse clag.clag network.kernel network.ifquery network.sfp
             network.sfphex network.net_use network.ifupdown2_policy dot1x.config
             system.versions system.logs system.systemd system.dmesg system.hwinfo
             system.memory_use system.configs system.pkg system.misc system.uefi
             system.time frr.frr neighmgr.neighmgr nvue.config lldp.lldp
          Optional modules: switchd.verbose clag.clagkerneldB system.pkgverify
             frr.ospftable frr.ospf6table frr.evpntable frr.bgptable nclu.config
          

          The following example adds a prefix to the generated cl-support file name:

          cumulus@switch:~$ sudo cl-support -p myprefix
          Please send /var/support/myprefix_support_leaf01_20240214_184135.txz to Cumulus support.
          

          The following example provides the reason for running the cl-support script:

          cumulus@switch:~$ sudo cl-support -r "switchd crash"
          Please send /var/support/cl_support_leaf01_20240214_184806.txz to Cumulus support.
          

          Delete cl-support Files

          To delete a cl-support file from the switch, run the NVUE nv action delete system tech-support files <file-name> command. You can also use the Linux sudo rm /var/support/<file-name> command.

          cumulus@switch:~$ nv action delete system tech-support files /var/support/cl_support_leaf01_20240725_221237.txz
          Action executing ...
          File Delete Succeeded
          Action succeeded
          

          Show cl-support Files

          To show the cl-support files on the switch, run the nv show system tech-support files command. You can also run the Linux ls command on the /var/support directory (ls /var/support).

          cumulus@switch:~$ nv show system tech-support files
          File name                              File path                                         
          -------------------------------------  --------------------------------------------------
          cl_support_leaf01_20240725_225811.txz  /var/support/cl_support_leaf01_20240725_225811.txz
          

          Upload cl-support Files

          To upload a cl-support file off the switch to an external location, run the nv action upload system tech-support files <file-name> <remote-url> command.

          For information on the directories included in the cl-support archive, see:

          Troubleshooting Log Files

          The only real unique entity for logging on Cumulus Linux compared to any other Linux distribution is switchd.log, which logs the HAL (hardware abstraction layer) from hardware.

          Read this guide on NixCraft to understand how /var/log works.

          Log File Descriptions

          Log Description
          /var/log/apt Information from the apt utility. For example, from apt-get install and apt-get remove.
          /var/log/audit/* Information stored by the Linux audit daemon, auditd.
          /var/log/autoprovision Output generated by running the zero touch provisioning script (ZTP).
          /var/log/boot.log Information that the system logs when the switch boots.
          /var/log/btmp Information about failed login attempts. Use the last command to view the btmp file. For example, last -f /var/log/btmp | more
          /var/log/cl-system-services.log Information about system services, such as switchd.
          /var/log/crit.log Log messages with a critical severity level.
          csmgrd.log ISSU errors and information.
          /var/log/dpkg.log Information that the system logs when you install or remove a package with the dpkg command.
          /var/log/frr/* Troubleshoots routing (FRR), such as an MD5 or MTU mismatch with OSPF.
          ifupdown2 Information that the system logs from the network interface manager (ifupdown2).
          /var/log/installer/* Information about Cumulus Linux installation.
          /var/log/lastlog Formats and prints the contents of the last login log file.
          /var/log/lttng-traces Information about LTTng sessions.
          mstpd Spanning Tree Protocol service errors and information.
          /var/log/netqd.log Information about the NetQ agent.
          /var/log/nginx Errors and processed requests in NGINX.
          /var/log/nv-cli.log Information about the NVUE CLI.
          /var/log/ntpstats Logs for network configuration protocol.
          /var/log/nvued.log Log file for NVUE.
          /var/log/ptmd Prescriptive Topology Manager (PTM) errors and information.
          /var/log/switchd.log The HAL log for Cumulus Linux.
          This is specific to Cumulus Linux. The system logs switchd crashes here.
          /var/log/syslog The main system log, which logs everything except auth-related messages.
          The primary log; grep this file to see what problem occurred.
          /var/log/wtmp Login records file.

          Troubleshooting the etc Directory

          The cl-support script replicates the /etc directory, however, it excludes certain files, such as /etc/nologin, which prevents unprivileged users from logging into the system.

          The following shows example output from ls -l on the /etc directory structure, which cl-support creates.

          cumulus@leaf02:mgmt:~$ ls -l /etc
          total 1040
          drwxr-xr-x 3 root root    4096 Apr 26 10:08 acpi
          -rw-r--r-- 1 root root    3040 May 25  2023 adduser.conf
          drwxr-xr-x 2 root root    4096 Apr 26 11:21 alternatives
          drwxr-xr-x 3 root root    4096 Apr 26 10:04 apparmor
          drwxr-xr-x 6 root root    4096 Apr 26 10:09 apparmor.d
          drwxr-xr-x 8 root root    4096 Apr 26 11:26 apt
          drwxr-x--- 4 root root    4096 Apr 26 10:09 audit
          -rw-r--r-- 1 root root    2119 Apr 26 10:08 bash.bashrc
          -rw-r--r-- 1 root root      45 Jan 24  2020 bash_completion
          drwxr-xr-x 2 root root    4096 Apr 26 10:10 bash_completion.d
          -rw-r--r-- 1 root root     367 Apr 10 07:01 bindresvport.blacklist
          drwxr-xr-x 2 root root    4096 Jan 26 21:48 binfmt.d
          drwxr-xr-x 3 root root    4096 Apr 26 10:05 ca-certificates
          -rw-r--r-- 1 root root    5989 Apr 26 10:08 ca-certificates.conf
          drwxr-xr-x 2 root root    4096 Apr 26 10:08 console-setup
          drwxr-xr-x 2 root root    4096 Apr 26 10:08 containerd
          drwxr-xr-x 2 root root    4096 Apr 26 10:08 cracklib
          drwxr-xr-x 2 root root    4096 Apr 26 10:09 cron.d
          drwxr-xr-x 2 root root    4096 Apr 26 10:08 cron.daily
          drwxr-xr-x 2 root root    4096 Apr 26 10:03 cron.hourly
          drwxr-xr-x 2 root root    4096 Apr 26 10:08 cron.monthly
          -rw-r--r-- 1 root root    1042 Mar  2  2023 crontab
          drwxr-xr-x 2 root root    4096 Apr 26 10:08 cron.weekly
          drwxr-xr-x 2 root root    4096 Apr 26 10:03 cron.yearly
          drwxr-xr-x 7 root root    4096 Apr 27 21:43 cumulus
          -rw-r--r-- 1 root root       0 Apr 26 16:07 cumulus-firstboot-after-networking-done
          -rw-r--r-- 1 root root       0 Apr 26 11:22 cumulus-firstboot-done
          drwxr-xr-x 4 root root    4096 Apr 26 10:04 dbus-1
          -rw-r--r-- 1 root root    2969 Jan  8  2023 debconf.conf
          -rw-r--r-- 1 root root       5 Jan 28 21:20 debian_version
          drwxr-xr-x 4 root root    4096 Apr 26 11:25 default
          -rw-r--r-- 1 root root    1706 May 25  2023 deluser.conf
          drwxr-xr-x 4 root root    4096 Apr 26 10:08 dhcp
          drwxr-xr-x 3 root root    4096 Apr 26 10:08 dhcpsnoop
          drwxr-xr-x 2 root root    4096 Apr 26 10:08 discover.conf.d
          -rw-r--r-- 1 root root     346 Jul 16  2005 discover-modprobe.conf
          -rw-r--r-- 1 root root   27885 Jan 13  2023 dnsmasq.conf
          drwxr-xr-x 2 root root    4096 Apr 26 10:06 dnsmasq.d
          drwxr-xr-x 2 root root    4096 Apr 26 10:09 docker
          drwxr-xr-x 4 root root    4096 Apr 26 10:03 dpkg
          -rw-r--r-- 1 root root      85 Apr 19 12:05 e2fsck.conf
          -rw-r--r-- 1 root root     685 Mar  5  2023 e2scrub.conf
          -rw-r--r-- 1 root root       0 Apr 26 10:03 environment
          drwxr-xr-x 3 root root    4096 Apr 26 10:04 etc
          -rw-r--r-- 1 root root    1853 Oct 17  2022 ethertypes
          drwxr-xr-x 2 root root    4096 Apr 26 11:24 firefly_servo
          drwxr-xr-x 4 root root    4096 Apr 26 10:08 fonts
          drwxr-xr-x 2 root root    4096 Apr 26 10:08 freeipmi
          drwxr-x--- 2 frr  frr     4096 Apr 27 21:43 frr
          -rw------- 1 root root     471 Apr 26 11:21 fstab
          -rw-r--r-- 1 root root    2584 Jul 29  2022 gai.conf
          -rw-r--r-- 1 root root    3886 Jan 14  2023 gprofng.rc
          drwxr-xr-x 2 root root    4096 Apr 26 10:03 groff
          -rw-r--r-- 1 root root     852 Apr 26 11:24 group
          -rw-r--r-- 1 root root     894 Apr 26 10:10 group-
          drwxr-xr-x 2 root root    4096 Apr 26 11:21 grub.d
          -rw-r----- 1 root shadow   705 Apr 26 11:24 gshadow
          -rw-r----- 1 root shadow   747 Apr 26 10:10 gshadow-
          drwxr-xr-x 3 root root    4096 Apr 26 10:03 gss
          -rw-r--r-- 1 root root    4436 Oct  6  2022 hdparm.conf
          drwxr-xr-x 2 root root    4096 Apr 26 10:09 hostapd
          -rw-r----- 1 root root     669 Apr 26 11:24 hostapd.conf
          -rw-r--r-- 1 root root       9 Aug  7  2006 host.conf
          -rw-r--r-- 1 root root     150 Apr 26 16:06 hostname
          -rw-r--r-- 1 root root     306 Apr 26 16:06 hosts
          -rw-r--r-- 1 root root     411 Apr 26 10:08 hosts.allow
          -rw-r--r-- 1 root root     711 Apr 26 10:08 hosts.deny
          drwxr-xr-x 3 root root    4096 Apr 26 10:06 hsflowd
          -rw-r--r-- 1 root root    1010 Mar 15 06:40 hsflowd.conf
          drwxr-xr-x 2 root root    4096 Apr 26 11:21 hw_init.d
          -rw-r--r-- 1 root root     258 Apr 26 10:21 image-release
          drwxr-xr-x 2 root root    4096 Apr 26 10:09 init
          drwxr-xr-x 2 root root    4096 Apr 26 10:10 init.d
          drwxr-xr-x 5 root root    4096 Apr 26 10:09 initramfs-tools
          -rw-r--r-- 1 root root    1875 Jan  3  2023 inputrc
          drwxr-xr-x 3 root root    4096 Apr 26 10:06 insserv
          -rw-r--r-- 1 root root     874 Feb 22  2022 insserv.conf
          drwxr-xr-x 2 root root    4096 Apr 26 10:09 insserv.conf.d
          drwxr-xr-x 5 root root    4096 Apr 26 10:03 iproute2
          -rw-r--r-- 1 root root      27 Jan 28 21:20 issue
          -rw-r--r-- 1 root root      20 Jan 28 21:20 issue.net
          drwxr-xr-x 2 root root    4096 Apr 26 10:09 kdump
          drwxr-xr-x 5 root root    4096 Apr 26 10:04 kernel
          -rw-r--r-- 1 root root   29522 Apr 26 11:25 ld.so.cache
          -rw-r--r-- 1 root root      34 Apr 10 07:01 ld.so.conf
          drwxr-xr-x 2 root root    4096 Apr 26 10:03 ld.so.conf.d
          drwxr-xr-x 3 root root    4096 Apr 26 10:04 letsencrypt
          -rw-r--r-- 1 root root     191 Feb  9  2023 libaudit.conf
          drwxr-xr-x 2 root root    4096 Apr 26 10:08 libnl
          drwxr-xr-x 2 root root    4096 Apr 26 11:24 linuxptp
          drwxr-xr-x 2 root root    4096 Apr 26 11:24 lldpd.d
          -rw-r--r-- 1 root root    2996 Apr 19 16:34 locale.alias
          -rw-r--r-- 1 root root    9449 Apr 26 10:04 locale.gen
          lrwxrwxrwx 1 root root      27 Apr 26 10:03 localtime -> /usr/share/zoneinfo/Etc/UTC
          drwxr-xr-x 4 root root    4096 Apr 26 10:05 logcheck
          -rw-r--r-- 1 root root   10216 Apr 26 11:23 login.defs
          -rw-r--r-- 1 root root   10217 Apr 19 12:05 login.defs.cumulus
          -rw-r--r-- 1 root root   12569 Nov 11  2022 login.defs.cumulus-orig
          lrwxrwxrwx 1 root root      22 Apr 26 10:10 logrotate.conf -> logrotate.conf.cumulus
          -rw-r--r-- 1 root root     474 Apr 19 12:05 logrotate.conf.cumulus
          -rw-r--r-- 1 root root     494 Dec 14  2022 logrotate.conf.cumulus-orig
          drwxr-xr-x 2 root root    4096 Apr 26 10:10 logrotate.d
          -rw-r--r-- 1 root root      91 Apr 20 15:39 lsb-release
          drwxr-xr-x 3 root root    4096 Apr 26 10:04 lttng
          drwxr-xr-x 3 root root    4096 Apr 26 10:10 lvm
          -r--r--r-- 1 root root      33 Apr 26 10:03 machine-id
          -rw-r--r-- 1 root root     111 Jan 28  2023 magic
          -rw-r--r-- 1 root root     111 Jan 28  2023 magic.mime
          -rw-r--r-- 1 root root    3310 Apr 26 11:21 mailcap
          -rw-r--r-- 1 root root     449 Nov 29  2021 mailcap.order
          -rw-r--r-- 1 root root      13 Apr 26 10:08 mailname
          -rw-r--r-- 1 root root     125 Apr 14  2022 mail.rc
          -rw-r--r-- 1 root root    5230 Mar 12  2023 manpath.config
          -rw-r--r-- 1 root root   73816 Feb 11  2023 mime.types
          -rw-r--r-- 1 root root     782 Mar  5  2023 mke2fs.conf
          drwxr-xr-x 3 root root    4096 Apr 26 10:06 mlx
          drwxr-xr-x 2 root root    4096 Apr 26 16:07 modprobe.d
          -rw-r--r-- 1 root root     248 Apr 26 10:03 modules
          drwxr-xr-x 2 root root    4096 Apr 26 11:21 modules-load.d
          -rw-r--r-- 1 root root     456 Apr 26 11:21 motd.distrib
          lrwxrwxrwx 1 root root      19 Apr 26 11:22 mtab -> ../proc/self/mounts
          drwxr-xr-x 4 root root    4096 Apr 26 10:08 mysql
          -rw-r--r-- 1 root root   11399 Jan 18  2023 nanorc
          -rw-r--r-- 1 root root     767 Aug 11  2022 netconfig
          drwxr-xr-x 4 root root    4096 Apr 26 10:09 netq
          drwxr-xr-x 2 root root    4096 Apr 26 10:09 netsniff-ng
          drwxr-xr-x 7 root root    4096 Apr 27 18:12 network
          drwxr-xr-x 3 root root    4096 Apr 26 10:04 NetworkManager
          -rw-r--r-- 1 root root      60 Apr 26 10:03 networks
          drwxr-xr-x 9 root root    4096 Apr 27 21:43 nginx
          -rw-r--r-- 1 root root     636 Apr 26 11:24 nsswitch.conf
          drwxr-xr-x 2 root root    4096 Apr 26 11:24 ntpsec
          drwxr-xr-x 3 root root    4096 Apr 26 10:05 nvue
          -rw-r--r-- 1 root root     978 Apr 26 02:46 nvue-auth.yaml
          drwxr-xr-x 3 root root    4096 Apr 27 21:43 nvue.d
          drwxr-xr-x 2 root root    4096 Apr 26 10:02 opt
          lrwxrwxrwx 1 root root      21 Jan 28 21:20 os-release -> ../usr/lib/os-release
          -rw-r--r-- 1 root root     552 Sep 21  2023 pam.conf
          drwxr-xr-x 2 root root    4096 Apr 26 11:24 pam.d
          -rw-r----- 1 root shadow  2997 Apr 26 11:24 pam_radius_auth.conf
          -rw-r--r-- 1 root root    1544 Apr 26 11:24 passwd
          -rw-r--r-- 1 root root    1554 Apr 26 10:10 passwd-
          drwxr-xr-x 3 root root    4096 Apr 26 10:04 perl
          lrwxrwxrwx 1 root root      15 Apr 26 10:10 profile -> profile.cumulus
          -rw-r--r-- 1 root root     746 Apr 19 12:05 profile.cumulus
          -rw-r--r-- 1 root root     769 Apr 10  2021 profile.cumulus-orig
          drwxr-xr-x 2 root root    4096 Apr 26 11:24 profile.d
          -rw-r--r-- 1 root root    3144 Oct 17  2022 protocols
          drwxr-xr-x 2 root root    4096 Apr 26 10:10 ptm.d
          -rw-r--r-- 1 root root     343 Apr 26 11:24 ptp4l.conf
          drwxr-xr-x 2 root root    4096 Apr 26 10:09 python3
          drwxr-xr-x 2 root root    4096 Apr 26 10:04 python3.11
          drwxr-xr-x 3 root root    4096 Apr 26 10:04 ras
          -rw-r--r-- 1 root root     985 Apr  9 05:47 rdnbrd.conf
          -rw-r--r-- 1 root root      72 Apr 28 18:48 resolv.conf
          drwxr-xr-x 3 root root    4096 Apr 26 10:06 resolvconf
          -rw-r--r-- 1 root root      61 Apr 28 18:48 resolv.conf.bak
          lrwxrwxrwx 1 root root      13 Jan 20 09:27 rmt -> /usr/sbin/rmt
          -rw-r--r-- 1 root root     911 Oct 17  2022 rpc
          lrwxrwxrwx 1 root root      20 Apr 26 10:10 rsyslog.conf -> rsyslog.conf.cumulus
          -rw-r--r-- 1 root root    1483 Apr 19 12:05 rsyslog.conf.cumulus
          -rw-r--r-- 1 root root    1430 Feb 22  2023 rsyslog.conf.cumulus-orig
          drwxr-xr-x 2 root root    4096 Apr 26 10:10 rsyslog.d
          drwxr-xr-x 3 root root    4096 Apr 26 10:04 runit
          -rw-r--r-- 1 root root    3663 Jun  9  2015 screenrc
          drwxr-xr-x 4 root root    4096 Apr 26 16:07 security
          drwxr-xr-x 2 root root    4096 Apr 26 10:03 selinux
          -rw-r--r-- 1 root root   10593 Oct 15  2022 sensors3.conf
          drwxr-xr-x 2 root root    4096 Apr 26 11:21 sensors.d
          -rw-r--r-- 1 root root   12813 Mar 27  2021 services
          -rw-r----- 1 root shadow  1083 Apr 26 16:07 shadow
          -rw-r----- 1 root shadow   945 Apr 26 11:24 shadow-
          -rw-r--r-- 1 root root     158 Apr 26 10:10 shells
          drwxr-xr-x 2 root root    4096 Apr 26 10:10 skel
          -rw-r--r-- 1 root root    7042 Oct 16  2022 smartd.conf
          drwxr-xr-x 4 root root    4096 Apr 26 10:07 smartmontools
          -rw-r--r-- 1 root root    1201 Dec  2  2018 smi.conf
          drwxr-xr-x 3 root root    4096 Apr 26 10:10 snmp
          drwxr-xr-x 4 root root    4096 Apr 26 11:24 ssh
          drwxr-xr-x 4 root root    4096 Apr 26 10:08 ssl
          drwxr-x--- 2 root mail    4096 Apr 26 10:08 ssmtp
          -rw-r--r-- 1 root root      21 Apr 26 10:10 subgid
          -rw-r--r-- 1 root root       0 Apr 26 10:03 subgid-
          -rw-r--r-- 1 root root      21 Apr 26 10:10 subuid
          -rw-r--r-- 1 root root       0 Apr 26 10:03 subuid-
          -rw-r--r-- 1 root root    4343 Dec 29 22:00 sudo.conf
          -r--r----- 1 root root    4233 Dec 29 22:00 sudoers
          drwxr-xr-x 2 root root    4096 Apr 26 16:03 sudoers.d
          drwxr-xr-x 6 root root    4096 Apr 26 10:06 sv
          drwxr-xr-x 2 root root    4096 Apr 26 11:24 synced
          -rw-r--r-- 1 root root    2355 Dec 19  2022 sysctl.conf
          drwxr-xr-x 2 root root    4096 Apr 26 11:23 sysctl.d
          drwxr-xr-x 6 root root    4096 Apr 26 10:10 systemd
          drwxr-xr-x 2 root root    4096 Apr 26 10:03 terminfo
          -rw-r--r-- 1 root root       8 Apr 26 10:03 timezone
          drwxr-xr-x 2 root root    4096 Apr 26 10:08 tmpfiles.d
          -rw-r--r-- 1 root root    1260 Jan 27  2023 ucf.conf
          drwxr-xr-x 4 root root    4096 Apr 26 10:03 udev
          drwxr-xr-x 3 root root    4096 Apr 26 10:04 ufw
          drwxr-xr-x 2 root root    4096 Apr 26 10:03 update-motd.d
          drwxr-xr-x 2 root root    4096 Apr 26 10:10 vim
          drwxr-xr-x 2 root root    4096 Apr 26 10:08 vrf
          -rw-r--r-- 1 root root     435 Mar 14 17:01 watchdog.conf
          -rw-r--r-- 1 root root    4942 May 14  2022 wgetrc
          drwxr-xr-x 2 root root    4096 Apr 26 11:24 what-just-happened
          drwxr-xr-x 2 root root    4096 Apr 26 11:25 wireshark
          drwxr-xr-x 4 root root    4096 Apr 26 10:04 X11
          -rw-r--r-- 1 root root     681 Jan 17  2023 xattr.conf
          drwxr-xr-x 3 root root    4096 Apr 26 10:03 xdg
          

          Troubleshooting Network Interfaces

          The following sections describe various ways you can troubleshoot ifupdown2 and network interfaces.

          Monitor Interface Traffic Rate and PPS

          Monitoring the traffic rate and PPS for an interface ensures optimal network performance and reliability. You can use the data provided to allocate and utilize network resources efficiently, ensuring quality of service and preventing network bottlenecks. The data helps you to obtain a comprehensive view of network health, detect any DDoS attacks, and see if the current network can handle peak loads or if you need future network capacity expansion and upgrades.

          By monitoring both the traffic rate and PPS, you can identify peak usage times and adjust bandwidth allocation or optimize packet paths to ensure low latency and high throughput.

          To show a summary view of the traffic rate and PPS for all interfaces, run the nv show interface rates command.

          cumulus@switch:~$ nv show interface rates
          Interface    Intvl     In-Bits Rate    In-Util   In-Pkts Rate    Out-Bits Rate  Out-Util     Out-Packets Rate
          ---------    ------    ------------    -------   ------------    -------------   ---------   ---------------
          swp1         50        153.40 Gbps     0.1%      1.00 Mpps       253.40 Gbps     2.0%        20.00 Mpps
          swp2         50        2.00 kbps       0.0%      20.00 kpps      3.00 kbps       3.0%        30.00 Mpps
          swp3         50        3.00 Mbps       0.0%      30.00 kpps      4.00 kbps       0.0%        40.00 kpps
          swp4         50        300 bps         0.0%      30 pps          400 bps         0.0%        40 pps
          

          To show the traffic rate and PPS for a specific interface, run the nv show interface <interface> rates command.

          cumulus@switch:~$ nv show interface swp1 rates
                          operational    applied
          --------------  -----------    ------- 
          load-interval    30              
          in-bits-rate    6000 
          in-pkts-rate    200 
          in-utilization  20.00% 
          out-bits-rate    8000 
          out-pkts-rate    100 
          out-utilization  10.00% 
          

          You must specify a specific interface; the nv show interface <interface> rates command does not support a range of interfaces.

          You can configure the load interval you want to use to calculate interface rates with the nv set system counter rates load-interval command. Cumulus Linux uses the load interval to measure and average out the rate counters to smoothen any short-term fluctuations. You can specify a value between 1 and 600. The default load interval is 60 seconds.

          cumulus@switch:~$ nv set system counter rates load-interval 30
          cumulus@switch:~$ nv config apply
          

          To view the configured load interval, run the nv show system counter rates command.

          Enable Network Logging

          To obtain verbose logs when you run systemctl start networking.service or systemctl restart networking.service as well as when the switch boots, create an overrides file with the systemctl edit networking.service command and add the following lines:

          [Service]
          # remove existing ExecStart rule
          ExecStart=
          # start ifup with verbose option
          ExecStart=/sbin/ifup -av
          

          When you run the systemctl edit command, you do not need to run systemctl daemon-reload.

          To disable logging, either:

          Exclude Certain Interfaces from Coming Up

          To exclude an interface so that it does not come up when you boot the switch or when you start, stop, or reload the networking service:

          1. Create a file in the /etc/systemd/system/networking.service.d directory (for example, /etc/systemd/system/networking.service.d/override.conf).

          2. Add the lines ExecStart=/sbin/ifup -a -X <interface> and ExecStop=/sbin/ifdown -a -X <interface> to the file. The following example stops eth0 from coming up:

            [Service]
            ExecStart=
            ExecStart=/sbin/ifup -a -X eth0
            ExecStop=
            ExecStop=/sbin/ifdown -a -X eth0
            

          You can exclude any interface specified in the /etc/network/interfaces file.

          Use ifquery to Validate and Debug Interface Configurations

          You use ifquery to print parsed interfaces file entries.

          To use ifquery to pretty print iface entries from the interfaces file, run:

          cumulus@switch:~$ sudo ifquery bond0
          auto bond0
          iface bond0
              address 14.0.0.9/30
              address 2001:ded:beef:2::1/64
              bond-slaves swp25 swp26
          

          Use ifquery --check to check the current running state of an interface within the interfaces file. The command returns exit code 0 or 1 if the configuration does not match. The line bond-xmit-hash-policy layer3+7 below fails because it should read bond-xmit-hash-policy layer3+4.

          cumulus@switch:~$ sudo ifquery --check bond0
          iface bond0
              bond-xmit-hash-policy layer3+7  [fail]
              bond-slaves swp25 swp26         [pass]
              address 14.0.0.9/30             [pass]
              address 2001:ded:beef:2::1/64   [pass]
          

          ifquery --check is an experimental feature.

          Use ifquery --running to print the running state of interfaces in the interfaces file format:

          cumulus@switch:~$ sudo ifquery --running swp1
          auto swp1
          iface swp1
          	mtu 9000
          	hwaddress 48:b0:2d:01:46:04
          

          ifquery --syntax-help provides help on all possible attributes supported in the interfaces file. For complete syntax on the interfaces file, see man interfaces and man ifupdown-addons-interfaces.

          You can use ifquery --print-savedstate to check the ifupdown2 state database. ifdown works only on interfaces present in this state database.

          cumulus@leaf1$ sudo ifquery --print-savedstate eth0  
          auto eth0
          iface eth0 inet dhcp
          

          Mako Template Errors

          An easy way to debug and get details about template errors is to use the mako-render command on your interfaces template file or on /etc/network/interfaces itself.

          cumulus@switch:~$ sudo mako-render /etc/network/interfaces
          iface swp51
          
          cumulus@leaf02:mgmt:~$ sudo mako-render /etc/network/interfaces
          # Auto-generated by NVUE!
          # Any local modifications will prevent NVUE from re-generating this file.
          # md5sum: ac1ff9d35c3cd51f7aa3073ae32debf2
          # This file describes the network interfaces available on your system
          # and how to activate them. For more information, see interfaces(5).
          
          source /etc/network/interfaces.d/*.intf
          
          auto lo
          iface lo inet loopback
              address 10.10.10.1/32
              vxlan-local-tunnelip 10.10.10.1
          
          auto mgmt
          iface mgmt
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          
          auto RED
          iface RED
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          
          auto BLUE
          iface BLUE
              address 127.0.0.1/8
              address ::1/128
              vrf-table auto
          
          auto eth0
          iface eth0 inet dhcp
              ip-forward off
              ip6-forward off
              vrf mgmt
          
          auto swp1
          iface swp1
          
          auto swp2
          iface swp2
          
          auto swp3
          iface swp3
          
          auto swp51
          iface swp51
          
          auto swp52
          iface swp52
          
          auto bond1
          iface bond1
              mtu 9000
              bond-slaves swp1
              bond-mode 802.3ad
              bond-lacp-bypass-allow yes
              bridge-access 10
          
          auto bond2
          iface bond2
              mtu 9000
              bond-slaves swp2
              bond-mode 802.3ad
              bond-lacp-bypass-allow yes
              bridge-access 20
          
          auto bond3
          iface bond3
              mtu 9000
              bond-slaves swp3
              bond-mode 802.3ad
              bond-lacp-bypass-allow yes
              bridge-access 30
          
          auto vlan10
          iface vlan10
              address 10.1.10.2/24
              address-virtual 00:00:5e:00:01:01 10.1.10.1/24
              hwaddress 44:38:39:22:01:78
              vrf RED
              vlan-raw-device br_default
              vlan-id 10
          
          auto vlan20
          iface vlan20
              address 10.1.20.2/24
              address-virtual 00:00:5e:00:01:01 10.1.20.1/24
              hwaddress 44:38:39:22:01:78
              vrf RED
              vlan-raw-device br_default
              vlan-id 20
          
          auto vlan30
          iface vlan30
              address 10.1.30.2/24
              address-virtual 00:00:5e:00:01:01 10.1.30.1/24
              hwaddress 44:38:39:22:01:78
              vrf BLUE
              vlan-raw-device br_default
              vlan-id 30
          
          auto vxlan48
          iface vxlan48
              bridge-vlan-vni-map 10=10 20=20 30=30
              bridge-learning off
          
          auto vlan3159_l3
          iface vlan3159_l3
              vrf RED
              vlan-raw-device br_l3vni
              vlan-id 3159
          
          auto vlan3607_l3
          iface vlan3607_l3
              vrf BLUE
              vlan-raw-device br_l3vni
              vlan-id 3607
          
          auto vxlan99
          iface vxlan99
              bridge-vlan-vni-map 3159=4001 3607=4002
              bridge-learning off
          
          auto br_default
          iface br_default
              bridge-ports bond1 bond2 bond3 vxlan48
              hwaddress 44:38:39:22:01:78
              bridge-vlan-aware yes
              bridge-vids 10 20 30
              bridge-pvid 1
              bridge-stp yes
              bridge-mcsnoop no
              mstpctl-forcevers rstp
          
          auto br_l3vni
          iface br_l3vni
              bridge-ports vxlan99
              hwaddress 44:38:39:22:01:78
              bridge-vlan-aware yes
          

          ifdown Cannot Find an Interface that Exists

          If you try to bring down an interface that you know exists, use ifdown with the --use-current-config option to force ifdown to check the current /etc/network/interfaces file to find the interface. For example:

          cumulus@switch:~$ sudo ifdown br0
          error: cannot find interfaces: br0 (interface was probably never up ?)
          
          cumulus@switch:~$ sudo brctl show
          bridge name   bridge id      STP enabled interfaces
          br0      8000.44383900279f   yes     downlink
                                       peerlink
          
          cumulus@switch:~$ sudo ifdown br0 --use-current-config 
          

          Remove All References to a Child Interface

          If you have a configuration with a child interface, whether it is a VLAN, bond, or another physical interface and you remove that interface from a running configuration, you must remove every reference to it in the configuration. Otherwise, the parent interface continues to use the interface.

          For example, consider the following configuration:

          auto lo
          iface lo inet loopback
          
          auto eth0
          iface eth0 inet dhcp
          
          auto bond1
          iface bond1
              bond-slaves swp2 swp1
          
          auto bond3
          iface bond3
              bond-slaves swp8 swp6 swp7
          
          auto br0
          iface br0
              bridge-ports swp3 swp5 bond1 swp4 bond3
              bridge-pathcosts  swp3=4 swp5=4 swp4=4
              address 11.0.0.10/24
              address 2001::10/64
          

          bond1 is a member of br0. If you remove bond1, you must remove the reference to it from the br0 configuration. Otherwise, if you reload the configuration with ifreload -a, bond1 remains part of br0.

          MTU Numerical Result Out of Range Error

          The MTU Numerical result out of range error occurs when the MTU you are trying to set on an interface is higher than the MTU of the lower interface or dependent interface. Linux expects the upper interface to have an MTU less than or equal to the MTU on the lower interface.

          In the example below, the swp1.100 VLAN interface is an upper interface to physical interface swp1. If you want to change the MTU to 9000 on the VLAN interface, you must include the new MTU on the lower interface swp1 as well.

          auto swp1.100
          iface swp1.100
              mtu 9000
          
          auto swp1 
          iface swp1  
              mtu 9000
          

          iproute2 batch Command Failures

          ifupdown2 batches iproute2 commands for performance reasons. A batch command contains ip -force -batch - in the error message. The command number that fails is at the end of this line: Command failed -:1.

          Below is a sample error for the command 1: link set dev host2 master bridge. There is an error adding the bond host2 to the bridge named bridge because host2 does not have a valid address.

          error: failed to execute cmd 'ip -force -batch - [link set dev host2 master bridge
          addr flush dev host2
          link set dev host1 master bridge
          addr flush dev host1
          ]'(RTNETLINK answers: Invalid argument
          Command failed -:1)
          warning: bridge configuration failed (missing ports)
          

          This error can occur when the bridge port does not have a valid hardware address or when the interface you add to the bridge is an incomplete bond; a bond without slaves is incomplete and does not have a valid hardware address.

          MLAG Interface Drops Packets

          Losing a large number of packets across an MLAG peerlink interface is often not a problem. This can occur to prevent BUM (broadcast, unknown unicast and multicast) packet looping. For more details, and for information on how to detect these drops, refer to the MLAG section.

          Troubleshoot Layer 1

          This chapter describes how to troubleshoot layer 1 issues that can affect the port modules connecting a switch to a network.

          High Speed Ethernet Technologies

          Specifications

          The following specifications are useful in understanding and troubleshooting layer 1 problems:

          Form Factors

          Modern Ethernet modules come in one of two form factors:

          Each form factor contains an EEPROM with information about the capabilities of the module and various groups of required or optional registers to query or control aspects of the module. The output from the ethtool -m <swp> command decodes the main values.

          The SFF MSA specifications define the memory locations for the fields in the EEPROM and the common registers:

          Identifiers are in the first byte of the module memory map:

          Encoding

          Two parts of high-speed Ethernet are under encoding in the output from the ethtool -m <swp> command:

          The relationship between lane speed and encoding methods is described in this table:

          Lane Speed Encoding
          10G Uses 64B/66B framing then encoded in NRZ — actually 10.3125 Gbps on the wire.
          25G Uses 64B/66B framing then encoded in NRZ — actually 25.78125 Gbps on the wire. Can also use RS-FEC (528,514) or Base-R FEC.
          50G Uses PAM4 encoding and RS-FEC (544,514).
          100G Uses PAM4 encoding.

          The SerDes (Serial/Deserializer) is the component in the port that converts byte data to and from a set of bit streams (lanes), where:

          On the ASIC, the 40G, 100G and 200G SerDes devices are 4 lanes; 400G SerDes uses 8 lanes. So an SFP port is actually one lane on a four lane SerDes. Depending on the platform design, this sometimes affects how you can configure and break out SFP ports.

          Port speeds are created using the following formulas:

          Port Speed Number of Lanes
          1G One 10G or 25G lane clocked at 1G. Or, on a 1G fixed copper switch, a 1G lane.
          10G One 10G lane.
          25G One 25G lane.
          40G Four 10G lanes.
          50G Two 25G lanes (NRZ) or one 50G lane (PAM4).
          100G Four 25G lanes (100G-SR4/CR4 NRZ), two 50G lanes (100G-CR2 PAM4), or one 100G lane.
          200G Four 50G lanes or two 100G lanes.
          400G Eight 50G lanes or four 100G lanes.
          800G Eight 100G lanes.

          Active and Passive Modules and Cables

          From the point of view of the port, modules and cables can be classified as either active or passive.

          Active cables and modules contain transmitters that regenerate the bit signals over the cable. All optical modules are active. 10/100/1000BaseT and 10GBaseT are active modules and contain an onboard PHY that handles the BaseT auto-negotiation and TX/RX to the remote BaseT device. For active modules, the port only has to provide a TX signal with a base level of power to the module and the module uses the power it receives on the port power bus to regenerate the signal to the remote side.

          Although some copper cable assemblies are active, they are extremely rare.

          Passive cables (copper DACs) connect the port side of the module directly to the copper twinax media on the other side of the module in the assembly. The port TX lines provide the power to drive the signal to the remote end. The port goes through a training sequence with the remote end port to tune the power TX and RX parameters to optimize the received signal and ensure correct clock and data recovery at each RX end.

          Compliance Codes, Ethernet Type, Ethmode Type, Interface Type

          Compliance codes, Ethernet type, Ethmode type, and interface type are all terms for the type of Ethernet technology that the module implements.

          For the port to know the characteristics of the module that is inserted, the SFP or QSFP module EEPROMs have a standardized set of data to describe the module characteristics. These values appear in the output of ethtool -m <swp>.

          The compliance codes describe the type of Ethernet technology the module implements, such as 1000Base-T, 10GBase-SR, 10GBase-CR, 40GBase-SR4, and 100GBase-CR4.

          The first part of the compliance code gives the full line rate speed of the technology. The last part of the compliance code specifies the Ethernet technology and the number of lanes used:

          An active module with a passive module compliance code or a passive module with an active module compliance code causes the port to be set up incorrectly and may affect signal integrity.

          Some modules have vendor specific coding, are older, or use a proprietary vendor technology that is not listed in the standards. As a result, they are not recognized by default and need to be overridden to the correct compliance code. On NVIDIA switches, the port firmware automatically overrides certain supported modules to the correct compliance code.

          Digital Diagnostic Monitoring/Digital Optical Monitoring (DDM/DOM)

          DDM/DOM is an optional capability that vendors can implement on their optical transceivers to display measurements about the optical power. The values are generally reliable within a 10% tolerance. A value of 0.0000 generally indicates the value is not implemented by the vendor.

          The most useful DDM/DOM values when troubleshooting a problem link are:

          The location of DDM/DOM fields are standardized. If DDM/DOM capability is present on a module, the values are displayed in the output of the NVUE nv show platform transceiver <interface> command or the Linux ethtool -m <swp> command.

          For each DDM/DOM value there can be thresholds to mark a high or low warning or an alarm when the value exceeds that threshold.

          An alarm value indicates the level required for the signal to be within the vendor’s design tolerance, and the warning level is a little bit closer to expected norms.

          When a warning or alarm is triggered, the flag flips from Off to On. Reading that value with ethtool -m or NVIDIA NetQ (or some other monitoring software) resets this flag back to Off after it is read.

          Auto-negotiation

          There are 3 different types of auto-negotiation (IEEE 802.3 clauses 28, 37, 73), which apply to various Ethernet technologies that Cumulus Linux supports:

          Many Ethernet technologies used in Cumulus Linux switches do not have auto-negotiation capability:

          Only about half of all modern link types support auto-negotiation. The next subsections provide guidance on when and how to enable auto-negotiation.

          1000BASE-T and 10GBASE-T fixed copper ports require auto-negotiation for 1G and 10G speeds. This is the default setting; you cannot disable auto-negotiation for 1G speeds. Disabling auto-negotiation on these ports requires setting the speed to 100Mbps or 10Mbps and the correct duplex setting.

          1000BASE-T SFPs have an onboard PHY that performs auto-negotiation automatically on the RJ45 side without involving the port. Do not change the default auto-negotiation setting on these ports; on NVIDIA switches, auto-negotiation is ON.

          For 1000BASE-X, auto-negotiation is highly recommended on 1G optical links to detect unidirectional link failures.

          For all other optical modules except for 1000BASE-X, there is no auto-negotiation standard.

          For 10G DACs, there is no auto-negotiation.

          For DAC cables on speeds higher than 25G, auto-negotiation is unnecessary, but is useful because it can improve signal integrity by link training. It also negotiates speed and FEC, which is less useful because the neighbor speed and FEC is usually known.

          General Auto-negotiation Guidance

          Autodetect

          As a result of the confusion about when auto-negotiation applies to a link type, many Ethernet software vendors, including Cumulus Linux, allow auto-negotiation ON to be configured on every interface type. When auto-negotiation is ON, but is not supported on a link type, the port software tries to determine the most likely link settings to bring the link up. Cumulus Linux calls this feature autodetect, but it is not directly configurable.

          When auto-negotiation is enabled on a port, the behavior is as follows:

          Autodetect is a local feature. The neighbor is assumed to either be configured with auto-negotiation off and speed, duplex, and FEC set manually, or using some equivalent algorithm to determine the correct speed, duplex, and FEC settings.

          To see the user configured settings for auto-negotiation, speed, duplex and FEC compared to the actual operational state on the port hardware, use the l1-show command.

          The autodetect feature is usually successful, but if the link does not come up, disable auto-negotiation and configure the link settings manually.

          FEC

          Forward Error Correction (FEC) is an algorithm used to correct bit errors along a medium. FEC encodes the data stream so that the remote device can correct a certain number of bit errors by decoding the stream.

          The target IEEE bit error rate (BER) in high-speed Ethernet is 10-12. At 25G lane speeds and above, this might not be achievable without error correction, depending on the media type and length. See Switch Port Attributes for a more detailed discussion of FEC requirements for certain cable types.

          Both sides of a link must have the same FEC encoding algorithm enabled for the link to come up. If both sides appear to have a working signal path but the link is down, there might be an auto-negotiation mismatch or FEC mismatch in the configuration.

          FEC Encoding Algorithms and Settings

          In some cases, the configured value might be different than the operational value. In such cases, the l1-show command displays both values. For example:

          Signal Integrity

          The goal of Ethernet protocols and technologies is to enable the bits generated on one side of a link to be received correctly on the other side. The next two sections provide information about what might be happening on the link level when the link is down or bits are not received correctly.

          Various characteristics show the state of a link. All characteristics might not be available to display on all platforms.

          Eyes

          When a 1 or a 0 bit is transmitted across a link, it is represented on the electrical side of the port as either a high voltage level or a low voltage level. If an oscilloscope is attached to those leads, as the bit stream is transmitted across it, the transitions between 1 and 0 form a pattern in the shape of an eye.

          The farther the distance between the 1 and 0, the more open the eye appears. The more open the eye is, the less likely it is for a bit to be misread. When a bit is misread, it causes a bit-error, which results in an FCS error on the entire packet being received. A lower eye measurement generally translates to a larger bit error rate (BER). FEC can correct bit errors up to a point.

          Eyes are not measured on fixed copper ports and are not measured when a link is down.

          Each hardware vendor implements some quantitative measurement of eyes and some kind of qualitative measurement.

          On an NVIDIA switch, the eyes are assigned a height in mV and a grade. For speeds below 100G (NRZ encoding), when the grade goes below 4000, the error rate or stability of the link might be negatively impacted.

          A link might have no stability problems with a measurement below these values, and FEC might correct all errors presented on such a link. For some interface types, FEC is required to remove errors up to BER levels that are expected on the media.

          For 50G lanes (200G- and 400G-capable ports), the link uses PAM4 encoding, which has 3 eyes stacked on top of each other and therefore much smaller eye measurements. FEC is required on these links.

          Show Layer 1 Information

          Use the NVUE nv show platform transceiver <interface> command or the Linux l1-show command to show all layer 1 aspects of a Cumulus Linux port and link.

          cumulus@switch:~$ nv show platform transceiver swp2
          cable-type             : Active cable 
          cable-length           : 3m 
          supported-cable-length : 0 om1, 0 om2, 0 om3, 3 om4, 0 om5 
          diagnostics-status     : Diagnostic Data Available 
          status                 : plugged_enabled 
          error-status           : Power_Budget_Exceeded 
          vendor-data-code       : 210215__ 
          identifier             : QSFP28 
          vendor-rev             : B2 
          vendor-name            : Mellanox 
          vendor-pn              : MFA1A00-C003 
          vendor-sn              : MT2108FT02204 
          temperature: 
            temperature         : 48.74 C 
            high-alarm-threshold: 80.00 C 
            low-alarm-threshold : -10.00 C 
            alarm               : Off 
          voltage: 
            voltage             : 3.2692 V 
            high-alarm-threshold: 3.5000 V 
            low-alarm-threshold : 3.1000 V 
            alarm               : Off 
          channel: 
            channel-1: 
              rx-power: 
                  power            : 0.0000 mW / -inf dBm 
                  high-alarm-thresh: 5.40 dBm 
                  low-alarm-thresh : -13.31 dBm 
                  alarm            : low 
              tx-power: 
                  power            : 0.0000 mW / -inf dBm 
                  high-alarm-thresh: 5.40 dBm 
                  low-alarm-thresh : -11.40 dBm 
                  alarm            : Off 
              tx-bias-current: 
                  current          : 0.000 mA 
                  high-alarm-thresh: 8.500 mA 
                  low-alarm-thresh : 5.492 mA 
                  alarm            : low 
            channel-2: 
              rx-power: 
                  power            : 0.0000 mW / -inf dBm 
                  high-alarm-thresh: 5.40 dBm 
                  low-alarm-thresh : -13.31 dBm 
                  alarm            : low 
              tx-power: 
                  power            : 0.0000 mW / -inf dBm 
                  high-alarm-thresh: 5.40 dBm 
                  low-alarm-thresh : -11.40 dBm 
                  alarm            : low 
              tx-bias-current: 
                  current          : 0.000 mA 
                  high-alarm-thresh: 8.500 mA 
                  low-alarm-thresh : 5.492 mA 
                  alarm            : low 
            channel-3: 
              rx-power: 
                  power            : 0.0000 mW / -inf dBm 
                  high-alarm-thresh: 5.40 dBm 
                  low-alarm-thresh : -13.31 dBm 
                  alarm            : low 
              tx-power: 
                  power            : 0.0000 mW / -inf dBm 
                  high-alarm-thresh: 5.40 dBm 
                  low-alarm-thresh : -11.40 dBm 
                  alarm            : low 
              tx-bias-current: 
                  current          : 0.000 mA 
                  high-alarm-thresh: 8.500 mA 
                  low-alarm-thresh : 5.492 mA 
                  alarm            : low 
            channel-4: 
              rx-power: 
                  power            : 0.0000 mW / -inf dBm 
                  high-alarm-thresh: 5.40 dBm 
                 low-alarm-thresh : -13.31 dBm 
                  alarm            : low 
              tx-power: 
                  power            : 0.0000 mW / -inf dBm 
                  high-alarm-thresh: 5.40 dBm 
                  low-alarm-thresh : -11.40 dBm 
                  alarm            : low 
              tx-bias-current: 
                  current          : 0.000 mA 
                  high-alarm-thresh: 8.500 mA 
                  low-alarm-thresh : 5.492 mA 
                  alarm            : low
          

          Because Linux Ethernet tools do not have a unified approach to the various vendor driver implementations and areas that affect layer 1, Cumulus Linux uses the l1-show command to show all layer 1 aspects of a Cumulus Linux port and link.

          You must run the l1-show command as root. The syntax for the command is:

          cumulus@switch:~$ sudo l1-show PORTLIST
          

          Here is the output from the NVIDIA SN2410 switch on the other side of the same link:

          cumulus@2410-switch:~$ sudo l1-show swp43
          Port:  swp43
            Module Info
                Vendor Name: Mellanox               PN: MCP2M00-A003
                Identifier: 0x03 (SFP)              Type: 25g-cr
            Configured State
                Admin: Admin Up     Speed: 25G      MTU: 9216
                Autoneg: On                         FEC: Auto
            Operational State
                Link Status: Kernel: Up             Hardware: Up
                Speed: Kernel: 25G                  Hardware: 25G
                Autoneg: On (Autodetect enabld)     FEC: None
                TX Power (mW): None
                RX Power (mW): None
                Topo File Neighbor: bcm-switch-1, swp43
                LLDP Neighbor:      bcm-switch-1, swp43
            Port Hardware State:
                Compliance Code: 100GBASE-CR4 or 25GBASE-CR CA-L
                Cable Type: Passive copper cable
                Speed: 25G                          Autodetect: Enabled
                Eyes: 79                            Grade: 5451
                Troubleshooting Info: No issue was observed.
          

          On Spectrum-4 switches the l1-show command output shows the same values for eyes and grades.

          The output is in the following sections:

          • Module Info: Shows basic information about the module, according to the module EEPROM.
          • Configured State: Shows configuration information of the port, as defined in the kernel.
          • Operational State: Shows high level details of the actual link status of the port in the hardware and kernel.
          • Port Hardware State: Shows low level port information from the port on the switch ASIC.

          Module Information

          The vendor name, vendor part number, identifier (QSFP/SFP type), and type (compliance codes) comes from the vendor EEPROM. See Compliance Codes, Ethernet Type, Ethmode Type, Interface type above for an explanation.

          Module Info
              Vendor Name: Mellanox               PN: MCP2M00-A003
              Identifier: 0x03 (SFP)              Type: 25g-cr
          

          Configured State

          The configured state reflects the configuration you apply to the kernel with ifupdown2. The switchd daemon translates the kernel state to the platform hardware state and keeps them in sync.

          Configured State
              Admin: Admin Up     Speed: 25G      MTU: 9216
              Autoneg: On                         FEC: Auto
          
          • Admin state:
            • Admin Up means the kernel has enabled the port with NVUE, ifupdown2, or temporarily with ip set line <swp> up.
            • Admin Down means the kernel has disabled the port.
          • Speed:
            • The configured speed in the kernel.
            • You can lower the speed with NVUE or ifupdown2.
            • If you enable auto-negotiation, this output displays the negotiated or auto-detected speed.
          • MTU: The configured MTU setting in the kernel.
          • Autoneg: The configured auto-negotiation state in the kernel. See Auto-negotiation for more information.
          • FEC: The configured state of FEC in the kernel. See FEC, above for more information.

          Operational State

          The operational state shows the current state of the link in the kernel and in the switch hardware.

            Operational State
                Link Status: Kernel: Up             Hardware: Up
                Speed: Kernel: 25G                  Hardware: 25G
                Autoneg: On (Autodetect enabld)     FEC: None
                TX Power (mW): None
                RX Power (mW): None
                Topo File Neighbor: switch-1, swp43
                LLDP Neighbor:      switch-1, swp43
          
          • Link Status and Speed: The kernel state and hardware state matches, unless the link is in some unstable or transitory state.
          • Autoneg and Autodetect: See Auto-negotiation above for more information.
          • FEC: The operational state of FEC on the link. See FEC above for more information.
          • TX Power and RX Power: These values come from the module DDM/DOM fields to indicate the optical power strength measured on the module if the module implements the feature. The switch supports, both, only RX, or neither. This does not apply to DAC and twisted pair interfaces.
          • Topo File Neighbor: If you populate the /etc/ptm.d/topology.dot file and the ptmd daemon is active, the entry matching this interface shows.
          • LLDP Neighbor: If the lldpd daemon is running and the switch receives LLDP data from the neighbor, the neighbor information shows here.

          Port Hardware State

          The port hardware state shows additional low level port information. The output varies between vendors.

          Here is the output on NVIDIA platforms:

            Port Hardware State:
                Compliance Code: 100GBASE-CR4 or 25GBASE-CR CA-L
                Cable Type: Passive copper cable
                Speed: 25G                          Autodetect: Enabled
                Eyes: 79                            Grade: 5451
                Troubleshooting Info: No issue was observed.
          

          The NVIDIA port firmware automatically troubleshoots link problems and displays items of concern in the Troubleshooting Info section of this output.

          See FEC, Auto-negotiation and Signal Integrity above for more details.

          Troubleshoot Layer 1 Problems

          This section provides a troubleshooting process and checklist to help resolve layer 1 issues for modules.

          The root cause of a layer 1 problem falls into one of these three categories:

          To resolve a layer 1 problem, follow these steps:

          Classify the Layer 1 Problem

          You can classify layer 1 problems as follows:

          See the sections below for specific guidance for each problem type.

          Isolate Faulty Hardware

          When you suspect that one of the components in a link is faulty, use the following approach to determine which component is faulty.

          First, identify the faulty behavior at the lowest level possible, then design a test that best displays that behavior. Use the hierarchy output of l1-show to find the best indicator. Here are some examples of tests you can use:

          Try swapping the modules and fibers to determine which component is bad:

          A down or flapping link can exhibit any or all the following symptoms:

          To begin troubleshooting, examine the output of l1-show on both ends of the link if possible. The output contains all the pertinent information to help troubleshoot the link.

          cumulus@switch~$ sudo l1-show swp10
          Port:  swp10
            Module Info
                Vendor Name: FINISAR CORP.          PN: FTLX8574D3BCL
                Identifier: 0x03 (SFP)              Type: 10g-sr
            Configured State
                Admin: Admin Up     Speed: 10G      MTU: 9216
                Autoneg: On                         FEC: Auto
            Operational State
                Link Status: Kernel: Up             Hardware: Up
                Speed: Kernel: 10G                  Hardware: 10G
                Autoneg: On (Autodetect enabld)     FEC: None
                TX Power (mW): [0.5267]
                RX Power (mW): [0.5427]
                Topo File Neighbor: qct-ix8-51, swp3
                LLDP Neighbor:      qct-ix8-51, swp3
            Port Hardware State:
                Compliance Code: 10G Base-SR
                Cable Type: Optical Module (separated)
                Speed: 10G                          Autodetect: Enabled
                Eyes: 411                           Grade: 41609
                Troubleshooting Info: No issue was observed.
          

          Working from top to bottom of the l1-show output on both sides of the link, ask the questions listed below.

          Examine Module Information

          Module Info
              Vendor Name: FINISAR CORP.          PN: FTLX8574D3BCL
              Identifier: 0x03 (SFP)              Type: 10g-sr
          

          Examine Configured State

          Configured State
              Admin: Admin Up     Speed: 10G      MTU: 9216
              Autoneg: On                         FEC: Auto
          

          Examine Operational State

          Operational State
              Link Status: Kernel: Up             Hardware: Up
              Speed: Kernel: 10G                  Hardware: 10G
              Autoneg: On (Autodetect enabld)     FEC: None
              TX Power (mW): [0.5267]
              RX Power (mW): [0.5427]
              Topo File Neighbor: qct-ix8-51, swp3
              LLDP Neighbor:      qct-ix8-51, swp3
          

          Examine Port Hardware State

          The following values come from the NVIDIA port firmware:

          Port Hardware State:
              Compliance Code: 25GBASE-CR CA-S
              Cable Type: Passive copper cable
              Speed: N/A                          Autodetect: Enabled
              Eyes: 0                             Grade: 0
              Troubleshooting Info: Auto-negotiation no partner detected.
          

          RX Signal Failure Examples

          Here is the output from l1-show for an AOC (on swp6) with failed RX on lane 3. Because an AOC is an integrated fiber assembly, you must replace the entire assembly.

            Port:  swp6
            Module Info
                Vendor Name: XXXXX                  PN: AOC-XXXX
                Identifier: 0x0d (QSFP+)            Type: 40g-sr4
            Configured State
                Admin: Admin Up     Speed: 40G      MTU: 9216
                Autoneg: Off                        FEC: Off
            Operational State
                Link Status: Kernel: Down           Hardware: Down <=Link is down, Kernel and Hardware
                Speed: Kernel: 40G                  Hardware: 40G
                Autoneg: Off                        FEC: None (down)
                TX Power (mW): [1.1645, 1.171, 1.1155, 1.0945]
                RX Power (mW): [0.159, 0.1732, 0.153, 0.0067]  <=Low power on lane 3
                Topo File Neighbor: switch_1, swp6
                LLDP Neighbor:      None, None
            Port Hardware State:
                Rx Fault: Local  <=Local RX Failed  Carrier Detect: no <=No bi-directional communication
                Rx Signal: Detect: YYYY             Signal Lock: YYYN  <=No signal lock on lane 3
                Ethmode Type: 40g-sr4               Interface Type: SR4
                Speed: 40G                          Autoneg: Off
                MDIX: ForcedNormal, Normal          FEC: Off
                Local Advrtsd: None                 Remote Advrtsd: None
                Eyes: L: 357, R: 326, U: 211, D: 219, L: 328, R: 312, U: 206, D: 211,
                      L: 359, R: 343, U: 211, D: 200, L: 0, R: 0, U: 0, D: 0 <= No valid eye on lane 3
          

          Here is the l1-show output for an AOC with failed lanes 0 and 1. Note that signal lock is bouncing, and sometimes shows Y. You must replace the AOC.

          Port:  swp8
            Module Info
                Vendor Name: XXXX                   PN: AOC-XXXX
                Identifier: 0x0d (QSFP+)            Type: 40g-sr4
            Configured State
                Admin: Admin Up     Speed: 40G      MTU: 9216
                Autoneg: Off                        FEC: Off
            Operational State
                Link Status: Kernel: Down           Hardware: Down <=Link is down, Kernel and Hardware
                Speed: Kernel: 40G                  Hardware: 40G
                Autoneg: Off                        FEC: None (down)
                TX Power (mW): [1.1762, 1.1827, 1.1272, 1.1062]
                RX Power (mW): [0.0001, 0.0001, 0.5255, 0.64]  <=Low power on lanes 0,1
                Topo File Neighbor: switch_2, swp10
                LLDP Neighbor:      None, None
            Port Hardware State:
                Rx Fault: Local  <=Local RX Failed  Carrier Detect: no <=No bi-directional communication
                Rx Signal: Detect: YYYY             Signal Lock: YNYY  <=No lock on lane 1 at moment of capture
                Ethmode Type: 40g-sr4               Interface Type: SR4
                Speed: 40G                          Autoneg: Off
                MDIX: ForcedNormal, Normal          FEC: Off
                Local Advrtsd: None                 Remote Advrtsd: None
                Eyes: L: 0, R: 0, U: 0, D: 0, L: 0, R: 0, U: 0, D: 0,  <=No valid eyes on lanes 0,1
                      L: 359, R: 359, U: 214, D: 226, L: 359, R: 359, U: 243, D: 264
          

          Physical errors on a link occur if you have signal integrity issues or you do not configure the required FEC type on a particular module or cable type.

          The target bit error rate (BER) in high-speed Ethernet is 10-12. When BER exceeds this value, either configure the correct FEC setting or replace a marginal module, cable, or fiber patch. If the resulting BER on a link with correctly configured FEC is still unacceptable, you need to replace one of the hardware components in the link to resolve the errors.

          See FEC and Troubleshoot Signal Integrity Issues for more details.

          To see error counters for a port, run the ethtool -S <swp> | grep Errors command. If FEC is on, these counters only count errors that FEC does not correct.

          On NVIDIA switches, to see the bit error count that FEC corrects on a link, run the sudo l1-show <swp> --pcs-errors command.

          Because errors can occur during link up and down transitions, it is best to check error counters over a period of time to ensure they are incrementing regularly instead of displaying stale counts from when the link last transitions up or down. The /var/log/linkstate log files show historical link up and link down transitions on a switch.

          Troubleshoot Signal Integrity Issues

          Signal integrity issues are often a root cause of different types of symptoms:

          To see error counters for a port, run the ethtool -S <swp> | grep Errors command. If FEC is on, these counters only count errors that FEC does not correct.

          To see counts of bit errors that FEC corrects on a link, run the sudo l1-show <swp> --pcs-errors command.

          Signal integrity issues are physical issues and usually, you must replace some hardware component in the link to fix the link. Follow the steps in Isolate Faulty Hardware to isolate and replace the failed hardware component.

          On rare occasions, if the switch does not recognize a module correctly and is the wrong type (active instead of passive), it can cause a signal integrity issue.

          See Active and Passive Modules and Cables, Compliance Codes, Ethernet Type, Ethmode Type, Interface Type and Examine Module Information for more details.

          Troubleshoot MTU Size Mismatches

          Usually there is an MTU size mismatch when higher layer protocols like OSPF adjacencies fail or you lose non-fragmentable packets. Generally, an MTU settings mismatch does not affect link operational status.

          To troubleshoot a suspected MTU problem, review the Configured State section in the output of l1-show:

          Configured State
              Admin: Admin Up     Speed: 10G      MTU: 9216 <===
              Autoneg: On                         FEC: Auto
          

          Troubleshoot High Power Module Issues

          The SFF specifications allow for modules of different power consumption levels along with a request and grant procedure to enable higher levels.

          An SFP module can have 3 different power classes:

          1. 1.0W
          2. 1.5W
          3. 2.0W

          Cumulus Linux enables power class 2 (1.5W) by default. All Cumulus Linux switches support 1.5W across all SFP ports simultaneously.

          A QSFP module can have 8 different power classes:

          1. 1.5W
          2. 2.0W
          3. 2.5W
          4. 3.5W
          5. 4.0W
          6. 4.5W
          7. 5.0W
          8. 10.0W

          Low power mode is power class 1 (1.5W). This is the state during initial boot.

          After hardware initialization, Cumulus Linux enables normal power mode on QSFP modules by default — power classes 2-4, 2.0W to 3.5W.

          All Cumulus Linux switches support 3.5W across all QSFP ports simultaneously.

          Some modules require high power modes for driving long distance lasers. Power classes 5-8 — 4.0W, 4.5W, 5.0W, 10.0W — are high power modes. If a module needs a high power mode, it can request it, which the switch grants if the port supports it.

          To determine if a switch supports higher power modes, consult the hardware manufacturer specifications for power limitations for a switch.

          NVIDIA switches vary in their support of high power modules. For example, on some NVIDIA Spectrum 1 switches, only the first and last two QSFP ports support up to QSFP power class 6 (4.5W) and only the first and last two SFP ports support SFP power class 3 (2.0W) modules. Other Spectrum 1 switches do not support high power ports at all. Consult the hardware manufacturer specifications for exact details of which ports support high power modules.

          The total bus power rating is the default power rating per port type (SFP: 1.5W, QSFP: 3.5W) multiplied by the number of ports of each type present on the bus.

          To see the requested and enabled status for high power module, review the output of sudo ethtool -m. The following output is from a device of power class between 1 and 4 (1.5W to 3.5W). The module does not request a high power class or the switch does not enable it.

          cumulus@switch:mgmt:~# sudo ethtool -m swp53
                  Identifier                                : 0x11 (QSFP28)
                  Extended identifier                       : 0x00
                  Extended identifier description           : 1.5W max. Power consumption <= ignore for high power modules
                  Extended identifier description           : No CDR in TX, No CDR in RX
                  Extended identifier description           : High Power Class (> 3.5 W) not enabled <= high power mode not requested or enabled
          

          The following is the output from a power class 7 (5.0W) module. The module is requesting power class 7, but the switch does not support or enable it. The switch only supports power class 6 on this port.

          cumulus@switch:mgmt:~# sudo ethtool -m swp49
          [sudo] password for cumulus:
                 Identifier                                : 0x11 (QSFP28)
                 Extended identifier                       : 0xcf
                 Extended identifier description           : 3.5W max. Power consumption  <= ignore for high power modules
                 Extended identifier description           : CDR present in TX, CDR present in RX
                 Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) not enabled <= Request 5.0W, not enabled
           
          

          The following is the output from a power class 6 (4.5W) module. The module is requesting power class 6 and the switch enables it.

          cumulus@switch:mgmt:~# sudo ethtool -m swp3
                  Identifier                                : 0x11 (QSFP28)
                  Extended identifier                       : 0xce
                  Extended identifier description           : 3.5W max. Power consumption <= ignore for high power modules
                  Extended identifier description           : CDR present in TX, CDR present in RX
                  Extended identifier description           : 4.5W max. Power consumption,  High Power Class (> 3.5 W) enabled <= Request 4.5W, enabled
          

          Troubleshoot I2C issues

          Ethernet switches contain multiple I2C buses set up for the switch CPU to communicate low speed control information with the port modules, fans, and power supplies within the system.

          On rare occasions, a port module with a defective I2C component or firmware can fail and lock up one or more I2C buses. Depending on the particular hardware design of a switch and the way in which the failure occurs, different symptoms of this failure display. Often traffic continues to work for a while in this failed condition, but sometimes the failure can cause modules to be incorrectly configured, resulting in link failures or increased error rates on a link. In the worst case, the switch reboots or locks up.

          Because I2C issues are in the low speed control circuitry of a module, high speed traffic rates do not affect the data side of the module. Software bugs in Cumulus Linux do not cause these issues.

          When the I2C bus has issues or lockups, installed port modules might no longer show up in the output of sudo l1-show <swp> or sudo ethtool -m <swp>. A significant number of smbus or i2c or EEPROM read errors might be present in /var/log/syslog. After one module locks up the bus, some or all the other modules then exhibit problems, making it nearly impossible to tell which module is causing the failure.

          Failed I2C components or defective designs in port modules cause an overwhelming number of I2C lockups. Low priced vendor modules cause most failures, but even high price, high quality modules can fail, only with much lower incidence; they have a higher MTBF rating.

          You might resolve the issue if you remove each port module one by one until the problem clears; this might indicate which module causes the failure. However, often the bus blocks in a way that requires a reboot or power cycle to clear the I2C failure. Clearing the failure in one of these ways works for a while, but when the conditions are right again, hours, days or months later, the marginal I2C component might fail again and lock up the switch.

          In the worst situations, a switch might have multiple bad or marginal I2C modules from the same vendor batch, making it difficult to determine which module or modules are bad.

          Because I2C problems can be very pernicious, often showing up again much later after the problem clears, deal with them quickly and forcefully.

          To verify that an I2C failure is occurring, run sudo tail -F /var/log/syslog and look for smbus or i2c or EEPROM read errors that continue to appear or appear in bursts.

          Based on the failure scenario when you discover the issue, choose when to address this issue; immediately or during a maintenance window.

          If the switch is operational again due to one of the above methods, but you have not identified the module that caused, try the following approach:

          If needed, contact the NVIDIA Enterprise Support team for additional help.

          Monitoring Interfaces and Transceivers with NVUE

          NVUE enables you to check the status of an interface, and view and clear interface counters. Interface counters provide information about an interface, such as the number of packets dropped, the number of inbound and outbound packets not transmitted because of errors, and so on.

          Show Interface Configuration and Statistics

          To check the configuration and statistics for an interface, run the nv show interface <interface> command:

          cumulus@switch:~$ nv show interface swp1
                                    operational        applied  pending
          ------------------------  -----------------  -------  -------
          type                       swp                swp      
          [acl]                                                  
          evpn                                                   
            multihoming                                          
              uplink                                    off      
          ptp                                                    
            enable                                      off      
          router                                                 
            adaptive-routing                                     
              enable                                    off      
            ospf                                                 
              enable                                    on       
              area                                      none     
              cost                                      auto     
              mtu-ignore                                off      
              network-type                              broadcast
              passive                                   on       
              priority                                  1        
              authentication                                     
                enable                                  off      
              bfd                                                
                enable                                  off      
              timers                                             
                dead-interval                           40       
                hello-interval                          10       
                retransmit-interval                     5        
                transmit-delay                          1        
            ospf6                                                
              enable                                    off      
            pbr                                                  
              [map]                                              
            pim                                                  
              enable                                    off      
          synce                                                  
            enable                                      off      
          ip                                                     
            igmp                                                 
              enable                                    off      
            ipv4                                                 
              forward                                   on       
            ipv6                                                 
              enable                                    on       
              forward                                   on       
            neighbor-discovery                                   
              enable                                    on       
              [dnssl]                                            
              home-agent                                         
                enable                                  off      
              [prefix]                                           
              [rdnss]                                            
              router-advertisement                               
                enable                                  on       
                fast-retransmit                         on       
                hop-limit                               64       
                interval                                600000   
                interval-option                         off      
                lifetime                                1800     
                managed-config                          off      
                other-config                            off      
                reachable-time                          0        
                retransmit-time                         0        
                router-preference                       medium   
            vrrp                                                 
              enable                                    off      
            vrf                                         default  
            [gateway]                                            
          link                                                   
            auto-negotiate           off                on       
            duplex                   full               full     
            speed                    1G                 auto     
            fec                                         auto     
            mtu                      9216               9216     
            [breakout]                                           
            state                    up                 up       
            stats                                                
              carrier-transitions    4                           
              in-bytes               300 Bytes                   
              in-drops               5                           
              in-errors              0                           
              in-pkts                5                           
              out-bytes              9.73 MB
              out-drops              0                           
              out-errors             0                           
              out-pkts               140188                      
            mac                      48:b0:2d:ef:52:b8           
          ifindex                    3
          

          Show Interface Counters

          NVUE provides the following commands to show counters (statistics) for the interfaces on the switch.

          NVUE Command
          Description
          nv show interface --view counters Shows all statistics for all the interfaces configured on the switch, such as the total number of received and transmitted packets, and the number of received and transmitted dropped packets and error packets.
          nv show interface <interface> counters Shows all statistics for a specific interface, such as the number of received and transmitted unicast, multicast and broadcast packets, the number of received and transmitted dropped packets and error packets, and the number of received and transmitted packets of a certain size.
          nv show interface <interface> counters errors Shows the number of error packets for a specific interface, such as the number of received and transmitted packet alignment, oversize, undersize, and jabber errors.
          nv show interface <interface> counters drops Shows the number of received and transmitted packet drops for a specific interface, such as ACL drops, buffer drops, queue drops, and non-queue drops.
          nv show interface <interface> counters pktdist Shows the number of received and transmitted packets of a certain size for a specific interface.
          nv show interface <interface> counters qos Shows QoS statistics for the specified interface. See Show Qos Counters.
          nv show interface <interface> counters ptp Shows PTP statistics for a specific interface. See Show PTP Counters.

          The following example shows all statistics for all the interfaces configured on the switch:

          cumulus@switch$ nv show interface --view counters
          Interface       MTU    RX_OK  RX_ERR  RX_DRP  RX_OVR  TX_OK  TX_ERR  TX_DRP  TX_OVR  Flg  
          --------------  -----  -----  ------  ------  ------  -----  ------  ------  ------  -----
          BLUE            65575  1      0       0       0       0      0       4       0       OmRU 
          RED             65575  1      0       0       0       0      0       4       0       OmRU 
          bond1           9000   718    0       0       0       1091   0       0       0       BMmRU
          bond2           9000   727    0       0       0       1088   0       0       0       BMmRU
          bond3           9000   722    0       0       0       1089   0       0       0       BMmRU
          br_default      9216   360    0       10      0       475    0       0       0       BMRU 
          eth0            1500   946    0       0       0       299    0       0       0       BMRU 
          lo              65536  651    0       0       0       651    0       0       0       LRU  
          mgmt            65575  283    0       0       0       0      0       4       0       OmRU 
          peerlink        9216   4972   0       0       0       5028   0       0       0       BMmRU
          peerlink.4094   9216   3263   0       0       0       3224   0       0       0       BMRU 
          swp1            9000   721    0       0       0       1091   0       0       0       BMsRU
          swp2            9000   730    0       0       0       1088   0       0       0       BMsRU
          swp3            9000   725    0       0       0       1089   0       0       0       BMsRU
          swp49           9216   2807   0       0       0       2691   0       0       0       BMsRU
          swp50           9216   2165   0       0       0       2337   0       0       0       BMsRU
          swp51           9216   685    0       0       0       690    0       0       0       BMRU 
          swp52           9216   703    0       0       0       722    0       0       0       BMRU 
          swp53           9216   738    0       0       0       710    0       0       0       BMRU 
          swp54           9216   682    0       0       0       730    0       0       0       BMRU 
          vlan10          9216   108    0       20      0       91     0       0       0       BMRU 
          vlan10-v0       9216   63     0       20      0       45     0       0       0       BMRU 
          vlan20          9216   104    0       20      0       88     0       0       0       BMRU 
          vlan20-v0       9216   58     0       20      0       44     0       0       0       BMRU 
          vlan30          9216   112    0       20      0       94     0       0       0       BMRU 
          vlan30-v0       9216   61     0       20      0       44     0       0       0       BMRU 
          vlan4024_l3     9216   1      0       0       0       82     0       0       0       BMRU 
          vlan4024_l3-v0  9216   0      0       0       0       36     0       0       0       BMRU 
          vlan4036_l3     9216   1      0       0       0       85     0       0       0       BMRU 
          vlan4036_l3-v0  9216   0      0       0       0       37     0       0       0       BMRU 
          vxlan48         9216   45     0       0       0       21     0       0       0       BMRU
          

          The following example shows all statistics for swp1:

          cumulus@switch$ nv show interface swp1 counters
                               operational  applied
          -------------------  -----------  -------
          carrier-transitions  4                   
          
          Detailed Counters
          ====================
              Counter            Receive  Transmit
              -----------------  -------  --------
              Broadcast Packets  0        0       
              Multicast Packets  0        0       
              Total Octets       0        0       
              Total Packets      0        0       
              Unicast Packets    0        0       
          
          Drop Counters
          ================
              Counter          Receive  Transmit
              ---------------  -------  --------
              ACL Drops        0        n/a     
              Buffer Drops     0        n/a     
              Non-Queue Drops  n/a      0       
              Queue Drops      n/a      0       
              Total Drops      0        0       
          
          Error Counters
          =================
              Counter           Receive  Transmit
              ----------------  -------  --------
              Alignment Errors  0        n/a     
              FCS Errors        0        n/a     
              Jabber Errors     0        n/a     
              Length Errors     0        n/a     
              Oversize Errors   0        n/a     
              Symbol Errors     0        n/a     
              Total Errors      0        0       
              Undersize Errors  0        n/a     
          
          Packet Size Statistics
          =========================
              Counter     Receive  Transmit
              ----------  -------  --------
              64          0        0       
              65-127      0        0       
              128-255     0        0       
              256-511     0        0       
              512-1023    0        0       
              1024-1518   0        0       
              1519-2047   0        0       
              2048-4095   0        0       
              4096-16383  0        0
          
          Ingress Buffer Statistics
          ============================
              priority-group  rx-frames  rx-buffer-discards  rx-shared-buffer-discards
              --------------  ---------  ------------------  -------------------------
              0               0          0 Bytes             0 Bytes                  
              1               0          0 Bytes             0 Bytes                  
              2               0          0 Bytes             0 Bytes                  
              3               0          0 Bytes             0 Bytes                  
              4               0          0 Bytes             0 Bytes                  
              5               0          0 Bytes             0 Bytes                  
              6               0          0 Bytes             0 Bytes                  
              7               0          0 Bytes             0 Bytes 
          ...
          

          The following example shows error counters for swp1:

          cumulus@switch$ nv show interface swp1 counters errors
          Counter           Receive  Transmit
          ----------------  -------  --------
          Alignment Errors  0        n/a     
          FCS Errors        0        n/a     
          Jabber Errors     0        n/a     
          Length Errors     0        n/a     
          Oversize Errors   0        n/a     
          Symbol Errors     0        n/a     
          Total Errors      0        0       
          Undersize Errors  0        n/a   
          

          AmBER PHY Health Management

          To show physical layer information, such as the error counters for each lane on a port, run the nv show interface <interface> link phy-detail command. This command highlights link integrity issues.

          The effective-ber in the command output represents the uncorrectable bit error rate, which is the same as uncorrected FEC errors.

          cumulus@switch$ nv show interface swp1 link phy-detail 
                                    operational
          -------------------------  -----------------
          time-since-last-clear-min  324
          phy-received-bits          15561574400000000
          symbol-errors              0
          effective-errors           0
          phy-raw-errors-lane0       747567424
          phy-raw-errors-lane1       215603747
          phy-raw-errors-lane2       158456437
          phy-raw-errors-lane3       30578923
          phy-raw-errors-lane4       121708834
          phy-raw-errors-lane5       29244642
          phy-raw-errors-lane6       79102523
          phy-raw-errors-lane7       96656135
          raw-ber                    1E-7
          symbol-ber                 15E-255
          effective-ber              15E-255
          raw-ber-lane0              3E-6
          raw-ber-lane1              9E-7
          raw-ber-lane2              6E-7
          raw-ber-lane3              1E-7
          raw-ber-lane4              5E-7
          raw-ber-lane5              1E-7
          raw-ber-lane6              3E-7
          raw-ber-lane7              4E-7
          rs-num-corr-err-bin0       757956054591
          rs-num-corr-err-bin1       598244758
          rs-num-corr-err-bin2       807002
          rs-num-corr-err-bin3       3371
          rs-num-corr-err-bin4       180
          rs-num-corr-err-bin5       1
          rs-num-corr-err-bin6       0
          rs-num-corr-err-bin7       0
          rs-num-corr-err-bin8       1
          rs-num-corr-err-bin9       0
          rs-num-corr-err-bin10      0
          rs-num-corr-err-bin11      0
          rs-num-corr-err-bin12      0
          rs-num-corr-err-bin13      0
          rs-num-corr-err-bin14      0
          rs-num-corr-err-bin15      0
          

          To show physical layer diagnostic information for a port, run the nv show interface <interface> link phy-diag command:

          cumulus@switch$ nv show interface swp20 link phy-diag 
                                            operational
          --------------------------------  -----------
          pd-fsm-state                      0x7
          eth-an-fsm-state                  0x6
          phy-hst-fsm-state                 0x8
          psi-fsm-state                     0x0
          phy-manager-link-enabled          0x9bff0
          core-to-phy-link-enabled          0x9b800
          cable-proto-cap-ext               0x0
          loopback-mode                     0x0
          retran-mode-request               0x0
          retran-mode-active                0x0
          fec-mode-request                  0x1
          profile-fec-in-use                0x4
          pd-link-enabled                   0x80000
          phy-hst-link-enabled              0x80000
          eth-an-link-enabled               0x0
          phy-manager-state                 0x3
          eth-proto-admin                   0x0
          ext-eth-proto-admin               0x0
          eth-proto-capability              0x0
          ext-eth-proto-capability          0x0
          data-rate-oper                    0x0
          an-status                         0x0
          an-disable-admin                  0x0
          proto-mask                        0x2
          module-info-ext                   0x0
          ethernet-compliance-code          0x1c
          ext-ethernet-compliance-code      0x32
          memory-map-rev                    0x40
          linear-direct-drive               0x0
          cable-breakout                    0x0
          cable-rx-amp                      0x1
          cable-rx-pre-emphasis             0x0
          cable-rx-post-emphasis            0x0
          cable-tx-equalization             0x0
          cable-attenuation-53g             0x0
          cable-attenuation-25g             0x0
          cable-attenuation-12g             0x0
          cable-attenuation-7g              0x0
          cable-attenuation-5g              0x0
          tx-input-freq-sync                0x0
          tx-cdr-state                      0xff
          rx-cdr-state                      0xff
          module-fw-version                 0x2e820043
          module-st                         0x3
          dp-st-lane0                       0x4
          dp-st-lane1                       0x4
          dp-st-lane2                       0x4
          dp-st-lane3                       0x4
          dp-st-lane4                       0x4
          dp-st-lane5                       0x4
          dp-st-lane6                       0x4
          dp-st-lane7                       0x4
          rx-output-valid                   0x0
          rx-power-type                     0x1
          active-set-host-compliance-code   0x52
          active-set-media-compliance-code  0x1c
          error-code-response               0x0
          temp-flags                        0x0
          vcc-flags                         0x0
          mod-fw-fault                      0x0
          dp-fw-fault                       0x0
          rx-los-cap                        0x0
          tx-fault                          0x0
          tx-los                            0x0
          tx-cdr-lol                        0x0
          tx-ad-eq-fault                    0x0
          rx-los                            0x0
          rx-cdr-lol                        0x0
          rx-output-valid-change            0x0
          flag-in-use                       0x0
          

          Switches with the Spectrum 1 ASIC do not support the nv show interface <interface> link phy-detail command or the nv show interface <interface> link phy-diag command.

          Clear Interface Counters

          To clear counters (statistics) for all interfaces, run the nv action clear interface counters command:

          cumulus@switch$ nv action clear interface counters
          all interface counters cleared
          Action succeeded
          

          To clear the counters for an interface, run the nv action clear interface <interface> counters command:

          cumulus@switch$ nv action clear interface swp1 counters
          swp1 counters cleared
          Action succeeded
          

          The nv action clear interface <interface> counters command does not clear counters in the hardware.

          Reset a Transceiver

          NVUE provides a command to reset a specific transceiver to its initial, stable state without having to be present physically in the data center to pull the transceiver.

          The following example resets the transceiver in swp1:

          cumulus@switch:~$ nv action reset platform transceiver swp1 
          Action executing ... 
          Resetting module swp1 ... OK 
          Action succeeded 
          

          The following example resets a range of transceivers:

          cumulus@switch:~$ nv action reset platform transceiver swp1-swp5
          Action executing ... 
          Resetting module swp1-swp5 ... OK 
          Action succeeded 
          

          When the reset completes successfully, you see the following switchd.log messages:

          hal_mlx_host_ifc.c:3392 port [104] module state has changed to [Unplugged] 
          hal_mlx_host_ifc.c:3392 port [104] module state has changed to [Plugged] 
          

          If a cable is faulty, the nv action reset platform transceiver <transceiver-id command completes successfully, but the module does not come back unless you resolve the issue or reboot the system if necessary,

          Show Transceiver Information

          To show the identifier, vendor name, part number, serial number, and revision for all modules, run the nv show platform transceiver command:

          cumulus@switch:~$ nv show platform transceiver 
          Transceiver  Identifier  Vendor name  Vendor PN         Vendor SN      Vendor revision
          -----------  ----------  -----------  ----------------  -------------  --------------- 
          swp1         QSFP28      Mellanox     MCP1600-C001E30N  MT2039VB01185  A3 
          swp10        QSFP28      Mellanox     MCP1600-C001E30N  MT2211VS01792  A3 
          swp11        QSFP28      Mellanox     MCP1600-C001E30N  MT2211VS01792  A3 
          swp12        QSFP28      Mellanox     MCP1650-V00AE30   MT2122VB02220  A2 
          swp13        QSFP28      Mellanox     MCP1650-V00AE30   MT2122VB02220  A2 
          swp14        QSFP-DD     Mellanox     MCP1660-W00AE30   MT2121VS01645  A3 
          swp15        QSFP-DD     Mellanox     MCP1660-W00AE30   MT2121VS01645  A3 
          swp18        QSFP28      Mellanox     MCP1600-C001E30N  MT2211VS01967  A3 
          swp20        QSFP28      Mellanox     MFA1A00-C003      MT2108FT02204  B2 
          swp21        QSFP28      Mellanox     MFA1A00-C003      MT2108FT02204  B2 
          swp22        QSFP28      Mellanox     MFA1A00-C003      MT2108FT02194  B2 
          swp23        QSFP28      Mellanox     MFA1A00-C003      MT2108FT02194  B2 
          swp31        QSFP28      Mellanox     MCP1600-C001E30N  MT2039VB01191  A3 
          

          To show a detailed view of module information for all ports that includes cable length, type, and diagnostics, current status and error status, run the nv show platform transceiver details command.

          To show hardware capabilities and measurement information on the module in a particular port, run the nv show platform transceiver <interface> command:

          cumulus@switch:~$ nv show platform transceiver swp2
          cable-type             : Active cable 
          cable-length           : 3m 
          supported-cable-length : 0m om1, 0m om2, 0m om3, 3m om4, 0m om5 
          diagnostics-status     : Diagnostic Data Available 
          status                 : plugged_enabled 
          error-status           : N/A 
          vendor-date-code       : 210215__ 
          identifier             : QSFP28 
          vendor-rev             : B2 
          vendor-name            : Mellanox 
          vendor-pn              : MFA1A00-C003 
          vendor-sn              : MT2108FT02204 
          temperature: 
            temperature           : 42.56 C 
            high-alarm-threshold  : 80.00 C 
            low-alarm-threshold   : -10.00 C 
            high-warning-threshold: 70.00 C 
            low-warning-threshold : 0.00 C 
            alarm                 : Off 
          voltage: 
            voltage               : 3.2862 V 
            high-alarm-threshold  : 3.5000 V 
            low-alarm-threshold   : 3.1000 V 
            high-warning-threshold: 3.4650 V 
            low-warning-threshold : 3.1350 V 
            alarm                 : Off 
          channel: 
            channel-1: 
              rx-power: 
                  power                 : 0.8625 mW / -0.64 dBm 
                  high-alarm-threshold  : 5.40 dBm 
                  low-alarm-threshold   : -13.31 dBm 
                  high-warning-threshold: 2.40 dBm 
                  low-warning-threshold : -10.30 dBm 
                  alarm                 : Off 
              tx-power: 
                  power                 : 0.8988 mW / -0.46 dBm 
                  high-alarm-threshold  : 5.40 dBm 
                  low-alarm-threshold   : -11.40 dBm 
                  high-warning-threshold: 2.40 dBm 
                  low-warning-threshold : -8.40 dBm 
                  alarm                 : Off 
              tx-bias-current: 
                  current               : 6.750 mA 
                  high-alarm-threshold  : 8.500 mA 
                  low-alarm-threshold   : 5.492 mA 
                  high-warning-threshold: 8.000 mA 
                  low-warning-threshold : 6.000 mA 
                  alarm                 : Off
          ...  
          

          You can also show transceiver data in a more condensed format with the nv show interface <interface> transceiver command:

          cumulus@switch:~$ nv show interface swp1 transceiver
          cable-type             : Active cable 
          cable-length           : 3m 
          supported-cable-length : 0m om1, 0m om2, 0m om3, 3m om4, 0m om5 
          diagnostics-status     : Diagnostic Data Available 
          status                 : plugged_enabled 
          error-status           : N/A 
          revision-compliance    : SFF-8636 Rev 2.5/2.6/2.7 
          vendor-date-code       : 210215__ 
          identifier             : QSFP28 
          vendor-rev             : B2 
          vendor-oui             : 00:02:c9 
          vendor-name            : Mellanox 
          vendor-pn              : MFA1A00-C003 
          vendor-sn              : MT2108FT02204 
          temperature            : 42.56 degrees C / 108.61 degrees F 
          voltage                : 3.2888 V 
          ch-1-rx-power          : 0.8625 mW / -0.64 dBm 
          ch-1-tx-power          : 0.8988 mW / -0.46 dBm 
          ch-1-tx-bias-current   : 6.750 mA 
          ch-2-rx-power          : 0.8385 mW / -0.76 dBm 
          ch-2-tx-power          : 0.9154 mW / -0.38 dBm 
          ch-2-tx-bias-current   : 6.750 mA 
          ch-3-rx-power          : 0.8556 mW / -0.68 dBm 
          ch-3-tx-power          : 0.9537 mW / -0.21 dBm 
          ch-3-tx-bias-current   : 6.750 mA 
          ch-4-rx-power          : 0.8576 mW / -0.67 dBm 
          ch-4-tx-power          : 0.9695 mW / -0.13 dBm 
          ch-4-tx-bias-current   : 6.750 mA 
          

          To show channel information for the module in a particular port, run the nv show platform transceiver <interface> channel command. To show specific channel information for the module in a particular port, run the nv show platform transceiver <interface> channel <channel> command.

          cumulus@switch:~$ nv show platform transceiver swp25 channel 
          channel: 
            channel-1: 
              rx-power: 
                  power                 : 0.8625 mW / -0.64 dBm 
                  high-alarm-threshold  : 5.40 dBm 
                  low-alarm-threshold   : -13.31 dBm 
                  high-warning-threshold: 2.40 dBm 
                  low-warning-threshold : -10.30 dBm 
                  alarm                 : Off 
              tx-power: 
                  power                 : 0.8988 mW / -0.46 dBm 
                  high-alarm-threshold  : 5.40 dBm 
                  low-alarm-threshold   : -11.40 dBm 
                  high-warning-threshold: 2.40 dBm 
                  low-warning-threshold : -8.40 dBm 
                  alarm                 : Off 
              tx-bias-current: 
                  current               : 6.750 mA 
                  high-alarm-threshold  : 8.500 mA 
                  low-alarm-threshold   : 5.492 mA 
                  high-warning-threshold: 8.000 mA 
                  low-warning-threshold : 6.000 mA 
                  alarm                 : Off 
            channel-2: 
              rx-power: 
                  power                 : 0.8385 mW / -0.76 dBm 
                  high-alarm-threshold  : 5.40 dBm 
                  low-alarm-threshold   : -13.31 dBm 
                  high-warning-threshold: 2.40 dBm 
                  low-warning-threshold : -10.30 dBm 
                  alarm                 : Off 
              tx-power: 
                  power                 : 0.9154 mW / -0.38 dBm 
                  high-alarm-threshold  : 5.40 dBm 
                  low-alarm-threshold   : -11.40 dBm 
                  high-warning-threshold: 2.40 dBm 
                  low-warning-threshold : -8.40 dBm 
                  alarm                 : Off
          ...
          

          To show the thresholds for the module for a specific interface, run the nv show interface <interface> transceiver thresholds command:

          cumulus@switch:~$ nv show interface swp3 transceiver thresholds
                               Ch    Value          High Alarm       High Warn        Low Warn       Low Alarm       Alt Value 
                                                    Threshold        Threshold       Threshold       Threshold 
          ------------------------------------------------------------------------------------------------------------------------ 
          temperature          -     42.74 C         80.00 C         70.00 C         0.00 C          -10.00 C        108.94F 
          voltage              -     3.2862 V        3.5000 V        3.4650 V        3.1350 V        3.1000 V 
          rx-power             1     -0.64 dBm       5.40 dBm        2.40 dBm        -10.30 dBm      -13.31 dBm      0.8625 mW 
                               2     -0.70 dBm       5.40 dBm        2.40 dBm        -10.30 dBm      -13.31 dBm      0.8514 mW 
                               3     -0.68 dBm       5.40 dBm        2.40 dBm        -10.30 dBm      -13.31 dBm      0.8556 mW 
                               4     -0.60 dBm       5.40 dBm        2.40 dBm        -10.30 dBm      -13.31 dBm      0.8704 mW 
          tx-power             1     -0.48 dBm       5.40 dBm        2.40 dBm        -8.40 dBm       -11.40 dBm      0.8963 mW 
                               2     -0.38 dBm       5.40 dBm        2.40 dBm        -8.40 dBm       -11.40 dBm      0.9154 mW 
                               3     -0.19 dBm       5.40 dBm        2.40 dBm        -8.40 dBm       -11.40 dBm      0.9562 mW 
                               4     -0.13 dBm       5.40 dBm        2.40 dBm        -8.40 dBm       -11.40 dBm      0.9695 mW 
          tx-bias-current      1     6.750 mA        8.500 mA        8.000 mA        6.000 mA        5.492 mA 
                               2     6.750 mA        8.500 mA        8.000 mA        6.000 mA        5.492 mA 
                               3     6.750 mA        8.500 mA        8.000 mA        6.000 mA        5.492 mA 
                               4     6.750 mA        8.500 mA        8.000 mA        6.000 mA        5.492 mA 
          

          Monitoring Interfaces and Transceivers with ethtool

          The ethtool command enables you to query or control the network driver and hardware settings and takes the device name, such as swp1, as an argument. When the device name is the only argument, ethtool prints the network device settings. See man ethtool(8) for details.

          NVIDIA recommends using the l1-show command to monitor Ethernet data; refer to Troubleshoot Layer 1.

          Monitor Interface Status

          To check the status of an interface, run the ethtool <interface> command:

          cumulus@switch:~$ ethtool swp1
          Settings for swp1:
                  Supported ports: [ FIBRE ]
                  Supported link modes:   1000baseT/Full
                                          10000baseT/Full
                  Supported pause frame use: No
                  Supports auto-negotiation: No
                  Advertised link modes:  1000baseT/Full
                  Advertised pause frame use: No
                  Advertised auto-negotiation: No
                  Speed: 10000Mb/s
                  Duplex: Full
                  Port: FIBRE
                  PHYAD: 0
                  Transceiver: external
                  Auto-negotiation: off
                  Current message level: 0x00000000 (0)
          
          Link detected: yes
          

          The switch hardware includes the active port settings. The output of ethtool <interface> shows the port settings in the kernel. The switchd process keeps the hardware and kernel in sync for the important port settings (speed, auto-negotiation, and link detected). However, some fields in ethtool, such as Supported Link Modes and Advertised Link Modes, do not update based on the actual module in the port and might show incorrect or misleading results.

          To query interface statistics, run the ethtool -S <interface> command:

          cumulus@switch:~$ ethtool -S swp1
          NIC statistics:
               rx_queue_0_packets: 5
               rx_queue_0_bytes: 300
               rx_queue_0_drops: 0
               rx_queue_0_xdp_packets: 0
               rx_queue_0_xdp_tx: 0
               rx_queue_0_xdp_redirects: 0
               rx_queue_0_xdp_drops: 0
               rx_queue_0_kicks: 1
               tx_queue_0_packets: 144957
               tx_queue_0_bytes: 10546468
               tx_queue_0_xdp_tx: 0
               tx_queue_0_xdp_tx_drops: 0
               tx_queue_0_kicks: 144950
          

          View and Clear Interface Counters

          Interface counters provide information about an interface. You can view this information when you run cl-netstat, ifconfig, or cat /proc/net/dev. You can also run sudo cl-netstat -c to save or clear the interface counters.

          cumulus@switch:~$ sudo cl-netstat
          Kernel Interface table
          Iface            MTU    RX_OK    RX_ERR    RX_DRP    RX_OVR    TX_OK    TX_ERR    TX_DRP    TX_OVR  Flg
          -------------  -----  -------  --------  --------  --------  -------  --------  --------  --------  -----
          lo             65536   185932         0         0         0   185932         0         0         0  LRU
          eth0            1500   151883         0         0         0    13504         0         0         0  BMRU
          swp1            9216        5         0         5         0   144986         0         0         0  BMsRU
          swp2            9216        5         0         5         0   144988         0         0         0  BMsRU
          swp3            9216        5         0         5         0   144944         0         0         0  BMsRU
          swp49           9216   502662         0         5         0   502629         0         0         0  BMsRU
          swp50           9216   507636         0         5         0   507666         0         0         0  BMsRU
          swp51           9216   749122         0         5         0   794080         0         0         0  BMRU
          swp52           9216   216057         0         5         0   212567         0         0         0  BMRU
          bond1           9216        0         0         0         0   144942         0         0         0  BMmRU
          bond2           9216        0         0         0         0   144944         0         0         0  BMmRU
          bond3           9216        0         0         0         0   144944         0         0         0  BMmRU
          br_default      9216     5072         0         0         0     5074         0         0         0  BMRU
          mgmt           65575     3365         0         0         0        0         0       936         0  OmRU
          peerlink        9216  1010288         0         0         0  1010295         0         0         0  BMmRU
          peerlink.4094   9216   506672         0         0         0   506668         0         0         0  BMRU
          vlan10          9216     1687         0         0         0     1687         0         0         0  BMRU
          vlan10-v0       9216     1678         0         0         0     1677         0         0         0  BMRU
          vlan20          9216     1688         0         0         0     1688         0         0         0  BMRU
          vlan20-v0       9216     1678         0         0         0     1677         0         0         0  BMRU
          vlan30          9216     1687         0         0         0     1689         0         0         0  BMRU
          vlan30-v0       9216     1678         0         0         0     1678         0         0         0  BMRU
          
          cumulus@switch:~$ sudo cl-netstat -c
          Cleared counters
          

          To see the cl-netstat command options, run the cl-netstat -h command.

          Some services, such as MLAG and DHCP can cause drop counters to increment as expected and do not cause a problem on the switch.

          Monitor Switch Port Hardware Information

          To see hardware capabilities and measurement information on the module in a particular port, use the ethtool -m command. If the module supports Digital Optical Monitoring (the Optical diagnostics support field is Yes in the output below), the optical power levels and thresholds also show below the standard hardware details.

          In the sample output below, you can see that this module is a 1000BASE-SX short-range optical module, manufactured by JDSU, part number PLRXPL-VI-S24-22. The second half of the output displays the current readings of the Tx power levels (Laser output power) and Rx power (Receiver signal average optical power), temperature, voltage and alarm threshold settings.

          cumulus@switch$ ethtool -m swp3
                  Identifier                                : 0x03 (SFP)
                  Extended identifier                       : 0x04 (GBIC/SFP defined by 2-wire interface ID)
                  Connector                                 : 0x07 (LC)
                  Transceiver codes                         : 0x00 0x00 0x00 0x01 0x20 0x40 0x0c 0x05
                  Transceiver type                          : Ethernet: 1000BASE-SX
                  Transceiver type                          : FC: intermediate distance (I)
                  Transceiver type                          : FC: Shortwave laser w/o OFC (SN)
                  Transceiver type                          : FC: Multimode, 62.5um (M6)
                  Transceiver type                          : FC: Multimode, 50um (M5)
                  Transceiver type                          : FC: 200 MBytes/sec
                  Transceiver type                          : FC: 100 MBytes/sec
                  Encoding                                  : 0x01 (8B/10B)
                  BR, Nominal                               : 2100MBd
                  Rate identifier                           : 0x00 (unspecified)
                  Length (SMF,km)                           : 0km
                  Length (SMF)                              : 0m
                  Length (50um)                             : 300m
                  Length (62.5um)                           : 150m
                  Length (Copper)                           : 0m
                  Length (OM3)                              : 0m
                  Laser wavelength                          : 850nm
                  Vendor name                               : JDSU
                  Vendor OUI                                : 00:01:9c
                  Vendor PN                                 : PLRXPL-VI-S24-22
                  Vendor rev                                : 1
                  Optical diagnostics support               : Yes
                  Laser bias current                        : 21.348 mA
                  Laser output power                        : 0.3186 mW / -4.97 dBm
                  Receiver signal average optical power     : 0.3195 mW / -4.96 dBm
                  Module temperature                        : 41.70 degrees C / 107.05 degrees F
                  Module voltage                            : 3.2947 V
                  Alarm/warning flags implemented           : Yes
                  Laser bias current high alarm             : Off
                  Laser bias current low alarm              : Off
                  Laser bias current high warning           : Off
                  Laser bias current low warning            : Off
                  Laser output power high alarm             : Off
                  Laser output power low alarm              : Off
                  Laser output power high warning           : Off
                  Laser output power low warning            : Off
                  Module temperature high alarm             : Off
                  Module temperature low alarm              : Off
                  Module temperature high warning           : Off
                  Module temperature low warning            : Off
                  Module voltage high alarm                 : Off
                  Module voltage low alarm                  : Off
                  Module voltage high warning               : Off
                  Module voltage low warning                : Off
                  Laser rx power high alarm                 : Off
                  Laser rx power low alarm                  : Off
                  Laser rx power high warning               : Off
                  Laser rx power low warning                : Off
                  Laser bias current high alarm threshold   : 10.000 mA
                  Laser bias current low alarm threshold    : 1.000 mA
                  Laser bias current high warning threshold : 9.000 mA
                   Laser bias current low warning threshold  : 2.000 mA
                  Laser output power high alarm threshold   : 0.8000 mW / -0.97 dBm
                  Laser output power low alarm threshold    : 0.1000 mW / -10.00 dBm
                  Laser output power high warning threshold : 0.6000 mW / -2.22 dBm
                  Laser output power low warning threshold  : 0.2000 mW / -6.99 dBm
                  Module temperature high alarm threshold   : 90.00 degrees C / 194.00 degrees F
                  Module temperature low alarm threshold    : -40.00 degrees C / -40.00 degrees F
                  Module temperature high warning threshold : 85.00 degrees C / 185.00 degrees F
                  Module temperature low warning threshold  : -40.00 degrees C / -40.00 degrees F
                  Module voltage high alarm threshold       : 4.0000 V
                  Module voltage low alarm threshold        : 0.0000 V
                  Module voltage high warning threshold     : 3.6450 V
                  Module voltage low warning threshold      : 2.9550 V
                  Laser rx power high alarm threshold       : 1.6000 mW / 2.04 dBm
                  Laser rx power low alarm threshold        : 0.0100 mW / -20.00 dBm
                  Laser rx power high warning threshold     : 1.0000 mW / 0.00 dBm
                  Laser rx power low warning threshold      : 0.0200 mW / -16.99 dBm
          

          Network Troubleshooting

          Cumulus Linux includes command line and analytical tools to help you troubleshoot issues with your network.

          ping

          Use the ping tool to check that a destination on a network is reachable. Ping sends ICMP Echo Request packets to the specified destination and listens for Echo Reply packets.

          You send Echo Request packets to a destination (IP address or a hostname) to check if it is reachable. You can specify the following options:

          Option Description
          count The number of Echo Request packets to send. You can specify a value between 1 and 10. The default packet count is 3.
          interval How often to send Echo Request packets. You can specify a value between 0.1 and 5 seconds. The default value is 4.
          size The Echo Request packet size in bytes. You can specify a value between 1 and 9216. The default value is 64.
          time The number of seconds to wait for an Echo Reply packet before the ping request times out. You can specify a value between 0.1 and 10. The default value is 10.
          source The source IP address from which to send the Echo Request packets.
          do-not-fragment Do not fragment. If the packet is larger than the maximum transmission unit (MTU) of any network segment it traverses, drop the packet instead of fragmenting the packet.
          l3protocol The layer 3 protocol you want to use to send the Echo Request packets. You can specify IPv4 or IPv6. If you don’t specify either IPv4 or IPv6, ping uses IPv4.
          vrf The VRF you want to use.
          source-interface The source interface from which to send Echo Request packets for a link local address. IPv6 only.

          The following example sends Echo Request packets to destination 10.10.10.10 to check if it is reachable.

          cumulus@switch:~$ nv action ping system 10.10.10.10
          

          The following example sends Echo Request packets to IPv6 destination fe80::a00:27ff:fe00:0 to check if it is reachable.

          cumulus@switch:~$ nv action ping system fe80::a00:27ff:fe00:0 l3protocol ipv6
          

          The following example sends 5 Echo Request packets every 2 seconds to check if destination 10.10.10.10 is reachable and waits for 3 seconds for an Echo Reply packet before timing out.

          cumulus@switch:~$ nv action ping system 10.10.10.10 count 5 interval 2 time 3
          

          The following example sends 50-byte Echo Request packets to check if destination 10.10.10.10 is reachable.

          cumulus@switch:~$ nv action ping system 10.10.10.10 size 50
          

          The following example checks if destination 10.10.10.10 is reachable and drops the packet instead of fragmenting it if the packet is larger than the maximum transmission unit (MTU) of any network segment it traverses.

          cumulus@switch:~$ nv action ping system 10.10.10.10 do-not-fragment
          

          The following example sends Echo Request packets to destination 10.10.10.10 from the source IP address 10.10.5.1.

          cumulus@switch:~$ nv action ping system 10.10.10.10 source 10.10.5.1
          

          The following example sends Echo Request packets to destination 10.10.10.10 for the management VRF.

          cumulus@switch:~$ nv action ping system 10.10.10.10 vrf mgmt
          

          The following example sends Echo Request packets to destination fe80::a00:27ff:fe00:0 from source interface eth0.

          cumulus@switch:~$ nv action ping system fe80::a00:27ff:fe00:0 source-interface eth0 
          

          You send Echo Request packets to a destination (IP address or a hostname) to check if it is reachable. In addition, you can specify the following options:

          Option Description
          -c The number of Echo Request packets to send. You can specify a value between 1 and 10. The default packet count is 3.
          -i How often two send Echo Request packets. You can specify a value between 0.1 and 5 seconds. The default value is 4.
          -s The packet size in bytes. You can specify a value between 1 and 9216. The default value is 64.
          -W The number of seconds to wait for an Echo Reply packet before the ping request times out. You can specify a value between 0.1 and 10. The default value is 10.
          -I <ip-address> The source IP address from which to send the Echo Request packets.
          M do Do not fragment. If the packet is larger than the maximum transmission unit (MTU) of any network segment it traverses, drop the packet instead of fragmenting the packet.
          <l3protocol> The layer 3 protocol you want to use to send the Echo Request packets. You can specify -4 for IPv4 or -6 for IPv6. If you don’t specify either IPv4 or IPv6, ping uses IPv4.
          -I <vrf-name> The VRF you want to use.
          -6 <ipv6-address>%<interface> The source interface from which to send Echo Request packets for a link local address. IPv6 only.

          The following example checks if destination 10.10.10.10 is reachable on the network.

          cumulus@switch:~$ ping 10.10.10.10
          

          The following example sends Echo Request packets to destination fe80::a00:27ff:fe00:0 for IPv6.

          cumulus@switch:~$ ping -6 fe80::a00:27ff:fe00:0
          

          The following example sends 5 Echo Request packets every two seconds to check if destination 10.10.10.10 is reachable on the network and waits for 3 seconds for an Echo Reply packet before timing out.

          cumulus@switch:~$ ping -c 5 -i 2 -W 3 10.10.10.10
          

          The following example sends 50-byte Echo Request packets to check if destination 10.10.10.10 is reachable on the network.

          cumulus@switch:~$ ping -s 50 10.10.10.10
          

          The following example checks if destination 10.10.10.10 is reachable and sets the do not fragment bit for IPv4.

          cumulus@switch:~$ ping -M do 10.10.10.10
          

          The following example sends Echo Request packets to destination 10.10.10.10 from the source IP address 10.10.5.1 for the management VRF.

          cumulus@switch:~$ ping -I vrf mgmt 10.10.5.1 10.10.10.10 
          

          The following example sends Echo Request packets to destination 10.10.10.10 for the management VRF.

          cumulus@switch:~$ ping 10.10.10.10 vrf mgmt
          

          The following example sends Echo Request packets to destination fe80::a00:27ff:fe00:0 from source interface eth0.

          cumulus@switch:~$ ping -6 fe80::a00:27ff:fe00:0%eth0 
          

          traceroute

          The traceroute tool traces the path of data packets from a source to a destination across an IP network, showing the sequence of routers (hops) the data passes through.

          By measuring the round-trip time for each hop, traceroute helps identify latency problems, routing inefficiencies, or potential network failures, enabling you to troubleshoot and optimize network performance.

          You send traceroute packets to a destination with the nv action traceroute system <destination> command. The destination can be either an IP address or a domain name. You can specify the following options:

          Option Description
          max-ttl The maximum number of hops to reach the destination. You can specify a value between 1 and 30. The default is 30.
          initial-ttl The minimum number of hops to reach the destination. You can specify a value between 1 and 30. The default is 1. The minimum number of hops must be less than or equal to the maximum number of hops.
          wait The maximum number of nanoseconds to wait for a response from each hop. You can specify a value between 0.1 and 10. The maximum number of hops must be more than or equal to the minimum number of hops.
          vrf The VRF to use.
          source The source IP address from which the route originates.
          l3protocol The layer 3 protocol; ipv4 or ipv6. The default is ipv4.
          l4protocol The layer 4 protocol; icmp, tcp, or udp. The default is icmp.
          do-not-fragment Do not fragment. Trace the route to the destination without fragmentation.

          The following example validates the route path to IPv4 destination 10.10.10.10.

          cumulus@switch:~$ nv action traceroute system 10.10.10.10
          

          The following example validates the route path to IPv6 destination fe80::a00:27ff:fe00:0.

          cumulus@switch:~$ nv action traceroute system fe80::a00:27ff:fe00:0 ipv6
          

          The following example validates the path to destination 10.10.10.10 with 5 minimum hops and 10 maximum hops.

          cumulus@switch:~$ nv action traceroute system 127.0.0.1 initial-ttl 5 max-ttl 10
          

          The following example sends UDP packets to validate the path to destination 10.10.10.10 and waits 2 nanoseconds for a response.

          cumulus@switch:~$ nv action traceroute system 10.10.10.10 l4protocol udp wait 2
          

          The following example validates the path to destination 10.10.10.10 from the source IP address 10.10.5.1.

          cumulus@switch:~$ nv action traceroute system 10.10.10.10 source 10.10.5.1
          

          The following example validates the path to destination 10.10.10.10 from the source IP address 10.10.5.1 in VRF RED.

          cumulus@switch:~$ nv action traceroute system 10.10.10.10 source 10.10.5.1 vrf RED 
          

          You send traceroute packets to a destination with the traceroute command. The destination can be either an IP address or a domain name. You can specify the following options:

          Option Description
          -m The maximum number of hops to reach the destination. You can specify a value between 1 and 30. The default is 30.
          -s The source IP address from which the route originates.
          -f The minimum number of hops to reach the destination. You can specify a value between 1 and 30. The default is 1. The minimum number of hops must be less than or equal to the maximum number of hops.
          -w The maximum number of nanoseconds to wait for a response from each hop. You can specify a value between 0.1 and 10.
          -i The VRF to use.
          <layer3-protocol> The layer 3 protocol; -4 for IPv4 or -6 for IPv6. The default is IPv4.
          <layer4-protocol> The layer 4 protocol packets to send; -I for ICMP, -T for TCP, or -U for UDP.
          -F Do not fragment. Trace the route to the destination without fragmentation.

          The following example validates the route path to IPv4 destination 10.10.10.10.

          cumulus@switch:~$ traceroute 10.10.10.10
          

          The following example validates the route path to IPv6 destination fe80::a00:27ff:fe00:0.

          cumulus@switch:~$ traceroute fe80::a00:27ff:fe00:0 -6
          

          The following example validates the path to destination 10.10.10.10 with 5 minimum hops and 10 maximum hops.

          cumulus@switch:~$ traceroute 10.10.10.10 -f 5 -m 10
          

          The following example sends UDP packets to validate the path to destination 10.10.10.10 and waits 2 nanoseconds for a response.

          cumulus@switch:~$ traceroute 10.10.10.10 -U -w 2
          

          The following example validates the path to destination 10.10.10.10 from the source IP address 10.10.5.1.

          cumulus@switch:~$ traceroute 10.10.10.10 -s 10.10.5.1
          

          The following example validates the path to destination 10.10.10.10 from the source IP address 10.10.5.1 in VRF RED.

          cumulus@switch:~$ traceroute 10.10.10.10 -s 10.10.5.1 -i RED
          

          tcpdump

          You can use tcpdump to monitor control plane traffic (traffic sent to and coming from the switch CPUs). tcpdump does not monitor data plane traffic; use cl-acltool instead (see above).

          For more information on tcpdump, read the documentation and the man page.

          The following example incorporates tcpdump options:

          cumulus@switch:~$ sudo tcpdump -i bond0 host 169.254.0.2 -c 10
          tcpdump: WARNING: bond0: no IPv4 address assigned
          tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
          listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
          16:24:42.532473 IP 169.254.0.2 > 169.254.0.1: ICMP echo request, id 27785, seq 6, length 64
          16:24:42.532534 IP 169.254.0.1 > 169.254.0.2: ICMP echo reply, id 27785, seq 6, length 64
          16:24:42.804155 IP 169.254.0.2.40210 > 169.254.0.1.5342: Flags [.], seq 266275591:266277039, ack 3813627681, win 58, options [nop,nop,TS val 590400681 ecr 530346691], length 1448
          16:24:42.804228 IP 169.254.0.1.5342 > 169.254.0.2.40210: Flags [.], ack 1448, win 166, options [nop,nop,TS val 530348721 ecr 590400681], length 0
          16:24:42.804267 IP 169.254.0.2.40210 > 169.254.0.1.5342: Flags [P.], seq 1448:1836, ack 1, win 58, options [nop,nop,TS val 590400681 ecr 530346691], length 388
          16:24:42.804293 IP 169.254.0.1.5342 > 169.254.0.2.40210: Flags [.], ack 1836, win 165, options [nop,nop,TS val 530348721 ecr 590400681], length 0
          16:24:43.532389 IP 169.254.0.2 > 169.254.0.1: ICMP echo request, id 27785, seq 7, length 64
          16:24:43.532447 IP 169.254.0.1 > 169.254.0.2: ICMP echo reply, id 27785, seq 7, length 64
          16:24:43.838652 IP 169.254.0.1.59951 > 169.254.0.2.5342: Flags [.], seq 2555144343:2555145791, ack 2067274882, win 58, options [nop,nop,TS val 530349755 ecr 590399688], length 1448
          16:24:43.838692 IP 169.254.0.1.59951 > 169.254.0.2.5342: Flags [P.], seq 1448:1838, ack 1, win 58, options [nop,nop,TS val 530349755 ecr 590399688], length 390
          10 packets captured
          12 packets received by filter
          0 packets dropped by kernel
          

          Run Commands in a Non-default VRF

          You can use ip vrf exec to run commands in a non-default VRF context, which is useful for network utilities like ping, traceroute, and nslookup.

          The full syntax is ip vrf exec <vrf-name> <command> <arguments>. For example:

          cumulus@switch:~$ sudo ip vrf exec Tenant1 nslookup google.com - 8.8.8.8
          

          By default, ping and ping6, and traceroute and traceroute6 all use the default VRF and use a mechanism that checks the VRF context of the current shell, which you can see when you run ip vrf id. If the VRF context of the shell is mgmt, these commands run in the default VRF context.

          ping and traceroute have additional arguments that you can use to specify an egress interface or a source address. In the default VRF, the source interface flag (ping -I or traceroute -i) specifies the egress interface for the ping or traceroute operation. However, you can use the source interface flag instead to specify a non-default VRF to use for the command. Doing so causes the routing lookup for the destination address to occur in that VRF.

          With ping -I, you can specify the source interface or the source IP address but you cannot use the flag more than once. Either choose an egress interface/VRF or a source IP address. For traceroute, you can use traceroute -s to specify the source IP address.

          You gain additional flexibility if you run ip vrf exec in combination with ping/ping6 or traceroute/traceroute6, as the VRF context is outside of the ping and traceroute commands. This allows for the most granular control of ping and traceroute, as you can specify both the VRF and the source interface flag.

          For ping, use the following syntax:

          ip vrf exec <vrf-name> [ping|ping6] -I [<egress_interface> | <source_ip>] <destination_ip>
          

          For example:

          cumulus@switch:~$ sudo ip vrf exec Tenant1 ping -I swp1 8.8.8.8
          cumulus@switch:~$ sudo ip vrf exec Tenant1 ping -I 192.0.1.1 8.8.8.8
          cumulus@switch:~$ sudo ip vrf exec Tenant1 ping6 -I swp1 2001:4860:4860::8888
          cumulus@switch:~$ sudo ip vrf exec Tenant1 ping6 -I 2001:db8::1 2001:4860:4860::8888
          

          For traceroute, use the following syntax:

          ip vrf exec <vrf-name> [traceroute|traceroute6] -i <egress_interface> -s <source_ip> <destination_ip>
          

          For example:

          cumulus@switch:~$ sudo ip vrf exec Tenant1 traceroute -i swp1 -s 192.0.1.1 8.8.8.8
          cumulus@switch:~$ sudo ip vrf exec Tenant1 traceroute6 -i swp1 -s 2001:db8::1 2001:4860:4860::8888
          

          The VRF context for ping and traceroute commands move automatically to the default VRF context, therefore, you must use the source interface flag to specify the management VRF. Typically, there is only a single interface in the management VRF (eth0) and only a single IPv4 address or IPv6 global unicast address assigned to it. You cannot specify both a source interface and a source IP address with ping -I.

          What Just Happened (WJH)

          What Just Happened (WJH) provides real time visibility into network problems and has two components:

          Configure WJH

          You can choose which packet drops you want to monitor by creating channels and setting the packet drop categories (layer 1, layer 2, layer 3, tunnel, buffer and ACL ) you want to monitor.

          NVUE does not provide commands to set the buffer and ACL packet drop categories. You must edit the /etc/what-just-happened/what-just-happened.json file. See the Linux Commands tab.

          The following example configures two separate channels:

          • The forwarding channel monitors layer 2, layer 3, and tunnel packet drops.
          • The layer-1 channel monitors layer 1 packet drops.
          cumulus@switch:~$ nv set system wjh channel forwarding trigger l2
          cumulus@switch:~$ nv set system wjh channel forwarding trigger l3
          cumulus@switch:~$ nv set system wjh channel forwarding trigger tunnel
          cumulus@switch:~$ nv set system wjh channel layer-1 trigger l1
          cumulus@switch:~$ nv config apply
          

          You can stop monitoring specific packet drops by unsetting a category in the channel list. The following command example stops monitoring layer 2 packet drops that are in the forwarding channel:

          cumulus@switch:~$ nv unset system wjh channel forwarding trigger l2
          cumulus@switch:~$ nv config apply
          

          To remove a channel, run the nv unset system wjh channel <channel> command. The following command example removes the layer-1 channel:

          cumulus@switch:~$ nv unset system wjh channel layer-1 
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/what-just-happened/what-just-happened.json file:

          • For each drop category you want to monitor, include the drop category value inside the square brackets ([]).
          • For each drop category you do not want to monitor, remove the drop category value from inside the square brackets.

          After you edit the file, you must restart the WJH service with the sudo systemctl restart what-just-happened command.

          The following example configures a channel to monitor layer 2, layer 3, and tunnel packet drops and a channel to monitor layer 1 packet drops.

          cumulus@switch:~$ sudo nano /etc/what-just-happened/what-just-happened.json
          {
              "what-just-happened": {
                  "channels": {
                      "forwarding": {
                          "drop_category_list": [
                              "l2",
                              "l3",
                              "tunnel"
                          ]
                      },
                      "layer-1": {
                          "drop_category_list": [
                              "l1"
                          ]
                      }
                  }
              }
          }
          
          cumulus@switch:~$ sudo systemctl restart what-just-happened
          

          The following example configures a channel to monitor buffer packet drops and a channel to monitor ACL packet drops.

          cumulus@switch:~$ sudo nano /etc/what-just-happened/what-just-happened.json
          {
              "what-just-happened": {
                  "channels": {
                      "buffer": {
                          "drop_category_list": ["buffer"]
                      },
                      "acl": {
                          "drop_category_list": ["acl"]
                      }
                  }
              }
          }
          
          cumulus@switch:~$ sudo systemctl restart what-just-happened
          

          Show Information about Dropped Packets

          You can run the following commands to show information about dropped packets and diagnose problems.

          To show information about packet drops for all the channels you configure, run the nv show system wjh packet-buffer command. The command output includes the reason for the drop and the recommended action to take.

          You can also show the WJH configuration on the switch:

          • To show the configuration for a channel, run the nv show system wjh channel <channel> command. For example, nv show system wjh channel forwarding.
          • To show the configuration for packet drop categories in a channel, run the nv show system wjh channel <channel> trigger command. For example, nv show system wjh channel forwarding trigger.

          The following example shows information about layer 1 packet drops:

          cumulus@switch:~$ nv show system wjh packet-buffer
          #   dMAC  dPort  Dst IP:Port  EthType  Drop group  IP Proto  Drop reason - Recommended action                         Severity  sMAC  sPort    Src IP:Port  Timestamp              VLAN
          --  ----  -----  -----------  -------  ----------  --------  -------------------------------------------------------  --------  ----  -------  -----------  ---------------------  ----
          1   N/A   N/A    N/A          N/A      L1          N/A       Generic L1 event - Check layer 1 aggregated information  Warn      N/A   swp17    N/A          22/11/03 01:00:35.458  N/A
          2   N/A   N/A    N/A          N/A      L1          N/A       Generic L1 event - Check layer 1 aggregated information  Warn      N/A   swp18    N/A          22/11/03 01:00:35.458  N/A
          3   N/A   N/A    N/A          N/A      L1          N/A       Generic L1 event - Check layer 1 aggregated information  Warn      N/A   swp19    N/A          22/11/03 01:00:35.458  N/A
          4   N/A   N/A    N/A          N/A      L1          N/A       Generic L1 event - Check layer 1 aggregated information  Warn      N/A   swp20    N/A          22/11/03 01:00:35.458  N/A
          

          You can run the following commands from the command line.

          Command
          Description
          what-just-happened poll Shows information about packet drops for all the channels you configure. The output includes the reason for the drop and the recommended action to take.

          The what-just-happened poll <channel> command shows information for the channel you specify.
          what-just-happened poll --aggregate Shows information about dropped packets aggregated by the reason for the drop. This command also shows the number of times the dropped packet occurs.

          The what-just-happened poll <channel> --aggregate command shows information for the channel you specify.
          what-just-happened poll --export Saves information about dropped packets to a file in PCAP format.

          The what-just-happened poll <channel> --export command saves information for the channel you specify.
          what-just-happened poll --export --no_metadata Saves information about dropped packets to a file in PCAP format without metadata.

          The what-just-happened poll <channel> --export --no_metadata command saves information for the channel you specify.
          what-just-happened dump Displays all diagnostic information on the command line.

          Run the what-just-happened -h command to see all the WJH command options.

          To show all dropped packets and the reason for the drop, run the NVUE nv show system wjh packet-buffer command or the what-just-happened poll command.

          The following example shows that packets drop five times because the source MAC address equals the destination MAC address:

          cumulus@switch:~$ what-just-happened poll --aggregate
          Sample Window : 2021/06/16 12:57:23.046 - 2021/06/16 14:46:17.701
          
          #  sPort  VLAN  sMAC               dMAC               EthType  Src IP:Port  Dst IP:Port  IP Proto  Count  Severity  Drop reason - Recommended action
          -- ------ ----- ------------------ ------------------ -------- ------------ ------------ --------- ------ --------- -----------------------------------------------
          1  swp4   N/A   44:38:39:00:a4:87  44:38:39:00:a4:87  IPv4     0.0.0.0:0    0.0.0.0:0    ip        100    Error     Source MAC equals destination MAC - Bad packet was received from peer
          2  swp1   N/A   44:38:39:00:a4:80  44:38:39:00:a4:80  IPv4     0.0.0.0:0    0.0.0.0:0    ip        100    Error     Source MAC equals destination MAC - Bad packet was received from peer
          

          The following command saves dropped packets to a file in PCAP format

          cumulus@switch:~$ what-just-happened poll --export --no_metadata
          PCAP file path : /var/log/mellanox/wjh/wjh_user_2021_06_16_12_03_15.pcap
          
          #    Timestamp              sPort  dPort  VLAN  sMAC               dMAC               EthType  Src IP:Port  Dst IP:Port  IP Proto  Drop   Severity  Drop reason - Recommended action
                                                                                                                                             Group
          ---- ---------------------- ------ ------ ----- ------------------ ------------------ -------- ------------ ------------ --------- ------ --------- -----------------------------------------------
          1    21/06/16 12:03:12.728  swp1   N/A    N/A   44:38:39:00:a4:84  44:38:39:00:a4:84  IPv4     N/A          N/A          N/A       L2     Error     Source MAC equals destination MAC - Bad packet as received from peer
          2    21/06/16 12:03:12.728  swp1   N/A    N/A   44:38:39:00:a4:84  44:38:39:00:a4:84  IPv4     N/A          N/A          N/A       L2     Error     Source MAC equals destination MAC - Bad packet was received from peer
          3    21/06/16 12:03:12.745  swp1   N/A    N/A   44:38:39:00:a4:84  44:38:39:00:a4:84  IPv4     N/A          N/A          N/A       L2     Error     Source MAC equals destination MAC - Bad packet was received from peer
          4    21/06/16 12:03:12.745  swp1   N/A    N/A   44:38:39:00:a4:84  44:38:39:00:a4:84  IPv4     N/A          N/A          N/A       L2     Error     Source MAC equals destination MAC - Bad packet was received from peer
          

          Considerations

          Buffer Packet Drop Monitoring

          Cumulus Linux and Docker

          WJH runs in a Docker container. By default, when Docker starts, it creates a bridge called docker0. However, for compatibility reasons Cumulus Linux disables the docker0 bridge in the /etc/docker/daemon.json file with the attribute "bridge: none".

          WJH and the NVIDIA NetQ Agent

          When you enable the NVIDIA NetQ agent on the switch, the WJH service stops and does not run. If you disable the NVIDIA NetQ service and want to use WJH, run the following commands to enable and start the WJH service:

          cumulus@switch:~$ nv set system wjh enable on
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ sudo systemctl enable what-just-happened
          cumulus@switch:~$ sudo systemctl start what-just-happened
          

          Monitoring System Statistics and Network Traffic with sFlow

          sFlow is a monitoring protocol that samples network packets, application operations, and system counters. sFlow collects both interface counters and sampled 5-tuple packet information so that you can monitor your network traffic as well as your switch state and performance metrics. To collect and analyze this data, you need an outside server; an sFlow collector.

          If you intend to run this service within a VRF, including the management VRF, follow these steps to configure the service.

          Configure sFlow

          To configure sFlow:

          Cumulus Linux provides different sampling rate configurations. The value represents the sampling ratio; for example, if you specify a value of 400, SFlow samples one in every 400 packets.

          Sampling Rate Default Value Description
          speed-100m 100 The sampling rate on a 100Mbps port.
          speed-1g 1000 The sampling rate on a 1Gbps port.
          speed-10g 10000 The sampling rate on a 10Gbps port.
          speed-40g 40000 The sampling rate on a 40Gbps port.
          speed-50g 50000 The sampling rate on a 50Gbps port.
          speed-100g 100000 The sampling rate on a 100Gbps port.
          speed-200g 200000 The sampling rate on a 200Gbps port.
          speed-400g 400000 The sampling rate on a 400Gbps port.
          speed-800g 800000 The sampling rate on a 800Gbps port.

          Some collectors require each source to transmit on a different port, others listen on only one port. Refer to the documentation for your collector for more information.

          Configure Designated Collectors

          Specify the IP address, UDP port number, and interface for the designated collectors. The port number and interface are optional; If you do not specify a port number, Cumulus Linux uses the default port 6343.

          The following example configures sFlow to send data to collector 192.0.2.100 on port 6343 and collector 192.0.2.200 on eth0:

          cumulus@switch:~$ nv set system sflow collector 192.0.2.100 port 6344
          cumulus@switch:~$ nv set system sflow collector 192.0.2.200 interface eth0
          cumulus@switch:~$ nv config apply
          

          Configure the sFlow sampling rate in number of packets if you do not want to use the default rate, and the polling interval in seconds.

          The following example polls the counters every 20 seconds and samples one in every 40000 packets for 40G interfaces:

          cumulus@switch:~$ nv set system sflow sampling-rate speed-40g 40000
          cumulus@switch:~$ nv set system sflow poll-interval 20
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/hsflowd.conf file to set up the collectors, sampling rates, and polling interval in seconds, then restart the hsflowd service with the sudo systemctl start hsflowd command.

          The following example polls the counters every 20 seconds, samples 1 of every 40000 packets for 40G interfaces, and sends this information to a collector at 192.0.2.100 on port 6343 and to another collector at 192.0.2.200 on interface eth0.

          cumulus@switch:~$ sudo nano /etc/hsflowd.conf
          sflow {
          # ====== Sampling/Polling/Collectors ======
            # EITHER: automatic (DNS SRV+TXT from _sflow._udp):
            #   DNS-SD { }
            # OR: manual:
            #   Counter Polling:
                  polling = 20
            #   default sampling N:
            #     sampling = 400
            #   sampling N on interfaces with ifSpeed:
                  sampling.100M = 100
                  sampling.1G = 1000
                  sampling.10G = 10000
                  sampling.40G = 40000
            #   sampling N for apache, nginx:
            #     sampling.http = 50
            #     sampling N for application (requires json):
            #     sampling.app.myapp = 100
            #   collectors:
            collector { ip=192.0.2.100 udpport=6344 }
            collector { ip=192.0.2.200 interface=eth0 }
          }
          
          cumulus@switch:~$ sudo systemctl start hsflowd
          

          Configure the SFlow Agent

          Provide the IP address or prefix, or the interface for the sFlow agent.

          The following example configures the sFlow agent prefix to 10.0.0.0/8:

          cumulus@switch:~$ nv set system sflow agent ip 10.0.0.0/8
          cumulus@switch:~$ nv config apply
          

          The following example configures the sFlow agent interface to eth0:

          cumulus@switch:~$ nv set system sflow agent interface eth0
          cumulus@switch:~$ nv config apply
          

          To provide the IP address or prefix for the sFlow agent, edit the /etc/hsflowd.conf file to set the agent.CIDR parameter, then restart the hsflowd service with the sudo systemctl start hsflowd command.

          cumulus@switch:~$ sudo nano /etc/hsflowd.conf
          ...
          sflow { 
            agent.CIDR = 10.0.0.0/8 
          } 
          
          cumulus@switch:~$ sudo systemctl start hsflowd
          

          To provide an interface for the sFlow agent, edit the /etc/hsflowd.conf file to set the agent parameter, then restart the hsflowd service with the sudo systemctl start hsflowd command.:

          cumulus@switch:~$ sudo nano /etc/hsflowd.conf
          ...
          sflow { 
            agent = eth0 
          } 
          
          cumulus@switch:~$ sudo systemctl start hsflowd
          

          Configure sFlow Policer Rate and Burst Size

          You can limit the number of sFlow samples per second and the sample burst size per second that the switch sends.

          The default number of sFlow samples and default sample size is 16384. You can specify a value between 0 and 16384.

          The following example sets the number of sFlow samples to 800 and the sample size to 900:

          cumulus@switch:~$ nv set system sflow policer rate 8000
          cumulus@switch:~$ nv set system sflow policer burst 9000
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/cumulus/datapath/traffic.conf file to change the sflow.rate and sflow.burst parameters, then reload switchd with the sudo systemctl reload switchd.service command.

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/traffic.conf
          # Set sflow/sample ingress cpu packet rate and burst in packets/sec 
          # Values: {0..16384} 
          sflow.rate = 8000
          sflow.burst = 9000 
          
          cumulus@switch:~$ sudo systemctl reload switchd.service 
          

          Enable sFlow

          To enable sFlow:

          cumulus@switch:~$ nv set system sflow state enabled 
          cumulus@switch:~$ nv config apply
          

          To disable sFlow, run the nv set system sflow state disabled command.

          By default, the hsflowd service is disabled and does not start automatically when the switch boots up.

          To enable and start the hsflowd service:

          cumulus@switch:~$ sudo systemctl enable hsflowd
          cumulus@switch:~$ sudo systemctl start hsflowd
          

          To disable the hsflowd service:

          cumulus@switch:~$ sudo systemctl stop hsflowd
          cumulus@switch:~$ sudo systemctl disable hsflowd
          

          Interface Configuration

          By default, sFlow is enabled on interfaces that are operationally UP. To disable sFlow on an interface:

          cumulus@switch:~$ nv set interface swp1 sflow state disabled 
          cumulus@switch:~$ nv config apply
          

          To enable sFlow on an interface, run the nv set interface <interface> sflow state enabled command.

          By default, sFlow is enabled on interfaces that are operationally UP. To disable sFlow on a specific interface, edit the /etc/cumulus/switchd.conf file and set the interface.<interface>.sflow.enable parameter to FALSE:

          cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
          interface.swp1.sflow.enable = FALSE 
          

          To enable sFlow on an interface, set the interface.<interface>.sflow.enable parameter to TRUE.

          To configure the sFlow sample rate on an interface.

          cumulus@switch:~$ nv set interface swp1 sflow sample-rate 100000
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/cumulus/switchd.conf file and set the interface.<interface-id>.sflow.sample_rate.ingress parameter:

          cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf
          interface.swp1.sflow.sample_rate.ingress = 100000 
          

          Monitor Dropped Packets

          You can configure sFlow to monitor dropped packets in hardware.

          cumulus@switch:~$ nv set system sflow dropmon hw 
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/hsflowd.conf file to change start to on in the dropmon { group=1 start=off limit=1000 } line.

          cumulus@switch:~$ sudo nano /etc/hsflowd.conf
          dropmon { group=1 start=on limit=1000 }
          

          Restart the hsflowd service with the sudo systemctl start hsflowd command.

          Configure sFlow Visualization Tools

          For information on configuring various sFlow visualization tools, read this knowledge base article.

          Show sFlow Configuration

          To show all sFlow configuration on the switch:

          cumulus@switch:~$ nv show system sflow
                          operational  applied    
          -------------  -----------  -----------
          poll-interval               20         
          state                       enabled    
          [collector]                 192.0.2.100
          [collector]                 192.0.2.200
          sampling-rate                          
            default                   400        
            speed-100m                100        
            speed-1g                  1000       
            speed-10g                 10000      
            speed-25g                 25000      
            speed-40g                 40000      
            speed-50g                 50000      
            speed-100g                100000     
            speed-200g                200000     
            speed-400g                400000     
            speed-800g                800000     
          agent                                  
            ip                        10.0.0.0/8 
            interface                 eth0       
          policer                                
            rate                      8000       
            burst                     9000       
          [dropmon]                   sw
          

          To show sFlow collector configuration:

          cumulus@switch:~$ nv show system sflow collector
          Ip                    Port 
          --------------------------------- 
          192.0.2.100           6343 
          192.0.2.200           6344
          

          To show the sFlow sampling rate configuration:

          cumulus@switch:~$ nv show system sflow sampling-rate
                      applied
          ----------  -------
          default     400    
          speed-100m  100    
          speed-1g    1000   
          speed-10g   10000  
          speed-25g   25000  
          speed-40g   40000  
          speed-50g   50000  
          speed-100g  100000 
          speed-200g  200000 
          speed-400g  400000 
          speed-800g  800000 
          

          To show sFlow agent configuration:

          cumulus@switch:~$ nv show system sflow agent
                     operational  applied   
          ---------  -----------  ----------
          ip                      10.0.0.0/8
          interface               eth0
          

          To show the number of samples per second and the sample burst size per second that the switch sends out:

          cumulus@switch:~$ nv show system sflow policer
          ---------------------- 
                 applied
          -----  -------
          rate   8000   
          burst  9000
          

          To show sFlow configuration on a specific interface:

          cumulus@switch:~$ nv show interface swp1 sflow
          ---------------------- 
                       operational  applied
          -----------  -----------  -------
          sample-rate  0            100000 
          state        disabled     enabled
          

          Considerations

          Cumulus Linux does not support sFlow egress sampling.

          SPAN and ERSPAN

          Cumulus Linux supports both SPAN and ERSPAN.

          SPAN

          To configure SPAN to mirror ports on your switch, you create a port mirror session. The session ID is a number between 0 and 7.

          You set the following SPAN options:

          Run the nv set system port-mirror session <session-id> span <option> command. The NVUE commands save the configuration in the /etc/cumulus/switchd.d/port-mirror.conf file.

          To reduce the volume of data, you can truncate the mirrored frames at a specified number of bytes. The size must be between 4 and 4088 bytes and a multiple of 4.

          Example Commands

          To mirror all packets received on swp1, and copy and transmit the packets to swp2 for monitoring:

          cumulus@switch:~$ nv set system port-mirror session 1 span direction ingress
          cumulus@switch:~$ nv set system port-mirror session 1 span source-port swp1
          cumulus@switch:~$ nv set system port-mirror session 1 span destination swp2
          cumulus@switch:~$ nv config apply
          

          To mirror all packets that go out of swp1, and copy and transmit the packets to swp2 for monitoring:

          cumulus@switch:~$ nv set system port-mirror session 1 span direction egress
          cumulus@switch:~$ nv set system port-mirror session 1 span source-port swp1
          cumulus@switch:~$ nv set system port-mirror session 1 span destination swp2
          cumulus@switch:~$ nv config apply
          

          SPAN sessions that reference an outgoing interface create the mirrored packets according to the ingress interface before the routing decision. For example, the above commands capture traffic that is ultimately destined to leave swp1 but mirrors the packets when they arrive on swp2. Packets that reference the original VLAN tag, and the source and destination MAC address transfer when swp2 originally receives the packet.

          To mirror packets from all ports to swp53:

          cumulus@switch:~$ nv set system port-mirror session 1 span direction ingress
          cumulus@switch:~$ nv set system port-mirror session 1 span source-port swp1-54
          cumulus@switch:~$ nv set system port-mirror session 1 span destination swp53
          cumulus@switch:~$ nv config apply
          

          To mirror all packets received on bond1, and copy and transmit the packets to swp53 for monitoring:

          cumulus@switch:~$ nv set system port-mirror session 1 span direction ingress
          cumulus@switch:~$ nv set system port-mirror session 1 span source-port bond1
          cumulus@switch:~$ nv set system port-mirror session 1 span destination swp53
          cumulus@switch:~$ nv config apply
          

          To truncate the mirrored frames at 40 bytes:

          cumulus@switch:~$ nv set system port-mirror session 1 span truncate size 40
          cumulus@switch:~$ nv config apply
          

          Delete SPAN Sessions

          You can delete all SPAN sessions with the nv unset system port-mirror command. For example:

          cumulus@switch:~$ nv unset system port-mirror
          cumulus@switch:~$ nv config apply
          

          To delete a specific SPAN session, run the nv unset system port-mirror session <session-id> command. For example:

          cumulus@switch:~$ nv unset system port-mirror session 1
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/cumulus/switchd.d/port-mirror.conf file, then load the configuration.

          The following example configuration mirrors all packets received on swp1, and copies and transmits the packets to swp2 for monitoring:

          cumulus@switch:~$ sudo nano /etc/cumulus/switchd.d/port-mirror.conf
          Copyright © 2021 NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
          #
          # This software product is a proprietary product of Nvidia Corporation and its affiliates
          # (the "Company") and all right, title, and interest in and to the software
          # product, including all associated intellectual property rights, are and
          # shall remain exclusively with the Company.
          #
          # This software product is governed by the End User License Agreement
          # provided with the software product.
          #
          # [session_n]
          # session-id = n
          # mirror.session.n.direction = (ingress | egress)
          # mirror.session.n.src = <swpx, bond>
          # mirror.session.n.dest = (swpx | <src-ip> <dst-ip>)
          # mirror.session.n.type = (span | erspan | none)
          #
          # Default is all sessions off
          # mirror.session.all.type = none
          [session_1]
          session-id = 1
          mirror.session.1.direction = ingress
          mirror.session.1.src = swp1
          mirror.session.1.dest = swp2
          mirror.session.1.type = span
          

          Run the following command to the load the configuration:

          cumulus@switch:~$ /usr/lib/cumulus/switchdctl --load /etc/cumulus/switchd.d/port-mirror.conf -prefix mirror
          

          SPAN sessions that reference an outgoing interface create the mirrored packets according to the ingress interface before the routing decision. For example, the following rule captures traffic that is ultimately destined to leave swp1 but mirrors the packets when they arrive on swp49. The rule transmits packets that reference the original VLAN tag, and source and destination MAC address at the time that swp49 originally receives the packet.

          [session_1]
          session-id = 1
          mirror.session.1.direction = egress
          mirror.session.1.src = swp1
          mirror.session.1.dest = swp49
          mirror.session.1.type = span
          
          cumulus@switch:~$ /usr/lib/cumulus/switchdctl --load /etc/cumulus/switchd.d/port-mirror.conf -prefix mirror
          

          Selective SPAN with ACLs

          You can configure selective SPAN with ACLs to mirror a subset of traffic according to:

          To match swp1 ingress traffic that has the source IP address 10.10.1.1 and mirror the traffic to swp2 when a match occurs:

          cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 match ip source-ip 10.10.1.1
          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 action span swp2
          cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound
          

          To match OSPF packets coming in on swp1 and mirror the traffic to swp2 when a match occurs:

          cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 match ip protocol ospf
          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 action span swp2
          cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound
          

          To match UDP packets coming in on bond1 and mirror the traffic to swp53 when a match occurs:

          cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 match ip protocol udp
          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 action span swp53
          cumulus@switch:~$ nv set interface bond1 acl EXAMPLE1 inbound
          cumulus@switch:~$ nv config apply
          

          • Always place your rule files in the /etc/cumulus/acl/policy.d/ directory.
          • Using cl-acltool with the --out-interface rule applies to transit traffic only; it does not apply to traffic sourced from the switch.
          • --out-interface rules cannot target bond interfaces, only the bond members tied to them. For example, to mirror all packets going out of bond1 to swp53, where bond1 members are swp1 and swp2, create the rule -A FORWARD --out-interface swp1,swp2 -j SPAN --dport swp53.

          1. Create a rules file in the /etc/cumulus/acl/policy.d/ directory. The following example rules mirror ICMP packets that ingress swp1 to swp54 and UDP packets that egress swp4 to swp53:

            cumulus@switch:~$ sudo nano /etc/cumulus/acl/policy.d/span.rules
            [iptables]
            -A FORWARD --in-interface swp1 -p icmp -j SPAN --dport swp54
            -A FORWARD --out-interface swp4 -p udp -j SPAN --dport swp53
            
          2. Install the rules:

            cumulus@switch:~$ sudo cl-acltool -i
            

          Do not run the cl-acltool -i command with -P option. The -P option removes all existing control plane rules or other installed rules and only installs the rules defined in the specified file.

          1. Verify that you installed the SPAN rules:

            cumulus@switch:~$ sudo cl-acltool -L all | grep SPAN
            38025 7034K SPAN       icmp --  swp1   any     anywhere             anywhere             dport:swp54
            50832   55M SPAN       udp  --  any    swp4    anywhere             anywhere             dport:swp53
            

          Example Rules

          To mirror forwarded packets from all ports matching source IP address 20.0.1.0 and destination IP address 20.0.1.2 to port swp1:

          -A FORWARD --in-interface swp+ -s 20.0.0.2 -d 20.0.1.2 -j SPAN --dport swp1
          

          To mirror ICMP packets from all ports to swp1:

          -A FORWARD --in-interface swp+ -s 20.0.0.2 -p icmp -j SPAN --dport swp1
          

          To mirror forwarded UDP packets received from port swp1, towards destination IP address 20.0.1.2 and destination port 53:

          -A FORWARD --in-interface swp1 -d 20.0.1.2 -p udp --dport 53 -j SPAN --dport swp1
          

          To mirror all forwarded TCP packets with only SYN set:

          -A FORWARD --in-interface swp+ -p tcp --tcp-flags ALL SYN -j SPAN --dport swp1
          

          To mirror all forwarded TCP packets with only FIN set:

          -A FORWARD --in-interface swp+ -p tcp --tcp-flags ALL FIN -j SPAN --dport swp1
          

          CPU port as the SPAN Destination

          You can set the CPU port as a SPAN destination interface to mirror data plane traffic to the CPU. The SPAN traffic goes to a separate network interface mirror where you can analyze it with tcpdump. This is a useful feature if you do not have any free external ports on the switch for monitoring. SPAN traffic does not appear on switch ports.

          Cumulus Linux controls how much traffic reaches the CPU so that mirrored traffic does not overwhelm the CPU.

          You configure the CPU port as the SPAN destination with ACLs.

          To monitor traffic mirrored to the CPU, run the tcpcdump -i mirror command.

          To match swp1 ingress traffic that has the source IP address 10.10.1.1 and mirror the traffic to the CPU when a match occurs:

          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 action span cpu
          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 match ip source-ip 10.10.1.1
          cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
          cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound
          cumulus@switch:~$ nv config apply
          

          To match swp1 egress traffic that has the source IP address 10.10.1.1 and mirror the traffic to the CPU when a match occurs:

          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 action span cpu
          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 match ip source-ip 10.10.1.1
          cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
          cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 outbound
          cumulus@switch:~$ nv config apply
          
          1. Create a file in the /etc/cumulus/acl/policy.d/ directory and add rules.

            To match swp1 ingress traffic that has the source IP address 10.10.1.1 and mirror the traffic to the CPU when a match occurs:

            cumulus@switch:~$ sudo nano /etc/cumulus/acl/policy.d/span-cpu.rules
            [iptables]
              -A FORWARD -i swp1 -s 10.10.1.1 -j SPAN --dport cpu
            

            To match swp1 egress traffic that has the source IP address 10.10.1.1 and mirror the traffic to the CPU when a match occurs:

            -A FORWARD -o swp1 -s 10.10.1.1 -j SPAN --dport cpu
            
          2. Install the rule:

            cumulus@switch:~$ sudo cl-acltool -i
            

          Do not run the cl-acltool -i command with -P option. The -P option removes all existing control plane rules or other installed rules and only installs the rules defined in the specified file.

          ERSPAN

          To configure ERSPAN to mirror ports on your switch, you create a port mirror session. The session ID is a number between 0 and 7.

          You can set the following ERSPAN options:

          Run the nv set system port-mirror session <session-id> erspan <option> command. The NVUE commands save the configuration in the /etc/cumulus/switchd.d/port-mirror.conf file.

          To reduce the volume of data, you can truncate the mirrored frames at a specified number of bytes. The size must be between 4 and 4088 bytes and a multiple of 4.

          Example Commands

          The following examples configure ERSPAN encapsulation from source IP address 10.10.10.1 to destination IP address 10.10.10.234.

          To mirror all packets that arrive on swp1:

          cumulus@switch:~$ nv set system port-mirror session 1 erspan direction ingress
          cumulus@switch:~$ nv set system port-mirror session 1 erspan source-port swp1
          cumulus@switch:~$ nv set system port-mirror session 1 erspan destination source-ip 10.10.10.1
          cumulus@switch:~$ nv set system port-mirror session 1 erspan destination dest-ip 10.10.10.234
          cumulus@switch:~$ nv config apply
          

          To mirror all packets that go out of swp1:

          cumulus@switch:~$ nv set system port-mirror session 1 erspan direction egress
          cumulus@switch:~$ nv set system port-mirror session 1 erspan source-port swp1
          cumulus@switch:~$ nv set system port-mirror session 1 erspan destination source-ip 10.10.10.1
          cumulus@switch:~$ nv set system port-mirror session 1 erspan destination dest-ip 10.10.10.234
          cumulus@switch:~$ nv config apply
          

          Delete ERSPAN Sessions

          You can delete all ERSPAN sessions with the nv unset system port-mirror command. For example:

          cumulus@switch:~$ nv unset system port-mirror
          cumulus@switch:~$ nv config apply
          

          To delete a specific ERSPAN session, run the nv unset system port-mirror session <session-id> command. For example:

          cumulus@switch:~$ nv unset system port-mirror session 1
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/cumulus/switchd.d/port-mirror.conf file, then load the configuration.

          The following example ERSPAN configuration mirrors all packets received on swp1:

          cumulus@switch:~$ sudo nano /etc/cumulus/switchd.d/port-mirror.conf
          Copyright © 2021 NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
          #
          # This software product is a proprietary product of Nvidia Corporation and its affiliates
          # (the "Company") and all right, title, and interest in and to the software
          # product, including all associated intellectual property rights, are and
          # shall remain exclusively with the Company.
          #
          # This software product is governed by the End User License Agreement
          # provided with the software product.
          #
          # [session_n]
          # session-id = n
          # mirror.session.n.direction = (ingress | egress)
          # mirror.session.n.src = <swpx, bond>
          # mirror.session.n.dest = (swpx | <src-ip> <dst-ip>)
          # mirror.session.n.type = (span | erspan | none)
          #
          # Default is all sessions off
          # mirror.session.all.type = none
          [session_1]
          session-id = 1
          mirror.session.1.direction = ingress
          mirror.session.1.src = swp1
          mirror.session.1.dest = 10.10.10.1 10.10.10.234
          mirror.session.1.type = erspan
          

          Run the following command to the load the configuration:

          cumulus@switch:~$ /usr/lib/cumulus/switchdctl --load /etc/cumulus/switchd.d/port-mirror.conf -prefix mirror
          

          Selective ERSPAN with ACLs

          You can configure selective ERSPAN with ACLs to mirror a subset of traffic according to:

          The following command mirrors inbound ICMP packets from all swp interfaces. The source IP address for ERSPAN encapsulation is 10.10.10.1 and the destination IP address for ERSPAN encapsulation is 10.10.10.234.

          cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 match ip protocol icmp
          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 action erspan source-ip 10.10.10.1
          cumulus@switch:~$ nv set acl EXAMPLE1 rule 1 action erspan dest-ip 10.10.10.234
          cumulus@switch:~$ nv set interface swp1-54 acl EXAMPLE1 inbound
          cumulus@switch:~$ nv config apply
          
          1. Create a rules file in /etc/cumulus/acl/policy.d/. The following rule configures ERSPAN for all ICMP packets that ingress swp1. The source IP address for ERSPAN encapsulation is 10.10.10.1 and the destination IP address for ERSPAN encapsulation is 10.10.10.234.

            cumulus@switch:~$ sudo nano /etc/cumulus/acl/policy.d/erspan.rules
            [iptables]
            -A FORWARD --in-interface swp1 -p icmp -j ERSPAN --src-ip 10.10.10.1 --dst-ip 10.10.10.234
            

            src-ip can be any IP address, even if it does not exist in the routing table.

            dst-ip must be an IP address reachable through the routing table and front-panel port (not the management port) or SVI. Use ping or ip route get to verify that the destination IP address is reachable.

          2. Install the rules:

            cumulus@switch:~$ sudo cl-acltool -i
            

          Do not run the cl-acltool -i command with -P option. The -P option removes all existing control plane rules or other installed rules and only installs the rules defined in the specified file.

          1. Verify that you installed the ERSPAN rules:

            cumulus@switch:~$ sudo iptables -L -v | grep ERSPAN
            29     0 ERSPAN     icmp --  swp1   any     anywhere             anywhere             ERSPAN src-ip:10.10.10.1 dst-ip:10.10.10.234
            

          Example Rules

          In the following example rules, the source IP address for ERSPAN encapsulation is 10.10.10.1 and the destination IP address for ERSPAN encapsulation is 10.10.10.234.

          To mirror forwarded packets from all ports matching the source IP address 20.0.0.2 and the destination IP address 20.0.1.2:

          -A FORWARD --in-interface swp+ -s 20.0.0.2 -d 20.0.1.2 -j ERSPAN --src-ip 10.10.10.1 --dst-ip 10.10.10.234
          

          To mirror ICMP packets from all ports:

          -A FORWARD --in-interface swp+ -p icmp -j ERSPAN --src-ip 10.10.10.1 --dst-ip 10.10.10.234
          

          To mirror forwarded UDP packets with destination port 53 arriving on swp1:

          -A FORWARD --in-interface swp1 -p udp --dport 53 -j ERSPAN --src-ip 10.10.10.1 --dest-ip 10.10.10.234
          

          To mirror all forwarded TCP packets with only SYN set:

          -A FORWARD --in-interface swp+ -p tcp --tcp-flags ALL SYN -j ERSPAN --src-ip 10.10.10.1 --dst-ip 10.10.10.234
          

          To mirror all forwarded TCP packets with only FIN set:

          -A FORWARD --in-interface swp+ -p tcp --tcp-flags ALL FIN -j ERSPAN --src-ip 10.10.10.1 --dst-ip 10.10.10.234
          

          Show SPAN and ERSPAN Configuration

          To show SPAN and ERSPAN configuration for a specific session, run the NVUE nv show system port-mirror session <session-id> command. To show SPAN and ERSPAN configuration for all sessions, run the NVUE nv show system port-mirror command.

          cumulus@switch:~$ nv show system port-mirror session 1
                           operational  applied  pending
          ---------------  -----------  -------  -------
          erspan                                        
            enable                               off    
          span                                          
            enable                               on     
            direction                            ingress
            [destination]                               
            [source-port]                        swp1   
            truncate                                    
              enable                             off  
          

          You can also run the sudo cl-acltool -L all | grep SPAN or sudo cl-acltool -L all | grep ERSPAN command.

          cumulus@switch:~$ sudo cl-acltool -L all | grep SPAN
              0     0 SPAN       all  --  any    swp1    10.10.10.1    anywhere    /* rule_id:1,acl_name:EXAMPLE1,dir:outbound,interface_id:swp1 */ dport:cpu
          

          Limitations

          Simple Network Management Protocol - SNMP

          SNMP is an IETF standards-based network management architecture and protocol. Cumulus Linux uses the open source Net-SNMP agent snmpd, which provides support for most of the common industry-wide MIBs, including interface counters, and TCP and UDP IP stack data. The SNMP version in Cumulus Linux adds custom MIBs and pass-through, and pass-persist scripts.

          SNMP Components

          The main components of SNMP in Cumulus Linux include:

          SNMP Network Management System

          An SNMP network management system (NMS) is a system configured to poll SNMP agents (such as Cumulus Linux switches or routers), which respond with data. A variety of command line tools exist to poll agents, such as snmpget, snmpgetnext, snmpwalk, snmpbulkget, and snmpbulkwalk. SNMP agents can also send unsolicited traps and inform messages to the NMS based on predefined criteria, such as link changes.

          SNMP Agent

          The SNMP agent (snmpd) running on a Cumulus Linux switch gathers information about the local system and stores the data in a MIB. Parts of the MIB tree are available and provided to incoming requests originating from an NMS host that has authenticated with the correct credentials. You can configure the Cumulus Linux switch with usernames and credentials to provide authenticated and encrypted responses to NMS requests. The snmpd agent can also proxy requests and act as a master agent to sub-agents running on other daemons, such as FRR or LLDP.

          Management Information Base (MIB)

          The MIB is a database for the snmpd service that runs on the agent. MIBs adhere to IETF standards but are flexible enough to allow vendor-specific additions. Cumulus Linux includes custom enterprise MIB tables in a set of text files on the switch; the files are in /usr/share/snmp/mibs/ and their names all start with Cumulus; for example, Cumulus-Counters-MIB.txt.

          The MIB is a top-down hierarchical tree. Each branch that forks off has both an identifying number (starting with 1) and an identifying string that is unique for that level of the hierarchy. You can use the strings and numbers interchangeably. The parent IDs (numbers or strings) combine, starting with the most general to form an address for the MIB object. A dot in this notation represents each junction in the hierarchy so that the address is a series of ID strings or numbers separated by dots. This entire address is an object identifier (OID).

          You can use various online and command line tools to translate between numbers and strings and to also provide definitions for the various MIB objects. For example, you can view the sysLocation object (in SNMPv2-MIB.txt) in the system table as either a series of numbers 1.3.6.1.2.1.1.6 or as the string iso.org.dod.internet.mgmt.mib-2.system.sysLocation. You view the definition with the snmptranslate command, which is part of the snmp Debian package in Cumulus Linux.

          cumulus@switch:~$ snmptranslate -Td -On SNMPv2-MIB::sysLocation
          .1.3.6.1.2.1.1.6
          sysLocation OBJECT-TYPE
            -- FROM       SNMPv2-MIB
            -- TEXTUAL CONVENTION DisplayString
            SYNTAX        OCTET STRING (0..255)
            DISPLAY-HINT  "255a"
            MAX-ACCESS    read-write
            STATUS        current
            DESCRIPTION   "The physical location of this node (e.g., 'telephone
                      closet, 3rd floor').  If the location is unknown, the
                      value is the zero-length string."
          ::= { iso(1) org(3) dod(6) internet(1) mgmt(2) mib-2(1) system(1) 6 }
          

          In the last line above, the section 1.3.6.1 or iso.org.dod.internet is the OID that defines internet resources. The 2 or mgmt that follows is for a management subcategory. The 1 or mib-2 under that defines the MIB-2 specification. The 1 or system is the parent for child objects sysDescr, sysObjectID, sysUpTime, sysContact, sysName, sysLocation, sysServices, and so on, as you see in the tree output from the second snmptranslate command below, where sysLocation is 6.

          cumulus@leaf01:mgmt:~$ snmptranslate -Tp -IR system
          +--system(1)
             |
             +-- -R-- String    sysDescr(1)
             |        Textual Convention: DisplayString
             |        Size: 0..255
             +-- -R-- ObjID     sysObjectID(2)
             +-- -R-- TimeTicks sysUpTime(3)
             |  |
             |  +--sysUpTimeInstance(0)
             |
             +-- -RW- String    sysContact(4)
             |        Textual Convention: DisplayString
             |        Size: 0..255
             +-- -RW- String    sysName(5)
             |        Textual Convention: DisplayString
             |        Size: 0..255
             +-- -RW- String    sysLocation(6)
             |        Textual Convention: DisplayString
             |        Size: 0..255
             +-- -R-- INTEGER   sysServices(7)
             |        Range: 0..127
             +-- -R-- TimeTicks sysORLastChange(8)
             |        Textual Convention: TimeStamp
             |
             +--sysORTable(9)
                |
                +--sysOREntry(1)
                   |  Index: sysORIndex
                   |
                   +-- ---- INTEGER   sysORIndex(1)
                   |        Range: 1..2147483647
                   +-- -R-- ObjID     sysORID(2)
                   +-- -R-- String    sysORDescr(3)
                   |        Textual Convention: DisplayString
                   |        Size: 0..255
                   +-- -R-- TimeTicks sysORUpTime(4)
                            Textual Convention: TimeStamp
          

          Configure SNMP

          The most basic SNMP configuration requires you to:

          By default, the SNMP configuration has a listening address of localhost (127.0.0.1), which allows the agent (the snmpd service) to respond to SNMP requests originating on the switch itself. This is a secure method that allows checking the SNMP configuration without exposing the switch to outside attacks. For an external SNMP NMS to poll a Cumulus Linux switch, you must configure the snmpd service running on the switch to listen to one or more IP addresses on interfaces that have a link state UP.

          Use the SNMPv3 username instead of the read-only community name. The SNMPv3 username does not expose the user credentials and can encrypt packet contents. However, SNMPv1 and SNMPv2c environments require read-only community passwords so that the snmpd daemon can respond to requests. The read-only community string enables you to poll various MIB objects on the device.

          Basic Configuration

          Before you can use SNMP, you need to enable and start the snmpd service, and configure a listening address.

          cumulus@switch:~$ nv set system snmp-server state enabled
          cumulus@switch:~$ nv set system snmp-server listening-address localhost
          cumulus@switch:~$ nv config apply
          
          1. Start the snmpd service:

            cumulus@switch:~$ sudo systemctl start snmpd.service
            
          2. Enable the snmpd service to start automatically after reboot:

            cumulus@switch:~$ sudo systemctl enable snmpd.service
            

            To enable the snmpd service to restart automatically after failure, create a file called /etc/systemd/system/snmpd.service.d/restart.conf and add the following lines:

            [Service]
            Restart=always
            RestartSec=60
            
          3. Edit the /etc/snmp/snmpd.conf file and add the IP address, protocol and port for snmpd to listen for incoming requests. You can use multiple lines to define multiple listening addresses or use a comma-separated list on a single line.

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          agentAddress 192.168.200.11@mgmt
          agentAddress udp:66.66.66.66:161,udp:77.77.77.77:161,udp6:[2001::1]:161
          ...
          
          1. Run the sudo systemctl daemon-reload command.

          Listening IP Addresses

          The listening address is localhost by default so that the SNMP agent only responds to requests originating on the switch itself in the default VRF. To configure the switch to respond to requests sent to localhost in a mgmt VRF shell, see SNMP and VRFs. You can also configure listening only on the IPv6 localhost address. When using IPv6 addresses or localhost, you can use a readonly-community-v6 for SNMPv1 and SNMPv2c requests. For SNMPv3 requests, you can use the username command to restrict access. See Configure the SNMPv3 Username below.

          The IP address must exist on an interface that has link UP on the switch where you use snmpd. By default, the IP address is udp:127.0.0.1:161, so snmpd only responds to requests (such as snmpwalk, snmpget, snmpgetnext) that originate from the switch. A wildcard setting of udp:161,udp6:161 forces snmpd to listen on all IPv4 and IPv6 interfaces for incoming SNMP requests.

          You can configure multiple IP addresses and bind to a particular IP address within a particular VRF table.

          To configure the listening IP addresses:

          cumulus@switch:~$ nv set system snmp-server listening-address localhost
          cumulus@switch:~$ nv set system snmp-server listening-address localhost-v6
          cumulus@switch:~$ nv config apply
          

          To configure the snmpd daemon to listen on all interfaces for either IPv4 or IPv6 UDP port 161 SNMP requests, run the following command, which removes all other individual IP addresses configured:

          cumulus@switch:~$ nv set system snmp-server listening-address all
          cumulus@switch:~$ nv set system snmp-server listening-address all-v6
          cumulus@switch:~$ nv config apply
          

          To configure snmpd to listen to a specific IPv4 or IPv6 address:

          cumulus@switch:~$ nv set system snmp-server listening-address 192.168.200.11
          cumulus@switch:~$ nv config apply
          

          To configure snmpd to listen to multiple addresses for incoming SNMP queries, separate the addresses with a space:

          cumulus@switch:~$ nv set system snmp-server listening-address 192.168.200.11 192.168.200.21
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/snmp/snmpd.conf file and add the IP address, protocol and port for snmpd to listen for incoming requests. You can use multiple lines to define multiple listening addresses or use a comma-separated list on a single line.

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          agentAddress 192.168.200.11@mgmt
          agentAddress udp:66.66.66.66:161,udp:77.77.77.77:161,udp6:[2001::1]:161
          ...
          

          SNMP and VRFs

          Cumulus Linux provides a listening address for VRFs together with trap and inform support. You can configure snmpd to listen to a specific IPv4 or IPv6 address on an interface within a particular VRF. With VRFs, identical IP addresses can exist in different VRF tables. This command restricts listening to a particular IP address within a particular VRF. If you do not provide a VRF name, Cumulus Linux uses the default VRF.

          The following command configures snmpd to listen to IP address 10.10.10.10 on eth0, the management interface in the management VRF:

          cumulus@switch:~$ nv set system snmp-server listening-address 10.10.10.10 vrf mgmt
          cumulus@switch:~$ nv config apply
          

          By default, snmpd does not cross VRF table boundaries. To listen on IP addresses in different VRF tables, use multiple listening-address commands each with a VRF name:

          cumulus@switch:~$ nv set system snmp-server listening-address 10.10.10.10 vrf rocket
          cumulus@switch:~$ nv set system snmp-server listening-address 10.10.10.20 vrf turtle
          cumulus@switch:~$ nv config apply
          

          By default, snmpd only responds to localhost requests in the default VRF. You can configure the switch to respond to requests sent to localhost in a mgmt VRF shell. To configure the snmpd daemon to listen on localhost in the mgmt VRF, run:

          cumulus@switch:~$ nv set system snmp-server listening-address localhost vrf mgmt
          cumulus@switch:~$ nv config apply
          

          To bind to a particular IP address within a particular VRF table, edit the /etc/snmp/snmpd.conf file and append @ and the name of the VRF table to the IP address (for example, 192.168.200.11@mgmt).

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          agentAddress 192.168.200.11@mgmt
          agentAddress udp:66.66.66.66:161,udp:77.77.77.77:161,udp6:[2001::1]:161
          ...
          

          By default, snmpd only responds to localhost requests in the default VRF. You can configure the switch to respond to requests sent to localhost in a mgmt VRF shell. Edit the /etc/snmp/snmpd.conf file and add @mgmt to the agentaddress configuration:

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          agentaddress 127.0.0.1@mgmt
          ...
          

          Then restart snmpd with the sudo systemctl restart snmpd command.

          Configure the SNMPv3 Username

          NVIDIA recommends you use an SNMPv3 username and password instead of the read-only community; SNMPv3 does not expose the password in the GetRequest and GetResponse packets and can also encrypt packet contents. You can configure multiple usernames for different user roles with different levels of access to various MIBs.

          The default snmpd.conf file contains the default user _snmptrapusernameX. You cannot use this username for authentication. SNMP traps require this username.

          You can authenticate the user in the following ways:

          The following example command requires no authentication password for the user testusernoauth:

          cumulus@switch:~$ nv set system snmp-server username testusernoauth auth-none
          cumulus@switch:~$ nv config apply
          

          The following example command configures MD5 authentication for the user limiteduser1:

          cumulus@switch:~$ nv set system snmp-server username testuserauth auth-md5 myauthmd5password
          cumulus@switch:~$ nv config apply
          

          The following example command configures SHA authentication for the user limiteduser1:

          cumulus@switch:~$ nv set system snmp-server username limiteduser1 auth-sha SHApassword1
          cumulus@switch:~$ nv config apply
          

          If you specify MD5 or SHA authentication, you can also specify an AES or DES encryption password to encrypt the contents of the request and response packets.

          cumulus@switch:~$ nv set system snmp-server username testuserauth auth-md5 myauthmd5password encrypt-aes myencryptsecret
          cumulus@switch:~$ nv config apply
          

          You can restrict a user to a particular OID tree. The OID can be either a string of decimal numbers separated by periods or a unique text string that identifies an SNMP MIB object. The MIBs that Cumulus Linux includes are in the /usr/share/snmp/mibs/ directory. If the MIB you want to use does not install by default, you can install it with the latest Debian snmp-mibs-downloader package.

          cumulus@switch:~$ nv set system snmp-server username testuserauth auth-md5 myauthmd5password encrypt-aes myaessecret oid 1.3.6.1.2.1.1
          cumulus@switch:~$ nv config apply
          

          You can restrict a user to a predefined view:

          cumulus@switch:~$ nv set system snmp-server username testuserauth auth-md5 myauthmd5password encrypt-aes myaessecret view rocket
          cumulus@switch:~$ nv config apply
          

          The example below defines five users, each with a different combination of authentication and encryption:

          cumulus@switch:~$ nv set system snmp-server username user1 auth-none
          cumulus@switch:~$ nv set system snmp-server username user2 auth-md5 user2password
          cumulus@switch:~$ nv set system snmp-server username user3 auth-md5 user3password encrypt-des user3encryption
          cumulus@switch:~$ nv set system snmp-server username user666 auth-sha user666password encrypt-aes user666encryption
          cumulus@switch:~$ nv set system snmp-server username user999 auth-md5 user999password encrypt-des user999encryption
          cumulus@switch:~$ nv set system snmp-server username user1 auth-none oid 1.3.6.1.2.1
          cumulus@switch:~$ nv set system snmp-server username user3 auth-sha testshax encrypt-aes testaesx oid 1.3.6.1.2.1
          cumulus@switch:~$ nv config apply
          

          Three directives define an internal SNMPv3 username that you need for snmpd to retrieve information and send built-in traps or for traps you configure with the monitor command (see below):

          • createuser is the default SNMPv3 username.
          • iquerysecName is the default SNMPv3 username you use when making internal queries to retrieve monitored expressions, either to evaluate the monitored expression or build a notification payload. These internal queries always use SNMPv3, even if you query the agent using SNMPv1 or SNMPv2c. The iquerysecname directive only defines which user to use.
          • rouser is the username for these SNMPv3 queries.

          Edit the /etc/snmp/snmpd.conf file and add the createuser, iquerysecName, rouser commands. The following example configuration configures snmptrapusernameX as the username using the createUser command.

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          createuser snmptrapusernameX
          iquerysecname snmptrapusernameX
          rouser snmptrapusernameX
          ...
          

          The example below defines five users, each with a different combination of authentication and encryption:

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          # simple no auth user
          #createuser user1
          
          # user with MD5 authentication
          #createuser user2 MD5 user2password
          
          # user with MD5 for auth and DES for encryption
          #createuser user3 MD5 user3password DES user3encryption
          
          # user666 with SHA for authentication and AES for encryption
          createuser user666 SHA user666password AES user666encryption
          
          # user999 with MD5 for authentication and DES for encryption
          createuser user999 MD5 user999password DES user999encryption
          
          # restrict users to certain OIDs
          # (Note: creating rouser or rwuser will give
          # access regardless of the createUser command above. However,
          # createUser without rouser or rwuser will not provide any access).
          rouser user1 noauth 1.3.6.1.2.1
          rouser user2 auth 1.3.6.1.2.1
          rwuser user3 priv 1.3.6.1.2.1
          rwuser user666
          rwuser user999
          ...
          

          The following example shows a more advanced but slightly more secure method of configuring SNMPv3 users without creating cleartext passwords:

          1. Install the net-snmp-config script that is in the libsnmp-dev package:

            cumulus@switch:~$ sudo -E apt-get update
            cumulus@switch:~$ sudo -E apt-get install libsnmp-dev
            
          2. Stop the snmpd daemon:

            cumulus@switch:~$ sudo systemctl stop snmpd.service
            
          3. Use the net-snmp-config command to create two users, one with MD5 and DES, and the next with SHA and AES.

          The minimum password length is eight characters and the arguments -a and -x have different meanings in net-snmp-config than snmpwalk.

          cumulus@switch:~$ sudo net-snmp-config --create-snmpv3-user -a md5authpass -x desprivpass -A MD5 -X DES userMD5withDES
          cumulus@switch:~$ sudo net-snmp-config --create-snmpv3-user -a shaauthpass -x aesprivpass -A SHA -X AES userSHAwithAES
          cumulus@switch:~$ sudo systemctl start snmpd.service
          

          This adds a createUser command in /var/lib/snmp/snmpd.conf. Do not edit this file by hand unless you are removing usernames. You can edit this file and restrict access to certain parts of the MIB by adding noauth, auth or priv to allow unauthenticated access, require authentication, or to enforce use of encryption.

          The snmpd daemon reads the information from the /var/lib/snmp/snpmd.conf file and then removes the line (so that Cumulus Linux does not store the master password for that user) and replaces it with the key it derives (using the EngineID). The key is a localized key so that if someone steals the password, they cannot use it to access other agents. To remove the two users userMD5withDES and userSHAwithAES, stop the snmpd daemon and edit the /var/lib/snmp/snmpd.conf file. Remove the lines containing the username, then restart the snmpd daemon as in step 3 above.

          Configure an SNMP View Definition

          To restrict MIB tree exposure, you can define a view for an SNMPv3 username or community password, and a host from a restricted subnet. In doing so, any SNMP request with that username and password must have a source IP address within the configured subnet.

          You can define a specific view multiple times and fine tune to provide or restrict access using the included or excluded command to specify branches of certain MIB trees.

          By default, the snmpd.conf file contains many views within the systemonly view.

          cumulus@switch:~$ nv set system snmp-server viewname cumulusOnly included .1.3.6.1.4.1.40310
          cumulus@switch:~$ nv set system snmp-server viewname cumulusCounters included .1.3.6.1.4.1.40310.2
          cumulus@switch:~$ nv set system snmp-server readonly-community simplepassword access any view cumulusOnly
          cumulus@switch:~$ nv set system snmp-server username testusernoauth auth-none view cumulusOnly
          cumulus@switch:~$ nv set system snmp-server username limiteduser1 auth-md5 md5password1 encrypt-aes myaessecret view cumulusCounters
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/snmp/snmpd.conf file and add the view command.

          rocommunity uses the systemonly view to create a password that can only access these branches of the OID tree.

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          view systemonly included .1.3.6.1.2.1.1
          view systemonly included .1.3.6.1.2.1.2
          view systemonly included .1.3.6.1.2.1.3
          ...
          

          Configure the Community String

          Cumulus Linux disables snmpd authentication for SNMPv1 and SNMPv2c by default. To enable authentication, provide a password (community string) for SNMPv1 or SNMPv2c environments so that the snmpd daemon can respond to requests. By default, this provides access to the full OID tree for such requests, regardless of their source. Cumulus Linux does not set a default password so snmpd does not respond to any requests that arrive unless you set the read-only community password.

          For SNMPv1 and SNMPv2c, you can specify a read-only community string. For SNMPv3, you can specify a read-only or a read-write community string (as long as you are not using the preferred username method; see above).

          You can specify a source IP address token to restrict access to only that a host or network.

          You can also specify a view to restrict the subset of the OID tree.

          The following example configuration:

          • Sets the read-only community string to simplepassword for SNMP requests.
          • Restricts requests to only those that come from hosts in the 192.168.200.10/24 subnet.
          • Restricts viewing to the mysystem view, which you define with the view command.
          cumulus@switch:~$ nv set system snmp-server viewname mysystem included 1.3.6.1.2.1.1
          cumulus@switch:~$ nv set system snmp-server readonly-community simplepassword access 192.168.200.10/24 view mysystem
          cumulus@switch:~$ nv config apply
          

          This example creates a read-only community password showitall that allows access to the entire OID tree for requests originating from any source IP address.

          cumulus@switch:~$ nv set system snmp-server readonly-community showitall access any
          cumulus@switch:~$ nv config apply
          

          To enable the community string, provide a community string, then set:

          • rocommunity or rwcommunity: rocommunity is for a read-only community; rwcommunity is for read-write access. Specify one or the other.
          • public: The plain text password or community string.

          NVIDIA strongly recommends you change this password to something else.

          • default allows connections from any system.
          • localhost allows requests only from the local host. A restricted source can either be a specific hostname (or address), or a subnet, represented as IP/MASK (like 10.10.10.0/255.255.255.0), or IP/BITS (like 10.10.10.0/24), or the IPv6 equivalents.
          • -V restricts viewing to a specific view. For example, systemonly is one SNMP view. This is a user-defined value.

          Edit the /etc/snmp/snmpd.conf file and add the community string.

          In the following example, the first line sets the read-only community string to turtle for SNMP requests sourced from the 192.168.200.10/24 subnet and restricts viewing to the systemonly view defined with the -V option. The second line creates a read-only community string that allows access to the entire OID tree from any source IP address.

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          rocommunity turtle 192.168.200.10/24 -V systemonly
          rocommunity cumuluspassword
          ...
          

          Restart snmpd for the changes to take effect:

          cumulus@switch:~$ systemctl restart snmpd.service
          

          Configure System Settings

          You can configure system settings for the SNMPv2 MIB. The following example commands set:

          To set the system physical location for the node in the SNMPv2-MIB system table:

          cumulus@switch:~$ nv set system snmp-server system-location my-private-bunker
          cumulus@switch:~$ nv config apply
          

          To set the username and email address of the contact person for this managed node:

          cumulus@switch:~$ nv set system snmp-server system-contact myemail@example.com
          cumulus@switch:~$ nv config apply
          

          To set an administratively assigned name for the managed node, run the following command. Typically, this is the fully qualified domain name of the node.

          cumulus@switch:~$ nv set system snmp-server system-name CumulusBox-1,543,567
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/snmp/snmpd.conf file and add the following configuration:

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          syscontact myemail@example.com
          syslocation My-private-bunker
          sysname CumulusBox-1,543,567
          ...
          

          Enable SNMP Support for FRR

          SNMP supports routing MIBs in FRR. If you are running Linux commands to configure the switch, you need to configure AgentX (ASX) access in FRR.

          The NVUE nv set system snmp-server state enable command automatically configures AgentX (ASX) access in FRR; you do not need to run any additional commands.

          1. Enable AgentX:

            cumulus@switch:~$ sudo vtysh
            ...
            switch# configure terminal
            switch(config)# agentx
            switch(config)# end
            switch# write memory
            switch# exit
            
          2. If your SNMP view restricts MIB access, expose the following MIBs for the protocols you are using:

            • For the BGP4 MIB, allow access to 1.3.6.1.2.1.15
            • For the OSPF MIB, allow access to 1.3.6.1.2.1.14
            • For the OSPFV3 MIB, allow access to 1.3.6.1.2.1.191

          To verify the configuration, you can run snmpwalk.

          cumulus@switch:~$ sudo snmpwalk -v2c -cpublic localhost 1.3.6.1.2.1.14
          

          If you disable the SNMP server with AgentX enabled, the FRR service restarts, which might impact traffic.

          Enable the .1.3.6.1.2.1 Range

          The snmpd.conf file in Cumulus Linux does not include certain MIBs by default. This results in some default views on common network tools (like librenms) to return less than optimal data. To include more MIBs, enable the complete .1.3.6.1.2.1 range. The default SNMPv3 configuration includes:

          This configuration grants access to a large number of MIBs, including all SNMPv2-MIB, which shows more data than you expect. In addition to being a security vulnerability, it consumes more CPU resources.

          To enable the .1.3.6.1.2.1 range, make sure the view commands include the required MIB objects.

          Set up the Custom MIBs on the NMS

          You do not need to change the /etc/snmp/snmpd.conf file on the switch to support the custom MIBs. The file includes the following lines by default and provides support for both the Cumulus Counters and the Cumulus Resource Query MIBs.

          cumulus@switch:~$ cat /etc/snmp/snmpd.conf
          ...
          sysObjectID 1.3.6.1.4.1.40310
          pass_persist .1.3.6.1.4.1.40310.1 /usr/share/snmp/resq_pp.py
          pass_persist .1.3.6.1.4.1.40310.2 /usr/share/snmp/cl_drop_cntrs_pp.py
          ...
          

          You need to copy several files to the NMS server for it to recognize the custom Cumulus MIB.

          Pass Persist Scripts

          The pass persist scripts in Cumulus Linux use the pass_persist extension to Net-SNMP. The scripts are in /usr/share/snmp and include:

          Cumulus Linux enables all the scripts by default except for bgp4_pp.py, which FRR uses.

          Disable SNMP

          To disable SNMP, run the nv set system snmp-server state disable command:

          cumulus@switch:~$ nv set system snmp-server state disable
          cumulus@switch:~$ nv config apply
          

          When you disable SNMP, the FRR service restarts, which might impact traffic.

          Example Configuration

          The following example configuration:

          cumulus@switch:~$ nv set system snmp-server listening-address all
          cumulus@switch:~$ nv set system snmp-server readonly-community tempPassword access any
          cumulus@switch:~$ nv set system snmp-server trap-destination 1.1.1.1 community-password mypassword version 2c
          cumulus@switch:~$ nv set system snmp-server trap-link-up check-frequency 15
          cumulus@switch:~$ nv set system snmp-server trap-link-down check-frequency 10
          cumulus@switch:~$ nv set system snmp-server trap-snmp-auth-failures
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/snmp/snmpd.conf file and apply the following configuration (add every line starting with a +):

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          +agentaddress udp:161
          agentxperms 777 777 snmp snmp
          agentxsocket /var/agentx/master
          +authtrapenable 1
          createuser _snmptrapusernameX
          iquerysecname _snmptrapusernameX
          +load 7.45 5.14 0
          master agentx
          monitor -r 60 -o laNames -o laErrMessage "laTable" laErrorFlag != 0
          +monitor CumulusLinkDOWN -S -r 10 -o ifName -o ifIndex -o ifAdminStatus -o ifOperStatus ifOperStatus == 2
          +monitor CumulusLinkUP -S -r 15 -o ifName -o ifIndex -o ifAdminStatus -o ifOperStatus ifOperStatus != 2
          pass -p 10 1.3.6.1.2.1.1.1 /usr/share/snmp/sysDescr_pass.py
          pass_persist 1.2.840.10006.300.43 /usr/share/snmp/ieee8023_lag_pp.py
          pass_persist 1.3.6.1.2.1.17 /usr/share/snmp/bridge_pp.py
          pass_persist 1.3.6.1.2.1.31.1.1.1.18 /usr/share/snmp/snmpifAlias_pp.py
          pass_persist 1.3.6.1.2.1.47 /usr/share/snmp/entity_pp.py
          pass_persist 1.3.6.1.2.1.99 /usr/share/snmp/entity_sensor_pp.py
          pass_persist 1.3.6.1.4.1.40310.1 /usr/share/snmp/resq_pp.py
          pass_persist 1.3.6.1.4.1.40310.2 /usr/share/snmp/cl_drop_cntrs_pp.py
          pass_persist 1.3.6.1.4.1.40310.3 /usr/share/snmp/cl_poe_pp.py
          +rocommunity neteng default
          +rocommunity tempPassword default
          rouser _snmptrapusernameX
          +syslocation leaf01
          sysobjectid 1.3.6.1.4.1.40310
          sysservices 72
          +trap2sink 1.1.1.1 mypassword
          

          Configure SNMP Traps

          SNMP traps are alert notification messages from SNMP agents to the SNMP manager. These messages generate whenever any failure or fault occurs in a monitored device or service. An SNMPv3 inform is an acknowledged SNMPv3 trap.

          You configure the following for SNMPv3 trap and inform messages:

          Generate Event Notification Traps

          The Net-SNMP agent provides a method to generate SNMP trap events using the Distributed Management (DisMan) Event MIB for various system events, including:

          To enable specific types of traps, create the following configurations in /etc/snmp/snmpd.conf.

          Define Access Credentials

          Although the traps are sent to an SNMPv2c receiver, the SNMPv3 username is still required to authorize the DisMan service. Starting with Net-SNMP 5.3, snmptrapd no longer accepts all traps by default. You must configure snmptrapd with authorized SNMPv1 and v2c community strings and, or SNMPv3 users. Non-authorized traps and informs are dropped.

          Follow the steps in Configure SNMP to define the username. You can refer to the snmptrapd.conf(5) manual page for more information.

          If not already on the system, install the snmptrapd Debian package with the sudo apt-get install snmptrapd command before you configure the username.

          Define Trap Receivers

          The following configuration defines the trap receiver IP address for SNMPv1 and SNMPv2c traps. For SNMP versions 1 and 2c, you must set at least one SNMP trap destination IP address; multiple destinations can exist. Removing all settings disables SNMP traps. The default version is 2c. You must include a VRF name with the IP address to force traps to send in a non-default VRF table.

          cumulus@switch:~$ nv set system snmp-server trap-destination localhost vrf rocket community-password mymanagementvrfpassword version 1
          cumulus@switch:~$ nv set system snmp-server trap-destination localhost-v6 community-password mynotsosecretpassword version 2c
          cumulus@switch:~$ nv config apply
          

          To define the IP address of the notification (or trap) receiver for either SNMPv1 traps or SNMPv2 traps, use the trapsink (SNMPv1) trap2sink (SNMPv2c). Specifying more than one sink directive generates multiple copies of each notification (in the appropriate formats). You must configure a trap server to receive and decode these trap messages (for example, snmptrapd). You can configure the address of the trap receiver with a different protocol and port but this is most often left out. The defaults are to use the well-known UDP packets and port 162.

          Edit the /etc/snmp/snmpd.conf file and configure the trap settings.

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          trap2sink [::1] mynotsosecretpassword
          trapsink 127.0.0.1@rocket mymanagementvrfpassword
          ...
          

          Restart the snmpd service to apply the changes.

          cumulus@switch:~$ sudo systemctl restart snmpd.service
          

          SNMPv3 Trap and Inform Messages

          The SNMP trap receiving daemon must have usernames, authentication passwords, and encryption passwords created with its own EngineID. You must configure this trap server EngineID in the switch snmpd daemon sending the trap and inform messages.

          cumulus@switch:~$ nv set system snmp-server trap-destination localhost username myv3user auth-md5 md5password1 encrypt-aes myaessecret engine-id  0x80001f888070939b14a514da5a00000000 inform
          cumulus@switch:~$ nv set system snmp-server trap-destination localhost vrf mgmt username mymgmtvrfusername auth-md5 md5password2 encrypt-aes myaessecret2 engine-id  0x80001f888070939b14a514da5a00000000 inform
          cumulus@switch:~$ nv config apply
          

          You can configure SNMPv3 trap and inform messages with the trapsess configuration command. Inform messages are traps that the receiving trap daemon acknowledges. You configure inform messages with the -Ci parameter. You must specify the EngineID of the receiving trap server with the -e field.

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          trapsess -Ci -e 0x80001f888070939b14a514da5a00000000 -v3 -l authPriv -u mymgmtvrfusername -a MD5 -A md5password2 -x AES -X myaessecret2 127.0.0.1@mgmt
          trapsess -Ci -e 0x80001f888070939b14a514da5a00000000 -v3 -l authPriv -u myv3user -a MD5 -A md5password1 -x AES -X myaessecret 127.0.0.1
          ...
          

          You can define multiple trap receivers and use the domain name instead of an IP address in the trap2sink directive.

          Restart the snmpd service to apply the changes:

          cumulus@switch:~$ sudo systemctl restart snmpd.service
          

          Source Traps from a Different Source IP Address

          When you run client SNMP programs (such as snmpget, snmpwalk, or snmptrap) from the command line, or when you configure snmpd to send a trap (based on snmpd.conf), you can configure a clientaddr in snmpd.conf that allows the SNMP client programs or snmpd (for traps) to source requests from a different source IP address.

          For more information about clientaddr, see the snmpd.conf man page.

          snmptrap, snmpget, snmpwalk and snmpd itself must be able to bind to this address.

          Edit the /etc/snmp/snmpd.conf file and add the clientaddr option. In the following example, spine01 is the client (IP address 192.168.200.21).

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          trapsess -Ci --clientaddr=192.168.200.21 -v 2c
          ...
          

          Restart the snmpd service to apply the changes.

          cumulus@switch:~$ sudo systemctl restart snmpd.service
          

          NVUE does not provide commands for this configuration.

          Monitor Fans, Power Supplies, Temperature and Transformers

          An SNMP agent (snmpd) waits for incoming SNMP requests and responds to them. If the agent does not receive any requests, it does not start any actions. However, various commands can configure snmpd to send traps according to preconfigured settings (load, file, proc, disk, or swap commands), or customized monitor directives.

          See the snmpd.conf man page for details on the monitor directive.

          You can configure snmpd to monitor the operational status of either the Entity MIB or Entity-Sensor MIB by adding the monitor directive to the snmpd.conf file. After you know the OID, you can determine the operational status, which can be a value of ok(1), unavailable(2) or nonoperational(3). Add a configuration like the following example to /etc/snmp/snmpd.conf and adjust the values:

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          # without installing extra MIBS we can check the check Fan1 status
          # if the Fan1 index is 100011001, monitor this specific OID (-I) every 10 seconds (-r), and defines additional information to be included in the trap (-o).
          monitor -I -r 10  -o 1.3.6.1.2.1.47.1.1.1.1.7.100011001 "Fan1 Not OK"  1.3.6.1.2.1.99.1.1.1.5.100011001 > 1
          # Any Entity Status non OK (greater than 1)
          monitor  -r 10  -o 1.3.6.1.2.1.47.1.1.1.1.7  "Sensor Status Failure"  1.3.6.1.2.1.99.1.1.1.5 > 1
          
          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          # for a specific fan called Fan1 with an index 100011001
          monitor -I -r 10  -o entPhysicalName.100011001 "Fan1 Not OK"  entPhySensorOperStatus.100011001 > 1
          # for any Entity Status not OK ( greater than 1)
          monitor  -r 10  -o entPhysicalName  "Sensor Status Failure"  entPhySensorOperStatus > 1
          

          You can find the entPhySensorOperStatus integer by walking the entPhysicalName table.

          To get all sensor information, run snmpwalk on the entPhysicalName table. For example:

          cumulus@leaf01:~$ snmpwalk -v 2c -cpublic localhost .1.3.6.1.2.1.47.1.1.1.1.7
          iso.3.6.1.2.1.47.1.1.1.1.7.100000001 = STRING: "PSU1Temp1"
          iso.3.6.1.2.1.47.1.1.1.1.7.100000002 = STRING: "PSU2Temp1"
          iso.3.6.1.2.1.47.1.1.1.1.7.100000003 = STRING: "Temp1"
          iso.3.6.1.2.1.47.1.1.1.1.7.100000004 = STRING: "Temp2"
          iso.3.6.1.2.1.47.1.1.1.1.7.100000005 = STRING: "Temp3"
          iso.3.6.1.2.1.47.1.1.1.1.7.100000006 = STRING: "Temp4"
          iso.3.6.1.2.1.47.1.1.1.1.7.100000007 = STRING: "Temp5"
          iso.3.6.1.2.1.47.1.1.1.1.7.100011001 = STRING: "Fan1"
          iso.3.6.1.2.1.47.1.1.1.1.7.100011002 = STRING: "Fan2"
          iso.3.6.1.2.1.47.1.1.1.1.7.100011003 = STRING: "Fan3"
          iso.3.6.1.2.1.47.1.1.1.1.7.100011004 = STRING: "Fan4"
          iso.3.6.1.2.1.47.1.1.1.1.7.100011005 = STRING: "Fan5"
          iso.3.6.1.2.1.47.1.1.1.1.7.100011006 = STRING: "Fan6"
          iso.3.6.1.2.1.47.1.1.1.1.7.100011007 = STRING: "PSU1Fan1"
          iso.3.6.1.2.1.47.1.1.1.1.7.100011008 = STRING: "PSU2Fan1"
          iso.3.6.1.2.1.47.1.1.1.1.7.110000001 = STRING: "PSU1"
          iso.3.6.1.2.1.47.1.1.1.1.7.110000002 = STRING: "PSU2"
          

          Restart the snmpd service to apply the changes.

          cumulus@switch:~$ sudo systemctl restart snmpd.service
          

          Cumulus Linux no longer uses the LM-SENSORS MIB to monitor temperature.

          You can configure the switch to trigger link up and link down notifications when the operational status of the link changes.

          The following example commands enable the Disman Event MIB (.1.3.6.1.2.1.88.2.0.1) to monitor the ifTable for network interfaces that come up every 15 seconds or go down every 10 seconds, and trigger a CumulusLinkUp and CumulusLinkDown named notification.

          The default check frequency is 60 seconds, with a minimum of 5 and a maximum of 300 seconds.

          These notifications include the following information.

          • ifName
          • ifIndex
          • ifAdminStatus
          • ifOperStatus
          cumulus@switch:~$ nv set system snmp-server trap-link-down check-frequency 10
          cumulus@switch:~$ nv set system snmp-server trap-link-up check-frequency 15
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/snmp/snmpd.conf file and configure the trap settings.

          The following example commands enable the Disman Event MIB (.1.3.6.1.2.1.88.2.0.1) to monitor the ifTable for network interfaces that come up every 15 seconds or go down every 10 seconds, and trigger a CumulusLinkUp and CumulusLinkDown named notification.

          These notifications include the following information.

          • ifName
          • ifIndex
          • ifAdminStatus
          • ifOperStatus
          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          monitor CumulusLinkDOWN -S -r 10 -o ifName -o ifIndex -o ifAdminStatus -o ifOperStatus ifOperStatus == 2
          monitor CumulusLinkUP -S -r 15 -o ifName -o ifIndex -o ifAdminStatus -o ifOperStatus ifOperStatus != 2
          

          The following example adds linkUpTrap and linkDownTrap traps as defined in RFC 3418:

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          linkUpDownNotifications yes
          
          notificationEvent  linkUpTrap    linkUp   ifIndex ifAdminStatus ifOperStatus
          notificationEvent  linkDownTrap  linkDown ifIndex ifAdminStatus ifOperStatus
          monitor  -r 15 -e linkUpTrap   "Generate linkUp" ifOperStatus != 2
          monitor  -r 10 -e linkDownTrap "Generate linkDown" ifOperStatus == 2
          ...
          

          Restart the snmpd service to apply the changes.

          cumulus@switch:~$ sudo systemctl restart snmpd.service
          

          For more information or additional options, refer to the snmpd.conf man page.

          Configure Free Memory Notifications

          You can monitor free memory and configure the switch to generate a trap when free memory drops below a certain size.

          The following example generates a trap when free memory drops below 1,000,000KB. The free memory trap also includes the amount of total real memory:

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          monitor MemFreeTotal -o memTotalReal memTotalFree <  1000000
          ...
          

          Restart the snmpd service to apply the changes.

          cumulus@switch:~$ sudo systemctl restart snmpd.service
          

          NVUE does not provide commands to configure free memory notifications.

          Configure Processor Load Notifications

          To generate a trap when the CPU load average exceeds a certain threshold, run the following commands. You can only use integers or floating point numbers.

          The following example generates a trap when the 1 minute interval reaches 12%, the 5 minute interval reaches 10%, or the 15 minute interval reaches 5%.

          cumulus@switch:~$ nv set system snmp-server trap-cpu-load-average one-minute 12 five-minute 10 fifteen-minute 5
          cumulus@switch:~$ nv config apply
          

          Edit the /etc/snmp/snmpd.conf file and configure the CPU load settings. To monitor CPU load for 1, 5, or 15 minute intervals, use the load directive with the monitor directive.

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          load 12 10 5
          ...
          
          

          Restart the snmpd service to apply the changes.

          cumulus@switch:~$ sudo systemctl restart snmpd.service
          

          Configure Disk Utilization Notifications

          To monitor disk utilization for all disks, use the includeAllDisks directive together with the monitor directive. The example code below generates a trap when a disk is 99% full:

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          includeAllDisks 1%
          monitor -r 60 -o dskPath -o DiskErrMsg "dskTable" diskErrorFlag !=0
          ...
          

          Restart the snmpd service to apply the changes.

          cumulus@switch:~$ sudo systemctl restart snmpd.service
          

          NVUE does not provide commands to configure disk utilization notifications.

          Configure Authentication Notifications

          To generate SNMP trap notifications for every SNMP authentication failure, run the following commands.

          cumulus@switch:~$ nv set system snmp-server trap-snmp-auth-failures
          cumulus@switch:~$ nv config apply
          

          In the /etc/snmp/snmpd.conf file, add the authtrapenable directive:

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          authtrapenable 1
          ...
          

          Restart the snmpd service to apply the changes.

          cumulus@switch:~$ sudo systemctl restart snmpd.service
          

          Monitor UCD-SNMP-MIB Tables

          To configure the Event MIB tables to monitor the various UCD-SNMP-MIB tables for problems (xxErrFlag column objects) and send a trap, add defaultMonitors yes to the snmpd.conf file and provide a configuration. You must first download the snmp-mibs-downloader Debian package and comment out the mibs line from the /etc/snmp/snmpd.conf file (see below). Then add a configuration like the following example:

          cumulus@switch:~$ sudo nano /etc/snmp/snmpd.conf
          ...
          defaultMonitors yes
          
          monitor   -o prNames -o prErrMessage "process table" prErrorFlag != 0
          monitor   -o memErrorName -o memSwapErrorMsg "memory" memSwapError != 0
          monitor   -o extNames -o extOutput "extTable" extResult != 0<br>monitor   -o dskPath -o dskErrorMsg "dskTable" dskErrorFlag != 0
          monitor   -o laNames -o laErrMessage  "laTable" laErrorFlag != 0<br>monitor   -o fileName -o fileErrorMsg  "fileTable" fileErrorFlag != 0
          ...
          

          Restart the snmpd service to apply the changes.

          cumulus@switch:~$ sudo systemctl restart snmpd.service
          

          Enable MIB to OID Translation

          You can use MIB names instead of OIDs, which greatly improves the readability of the snmpd.conf file. You enable this by installing the snmp-mibs-downloader, which downloads SNMP MIBs to the switch before enabling traps.

          1. Open /etc/apt/sources.list in a text editor, add the non-free repository, then save the file:

            cumulus@switch:~$ sudo nano /etc/apt/sources.list
            ...
            deb  http://deb.debian.org/debian bookworm main non-free
            ...
            
          2. Update the switch:

            cumulus@switch:~$ sudo -E apt-get update
            
          3. Install the snmp-mibs-downloader:

            cumulus@switch:~$ sudo -E apt-get install snmp-mibs-downloader
            
          4. Open the /etc/snmp/snmp.conf file to verify that the mibs : line is in comments:

            #
            # As the snmp packages come without MIB files due to license reasons, loading
            # of MIBs is disabled by default. If you added the MIBs you can reenable
            # loading them by commenting out the following line.
            #mibs :
            
          5. Open the /etc/default/snmpd file to verify that the export MIBS= line is in comments:

            # This file controls the activity of snmpd and snmptrapd
            
            # Don't load any MIBs by default.
            # You might comment this lines after you have the MIBs Downloaded.
            #export MIBS=
            
          6. After you confirm the configuration, remove or comment out the non-free repository in /etc/apt/sources.list.

            #deb http://ftp.us.debian.org/debian/ buster main non-free
            

          Configure Incoming SNMP Traps

          The Net-SNMP trap daemon in /etc/snmp/snmpd.conf receives SNMP traps. You configure how incoming traps process in the /etc/snmp/snmptrapd.conf file. With Net-SNMP release 5.3 and later, you must specify who is authorized to send traps and informs to the notification receiver (and what types of processing these are allowed to trigger). You can specify three processing types:

          Typically, you configure all three — log,execute,net — to cover any style of processing for a particular category of notification. You can limit certain notification sources to certain processing only.

          authCommunity TYPES COMMUNITY [SOURCE [OID | -v VIEW ]] authorizes traps and SNMPv2c INFORM requests with the community you specify to trigger the types of processing you list. By default, this allows any notification using this community to process. You can use the SOURCE field to specify that the configuration only applies to notifications from particular sources. For more information about specific configuration options within the file, see snmptrapd.conf(5) man page with the man 5 snmptrapd.conf command.

          If not already on the system, install the snmptrapd Debian package before you configure incoming traps:

          cumulus@switch:~$ sudo apt-get install snmptrapd
          

          Supported MIBs

          The following table lists the relevant Cumulus Linux network monitoring MIBs:

          MIB Name
          Suggested Uses
          BGP4-MIB
          OSPFv2-MIB
          OSPFv3-MIB
          RIPv2-MIB
          You can enable FRR SNMP support to provide support for OSPF-MIB (RFC-1850), OSPFV3-MIB (RFC-5643), and BGP4-MIB (RFC-1657).
          CUMULUS-BGPVRF-MIB Provides monitoring for all BGP peer types (unnumbered, IPv4, and IPv6) in all VRFs. /usr/share/snmp/mibs/CUMULUS-BGPVRF-MIB.txt defines this MIB.
          CUMULUS-COUNTERS-MIB Discard counters and interface counters. /usr/share/snmp/mibs/Cumulus-Counters-MIB.txt defines this MIB, which has the OID .1.3.6.1.4.1.40310.2.
          CUMULUS-RESOURCE-QUERY-MIB Cumulus Linux includes its own resource utilization MIB, which is similar to using cl-resource-query. This MIB monitors layer 3 entries by host, route (such as the total number of IPv4 routes in the FIB), nexthops, ECMP groups, and layer 2 MAC and BDPU entries. /usr/share/snmp/mibs/Cumulus-Resource-Query-MIB.txt defines this MIB, which has the OID .1.3.6.1.4.1.40310.1.
          CUMULUS-SNMP-MIB SNMP counters. For information on exposing CPU and memory information with SNMP, see this knowledge base article.
          DISMAN-EVENT-MIB Trap monitoring.
          ENTITY-MIB Cumulus Linux supports the temperature sensors, fan sensors, power sensors, and ports from RFC 4133.

          Note: The ENTITY-MIB does not show the chassis information in Cumulus Linux.
          ENTITY-SENSOR-MIB Physical sensor information (temperature, fan, and power supply) from RFC 3433.
          HOST-RESOURCES-MIB Users, storage, interfaces, process info, run parameters.
          BRIDGE-MIB
          Q-BRIDGE-MIB
          The dot1dBasePortEntry and dot1dBasePortIfIndex tables in the BRIDGE-MIB and dot1qBase, dot1qFdbEntry, dot1qTpFdbEntry, dot1qTpFdbStatus, and dot1qVlanStaticName tables in the Q-BRIDGE-MIB tables. You must uncomment the bridge_pp.py pass_persist script in /etc/snmp/snmpd.conf.
          IEEE8023-LAG-MIB Implementation of the IEEE 8023-LAG-MIB includes the dot3adAggTable and dot3adAggPortListTable tables. To enable this, edit /etc/snmp/snmpd.conf and uncomment or add the following lines:
          view systemonly included .1.2.840.10006.300.43
          pass_persist .1.2.840.10006.300.43 /usr/share/snmp/ieee8023_lag_pp.py
          IF-MIB Interface description, type, MTU, speed, MAC, admin, operation status, counters.

          Note: Cumulus Linux disables the IF-MIB cache by default. The non-caching code path in the IF-MIB treats 64-bit counters like 32-bit counters (a 64-bit counter rolls over after the value increments to a value that extends beyond 32 bits). To enable the counter to reflect traffic statistics using 64-bit counters, remove the -y option from the SNMPDOPTS line in the /etc/default/snmpd file. The example below first shows the original line, commented out, then the modified line without the -y option:
          cumulus@switch:~$ cat /etc/default/snmpd
          # SNMPDOPTS='-y -LS 0-4 d -Lf /dev/null -u snmp -g snmp -I -smux -p /run/snmpd.pid'
          SNMPDOPTS='-LS 0-4 d -Lf /dev/null -u snmp -g snmp -I -smux -p /run/snmpd.pid
          IP-FORWARD-MIB IP routing table.
          IP-MIB (includes ICMP) IPv4, IPv4 addresses counters, netmasks.
          IPv6-MIB IPv6 counters.
          LLDP-MIB Layer 2 neighbor information from lldpd (you need to enable the SNMP subagent in LLDP). You need to start lldpd with the -x option to enable connectivity to snmpd(AgentX).
          LM-SENSORS MIB Fan speed, temperature sensor values, voltages. The ENTITY-SENSOR MIB replaces this MIB.
          NET-SNMP-AGENT-MIB Agent timers, user, group config.
          NET-SNMP-VACM-MIB Agent timers, user, group config.
          NOTIFICATION-LOG-MIB Local logging.
          SNMP-FRAMEWORK-MIB Users, access.
          SNMP-MPD-MIB Users, access.
          SNMP-TARGET-MIB SNMP-TARGET-MIB.
          SNMP-USER-BASED-SM-MIBS Users, access.
          SNMP-VIEW-BASED-ACM-MIB Users, access.
          TCP-MIB TCP-related information.
          UCD-SNMP-MIB System memory, load, CPU, disk IO.
          UDP-MIB UDP-related information.

          List All Installed MIBs

          Due to licensing restrictions, Cumulus Linux does not install all MIBs. For the MIBs that Cumulus Linux does not install, you must add the “non-free” archive to /etc/apt/sources.list. To see which MIBs are on your switch, run ls /usr/share/snmp/mibs/.

          To install more MIBs, install snmp-mibs-downloader, then either remove or comment out the “non-free” repository in /etc/apt/sources.list. Refer to Enable MIB-to-OID Translation.

          Installed MIBs
          cumulus@switch:~$ ls /usr/share/snmp/mibs/
          AGENTX-MIB.txt                       IP-MIB.txt                        SNMP-MPD-MIB.txt
          BRIDGE-MIB.txt                       IPV6-FLOW-LABEL-MIB.txt           SNMP-NOTIFICATION-MIB.txt
          Cumulus-BGPVRF-MIB.txt               IPV6-ICMP-MIB.txt                 SNMP-PROXY-MIB.txt
          Cumulus-Counters-MIB.txt             IPV6-MIB.txt                      SNMP-TARGET-MIB.txt
          Cumulus-POE-MIB.txt                  IPV6-TCP-MIB.txt                  SNMP-TLS-TM-MIB.txt
          Cumulus-Resource-Query-MIB.txt       IPV6-TC.txt                       SNMP-TSM-MIB.txt
          Cumulus-Sensor-MIB.txt               IPV6-UDP-MIB.txt                  SNMP-USER-BASED-SM-MIB.txt
          Cumulus-Snmp-MIB.txt                 LM-SENSORS-MIB.txt                SNMP-USM-AES-MIB.txt
          Cumulus-Status-MIB.txt               MTA-MIB.txt                       SNMP-USM-DH-OBJECTS-MIB.txt
          DISMAN-EVENT-MIB.txt                 NET-SNMP-AGENT-MIB.txt            SNMP-USM-HMAC-SHA2-MIB.txt
          DISMAN-EXPRESSION-MIB.txt            NET-SNMP-EXAMPLES-MIB.txt         SNMPv2-CONF.txt
          DISMAN-NSLOOKUP-MIB.txt              NET-SNMP-EXTEND-MIB.txt           SNMPv2-MIB.txt
          DISMAN-PING-MIB.txt                  NET-SNMP-MIB.txt                  SNMPv2-SMI.txt
          DISMAN-SCHEDULE-MIB.txt              NET-SNMP-MONITOR-MIB.txt          SNMPv2-TC.txt
          DISMAN-SCRIPT-MIB.txt                NET-SNMP-PASS-MIB.txt             SNMPv2-TM.txt
          DISMAN-TRACEROUTE-MIB.txt            NET-SNMP-PERIODIC-NOTIFY-MIB.txt  SNMP-VIEW-BASED-ACM-MIB.txt
          EtherLike-MIB.txt                    NET-SNMP-SYSTEM-MIB.txt           TCP-MIB.txt
          GNOME-SMI.txt                        NET-SNMP-TC.txt                   TRANSPORT-ADDRESS-MIB.txt
          HCNUM-TC.txt                         NET-SNMP-VACM-MIB.txt             TUNNEL-MIB.txt
          HOST-RESOURCES-MIB.txt               NETWORK-SERVICES-MIB.txt          UCD-DEMO-MIB.txt
          HOST-RESOURCES-TYPES.txt             NOTIFICATION-LOG-MIB.txt          UCD-DISKIO-MIB.txt
          IANA-ADDRESS-FAMILY-NUMBERS-MIB.txt  RFC1155-SMI.txt                   UCD-DLMOD-MIB.txt
          IANAifType-MIB.txt                   RFC1213-MIB.txt                   UCD-IPFILTER-MIB.txt
          IANA-LANGUAGE-MIB.txt                RFC-1215.txt                      UCD-IPFWACC-MIB.txt
          IANA-RTPROTO-MIB.txt                 RMON-MIB.txt                      UCD-SNMP-MIB-OLD.txt
          IF-INVERTED-STACK-MIB.txt            SCTP-MIB.txt                      UCD-SNMP-MIB.txt
          IF-MIB.txt                           SMUX-MIB.txt                      UDP-MIB.txt
          INET-ADDRESS-MIB.txt                 SNMP-COMMUNITY-MIB.txt
          IP-FORWARD-MIB.txt                   SNMP-FRAMEWORK-MIB.txt
          

          Considerations

          The snmpd service might cache SNMP MIB object values for performance reasons and update the values periodically. When you poll SNMP objects, the values returned might not reflect real time status changes for some period of time.

          Troubleshoot SNMP

          Use the following commands to troubleshoot potential SNMP issues.

          To show a summary of the SNMP configuration settings on the switch:

          cumulus@switch:~$ nv show service snmp-server
                               applied         description
          -------------------  --------------  ---------------------------------------------------------------------
          enable               on              Turn the feature 'on' or 'off'.  This feature is disabled by default.
          [listening-address]  localhost       Collection of listening addresses
          trap-link-down
            check-frequency    60              Link up or link down checking frequency in seconds
          trap-link-up
            check-frequency    60              Link up or link down checking frequency in seconds
          [username]           testusernoauth  Usernames
          [username]           user1
          [username]           user2
          [username]           user3
          [username]           user666
          [username]           user999
          

          To show a summary of the SNMP configuration settings in json format, run the nv show service snmp-server --output json --applied command.

          To show the SNMP trap CPU load average, run the nv show service snmp-server trap-cpu-load-average command.

          To show SNMP trap authentication failures, run the nv show service snmp-server trap-snmp-auth-failures command.

          To see all the show commands for SNMP troubleshooting, run nv show service snmp-server and press the Tab key:

          cumulus@switch:~$ nv show service snmp-server  <<press Tab>>
          listening-address        readonly-community-v6    trap-link-down           username
          mibs                     trap-cpu-load-average    trap-link-up             viewname
          readonly-community       trap-destination         trap-snmp-auth-failures  
          

          Single User Mode - Password Recovery

          Use single user mode to assist in troubleshooting system boot issues or for password recovery.

          To enter single user mode:

          1. Boot the switch, then as soon as you see the GRUB menu, use the arrow keys to select Advanced options for Cumulus Linux GNU/Linux.

            Before the GRUB menu appears, the switch goes through the boot cycle. Do not interrupt this autoboot process when you see the following lines; wait until you see the GRUB menu.

            ...
            USB0:  Bringing USB2 host out of reset...
            Net:   eth-0
            SF:    MX25L6405D with page size 4 KiB, total 8 MiB
            Hit any key to stop autoboot:  2
            

                         GNU GRUB  version 2.02+dfsg1-20
            
            +----------------------------------------------------------------------------+
            |*Cumulus Linux GNU/Linux                                                    |
            | Advanced options for Cumulus Linux GNU/Linux                               |
            | ONIE                                                                       |
            |                                                                            |
            +----------------------------------------------------------------------------+
            
          2. Select Cumulus Linux GNU/Linux, with Linux 4.19.0-cl-1-amd64 (recovery mode).

                         GNU GRUB  version 2.02+dfsg1-20
            
            +----------------------------------------------------------------------------+
            | Cumulus Linux GNU/Linux, with Linux 4.19.0-cl-1-amd64                       |
            |*Cumulus Linux GNU/Linux, with Linux 4.19.0-cl-1-amd64 (recovery mode)       |
            |                                                                            |
            +----------------------------------------------------------------------------+  
            
          3. After the system reboots, set a new root password. The root user provides complete control over the switch.

            root@switch:~# passwd
            Enter new UNIX password:
            Retype new UNIX password:
            passwd: password updated successfully
            

            You can take this opportunity to reset the password for the cumulus account.

               root@switch:~# passwd cumulus
               Enter new UNIX password:
               Retype new UNIX password:
               passwd: password updated successfully
            

            In Cumulus Linux 5.9 and later, user passwords must include at least one lowercase character, one uppercase character, one digit, one special character, and cannot be usernames. In addition, passwords must be a minimum of eight characters long, expire in 365 days, and provide a warning 15 days before expiration. For more information about the password security policy, refer to Password Security.

          4. Sync the /etc directory, then reboot the system:

            root@switch:~# sync
            root@switch:~# reboot -f
            Restarting the system.
            

          Resource Diagnostics

          Cumulus Linux synchronizes routes between the kernel and the switching silicon. If the required resource pools in hardware fill up, new kernel routes can cause existing routes to move from being fully allocated to being partially allocated. To avoid this issue, monitor the routes in the hardware to keep them below the ASIC limits.

          You can retrieve information about host entries, MAC entries, layer 2 and layer 3 routes, and ECMP routes that are in use.

          To monitor the routes in Cumulus Linux hardware, you can use NVUE commands or the Linux cl-resource-query command.

          To show both global and ACL ASIC resources, run the nv show platform asic resource command.

          cumulus@switch:~$ nv show platform asic resource
          Global 
          ========= 
              Resource Name             Count         Max        Percentage 
              ------------------                      -----      ---------
              IPv4-host-entries             4         32768      0% 
              IPv6-host-entries             4         8192       0% 
              IPv4-neighbors                4                    0% 
              IPv6-neighbors                4                    0% 
              IPv4-route-entries            22        65536      0% 
              IPv6-route-entries            21        45056      0% 
                  IPv4-Routes               22                   0% 
              IPv6-Routes                   13                   0% 
              MAC-entries                   36        40960      0% 
              Total-Mcast-Routes             0        1000       0% 
              Ingress-ACL-entries            0                   0% 
              Egress-ACL-entries             0                   0% 
                Total-Routes                 43       110592     0% 
              ACL-Regions                    2        400        0% 
              ACL-18B-Rules-Key              2        3792       0% 
              ACL-36B-Rules-Key              0        1536       0% 
              ACL-54B-Rules-Key              0        1024       0% 
              ECMP-entries                   5                   0% 
              ECMP-nexthops                  8        7808       0% 
              Flow-Counters                  10       16196      0% 
                 RIF-Basic-Counters          36       1000       3% 
              RIF-Enhanced-Counters          0        964        0% 
              Downstream-VNI-FID-count       0                   0% 
              Total-FID-count                3        6143       0% 
              Vport-FID-count                3                   0%
          Acl 
          ====== 
              Resource Name                         18B Rule     36B Rule     54B Rule      Rule Count 
              ----------------------------          ----------   -----------  ----------     ------ 
              Egress-ACL-ipv4-filter-table           0           0               0            0 
              Egress-ACL-mac-filter-table            0           0               0            0 
              Ingress-ACL-mac-filter-table           0           0               0            0 
              Ingress-ACL-ipv4-filter-table          0           0               0            0 
              Ingress-ACL-ipv6-filter-table          0           0               0            0 
              Ingress-ACL-ipv4-mangle-table          1           0               0            1 
              Ingress-ACL-ipv6-mangle-table          0           0               0            0 
              Egress ACL-ipv4-mangle-table           1           0               0            1 
              Egress-ACL-ipv6-mangle-table           0           0               0            0 
              Ingress-PBR-ipv4-filter-table          0           0               0            0 
              Ingress-PBR-ipv6-filter-tabl           0           0               0            0  
          

          To show global ASIC resources on the switch in tabular format, run the nv show platform asic resource global command.

          cumulus@switch:~$ nv show platform asic resource global
          Resource Name                     Count   Max      Percentage 
              ------------------            -----   ----      ---------- 
              IPv4-host-entries             4       32768     0%
              IPv6-host-entries             4       8192      0% 
              IPv4-neighbors                4                 0% 
              IPv6-neighbors                4                 0% 
              IPv4-route-entries            22      65536     0% 
              IPv6-route-entries            21      45056     0% 
              IPv4-Routes                   22                0% 
              IPv6-Routes                   13                0% 
              MAC-entries                   36      40960     0% 
              Total-Mcast-Routes            0       1000      0% 
              Ingress-ACL-entries           0                 0% 
              Egress-ACL-entries            0                 0% 
              Total-Routes                  43      110592    0% 
              ACL-Regions                   2       400       0% 
              ACL-18B-Rules-Key             2       3792      0% 
              ACL-36B-Rules-Key             0       1536      0% 
              ACL-54B-Rules-Key             0       1024      0% 
              ECMP-entries                  5                 0% 
              ECMP-nexthops                 8       7808      0% 
              Flow-Counters                 10      16196     0% 
              Ingress-ACL-entries           0                 0% 
              RIF-Basic-Counters            36      1000      3% 
              RIF-Enhanced-Counters         0       964       0% 
              Downstream-VNI-FID-count      0                 0% 
              Total-FID-count               3       6143      0% 
              Vport-FID-count               3                 0%
              Dynamic-Config-DNAT-entries   0       64        0.0% 
              Dynamic-Config -SNAT-entries  0       64        0.0% 
              Dynamic-DNAT-entries          0       1024      0.0% 
              Dynamic-SNAT-entries          0       1024      0.0% 
          

          To show only ACL ASIC resources in tabular format, run the nv show platform asic resource acl command.

          cumulus@switch:~$ nv show platform asic resource acl
          Resource Name                        18B Rule     36B Rule     54B Rule    Rule Count 
              ----------------------------     ----------   ----------   ----------  -------- 
              Egress-ACL-ipv4-filter-table       0          0             0          0 
              Egress-ACL-mac-filter-table        0          0             0          0 
              Ingress-ACL-mac-filter-table       0          0             0          0 
              Ingress-ACL-ipv4-filter-table      0          0             0          0 
              Ingress-ACL-ipv6-filter-table      0          0             0          0 
              Ingress-ACL-ipv4-mangle-table      1          0             0          1 
              Ingress-ACL-ipv6-mangle-table      0          0             0          0 
              Egress ACL-ipv4-mangle-table      1           0             0          1 
              Egress-ACL-ipv6-mangle-table      0           0             0          0 
              Ingress-PBR-ipv4-filter-table     0           0             0          0 
              Ingress-PBR-ipv6-filter-tabl      0           0             0          0 
              Egress-ACL-ipv6-filter-table      0           0             0          0 
          

          The example below shows cl-resource-query results for an NVIDIA Spectrum-2 switch:

          cumulus@switch:~$ sudo cl-resource-query
          IPv4 host entries:                      0,   0% of maximum value  41360
          IPv6 host entries:                      0,   0% of maximum value  20680
          IPv4 neighbors:                         0
          IPv6 neighbors:                         0
          IPv4 route entries:                     0,   0% of maximum value  82720
          IPv6 route entries:                    22,   0% of maximum value  74446
          IPv4 Routes:                            0
          IPv6 Routes:                           12
          Total Routes:                          22,   0% of maximum value 157166
          Unicast Adjacency entries:              0,   0% of maximum value  33087
          ECMP entries:                           0,   0% of maximum value   8571
          MAC entries:                           38,   0% of maximum value  57903
          Total Mcast Routes:                     0,   0% of maximum value   1000
          Ingress ACL entries:                    0
          Egress ACL entries:                     0
          ACL Regions:                            4,   1% of maximum value    400
          ACL 18B Rules Key:                      1,   0% of maximum value  57476
          ACL 36B Rules Key:                      0,   0% of maximum value  57475
          ACL 54B Rules Key:                      0,   0% of maximum value  34485
          Ingress ACL mac filter table:           0    18B : 0 36B : 0 54B : 0 
          Ingress ACL ipv4 filter table:          0    18B : 0 36B : 0 54B : 0 
          Ingress ACL ipv6 filter table:          0    18B : 0 36B : 0 54B : 0 
          Egress ACL mac filter table:            0    18B : 0 36B : 0 54B : 0 
          Egress ACL ipv4 filter table:           0    18B : 0 36B : 0 54B : 0 
          Egress ACL ipv6 filter table:           0    18B : 0 36B : 0 54B : 0 
          Ingress ACL ipv4 mangle table:          0    18B : 0 36B : 0 54B : 0 
          Ingress ACL ipv6 mangle table:          0    18B : 0 36B : 0 54B : 0 
          Ingress PBR ipv4 filter table:          0    18B : 0 36B : 0 54B : 0 
          Ingress PBR ipv6 filter table:          0    18B : 0 36B : 0 54B : 0 
          Flow Counters:                          2,   0% of maximum value  39430
          RIF Basic Counters:                     0,   0% of maximum value   7885
          RIF Enhanced Counters:                 38,   1% of maximum value   2666
          Dynamic SNAT entries:                   0,   0% of maximum value   1024
          Dynamic DNAT entries:                   0,   0% of maximum value   1024
          Dynamic Config SNAT entries:            0,   0% of maximum value     64
          Dynamic Config DNAT entries:            0,   0% of maximum value     64
          

          Ingress ACL and Egress ACL entries show the counts in single wide (not double-wide). For information about ACL entries, see Estimate the Number of ACL Rules.

          ASIC Monitoring

          Cumulus Linux provides several ASIC monitoring tools that collect and distribute data about the state of the ASIC.

          Enable ASIC Monitoring

          To enable ASIC monitoring for histogram collection and high frequency telemetry, run the following commands.

          cumulus@switch:~$ nv set system telemetry enable on
          cumulus@switch:~$ nv config apply
          

          The asic-monitor service manages both histogram collection and high frequency telemetry. systemd manages the asic-monitor service.

          The asic-monitor service reads:

          • The /etc/cumulus/datapath/monitor.conf configuration file to determine what statistics to collect and when to trigger. The service always starts; however, if the configuration file is empty, the service exits.
          • The /etc/cumulus/telemetry/hft/hft_job.conf and /etc/cumulus/telemetry/hft/hft.conf files for high frequency telemetry.

          Restarting the asic-monitor service does not disrupt traffic or require you to restart switchd.

          Histogram Collection

          The histogram collection monitoring tool polls for data at specific intervals and takes certain actions so that you can identify and respond to problems, such as:

          Cumulus Linux provides several histograms:

          Cumulus Linux supports:

          Histogram Collection Example

          The NVIDIA Spectrum ASIC provides a mechanism to measure and report ingress and egress queue lengths, counters and latency in histograms (a graphical representation of data, which it divides into intervals or bins). Each queue reports through a histogram with 10 bins, where each bin represents a range of queue lengths.

          You configure the histogram with a minimum size boundary (Min) and a histogram size. You then derive the maximum size boundary (Max) by adding the minimum size boundary and the histogram size.

          The 10 bins have numbers 0 through 9. Bin 0 represents queue lengths up to the Min specified, including queue length 0. Bin 9 represents queue lengths of Max and above. Bins 1 through 8 represent equal-sized ranges between the Min and Max (by dividing the histogram size by 8).

          For example, consider the following histogram queue length ranges, in bytes:

          The following illustration demonstrates a histogram showing how many times the queue length for a port was in the ranges specified by each bin. The example shows that the queue length was between 960 and 2495 bytes 125 times within one second.

          Configure Histogram Collection

          To configure Histogram Collection, you specify:

          Histogram Settings

          Histogram settings include the type of data you want to collect, the ports you want the histogram to monitor, the sampling time of the histogram, the histogram size, and the minimum boundary size for the histogram.

          When you configure minimum boundary and histogram sizes, Cumulus Linux rounds down the configured byte value to the nearest multiple of the switch ASIC cell size before programming it into hardware. The cell size is a fixed number of bytes on each switching ASIC:

          The histogram type can be egress-buffer, ingress-buffer, counter, or latency.

          • To change global histogram settings, run the nv set system telemetry histogram <type> command.
          • To enable histograms on interfaces or to change interface level settings, run the nv set interface <interface> telemetry histogram <type> command.

          The following example configures the egress queue length histogram and sets the minimum boundary size to 960, the histogram size to 12288, and the sampling interval to 1024. These settings apply to interfaces that have the egress-buffer histogram enabled and do not have different values configured for these settings at the interface level:

          cumulus@switch:~$ nv set system telemetry histogram egress-buffer bin-min-boundary 960 
          cumulus@switch:~$ nv set system telemetry histogram egress-buffer histogram-size 12288 
          cumulus@switch:~$ nv set system telemetry histogram egress-buffer sample-interval 1024
          cumulus@switch:~$ nv config apply
          

          The following example enables the egress queue length histogram for traffic class 0 on swp1 through swp8 with the globally applied minimum boundary, histogram size, and sample interval. The example also enables the egress queue length histogram for traffic class 1 on swp9 through swp16 and sets the minimum boundary to 768 bytes, the histogram size to 9600 bytes, and the sampling interval to 2048 nanoseconds.

          cumulus@switch:~$ nv set system telemetry enable on
          cumulus@switch:~$ nv set interface swp1-8 telemetry histogram egress-buffer traffic-class 0
          cumulus@switch:~$ nv set interface swp9-16 telemetry histogram egress-buffer traffic-class 1 bin-min-boundary 768
          cumulus@switch:~$ nv set interface swp9-16 telemetry histogram egress-buffer traffic-class 1 histogram-size 9600
          cumulus@switch:~$ nv set interface swp9-16 telemetry histogram egress-buffer traffic-class 1 sample-interval 2048
          cumulus@switch:~$ nv config apply
          

          The following example configures the ingress queue length histogram and sets the minimum boundary size to 960 bytes, the histogram size to 12288 bytes, and the sampling interval to 1024 nanoseconds. These settings apply to interfaces that have the ingress-buffer histogram enabled and do not have different values configured for these settings at the interface level:

          cumulus@switch:~$ nv set system telemetry enable on
          cumulus@switch:~$ nv set system telemetry histogram ingress-buffer bin-min-boundary 960 
          cumulus@switch:~$ nv set system telemetry histogram ingress-buffer histogram-size 12288 
          cumulus@switch:~$ nv set system telemetry histogram ingress-buffer sample-interval 1024
          cumulus@switch:~$ nv config apply
          

          The following example enables the ingress queue length histogram for priority group 0 on swp1 through swp8 with the globally applied minimum boundary, histogram size, and sample interval. The example also enables the ingress queue length histogram for priority group 1 on swp9 through swp16 and sets the minimum boundary to 768 bytes, the histogram size to 9600 bytes, and the sampling interval to 2048 nanoseconds.

          cumulus@switch:~$ nv set interface swp1-8 telemetry histogram ingress-buffer priority-group 0
          cumulus@switch:~$ nv set interface swp9-16 telemetry histogram ingress-buffer priority-group 1 bin-min-boundary 768
          cumulus@switch:~$ nv set interface swp9-16 telemetry histogram ingress-buffer priority-group 1 histogram-size 9600
          cumulus@switch:~$ nv set interface swp9-16 telemetry histogram ingress-buffer priority-group 1 sample-interval 2048
          cumulus@switch:~$ nv config apply
          

          The following example configures the counter rate histogram and sets the minimum boundary size to 960, the histogram size to 12288, and the sampling interval to 1024. The histogram monitors all counter types and reports the changes in counter data between samples. These settings apply to interfaces that have the counter histogram enabled and do not have different values configured for these settings at the interface level:

          cumulus@switch:~$ nv set system telemetry histogram counter bin-min-boundary 960
          cumulus@switch:~$ nv set system telemetry histogram counter histogram-size 12288
          cumulus@switch:~$ nv set system telemetry histogram counter sample-interval 1024
          cumulus@switch:~$ nv config apply
          

          The following example enables the counter rate histogram on swp1 through swp8 and uses the global settings for the minimum boundary size, histogram size, and the sampling interval. The histogram monitors all received packet counters on ports 1 through 8 and reports the changes in counter data between samples.

          cumulus@switch:~$ nv set interface swp1-8 telemetry histogram counter counter-type rx-packet
          cumulus@switch:~$ nv config apply
          

          The following example configures the latency histogram and sets the minimum boundary size to 960 and the histogram size to 12288. These settings apply to interfaces that have the latency histogram enabled and do not have different values configured for these settings at the interface level:

          cumulus@switch:~$ nv set system telemetry histogram latency bin-min-boundary 960 
          cumulus@switch:~$ nv set system telemetry histogram latency histogram-size 12288 
          cumulus@switch:~$ nv config apply
          

          The following example enables the latency histogram for traffic class 0 on swp1 through swp8 with the globally applied minimum boundary and histogram size. The example also enables the latency histogram for traffic class 1 on swp9 through swp16 and sets the minimum boundary to 768 bytes and the histogram size to 9600 bytes.

          cumulus@switch:~$ nv set system telemetry enable on
          cumulus@switch:~$ nv set interface swp1-8 telemetry histogram latency traffic-class 0
          cumulus@switch:~$ nv set interface swp9-16 telemetry histogram latency traffic-class 1 bin-min-boundary 768
          cumulus@switch:~$ nv set interface swp9-16 telemetry histogram latency traffic-class 1 histogram-size 9600
          cumulus@switch:~$ nv config apply
          

          Edit settings in the /etc/cumulus/datapath/monitor.conf file, then restart the asic-monitor service with the systemctl restart asic-monitor.service command. The asic-monitor service reads the new configuration file and then runs until you stop the service with the systemctl stop asic-monitor.service command.

          The following table describes the ASIC monitor settings.

          Setting Description
          port_group_list Specifies the names of the monitors (port groups) you want to use to collect data, such as histogram_pg. You can provide any name you want for the port group. You must use the same name for all the port group settings.

          Example:
          monitor.port_group_list = [histogram_pg,discards_pg,buffers_pg,all_packets_pg]
          Note: You must specify at least one port group. If the port group list is empty, systemd shuts down the asic-monitor service.
          <port_group_name>.port_set Specifies the range of ports you want to monitor, such as swp4,swp8,swp10-swp50. To specify all ports, use the all_ports option.

          Example:
          monitor.histogram_pg.port_set = swp1-swp50
          monitor.histogram_pg.port_set = all_ports
          <port_group_name>.stat_type Specifies the type of data that the port group collects.

          For egress queue length histograms, specify histogram_tc. For example:
          monitor.histogram_pg.stat_type = histogram_tc
          For ingress queue length histograms, specify histogram_pg. For example:
          monitor.histogram_pg.stat_type = histogram_pg
          For counter rate histograms, specify histogram_counter. For example:
          monitor.histogram_pg.stat_type = histogram_counter
          . For latency histograms, specify histogram_latency. For example:
           monitor.histogram_pg.stat_type = histogram_latency
          .
          <port_group_name>.cos_list For histogram monitoring, each CoS (Class of Service) value in the list has its own histogram on each port. The global limit on the number of histograms is an average of one histogram for each port.

          Example:
          monitor.histogram_pg.cos_list = [0]
          <port_group_name>.counter_type Specifies the counter type for counter rate histogram monitoring. The counter types can be tx-pkt,rx-pkt,tx-byte,rx-byte.

          Example:
          monitor.histogram_pg.counter_type = [rx_byte]
          <port_group_name>.trigger_type Specifies the type of trigger that initiates data collection. The only option is timer. At least one port group must have a configured timer, otherwise no data is ever collected.

          Example:
          monitor.histogram_pg.trigger_type = timer
          <port_group_name>.timer Specifies the frequency at which data collects; for example, a setting of 1s indicates that data collects one time each second. You can set the timer to the following:
          1 to 60 seconds: 1s, 2s, and so on up to 60s
          1 to 60 minutes: 1m, 2m, and so on up to 60m
          1 to 24 hours: 1h, 2h, and so on up to 24h
          1 to 7 days: 1d, 2d and so on up to 7d

          Example:
          monitor.histogram_pg.timer = 4s
          <port_group_name>.histogram.minimum_bytes_boundary For histogram monitoring.

          The minimum boundary size for the histogram in bytes. On a Spectrum switch, this number must be a multiple of 96. Adding this number to the size of the histogram produces the maximum boundary size. These values represent the range of queue lengths for each bin.

          Example:
          monitor.histogram_pg.histogram.minimum_bytes_boundary = 960
          <port_group_name>.histogram.histogram_size_bytes For histogram monitoring.

          The size of the histogram in bytes. Adding this number and the minimum_bytes_boundary value together produces the maximum boundary size. These values represent the range of queue lengths for each bin.

          Example:
          monitor.histogram_pg.histogram.histogram_size_bytes = 12288
          <port_group_name>.histogram.sample_time_ns For histogram monitoring.

          The sampling time of the histogram in nanoseconds.

          Example:
          monitor.histogram_pg.histogram.sample_time_ns = 1024

          The following example configures the egress queue length histogram and sets the minimum boundary size to 960, the histogram size to 12288, and the sampling interval to 1024. The histogram collects data every second for traffic class 0 through 15 on all ports:

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.port_group_list                               = [histogram_pg] 
          monitor.histogram_pg.port_set                         = allports
          monitor.histogram_pg.stat_type                        = histogram_tc
          monitor.histogram_pg.cos_list                         = [0-15]
          monitor.histogram_pg.trigger_type                     = timer
          monitor.histogram_pg.timer                            = 1s
          ...
          monitor.histogram_pg.histogram.minimum_bytes_boundary = 960
          monitor.histogram_pg.histogram.histogram_size_bytes   = 12288
          monitor.histogram_pg.histogram.sample_time_ns         = 1024
          

          The following example configures the egress queue length histogram and sets the minimum boundary to 960 bytes, the histogram size to 12288 bytes, and the sampling interval to 1024 nanoseconds. The histogram collects data every second for traffic class 0 on swp1 through swp8, and for traffic class 1 on swp9 through swp16.

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.port_group_list                                = [histogram_gr1, histogram_gr2] 
          monitor.histogram_gr1.port_set                         = swp1-swp8
          monitor.histogram_gr1.stat_type                        = histogram_tc
          monitor.histogram_gr1.cos_list                         = [0]
          monitor.histogram_gr1.trigger_type                     = timer
          monitor.histogram_gr1.timer                            = 1s
          ...
          monitor.histogram_gr1.histogram.minimum_bytes_boundary = 960
          monitor.histogram_gr1.histogram.histogram_size_bytes   = 12288
          monitor.histogram_gr1.histogram.sample_time_ns         = 1024
          

          monitor.histogram_gr2.port_set = swp9-swp16 monitor.histogram_gr2.stat_type = histogram_tc monitor.histogram_gr2.cos_list = [1] monitor.histogram_gr2.trigger_type = timer monitor.histogram_gr2.timer = 1s … monitor.histogram_gr2.histogram.minimum_bytes_boundary = 960 monitor.histogram_gr2.histogram.histogram_size_bytes = 12288 monitor.histogram_gr2.histogram.sample_time_ns = 1024

          The following example configures the ingress queue length histogram and sets the minimum boundary size to 960 bytes, the histogram size to 12288 bytes, and the sampling interval to 1024 nanoseconds. The histogram collects data every second for priority group 1 through 15 on all ports.

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.port_group_list                               = [histogram_pg] 
          monitor.histogram_pg.port_set                         = allports
          monitor.histogram_pg.stat_type                        = histogram_pg
          monitor.histogram_pg.cos_list                         = [0-15]
          monitor.histogram_pg.trigger_type                     = timer
          monitor.histogram_pg.timer                            = 1s
          ...
          monitor.histogram_pg.histogram.minimum_bytes_boundary = 960
          monitor.histogram_pg.histogram.histogram_size_bytes   = 12288
          monitor.histogram_pg.histogram.sample_time_ns         = 1024
          

          The following example configures the ingress queue length histogram and sets the minimum boundary size to 960, the histogram size to 12288, and the sampling interval to 1024. The histogram monitors priority group 0 on ports 1 through 8 and priority group 1 on ports 9 through 16:

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.port_group_list                                = [histogram_gr1, histogram_gr2] 
          monitor.histogram_gr1.port_set                         = swp1-swp8
          monitor.histogram_gr1.stat_type                        = histogram_pg
          monitor.histogram_gr1.cos_list                         = [0]
          monitor.histogram_gr1.trigger_type                     = timer
          monitor.histogram_gr1.timer                            = 1s
          ...
          monitor.histogram_gr1.histogram.minimum_bytes_boundary = 960
          monitor.histogram_gr1.histogram.histogram_size_bytes   = 12288
          monitor.histogram_gr1.histogram.sample_time_ns         = 1024
          

          monitor.histogram_gr2.port_set = swp9-swp16 monitor.histogram_gr2.stat_type = histogram_pg monitor.histogram_gr2.cos_list = [1] monitor.histogram_gr2.trigger_type = timer monitor.histogram_gr2.timer = 1s … monitor.histogram_gr2.histogram.minimum_bytes_boundary = 960 monitor.histogram_gr2.histogram.histogram_size_bytes = 12288 monitor.histogram_gr2.histogram.sample_time_ns = 1024

          The following example configures the counter rate histogram and sets the minimum boundary size to 960, the histogram size to 12288, and the sampling interval to 1024. The histogram monitors all counter types:

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.port_group_list                               = [histogram_pg] 
          monitor.histogram_pg.port_set                         = allports
          monitor.histogram_pg.stat_type                        = histogram_counter
          monitor.histogram_pg.counter_type                     = [tx-pkt,rx-pkt,tx-byte,rx-byte]
          monitor.histogram_pg.trigger_type                     = timer
          monitor.histogram_pg.timer                            = 1s
          ...
          monitor.histogram_pg.histogram.minimum_bytes_boundary = 960
          monitor.histogram_pg.histogram.histogram_size_bytes   = 12288
          monitor.histogram_pg.histogram.sample_time_ns         = 1024
          

          The following example configures the counter rate histogram and sets the minimum boundary size to 960, the histogram size to 12288, and the sampling interval to 1024. The histogram monitors all received packets on ports 1 through 8:

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.port_group_list                               = [histogram_pg] 
          monitor.histogram_pg.port_set                         = swp1-swp8
          monitor.histogram_pg.stat_type                        = histogram_counter
          monitor.histogram_pg.counter_type                     = [tx-pkt]
          monitor.histogram_pg.trigger_type                     = timer
          monitor.histogram_pg.timer                            = 1s
          ...
          monitor.histogram_pg.histogram.minimum_bytes_boundary = 960
          monitor.histogram_pg.histogram.histogram_size_bytes   = 12288
          monitor.histogram_pg.histogram.sample_time_ns         = 1024
          

          The following example configures the latency histogram and sets the minimum boundary size to 960 and the histogram size to 12288. These settings apply to interfaces that have the latency histogram enabled and do not have different values configured for these settings at the interface level:

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.port_group_list                               = [latency_pg] 
          monitor.histogram_pg.port_set                         = allports
          monitor.histogram_pg.stat_type                        = histogram_latency
          monitor.histogram_pg.cos_list                         = [0-15]
          monitor.histogram_pg.trigger_type                     = timer
          monitor.histogram_pg.timer                            = 1s
          ...
          monitor.histogram_pg.histogram.minimum_bytes_boundary = 960
          monitor.histogram_pg.histogram.histogram_size_bytes   = 12288
          

          The following example enables the latency histogram for traffic class 0 on swp1 through swp8 with the globally applied minimum boundary and histogram size. The example also enables the latency histogram for traffic class 1 on swp9 through swp16 and sets the minimum boundary to 768 bytes and the histogram size to 9600 bytes.

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.port_group_list                                = [histogram_gr1, histogram_gr2] 
          monitor.histogram_gr1.port_set                         = swp1-swp8
          monitor.histogram_gr1.stat_type                        = histogram_latency
          monitor.histogram_gr1.cos_list                         = [0]
          monitor.histogram_gr1.trigger_type                     = timer
          monitor.histogram_gr1.timer                            = 1s
          ...
          monitor.histogram_gr1.histogram.minimum_bytes_boundary = 960
          monitor.histogram_gr1.histogram.histogram_size_bytes   = 12288
          

          monitor.histogram_gr2.port_set = swp9-swp16 monitor.histogram_gr2.stat_type = histogram_latency monitor.histogram_gr2.cos_list = [1] monitor.histogram_gr2.trigger_type = timer monitor.histogram_gr2.timer = 1s … monitor.histogram_gr2.histogram.minimum_bytes_boundary = 960 monitor.histogram_gr2.histogram.histogram_size_bytes = 12288

          In the following example:

          • Packet drops on swp1 through swp50 collect every two seconds.
          • If the number of packet drops is greater than 100, the results write to the /var/lib/cumulus/discard_stats snapshot file and the system sends a message to the /var/log/syslog file.
          monitor.port_group_list                            = [discards_pg]
          monitor.discards_pg.port_set                       = swp1-swp50
          monitor.discards_pg.stat_type                      = packet
          monitor.discards_pg.action_list                    = [snapshot,log]
          monitor.discards_pg.trigger_type                   = timer
          monitor.discards_pg.timer                          = 2s
          monitor.discards_pg.log.packet_error_drops         = 100
          monitor.discards_pg.snapshot.packet_error_drops    = 100
          monitor.discards_pg.snapshot.file                  = /var/lib/cumulus/discard_stats
          monitor.discards_pg.snapshot.file_count            = 16
          

          A collect action triggers the collection of additional information. You can daisy chain multiple monitors (port groups) into a single collect action.

          In the following example:

          • Queue length histograms collect for swp1 through swp50 every second.
          • The results write to the /var/run/cumulus/histogram_stats snapshot file.
          • When the queue length reaches 500 bytes, the system sends a message to the /var/log/syslog file and collects additional data; buffer occupancy and all packets for each port.
          • Buffer occupancy data writes to the /var/lib/cumulus/buffer_stats snapshot file and all packets for each port data writes to the /var/lib/cumulus/all_packet_stats snapshot file.
          • In addition, packet drops on swp1 through swp50 collect every two seconds. If the number of packet drops is greater than 100, the monitor writes the results to the /var/lib/cumulus/discard_stats snapshot file and sends a message to the /var/log/syslog file.
          monitor.port_group_list                               = [histogram_pg,discards_pg]
          

          monitor.histogram_pg.port_set = swp1-swp50 monitor.histogram_pg.stat_type = buffer monitor.histogram_pg.cos_list = [0] monitor.histogram_pg.trigger_type = timer monitor.histogram_pg.timer = 1s monitor.histogram_pg.action_list = [snapshot,collect,log] monitor.histogram_pg.snapshot.file = /var/run/cumulus/histogram_stats monitor.histogram_pg.snapshot.file_count = 64 monitor.histogram_pg.histogram.minimum_bytes_boundary = 960 monitor.histogram_pg.histogram.histogram_size_bytes = 12288 monitor.histogram_pg.histogram.sample_time_ns = 1024 monitor.histogram_pg.log.queue_bytes = 500 monitor.histogram_pg.collect.queue_bytes = 500 monitor.histogram_pg.collect.port_group_list = [buffers_pg,all_packet_pg]

          monitor.buffers_pg.port_set = swp1-swp50 monitor.buffers_pg.stat_type = buffer monitor.buffers_pg.action_list = [snapshot] monitor.buffers_pg.snapshot.file = /var/lib/cumulus/buffer_stats monitor.buffers_pg.snapshot.file_count = 8

          monitor.all_packet_pg.port_set = swp1-swp50 monitor.all_packet_pg.stat_type = packet_all monitor.all_packet_pg.action_list = [snapshot] monitor.all_packet_pg.snapshot.file = /var/lib/cumulus/all_packet_stats monitor.all_packet_pg.snapshot.file_count = 8

          monitor.discards_pg.port_set = swp1-swp50 monitor.discards_pg.stat_type = packet monitor.discards_pg.action_list = [snapshot,log] monitor.discards_pg.trigger_type = timer monitor.discards_pg.timer = 2s monitor.discards_pg.log.packet_error_drops = 100 monitor.discards_pg.snapshot.packet_error_drops = 100 monitor.discards_pg.snapshot.file = /var/lib/cumulus/discard_stats monitor.discards_pg.snapshot.file_count = 16

          Bandwidth Gauge

          Cumulus Linux supports the bandwidth gauge option on the Spectrum-4 switch only.

          To track bandwidth usage for an interface, you can enable the bandwidth gauge option with the nv set interface <interface-id> telemetry bw-gauge enable on command:

          cumulus@switch:~$ nv set interface swp1 telemetry bw-gauge enable on
          cumulus@switch:~$ nv config apply
          

          To disable the bandwidth gauge setting, run the nv set interface <interface-id> telemetry bw-gauge enable off command.

          To show the bandwidth gauge setting for an interface, run the nv show interface <interface> telemetry bw-gauge command:

          cumulus@switch:~$ nv show interface swp1 telemetry bw-gauge
                  operational  applied
          ------  -----------  -------
          enable  on           on
          

          To show a summary of the bandwidth for an interface, run the nv show system telemetry bw-gauge interface command:

          cumulus@switch:~$ nv show system telemetry bw-gauge interface
          Interface  Tx (Mbps)  Rx (Mbps)
          ---------  ---------  ---------
          swp1       4          4
          

          Snapshots

          To create a snapshot:

          Snapshots provide you with more data; however, they can occupy a lot of disk space on the switch. To reduce disk usage, you can use a volatile partition for the snapshot files; for example, /var/run/cumulus/histogram_stats.

          The following example creates the /var/run/cumulus/histogram_stats snapshot every 5 seconds. The number of snapshots that you can create before the first snapshot file is overwritten is set to 30.

          cumulus@switch:~$ nv set system telemetry snapshot-file name /var/run/cumulus/histogram_stats
          cumulus@switch:~$ nv set system telemetry snapshot-file count 30
          cumulus@switch:~$ nv set system telemetry snapshot-interval 5
          cumulus@switch:~$ nv config apply
          

          Edit the snapshot.file settings in the /etc/cumulus/datapath/monitor.conf file, then restart the asic-monitor service with the systemctl restart asic-monitor.service command. The asic-monitor service reads the new configuration file and then runs until you stop the service with the systemctl stop asic-monitor.service command.

          Setting Description
          <port_group_name>.action_list Specifies one or more actions that occur when data collects:
          snapshot writes a snapshot of the data collection results to a file. If you specify this action, you must also specify a snapshot file (described below). You can also specify a threshold that initiates the snapshot action.

          Example:
          monitor.histogram_pg.action_list = [snapshot]
          collect gathers additional data. If you specify this action, you must also specify the port groups for the additional data you want to collect.

          Example:
          monitor.histogram_pg.action_list = [collect
          monitor.histogram_pg.collect.port_group_list = [buffers_pg,all_packet_pg]
          log sends a message to the /var/log/syslog file. If you specify this action, you must also specify a threshold that initiates the log action.
          Example:
          monitor.histogram_pg.action_list = [log]
          monitor.histogram_pg.log.queue_bytes = 500
          You can use all three of these actions in one monitoring step. For example
          monitor.histogram_pg.action_list = [snapshot,collect,log]
          Note: If an action appears in the action list but does not have the required settings (such as a threshold for the log action), the ASIC monitor stops and reports an error.
          <port_group_name>.snapshot.file Specifies the name for the snapshot file. All snapshots use this name, with a sequential number appended to it. See the snapshot.file_count setting.

          Example:
          monitor.histogram_pg.snapshot.file = /var/run/cumulus/histogram_stats
          <port_group_name>.snapshot.file_count Specifies the number of snapshots you can create before Cumulus Linux overwrites the first snapshot file. In the following example, because the snapshot file count is set to 64, the first snapshot file is histogram_stats_0 and the 64th snapshot is histogram_stats_63. After the 65th snapshot, Cumulus Linux overwrites the original snapshot file (histogram_stats_0) and the sequence restarts.

          Example:
          monitor.histogram_pg.snapshot.file_count = 64
          Note: While more snapshots provide you with more data, they can occupy a lot of disk space on the switch.

          The following example shows an ingress queue snapshot:

          cumulus@switch:~$ nv show interface swp1 telemetry histogram ingress-buffer priority-group 0 snapshot
          Sl.No  Date-Time            Bin-0   Bin-1    Bin-2    Bin-3    Bin-4    Bin-5    Bin-6    Bin-7     Bin-8     Bin-9
          -----  -------------------  ------  -------  -------  -------  -------  -------  -------  --------  --------  ---------
          0      -                    (<864)  (<2304)  (<3744)  (<5184)  (<6624)  (<8064)  (<9504)  (<10944)  (<12384)  (>=12384)
          1      2023-12-13 11:02:44  980318  0        0        0        0        0        0        0         0         0
          2      2023-12-13 11:02:43  980318  0        0        0        0        0        0        0         0         0
          3      2023-12-13 11:02:42  980318  0        0        0        0        0        0        0         0         0
          4      2023-12-13 11:02:41  980318  0        0        0        0        0        0        0         0         0
          5      2023-12-13 11:02:40  980488  0        0        0        0        0        0        0         0         0
          6      2023-12-13 11:02:39  980149  0        0        0        0        0        0        0         0         0
          7      2023-12-13 11:02:38  979809  0        0        0        0        0        0        0         0         0
          8      2023-12-13 11:02:37  980488  0        0        0        0        0        0        0         0         0
          9      2023-12-13 11:02:36  980318  0        0        0        0        0        0        0         0         0
          

          Parsing the snapshot file and finding the information you need can be tedious; use a third-party analysis tool to analyze the data in the file.

          Log files

          In addition to snapshots, you can configure the switch to send log messages to the /var/log/syslog file when the queue length reaches a specified number of bytes, the number of counters reach a specified value, or the latency reaches a specific number of nanoseconds.

          The following example sends a message to the /var/log/syslog file after the ingress queue length for priority group 1 on swp9 through swp16 reaches 5000 bytes:

          cumulus@switch:~$ nv set interface swp9-16 telemetry histogram ingress-buffer priority-group 1 threshold action log
          cumulus@switch:~$ nv set interface swp9-16 telemetry histogram ingress-buffer priority-group 1 threshold value 5000
          cumulus@switch:~$ nv config apply
          

          The following example sends a message to the /var/log/syslog file after the number of received packets on swp1 through swp8 reaches 500:

          cumulus@switch:~$ nv set interface swp1-8 telemetry histogram counter counter-type rx-packet threshold log
          cumulus@switch:~$ nv set interface swp1-8 telemetry histogram counter counter-type rx-packet threshold value 500
          cumulus@switch:~$ nv config apply
          

          The following example sends a message to the /var/log/syslog file after packet latency for traffic class 0 on swp1 through swp8 reaches 500 nanoseconds:

          cumulus@switch:~$ nv set interface swp1-8 telemetry histogram latency traffic-class 0 threshold action log
          cumulus@switch:~$ nv set interface swp1-8 telemetry histogram latency traffic-class 0 threshold value 500
          cumulus@switch:~$ nv config apply
          

          Set the log options in the /etc/cumulus/datapath/monitor.conf file, then restart the asic-monitor service with the systemctl restart asic-monitor.service command. The asic-monitor service reads the new configuration file and then runs until you stop the service with the systemctl stop asic-monitor.service command.

          Setting Description
          <port_group_name>.log.action_list Set this option to log to create a log message when the queue length or counter number reaches the threshold set.
          <port_group_name>.log.queue_bytes Specifies the length of the queue in bytes after which the switch sends a log message.
          <port_group_name>.log.count Specifies the number of counters to reach after which the switch sends a log message.
          <port_group_name>.log.value Specifies the number of latency nanoseconds to reach after which the switch sends a log message.

          The following example sends a message to the /var/log/syslog file after the ingress queue length reaches 5000 bytes:

          ...
          monitor.histogram_pg.action_list  = [log]
          ...
          monitor.histogram_pg.log.queue_bytes  = 5000
          

          The following example sends a message to the /var/log/syslog file after the number of packets reaches 500:

          ...
          monitor.histogram_pg.action_list  = [log]
          ...
          monitor.histogram_pg.log.count  = 500
          

          The following example sends a message to the /var/log/syslog file after packet latency reaches 500 nanoseconds:

          ...
          monitor.histogram_pg.action_list  = [log]
          ...
          monitor.histogram_pg.log.value  = 500
          

          The following shows an example syslog message:

          2018-02-26T20:14:41.560840+00:00 cumulus asic-monitor-module INFO:  2018-02-26 20:14:41.559967: Egress queue(s) greater than 500 bytes in monitor port group histogram_pg.
          

          When collecting data, the switch uses both the CPU and SDK process, which can affect switchd. Snapshots and logs can occupy a lot of disk space if you do not limit their number.

          Show Histogram Information

          To show a list of the interfaces with enabled histograms, run the nv show system telemetry histogram interface command:

          cumulus@switch:~$ nv show system telemetry histogram interface
          Interface         ingress-buffer          egress-buffer            counter 
          --------------------------------------------------------------------------------------- 
          swp1              0,1,2                   -                        tx-byte,rx-byte 
          swp2              -                       0,1,8                    tx-byte,tx-byte
          

          To show the egress queue depth histogram samples collected at the configured interval for a traffic class for a port, run the nv show interface <interface> telemetry histogram egress-buffer traffic-class <traffic-class> command.

          cumulus@switch:~$ nv show interface swp1 telemetry histogram egress-buffer traffic-class 0
          Time         0-863     864:2303    2304:3743.  3744:5183   5184:6623   6624:8063   8064:9503 9. 504:10943   10944:12383 
          12384:* 
          ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
          08:56:19     978065        0           0           0          0            0           0             0          0
          08:56:20     978532        0           0           0          0            0           0             0          0 
          

          To show the ingress queue depth histogram samples collected at the configured interval for a priority group for a port, run the nv show interface <interface> telemetry histogram ingress-buffer priority-group <priority-group> command.

          cumulus@switch:~$ nv show interface swp1 telemetry histogram ingress-buffer priority-group 0
          Time      0-863     864:2303    2304:3743  3744:5183   5184:6623   6624:8063   8064:9503 9. 504:10943   10944:12383 
          12384:* 
          ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
          08:56:19  978065        0          0           0           0            0           0           0             0
          08:56:20  978532        0          0           0           0            0           0           0             0
          

          Interface Packet and Buffer Statistics

          Interface packet and buffer statistics show information about all, good, and dropped packets, and interface ingress and egress buffer occupancy.

          Interface Packet and Buffer Statistics Collection

          To monitor interface packet and buffer statistics, you specify:

          The switch limits statistics collection for 128 ports every 10 seconds or for 13 ports every second.

          The following example enables packet and buffer data collection on all interfaces. The switch sends the interface statistics about all, good, and dropped packets, in addition to ingress and egress queue occupancy to the default snapshot file every fifteen seconds.

          cumulus@switch:~$ nv set system telemetry enable on
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg interface all 
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg stats-type packet-all 
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg timer-interval 15
          cumulus@switch:~$ nv config apply
          

          The following example enables packet and buffer data collection on swp1 through swp8. The switch sends the interface statistics about ingress and egress queue occupancy to the default snapshot file every ten seconds.

          cumulus@switch:~$ nv set system telemetry enable on
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg interface swp1-8 
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg stats-type buffer
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg timer-interval 10
          cumulus@switch:~$ nv config apply
          

          The following example enables packet and buffer data collection on all interfaces. The switch sends the interface statistics about all and good packets to the default snapshot file every fifteen seconds.

          cumulus@switch:~$ nv set system telemetry enable on
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg interface all
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg stats-type packet
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg timer-interval 15
          cumulus@switch:~$ nv config apply
          

          The following example enables packet and buffer data collection on all interfaces. The switch sends the interface statistics about all, good, and dropped packets to the default snapshot file every fifteen seconds.

          cumulus@switch:~$ nv set system telemetry enable on
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg interface all
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg stats-type packet-extended
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg timer-interval 15
          cumulus@switch:~$ nv config apply
          

          Edit settings in the /etc/cumulus/datapath/monitor.conf file, then restart the asic-monitor service with the systemctl restart asic-monitor.service command.

          The following table describes the ASIC monitor settings.

          Setting Description
          port_group_list Specifies the name of the monitor (port groups) you want to use to collect data, such as buffers_pg. You can provide any name you want for the port group. You must use the same name for all the port group settings. You must specify at least one port group. If the port group list is empty, systemd shuts down the asic-monitor service.
          <port_group_name>.port_set Specifies the range of ports you want to monitor, such as swp4,swp8,swp10-swp50 or all.
          <port_group_name>.stat_type Specifies the type of data that the port group collects; packet_all, buffer, packet, or packet_extended.
          <port_group_name>.timer Specifies how often the switch sends the data to the snapshot file; for example, if you specify 1s, the switch sends the data one time each second.

          The following example enables packet and buffer statistics on all interfaces. The switch sends all interface statistics to the default snapshot file every fifteen seconds.

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.packet-all-pg_packet_all.port_set            = all
          monitor.packet-all-pg_packet_all.stat_type           = packet_all
          monitor.packet-all-pg_packet_all.trigger_type        = timer
          monitor.packet-all-pg_packet_all.timer               = 15s
          monitor.packet-all-pg_packet_all.action_list         = [snapshot]
          monitor.packet-all-pg_packet_all.snapshot.file       = /var/run/cumulus/intf_stats_packet-all-pg
          monitor.packet-all-pg_packet_all.snapshot.file_count = 64
          

          The following example enables packet and buffer data collection on swp1 through swp8. The switch sends the interface statistics about ingress and egress queue occupancy to the default snapshot file every ten seconds.

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.packet-all-pg_buffer.port_set            = swp1,swp2,swp3,swp4,swp5,swp6,swp7,swp8
          monitor.packet-all-pg_buffer.stat_type           = buffer
          monitor.packet-all-pg_buffer.trigger_type        = timer
          monitor.packet-all-pg_buffer.timer               = 10s
          monitor.packet-all-pg_buffer.action_list         = [snapshot]
          monitor.packet-all-pg_buffer.snapshot.file       = /var/run/cumulus/intf_stats_packet-all-pg
          monitor.packet-all-pg_buffer.snapshot.file_count = 120
          

          The following example enables packet and buffer data collection on all interfaces. The switch sends the interface statistics about all and good packets to the default snapshot file every fifteen seconds.

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.packet-all-pg_packet.port_set            = all
          monitor.packet-all-pg_packet.stat_type           = packet
          monitor.packet-all-pg_packet.trigger_type        = timer
          monitor.packet-all-pg_packet.timer               = 15s
          monitor.packet-all-pg_packet.action_list         = [snapshot]
          monitor.packet-all-pg_packet.snapshot.file       = /var/run/cumulus/intf_stats_packet-all-pg
          monitor.packet-all-pg_packet.snapshot.file_count = 64
          

          The following example enables packet and buffer data collection on all interfaces. The switch sends the interface statistics about all, good, and dropped packets to the default snapshot file every fifteen seconds.

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.packet-all-pg_packet_extended.port_set            = all
          monitor.packet-all-pg_packet_extended.stat_type           = packet_extended
          monitor.packet-all-pg_packet_extended.trigger_type        = timer
          monitor.packet-all-pg_packet_extended.timer               = 15s
          monitor.packet-all-pg_packet_extended.action_list         = [snapshot]
          monitor.packet-all-pg_packet_extended.snapshot.file       = /var/run/cumulus/intf_stats_packet-all-pg
          monitor.packet-all-pg_packet_extended.snapshot.file_count = 64
          

          Snapshots

          Cumulus Linux saves packet and buffer statistics to the /var/run/cumulus/intf_stats_<port-group> file by default when you configure packet and buffer statistics collection and set the timer in seconds.

          You can change the snapshot directory and file name. You can also change the number of snapshots to create before Cumulus Linux overwrites the first snapshot file. For example, if you set the snapshot file count to 30, the first snapshot file is intf_stats_<port-group>_0 and the 30th snapshot is intf_stats_<port-group>_30. After the 30th snapshot, Cumulus Linux overwrites the original snapshot file (intf_stats_<port-group>_0) and the sequence restarts. The default value is 64.

          Snapshots provide you with more data; however, they can occupy a lot of disk space on the switch. To reduce disk usage, use a volatile partition for the snapshot files.

          The following example creates the /var/run/cumulus/all_packet_stats1 snapshot for all interface packet and buffer statistics. The number of snapshots that you can create before the first snapshot file is overwritten is set to 80.

          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg snapshot-file name /var/run/cumulus/all_packet_stats1 
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg snapshot-file count 80 
          cumulus@switch:~$ nv config apply
          

          Edit the snapshot.file settings in the /etc/cumulus/datapath/monitor.conf file, then restart the asic-monitor service with the systemctl restart asic-monitor.service command. The asic-monitor service reads the new configuration file and then runs until you stop the service with the systemctl stop asic-monitor.service command.

          Setting Description
          <port_group_name>.snapshot.file Specifies the name and directory for the snapshot file. The default snapshot file is /var/run/cumulus/intf_stats_<port_group_name>.
          <port_group_name>.snapshot.file_count Specifies the number of snapshots you can create before Cumulus Linux overwrites the first snapshot file.

          The following example sets the snapshot file name to all_packet_stats and the directory to /var/run/cumulus/packet_buffer:

          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.packet-all-pg_packet_extended.action_list         = [snapshot]
          monitor.packet-all-pg_packet_extended.snapshot.file       = /var/run/cumulus/packet_buffer/all_packet_stats1 
          monitor.packet-all-pg_packet_extended.snapshot.file_count = 80
          

          To show a packet and buffer statistics snapshot, run these commands:

          The following example shows a snapshot for good packets transmitted on swp1:

          cumulus@switch:~$ nv show system telemetry snapshot port-group all-packet-pg stats interface swp1 packet good tx 
          Id       Date-Time                 Packet         Byte             Mcast        Bcast         Mac Ctrl       Pause Mac Ctrl 
          
          -----    -------------------       ------------   -------------    ---------    ----------    ------------   ---------------
          
          1         2023-12-13 11:02:44      2              268              0            0             0              0
          2         2023-12-13 11:02:43      2              268              0            0             0              0
          3         2023-12-13 11:02:42      2              268              0            0             0              0         
          

          The following example shows a snapshot for dropped packets received on swp1:

          cumulus@switch:~$ nv show system telemetry snapshot port-group all-packet-pg stats interface swp1 packet discard rx 
          Id       Date-Time                  General      Policy        Vlan         Tag Type     Opcode     Buffer   Runt     Other 
          
          -----    -------------------        ---------    -----------   -------      ----------    -------   -------  -------  -------- 
          
          1         2023-12-13 11:02:44       2            0             0            0             0         0        0        0
          2         2023-12-13 11:02:43       2            0             0            0             0         0        0        0
          3         2023-12-13 11:02:42       2            0             0            0             0         0        0        0 
          

          The following example shows a snapshot for ingress queue packets received on swp1:

          cumulus@switch:~$ nv show system telemetry snapshot port-group all-packet-pg stats interface swp1 packet pg 0 tx
          Id       Date-Time                 Pause Packet        Pause Duration   
          -----    -------------------       ------------        --------------------- 
          
          1         2023-12-13 11:02:44      0                   0                 
          2         2023-12-13 11:02:43      0                   0               
          3         2023-12-13 11:02:42      0                   0
          

          The following example shows a snapshot for buffer occupancy on swp1. The current value is the number of bytes buffered at the time of the sample, and the watermark value represents the highest historical number of bytes buffered during a sample.

          cumulus@switch:~$ nv show system telemetry snapshot port-group all-packet-pg stats interface swp1 buffer pg 0 
          Id       Date-Time                 Current Value        Watermark        
          -----    -------------------       ------------         -------------
          1        2023-12-13 11:02:44       0                    0                           
          2        2023-12-13 11:02:43       0                    0              
          3        2023-12-13 11:02:42       0                    0   
          

          Parsing the snapshot file and finding the information you need can be tedious; use a third-party analysis tool to analyze the data in the file.

          Log files

          In addition to snapshots, you can configure the switch to send log messages to the /var/log/syslog file when dropped error packets or dropped congested packets reach a specific number.

          The following example sends a message to the /var/log/syslog file after the number of dropped error packets collected in the packet-all-pg port group reaches 100:

          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg threshold packet-error-drops value 100 
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg threshold packet-error-drops action log  
          cumulus@switch:~$ nv config apply
          

          The following example sends a message to the /var/log/syslog file after the number of dropped congested packets collected in the packet-all-pg port group reaches 100:

          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg threshold packet-congestion-drops value 100 
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg threshold packet-congestion-drops action log 
          cumulus@switch:~$ nv config apply
          

          You cannot set a threshold for buffer occupancy.

          Set the log options in the /etc/cumulus/datapath/monitor.conf file, then restart the asic-monitor service with the systemctl restart asic-monitor.service command. The asic-monitor service reads the new configuration file and then runs until you stop the service with the systemctl stop asic-monitor.service command.

          Setting Description
          <port_group_name>.log.action_list Set this option to log to create a log message when dropped error packets or dropped congested packets reach a specific number.
          <port_group_name>.log.value Specifies the number of dropped packets to reach after which the switch sends a log message.

          The following example sends a message to the /var/log/syslog file after the number of dropped congested packets collected in the packet-all-pg port group reaches 100:

          ...
          monitor.packet-all-pg.action_list  = [log]
          ...
          monitor.packet-all-pg.log.value  = 100
          

          When collecting data, the switch uses both the CPU and SDK process, which can affect switchd. Snapshots and logs can occupy a lot of disk space if you do not limit their number.

          Collect Action

          A collect action triggers the collection of additional information. You can link multiple monitors (port groups) together into a single collect action.

          The following example configures the switch to collect ingress and egress queue occupancy statistics when the number of dropped error packets reaches 100:

          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg threshold packet-error-drops value 100
          cumulus@switch:~$ nv set system telemetry snapshot port-group packet-all-pg threshold packet-error-drops action collect port-group buffer-pg
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ sudo nano /etc/cumulus/datapath/monitor.conf
          ...
          monitor.packet-all-pg_packet_all.port_set               = all
          monitor.packet-all-pg_packet_all.stat_type                    = packet_all
          monitor.packet-all-pg_packet_all.trigger_type                 = timer
          monitor.packet-all-pg_packet_all.timer                        = 5s
          monitor.packet-all-pg_packet_all.action_list                  = [snapshot,collect]
          monitor.packet-all-pg_packet_all.snapshot.file                = /var/run/cumulus/intf_stats_packet-all-pg
          monitor.packet-all-pg_packet_all.snapshot.file_count          = 64
          monitor.packet-all-pg_packet_all.collect.packet_error_drops   = 100
          monitor.packet-all-pg_packet_all.collect.port_group_list      = [buffer-pg_buffer]
          
          
          monitor.buffer-pg_buffer.port_set                             = swp1,swp2,swp3,swp4,swp5,swp6,swp7,swp8
          monitor.buffer-pg_buffer.stat_type                            = buffer
          monitor.buffer-pg_buffer.action_list                          = [snapshot]
          monitor.buffer-pg_buffer.snapshot.file                        = /var/run/cumulus/intf_stats_buffer-pg
          monitor.buffer-pg_buffer.snapshot.file_count                  = 64
          

          High Frequency Telemetry

          High frequency telemetry enables you to collect counters at very short sampling intervals (single digit milliseconds to microseconds). The data can help you detect short duration events like microbursts, and provides information about where in time the events happen and for how long.

          High frequency telemetry data provides time-series data that traditional histograms cannot provide. This data can help you understand the shape of the traffic pattern and identify any spikes or dips, or jitter in the traffic.

          Cumulus Linux collects high frequency telemetry data in a json format file. You can upload the file to an external location, then process the data, plot it into a time-series graph and see how the network behaves with high precision.

          Cumulus Linux provides two options to configure high frequency telemetry; you can run NVUE commands or use the Cumulus Linux job management tool (cl-hft-tool). You can see all the cl-hft-tool command options with cl-hft-tool -h. Cumulus Linux recommends that you use NVUE commands.

          To configure high frequency telemetry:

          1. Enable telemetry with the nv set system telemetry enable on command.
          2. Configure data collection.
          3. Configure data export.
          4. Configure the schedule.

          Configure Data Collection

          High frequency telemetry uses profiles for data collection. A profile is a set of configurations. Cumulus Linux provides a default profile called standard. You can create a maximum of four new profiles (four profiles in addition to the default profile).

          You cannot delete or modify a profile if data collection jobs are already running or scheduled.

          To configure data collection:

          Use commas (no spaces) to separate the list of traffic classes. For example, to set traffic class 1, 3, and 6, specify 1,3,6.

          The following example configures profile1 and sets the sampling interval to 1000, the traffic class to 0, 3, and 7, and the type of data to collect to traffic class buffer occupancy (tc-occupancy):

          cumulus@switch:~$ nv set system telemetry hft profile profile1 sample-interval 1000
          cumulus@switch:~$ nv set system telemetry hft profile profile1 counter tc-occupancy
          cumulus@switch:~$ nv set system telemetry hft profile profile1 traffic-class 0,3,7 
          cumulus@switch:~$ nv config apply
          

          The following example configures profile2 and sets the sampling interval to 1000, and the type of data to collect to received bytes (rx-byte) and transmitted bytes (tx-byte).

          You must specify the nv set system telemetry hft profile <profile-id> counter command for each data type you want to collect.

          cumulus@switch:~$ nv set system telemetry hft profile profile2 sample-interval 1000
          cumulus@switch:~$ nv set system telemetry hft profile profile2 counter rx-byte
          cumulus@switch:~$ nv set system telemetry hft profile profile2 counter tx-byte
          cumulus@switch:~$ nv config apply
          

          To delete a profile, run the nv unset system telemetry hft profile <profile-id> command.

          The following example configures profile1 and sets the sampling interval to 1000, the traffic class to 0, 3, and 7, and the type of data to collect to traffic class buffer occupancy (tc_curr_occupancy):

          cumulus@switch:~$ cl-hft-tool profile-add --name profile1 --counter tc_curr_occupancy --tc 0,3,7 --interval 1000 
          

          The following example configures profile2, and sets the sampling interval to 1000 and the type of data to collect to received bytes (if_in_octets) and transmitted bytes (if_out_octets):

          cumulus@switch:~$ cl-hft-tool profile-add --name profile2 --counter if_in_octets,if_out_octets --interval 1000 
          

          To delete a profile, run the cl-hft-tool profile-delete --name <profile-id> command:

          cumulus@switch:~$ cl-hft-tool profile-delete --name profile1 
          

          To delete all profiles, run the cl-hft-tool profile-delete --name all command.

          Configure Data Export

          You can save the collected data locally to a json file in the /var/run/cumulus/hft directory, then export the json file to an external location with NVUE commands (or the API). The json format file includes the data for each sampling interval and a timestamp for the collected data.

          To save the collected data locally to a json file, run the nv set system telemetry hft target local command:

          cumulus@switch:~$ nv set system telemetry hft target local
          cumulus@switch:~$ nv config apply
          

          The following example saves the collected data locally to a json file:

          cumulus@switch:~$ cl-hft-tool target-add --target local
          

          To delete a target, run the cl-hft-tool target-delete --target local command:

          cumulus@switch:~$ cl-hft-tool target-delete --target local 
          

          To export a json file to an external location, run the NVUE nv action upload system telemetry hft job <job-id> <remote-url> command. Cumulus Linux supports FTP, SCP, and SFTP. You can see the list of jobs with the nv show system telemetry hft job command.

          cumulus@switch:~$ nv action upload system telemetry hft job 1 scp://root@host1:/home/telemetry/
          

          Configure the Schedule

          To configure the schedule for a data collection profile, set:

          The following example configures profile1 to start on 2024-07-17 at 10:00:00, run for 30 seconds, and collect data on swp1s0 through swp9s0.

          Specify the date and time in YYYY-MM-DD HH:MM:SS format.

          cumulus@switch:~$ nv action schedule system telemetry hft job 2024–07-17 10:00:00 duration 30 profile profile1 ports swp1s0-swp9s0
          Action executing ...
          Job schedule successfull.
          HFT job schedule successful: job-id 1
          
          Action succeeded
          

          You can provide a short reason why you are collecting the data. If the description contains more than one word, you must enclose the description in quotes. A description is optional.

          cumulus@switch:~$ nv action schedule system telemetry hft job 2024-07-17 10:00:00 duration 30 profile profile1 ports swp1s0-swp9s0 description "bandwidth profiling"
          Action executing ...
          Job schedule successfull.
          HFT job schedule successful: job-id 1
          
          Action succeeded
          

          The following example configures profile2 to start immediately, run for 30 seconds, and collect data on swp2s0.

          cumulus@switch:~$ nv action schedule system telemetry hft job now now duration 30 profile profile2 ports swp2s0
          Action executing ...
          Job schedule successfull.
          HFT job schedule successful: job-id 2
          
          Action succeeded
          

          The following example configures profile1 to start on 2024-07-17 at 10:00:00, run for 30 seconds, and collect data on swp1s0 through swp9s0.

          Specify the date and time in YYYY-MM-DD-HH:MM:SS format.

          cumulus@switch:~$ cl-hft-tool job-schedule --time 2024–07-17-10:00:00 --duration 30 --profile profile1 --ports swp1s0-swp9s0  
          

          You can provide a short reason why you are collecting the data. If the description contains more than one word, you must enclose the description in quotes. A description is optional.

          cumulus@switch:~$ cl-hft-tool job-schedule --time 2024–07-17-10:00:00 --duration 30 --profile profile1 --ports swp1s0-swp9s0 --description "bandwidth profiling"
          

          Cancel Data Collection

          You can cancel a specific or all data collection jobs, or a specific or all jobs for a profile.

          To cancel a scheduled telemetry job, run the nv action cancel system telemetry hft job <job-id> profile <profile-id> command. Run the nv show system telemetry hft job command to see the list of job IDs.

          The following example cancels all jobs for profile profile1:

          cumulus@switch:~$ nv action cancel system telemetry hft job all profile profile1
          Action executing ...
          Action succeeded
          

          The following example cancels job 6:

          cumulus@switch:~$ nv action cancel system telemetry hft job 6
          Action executing ...
          Action succeeded
          

          To cancel a scheduled telemetry job, run the cl-hft-tool job-cancel --job <job-id> command.

          The following example cancels job 6:

          cumulus@switch:~$ cl-hft-tool  job-cancel --job 6
          

          Show Session Information

          To show a summary of high frequency telemetry configuration and data:

          cumulus@switch:~$ nv show system telemetry hft
          profile
          ==========
              Profile        traffic-class  counter       sample-interval
              -------------  -------------  ------------  ---------------
              standard       3              rx-byte       5000
                                            tc-occupancy
                                            tx-byte
              user_profile1  0              rx-byte       1000
                             1              tc-occupancy
                             2              tx-byte
          
          job
          ======
              Job  Counter                       duration  sample-interval  Start Time            Traffic Class  Status     Description
              ---  ----------------------------  --------  ---------------  --------------------  -------------  ---------  -----------
              1    tx-byte,rx-byte,tc-occupancy  20        5000             2024-07-30T05:34:23Z  3              completed  NA
              2    tx-byte,rx-byte,tc-occupancy  20        1000             2024-07-30T05:35:17Z  0-2            completed  NA
          ...
          

          To show the high frequency telemetry profiles configured on the switch:

          cumulus@switch:~$ nv show system telemetry hft profile
          Profile        traffic-class  counter       sample-interval
          -------------  -------------  ------------  ---------------
          standard       3              rx-byte       5000
                                        tc-occupancy
                                        tx-byte
          user_profile1  0              rx-byte       1000
                         1              tc-occupancy
                         2              tx-byte
          

          To show the settings for a specific profile:

          cumulus@switch:~$ nv show system telemetry hft profile profile1
                           operational  applied
          ---------------  -----------  -------
          sample-interval  1000         1000   
          [traffic-class]  0            0      
          [traffic-class]  1            1      
          [traffic-class]  2            2      
          [traffic-class]  3            3      
          [traffic-class]  4            4      
          [traffic-class]  5            5      
          [traffic-class]  6            6      
          [traffic-class]  7            7      
          [traffic-class]  8            8      
          [traffic-class]  9            9      
          [counter]        rx-byte      rx-byte
          [counter]        tx-byte      tx-byte
          

          To show configured targets:

          cumulus@switch:~$ nv show system telemetry hft target
          applied
          -------
          local  
          

          To show information for all data collection jobs:

          cumulus@switch:~$ nv show system telemetry hft job
          Job  Counter                       duration  sample-interval  Start Time            Traffic Class  Status     Description
          ---  ----------------------------  --------  ---------------  --------------------  -------------  ---------  -----------
          1    tx-byte,rx-byte,tc-occupancy  20        5000             2024-07-30T05:34:23Z  3              completed  NA
          2    tx-byte,rx-byte,tc-occupancy  20        1000             2024-07-30T05:35:17Z  0-2            completed  NA
          

          To show information about a specific data collection job:

          cumulus@switch:~$ nv show system telemetry hft job 1
          duration      : 20                sample_interval : 5000
          status        : completed         start_time      : 2024-07-30T05:34:23Z
          traffic_class : 3                 counter         : tx-byte,rx-byte,tc-occupancy
          description   : NA
          target        : /var/run/cumulus/hft
          port          : swp9s0
          

          Open Telemetry Export

          Telemetry enables you to collect, send, and analyze large amounts of data, such as traffic statistics, port status, device health and configuration, and events. This data helps you monitor switch performance, health and behavior, traffic patterns, and QoS.

          Configure Open Telemetry

          Cumulus Linux supports open telemetry (OTEL) export. You can use OTLP to export metrics, such as interface counters, histogram collection, and platform statistic data to an external collector for analysis and visualization.

          Cumulus Linux supports open telemetry export on switches with the Spectrum-2 ASIC and later.

          To enable open telemetry:

          cumulus@switch:~$ nv set system telemetry export otlp state enabled 
          cumulus@switch:~$ nv config apply
          

          You can enable open telemetry for interface statistics, histogram data, control plane statistics, and platform statistics.

          Interface Statistics

          When you enable open telemetry for interface statistics, the switch exports counters on all configured interfaces:

          cumulus@switch:~$ nv set system telemetry interface-stats export state enabled
          cumulus@switch:~$ nv config apply
          

          You can enable additional interface statistic collection per interface for specific ingress buffer traffic classes (0 through 15) and egress buffer priority groups (0 through 7). When you enable these settings, the switch exports interface_pg and interface_tc counters for the defined priority groups and traffic classes:

          cumulus@switch:~$ nv set system telemetry interface-stats ingress-buffer priority-group 4
          cumulus@switch:~$ nv set system telemetry interface-stats egress-buffer traffic-class 12
          cumulus@switch:~$ nv config apply
          

          You can enable additional switch priority interface statistic collection on all configured interfaces for specific switch priority values:

          cumulus@switch:~$ nv set system telemetry interface-stats switch-priority 4
          cumulus@switch:~$ nv config apply
          

          You can adjust the interface statistics sample interval (in seconds). You can specify a value between 1 and 86400. The default value is 1.

          cumulus@switch:~$ nv set system telemetry interface-stats sample-interval 100
          cumulus@switch:~$ nv config apply
          

          Control Plane Statistics

          When you enable open telemetry for control plane statistics, additional counters for control plane packets are exported:

          cumulus@switch:~$ nv set system telemetry control-plane-stats export state enabled
          cumulus@switch:~$ nv config apply
          

          You can adjust the control plane statistics sample interval (in seconds). You can specify a value between 1 and 86400. The default value is 1.

          cumulus@switch:~$ nv set system telemetry control-plane-stats sample-interval 100
          cumulus@switch:~$ nv config apply
          

          Histogram Data

          When you enable open telemetry for histogram data, your buffer, counter, and latency histogram collection configuration defines the data that the switch exports:

          cumulus@switch:~$ nv set system telemetry histogram export state enabled
          cumulus@switch:~$ nv config apply
          

          Platform Statistics

          When you enable platform statistic open telemetry, data related to CPU, disk, filesystem, memory, and sensor health is exported. To enable all platform statistics globally:

          cumulus@switch:~$ nv set system telemetry platform-stats export state enabled
          cumulus@switch:~$ nv config apply
          

          If you do not want to enable all platform statistics, you can enable or disable individual platform telemetry components or adjust the sample interval for individual components. The default sample interval is 60 seconds.

          cumulus@switch:~$ nv set system telemetry platform-stats class cpu state enabled
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ nv set system telemetry platform-stats class cpu sample-interval 100
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ nv set system telemetry platform-stats class disk state enabled
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ nv set system telemetry platform-stats class disk sample-interval 100
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ nv set system telemetry platform-stats class file-system state enabled
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ nv set system telemetry platform-stats class file-system sample-interval 100
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ nv set system telemetry platform-stats class memory state enabled
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ nv set system telemetry platform-stats class memory sample-interval 100
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ nv set system telemetry platform-stats class environment-sensors state enabled
          cumulus@switch:~$ nv config apply
          
          cumulus@switch:~$ nv set system telemetry platform-stats class environment-sensors sample-interval 100
          cumulus@switch:~$ nv config apply
          

          Layer 3 Router Statistics

          When you enable open telemetry for layer 3 router statistics, the switch exports data related to the routing table, BGP peers, BGP advertised routes, and the BGP packet input and output queue. To enable router statistics:

          To enable BGP peer state statistic open telemetry:

          cumulus@switch:~$ nv set system telemetry router bgp export state
          cumulus@switch:~$ nv config apply
          

          To enable BGP statistic open telemetry for all peers under a VRF:

          cumulus@switch:~$ nv set system telemetry router bgp vrf <vrf_id> export state
          cumulus@switch:~$ nv config apply
          

          To enable BGP statistic open telemetry for a specific peer under a VRF:

          cumulus@switch:~$ nv set system telemetry router bgp vrf <vrf_id> peer <peer_id> export state
          cumulus@switch:~$ nv config apply
          

          To enable statistic open telemetry for the routing table:

          cumulus@switch:~$ nv set system telemetry router rib export state
          cumulus@switch:~$ nv config apply 
          

          To enable statistic open telemetry for the routing table for a VRF:

          cumulus@switch:~$ nv set system telemetry router vrf <vrf_id> rib export state
          cumulus@switch:~$ nv config apply
          

          gRPC OTLP Export

          To configure the open telemetry export destination:

          1. Configure gRPC to communicate with the collector by providing the collector destination IP address or hostname. Specify the port to use for communication if it is different from the default port 8443:

            cumulus@switch:~$ nv set system telemetry export otlp grpc destination 10.1.1.100 port 4317
            cumulus@switch:~$ nv config apply
            
          2. Configure an X.509 certificate to secure the gRPC connection:

            cumulus@switch:~$ nv set system telemetry export otlp grpc cert-id <certificate>
            cumulus@switch:~$ nv config apply
            

          By default, OTLP export is in secure mode that requires a certificate. For connections without a configured certificate, you must enable insecure mode with the nv set system telemetry export otlp grpc insecure enabled command.

          Show Telemetry Export Configuration

          To show the telemetry export configuration, run the nv show system telemetry export command:

          cumulus@switch:~$ nv show system telemetry export
                              applied   pending 
          ------------------  --------  --------
          vrf                 default   default 
          otlp                                  
            state             disabled  disabled
            grpc                                
              insecure  disabled  disabled
              port            8443      8443    
              [destination]             
          

          To show the OTLP gRPC destination configuration, run the nv show system telemetry export otlp grpc destination command.

          Static Labels

          You can apply static labels to switches and individual interfaces to configure descriptions for devices and interface roles. Exported OTLP data includes these label names and descriptions.

          To configure a switch device label Data_Center_Location and a string identifying it as part of Data_Center_B:

          cumulus@switch:~$ nv set system telemetry label "Data Center Location" description "Data Center B"
          cumulus@switch:~$ nv config apply
          

          Validate device label configuration with the nv show system telemetry label command:

          cumulus@switch:~$ nv show system telemetry label
                                description  
          --------------------  -------------
          Data Center Location  Data Center B
          

          To configure a switch interface label interface_swp10_label with the description Server 10 connection:

          cumulus@switch:~$ nv set interface swp10 telemetry label "interface_swp10_label" description "Server 10 connection"
          cumulus@switch:~$ nv config apply
          

          Validate the configuration with the nv show system telemetry label command:

          cumulus@switch:~$ nv show system telemetry label
                                description  
          --------------------  -------------
          Data Center Location  Data Center B
          

          Validate interface label configuration with the nv show interface <interface> telemetry label command:

          cumulus@switch:~$ nv show interface swp10 telemetry label
                                 description         
          ---------------------  --------------------
          interface_swp10_label  Server 10 connection
          

          Telemetry Data Format

          Cumulus Linux exports statistics and histogram data in the formats defined in this section.

          Interface Statistic Format

          The interface statistic data samples that the switch exports to the OTEL collector are gauge streams that include the interface name as an attribute and the statistics value reported in the asDouble exemplar.

          Name Description
          nvswitch_interface_oper_state Interface operational state as a bitmap: (None[0], Up[1], Down[2], Invalid[4], Error[8])
          nvswitch_interface_dot3_control_in_unknown_opcodes Input 802.3 unknown opcode counter.
          nvswitch_interface_dot3_in_pause_frames Input 802.3 pause frame counter.
          nvswitch_interface_dot3_out_pause_frames Output 802.3 pause frame counter.
          nvswitch_interface_dot3_stats_alignment_errors 802.3 alignment error counter.
          nvswitch_interface_dot3_stats_carrier_sense_errors 802.3 interface carrier sense error counter.
          nvswitch_interface_dot3_stats_deferred_transmissions 802.3 deferred transmission counter.
          nvswitch_interface_dot3_stats_excessive_collisions 802.3 excessive collisions counter.
          nvswitch_interface_dot3_stats_fcs_errors 802.3 FCS error counter.
          nvswitch_interface_dot3_stats_frame_too_longs 802.3 excessive frame size counter.
          nvswitch_interface_dot3_stats_internal_mac_receive_errors 802.3 internal MAC receive error counter.
          nvswitch_interface_dot3_stats_internal_mac_transmit_errors 802.3 internal MAC transmit error counter.
          nvswitch_interface_dot3_stats_late_collisions 802.3 late collisions counter.
          nvswitch_interface_dot3_stats_multiple_collision_frames 802.3 multiple collision frames counter.
          nvswitch_interface_dot3_stats_single_collision_frames 802.3 single collision frames counter.
          nvswitch_interface_dot3_stats_sqe_test_errors 802.3 SQE test error counter.
          nvswitch_interface_dot3_stats_symbol_errors 802.3 symbol error counter.
          nvswitch_interface_performance_marked_packets Interface performance marked packets, with marking as ece or ecn.
          nvswitch_interface_discards_ingress_general Interface ingress general discards counter.
          nvswitch_interface_discards_ingress_policy_engine Interface ingress policy engine discards counter.
          nvswitch_interface_discards_ingress_vlan_membership Interface ingress VLAN membership filter discards counter.
          nvswitch_interface_discards_ingress_tag_frame_type Interface ingress VLAN tag filter discards counter.
          nvswitch_interface_discards_egress_vlan_membership Interface egress VLAN emmbership filter discards counter.
          nvswitch_interface_discards_loopback_filter Interface loopback filter discards counter.
          nvswitch_interface_discards_egress_general Interface egress general discards counter.
          nvswitch_interface_discards_egress_link_down Interface egress link down discards counter.
          nvswitch_interface_discards_egress_hoq Interface egress head-of-queue timeout discards.
          nvswitch_interface_discards_port_isolation Interface port isolation filter discards.
          nvswitch_interface_discards_egress_policy_engine Interface egress policy engine discards.
          nvswitch_interface_discards_ingress_tx_link_down Interface ingress transmit link down discards.
          nvswitch_interface_discards_egress_stp_filter Interface egress spanning tree filter discards.
          nvswitch_interface_discards_egress_hoq_stall Interface egress head-of-queue stall discards.
          nvswitch_interface_discards_egress_sll Interface egress switch lifetime limit discards.
          nvswitch_interface_discards_ingress_discard_all Interface total ingress discards.
          nvswitch_interface_tx_stats_pkts64octets Total packets transmitted, 64 octets in length.
          nvswitch_interface_tx_stats_pkts65-to127octets Total packets transmitted, 64 octets in length.
          nvswitch_interface_tx_stats_pkts256-to511octets Total packets transmitted, 256-511 octets in length.
          nvswitch_interface_tx_stats_pkts512-to1023octets Total packets transmitted, 512-1023 octets in length.
          nvswitch_interface_tx_stats_pkts1024-to1518octets Total packets transmitted, 1024-1518 octets in length.
          nvswitch_interface_tx_stats_pkts1519-to2047octets Total packets transmitted, 1519-2047 octets in length.
          nvswitch_interface_tx_stats_pkts2048-to4095octets Total packets transmitted, 2048-4095 octets in length.
          nvswitch_interface_tx_stats_pkts4096-to8191octets Total packets transmitted, 4096-8191 octets in length.
          nvswitch_interface_tx_stats_pkts8192-to10239octets Total packets transmitted, 8192-10239 octets in length.
          nvswitch_interface_ether_stats_pkts64octets Total packets received, 64 octets in length.
          nvswitch_interface_ether_stats_pkts65to127octets Total packets received, 65-127 octets in length.
          nvswitch_interface_ether_stats_pkts128to255octets Total packets received, 128-255 octets in length.
          nvswitch_interface_ether_stats_pkts256to511octets Total packets received, 256-511 octets in length.
          nvswitch_interface_ether_stats_pkts512to1023octets Total packets received, 512-1023 octets in length.
          nvswitch_interface_ether_stats_pkts1024to1518octets Total packets received, 1024-1518 octets in length.
          nvswitch_interface_ether_stats_pkts1519to2047octets Total packets received, 1519-2047 octets in length.
          nvswitch_interface_ether_stats_pkts2048to4095octets Total packets received, 2048-4095 octets in length.
          nvswitch_interface_ether_stats_pkts4096to8191octets Total packets received, 4096-8191 octets in length.
          nvswitch_interface_ether_stats_pkts8192to10239octets Total packets received, 8192-10239 octets in length.
          nvswitch_interface_carrier_up_changes_total Total number of carrier up transitions for the interface.
          nvswitch_interface_carrier_last_change_time_ms Time of last carrier change for the interface as Unix epoch timestamp, with millisecond granularity.
          nvswitch_interface_carrier_down_changes_total Total number of carrier down transitions for the interface.
          nvswitch_interface_carrier_changes_total Total number of carrier changes for the interface.
          nvswitch_interface_mtu_bytes Operational MTU for the interface in bytes.
          nvswitch_interface_info Provides information about the interface: MAC address, duplex, ifalias, interface name, operstate.
          nvswitch_interface_iface_id The ifindex for the interface.
          nvswitch_interface_flags Kernel device flags set for an interface as an integer representing the kernel net_device flags bitmask.
          nvswitch_interface_proto_down Interface protocol down status.

          The following additional interface traffic class statistics are collected and exported when you configure the nv set system telemetry interface-stats egress-buffer traffic-class <class> command:

          Name Description
          nvswitch_interface_tc_tx_bc_frames Interface egress traffic class transmit broadcast frames counter.
          nvswitch_interface_tc_tx_ecn_marked_tc Interface egress traffic class transmit ECN marked counter.
          nvswitch_interface_tc_tx_frames Interface egress traffic class trasmit frames counter.
          nvswitch_interface_tc_tx_mc_frames Interface egress traffic class trasmit multicast frames counter.
          nvswitch_interface_tc_tx_no_buffer_discard_uc Interface egress traffic class transmit unicast no buffer discard counter.
          nvswitch_interface_tc_tx_octet Interface egress traffic class transmit bytes counter.
          nvswitch_interface_tc_tx_queue Interface egress traffic class transmit queue counter.
          nvswitch_interface_tc_tx_uc_frames Interface egress traffic class transmit unicast frames counter.
          nvswitch_interface_tc_tx_wred_discard Interface egress traffic class transmit WRED discard counter.

          The following additional interface priority group statistics are collected and exported when you configure the nv set system telemetry interface-stats ingress-buffer priority-group <priority> command:

          Name Description
          nvswitch_interface_pg_rx_buffer_discard Interace ingress priority group receive buffer discard counter.
          nvswitch_interface_pg_rx_frames Interface ingress priority group receive frames counter.
          nvswitch_interface_pg_rx_octets Interface ingress priority group receive bytes counter.
          nvswitch_interface_pg_rx_shared_buffer_discard Interface ingress priority group receive shared buffer discard counter.
          nvswitch_interface_pg_rx_uc_frames Interface receive priority group unicast frames counter.
          nvswitch_interface_pg_rx_mc_frames Interface receive priority group multicast frames counter.
          nvswitch_interface_pg_rx_bc_frames Interface receive priority group broadcast frames counter.
          nvswitch_interface_pg_tx_octets Interface receive priority group transmit bytes counter.
          nvswitch_interface_pg_tx_uc_frames Interface receive priority group transmit unicast frames counter.
          nvswitch_interface_pg_tx_mc_frames Interface receive priority group transmit multicast frames counter.
          nvswitch_interface_pg_tx_bc_frames Interface receive priority group transmit broadcast frames counter.
          nvswitch_interface_pg_tx_frames Interface receive priority group transmit frames counter.
          nvswitch_interface_pg_rx_pause Interface receive priority group receive pause counter.
          nvswitch_interface_pg_rx_pause_duration Interface receive priority group receive pause duration counter.
          nvswitch_interface_pg_tx_pause Interface receive priority group transmit pause counter.
          nvswitch_interface_pg_tx_pause_duration Interface receive priority group transmit pause duration counter.
          nvswitch_interface_pg_rx_pause_transition Interface receive priority group receive pause transition counter.
          nvswitch_interface_pg_rx_discard Interface receive priority group receive discard counter.

          The following additional interface switch priority statistics are collected and exported when you configure the nv set system telemetry interfaces-stats switch-priority <priority> command:

          Name Description
          nvswitch_interface_sp_rx_bc_frames Received broadcast counter for the switch priority
          nvswitch_interface_sp_rx_discard Receive discard counter for the switch priority
          nvswitch_interface_sp_rx_frames Receive frame counter for the switch priority.
          nvswitch_interface_sp_rx_mc_frames Receive multicast frame counter for the switch priority.
          nvswitch_interface_sp_rx_octets Receive octets counter for the switch priority.
          nvswitch_interface_sp_rx_pause Receive pause counter for the switch priority.
          nvswitch_interface_sp_rx_pause_duration Recieve pause duration counter for the switch priority.
          nvswitch_interface_sp_rx_pause_transition Recieve pause transition counter for the switch priority.
          nvswitch_interface_sp_rx_uc_frames Receive unicast frame counter for the switch priority.
          nvswitch_interface_sp_tx_bc_frames Transmit broadcast frame counter for the switch priority.
          nvswitch_interface_sp_tx_frames Transmit frame counter for the switch priority.
          nvswitch_interface_sp_tx_mc_frames Transmit multicast frame counter for the switch priority.
          nvswitch_interface_sp_tx_octets Transmit octets counter for the switch priority.
          nvswitch_interface_sp_tx_pause Transmit pause counter for the switch priority.
          nvswitch_interface_sp_tx_pause_duration Transmit pause duration for the switch priority.
          nvswitch_interface_sp_tx_uc_frames Transmit unicast frame counter for the switch priority.

          Example JSON data for interface_oper_state:
                      {
                        "name": "nvswitch_interface_oper_state",
                        "description": "NVIDIA Ethernet Switch Interface operational state",
                        "gauge": {
                          "dataPoints": [
                            {
                               "attributes": [
                                {
                                  "key": "interface",
                                  "value": {
                                    "stringValue": "swp61s0"
                                  }
                                }
                              ],
                              "timeUnixNano": "1722458198491000000",
                              "asDouble": 1
                            },
                            {
                              "attributes": [
                                {
                                  "key": "interface",
                                  "value": {
                                    "stringValue": "swp1"
                                  }
                                }
                              ],
                              "timeUnixNano": "1722458198491000000",
                              "asDouble": 2
                            }
                          ]
                        },
          

          Example JSON data for interface_dot3_stats_fcs_errors:
                      {
                        "name": "nvswitch_interface_dot3_stats_fcs_errors",
                        "description": "NVIDIA Ethernet Switch Interface dot3 stats fcs errors counter",
                        "gauge": {
                          "dataPoints": [
                            {
                              "attributes": [
                                {
                                  "key": "interface",
                                  "value": {
                                    "stringValue": "swp1"
                                  }
                                }
                              ],
                              "timeUnixNano": "1722458205491000000",
                              "asDouble": 0
                            }
                          ]
                        },
          

          Control Plane Statistic Format

          When you enable control plane statistic telemetry, the following statistics are exported:

          Name Description
          nvswitch_control_plane_tx_packets Control plane transmit packets.
          nvswitch_control_plane_tx_bytes Control plane transmit bytes.
          nvswitch_control_plane_rx_packets Control plane receive packets.
          nvswitch_control_plane_rx_bytes Control plane receive bytes.
          nvswitch_control_plane_rx_buffer_drops Control plane receive buffer drops.
          nvswitch_control_plane_trap_rx_packets Control plane trap group receive packets.
          nvswitch_control_plane_trap_rx_event_count Control plane trap group receive events.
          nvswitch_control_plane_trap_rx_drop Control plane trap group receive drops.
          nvswitch_control_plane_trap_rx_bytes Control plane trap group receive bytes.
          nvswitch_control_plane_trap_group_rx_packets Control plane trap group receive packets.
          nvswitch_control_plane_trap_group_rx_bytes Control plane trap group receive bytes.
          nvswitch_control_plane_trap_group_pkt_violations Control plane trap group packet violations.
          Example JSON data for nvswitch_control_plane_trap_rx_drop:
                      {
                        "name": "nvswitch_control_plane_trap_rx_drop",
                        "description": "NVIDIA Ethernet Switch trap t_drops counter",
                        "sum": {
                          "dataPoints": [
                            {
                              "attributes": [
                                {
                                  "key": "group",
                                  "value": {
                                    "stringValue": "25"
                                  }
                                }
                              ],
                              "startTimeUnixNano": "1729836350747000000",
                              "timeUnixNano": "1729839232747000000",
                              "asDouble": 0
                            },
                            {
                              "attributes": [
                                {
                                  "key": "group",
                                  "value": {
                                    "stringValue": "3"
                                  }
                                }
                              ],
                              "startTimeUnixNano": "1729836350747000000",
                              "timeUnixNano": "1729839232747000000",
                              "asDouble": 0
                            },
                            {
                              "attributes": [
                                {
                                  "key": "group",
                                  "value": {
                                    "stringValue": "5"
                                  }
                                }
                              ],
                              "startTimeUnixNano": "1729836350747000000",
                              "timeUnixNano": "1729839232747000000",
                              "asDouble": 0
                            },
                            {
                              "attributes": [
                                {
                                  "key": "group",
                                  "value": {
                                    "stringValue": "53"
                                  }
                                }
                              ],
                              "startTimeUnixNano": "1729836350747000000",
                              "timeUnixNano": "1729839232747000000",
                              "asDouble": 0
                            },
                            {
                              "attributes": [
                                {
                                  "key": "group",
                                  "value": {
                                    "stringValue": "78"
                                  }
                                }
                              ],
                              "startTimeUnixNano": "1729836350747000000",
                              "timeUnixNano": "1729839232747000000",
                              "asDouble": 1
                            },
                            {
                              "attributes": [
                                {
                                  "key": "group",
                                  "value": {
                                    "stringValue": "80"
                                  }
                                }
                              ],
                              "startTimeUnixNano": "1729836350747000000",
                              "timeUnixNano": "1729839232747000000",
                              "asDouble": 1
                            }
                          ],
                          "aggregationTemporality": 2,
                          "isMonotonic": true
                        },
                        "metadata": [
                          {
                            "key": "prometheus.type",
                            "value": {
                              "stringValue": "counter"
                            }
                          }
                        ]
                      }
          

          Platform Statistic Format

          When you enable platform statistic telemetry globally, or when you enable telemetry for the individual components, the following statistics are exported:

          CPU statistics include the CPU core number and operation mode (user, system, idle, iowait, irq, softirq, steal, guest, guest_nice).

          Name Description
          node_cpu_core_throttles_total Number of times a CPU core has been throttled.
          node_cpu_frequency_max_hertz Maxiumum CPU thread frequency in hertz.
          node_cpu_frequency_min_hertz Minimum CPU thread frequency in hertz.
          node_cpu_guest_seconds_total Seconds the CPUs spent in guests for each mode.
          node_cpu_package_throttles_total Number of times the CPU package has been throttled.
          node_cpu_scaling_frequency_hertz Current scaled CPU thread frequency in hertz.
          node_cpu_scaling_frequency_max_hertz Maximum scaled CPU thread frequency in hertz.
          node_cpu_scaling_frequency_min_hertz Minimum scaled CPU thread frequency in hertz.
          node_cpu_seconds_total Seconds the CPU spent in each mode.
          Name Description
          node_disk_ata_rotation_rate_rpm ATA disk rotate rate in RPMs. (0 for SSDs).
          node_disk_ata_write_cache ATA disk write cache presence.
          node_disk_ata_write_cache_enabled ATA disk write cache status (enabled or disabled).
          node_disk_discard_time_seconds_total Total number of seconds spent by all discards.
          node_disk_discarded_sectors_total Total number of sectors discarded successfully.
          node_disk_discards_completed_total Total number of discards discards completed.
          node_disk_discards_merged_total Total number of discards merged.
          node_disk_flush_requests_time_seconds_total Total number of seconds spent by all flush requests.
          node_disk_flush_requests_total The total number of flush requests completed successfully.
          node_disk_info Disk information from /sys/block/<block_device>.
          node_disk_io_now Number of I/Os in progress.
          node_disk_io_time_seconds_total Total seconds spent during I/O.
          node_disk_io_time_weighted_seconds_total Weighted number of seconds spent during I/O.
          node_disk_read_bytes_total Total number of bytes read successfully.
          node_disk_read_time_seconds_total Total number of seconds spent by all reads.
          node_disk_reads_completed_total Total number of reads completed successfully.
          node_disk_reads_merged_total Total number of reads merged.
          node_disk_write_time_seconds_total Total number of seconds spent by all writes.
          node_disk_writes_completed_total Total number of writes completed successfully.
          node_disk_writes_merged_total Number of writes merged.
          node_disk_written_bytes_total Total number of bytes written successfully.
          Name Description
          node_filesystem_avail_bytes Filesystem space available to non-root users in bytes.
          node_filesystem_device_error Whether an error occurred while getting statistics for the given device.
          node_filesystem_files Filesystem total file nodes.
          node_filesystem_files_free Filesystem total free file nodes.
          node_filesystem_free_bytes Filesystem free space in bytes.
          node_filesystem_readonly Filesystem read-only status.
          node_filesystem_size_bytes Filesystem size in bytes.
          Name Description
          node_memory_Active_anon_bytes /proc/meminfo Active_anon bytes.
          node_memory_Active_bytes /proc/meminfo Active bytes.
          node_memory_Active_file_bytes /proc/meminfo Active_file bytes.
          node_memory_AnonHugePages_bytes /proc/meminfo AnonHugePages bytes.
          node_memory_AnonPages_bytes /proc/meminfo AnonPages bytes.
          node_memory_Bounce_bytes /proc/meminfo Bounce bytes.
          node_memory_Buffers_bytes /proc/meminfo Buffers bytes.
          node_memory_Cached_bytes /proc/meminfo Cached bytes.
          node_memory_CommitLimit_bytes /proc/meminfo CommitLimit bytes.
          node_memory_Committed_AS_bytes /proc/meminfo Committed_AS bytes.
          node_memory_DirectMap1G_bytes /proc/meminfo DirectMap1G bytes.
          node_memory_DirectMap2M_bytes /proc/meminfo DirectMap2M bytes.
          node_memory_DirectMap4k_bytes /proc/meminfo DirectMap4k bytes.
          node_memory_Dirty_bytes /proc/meminfo Dirty bytes.
          node_memory_FileHugePages_bytes /proc/meminfo FileHugePages bytes.
          node_memory_FilePmdMapped_bytes /proc/meminfo FilePmdMapped bytes.
          node_memory_HardwareCorrupted_bytes /proc/meminfo HardwareCorrupted bytes.
          node_memory_HugePages_Free /proc/meminfo HugePages_Free.
          node_memory_HugePages_Rsvd /proc/meminfo HugePages_Rsvd.
          node_memory_HugePages_Surp /proc/meminfo HugePages_Surp.
          node_memory_HugePages_Total /proc/meminfo HugePages_Total.
          node_memory_Hugepagesize_bytes /proc/meminfo Hugepagesize bytes.
          node_memory_Hugetlb_bytes /proc/meminfo Hugetlb bytes.
          node_memory_Inactive_anon_bytes /proc/meminfo Inactive_anon bytes.
          node_memory_Inactive_bytes /proc/meminfo Inactive bytes.
          node_memory_Inactive_file_bytes /proc/meminfo Inactive_file bytes.
          node_memory_KReclaimable_bytes /proc/meminfo KReclaimable bytes.
          node_memory_KernelStack_bytes /proc/meminfo KernelStack bytes.
          node_memory_Mapped_bytes /proc/meminfo Mapped bytes.
          node_memory_MemAvailable_bytes /proc/meminfo MemAvailable bytes.
          node_memory_MemFree_bytes /proc/meminfo MemFree bytes.
          node_memory_MemTotal_bytes /proc/meminfo MemTotal bytes.
          node_memory_Mlocked_bytes /proc/meminfo Mlocked bytes.
          node_memory_NFS_Unstable_bytes /proc/meminfo NFS_Unstable bytes.
          node_memory_PageTables_bytes /proc/meminfo PageTables bytes.
          node_memory_Percpu_bytes /proc/meminfo Percpu bytes.
          node_memory_SReclaimable_bytes /proc/meminfo SReclaimable bytes.
          node_memory_SUnreclaim_bytes /proc/meminfo SUnreclaim bytes.
          node_memory_SecPageTables_bytes /proc/meminfo SecPageTables bytes.
          node_memory_ShmemHugePages_bytes /proc/meminfo ShmemHugePages bytes.
          node_memory_ShmemPmdMapped_bytes /proc/meminfo ShmemPmdMapped bytes.
          node_memory_Shmem_bytes /proc/meminfo Shmem bytes.
          node_memory_Slab_bytes /proc/meminfo Slab bytes.
          node_memory_SwapCached_bytes /proc/meminfo SwapCached bytes.
          node_memory_SwapFree_bytes /proc/meminfo SwapFree bytes.
          node_memory_SwapTotal_bytes /proc/meminfo SwapTotal bytes.
          node_memory_Unevictable_bytes /proc/meminfo Unevictable bytes.
          node_memory_VmallocChunk_bytes /proc/meminfo VmallocChunk bytes.
          node_memory_VmallocTotal_bytes /proc/meminfo VmallocTotal bytes.
          node_memory_VmallocUsed_bytes /proc/meminfo VmallocUsed bytes.
          node_memory_WritebackTmp_bytes /proc/meminfo WritebackTmp bytes.
          node_memory_Writeback_bytes /proc/meminfo Writeback bytes.
          node_memory_Zswap_bytes /proc/meminfo Zswap bytes.
          node_memory_Zswapped_bytes /proc/meminfo Zswapped bytes.
          Name Description
          nvswitch_env_fan_cur_speed Current fan speed in RPM.
          nvswitch_env_fan_dir Fan direction (0: Front2Back, 1: Back2Front).
          nvswitch_env_fan_max_speed Fan maximum speed in RPM.
          nvswitch_env_fan_min_speed Fan minimum speed in RPM.
          nvswitch_env_fan_state Fan status (0: ABSENT, 1: OK, 2: FAILED, 3: BAD).
          nvswitch_env_psu_capacity PSU capacity in watts.
          nvswitch_env_psu_current PSU current in amperes.
          nvswitch_env_psu_power PSU power in watts.
          nvswitch_env_psu_state PSU state (0: ABSENT, 1: OK, 2: FAILED, 3: BAD).
          nvswitch_env_psu_voltage PSU voltage in volts.
          nvswitch_env_temp_crit Critical temperature threshold in centigrade.
          nvswitch_env_temp_current Current temperature in centigrade.
          nvswitch_env_temp_max Maximum temperature threshold in centigrade.
          nvswitch_env_temp_min Minimum temperature threshold in centigrade.
          nvswitch_env_temp_state Temperature sensor status (0: ABSENT, 1: OK, 2: FAILED, 3: BAD).
          Example JSON data for PSU and temperature sensor telemetry:
                  {
                    "name": "nvswitch_platform_environment_psu_state",
                    "description": "PSU state. 0:ABSENT 1:OK 2:FAILED 3:BAD",
                    "gauge": {
                      "dataPoints": [
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "Power Supply Unit 1"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "PSU1"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 1
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "Power Supply Unit 2"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "PSU2"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 3
                        }
                              "value": {
                                "stringValue": "Power Supply Unit 1"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "PSU1"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 1
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "Power Supply Unit 2"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "PSU2"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 3
                        }
                      ]
                    },
                    "metadata": [
                      {
                        "key": "prometheus.type",
                        "value": {
                          "stringValue": "gauge"
                        }
                      }
                    ]
                  },
                  {
                    "name": "nvswitch_platform_environment_temp_crit",
                    "description": "Critical temperature in Centigrade.",
                    "gauge": {
                      "dataPoints": [
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "Asic Temp Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp4"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 120
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 0"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp5"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 100
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 1"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp6"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 100
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 2"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp7"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 100
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 3"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp8"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 100
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 4"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp9"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 100
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 5"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp10"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 100
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Package Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp1"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 100
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "Main Board Ambient Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp3"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 85
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "PSU1 Temp Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "PSU1Temp1"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 85
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "PSU2 Temp Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "PSU2Temp1"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 0
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "Port Ambient Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp2"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 85
                        }
                      ]
                    },
                    "metadata": [
                      {
                        "key": "prometheus.type",
                        "value": {
                          "stringValue": "gauge"
                        }
                      }
                    ]
                  },
                  {
                    "name": "nvswitch_platform_environment_temp_current",
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "Port Ambient Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp2"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 85
                        }
                      ]
                    },
                    "metadata": [
                      {
                        "key": "prometheus.type",
                        "value": {
                          "stringValue": "gauge"
                        }
                      }
                    ]
                  },
                  {
                    "name": "nvswitch_platform_environment_temp_current",
                    "description": "Current temperature in Centigrade.",
                    "gauge": {
                      "dataPoints": [
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "Asic Temp Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp4"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 50
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 0"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp5"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 52
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 1"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp6"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 69
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 2"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp7"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 55
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 3"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp8"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 54
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 4"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp9"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 52
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Core Sensor 5"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp10"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 52
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "CPU Package Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp1"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 69
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "Main Board Ambient Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp3"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 24.687
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "PSU1 Temp Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "PSU1Temp1"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 24.531
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "PSU2 Temp Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "PSU2Temp1"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 0
                        },
                        {
                          "attributes": [
                            {
                              "key": "description",
                              "value": {
                                "stringValue": "Port Ambient Sensor"
                              }
                            },
                            {
                              "key": "name",
                              "value": {
                                "stringValue": "Temp2"
                              }
                            }
                          ],
                          "timeUnixNano": "1729113543218000000",
                          "asDouble": 22.312
                        }
                      ]
                    },
                    "metadata": [
                      {
                        "key": "prometheus.type",
                        "value": {
                          "stringValue": "gauge"
                        }
                      }
                    ]
                  },
          

          Router Data Format

          When you enable Router statistic telemetry, the following statistics are exported:

          Name Description
          nvswitch_routing_bgp_peer_state BGP peer state: Established, Idle, Connect, Active, OpenSent.
          nvswitch_routing_bgp_peer_fsm_established_transitions BGP peer state transitions to the Established state.
          nvswitch_routing_bgp_peer_rib_in_total_routes_ipv4 Total number of routes advertised to a specific IPv4 BGP peer.
          nvswitch_routing_bgp_peer_rib_in_total_routes_ipv6 Total number of routes advertised to a specific IPv6 BGP peer.
          nvswitch_routing_bgp_in_queue_socket Total number of BGP messages in the input queue.
          nvswitch_routing_bgp_out_queue_socket Total number of BGP messages in the output queue.
          nvswitch_routing_bgp_rx_updates Total number of BGP received packets.
          nvswitch_routing_bgp_tx_updates Total number of BGP sent packets.
          nvswitch_routing_rib_count Total route counts in the routing table.
          nvswitch_routing_bgp_peer_rib_count Total number of routes for each Address Family Indicator (AFI) and Subsequent Address Family Indicator (SAFI).
          Example JSON data for bgp_peer_state:
          ADD EXAMPLE
          

          Histogram Data Format

          The histogram data samples that the switch exports to the OTEL collector are histogram data points that include the histogram bucket (bin) counts and the respective queue length size boundaries for each bucket. Latency and counter histogram data are also exported, if configured.

          Latency histogram bucket counts do not increment in exported telemetry data if there are no packets transmitted in the traffic class during the sample interval.

          The switch sends a sample with the following names for each interface enabled for ingress and egress buffer, latency, and/or counter histogram collection:

          Name Description
          nvswitch_histogram_interface_egress_buffer Histogram interface egress buffer queue depth.
          nvswitch_histogram_interface_ingress_buffer Histogram interface ingress buffer queue depth.
          nvswitch_histogram_interface_counter Histogram interface counter data.
          nvswitch_histogram_interface_latency Histogram interface latency data.

          Example JSON data for interface_ingress_buffer:
                      {
                        "name": "nvswitch_histogram_interface_ingress_buffer",
                        "description": "NVIDIA Ethernet Switch Histogram Interface Ingress Buffer Queue Depth",
                        "unit": "bytes",
                        "histogram": {
                          "dataPoints": [
                            {
                              "attributes": [
                                {
                                  "key": "interface",
                                  "value": {
                                    "stringValue": "swp1s1"
                                  }
                                },
                                {
                                  "key": "pg",
                                  "value": {
                                    "intValue": "0"
                                  }
                                }
                              ],
                              "startTimeUnixNano": "1729839231624809212",
                              "timeUnixNano": "1729839231628434909",
                              "count": "1019165",
                              "bucketCounts": [
                                "1019165",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0"
                              ],
                              "explicitBounds": [
                                863,
                                295775,
                                590687,
                                885599,
                                1180511,
                                1475423,
                                1770335,
                                2065247,
                                2360159
                              ]
                            },
          

          Example JSON data for interface_egress_buffer:
          {
                        "name": "nvswitch_histogram_interface_egress_buffer",
                        "description": "NVIDIA Ethernet Switch Histogram Interface Egress Buffer Queue Depth",
                        "unit": "bytes",
                        "histogram": {
                          "dataPoints": [
                            {
                              "attributes": [
                                {
                                  "key": "interface",
                                  "value": {
                                    "stringValue": "swp1s1"
                                  }
                                },
                                {
                                  "key": "tc",
                                  "value": {
                                    "intValue": "0"
                                  }
                                }
                              ],
                              "startTimeUnixNano": "1729839232707032279",
                              "timeUnixNano": "1729839232709312158",
                              "count": "1077334",
                              "bucketCounts": [
                                "1077334",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0"
                              ],
                              "explicitBounds": [
                                863,
                                1180511,
                                2360159,
                                3539807,
                                4719455,
                                5899103,
                                7078751,
                                8258399,
                                9438047
                              ]
                            }
          

          Example JSON data for interface_counter:
                      {
                        "name": "nvswitch_histogram_interface_counter",
                        "description": "NVIDIA Ethernet Switch Histogram Interface Counter",
                        "unit": "counter",
                        "histogram": {
                          "dataPoints": [
                            {
                              "attributes": [
                                {
                                  "key": "interface",
                                  "value": {
                                    "stringValue": "swp1s1"
                                  }
                                },
                                {
                                  "key": "type",
                                  "value": {
                                    "stringValue": "crc"
                                  }
                                }
                              ],
                              "startTimeUnixNano": "1729839235935525147",
                              "timeUnixNano": "1729839235937099838",
                              "count": "1033926",
                              "bucketCounts": [
                                "1033926",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0"
                              ],
                              "explicitBounds": [
                                99999,
                                1337499,
                                2574999,
                                3812499,
                                5049999,
                                6287499,
                                7524999,
                                8762499,
                                9999999
                              ]
                            },  
          

          Example JSON data for interface_latency:
                      {
                        "name": "nvswitch_histogram_interface_latency",
                        "description": "NVIDIA Ethernet Switch Histogram Interface Latency",
                        "unit": "packets",
                        "histogram": {
                          "dataPoints": [
                            {
                              "attributes": [
                                {
                                  "key": "interface",
                                  "value": {
                                    "stringValue": "swp1s1"
                                  }
                                },
                                {
                                  "key": "tc",
                                  "value": {
                                    "intValue": "0"
                                  }
                                }
                              ],
                              "startTimeUnixNano": "1729839233815168456",
                              "timeUnixNano": "1729839233818493910",
                              "bucketCounts": [
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0",
                                "0"
                              ],
                              "explicitBounds": [
                                319,
                                831,
                                1343,
                                1855,
                                2367,
                                2879,
                                3391,
                                3903,
                                4415
                              ]
                            },
          

          Static Label Format

          Device static labels are exported in the resource metric section of OTLP data:

          Example JSON data for static device label:
          { “resourceMetrics”: [ { “resource”: { “attributes”: [ { “key”: “net.host.name”, “value”: { “stringValue”: “switch-hostname” } }, { “key”: “static_label_1”, “value”: { “stringValue”: “label_1_string” } } ] } } ] }

          Interface static labels are exported as attributes in the gauge metrics for each interface.

          Example JSON data for static interface label:
                "metrics": [ 
                    "name": "nvswitch_interface_iface_id", 
                    "description": "Network device property: iface_id", 
                    "gauge": { 
                      "dataPoints": [ 
                        { 
                          "attributes": [ 
                            { 
                              "key": "interface", 
                              "value": { 
                                "stringValue": "swp10" 
                              } 
                            } 
                          ], 
                          "timeUnixNano": "1727942163835000000", 
                          "asDouble": 13 
                        }, 
                        { 
                          "attributes": [ 
                            { 
                              "key": "interface_swp10_label", 
                              "value": { 
                                "stringValue": "swp10_label_string" 
                              } 
                            }
          

          Monitoring Best Practices

          The following monitoring processes are best practices for reviewing and troubleshooting potential issues with Cumulus Linux environments.

          This document describes:

          Trend Analysis Using Metrics

          A metric is a quantifiable measure that tracks and assesses the status of a specific infrastructure component. Examples of metrics include bytes on an interface, CPU utilization, and total number of routes.

          Metrics are more valuable when you use them for trend analysis.

          Generate Alerts with Triggered Logging

          Cumulus Linux typically sends triggered issues to syslog, but can send issues to another log file depending on the feature. rsyslog handles all logging, including local and remote logging. Logs are the best method to use for generating alerts when the system transitions from a stable steady state.

          Sending logs to a centralized collector, then creating alerts that you base on critical logs is an optimal solution.

          Log Formatting

          Most log files in Cumulus Linux use a standard presentation format. For example:

          2017-03-08T06:26:43.569681+00:00 leaf01 sysmonitor: Critically high CPU use: 99%
          

          For brevity and legibility, this section omits the timestamp and hostname from examples.

          Hardware

          NVUE provides commands to monitor various switch hardware elements.

          Command Description
          nv show platform environment fan Shows information about the fans on the switch, such as the minimum, maximum and current speed, the fan state, and the fan direction.
          nv show platform environment led Shows information about the LEDs on the switch, such as the LED name and color.
          nv show platform environment psu Shows information about the PSUs on the switch, such as the PSU name and state.
          nv show platform environment temperature Shows information about the sensors on the switch, such as the critical, maximum, minimum and current temperature and the current state of the sensor.
          nv show platform environment voltage Shows the list of voltage sensors on the switch.
          nv show platform inventory Shows the switch inventory, which includes fan and PSU hardware version, model, serial number, state, and type. For information about a specific fan or PSU, run the nv show platform inventory <inventory-name> command.

          The following example shows the nv show platform environment fan command output. The airflow direction must be the same for all fans. If Cumulus Linux detects that the fan airflow direction is not uniform, it logs a message in the var/log/syslog file.

          cumulus@switch:~$ nv show platform environment fan
          Name      Fan State  Current Speed (RPM)  Max Speed  Min Speed  Fan Direction
          --------  ---------  -------------------  ---------  ---------  -------------
          FAN1/1    ok         6000                 29000      2500       F2B         
          FAN1/2    ok         6000                 29000      2500       F2B         
          FAN2/1    ok         6000                 29000      2500       F2B         
          FAN2/2    ok         6000                 29000      2500       F2B         
          FAN3/1    ok         6000                 29000      2500       F2B         
          FAN3/2    ok         6000                 29000      2500       F2B         
          PSU1/FAN  ok         6000                 29000      2500       F2B         
          PSU2/FAN  ok         6000                 29000      2500       F2B   
          

          If the airflow direction for all fans is not in the same (front to back or back to front), cooling is suboptimal for the switch, rack, and even the entire data center.

          The smond process provides monitoring for various switch hardware elements. Minimum or maximum values depend on the flags you apply to the basic command. The table below lists the hardware elements and applicable commands and flags.

          Hardware Element Monitoring Commands Interval Poll
          Temperature smonctl -j
          smonctl -j -s TEMP[X]
          10 seconds
          Fan smonctl -j
          smonctl -j -s FAN[X]
          10 seconds
          PSU smonctl -j
          smonctl -j -s PSU[X]
          10 seconds
          PSU Fan smonctl -j
          smonctl -j -s PSU[X]Fan[X]
          10 seconds
          PSU Temperature smonctl -j
          smonctl -j -s PSU[X]Temp[X]
          10 seconds
          Voltage smonctl -j
          smonctl -j -s Volt[X]
          10 seconds
          Front Panel LED ledmgrd -d
          ledmgrd -j
          5 seconds

          Not all switch models include a sensor for monitoring power consumption and voltage. See this note for details.

          Hardware Logs Log Location Log Entries
          High temperature
          /var/log/syslog
          /usr/sbin/smond : : Temp1(Board Sensor near CPU): state changed from UNKNOWN to OK
          /usr/sbin/smond : : Temp2(Board Sensor Near Virtual Switch): state changed from UNKNOWN to OK
          /usr/sbin/smond : : Temp3(Board Sensor at Front Left Corner): state changed from UNKNOWN to OK
          /usr/sbin/smond : : Temp4(Board Sensor at Front Right Corner): state changed from UNKNOWN to OK
          /usr/sbin/smond : : Temp5(Board Sensor near Fan): state changed from UNKNOWN to OK
          Fan speed issues
          /var/log/syslog
          /usr/sbin/smond : : Fan1(Fan Tray 1, Fan 1): state changed from UNKNOWN to OK
          /usr/sbin/smond : : Fan2(Fan Tray 1, Fan 2): state changed from UNKNOWN to OK
          /usr/sbin/smond : : Fan3(Fan Tray 2, Fan 1): state changed from UNKNOWN to OK
          /usr/sbin/smond : : Fan4(Fan Tray 2, Fan 2): state changed from UNKNOWN to OK
          /usr/sbin/smond : : Fan5(Fan Tray 3, Fan 1): state changed from UNKNOWN to OK
          /usr/sbin/smond : : Fan6(Fan Tray 3, Fan 2): state changed from UNKNOWN to OK
          Fan direction issue
          /var/log/syslog
          /usr/sbin/smond : : Fan direction mismatch: 12 fans B2F; 1 fans F2B!
          PSU failure
          /var/log/syslog
          /usr/sbin/smond : : PSU1Fan1(PSU1 Fan): state changed from UNKNOWN to OK
          /usr/sbin/smond : : PSU2Fan1(PSU2 Fan): state changed from UNKNOWN to BAD

          System Data

          Cumulus Linux includes several ways to monitor system data. In addition, you can receive alerts in high risk situations.

          CPU Idle Time

          When a CPU reports five high CPU alerts within a span of five minutes, the switch logs an alert.

          Short bursts of high CPU can occur during switchd churn or routing protocol startup. Do not set alerts for these short bursts.

          System Element Monitoring Commands Interval Poll
          CPU utilization NVUE: nv show system cpu
          Linux: sudo cat /proc/stat
          top -b -n 1
          30 seconds
          CPU Logs Log Location Log Entries
          High CPU
          /var/log/syslog
          sysmonitor: Critically high CPU use: 99%
          systemd[1]: Starting Monitor system resources (cpu, memory, disk)…
          systemd[1]: Started Monitor system resources (cpu, memory, disk).
          sysmonitor: High CPU use: 89%
          systemd[1]: Starting Monitor system resources (cpu, memory, disk)…
          systemd[1]: Started Monitor system resources (cpu, memory, disk).
          sysmonitor: CPU use no longer high: 77%

          Cumulus Linux monitors CPU, memory, and disk space with sysmonitor. The configurations for the thresholds are in /etc/cumulus/sysmonitor.conf. For more information, see man sysmonitor.

          CPU measure Thresholds
          Use Alert: 90% Crit: 95%
          Process Load Alarm: 95% Crit: 125%

          Spectrum 1 CPUs can become overloaded at moderate to high network scale. If your Spectrum 1 switch is not able to process CPU-destined traffic or is running continually at high CPU, either reduce the scale of the network where you deploy Spectrum 1 switches or replace the switch with a newer generation switch that offers stronger compute resources.

          Disk Usage

          To monitor disk utilization such as the total storage capacity of the filesystem, the amount of space currently being used, the amount of free space available, the percentage of the filesystem’s total capacity currently in use, and the directory or mount point where the filesystem is attached to the system, run the NVUE nv show system disk usage command or the Linux sudo df command.

          cumulus@switch:~$ nv show system disk usage 
          Mount Point   Filesystem   Size   Used         Avail   Use% 
          -----------   ----------   --     ---------    ----    ---- 
          /             /dev/sda5    5.4G    3.0G        2.2G     58% 
          /dev          udev         2.0G    0           2.0G     0% 
          /dev/shm      tmpfs        2.1G    61M         2.0G     3% 
          /run          tmpfs        411M    38M         374M     10% 
          /run/lock     tmpfs        5.0M    0           5.0M     0% 
          /tmp          tmpfs        2.1G    12K         2.1G     1% 
          /vagrant      vagrant      4.3T    3.1T        1.3T     72% 
          

          When monitoring disk utilization with the Linux command, you can exclude the tmpfs filesystem with sudo df -x tmpfs.

          cumulus@switch:~$ sudo df -x tmpfs
          Filesystem     1K-blocks     Used  Available  Use%  Mounted on
          udev              867272        0     867272    0%  /dev
          /dev/vda5        5646348  2417272    2921624   46%  /
          /dev/vdb             354      354          0  100%  /mnt/air
          

          Process Restart

          In Cumulus Linux, systemd monitors and restarts processes.

          To view processes that systemd monitors, run the systemctl status command.

          cumulus@switch:~$ systemctl status
          ● leaf01
              State: running
              Units: 521 loaded (incl. loaded aliases)
               Jobs: 0 queued
             Failed: 0 units
              Since: Wed 2024-11-13 19:16:28 UTC; 4 weeks 0 days ago
            systemd: 252.30-1~deb12u2
             CGroup: /
                     ├─1001 bpfilter_umh
                     ├─init.scope
                     │ └─1 /sbin/init
                     └─system.slice
                       ├─acpid.service
                       │ └─850 /usr/sbin/acpid
                       ├─auditd.service
                       │ └─373 /sbin/auditd
                       ├─cl-system-services.service
                       │ └─1182 /usr/sbin/cl_system_services -l INFO
                       ├─clagd.service
                       │ └─2550 /usr/bin/python3 -u /usr/sbin/clagd --daemon linklocal pe>
                       ├─cron.service
                       │ └─869 /usr/sbin/cron -f -L 38
                       ├─csmgrd.service
          

          Layer 1 Protocols and Interfaces

          Link and port state interface transitions log to /var/log/syslog and /var/log/switchd.log.

          Interface Element Monitoring Commands
          Link state NVUE: nv show interface <interface>

          Linux: sudo cat /sys/class/net/<interface>/operstate
          Link speed NVUE: nv show interface <inteface>

          Linux: sudo cat /sys/class/net/<interface>/speed
          Port state NVUE: nv show interface

          Linux: ip link show
          Bond state NVUE: nv show interface <bond>

          Linux: sudo cat /proc/net/bonding/<bond>

          You obtain interface counters from either querying the hardware or the Linux kernel. The Linux kernel aggregates the output from the hardware.

          Interface Counter Element Monitoring Commands Interval Poll
          Interface counters NVUE: nv show interface <interface> counters

          Linux: cat /sys/class/net/<interface>/statistics/<statistic-name>
          cl-netstat -j
          ethtool -S <interface>
          10 seconds
          Layer 1 Logs Log Location Log Entries
          Link failure/Link flap
          /var/log/switchd.log
          switchd[5692]: nic.c:213 nic_set_carrier: swp17: setting kernel carrier: down
          switchd[5692]: netlink.c:291 libnl: swp1, family 0, ifi 20, oper down
          switchd[5692]: nic.c:213 nic_set_carrier: swp1: setting kernel carrier: up
          switchd[5692]: netlink.c:291 libnl: swp17, family 0, ifi 20, oper up
          Unidirectional link
          /var/log/switchd.log
          /var/log/ptm.log
          ptmd[7146]: ptm_bfd.c:2471 Created new session 0x1 with peer 10.255.255.11 port swp1
          ptmd[7146]: ptm_bfd.c:2471 Created new session 0x2 with peer fe80::4638:39ff:fe00:5b port swp1
          ptmd[7146]: ptm_bfd.c:2471 Session 0x1 down to peer 10.255.255.11, Reason 8
          ptmd[7146]: ptm_bfd.c:2471 Detect timeout on session 0x1 with peer 10.255.255.11, in state 1
          Bond Negotiation Working
          /var/log/syslog
          kernel: [85412.763193] bonding: bond0 is being created…
          kernel: [85412.770014] bond0: Enslaving swp2 as a backup interface with an up link
          kernel: [85412.775216] bond0: Enslaving swp1 as a backup interface with an up link
          kernel: [85412.797393] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
          kernel: [85412.799425] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
          Bond Negotiation Failing
          /var/log/syslog
          kernel: [85412.763193] bonding: bond0 is being created…
          kernel: [85412.770014] bond0: Enslaving swp2 as a backup interface with an up link
          kernel: [85412.775216] bond0: Enslaving swp1 as a backup interface with an up link
          kernel: [85412.797393] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
          MLAG peerlink negotiation Working
          /var/log/syslog
          lldpd[998]: error while receiving frame on swp50: Network is down
          lldpd[998]: error while receiving frame on swp49: Network is down
          kernel: [76174.262893] peerlink: Setting ad_actor_system to 44:38:39:00:00:11
          kernel: [76174.264205] 8021q: adding VLAN 0 to HW filter on device peerlink
          mstpd: one_clag_cmd: setting (1) peer link: peerlink
          mstpd: one_clag_cmd: setting (1) clag state: up
          mstpd: one_clag_cmd: setting system-mac 44:38:39:ff:40:94
          mstpd: one_clag_cmd: setting clag-role secondary
          /var/log/clagd.log
          clagd[14003]: Cleanup is executing.
          clagd[14003]: Cannot open file “/tmp/pre-clagd.q7XiO
          clagd[14003]: Cleanup is finished
          clagd[14003]: Beginning execution of clagd version 1
          clagd[14003]: Invoked with: /usr/sbin/clagd –daemon
          clagd[14003]: Role is now secondary
          clagd[14003]: HealthCheck: role via backup is second
          clagd[14003]: HealthCheck: backup active
          clagd[14003]: Initial config loaded
          clagd[14003]: The peer switch is active.
          clagd[14003]: Initial data sync from peer done.
          clagd[14003]: Initial handshake done.
          clagd[14003]: Initial data sync to peer done.
          MLAG peerlink negotiation Failing
          /var/log/syslog
          lldpd[998]: error while receiving frame on swp50: Network is down
          lldpd[998]: error while receiving frame on swp49: Network is down
          kernel: [76174.262893] peerlink: Setting ad_actor_system to 44:38:39:00:00:11
          kernel: [76174.264205] 8021q: adding VLAN 0 to HW filter on device peerlink
          mstpd: one_clag_cmd: setting (1) peer link: peerlink
          mstpd: one_clag_cmd: setting (1) clag state: down
          mstpd: one_clag_cmd: setting system-mac 44:38:39:ff:40:94
          mstpd: one_clag_cmd: setting clag-role secondary
          /var/log/clagd.log
          clagd[26916]: Cleanup is executing.
          clagd[26916]: Cannot open file “/tmp/pre-clagd.6M527vvGX0/brbatch” for reading: No such file or directory
          clagd[26916]: Cleanup is finished
          clagd[26916]: Beginning execution of clagd version 1.3.0
          clagd[26916]: Invoked with: /usr/sbin/clagd –daemon 169.254.1.2 peerlink.4094 44:38:39:FF:01:01 –priority 1000 –backupIp 10.0.0.2
          clagd[26916]: Role is now secondary
          clagd[26916]: Initial config loaded
          MLAG port negotiation Working
          /var/log/syslog
          kernel: [77419.112195] bonding: server01 is being created…
          lldpd[998]: error while receiving frame on swp1: Network is down
          kernel: [77419.122707] 8021q: adding VLAN 0 to HW filter on device swp1
          kernel: [77419.126408] server01: Enslaving swp1 as a backup interface with a down link
          kernel: [77419.177175] server01: Setting ad_actor_system to 44:38:39:ff:40:94
          kernel: [77419.190874] server01: Warning: No 802.3ad response from the link partner for any adapters in the bond
          kernel: [77419.191448] IPv6: ADDRCONF(NETDEV_UP): server01: link is not ready
          kernel: [77419.191452] 8021q: adding VLAN 0 to HW filter on device server01
          kernel: [77419.192060] server01: link status definitely up for interface swp1, 1000 Mbps full duplex
          kernel: [77419.192065] server01: now running without any active interface!
          kernel: [77421.491811] IPv6: ADDRCONF(NETDEV_CHANGE): server01: link becomes ready
          mstpd: one_clag_cmd: setting (1) mac 44:38:39:00:00:17 <server01, None>
          /var/log/clagd.log
          clagd[14003]: server01 is now dual connected.
          MLAG port negotiation Failing
          /var/log/syslog
          kernel: [79290.290999] bonding: server01 is being created…
          kernel: [79290.299645] 8021q: adding VLAN 0 to HW filter on device swp1
          kernel: [79290.301790] server01: Enslaving swp1 as a backup interface with a down link
          kernel: [79290.358294] server01: Setting ad_actor_system to 44:38:39:ff:40:94
          kernel: [79290.373590] server01: Warning: No 802.3ad response from the link partner for any adapters in the bond
          kernel: [79290.374024] IPv6: ADDRCONF(NETDEV_UP): server01: link is not ready
          kernel: [79290.374028] 8021q: adding VLAN 0 to HW filter on device server01
          kernel: [79290.375033] server01: link status definitely up for interface swp1, 1000 Mbps full duplex
          kernel: [79290.375037] server01: now running without any active interface!
          /var/log/clagd.log
          clagd[14291]: Conflict (server01): matching clag-id (1) not configured on peer…
          clagd[14291]: Conflict cleared (server01): matching clag-id (1) detected on peer
          MLAG port negotiation Flapping
          /var/log/syslog
          mstpd: one_clag_cmd: setting (0) mac 00:00:00:00:00:00 <server01, None>
          mstpd: one_clag_cmd: setting (1) mac 44:38:39:00:00:03 <server01, None>
          /var/log/clagd.log
          clagd[14291]: server01 is no longer dual connected
          clagd[14291]: server01 is now dual connected.

          PTM uses LLDP information to compare against a topology.dot file that describes the network. It has built in alerting capabilities. Use PTM on the switch instead of polling LLDP information regularly. You can install PTM from the Cumulus Linux GitHub repository.

          Consider tracking peering information through PTM. For more information, refer to the Prescriptive Topology Manager documentation.

          Neighbor Element Monitoring Commands Interval Poll
          LLDP Neighbor sudo lldpctl -f json 300 seconds
          Prescriptive Topology Manager ptmctl -j Triggered

          Layer 2 Protocols

          Spanning tree is a protocol that prevents loops in a layer 2 infrastructure. In a stable state, the spanning tree protocol converges. Monitor the Topology Change Notifications (TCN) in STP to identify when new BPDUs arrive.

          Interface Counter Element Monitoring Commands Interval Poll
          STP TCN Transitions NVUE: nv show bridge domain <bridge> stp

          Linux: mstpctl showbridge json
          mstpctl showport
          60 seconds
          MLAG peer state NVUE: nv show mlag

          Linux: clagctl status
          sudo clagd -j
          sudo cat /var/log/clagd.log
          60 seconds
          MLAG peer MACs NVUE: nv show mlag

          Linux: clagctl dumppeermacs
          clagctl dumpourmacs
          300 seconds
          Layer 2 Logs Log Location Log Entries
          Spanning Tree Working
          /var/log/syslog
          kernel: [1653877.190724] device swp1 entered promiscuous mode
          kernel: [1653877.190796] device swp2 entered promiscuous mode
          mstpd: create_br: Add bridge bridge
          mstpd: clag_set_sys_mac_br: set bridge mac 00:00:00:00:00:00
          mstpd: create_if: Add iface swp1 as port#2 to bridge bridge
          mstpd: set_if_up: Port swp1 : up
          mstpd: create_if: Add iface swp2 as port#1 to bridge bridge
          mstpd: set_if_up: Port swp2 : up
          mstpd: set_br_up: Set bridge bridge up
          mstpd: MSTP_OUT_set_state: bridge:swp1:0 entering blocking state(Disabled)
          mstpd: MSTP_OUT_set_state: bridge:swp2:0 entering blocking state(Disabled)
          mstpd: MSTP_OUT_flush_all_fids: bridge:swp1:0 Flushing forwarding database
          mstpd: MSTP_OUT_flush_all_fids: bridge:swp2:0 Flushing forwarding database
          mstpd: MSTP_OUT_set_state: bridge:swp1:0 entering learning state(Designated)
          mstpd: MSTP_OUT_set_state: bridge:swp2:0 entering learning state(Designated)
          sudo: pam_unix(sudo:session): session closed for user root
          mstpd: MSTP_OUT_set_state: bridge:swp1:0 entering forwarding state(Designated)
          mstpd: MSTP_OUT_set_state: bridge:swp2:0 entering forwarding state(Designated)
          mstpd: MSTP_OUT_flush_all_fids: bridge:swp2:0 Flushing forwarding database
          mstpd: MSTP_OUT_flush_all_fids: bridge:swp1:0 Flushing forwarding database
          Spanning Tree Blocking
          /var/log/syslog
          mstpd: MSTP_OUT_set_state: bridge:swp2:0 entering blocking state(Designated)
          mstpd: MSTP_OUT_set_state: bridge:swp2:0 entering learning state(Designated)mstpd: MSTP_OUT_set_state: bridge:swp2:0 entering forwarding state(Designated)mstpd: MSTP_OUT_flush_all_fids: bridge:swp2:0 Flushing forwarding databasemstpd: MSTP_OUT_flush_all_fids: bridge:swp2:0 Flushing forwarding databasemstpd: MSTP_OUT_set_state: bridge:swp2:0 entering blocking state(Alternate)
          mstpd: MSTP_OUT_flush_all_fids: bridge:swp2:0 Flushing forwarding database

          Layer 3 Protocols

          When FRR boots up for the first time, there is a different log file for each activated daemon. If you edit the log file (for example, through vtysh or frr.conf), the integrated configuration sends all logs to the same file.

          To send FRR logs to syslog, apply the configuration log syslog in vtysh.

          BGP

          When monitoring BGP, check if BGP peers are operational. There is not much value in alerting on the current operational state of the peer; monitoring the transition is more valuable, which you can do by monitoring syslog.

          Monitoring the routing table provides trending on the size of the infrastructure. This is useful when you integrate with host-based solutions (such as Routing on the Host) when the routes track with the number of applications available.

          BGP Element Monitoring Commands Interval Poll
          BGP peer failure sudo vtysh -c "show ip bgp summary json" 60 seconds
          BGP route table sudo vtysh -c "show ip bgp json" 600 seconds
          BGP Logs Log Location Log Entries
          BGP peer down
          /var/log/syslog
          /var/log/frr/*.log
          bgpd[3000]: %NOTIFICATION: sent to neighbor swp1 4/0 (Hold Timer Expired) 0 bytes
          bgpd[3000]: %ADJCHANGE: neighbor swp1 Down BGP Notification send

          OSPF

          When monitoring OSPF, check if OSPF peers are operational. There is not much value in alerting on the current operational state of the peer; monitoring the transition is more valuable, which you can do by monitoring syslog.

          Monitoring the routing table provides trending on the size of the infrastructure. This is useful when you integrate with host-based solutions (such as Routing on the Host) when the routes track with the number of applications available.

          OSPF Element Monitoring Commands Interval Poll
          OSPF protocol peer failure sudo vtysh -c "show ip ospf neighbor all json"
          cl-ospf summary show json
          60 seconds
          OSPF link state database sudo vtysh - c "show ip ospf database" 600 seconds

          Route and Host Entries

          Route Element Monitoring Commands Interval Poll
          Host Entries cl-resource-query
          cl-resource-query -k
          600 seconds
          Route Entries cl-resource-query
          cl-resource-query -k
          600 seconds

          Routing Logs

          Layer 3 Logs Log Location Log Entries
          Routing protocol process crash
          /var/log/syslog
          frrouting[1824]: Starting FRRouting daemons (prio:10):. zebra. bgpd.
          bgpd[1847]: BGPd 1.0.0+cl3u7 starting: vty@2605, bgp@:179
          zebra[1840]: client 12 says hello and bids fair to announce only bgp routes
          watchfrr[1853]: watchfrr 1.0.0+cl3u7 watching [zebra bgpd], mode [phased zebra restart]
          watchfrr[1853]: bgpd state -> up : connect succeeded
          watchfrr[1853]: bgpd state -> down : read returned EOF
          cumulus-core: Running cl-support for core files bgpd.3030.1470341944.core.core_helper
          core_check.sh[4992]: Please send /var/support/cl_support__spine01_20160804_201905.tar.xz to Cumulus support
          watchfrr[1853]: Forked background command [pid 6665]: /usr/sbin/service frr restart bgpd
          watchfrr[1853]: watchfrr 0.99.24+cl3u2 watching [zebra bgpd ospfd], mode [phased zebra restart]
          watchfrr[1853]: zebra state -> up : connect succeeded
          watchfrr[1853]: bgpd state -> up : connect succeeded
          watchfrr[1853]: watchfrr: Notifying Systemd we are up and running

          Logging

          The table below describes the various log files.

          Logging Element Monitoring Commands Log Location
          syslog Catch all log file. Identifies memory leaks and CPU spikes.
          /var/log/syslog
          switchd functionality Hardware Abstraction Layer (HAL).
          /var/log/switchd.log
          Routing daemons FRR zebra daemon details.
          /var/log/daemon.log
          Routing protocol The log file is configurable in FRR. When FRR first boots, it uses the non-integrated configuration so each routing protocol has its own log file. After booting up, FRR switches over to using the integrated configuration, so that all logs go to a single place.

          To edit the location of the log files, use the log file command. By default, Cumulus Linux does not send FRR logs to syslog. Use the log syslog command to send logs through rsyslog and into /var/log/syslog.

          Note: To write syslog debug messages to the log file, you must run the log syslog debug command to configure FRR with syslog severity 7 (debug); otherwise, when you issue a debug command such as debug bgp neighbor-events, no output logs to /var/log/frr/frr.log.

          However, when you manually define a log target with the log file /var/log/frr/debug.log command, FRR automatically defaults to severity 7 (debug) logging and the output logs to /var/log/frr/frr.log.
          /var/log/frr/zebra.log
          /var/log/frr/.log
          /var/log/frr/frr.log

          Device Management

          Device Access Logs

          Access Logs Log Location Log Entries
          User Authentication and Remote Login
          /var/log/syslog
          sshd[31830]: Accepted publickey for cumulus from 192.168.0.254 port 45582 ssh2: RSA 38:e6:3b:cc:04:ac:41:5e:c9:e3:93:9d:cc:9e:48:25
          sshd[31830]: pam_unix(sshd:session): session opened for user cumulus by (uid=0)

          Device Super User Command Logs

          Super User Command Logs Log Location Log Entries
          Executing commands using sudo
          /var/log/syslog
          sudo: cumulus: TTY=unknown ; PWD=/home/cumulus ; USER=root ; COMMAND=/tmp/script_9938.sh -v
          sudo: pam_unix(sudo:session): session opened for user root by (uid=0)
          sudo: pam_unix(sudo:session): session closed for user root

          switchd Log Message Reference

          The following table lists the log messages generated by switchd, organized by severity, then message text. These messages appear in /var/log/switchd.log.

          Severity Message Text Explanation Recommended Action
          CRITICAL _port_group_config_values_get: hal_list_get failed on [str] List create failed. File a ticket with Cumulus Support.
          CRITICAL _range_limits_get: start linux interface name buffer is NULL Invalid parameter. File a ticket with Cumulus Support.
          CRITICAL _range_limits_get: end linux interface name buffer is NULL Invalid parameter. File a ticket with Cumulus Support.
          CRITICAL _range_limits_get: [str]-[str] not recognized Invalid port set configuration. Check QoS configuration file.
          CRITICAL _range_limits_get: port range [str] not recognized Invalid port set configuration. Check QoS configuration file.
          CRITICAL _port_group_ports_set: hal_list_get failed on [str] Port set list create failed. Check QoS configuration file.
          CRITICAL _port_group_name_list_get: hal_list_get failed on [str] List create failed. Check QoS configuration file.
          CRITICAL _port_group_range_translate: _get_range_limits failed on [str] Invalid port set configuration. Check QoS configuration file.
          CRITICAL _priority_group_config_get: hal_list_get failed on [str] Configuration list create failed. File a ticket with Cumulus Support.
          CRITICAL hal_list_get: list string [str] contains more elements than the maximum allowed ([int]) List capacity exceeded. File a ticket with Cumulus Support.
          CRITICAL hal_sh_datapath_file_read: could not load config file [str] Could not load the back end QoS configuration file. Check backend QoS configuration file.
          CRITICAL Unable to reallocate [int] bytes of memory Memory allocation failed. File a ticket with Cumulus Support.
          CRITICAL No backends found. No back ends found. File a ticket with Cumulus Support.
          CRITICAL License: email is longer than [int] characters Email length exceeds maximum. Modify email address.
          CRITICAL License: license data is longer than [int] License data exceeds maximum. Check license.
          CRITICAL License: Invalid format Invalid license format. Check license.
          CRITICAL No license file. No license file found. Check license file.
          CRITICAL The Cumulus Linux license appears to be invalid.
          This WILL NOT affect your system operations at the moment. Future versions will enforce fully valid licenses on the system.
          Please contact licensing@cumulusnetworks.com at your convenience so we can validate and assist you with this licensing issue.
          Invalid license. Check license.
          CRITICAL No license file. No license file found. Check license file.
          CRITICAL Incomplete license. Incomplete license. Check license.
          CRITICAL License is expired! License is expired. Renew license.
          CRITICAL unable to get tap_name for port [uint] Port config failed: no port name. File a ticket with Cumulus Support.
          CRITICAL Voluntary restart by timestamp check requested Voluntary switchd restart. None.
          CRITICAL Couldn’t write ready file [str] Could not mark switchd startup complete. File a ticket with Cumulus Support.
          CRITICAL Could not open [str] to record error type Could not report restart reason. File a ticket with Cumulus Support.
          CRITICAL Error setting signal handlers. Signal handler initialization failed. File a ticket with Cumulus Support.
          CRITICAL No license to run switchd! No switchd license is installed. Install switchd license.
          CRITICAL daemon call failed with rv [int] switchd could not be daemonized. File a ticket with Cumulus Support.
          CRITICAL Couldn’t write pid file [str] Could not write out the process ID. File a ticket with Cumulus Support.
          CRITICAL Switchd fs init failed. Failed to initialize the switchd file system. File a ticket with Cumulus Support.
          CRITICAL Switchd config failed. Could not load the switchd configuration file. Check switchd configuration file.
          CRITICAL Netlink init failed. Netlink initialization failed. File a ticket with Cumulus Support.
          CRITICAL HAL init failed. HAL initialization failed. File a ticket with Cumulus Support.
          CRITICAL NIC init failed. NC initialization failed. File a ticket with Cumulus Support.
          CRITICAL Port init failed. Port initialization failed. File a ticket with Cumulus Support.
          CRITICAL Bridges init failed. Bridges initialization failed. File a ticket with Cumulus Support.
          CRITICAL Bonds init failed. Bonds initialization failed. File a ticket with Cumulus Support.
          CRITICAL Logical networks init failed. Logical networks initialization failed. File a ticket with Cumulus Support.
          CRITICAL Interface list init failed. Interface list initialization failed. File a ticket with Cumulus Support.
          CRITICAL Switchd fs mount failed. Could not mount switchd file system. File a ticket with Cumulus Support.
          CRITICAL Failed to add route [str] Failure in VRF route leak feature. This message notifies that a route entry could not be properly added to one of the software tables. File a ticket with Cumulus Support.
          CRITICAL MAC address [str] couldn’t be added to or retrieved from hash Relates to merging MAC tables. The message notifies that an entry expected in a MAC address software table is not found therein. This should never be seen. File a ticket with Cumulus Support.
          CRITICAL Failed to add route [str] Add MPLS transit LSP to a software table failed. File a ticket with Cumulus Support.
          CRITICAL [str]: hal port list malloc failed Memory exhausted. File a ticket with Cumulus Support.
          CRITICAL Failed to add [str] to [str] “Failed to add to ”. Issue happens when addition of the port to a software table fails. File a ticket with Cumulus Support.
          CRITICAL Failed to add hal mroute for [str] Failed to add multicast route to a software table. File a ticket with Cumulus Support.
          CRITICAL Failed to add [str] to [str] “Failed to add to ”. Issue happens when addition of the port to a software table fails. File a ticket with Cumulus Support.
          CRITICAL Failed to add grp [str] to mroute A multicast route for a group could not be added to a software hash table. File a ticket with Cumulus Support.
          CRITICAL Maximum number of bonds exceeded, max is [int] Maximum number of bonds exceeded. Reduce the number of bonds on the switch.
          CRITICAL Maximum number of slaves per bond exceeded, max is [int] Maximum number of bond members per bond exceeded. Reduce the number of bond members configured for a bond.
          CRITICAL rtnl slave state get failed for bond:[int] port: [int] Could not get the bond member state from Netlink. File a ticket with Cumulus Support.
          CRITICAL Failed to add route [str] Failed to add route to a software table. File a ticket with Cumulus Support.
          CRITICAL Failed to add [str] to bridge [int] Failed to add a port to the bridge. File a ticket with Cumulus Support.
          CRITICAL Failed to add [str] to grp [str] Failed to add a port to the MDB group. File a ticket with Cumulus Support.
          CRITICAL Failed to add [str] to bridge [int] Failed to add an MDB group to the bridge. File a ticket with Cumulus Support.
          CRITICAL Failed to add bridge [int] to mdb Failed to add a given bridge to MDB. File a ticket with Cumulus Support.
          CRITICAL Failed to add port [str] in grp [str], bridge [int] Failed to add a port to a group for the specific bridge. File a ticket with Cumulus Support.
          CRITICAL Failed to add [str] to bridge [int] Failed to add a port to the bridge. File a ticket with Cumulus Support.
          CRITICAL Failed to add port [str] in grp [str] Failed to add a port to the MDB group. File a ticket with Cumulus Support.
          CRITICAL Failed to add [str] to bridge [int] Failed to add a port to the bridge. File a ticket with Cumulus Support.
          CRITICAL Failed to add bridge [str] to mdb Failed to add a given bridge to MDB. File a ticket with Cumulus Support.
          CRITICAL Failed to add [str] to bridge [int] Failed to add a port to the bridge. File a ticket with Cumulus Support.
          CRITICAL arptables: Memory allocation for rules failed,malloc: [str] ACL out of memory resource. File a ticket with Cumulus Support.
          CRITICAL Failed to create kernel bridge L2: Bridge hash table add failed. File a ticket with Cumulus Support.
          CRITICAL Open of /dev/net/tun failed: [str] Failed to create a net device. File a ticket with Cumulus Support.
          CRITICAL TUNSETIFF failed: [str] Failed to create a net device. File a ticket with Cumulus Support.
          CRITICAL SIOCGIFHWADDR failed: [str] Failed to create a net device. File a ticket with Cumulus Support.
          CRITICAL SIOCSIFHWADDR failed: [str] Failed to create a net device. File a ticket with Cumulus Support.
          CRITICAL TUNSETPERSIST failed: [str] Failed to create a net device. File a ticket with Cumulus Support.
          CRITICAL TUNSETOFFLOAD failed: [str] Failed to create a net device. File a ticket with Cumulus Support.
          CRITICAL Couldn’t create tuntap ioctl socket. Failed to create a net device. File a ticket with Cumulus Support.
          CRITICAL Couldn’t get netdev flags. Failed to create a net device. File a ticket with Cumulus Support.
          CRITICAL Couldn’t Set netdev flags. Failed to create a net device. File a ticket with Cumulus Support.
          CRITICAL [str]: rtnl_link_alloc failed for family [int] Failed to create filters. File a ticket with Cumulus Support.
          CRITICAL [str]: rtnl_neigh_alloc failed for family [int] Failed to create filters. File a ticket with Cumulus Support.
          CRITICAL Failed to blacklist interface [int] Failed to block interfaces. Check block interfaces.
          CRITICAL F ailed to blacklist interface [int] Failed to block interfaces. Check block interfaces.
          CRITICAL Failed to add [str], ifindex [int] to sw_intfs Failed to create interfaces. Recreate the interface.
          CRITICAL Failed to delete ifindex [int] from sw_intfs Failed to create interfaces. Recreate the interface.
          CRITICAL [str]: could not load interface config Syncing database failed between kernel and switchd. File a ticket with Cumulus Support.
          CRITICAL bogus filesystem path: [str] Failed to add a file in SFS (simple file system). File a ticket with Cumulus Support.
          CRITICAL Need file spec Failed to add a file in SFS (simple file system). File a ticket with Cumulus Support.
          CRITICAL can’t replace existing directory with file: [str] Failed to add a file in SFS (simple file system). File a ticket with Cumulus Support.
          CRITICAL filesystem already initialized Failed to initialize in SFS (simple file system). File a ticket with Cumulus Support.
          CRITICAL filesystem hash table alloc failed Failed to allocate hash table in SFS init. File a ticket with Cumulus Support.
          CRITICAL filesystem mount failed Failed to mount SFS in swtichd init. File a ticket with Cumulus Support.
          CRITICAL filesystem new failed Failed to mount SFS in swtichd init. File a ticket with Cumulus Support.
          CRITICAL bogus filesystem path: [str] Failed to delete filesystem in SFS. File a ticket with Cumulus Support.
          CRITICAL pthread_create failed: [str] Failed to create a thread in NIC init. File a ticket with Cumulus Support.
          CRITICAL pthread_detach failed: [str] Failed to detach a thread in NIC init. File a ticket with Cumulus Support.
          CRITICAL TX Ring allocation failed: [str] Failed to alloc packt buffer in NIC init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t increase netlink rbuf size: [str] Failed to init buffer size in Netlink socket. File a ticket with Cumulus Support.
          CRITICAL Couldn’t increase netlink wbuf size: [str] Failed to init buffer size in Netlink socket. File a ticket with Cumulus Support.
          CRITICAL Couldn’t allocate netlink socket. Failed to create a Netlink socket. File a ticket with Cumulus Support.
          CRITICAL Couldn’t connect netlink socket: [str] Failed to create a Netlink socket. File a ticket with Cumulus Support.
          CRITICAL nl_resync_route failed for cache [int]: [str] Failied to create a resync router function in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t set bufsize for manager netlink socket. Failied to reinitialize buffer size in Netlink socket. File a ticket with Cumulus Support.
          CRITICAL invalid cache mngrinfo. Failied to create a resync in Netlink idle callback. File a ticket with Cumulus Support.
          CRITICAL [str]: failed to close socket: [str] Failed to configure FD in Netlink socket. File a ticket with Cumulus Support.
          CRITICAL [str]: nl_cache_mngr_data_ready failed: [str] Failed to configure FD in Netlink socket. File a ticket with Cumulus Support.
          CRITICAL Couldn’t allocate netlink socket. Failed to create a socket in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t allocate netlink socket. Failed to create a socket in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t allocate manager netlink socket. Failed to create a socket in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t create cache manager: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t set bufsize for manager netlink socket. Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t add link cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t add link cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t add route cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t add mdb cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t alloc neigh cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t add mroute cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t alloc tcqdisc cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t add tcqdisc cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t alloc tcclass cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t add tcclass cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t alloc tccls cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t add tccls cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t alloc tcact cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t add tcact cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t alloc rule cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t add rule cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t add neigh cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t alloc netconf cache: [str] Failed to allocate a cache in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Couldn’t initialize genl/port interface Failed to initialize port interface in Netlink init. File a ticket with Cumulus Support.
          CRITICAL Failed to create kernel bridge Failed to allocate a hash entry in bridge sync. File a ticket with Cumulus Support.
          CRITICAL Port msg [str] failure: err [int] Failed to configure PORT FEC parameter. File a ticket with Cumulus Support.
          CRITICAL Port msg [str] reply failure: err [int] Failed to configure PORT FEC parameter. File a ticket with Cumulus Support.
          CRITICAL Failed run recvmsg_default on port socket, err [int], [str] Failed to initialize PORT receiving messages. File a ticket with Cumulus Support.
          CRITICAL vlan stats send failure: err [int] Failed to configure status in VLAN. File a ticket with Cumulus Support.
          CRITICAL mroute hitbits send failure: err [int] Failed to configure hit bit status in MCAST router. File a ticket with Cumulus Support.
          CRITICAL Port send stats failure: err [int] Failed to configure status in PORT. File a ticket with Cumulus Support.
          CRITICAL Port send settings failure: err [int] Failed to configure settings in PORT. File a ticket with Cumulus Support.
          CRITICAL Port send carrier failure: err [int] Failed to configure carrier in PORT. File a ticket with Cumulus Support.
          CRITICAL ifindex [int] already registered for port ops Failed to register interface options in PORT. File a ticket with Cumulus Support.
          CRITICAL ifindex [int] not registered for port ops Failed to unregister interface options in PORT. File a ticket with Cumulus Support.
          CRITICAL Failed to allocate port hash table Failed to initialize PORT. File a ticket with Cumulus Support.
          CRITICAL Failed to allocate port socket Failed to initialize PORT. File a ticket with Cumulus Support.
          CRITICAL Failed to genl connect to port socket Failed to initialize PORT. File a ticket with Cumulus Support.
          CRITICAL Failed to allocate port sync socket Failed to initialize PORT. File a ticket with Cumulus Support.
          CRITICAL Failed to genl connect to port socket Failed to initialize PORT. File a ticket with Cumulus Support.
          CRITICAL Failed to set genl port sync socket to non-blocking Failed to initialize PORT. File a ticket with Cumulus Support.
          CRITICAL Failed to resolve port ops, err [int] Failed to initialize PORT. File a ticket with Cumulus Support.
          CRITICAL Failed to resolve port multicast group Failed to initialize PORT. File a ticket with Cumulus Support.
          CRITICAL Failed to register port ops, err [int] Failed to initialize PORT. File a ticket with Cumulus Support.
          CRITICAL Failed to add port group membership, err [int] Failed to initialize PORT. File a ticket with Cumulus Support.
          CRITICAL Failed to modify port socket notify cb, err [int] Failed to initialize PORT. File a ticket with Cumulus Support.
          CRITICAL [str]:[int]: [str][str]Assertion [str] failed Failed to find the function name in switchd. File a ticket with Cumulus Support.
          ERROR priority group [int] headroom count [int] exceeds the maximum value [int] P ort headroom buffers exceed ASIC limit. File a ticket with Cumulus Support.
          ERROR shared buffer type [int] not recognized Invalid buffer type. File a ticket with Cumulus Support.
          ERROR sx_api_cos_prio_to_ieeeprio_set failed: [str] IEEE priority map configuration write failed. File a ticket with Cumulus Support.
          ERROR _hal_mlx_packet_2_switch: priority field [int] not supported Packet priority field not supported for source. Check QoS configuration file.
          ERROR priority field [int] not supported P acket priority field not supported for remark. Check QoS configuration file.
          ERROR cos list length [int] is longer than maximum value [int] ECN/RED configuration: list is too long. Check QoS configuration file.
          ERROR hash params get failed: [str] ASIC ECMP hash seed configuration read failed. File a ticket with Cumulus Support.
          ERROR hash params set failed: [str] ASIC ECMP hash seed configuration write failed. File a ticket with Cumulus Support.
          ERROR hal_sh_datapath_pfc_set: PFC configuration not supported on the CPU port CPU port does not support priority flow control. File a ticket with Cumulus Support.
          ERROR hal_sh_datapath_init: datapath init failed: rv [int]: [str] Back end QoS initialization failed. Check for detailed log messages.
          ERROR _priority_field_list_get: Packet priority field [str] not supported Invalid packet priority field. Check QoS configuration file.
          ERROR _sfs_init: could not load traffic config file [str] Traffic configuration file missing or is unreadable. Check QoS configuration file.
          ERROR _sfs_port_init: could not load traffic config file [str] Traffic configuration file missing or is unreadable. Check QoS configuration file.
          ERROR _add_port_group: port group [str] exceeds max port group count [int] Too many port groups configured. Check QoS configuration file.
          ERROR _add_port_group: memory allocation failed for port group [str] Memory allocation failed. File a ticket with Cumulus Support.
          ERROR _switch_priority_config: [str] ASIC scheduler configuration failed. File a ticket with Cumulus Support.
          ERROR _priority_map_config map function, hal port [int]: [str] ASIC priority map configuration failed. File a ticket with Cumulus Support.
          ERROR _priority_map_config enable function: [str] ASIC priority map enable configuration failed. File a ticket with Cumulus Support.
          ERROR _port_group_range_translate: invalid port list: range length is 0, id list is 0 Invalid port set configuration. Check QoS configuration file.
          ERROR _port_group_range_translate: failed: port list not created from range [str] to [str] Invalid port set configuration. Check QoS configuration file.
          ERROR hal_datapath_init: priority field initialization expects 3 three priority fields, got [int] Invalid inputs. File a ticket with Cumulus Support.
          ERROR hal_datapath_init: priority map direction initialization expects two priority map directions, got [int] Invalid inputs. File a ticket with Cumulus Support.
          ERROR hal_datapath_init: DOS config failed: [str] ASIC DoS config failed. File a ticket with Cumulus Support.
          ERROR _cutthrough_config: Cutthrough config failed on HAL port [int]: [str] ASIC cut-through configuration failed. File a ticket with Cumulus Support.
          ERROR _source_priority_map_init: packet priority map size [int] is larger than array length [int] Configured priority map list is too large. Check QoS configuration file.
          ERROR _source_priority_map_populate: packet priority map entry index [int] is larger than array length [int] Configured priority map list is too large. Check QoS configuration file.
          ERROR _remark_priority_map_init: packet priority map entry index [int] is larger than array length [int] Configured priority map list is too large. Check QoS configuration file.
          ERROR _remark_priority_map_populate: packet priority map entry index [int] is larger than array length [int] Configured priority map list is too large. Check QoS configuration file.
          ERROR _priority_map_get: field index [int] is out of bounds: [int] available field entries Invalid packet priority field. Check QoS configuration file.
          ERROR _priority_map_get: cos ID [int] is out of bounds: [int] cos ID values Invalid priority value. Check QoS configuration file.
          ERROR _priority_map_get: old syntax flag [int] should be less than [int] Invalid configuration syntax. File a ticket with Cumulus Support.
          ERROR hal_list_get: strdup returned NULL Memory allocation failed. File a ticket with Cumulus Support.
          ERROR hal_port_pause_set: RX pause not allowed on port [int] Invalid operation for the current port configuration. Check QoS configuration file.
          ERROR vlan_range_confi: incorrect format, revert to default Invalid configuration. Correct configured VLAN range.
          ERROR vlan_range_confi: incorrect format, revert to default Invalid configuration. Correct configured VLAN range.
          ERROR vlan_range_confi: incorrect range, revert to default Invalid configuration. Correct configured VLAN range.
          ERROR vlan_range_confi: minimum range is [int], revert to default Invalid configuration. Correct configured VLAN range.
          ERROR backend_enum_info_key unsupported type [uint] Invalid backend type. File a ticket with Cumulus Support.
          ERROR hal_init [str] enum_fn [str]: dlerror [str] No back end enum function found. File a ticket with Cumulus Support.
          ERROR hal_init failed to open [str]: [str] Could not open the back end DLL. File a ticket with Cumulus Support.
          ERROR hal_init unsupported type [uint] Invalid back end type. File a ticket with Cumulus Support.
          ERROR hal_init: backend function at offset [int] non populated: [address] Back end function pointer is not set. File a ticket with Cumulus Support.
          ERROR Unable to setup handling of SIGHUP for log rotation: [str] Signal handler initiailzation failed. File a ticket with Cumulus Support.
          ERROR Couldn’t delete pid file [str], [str] Could not delete process ID file. File a ticket with Cumulus Support.
          ERROR Failed to update VRF [str] to table id [uint]
          ERROR Failed to add VRF [str] with table id [uint]
          ERROR Failed to get VRF table id for index [int] VRF table ID not found for the master link. File a ticket with Cumulus Support.
          ERROR [int] hosts were ignored due to capacity. Kernel neighbors did not fit in the hardware table. Modify neighbor configuration.
          ERROR [int] routes were ignored due to total capacity. Kernel routes did not fit in the hardware table. Modify route configuration.
          ERROR [int] [str] routes were ignored due to capacity. Kernel routes did not fit in the hardware table. Modify route configuration.
          ERROR Ignoring VRF [str]; table id 0 is reserved for default VRF Invalid table ID for the VRF. Modify VRF configuration.
          ERROR Failed to update VRF [str] to table id [uint] Failed to update the VRF table ID. File a ticket with Cumulus Support.
          ERROR Failed to add VRF [str] with table id [uint] Failed to add the VRF. File a ticket with Cumulus Support.
          ERROR Failed to set type to ‘vrf’ in link filter Failed to update the Netlink cache link type. File a ticket with Cumulus Support.
          ERROR Found [int] VRF entries after sync. cleaning up. VRF entries remaining after sync operation. File a ticket with Cumulus Support.
          ERROR Ignoring attempts to delete default route for table 0 Failed to remove the default route. File a ticket with Cumulus Support.
          ERROR Master interface not found in nl cache for index [int] Link object not found in Netlink cache. File a ticket with Cumulus Support.
          ERROR Maximum number of VRFs already exist. Can not add VRF for table [uint] More VRFs have been configured than the maximum number supported by the platform. Reduce the number of configured VRFs.
          ERROR Failed to delete REPL route from Hash Table Failure in VRF route leak feature. This message notifies that an entry could not be properly deleted from one of the software tables. File a ticket with Cumulus Support.
          ERROR Failed to add route [str] Failure in VRF route leak feature. This message notifies that a route entry could not be properly added to one of the software tables. File a ticket with Cumulus Support.
          ERROR Route [str] in HW, but not in HAL cache. Adding. Happens when HAL resync is triggered. Route discovered in hardware but it is not in the software database. No action is required as software auto-corrects.
          ERROR HW route [str] doesn’t match HAL route [str]. Updating. Happens when HAL resync is triggered. Route discovered in hardware but it is not matching the software database. No action is required as software auto-corrects.
          ERROR Route [str] in HAL cache, but not in HW. Deleting. Happens when HAL resync is triggered. Route discovered in software but it is not seen in the hardware. No action is required as software auto-corrects.
          ERROR No parent interface for [str] The parent interface of an interface could not be derived. This is a software issue. File a ticket with Cumulus Support.
          ERROR [str]: sfs backing pointer is NULL FUSE file pointer is NULL. File a ticket with Cumulus Support.
          ERROR port sample rate string not found Port sampling rate is not set. Correct the configuration.
          ERROR [str]: hal port pointer is NULL for port sample string [str] Port pointer is incorrect when port sampling is set. File a ticket with Cumulus Support.
          ERROR port sample rate string [str] not recognized Port sampling rate is not configured properly. Correct the configuration.
          ERROR port sample interface [str] in port sample string ‘[str]’ not recognized Port sampling port-name is not configured properly. Correct the configuration.
          ERROR port sample rate value not recognized in string [str] Port sampling value is not configured properly. Correct the configuration.
          ERROR port sample rate value [str] does not contain a valid integer Port sampling value is not configured properly using integer values. Correct the configuration.
          ERROR [int] mroutes were ignored due to total capacity. Maximum number of multicast routes exceeded. Reduce the number of multicast routres in the switch/fabric.
          ERROR local_ip format in fuse node is incorrect [str] Local_IP value formatting in FUSE file is incorrect. Correct the configuration.
          ERROR local_ip fuse node read failed Local_IP value formatting in FUSE file is invalid. Correct the configuration.
          ERROR vxlan encap dscp action invalid VXLAN encap DSCP action in FUSE file is invalid. Correct the configuration.
          ERROR vxlan encap dscp action [[str]] invalid VXLAN encap DSCP action formatting in FUSE file is invalid. Correct the configuration.
          ERROR vxlan decap dscp action invalid VXLAN decap DSCP action in FUSE file is invalid. Correct the configuration.
          ERROR vxlan decap dscp action invalid VXLAN decap DSCP action formatting in FUSE file is invalid. Correct the configuration.
          ERROR sx_api_lag_hash_flow_params_set failed: [str] Set hash flow parameters failed in the SDK.
          ERROR bond IDs exhausted Maximum number of bonds exhausted. Reduce the numbert of configured bonds.
          ERROR bond_id [uint] swid [uint] lag create failed: [str] Setting a LAG port group failed in the SDK.
          ERROR bond_id [uint] lag_id 0x%x port state set failed: [str] Port state could not be set to ADMIN_UP in the SDK.
          ERROR bond_id [uint] lag_id 0x%x ingr_filter set failed: [str] Ingress filter set failed for specified bond_id and lag_id in the SDK.
          ERROR bond_id [uint] old lag_id 0x%x not cleaned up Could not find the mapping of the bond_id with the old lag_id, hence cleanup was not successful.
          ERROR lag_id 0x%x swid [uint] failed: [str] Removal of the LAG port group for the specified lag_id failed.
          ERROR cannot find bond slave port [uint] During addition of a bond slave port to a bond, the slave port entry in the software tables could not be found.
          ERROR ifp not found for bond_id [uint] Software entry for a bond_id could not be found in the database.
          ERROR [str] member [str] add failed: [str] Adding a member to a LAG group failed in the SDK.
          ERROR [str] member [str] delete failed: [str] Deleting a member in a LAG group failed in the SDK.
          ERROR invalid port_storm_ctrl_type [uint] Invalid Storm-Control-Type (invalid value) detected in the software. Internal error.
          ERROR unexpected duplicate bond if_key [str] Existing duplicate bond key found in the software table.
          ERROR lag_id 0x%x with no corresponding bond id bond_id could not be located for the given lag_id.
          ERROR [str] collector set failed for [str]: [str] Adding the specified port to the collector set for the bond failed.
          ERROR [str] distributor set failed for [str]: [str] Adding the specified port to the distributor set for the bond failed.
          ERROR cannot find base bond slave [str] for bond_id [int] During update of a bond, the slave port entry in the software tables could not be found.
          ERROR unexpected duplicate bond interface [str] Existing duplicate bond interface found in the software table.
          ERROR unexpected duplicate lag_id 0x%x Existing duplicate lag_id found in the software table.
          ERROR info not found for bond_id [uint] An expected entry could not be found for a given bond_id in a software table.
          ERROR [str] unexpected duplicate member [str] Duplicate entry for the specified interface found for a bond in a software table.
          ERROR info not found for bond_id [uint] An expected entry could not be found for a given bond_id in a software table.
          ERROR initialization failed: [str] SDK API call for tunnel initialization failed. File a ticket with Cumulus Support.
          ERROR logical network type [uint] key [uint] not found Logical network of given type and key was not found in the software tables. File a ticket with Cumulus Support.
          ERROR logical network type [uint] key [uint] not found Logical network of given type and key was not found in the software tables. File a ticket with Cumulus Support.
          ERROR failed to create keys tunnel type [uint] and key [uint] Failed to create logical network keys of given tunnel type and interface key in the software tables. File a ticket with Cumulus Support.
          ERROR failed to update the decap key for gre entry tunnel_id: Failed to update the GRE decap keys for the given tunnel. File a ticket with Cumulus Support.
          ERROR failed to find the decap key for tunnel_id : (0x%x) Failed to find the GRE decap keys for the given tunnel. File a ticket with Cumulus Support.
          ERROR failed to update a decap entry tunnel_id : (0x%x) Failed to update the GRE decap keys for the given tunnel. File a ticket with Cumulus Support.
          ERROR unexpected duplicate entry in gre_tunnel_key_ht GRE tunnel key table has an existing duplicate entry. File a ticket with Cumulus Support.
          ERROR unexpected duplicate entry in gre_tunnel_id_ht GRE tunnel ID table has an existing duplicate entry. File a ticket with Cumulus Support.
          ERROR unexpected duplicate entry in gre_olay_ulay_ht GRE tunnel overlay/underlay table has an existing duplicate entry. File a ticket with Cumulus Support.
          ERROR failed to create the hw decap key for gre entry tunnel_id: Failed to create the GRE decap keys for the given tunnel. File a ticket with Cumulus Support.
          ERROR failed to create a decap entry tunnel_id : (0x%x) Failed to create a decap entry in the software table for the given tunnel_id. File a ticket with Cumulus Support.
          ERROR unexpected duplicate entry in gre_decap_ht tunnel_id GRE decap table has an existing duplicate entry for the given tunnel. File a ticket with Cumulus Support.
          ERROR failed to create Failed to set up a tunnel between the given local and remote IP addresses. File a ticket with Cumulus Support.
          ERROR unable to find overlay overlay info from overlay ifindex: While removing a GRE tunnel, the overlay info could not be retrieved from the software tables for the given ifindex. File a ticket with Cumulus Support.
          ERROR failed to create key for type [uint] Failed to create the GRE keys for the given tunnel type. File a ticket with Cumulus Support.
          ERROR failed to find gre entry Failed to find a logical network entry for a tunnel in the software tables during tunnel removal. File a ticket with Cumulus Support.
          ERROR failed to find gre entry Failed to find a GRE decap entry for a tunnel during tunnel removal. File a ticket with Cumulus Support.
          ERROR Failed to open SX-API, error: [str] SDK API open failed. File a ticket with Cumulus Support.
          ERROR Failed to set SDK VERBOSITY level, error: [str] Failed to set SDK VERBOSITY level. File a ticket with Cumulus Support.
          ERROR num_devices [uint] is invalid SDK found zero (asic) devices. File a ticket with Cumulus Support.
          ERROR Failed to initialize SDK ([str]) Failed to initialize SDK. File a ticket with Cumulus Support.
          ERROR *** failed to configure the requested setup *** The board configuration could not be enforced. File a ticket with Cumulus Support.
          ERROR more devices encountered than configured [uint] More devices encountered than configured. File a ticket with Cumulus Support.
          ERROR duplicate device ID [uint] Found an existing duplicate device when adding a new device. File a ticket with Cumulus Support.
          ERROR invalid unit [uint] num_devices [uint] Internal error: given unit number is out of range. File a ticket with Cumulus Support.
          ERROR invalid port [uint] num_ports [uint] Internal error: given port number is out of range. File a ticket with Cumulus Support.
          ERROR ERROR: Fail to extract data from XML file The topology map could not be extracted from the XML file. File a ticket with Cumulus Support.
          ERROR Device ID already added A device is already added to the topology database. File a ticket with Cumulus Support.
          ERROR Device ID [uint] NOT found in the XML file A device could not be found in the databse. File a ticket with Cumulus Support.
          ERROR Failed to add device [uint] to the SDK (sx_api_topo_device_set) Failed to add a device to the topology database. File a ticket with Cumulus Support.
          ERROR Failed to allocate memory for tree tree_info array, Memory exhausted. Restart switchd, if possible. File a ticket with Cumulus Support.
          ERROR ERROR: Fail to add topo tree Top tree could not be added. File a ticket with Cumulus Support.
          ERROR Failed to set topo device ready for device [uint]: [str] Could not set DEVICE_READY in the SDK. File a ticket with Cumulus Support.
          ERROR Unable to load file [str] (file not exists or corrupted ) The specified XML configuration file could not be loaded as it either does not exist or is possibly corrupted. File a ticket with Cumulus Support.
          ERROR Unable to parse file (error #[int]: [str]) The specified XML configuration file could not be parsed. File a ticket with Cumulus Support.
          ERROR Expat error #[int] (line [int] , column [int]): [str] A parsing error encountered when parsing the configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing number of devices Error parsing the number of child devices in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing device mac address Error parsing the device MAC address in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing device MAC address Error parsing the device MAC address in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error mac cannot be 00:00:00:00:00:00 Found a device MAC address of all zeores in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Failed running [str] Specified system command failed. File a ticket with Cumulus Support.
          ERROR Error parsing dev_number Error parsing the device number in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing device mac address Error parsing the device MAC address in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing device MAC address Error parsing the device MAC address in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing number of physical ports value Error parsing the number of PHY ports in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Invalid number of physical ports [uint] Found zero or more than the allowed maximum port numbers in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing ports list section Error parsing the port list section in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing local port number Error parsing the local port number in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing mapping mode Error parsing the mapping mode in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing label port Error parsing the label port in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing width value Error parsing the port width in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing RX lanes value , local port: [[int]] Error parsing the RX lanes value in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing lanes value , local port: [[int]] Error parsing the lanes value in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing lanes value, local port: [[int]] Error parsing the lanes value in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing lane to module value, local port: [[int]] Error parsing the lane to module mapping value in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing lane to module value , local port: [[int]] Error parsing the lane to module mapping value in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing port mode value Error parsing the in the port mode value XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing port speed value Error parsing the in the port speed value XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing swid value Error parsing the swid value in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Error parsing autoneg value Error parsing the autoneg value in the XML configuration file. File a ticket with Cumulus Support.
          ERROR Conflict detected in lid 0x%0x port [uint] dev_id [uint] lid already detected in a software table. File a ticket with Cumulus Support.
          ERROR port_device_set has failed for device [uint] ([str]) Could not set a port device to the SDK. File a ticket with Cumulus Support.
          ERROR topo_xml_device_add failed for device [uint] ([str]) A device could not be added to a topo_xml_device database. File a ticket with Cumulus Support.
          ERROR mode set dev [int] port 0x%x failed: [str] Port mode could not be set to STACKING mode in the SDK. File a ticket with Cumulus Support.
          ERROR binding dev [int] port 0x%x to swid [int] failed: [str] Binding the device and port to the swid failed in the SDK. File a ticket with Cumulus Support.
          ERROR port_init dev [int] port 0x%x failed: [str] Specified device/port could not be initialized in the SDK. File a ticket with Cumulus Support.
          ERROR swid set failed: [str] swid set failed in the SDK. File a ticket with Cumulus Support.
          ERROR Failed to set port 0x%x mapping: [str] Port mapping failed for the specified port in the SDK. File a ticket with Cumulus Support.
          ERROR Failed to set port 0x%x mapping: [str] Port mapping failed for the specified port in the SDK. File a ticket with Cumulus Support.
          ERROR invalid bridge_vlan [uint] for bridge_id [int] The specified bridge_vlan is not valid for the given bridge_id. File a ticket with Cumulus Support
          ERROR vfid not set for vlan [uint] VFID is not set for the specified VLAN. File a ticket with Cumulus Support.
          ERROR new group check failed for vlan [uint] mac [str]: [str] The specified MAC address and VLAN already exist in a group. File a ticket with Cumulus Support.
          ERROR old port list get failed for vlan [uint] mac [str]: [str] Could not find the existing port list for the given MAC address and VLAN. File a ticket with Cumulus Support.
          ERROR port delete failed for vlan [uint] mac [str]: [str] Could not delete the existing port list for the given MAC address and VLAN. File a ticket with Cumulus Support.
          ERROR create failed for vlan [uint] mac [str]: [str] Could not create a new multicast group for the given MAC address and VLAN. File a ticket with Cumulus Support.
          ERROR port add failed for vlan [uint] mac [str]: [str] Could not add a port to the port list for the given MAC address and VLAN. File a ticket with Cumulus Support.
          ERROR invalid bridge_vlan [uint] for bridge_id [int] The specified bridge_vlan is not valid for the given bridge_id. File a ticket with Cumulus Support.
          ERROR vfid not set for vlan [uint] The specified MAC address and VLAN already exist in a group. File a ticket with Cumulus Support.
          ERROR group delete failed for vlan [uint] mac [str]: [str] Could not delete a multicast group for the given MAC address and VLAN. File a ticket with Cumulus Support.
          ERROR invalid bridge_vlan [uint] for bridge_id [int] The specified bridge_vlan is not valid for the given bridge_id. File a ticket with Cumulus Support.
          ERROR vfid not set for vlan [uint] VFID is not set for the specified VLAN. File a ticket with Cumulus Support.
          ERROR flood_mode_get failed for swid [int] vfid [int] [str] Flood mode value could not be retrieved for the given swid/port/vfid. File a ticket with Cumulus Support.
          ERROR unreg_mc_flood_ports fail for swid [int], vfid [int], [str] Unregistered multicast flood ports setting for given swid/port/vfid failed in the SDK. File a ticket with Cumulus Support.
          ERROR mroute ports [int] exceeds [int] A multicast route has a larger number of ports than the RIFs on the switch. File a ticket with Cumulus Support.
          ERROR no container id retrieved for [str] Egress container could not be retrieved. File a ticket with Cumulus Support.
          ERROR route cmd [str] failed: [str] Setting multicast route in the SDK failed. File a ticket with Cumulus Support.
          ERROR router table_id [uint] vrid [int] set failed: [str] Setting a VRID for a router failed. File a ticket with Cumulus Support.
          ERROR unexpected duplicate key list size [uint] Unexpected duplicate entry found in next-hop list. File a ticket with Cumulus Support.
          ERROR unexpected duplicate key type [str] min_mtu [uint] fid [uint] Unexpected duplicate entry in container anchor. File a ticket with Cumulus Support.
          ERROR unexpected duplicate nh_list num_elems [uint] Unexpcted duplicate next-hop list in container. File a ticket with Cumulus Support.
          ERROR failed for nh_list num_elems [uint]: [str] Could not create a new container for the next-hop list in the SDK. File a ticket with Cumulus Support.
          ERROR failed for type [str] container_id [uint] num_elems [uint]: [str] Could not free a container for the next-hop list in the SDK. File a ticket with Cumulus Support.
          ERROR unsupported chip type [uint] Chip type in table is not supported. File a ticket with Cumulus Support.
          ERROR invalid parse depth VXLAN parsing depth setting is invalid. File a ticket with Cumulus Support.
          ERROR unexpected duplicate ln_type [uint] ln_key [uint] Unexpected duplicate entry in logical VPN key table. File a ticket with Cumulus Support.
          ERROR unexpected duplicate ID 0x%x Unexpected duplicate entry in logical VPN ID table. File a ticket with Cumulus Support.
          ERROR unsupported ln_type [int] or ln_key [int] Unsupported logical network type encountered. File a ticket with Cumulus Support.
          ERROR unexpected duplicate ln_type [uint] ln_key [uint] Unexpected duplicate entry in logical network table. File a ticket with Cumulus Support.
          ERROR lid 0x%x remote_ip [str] not found Local ID to remote_ip mapping retrieval failed. File a ticket with Cumulus Support.
          ERROR unexpected duplicate key [uint] Unexpected duplicate entry in the VPN map table. File a ticket with Cumulus Support.
          ERROR invalid VPN or vlan [uint] Invalid VPN/VLAN value provided for creating a VPM map entry. File a ticket with Cumulus Support.
          ERROR unexpected duplicate entry Unexpected duplicate entry in VPN decap table. File a ticket with Cumulus Support.
          ERROR invalid VPN Invalid VPN value provided for creating a VPN decap entry. File a ticket with Cumulus Support.
          ERROR unexpected duplicate key [str] type [uint] Unexpected duplicate entry in VPN next-hop table. File a ticket with Cumulus Support.
          ERROR tunnel get failed: [str] Tunnel attribute retrieval from the SDK failed. File a ticket with Cumulus Support.
          ERROR tunnel update failed: [str] Tunnel attribute update in the SDK failed. File a ticket with Cumulus Support.
          ERROR creation failed: [str] VXLAN tunnel creation failed. File a ticket with Cumulus Support.
          ERROR tunnel_id 0x%x hash set failed: [str] VXLAN UDP header src port control setting in SDK for enabling better entropy failed. File a ticket with Cumulus Support.
          ERROR unsupported type [uint] Unsupported tunnel type encountered. File a ticket with Cumulus Support.
          ERROR unexpected duplicate key for tunnel_id 0x%x Unexpected duplicate entry in logical VPN tunnel key table. File a ticket with Cumulus Support.
          ERROR unexpected duplicate ID for tunnel_id 0x%x Unexpected duplicate entry in logical VPN ID table. File a ticket with Cumulus Support.
          ERROR unexpected duplicate ln_type [uint] ln_key [uint] Unexpected duplicate entry in logical network table. File a ticket with Cumulus Support.
          ERROR update failed: [str] VXLAN tunnel update failed. File a ticket with Cumulus Support.
          ERROR unsupported type [uint] Unsupported tunnel type encountered. File a ticket with Cumulus Support.
          ERROR tunnnel_id 0x%x failed: [str] VXLAN tunnel destroy failed. File a ticket with Cumulus Support.
          ERROR tunnel_id 0x%x ttl [uint] failed: [str] VXLAN tunnel TTL set failed. File a ticket with Cumulus Support.
          ERROR failed: [str] VXLAN tunnel map set failed. File a ticket with Cumulus Support.
          ERROR vfid not available for vlan [uint] VFID is not set for the specified VLAN. File a ticket with Cumulus Support.
          ERROR failed: [str] VXLAN tunnel map set failed. File a ticket with Cumulus Support.
          ERROR failed: [str] Tunnel decap entry could not be set. File a ticket with Cumulus Support.
          ERROR failed: [str] Tunnel decap entry could not be set. File a ticket with Cumulus Support.
          ERROR [str] group_id [uint] vid [uint] failed: [str] FDB flood set for a group failed. File a ticket with Cumulus Support.
          ERROR group_id [uint] vid [uint] failed: [str] FDB flood set for a group failed. File a ticket with Cumulus Support.
          ERROR creation failed: [str] VPN device addition to the SDK failed. File a ticket with Cumulus Support.
          ERROR unexpected duplicate entry VPN port table has an existing duplicate entry. File a ticket with Cumulus Support.
          ERROR vpn_port 0x%x failed: [str] VPN device deletion from the SDK failed. File a ticket with Cumulus Support.
          ERROR tunnel encap dscp action PRESERVE is invalid PRESERVE is not a valid action. Check configuration.
          ERROR tunnel_id 0x%x cos set failed: [str] COS set failed for a tunnel. File a ticket with Cumulus Support.
          ERROR tunnel decap dscp action SET is invalid SET is not a valid DSCP action. Check configuration.
          ERROR tunnel_id 0x%x cos set failed: [str] COS set failed for a tunnel. File a ticket with Cumulus Support.
          ERROR tunnel id not found: ln_type [uint] ln_key [uint] Specified tunnel ID was not found in the VPN table. File a ticket with Cumulus Support.
          ERROR Error opening socket for grat arp [str], ifi [int]: [str] Gratuitous ARP socket could not be opened. File a ticket with Cumulus Support.
          ERROR Error sending grat arp [str], ifi [int]: [str] Gratuitous ARP socket could not be sent. File a ticket with Cumulus Support.
          ERROR Failed to update table id for port [int], interface [str]/[int] table_id could not be updated for a port. File a ticket with Cumulus Support.
          ERROR interface not found at index [int] Interface information for a given ifindex could not be found. File a ticket with Cumulus Support.
          ERROR Failed to open IPv4 socket [str] IPv4 socket open failed. File a ticket with Cumulus Support.
          ERROR Failed to set SO_RCVBUF [str] Socket option SO_RCVBUF set failed. File a ticket with Cumulus Support.
          ERROR No source MAC address ifindex [int], [str] No source MAC address found for given ifiindex. File a ticket with Cumulus Support.
          ERROR Failed to open IPv6 socket [str] IPv6 socket open failed. File a ticket with Cumulus Support.
          ERROR Failed to set SO_RCVBUF [str] Socket option SO_RCVBUF set failed. File a ticket with Cumulus Support.
          ERROR No source MAC address ifindex [int], [str] No source MAC address found for given ifiindex. File a ticket with Cumulus Support.
          ERROR Adding ctx to hash table failed Adding ctx to hash table failed. File a ticket with Cumulus Support.
          ERROR Getting pktinj failed Getting pktinj failed. File a ticket with Cumulus Support.
          ERROR ERSPAN target is supported with the following field(s): –src-ip –dst-ip . ACL unsupported ERSPAN target. See ACL user documentation.
          ERROR Inverse flags not supported for UDP ACL inverse match not supported. See ACL user documentation.
          ERROR Specified Match [str] not supported ACL unsupported match. See ACL user documentation.
          ERROR Specified Target [str] not supported ACL unsupported target. See ACL user documentation.
          ERROR Inverse flags not supported for ARP ACL inverse match not supported. See ACL user documentation.
          ERROR Inverse flags not supported for IP ACL inverse match not supported. See ACL user documentation.
          ERROR Inverse flags not supported for IPv6 ACL inverse match not supported. See ACL user documentation.
          ERROR Traffic class [hex] not supported for IP ACL unsupported match. See ACL user documentation.
          ERROR Range for ICMP Types i.e [hex]-[hex] not supported ACL unsupported match. See ACL user documentation.
          ERROR Range for ICMP codes i.e [hex]-[hex] not supported ACL unsupported match. See ACL user documentation.
          ERROR Invert flags not supported for mark_m ACL unsupported match. See ACL user documentation.
          ERROR OR bitmask [hex] not supported for mark_m match ACL unsupported match. See ACL user documentation.
          ERROR Inverse flags not supported for VLAN ACL inverse match not supported. See ACL user documentation.
          ERROR Inverse flags not supported for ICMP ACL inverse match not supported. See ACL user documentation.
          ERROR vlanid match not supported for VLAN ACL vlanid usupported match. See ACL user documentation.
          ERROR vlan prio match not supported for VLAN ACL vlan prio unsupported match. See ACL user documentation.
          ERROR IP, IPv6, ARP options are not supported for LOG ACL unsupported match for LOG action. See ACL user documentation.
          ERROR Interface [str] not supported for SPAN ACL unsupported target. See ACL user documentation.
          ERROR SPAN target is supported with the following ACL unsupported SPAN target. See ACL user documentation.
          ERROR ERSPAN target is supported with the following ACL unsupported match. See ACL user documentation.
          ERROR Inverse flags not supported for ICMPv6 ACL inverse match not supported. See ACL user documentation.
          ERROR target bitmask [hex] not supported for mark ACL unsupported verdicts continue/return/etc. See ACL user documentation.
          ERROR Specified Match [str] not supported ACL unsupported match. See ACL user documentation.
          ERROR Specified watcher [str] not supported ACL unsupported watcher action. See ACL user documentation.
          ERROR Specified target [str] not supported ACL unsupported target action. See ACL user documentation.
          ERROR Inverse flags not supported for Multiport ACL inverse match not supported. See ACL user documentation.
          ERROR Inverse flags not supported for TOS ACL inverse match not supported. See ACL user documentation.
          ERROR Inverse flags not supported for DSCP ACL inverse match not supported. See ACL user documentation.
          ERROR Inverse flags or Limit Interface or Source not supported for addrtype ACL inverse match not supported. See ACL user documentation.
          ERROR Dest addrtypes other than IPROUTER/LOCAL not supported ACL unsupported match for addrtype. See ACL user documentation.
          ERROR IP6 Fragment IDs not supported ACL unsupported match. See ACL user documentation.
          ERROR IP6 Length/Reserved or Last or More fragment Fields not supported ACL unsupported match. See ACL user documentation.
          ERROR Greater/Lesser/Not equal to mode not supported for TTL field ACL unsupported match. See ACL user documentation.
          ERROR Greater/Lesser/Not equal mode not supported for HL field ACL unsupported match. See ACL user documentation.
          ERROR Inverse flags not supported for match mark ACL inverse match not supported. See ACL user documentation.
          ERROR TCP SEQ, TCP Options or IP options, UID are not supported for LOG ACL unsupported match. See ACL user documentation.
          ERROR TCP SEQ, TCP Options or IP options, UID are not supported for LOG ACL unsupported match. See ACL user documentation.
          ERROR Inverse flags or options not supported for TCP ACL inverse match not supported. See ACL user documentation.
          ERROR Interface [str] not supported for SPAN ACL unsupported target interface in SPAN rule. See ACL user documentation.
          ERROR SPAN target is supported with the following " field(s): –dport swp." ACL unsupported SPAN target. See ACL user documentation.
          ERROR Interface [str] not supported for SPAN ACL unsupported target interface. See ACL user documentation.
          ERROR SPAN target is supported with the following field(s): –dport swp. ACL unsupported SPAN target. See ACL user documentation.
          ERROR resource region [uint] destroy failed: [str] ACL: Mellanox API for region destroy failed. File a ticket with Cumulus Support.
          ERROR resource region [uint] size [uint] create failed: [str] ACL: Mellanox API for TCAM region create failed. Check if there are too many rules/tables.
          ERROR unexpected duplicate key [str] ACL: Internal interface cache has duplicate. File a ticket with Cumulus Support.
          ERROR unexpected duplicate user: [str] key_idx [val] offset [val] ACL: Duplicate entry in interface rule hash. File a ticket with Cumulus Support.
          ERROR expected trap_id [uint](actual [uint]) type [uint] (actual [val]) ACL: INPUT chain trap counter get ID invalid. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] region [str] [str] size [uint] creation failed: [str] ACL: Mellanox API for TCAM region create failed. Check if there are too many rules/tables.
          ERROR table [str] chain [str] region [str] [str] size [uint] acl_id creation failed: [str] ACL: Mellanox API for ACL ID create failed. Check if there are too many rules/tables.
          ERROR table [str] chain [str] region [str] [str] rules del @offset [val] num_rules [val] failed: [str] ACL: Mellanox API for setting region to ACL ID failed. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] region [str] [str] size [uint] acl_id destroy failed: [str] ACL: Mellanox API for unsetting region from ACL ID failed. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] region [str] [str] size [uint] destroy failed: [str] ACL: Mellanox API for region destroy failed. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] region [str] [str] rules add @offset [val] num_rules [val] failed: [str] ACL: Mellanox API for setting rule in a region failed. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] failed to allocate Mark value [val] ACL: Mark value alloc failed. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] Mark values must be less than [val] when partial mask is in use ACL: Masked mark use limitation. Change mask match rules.
          ERROR PBR: Unsupported route type ecmp PBR: Invalid ECMP route. File a ticket with Cumulus Support.
          ERROR PBR: Unsupported route type singe hop PBR: Invalid non-ECMP route. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] region [str] action [str] is not supported ACL: Unsupported action. Remove rule with the action.
          ERROR PBR: couldn’t set default forward action for rule PBR: Default action accept couldn’t be set for PBR rules. File a ticket with Cumulus Support.
          ERROR [str] offset [uint] or key_idx [uint] exceeds rule list size [uint] or descriptor size [val] ACL: Too may rules in a region. File a ticket with Cumulus Support.
          ERROR [str] offset [uint] key_idx [uint] invalid key_id [uint] ACL: Invalid src/dst port/intf in rule. Remove the rule.
          ERROR acl group [str] creation failed: [str] ACL: Mellanox API for ACL group create failed. Check for too many ACL tables
          ERROR table [str] chain [str] key handle create failed: [str] ACL: Mellanox API to create a key list handle failed. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] key attr query failed: [str] ACL: Mellanox API to get key attributes failed. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] key handle delete failed: [str] ACL: Mellanox API for key handle delete failed. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] region [str] [str] size [uint] offset too large ACL: Rule offset in region larger than region size. File a ticket with Cumulus Support.
          ERROR region set failed: [str] ACL: Mellanox API for TCAM region create failed. Check if there are too many rules/tables.
          ERROR anv set failed: [str] ACL: Mellanox API for ACL ID create failed. Check if there are too many rules/tables.
          ERROR rules set failed: [str] ACL: Mellanox API for rule set failed. Check if there are too many rules.
          ERROR [str] malloc failed ACL: Memory resource allocation error. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] too many keys in rule ACL: Too many keys used in a rule. Delete the rule.
          ERROR table [str] chain [str] too many actions in rule" ACL: Too many actions in rule. Modify the rule.
          ERROR table [str] chain [str] rule can match on a single output interface only ACL: Rule match limitation. Delete the rule.
          ERROR table [str] chain [str] number of input interfaces ([uint]) cannot be less than number of output interfaces ([val]) ACL: Rule replication limitation. Remove the rule.
          ERROR table [str] chain [str] key classification missing for [val] input bridge interface(s) ACL: Ingress/egress intf missing. Check ACL rules.
          ERROR table [str] chain [str] key classification missing for [val] input interface(s) ACL: Ingress/egress port/bond missing. Check ACL rules.
          ERROR analyzer set failed: [str] ACL: Mellanox API for setting ERSPAN analyzer port failed. File a ticket with Cumulus Support.
          ERROR session [uint] [str] failed: [str] ACL: Mellanox API for en/dis of SPAN session failed. File a ticket with Cumulus Support.
          ERROR session [uint] edit failed: [str] ACL: Mellanox API for ERSPAN session modify failed. File a ticket with Cumulus Support.
          ERROR session [uint] [str] lid 0x%x add failed: [str] ACL: Mellanox API to set SPAN mirror source failed. File a ticket with Cumulus Support.
          ERROR session [uint] [str] lid 0x%x delete failed: [str] ACL: Mellanox API to unset SPAN mirror sources failed. File a ticket with Cumulus Support.
          ERROR span init failed: [str] ACL: Mellanox API for SPAN engine initialization failed. File a ticket with Cumulus Support.
          ERROR Unexpected duplicate session key ACL: Duplicate SPAN session key. File a ticket with Cumulus Support.
          ERROR session create failed: [str] ACL: Mellanox API for session create failed. File a ticket with Cumulus Support.
          ERROR out of SPAN/ERSPAN sessions ACL: Too many SPAN/ERSPAN session targets. Check and reduce SPAN sesions.
          ERROR session [uint] analyzer delete failed: [str] ACL: Mellanox API for session delete failed. File a ticket with Cumulus Support.
          ERROR session [uint] destroy failed: [str] ACL: Mellanox API for for session destroy failed. File a ticket with Cumulus Support.
          ERROR Non flow-based SPAN does not support router interface ACL: Non-flow based cannot SPAN from routed interface. Remove these SPAN rules.
          ERROR Non flow-based SPAN does not support sub-interface ACL: Non-flow based cannot SPAN from routed sub-interface. Remove these SPAN rules.
          ERROR unsupported session_type [uint] ACL: Mirror session has to be SPAN or ERSPAN. Remove these rules.
          ERROR router interface ([str]) is not supported as SPAN dport ACL: SPAN destination cannot be router interface. Remove these rules.
          ERROR unsupported chip type [uint] ACL: Wrong plaform/chip type. File a ticket with Cumulus Support.
          ERROR policer creation failed: [str] ACL: Mellanox internal API to allocate a policer resource failed. Check number of ACL rules using policers.
          ERROR unexpected duplicate ID [val] ACL: Duplicate policer ID allocated. File a ticket with Cumulus Support.
          ERROR policer [val] delete failed [str] ACL: Mellnox internal API to delete policer resource failed. File a ticket with Cumulus Support.
          ERROR tricolor conversion failed pir [uint] cir [uint] cbs [val] ebs [val] ACL: Converting tricolor policer rates to kbps failed. File a ticket with Cumulus Support.
          ERROR conversion failed mode [uint] rate [uint] burst [uint] ACL: Converting color bind policer rates to kpbs failed. File a ticket with Cumulus Support.
          ERROR counter creation failed: [str] ACL: Mellanox internal API for allocating a counter failed. Check number of ACL rules.
          ERROR counter [uint] delete failed: [str] ACL: Mellanox internal API for counter delete failed. File a ticket with Cumulus Support.
          ERROR counter [uint] failed: [str] ACL: Mellanox internal API for counter read failed. File a ticket with Cumulus Support.
          ERROR policer [val] counter failed: [str] ACL: Mellanox internal API for for policer counter read failed. File a ticket with Cumulus Support.
          ERROR invalid interface [str] ACL: Invalid interface specified in ACL rule. Check ACL rule set.
          ERROR bond id [uint] not fully established ACL: Bond interface not fully up. Check bond interface configuration on local/remote ends.
          ERROR unsupported interface type: [str] ACL: An ACL rule has an unsupported type of interface specified in match. Check ACL rule set for interfaces specified.
          ERROR mixing PBS ports in different swids [uint] and [uint] is not allowed ACL: SPAN using policy-based switching doesn’t support target ports in different units. Change SPAN rule configuration.
          ERROR unexpected duplicate PBS key with [uint] port(s) ACL: Duplicate PBS key used for SPAN. File a ticket with Cumulus Support.
          ERROR create failed for PBS record with [uint] port(s): [str] ACL: Mellanox internal API for PBS set failed. File a ticket with Cumulus Support.
          ERROR pbs_id [uint] delete failed: [str] ACL: Mellanox internal API for PBS ID delete failed. File a ticket with Cumulus Support.
          ERROR [str] failed for pbs_id [uint]: [str] ACL: Mellanox internal API for PBS ID delete failed. File a ticket with Cumulus Support.
          ERROR group [str] set failed: [str] ACL: Mellanox internal API that binds an ACL group to a hardware resource like port/VLAN failed. File a ticket with Cumulus Support.
          ERROR user tokens exhausted ACL: Mellanox internal API for user token alloc failed. Check number of ACL rules.
          ERROR hardware platform does not support user tokens ACL: Mellanox plaform doesn’t support user tokens. File a ticket with Cumulus Support.
          ERROR unexpected duplicate mark key [uint] ACL: User token duplicate allocated. File a ticket with Cumulus Support.
          ERROR bind [str] set failed on port 0x%x: [str] ACL: Mellanox internal API that binds an ACL group to a port failed. File a ticket with Cumulus Support.
          ERROR bind [str] unset failed on port 0x%x: [str] ACL: Mellanox internal API that unbinds an ACL group from a port failed. File a ticket with Cumulus Support.
          ERROR bind [str] cmd [uint] failed on bond 0x%x: [str] ACL: Mellanox internal API that un/binds an ACL group with a bond interface failed. File a ticket with Cumulus Support.
          ERROR bind [str] set failed on port 0x%x: [str] ACL: Mellanox internal API that binds an ACL group to a bond member failed. File a ticket with Cumulus Support.
          ERROR bind [str] cmd [uint] failed on RIF 0x[uint]: [str] ACL: Mellanox internal API that binds an ACL group to an L3 ingress interface failed. File a ticket with Cumulus Support.
          ERROR group set failed: [str] ACL: Mellanox internal API that sets the group of an ACL failed. File a ticket with Cumulus Support.
          ERROR range creation failed: [str] ACL: Mellanox internal API to allocate an L4 port range resource failed. Check number of ranges being used in ACLs.
          ERROR range delete failed: [str] ACL: Mellanox internal API to free an L4 port range resource failed. File a ticket with Cumulus Support.
          ERROR table [str] chain [str] region [str] [str] size [uint] creation failed: [str] ACL: Mellanox API for TCAM region create failed. Check if there are too many rules/tables.
          ERROR table [str] chain [str] region [str] [str] size [uint] acl_id creation failed: [str] ACL: Mellanox API for ACL ID create failed. Check if there are too many rules/tables.
          ERROR table [str] chain [str] region [str] [str] rules del @offset [val] num_rules [val] failed: [str] ACL: Mellanox API for setting region to ACL ID failed. File a ticket with Cumulus Support.
          ERROR table [str] [str] chain [str] region [str] [str] rules del @offset [val] num_rules [val] failed: [str] ACL: Mellanox API for setting region to ACL ID failed. File a ticket with Cumulus Support.
          ERROR table [str] [str] chain [str] region [str] [str] size [uint] acl_id destroy failed: [str] ACL: Mellanox API for unsetting region from ACL ID failed. File a ticket with Cumulus Support.
          ERROR table [str] [str] chain [str] region [str] [str] size [uint] destroy failed: [str] ACL: Mellanox API for region destroy failed. File a ticket with Cumulus Support.
          ERROR acl group [str] creation failed: [str] ACL: Mellanox API for ACL group create failed. Check for too many ACL tables.
          ERROR table [str] [str] chain [str] region [str] [str] size [uint] offset too large ACL: Rule offset in region larger than region size. File a ticket with Cumulus Support.
          ERROR region set failed: [str] ACL: Mellanox API for TCAM region create failed. Check if there are too many rules/tables.
          ERROR anv set failed: [str] ACL: Mellanox API for ACL ID create failed. Check if there are too many rules/tables.
          ERROR rules set failed: [str] ACL: Mellanox API for rule set failed. Check if there are too many rules.
          ERROR iACL action cannot be satisfied with eACL key ACL: Invalid dependency between ingress and egress ACLs. Check ACL rules.
          ERROR eACL action cannot be satisfied with iACL key ACL: Invalid dependency between ingress and egress ACLs. Check ACL rules.
          ERROR ACL can match one single output interface only ACL: Don’t support multiple out interface match. Check ACL rules.
          ERROR expected trap_id [uint] (actual [uint]) type [uint] (actual [val]) ACL: INPUT chain trap counter get ID invalid. File a ticket with Cumulus Support.
          ERROR [str] [str] API [str]: dlerror [str] nftables unsupported. Ignore this error message.
          ERROR Memory allocation failed ACL out of memory resource. File a ticket with Cumulus Support.
          ERROR Rule with LOG must be followed by same rule with DROP An ACL rule with a LOG action must be followed by same rule with a DROP action. See ACL user documentation for more info.
          ERROR Rule with LOG must be followed by same rule with DROP An ACL rule with a LOG action must be followed by same rule with a DROP action. See ACL user documentation for more info.
          ERROR Rule with log watcher must be have DROP action An ACL rule with a watcher action must be followed by same rule with a DROP action. See ACL user documentation for more info.
          ERROR Rule with LOG must be followed by same rule with DROP An ACL rule with a LOG action must be followed by same rule with a DROP action. See ACL user documentation for more info.
          ERROR [str] not found in hal_bonds ACL: Bond interface in rule not present. Check bond configuration.
          ERROR Inverse flags for SRC/DST IP, IN/OUT interface, TOS, Protocol not supported ACL inverse match flags are not supported. See ACL user documentation.
          ERROR Target Verdict :[str] not supported ACL rule target verdict queue/stop/return/etc not supported. See ACL user documentation for supported targets.
          ERROR Fall through target not supported ACL fall through action not supported. See ACL documentation for supported actions.
          ERROR Jump, target:[val] not supported ACL jump action not supported. See ACL documentation for supported actions.
          ERROR Module, target:[str] not supported Specified ACL action not supported. See ACL documentation for supported actions.
          ERROR iptables: Invalid argument iptables rules likely empty. Check iptables rules list.
          ERROR iptables:Could not open raw IPv4,socket. Internal socket error. File a ticket with Cumulus Support.
          ERROR iptables:Error retrieving getsockopt SO_GET_INFO: Internal socket error. File a ticket with Cumulus Support.
          ERROR iptables: Memory allocation for counters failed for size ACL out of memory resource. File a ticket with Cumulus Support.
          ERROR iptables: Error retrieving getsockopt SO_GET_ENTRIES: Internal socket error. File a ticket with Cumulus Support.
          ERROR iptables: Memory allocation for rules failed for size ACL out of memory resource. File a ticket with Cumulus Support.
          ERROR Inverse flags for SRC/DST IP, IN/OUT interface, TOS, Protocol not supported ACL inverse match flags are not supported. See ACL user documentation.
          ERROR Target Verdict :[str] not supported ACL rule target verdict queue/stop/return/etc not supported. See ACL user documentation for supported targets.
          ERROR Fall through target not supported ACL fall through action not supported. See ACL documentation for supported actions.
          ERROR Jump, target:[val] not supported ACL jump action not supported. See ACL documentation for supported actions.
          ERROR Module, target:[str] not supported Specified ACL action not supported. See ACL documentation for supported actions.
          ERROR iptables: Invalid argument iptables rules likely empty. Check iptables rules list.
          ERROR ip6tables:Could not open raw IPv6,socket. Internal socket error. File a ticket with Cumulus Support.
          ERROR ip6tables:Error retrieving getsockopt SO_GET_INFO: Internal socket error. File a ticket with Cumulus Support.
          ERROR ip6tables: Memory allocation for rules failed for size ACL out of memory resource. File a ticket with Cumulus Support.
          ERROR ip6tables: Error retrieving getsockopt SO_GET_ENTRIES: Internal socket error. File a ticket with Cumulus Support.
          ERROR iptables: Memory allocation for rules failed for size ACL out of memory resource. File a ticket with Cumulus Support.
          ERROR arptables:Could not open raw IPv6 socket. Internal socket error. File a ticket with Cumulus Support.
          ERROR arptables:Error retrieving getsockopt SO_GET_INFO: [str] Internal socket error. File a ticket with Cumulus Support.
          ERROR arptables: Error retrieving getsockopt SO_GET_ENTRIES: [str] Internal socket error. File a ticket with Cumulus Support.
          ERROR Inverse flags for SRC/DST MAC, IN/OUT/logical interface, Protocol not supported ACL unsupported match. See ACL documentation for supported matches.
          ERROR logical interface in:[str] out:[str] not supported ACL unsupported match. See ACL documentation for supported matches.
          ERROR Protocol field: LENGTH not supported ACL unsupported LENGTH field match for Ethernet packets. See ACL documentation for supported matches.
          ERROR Policy not supported ACL unsupported ebtables action continue/return/etc. See ACL documentation for supported actions.
          ERROR Target verdict: [str] not supported ACL unsupported ebtables verdict. See ACL documentation for supported actions.
          ERROR Fall through or Jump target not supported ACL fall through/jump action not supported. See ACL documentation for supported actions.
          ERROR Target verdict: [str] not supported ACL target verdict queue/stop/return/etc not supported. See ACL user documentation for supported targets.
          ERROR ebtables: Invalid argument iptables rules likely empty or read from kernel failed. Check ebtables rules list or file a ticket with Cumulus Support.
          ERROR ebtables:Could not open raw IPv4,socket. Internal socket error. File a ticket with Cumulus Support.
          ERROR ebtables:Error retrieving getsockopt SO_GET_INFO: Internal socket error. File a ticket with Cumulus Support.
          ERROR ebtables: Memory allocation for rules failed for size ACL out of memory resource. File a ticket with Cumulus Support.
          ERROR ebtables: Error retrieving getsockopt SO_GET_ENTRIES: Internal socket error. File a ticket with Cumulus Support.
          ERROR sfs_add: [str] failed ACL: FUSE file system node creation failed. File a ticket with Cumulus Support.
          ERROR MAX retries reached, stats sync acl failed - %d ACL: Kernel updation of stats failed after retries. File a ticket with Cumulus Support.
          ERROR hal_mlx_sdk_counter_wrap.c:357 ERR sx_api_flow_counter_bulk_set create failed with: No More Resources Hardware offload of ACL rule set failed, typically due to TCAM resource exhaustion. Refer to Troubleshooting ACL Rule Installation Failures.
          ERROR hal_mlx_flx_acl.c:9588 ERR flow_counter_bulk_set create failed with: No More Resources Hardware offload of ACL rule set failed, typically due to TCAM resource exhaustion. Refer to Troubleshooting ACL Rule Installation Failures.
          ERROR hal_mlx_flx_acl.c:3246 ERR BULK counter init failed with No More Resources Hardware offload of ACL rule set failed, typically due to TCAM resource exhaustion. Refer to Troubleshooting ACL Rule Installation Failures.
          ERROR hal_mlx_flx_acl.c:2765 hal_mlx_flx_chain_desc_install returned 0 Hardware offload of ACL rule set failed, typically due to TCAM resource exhaustion. Refer to Troubleshooting ACL Rule Installation Failures.
          ERROR hal_mlx_flx_acl.c:1981 ERR acl_plan_install returned 0 Hardware offload of ACL rule set failed, typically due to TCAM resource exhaustion. Refer to Troubleshooting ACL Rule Installation Failures.
          ERROR sync_acl hardware installation failed Hardware offload of ACL rule set failed, typically due to TCAM resource exhaustion or unsupported rules. Refer to Troubleshooting ACL Rule Installation Failures.
          ERROR sync_acl.c:225 ERR BULK counter init failed with No More Resources Hardware offload of ACL rule set failed, typically due to TCAM resource exhaustion. Refer to Troubleshooting ACL Rule Installation Failures.
          ERROR sync_acl.c:6669 ERR BULK counter init failed with No More Resources Hardware offload of ACL rule set failed, typically due to TCAM resource exhaustion. Refer to Troubleshooting ACL Rule Installation Failures.
          ERROR ACL: Restore of current table failed: sync_acl hardware installation failed Hardware restore of previously installed ACL rule set failed. File a ticket with Cumulus Support.
          ERROR kernel tunnel not found for if_key [str] L2: Failed to lookup a tunnel interface. Check configuration.
          ERROR [str] duplicate member [str] for bridge [int] L2: Duplicate bridge in hash table. File a ticket with Cumulus Support.
          ERROR tc: u32 ip: unknown match: handle: [hex] index:[int] off:[int] offmask:[hex] val:[hex] mask: [hex] TC rule harware offload unsupported. Remove TC rules.
          ERROR tc: u32 ip: match parse failed: handle: [hex], index:[int], rv: [int] TC rule harware offload unsupported. Remove TC rule.
          ERROR TC: [str] TC rule harware offload unsupported. Remove TC rule.
          ERROR TC: sync_clss hardware installation failed TC rule harware offload unsupported. Remove TC rule.
          ERROR IPRULE: sync to h/w failed in non atomic mode. IPRULE rules deleted. Please retry Failed to install ACL rules. Remove ACL rules and reinstall ACL rules.
          ERROR IPRULE: event handler failed Failed to install ACL rules. Remove ACL rules and reinstall ACL rules.
          ERROR TC: sync to h/w failed in non atomic mode. TC rules deleted. Please retry Failed to install ACL rules. Remove ACL rules and reinstall ACL rules.
          ERROR TC: event handler failed Syncing database failed between kernel and switchd. File a ticket with Cumulus Support.
          ERROR sigaction failed for signal [int], [str] Failed to initialize signal handler. File a ticket with Cumulus Support.
          ERROR Ignoring VRF [str]; table id 0 is reserved for default VRF Failed to add a new entry in the ARP table.
          ERROR Unable to setup handling of SIGHUP for log rotation: [str] Failed to create a polling thread in NIC init. File a ticket with Cumulus Support.
          ERROR read error on fd errno [int] Failed to attach a port in NIC. File a ticket with Cumulus Support.
          ERROR [str] duplicate member [str] for bridge [int] Failed to allocate a member in the kernel bridge. Please check there are no duplicate bridge members.
          ERROR [str]: no context Failed to append the CSV command in Prescriptive Topology Manager. File a ticket with Cumulus Support.
          ERROR [str]: Could not append key Failed to append CSV command in Prescriptive Topology Manager File a ticket with Cumulus Support.
          ERROR [str]: Could not append val Failed to append CSV command in Prescriptive Topology Manager File a ticket with Cumulus Support.
          ERROR [str]: Could not allocate csv Failed to initialize the CSV command in Prescriptive Topology Manager. File a ticket with Cumulus Support.
          ERROR [str]: Could not allocate record Failed to initialize the CSV command in Prescriptive Topology Manager. File a ticket with Cumulus Support.
          ERROR [str]: Could not allocate context Failed to initialize the CSV command in Prescriptive Topology Manager. File a ticket with Cumulus Support.
          ERROR [str]: no context Failed to complete the CSV command in Prescriptive Topology Manager. File a ticket with Cumulus Support.
          ERROR [str]: cannot serialize Failed to complete the CSV command in Prescriptive Topology Manager. File a ticket with Cumulus Support.
          ERROR fatal recv error([str]), closing connection, rc [int] Failed to complete the CSV command in Prescriptive Topology Manager. File a ticket with Cumulus Support.
          ERROR Cannot allocate csv for msg Failed to read the CSV command in Prescriptive Topology Manager. File a ticket with Cumulus Support.
          ERROR [str]: Could not allocate context Failed to decode the CSV command in Prescriptive Topology Manager. File a ticket with Cumulus Support.
          ERROR STP mode_set failed for port [int]: [str] Failed to set spanning tree mode in port [uint], error msg [str]. Forwarding behavior would be impacted by this failure. File a ticket and contact Cumulus Support.
          ERROR failed to set port [int] vlan_ingress_filter enable Failed to set VLAN ingress filter for port [uint], error msg [str]. File a ticket and contact Cumulus Support.
          ERROR failed to set FDB polling interval swid [uint]: [str] Failed to set FDB polling interval for Mellanox SDK switchd id [int], error msg [str]. Failure to do this impacts MAC address learning behavior. File a ticket and contact Cumulus Support.
          ERROR failed to set FDB notify_params swid [uint]: [str] Failed to set FDB MAC address learning notification in Mellanox SDK for switch id [uint], error msg [str]. This error impacts the capability of the switch to learn MAC address. File a ticket and contact Cumulus Support.
          ERROR failed to create trap group [uint] trap id [uint] swid [uint] group_attr.prio : [int] error: [str] Failed to create the TRAP groups in the Mellanox SDK. Traps groups are used for policing trap IDs, which are used to punt control packets to OS stack. This failure impacts packet forwarding. File a ticket and contact Cumulus Support.
          ERROR failed to open host ifc group [uint] trap id [uint] swid [uint] error [str] Failed to retrieve the file descriptor of the current open channel to the Mellanox SDK, for ifc group [uint] trap ID [uit] swid [uint], error msg [str]. The error is not recoverable. File a ticket and contact Cumulus Support.
          ERROR failed to obtain group [uint] FD for polling Failed to retrieve the FD for a trap group [id]. The error is not recoverable. File a ticket and contact Cumulus Support.
          ERROR failed to define trap [uint] group [uint] swid [uint] error: [str] Failed to set trap ID [uint], trap group [uint], switch ID [uint], for user defined trap, error msg [str] . File a ticket and contact Cumulus Support.
          ERROR failed to set trap [uint] group [uint] swid [uint] error: [str] Failed to set trap ID [uint], trap group [uint], switch ID [uint], for user defined trap, error msg [str]. File a ticket and contact Cumulus Support.
          ERROR failed to register trap [uint] swid [uint] error: [str] Failed to register trap ID [uint] in switch ID [uint] in Mellanox SDK, error msg [str]. File a ticket and contact Cumulus Support.
          ERROR trap_id [uint] was not installed Trap ID [uint] was not installed in the Mellanox SDK. This would impact packet forwarding from the switch ASIC to the control plane. File a ticket and contact Cumulus Support.
          ERROR trap_id [uint] was not installed Trap ID [uint] was not installed in the Mellanox SDK. This would impact packet forwarding from the switch ASIC to the control plane. File a ticket and contact Cumulus Support.
          ERROR dflt_trap_parsing_depth get failed: [str] Failed to retrieve the Mellanox Spectrum chip parsing depth from Mellanox SDK, error msg [str]. Possibly the parsing depth has not been set correctly. This would impact hardware packet forwarding. File a ticket and contact Cumulus Support.
          ERROR new_depth [uint] failed: [str] Failed to set the packet parsing depth [uint] in Mellanox SDK, error msg [str]. This failure impacts hadrware packet forwarding. File a ticket and contact Cumulus Support.
          ERROR failed to set trap [uint] group [uint] swid [uint] action [uint] error: [str] Failed to set trap ID [uint], trap group [uint], switch ID [uint], trap action. Failure would lead to the respective control packet not reaching the CPU. File a ticket and contact Cumulus Support.
          ERROR [str] failed to convert trap policer attributes Failed to get the policer unit for policer group name [str]. Policer unit can be metered with unit of packets for bytes. File a ticket and contact Cumulus Support.
          ERROR [str] failed to create policer: [str] Failed to create policer for policer group [str], error msg [str]. Failure to set policer would impact packet forwarding from hadrware data path to CPU. File a ticket and contact Cumulus Support.
          ERROR [str] sw_rate_limiter set failed: [str] Failed to set the software rate limiter for policy group [str] in Mellanox SDK, error msg [str]. This failure could impact rate limiting for packets forwarded to CPU. File a ticket and contact Cumulus Support.
          ERROR group [str] failed to edit policer: [str] Failed to modify policer for policer group [str], error msg [str]. Failure to set policer would impact packet forwarding from hadrware data path to CPU. File a ticket and contact Cumulus Support.
          ERROR unknown trap group [uint] A trap group ID [uint] unknown to the Mellanox SDK is being used to configure the Mellanox SDK policer. This is an internal configuration error. File a ticket and contact Cumulus Support.
          ERROR group [str] failed to bind policer %" PRIu64 “: [str] Policer group [str] with policer ID [uint64] failed to bind in the Mellanox SDK, error msg [str]. This error would impact policing of packets being forwarded from hardware to CPU. File a ticket and contact Cumulus Support.
          ERROR group [str] failed to unbind policer %” PRIu64 “: [str] Policer group [str] with policer ID [uint64] failed to unbind in the Mellanox SDK, error msg [str]. This error would impact policing of packets being forwarded from hardware to CPU. File a ticket and contact Cumulus Support.
          ERROR unsupported type [uint] Failed to create a trap counter type as the trap counter type [uint] does not match one of the well-defined ones in the Mellanox SDK. This is an internal configuration error. File a ticket and contact Cumulus Support.
          ERROR unsupported type [uint] Failed to create a trap counter type as the trap counter type [uint] does not match one of the well-defined ones in Mellanox SDK. This is an internal configuration error. File a ticket and contact Cumulus Support.
          ERROR type [uint] failed: [str] Failed to retrieve the host IFC counter for the counter type [int] from the Mellanox SDK, error msg [str]. File a ticket and contact Cumulus Support.
          ERROR unknown meter_type [uint] Incorrect policer unit [uint] used to find out the policer group meter unit. Policer unit can be metered with unit of packets for bytes. This is an internal error. File a ticket and contact Cumulus Support.
          ERROR unrecognized lid [hex] Failed to retrieve the interface key from logical port id [uint]. File a ticket and contact Cumulus Support.
          ERROR [str] unexpected duplicate key [uint] Failed to add an interface [str] vport with internal VLAN ID [uint] in external VLAN vport hash table because of duplicate entry [uint]. File a ticket and contact Cumulus Support.
          ERROR [str] int_vid [uint] ext_vid [uint]: [str] Failed to create a virtual port from logical port, interface [str], internal vlan id [uint] and external vlan id[uint], error msg [str] File a ticket and contact Cumulus Support.
          ERROR Unexpected duplicate vport_lid [hex] for [str] Failed to add vport logical interface id [uint], interface [str], in vlan vport hash table because of duplicate entry [uint] File a ticket and contact Cumulus Support.
          ERROR delete failed for [str] int_vid [uint] ext_vid [uint]: [str] Failed to delete a virtual port from logical port, interface [str], internal vlan id [uint] and external vlan id [uint], error msg [str] File a ticket and contact Cumulus Support.
          ERROR [str] vrid not found for table [uint] virtual router id not found in virtual id table [id] in software File a ticket and contact Cumulus Support.
          ERROR delete failed for int_vid [uint] ext_vid [uint] vport_lid [hex] : [str] Failed to delete a virtual port from logical port, internal vlan id [uint], virtual port logical if [uint] and external vlan id [uint], error msg [str] File a ticket and contact Cumulus Support.
          ERROR [str] vrid not found for table [uint] virtual router id not found in virtual id table [id] in software File a ticket and contact Cumulus Support.
          ERROR port [int] ext_vlan [int] already exists port [uint] and external vlan [uint] already exists in the e2i table File a ticket and contact Cumulus Support.
          ERROR [str] int_vlan [int] already assigned to [str] interface [str] with internal vlan [uint] is already assigned to interface [str] in the e2i table. This is an internal configration error File a ticket and contact Cumulus Support.
          ERROR failed to get base bond for [str] Failed to get the bond interface for interface [str] File a ticket and contact Cumulus Support.
          ERROR failed to add to interface ht s[str] Failed to add interface [str] to the ifp hash table because an entry already exists File a ticket and contact Cumulus Support.
          ERROR [str] old_int_vlan [int] inconsistent interface [str] with vlan [uint] is inconsistent in the e2i table File a ticket and contact Cumulus Support.
          ERROR [str] new_int_vlan [int] already assigned to [str] interface [str] with internal vlan [uint] is already assigned to interface [str] in the e2i table. This is an internal configration error File a ticket and contact Cumulus Support.
          ERROR UC flood block [uint] failed for [str] vlan [uint]: [str] unicast flood block [uint] failed for interface [str] for vlan id [uint], error msg [str] File a ticket and contact Cumulus Support.
          ERROR learn mode [uint] failed for [str] vlan [uint]: [str] learn mode [uint] failed for interface [str] and internal vlan [uint], error msg [str] File a ticket and contact Cumulus Support.
          ERROR error processing bridge vlan information error processing bridge vlan information File a ticket and contact Cumulus Support.
          ERROR bond_mbrs_vlan_port_set failed for bond: [int] failed to set vlan for bond members for bond id [uint] File a ticket and contact Cumulus Support.
          ERROR unsupported interface type: [uint] unsupported interface type [uint] File a ticket and contact Cumulus Support.
          ERROR cannot find STG for bridge_vlan [uint] vid [uint] cannot find the spanning tree group for bridge vlan [uint] and vlan id [uint] File a ticket and contact Cumulus Support.
          ERROR flood_mode_set failed for swid [int] vid [int] Flood mode could not be set for unregistered multicast in swid [uint] vlan id [uint] File a ticket and contact Cumulus Support.
          ERROR vlans set failed for [str]: [str] setting of vlan failed for interface [str], error msg [str] File a ticket and contact Cumulus Support.
          ERROR qinq mode set failed for [str]: [str] failed to set qinq mode for interface [str], error msg [str] File a ticket and contact Cumulus Support.
          ERROR qinq mode set failed for [str]: [str] failed to set qinq mode for bond for interface [str], error msg [str] File a ticket and contact Cumulus Support.
          ERROR unsupported interface type: [uint] unsupported interface type [uint] File a ticket and contact Cumulus Support.
          ERROR bond id [uint] not fully created bond id [uint] creation is not complete File a ticket and contact Cumulus Support.
          ERROR cannot find bridge vlan for bridge: [int] unable to find bridge vlan for the bridge id [uint] File a ticket and contact Cumulus Support.
          ERROR cannot find bond vlan for bond cannot find bond vlan for the bond File a ticket and contact Cumulus Support.
          ERROR cannot allocate vlan for bond interface Failed to allocate vlan for bond interface File a ticket and contact Cumulus Support.
          ERROR cannot allocate vlan for sub-interface Failed to allocate vlan for sub interface File a ticket and contact Cumulus Support.
          ERROR gre tunnel decap entry creation failed : [str] Failed to create decapsulation entry in Mellanox SDK. Decapsulation of GRE packet woould not be operational, error msg [str] File a ticket and contact Cumulus Support.
          ERROR gre tunnel decap destroy failed : [str] Failed to delete the GRE decapsulation entry in Mellanox SDK, error msg [str] File a ticket and contact Cumulus Support.
          ERROR gre tunnel curr decap entry delete failed : [str] Failed to delete the GRE decapsulation entry in Mellanox SDK, error msg [str] File a ticket and contact Cumulus Support.
          ERROR gre tunnel new decap entry update failed : [str] Failed to create decapsulation entry in Mellanox SDK. Decapsulation of GRE packet woould not be operational, error msg [str] File a ticket and contact Cumulus Support.
          ERROR failed to make logical gre key Failed to form the logical GRE key from the interface information provided File a ticket and contact Cumulus Support.
          ERROR failed to make gre decap key Failed to form the logical decap GRE key from the information provided File a ticket and contact Cumulus Support.
          ERROR failed to make overlay key from underlay key Failed to create overlay gre key from underlay information File a ticket and contact Cumulus Support.
          ERROR unable to find gre entry for tunnel id ([hex] Failed to find gre entry from tunnel id in the gre tunnel key hash table, using tunnel id [uint] File a ticket and contact Cumulus Support.
          ERROR duplicate entry in overlay ht : ifindex ([int] Unable to add a duplicate gre entry with ifindex [uint] in the gre overlay hash table. A duplicate config is being attempted File a ticket and contact Cumulus Support.
          ERROR failed to create overlay rif : ifindex : [int] tunnel type [uint] key [uint] Unable to create an overlay router interface with ifindex [uint] , tunnel type [uint] and tunel key [uint] File a ticket and contact Cumulus Support.
          ERROR gre tunnel creation failed: [str] : Failed to create tunnel id for GRE in Mellanox SDK, error msg [str] File a ticket and contact Cumulus Support.
          ERROR invalid argument GRE update is being called with an invalid GRE information, no operation would be performed File a ticket and contact Cumulus Support.
          ERROR gre tunnel ([hex]) update failed: [str] : Failed to update tunnel id [hex] for GRE in Mellanox SDK, error msg [str] File a ticket and contact Cumulus Support.
          ERROR gre tunnel destroy failed: [str] Failed to delete tunnel id for GRE in Mellanox SDK, error msg [str] File a ticket and contact Cumulus Support.
          ERROR loopback rif for ifindex ([int]) : [str] Failed to add the loopback router interface, interface ifindex [uint] in Mellanox SDK, error [str] File a ticket and contact Cumulus Support.
          ERROR ifindex ([int]) overlay rif ([int]) : [str] Failed to delete the loopback router interface, interface ifindex [uint], overlay router interface [uint] in Mellanox SDK, error [str] File a ticket and contact Cumulus Support.
          ERROR cannot allocate bridge vlan for bridge id [int] Failed to allocate bridge vlan for bridge id [uint] Check The Cumulus Linux Configuration guide
          ERROR flood_mode_set failed for swid [int] vid [int] Flood mode could not be set for unregistered multicast in swid [uint] vlan id [uint] File a ticket and contact Cumulus Support.
          ERROR cannot allocate ln_vlan [uint] for bridge_id [int] Failed to allocate vlan [uint] for bridge id [uint] Check The Cumulus Linux Configuration guide
          ERROR flood_mode_set failed for swid [int] vid [int] Flood mode could not be set for unregistered multicast in swid [uint] vlan id [uint] File a ticket and contact Cumulus Support.
          ERROR vlan [uint] not yet allocated vlan [uint] does not exists for the bridge and is not allocated Check The Cumulus Linux Configuration guide
          ERROR [str] bridge_id [uint] vlan [uint] port [hex] failed: [str] failed to add a unicast mac address [str] on bridge id [uint] vlan [uint] port [uint]. This could be an internal error or could be because of configuration error File a ticket and contact Cumulus Support.
          ERROR [str] bridge_id [uint] vlan [uint] port [hex] failed: [str] failed to delete a unicast mac address [str] on bridge id [uint] vlan [uint] port [uint]. This could be an internal error or could be because of configuration error File a ticket and contact Cumulus Support.
          ERROR num_macs [uint] num_failed_macs [uint] delete failed: [str] failed to delete a number [uint] of unicast mac address, error msg [str]. This could be an internal error or could be because of configuration error File a ticket and contact Cumulus Support.
          ERROR num_macs [uint] learn set failed: [str] failed to add [uint] unicast mac addressess, error msg [str]. This could be because of resource exhaustion File a ticket and contact Cumulus Support.
          ERROR num_macs [uint] delete failed: [str] failed to delete a number [uint] of unicast mac address, error message [str]. This could be an internal error or could be because of configuration error File a ticket and contact Cumulus Support.
          ERROR age_time set failed [str] on swid [uint] Failed to set fdb ageing time. This would cause the mac addresses in FDB not to age mac address File a ticket and contact Cumulus Support.
          ERROR cannot find vlan for brmac [str] vfid [uint] vlan [uint] does not exists for the bridge and so could not find the vlan for bridge mac address Check The Cumulus Linux Configuration guide
          ERROR vfid not set for vlan [uint] failed to return a translated vlan id for vlan [uint] File a ticket and contact Cumulus Support.
          ERROR num_macs [uint] delete failed: [str] failed to delete a number [uint] of unicast mac address, error msg [str]. This could be an internal error or could be because of configuration error File a ticket and contact Cumulus Support.
          ERROR bridge_vlan [uint] expected swid [uint] but found [uint] bridge vlan id [uint] expected switchd id [uint] for the vlan is [uint] switch id Check The Cumulus Linux Configuration guide
          ERROR num_macs [uint] delete failed: [str] failed to delete a number [uint] of unicast mac address, error msg [str]. This could be an internal error or could be because of configuration error File a ticket and contact Cumulus Support.
          ERROR bridge_vlan [uint] expected swid [uint] but found [uint] bridge vlan id [uint] expected switchd id [uint] for the vlan is [uint] switch id Check The Cumulus Linux Configuration guide
          ERROR get failed: [str] Failed to get fdb unicast mac address from Mellanox SDK, error msg [str] File a ticket and contact Cumulus Support.
          ERROR num_macs [uint] delete failed: [str] failed to delete a unicast mac address [str] on bridge id [uint] vlan [uint] port [uint]. This could be an internal error or could be because of configuration error File a ticket and contact Cumulus Support.
          ERROR internal vlans exhausted total number of internal vlan has exhausted. No morevlans could be addded Check The Cumulus Linux Configuration guide
          ERROR identity map failed for vlan [uint]: [str] Failed to map the forwarding id to the vlan id [uint] in Mellanox SDK, error msg [str] File a ticket and contact Cumulus Support.
          ERROR learn mode_failed for vlan [uint]: [str] failed to set the learning mode for vlan id [uint], error message [str] File a ticket and contact Cumulus Support.
          ERROR failed to get members for vlan [uint]: [str] Failed to get member port for vlan [uint], error message [uint] File a ticket and contact Cumulus Support.
          ERROR vlan [uint] is not an L3 vlan vlan [uint] entry is not representing a l3 interface Check The Cumulus Linux Configuration guide
          ERROR unsupported interface type: [uint] unsupported interface type [uint] File a ticket and contact Cumulus Support.
          ERROR [hex] int_vlan [uint] failed: [str] failed to set the internal vlan [uint] for the logical port id [uint], error msg [str] Check The Cumulus Linux Configuration guide
          ERROR [hex] pvid [uint] failed: [str] failed to set the logical interface [uint] to van id [uint], error msg [str] Check The Cumulus Linux Configuration guide
          ERROR [hex] int_vlan [uint] failed: [str] failed to unset the internal vlan [uint] for the logical port id [uint], error msg [str] Check The Cumulus Linux Configuration guide
          ERROR [hex] revert pvid: [str] failed to delete the logical interface [uint] to vlan id [uint], error msg [str] Check The Cumulus Linux Configuration guide
          ERROR unsupported interface type: [uint] unsupported interface type [uint] File a ticket and contact Cumulus Support.
          ERROR unsupported interface type: [uint] unsupported interface type [uint] File a ticket and contact Cumulus Support.
          ERROR failed for lid [hex] int_vlan [uint] STG [uint]: [str] Failed to set the spanning tree group for logical interface [uint], internal vlan [uint] spanning tree group [uint], error msg [%s] File a ticket and contact Cumulus Support.
          ERROR unsupported if_type [uint] unsupported interface type [uint] File a ticket and contact Cumulus Support.
          ERROR port [str] not established port interface [str] has not been established yet File a ticket and contact Cumulus Support.
          ERROR failed for [str] lid [hex]: [str] Failed to set the port, logical id [uint], interface [str], to accept the frame type, error msg [str] File a ticket and contact Cumulus Support.
          ERROR list allocation failed failed to allocate memory for the ports File a ticket and contact Cumulus Support.
          ERROR STGs exhausted Total number of spanning tree group has exhausetd. please consult configuration manual Check The Cumulus Linux Configuration guide
          ERROR MSTP instance set failed for STG [int]: [str] Failed to set the MSTP instance for the spanning tree group [uint] in Mellanox SDK, error msg [str] Check The Cumulus Linux Configuration guide
          ERROR failed to delete STG [uint]: [str] Failed to delete the MSTP instance for the spanning tree group [uint] in Mellanox SDK, error msg [str] Check The Cumulus Linux Configuration guide
          ERROR failed to add vlan [int] to STG [int]: [str] Failed to add vlan [uint] to spanning tree group [uint] in Mellanox SDK, error msg [str] File a ticket and contact Cumulus Support.
          ERROR failed to remove vlan [int] from STG [int]: [str] Failed to delete vlan [uint] to spanning tree group [uint] in Mellanox SDK, error msg [str] File a ticket and contact Cumulus Support.
          ERROR vlan [uint] not yet created vlan [uint] does not exist and is not allocated Check The Cumulus Linux Configuration guide
          ERROR STG [int] not yet created spanning tree group id [uint] is not created File a ticket and contact Cumulus Support.
          ERROR Duplicate vfid [uint] Failed to add virtual forwarding id [uint] in hash table because of a duplicate entry File a ticket and contact Cumulus Support.
          ERROR fdb_uc_mac_addr_get failed: [str] Failed to get fdb unicast mac address from Mellanox SDK, error msg [str] File a ticket and contact Cumulus Support.
          ERROR failed to allocate mac_list Failed to allocate mac address list Check The Cumulus Linux Configuration guide
          ERROR init set failed: [str] Initilizition of router module in Mellanox SDK failed. File a ticket and contact Cumulus Support.
          ERROR hash params set failed: [str] Initilizition of router ecmp hash module in Mellanox SDK failed. File a ticket and contact Cumulus Support.
          ERROR router #[uint] set failed: [str] Initilizition of virtual router id [uint] failed, because of error [str] in Mellanox SDK. Check The Cumulus Linux Configuration guide
          ERROR [str] failed cmd [str] vlan [uint] mac [str] fwd_state [str]: [str] Routed interface description [str] failed to add/update/delete [str] with vlan [uint] mac [str] forwarding state [str] Check The Cumulus Linux Configuration guide
          ERROR [str] cmd [str] failed for vlan [uint] mac [str]: [str] Routed interface description [str] failed to add/update/delete [str] with vlan [uint] mac [str] error [str] Check The Cumulus Linux Configuration guide
          ERROR failed for intf [uint]: [str] Failed to get retrieve router interface id [uint], errorstr [str] File a ticket and contact Cumulus Support.
          ERROR failed for vlan [uint] intf [uint]: [str] Failed to delete interface for vlan [uint] interface id [uint] error [str] File a ticket and contact Cumulus Support.
          ERROR neigh delete all failed for intf [uint]: [str] Deletion of all the neighbor entries for the interface id [uint] error [str] File a ticket and contact Cumulus Support.
          ERROR interface state set failed for l3_intf_id [uint]: [str] Failed to set state for l3 interface id [uint] error [str] File a ticket and contact Cumulus Support.
          ERROR invalid router mac [str] check for a unicast mac address [str] failed Check The Cumulus Linux Configuration guide
          ERROR failed for l3_intf_id [int] mac [str] vlan [uint]: [str] Failed to add mac address [str] for interface id [uint] vlan [uint] error [str] File a ticket and contact Cumulus Support.
          ERROR failed for l3_intf_id [int] mac [str] vlan [uint]: [str] Failed to delete mac address [str] for interface id [uint] vlan [uint] error [str] File a ticket and contact Cumulus Support.
          ERROR Invalid table id: must be between [int] and [int] An Invalid vrf id is being tried to program, vrf id should be in range of vrf id [uint] - [uint] File a ticket and contact Cumulus Support.
          ERROR calloc failed to allocate a new ECMP entry Failed to allocate a new entry for ECMP File a ticket and contact Cumulus Support.
          ERROR unexpected duplicate SDK ecmp id [uint] (HAL ECMP id [int] Failed to add an ECMP entry id [uint] in hash table because of duplicate ECMP ID [uint] File a ticket and contact Cumulus Support.
          ERROR unable to reconstruct original route [str] Failed to construct a route [str] from a hardware entry to a hardware abstraction layer entry entry, for the purpose of updating the existing entry File a ticket and contact Cumulus Support.
          ERROR unable to obtain HW info for original route [str] Failed to retrieve a route [str] from the hardware, for the purpose of updating the existing entry File a ticket and contact Cumulus Support.
          ERROR activity get failed: [str] Failed to check the neighbor activity information, error [str] File a ticket and contact Cumulus Support.
          ERROR route [str] not found in hal_routes Failed to find the route [str] in software File a ticket and contact Cumulus Support.
          ERROR unexpected duplicate l3_intf_id [uint] Found duplicate l3 interface id [uint] in software File a ticket and contact Cumulus Support.
          ERROR unexpected duplicate RIF param [uint] Found duplicate l3 interface id [uint] parameters in software File a ticket and contact Cumulus Support.
          ERROR unexpected duplicate l3_intf_id [uint] vlan [uint] Found duplicate l3 interface id [uint] having vlan [uint] in software File a ticket and contact Cumulus Support.
          ERROR failed for vlan [uint] l3_intf_id [uint]: [str] Failed to delete the router interface, vlan [uint] interface if [uint] in Mellanox SDK, error [str] File a ticket and contact Cumulus Support.
          ERROR unexpected duplicate entry Failed to add the router interface because of duplicate entry File a ticket and contact Cumulus Support.
          ERROR vlan not found for l3_intf_id [uint] Failed retrieve l3 interface [uint] param File a ticket and contact Cumulus Support.
          ERROR [str] vlan [uint] does not exist for bridge_id [int] vlan [uint] does not exists for the bridge id [uint] Check The Cumulus Linux Configuration guide
          ERROR [str] no bridge exists for bridge_id [int] bridge id [uint] does not exists in the software database Check The Cumulus Linux Configuration guide
          ERROR [str] cannot allocate vlan [uint] for bridge_id [int] failed to allocate bridge id [uint] for vlan id [uint] in mellaox SDK. File a ticket and contact Cumulus Support.
          ERROR interface [str] not an svi Interface [str] type is expected to be SVI, but it is not SVI File a ticket and contact Cumulus Support.
          ERROR [str] vrid not found for table [uint] virtual router id not found in virtual id table [id] in software File a ticket and contact Cumulus Support.
          ERROR invalid interface: [str] An interface check has found that the . File a ticket and contact Cumulus Support.
          ERROR [str] vrid not found for table [uint] virtual router id not found in virtual id table [id] in software File a ticket and contact Cumulus Support.
          ERROR neighbor set failed for netdev_rif [uint]: [str] Setting up of a neighbor failed on the netdev router interface [uint] File a ticket and contact Cumulus Support.
          ERROR neighbor delete failed: [str] Deletion of the neighbor entry failed, error [str] File a ticket and contact Cumulus Support.
          ERROR unexpected duplicate: Addition of the interface to the software table failed because of an already existing interface File a ticket and contact Cumulus Support.
          ERROR Failed to get vrid for table [uint] virtual router id not found in virtual id table [id] in software File a ticket and contact Cumulus Support.
          ERROR Failed to get vrid for table [uint] virtual router id not found in virtual id table [id] in software File a ticket and contact Cumulus Support.
          ERROR route_op [uint] neighbor route must have a valid netdev: route operation to add/del [uint] failed as the route does not have a valid next hop net device File a ticket and contact Cumulus Support.
          ERROR cannot find vlan_if for next hop [str] vlan interface [str] could not be found for the next hop, as the next hop programming is being done on a vlan interface File a ticket and contact Cumulus Support.
          ERROR cannot find parent bond info Failed to retrieve the master bond interface descriptor for port id [uint] and vlan id [uint] File a ticket and contact Cumulus Support.
          ERROR no RIF found for [str] Router interface [str] not found for neighbor File a ticket and contact Cumulus Support.
          ERROR unexpected rif type [str] A router interface type [str] found and it is an invalid type of router interface File a ticket and contact Cumulus Support.
          ERROR neigh_get vrid [uint] failed: [str] Failed to retrieve the neighbor from the virtual router id [uint] File a ticket and contact Cumulus Support.
          ERROR failed to allocate neigh_entry_list Failed to allocate memory for a list of neighbors File a ticket and contact Cumulus Support.
          ERROR route cmd [str] failed for vrid [int]: [str] Failed operation [str] on a unicast route on vrid [uint] File a ticket and contact Cumulus Support.
          ERROR for [str] Failed to set a unicast route [str] in Mellanox SDK. File a ticket and contact Cumulus Support.
          ERROR route delete failed for [str]: [str] Failed to delete route in route [str] in the Mellanox SDK, error msg [str] File a ticket and contact Cumulus Support.
          ERROR unexpected duplicate: Found duplicate route in software File a ticket and contact Cumulus Support.
          ERROR unknown route type [uint] Route inspected is an unknown route type [uint], unsupported by Mellanox SDK. File a ticket and contact Cumulus Support.
          ERROR ECMP SDK id [uint] not found ECMP id [uint] for the route not found in the Mellanox SDK. File a ticket and contact Cumulus Support.
          ERROR Failed to get vrid for table [uint] virtual router id not found in virtual id table [id] in software. File a ticket and contact Cumulus Support.
          ERROR too many next hops [int] for hal_route [str] number of next hop [uint] exceeded the max limit for the route [str]. File a ticket and contact Cumulus Support.
          ERROR Cannot allocate an ECMP key Failed to allocate an ECMP key to program a new route. File a ticket and contact Cumulus Support.
          ERROR Failed to get vrid for table [uint] virtual router id not found in virtual id table [id] in software. File a ticket and contact Cumulus Support.
          ERROR unknown route type [uint] Route inspected is an unknown route type [uint], unsupported by Mellanox SDK. File a ticket and contact Cumulus Support.
          ERROR route get failed for prefix_type [uint]: [str] Failed to retrieve a unicast route from Mellanox SDK, error reason [str]. File a ticket and contact Cumulus Support.
          ERROR Failed to get table_id for vrid [uint] Failed to retrieve vritual router table id for virtual router id [uint]. File a ticket and contact Cumulus Support.
          ERROR sx_api_router_uc_route_get_all failed: [str] Failed to retrieve a unicast route from Mellanox SDK, error reason [str]. File a ticket and contact Cumulus Support.
          ERROR failed to allocate uc_route_entry_list Failed to allocate a list for unicast route entries. File a ticket and contact Cumulus Support.
          ERROR ECMP hal_route_to_hw_ecmp_key failed Failed to retrieve a ecmp key for a route from Mellanox SDK. File a ticket and contact Cumulus Support.
          ERROR unable to find gre entry for tunnel Failed to find a GRE tunnel entry. File a ticket and contact Cumulus Support.
          ERROR unable to make logical gre key from if_key Failed to create a logical GRE key from the interface key. File a ticket and contact Cumulus Support.
          ERROR ECMP route contains one or more unresolvable nexthops ECMP route has one or more unresolved next hop. So programming for the route is not performed. File a ticket and contact Cumulus Support.
          ERROR ecmp: can’t hold the ECMP entry ECMP container NULL or invalid, so route entry programming would not be performed. File a ticket and contact Cumulus Support.
          ERROR onlink host route key setup failed Failed to create onlink host route key. File a ticket and contact Cumulus Support.
          ERROR onlink host route creation failed Failed to create onlink host route. File a ticket and contact Cumulus Support.
          ERROR sx_api_router_ecmp_clone_set failed on parent SDK ECMP ID [int]: [str] Failed to clone ecmp route to the parent ecmp File a ticket and contact Cumulus Support.
          ERROR ecmp: empty ECMP container add failed: [str] Failed to create a new and empty ecmp container with attriuts in Mellanox SDK, status msg [str]. File a ticket and contact Cumulus Support.
          ERROR ecmp: ECMP [str] failed: [str] num_next_hops is [int]. Failed to (create/update) operation [str], an ecmp container, status msg [str], with number of nexthops [uint]. File a ticket and contact Cumulus Support.
          ERROR unexpected duplicate ECMP key Failed to add an ECMP key in ecmp key hash table, as a duplicate entry already exists. File a ticket and contact Cumulus Support.
          ERROR unexpected duplicate ECMP SDK id [uint] Failed to add an ECMP id in ecmp id hash table, as a duplicate entry already exists. File a ticket and contact Cumulus Support.
          ERROR ecmp_id [uint] delete failed: [str] Failed to delete a ecmp entry, ecmp id [uint], status msg [str]. File a ticket and contact Cumulus Support.
          ERROR onlink host route key setup failed. Failed to create onlink host route key File a ticket and contact Cumulus Support.
          ERROR ecmp pbr refcount: can’t hold the ECMP sdkid: [int] entry. ecmp entry not found in the software while programming entries for PBR. File a ticket and contact Cumulus Support.
          ERROR ecmp pbr refcount: can’t put the ECMP sdkid: [int] entry. ecmp entry not found in the software while programming entries for PBR. File a ticket and contact Cumulus Support.
          ERROR SDK ecmp_id [uint] failed: [str] Failed to set ecmp attributes with ecmp id [uint] in Mellanox SDK, error mesg [str]. File a ticket and contact Cumulus Support.
          ERROR unexpected duplicate ECMP clone ID key Failed to add an ECMP clone id in, clone id hash table, as a duplicate entry already exists. File a ticket and contact Cumulus Support.
          ERROR onlink host route not suppported on non-Spectrum backend Onlink host route not supported on non-spectrum backend. File a ticket and contact Cumulus Support.
          ERROR onlink host route not suppported on non-Spectrum backend Onlink host route not supported on non-spectrum backend. File a ticket and contact Cumulus Support.
          ERROR unexpected duplicate: Failed to add an ECMP entry in hash table because a duplicate entry already exists. File a ticket and contact Cumulus Support.
          ERROR cannot find vlan_if for next hop [str] vlan interface [str] could not be found for the next hop, as the next hop programming is being done on a vlan interface. File a ticket and contact Cumulus Support.
          ERROR invalid rif for [str] An interface check found that routed interface [str] is invalid. File a ticket and contact Cumulus Support.
          ERROR hal_clag_set_port_egress_mask failed in backend[[int]] for Failed to install egress mask on MLAG port. File a ticket with Cumulus Support.
          ERROR hal_clag_set_ln_egress_mask failed in backend[[int]] for Failed to install egress mask on VXLAN device. File a ticket with Cumulus Support.
          ERROR ln_key [int] anycast_ip not set MLAG anycast IP is not set. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_tc_prio_map_get hal port [int] returned [str] ASIC egress queue map configuration read failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_tc_prio_map_set logical port 0x%x returned [str] ASIC egress queue map configuration write failed. File a ticket with Cumulus Support.
          WARNING hal_mlx_priority_source_trust_get HAL port [int] logical port 0x%x returned [str] ASIC priority source trust configuration read failed. File a ticket with Cumulus Support.
          WARNING hal_mlx_priority_source_trust_set HAL port [int] logical port 0x%x returned [str] ASIC priority source trust configuration write failed. File a ticket with Cumulus Support.
          WARNING hal_mlx_rewrite_enable_get HAL port [int] logical port 0x%x returned [str] ASIC priority rewrite enable configuration read failed. File a ticket with Cumulus Support.
          WARNING hal_mlx_rewrite_enable_set HAL port [int] logical port 0x%x returned [str] ASIC priority rewrite enable configuration write failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_tc_mcaware_get hal port [int] returned [str] ASIC MC buffer configuration read failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_tc_mcaware_set hal port [int] returned [str] ASIC MC buffer configuration write failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_tc_prio_map_set hal port [int] returned [str] ASIC egress queue map configuration write failed. File a ticket with Cumulus Support.
          WARNING buffer pool [int] size 0 is invalid: this pool was not created Invalid buffer pool size. Check back end QoS configuration file.
          WARNING Pool configuration mode [int] not recognized: defaulting to buffer units Invalid parameter. Check back end QoS configuration file.
          WARNING sx_api_cos_shared_buff_pool_set for sw pool id [int] to size [int] (mode [int]) failed: [str], ASIC buffer pool configuration write failed. File a ticket with Cumulus Support.
          WARNING Hardware buffer pool index [int] too large (max is [int]) Too many buffer pools for ASIC limit. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_prio_buff_map_get failed for MLX port [int]: [str] Priority group buffer map read failed. File a ticket with Cumulus Support.
          WARNING switch priority [int] greater than max value [int] Invalid switch priority. Check QoS configuration file.
          WARNING sx_api_cos_port_prio_buff_map_set failed for MLX port [int]: [str] ASIC packet buffer configuration write failed. File a ticket with Cumulus Support.
          WARNING reserved buffer type [int] not recognized Invalid buffer type. File a ticket with Cumulus Support.
          WARNING cos ID [int] larger than maximum switch priority value [int] Invalid internal switch priority value. File a ticket with Cumulus Support.
          WARNING profile element index [int] too large for array size [int]: [int] map entries, priority field [int] Priority profle index is too large for the array. File a ticket with Cumulus Support.
          WARNING [str] failed for MLX port 0x%x, buffer count [int]: [str] ASIC packet buffer configuration write failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_shared_buff_pool_get failed, cannot get pool size or mode : [str] ASIC buffer pool configuration read failed. File a ticket with Cumulus Support.
          WARNING pool [int] mode [int] not recognized Invalid buffer pool mode. File a ticket with Cumulus Support.
          WARNING MLX logical port 0x%x: cos ID [int] larger than maximum switch priority value [int] Invalid switch priority value. File a ticket with Cumulus Support.
          WARNING unlimited egress buffer for flow controlled switch priority [int]: unicast config may not match multicast config on some ports Possible buffer configuration conflict. File a ticket with Cumulus Support.
          WARNING sx_api_cos_shared_buff_pool_get failed, cannot report pool configurations: [str] ASIC buffer pool configuration read failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_pools_list_get, pool count == 0, failed: [str] Buffer pool configuration read failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_pools_list_get, pool count == [uint] failed: [str] Buffer pool configuration read failed. File a ticket with Cumulus Support.
          WARNING _pool_buffer_list_get failed: [str] Buffer pool configuration read failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_shared_buff_pool_get failed, cannot report pool configurations: [str] ASIC packet buffer configuration write failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_pcpdei_to_prio_get port [int] (0x%x) returned [str] ASIC L2 priority source map get operation failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_buff_type_set failed for HAL port [int]/MLX port [int]: [str] ASIC packet buffer configuration write failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_shared_buff_type_set failed for HAL port [int]/MLX port [int]: [str] ASIC packet buffer configuration write failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_pcpdei_to_prio_set port [int] logical port 0x%x returned [str] ASIC L2 priority source map set operation failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_dscp_to_prio_get port [int] (0x%x) returned [str] ASIC L3 priority source map get operation failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_dscp_to_prio_set port [int] returned [str] ASIC L3 priority source map set operation failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_prio_to_pcpdei_rewrite_set hal port [int] element count [int]: returned [str] ASIC L2 priority remark map set operation failed. File a ticket with Cumulus Support.
          WARNING switch priority [int] color [int] pcp [int] deci [int] failed L2 priority remark map. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_prio_to_dscp_rewrite_set hal port [int] logical_port [int] element count [int]: returned [str] ASIC L3 priority remark map set operation failed. File a ticket with Cumulus Support.
          WARNING switch priority [int] color [int] dscp [int] failed L3 priority remark map. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_ets_element_get logical port 0x%x returned [str] ASIC scheduler configuration read failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_ets_element_set (destroy) logical port 0x%x returned [str] ASIC scheduler configuration write failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_ets_element_get hal port [int] returned [str] ASIC scheduler configuration read failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_ets_element_set logical port 0x%x level [int] index [int] returned [str] ASIC scheduler configuration write failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_ets_element_get hal port [int] returned [str] ASIC scheduler configuration read failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_ets_element_set (destroy) hal port [int] returned [str] ASIC scheduler configuration write failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_port_ets_element_set hal port [int] level [int] index [int] returned [str] ASIC scheduler configuration write failed. File a ticket with Cumulus Support.
          WARNING sx_api_port_pfc_enable_set hal port [int] returned [str] ASIC priority flow control configuration failed. File a ticket with Cumulus Support.
          WARNING switch priority [int] is not supported for MLX unit Internal switch priority not supported. Check QoS configuration file.
          WARNING sx_api_cos_redecn_general_param_get returned [str] ASIC ECN configuration failed. File a ticket with Cumulus Support.
          WARNING hal_mlx_ecn_red_set HAL port [int] min_threshold_bytes [int] is less than minimum size, using [int] bytes invalid parameter. Check QoS configuration file.
          WARNING hal_mlx_ecn_red_set HAL port [int] max_threshold_bytes [int] is greater than maximum size, using [int] bytes invalid parameter. Check QoS configuration file.
          WARNING sx_api_cos_redecn_profile_set returned [str] ASIC ECN configuration failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_redecn_tc_enable_set returned [str] ASIC ECN configuration failed. File a ticket with Cumulus Support.
          WARNING sx_api_cos_redecn_profle_tc_bind_set for hal port [int] flow type [int] returned [str] ASIC ECN configuration failed. File a ticket with Cumulus Support.
          WARNING hal_sh_datapath_file_read: egress port MC buffer percent [perc] reduced to 100.0 Invalid egress port MC buffer value. Check back end QoS configuration file.
          WARNING priority group PG ID [int] is larger than the PG ID mask size [int] Invalid priority group ID value configured. Check back end QoS configuration file.
          WARNING No priority group ID found for lossless traffic No priority group ID found for lossless traffic. File a ticket with Cumulus Support.
          WARNING _queue_info_set: port_q_count_get failed for hal port [int] Could not find port queue limits. File a ticket with Cumulus Support.
          WARNING hal_sh_datapath_packet_buffer_set: [str] Back end packet buffer config failed. Check for detailed log messages.
          WARNING unable to set FEC parameters while autoneg is enabled Invalid operation for the current port configuration. Disable auto-negotiation on the port.
          WARNING ethtool settings nwords too large: [int] Invalid parameter: using default. File a ticket with Cumulus Support.
          WARNING _port_group_priority_map_get: arg is NULL Invalid parameter. File a ticket with Cumulus Support.
          WARNING _port_group_config_values_get: _port_group_find failed on [str] [str] Port group not found or created. File a ticket with Cumulus Support.
          WARNING _port_group_set_get: [str] port set not found Port group port set not found. Check QoS configuration file.
          WARNING _port_pause_config: config_port_pause failed: [str] ASIC port pause configuration failed. File a ticket with Cumulus Support.
          WARNING _priority_flow_control_config: hal_port_pfc_set failed on hal port [int]: [str] ASIC priority flow control configuration failed. File a ticket with Cumulus Support.
          WARNING _config_port_packet_buffers: [str] ASIC packet buffer config failed. File a ticket with Cumulus Support.
          WARNING _switch_priority_config: hal port [int]: [str] _switch_priority_config: hal port [int]: [str] File a ticket with Cumulus Support.
          WARNING _config_port_packet_buffers: [str] ASIC packet buffer config failed. File a ticket with Cumulus Support.
          WARNING _priority_map_config: priority map direction [int] is larger then max value HAL_DATAPATH_PRIORITY_DIRECTION_MAX Invalid parameter. File a ticket with Cumulus Support.
          WARNING _priority_map_config: packet priority field [int] not supported Invalid packet priority field(s). Check QoS configuration file.
          WARNING _hash_config: route_ecmp_max_paths_set failed: [str] ASIC ECMP configuration failed. File a ticket with Cumulus Support.
          WARNING _hash_config: hash config failed: [str] ASIC symmetric hash configuration failed. File a ticket with Cumulus Support.
          WARNING _hash_config: ecmp hash seed config faild: [str] ASIC ECMP hash configuration failed. File a ticket with Cumulus Support.
          WARNING _hash_config: hash config failed: [str] ASIC resilient hash configuration failed. File a ticket with Cumulus Support.
          WARNING _mpls_config: mpls enable config failed: [str] ASIC MPLS configuration failed. File a ticket with Cumulus Support.
          WARNING _ecn_red_config: hal_datapath_ecn_red_set failed on hal port [int]: [str] ASIC ECN/RED configuration failed. File a ticket with Cumulus Support.
          WARNING _port_attribute_mark: flow control configuration conflict on hal port [int]: Flow control configuration conflict. Check QoS configuration file.
          WARNING hal_datapath_forwarding_profile_get: forwarding table profile path was NULL Memory allocation failed. File a ticket with Cumulus Support.
          WARNING hal_datapath_forwarding_profile_get: sfs_config_get failed for [str] Missing forwarding table profile configuration. Check QoS configuration file.
          WARNING hal_datapath_init: packet priority source mapping configuration failed Packet priority source map configuration failed. Check for detailed log messages.
          WARNING hal_datapath_init: packet priority remark configuration failed Packet priority remark map configuration failed. Check for detailed log messages.
          WARNING _config_value_read: sfs path is null Invalid parameter. File a ticket with Cumulus Support.
          WARNING _config_value_read: sfs_config_get [str] failed Configuration parameter not found. File a ticket with Cumulus Support.
          WARNING _config_value_read: sfs_config_get [str] returned NULL configuration Configuration value not found. File a ticket with Cumulus Support.
          WARNING _cos_show_node_create: sfs_add failed for CoS node switchd fuse node create failed. File a ticket with Cumulus Support.
          WARNING _priority_map_get: remark list has more than one packet priority value: configuring the first value Ignoring surplus remark values. Check QoS configuration file.
          WARNING _priority_map_list_get: [str] ASIC priority profile create failed. File a ticket with Cumulus Support.
          WARNING _switch_priority_config_values_get: scheduling algorithm [str] not recognized Invalid scheduling algorithm. Check QoS configuration file.
          WARNING hal_list_get: list type [int] is not supported Invalid list type. File a ticket with Cumulus Support.
          WARNING Couldn’t read a random number [int] setting seed to [uint] No ECMP hash seed found in file. Check file or accept default random seed.
          WARNING Couldn’t read a random number [int] setting seed1 to [uint] No ECMP hash seed1 found in file. Check file or accept default random seed.
          WARNING Neighbor entry is not IPv4 or v6: [int]! Netlink neighbor object has invalid family. File a ticket with Cumulus Support.
          WARNING Neighbor entry had unexpected flags [int] Netlink neighbor object has unsupported flags. File a ticket with Cumulus Support.
          WARNING [str]: route table mode [int] not supported Invalid route table ASIC mode. File a ticket with Cumulus Support.
          WARNING [str]: host table mode [int] not supported Invalid neighbor table ASIC mode. File a ticket with Cumulus Support.
          WARNING Route [[str]] is not IPv4 or v6 or MPLS, family: [int] Netlink route object has invalid family. File a ticket with Cumulus Support.
          WARNING Route [[str]] has unexpected flags: [int] Netlink route object has unsupported flags. File a ticket with Cumulus Support.
          WARNING Route [[str]] has unexpected type: [int] Netlink route object has unsupported type. File a ticket with Cumulus Support.
          WARNING Route [[str]] has unexpected tos: [int] Netlink route object has unsupported TOS value. File a ticket with Cumulus Support.
          WARNING Route [[str]] has non-NULL src. Netlink route object has non-NULL source. File a ticket with Cumulus Support.
          WARNING Route [[str]] has non-zero iif: [int] Netlink route object has non-NULL interface. File a ticket with Cumulus Support.
          WARNING [int] routes exceeded [int] ecmp NHs, and were truncated. Routes exceeded per-route nexthop limit. Modify route next-hop configuration.
          WARNING [int] routes reverted to non-ECMP due to NH table Total number of nexthops did not fit in hardware table. Modify route next-hop configuration.
          WARNING [str]: route table mode [int] not supported Hardware route table is set to an incorrect mode. File a ticket with Cumulus Support.
          WARNING [str]: host table mode [int] not supported Hardware host table is set to an incorrect mode. File a ticket with Cumulus Support.
          WARNING vpn_id 0x%x for ln_type [uint] ln_key [uint] tunnel_id 0x%x invalid local_ip [str] Local IP for a tunnel is invalid. File a ticket with Cumulus Support.
          WARNING Error removing isolated port [int] from [int]. Error: [str] Removing isolated VPN port from the SDK failed. File a ticket with Cumulus Support.
          WARNING Error adding isolated port [int] to [int]. Error: [str] Adding isolated VPN port to the SDK failed. File a ticket with Cumulus Support.
          WARNING Error removing isolated port [int] from [int]. Moving isolated VPN port from the SDK failed. File a ticket with Cumulus Support.
          WARNING [str]: failed to push port settings to hal. err = [int] table_id could not be set for a port. File a ticket with Cumulus Support.
          WARNING [str] not found in grp [str], bridge [int] A port not found in the given group for the specific bridge. File a ticket with Cumulus Support.
          WARNING grp [str] not found in bridge [int] During deletion, an MDB group was not found for a specific bridge. File a ticket with Cumulus Support.
          WARNING lid 0x%x cannot be both SPAN source ACL: SPAN source and target cannot be the same. Remove the rule.
          WARNING CPU not supported as mirror port ACL: Mirror target port cannot be CPU. Remove these SPAN rules.
          WARNING table [str] [str] chain [str] L2 header field match not supported with IPv6 key ACL: Specified match unsupported. Remove the rule.
          WARNING table [str] [str] chain [str] IP TTL not supported with MAC+IPv4 key ACL: Unsupported match. Remove the rule.
          WARNING table [str] [str] chain [str] requires hardware IPv6 rule format but platform does not support MAC+IPv6 key combination ACL: Unsupported match. Remove the rule.
          WARNING table [str] [str] chain [str] requires hardware IPv4 rule but platform does not support IPv4 key with ACL: Unsupported match. Remove the rule.
          WARNING logical network type not supported ACL: Unsupported interface type in internal VXLAN rules. File a ticket with Cumulus Support.
          WARNING Detected excessive moves of mac address [str] on bridge [str], last seen on [str] and [str]. L2: Too many MAC moves seen. Check network topology for loops or intrusion.
          WARNING Memory allocation failed Memory exhausted. File a ticket with Cumulus Support.
          WARNING Can’t open configuration file [str]: [int] Failed to read configuration file in SFS. File a ticket with Cumulus Support.
          WARNING tx failed with count [int], start %p Failed to transfer packets in NIC. File a ticket with Cumulus Support.
          WARNING Detected excessive moves of mac address [str] on bridge [str], Moved MAC addresses over threshold.
          WARNING Unsupported command [str] Wrong FEC command. File a ticket with Cumulus Support.
          WARNING genl_talk returned error for ifindex [int] ([str]) Failed to read cached settings in PORT. File a ticket with Cumulus Support.
          WARNING new tag_state [str] mismatches with [str] for [str] int_vlan [uint] The new configured tag state [str] mismatches with the old tag state [str] for the internal VLAN [id]. File a ticket and contact Cumulus Support.
          WARNING verbosity level for SDK module [uint] not present incorrect verbosity level for the Mellanox SDK module is being configured. This is an internal error. File a ticket and contact Cumulus Support.
          WARNING legacy SX2 nexthop route type [uint] not handled. for Legacy Mellanox SX2 chip, next hop of route type [uint] not handled. File a ticket and contact Cumulus Support.
          WARNING hash_table_delete of clone parent from id_ht %p failed. Failed to delete an ECMP clone id in, clone id hash table, as a entry does not exists. File a ticket and contact Cumulus Support.
          WARNING [str]: no parent for [str] Missing parent interface. File a ticket with Cumulus Support.
          WARNING [str]: no parent for [str] Missing parent interface. File a ticket with Cumulus Support.

          FRRouting Log Message Reference

          The following table lists the HIGH severity ERROR log messages generated by FRR. These messages appear in /var/log/frr/frr.log.

          Category Severity Message # Message Text Explanation Recommended Action
          Babel HIGH 16777217 BABEL Memory Errors Babel has failed to allocate memory. The system is about to run out of memory. Find the process that is causing memory shortages and remediate that process. Restart FRR.
          Babel HIGH 16777218 BABEL Packet Error Babel has detected a packet encode/decode problem. Collect the relevant log files and report the issue for troubleshooting.
          Babel HIGH 16777219 BABEL Configuration Error Babel has detected a configuration error of some sort. Ensure that the configuration is correct.
          Babel HIGH 16777220 BABEL Route Error Babel has detected a routing error and is in an inconsistent state. Gather data to report the issue for troubleshooting. Restart FRR.
          BGP HIGH 33554433 BGP attribute flag is incorrect BGP attribute flag is set to the wrong value (Optional/Transitive/Partial). Determine the source of the attribute and determine why the attribute flag has been set incorrectly.
          BGP HIGH 33554434 BGP attribute length is incorrect BGP attribute length is incorrect. Determine the source of the attribute and determine why the attribute length has been set incorrectly.
          BGP HIGH 33554435 BGP attribute origin value invalid BGP attribute origin value is invalid. Determine the source of the attribute and determine why the origin attribute has been set incorrectly.
          BGP HIGH 33554436 BGP as path is invalid BGP AS path has been malformed. Determine the source of the update and determine why the AS path has been set incorrectly.
          BGP HIGH 33554437 BGP as path first as is invalid BGP update has invalid first AS in AS path. Determine the source of the update and determine why the AS path first AS value has been set incorrectly.
          BGP HIGH 33554439 BGP PMSI tunnel attribute type is invalid BGP update has invalid type for PMSI tunnel. Determine the source of the update and determine why the PMSI tunnel attribute type has been set incorrectly.
          BGP HIGH 33554440 BGP PMSI tunnel attribute length is invalid BGP update has invalid length for PMSI tunnel. Determine the source of the update and determine why the PMSI tunnel attribute length has been set incorrectly.
          BGP HIGH 33554442 BGP peergroup operated on in error BGP operating on peer-group instead of peers included. Ensure the configuration doesn’t contain peer-groups contained within peer-groups.
          BGP HIGH 33554443 BGP failed to delete peer structure BGP was unable to delete the peer structure when the address-family was removed. Determine if all expected peers are removed and restart FRR if not. This is most likely a bug.
          BGP HIGH 33554444 BGP failed to get table chunk memory BGP unable to get chunk memory for table manager. Ensure there is adequate memory on the device to support the table requirements.
          BGP HIGH 33554445 BGP received MACIP with invalid IP addr len BGP received MACIP with invalid IP address length from Zebra. Verify the MACIP entries inserted in Zebra are correct. This is most likely a bug.
          BGP HIGH 33554446 BGP received invalid label manager message BGP received an invalid label manager message from the label manager. Label manager sent an invalid message to BGP for the wrong protocol instance. This is most likely a bug.
          BGP HIGH 33554447 BGP unable to allocate memory for JSON output BGP attempted to generate JSON output and was unable to allocate the memory required. Ensure that the device has adequate memory to support the required functions.
          BGP HIGH 33554448 BGP update had attributes too long to send BGP attempted to send an update but the attributes were too long to fit. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554449 BGP update group creation failed BGP attempted to create an update group but was unable to do so. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554450 BGP error creating update packet BGP attempted to create an update packet but was unable to do so. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554451 BGP error receiving open packet BGP received an open from a peer that was invalid. Determine the sending peer and correct its invalid open packet.
          BGP HIGH 33554452 BGP error sending to peer BGP attempted to respond to open from a peer and failed. BGP attempted to respond to an open and could not send the packet. Check the local IP address for the source.
          BGP HIGH 33554453 BGP error receiving from peer BGP received an update from a peer but the status was incorrect. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554454 BGP error receiving update packet BGP received an invalid update packet. Determine the source of the update and resolve the invalid update being sent.
          BGP HIGH 33554455 BGP error due to capability not enabled BGP attempted a function that did not have the capability enabled. Enable the capability if this functionality is desired.
          BGP HIGH 33554456 BGP error receiving notify message BGP unable to process the notification message. BGP notify received while in a stopped state. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554457 BGP error receiving keepalive packet BGP unable to process a keepalive packet. BGP keepalive received while in a stopped state. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554458 BGP error receiving route refresh message BGP unable to process route refresh message. BGP route refresh received while in a stopped state. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554459 BGP error capability message BGP unable to process received capability. BGP capability message received while in a stopped state. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554460 BGP error with nexthop update BGP unable to process nexthop update. BGP received the nexthop update but the nexthop is not reachable in this BGP instance. Report the problem for troubleshooting.
          BGP HIGH 33554461 Failure to apply label BGP attempted to apply a label but could not do so. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554462 Multipath specified is invalid BGP was started with an invalid ECMP/multipath value. Correct the ECMP/multipath value supplied when starting the BGP daemon.
          BGP HIGH 33554463 Failure to process a packet BGP attempted to process a received packet but could not do so. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554464 Failure to connect to peer BGP attempted to send open to a peer but couldn’t connect. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554465 BGP FSM issue BGP neighbor transition problem. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554466 BGP VNI creation issue BGP could not create a new VNI. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554467 BGP default instance missing BGP could not find default instance. Define a default instance of BGP; some feature requires its existence.
          BGP HIGH 33554468 BGP remote VTEP invalid BGP remote VTEP is invalid and cannot be used. Correct the remote VTEP configuration or resolve the source of the problem.
          BGP HIGH 33554469 BGP ES route error BGP ES route incorrect as it learned both local and remote routes. Correct the configuration or address it so that same route is not learned both local and remote.
          BGP HIGH 33554470 BGP EVPN route delete error BGP attempted to delete an EVPN route and failed. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554471 BGP EVPN install/uninstall error BGP attempted to install or uninstall an EVPN prefix and failed. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554472 BGP EVPN route received with invalid contents BGP received an EVPN route with invalid contents. Determine the source of the EVPN route and resolve whatever is causing the invalid content.
          BGP HIGH 33554473 BGP EVPN route create error BGP attempted to create an EVPN route and failed. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554474 BGP EVPN ES entry create error BGP attempted to create an EVPN ES entry and failed. This is most likely a bug. If the problem persists, report it for troubleshooting.
          BGP HIGH 33554475 BGP config multi-instance issue BGP configuration attempting multiple instances without enabling the feature. Correct the configuration so that BGP multiple-instance is enabled if desired.
          BGP HIGH 33554476 BGP AS configuration issue BGP configuration attempted for a different AS than is currently configured. Correct the configuration so that the correct BGP AS number is used.
          BGP HIGH 33554477 BGP EVPN AS and process name mismatch BGP configuration has AS and process name mismatch. Correct the configuration so that the BGP AS number and instance name are consistent.
          BGP HIGH 33554478 BGP Flowspec packet processing error The BGP flowspec subsystem has detected an error in the sending or receiving of a packet. Gather log files from both sides of the peering relationship and report the issue for troubleshooting.
          BGP HIGH 33554479 BGP Flowspec Installation/removal Error The BGP flowspec subsystem has detected that there was a failure for installation/removal/modification of Flowspec from the dataplane. Gather log files from the router and report the issue for troubleshooting. Restart FRR.
          EIGRP HIGH 50331649 EIGRP Packet Error EIGRP has a packet that does not correctly decode or encode. Gather log files from both sides of the neighbor relationship and report the issue for troubleshooting.
          EIGRP HIGH 50331650 EIGRP Configuration Error EIGRP has detected a configuration error. Correct the configuration issue. If it still persists, report the issue for troubleshooting.
          General HIGH 100663297 Failure to raise or lower privileges FRR attempted to raise or lower its privileges and was unable to do so. Ensure that you are running FRR as the frr user and that the user has sufficient privileges to properly access root privileges.
          General HIGH 100663298 VRF Failure on Start Upon startup, FRR failed to properly initialize and start up the VRF subsystem. Ensure that there is sufficient memory to start processes, then restart FRR.
          General HIGH 100663299 Socket Error When attempting to access a socket, a system error occurred and FRR was unable to properly complete the request. Ensure that there are sufficient system resources available and ensure that the frr user has sufficient permissions to work.
          General HIGH 100663303 System Call Error FRR has detected an error from using a vital system call and has probably already exited. Ensure permissions are correct for FRR users and groups. Additionally, check that sufficient system resources are available.
          General HIGH 100663304 VTY Subsystem Error FRR has detected a problem with the specified configuration file. Ensure the configuration file exists and has the correct permissions for operations. Additionally, ensure that all config lines are correct as well.
          General HIGH 100663305 SNMP Subsystem Error FRR has detected a problem with the SNMP library it uses. A callback from this subsystem has indicated some error. Examine the callback message and ensure SNMP is properly set up and working.
          General HIGH 100663306 Interface Subsystem Error FRR has detected a problem with interface data from the kernel as it deviates from what we would expect to happen via normal netlink messaging. Open an issue with all relevant log files and restart FRR.
          General HIGH 100663307 NameSpace Subsystem Error FRR has detected a problem with namespace data from the kernel as it deviates from what we would expect to happen via normal kernel messaging. Open an issue with all relevant log files and restart FRR.
          General HIGH 4043309068 A necessary work queue does not exist. A necessary work queue does not exist. Notify a developer.
          General HIGH 100663308 Developmental Escape Error FRR has detected an issue where new development has not properly updated all code paths. Open an issue with all relevant log files.
          General HIGH 100663309 ZMQ Subsystem Error FRR has detected an issue with the ZeroMQ subsystem and ZeroMQ is not working properly now. Open an issue with all relevant log files and restart FRR.
          General HIGH 100663310 Feature or system unavailable FRR was not compiled with support for a particular feature or it is not available on the current platform. Recompile FRR with the feature enabled or find out what platforms support the feature.
          General HIGH 4043309071 IRDP message length mismatch The length encoded in the IP TLV does not match the length of the packet received. Notify a developer.
          General HIGH 4043309073 Dataplane installation failure Installation of routes to the underlying dataplane failed. Check all configuration parameters for correctness.
          General HIGH 4043309075 Netlink backend not available FRR was not compiled with support for Netlink. Any operations that require Netlink will fail. Recompile FRR with Netlink or install a package that supports this feature.
          General HIGH 4043309076 Protocol Buffers backend not available FRR was not compiled with support for protocol buffers. Any operations that require protobuf will fail. Recompile FRR with protobuf support or install a package that supports this feature.
          General HIGH 4043309087 Cannot set receive buffer size The socket receive buffer size could not be set in the kernel. Ignore this error.
          General HIGH 4043309089 Receive buffer overrun The kernel’s buffer for a socket has been overrun, rendering the socket invalid. Zebra will restart itself. Notify a developer if this issue shows up frequently.
          General HIGH 4043309091 Received unexpected response from kernel Received unexpected response from the kernel via Netlink. Notify a developer.
          General HIGH 4043309094 String could not be parsed as IP prefix There was an attempt to parse a string as an IPv4 or IPv6 prefix, but the string could not be parsed and this operation failed. Notify a developer.
          General HIGH 268435457 WATCHFRR Connection Error WATCHFRR has detected a connectivity issue with one of the FRR daemons. Ensure that FRR is still running. If it isn’t, report the issue for troubleshooting.
          ISIS HIGH 67108865 ISIS Packet Error ISIS has detected an error with a packet from a peer. Gather log information and report the issue for troubleshooting. Restart FRR.
          ISIS HIGH 67108866 ISIS Configuration Error ISIS has detected an error within the configuration for the router. Ensure configuration is correct.
          OSPF HIGH 134217729 Failure to process a packet OSPF attempted to process a received packet but could not do so. This is most likely a bug. If the problem persists, report it for troubleshooting.
          OSPF HIGH 134217730 Failure to process Router LSA OSPF attempted to process a router LSA, but there was an advertising ID mismtach with the link ID. Check the OSPF network configuration for any configuration issue. If the problem persists, report it for troubleshooting.
          OSPF HIGH 134217731 OSPF Domain Corruption OSPF attempted to process a router LSA, but there was an advertising ID mismtach with the link ID. Check OSPF network database for a corrupted LSA. If the problem persists, shut down the OSPF domain and report the problem for troubleshooting.
          OSPF HIGH 134217732 OSPF Initialization failure OSPF failed to initialize the OSPF default instance. Ensure there is adequate memory on the device. If the problem persists, report it for troubleshooting.
          OSPF HIGH 134217733 OSPF SR Invalid DB OSPF segment routing database is invalid. This is most likely a bug. If the problem persists, report it for troubleshooting.
          OSPF HIGH 134217734 OSPF SR hash node creation failed OSPF segment routing node creation failed. This is most likely a bug. If the problem persists, report it for troubleshooting.
          OSPF HIGH 134217735 OSPF SR Invalid lsa id OSPF segment routing invalid LSA ID. Restart the OSPF instance. If the problem persists, report it for troubleshooting.
          OSPF HIGH 134217736 OSPF SR Invalid Algorithm OSPF segment routing invalid algorithm. This is most likely a bug. If the problem persists, report it for troubleshooting.
          PIM HIGH 184549377 PIM MSDP Packet Error PIM has received a packet from a peer that does not correctly decode. Check the MSDP peer and ensure it is correctly working.
          PIM HIGH 184549378 PIM Configuration Error PIM has detected a configuration error. Ensure the configuration is correct and apply the correct configuration.
          RIP HIGH 201326593 RIP Packet Error RIP has detected a packet encode/decode issue. Gather log files from both sides and open a Issue
          Zebra HIGH 4043309057 Error reading response from label manager Zebra could not read the ZAPI header from the label manager. Wait for the error to resolve on its own. If it does not resolve, restart Zebra.
          Zebra HIGH 4043309058 Label manager could not find ZAPI client Zebra was unable to find a ZAPI client matching the given protocol and instance number. Ensure that clients that use the label manager are properly configured and running.
          Zebra HIGH 4043309059 Zebra could not relay label manager response Zebra found the client and instance to relay the label manager response or request, but was unable to do so, possibly because the connection was closed. Ensure that clients that use the label manager are properly configured and running.
          Zebra HIGH 100663300 ZAPI Error A version mismatch has been detected between Zebra and a client protocol. Two different versions of FRR have been installed and the install is not properly set up. Completely stop FRR, remove it from the system and reinstall. Typically, only developers should see this issue.
          Zebra HIGH 4043309061 Mismatch between ZAPI instance and encoded message instance While relaying a request to the external label manager, Zebra noticed that the instance number encoded in the message did not match the client instance number. Notify a developer.
          Zebra HIGH 100663301 ZAPI Error The ZAPI subsystem has detected an encoding issue between Zebra and a client protocol. Restart FRR.
          Zebra HIGH 100663302 ZAPI Error The ZAPI subsystem has detected a socket error between Zebra and a client. Restart FRR.
          Zebra HIGH 4043309064 Zebra label manager used all available labels Zebra is unable to assign additional label chunks because it has exhausted its assigned label range. Make the label range bigger and restart Zebra.
          Zebra HIGH 4043309065 Daemon mismatch when releasing label chunks Zebra noticed a mismatch between a label chunk and a protocol daemon number or instance when releasing unused label chunks. Ignore this error.
          Zebra HIGH 4043309066 Zebra did not free any label chunks Zebra’s chunk cleanup procedure ran but no label chunks were released. Ignore this error.
          Zebra HIGH 4043309067 Dataplane returned invalid status code The underlying dataplane responded to a Zebra message or other interaction with an unrecognized unknown or invalid status code. Notify a developer.
          Zebra HIGH 4043309069 Failed to add FEC for MPLS client A client requested a label binding for a new FEC but Zebra was unable to add the FEC to its internal table. Notify a developer.
          Zebra HIGH 4043309070 Failed to remove FEC for MPLS client Zebra was unable to find and remove an FEC in its internal table. Notify a developer.
          Zebra HIGH 4043309072 Attempted to perform nexthop update for unknown address family Zebra attempted to perform a nexthop update for unknown address family. Notify a developer.
          Zebra HIGH 4043309074 Zebra table lookup failed Zebra attempted to look up a table for a particular address family and a subsequent address family but didn’t find anything. If you entered a command to trigger this error, make sure you entered the arguments correctly. Check your configuration file for any potential errors. If these look correct, notify a developer.
          Zebra HIGH 4043309077 Table manager used all available IDs Zebra’s table manager used up all IDs available to it and can’t assign any more. Reconfigure Zebra with a larger range of table IDs.
          Zebra HIGH 4043309078 Daemon mismatch when releasing table chunks Zebra noticed a mismatch between a table ID chunk and a protocol daemon number instance when releasing unused table chunks. Ignore this error.
          Zebra HIGH 4043309079 Zebra did not free any table chunks Zebra’s table chunk cleanup procedure ran but no table chunks were released. Ignore this error.
          Zebra HIGH 4043309080 Address family specifier unrecognized Zebra attempted to process information from somewhere that included an address family specifier but did not recognize the provided specifier. Ensure that your configuration is correct. If it is, notify a developer.
          Zebra HIGH 4043309081 Incorrect protocol for table manager client Zebra’s table manager only accepts connections from daemons managing dynamic routing protocols, but received a connection attempt from a daemon that does not meet this criterion. Notify a developer.
          Zebra HIGH 4043309082 Mismatch between message and client protocol and/or instance Zebra detected a mismatch between a client’s protocol and/or instance numbers versus those stored in a message transiting its socket. Notify a developer.
          Zebra HIGH 4043309083 Label manager unable to assign label chunk Zebra’s label manager was unable to assign a label chunk to client. Ensure that Zebra has a sufficient label range available and that there is not a range collision.
          Zebra HIGH 4043309084 Label request from unidentified client Zebra’s label manager received a label request from an unidentified client. Notify a developer.
          Zebra HIGH 4043309085 Table manager unable to assign table chunk Zebra’s table manager was unable to assign a table chunk to a client. Ensure that Zebra has sufficient table ID range available and that there is not a range collision.
          Zebra HIGH 4043309086 Table request from unidentified client Zebra’s table manager received a table request from an unidentified client. Notify a developer.
          Zebra HIGH 4043309088 Unknown Netlink message type Zebra received a Netlink message with an unrecognized type field. Verify that you are running the latest version of FRR to ensure kernel compatibility. If the problem persists, notify a developer.
          Zebra HIGH 4043309090 Netlink message length mismatch Zebra received a Netlink message with incorrect length fields. Notify a developer.
          Zebra HIGH 4043309092 Bad sequence number in Netlink message Zebra received a Netlink message with a bad sequence number. Notify a developer.
          Zebra HIGH 4043309093 Multipath number was out of valid range The multipath number specified to Zebra must be in the appropriate range. Provide a multipath number that is within its accepted range.
          Zebra HIGH 4043309095 Failed to add MAC address to interface Zebra attempted to assign a MAC address to a VXLAN interface but failed. Notify a developer.
          Zebra HIGH 4043309096 Failed to delete VNI Zebra attempted to delete a VNI entry and failed. Notify a developer.
          Zebra HIGH 4043309097 Adding remote VTEP failed Zebra attempted to add a remote VTEP and failed. Notify a developer.
          Zebra HIGH 4043309098 Adding VNI failed Zebra attempted to add a VNI hash to an interface and failed. Notify a developer.

          Try It Pre-built Demos

          This documentation includes pre-built Try It demos for certain Cumulus Linux features. The Try It demos run a simulation in NVIDIA Air; a cloud hosted platform that works exactly like a real world production deployment. All the Try It demos use the NVIDIA Cumulus Linux reference topology.

          The following Try It demos are available:

          Access a Try It Demo

          To access a demo, click the Try It tab in a Configuration Example section of the documentation. Acknowledge the captcha and select the Launch Simulation button:

          NVIDIA Air starts building the simulation and boots the nodes:

          The simulation can take a few minutes to build and might display a grey screen before loading.

          Run Commands

          When the simulation is ready, you can log into the leaf and spine switches. The switches are pre-configured with the configuration commands shown in the documentation. You can run any Cumulus Linux commands to learn more about the feature and configure additional settings.

          Launch in AIR

          If you want to save the simulation or extend the run time, click LAUNCH IN AIR to access the network simulation platform. From this platform, you can run additional pre-built demos and even build your own simulations. Refer to the NVIDIA Air User Guide.

          Reference Topology

          The Cumulus Linux documentation uses this reference topology for configuration examples.

          Cumulus Linux in a Virtual Environment

          Cumulus Linux in a virtual environment enables you to become familiar NVIDIA networking technology, learn and test Cumulus Linux in your own environment, and create a digital twin of your IT infrastructure to validate configurations, features, and automation code.

          Virtual Environments

          NVIDIA provides these virtual environments:

          Cumulus Linux in a virtual environment contains the same Cumulus Linux operating system as NVIDIA Ethernet switches and contains the same software features. You have the full data plane functionality through the Linux kernel, as well as layer 2 VLANs and both VXLAN bridging and VXLAN routing capabilities.

          Unsupported Features in a Virtual Environment

          Due to hardware specific implementations, virtual environments do not support certain Cumulus Linux features.

          Feature Supported in a Virtual Environment
          Access Control Lists No
          In-Service-System-Upgrade-ISSU No
          Precision Time Protocol - PTP No
          Port Security No
          SPAN and ERSPAN No
          Temperature and sensor outputs Artificial temperature and sensor outputs for simulation.
          Packet marking and remarking No
          QoS buffer management and buffer monitoring No
          QoS shaping No
          PFC watchdog No
          What Just Happened (WJH) No
          Network Address Translation - NAT No
          Adaptive Routing No
          Storm control No

          NVUE Command Reference

          The NVUE command reference for Cumulus Linux provides descriptions and examples for the following commands: