Configure Threshold-Based Event Notifications

NetQ supports TCA events, which are a set of events that trigger at the crossing of a user-defined threshold. These events allow detection and prevention of network failures for selected ACL resources, digital optics, forwarding resources, interface errors and statistics, link flaps, resource utilization, and sensor events. You can find a complete list in the TCA Event Messages Reference.

A notification configuration must contain one rule. Each rule must contain a scope and a threshold. Optionally, you can specify an associated channel. If you want to deliver events to one or more notification channels (email, syslog, Slack, or PagerDuty), create them by following the instructions in Create a Channel, and then return here to define your rule.

If a rule is not associated with a channel, the event information is only reachable from the database.

Define a Scope

You use a scope to filter the events generated by a given rule. You set the scope values on a per TCA rule basis. You can filter all rules on the hostname. You can also filter some rules by other parameters.

Select Filter Parameters

For each event type, you can filter rules based on the following filter parameters.

Event ID Scope Parameters
TCA_TCAM_IN_ACL_V4_FILTER_UPPER Hostname
TCA_TCAM_EG_ACL_V4_FILTER_UPPER Hostname
TCA_TCAM_IN_ACL_V4_MANGLE_UPPER Hostname
TCA_TCAM_EG_ACL_V4_MANGLE_UPPER Hostname
TCA_TCAM_IN_ACL_V6_FILTER_UPPER Hostname
TCA_TCAM_EG_ACL_V6_FILTER_UPPER Hostname
TCA_TCAM_IN_ACL_V6_MANGLE_UPPER Hostname
TCA_TCAM_EG_ACL_V6_MANGLE_UPPER Hostname
TCA_TCAM_IN_ACL_8021x_FILTER_UPPER Hostname
TCA_TCAM_ACL_L4_PORT_CHECKERS_UPPER Hostname
TCA_TCAM_ACL_REGIONS_UPPER Hostname
TCA_TCAM_IN_ACL_MIRROR_UPPER Hostname
TCA_TCAM_ACL_18B_RULES_UPPER Hostname
TCA_TCAM_ACL_32B_RULES_UPPER Hostname
TCA_TCAM_ACL_54B_RULES_UPPER Hostname
TCA_TCAM_IN_PBR_V4_FILTER_UPPER Hostname
TCA_TCAM_IN_PBR_V6_FILTER_UPPER Hostname
Event ID Scope Parameters
TCA_DOM_RX_POWER_ALARM_UPPER Hostname, Interface
TCA_DOM_RX_POWER_ALARM_LOWER Hostname, Interface
TCA_DOM_RX_POWER_WARNING_UPPER Hostname, Interface
TCA_DOM_RX_POWER_WARNING_LOWER Hostname, Interface
TCA_DOM_BIAS_CURRENT_ALARM_UPPER Hostname, Interface
TCA_DOM_BIAS_CURRENT_ALARM_LOWER Hostname, Interface
TCA_DOM_BIAS_CURRENT_WARNING_UPPER Hostname, Interface
TCA_DOM_BIAS_CURRENT_WARNING_LOWER Hostname, Interface
TCA_DOM_OUTPUT_POWER_ALARM_UPPER Hostname, Interface
TCA_DOM_OUTPUT_POWER_ALARM_LOWER Hostname, Interface
TCA_DOM_OUTPUT_POWER_WARNING_UPPER Hostname, Interface
TCA_DOM_OUTPUT_POWER_WARNING_LOWER Hostname, Interface
TCA_DOM_MODULE_TEMPERATURE_ALARM_UPPER Hostname, Interface
TCA_DOM_MODULE_TEMPERATURE_ALARM_LOWER Hostname, Interface
TCA_DOM_MODULE_TEMPERATURE_WARNING_UPPER Hostname, Interface
TCA_DOM_MODULE_TEMPERATURE_WARNING_LOWER Hostname, Interface
TCA_DOM_MODULE_VOLTAGE_ALARM_UPPER Hostname, Interface
TCA_DOM_MODULE_VOLTAGE_ALARM_LOWER Hostname, Interface
TCA_DOM_MODULE_VOLTAGE_WARNING_UPPER Hostname, Interface
TCA_DOM_MODULE_VOLTAGE_WARNING_LOWER Hostname, Interface
Event ID Scope Parameters
TCA_TCAM_TOTAL_ROUTE_ENTRIES_UPPER Hostname
TCA_TCAM_TOTAL_MCAST_ROUTES_UPPER Hostname
TCA_TCAM_MAC_ENTRIES_UPPER Hostname
TCA_TCAM_ECMP_NEXTHOPS_UPPER Hostname
TCA_TCAM_IPV4_ROUTE_UPPER Hostname
TCA_TCAM_IPV4_HOST_UPPER Hostname
TCA_TCAM_IPV6_ROUTE_UPPER Hostname
TCA_TCAM_IPV6_HOST_UPPER Hostname
Event ID Description
TCA_HW_IF_OVERSIZE_ERRORS Hostname, Interface
TCA_HW_IF_UNDERSIZE_ERRORS Hostname, Interface
TCA_HW_IF_ALIGNMENT_ERRORS Hostname, Interface
TCA_HW_IF_JABBER_ERRORS Hostname, Interface
TCA_HW_IF_SYMBOL_ERRORS Hostname, Interface
Event ID Scope Parameters
TCA_RXBROADCAST_UPPER Hostname, Interface
TCA_RXBYTES_UPPER Hostname, Interface
TCA_RXMULTICAST_UPPER Hostname, Interface
TCA_TXBROADCAST_UPPER Hostname, Interface
TCA_TXBYTES_UPPER Hostname, Interface
TCA_TXMULTICAST_UPPER Hostname, Interface
Event ID Description
TCA_LINK Hostname, Interface
Event ID Scope Parameters
TCA_CPU_UTILIZATION_UPPER Hostname
TCA_DISK_UTILIZATION_UPPER Hostname
TCA_MEMORY_UTILIZATION_UPPER Hostname
Event ID Scope Parameters
Tx CNP Unicast No Buffer Discard Hostname, Interface
Rx RoCE PFC Pause Duration Hostname
Rx RoCE PG Usage Cells Hostname, Interface
Tx RoCE TC Usage Cells Hostname, Interface
Rx RoCE No Buffer Discard Hostname, Interface
Tx RoCE PFC Pause Duration Hostname, Interface
Tx CNP Buffer Usage Cells Hostname, Interface
Tx ECN Marked Packets Hostname, Interface
Tx RoCE PFC Pause Packets Hostname, Interface
Rx CNP No Buffer Discard Hostname, Interface
Rx CNP PG Usage Cells Hostname, Interface
Tx CNP TC Usage Cells Hostname, Interface
Rx RoCE Buffer Usage Cells Hostname, Interface
Tx RoCE Unicast No Buffer Discard Hostname, Interface
Rx CNP Buffer Usage Cells Hostname, Interface
Rx RoCE PFC Pause Packets Hostname, Interface
Tx RoCE Buffer Usage Cells Hostname, Interface
Event ID Scope Parameters
TCA_SENSOR_FAN_UPPER Hostname, Sensor Name
TCA_SENSOR_POWER_UPPER Hostname, Sensor Name
TCA_SENSOR_TEMPERATURE_UPPER Hostname, Sensor Name
TCA_SENSOR_VOLTAGE_UPPER Hostname, Sensor Name
Event ID Scope Parameters
TCA_WJH_DROP_AGG_UPPER Hostname, Reason
TCA_WJH_ACL_DROP_AGG_UPPER Hostname, Reason, Ingress port
TCA_WJH_BUFFER_DROP_AGG_UPPER Hostname, Reason
TCA_WJH_SYMBOL_ERROR_UPPER Hostname, Port down reason
TCA_WJH_CRC_ERROR_UPPER Hostname, Port down reason

Specify the Scope

Rules require a scope. The scope can be the entire complement of monitored devices or a subset. You define scopes as regular expressions, and they appear as regular expressions in NetQ. Each event has a set of attributes you can use to apply the rule to a subset of all devices. The definition and display is slightly different between the NetQ UI and the NetQ CLI, but the results are the same.

You define the scope in the Choose Attributes step when creating a TCA event rule. You can choose to apply the rule to all devices or narrow the scope using attributes. If you choose to narrow the scope, but then do not enter any values for the available attributes, the result is all devices and attributes.

Scopes appear in TCA rule cards using the following format: Attribute, Operation, Value.

In this example, three attributes are available. For one or more of these attributes, select the operation (equals or starts with) and enter a value. For drop reasons, click in the value field to open a list of reasons, and select one from the list.

Note that you should leave the drop type attribute blank.

Create rule to show events from a … Attribute Operation Value
Single device hostname Equals <hostname> such as spine01
Single interface ifname Equals <interface-name> such as swp6
Single sensor s_name Equals <sensor-name> such as fan2
Single WJH drop reason reason or port_down_reason Equals <drop-reason> such as WRED
Single WJH ingress port ingress_port Equals <port-name> such as 47
Set of devices hostname Starts with <partial-hostname> such as leaf
Set of interfaces ifname Starts with <partial-interface-name> such as swp or eth
Set of sensors s_name Starts with <partial-sensor-name> such as fan, temp, or psu

Refer to WJH Event Messages Reference for WJH drop types and reasons. Leaving an attribute value blank defaults to all; all hostnames, interfaces, sensors, forwarding resources, ACL resources, and so forth.

Each attribute is displayed on the rule card as a regular expression equivalent to your choices above:

  • Equals is displayed as an equals sign (=)
  • Starts with is displayed as a caret (^)
  • Blank (all) is displayed as an asterisk (*)

Scopes are defined with regular expressions. When more than one scoping parameter is available, they must be separated by a comma (without spaces), and all parameters must be defined in order. When an asterisk (*) is used alone, it must be entered inside either single or double quotes. Single quotes are used here.

The single hostname scope parameter is used by the ACL resources, forwarding resources, and resource utilization events.

Scope Value Example Result
<hostname> leaf01 Deliver events for the specified device
<partial-hostname>* leaf* Deliver events for devices with hostnames starting with specified text (leaf)

The hostname and interface scope parameters are used by the digital optics, interface errors, interface statistics, and link flaps events.

Scope Value Example Result
<hostname>,<interface> leaf01,swp9 Deliver events for the specified interface (swp9) on the specified device (leaf01)
<hostname>,'*' leaf01,'*' Deliver events for all interfaces on the specified device (leaf01)
'*',<interface> '*',swp9 Deliver events for the specified interface (swp9) on all devices
<partial-hostname>*,<interface> leaf*,swp9 Deliver events for the specified interface (swp9) on all devices with hostnames starting with the specified text (leaf)
<hostname>,<partial-interface>* leaf01,swp* Deliver events for all interface with names starting with the specified text (swp) on the specified device (leaf01)

The hostname and sensor name scope parameters are used by the sensor events.

Scope Value Example Result
<hostname>,<sensorname> leaf01,fan1 Deliver events for the specified sensor (fan1) on the specified device (leaf01)
'*',<sensorname> '*',fan1 Deliver events for the specified sensor (fan1) for all devices
<hostname>,'*' leaf01,'*' Deliver events for all sensors on the specified device (leaf01)
<partial-hostname>*,<interface> leaf*,fan1 Deliver events for the specified sensor (fan1) on all devices with hostnames starting with the specified text (leaf)
<hostname>,<partial-sensorname>* leaf01,fan* Deliver events for all sensors with names starting with the specified text (fan) on the specified device (leaf01)

The hostname, reason/port down reason, ingress port, and drop type scope parameters are used by the What Just Happened events.

Scope Value Example Result
<hostname>,<reason>,<ingress_port>,<drop_type> leaf01,ingress-port-acl,'*','*' Deliver WJH events for all ports on the specified device (leaf01) with the specified reason triggered (ingress-port-acl exceeded the threshold)
'*',<reason>,'*' '*',tail-drop,'*' Deliver WJH events for the specified reason (tail-drop) for all devices
<partial-hostname>*,<port_down_reason>,<drop_type> leaf*,calibration-failure,'*' Deliver WJH events for the specified reason (calibration-failure) on all devices with hostnames starting with the specified text (leaf)
<hostname>,<partial-reason>*,<drop_type> leaf01,blackhole,'*' Deliver WJH events for reasons starting with the specified text (blackhole [route]) on the specified device (leaf01)

Create a TCA Rule

Now that you know which events are supported and how to set the scope, you can create a basic rule to deliver one of the TCA events to a notification channel. This can be done using either the NetQ UI or the NetQ CLI.

To create a TCA rule:

  1. Click to open the Main Menu.
  1. Click Threshold Crossing Rules under Notifications.
  1. Select the event type for the rule you want to create. Note that the total count of rules for each event type is also shown.

  2. Click Create a Rule or (Add rule) to add a rule.

    The Create TCA Rule dialog opens. Four steps create the rule.

You can move forward and backward until you are satisfied with your rule definition.

  1. On the Enter Details step, enter a name for your rule and assign a severity. Verify the event type.

The rule name has a maximum of 20 characters (including spaces).

  1. Click Next.

  2. On the Choose Attribute step, select the attribute to measure against.

The attributes presented depend on the event type chosen in the Enter Details step. This example shows the attributes available when Resource Utilization was selected.

  1. Click Next.

  2. On the Set Threshold step, enter a threshold value.

For Digital Optics, you can choose to use the thresholds defined by the optics vendor (default) or specify your own.
  1. Define the scope of the rule.

    • If you want to restrict the rule based on a particular parameter, enter values for one or more of the available attributes. For What Just Happened rules, select a reason from the available list.

    • If you want the rule to apply to all devices, click the scope toggle.

  1. Click Next.

  2. Optionally, select a notification channel where you want the events to be sent.

    Only previously created channels are available for selection. If no channel is available or selected, the notifications can only be retrieved from the database. You can add a channel at a later time and then add it to the rule. Refer to Create a Channel and Modify TCA Rules.

  3. Click Finish.

This example shows two interface statistics rules. The rule on the left triggers an informational event when the total received bytes exceeds the upper threshold of 5 M on any switches. The rule on the right triggers an alarm event when any switch exceeds the total received broadcast bytes af 560 K, indicating a broadcast storm. Note that the cards indicate both rules are currently Active.

The simplest configuration you can create is one that sends a TCA event generated by all devices and all interfaces to a single notification application. Use the netq add tca command to configure the event. Its syntax is:

netq add tca [event_id <text-event-id-anchor>] [scope <text-scope-anchor>] [tca_id <text-tca-id-anchor>] [severity info | severity critical] [is_active true | is_active false] [suppress_until <text-suppress-ts>] [threshold_type user_set | threshold_type vendor_set] [threshold <text-threshold-value>] [channel <text-channel-name-anchor> | channel drop <text-drop-channel-name>]

Note that the event ID is case sensitive and must be in all uppercase.

For example, this rule tells NetQ to deliver an event notification to the tca_slack_ifstats pre-configured Slack channel when the CPU utilization exceeds 95% of its capacity on any monitored switch:

cumulus@switch:~$ netq add tca event_id TCA_CPU_UTILIZATION_UPPER scope '*' channel tca_slack_ifstats threshold 95

This rule tells NetQ to deliver an event notification to the tca_pd_ifstats PagerDuty channel when the number of transmit bytes per second (Bps) on the leaf12 switch exceeds 20,000 Bps on any interface:

cumulus@switch:~$ netq add tca event_id TCA_TXBYTES_UPPER scope leaf12,'*' channel tca_pd_ifstats threshold 20000

This rule tells NetQ to deliver an event notification to the syslog-netq syslog channel when the temperature on sensor temp1 on the leaf12 switch exceeds 32 degrees Celcius:

cumulus@switch:~$ netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope leaf12,temp1 channel syslog-netq threshold 32

This rule tells NetQ to deliver an event notification to the tca-slack channel when the total number of ACL drops on the leaf04 switch exceeds 20,000 for any reason, ingress port, or drop type.

cumulus@switch:~$ netq add tca event_id TCA_WJH_ACL_DROP_AGG_UPPER scope leaf04,'*','*','*' channel tca-slack threshold 20000

For a Slack channel, the event messages should be similar to this:

Set the Severity of a Threshold-based Event

In addition to defining a scope for TCA rule, you can also set a severity of either info or critical. To add a severity to a rule, use the severity option.

For example, if you want to add a critical severity to the CPU utilization rule you created earlier:

cumulus@switch:~$ netq add tca event_id TCA_CPU_UTILIZATION_UPPER scope '*' severity critical channel tca_slack_resources threshold 95

Or if an event is important, but not critical. Set the severity to info:

cumulus@switch:~$ netq add tca event_id TCA_TXBYTES_UPPER scope leaf12,'*' severity info channel tca_pd_ifstats threshold 20000

Set the Threshold for Digital Optics Events

Digital optics have the additional option of applying user- or vendor-defined thresholds, using the threshold_type and threshold options.

This example shows how to send an alarm event on channel ch1 when the upper threshold for module voltage exceeds the vendor-defined thresholds for interface swp31 on the mlx-2700-04 switch.

cumulus@switch:~$ netq add tca event_id TCA_DOM_MODULE_VOLTAGE_ALARM_UPPER scope 'mlx-2700-04,swp31' severity critical is_active true threshold_type vendor_set channel ch1
Successfully added/updated tca

This example shows how to send an alarm event on channel ch1 when the upper threshold for module voltage exceeds the user-defined threshold of 3V for interface swp31 on the mlx-2700-04 switch.

cumulus@switch:~$ netq add tca event_id TCA_DOM_MODULE_VOLTAGE_ALARM_UPPER scope 'mlx-2700-04,swp31' severity critical is_active true threshold_type user_set threshold 3 channel ch1
Successfully added/updated tca

Create Multiple Rules for a TCA Event

You are likely to want more than one rule around a particular event. For example, you might want to:

  • Monitor the same event but for a different interface, sensor, or device
  • Send the event notification to more than one channel
  • Change the threshold for a particular device that you are troubleshooting

And so forth.

In the NetQ UI you create multiple rules by adding multiple rule cards. Refer to Create a TCA Rule.

In the NetQ CLI, you can also add multiple rules. This example shows the creation of three additional rules for the max temperature sensor.

netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope leaf*,temp1 channel syslog-netq threshold 32

netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope '*',temp1 channel tca_sensors,tca_pd_sensors threshold 32

netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope leaf03,temp1 channel syslog-netq threshold 29

Now you have four rules created (the original one, plus these three new ones) all based on the TCA_SENSOR_TEMPERATURE_UPPER event. To identify the various rules, NetQ automatically generates a TCA name for each rule. As you create each rule, NetQ adds an _# to the event name. The TCA Name for the first rule created is then TCA_SENSOR_TEMPERATURE_UPPER_1, the second rule created for this event is TCA_SENSOR_TEMPERATURE_UPPER_2, and so forth.

Manage Threshold-based Event Notifications

After you create some rules, you might want to modify them; view a list of them, disable a rule, delete a rule, and so forth.

View TCA Rules

You can view all the threshold-crossing event rules you have created in the NetQ UI or the NetQ CLI.

  1. Click .

  2. Select Threshold Crossing Rules under Notifications.

    A card appears for every rule.

When you have at least one rule created, you can use the filters that appear above the rule cards to find the rules of interest. Filter by status, severity, channel, and/or events. When a filter is applied a badge indicating the number of items in the filter is shown on the filter dropdown.

To view TCA rules, run:

netq show tca [tca_id <text-tca-id-anchor>] [json]

This example displays all TCA rules:

cumulus@switch:~$ netq show tca
Matching config_tca records:
TCA Name                     Event Name           Scope                      Severity Channel/s          Active Threshold          Unit     Threshold Type Suppress Until
---------------------------- -------------------- -------------------------- -------- ------------------ ------ ------------------ -------- -------------- ----------------------------
TCA_CPU_UTILIZATION_UPPER_1  TCA_CPU_UTILIZATION_ {"hostname":"leaf01"}      info     pd-netq-events,slk True   87                 %        user_set       Fri Oct  9 15:39:35 2020
                             UPPER                                                    -netq-events
TCA_CPU_UTILIZATION_UPPER_2  TCA_CPU_UTILIZATION_ {"hostname":"*"}           critical slk-netq-events    True   93                 %        user_set       Fri Oct  9 15:39:56 2020
                             UPPER
TCA_DOM_BIAS_CURRENT_ALARM_U TCA_DOM_BIAS_CURRENT {"hostname":"leaf*","ifnam critical slk-netq-events    True   0                  mA       vendor_set     Fri Oct  9 16:02:37 2020
PPER_1                       _ALARM_UPPER         e":"*"}
TCA_DOM_RX_POWER_ALARM_UPPER TCA_DOM_RX_POWER_ALA {"hostname":"*","ifname":" info     slk-netq-events    True   0                  mW       vendor_set     Fri Oct  9 15:25:26 2020
_1                           RM_UPPER             *"}
TCA_SENSOR_TEMPERATURE_UPPER TCA_SENSOR_TEMPERATU {"hostname":"leaf","s_name critical slk-netq-events    True   32                 degreeC  user_set       Fri Oct  9 15:40:18 2020
_1                           RE_UPPER             ":"temp1"}
TCA_TCAM_IPV4_ROUTE_UPPER_1  TCA_TCAM_IPV4_ROUTE_ {"hostname":"*"}           critical pd-netq-events     True   20000              %        user_set       Fri Oct  9 16:13:39 2020
                             UPPER

This example displays a specific TCA rule:

cumulus@switch:~$ netq show tca tca_id TCA_TXMULTICAST_UPPER_1
Matching config_tca records:
TCA Name                     Event Name           Scope                      Severity         Channel/s          Active Threshold          Suppress Until
---------------------------- -------------------- -------------------------- ---------------- ------------------ ------ ------------------ ----------------------------
TCA_TXMULTICAST_UPPER_1      TCA_TXMULTICAST_UPPE {"ifname":"swp3","hostname info             tca-tx-bytes-slack True   0                  Sun Dec  8 16:40:14 2269
                             R                    ":"leaf01"}

Change the Threshold on a TCA Rule

After receiving notifications based on a rule, you might find that you want to increase or decrease the threshold value to limit or increase the events you receive. You can accomplish this with the NetQ UI or the NetQ CLI.

To modify the threshold:

  1. Click to open the Main Menu.

  2. Click Threshold Crossing Rules under Notifications.

  3. Locate the rule you want to modify and hover over the card.

  4. Click .

  1. Enter a new threshold value.
  1. Click Update Rule.

To modify the threshold, run:

netq add tca tca_id <text-tca-id-anchor> threshold <text-threshold-value>

This example changes the threshold for the rule TCA_CPU_UTILIZATION_UPPER_1 to a value of 96 percent. This overwrites the existing threshold value.

cumulus@switch:~$ netq add tca tca_id TCA_CPU_UTILIZATION_UPPER_1 threshold 96

Change the Scope of a TCA Rule

After receiving notifications based on a rule, you might find that you want to narrow or widen the scope value to limit or increase the events you receive. You can accomplish this with the NetQ UI or the NetQ CLI.

To modify the scope:

  1. Click to open the Main Menu.

  2. Click Threshold Crossing Rules under Notifications.

  3. Locate the rule you want to modify and hover over the card.

  4. Click .

  1. Change the scope, applying the rule to all devices or broadening or narrowing the scope. Refer to Specify the Scope for details.
In this example, the scope is across the entire network. Toggle the scope and select one or more hosts on which to apply this rule.
  1. Click Update Rule.

To modify the scope, run:

netq add tca event_id <text-event-id-anchor> scope <text-scope-anchor> threshold <text-threshold-value>

This example changes the scope for the rule TCA_CPU_UTILIZATION_UPPER to apply only to switches beginning with a hostname of leaf. You must also provide a threshold value. This example case uses a value of 95 percent. Note that this overwrites the existing scope and threshold values.

cumulus@switch:~$ netq add tca event_id TCA_CPU_UTILIZATION_UPPER scope hostname^leaf threshold 95
Successfully added/updated tca

cumulus@switch:~$ netq show tca

Matching config_tca records:
TCA Name                     Event Name           Scope                      Severity         Channel/s          Active Threshold          Suppress Until
---------------------------- -------------------- -------------------------- ---------------- ------------------ ------ ------------------ ----------------------------
TCA_CPU_UTILIZATION_UPPER_1  TCA_CPU_UTILIZATION_ {"hostname":"*"}           critical         onprem-email       True   93                 Mon Aug 31 20:59:57 2020
                             UPPER
TCA_CPU_UTILIZATION_UPPER_2  TCA_CPU_UTILIZATION_ {"hostname":"hostname^leaf info                                True   95                 Tue Sep  1 18:47:24 2020
                             UPPER                "}

Change, Add, or Remove the Channels on a TCA Rule

You can change the channels associated with a TCA rule, add more channels to receive the same events, or remove channels that you no longer want to receive the associated events.

  1. Click to open the Main Menu.

  2. Click Threshold Crossing Rules under Notifications.

  3. Locate the rule you want to modify and hover over the card.

  4. Click .

  1. Click Channels.
  1. Select one or more channels.

    Click a channel to select it. Click again to unselect a channel.

  2. Click Update Rule.

To change a channel association, run:

netq add tca tca_id <text-tca-id-anchor> channel <text-channel-name-anchor>

This overwrites the existing channel association.

This example shows the changing of the channel for the disk utilization 1 rule to a PagerDuty channel pd-netq-events.

cumulus@switch:~$ netq add tca tca_id TCA_DISK_UTILIZATION_UPPER_1 channel pd-netq-events
Successfully added/updated tca TCA_DISK_UTILIZATION_UPPER_1

To remove a channel association (stop sending events to a particular channel), run:

netq add tca tca_id <text-tca-id-anchor> channel drop <text-drop-channel-name>

This example removes the tca_slack_resources channel from the disk utilization 1 rule.

cumulus@switch:~$ netq add tca tca_id TCA_DISK_UTILIZATION_UPPER_1 channel drop tca_slack_resources
Successfully added/updated tca TCA_DISK_UTILIZATION_UPPER_1

Change the Name of a TCA Rule

You cannot change the name of a TCA rule using the NetQ CLI because the rules do not have names. They receive identifiers (the tca_id) automatically. In the NetQ UI, to change a rule name, you must delete the rule and re-create it with the new name. Refer to Delete a TCA Rule and then Create a TCA Rule.

Change the Severity of a TCA Rule

TCA rules have either an informational or critical severity.

In the NetQ UI, the severity cannot change by itself, you must delete the rule and re-create it using the new severity. Refer to Delete a TCA Rule and then Create a TCA Rule.

In the NetQ CLI, to change the severity, run:

netq add tca tca_id <text-tca-id-anchor> (severity info | severity critical)

This example changes the severity of the maximum CPU utilization 1 rule from critical to info:

cumulus@switch:~$ netq add tca tca_id TCA_CPU_UTILIZATION_UPPER_1 severity info
Successfully added/updated tca TCA_CPU_UTILIZATION_UPPER_1

Suppress a TCA Rule

During troubleshooting or maintenance of switches you might want to suppress a rule to prevent erroneous event messages. You accomplish this using the NetQ UI or the NetQ CLI.

The TCA rules have three possible states in the NetQ UI:

  • Active: Rule is operating, delivering events. This is the normal operating state.
  • Suppressed: Rule is disabled until a designated date and time. When that time occurs, the rule is automatically reenabled. This state is useful during troubleshooting or maintenance of a switch when you do not want erroneous events being generated.
  • Disabled: Rule is disabled until a user manually reenables it. This state is useful when you are unclear when you want the rule to be reenabled. This is not the same as deleting the rule.

To suppress a rule for a designated amount of time, you must change the state of the rule.

To suppress a rule:

  1. Click to open the Main Menu.

  2. Click Threshold Crossing Rules under Notifications.

  3. Locate the rule you want to suppress.

  4. Click Disable.

  1. Click in the Date/Time field to set when you want the rule to be automatically reenabled.

  2. Click Disable.

Note the changes in the card:
  • The state is now marked as Inactive, but remains green
  • The date and time that the rule will be enabled is noted in the Suppressed field
  • The Disable option has changed to Disable Forever. Refer to Disable a TCA Rule for information about this change.

Using the suppress_until option allows you to prevent the rule from being applied for a designated amout of time (in seconds). When this time has passed, the rule is automatically reenabled.

To suppress a rule, run:

netq add tca tca_id <text-tca-id-anchor> suppress_until <text-suppress-ts>

This example suppresses the maximum cpu utilization event for 24 hours:

cumulus@switch:~$ netq add tca tca_id TCA_CPU_UTILIZATION_UPPER_2 suppress_until 86400
Successfully added/updated tca TCA_CPU_UTILIZATION_UPPER_2

Disable a TCA Rule

Whereas suppression temporarily disables a rule, you can deactivate a rule to disable it indefinitely. You can disable a rule using the NetQ UI or the NetQ CLI.

The TCA rules have three possible states in the NetQ UI:

  • Active: Rule is operating, delivering events. This is the normal operating state.
  • Suppressed: Rule is disabled until a designated date and time. When that time occurs, the rule is automatically reenabled. This state is useful during troubleshooting or maintenance of a switch when you do not want erroneous events being generated.
  • Disabled: Rule is disabled until a user manually reenables it. This state is useful when you are unclear when you want the rule to be reenabled. This is not the same as deleting the rule.

To disable a rule that is currently active:

  1. Click to open the Main Menu.

  2. Click Threshold Crossing Rules under Notifications.

  3. Locate the rule you want to disable.

  4. Click Disable.

  5. Leave the Date/Time field blank.

  6. Click Disable.

Note the changes in the card:
  • The state is now marked as Inactive and is red
  • The rule definition is grayed out
  • The Disable option has changed to Enable to reactivate the rule when you are ready

To disable a rule that is currently suppressed:

  1. Click to open the Main Menu.

  2. Click Threshold Crossing Rules under Notifications.

  3. Locate the rule you want to disable.

  4. Click Disable Forever.

    Note the changes in the card:

    • The state is now marked as Inactive and is red
    • The rule definition is grayed out
    • The Disable option has changed to Enable to reactivate the rule when you are ready

To disable a rule, run:

netq add tca tca_id <text-tca-id-anchor> is_active false

This example disables the maximum disk utilization 1 rule:

cumulus@switch:~$ netq add tca tca_id TCA_DISK_UTILIZATION_UPPER_1 is_active false
Successfully added/updated tca TCA_DISK_UTILIZATION_UPPER_1

To reenable the rule, set the is_active option to true.

Delete a TCA Rule

You might find that you no longer want to receive event notifications for a particular TCA event. In that case, you can either disable the event if you think you might want to receive them again or delete the rule altogether. Refer to Disable a TCA Rule for the first case. Follow the instructions here to remove the rule using either the NetQ UI or NetQ CLI.

The rule can be in any of the three states, active, suppressed, or disabled.

To delete a rule:

  1. Click to open the Main Menu.

  2. Click Threshold Crossing Rules under Notifications.

  3. Locate the rule you want to remove and hover over the card.

  4. Click .

To remove a rule altogether, run:

netq del tca tca_id <text-tca-id-anchor>

This example deletes the maximum receive bytes rule:

cumulus@switch:~$ netq del tca tca_id TCA_RXBYTES_UPPER_1
Successfully deleted TCA TCA_RXBYTES_UPPER_1

Resolve Scope Conflicts

There might be occasions where the scope defined by the multiple rules for a given TCA event might overlap each other. In such cases, NetQ uses the TCA rule with the most specific scope that is still true to generate the event.

To clarify this, consider this example. Three events occurred:

  • First event on switch leaf01, interface swp1
  • Second event on switch leaf01, interface swp3
  • Third event on switch spine01, interface swp1

NetQ attempts to match the TCA event against hostname and interface name with three TCA rules with different scopes:

  • Scope 1 send events for the swp1 interface on switch leaf01 (very specific)
  • Scope 2 send events for all interfaces on switches that start with leaf (moderately specific)
  • Scope 3 send events for all switches and interfaces (very broad)

The result is:

  • For the first event, NetQ applies the scope from rule 1 because it matches scope 1 exactly
  • For the second event, NetQ applies the scope from rule 2 because it does not match scope 1, but does match scope 2
  • For the third event, NetQ applies the scope from rule 3 because it does not match either scope 1 or scope 2

In summary:

Input Event Scope Parameters TCA Scope 1 TCA Scope 2 TCA Scope 3 Scope Applied
leaf01,swp1 Hostname, Interface '*','*' leaf*,'*' leaf01,swp1 Scope 3
leaf01,swp3 Hostname, Interface '*','*' leaf*,'*' leaf01,swp1 Scope 2
spine01,swp1 Hostname, Interface '*','*' leaf*,'*' leaf01,swp1 Scope 1

Modify your TCA rules to remove the conflict.