Monitor Informational Events
Two event workflows, the Alarms card workflow and the Info card workflow, provide a view into the events occurring in the network. The Alarms card workflow tracks critical severity events, whereas the Info card workflow tracks all warning, info, and debug severity events.
To focus on events from a single device perspective, refer to Monitor Switches. To monitor alarms, refer to Monitor Alarm Events.
Info Card Workflow Summary
The Info card workflow enables users to easily view and track informational alarms occurring anywhere in your network.
The small Info card displays:
-
total number and distribution of info events
-
total number and distribution of alarms
-
indication of when and relative number of configuration changes
The medium Info card displays:
-
types of info events that occurred
-
total number and distribution of info events
-
total number and distribution of alarms
-
total number and distribution of configuration changes
<insert image>
The large Alarms card displays:
-
types of info events that occurred
-
total number and distribution of info events
-
total number and distribution of alarms
-
total number and distribution of configuration changes
-
info events by most recent
-
devices by most issues
<insert images>
The full screen Info card provides tabs for all events and all devices.
<insert image>
View Info Events Summary
A summary of the informational events occurring in the network can be found on the small, medium, and large Info cards. Additional details are available as you increase the size of the card.
To view the summary with the small Info card, simply open the card. This card gives you a high-level view in a condensed visual, including the number and distribution of the info events along with the alarm and configuration changes that have occurred during the same time period.
To view the summary with the medium Info card, simply open the card. This card gives you the same count and distribution of info and alarm events and configuration changes, but it also provides information about the sources of the info events and enables you to view a small slice of time using the distribution charts.
-
Use the chart at the top of the card to view the various sources, types, of info events. Hover over the chart to view the number of each type. The four or so types with the most info events are called out separately, with all others collected together into an Other category. Hovering over the Other segment of the chart provides a listing of the additional types and their counts.
-
Hover on the distribution charts to view the count of info and alarm events and configuration changes during a smaller time slice. For example, in a card showing results for the last 24 hours, you can view the counts for a single hour. For a card showing results for the last week, you can view the counts for a single day . Hovering updates the data in the table. To persist the changes, click the charts.
To view the summary with the large Info card, open the card. The left side of the card provides the same capabilities as the medium Info card.
Compare Timing of Info and Alarm Events
While you can see the relative relationship between info and alarm events on the small Info card, the medium and large cards provide considerably more information. Open either of these to view individual line charts for the events. Generally, alarms have some corollary info events. For example, when a network service becomes unavailable, a critical alarm is often issued, and when the service becomes available again, an info event of severity warning is generated. For this reason, you might see some level of tracking between the info and alarm counts and distributions. Some other possible scenarios:
-
When a critical alarm is resolved, you may see a temporary increase in info events as a result.
-
When you get a burst of info events, you may see a follow-on increase in critical alarms, as the info events may have been warning you of something beginning to go wrong.
-
You set logging to debug, and a large number of info events of severity debug are seen. You would not expect to see an increase in critical alarms.
Compare Timing of Info and Alarm Events with Configuration Changes
While you can see the relative relationship between info and alarm events with configuration changes on the small Info card, the medium and large cards provide a better view. Open either the medium or large card. Focusing on the distribution charts, you can see the relationship between the info and alarm events and any configuration changes that have been made during this time period. Significant configuration changes, such as x, y, or z, are likely to create a number of info events. Smaller configuration changes, such as a, b, or c are not as likely to create a large number of info events. ( what are the config changes that we track, which category do they fit in? )
View the Source of Info Events
Using the medium or large Info card, you can view the source and count of info events by those sources. The Types of Info chart is divided into segments that represent each source type that generated info events. Hover over the segments to view the number of info events for each source type. The four or so types with the most info events are called out separately, with all others collected together into an Other category. Hovering over the Other segment of the chart provides a listing of the additional types and their counts.
View All Info Events Sorted by Time of Occurrence
You can view all info events using the large Info card. Open the large card and confirm the Events By Most Recent option is selected in the dropdown above the table on the right. When this option is selected, all of the info events are listed with the most recently occurring event at the top. Scrolling down shows you the info events that have occurred at an earlier time within the selected time period for the card.
You can also resort the table to show the oldest info events at the top or based on the message if you are looking for how often a particular info event is occurring. Hover in the relevant column header and click
. (only avail on full screen?)
View Devices with the Most Info Events
By default, the list of all info events is displayed when viewing the large Info card. You can filter instead for the devices that have the most info events by selecting the Devices by Most Issues option from the dropdown above the table. To view all devices with info events, click Show All Devices to open the Device Inventory tab on the full screen Info card.
View All Events
You can view all alarms and info events using the full screen Info card. Simply open the full screen Info card, and click the All Events tab.
If you are already have the large info card open, you can open the All Events tab by clicking the See Alarms link in the bottom right corner of the card.
To return to your workbench, click
next to the title of the full screen card.
View All Devices
You can view all devices using the full screen Info card. Simply open the full screen Info card, and click the All Devices tab.
Informational Events Reference
The following table lists alarm messages organized by message type and then severity, by default. Click the column header to sort the list by that characteristic. Click
in any column header to toggle the sort order between A-Z and Z-A. Recommended actions suggest NetQ GUI cards, CLI commands and Cumulus Linux NCLU commands for further investigation.
Type |
Trigger |
Severity |
Message Format |
Example |
|
|
|||||
bgp |
BGP Session state changed from Failed to Established |
Info |
BGP session with peer @peer @peerhost @neighbor vrf @vrf session state changed from Failed to Established |
BGP session with peer swp5 spine02 spine03 vrf default session state changed from Failed to Established |
|
bgp |
BGP Session state changed from Established to Failed |
Info |
BGP session with peer @peer @neighbor vrf @vrf state changed from Established to Failed |
BGP session with peer leaf03 leaf04 vrf mgmt state changed from Established to Failed |
|
bgp |
The reset time for a BGP session changed |
Info |
BGP session with peer @peer @neighbor vrf @vrf reset time changed from @old_last_reset_time to @new_last_reset_time |
BGP session with peer spine03 swp9: reset time changed from time to time (format of times?) |
|
cable |
The speed setting for a given port changed |
Info |
@ifname speed changed from @old_speed to @new_speed |
swp9 speed changed from 10 to 40 ( units ?) |
|
cable |
The transceiver status for a given port changed |
Info |
@ifname transreceiver changed from @old_transareceiver to @new_transareceiver |
swp4 transceiver changed from down to up/disabled to enabled/disconnected to connected? |
|
cable |
The vendor of a given port (transceiver?) changed |
Info |
@ifname vendor name changed from @old_vendor_name to @new_vendor_name |
swp23 vendor name changed from Broadcom to Mellanox |
|
cable |
The part number of a given port (transceiver?) changed |
Info |
@ifname part number changed from @old_part_number to @new_part_number |
swp7 part number changed from FP1ZZ5654002A to MSN2700-CS2F0 |
|
cable |
The serial number of a given port changed |
Info |
@ifname serial number changed from @old_serial_number to @new_serial_number |
swp4 serial number changed from 571254X1507020 to MT1552X12041 |
|
cable |
The status of forward error correction (FEC) support for a given port changed |
Info |
@ifname supported fec changed from @old_supported_fec to @new_supported_fec |
swp12 supported fec changed from supported to unsupported swp12 supported fec changed from unsupported to supported |
|
cable |
The advertised support for FEC for a given port changed |
Info |
@ifname supported fec changed from @old_advertised_fec to @new_advertised_fec |
swp24 supported FEC changed from advertized to not advertized (or enabled/disabled?) |
|
cable |
The FEC status for a given port changed |
Info |
@ifname fec changed from @old_fec to @new_fec |
swp15 fec changed from disabled to enabled |
|
clag |
Backup IP address has not been specified in case the peerlink goes down |
Warning |
Backup IP not configured |
Backup IP not configured |
|
clag |
Backup IP address is not reachable |
Warning |
Backup IP connectivity failed |
Backup IP connectivity failed |
|
clag |
Bond has only one connection |
Warning |
Bond @bond is singly connected |
Bond 2 is singly connected |
|
clag |
Bond state changed from up to down |
Warning |
Bond @bond is down |
Bond 1 is down |
|
clag |
CLAG remote peer state changed from down to up |
Info |
Peer state changed to up |
Peer state changed to up |
|
clag |
Local CLAG host state changed from down to up |
Info |
Clag state changed from down to up |
CLAG state changed from down to up |
|
clag |
CLAG bond in Conflicted state was updated with new bonds |
Info |
Clag conflicted bond changed from @old_conflicted_bonds to @new_conflicted_bonds |
Clag conflicted bond changed from swp7 swp8 to @swp9 swp10 |
|
clag |
CLAG bond changed state from protodown to up state |
Info |
Clag conflicted bond changed from @old_state_protodownbond to @new_state_protodownbond |
Clag conflicted bond changed from protodown to up |
|
configdiff |
Configuration file has been modified |
Info |
@hostname config file @type was modified |
spine03 config file /etc/frr/frr.conf was modified |
|
configdiff |
Configuration file has been created |
Info |
@hostname config file @type was created |
leaf12 config file /etc/lldp.d/README.conf was created |
|
evpn |
The type of VNI, layer 2 versus layer 3, is inconsistent across the network fabric |
Warning |
VNI @vni type inconsistent |
VNI 12 type inconsistent |
|
evpn |
The forwarding database learning feature was enabled on the specified VNI |
Warning |
VNI @vni fdb learning enabled on network port |
VNI 4 fdb learning enabled on network port |
|
evpn |
An incorrect VNI to VRF mapping has occurred |
Warning |
VNI @vni mapped to VRF @vrf1 instead of @vrf2 |
VNI 21 mapped to VRF mgmt instead of default |
|
evpn |
An incorrect VNI to VLAN mapping has occurred |
Warning |
VNI @vni mapped to VLAN @vlan1 instead of @vlan2 |
VNI 7 mapped to VLAN 1005 instead of 1003 |
|
evpn |
A VNI was configured and moved from the down state to the up state |
Info |
VNI @vni state changed from down to up |
VNI 36 state changed from down to up |
|
evpn |
The kernel state changed from x to y on a VNI |
Info |
VNI @vni kernel state changed from @old_in_kernel_state to @new_in_kernel_state" |
VNI 3 kernel state changed from something to something else |
|
evpn |
A VNI state changed from not advertising all VNIs to advertising all VNIs |
Info |
VNI @vni vni state changed from @old_adv_all_vni_state to @new_adv_all_vni_state" |
VNI 11 vni state changed from something to something else |
|
link |
Link operational state changed from down to up |
Info |
HostName @hostname changed state from @old_state to @new_state Interface:@ifname |
HostName leaf04 changed state from down to up Interface:swp11 |
|
lldp |
Local LLDP host has new neighbor information |
Info |
LLDP Session with host @hostname and @ifname modified fields @changed_fields |
LLDP Session with host leaf02 swp6 modified fields leaf06 swp21 |
|
lldp |
Local LLDP host has new peer interface name |
Info |
LLDP Session with host @hostname and @ifname @old_peer_ifname changed to @new_peer_ifname |
LLDP Session with host spine01 and swp5 swp12 changed to port12 |
|
lldp |
Local LLDP host has new peer hostname |
Info |
LLDP Session with host @hostname and @ifname @old_peer_hostname changed to @new_peer_hostname |
LLDP Session with host leaf03 and swp2 leaf07 changed to exit01 |
|
ntp |
NTP sync state has been lost or never acquired |
Warning |
Sync state unknown for @hostname |
Sync state unknown for spine01 |
|
ntp |
NTP sync state changed from not in sync to in sync |
Info |
Sync state changed from @old_state to @new_state for @hostname |
Sync state changed from not sync to in sync for leaf06 |
|
sensor |
A temperature, fan, or power supply sensor state changed from a higher severity to a lower severity |
Info |
Sensor @sensor state changed from @old_state to @new_state |
Sensor temperature state changed from critical to low |
|
sensor |
A temperature, fan, or power supply sensor state changed from a low or medium severity to a medium or low severity |
Info |
Sensor @sensor state changed from @old_state to @new_state |
Sensor temperature state changed from medium to low |
|
services |
A service changed state from down to up |
Info |
Service @name changed state from down to up |
Service bgp changed state from down to up Service lldp changed state from down to up |
|
trace |
The underlay network MTU does not provide enough headroom to encapsulate (some message?) for a given interface |
Warning |
Underlay mtu @mtu at @hostname:@link not enough encap headroom |
Underlay mtu 9600 at leaf23:swp6 not enough encap headroom |
|
trace |
Path MTU is inconsistent along trace path |
Warning |
Inconsistent PMTU among paths |
Inconsistent PMTU among paths |
|
trace |
Tunnel MTU is different on the incoming and outgoing interfaces |
Warning |
@link tunnel mtu mismatch between ingress @hostname1 and egress @hostname2 |
swp9 tunnel mtu mismatch between ingress spine6 and egress @leaf31 |
|
trace |
Interface MTU is different on each end of the link |
Warning |
Link mtu mismatch between @hostname:@link1 (@mtu1) and @hostname2:@link2 (@mtu2) |
Link mtu mismatch between leaf09:swp4 (9600) and Leaf10:swp5 (1500) |
|
vlan |
Interface has no peer bond on remote switch |
Info |
@link No peer bond |
swp49 No peer bond |
|
vlan |
Interface has no CLAG peer information |
Info |
@link no clag peer info |
swp25 no clag peer info |
|
vlan |
Interface has no CLAG peerlink information |
Info |
@link no clag peerlink info |
@link no clag peerlink info |
|
vlan |
Peer link on a given interface is down |
Info |
Peer link @ifname is down |
Peer link swp17 is down |
|
vlan |
CLAG peerlink for a given interface is unknown |
Info |
@link clag peer link unknown |
swp4 clag peer link unknown |