Tips for Network Engineer

Thursday, October 9, 2025

SQL ? no ! AQL (Arista Dashboard Language)

Hello !

I'm still alive... Let's continue with Arista networking...

In Cloudvision, make a new Dashboard, chose Metrics > AQL > Table

1) Dashboard to display cvtemp password

Sometimes when you install a switch via ZTP, it get stuck in a weird state and the switch is no longer reachable with just "admin / ", but your config is not working either. If your switch was reaching cloudvision at one point, that means you can use the temporary account "cvptemp" to log in into the switch. Here is a CloudVision dashboard that would display switches that registered through ZTP, and match the corresponding "cvptemp" password.

Commands

let data = `cvp:/netElementCredentials/ids/*`
let inventory = `analytics:/DatasetInfo/Devices`
let res = newDict()
let id = 1
for macKey,macPass in data{
    res[macKey] = newDict()
    res[macKey]["Password"] = merge(macPass)[macKey]
    for deviceKey, deviceVal in merge(inventory){
        if deviceVal["mac"] == macKey {
            res[macKey]["hostname"] = deviceVal["hostname"]
        }
    }
    let id = id +1
}
res

Example output

Then you can ssh into your switch with cvptemp and the password.

2) Dashboard to display all BGP down per VRF

Another dashboard to build in Cloudvision. The goal is to display all BGP connections that are down. I put in a filter for equipment containing "sw" but you can remove this part.

Commands


let devicesInfo = `analytics:/Devices/*/versioned-data/Device` 
let devices = newDict()
for deviceSerial, ts_details in devicesInfo {
    let device_info = merge(ts_details)
    # Filter only device containing sw
    if strContains(device_info["hostname"] , "sw"){
        devices[deviceSerial] = merge(ts_details)
    }
    
}
let bgp = `analytics:/Devices/*/versioned-data/routing/bgp/status/vrf/*/bgpPeerInfoStatusEntry/*`
let bgp_per_devices = newDict()
for device_serial, device in devices{
    if dictHasKey(bgp, device_serial){
        bgp_per_devices[device["hostname"]] = bgp[device_serial]
    }
}
let bgp_down_per_devices = newDict()
for switch_name, vrf in bgp_per_devices{
    for vrf_name, bgp_peers in vrf {
        for bgp_peer_ip, ts_bgp_peer in bgp_peers{
            let bgp_peer = merge(ts_bgp_peer)
            if bgp_peer["bgpState"]["Name"] != "Established"{
                # check if the switch name exist
                if !dictHasKey(bgp_down_per_devices, switch_name){
                    bgp_down_per_devices[switch_name] = newDict()

                }
                # check if the vrf exist
                if !dictHasKey(bgp_down_per_devices[switch_name], vrf_name){
                    bgp_down_per_devices[switch_name][vrf_name] = newDict()

                }
                bgp_down_per_devices[switch_name][vrf_name][bgp_peer_ip] = bgp_peer

            }
        }

    }
}

let dict_for_viewing = newDict()
for device_name, vrf in bgp_down_per_devices {
    for vrf_name , bgp_peers in vrf {
        for bgp_peer_ip, bgp_peer in bgp_peers {
            if bgp_peer["bgpPeerAdminShutDown"] == false{
                let detail_problemes = newDict()
                detail_problemes["switch"] = device_name
                detail_problemes["vrf"] = vrf_name
                detail_problemes["description"] = bgp_peer["bgpPeerDescription"]
                detail_problemes["admin_shut"] = bgp_peer["bgpPeerAdminShutDown"]
                # check if the switch name exist
                if !dictHasKey(dict_for_viewing, device_name){
                        dict_for_viewing[bgp_peer_ip] = newDict()

                    }
                dict_for_viewing[bgp_peer_ip] = detail_problemes
            }

        }
    }
}
dict_for_viewing

Example output

Tuesday, August 20, 2024

Advanced tips for a good Arista network debugging

Jim Carrey Typing GIFs | Tenor

Hello !

Long time since I posted... In my professional journey I discovered Arista networking... I had to deploy and engineer over 400 switches (new datacenter) so I did a LOT of troubleshooting and I wanted to share some advanced tips.

Arista EOS is close to Cisco IOS but not identical. Also, it is 100% based on linux core (CentOS 7 at the moment of this blog) so we can use the power of linux on top of the particularities of EOS.

1) Check if your route-map works

These commands help to understand if your route is received/sent by the current configured route-map, and which statement matches :

Commands

show bgp debug policy inbound neighbor { <neighbor address> | all } ipv4 unicast [ vrf <vrf name> ] [ route-map <route-map> ] <prefix> 

show bgp debug policy outbound neighbor <neighbor address> ipv4 unicast [ vrf <vrf name> ] [ route-map <route-map> ] <prefix>

Example output

% show bgp debug policy inbound neighbor 10.1.2.1 ipv4 unicast vrf red route-map foo 10.100.20.0/24 

NLRI 10.100.20.0/24, received from 10.1.2.1

route-map foo
 seq 10 permit
     match as 1 (failed)
     Seq result: fall through to next sequence
  seq 20 permit
     match as 1 (matched)
     sub-route-map sub_foo (permit)
        seq 10 permit
           match as 1 (matched)
           Seq result: permit
        Route map result: permit, matching sequence 10
     Seq result: permit
  Route map result: permit, matching sequence 20

Associated link (you need an account) : https://www.arista.com/en/support/toi/eos-4-22-1f/14267-route-map-debugging-cli

2) Check if your configuration is coherent

These commands will help you check and verify if your configuration is coherent.

2.1 Consistency

Starting with EOS 4.30.2 it is possible to use the consistency check globally to verify if what you configured is used and well linked with other parts (example prefix-list and route-maps).

commands

Commands

show configuration consistency policy

Example output

% SWLAB#show configuration consistency policy

Undefined references

Feature                Result       Detail
---------------------- ------------ -----------------------------------------
IPv6 access list       warn         lab-control-plane-acl is undefined
Route map              warn         ROUTE_LEAKING_PFX is undefined

2.2 config-sanity

The config-sanity has always been part of EOS and is very useful, it checks for each part if you have everything configured by categories.

Commands


show vxlan config-sanity
show route-map config-sanity
show mlag config-sanity

Example output

% SWLAB#show vxlan config-sanity

category                            result  detail                                    
---------------------------------- -------- ----------------------------------------- 
Local VTEP Configuration Check       WARN                                      
  VLAN-VNI Map                       WARN   VLAN 20 does not exist

3) Check the control plane

Thanks to the linux base, it is possible to do live packet capturing on the switch.

Warning - depending on your EOS version and hardware model, there are sometimes bugs linked to packet capturing, such as freezing completely the switch (need hard electrical reboot). See bug 836750

First, you go to the VRF environment you want to analyze, then you go to linux bash and you capture the interface.

Commands


cli vrf MYVRF
bash
sudo su -
ip a (check the interfaces within the VRF environment) 
tcpdump -nni vlan971

Linux is a full operating system so you might like to use different commands as you like, such as top, netstat...

4) Capture packets

Capturing packets is never without risks, for example if you have too much throughput it could lead to a bug/overload on the switch so beware !

A good solution is to use the Recirculation Channel feature to help with the CPU load.

Let's says I want to capture packets (data) on interface Ethernet3 (Tx and Rx), and that Ethernet34 is not used.

Commands

conf t
!
interface recirc-channel 1
switchport recirculation features cpu-mirror
!
interface et34
description Recirc-channel1
traffic-loopback source system device mac
channel-group recirculation 1
no shut
!
monitor session MonitorAvecRecirc source Ethernet3 both
monitor session MonitorAvecRecirc destination cpu
end
!
! Launch tcpdump 
tcpdump monitor MonitorAvecRecirc > tcpdump_data.pcap

tcpdump via CLI is limited, and filter do not work but it is very helpful to troubleshoot some cases

---

That's it for my post on Arista. See you soon !

Friday, August 27, 2021

Remove encapsulation from pcap packets

Hello,

Recently I was troubleshooting some flows, and I needed to know what went trough a GRE tunnel. After reading some tutorials, especially including Wireshark (which did not work) I found the perfect tool to help you de-encapsulate your packet : ipdecap ( website - github )

1) Install

In theory, installation is simple. In practice, you will need to tweak a bit :

Packages needed = autoconf, automake, libtool, openssl, libpcap, libpcap-devel

Install procedure =

wget https://loicpefferkorn.net/ipdecap/ipdecap-0.7.tar.gz
tar xvzf ipdecap-0.7.tar.gz
cd ipdecap-0.7
sh autogen.sh
./configure
make
make install

But you might encounter errors such as :

ipdecap.c:28:45: fatal error: pcap/dlt.h: No such file or directory

I could not find the requested file on my server, so I went ahead in the src/ipdecap.c file and deleted the line. Afterwards it worked fine.

2) Use

It is relatively easy to use

Remove GREP encapsulation from packets located in gre.cap file, and write them in output.cap
```
$ ipdecap -i gre.cap -o output.cap
```
If you have multiple tunnels encapsulated, just repeat the previous step.
Remove ESP encapsulation, configuration in esp.conf
```
$ ipdecap -i esp.cap -o output.cap -c esp.conf
```

Merci a loicpefferkorn pour ce package !

Friday, August 13, 2021

Update F5 chassis licence

Updating a licence on a F5 Chassis can be a tricky time. Let's review the necessary steps:

0) PREPARE

0.1) VCMP cluster sync

Via tmsh/cli [ ACTIVE node only ]

show cm sync-status

show cm failover-status

run /cm config-sync to-group CLUSTERGROUP

0.2) Check VCMP Host status

show vcmp guest all-properties | grep "Comment\|deployed\|Prompt"

0.3) Take a licency copy in case

bash

tmsh show sys license

cd /config

cp bigip.license bigip.license.DATE

ls -la | grep license

0.4) If GTM is used, in case of standalone think to deactivate the nodes that will be licence updated, considering that wideips would point to multiple standalone units.

1) EXECUTION

1.1) On Standy unit

check standby units : show vcmp guest all-properties | grep "Comment\|deployed\|Prompt"

if you need to failover an ACTIVE node : run sys failover standby

F5 procedure:

To re-activate the license with the Add-On registration using the manual activation method, perform the following procedure:

1. Log in to the Configuration utility.

2. Navigate to System > License.

3. Click Re-activate.

4. Paste the Add-On registration key into the Add-On Key field and click Add.

5. Click Manual.

6. Click Next.

7. Copy the dossier and connect to the F5 Product Licensing page at the following address:

https://secure.f5.com

8. On the F5 Licensing Tools page, click Activate F5 product registration key for BIG-IP 9.x and later.

9. Paste the dossier into the Enter your dossier field, and click Next.

10. Copy the license returned by the F5 Product Licensing page and paste it into the License field in the Configuration utility.

11. Click Next.

1.2) Failover the nodes within clusters

show cm failover-status

! If ACTIVE, failover

run sys failover standby

! Check status : STANDBY

show cm failover-status

1.3) The ARP situation

You might want to monitor more precisely what happens to the ARP of all of your VS in your infrastructure where the L3 is managed when the failover is issued. If you still have ARP pointing to the STANDBY node, go into the L3 switch and clear everything :

clear ip arp <ip_address> vrf <VRF_Name>

You should only be left with SELF-IPs after this step.

With all these steps you should be good to go. Of course your infrastructure will have differences, but that is what they pay you ! 😅

Thursday, April 22, 2021

Cisco Modeling Labs (new VIRL name) tips

Hello,

The modeling in networks is still in rough shape and exist in many different products. Today we focus on Cisco tool that I found very practical once you manage to set it up correctly and gain knowledge.

1) Gain access

Either you are a Cisco Partner or an individual, you can access CML either way. The price may vary. Then you get an ISO that you can install wherever you want, and a small setup defining the IP of the tool and setup users that will have access to it.

2) Use the lab

You have already defined Cisco nodes with various IOS available. Just drag and drop then press the green arrow just like a VM in VMWare and it will boot.

Keep in mind to define the number of interfaces needed before you boot it as it won't be changeable afterwards.

3) Reach it

There is a functionality call "breakout tool" that will allow you to reach each of your lab equipment via SSH or automation. I will detail here how to deploy on a linux VM :

From your CML main dashboard, click on "Tools", then "Breakout tool" to have access to the full documentation.

From your linux VM get the package, for example if your lab can be reached on 10.10.10.10:

wget https://10.10.10.10/breakout/breakout-linux-x86_amd64 --no-check-certificate

Then run the default UI of the tool:

./breakout-linux-x86_amd64 -listen 0.0.0.0 ui

Then you can access the UI and fill all the details such as username/password to connect to your CML.

If you encounter an error like:

Can't refresh data
Get "10.10.10.10/api/v0/keys/vnc": unsupported protocol scheme ""

Just add https:// before the IP address of the lab

Then you can get (refresh button) your labs details and then connect to running equipment from your linux VM using the 127.0.0.1 address and custom ports.

4) Import and Export

You can import very easily an already defined lab by using the import button available in your welcome screen in CML once logged in.

If you want to export your current lab you have some steps to check :

- For each equipment in the lab, click on it, then go to "Edit configuration" and click on "Fetch from device"

- Get out of equipment focus but still within your lab, click on the "Simulate" tab and the "Download lab".

You will now have a .yaml file that you will be able to exchange or reuse elsewhere.

5) Add a VM from another technology

You can add other technologies in your lab in order to fully simulate a part of your network. Here are the steps you must go through :

- Download virtual version of your technology, for example ISO or OVA.

- Download and install QEMU to convert the vmdk file to .qcow2 format (only format recognized by CML 2.0)

- copy (SCP) your .qcow2 to the CML server. Make sure you have enough disk space for this upload.

- Create a node definition matching your technology (you often can find .yaml already defined on internet) selecting your .qcow2 disk image

- The node is now available to use in your lab

6) Useful links:

FAQ: https://developer.cisco.com/docs/modeling-labs/#!faq

QEMU: https://www.qemu.org/download/

Monday, July 13, 2020

DNS update ? Tips to check, flush, manage

Hello,

We all have different softwares to manage DNS (Efficient IP, Infoblox, Solarwinds, Netbox...) but we all share the same issues : when it comes to change DNS entries (A Record, IP...) it might have a non-negligeable impact and sometimes we cannot // don't want to just lower the TTL of all impacted entries to 5 min. Finally, we all know that a DNS entry might take up to 24h to be updated everywhere in the world, whatever you do.

So when you have to do a big change, you might want to relay to the following - very handy - websites :

1) Check the status

https://dnschecker.org/

Check live and globally what is the entry data. It has the advantage of checking all around the globe.

2) Flush some cache

If you notice that the entry does not update everywhere, you might want to force flush some major relay. In order to do this, use those websites :

Check + flush opendns

https://cachecheck.opendns.com/

Flux google cache

https://developers.google.com/speed/public-dns/cache

3) Troubleshoot and understand all the layers of the DNS request until your server

https://dnsviz.net/

http://dns.squish.net/

There it is, I hope it will help you with your future changes !

Monday, June 22, 2020

Fun games to learn code

Hello,

I wanted to share some good website to learn to code in fun ways.

http://www.pythonchallenge.com/

Try to solve good challenges using python

https://regexcrossword.com/

Review everything you know about regex with this fun special crossword. Accessible at first, there are custom made puzzles by user that are really challenging. If you need some info, have a look at https://www.rexegg.com/regex-quickstart.html

https://projecteuler.net/

Series of challenge that you can solve by any code language you wish.

Have fun,