Learning and Testing Ansible Playbooks with Virtual Images

One of the common problems I see with network automation in general is that no one wants to learn and test their automation software on production network gear.  While lots of people have labs they can play with network gear that obviously is not an option for everyone.  Even used equipment can cost a thousand dollars or more, and in my case my spouse was not super excited about our power bill rising by $100 a month while I ran some Cisco switches in my office.  

I am not saying that owning networking hardware is a bad idea, its just not exactly scalable, and doesn’t work for everyone’s situation.  Imagine the network engineer living in New York City in the super small apartment. I was lucky enough while studying for my CCIE to have access to a lab at work where I had all the equipment I needed to play with, and didn’t have to deal with the heat, noise and power bills.  I was a Cisco employee at the time so I also had access to virtual Cisco images.

Luckily networking vendors (including Cisco) are finally realizing that these virtual images are super important for education and control plane testing.  While we can’t replace line-rate hardware with virtual images, we can test configurations, bring up routing protocols, test connectivity.  Of course testing Ansible Networking Playbooks is perfect for a virtual environment.  The worse thing about automation is I can automate a mistake across hundreds of nodes at the same time.  While working at Cumulus Networks we would replicate entire customer environments with virtual images and supply them to customers as Vagrantfiles.  This helped customers have a virtual playground to test and play with their automation scripts.  I think this strategy of virtual topologies should be how all network operators test their network automation strategy.  Let me elaborate on some networking vendors:

Arista

Arista has a common NOS (network operating system) amongst their hardware called EOS (Extensible Operating System).  If you create a free account on their website, they will allow you to download a virtual EOS (vEOS) for free.  Comment below if you have problems, I found this very easy.  I am pretty sure the only limitation with this free VM is the amount of ports (I think by default it is limited to 4).  What I am not sure about is if you can pay for more ports.  For the testing I was doing it worked great on my laptop.

Cisco

Cisco Systems has three main platforms that I use: Cisco IOS, Cisco NX-OS and Cisco IOS-XR.  All three of them have virtual images, but they require entitlement on you account, from what I have seen is that if you own the physical gear you automatically can get the virtual image.  I would be curious about other people’s experiences here.  Another option is using Cisco VIRL.

I wrote a Knowledge Base article for Red Hat Ansible Engine: https://access.redhat.com/articles/3199502 . This is a super simple guide on just getting Cisco NX-OSv up and running on your laptop (in my case a Macbook Pro).

Cumulus Networks

For Cumulus Linux they have a free version called Cumulus VX (for virtual experience).  You have to register, but you can download it here: https://cumulusnetworks.com/products/cumulus-vx/ Unlike Arista, there is no port limitation, so you can add as many ports as you want (depending on the underlying platform, e.g. VirtualBox vs KVM).

Another cool tool that Cumulus Networks provides for free is called topology converter. This python script creates a virtual topology (using Vagrant) from a network map (in the form of dot notation).  This allows users of Cumulus Linux (or really any Linux operating system) to build complex topologies.  While I was working at Cumulus Networks I could run well over a hundred Cumulus VX instances on a single server.  I highly encourage you to play with this tool.

Juniper

Juniper Networks has a few different virtual images floating around, including vSRX and vQFX.  My Juniper account already has entitlement to the virtual images through my employer, but they have published a Vagrant image that is not behind a login wall or paywall here: https://github.com/Juniper/vqfx10k-vagrant

VyOS

VyOS is an open source fork of Vyatta routing software.  While VyOS might be one of the networking platforms on here you have never heard of, many people use VyOS in production as a vRouter.  Their use-case is often peering to a service provider where they already have limited bandwidth out of the data center, so not having 100Gbps line-rate is not a problem.  Having Vagrant images and access to run virtual images in KVM or VirtualBox is really nice. to test out BGP configurations, prefix lists, and more.  Check out there website: https://vyos.io/

While layer 2 configuration is very different from other networking vendors, the OSPF and BGP configurations will be very similar to what you see on Cisco IOS and Cumulus, so VyOS could also be used to learn, train and pass networking certifications on those layer 3 technologies.

Summary

While I am sure many other networking platforms are out there (e.g. F5 Networks) these are some of the ones I play with the most.  I am super excited about all the virtual networks people are creating, because it means that network operators can test network changes on a virtual topology versus messing up their production network.  I am sure we will see people implement really interesting CI/CD pipelines in the future, where they can automate changes into their virtual development environment before touching any production equipment.

Infoblox Integration in Ansible 2.5

The Ansible 2.5 open source project release includes the following Infoblox Network Identity Operating System (NIOS) enablement:

  • Five modules
  • A lookup plugin (for querying Infoblox NIOS objects)
  • A dynamic inventory script

For network professionals, this means that existing networking Ansible Playbooks can utilize existing Infoblox infrastructure for IP Address Management (IPAM), using Infoblox for tracking inventory and more. For more information on Infoblox terminology, documentation and examples, refer to the Infoblox website

See the rest of the blog post over on Ansible.com:
https://www.ansible.com/blog/infoblox-integration-in-ansible-2.5

Networking Features in Ansible 2.5

The upcoming Ansible 2.5 open source project release has some really exciting improvements, and the following blog highlights just a few of the notable additions. In typical Ansible fashion, development of networking enhancements is done in the open with the help of the community. You can follow along by watching the networking GitHub project board, as well as the roadmap for Ansible 2.5 via the networking wiki page.

A few highlighted features include:

New Connection Types: network_cli and NETCONF

Ansible Fact Improvements

Improved Logging

Continued Enablement for Declarative Intent

Persistent SSH Connection Improvements

Additional Platforms and Modules

See the rest of the blog post over on Ansible.com: https://www.ansible.com/blog/coming-soon-networking-features-in-ansible-2.5

Ansible Provider for networking modules

While I know the provider argument is going to be eventually deprecated for networking modules for Ansible in favor of the new connection: network_cli I often find myself troubleshooting playbooks running pre-Ansible 2.5. The message you can get can be super frustrating:

fatal: [eos]: FAILED! => {"changed": false, "failed": true, "msg": "unable to open shell. Please see: https://docs.ansible.com/ansible/network_debug_troubleshooting.html#unable-to-open-shell"} 

But rest assured, the problem is probably easy to fix!

If you use the verbose mode with -vvvv on a networking module you can see all the provider methods being used.  This will give you the knobs you need to adjust settings on the provider:


"provider": {
"auth_pass": null,
"authorize": true,
"host": null,
"password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
"port": null,
"ssh_keyfile": null,
"timeout": null,
"transport": "cli",
"use_ssl": null,
"username": "admin",
"validate_certs": null
}

Here are the two most common problems I see->

  1. The provider is not even set. Try just setting the provider at the task level and use the verbose mode to make sure the correct provider info is being passed to login to your network switch.
  2. The task that takes a long time to complete, or the networking platform is a switch stack that needs to propagate the change amongst all the switches. This is where the timeout function can be used.  Increase the timeout to make sure the task has time to complete.

Good luck and happy automating!  Also please go join my  repo https://github.com/network-automation/

Just email my team at ansible-network@redhat.com

Automating network troubleshooting with NetQ + Ansible

For this blog post I want to focus on automating network troubleshooting, the forgotten stepchild of network automation tasks. I think most automation tools focus on provisioning (or first time configuring) because so many network engineers are new to network automation in general. While I think that is great (and I want to encourage everyone to automate!) I think there is so much more potential for network automation. I am introducing Sean’s third category of automation use-cases — OPS!

See the rest of the blog post over on CumulusNetworks.com
https://cumulusnetworks.com/blog/network-troubleshooting-netq/

NetDevOps: What does it even mean?

As more and more network engineers dive into network automation, the word idempotence keeps coming up. What is it? Why is it important? Why should we care? Idempotence is often described as the ability to perform the same task repeatedly and produce the same result. I want to demonstrate a super simple example of what this means.

See the rest of the blog post over on CumulusNetworks.com
https://cumulusnetworks.com/blog/netdevops-important-idempotence/

Backing up configs with the Ansible NCLU module

With the release of Ansible 2.3 the Cumulus Linux NCLU module is now part of Ansible core. This means when you `apt-get install ansible`, you get the NCLU module pre-installed! This blog post will focus on using the NCLU module to backup and restore configs on Cumulus Linux. To read more about the NCLU module from its creator, Barry Peddycord, click here.

See the rest of the blog post over on CumulusNetworks.com
https://cumulusnetworks.com/blog/configs-ansible-nclu-module/

5 host network configurations for MLAG

Host network configurations for MultiChassis Link Aggregation (MLAG, also referred to as dual-attach or ‘high availability’) can vary from host OS to host OS, even amongst Linux distributions. The most recommended and robust method is to use Link Aggregation Control Protocol (LACP), which is supported on most host operating systems natively. Host bonds or bonding refers to a variety of bonding methods, but for the purpose of this article it will refer to LACP bonds. The terms etherchannel, link aggregation group (LAG), NIC teaming, port-channel and bond can be used interchangeably to refer to LACP depending on the vendor’s nomenclature. For the sake of simplicity, we will just call it bonds or bonding. This post will take your through the steps for host network configurations for MLAG across five different operating systems.

See the rest of the blog post over on CumulusNetworks.com
https://cumulusnetworks.com/blog/5-host-network-configurations-mlag/

EVPN Ansible playbook for Cumulus Linux 3.2.1

I wrote a quick Ansible playbook for Cumulus EVPN. Cumulus EVPN is now GA (Generally Available) but the packages are still in the EA (early access repo) so it can be confusing if you are not used to Debian packaging system. This is nice that it won’t try to upgrade/reboot unless you have the wrong version. Feel free to read the documentation on Cumulus Networks website.

 

- name: check current quagga version for EVPN
  command: "dpkg -l quagga"
  register: quaggaversion
  when: ansible_lsb.major_release == "3"

- name: debug quaggaversion
  debug:
     var: quaggaversion.stdout
  when: ansible_lsb.major_release == "3"

- name: uncomment early access repo
  lineinfile: >
    dest=/etc/apt/sources.list
    regexp="#deb     http://repo3.cumulusnetworks.com/repo CumulusLinux-3-early-access cumulus"
    line="deb     http://repo3.cumulusnetworks.com/repo CumulusLinux-3-early-access cumulus"
    state=present
  when: ansible_lsb.major_release == "3"


- name: uncomment early access repo sources
  lineinfile: >
    dest=/etc/apt/sources.list
    regexp="#deb-src http://repo3.cumulusnetworks.com/repo CumulusLinux-3-early-access cumulus"
    line="deb-src http://repo3.cumulusnetworks.com/repo CumulusLinux-3-early-access cumulus"
    state=present
  when: ansible_lsb.major_release == "3"

- name: install eau8 of quagga
  apt: name="cumulus-evpn" update_cache=yes
  when: ansible_lsb.major_release == "3" and "eau8" not in quaggaversion.stdout

- name: upgrade switch (part of EVPN install instructins)
  shell: "apt-get upgrade -y --force-yes -o Dpkg::Options::='--force-confnew'"
  become: true
  become_method: sudo
  when: 'ansible_lsb.major_release == "3" and "eau8" not in quaggaversion.stdout'

- name: Reboot for apt-get upgrade
  shell: sleep 2 && shutdown -r now "Ansible updates triggered"
  async: 1
  poll: 0
  become: true
  ignore_errors: true
  when: 'ansible_lsb.major_release == "3" and "Cumulus" in ansible_lsb.id and "eau8" not in quaggaversion.stdout'

- name: Wait for everything to come back up
  local_action: wait_for port=22 host="{{ inventory_hostname }}" search_regex=OpenSSH delay=10
  when: 'ansible_lsb.major_release == "3" and "eau8" not in quaggaversion.stdout'