Ansible Provider for networking modules

While I know the provider argument is going to be eventually deprecated for networking modules for Ansible in favor of the new connection: network_cli I often find myself troubleshooting playbooks running pre-Ansible 2.5. The message you can get can be super frustrating:

fatal: [eos]: FAILED! => {"changed": false, "failed": true, "msg": "unable to open shell. Please see: https://docs.ansible.com/ansible/network_debug_troubleshooting.html#unable-to-open-shell"} 

But rest assured, the problem is probably easy to fix!

If you use the verbose mode with -vvvv on a networking module you can see all the provider methods being used.  This will give you the knobs you need to adjust settings on the provider:


"provider": {
"auth_pass": null,
"authorize": true,
"host": null,
"password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
"port": null,
"ssh_keyfile": null,
"timeout": null,
"transport": "cli",
"use_ssl": null,
"username": "admin",
"validate_certs": null
}

Here are the two most common problems I see->

  1. The provider is not even set. Try just setting the provider at the task level and use the verbose mode to make sure the correct provider info is being passed to login to your network switch.
  2. The task that takes a long time to complete, or the networking platform is a switch stack that needs to propagate the change amongst all the switches. This is where the timeout function can be used.  Increase the timeout to make sure the task has time to complete.

Good luck and happy automating!  Also please go join my  repo https://github.com/network-automation/

Just email my team at ansible-network@redhat.com

Automating network troubleshooting with NetQ + Ansible

For this blog post I want to focus on automating network troubleshooting, the forgotten stepchild of network automation tasks. I think most automation tools focus on provisioning (or first time configuring) because so many network engineers are new to network automation in general. While I think that is great (and I want to encourage everyone to automate!) I think there is so much more potential for network automation. I am introducing Sean’s third category of automation use-cases — OPS!

See the rest of the blog post over on CumulusNetworks.com
https://cumulusnetworks.com/blog/network-troubleshooting-netq/

NetDevOps: What does it even mean?

As more and more network engineers dive into network automation, the word idempotence keeps coming up. What is it? Why is it important? Why should we care? Idempotence is often described as the ability to perform the same task repeatedly and produce the same result. I want to demonstrate a super simple example of what this means.

See the rest of the blog post over on CumulusNetworks.com
https://cumulusnetworks.com/blog/netdevops-important-idempotence/

Backing up configs with the Ansible NCLU module

With the release of Ansible 2.3 the Cumulus Linux NCLU module is now part of Ansible core. This means when you `apt-get install ansible`, you get the NCLU module pre-installed! This blog post will focus on using the NCLU module to backup and restore configs on Cumulus Linux. To read more about the NCLU module from its creator, Barry Peddycord, click here.

See the rest of the blog post over on CumulusNetworks.com
https://cumulusnetworks.com/blog/configs-ansible-nclu-module/

5 host network configurations for MLAG

Host network configurations for MultiChassis Link Aggregation (MLAG, also referred to as dual-attach or ‘high availability’) can vary from host OS to host OS, even amongst Linux distributions. The most recommended and robust method is to use Link Aggregation Control Protocol (LACP), which is supported on most host operating systems natively. Host bonds or bonding refers to a variety of bonding methods, but for the purpose of this article it will refer to LACP bonds. The terms etherchannel, link aggregation group (LAG), NIC teaming, port-channel and bond can be used interchangeably to refer to LACP depending on the vendor’s nomenclature. For the sake of simplicity, we will just call it bonds or bonding. This post will take your through the steps for host network configurations for MLAG across five different operating systems.

See the rest of the blog post over on CumulusNetworks.com
https://cumulusnetworks.com/blog/5-host-network-configurations-mlag/

EVPN Ansible playbook for Cumulus Linux 3.2.1

I wrote a quick Ansible playbook for Cumulus EVPN. Cumulus EVPN is now GA (Generally Available) but the packages are still in the EA (early access repo) so it can be confusing if you are not used to Debian packaging system. This is nice that it won’t try to upgrade/reboot unless you have the wrong version. Feel free to read the documentation on Cumulus Networks website.

 

- name: check current quagga version for EVPN
  command: "dpkg -l quagga"
  register: quaggaversion
  when: ansible_lsb.major_release == "3"

- name: debug quaggaversion
  debug:
     var: quaggaversion.stdout
  when: ansible_lsb.major_release == "3"

- name: uncomment early access repo
  lineinfile: >
    dest=/etc/apt/sources.list
    regexp="#deb     http://repo3.cumulusnetworks.com/repo CumulusLinux-3-early-access cumulus"
    line="deb     http://repo3.cumulusnetworks.com/repo CumulusLinux-3-early-access cumulus"
    state=present
  when: ansible_lsb.major_release == "3"


- name: uncomment early access repo sources
  lineinfile: >
    dest=/etc/apt/sources.list
    regexp="#deb-src http://repo3.cumulusnetworks.com/repo CumulusLinux-3-early-access cumulus"
    line="deb-src http://repo3.cumulusnetworks.com/repo CumulusLinux-3-early-access cumulus"
    state=present
  when: ansible_lsb.major_release == "3"

- name: install eau8 of quagga
  apt: name="cumulus-evpn" update_cache=yes
  when: ansible_lsb.major_release == "3" and "eau8" not in quaggaversion.stdout

- name: upgrade switch (part of EVPN install instructins)
  shell: "apt-get upgrade -y --force-yes -o Dpkg::Options::='--force-confnew'"
  become: true
  become_method: sudo
  when: 'ansible_lsb.major_release == "3" and "eau8" not in quaggaversion.stdout'

- name: Reboot for apt-get upgrade
  shell: sleep 2 && shutdown -r now "Ansible updates triggered"
  async: 1
  poll: 0
  become: true
  ignore_errors: true
  when: 'ansible_lsb.major_release == "3" and "Cumulus" in ansible_lsb.id and "eau8" not in quaggaversion.stdout'

- name: Wait for everything to come back up
  local_action: wait_for port=22 host="{{ inventory_hostname }}" search_regex=OpenSSH delay=10
  when: 'ansible_lsb.major_release == "3" and "eau8" not in quaggaversion.stdout'

Fully Qualified Domain Name Capability for BGP

I really like Daniel Walton‘s draft-walton-bgp-hostname-capability-00.  Cumulus Networks implemented this as part of Quagga/FRR and it has become to default on Cumulus Linux 3.0 and later.  BGP is one of those protocols that is really powerful but in the past is really a pain in the ass to troubleshoot if you don’t know how things are cabled.  Without a network map or diagram it will take awhile to reverse engineer and troubleshoot a customer’s network. 

This capability allows the configured BGP neighbor hostname to be shared via BGP (not lldp or cdp) so it can give you the output with the BGP commands themselves without having to correlate a neighbor to IP address, and IP address to physical switchport.  What does that look like?

cumulus@leaf01:mgmt-vrf:~$ net show bgp sum

show bgp ipv4 unicast summary
=============================
BGP router identifier 10.0.0.11, local AS number 65011 vrf-id 0
BGP table version 107
RIB entries 25, using 3400 bytes of memory
Peers 2, using 42 KiB of memory
Peer groups 1, using 72 bytes of memory

Neighbor        V         AS MsgRcvd MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
spine01(swp51)  4      65020  286176  286211        0    0    0 01w2d22h           11
spine02(swp52)  4      65020  224758  224766        0    0    0 4d18h12m           11

Wow… that is beautiful. Now I know my two uplinks (switchport 51 and switchport 52) are hooked up to spine01 and spine02. the output without this default (“no bgp default show-hostname”) looks like this:

cumulus@leaf01:mgmt-vrf:~$ net show bgp sum

show bgp ipv4 unicast summary
=============================
BGP router identifier 10.0.0.11, local AS number 65011 vrf-id 0
BGP table version 107
RIB entries 25, using 3400 bytes of memory
Peers 2, using 42 KiB of memory
Peer groups 1, using 72 bytes of memory

Neighbor        V         AS MsgRcvd MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
swp51           4      65020  286341  286376        0    0    0 01w2d22h           11
swp52           4      65020  224923  224931        0    0    0 4d18h21m           11

I use this for every customer network I work on (which is easy now that it is the default). I hope more and more vendors implement this capability because it is very slick.

Let me know what you think in the comments, I will talking about new features in BGP that we have been using for awhile but might not be that common outside of Cumulus Networks.

Turning on all Cumulus Linux interfaces

Cumulus Linux has a cool ‘feature’ it inherited from Linux.  On most network switches  a port is either Layer2 or Layer3.  When a port is layer2 it has to be part of a VLAN.  Linux does understand and work with VLANs but it can have a port that is running at layer2 but not part of a VLAN.  We can configure a port under /etc/network/interfaces like this:

auto swp1
iface swp1

If we ifup this port its not part of a VLAN, and its not Layer3.  

Why would we want to do this?  Honestly the most common reason is that we can check physical connections by using lldp before this port is part of a broadcast domain (VLAN) that could cause loops or unexpected behavior.

What if I cabled a switch and don’t know what ports are connected?  You could create a stanza for each swp like above, use something like mako or do a simple bash loop on the command line like this:

cumulus@leaf01:~$ for swp in {1..54}; do sudo ip link set swp$swp; done

Now all 54 ports are admin up at layer2 so we can check connections, but its not routing or switching.  Now you can use a “net show int” or use the linux command “ip link show”

cumulus@leaf01:~$ ip link show | grep LOWER_UP
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master mgmt state UP mode DEFAULT group default qlen 1000
11: swp9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
51: swp49: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9216 qdisc pfifo_fast master peerlink state UP mode DEFAULT group default qlen 1000
52: swp50: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9216 qdisc pfifo_fast master peerlink state UP mode DEFAULT group default qlen 1000
53: swp51: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
54: swp52: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000

You can even just use lldp here

cumulus@leaf01:~$ net show lldp

Local Port    Speed    Mode                 Remote Port    Remote  Host     Summary
------------  -------  -------------  ----  -------------  ---------------  --------------------------
eth0          1G       Mgmt           ====  swp23          oob-mgmt-switch  IP: 10.50.100.100/24(DHCP)
swp9          10G      NotConfigured  ====  swp51          leaf07
swp49         40G      NotConfigured  ====  swp49          exit02           
swp50         40G      NotConfigured  ====  swp50          exit02           
swp51         40G      NotConfigured  ====  swp30          spine01
swp52         40G      NotConfigured  ====  swp30          spine02

Yay now we can go configure them now that we know how they are cabled up. Its great to be lazy 🙂

Update 3/30/2017

The famous Daniel Walton (yeah the Daniel that has his name on this, this and my personal favorite this.) let me know that Cumulus Linux’s new NCLU (Network Command Line Utility) also has a method of doing this quickly. You can check out his tweet here. NCLU uses the net command and has the ability to do a range of ports really quickly. One caveat vs using ip link set… this will make persistent config (meaning the ports are actually configured under /etc/network/interfaces). If you don’t want the config to remain you can do a “net del” then a “net commit”

net add int swp1-54

Linux networking: It’s not just SDN

Oftentimes, Cumulus Linux gets confused for an SDN (software-defined networking) solution. In conversations with potential customers, I’ve noticed that some of them find it difficult to distinguish between SDN, open networking and Cumulus Linux. When I talk to network engineers, I start by clarifying the SDN buzzword head on. The term gets overused, and is often defined by other confusing acronyms or marketing jargon. To complicate things further, SDN is often thought of as equivalent to OpenFlow, which is flawed in my opinion.

See the rest of the blog post over on CumulusNetworks.com
https://cumulusnetworks.com/blog/linux-sdn-networking/