webinar: network automation [tips & tricks]

60
v Network Automation: Tips & Tricks A Black & White Presentation David Barroso, Dinesh G Dutt August 30, 2016

Upload: cumulus-networks

Post on 09-Jan-2017

240 views

Category:

Technology


1 download

TRANSCRIPT

v

Network Automation: Tips & TricksA Black & White Presentation

David Barroso, Dinesh G Dutt

August 30, 2016

cumulusnetworks.com 2

Agenda

Why Automate ?The Work Before AutomationHow Does Automating Routers Differ From Servers

Tips & TricksSummary

August 30, 2016

cumulusnetworks.com 3

Audience Assumptions

Interested in network automation, but stumped or daunted

Small to mid size enterprises 1-32 racks 10-800 servers

Unfamiliar with programmingUse Ansible, but the ideas apply to Puppet, Chef, Salt etc.

August 30, 2016

• Network Systems Engineer at Fastly• Previously:

- Network Engineer at Spotify- Network Engineer at NTT- Network & Systems Engineer at Atlas

IT• Creator of:

- N.A.P.A.L.M.- SDN Internet Router

Twitter | Linkedin | Github

@dbarrosop

WHY AUTOMATE?Even when you are a small shop

#1 CONSISTENCYIt's not only about automation, it's about consistent

configurations, workflows and change control

Consistent errors introduced by bugs are easier to identify andfix than random errors introduced by humans

#2 SCALABILITYIf it works for 1 device it works for N devices (-ish)

#3 FAST ITERATIONSmall and incremental changes are easier to perform when you can

focus on the changes and the outcomevs

where to apply them and how

#4 FOR "FUN"It's more interesting than provisioning a VLAN or an

IP for the gazillionth time

cumulusnetworks.com 11

•Prelude to Automation

August 30, 2016

cumulusnetworks.com 12

Reduce Clutter

August 30, 2016

https://www.flickr.com/photos/rubbermaid/commons.wikimedia.org

cumulusnetworks.com 13

Exploit order and regularity of network Same ports across all boxes connected to uplink ports Same host connected to pair of leaves on same port on

both leavesFrom this order and regularity emerge simple patterns

Automate Patterns

Patterns

August 30, 2016

Spine01 Spine021 2 3 1 2 3 1 2 3

Spine03

cumulusnetworks.com 14

Principles of Simplifying Configuration

Cookie cutter configuration a.k.a substitutability

As little node-specific variation as possible• Nothing more than a single IP address, node

name, for example As little duplication of information as possible

• Specifying IP addresses on interfaces AND in OSPF/BGP network statements

As much configuration as necessary, not more

August 30, 2016

cumulusnetworks.com 15

How Automating Switches/Routers Differs From Servers

August 30, 2016

Scale Interfaces VLANs

Multiple pieces of information have to be configured AND coordinated across devices:

IP addresses on interfaces (common subnet)

If BGP, ASN number of self and peer

L1 L2 L16

S1 S2 S4S310.1.1.1

10.1.1.0

10.1.4.33

10.1.4.32

cumulusnetworks.com 16

How Automating Switches/Routers Differs From Servers

Duplication of Information Eg.: IP address specified on interface, in network

statements, in BGP neighbor statementsComplex Configuration

Multiple protocols

August 30, 2016

cumulusnetworks.com 17August 30, 2016

•Tips & Tricks

August 30, 2016 cumulusnetworks.com 18

#0: Simplicity vs Flexibility

cumulusnetworks.com 19August 30, 2016

#!/usr/bin/python

print “Hello Mr. Barroso”

#!/usr/bin/python

name = “Mr. Barroso”

print “Hello %s” % name

#!/usr/bin/python

class Person(object): '''class storing attributes of a person''' first_name = '' last_name = ''

def __init__(self, name, surname): self.first_name = name self.last_name = surname

def greet(self): print "Hello Mr. %s" % self.last_name

if __name__ == '__main__': name_input = input('Enter <first name> <last name>: ') print name_input prenom, nom = name_input.split() persona = Person(prenom, nom) persona.greet()

cumulusnetworks.com 20

Real Life Example of a Customer

1. Push device-specific files (glorified file copy)

2. Look at patterns and create templates

3. Automate more of the tasks

4. Add Ansible roles, fully automated

L1 L2 L16

S1 S2 S4S3

August 30, 2016

cumulusnetworks.com 21

Start with Automating Simple Tasks

Adding or removing usersAdding additional interesting packages

bwm-ng or scamper for exampleConvert ad-hoc command into playbook

August 30, 2016

cumulus@dinesh-ubuntu ~/w/a/playbook> ansible leaf-1 -s -m apt -a 'name=bwm-ng state=installed'

- hosts: all tasks: - name: Install bwm-ng on all hosts apt: name=‘bwm-ng’ state=installed update_cache=yes

cumulusnetworks.com 22August 30, 2016

•#1: Pick Simple, Consistent Toolchain

cumulusnetworks.com 23

Some Tools Fit Better With Some Languages Than Others

Puppet & Chef have Ruby as base language Ansible users tend to use PythonMixing Python & Ruby tool chains requires multiple language skills, can be more maintenance

For example, Serverspec and other such validation tools will be natural for Puppet/Chef shops, but will require adding Ruby skills to Ansible shops

Cumulus Linux is Linux, so any tool works out of the box, no assembly required

August 30, 2016

cumulusnetworks.com 24August 30, 2016

•#2: Use Unnumbered Interfaces

cumulusnetworks.com 25

Use of Unnumbered in DC

Unnumbered Interfaces are those without a global IP address of their own

Interface IP addresses are never advertised inside the DC

Reduces IP address requirements Reduces FIB & RIB sizes Reduces attack vector Automation simplification: Single IP address to configure

per node

August 30, 2016

cumulusnetworks.com 26August 30, 2016

•#3: Use Interface Names Instead of IP Addresses

cumulusnetworks.com 27

Why ?

Names are easier to spot errors with compared to IP addresses

Unchanged configuration on renumberingWith unnumbered interfaces, interfaces have no IP addresses anyway

August 30, 2016

cumulusnetworks.com 28

OSPF: Avoid “network” Statements, Use “ip ospf area” under “interface:

August 30, 2016

interface swp1 ip ospf area 0.0.0.0interface swp2 ip ospf area 0.0.0.0…inerface swp17 ip ospf area 0.0.0.0!router ospf ospf router-id 10.0.0.17

S1

interface swp1 ip ospf area 0.0.0.0interface swp2 ip ospf area 0.0.0.0…inerface swp17 ip ospf area 0.0.0.0!router ospf ospf router-id 10.0.0.20

S4

interface swp1 ip ospf area 0.0.0.0interface swp2 ip ospf area 0.0.0.0…inerface swp4 ip ospf area 0.0.0.0!router ospf ospf router-id 10.0.0.1

L1 interface swp1 ip ospf area 0.0.0.0interface swp2 ip ospf area 0.0.0.0…inerface swp4 ip ospf area 0.0.0.0!router ospf ospf router-id 10.0.0.16

L16

SPINE

LEAF L1 L2 L16

S1 S2 S4S310.1.1.1

10.1.1.0

10.1.4.33

10.1.4.32

cumulusnetworks.com 29

Traditional BGP Configuration

August 30, 2016

router bgp 64501 bgp log-neighbor-changes bgp router-id 10.0.0.1 ! neighbor 10.1.1.1 remote-as 65000 neighbor 10.1.2.1 remote-as 65000 neighbor 10.1.3.1 remote-as 65000 neighbor 10.1.4.1 remote-as 65000

router bgp 64502 bgp log-neighbor-changes bgp router-id 10.0.0.2 ! neighbor 10.1.1.3 remote-as 65000 neighbor 10.1.2.3 remote-as 65000 neighbor 10.1.3.3 remote-as 65000 neighbor 10.1.4.3 remote-as 65000

router bgp 65000 bgp log-neighbor-changes bgp router-id 10.0.0.17 ! neighbor 10.1.1.0 remote-as 64501 neighbor 10.1.1.2 remote-as 64502 … neighbor 10.1.1.32 remote-as 64517

router bgp 65000 bgp log-neighbor-changes bgp router-id 10.0.0.20 ! neighbor 10.1.4.0 remote-as 64501 neighbor 10.1.4.2 remote-as 64502 … neighbor 10.1.4.32 remote-as 65534router bgp 64516

bgp log-neighbor-changes bgp router-id 10.0.0.16 ! neighbor 10.1.1.33 remote-as 65000 neighbor 10.1.2.33 remote-as 65000 neighbor 10.1.3.33 remote-as 65000 neighbor 10.1.4.33 remote-as 65000

L1 L2 L16

S1

S4SPINE

LEAF L1 L2 L16

S1 S2 S4S310.1.1.1

10.1.1.0

10.1.4.33

10.1.4.32

cumulusnetworks.com 30

BGP Unnumbered Configuration

August 30, 2016

router bgp 64501 bgp log-neighbor-changes bgp router-id 10.0.0.1 ! neighbor swp1 remote-as external neighbor swp2 remote-as external neighbor swp3 remote-as external neighbor swp4 remote-as external

router bgp 64502 bgp log-neighbor-changes bgp router-id 10.0.0.2 ! neighbor swp1 remote-as external neighbor swp2 remote-as external neighbor swp3 remote-as external neighbor swp4 remote-as external

router bgp 64516 bgp log-neighbor-changes bgp router-id 10.0.0.16 ! neighbor swp1 remote-as external neighbor swp2 remote-as external neighbor swp3 remote-as external neighbor swp4 remote-as external

router bgp 65000 bgp log-neighbor-changes bgp router-id 10.0.0.17 ! neighbor swp1 remote-as external neighbor swp2 remote-as external … neighbor swp16 remote-as external

router bgp 65000 bgp log-neighbor-changes bgp router-id 10.0.0.20 ! neighbor swp1 remote-as external neighbor swp2 remote-as external … neighbor swp16 remote-as external

L1 L2 L16

S1

S4SPINE

LEAF L1 L2 L16

S1 S2 S4S3

cumulusnetworks.com 31August 30, 2016

•#4: A Host by any Name…

cumulusnetworks.com 32

Some Characteristics of a Hostname

Pick a base hostname that reflects the key role of the device:

leaf, tor, spine, etc.Assign a unique number to device instance to construct unique hostname

leaf-1 (for leaf in rack-1), spine-1, etc.Add prefixes to make it globally unique:

dc-ny-tor-1, dc-sf-tor-1, dc-sf-tor-2 etc.

August 30, 2016

cumulusnetworks.com 33

Generate Unique ID from Hostname

An example to simulate thinking, but this is not unlike how some customers have deployed it

August 30, 2016

- hosts: all any_errors_fatal: true vars_files: - properties.yml tasks: - name: Get my node ID set_fact: my_node_id: "{{ inventory_hostname.split('-')[1] }}"

cumulusnetworks.com 34

Use Hostname to Derive Loopback IP address

August 30, 2016

Poor man’s IPAMUses jinja2’s ipsubnet filter

Spine addresses are assigned from one end and the leaf addresses from the other end

leaf-1 gets ipsubnet(32, 1), leaf-2 gets ipsubnet(32, 2) etc.

spine-1 => ipsubnet(32, -2), spine-2 => ipsubnet(32, -3) etc.

- hosts: all any_errors_fatal: true vars_files: - properties.yml tasks: - name: Get my node ID set_fact: my_node_id: "{{ inventory_hostname.split('-')[1] }}"

- name: Get loopback IP for leaves set_fact: my_ip: "{{ lo_ip_subnet|ipsubnet(32, (my_node_id|int)) }}" when: "{{ 'leaf' in group_names }}"

- name: Get loopback IP for spines set_fact: my_ip: "{{ lo_ip_subnet|ipsubnet(32, -(my_node_id|int)-1) }}" when: "{{ 'spine' in group_names }}"

cumulusnetworks.com 35

Use Hostname to Derive ASN

August 30, 2016

Same style as IP, but for ASN

leaf-1 => base_asn + 1

leaf-2 => base_asn + 2

….

- hosts: all any_errors_fatal: true vars_files: - properties.yml tasks: - name: Get my ASN for spines set_fact: my_asn: "{{ bgp_spine_asn }}" when: "{{ 'spine' in group_names }}"

- name: Get my ASN for leaves set_fact: my_asn: "{{ (bgp_leaf_asn_base | int) + (my_node_id | int) }}" when: "{{ 'leaf' in group_names }}"

cumulusnetworks.com 36

Use Hostname to Derive CLAG Configuration

August 30, 2016

- name: Construct MLAG Local IP set_fact: my_clag_ip: | {% if (my_node_id|int %2) == 1 %} 169.254.1.1/30 {%else%} 169.254.1.2/30 {%endif%} when: "{{ 'leaf' in group_names and dual_attach_hosts }}" - name: Construct MLAG Peer IP set_fact: my_clag_peer_ip: | {% if (my_node_id|int % 2) == 1 %} 169.254.1.2 {%else%} 169.254.1.1 {%endif%} when: "{{ 'leaf' in group_names and dual_attach_hosts }}"

- name: Construct CLAG SysMAC set_fact: my_clag_sys_mac: | {% if (my_node_id|int % 2) == 1 %} {{ "%s%02d"|format(clag_base_sys_mac, (my_node_id|int)) }} {%else%} {{ "%s%02d" | format(clag_base_sys_mac, (my_node_id|int - 1)) }} {%endif%} when: "{{ 'leaf' in group_names and dual_attach_hosts }}"

- name: Construct CLAG Priority set_fact: my_clag_prio: | {% if (my_node_id|int % 2) == 1 %} 4096 {%else%} 8192 {%endif%} when: "{{ 'leaf' in group_names and dual_attach_hosts }}"

Using modulo with my_node_id to derive unique information per pair

cumulusnetworks.com 37August 30, 2016

•#5: Validate Your Input Before Applying

cumulusnetworks.com 38

Use Ansible’s Validate Option Rigorously

August 30, 2016

- hosts: all any_errors_fatal: true vars_files: - properties.yml tasks: - name: Add ISL interfaces to interfaces file for OSPF blockinfile: dest: /etc/network/interfaces marker: "#{mark} {{ item }} ANSIBLE MANAGED BLOCK" block: | auto {{ item }} iface {{ item }} inet static address {{ my_ip }}/32

validate: ifup -s -a -i %s become: true with_items: ansible_interfaces when: "{{ (protocol == 'ospf') and (('spine' in group_names and item|match(spine_to_leaf_ports)) or ('leaf' in group_names and item|match(leaf_to_spine_ports))) }}" tags: - ifconfig

cumulusnetworks.com 39

Quagga Validation Needs a Little Love

August 30, 2016

- name: Push out quagga config checker copy: dest: /tmp/chk-quagga.sh mode: 0700 content: | #!/bin/sh

sudo chmod 644 $1 sudo vtysh -C -f $1 become: yes

- name: Add logging and base config blockinfile: dest: /etc/quagga/Quagga.conf create: yes owner: quagga marker: "!{mark} base config ANSIBLE MANAGED BLOCK" block: | ! log file /var/log/quagga/quagga.log log timestamp precision 6 ! validate: sudo /tmp/chk-quagga.sh %s become: true

cumulusnetworks.com 40August 30, 2016

•#6: Automate Configuration Verification

cumulusnetworks.com

Validate Configuration Using Appropriate Tool

Most people use manual configuration to validate configuration

Puppet & Chef come with their configuration validation tools

With Ansible, use playbooks to validate configuration

August 30, 2016

cumulusnetworks.com 42

Validate BGP Configuration

August 30, 2016

- name: Get bgp summary command: vtysh -c 'sh ip bgp summary json' register: cmd_out become: true

- name: Get the peer count set_fact: peer_count: "{{ ((cmd_out.stdout|from_json).totalPeers) }}"

- name: Get the peer list set_fact: bgp_peers: "{{ (cmd_out.stdout|from_json).peers }}“

- name: Validate peer count matches the expected number of leaves assert: { that: '(peer_count|int) == num_leaves' } when: "{{ 'spine' in group_names }}"

- name: Validate peer count matches the expected number of spines assert: { that: '(peer_count|int) == num_spines' } when: "{{ 'leaf' in group_names }}"

- name: Verify all BGP sessions are in established state assert: { that: 'bgp_peers[item]["state"] == "Established"' } with_items: "{{ bgp_peers }}"

cumulusnetworks.com 43

Validate CLAG Configuration

August 30, 2016

---- hosts: 'leaf*' vars_files: - properties.yml gather_facts: false tasks:

- name: Get clagctl output command: clagctl -j register: cmd_out

- name: Get the status set_fact: clag_status: "{{ (cmd_out.stdout|from_json).status }}"

- name: Get the Individual Bond status set_fact: clag_ifs: "{{ (cmd_out.stdout|from_json).clagIntfs }}"

- name: Verify CLAG Peer is up and alive assert: { that: 'clag_status["peerAlive"] == true' }

- name: Verify all bonds are dual attached assert: { that: 'clag_ifs[item]["status"] == "dual"' } with_items: "{{ clag_ifs }}"

vagrant@leaf-1:~$ clagctl -j { "clagIntfs": { "bond-swp5": { "clagId": 5, "status": "single" }, "bond-swp6": { "clagId": 6, "status": "single" } }, "status": { "backupActive": false, "backupIp": "", "ourId": "08:00:27:7f:06:83", "ourPriority": 4096, "ourRole": "primary", "peerAlive": true, "peerId": "08:00:27:70:33:61", "peerIf": "peer-link.4094", "peerIp": "169.254.1.2", "peerPriority": 8192, "peerRole": "secondary", "sysMac": "44:38:39:ff:00:01" }}

cumulusnetworks.com 44August 30, 2016

•#7: Stage & Test Your Changes in Virtual Land First

cumulusnetworks.com 45

Vagrant & VX

Vagrant is a simple, effective tool to build networks – complete with switches and hosts – in virtual land, using VMs

Comes with support for multiple hosts, and some switches

Runs on OS X and LinuxVX is Cumulus Linux as a VMUse Vagrant + VX + host boxes to spin up your entire network (on a powerful server) or a section of it (on your laptop) and test changes here before pushing it outAugust 30, 2016

cumulusnetworks.com 46

Commit/Rollback in the Age of Automation

Master state is in the playbooks (or recipes), not the device specific configuration themselves

Use source control (git is easy to get) to manage playbook versions

Ansible’s validate ensures commands don’t fail due to syntactic errors

Verifying Configuration ensures final state is as desired

Testing changes in virtual land ensures you don’t hose the box

August 30, 2016

cumulusnetworks.com 47

Get Git

August 30, 2016

cumulusnetworks.com 48August 30, 2016

•#8: Separate Data From Code

cumulusnetworks.com 49

Store Fabric-wide Data, Abstract Details

August 30, 2016

# Keep the numSpines and spine2leaf_ports consistent # For example, if num_spines was 3, leaf_to_spine_ports would # be "swp[1-3]". num_spines: 4 num_leaves: 9 hosts_per_leaf: 1 # Keep the num_leaves and spine_to_leaf_ports consistent # For example, with 4 leaves, spine_to_leaf_ports could be # "swp[1-4]". The first server port is after the last ISL port # For dual-attach ports, the number of leaves have to be even spine_to_leaf_ports: "swp[1-9]" leaf_to_spine_ports: "swp[1-4]" leaf_to_server_ports: "swp[5-6]" ################### Routing Protocol Used ########################### protocol: 'bgp' ################### SVI and Host IP CONFIGURATION ################### lo_ip_subnet: '10.254.0.0/24' ################### Dual-Attach Server CONFIGURATION ################ # Update the clag-peer ports based on how many server ports are there dual_attach_hosts: true clag_peer_ports: "swp[7-8]"

#9 OWN YOUR CONFIGURATIONLet your configuration management system be the source

of truth,don't let your devices dictate your fate

❌ Traditional NOS that

requiresyou to tell it how to get there---# Add these VLANs vlans_i_want:

- id: 200name: prod

- id: 300name: pre

- id: 400name: dev

# Remove these VLANs vlans_i_dont_want:

- id: 23 name: asd

# What happens with unknown VLANs?

✅ Modern OS

that is able to understandwhat you want and how to get there by itself---# Only VLANs specified here will be allowed vlans:

- id: 200name: prod

- id: 300

name: pre- id: 400

name: dev# Unknown VLANs will be removed

#10 DO NOT REINVENT THE WHEELRe-use DevOps tools, knowledge, experience, best

practices...

#11 USE INTERNAL RESOURCESChances are your organization has a DevOps team already. Make

use of their knowledge and experience. Don't be a silo

#12 JOIN A COMMUNITYMailing lists, IRC and others are great sources of information. Ask

and learn from others, share your own experience, etc.

#13 DON'T FEAR PROGRAMMINGBecause you already mastered CLIs and network

protocols that are more complex than most programming languages and techniques

And knowing some Python and how to interact with an APIdoesn't make you a programmer

#0 START SIMPLEAnd keep iterating

cumulusnetworks.com 58

Summary

Exploit regularity to create patterns, automate patterns

Simplify configuration using unnumbered interfaces, OSPF/BGP unnumbered

Validate every step

August 30, 2016

May 2, 2023 cumulusnetworks.com 59

•Agile Network Deployment

•Guest Speaker: To be Announced

•When: September 29

Next Month’s Webinar

cumulusnetworks.com 60

CUMULUS, the Cumulus Logo, CUMULUS NETWORKS, and the Rocket Turtle Logo (the “Marks”) are trademarks and service marks of Cumulus Networks, Inc. in the U.S. and other countries. You are not permitted to use the Marks without the prior written consent of Cumulus Networks. The registered trademark Linux® is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a world-wide basis. All other marks are used under fair use or license from their respective owners.

Thank You!

Bringing the Linux Revolution to Networking

August 30, 2016