public transit network automation in a large and highly ... atl slide... · greenfielding network...

Post on 22-Jun-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Greenfielding Network and Systems Automation in a Large and Highly Dynamic Public Transit Network

Logan BestDevOps EngineerTransit Wireless

Share your automation story

1. How did you get started with Ansible?

2. How long have you been using it?

3. What's your favorite thing to do when you Ansible?

-vvv

Disclaimer:

This talk will be intentionally vague in some cases due to NDA and proprietary

IP that I cannot divulge.

Any opinions expressed are of my own and not my employers.

Core Network

Cisco● IOS● IOS-XE● IOS-XR● NX-OS● ASA

Extreme● NX9500● NX9600● VX9000

Nokia● ALu● ALE

WestellMikrotikDigi LTE

DebianUbuntuCentOS

ProxmoxOxidizedZabbixThe list goes on….

What does this all come down to?

● We have a massive footprint of vendors, versions, and platforms to cover

● Almost 20,000 devices just in NYC● Network_cli just isn’t enough sometimes● Yes, that means some things rely on telnet >.<● LOADS of underlying groundwork required

So how do you even begin?

● Talk to your peers about existing pain points● Where’s the low hanging fruit you can get easy wins with?● How’s the existing infrastructure setup? What’s missing?

What are the current projects?

● Find out what your team or related teams are working on● How can those tasks be automated?

What was missing?

● Source of Truth● Secrets Management● CMDB● Central Authentication● Self Service● DEVELOPERS DEVELOPERS DEVELOPERS

Whew… So how do we even get started?

● Crawl, Walk, Run principle● K.I.S.S● Have a BIG emphasis on team training and buy in● Network Audit● Get corporate buy in on conferences, trainings, and certifications● Use the small initial wins as leverage

Crawl

● Utilize Network Audit to gather facts about the network● Team Education● Monitoring● Automation used as needed with validated and reviewed additive only

changes● Start introducing input validation to reduce change risk

Walk

● Introduce Netbox as Source of Truth● Build your Inventory strategy● Setup DNS and LDAP/Radius AAA● Start simple small when making changes to the network● Severely limit your initial footprint to reduce risk to prod● LAB EVERYTHING!!!

Run

● Netbox implementation complete● Monitoring adds new automation and device specific metrics● Implement rollback, integrates with Oxydized to backup on each run and

restore if needed● Automated ZTP with Ansible instead of console provisioning● Introduce Jira and proper change/project management culture● Auto documenting Jira issues with Ansible!● Getting closer to no manual changes as playbooks evolve and become

more robust

How are we doing all of this?

● Python● Ansible● AWX● Netbox

● Stackstorm● Zabbix● Jira● Slack● Viewflow.io

So where’s the “highly dynamic” part?

Wifi onboard the trains!

A C C A

Some train operators don’t keep cars together

How can we keep our sanity?

● Rigorous testing● Get so good at that you can write a whitepaper on it● Innovate using existing protocols● Have a backup strategy for when it all fails to provision

In the end...

● Don’t be afraid to start slow● Don’t be afraid to start small● Have a well thought out vision● Advocate for education for yourself and your peers● You will eventually break something.

top related