open source tools for optimizing your peering …...software / network engineer at mauve mailorder...

35
Open source tools for optimizing your peering infrastructure @ DE-CIX TechMeeting 2018-06-06 by Daniel Czerwonk

Upload: others

Post on 05-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

Open source tools for optimizing

your peering infrastructure

@ DE-CIX TechMeeting 2018-06-06

by Daniel Czerwonk

Page 2: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Software / Network Engineer at Mauve Mailorder Software

• Head of Network Freifunk Essen e.V.

• AS44821 (Mauve), AS206356 (Freifunk Essen e.V.),

AS202739 (routing-rocks)

• birdwatcher and bio-routing contributor

• Twitter: @dan_nrw

• Github: https://github.com/czerwonk

• LinkedIn: https://www.linkedin.com/in/czerwonk/

Who is this guy? About me…

Page 3: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

Our journey starts late 2016

A new networking setup is about to

be build

Page 4: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

But before that:

Let’s talk about monitoring…

Page 5: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Very small operations team

• Freifunk Essen should be even less ops demanding

• Identify trends/anomalies early

• Capacity planing (beware of retention)

• Source for alerting

• Start point for traffic engineering, etc.

• Source to build post mortem on (in case of outage)

• Dashboard to give a quick overview when needed

Why is monitoring important for me?

Page 6: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

So, let’s build a monitoring system…

Page 7: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Prometheus to collect metrics

• Grafana to visualize metrics

• Alertmanager with Pushover integration for alerting

• Everything Ansible managed

What I wanted…

+ +

Page 8: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Bird routing daemon

• JunOS running on a few EX series switches

• Host metrics from bare metal software router machines (statistics, resources)

• External network latencies (RIPE ATLAS, etc.)

What I wanted to scrape?

Page 9: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

What I found…

Page 10: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

In 2016…

Metric Solution Problem

bird no exporter available

JunOS snmp_exportercomplex configuration,

bad performance

Host metrics node_exporter

Network latenciesblackbox_exporter with

external probe VMs

bad coverage,

only one request per scrape

Page 11: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Official Prometheus project

• On Linux hosts (e.g. Routers)

• Network interface metrics

• Resource consumption: CPU load, RAM usage, Disk space

• Interrupts / context switches

• License: Apache 2.0

• Source: https://github.com/prometheus/node_exporter

node_exporter

Page 12: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

At least we got the host metrics covered.

And the rest?

I had to solve that…

Page 13: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

So I started to write some

exporters…

Page 14: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Performance is key feature

• Need for concurrent processing

• Single binary / no dependencies

• Easy installation via go get …

• Existing client API for Prometheus

• Love writing code in golang in my spare time

Which programming language?

I chose golang:

Page 15: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

atlas_exporter

RIPE ATLAS

Milestones to an exporter suite

bird_exporter

Bird 1.x

2016 20182017

RIPE LABS

article

Support for

bird 2.x

Replaced SNMP

by SSH

junos_exporter

Juniper JunOS

using SNMP

ping_exporter

ICMP probing

mikrotik-exporter

RouterOS

Page 16: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Started late 2016

• Communicates with bird via socket

• Bird 1.x and 2.x supported

• Protocols: BGP, OSPFv2, OSPFv3, Kernel, Static, Device, Direct

• License: MIT

• Source: https://github.com/czerwonk/bird_exporter

bird_exporter

Page 17: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

bird_exporter

bird_protocol_prefix_import_count{proto=~"BGP|OSPFv3",ip_version="6"}

count(bird_protocol_up{proto=“BGP"} == 1)

Page 18: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• BGP session state metrics

• BGP message counts (received, sent, withdrawn, etc.)

• Prefix counts for all supported protocols (imported, exported, filtered, etc.)

• OSPFv2/OSPFv3 neighbour counts

• Protocol uptime

bird_exporter - Features

Page 19: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Started early 2018

• Replacement for RRD based smokeping

• Concerning ICMP also replacement for blackbox_exporter since lack of loss

detection

• Based on go-ping by Digineo: https://github.com/digineo/go-ping

• License: MIT

• Source: https://github.com/czerwonk/ping_exporter

ping_exporter

Page 20: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

ping_exporter

ping_rtt_mean_ms{ip_version="6"}

ping_loss_percent{ip_version="4"}

Page 21: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Sends and aggregates multiple ICMP ECHO requests

• Roundtrip metrics (current, best, worst)

• Simple way to detect loss

• Supports multiple targets

• DNS refresh ensures the correct IP is measured when DNS is changed

• Only ICMP support at the moment

• Warning: ICMP is not user traffic so keep that in mind when trying to interpret these

metrics

ping_exporter - Features

Page 22: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Started early 2017

• Metrics by requesting measurement results from RIPE ATLAS

• Useful to get an outside view from different other networks

• License: LGPL3 (since the binding used is under this license)

• Source: https://github.com/czerwonk/atlas_exporter

• More info:

https://labs.ripe.net/Members/daniel_czerwonk/using-ripe-atlas-measurement-

results-in-prometheus-with-atlas_exporter

atlas_exporter

Page 23: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

atlas_exporter

avg(atlas_ping_avg_latency{ip_version="4"}) by (asn)

avg(atlas_traceroute_hops{ip_version="4"}) by (asn)

Page 24: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Ping (success, min/max/avg latency, dups, size)

• Traceroute (success, hop count, rtt)

• NTP (delay, derivation, ntp version)

• DNS (succress, rtt)

• HTTP (return code, rtt, http version, header size, body size)

• SSL Certificates (alert, rtt)

atlas_exporter - Features

Page 25: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Started late 2017

• snmp_exporter did not perform as required

• First implementation using a simple set of SNMP OIDs

• Early 2018: reimplementation using SSH and XML RPC representation

• Alternative to Junipers OpenNTI since telemetry is only supported on newer

versions of JunOS and hardware

• License: MIT

• Source: https://github.com/czerwonk/junos_exporter

junos_exporter

Page 26: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Interfaces (bytes transmitted/received, errors, drops)

• Routes (per table, by protocol)

• Alarms (count)

• BGP (message count, prefix counts per peer, session state)

• OSPFv2, OSPFv3 (number of neighbours)

• Interface diagnostics (optical signals)

• ISIS (number of adjacencies, total number of routers)

• Environment (temperatures)

• Routing engine statistics

junos_exporter - Features

Page 27: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Contribution to existing project

• Only interface and resource metrics at this point

• Added several other features

• License: BSD3

• Source: https://github.com/nshttpd/mikrotik-exporter

mikrotik-exporter

Page 28: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

• Interface metrics (RX bytes, TX bytes, drops, errors, etc.)

• BGP session states

• BGP message counts (updates, withdraws)

• DHCP leases

• DHCPv6 bindings

• Optical diagnostics

• IPv4/IPv6 pool counts

• System resources (memory, CPU load, etc.)

• Prefix counts per protocol (in RIB)

mikrotik-exporter - Features

Page 29: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

Dashboard examples

How to combine several exporters?

Page 30: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

Mauve Network Overview

Page 31: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

Mauve Routing

Page 32: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

Alerting

When and how?

Page 33: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

How to alert?

What the SRE book has taught us:

https://landing.google.com/sre/book/chapters/monitoring-distributed-systems.html

Page 34: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

How to alert? A few examples…

Port saturation:

Upstream session down:

Page 35: Open source tools for optimizing your peering …...Software / Network Engineer at Mauve Mailorder Software • Head of Network Freifunk Essen e.V. • AS44821 (Mauve), AS206356 (Freifunk

Thank you for your attention.

Special thanks to all people contributed to my projects!