giovanni cesaroni, garr eumedconnect2 training – rome, 22-25 june 2009 gins the garr network...

80
Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Post on 15-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, GARREUMEDCONNECT2 Training – Rome, 22-25 June 2009

GINSThe GARR Network Monitoring System

Page 2: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 2

PART 1

GINS description

• NOC Tools Motivation

• Required Functionality

• Monitoring Environment

• Statistics Examples

• Visualization

• Reports

• Slicing

• Traffic Flows Analysis

• Work in progress

PART 2

Let’s code the Network Monitoring!

• SNMP in action

• BGP, OSPF, MPLS, IPv6

Agenda

PART 3

RRD World

• RRD in action

• How to avoid loosing data

Page 3: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 3

• 43 POPs (University and Research Centre)• PEERING: 76 Gbps

• 52.5Gbps vs GEANT2• 10G + 2.5G IP Access• 3*10GE E2E links• 9*1GE E2E links

• 3x2.5Gbps IP Transit• 2 Milan + 1 Rome

• 7x1Gbps+10Gbps National PEERING

• BackBone Capacity ~110Gbps• 7 TLC Operators

• Telecom Italia• Infracom (ex Autostrade TLC)• Fastweb• Interoute (ex Eurostrada)• WIND • BT-Italia (ex Albacom)• COLT-Telecom

• 3 International IP Carrier• Global Crossing• Telia• Level3

• Access Capacity: ~60Gbps• Starting from 2M 10G • N.Access Links: 500• N.Backbone Links: 62

GARR Network

• E2E Capacity: ~40Gbps• from 1G 10G

Page 4: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 4

GOALS

• Provide the NOC, Operations and Planning staff with all the tools needed to do their work as well as possible

• Monitor users site connectivity• Check the status of the services at each level of the network

• service oriented approach (not metric oriented)

• Integrate monitoring services• Automate tools configuration• Give easy access to the information• Automatic generation of fault and performance reports

The goal is not to manage the control plane, but to have full control of the network

Page 5: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 5

Measurements Storage(MySQL & RRD)

Measurements Storage(MySQL & RRD)

Consistency Tools Robots

GINS Monitoring ToolsGINS Monitoring Tools GINS Visualization ToolsGINS Visualization Tools

GARRNetwork

GARR NOC

GARR-DB: Network Database(Network Structure MySql)

GARR-DB: Network Database(Network Structure MySql)

GINS Architecture

Page 6: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 6

administrativeand technicalinformation!!!

administrativeand technicalinformation!!!

physical object

segments

Aggregate

User Site

GARRBackbone

GARR-DB: the Information System

Logical “circuit” (IP link,MPLS LSP, lambda service, etc)

GARR Domain

physical circuit

eq

physical circuit

physical objects

Page 7: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 7

Scheduler:Cron

SW tools used by GINS

Network

Data management:AWK, Bash, PHP, RRDtools

Data acquisition:MRTG, SNMP polls, ping

~5500 RRD files

Data storage:MySQL, File, RRD

Data visualization:PHP, HTML, Javascript, Ajax, SVG

Reports:PHP, Jpgraph, HTMLDOC

Page 8: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 8

NOC in action

End Site

GARRBackbone

GARR NOCAPMTLC NOC

Alarms

TroubleTicket

Page 9: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 9

• Network monitoring

• Statistics acquisition

• Trouble Ticket System

• Fault and Performance Reports

Monitoring Services Lambda SDH/SONET MPLS IPv4, IPv6 OSPF, BGP E2E Multicast Beacons Equipment

Statistics Services IPv4, IPv6, Multicast traffic Physical interface errors Routers CPU Premium IP SDH/SONET errors Backbone weathermap Uncompressed Statistics

GINS at a glance

Main functionalities

Page 10: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 10

• GINS detects/defines the status of different services, on the basis of the information gathered through the network.Monitoring is supported on the following service classes:

• IPv4 and IPv6: [service status, input errors and output drops on physical interfaces]

• end-user site • backbone interface

• IP Multicast Beacons [service status]

• Routing protocols: • OSPF [link costs] • BGP [peering status, adv/rec routes]

• SDH/Sonet [SDH/Sonet errors]• router interface on leased-lines

• Lambda [service status, optical equipment port status]

• MPLS [MPLS LSP status]

• E2E: [E2E service status]• defined as the stitching of multiple intra-domain and inter-domain links

Monitoring services

Page 11: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 11

• GINS stores performance measurements data and provides:

• Traffic Statistics• IPv4 and IPv6, Multicast for end user sites and backbone• Aggregate• Peering• Premium IP• Uncompressed Statistics• Sonet/SDH errors on leased lines• Router CPU load and temperature

Statistics services

Page 12: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 12

Other services

• GINS includes a Trouble Ticket System which is highly customized for the GARR operations procedures. In particular, it manages user services, leased lines and PoP ticket.

• Fault and performance reports:• User monthly and yearly reports (HTML and PDF)• User fault report and circuit availability• Uncompressed traffic statistics (IP BW usage, 95th percentile,

etc.)• Carrier fault report and circuit availability (HTML and PDF)

• Monitored physical devices:• Juniper J6350, M7i, M10, M20, M320• Cisco: 12xxx, 17xx, 18xx, 2xxx, 3750, 72xx, 75xx• ADVA FSP3000• Metrobility R4000, R5000

Page 13: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 13

Monitoring

• Who is the target user of monitoring UIs?

The NOC & the Operation Staff, private access

Page 14: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 14

•Control Panel and IP Monitoring

•BGP Alarms & Monitoring

•E2E Monitoring, Lambda & MPLS

•Other Services

Page 15: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 15

Monitor Control Panel

Page 16: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 16

NOC Interface (1/2) : links status

End Site InfoTelnet Traffic in/out

Trouble ticket

Last action

Page 17: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 17

NOC Interface (2/2): other services and quick ticket management

Page 18: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 18

End Site Info

Traffic

Trouble Tickets

Interface Errors

Page 19: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 19

Physical Interface Input Errors and Output Drops

2Mbps

The link is going to be upgraded to a Gbps link in the next days!

Page 20: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 20

Aggregate status of the “domain link”

Status of the “domain segment”

E2E Monitoring

Status of the Interdomain Link

Page 21: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 21

IP MPLS LSP Lambda10GE

E2E Stitching Monitoring

Page 22: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 22

E2Emon XML schemaE2Emon XML schema

GARR archive

E2EmonE2EmonE2EmonE2Emondata aggregation

GARR NOC

GN2 E2E CU

GINSGINSGINSGINS

GINS vs Gn2 E2E CU

GN2:JRA4Switch & DFN

Page 23: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 23

MUPBED: one e2e connection

GINS MPLS ServiceGINS MPLS Service

SNMP Polls

Informations on:1- LSP12- L2 connection

Informations on:1- LSP12- L2 connection

TO

MI2

MI1

GN2IT

GN2DFNGARR

GN2

DFN

FF

LSP2

LSP3

TLAB

TSystem

LSP1

MPLS Monitoring

Page 24: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 24

MPLS Monitoring: MUPBED case

LSP Status E2E L2 inter-domain status

Page 25: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 25

... ...

BGP monitoring

• Peer status & prefixes information

• Alarms

Page 26: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 26

SONET Alarms (rfc2558)

Page 27: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 27

• Common statistics sets, different type of representation

• Online Network Status

• Other Services

Statistics

Page 28: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 28

Traffic, Input errors & output drops

CPU load

& temperature

Router

aggregate traffic

& peaks

Long Term Analysis

Page 29: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 29

Example of temperature statistics

In such cases I’d like to be alerted by email, SMS, phone and voice!!!

Page 30: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 30

The backbone weathermap

Page 31: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 31Traffic load

615M

OSPF cost

25

Router CPU temperature

20

Ticket info

Page 32: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 32

Ticket info

Traffic load

Page 33: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 33

Weathermap

SVG image

HTML dynamic map

PNG image

Merge

Measurements StorageMeasurements Storage

Network DatabaseNetwork Database

Network

Convert

Generate

How it works

Page 34: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 34

Fault & Performance Reports

• Who is the target user for network reports?

• What kind of reports are provided?

1- Network users, end sites

• fault and availability reports of the services• historical traffic data

Page 35: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 35

monthly reportmonthly report

95th percentile95th percentile

Uncompressedstatistics

UncompressedstatisticsGARR User

GARR User

Fault & Performance Reports: UI

Page 36: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 36

~1,000 report pages per month~50MB disk space per month

User monthly and yearly PDF Reports

Introduction Faults and availability

Monthly and yearly traffic statistics

Page 37: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 37

Uncompressed Traffic Statistics, monthly view

95th percentile

5 minutes

Page 38: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 38

Uncompressed Traffic Statistics, yearly view

Monthly values

Page 39: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 39

2005!!

Historical data

Page 40: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 40

Fault & Performance Reports

• Who is the target user for network reports?

• What kind of reports are provided?

1- Network users, end sites

• fault and availability reports of the services• historical traffic data

2- Network planning staff

• to extrapolate the traffic trends for the future network planning

Page 41: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 41

GARR Traffic Trends

0

5

10

15

20

25

30

35

1/20

05

5/20

05

9/20

05

1/20

06

5/20

06

9/20

06

1/20

07

5/20

07

9/20

07

1/20

08

5/20

08

9/20

08

1/20

09

5/20

09

Gbps Ave In Ave Out Max In

Max Out Vol In (Pbps) Vol Out(Pbps)

95th In 95th Out 95th

3.84 Gbps3.84 Gbps

30.67 Gbps

30.67 Gbps

Page 42: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 42

GLOBAL INTERNET r ~ 1.4/y

Traffic Evolution

2001 2002 2003 2004 2005 2006 2007 2008 2009

NATIONAL INTERNET r ~ 1.6/y

RESEARCH TRAFFIC r ~ 2.0/y E2E

Page 43: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 43

Latency Measurements

http://oss.oetiker.ch/smokeping/

By Tobias Oetiker

Page 44: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 44

Latency Measurements

Server End SiteFping probe

• Round Trip Time fluctuations

• Packet Loss pecentage

Page 45: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 45

Slices

GARR-DB: Network DatabaseGARR-DB: Network Database

Description of the infrastructure

• Temporary infrastructures

• Network Labs

• Temporary research projects

• Infrastructures requiring monitoring only

Homer’s dream is just:

• Dedicated monitoring systems (users or projects)

Page 46: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 46

Dedicated monitoring systemsSlices

User requirements:

• Quick and easy setup• Traffic statistics• Weathermaps• Alarms

Administrator requirements:

• Easy to manage• Replicable

Page 47: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 47

Slices

• Slice link, description and status • Access policy

• Slice status (on,off)

• Status of MRTG CFG generation (red if disabled)

• Cronjob status (red if disabled)

• MRTG log status

• Url

Page 48: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 48

Slices

Page 49: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 49

Traffic Flows Analysis

Suite Nfsen/Nfdump by Peter Haag

Based on NetFlow protocol

Page 50: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 50

Traffic Flows Analysis, architecture overview

Network

NetFlow, data export, sampling

Raw data

Nfcapd

User

NfsenNfdump

RRDs

Nfdump (CLI)

www

Daily numbers:

• ~2000 flows/s export

• sampling 1:1000

• ~40MB-1.6GB each router (raw data)

Page 51: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 51

Traffic Flows Analysis, example

Analysis of 2 subnets traffic on one interface

Servers vs DHCP

Page 52: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 52

MRTG vs NetFlow

GINS (SNMP)

Nfsen (NetFlow)

Page 53: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 53

Do I trust sampling?

From router counters (GINS by MRTG):

From flows (NetFlow):

Page 54: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 54

Traffic Flows Analysis with ASTracker

+

+

+

+

+

+

=

+

How to get information on the traffic exchanged between ASes?

Example of an IP commodity peering

ASTracker Nfsen plugin by Nino Ciurleo @ GARR

Page 55: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 55

Traffic Flows Analysis with ASTracker: Microsoft black hole

Microsoft AS8075 announce by GEANT Output traffic on Geant, input traffic lost

Page 56: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 56

Traffic Flows Analysis with ASTracker: other examplesFacebook:

From the Microsoft Web Site“As part of Microsoft's routine, monthly security update cycle, we released 10 new security updates on June 9, 2009”.

Page 57: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 57

• Tools that are currently going to be integrated:• Reports on Traffic Flows Analysis • Equipment SNMP Traps

• Future plans:• Packaging: module packaging for distribution• Optical Network Monitoring• GINSv2

GINS

Tell me guy!

Optical Network

#@%$!

OperationsSupportSystem

Work in progress

Page 58: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 58

LET’S CODE THE NETWORK MONITORING!LET’S CODE THE NETWORK MONITORING!

Part 2

Page 59: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 59

SNMP, RFC

• n. 1441

• Introduction to version 2 of the Internet-standard Network Management Framework

• n. 2578

• Structure of Management Information for version 2 of the Simple Network Management Protocol (SNMPv2)

• n. 1213 (updates 2011,2013,2013)

• Management Information Base for Network Management of TCP/IP-based internets: MIB-II

Page 60: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 60

SNMP

• 2 different approaches:

• SNMP POLL

• SNMP TRAP

You ask for something

The equipment sends a response

The equipment advises you about an event

Page 61: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 61

USING SNMP, POLL

snmpget -v2c -c <community> <router> <Object Identifier OID>

snmpwalk -v2c -c <community> <router> <part of an OID>

Poll response: <OID> = <data type>: <value>

Basic examples: snmpget -v2c -c <community> <router> IP-MIB::ipAdEntIfIndex.194.116.96.25IP-MIB::ipAdEntIfIndex.194.116.96.25 = INTEGER: 82

snmpget -v2c -c <community> <router> IF-MIB::ifName.82IF-MIB::ifName.82 = STRING: ge-1/2/0.4

snmpget -v2c -c <community> <router> IF-MIB::ifHighSpeed.82IF-MIB::ifHighSpeed.82 = Gauge32: 1000

snmpget -v2c -c <community> <router> IF-MIB::ifHCInOctets.82IF-MIB::ifHCInOctets.82 = Counter64: 262925908632166

Page 62: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 62

SNMP in action: BGP Monitoring

Status of the Peer BGP: 1.3.6.1.2.1.15.3.1.2 (RFC 1269)

snmpwalk -v2c -c <community> <router> 1.3.6.1.2.1.15.3.1.2 |

awk -F 'SNMPv2-SMI::mib-2.15.3.1.2.' '{print $2}' |awk -F ' = INTEGER: ' '{

if($2=="1"){status=sprintf("Idle");}; if($2=="2"){status=sprintf("Connect");}; if($2=="3"){status=sprintf("Active");}; if($2=="4"){status=sprintf("Opensent");}; if($2=="5"){status=sprintf("Openconfirm");}; if($2=="6"){status=sprintf("Established");}; print $1,status;}'

Returns a list of:<IP address of the Peer> <Status of the Peer>

Page 63: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 63

SNMP in action: BGP Monitoring

snmpwalk -v2c -c <community> <router> 1.3.6.1.2.1.15.3.1.9 |awk -F 'SNMPv2-SMI::mib-2.15.3.1.9.' '{print $2}' |awk -F ' = INTEGER: ' '{print $1,$2;}'

Returns a list of:<IP address of the Peer> <AS of the Peer>

AS of the Peer BGP: 1.3.6.1.2.1.15.3.1.9 (RFC 1269)

Page 64: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 64

SNMP in action: BGP Monitoring

A rude but simple BGP MonitorContent of /<some path>/BGPmon.sh

#!/bin/bashsnmpwalk -v2c -c <community> <router> 1.3.6.1.2.1.15.3.1.2 |awk -F 'SNMPv2-SMI::mib-2.15.3.1.2.' '{print $2}' |awk -F ' = INTEGER: ' '{ if($2!="6"){alarm=sprintf(“The Peer has a problem: ");}; print alarm,$1;}'

In the crontab

MAILTO="[email protected]"0-55/5 * * * * /<path>/BGPmon.sh

Why rude? 0- If a peering goes down for 24 hours, I get 288 emails, please change the email address!!!

1- A better way of coding is to use the libraries of an higher language (php, perl, java, etc.), allowing you to manage errors, performances and historical data

Why simple? Just a lovely command line

Page 65: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 65

SNMP in action: BGP Monitoring

CISCO-BGP4-MIBAccepted prefixes from Peer1.3.6.1.4.1.9.9.187.1.2.4.1.1.<IP>.1.1 (.1.1 = IPv4 Unicast)Advertised prefixes to Peer1.3.6.1.4.1.9.9.187.1.2.4.1.6.<IP>.1.1

BGP4-V2-MIB-JUNIPERReceived prefixes from Peer1.3.6.1.4.1.2636.5.1.1.2.6.2.1.7.<Peer Index>.1.1Advertised prefixes to Peer1.3.6.1.4.1.2636.5.1.1.2.6.2.1.10.<Peer Index>.1.1Accepted prefixes from Peer1.3.6.1.4.1.2636.5.1.1.2.6.2.1.8.<Peer Index>.1.1Peer Index from:1.3.6.1.4.1.2636.5.1.1.2.1.1.1.14

Monitoring BGP Prefixesno more standard MIBs available

Page 66: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 66

SNMP in action: OSPF Monitoring

OSPF cost of a link: 1.3.6.1.2.1.14.8.1.4.<IP Address>.0.0 (RFC 1850)

snmpwalk -v2c -c <community> <router> 1.3.6.1.2.1.14.8.1.4 | grep '.0.0 =' | awk -F '.0.0 = INTEGER: ' '{print $1,$2}' | awk -F 'SNMPv2-SMI::mib-2.14.8.1.4.' '{print $2}'

Returns a list of:<IP address> <OSPF cost>

Page 67: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 67

SNMP in action: MPLS LSP Monitoring

On Juniper Routers:

To get the information about an LSP,we have to know the index identifying the LSP (<LSP index>),

Example: BO1-MI1-VPN :.66.79.49.45.77.73.49.45.86.80.78.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0

snmpget -v2c -c <comunity> <router> <mplsLspState>.<LSP index>

1 = unknown2 = up3 = down

Some information:mplsLspName 1.3.6.1.4.1.2636.3.2.3.1.1mplsLspPathChanges 1.3.6.1.4.1.2636.3.2.3.1.10mplsLspLastPathChange 1.3.6.1.4.1.2636.3.2.3.1.11mplsLspConfiguredPaths 1.3.6.1.4.1.2636.3.2.3.1.12mplsLspStandbyPaths 1.3.6.1.4.1.2636.3.2.3.1.13mplsLspOperationalPaths 1.3.6.1.4.1.2636.3.2.3.1.14mplsLspFrom 1.3.6.1.4.1.2636.3.2.3.1.15mplsLspTo 1.3.6.1.4.1.2636.3.2.3.1.16mplsPathName 1.3.6.1.4.1.2636.3.2.3.1.17mplsPathType 1.3.6.1.4.1.2636.3.2.3.1.18mplsPathExplicitRoute 1.3.6.1.4.1.2636.3.2.3.1.19

mplsLspState 1.3.6.1.4.1.2636.3.2.3.1.2mplsPathRecordRoute 1.3.6.1.4.1.2636.3.2.3.1.20mplsPathBandwidth 1.3.6.1.4.1.2636.3.2.3.1.21mplsPathCOS 1.3.6.1.4.1.2636.3.2.3.1.22mplsPathInclude 1.3.6.1.4.1.2636.3.2.3.1.23mplsPathExclude 1.3.6.1.4.1.2636.3.2.3.1.24mplsPathSetupPriority 1.3.6.1.4.1.2636.3.2.3.1.25mplsPathHoldPriority 1.3.6.1.4.1.2636.3.2.3.1.26mplsPathProperties 1.3.6.1.4.1.2636.3.2.3.1.27

mplsLspOctets 1.3.6.1.4.1.2636.3.2.3.1.3mplsLspPackets 1.3.6.1.4.1.2636.3.2.3.1.4mplsLspAge 1.3.6.1.4.1.2636.3.2.3.1.5mplsLspTimeUp 1.3.6.1.4.1.2636.3.2.3.1.6mplsLspPrimaryTimeUp 1.3.6.1.4.1.2636.3.2.3.1.7mplsLspTransitions 1.3.6.1.4.1.2636.3.2.3.1.8mplsLspLastTransition 1.3.6.1.4.1.2636.3.2.3.1.9

Page 68: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 68

SNMP in action: MPLS LSP Monitoring

How to build the LSP index:

B O 1 - M I .....

CHAR to DEC translation

.66.79.49.45.77.73.49.45.86.......

<?$name=$argv[1];$oid=name2oid($name);print $name.": ".$oid."\n";

function name2oid($string) { $oid = ''; $len = strlen($string); for ($i = 0; $i < $len; $i++) { $oid .= ".".str_pad(ord($string[$i]), 2, 0, STR_PAD_LEFT); } $npoints=32-$len; for ($i=0;$i<$npoints;$i++){ $oid .= ".0"; } return $oid;}?>Build the monster using a translator

(or use an ASCII table on wikipedia):

$ php name2oid.php BO1-MI1-VPN

BO1-MI1-VPN: .66.79.49.45.77.73.49.45.86.80.78.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0

Page 69: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 69

SNMP in action: IPv6 traffic

There are no IPv6-MIBs available to measure the IPv6 traffic on the Interfaces on Cisco and Juniper Routers.

A solution for Juniper Routers is using the firewall, defining a counter for the IPv6 traffic

1- firewall configuration> show configuration firewall family inet6 { filter ipv6-traffic { interface-specific; term count { then { count ipv6-traffic; accept; } } } }

3- result> show firewall | grep ipv6 Filter: ipv6-traffic-ge-0/2/4.0-iipv6-traffic-ge-0/2/4.0-i 253874255 2929972Filter: ipv6-traffic-ge-0/2/4.0-oipv6-traffic-ge-0/2/4.0-o 278249000 3005956

2- interface configuration> show configuration interfaces ge-0/2/4.0 family inet6filter { input ipv6-traffic; output ipv6-traffic;}

And now is time to understand how the OID counter is built

Page 70: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 70

SNMP in action: IPv6 traffic

From JUNIPER-FIREWALL-MIB: jnxFWCounterDisplayFilterName: 1.3.6.1.4.1.2636.3.5.2.1.1jnxFWCounterDisplayName: 1.3.6.1.4.1.2636.3.5.2.1.7And what we need to measure: jnxFWCounterByteCount: 1.3.6.1.4.1.2636.3.5.2.1.5

How to build the index of the counter? After some long reverse engineering….

1.3.6.1.4.1.2636.3.5.2.1.5 + <length of the filter_name> + <CHAR to DEC translation of the filter_name> +<length of the counter_name> + <CHAR to DEC translation of the counter_name> + .2

In this case the filter_name and the counter_name are the same(ipv6-traffic-ge-0/2/4.0-i)

Page 71: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 71

SNMP in action: IPv6 traffic

Example for the counter on the ae1.0 interface:

ipv6-traffic-ae1.0-i : .105.112.118.54.45.116.114.97.102.102.105.99.45.97.101.49.46.48.45.105

1.3.6.1.4.1.2636.3.5.2.1.5 + .20 + .105.112.118.54.45.116.114.97.102.102.105.99.45.97.101.49.46.48.45.105 + .20 + .105.112.118.54.45.116.114.97.102.102.105.99.45.97.101.49.46.48.45.105 + .2

Finally, you can get the counter value by snmp or you can use the OID in a MRTG configuration file.

Page 72: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 72

RRD World

Where to find all the information:

http://oss.oetiker.ch/rrdtool/

thanks to Tobias Oetiker

How to store data in an efficient and systematic manner:

Page 73: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 73

RRD API

Network

SNMP Polls

RRD World

CACTI

Handmadeor other

poller

MRTG

Storage

Page 74: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 74

RRD World

The RRD file:a possible and typical temporal structure:

600 values 600 values 600 values 600 values

Averageon

5 minutes

Averageon

30 minutes

Averageon

2 hours

Averageon

1 day

50 days 600 days12.5 days50 hours

600 600 600 600

AVERAGE600 600 600 600

MAX

Round Robin Archive: RRA

Page 75: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 75

RRD World

RRD in action:

600 values 600 values 600 values 600 values

Averageon

5 minutes

Averageon

30 minutes

Averageon

2 hours

Averageon

1 day

New value

Page 76: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 76

RRD World: how to avoid loosing data, method 1

First thing to do:Change the size of the yearly RRA, for example to 10 years

600 values 600 values 600 values 600 values

3650 values

Averageon

5 minutes

Averageon

30 minutes

Averageon

2 hours

Averageon

1 day

Page 77: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 77

RRD World: how to avoid loosing data , method 1

RRD API:Info, create, update, fetch, tune, graph, dump, restore, etc.

rrdtool info <file.rrd>

rra[0].cf = "AVERAGE"rra[0].rows = 600rra[0].pdp_per_row = 1rra[1].cf = "AVERAGE"rra[1].rows = 600rra[1].pdp_per_row = 6rra[2].cf = "AVERAGE"rra[2].rows = 600rra[2].pdp_per_row = 24rra[3].cf = "AVERAGE"rra[3].rows = 600rra[3].pdp_per_row = 288

5 m

30 m

2 h

1 d

rrdtool resize <file.rrd> 3 GROW 3050

RRA numberrra[0].cf = "AVERAGE"rra[0].rows = 600rra[0].pdp_per_row = 1rra[1].cf = "AVERAGE"rra[1].rows = 600rra[1].pdp_per_row = 6rra[2].cf = "AVERAGE"rra[2].rows = 600rra[2].pdp_per_row = 24rra[3].cf = "AVERAGE"rra[3].rows = 3650rra[3].pdp_per_row = 288

10 years RRD

Page 78: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 78

600 values

RRD World : how to avoid loosing data , method 2

Building RRD without compression:

600 values 600 values 600 values

Averageon

5 minutes

Averageon

30 minutes

Averageon

2 hours

Averageon

1 day

Yearly RRD without compressionSingle RRA with 105408 values

366 days

Script(every hour)

Averageon

5 minutes

12 values12 values

Page 79: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 79

RRD World : how to avoid loosing data , method 2

Building RRD without compression:

how to do it1- Uncompressed RRD creation (once for year):

rrdtool create <destination.rrd> > --start <some year ago> --step 300> DS:in:GAUGE:600:U:U DS:out:GAUGE:600:U:U > RRA:LAST:0.5:1:105408

2- Data extraction and insertion (once for hour):

rrdtool fetch <source.rrd> --end now-600s --start now-4200s AVERAGE |

awk -F ' ' 'BEGIN {x=0;}{x++; if (x>2){ print $1 $2":"$3 } }' |

xargs rrdtool update <destination.rrd>

Page 80: Giovanni Cesaroni, GARR EUMEDCONNECT2 Training – Rome, 22-25 June 2009 GINS The GARR Network Monitoring System

Giovanni Cesaroni, EUMEDCONNECT2 Training, Rome 22-25 June 2009 80

Reference

• URL: www.gins.garr.it

• Email: [email protected]

[email protected]