the dante noc network monitoring system › media_centre › media... · dante noc create a central...

19
Creating the global research village The DANTE NOC Network Monitoring System Xavier Martins-Rivas, DANTE TNC 2010, Vilnius, 2 nd June 2010

Upload: others

Post on 24-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

The DANTE NOC Network Monitoring System

Xavier Martins-Rivas, DANTETNC 2010, Vilnius, 2nd June 2010

Page 2: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

The DANTE NOC Network Monitoring System

The brief for the DANTE Network Operations Centre Network Monitoring System

The requirementNetwork Monitoring Tools to support the Network Operations Centre business model

The solutionAn integrated monitoring system correlating and interpreting alarms from multi-layer, multi-vendor network systems

Page 3: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

The Requirement

Tasked with developing a suite of monitoring tools -

Tailored to the Network Operations requirement for the DANTE NOCCreate a central dashboard of network problems

accessible by multiple teams, to help organise workflowCorrelating multiple alarms to provide more

straightforward problem descriptionsProvide network information as the basis for the creation

of further tools and reports

Page 4: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

Tailored Monitoring Tools at the Centre of the NOC

• Tier 1Basic Skill, 24x7

• Tier 2 Higher skill,5am–9pm, Mon-Fri

• Tier 3Specialists9am-5pm, Mon-Fri

Page 5: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

Enabling Multi-skilled Operations

IP

Optics

IP Optics

IP and Optical Support in the same NOC team, which requires integrated monitoring solutions.

Page 6: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

The Solution

TrapHandler

SNMP

Correlation

Database

A trap handling mechanismA databaseA correlation engine

Database

Page 7: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

An Example of the Correlation RequirementFibre Cut

– Regular non-preventable problem– Priority to respond quickly– Important to simplify alarms generated at multiple layers

The Challenge:–Multiple vendors / hardware–Different layers see the problem differently–A lot of alarms are generated, confusing operators

Page 8: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

Correlation

Responding to the challenge–Logical analysis by product

specialists within DANTE– ‘Trigger Points’ identified–Related/unrelated alarms

separable using logic and ‘time buckets’

–Correlation output translated to coded logic

Page 9: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

Correlation

End product –“Possible Fibre Cut” alarm One alarm with clear time and location Containing detail of underlying alarms, and related higher layer alarmsDefinitive call-out trigger for out of hours response

Page 10: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

The GUI

‘The Dashboard’

Needed to be an interactive database usable by all three tiers of support

Containing all required information for response and resolution

Configurable to a basic level by operators

Page 11: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

The Dashboard

Alarms SeverityAcknowledgmentAssociated Ticket

DescriptionOccurrences

Blacklist and Filter

Page 12: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

Dashboard – Alarm detail

Page 13: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

The Dashboard - Blacklist

Page 14: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

Integrating existing tools

DANTE Operations Circuit Database

Asset DatabaseConfiguration itemsContact information

Page 15: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

Performance and Reporting

Backbone view

POP view

Page 16: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

Page 17: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

Evaluation of the Process

A lot of work was invested in delivering this solutionInitially ––1 developer coding for 6 Weeks –The time of many experts within the company

(Engineering and Planning, Systems, NOC, Operations)

Now –0.5 FTE of a developer; on-going support

Page 18: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

Evaluation of the Process

... A unique result delivering great results for the network

–A tool that fits our need–With a very high level of flexibility and integration–A natural way to control the work flow

An important basis for the development of further tools to add to the network management portfolio

Page 19: The DANTE NOC Network Monitoring System › media_centre › media... · DANTE NOC Create a central dashboard of network problems accessible by multiple teams, to help organise workflow

Creating the global research village

Conclusion

The Dashboard is available for Download

http://downloads.geant.net