the network operation centre of a rren: anella cient ífica · the network operation centre of a...
TRANSCRIPT
The Network Operation Centre of a RREN:The Network Operation Centre of a RREN:
Anella CientAnella Cientííficafica
Maria Isabel Gandía Carriedo
Communications Area, Systems & Networks
Department, CESCA
TF-NOC Preparation Meeting
NORDUnet A/S, Kastrup, 3/5/2010
AgendaAgenda
� About CESCA and Anella Científica
� Anella Científica/CESCA NOC:• Communication with the users
• How we manage the network
• How we manage dedicated circuits
� Tools• Communications database
• Ad-hoc scripts
• Cacti & its plugins
• PerfSonar
• SMARTxAC
• NAM
• Other tools
� Conclusions
About CESCA and Anella CientAbout CESCA and Anella Cientíífica fica
� Public consortium
� Created in 1991
� Formed by:
• Generalitat de Catalunya
• Talència
• 9 Catalan universities
• Consejo Superior de
Investigaciones Científicas
� CATNIX created in 1999
� Anella Científica created in 1993
About Anella CientAbout Anella Cientííficafica
Anella Científica is the research and education network in Catalonia
Managed by CESCA
Connected to RedIRIS
With more than 80 points of access of institutions related to research
0
10
20
30
40
50
60
70
80
90
93 94 95 96 97 98 99 00 01 02 03 04 05 06 07 08 09 10
# P
oin
ts o
f a
cc
es
s
(Ag
gre
ga
tge
d c
ap
ac
ity
in
Mb
ps
)
≤≤≤≤ 10 Mbps
10–90 Mbps
100–990 Mbps
≥≥≥≥ 1.000 Mbps
0
200
400
600
800
1000
1200
2002 2003 2004 2005 2006 2007 2008 2009 2010
Trà
fic
(T
B)
660
660
770
880
880
15188
16190
17288
19388
27502
371G
532G
664G
736G
7.646,552008
4.665,432007
2.591,912010
6.712,352009
2.920,752006
7616G
7928G
8228G
8529G
Anella CientAnella Cientíífica: fica: EvolutionEvolution
Anella CientAnella Cientíífica: Architecturefica: Architecture
� Some local dark fibre links
� L2 Gigabit Ethernet network
� Flexible and easily scalable
� Different points of access & connections:
• Ethernet: 10, 34, 100, 1,000 and 10,000 Mb/s
• ADSL, SHDSL
� Core is a full mesh, redundancy in the links between nodes
� Access is a “ring”: dual homing
� Redundancy of the provider network and the WDM network
� Customizable CIR + EIR
� QoS capabilities at L2 network
…but the model will probably change
Anella CientAnella Cientíífica: projectsfica: projects
� PIC participates in LHC (10 Gbps)
� i2CAT participates in FEDERICA, Phosphorus, HDVIPER (10 Gbps)
� UPC-CCABA participates in EuQoS, MUPBED,… (1 Gbps)
� CESCA, i2CAT & UPC participate in PASITO (10 Gbps)
� BSC participates in RES (1 Gbps)
� Liceu transmits the course Opera Oberta
CESCA, as the manager of the Regional Research and Education Network (RREN) in Catalonia and as a Local Internet Registry (LIR) has:
• Addresses for the connected institutions:
– IPv4: 84.88.0.0/15
– IPv6: 2001:40B0::/32
• An Autonomous System (AS):
– AS13041
�CESCA controls all the L3, some L2 and some L1,
so our monitoring is mostly L3-based.
Anella CientAnella Cientíífica: L3fica: L3
Anella CientAnella Cientíífica: topologyfica: topology
1. Public and private non-profit Universities
2. Official Bodies of Research
3. Other non-profit Research centres
4. Hospital Research centres
1. Official bodies of R+D management
2. Relevant Digital contents institutions
3. R+D+i participants
4. Special interest for R+D institutions
1. Science and technological parks
2. Other hospital units
A B C
C. Nord Telvent
Operator
Internet
Anella CientAnella Cientíífica: circuitsfica: circuits
� Permanent circuits & services:
• Each point of access has one circuit to each core node for
redundancy (using L3 routing)
• An institution can have more than one VLAN with other points of access that usually belong to the same institution (internal traffic)
• An institution can have a dedicated virtual router, managed by
CESCA, to aggregate some connections
C. Nord Telvent
Operator
A B C
Anella CientAnella Cientíífica: points of accessfica: points of access
Backbone
Node
BackboneNode
Access
Node 10~70km10~40km
10~70km
Access
Node
Core
Access Ring
AgendaAgenda
� About CESCA and Anella Científica
� Anella Científica/CESCA NOC:• Communication with the users
• How we manage the network
• How we manage dedicated circuits
� Tools• Communications database
• Ad-hoc scripts
• Cacti & its plugins
• PerfSonar
• SMARTxAC
• NAM
• Other tools
� Conclusions
The NOC: Communications AreaThe NOC: Communications Area
� Some numbers:
• 85 points of access
• 2 core nodes
• 76 institutions connected to Anella Científica
• 22 entities connected to CATNIX
• 4 network engineers & 1 student
• 20 engineers for the weekend monitoring
� Help from the Operations & Security Area for cabling, installations, etc.
� We have a technical and an administrative contact for each institution
that channel all the requests (IP address assignments, routing,
dedicated circuits, incidents), but we can have previous conversations with relevant users to know their needs.
� Some technical contacts have a meeting once a year (CTAC).
� We organize a Meeting/Workshop (TAC) once a year to present new
institutions and projects (for instance, this year, Cloud Computing)
Communication institutions Communication institutions --> CESCA> CESCA
� Adresses (RT):
– Routing
– Network incidents
– Addresses requests
– Reverse DNS
– Services (Multicast, ftp-mirror,…)
– Eduroam
– Security incidents
� Telephone
Communication CESCA Communication CESCA --> institutions > institutions
� Distribution lists:
– Members of the Comission
– Technical representatives
– Other technical staff
– Generic addresses
� RT queues
� Telephone & e-mail
� TAC
� Aula (New Technologies and Seminars)
If there is an incident..If there is an incident..
� During our working hours (9.00-18.00 Mo-Th, 9.00-14.30
Fr, 8.00-15.00 Jul/Aug)
• They call us
• They send a message to [email protected]
• We try to be very proactive
� Out of our working hours, 24x7 reactive service for the institutions with an external enterprise.
� The external enterprise is able to check the state of our
routers and switches and, if the problem is external, they
can call our provider.
� Second level support from our technicians during the weekend.
How we manage the networkHow we manage the network
� Inventory of circuits using “our” Communications database
� Ad-hoc scripts and alarms
� Statistics via SNMP with Cacti
� UPC-CCABA has developed a passive monitoring system
using real-time analysis: SMARTxAC
� Our NOC is subscribed to the Dante E2ECU (End to end coordination unit) mailing list for dedicated circuits
� perfSONAR node through RedIRIS for LHC
� NAM
� Other tools
How we manage dedicated circuitsHow we manage dedicated circuits
� Special circuits & services:
• If the circuit is between two institutions connected to Anella
Científica, we ask both if they want the connection. We have a
special range of VLAN for these connections.
• If the circuit is external, RedIRIS uses a formulary that the
institutions fill and send. They send it to RedIRIS and CESCA
indicating the name of the project, description, responsible entity, kind of connection, etc.
• For modifications, institutions can ask us directly and we contact
RedIRIS
• RedIRIS and CESCA have agreed two ranges of VLAN for special
projects, one range for each type of encapsulation
• We use the Request tracker to handle all the requests, arrange a
VLAN number, etc.
For our users:For our users:
� Listen to their needs first
� For each new connection, there are some stress tests before going to a production environment
� They can choose static routing or dynamic routing (BGP)
� We ping their interface from the other end of the /30 and
from our monitoring machine
� We apply anti-spoofing filters…Some insist on using the
infrastructure address for VPNs �
AgendaAgenda
� About CESCA and Anella Científica
� Anella Científica/CESCA NOC:• Communication with the users
• How we manage the network
• How we manage dedicated circuits
� Tools• Communications database
• Ad-hoc scripts
• Cacti & its plugins
• PerfSonar
• SMARTxAC
• NAM
• Other tools
� Conclusions
““OurOur”” Communications Communications databasedatabase
� We store all the information of our institutions:• Points of access
• Addresses
• Technical and executive contacts e-mails and telephones
• Assigned IP addresses
• Infrastructure addresses (point to point)
• Equipment
• Bandwidth
• Technology
• Comments, special cases for the 24x7 service
� It makes our life easier, as we have many “special” cases:• More than one point of access per institution
• More than one institution per point of access
• Different circuits intra and inter-institutions
• …
““OurOur”” Communications Communications databasedatabase
� All the information from an institution/circuit/person is linked
� Every time we need to contact an institution, we find the related information here
� It’s not accessible from external networks
� It’s programmed by our engineers
� It also stores information of the Neutral Internet Exchange, CATNIX
� Pros
• All the information is together
• We don’t have to maintain separated files for the assignment of VLAN, IP addresses, etc
• Easy creation of new instances
• When there is a change on the technical/administrative contacts, it’s
changed “almost” automatically
� Cons
• Each change requires programming
• Sometimes the initial programmer is not the same person that
makes the changes
““OurOur”” Communications databaseCommunications database
““OurOur”” adad--hoc scriptshoc scripts
� They send e-mails and messages to our mobile phone
when a connection fails.
� The institution and problem is on the subject
� It’s the best way to be “proactive”
� Pros
• They are extremely useful to quickly detect problems and know
them during the weekend
• Easy to program (shell)
� Cons
• We need to remember to add the institutions each time there is a
new connection (separated maintenance)
““OurOur”” adad--hoc scriptshoc scripts
CactiCacti
� RRDtool front-end, high performance tool that stores and represents series of data.
� It’s used to monitor:• CPU, temperature and memory of the routers
• Anella Científica: points of access
• Voice calls
• Remote and direct access services
• CATNIX (Internet Exchange)
• Warnings
• Automatic monthly statistics
• BGP prefixes
• Ping
• Power consumption
• RedIRIS & Orange Business Services graphics integrated
SCP
Cacti: one for users, one for managementCacti: one for users, one for management
PRIVATE PUBLIC
Contact information
CACTICACTI
SC
P
PluginsPlugins
� Useful to generate monthly reports
� Useful to detect • Down links
• Congestions
• High temperature
• High CPU
• Excess of BGP prefixes…
ToldTold
ReportitReportit
SuperlinksSuperlinks
� It allows us to new tabs
� Useful to integrate RedIRIS graphs
in the same environment
� It stores in the cache the visited graphs for 5 minutes
� It doesn’t generate all the graphs
BoostBoost
Link2BDCOPSLink2BDCOPS
� It adds an icon next to each graph that, if you click on it ,
you see the data of the technical and administrative contact
� Programmed by our engineers
� Linked to our database
� Only the internal Cacti has access to it
CactiCacti
� Pros
• It’s very useful to detect problems
• It’s very useful to “see” the network while it’s working
• It makes the 24x7 service easier
• It simplifies the generation of monthly reports
• Graph templates are useful
� Cons
• Groups of users are hard to manage
• The creation of Graph Templates requires time and dedication
• The user interface is better if you don’t have a big amount of data.
PerfSonarPerfSonar
� We’re beginning to use it.
� Initially installed for the LHC project
� Uses the installable DVD version from RedIRIS
� Coordinated through RedIRIS
� Other tools, like NDT, also installed, for the measurement
of the network by our users
• Our NOC is subscribed to the Dante E2ECU (End to end coordination unit) mailing list
PerfSonarPerfSonar
� Pros
• Good for inter-domain monitoring of L2 circuits (LHC)
• Very powerful if all the tools are used
� Cons
• Installing it wasn’t easy at all…
SMARTxACSMARTxAC
� Traffic Monitoring System for Anella Científica (Sistema de
Monitorització de AЯTfic per l’Anella Científica).
� It’s a passive monitoring and analysis system, tailor-made
for Anella Científica by the Advanced Broadband
Communications Service of the Technical University of
Catalonia (UPC-CCABA).
� Usable for other high-speed networks.
� Since 2003, SMARTxAC has been used for continuously
monitoring Anella Científica.
� Passive splitters and cards for every external link.
SMARTxACSMARTxAC: Topology and splitters: Topology and splitters
Campus
NordTelvent
Specialprojects
Catalyst 6500
Level 2/3
Local connections
Juniper M320
Level 3 (RedIRIS)
Nortel
Level 2 (RedIRIS)
Capture servers (Endace
cards), analysis and
monitoring
Splitters
Catalyst 6500
Level 2
Catalyst 6500
Level 3
Operator
SMARTxACSMARTxAC
� Pros
• It captures ALL the headers through the regular traffic links
• Very useful to detect problems that happened hours ago
• Traffic is classified
• It can detect different types of application
� Cons
• The 10 Gbps cards are very expensive
• New interfaces require more programming and more cards
The NAM, Network Analysis ModuleThe NAM, Network Analysis Module
� It’s a module of the Catalyst 6500
� Similar to a SPAN port + server with ethereal/wireshark
� It allows us to capture all the traffic in certain period
� The results help us to find the origin of attacks or security
problems, black holes, etc.
� 2 simultaneous captures
Source: http://www.cisco.com
The NAM, Network Analysis ModuleThe NAM, Network Analysis Module
� Pros
• Very easy to use (web-based interface)
• Analysis in real-time of what’s happening on the network
• The capture can be saved in “ethereal format”
• It can monitor physical and logical interfaces, like VLANs
• It monitors ALL the traffic
• Filters can be applied before the capture
� Cons
• It’s a proprietary solution
• It can only monitor interfaces 1 Gbps or less
• It’s used once a problem has started
OtherOther toolstools
� MGEN to send big amounts of traffic on the links and check
if they we can fulfill them with UDP traffic
� Direct access to some tools that our providers gives us:
• HP Openview
• Cacti statistics
• Management of VLAN
� Iperf
� Netmate
� Pathrate
� Nagios
� Zabbix
� MTR
The most common incidents & requestsThe most common incidents & requests
� Incidents:
• Electrical cuts at the institution
• Radiolinks & ADSL
• Last mile fibre cuts
• Crazy firewalls…
• DoS attacks
� Other requests
• Multicast tests
• New circuits
• Routing
• DNS
• Redundancy
AgendaAgenda
� About CESCA and Anella Científica
� Anella Científica/CESCA NOC:• Communication with the users
• How we manage the network
• How we manage dedicated circuits
� Tools• Communications database
• Ad-hoc scripts
• Cacti & its plugins
• PerfSonar
• SMARTxAC
• NAM
• Other tools
� Conclusions
ConclusionsConclusions
� Our RREN has to face the problems of small entities, big
universities and research centres and very important projects with dedicated lambdas that traverse several
domains
� RT for incidents
� At least a database for data
� At least a monitoring tool
� At least an analysis tool
� New models with dark fibre require new management
models for the NOC
No single tool
Thanks for your attention!Thanks for your attention!
Questions? Suggestions?Questions? Suggestions?