kddi - openstack summit 2016/red hat nfv mini summit
TRANSCRIPT
KDDI Research Inc. Proprietary and Confidential
Troubles prediction and detection based on Distributed Monitoring & Analytics frameworkYuki Kasuya <yu-kasuya@kddi-research.jp> KDDI Research
KDDI Research Inc. Proprietary and Confidential
Agenda
1
2
3
4
Motivation / Problem
Solution / Architecture
Use case
Conclusion
2
KDDI Research Inc. Proprietary and Confidential
RemoteOperation Center
Development Division
nRequire reliability
nMany operators needed
nHigh cost operationl change to low cost
4
Current operation style
Data Center24hours / 365daysHW replacement
daytimesoftware bug
KDDI Research Inc. Proprietary and Confidential
Driver: Changing operation style
ReactiveOperation
PeriodicOperation
ProactiveOperation
Cost Reduction
Agility
Proactive
Reactive
24/7 maintenance
9am -‐‑‒ 5pmmaintenance
Automation
5
before broken,do prevention
KDDI Research Inc. Proprietary and Confidential
What is key point?
n Auto healing process
1. Fault detection by monitoring system
1. Recovery plan by OSS/Orchestrator
1. Auto healing by Orchestrator
For fast recovery, real-time fault detection / prediction is key point.
6
KDDI Research Inc. Proprietary and Confidential
Problems for proactive operation
NFVI NFVI
VMVNFpollerpoller
verbosedata
verbosedata
notifier
evaluator
collector
DBn Centralized monitoring architecture
l Difficult to real-time(fine-grained) monitoring
l High load to collect a lot of data
n Generally, delay of collecting data affects several area.Now, the time has come to consider to enhance the architecture.
7
delay
KDDI Research Inc. Proprietary and Confidential
n Distribute each function into computing nodes
l Monitoring process is complete in each computing node
l Real-time(Fine-grained) monitoring
l Scale with the number of computing nodes
Distributed Monitoring and Analytics (DMA)
NFVI NFVI
VMVNF
notifier
evaluator
collector
analyzer
evaluator
collector
analyzer
evaluator
collector
DB
DB DB
analyticsresult concise
data
pollerpoller
9
KDDI Research Inc. Proprietary and Confidential
Architecture detail
Poller/Notification
libvirtAPI
SNMP Get
SNMP Trap
CollectorDatabase
MeterTranslator
Evaluator
AnalyticsEngineFault
Detection(Prediction)
Statistical Analysis
Alarm Correlation
Transmitter
Fault Alarm
StatisticalData
OpenStackAPI
...
Perf. Data
Perf.data (CPU/Memory...)
Guest.
(Alarm)Faultdata
Perf.data (CPU/Memory...)
Host.
(Alarm)Faultdata
Fault Data
...
NFVI
10
KDDI Research Inc. Proprietary and Confidential
Centralized Distributed
Fault ✔ ✔(predict, silent)
Account ✔
Performance✔
(Micro burst)
Target area
n Centralizedl cloud platform
n Distributedl NFVl carrier grade network
11
KDDI Research Inc. Proprietary and Confidential
Use Case 1: Prediction using machine learning
T. Niwa et al., “Universal Fault Detection for NFV using SOM-‐‑‒based Clustering,” APNOMS 2015.
Distributed machine learning
13
Demo @MWC2016
KDDI Research Inc. Proprietary and Confidential
Use Case 2: Detect micro burst traffic
Finer performance data processing
M. Miyazawa et al., “In-‐‑‒network real-‐‑‒time performance monitoring with distributed event processing,” NOMS 2014.
• Centralized approach cannot detect the bursts.• DMA can achieve x1000 finer monitoring.
-‐‑‒ -‐‑‒ 8
-‐‑‒ -‐‑‒ 5
-‐‑‒B75 1 B A 1A11 5 8 A5 C1
B AA 1 8
051 A8 5 5 8 7 8 5ADA8 8 1A8 8 55 5
Time%(sec)
0
20
40
60
80
100
0 20 40 60 80 100 120 140
Bandwidth%U
tiliza
tion%(%)
5 A 1 8 5 1 175 5 A
8 A 82BA5 1 175 5 A
C5 H.8 2B A H
70�
18Mbps
DMA approach
Centralized approach
14
KDDI Research Inc. Proprietary and Confidential
nChange to proactive / low cost operation
nFault detection / prediction is key point
nDistributed Monitoring and Analytics is suitable for proactive operation
Conclusion
16
KDDI Research Inc. Proprietary and Confidential
nStart to discuss how to integrate DMA to OpenStack
Join us
17