gerald kunzmann, docomo carlos goncalves, nec ryota mibu, nec
DESCRIPTION
Doctor Overview Goal Approach Status Build fault management and maintenance framework Approach Identify requirement Gap Analysis Implementation work in Upstream (OpenStack) Integration and testing Status Initial Requirement study, architecture design, Gap analysis : Done Collaborative Development: On-going (3 merged Blueprints in OpenStack Liberty) Standardization Sync: On-going (by NFV member efforts, joint meeting)TRANSCRIPT
1
OPNFV Summit 2015
Doctor: Failure Detection and Notification for NFVGerald Kunzmann, DOCOMO
Carlos Goncalves, NECRyota Mibu, NEC
2
Doctor Overview
• Goal– Build fault management and maintenance framework
• Approach– Identify requirement– Gap Analysis– Implementation work in Upstream (OpenStack)– Integration and testing
• Status– Initial Requirement study, architecture design, Gap analysis : Done– Collaborative Development: On-going (3 merged Blueprints in OpenStack
Liberty)– Standardization Sync: On-going (by NFV member efforts, joint meeting)
3
Key Requirements as VIM
Immediate Notification
Consistent Resource State
Awareness
Extensible Monitoring Fault Correlation
4
ReactionDetection withoutDoctor (few minutes)Detection withDoctor (1 second)
Virtualized Infrastructure
ApplicationManager
Virtualized Infrastructure Manager (VIM)= OpenStack
Virtual Comput
eVirtual
StorageVirtual Networ
k
Virtualization Layer
Hardware Resources
ACT
VM-1 Down
Host-A Down
Switch Act-SbyQuick Recovery
Video Player
Doctor Demo Overview
SBY
Streaming Server
5
Fault Management Sequence
Monitor
Notifier
Manager
Virtualized Infrastructure
(Resource Pool)
AlarmConf.
3. Update State2. Find Affected
Application
ControllerControllerController
Resource Map
1. Raw Failure Inspector
4. Notify all
5. Notify Error
0. Set Alarm
6-. Action
Failure Policy
MonitorMonitor
Ceilometer
Nova
Streaming Server
State Reflector
Log Monitor
App Manager+ Viewer
Liberty
6
Host A
Service Healing Process
VM0
vNIC
Video Player
vSwitch
NIC
Host B
VM1
vNIC
vSwitch
NIC
Switch
Data Flow (After)
Data Flow (Before)
VM9Streaming
ServerStreaming
ServerApp Manager
ControlAlarm
Notification
7
Doctor Demo Screen
App ManagerService Control
App ManagerEvent/Action
Log
VM Egress Stats
(Zabbix)
VM List(Horizon)
Demo Operation Console
Video Player(with Doctor)
Video Player(without Doctor)
8
Doctor Demo
9
Doctor Blueprints in OpenStack Liberty Cycle
Project Blueprint Spec Drafter Developer Status
Ceilometer Event Alarm Evaluator Ryota Mibu
(NEC)Ryota Mibu (NEC)
Completed (Liberty)
Nova
New nova API call to mark nova-compute down
Tomi Juvonen (Nokia)
Roman Dobosz (Intel)
Completed (Liberty)
Support forcing service down Tomi Juvonen (Nokia)
Carlos Goncalves (NEC)
Completed (Liberty)
Get valid server state Tomi Juvonen (Nokia)
Spec approved (Mitaka)
Add notification for service status change
Balazs Gibizer (Ericsson)
Balazs Gibizer (Ericsson)
Waiting for spec approval (Mitaka)
✓
✓
✓
✓ Using in This Demo
10
Doctor BP Detail: Nova – Mark Nova-Compute Down
Host / Machine
Hypervisor
VMnova
compute
nova api
nova conduct
ornova
scheduler
nova DBqueue
External Monitoring
Service
vSwitch
BMC
EXISTING(periodic update)
Force-down API
NEW APIto update nova-computeservice state
service state
Monitoring Client
11
Doctor BP Detail: Ceilometer - Event Alarm
sample
Notification-driven alarm
evaluatorNEW Shortcut(notification-based)
EXISTING(polling-based)
Manager
Audit Service
stats
notification
event
CinderNeutronNova
12
Who made this demo?
• Upstream OSS Community & Developer– OpenStack Contributors including Doctor Developers
• OPNFV Doctor Team– Doctor contributors who worked on requirement study, gap
analysis and implementation design
• Doctor PoC Demo Team– NTT DOCOMO– NEC: Toshiaki Takahashi, Takahiro Suzuki, Ryuji Ishikawa, ...
13
Visit DOCOMO Booth, PoC Demo Zone