xiangping wu - huawei

15
www.huawei.com For CQR 2010 @ Vancouver 2010-6-8 Wu Xiangping & Pant Himanshu FMEA of Multi Layer Cooperation in All IP Network

Upload: vungoc

Post on 03-Feb-2017

228 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: Xiangping Wu - Huawei

www.huawei.com

For CQR 2010 @ Vancouver2010-6-8

Wu Xiangping & Pant Himanshu

FMEA of Multi Layer Cooperation

in All IP Network

Page 2: Xiangping Wu - Huawei

Content

• Background & Problems

• FMEA for Multi Layer Cooperation Analysis

• Summary

Page 3: Xiangping Wu - Huawei

HUAWEI TECHNOLOGIES CO., LTD. Page 3

Background

� Growth in All-IP network demands for improvement in

operations, administration and maintenance (OAM).

� Improved OAM has to speed-up failure identification, fault

diagnosis and service restoration as real time services are

sensitive to network performance.

� A major challenge to improve OAM in All-IP network is lack of

cooperation between the different layers of the network.

Page 4: Xiangping Wu - Huawei

HUAWEI TECHNOLOGIES CO., LTD. Page 4

The Challenges

� Fault Management

� Each layer of the All-IP network has its own fault detection and recovery

mechanisms, but there is little cooperation among them, which often causes

reliability problems.

� Too many alarms often make the network operators don’t know the exact

cause of the failures, resulting in time-consuming troubleshooting.

� Network KPIs vs. Service QoE

� What are the important KPIs for different network layers, and how do they

affect each other?

� What is the relationship between network KPIs and Service QoE?

Page 5: Xiangping Wu - Huawei

HUAWEI TECHNOLOGIES CO., LTD. Page 5

An Example of Problem with Multi-Layer Cooperation

� In the event of a failure, e.g., a fiber cut, the restoration time in optical network may exceed 50 ms due

to the processing latency in the POS interface. This will lead to reconvergence of the IP bearer layer

and then resulting in service(s) impairment.

OOOOTTTTUUUU

OOOOTTTTUUUU

OOOOTTTTUUUU

POS

POS

OOOOTTTTUUUU

OOOOTTTTUUUUOOOOTTTTUUUU

OOOOLLLLPPPP

B

OOOOTTTTUUUU

OOOOTTTTUUUUOOOO

LLLLPPPP

A

Metro WDMBackbone

WDM

Customer

side

Interfacing

OOOOTTTTUUUU

OOOOTTTTUUUU

OOOOTTTTUUUU

POS

POS

OOOOTTTTUUUU

OOOOTTTTUUUUOOOOTTTTUUUU

OOOOLLLLPPPP

B

OOOOTTTTUUUU

OOOOTTTTUUUUOOOO

LLLLPPPP

A

Metro WDMBackbone

WDM

Customer

side

Interfacing

Mismatching of the timers setting in different layers could cause reliability problem

Page 6: Xiangping Wu - Huawei

HUAWEI TECHNOLOGIES CO., LTD. Page 6

KPI System Layered Architecture based on QoE

Net. Net. KPIsKPIs

Service

KQIs

Service

KPIs

Service

KPIs

Net. Net. KPIsKPIs Net. Net. KPIsKPIs

Carrier Carrier

KPIsKPIsRadio Radio KPIsKPIs

TransmissiTransmissi

on on KPIsKPIsCS CS KPIsKPIs

MSC/MGW

End-user Perceive Level

Service Application Level

Network Performance Level

End-user Level

Operator Level

Vendor Level

Node B RNC

What kind of metrics should be measured for each of the network

layers (radio, core, IP bearing and optical network layers)?

Page 7: Xiangping Wu - Huawei

Content

• Background & Problems

• FMEA for Multi Layer Cooperation Analysis

• Summary

Page 8: Xiangping Wu - Huawei

HUAWEI TECHNOLOGIES CO., LTD. Page 8

FMEA is A Good Way to Do the Analysis

� What will be the impact on the different network layers and on the

end-users’ services?

� What kind of protection mechanisms in different network layers will

be triggered to deal with the fault?

� What are the trigger conditions for these protection mechanisms?

� What kind of alarms will be triggered for different layers?

� How do these alarms traverse through the network layers?

� How should the alarms be correlated to restore the service with

minimum impact on end-customers?

Page 9: Xiangping Wu - Huawei

HUAWEI TECHNOLOGIES CO., LTD. Page 9

FMEA: From Product to Network level

NetworkNetwork

Failure modes and effects

analysis (FMEA) for products

Detect Fail

Mode

Action/Follow-up

Detection

Occurrence

Probability

Severity

Prioritize Risk

Failure modes and effects

analysis (FMEA) for products

Detect Fail

Mode

Action/Follow-up

Detection

Occurrence

Probability

Severity

Prioritize Risk

Fault/EventProtection Detection PrioritySeverity Action/

Follow-UpFault/EventFault/Event

ProtectionProtection DetectionDetection PriorityPrioritySeveritySeverity Action/

Follow-Up

Action/

Follow-Up

Page 10: Xiangping Wu - Huawei

HUAWEI TECHNOLOGIES CO., LTD. Page 10

A Practical Network for the Network FMEA

Softswitch

Layer

IP Bearer

Layer

Optical

Layer

Softswitch

Layer

IP Bearer

Layer

Optical

Layer

Page 11: Xiangping Wu - Huawei

HUAWEI TECHNOLOGIES CO., LTD. Page 11

Network FMEA Process

IP Bearer Net.IP Bearer Net.Optical Transport Net. Softswitch Net.Fault/Event

Inte

rrup

tion

De

gra

da

tion

OTU

Recei

ve no

light

Shut

off

Laser

Insert PRBS

or all 1

downward

Trig

ge

r op

tica

l lay

er p

rote

ctio

n

<50ms

NA

>100ms

LinkdownTrigger route

reconvengence

B1/B2/B3 BER

Achieve 10-6

SCTP Link switchover

Device failure, service

interrupt

SCTP Link switchover

LOS LOF AIS

LinkDown

SCTP Link switchover

SCTP Link switchover

SCTP Link CongestionPacket loss

in IP layer

B1/B2/B3 Bit Error

Trigger Condition Alarm Protection

Interruption beyond 6s in

single plane

Interruption beyond 7.5s

in dual planes

Packet loss beyond

threshold in dual planes

IP Bearer Net.IP Bearer Net.IP Bearer Net.IP Bearer Net.Optical Transport Net.Optical Transport Net. Softswitch Net.Softswitch Net.Softswitch Net.Fault/Event

Inte

rrup

tion

De

gra

da

tion

OTU

Recei

ve no

light

Shut

off

Laser

Insert PRBS

or all 1

downward

Trig

ge

r op

tica

l lay

er p

rote

ctio

n

<50ms

NA

>100ms

LinkdownTrigger route

reconvengence

B1/B2/B3 BER

Achieve 10-6

SCTP Link switchover

Device failure, service

interrupt

SCTP Link switchover

LOS LOF AIS

LinkDown

SCTP Link switchover

SCTP Link switchover

SCTP Link CongestionPacket loss

in IP layer

B1/B2/B3 Bit Error

Trigger Condition Alarm Protection

Interruption beyond 6s in

single plane

Interruption beyond 7.5s

in dual planes

Packet loss beyond

threshold in dual planesIncreasing Time & Increasing Service Impact (# of customers affected)Increasing Time & Increasing Service Impact (# of customers affected)

Page 12: Xiangping Wu - Huawei

HUAWEI TECHNOLOGIES CO., LTD. Page 12

Example: Fiber Cut

Cause: Accidental, ploughing or digging by utility staff

Fiber cut

Loss of connectivity

Service down-time exceeds 2 seconds.

Soft switch layer: SCTP link switchover

IP layer: reroute and re-convergence only if the protection and recovery time on optical layer

exceeds 50ms.

Optical layer: line protection or wavelength protection

Fault/EventFault/Event

SeveritySeverity

ProtectionProtection

Softswitch Layer Alarms:

▪SCTP Link Congestion

▪Post > 6 s delay on single layer

▪SCTP Link switchover

IP Layer Alarms:

▪Link Down

▪Post > 100 ms delay on optical layer

Optical Layer Alarms: LoS, LoF, & AIS

Detection

Alarms

Detection

Alarms

PriorityPriorityHigh: Impacts a large number of end-customers

Action/

Follow-Up

Action/

Follow-Up

Prevent future fiber cuts by

•burying the cable deeper in the ground

•improving liaison with utilities to prevent accidents

Fix immediate fiber cut

Page 13: Xiangping Wu - Huawei

Content

• Background & Problems

• FMEA for Multi Layer Cooperation Analysis

• Summary

Page 14: Xiangping Wu - Huawei

HUAWEI TECHNOLOGIES CO., LTD. Page 14

Summary

� Cooperation among network layers is important for fault

management and QoE assessment of All IP network;

� FMEA can be an useful tool for the cooperation analysis, especially

for improving the fault management.

� When a fault or event occurs, the operators can know clearly about:

� What kind of protection and restoration actions have been taken in each of the

network layers to deal with this fault or event;

� What alarms will be caused by this fault or event, and which one is the root alarm. So,

the operators can know the actions should be taken for troubleshooting.

� The analysis results can also be used for network design and plan.

Page 15: Xiangping Wu - Huawei

Thank youwww.huawei.com