5-ptn trouble shooting
DESCRIPTION
telecommunication manual.TRANSCRIPT
-
PTN Troubleshooting
Address: No.5, Dongxin Road, Guandong Science Park, Wuhan/ postal code: 430073
E-mail: [email protected]
www.fiberhome.com.cn
-
Agenda
1.OAM and Its Application
2.Typical Troubleshooting
3. Instrument Detection
4. PTN Engineering Cases
Page2
-
OAM
OAM (Operation, Administration and Maintenance) refers to the production, organization and management activities to ensure the normal, safe and effective operation of networks and services, called as Operation, Administration and Maintenance or OAM for short.
According to the actual needs of network operators, OAM can be usually classified into 3 types:
Operation
Operation mainly analyzes daily network state, monitors alarms and controls performance.
Management
Management is to analyze, forecast, plan and configure daily network and service.
Maintenance
Maintenance is to conduct such daily operation activities as test and trouble management etc. of the network and its service.
Page3
-
OAM Classification
According to functions, OAM can be classified into:
Trouble management: trouble testing, trouble classification,
trouble positioning and trouble notification etc.
Performance management: performance monitoring, performance
analysis and performance management and control etc.
Protection and recovery: protection system and recovery system
etc.
Page4
-
OAM Terms
Management Entity (ME)
An entity in need of management indicates the relation between two MEPs. In T-MPLS , the basic ME is a T-MPLS path. Nest is allowed between MEs, but no overlapping is allowed among more than 2 MEs.
ME Group (MEG)
A group of MEs should meet the following conditions:
Belong to the same management domain;
Belong to the same MEG layer;
Belong to the same point-to-point or point-to-multipoint T-MPLS connection.
For point-to-point T-MPLS connection, a MEG includes a ME. For point-to-N (N>1)-point connection, a MEG includes N MEs.
Page5
-
OAM Terms
MEG End Point (MEP)
It's used for marking the beginning and ending of a MEG and it can generate and terminate OAM grouping.
MEG Intermediate Point (MIP)
MEG Intermediate Point can not generate OAM group, but it can select special actions for some OAM groups and transmit passing-by T-MPLS frames transparently. MEP and MIP are designated by the management plane or control plane.
MEG Level (MEL)
When many MEGs nest, it is used for distinguishing various MEG OAM groups, and OAM groups in tunnels are processed by adding MELs in source direction and reducing MELs in destination direction.
Page6
-
7
OAM of PTN in Different Network Areas
Client equipment Client equipment
PTN network
UNI access link LSP layer LSP layer LSP layer LSP layer UNI access link
LSP (tunnel) LSP (tunnel)
PW (pseudo-wire) / LSP (tunnel)
Client service
Page7
-
8
OAM of PTN on Different Network Layers
T-MPLS channel layer presents PW information and indicates a VC channel; T-MPLS path expresses a T-MPLS tunnel, i.e. LSP; T-MPLS section indicates the sub-layer network of a T-MPLS, protecting the detection on the link layer; in T-MPLS OAM, it respectively correspond to the OAM detection on three layers: TMC, TMP and TMS.
Page8
-
OAM Frame Structure of PTN
1 2 3 4
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
1 Label(13) MEL S TTL
5 Function Type Res Version Flags TLV Offset
OAM PDU payload area
last End TLV
OAM information is contained in specific OAM frames and is transmitted in frames.
OAM frame: composed of OAM PDU and external forward label stack items.
Like other data groups, contents of forward label stack items are used for ensuring the correct forward of OAM frames on the path.
Every MEP or MIP just identifies and processes OAM groups on their own layers.
Page9
-
OAM Frame Format of PTN
Label (14): 20-bit tag value of 14, indicating the OAM tag; MEL: 3-bit MEL with the range of 0-7 S:1-bit S with the value of 1, indicating the label stack bottom; TTL: 8-bit TTL with the value of 1 or the hop +1 from MEP to the designated MIP. The 5th bit is the OAM function type with 8 bits, indicating the function type of OAM. In addition, some of OAM PDUs require to be designated with target MEP or MIP, i.e. MEP or MIP ID. According to different function types, it can be in any of the three formats below: 48-bit MAC address; 13-bit MEG ID and 13-bit MEP/MIP ID; 128-bit IPV6 address.
Page10
-
OAM Mechanism in T-MPLS Network
OAM functions in T-MPLS network can be classified into alarm OAM function, performance OAM function and other OAM functions.
OAM
Technology Trouble management Performance management
Function
Trouble DetectionTrouble VerificationTrouble Location and Trouble announcement
Performance monitoringAnalysis of performancePerformance management controlStart the trouble management system when the performance fall.
Purpose
Cooperate with network
management system to improve the
reliability and usabilility of network
Maintain the quality of service and the
operation efficiency of network.
Main tools
and method
Check the connection (CC) Alarm indication signal (AIS) Remote defect indication (RDI) Link tracking (LT) Loopback detection (LB) Lock (LCK)Test (TST) Customer signaling failure (CSF)
Loss measurement of Frames Delay measurement of Frames Delay jitter measurement
Page11
-
Introduction to OAM Type - CC
CC - Continuity and Connectivity Check
It can be used for trouble management, performance monitoring,
protection switching and signal continuity detection between any pair of
MEPs in MEG area.
Frames used for transferring CC information are CV ones and their
main parameters include:
MEG ID
Their own MEP IDs
All target MEP IDs
Transmission period: 3.3ms/10ms/100ms/1s
In case no CV frames can be received within 3.5 times of transmission
period, LOC (loss of continuity) alarms will generate.
Page12
-
Introduction to OAM Types - AIS, LB and LCK
AIS: Alarm Indication Signal When troubles are detected on the service layer, the client layer shall be
informed.
FDI frames are used for transferring.
Transferring period: 1s
If no more AIS message is received within 3.5 times of receiving period again, AIS alarm shall be cleared.
LB: Loopback Loopback function. MEP is the starting point for Loopback to request for
grouping and the execution point of Loopback can be MEP or MIP.
It's used for verifying the connection in between MEPs or between MEP and MIP.
It is used for bi-directional on-line or off-line diagnosis between MEPs and for testing bandwidth throughout and bit error rate etc.
LBM and LBR
LCK: Lock It's used for informing MEP at the opposite end that MEP at the home end has
interrupted the normal service as required by management.
MEP at the opposite end can judge whether the service interruption is foreseen or caused by troubles.
Page13
-
Introduction to OAM-TST, CSF, LM and DM
TST: Test Test request signal that is sent to another MEP by one MEP
Unidirectional on-line or off-line diagnostic test
Differences from LB
CSF: Client Signal Fail Notice the far end that client signal fails at the entrance of the home end.
LM: Frame Loss Measurement It's used for testing the unidirectional or bidirectional loss rate from one MEP to another
MEP.
CV frames are used for testing. SD: Performance degradation with an accuracy to one thousandth at most.
DM: Packet Delay and Packet Delay Variation Measurement It's used for testing the packet transfer delay and delay variation from one MEP to another
MEP .
Unidirectional: Clocks at the transmitting and receiving ends are synchronous, source end transmits DM frame and unidirectional time delay is calculated when the destination end receives the DM frame. 1DM frames are used for testing.
Bidirectional: Source end transmits the DM request frames and the destination end shall loopback DM response frames to the source end when receiving the DM . Then when the source end receives the responded DM frames, bidirectional time delay shall be calculated with DMM and DMR.
Page14
-
Introduction to OAM-APS, MCC, SCC and SSM
APS : Automatic Protection Switching
It's defined by G..8131/G.8132 and used for transmitting APS .
MCC: Management Communication Channel
It's used for transmitting MCC management channel information, TMS layer
SCC : Signaling Communication Channel
Its used by a MEP for transmitting control plane information to its counterpart
MEP.
SSM: Synchronization Status Message
It's defined by G.8261 and used for transmitting SSM frames.
Page15
-
OAM Operation Corresponding to MEP and MIP
Customer
Equipment
Customer
Equipment PTN Equipment
MEG ID
MEG Layer
Local MEP ID
MEP ID of opposite end
Sending cycle of CC message
CC active
MEP Configuration MIP Configuration
MEG ID
MEG Layer
Local MIP ID
Page16
-
Positions of Corresponding Layers in PTN Equipment
Page17
-
OAM Alarm
MMG: Used for receiving mismatching MEG-ID.
UNM: When MEG-ID matches, source MEP-ID of received CV frames mismatch local expectation values.
UNP: When MEG-ID matches the expected MEP-ID, time interval to receive CV frames mismatch s the local transmitting time interval.
LOC: No correct CV frames are received within 3.5 transmitting cycles in a row.
RDI: Used by MEP for informing its counterpart MEP that it has some defects and only used for bidirectional T-MPLS connection. Its
information shall be initiated by the defective MEP and the information
shall be sent to its counterpart MEP periodically until defects are
cleared.
Page18
-
Summary
Since OAMs of PTN network path can be classified into three OAM layers:
TMS, TMP and TMC, where:
OAM at TMS layer is mainly used for WRAPPING protection and MCC
communication;
OAM at TMP layer is mainly used for LSP1:1/1+1 protection;
OAM at TMC layer is closest to the service layer, therefore OAM frames at
TMC layer can be used for simulation test of service continuity.
Page19
-
2.Typical Troubleshooting
3. Instrument Detection
4. PTN Engineering Cases
1.OAM and Its Application
Page20
Agenda
-
Common Troubles of PTN
Trouble I: Unavailable service
Processing steps:
1. Since service package will be multiplexed into PW first when service is transmitted in PTN network, the continuity of wires can be
judged from the continuity of PW layer;
2. If channels on PW layer is of discontinuity, performance method is required to judge which section of the line side the service
interruption occurs on;
3. If channels of PW layer are in continuity, troubles are positioned to add/drop voice service sites of source and destination. At this time,
alarms, status, performance and configuration of card relevant to
source and destination shall be checked emphatically.
Page21
-
Trouble Example
Trouble description:
FE1 of card ESJ1 at Slot
Position 1F in Office 1
has developed an FE
service to Port FE3 in
Office 9, which is
reflected to be
unavailable.
Page22
-
Trouble Example
According to the first processing step, it shall detect the continuity of PW
layer first.
Detection means: CV frames in PW configuration
CV frames on PW layer are defaulted to be disabled, therefore, they are
necessary to be opened at both the source end and destination end in the
PW configuration.
Position of PW configuration in single board configuration:
1. 660 low-level service is configured on XCU;
2. 660 high-level service is configured on the corresponding UNI card;
3. All 640 services are configured on XCU;
4. All 620 services are configured on PSSB.
Page23
-
Trouble Example
According to the service shown in the example, we need to open CV frames
both in ESJ1 in Office 1 and in 620 PW configuration. Corresponding
configurations are as follows:
Page24
-
Trouble Example
At this time, the 2 conditions below may appear:
1. No TMC_LOC alarms are on service cards of the source and destination sites;
2. TMC_LOC alarms are on all service cards of the source and destination sites, or
TMC_LOC alarms are on the service card of either site.
Description:
Condition 1 indicates that the line channel is continuous, therefore, trouble
causes can be positioned to add/drop voice service sites of the source and
destination. At this time, single card configuration mapping shall be
checked emphatically and performance method is needed to position
trouble points under the condition of correct card configuration mapping;
Condition 2 indicates that the line channel is in discontinuity and trouble
causes may be in the intermediate straight-through sites or the source or
destination of add/drop voice service.
Page25
-
Condition 1
1. Check whether there are abnormal alarms on the service signal disc first and
alarms about configuration mismatch and unit disc troubles must be handled.
These two kinds of alarms are caused by wrong configurations, therefore
inspection shall be mainly conducted on whether the labels, PW labels, VLAN
ID, simulation serial numbers, LSP-ID and PW-ID being used are repeated.
2. If no obvious alarms for wrong configuration are found, card configuration
mapping shall be inspected, mainly including items such as interface mode,
correlation configuration, VPWS configuration, PW configuration and TUNNEL
instrument configuration.
Page26
-
Condition 1
Mapping rules are:
Operating mode of interfaces (determining whether interfaces can operate
normally)
Interface correlation mode (determining whether services can be accepted at
interfaces)
Flow classification value in VPWS configuration (determining whether its accessible to PW according to VLAN values or simulation serial numbers)
NNI in VPWS configuration (assigned PW-ID for the service is found in NNI
interface)
PW configuration block (ingress/egress PW labels and TUNNEL/LSP index can be
known by searching out the corresponding items according to PW-ID)
TUNNEL table (corresponding LABEL value can be got)
Reference data configuration in XCU
Page27
-
Condition 2
When CV frames at TMC layer are opened for equipments at both the source end
and destination end, if TMC_LOC alarm still exists, it indicates that the channel is
in discontinuity and now it's necessary to conduct troubleshooting one by one from
the source site according to the following steps (mainly depending on the
performance method):
1. Inspect whether packets are received at the UNI interface and whether there are
lost packet, damaged packet, wrong packet or broken packet therein?
2. If packets are received at corresponding UNI interface, whether the TMC
concerned has corresponding packets?
3. Whether corresponding TMP of TMC has corresponding packets?
4. Whether corresponding TMP of the straight-through site has corresponding
packets to be forwarded?
Page28
-
Condition 2: 620-source Site
Normally, if interface LAN3 has received packets, its
corresponding TMC shall also have packets. If its
corresponding TMC has no corresponding packets, it indicates
that the service has not completed the add voice service.
As shown in the figure above: corresponding service of TMC1
corresponds to the service with a label value of 400.
Current performance in the figure above indicates that the
service can complete add voice service.
Page29
-
Condition 2: 640- straight-through Site
Total packets received by TMP1 are equal to those received by S15.1,
Total packets sent out by TMP1 are equal to those sent out by S15.2,
As shown in the figure above, hexadecimal label value of TMP1 is 0x190,
which corresponds to decimal 400
Performance in the figure above indicates that service with label value of
400 can complete the straight-through from 14.1 to 15.1
Page30
-
Condition 2: 660-Straight-through Site
The service through Office 2 is straight-through, the pass-by cards are
XSJ2 and ESJ1 and its performances are as follows:
As shown in the figure, TMP1 channel of XSJ2 has total packet
numbers received, and TMP1 of ESJ1 has total packet number
sent out, so if the corresponding label values of TMP1 of the two
cards are the same, the service can complete the straight-through
from XSJ2 to ESJ1.
Page31
-
3. Instrument Detection
4. PTN Engineering Cases
1.OAM and Its Application
2.Typical Troubleshooting
Page32
Agenda
-
Topology Description
As shown in the figure below, Office 1, 2 and 3 are of 660
equipment, Office 4, 5 and 6 are of 640 equipment and Office 7,
8 and 9 are of 620 equipment. Develop a 100 M service from
Office 1 to Office 8.
Page33
-
Topology Description
Use Smartbits table to transmit unicast packet 6000.
Page34
-
660-ESJ1 Disc (Source Site)
Performance on 100M disc at 660 (source site) is as
follows:
Therefore, the service on ESJ1 card is to:
Page35
-
660-XCUJ1 Disc (Source Site)
Performance on XCUJ1 card is as follows:
The service on XCUJ1 card is as shown in
the right figure:
Page36
-
660-XSJ2 Disc (Source Site)
Performance on XSJ2 card
is as follows:
The service on XSJ2 card is to:
Page37
-
660 (Straight-through Site)
The service through Office 2 is straight-through, the passed cards
are XSJ2, XCUJ1 and GSJ2and its performances are as follows:
Page38
-
660 (Straight-through Site)
Therefore, the service flow in Office 2 is as follows:
Page39
-
640 (Straight-through Site)
The service on 640 in Office 4 is straight-through and its
performance is as follows:
Page40
-
640 (Straight-through Site)
The service flow in Office 4 is as follows:
Page41
-
620 (Destination Site)
620 in Office 8 is the drop voice site of service and its
performance is as follows:
Its service direction is:
Page42
-
4. PTN Engineering Cases
1.OAM and Its Application
2.Typical Troubleshooting
3. Instrument Detection
Page43
Agenda
-
Case 1 Connection between S1J1 of Equipment 660 and
0155-8 of Equipment ASON - 1
[Phenomenon description] As shown in the figure, the 2G service of a certain project is transmitted
with PTN network and MSTP network. of which, card S1J1 of PTN
Equipment 660 is connected with card 0155-8 of ASON. It is found that card
S1J1 connected to PTN equipment reports LP-RDI that the channel used for
alarm service at BSC side is unavailable.
Page44
-
[Processing procedure]
When equipment loopback at the line interface is made on the card S1J1 of the connected PTN, the LP-RDI alarm on card S1J1 disappears and the service from base site to PTN
equipment is normal, so the problem of equipment troubles at the PTN side is excluded.
When equipment loopback is made on card 0155-8 of connected Equipment MSTP, similarly, the LP-RDI alarm on the card disappears and the service channel at the BSC side
is normal, so the problem of equipment troubles at the MSTP side is excluded.
It's found to be normal by checking the connecting time slot of Equipment PTN and Equipment ASON, so the problem of configuration is excluded.
When line loopback is made on card S1J1 of connected PTN, the channel used for alarm service at the BSC side is unavailable.
When a hard loop is made on Equipment PTN by being downloaded to site, the LP-RDI alarm on card S1J1 disappears and the service from base site to Equipment PTN is normal;
when a hard loop is made on card 0155-8 of Equipment MSTP, the channel used for alarm
service at the BSC side is unavailable. Therefore, the problems is positioned on the Optical
Port of 0155-8.
Replace the optical module at the corresponding Optical Port of 0155-8 and the channel
used for alarm service at the BSC side is unavailable too. Then replace an Optical Port to
remake the service and the whole service becomes normal.
Case 1 Connection between S1J1 of Equipment 660
and 0155-8 of Equipment ASON - 2
Page45
-
[Problem analysis]
1. The LP-RDI alarm is an optical disc alarm, so once the channel used for
connected service is unavailable, the LP-RDI alarm will appear on the card
S1J1 of Equipment PTN but no LP-RDI alarm will appear on the Optical Port
disc of 0155-8.
2. When software equipment loopbacks are made on card of PTN and
MSTP, the problems of Optical Port hardware cannot be positioned, so only
when hard loops are made on them, can problems be positioned to the
Optical Port and its module.
[Supplementary information]
1. In a certain project, when many times of connections of Interface 0155-8
of Equipment 780B with Interface RNC are found, the optical disc reports the
bit error alarm to the remote end and the alarm of high bit error ratio also
appears on RNC. When Optical Port of card 0155-8 is replaced, the trouble
alarm disappears.
Case 1 Connection between S1J1 of Equipment 660 and
0155-8 of Equipment ASON - 3
Page46
-
Case 2 Operating Mode for PTN Subnet to Eliminate Unnecessary
VLAN Configuration Items crossways - 1
[Phenomenon description]
Disc FE is inserted at Slot Position
12. Download 4 services for Port line1
in Office 4 from Port line 1 at Slot 12 in
Office 1 first and then there will be 4
VLAN configuration items on the
corresponding 100M discs. If we
delete the services with the VIDs of 3,
4 and 5, the corresponding VLAN
configuration items of the 3 services
are not eliminated and still stay at the
VLAN configuration of the card, which
will bring up with unnecessary troubles
to later inspection. Therefore, the
VLAN configuration items for
nonexistent services shall be deleted.
Page47
-
[Processing procedure]
1. Open the cross interface of subnet and
right click on the corresponding NE.
Case 2 Operating Mode for PTN Subnet to Eliminate Unnecessary
VLAN Configuration Items crossways - 2
Page48
-
2. Choose VLAN configurations. VLAN-ID values of corresponding slot positions
can be found here, so choose the unneeded VLAN-ID values and delete them.
Case 2 Operating Mode for PTN Subnet to Eliminate Unnecessary
VLAN Configuration Items crossways - 3
Page49
-
3. Conduct the same operations to the VLAN configurations of
cards relevant with NEs at the opposite end, then close the dialog
boxes and it will be OK to re-download the configurations. Thus,
the relevant unused configurations will be eliminated.
[trouble detection and analysis]
When cross data in the NMS is deleted, NMS keeps the relevant LAN
configurations and considers that this VLAN is still in use so as not to delete it.
Thus it's necessary for us to delete it manually. We can create, modify and
delete the VLAN information of relevant cards through the VLAN management to
NEs.
Case 2 Operating Mode for PTN Subnet to Eliminate Unnecessary
VLAN Configuration Items crossways - 4
Page50
-
Case 3 Repetition of VLAN Configuration Leading to Interruption
of Other 100M Data Transmission Services - 1
660-1 660-2 5
[Phenomenon description]
One day, when an issued subnet crossing configuration is added, the
users reflect that interruptions happened to another 100M service between
Site 1# and Site 2#. This service is from Interface line4 at Slot Position 17 of
Site 2 to Interface line12 at Slot Position 1B of Site 1.
Page51
-
Case 3 Repetition of VLAN Configuration Leading to Interruption
of Other 100M Data Transmission Services - 2
[Processing procedure]
1. Check alarms on corresponding cards and no abnormal alarms are found;
2. Check the cross configurations of subnets and the service is in transmitting mode, so the
cross configurations are correct;
3. Check single configurations of machine panels and it's found the operating mode of ports
is correct, ports are enabled, configurations of PVID are normal and VMAN of Interface LAN is
enabled;
4. Check the setting of VLAN and it's found that LAN Port 12 corresponds to 2 VLAN values
in the "VLAN configuration" of the 100M card;
5. Delete the excessive VLAN values in subnet cross VLAN settings and VLAN
configurations of card and save the downloaded cross, then the service recovers.
[trouble analysis]
1. The cause is as follows: when cross data is deleted in subnet cross, the database of
subnet saves relevant VLAN configuration. When services are added, the corresponding
VLANs of the deleted services aren't duly deleted, so once a new subnet cross is added and
downloaded configurations are saved, the VLAN in the subnet cross will recover the card
configuration, which then causes unavailable services.
Page52
-
Thank you!