transcript
@@SECTION.@@MODULE @@MODULETITLE @@MODULEPARTNUMBER Edition
@@MODULEEDITIONAll Rights Reserved © 2007, Alcatel-Lucent
1678 MCC · Operation and Maintenance
1·2
Christian Wittig US/S
1 · 2 · *
Functional Overview, Redundancy mechanism and Hints
New HW LAX + 1GBE HW + 1GBE Commissioning + 1GBE Commands on
SC
HO-Matrix and Synchronization issue
Documentation in intranet
CT IP Tunnel / EP Dialog
FLC Replacement
1 · 2 · *
1 · 2 · *
Section 1 - Module 2 - Page *
The 1678 MCC is divided into the following Subsystems (Functional
Blocks):
Input/Output (I/O) Subsystem
Interfaces: HK (Housekeeping), RA (Remote Alarm), RL (Rack Lamp
functionality)
Data Communication Channel (DCC) Subsystem
Interfaces to the external environment are:
Synchronization interfaces from external sources to the
Synchronization subsystem and vice versa
Craft Terminal Interface
Data Communication Channel (DCC) to/from other SDH Network
elements
Standard SDH interfaces are used as link from/to 1678 MCC Main
Shelf to/from
OED 1670 SM / 1662 SMC / LO Extension shelf:
Clock derivation/distribution is done using the data links
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
3.7 Main Shelf Architecture
Section 1 - Module 2 - Page *
The Control and SW Subsystem is responsible for supervising and
commanding the entire 1678 MCC.
It is divided in two levels:
First Level Control to provide all control functions on system
level, e.g. control interface to OS
Second Level Control to perform all functions on shelf level, e.g.
control of boards
ISSB (Intra Shelf Serial Bus) Backpanel LAN:
It enables the communication between the different processor
modules of the control system:
DCR: Data Communication Router
EM: Embedded System Module
1 · 2 · *
3.10 Matrix Redundancy
Section 1 - Module 2 - Page *
To ensure transmission, even in case of board failure of the matrix
boards, the HO and LO matrix boards are 1+1 EPS protected in the
1678 MCC main shelf.
In the normal situation one matrix is in active state, the other in
hot standby state.
Signal copies are delivered to both matrix copies A and B of the HO
matrix.
If the signals have to be switched on high order level, each HO
matrix board transmits one copy back to the I/O board.
From each HO matrix copy the signals, which have to be switched on
low order level, are transmitted to both copies of the LO
matrix.
After low order switching, the signals are transmitted back to both
copies of the HO matrix.
Each I/O board receives redundant signals from each matrix copy and
selects one of it.
In case of signal failure there is a hitless switch to the signal
transmitted by the other matrix copy.
This matrix copy becomes the active one and requests all I/O boards
to switch to it.
If some I/O boards still choose the signal from the previous matrix
copy, because of the signal quality, both matrix copies stay in the
active state.
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
VSR Link boards
VC-12 / VC-3 level
Assembler: Function to perform the high order path termination and
adaptation
Section 1 - Module 2 - Page *
The LO matrix is realized as a square matrix on the LAX40 / LAX20
matrix boards.
The capacity of the LO matrix board is 40 / 20 Gbit/s, that
corresponds to 256 / 128 STM-1 equivalents.
If the LO matrix is used, up to 256 STM-1 equivalents from each of
the remaining 14 I/O slots can still be connected to the HO
matrix
Only 1 pair can be equipped: It is 1+1 EPS protected
Switching entities are: LO VC-3 and VC-12.
Several Types of connections can be established:
unidirectional point-to-point (protected or unprotected)
bi-directional point-to-point (protected or unprotected)
unidirectional point-to-multipoint
From receive direction (RX) the signals are delivered to the HO
matrix and than transmitted to the LO matrix.
The structured HO signals (VC-4) are connected to an assembler to
perform the high order path termination and adaptation
function.
After switching the low order signals, they are connected to the
assembler for adaptation and termination function to create a HO
VC-4 signal
The HO VC-4 signals are then transmitted back to the HO
matrix.
In combination with the LO matrix, the OED 1662 SMC can be
used.
It is connected to the main shelf using STM-16 I/O
interfaces.
The internal connection inside the OED 1662 SMC is fixed and can
not be changed by the operator.
Since the connection between OEDs and main shelf is MSP 1+1
protected, the I/O capacity of the main shelf is reduced
accordingly.
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
DCR
EM
DCN
OS
1 · 2 · *
IP Address Rules
The following addresses are in a common IP subnet (at least
/28)
emServ
emCong
dcrServ
dcrCong
activeDCR(always configured on the active FLC)
localCT (reserved address for installation purposes)
dcnGateway (if DCN connection is via LAN)
The address DCRHostId/32 is needed if IP-over-DCC and/or OSPF is
used
Used as local address of all unnumbered IP-over-DCC links
Used as OSPF router Id
The address GmreNode/32 is needed if GMRE is used
Used for GMRE neighbor-to-neighbor communication
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
NE Management
IP mandatory for
Equipment Provisioning (TCP based application)
Time Synchronization (NTP)
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
Behavior on External LAN
Internal and external traffic separated via VLAN setup on on-board
LAN SWs
Default: no VLAN tagging on external LAN
No internal IP addresses on external LAN
No dedicated Router necessary per NE
Rapid Spanning Tree Protocol (RSTP) running on on-board LAN
SWs
For DCN redundancy reasons, external LAN ports to be interconnected
via (RSTP capable) LAN-Switching equipment
Bootp used during initial installation
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
Management may be via
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
FLC is installed with SW R4.2 and correct IP addresses
Platform Release (CORE-01J)
Application Release SKY42-05K
Stored in Backup
1 · 2 · *
Basic Framework
hoSlca_1_dcrs|100.1.10.1|100.1.203.4|00:00:00:00:00:0A|MX640GEA
hoSlca_1_dcrc|100.1.10.20|100.1.203.5|00:00:00:00:00:0A|MX640GEA
hoSlcb_1_dcrs|100.1.11.1|100.1.203.4|00:00:00:00:00:0B|MX640GEB
hoSlcb_1_dcrc|100.1.11.20|100.1.203.5|00:00:00:00:00:0B|MX640GEB
Execute to compare both FLC at once
appl@emServ(appl)$ start
< <equip:EMType>S</equip:EMType>
appl@emCongA3(appl)$ PersSync
NVRAM access successfully ended!
1 · 2 · *
CS parameters
Note: Avoid changing FLC IP address after first installation
1.bin
1 · 2 · *
CS Parameters
bash
tools
<theNetworkPartIP
type="string">10.227.225.96</theNetworkPartIP>
<theNetworkPartNumBits
type="string">28</theNetworkPartNumBits>
<theDefaultGateway
type="string">10.227.225.97</theDefaultGateway>
In case of EP not available: It is possible to change CS
parameters
Set CS Parameters
1 · 2 · *
CS parameters
SW Installation
What is the meaning of files PrelimCSSRepository.xml and
Installconfig.xml ?
CSServer
CSServer
1 · 2 · *
Gigabit Ethernet
1 · 2 · *
Ethernet/GFP-F Mapping
Core header
Payload header
1 · 2 · *
Gigabit Ethernet
1 · 2 · *
Commissioning of 1GBE
Follow commissioning docu for related release on NPI homepage or
Customer docu homepage
Setup VCG using ISA port configuration menu
Create unidirectional CrossConnection Loop from/to VC4vTTP of the
1GBE port on test
Connect Ethernet Testset and check that L2 traffic is
transported
Test optical output power of GBE port and check that range is
correct according to Commissioning protocol
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
to Setup Ethernet port
1 · 2 · *
gs
Boot up HO-SC
telnet 100.1.10.1
1 · 2 · *
appl@mgb1em3(LogFiles)$ telnet 100.1.10.1
Application software for SC1678mccX (SDH) R. 03.01.59
Based on generic SW R. 02.01.13
SC Protocol version: 11.6.1.0.0.0
SMX Protocol Version: 10.2.0.0.0.0
The image is located on DCR Board on each FLC:
/dcrroot/tftpboot/NE_1678MCC/SC_SCM/ Image78MCC
1 · 2 · *
Date: Oct 11 2005
Pigato EPLD version not available on PQ2 SCM board V1.14
Micro Revision Number : 81
Ram Size Configured : 472Mbytes
Application start address : 0x100000
Application end address : 0x1800000
LIB: ldr_lib_ec - REL: 'RWL-PQ2-EMC''V8.1.0P110'P9 (Oct 11
2005)
LIB: llc2_lib - REL: 'CS-PQ1-GEN-LLC2''V8.1.0'P29 (Mar 30
2006)
LIB: sec_lib - REL: 'KS-PQ2-SMC''V8.1.0P110'P18 (Jul 11 2006)
LIB: ks_lib - REL: 'KS-PQ2-GMC''V8.1.0P110'P19 (Sep 8 2006)
LIB: libbsp - REL: 'KS-PQ2-GMC''V8.1.0P110'P19 (Sep 8 2006)
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
1 · 2 · *
decValue in ppb (parts per billion) -> / 1000 -> in ppm
Alarm: SSF (Server Signal Failure) on port, SF (Signal Failure) on
Timing Source (Sync View) FO (Frequency Offset)
Reason: TimeSource not working or Hardware (faulty quartz),
doublecheck with bsw command
Action: repair/remove Timing Source or replace MX board
Check Syncronisation on HO-MX
1 · 2 · *
Difference between Systemclock (T0) of active Matrix (used)
<-> Output Frequency (T0) of passive Matrix (not used)
decValue * 0.0226 -> in ppm
Alarm: SSM (Synchronization Source Missmatch) if Difference
decValue > 16 (0.3616 ppm)
Reason: Hardware (different from the problems above)
Action: It can’t be determined which MX board cause the problem
(sender-, receiver problem). The alarm is always reported from the
passive matrix. Nevertheless the problem can be caused by the other
one. Replace the passive matrix first. If this is nou successful
replace the active matrix too (procedure required!).
Note: A new plugged-in board reports this alarm until the CRU’s are
synchronized. This can take up to 45 min (normal case 10-15
min).
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
Difference between Systemclock (T0) <-> headed 10 MHz
quartz
hexValue -> decValue * 0.0226 -> in ppm
0x7ff-hexValue -> decValue * 0.0226 -> in ppm
Alarm: FO (Frequency Offset) (raised if > 10.0 ppm, cleared <
9.2 ppm)
Reason: Hardware (faulty quartz)
Action: replace MX board
1 · 2 · *
Configurable on ports 1…8
42 x DCC-M + 16 x DCC-R per system
DCC Application
2 x DCC-M per STM-N ring (east – west)
21 STM-N rings per 1678
IF/IB Signalling for GMPLS
1 · 2 · *
Management over DCC
Management of NE2 is possible using NE1 as gateway NE for IP and
OSI
In EP dialog Gateway field is empty for NE2
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
Example:
Menu Comm.Routing -> Interface Configuration -> OSPF
OSPF is needed for automatic IP routing through DCC
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
X
X
1 · 2 · *
Enable OSPF for LAPD port and Local Ethernet
OSPF set on Local Eth.
OSPF set on LAPD
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
Overview_statist
Nr
Category
Mai
June
July
August
Alarms
Solution
1
Exchange of MX640
3
Question/Info
2
1
1
1
4
2
1
2
3
LossOfPointer/SSF/RMAF/RMBF
Solved by SW Reset of DCR, LAX or Process Restart
7
AUG Total 24 / PSC 5
Mai 2007 / TOTAL 25
N
200707001898
1678MCC
Alarms test on interconnection was not possible because alarm
reporting was delayed
Level 2
DEUTSCHE TELEKOM AG
N
200707004299
1678MCC
After we install new card in slot 14 we got URU alarm on all ports
on the card in slot 13.
1-1661212
Level 2
N
Unknown
1-XKF1I
N
200707011730
1678MCC
Problem Info - Ho-Trails 6041, 6042, 6044 and 6078 report Transport
Failure, Portalarms are : F016.X3/ModVc4#87, 88 , 91 and 107 :
Server Signal failure. No traffic impacted
Level 2
Level 2
N
Unknown
1-1ZT41M
N
200707012240
1678MCC
Spare Part Req: 3AL81429AAAD01 AR Creation -- Alcatel - (SPARE)
Request ::: BT Reference: TXF03644166
1-1666811
Level 2
Level 2
N
Unknown
1-1ZT41M
N
200707015855
1678MCC
Problem Info - B005.X3 shows LAN failure on all matrix card and on
FLC cards.
Level 2
Level 2
N
Unknown
1-HKA0W
N
200707022328
1678MCC
We have a timing problem on this crossconnect N002.X3 Please
investigate.
Level 2
DEUTSCHE TELEKOM AG
N
200707023000
1678MCC
Am 1678MCC mit der ID 0038-10_05_L13_012 meldet Redundant Matrix B
failure von Slot11 und Redundant Matrix A failure von Slot 10
(jeweils MX640). Über den Internal Link monitor wird ein
dauerhaftes Kommunikationsproblem mit dem Slot 9 (P16S16)
angezeigt.T
Level 2
1 · 2 · *
1 · 2 · *
- Trouble shooting transmission issues
- Sync Problems solved without HW Change
- hanging processes
- wrong configuration
1 · 2 · *
1 · 2 · *
1 · 2 · *
1 · 2 · *
1 · 2 · *
PSC link
1 · 2 · *
IDEA Database
1 · 2 · *
1) Wrong implementation of Workaround in SW3.2P3
Caused by short occurance of MS-SSF (MS-AIS). When the workaround
interferees then the STM1 sending LOF (Alarmliste: RDI auf
MSTTP)
Action: Remove/restore Board
RST B04
Board will loose ist traffic on board level for 30sec
Fix: 3.2 P5 / R4.2
1 · 2 · *
Problems Transmission and misbehaviour
Mismatch db-Info of both SC‘s. Will be found by HealthCheck.
Consequence is that HiOrder-Matrix-Switch leads to traffic
interuption
of 20sec on all Boards.
Action:
EPS switch HiOrder-Matrix (leads to 20s interuption on all
boards)
Reset passive SC
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
Degraded Signal on MSTTP,
OR
OR
3) Dafodil-ASIC hanging pointer processor
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
Check registers on active HO-SC
bsw ispbRD 9 2 140
bsw ispbRD 10 2 140
bsw ispbRD 9 3 140
bsw ispbRD 10 3 140
bsw ispbRD 9 5 283
bsw ispbRD 9 6 283
bsw ispbRD 9 7 283
bsw ispbRD 9 8 283
bsw ispbRD 10 5 283
bsw ispbRD 10 6 283
bsw ispbRD 10 7 283
bsw ispbRD 10 8 283
Result: 0xff ff ff fe (wrong)
0xff ff ff ff (correct)
Action:
Fix: 3.2 P4
If Overhead links are not correct the HiOrder-Matrix-Switch will
lead to power reset on several or on all IO-Boards and to a traffic
interuption on these boards
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
All SC/LAX Interfaces pingable from both EM and DCR ?
ping 100.1.10.1 (HO-MX left)
ping 100.1.10.20 (HO-MX left)
ping 100.1.11.1 (HO-MX right )
ping 100.1.11.20 (HO-MX right)
ping 100.1.18.1 (left LAX)
ping 100.1.18.20 (left LAX)
tools
1 · 2 · *
Alarm ICP not active for process ICHdlr ?
log
tail –f scSwitch.log
24.09.2007 08:11:18 : HoMx (1, 4, 10, 0, 0) passive: HoMx (1, 4,
11, 0, 0) Alarm LSSC/CSF = 1
24.09.2007 08:13:51 : HoMx (1, 4, 10, 0, 0) set EquState:
e_EquState_Downloading=>e_EquState_Active
24.09.2007 08:13:51 : HoMx (1, 4, 10, 0, 0) passive: HoMx (1, 4,
11, 0, 0) Alarm LSSC/CSF = 0
4) Each LAX is loaded correctly ?
telnet 100.1.18.1
1 · 2 · *
ICP/CSF on LAX, HO-MX troubleshooting
5) All SC‘s are declared in CS configuration on both FLC
start
view /etc/hosts
1 · 2 · *
Craft Terminal
Craft Terminal
-> F-Interface has to be connected to Active FLC
-> Use Craft Terminal as PCCT via LAN is independent from state
of FLC
Craft Terminal
1 · 2 · *
Start CT
Appl.
-> Do not forget to declare the PCCT User in EP Dialog !
PCCT1 NSAP 540072872203010010088010
1 · 2 · *
-> Enter IP of active EM
-> Enter SW Release
Equipment Provisioning without USM
It is possible to open Equipment Provisioning on the Service Laptop
without Equipment View
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
Delete Persistency
start
1 · 2 · *
FLC Replacement
1) Install new FLC with the same SW,
IP and network parameters like on active FLC
2) Start application on newly
installed FLC
3) Check processes on this FLC. The
FLC should become passive
[ Raise Equipment Provisioning, CS parameters and
Click “OK” ]
After Supervision comes back the Reinitialize of CS Server has
written all necessary Persistency data onto passive FLC. Check this
with command PersSync and check CS Persistency on passive
FLC:
> start
1 · 2 · *
Request SC EPS status
Check the EPS state of the SC’s. In case “SF” for failure the EPS
state release is not possible and the SC’s remain with “expression
mark” on GUI.
telnet <IP active SC>
0 . 1 . 2 .
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In case of EPS possible
Pg.B10-NO_REQUEST
1 · 2 · *
Show status of boards
Sometimes e.g. in ISU it is necessary to display the board status,
in case a FPGA download will be performed than the status “fpga
dwnld” will be displayed.
me 0 2
Slot number (0... (descrNum.) | S1..26 | A0..A3 B0..B21 | Ex-y | -1
all | Quit):-1
Slot Card Number Status
Passive LAX card
Active LAX card
1 · 2 · *
Check CRU Status and Clock source
The CRU mode must be “SC_SYNC_LOCKED” before SLC switchover and
before testing. That means that a clock source is selected. In the
example the CRU mode is not ok, probably no clock source is
connected or configured
me
4
1 · 2 · *
Usefull SC commands
Check SC traces
The SC traces are usefull to find out if the SC behaves suspicious,
board reset was performed or a SLC switchover was performed without
an obvious reason. The latest traces are recorded.
me
1
15 [SC traces]
SC AS - 8 ../src/as_rav_blk.c:3204: REPEAT #2 failed to read RAVEL
SW version on board with BA 0x2100 (at address 0x2100)
SC AS - 8 ../src/as_rav_blk.c:3204: REPEAT #3 failed to read RAVEL
SW version on board with BA 0x2100 (at address 0x2100)
1 · 2 · *
Usefull SC commands
Reset remote SC
If the partner SLC should be resetted in case of being unavailable
use following command. This can be done when e.g the active SLC is
not pingable anymore (CSF alarm). Then perform Cold start.
me
1
1 · 2 · *
Check ispbs error counters
The ispbs is a bus for communication between the SC’s and the
intelligent boards in the shelf.
In case of error counters are bigger than 000FF the bus is
disturbed. This can be a result of a faulty board or a faulty
Busterm board.
Telnet+> ispbs
-> The error counters can be resetted by using command “ispbs
?”
Product Overview · Function and Features
All Rights Reserved © Alcatel-Lucent 2007
1678 MCC Operation and Maintenance
1 · 2 · *
1 · 2 · *
1.
Copy script files to both FLC’s
maint@emServ(maint)$ tar -xvf cb.tar
mkdir /packages/checkedBackup
crontab -e
# run checkedPersBackup tool every Mon and Thu at 00:35 UTC
35 0 * * 1,4 cd ~; /usr/bin/perl ./checkedPersBackup.pl >>
~/checkPersistency.log 2>&1
:wq
1 · 2 · *
NrCategoryMaiJuneJulyAugustAlarmsSolution
DCR, LAX or Process