+ All Categories
Home > Documents > Machine Learning Methods for Communication Networks … › wp-content › ..., “A Tutorial on...

Machine Learning Methods for Communication Networks … › wp-content › ..., “A Tutorial on...

Date post: 28-Jan-2021
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
18
Part II – 8: Failure management Machine Learning Methods for Communication Networks and Systems – 051911 Francesco Musumeci Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB) Politecnico di Milano, Milano, Italy
Transcript
  • Part II – 8: Failure management

    Machine Learning Methods for Communication Networks and Systems – 051911

    Francesco MusumeciDipartimento di Elettronica, Informazione e Bioingegneria(DEIB)Politecnico di Milano, Milano, Italy

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    • Hard-failureso Sudden events, e.g., fiber cuts, power outages, etc.o Unpredictable, require «protection» (reactive procedures)

    • Soft-failures:o Gradual transmission degradation due to equipment

    malfunctioning, filter shrinking/misalignment…o Trigger early network reconfiguration (proactive procedures)

    Two main failure types in optical networks

    2

    RXTX

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    1. Early detection (When?)o «Predict» that BER will go above a thresholdo Allows early/quick activation of proactive procedures

    2. Identification (Which element?)o e.g., filter misalignment or amplifier malfunctioning ..o Reduced Time To Repair (TTR)

    3. Localization of soft-failures (Where?)o e.g., which node/link along the path?

    4. Magnitude estimation (How much?)o Triggers the proper reaction (e.g., device restart/reconfiguration, lightpath

    re-routuing, in-field reparation…)

    Handling soft-failures

    3

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    • How can we predict soft-failures?

    Perform continuous monitoring of Bit Error Rate (BER) at the receiver…… until some “anomalies” are detected

    Early-detection helps preventing service disruption (e.g., through proactive network reconfiguration)

    Soft-failure early detection

    RX

    RX

    TX

    TX

    time

    BER

    time

    BER

    timeBE

    R

    intolerable BER

    time

    BER intolerable BER

    detection

    failure

    reconfiguration

    4

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    • How can we identify the cause of the failure?– Failures can be caused by different sources

    o Filters shrinking/misalignmento Excessive attenuation (e.g., due to amplifier malfunctioning)o Laser/photodetectors malfunctioningo …

    Different sources of failure can be distinguishedvia the different effects they cause on BER variation(i.e., via different BER “features”)

    Soft-failure cause identification

    5

    RX

    RX

    TX

    TX

    time

    BER

    intolerable BER

    time

    BER

    intolerable BER

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    • How can we identify the location of the failure?– A single failure may affect multiple lightpaths– Leverage information on failure-cause on each lightpath

    in combination with routing information– No need for monitoring in the entire network (monitors

    can be deployed only at the receivers)

    Soft-failure localization

    6

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    • What is the failure magnitude (i.e., severity)?– Different failures magnitude can affect the network

    differently– According to the severity, different actions can be

    triggered to solve the failureo device restart/reconfigurationo lightpath re-routuingo in-field reparation…)

    Soft-failure magnitude estimation

    7

    RX

    RX

    TX

    TX

    time

    BER

    Replacethe deviceReset the

    device

    Reconfigurethe device

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    1. F. Musumeci et al., “A Tutorial on Machine Learning for Failure Management in Optical Networks”, Journal of LightwaveTechnology, vol. 37, n. 16, Aug. 2019

    2. S. Shahkarami et al, “Machine-Learning-Based Soft-Failure Detection and Identification in Optical Networks,” in OFC Conference 2018, pp. M3A–5

    • Paper(s) objective: failure detection, cause identification and magnitude estimation in optical transmission system

    – inputo monitored BER

    – outputo failure detection, cause identification and magnitude estimation

    – ML algorithms:o ANNo SVMo RF

    Failure managementSources 1-2

    8

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    Our study: Optical Network Failure Management (ONFM)

    9

    F. Musumeci et al., “A Tutorial on Machine Learning for Failure Management in Optical Networks”, Journal of Lightwave Technology, vol. 37, n. 16, Aug. 2019

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    • BER window: two main optimization parameters– Window duration, W– BER sampling period, TBER

    • Training of the ML algorithms is done for differentcombinations of these two params

    Our study: window analysis

    10

    Features extracted:- BER statistics:

    - mean- min/max- standard dev.

    - Window spectralcomponents after FFT

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    Our study: failure detection

    11

    2. Failure Identification

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    Our study: failure identification

    12

    1. Failure Detection

    3a. Failure MagnitudeEstimation (Atten.)

    3b. Failure MagnitudeEstimation (Filtering)

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    Our study: failure magnitude estimation

    13

    2. Failure Identification

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    • Testbed for real BER traces– Ericsson 80 km transmission system

    o 24 hours BER monitoringo 2 seconds sampling interval

    – PM-QPSK modulation @ 100Gb/s – 2 Erbium Doped Fiber Amplifiers (EDFA) followed by Variable Optical

    Attenuators (VOAs, not shown)– Bandwidth-Variable Wavelength Selective Switch (BV-WSS) is used to emulate

    2 types of BER degradation:o Filter misalignment (Filtering)o Additional attenuation in intermediate span, due to EDFA gain-reduction (Attenuation)

    – Different failure magnitudes:o Filtering: 50-to-26 GHz at steps of 2 GHzo Attenuation: 0-to-10 dB additional attenuation at steps of 1 dB

    Testbed setup (1)

    14

    F. Musumeci et al., “A Tutorial on Machine Learning for Failure Management in Optical Networks”, Journal of Lightwave Technology, vol. 37, n. 16, Aug. 2019

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    Results

    15

    Takeway1: Accuracy always increases with window duration

    Takeway2: Detection (finding anomalies) is accurate also for in short-time windows

    Takeway3: Complex tasks (e.g., failure-cause identification) requires more BER info (longer windows) to have sufficient accuracy

    F. Musumeci et al., “A Tutorial on Machine Learning for Failure Management in Optical Networks”, Journal of Lightwave Technology, vol. 37, n. 16, Aug. 2019

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    Testbed setup (2)

    • Testbed for real BER traces– Ericsson 380 km transmission system

    o 24 hours BER monitoringo 3 seconds sampling interval

    – PM-QPSK modulation @ 100Gb/s – 6 Erbium Doped Fiber Amplifiers (EDFA) followed by Variable Optical

    Attenuators (VOAs)– Bandwidth-Variable Wavelength Selective Switch (BV-WSS) is used to

    emulate 2 types of BER degradation:o Filter misalignmento Additional attenuation in intermediate span (e.g., due to EDFA gain-reduction)

    TX

    BVWSS

    1

    RX

    BVWSS

    2

    60km 80km 80km 80km 80km

    E1 E2 E3 E4 E5 E6

    S. Shahkarami et al, “Machine-Learning-Based Soft-Failure Detection and Identification in Optical Networks,” in OFC Conference 2018, pp. M3A–5

    16

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    Numerical results: DetectionAccuracy vs window features

    17

    • Binary SVM

    Take-away 1: Higher performance with low sampling time Fast monitoring equipment isrequired

    Take-away 2: For increasing sampling time, longer “Windows” are needed for high accuracy

    S. Shahkarami et al, “Machine-Learning-Based Soft-Failure Detection and Identification in Optical Networks,” in OFC Conference 2018, pp. M3A–5

  • F. Musumeci: ML Methods for Communication Nets & SystemsPart II – 8: Failure management

    Numerical results: IdentificationAccuracy vs window features

    18

    • Neural Network

    Take-away 3: To perform failure-cause identification, much smaller sampling period is needed wrt failure detection

    S. Shahkarami et al, “Machine-Learning-Based Soft-Failure Detection and Identification in Optical Networks,” in OFC Conference 2018, pp. M3A–5

    Diapositiva numero 1Two main failure types in optical networksHandling soft-failures Soft-failure early detectionSoft-failure cause identificationSoft-failure localizationSoft-failure magnitude estimationFailure management�Sources 1-2Our study: Optical Network Failure Management (ONFM)Our study: window analysisOur study: failure detectionOur study: failure identificationOur study: failure magnitude estimationTestbed setup (1)ResultsTestbed setup (2)Numerical results: Detection�Accuracy vs window featuresNumerical results: Identification�Accuracy vs window featuresDiapositiva numero 19Diapositiva numero 20Failure management�Source 3Failure management�Source 3Failure management�Source 3Failure management�Source 3Failure management�Source 3Failure management�Source 3Failure management�Source 3


Recommended