UNIVERSITY OF CALIFORNIA Santa Barbara Hybrid Adaptation...

UNIVERSITY OF CALIFORNIA

Santa Barbara

Hybrid Adaptation Layers for High Performance Optical Packet Switched IP

Networks

A Dissertation submitted in partial satisfaction of the

requirements for the degree Doctor of Philosophy

in Electrical and Computer Engineering

by

Suresh Rangarajan

Committee in charge:

Professor Daniel J. Blumenthal, Committee Chair

Professor John E. Bowers

Professor Kevin Almeroth

Professor Upamanyu Madhow

March 2006

UMI Number: 3206436

32064362006

Copyright 2006 byRangarajan, Suresh

UMI MicroformCopyright

All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code.

ProQuest Information and Learning Company 300 North Zeeb Road

P.O. Box 1346 Ann Arbor, MI 48106-1346

All rights reserved.

by ProQuest Information and Learning Company.

The dissertation of Suresh Rangarajan is approved.

_________________________________________

Professor Upamanyu Madhow

_________________________________________

Professor Kevin Almeroth

_________________________________________

Professor John E. Bowers

_________________________________________

Professor Daniel J. Blumenthal, Committee Chair

March, 2006

Hybrid Adaptation Layers for High Performance Optical Packet Switched IP

Networks

Copyright © March, 2006

by

Suresh Rangarajan

iii

ACKNOWLEDGEMENTS

This thesis would not have been possible without the guidance, help and

encouragement of many people.

My advisor, Prof. Dan Blumenthal has provided me with the freedom to research by

granting me access to the Optical Communication and Photonic Networks lab. I

would also like to thank him for sharing his vision of optical networking with me and

allowing me to be a part of what I see as some of the most interesting engineering

research in this field.

I would like to thank my defense committee members both for taking time to read

through drafts of my thesis and for providing me with useful recommendations; Prof.

Kevin Almeroth for going through my work from a network viewpoint, Prof.

Madhow for his insightful comments on the high level perspective of my research and

Prof. Bowers for suggesting historical research in similar areas and for reviewing my

work from an optical communications view.

I would like to thank funding organizations Cisco and DARPA-Dimi for financing the

research presented in this thesis. Without their continued support for this work, this

thesis would have lost a significant part of its focus.

I would like to especially thank members of Cisco - Paul Donner, Russell Gyurek and

Farzam Toudeh-Fallah for their invaluable suggestions during most part of this

research.

iv

I would like to thank my colleagues at the Optical Communication and Photonic

Networks and Prof. Bower’s group for their collaboration, encouragement and advice.

I would especially like to thank Henrik Poulsen and Lavanya Rau for guiding me

through this research and for making it a pleasure to work in the lab. Roopesh Doshi,

Ramesh Rajaduray, Vikrant Lal, Marcelo Davanco, Emily Burmeister, Joe Summers,

David Wolfson, Zhaoyang Hu and others have all made my experience in this group

unforgettable both academically and socially.

I would like to thank my friends Jorn Helms, Mojtaba Ghodsi, Giulia Casavecchia,

Frank Alvarado, Graham Rhodes, Mahesh Khale and many more who have made

living in the beautiful city of Santa Barbara memorable.

Finally, I would like to thank my mom Vijayalakshmy Rangarajan, my dad

Rangarajan Gopala, my sister Priya Rangarajan and my brother-in-law Narayanan

Seshadri for their constant support, excellent and timely advice and helping me in

every way to complete my thesis.

v

VITA OF SURESH RANGARAJAN March 2006

EDUCATION

Bachelor of Engineering in Electrical and Communication Engineering from the

University of Madras, Chennai, India in April, 1999

Masters of Science in Electrical and Computer Engineering from the University of

California, Santa Barbara in June 2001

PUBLICATIONS

[1] Scalable All-Optical Compression/Decompression of Variable Length (40 to

1500 byte) Packets, Suresh Rangarajan, Henrik N. Poulsen and Daniel J.

Blumenthal – Optical Fiber Conference OFC’2006(accepted for presentation and

publication)

[2] All-Optical Packet Compression of Variable Length Packets from 40 to 1500

bytes using a Gated Fiber Loop, Suresh Rangarajan, Henrik N. Poulsen, Daniel J.

Blumenthal, IEEE Photonic Technology Letters, Jan 2006.

[3] Burst Mode 10Gbps Optical Header Recovery and Lookup Processing for

Asynchronous Variable-Length 40Gbps Optical Packet Switching, Henrik N.

Poulsen, David Wolfson, Suresh Rangarajan, Daniel J. Blumenthal, Optical Fiber

Conference OFC’2006(accepted for presentation and publication)

[4] Scalable All-Optical Packet Decompression of Variable Length Packets from

40 to 1500 bytes using a Gated Fiber Loop, Suresh Rangarajan, Henrik N. Poulsen,

Daniel J. Blumenthal, IEEE Photonic Technology Letters, (to be submitted)

vi

[5] Synchronizing Optical Data and Electrical Control Planes in Asynchronous

Optical Packet Switches, David Wolfson1, Henrik N. Poulsen1, Suresh Rangarajan1,

Zhaoyang Hu1, Daniel J. Blumenthal1, Garry Epps2, David Civello2; 1Univ. of

California at Santa Barbara, USA, 2Cisco Systems, USA, Optical Fiber Conference

OFC’2006(accepted for presentation and publication)

[6] End-to-End Layer-3 (IP) Packet Throughput and Latency Performance

Measurements in an All-Optical Label Switched Network with Dynamic For-

warding, Suresh Rangarajan1 , Henrik N. Poulsen1 , Paul G. Donner2, Russell

Gyurek2, Vikrant Lal1, Milan L. Masanovic1, Daniel L. Blumenthal1; 1:Univ. of

California, Santa Barbara, USA, 2:Cisco Systems Inc., USA, OWC5, Optical Fiber

Conference OFC’2005

[7] Performance of a Label Erase and Wave-length Switching Sub-System for

Layer-3 All-Optical Label Switching Using a Two Stage InP Wavelength

Converter, Henrik N. Poulsen, Suresh Rangarajan, Milan L. Masanovic, Vikrant Lal,

Daniel J. Blumenthal; Univ. of California at Santa Barbara, USA, OTuC2, Optical

Fiber Conference OFC’2005

[8] All-optical contention resolution with wavelength conversion for

asynchronous variable-length 40 Gb/s optical packets, Rangarajan, S.; Zhaoyang

Hu; Rau, L.; Blumenthal, D.J., Photonics Technology Letters, IEEE,

Volume:16, Issue:2, Feb.2004, Pages:689 – 691

[9] Analog performance of an ultrafast sampled-time all-optical fiber XPM

wavelength converter, Rau, L.; Doshi, R.; Rangarajan, S.; Yijen Chiu; Blumenthal,

vii

D.J.; Bowers, J.E.; Photonics Technology Letters, IEEE ,Volume: 15 , Issue: 4

, April 2003 Pages:560 – 562

[10] An all-optical bufferless multiwavelength sorter for 40 Gb/s asynchronous

variable-length optical packets, Hu, Z.; Rangarajan, S.; Rau, L.; Blumenthal, D.;

Optical Fiber Communications Conference, OFC’2003 , 23-28 March 2003

Pages:131 - 133 vol.1

[11] Optical signal processing for optical packet switching networks, Blumenthal,

D.J.; Bowers, J.E.; Rau, L.; Hsu-Feng Chou; Rangarajan, S.; Wei Wang; Poulsen,

K.N.; Communications Magazine, IEEE, Volume: 41, Issue: 2 , Feb. 2003 Pages:S23

- S29

[12] Low power penalty 80 to 10 Gbit/s OTDM demultiplexer using standing-

wave enhanced electroabsorption modulator with reduced driving voltage, Hsu-

Feng Chou; Yi-Jen Chiu; Rau, L.; Wei Wang; Rangarajan, S.; Bowers, J.E.;

Blumenthal, D.J.;Electronics Letters ,Volume: 39 , Issue: 1 , 9 Jan 2003 Pages:94 –

95

[13] Two-hop all-optical label swapping with variable length 80 Gb/s packets

and 10 Gb/s labels using nonlinear fiber wavelength converters,

unicast/multicast output and a single EAM for 80- to 10 Gb/s packet

demultiplexing, Rau, L.; Rangarajan, S.; Blumenthal, D.J.; Chou, H.-F.; Chiu, Y.-J.;

Bowers, J.E; Optical Fiber Communication Conference, OFC’2002 , 17-22 March

2002 Pages:FD2-1 - FD2-3

viii

[14] All-optical add-drop of an OTDM channel using an ultra-fast fiber based

wavelength converter, Rau, L.; Rangarajan, S.; Wei Wang; Blumenthal, D.J.;

Optical Fiber Communication Conference, OFC’2002 , 17-22 March 2002 Pages:259

– 261

[15] Optical packet switching and associated optical signal processing,

Blumenthal, D.J.; Bowers, J.E.; Chiu, Y.-J.; Chou, H.-F.; Olsson, B.-E.; Rangarajan,

S.; Rau, L.; Wang, W.; 2002 IEEE/LEOS Summer Topicals , 15-17 July 2002

Pages:TuG2-17 - TuG2-18

[16] Design of Impedance matching Microstrip line for Opto-microwave

applications, S.Rangarajan, A. Selvarajan et.al. National Communications

ix

ABSTRACT

Hybrid Adaptation Layers for High Performance Optical Packet Switched IP Networks

by

Suresh Rangarajan

With the increasing demand for network bandwidth, today's electronic routers

struggle to keep up and are fast reaching their physical limitations. All-Optical packet

switching offers an alternative approach to forwarding traffic by keeping the payload

in the optical domain throughout the network while forwarding is done based solely

on the optical header. Advantages of such an approach include a scalable increase in

payload bit rate, packet forwarding rate and reduction in power consumption and

physical size.

Current research in optics has yielded several integrated and non-integrated

technologies leading towards all-optical packet forwarding. Research on an

adaptation mechanism that achieves complete payload and frame decoupling and

thereby enabling direct IP over optics packet forwarding has however been limited.

Additionally, core packet forwarding involves statistical multiplexing of packet

traffic. Contention is a major issue during statistical multiplexing and has been

conventionally solved using Random Access Memory (RAM) buffers in electronics.

In the absence of Optical RAMs, a new technique of handling contention using fixed

length delay lines as buffering is required. This thesis focuses on research of a new

optical framing mechanism that enables all-optical statistical demultiplexing of IP

x

over optics at the core and optical processing mechanisms such as packet

compression and decompression to enable all-optical packet forwarding by statistical

multiplexing at the network core. Research on efficient adaptation of slow speed IP

traffic at the edge to enable and improve forwarding at the high speed optical core is

presented in this work.

xi

Table of Contents:

Table of Contents:....................................................................................................... xii

Chapter 1....................................................................................................................... 1

Introduction................................................................................................................... 1

1.1 Motivation and Problem definition..................................................................... 1

1.2 Thesis Outline ..................................................................................................... 2

References................................................................................................................. 5

Chapter 2....................................................................................................................... 6

Optical Data Networks for Packet Traffic .................................................................... 6

2.1 Optical data networks – current state of art ........................................................ 8

2.1.1 Optical Ethernet networks............................................................................ 8

2.1.2 Synchronous optical networks – IP over SONET...................................... 12

2.2 OPS networks of today ..................................................................................... 14

2.2.1 KEOPS network......................................................................................... 15

2.2.2 All Optical Label Swapping network ........................................................ 17

2.3 Challenges in building optical packet switched networks ................................ 19

2.3.1 Framing requirements ................................................................................ 19

2.3.2 Physical and data link layer issues............................................................. 21

2.4 OPS Performance Measurement Metrics.......................................................... 22

2.4.1 Physical Layer (Layer 1) Performance Measurements.............................. 22

xii

2.4.2 Packet level (Layer 2/3) Measurements..................................................... 24

2.5 Chapter summary .............................................................................................. 25

References............................................................................................................... 26

Chapter 3..................................................................................................................... 29

Edge Node Traffic Adaptation.................................................................................... 29

3.1 Architectural Considerations for an All-Optical Label Swapped Packet

Switched network.................................................................................................... 30

3.1.1 Traffic characteristics at the Network Edge............................................... 31

3.1.2 Electronics in optical networks.................................................................. 33

3.2 Optical Framing Considerations ....................................................................... 34

3.2.1 Forwarding requirements ........................................................................... 35

3.2.2 Idlers .......................................................................................................... 36

3.2.3 Edge and Core node Clock-Data Recovery (CDR) ................................... 37

3.2.4 Framing structure ....................................................................................... 39

3.3 Edge Node Adaptation for OPS Traffic............................................................ 42

3.3.1 Ingress Node Description........................................................................... 43

3.3.2 Egress Node Description............................................................................ 45

3.4 Edge Node Implementation .............................................................................. 46

3.4.1 PoS Framer/ Deframer ............................................................................... 47

3.4.2 Electronic Control design .......................................................................... 48

3.4.3 Tunable Laser board and control ............................................................... 51

3.5 Edge Adaptation testing and performance........................................................ 53

xiii

3.5.1 Impact of optical framing on utilization .................................................... 54

3.5.2 Measured throughput performance ............................................................ 55

3.5.3 Measured latency performance .................................................................. 56

3.6 Chapter Summary ............................................................................................. 57

References............................................................................................................... 59

Chapter 4..................................................................................................................... 60

All-Optical Label Swapping Network Performance................................................... 60

4.1 All-Optical forwarding using Wavelength Conversion .................................... 61

4.1.1 Statistical Demultiplexing.......................................................................... 63

4.1.2 Core node issues in optics and electronics................................................. 64

4.2 Optical Core Node ............................................................................................ 65

4.2.1 Two-stage wavelength conversion principle ............................................. 68

4.2.2 First stage conversion and label erase........................................................ 69

4.2.3 Second stage conversion and packet steering ............................................ 72

4.2.4 Framing and idlers revisited....................................................................... 73

4.3 Experimental Performance Evaluation ............................................................. 74

4.3.1 Measurement test system and setup........................................................... 77

4.3.2 Measured throughput performance ............................................................ 79

4.3.3 Measured latency performance .................................................................. 80

4.4 Chapter Summary ............................................................................................. 81

References............................................................................................................... 82

Chapter 5..................................................................................................................... 83

xiv

Core Traffic Adaptation for Statistical Multiplexing.................................................. 83

5.1 Core node with contention resolution............................................................... 84

5.1.1 Statistical Multiplexing and contention ..................................................... 84

5.1.2 Contention handling schemes .................................................................... 85

5.2 Optical Buffering techniques ............................................................................ 86

5.2.1 Feed-forward buffering schemes ............................................................... 86

5.2.2 Feedback buffering .................................................................................... 87

5.3 Compression/Decompression for contention resolution................................... 87

5.3.1 Network-wide deployment......................................................................... 88

5.3.2 Intra-node deployment ............................................................................... 90

5.3.3 Compression/Decompression process ....................................................... 91

5.4 Chapter Summary ............................................................................................. 93

References............................................................................................................... 95

Chapter 6..................................................................................................................... 96

Optical Packet Compression ....................................................................................... 96

6.1 Compression – State of the art today ................................................................ 97

6.1.1 Feed forward delay line approach.............................................................. 98

6.1.2 Spectral Slicing approach ........................................................................ 100

6.1.3 Loop based approach ............................................................................... 101

6.2 Network deployment of packet compression.................................................. 102

6.2.1 Intra-node compression/decompression................................................... 103

6.2.2 Packet compression classification............................................................ 106

xv

6.2.3 Requirement of a practical compressor.................................................... 107

6.2.4 Approaches to compression implementation ........................................... 109

6.3 Fold-in packet compression ............................................................................ 111

6.3.1 Linear time delay compression ................................................................ 112

6.3.2 Loop based compression.......................................................................... 113

6.4 Packet adaptation using fold-in compression ................................................. 115

6.4.1 Loop characterization............................................................................... 116

6.4.2 2.5Gbps to 10Gbps Packet compression.................................................. 117

6.4.3 Variable compression control .................................................................. 121

6.4.4 Bandwidth efficient multiple compressor approach ................................ 124

6.5 Chapter summary ............................................................................................ 126

References............................................................................................................. 129

Chapter 7................................................................................................................... 131

Optical Packet Decompression ................................................................................. 131

7.1 Decompression – State of the art .................................................................... 132

7.1.1 Feed-forward Delay line approach........................................................... 133

7.1.2 Spectral slicing......................................................................................... 135


7.2 Fundamentals, design and implementations of packet decompression .......... 137

7.2.1 Decompression - principles of operation ................................................. 137

7.2.2 Requirements on a packet decompressor................................................. 139

7.3 Fold-out packet decompression ...................................................................... 140

xvi

7.3.1 Linear time delay approach...................................................................... 141


7.3.3 Loop characterization............................................................................... 144

7.3.4 10Gbps to 2.5Gbps packet decompression .............................................. 146

7.3.5 Variable packet decompression ............................................................... 149

7.4 Chapter summary ............................................................................................ 151

References............................................................................................................. 152

Chapter 8................................................................................................................... 153

End-to-End Core Traffic Adaptation ........................................................................ 153

8.1 End-to-end core traffic adaptation .................................................................. 153

8.2 Issues in end-to-end compression-decompression experimentation............... 154

8.3 10Gbps core adaptation of 2.5Gbps traffic ..................................................... 158

8.4 Variable length packet adaptation................................................................... 162

8.5 Chapter summary ............................................................................................ 164

Chapter 9................................................................................................................... 165

Conclusion ................................................................................................................ 165

9.1 Thesis conclusions .......................................................................................... 165

9.2 Future work..................................................................................................... 169

xvii

List of Figures:

Figure 2. 1: Layered 10GigE model. PHY - Physical layer device; PMD - Physical

Medium Dependent....................................................................................................... 9

Figure 2. 2: General Ethernet Framing Format. CRC - Cyclic Redundancy Check;

MAC - Media Access Control .................................................................................... 10

Figure 2. 3: Basic SONET Frame. SPE - Synchronous Payload Envelope................ 13

Figure 2. 4: HDLC-framed PPP-encapsulated IP packet............................................ 13

Figure 2. 5: PoS OSI Stack. IP - Internet Protocol; PPP- Point to Point Protocol;

HDLC - High-level Data Link Control; SONET - Synchronous Optical NETwork.. 14

Figure 2. 6: KEOPS Packet Format ............................................................................ 15

Figure 2. 7: KEOPS Node structure. DMUX - Demultiplexer; MUX - Multiplexer . 16

Figure 2. 8: AOLS Packet Frame. (a) Serial label, (b) Optical SCM label ................ 17

Figure 2. 9: AOLS Network architecture.................................................................... 18

Figure 2. 10: IP Packet Structure. IHL - Internet Header Length; TOS - Type Of

Service; TTL - Time To Live...................................................................................... 20

Figure 2. 11: Data Entity Granularity ......................................................................... 21

Figure 3. 1: End-to-End network functional architecture. .......................................... 31

Figure 3. 2: a) Peer-to-Peer vs. (b) Overlay optical network approach ...................... 32

Figure 3. 3: Need for idlers - transmission distortions in the absence of dc balance.. 37

xviii

Figure 3. 4: Packet mode clock recovery.................................................................... 38

Figure 3. 5: Optical Framing Structure. SW - Synch. Word; S/EOOP –Start/End Of

Optical Packet; OL - Optical Label; S/EOIP- Start/End Of IP Packet ...................... 40

Figure 3. 6: Example 32 bit Optical Label or header. O_ID - Optical Identifier; TTL -

Time To Live .............................................................................................................. 41

Figure 3. 7: Experimental scope traces of (a) Optically framed packets (b) Packet

framing (refer Figure 3.6) ........................................................................................... 42

Figure 3. 8: Edge node adaptation layering ................................................................ 43

Figure 3. 9: Ingress node high level schematic........................................................... 44

Figure 3. 10: Egress node functionality. CDR- Clock Data Recovery ....................... 45

Figure 3. 11: Vitesse POS OC-48 Framer/Deframer functionality............................. 47

Figure 3. 12: Ingress FPGA process functional diagram............................................ 49

Figure 3. 13: Egress FPGA process functionality....................................................... 50

Figure 3. 14: Egress FPGA - Phase locker operating principle .................................. 51

Figure 3. 15: Fast tunable laser functionality. DAC – Digital to Analog Converter; V/I

– Voltage to Current converter ................................................................................... 52

Figure 3. 16: Fast tunable laser spectrum at 1549.32nm and switching scope trace .. 52

Figure 3. 17: Test path and traffic flow setup............................................................. 53

Figure 3. 18: Edge adaptation theoretical throughput calculation .............................. 54

Figure 3. 19: Edge adaptation throughput measurement and theoretical comparison 55

Figure 3. 20: Edge adaptation latency measurement .................................................. 57

xix

Figure 4. 1: 4 x 4 AWGR wavelength mapping - passive, non-blocking, static optical

switch; λji – input port i, wavelength j ................................................................. 62

Figure 4. 2: Statistical demultiplexing (a) 1 input (b) 4 inputs - forwarding without

contention resolution................................................................................................... 63

Figure 4. 3: Optical Core Node - Principle of operation............................................. 65

Figure 4. 4: Core FPGA Process - 4 control signals. OL – Optical Label; S/EOOP –

Start/End Of Optical Packet; DMUX – Demultiplexer. ............................................. 67

Figure 4. 5: 2-Stage wavelength conversion principle. BPF - Band Pass Filter; SOA -

Semiconductor Optical Amplifier; MZI - Mach Zehnder Interferometer .................. 69

Figure 4. 6: SOA based 1st stage Wavelength Conversion and frame erase.............. 69

Figure 4. 7: Op Amp based Fast switching current driver circuitry for SOA............. 70

Figure 4. 8: Output power of the SOA as a function of the input voltage applied ..... 71

Figure 4. 9: Optical Signal to Noise Ratio (OSNR) performance of first stage

wavelength converter system...................................................................................... 72

Figure 4. 10: Mach Zehnder Interferometer based 2nd stage Wavelength Conversion.

FBG – Fiber Bragg Grating ........................................................................................ 73

Figure 4. 11: External Idler Insertion. AWGR - Arrayed WaveGuide Router ........... 74

Figure 4. 12: Core node experimental setup ............................................................... 75

Figure 4. 13: 1st stage Wavelength Converter Scope traces (a) CW + packet (b)

Frame erased wavelength converted payload ............................................................ 76

Figure 4. 14: Scope traces after 2nd stage Wavelength Conversion (a) Before rewrite

(b) After frame rewrite................................................................................................ 77

xx

Figure 4. 15: Traffic flow and Layer-3 measurement setup ....................................... 78

Figure 4. 16: End-to-End IP throughput measurement and theoretical comparison

(refer Figure 4.15)....................................................................................................... 79

Figure 4. 17: End-to-End latency vs. packet size (refer Figure 4.15)......................... 80

Figure 5. 1: Contention resolution during statistical MUX. CR- Contention Resolution

..................................................................................................................................... 84

Figure 5. 2: Feed-forward delay structure................................................................... 86

Figure 5. 3: Feedback delay buffering ........................................................................ 87

Figure 5. 4: Compression/Decompression deployed at the network edges ................ 89

Figure 5. 5: % available link utilization Network scale ComDecom vs. packet size in

bytes ............................................................................................................................ 90

Figure 5. 6: Intra-node deployment of Compression/decompression at the core ....... 91

Figure 5. 7: Compression Decompression process detail ........................................... 92

Figure 5. 8: RZ pulse quality requirements for compression...................................... 93

Figure 6. 1: (a) Structure of a feed forward delay compressor, (b) Working Principle

..................................................................................................................................... 99

Figure 6. 2: Spectral slicing approach - Principle of operation ................................ 100

Figure 6. 3: Loop based compression - Principle of operation................................. 101

Figure 6. 4: Intra node deployment of compression and decompression ................. 103

Figure 6. 5: Edge ingress node functionality– electronic adaptation........................ 104

xxi

Figure 6. 6: Edge egress functionality – electronic adaptation................................. 104

Figure 6. 7: Intra-node deployment of Compression/decompression at the core ..... 105

Figure 6. 8: Fixed compression ratio compression – padding to achieve fixed

temporal size ............................................................................................................. 107

Figure 6. 9: Variable and discretely variable compression ratio compression ......... 107

Figure 6. 10: Generic structure of compressor/decompressor .................................. 108

Figure 6. 11: Bit- serial compression........................................................................ 110

Figure 6. 12: Compression Principle - packet fold-in approach ............................... 111

Figure 6. 13: Linear time delay implementation of fold-in compression ................. 112

Figure 6. 14: Loop based compression principle ...................................................... 114

Figure 6. 15: Loop based implementation of fold-in compression (colors indicate

virtual bins not wavelengths) .................................................................................... 115

Figure 6. 16: Average Power increase at the input of the loop SOA........................ 116

Figure 6. 17: Measured SOA power penalty, OSNR Vs input power ...................... 117

Figure 6. 18: 2.5Gbps to 10Gbps compression - Experimental setup ...................... 118

Figure 6. 19: (a) Input 1500 packet and (b) packet stream quality .......................... 119

Figure 6. 20: (a) Compressed 1500 byte packet and (b) compressed stream quality

................................................................................................................................... 120

Figure 6. 21: Bit error rate measurements on 1500 byte compressed packets (refer

Figure 6. 18).............................................................................................................. 121

Figure 6. 22: Principle of variable compression ratio control .................................. 122

xxii

Figure 6. 23: 1500, 1024 byte input packets (a1,b1) and compressed output packets

(a2,b2) ....................................................................................................................... 122

Figure 6. 24: 560, 40 byte input packets (c1,d1) and compressed output packets

(c2,d2) ....................................................................................................................... 124

Figure 6. 25: Bandwidth efficient compression setup. ............................................. 126

Figure 7. 1: Feed forward delay decompression- (a) structure (b) operation ........... 133

Figure 7. 2: Decompression using spectral slicing ................................................... 135

Figure 7. 3: Loop based decompression structure and operation.............................. 136

Figure 7. 4 : Decompression principle - increase in utilization ................................ 138

Figure 7. 5: Decompression - Time out issue ........................................................... 139

Figure 7. 6:Linear time delay decompression technique .......................................... 142

Figure 7. 7:Loop based decompression principle .................................................... 143

Figure 7. 8: Loop based fold-out decompression...................................................... 144

Figure 7. 9: Decompression SOA power penalty, OSNR, output power Vs input

power......................................................................................................................... 145

Figure 7. 10: Loop based decompression - experimental setup................................ 146

Figure 7. 11: 10Gbps to 2.5 Gbps decompression of 1500 byte packets - (a1,b1) input

compressed packets and bit quality, (a2,b2) decompressed packets and bit quality

(2μsec (left) and 100psec (right) time scale) ............................................................ 147

Figure 7. 12: Bit error rate measurement on decompressed 1500 byte packets (refer

Figure 7. 10).............................................................................................................. 149

xxiii

Figure 7. 13: Compressed packet input(a1, b1, c1, d1) and decompressed packet

output (a2, b2, c2, d2) for 1500, 1024, 560 and 40 byte packets ( 500nsec time scale)

................................................................................................................................... 150

Figure 8. 1: Loop based fold-in compressor output imperfection ............................ 155

Figure 8. 2: Loop based fold-out decompressor output imperfection....................... 156

Figure 8. 3: (De)encoding to enable compression/decompression ........................... 156

Figure 8. 4: Compression and decompression of encoded packets to prevent payload

bit corruption............................................................................................................. 157

Figure 8. 5: End -to-end compression/decompression experimental setup .............. 159

Figure 8. 6: 1500 bytes (a1, a2) Input packet ,bit quality at 2.5Gbps (b1, b2)

compressed packet, bit quality at 10Gbps................................................................ 160

Figure 8. 7: 1500 byte packets (a1, a2) compressed packet and bit quality at 10Gbps,

(b1, b2) decompressed packet and bit quality at 2.5Gbps ........................................ 161

Figure 8. 8: End-to-End bit error rate measurement at 2.5Gbps............................... 162

Figure 8. 9: Input, compressed and decompressed packets for (a) 1500 byte, (b) 1024

byte, (c) 560 byte, (d) 40 byte packets...................................................................... 163

xxiv

Chapter 1

Introduction

1.1 Motivation and Problem definition

Mankind’s passion for interaction and exchange of information has fueled his ability

to communicate from just a handful of primal sounds to using packets of light to get

the message across. In today’s day and age, emails and web pages containing millions

of bits of information are routinely created and read everyday to the extent that they

have now become more of a necessity than a luxury. The Internet penetration has

grown from 0.4% in December 1995 to 15.7% of the world population in December

2005 [1]. Internet growth is reported to be about 100% every 12-18 months [2] and

newer avenues of internet usage such as video broadcast over the internet and IP

telephony are growing in popularity. With the ever increasing demand for network

bandwidth and higher degree of service sophistication required such as Quality Of

Service (QOS), hardware supporting the internet namely electronic routers are close

to reaching their physical limits [3].

Optical packet switching provides a scalable alternative to forwarding traffic in the

electronic domain. Advantages in switching in optics include lower power

consumption, smaller physical footprint, and it’s potential to scale both in bit rate and

1

packet forwarding rate [4]. Several key technologies in processing data in optics have

been researched over the past few years and the maximum bit rate handled in optics

has steadily increased. However, research in bridging the gap between increasing IP

traffic in today’s internet and enabling optical technologies has been limited. Another

critical issue in forwarding packets in optics is contention resolution. While the IP

traffic characteristics dictate variable packet sizes arriving randomly, core optical

switches with limited buffering capabilities are better suited to handling data in fixed

time slots. The purpose of the research done in this thesis is to investigate an

appropriate opto-electronic adaptation mechanism that enables both transport and

forwarding of raw IP packets taking into account their current traffic characteristics

and using state of the art all-optical technologies.

1.2 Thesis Outline

In chapter 2, we examine current optical transport protocols and some of the

significant state of the art research in optical packet switching with the aim of

identifying known issues in building an All-Optical Packet Switched network. We

take a look at different performance metrics required both in the physical layer testing

of optical sub-systems and network layer testing for end-to-end IP packet forwarding

that will be used in subsequent chapters.

Edge node packet adaptation requirements are investigated and an optical framing

mechanism that takes all known issues into account and satisfies these adaptation

requirements is proposed in Chapter 3. Ingress and egress functionality are studied

2

and individual sub-systems that are required for implementation of adaptation from IP

traffic to and from optical packet traffic are described. Theoretical and experimental

results on end-to-end packet performance through the edge adaptation are presented

and the impact of optical framing is investigated.

Chapter 4 deals with the basic principle behind forwarding traffic at the optical core

node. Statistical demultiplexing using 2-stage wavelength conversion as an all-optical

packet steering mechanism is studied. The electronic and optical technologies

required in enabling all-optical packet forwarding are presented along with the

identification of bottlenecks in packet forwarding at the core. An experimental

demonstration of all-optical packet forwarding made possible due to the edge node

adaptation is shown and the impact of core node packet forwarding on layer-3 end-to-

end network performance is analyzed.

In chapter 5, we look at the requirements of the optical core node to perform

statistical multiplexing of packet traffic. Current solutions to packet contentions are

examined and the issues with optical buffering technology are identified.

Compression and decompression, as an optical adaptation mechanism to adapt

variable sized payloads to fixed time slots desired in optical buffering is proposed and

deployment strategies for such adaptation within the network are compared.

Current state of the art in optical packet compression is reviewed in Chapter 6 and the

problems in implementing various schemes to compress IP packets are discussed.

Requirements for a practical compressor are summarized based on knowledge of

input and the desired output traffic characteristics. We then propose a novel approach

3

to packet compression that is capable of handling large packet byte sizes and

investigate two techniques of implementing this approach. The sub-systems that go

into the practical realization of packet compression using the one proposed approach

namely loop-based compression are examined and we present an experimental

demonstration of packet compression capable of compression using variable

compression ratios to achieve variable length to fixed packet size adaptation. Finally,

the performance of the compression scheme is evaluated using physical layer metrics.

Our research on optical packet decompression is presented in Chapter 7 along with

prior state of the art in this area. We put forward an approach to decompression that

complements the compression scheme presented in chapter 6 and the feasibility of

such an approach is experimentally demonstrated. Bit error rate measurements are

presented to evaluate the performance of such a decompression scheme. We look at

the issues in implementing back-to-back compression and decompression and

propose modifications to edge node adaptation to solve the identified issues. Finally

an experimental demonstration of end-to-end packet compression and decompression

for variable packet sizes as an adaptation to a fixed time slot is presented and its

performance is evaluated.

Chapter 8 presents the thesis conclusions and possible avenues for future work in this

area.

4

References

[1] Online : http://www.internetworldstats.com/emarketing.htm.

[2] Online: http://www.dtc.umn.edu/~odlyzko/doc/itcom.internet.growth.pdf.

[3] J. Gripp, M. Duelk, J. E. Simsarian, A. Bhardwaj, P. Bernasconi, O. Laznicka,

and M. Zirngibl, "Optical switch fabrics for ultra-high-capacity IP routers,"

Lightwave Technology, Journal of, vol. 21, pp. 2839-2850, 2003.

[4] D. J. Blumenthal, "Optical packet switching,"910-912 Vol.2, 2004.

5

http://www.internetworldstats.com/emarketing.htm.

http://www.dtc.umn.edu/%7Eodlyzko/doc/itcom.internet.growth.pdf.

Chapter 2

Optical Data Networks for Packet

Traffic

Today’s communication networks need to meet several critical requirements such as

providing scalable network bandwidth, a fine granularity of control, Quality Of

Service (QOS) and backward compatibility with a wide range of existing network

protocols and framing formats. Fiber optic communication has been used for years as

a transport medium in data networks for rapid transfer of large amounts of traffic

between nodes. However, at each processing node, data traffic gets converted to

electronic form before examination and reconverted to optical form after processing

and before exiting the node. Moore’s law predicts the growth of future electronic

circuitry and states that the number of transistors on an Integrated Circuit (IC)

doubles every 18-24 months. However, it has been seen that with the birth of the

internet, data traffic far exceeds growth predictions for electronic processing

capability made by Moore’s law. Consequently, it can be deduced that in view of the

explosive nature of data traffic, the direct processing of traffic using electronics

would require parallel electronic processors to meet future forwarding requirements.

However, such a scaling mechanism is cumbersome and non-optimal. Minimizing the

6

need for O-E-O conversion within the network is one approach to alleviate the burden

on the electronic domain. The aim of using Optical Packet Switching (OPS) is to

separate forwarding decisions that may be made at packet rates by existing

electronics, from the ultra-fast data rates of the packet payloads supported by fiber

optics, while still maintaining a fine granularity of control. Significant research is

being done targeting minimization of power and space footprints of optical

components by integration of multiple functionalities into a single chip [1, 2].

However, several issues need to be solved before deploying such chips to improve the

network performance. The purpose of the research presented in this thesis is

threefold:

• To explore the adaptation requirements on traffic within an OPS network

• To minimize bottlenecks on handling packet traffic due to practical

limitations of optical forwarding and the electronic control plane and

• To propose, demonstrate and evaluate optical adaptation techniques with the

intent of solving buffering issues that can be implemented using today’s

optical technologies.

In this chapter, we briefly examine today’s optical networks and extract requirements

that would need to be met when building an All-Optical Packet Switched (AOPS)

network capable of forwarding IP traffic.

7

2.1 Optical data networks – current state of art

Optical data networks in place today are primarily used as a transport mechanism

geared towards providing an efficient communication medium while a higher layer

(IP or ATM) implemented in electronics performs switching of individual routable

units of data. Such optical networks are designed to rely heavily upon the electronic

domain for all forwarding operations and consequently requiring as many as six O-E-

O conversions at each route processing node [3]. In order to grow successfully, data

networks must satisfy very different requirements and cater to end-user IP networks

and Corporate Storage Area Networks (SAN) [4]. In this section we examine two of

the most popular optical data networks currently in place namely Optical Ethernet

(Gigabit Ethernet or Gig-E and 10 Gigabit Ethernet or 10GigE) networks and the

Synchronous Optical NETwork (SONET) networks, to understand the reasons behind

their widespread deployment and identify the constraints and drawbacks of today’s

optical networks.

2.1.1 Optical Ethernet networks

While copper based Ethernet traditionally dominated Local Area Networks (LAN)

with 95% of IP traffic originating and terminating in Ethernet [5], optical Gig-E

networks which consists of 1000Mbps and 10Gbps links are designed to be deployed

as backbones in Metro Area Networks (MAN) and more recently Wide Area

Networks (WAN) [6]. Ethernet was originally developed by Robert Metcalfe and

David Boggs at the Xerox Corporation in 1973 to work at 2.94Mbps. All Ethernet

8

networks follow the Carrier Sense Multiple Access with Collision Detect

(CSMA/CD) protocol [ref IEEE 802.3] originally intended for the half duplex

operation, where multiple stations shared a single communication medium. The key

to success in ethernet is its simplicity and the high degree of structured compatibility

offered over various physical and data link implementations. While Gigabit ethernet

supports copper and optical transmission, 10GigE addresses many concerns arising

during high speed transport of packet traffic. We therefore review 10GigE standards

in particular as defined in [ref. IEEE standards document 802.3ae]. The most

important difference between 10GigE and previous Ethernet flavors is the elimination

of the half duplex operation or in essence the CSMA/CD and using only full duplex

operation, as half-duplex operation becomes cumbersome at high speeds. Figure 2. 1

shows the general layered model of a 10GigE network.

Session

Presentation

Application

Transport

Network

Data Link

Physical

Logical Link Control (LLC) and Media Access Control (MAC)

Full Duplex

ReconciliationReconciliation

XGMII (and XGAUI)

WWDM PHYWAN PHYLAN PHY

PMD PMD PMD PMD PMDPMD PMD

Session

Presentation

Application

Transport

Network

Data Link

Physical

SessionSession

PresentationPresentation

ApplicationApplication

TransportTransport

NetworkNetwork

Data LinkData Link

PhysicalPhysical


Full Duplex


Full Duplex

ReconciliationReconciliationReconciliationReconciliation

XGMII (and XGAUI)XGMII (and XGAUI)

WWDM PHYWWDM PHYWAN PHYWAN PHYLAN PHYLAN PHY

PMDPMD PMDPMD PMDPMD PMDPMD PMDPMDPMDPMD PMDPMD

Figure 2. 1: Layered 10GigE model. PHY - Physical layer device; PMD - Physical Medium Dependent

As can been seen, the strongly structured Ethernet model has a Media Independent

Interface (MII) along with a physical layer definition specific to a variety of

implementations and is an important reason behind the widespread deployment of

9

Ethernet. Key issues defined in Ethernet standards that are relevant to any optical

adaptation are encoding, physical layer clock/data recovery mechanisms, framing

specifications and reconciliation between transmission rates and electrical processor

rates. Data from higher layers is passed on to a Logical Link Control (LLC) and

Media Access Control (MAC), which are responsible for framing the data and control

handshaking for successful transfer of frames. 10GigE specifies a 10 Gigabit Media

Independent Interface (XGMII) and reconciliation layer, which provide a simple,

inexpensive, and easy to implement interface between the MAC and the lower

Physical Layer Device (PHY) layers. In essence, one function of these layers is to

handle proper serialization and deserialization of data between the physical layer

serial 10Gbps and MAC layer parallel 155 Mbps data rates. 10GigE defines seven

unique physical layer interfaces, 6 serial and 1 Wide Wavelength Division

Multiplexed (WWDM) mainly to cater to a wide range of cost, range (for LANs and

WANs) and compatibility with other technologies such as SONET. Due to its low

latency and high speed, 10GigE has become promising for Storage Area Networks

(SANs) and several companies such as Cisco, Extreme networks and Foundry

networks offer 10GigE interfaces today. The basic framing format used in all

Ethernet networks is the same and is shown in Figure 2. 2.

Preamble8 bytes

D Addr6 bytes

S Addr6 bytes

Type2 bytes

Data46 – 1500 bytes

CRC4 bytes

MAC Header Data Field MAC Trailer

Preamble8 bytes

D Addr6 bytes

S Addr6 bytes

Type2 bytes

Data46 – 1500 bytes

CRC4 bytes

MAC Header Data Field MAC Trailer

Figure 2. 2: General Ethernet Framing Format. CRC - Cyclic Redundancy Check; MAC - Media

10

Access Control

Bits are considered only in octets and frames are separated by a minimum interframe

gap of 12 bytes for synchronization at various layers. Small packets are padded up to

the minimum packet size of 64 bytes and the maximum supported frame size is

1518bytes. Each Ethernet frame consists of a start byte, 7 preamble bytes, a source

and destination address of 6 bytes each, the data of 64 - 1518 bytes and a trailer

Cyclic Redundancy Check (CRC) of 4 bytes. In total, 10GigE has a 26 byte overhead

per packet and a 12 byte minimum interpacket gap requirement or a 38 byte aggregate

overhead per packet amounting to 97.5 % utilization for 1500 byte packets. 10GigE

uses continuous signaling between the transmitter and the receiver so that clock may

be recovered from the data channel. Clock recovery is therefore not performed on a

frame-by-frame basis as was the case in early Ethernet implementations. A 64B/66B

encoding scheme is used by 10GigE, replacing the less efficient 8B/10B scheme in

earlier Ethernet implementations to maintain DC balance, increase data bit transitions

for clock recovery and increase the possibility of error detection and correction.

10GigE therefore offers a very standardized framing structure that may be

used with a wide variety of physical interfaces with the possibility of tapping into

cheap and efficient bandwidth. However, any forwarding decision will have to be

made only after the entire packet has been examined and therefore is wasteful in

terms of electronic processing requirement.

11

2.1.2 Synchronous optical networks – IP over SONET

Another widely deployed standard for optical communication in today’s

communication networks is Synchronous Optical NETwork (SONET). It is an

interface standard for connecting fiber optic transmission systems and fits in the

physical layer of the Open System Interconnect (OSI) model. SONET was originally

proposed by Bellcore in the mid 80s and is currently well defined by the ANSI [7].

The standard was originally created to run at OC-1 (51.84Mbps) with the goal of

unifying international digital systems and as a way to multiplex digital channels to

higher data rates of OC-192 (9.95Gbps) or higher. While SONET is strictly a physical

layer standard, it lends itself to transport IP directly over it as Packet over SONET

(PoS).

The basic SONET frame is shown in Figure 2. 3 consists of 810 bytes of information

that may be viewed as 90 bytes wide and 9 rows high [8]. The user data or

Synchronous Payload Envelope (SPE) occupies 87 columns while 3 columns are

reserved for section overheads. Each SONET frame is transmitted at precise 125usec

intervals regardless of the data rate. Multiplexing of several SONET streams is done

on a byte-by-byte basis to achieve higher data rates of OC-3, OC-12, OC-48 etc.

12

……….9 rows

80 columns

Section

Line

Transport Overhead

SPE Path Overhead

125 usec

……….9 rows

80 columns

Section

Line

Transport Overhead

SPE Path Overhead

125 usec

Figure 2. 3: Basic SONET Frame. SPE - Synchronous Payload Envelope

All SONET nodes are synchronous or plesionchronous to reduce or eliminate the

need for bit stuffing to compensate for clock variations in streams from different

transmitters. When transmitting IP packets over SONET, each IP datagram is first

encapsulated in a Point-to-Point Protocol (PPP) [ref. RFC 1661] and then in framed

using High-level Data Link Control (HDLC) [ref. RFC 1662] before being scrambled

and fit inside the SPE. Figure 2. 4 shows the general format of a HDLC-framed PPP-

encapsulated IP datagram.

Flag1 byte

Address1 byte

Protocol2 bytes

DataUsually 46 – 1500 bytes

FCS4 bytes

Flag1 byte

Flag1 byte

Address1 byte

Protocol2 bytes

DataUsually 46 – 1500 bytes

FCS4 bytes

Flag1 byte

Figure 2. 4: HDLC-framed PPP-encapsulated IP packet

The PPP encapsulation provides multiprotocol encapsulation, error control and link

initialization control while the HDLC framing is mainly used to provide delineation

within the SONET frames. SONET uses 1+x43 scrambling to ensure sufficient bit

transitioning and for security. The OSI protocol stack for PoS is shown in Figure 2. 5.

PoS uses a minimum of 9 bytes to encapsulate each IP packet and 36 bytes for every

13

810 bytes as SONET overhead. Framing multiple IP packets into synchronous

containers reduces the total framing overhead and the effective utilization for 1500

byte packets may be as high as 99.5%.

IPPPP

HDLCSONET

Network Layer

Data Link Layer

Physical Layer

IPPPP

HDLCSONET

Network Layer

Data Link Layer

Physical Layer

Figure 2. 5: PoS OSI Stack. IP - Internet Protocol; PPP- Point to Point Protocol; HDLC - High-level Data Link Control; SONET - Synchronous Optical NETwork

While PoS is an efficient way of transferring IP packets, it does not allow for

individual manipulation of packets during forwarding. All framing must be removed

before IP packets can undergo statistical multiplexing/demultiplexing and the

forwarding electronics must therefore examine the entire data stream to do so.

2.2 OPS networks of today

While the static provisioning of optical fiber bandwidth has been widely used

commercially using Wavelength Division Multiplexing (WDM) in fibers, OPS

affords the possibility of more efficiently utilizing available bandwidth by

provisioning it dynamically at a much finer granularity. Optical technologies capable

of supporting OPS have been demonstrated and several network approaches such as

KEOPS, AOLS, WASPNet and OPSNet have been proposed. In this section we will

briefly look at two of the significant proposals for OPS and define key issues that

need to be addressed to enable packet forwarding in the optical domain.

14

2.2.1 KEOPS network

Work on the European ACTS approach to OPS has been done under the KEys to

Optical Packet Switching (KEOPS) project [9] with contributions from both

companies such as Alcatel and France Telecom and universities such as TUD and

ETH – Zurich. The focus here was to analyze and address some basic issues before

optical bit rate transparent packet switching can be performed. The approach proposes

to use fixed duration payloads that are 1.35usecs long with both forwarding

information and the payload being transmitted serially on the same wavelength.

Figure 2. 6 shows the general packet structure used while Figure 2. 7 shows the basic

node structure used in the project. An important issue with this approach is that the

payload time must be adjusted closely relative to the header using synchronizers.

Incoming packets undergo O-E conversion for header recovery and synchronization

Synch

Header

Payload 1.35usecs

1.646 usec packet

Guard times time

Synch

Header

Payload 1.35usecs

1.646 usec packet

Guard times time

Figure 2. 6: KEOPS Packet Format

before entering the switch fabric. Within the switch, packets undergo header erasure,

demultiplexing, wavelength conversion, multiplexing and finally header rewriting

before exiting. Mach Zehnder Interferometric Wavelength Converters (MZ – IWC)

15

are used to tune packets to appropriate wavelengths. This approach uses input

buffering [10] in the switch to avoid collisions.

Header Recovery and synch control

Switch Control Header Update

Electrical Control Plane

Synch

Switch Fabric

Signal regeneration and header

updation

DMUX MUX

Optical Plane

Fiber in Fiber out

Header Recovery and synch control

Switch Control Header Update

Electrical Control Plane

Synch

Switch Fabric

Signal regeneration and header

updation

DMUX MUX

Optical Plane

Fiber in Fiber out

Figure 2. 7: KEOPS Node structure. DMUX - Demultiplexer; MUX - Multiplexer

However a feed-forward approach to buffering is adopted which does not provide

good support for differentiated services. While the project addresses some key

concerns of OPS and demonstrates essential enabling optical components, actual end-

to-end demonstration of packet forwarding is not undertaken. The use of a fixed

payload size requires aggregation of smaller IP packet sizes at the edge for efficient

bandwidth utilization. This can have undesirable effects on the overall latency and

places further processing requirements on the edge electronics. Packets longer than

the fixed payload size would require segmentation before being forwarded through

the optical network. Segment misordering and latency increase due to additional

processing steps are some of the drawbacks in doing so. Moreover, while packet

16

header may be used for identification of the optical packet itself at core nodes [11],

the need for payload encapsulation is not addressed.

2.2.2 All Optical Label Swapping network

Another approach to OPS is the All-Optical Label Swapping (AOLS) done at the

University of California, Santa Barbara (UCSB) [12]. Work done in this thesis adopts

the AOLS architecture and investigates the adaptation requirements in implementing

this architecture. The approach considers both serial and Sub-Carrier Multiplexing

(SCM) approaches for labeling of optical packets as shown in Figure 2. 8., introduces

specific technologies required to implement the architecture and finally demonstrates

high speed label swapping using fiber based wavelength converters (FWC).

Variable sized IP Payload IP Header Optical Label

Guard Band

IP Bit Rate e.g., 2.5 Gbps – 40 Gbps Optical Label Bit Rate e.g. 2.5Gbps

(a)

Variable sized IP Payload IP Header

Optical Label

IP Bit Rate e.g., 2.5 Gbps – 40 Gbps

Optical Label Bit Rate e.g. 2.5Gbps

Optical SCM Label

Encapsulation

Optical Serial Label

(b)

Variable sized IP Payload IP Header Optical Label

Guard Band

IP Bit Rate e.g., 2.5 Gbps – 40 Gbps Optical Label Bit Rate e.g. 2.5Gbps

(a)

Variable sized IP Payload IP Header

Optical Label

IP Bit Rate e.g., 2.5 Gbps – 40 Gbps

Optical Label Bit Rate e.g. 2.5Gbps

Optical SCM Label

Encapsulation

Optical Serial Label

(b)

Figure 2. 8: AOLS Packet Frame. (a) Serial label, (b) Optical SCM label

17

Figure 2. 9. shows the general AOLS architecture where IP packets are labeled at the

ingress nodes, forwarded all optically at the core node and finally stripped back to

their IP framing before exiting the network at the egress nodes. Fiber Loop Mirrors

(FLM) used for SCM header recovery, fast switching tunable laser sources, SOA

based MZ-IWCs and high speed FWCs have been demonstrated. While the general

framework for OPS has been researched, specific adaptation and actual demonstration

of end-to-end forwarding of IP packets remains to be done.

Ingress Edge Router with

AOLS Interface

AOLS core router or subnet

Egress Edge Router with

AOLS InterfaceAOLS core router or subnet

IP Packet from source node Optical

Label and packet at λi Optical

Label and packet at λj

IP Packet to destination node

Optical AOLS Core Network


AOLS Interface


AOLS Interface




AOLS Interface


AOLS InterfaceAOLS core router or subnet


IP Packet from source node Optical

Label and packet at λi Optical

Label and packet at λj

IP Packet to destination node

Optical AOLS Core Network

Figure 2. 9: AOLS Network architecture

Research on electronic adaptation ensuring proper transport and forwarding of

individual packets and optical adaptation that may be used to improve optical

forwarding performance by addressing issues such as buffering for contention

resolution would help in the complete end-to-end realization of the AOLS

architecture.

18

2.3 Challenges in building optical packet switched networks

As described in the previous section, significant research in OPS has been done over

the past few years including but not limited to research in architectural designs, sub-

systems development and component integration [1, 2]. Consequently, OPS has

reached a state of maturity where end-to-end network design, demonstration and

performance assessment for packet forwarding may be attempted. However,

sufficient design of both optical and electronic adaptation would be required, taking

into account all the physical layer limitations and forwarding plane requirements. In

this section, we look at the impact of the different optical and electronic layer

constraints on optical packet framing before putting down a set of guidelines for

design of an adaptation mechanism.

2.3.1 Framing requirements

Before proceeding to define the framing requirements, we examine the nature of the

data to be forwarded through the all-optical network. While most or all of the

adaptation proposed and demonstrated in this thesis may be modified to fit different

packet framing standards external to the all-optical network, we concentrate on

forwarding Internet Protocol (IP) packets as they make up most of the today’s

Internet traffic. Figure 2. 10 shows the general structure of an IP packet. The IP

header consists of multiple fields of which the Type of Service (TOS), total length,

Time To Live (TTL) and source and destination addresses are most relevant for

forwarding in the optical domain. While IPv4 can support up to 65536 bytes of

19

payload, Internet statistics [13] show that majority of the internet traffic has an IP

packet size variation from 44 to 1500 bytes. In this thesis, incoming packets are

assumed to be of variable byte size from 44 to 1500 bytes. Forwarding of packets is

done primarily using the IP destination address, which is 32 bits long. This

information must be passed on to the optical header for proper forwarding and routing

of optical packets.

Data – variable length

Option and Padding

Destination Address

Source Address

TTL Protocol Header Checksum

Identification Flags Fragment Offset

Total LengthVer. IHL. TOS

Data – variable length

Option and Padding

Destination Address

Source Address

TTL Protocol Header Checksum

Identification Flags Fragment Offset

Total LengthVer. IHL. TOS

Figure 2. 10: IP Packet Structure. IHL - Internet Header Length; TOS - Type Of Service; TTL - Time To Live

QOS within the optical network may be set by including information from the IP TOS

field. IP packets do not undergo any scrambling and any optical framing mechanism

implemented must, if required, have its own payload scrambling for the purpose of

both security and dc balance. It is also preferable that each IP packet be transferred as

a single self-contained routable unit without fragmentation. In the case where framing

requires fragmentation of the packets, the framing mechanism must ensure that IP

packets exiting the optical network would be whole.

20

2.3.2 Physical and data link layer issues

When designing an adaptation for OPS, it is important to take physical rise/fall time

considerations of both optics and control electronic into account. While technology to

handle payload bit rates of up to 160Gbps have been demonstrated at UCSB [14],

electronics required to process framing information and generate control signals

presents a bottleneck in packet forwarding. It is therefore ideal to limit complex bit

level manipulations of the optical packet payload to the edge node electronics while

core node electronics deal only with packet level manipulations. Stream level

manipulation (signal amplification, regeneration, static or MEMS based quasi-static

WDM routing) may be limited to the core transmission optics as shown in Figure 2.

11. Nodes within the optical network would need to communicate for control plane

updation and for data link layer management. The optical adaptation design should

provide a mechanism for such communication between nodes.

Edge nodes – Bit level manipulation

Core nodes - Packet level manipulation

Core transmission -Stream level manipulation

Edge nodes – Bit level manipulation

Core nodes - Packet level manipulation

Core transmission -Stream level manipulation

Figure 2. 11: Data Entity Granularity

21

2.4 OPS Performance Measurement Metrics

Another challenge of building an OPS network is the proper evaluation of its

performance. While the performance of optical links is currently gauged using

physical layer metrics to ensure sufficient link quality, an OPS network designed

must test both physical and network layer metrics to evaluate its performance.

Performance tests on OPS thus far have been limited to extensive physical layer

measurements and some basic link layer packet loss predictions based on physical

layer measurements. However, the impact of various electronic and optical factors on

the performance of an OPS network cannot be sufficiently measured using just these.

For example, a physical layer connection with an excellent bit error rate performance

does not necessarily translate to a low packet loss rate. We present the main

measurement metrics that may be used to assess and improve the network

performance.

2.4.1 Physical Layer (Layer 1) Performance Measurements

Several physical layer performance measurements have been used in this thesis to

assess and predict the performance of the optical subsystems such as it bit error rate,

extinction ratio and Optical Signal to Noise Ratio. Additional measurements such as

bit jitter may be applicable based on the nature of the application.

2.4.1.1 Bit Error Rate:

This measurement is defined as the probability of incorrect identification of a bit by

the decision circuit of the receiver [15]. It is given in terms of a ratio and may be

22

calculated as the number of erroneous bits received divided by the total number of

bits transmitted, received, or processed over some stipulated period of time. The BER

is usually expressed as a coefficient and a power of 10; for example, 2.5 erroneous

bits out of 100,000 bits transmitted would be 2.5 out of 105 or 2.5 x 10-5. For a BER

measurement to be accurate, a sufficient number of errors must be detected that

statistically validates the measurement.

2.4.1.2 Extinction Ratio

Extinction ratio is defined as the ratio of two optical power levels of the intensity

modulated representation of a digital signal generated by an optical source, e.g., a

laser diode, where P 1 is the optical power level generated when the light source is

"on," and P 0 is the power level generated when the light source is "off."

0

1

PPRatioExtinction =

It is most commonly expressed in decibels (dB) in the logarithmic scale.

2.4.1.3 Optical Signal to Noise Ratio

The Optical Signal to Noise Ratio (OSNR), also measured in dB, is the ratio of

optical power in the optical signal and that in the noise within the signal bandwidth

(usually defined by the filter used) accompanying the signal. It is usually given in dB.

Noise

Signal

PP

RatioExtinction =

23

2.4.2 Packet level (Layer 2/3) Measurements

While physical layer measurements can be a good tool to predict the approximate

packet performance, layer 2/3 measurements are required to fully study and

understand the impact of packet conditions on the physical layer. Packet

measurements are layer 2 measurements when made with just frame identification,

and layer 3 when a more detailed check on both proper forwarding and complete

packet integrity is done before a decision on good or bad packet count is reached. In

addition to the metrics defined below, packet loss and packet jitter may also be

critical based on the type of application.

2.4.2.1 Packet Throughput

Packet throughput may be defined as the maximum packet rate at which none of the

offered packets are dropped by the device. In an OPS network, measurement of end-

to-end throughput is helpful in evaluating the limits of packet forwarding and is a

direct readout of the impact of changes in both network and physical layer

parameters.

2.4.2.2 Packet Latency

Latency may be defined as the time interval starting when the end of the first bit of

the input packet reaches the input port and ending when the start of the first bit of the

output packet is seen on the output port of the node under test. It may be used to

assess the amount of delay a packet suffers when being forwarded through a network.

24

2.5 Chapter summary

Packet transport protocols used commercially in today’s optical communication

networks and research approaches to date on Optical Packet Switching (OPS) were

examined in this chapter. While several key enabling optical technologies and

integration are being researched, various framing challenges need to be overcome

before the practical realization of OPS. Different framing formats have been proposed

and demonstrated thus far. However, in order to achieve complete decoupling of

optical payload and header information and to be able to perform any optical

adaptation that may be required at the core, a framing mechanism that supports

variable length payloads is needed. In the next chapter, we propose and evaluate the

performance of a new adaptation mechanism designed based both on lessons learnt

from previous work and additional optical and electronic constraints that may affect

the performance of all-optical packet forwarding.

25

References

[1] V. Lal, M. Masanovic, D. Wolfson, G. Fish, C. Coldren, and D. J.

Blumenthal, "Monolithic widely tunable optical packet forwarding chip in InP

for all-optical label switching with 40 Gbps payloads and 10 Gbps labels,"

presented at Optical Communication, 2005. ECOC 2005. 31st European

Conference on,25-26 vol.6, 2005.

[2] M. L. Masanovic, V. Lal, J. A. Summers, J. S. Barton, E. J. Skogen, L. G.

Rau, L. A. Coldren, and D. J. Blumenthal, "Widely tunable monolithically

integrated all-optical wavelength converters in InP," Lightwave Technology,

Journal of, vol. 23, pp. 1350-1362, 2005.

[3] D. J. Blumenthal, "Optical packet switching," presented at Lasers and Electro-

Optics Society, 2004. LEOS 2004. The 17th Annual Meeting of the IEEE

910-912 Vol.2, 2004.

[4] C. DeCusatis, "Storage area network applications," presented at Optical Fiber

Communication Conference and Exhibit, 2002. OFC 2002,443-444, 2002.

[5] M. C. Nuss, "Optical Ethernet in the Metro," presented at Lasers and Electro-

Optics Society, 2001. LEOS 2001. The 14th Annual Meeting of the IEEE,289

vol.1, 2001.

[6] J. Hurwitz and W. C. Feng, "End-to-end performance of 10-gigabit Ethernet

on commodity systems," Micro, IEEE, vol. 24, pp. 10-22, 2004.

26

[7] ANSI T.105-1991 and ANSI T1.105-1988, Digital hierarchy - optical

interface rates and formates specifications.

[8] A. S. Tanenbaum, Computer networks, 3rd ed. Upper Saddle River, N.J.:

Prentice Hall PTR, 1996.

[9] C. Guillemot, M. Renaud, P. Gambini, C. Janz, I. Andonovic, R. Bauknecht,

B. Bostica, M. Burzio, F. Callegati, M. Casoni, D. Chiaroni, F. Clerot, S. L.

Danielsen, F. Dorgeuille, A. Dupas, A. Franzen, P. B. Hansen, D. K. Hunter,

A. Kloch, R. Krahenbuhl, B. Lavigne, A. Le Corre, C. Raffaelli, M. Schilling,

J. C. Simon, and L. Zucchelli, "Transparent optical packet switching: the

European ACTS KEOPS project approach," Lightwave Technology, Journal

of, vol. 16, pp. 2117-2134, 1998.

[10] A. Pattavina, "Architectures and performance of optical packet switching

nodes for IP networks," Lightwave Technology, Journal of, vol. 23, pp. 1023-

1032, 2005.

[11] C. Guillemot, A. Lecorre, J. Kervarec, M. Henry, J. C. Simon, A. Luron, C.

Vuchener, P. Lamouler, and P. Gravey, "Optical packet switch demonstrator

assessment: packet delineation and fast wavelength routing," presented at

Integrated Optics and Optical Fibre Communications, 11th International

Conference on, and 23rd European Conference on Optical Communications

(Conf. Publ. No.: 448) 343-346 vol.3, 1997.

[12] D. J. Blumenthal, B. E. Olsson, G. Rossi, T. E. Dimmick, L. Rau, M.

Masanovic, O. Lavrova, R. Doshi, O. Jerphagnon, J. E. Bowers, V. Kaman, L.

27

A. Coldren, and J. Barton, "All-optical label swapping networks and

technologies," Lightwave Technology, Journal of, vol. 18, pp. 2058-2075,

2000.

[13] K. Thompson, G. J. Miller, and R. Wilder, "Wide-area Internet traffic patterns

and characteristics," Network, IEEE, vol. 11, pp. 10-23, 1997.

[14] W. Wei, L. G. Rau, and D. J. Blumenthal, "160 Gb/s variable length packet/10

Gb/s-label all-optical label switching with wavelength conversion and

unicast/multicast operation," Lightwave Technology, Journal of, vol. 23, pp.

211-218, 2005.

[15] G. P. Agrawal, Fiber-optic Communication Systems, Second Edition ed.

28

Chapter 3

Edge Node Traffic Adaptation

All Optical Packet Switched (AOPS) networks present an approach to forwarding

traffic where the forwarded optical payload does not undergo Optical-Electronic-

Optical (O-E-O) conversions throughout the network. However, to exploit the

advantages of such a network, a suitable adaptation that offers sufficient forwarding

control of the optical payloads using electronic control signals is required. In the

previous chapter we examined basic issues in adapting traffic such that all-optical

packet forwarding may take place at the core. Several factors contribute to the end-to-

end efficiency of an OPS network. In this chapter, we analyze the traffic

characteristics along with electronic and optical performance limitations, propose a

framing format for packet forwarding within the optical network and study the

various subsystems that go into building edge nodes to implement packet adaptation.

Finally, we evaluate the performance of edge adaptation implemented based on layer-

3 performance metrics and propose ways of increasing overall forwarding efficiency.

29

3.1 Architectural Considerations for an All-Optical Label

Swapped Packet Switched network

In implementing an OPS network, a successful packet adaptation would need to take

several influencing elements into account. At the network edge, packets transition

between the electronic and optical planes and the bulk of packet processing takes

place in the electronic domain. Consequently bit level processing of the packet stream

is possible at base rate. In order to exploit the high bandwidth capabilities offered by

optics, it is desirable that optical payloads are transmitted at higher bit rates (10Gbps

or higher) while the framing that requires electronic processing at the optical core be

transmitted at a lower data rate (e.g. 2.5Gbps). Figure 3. 1 shows the optical and

electronic functions at different points of an optical network along with issues and

bottlenecks limiting the forwarding process. At the network edge, forwarding is

limited mainly by electronic bottlenecks such as IP packet processing and bit level

frame processing at edge data rates. Furthermore, it is critical that any framing

mechanism used ensures error-free bit level recovery of forwarded payloads. At the

core however, network performance limitations include optical rise/fall time

considerations of forwarding elements, the electronic header recovery and

uncertainties in control signals. Complete control over serialization and

deserialization processes is required to reduce control signal timing uncertainties to

within a bit of the serialized data rate. However, commercially available electronics

30

Ingress

CoreEgress

OPTICS

ISSUESAND BOTTLENECKS

ELECTRONICS

* Fast wavelength tuning

* Fast wavelength tuning* Packet Wavelength Conversion

* Optical Rx

* Parallel processing of serial data

* Optical rise/fall times* Electronic header CDR* DC balance

* Per packet payload CDR* Physical layer variations

* Framing* Serialization

* Header recovery* Forwarding control signal generation

* Deframing* Deserialization

OPTICAL NETWORKIngress

CoreEgress

OPTICS

ISSUESAND BOTTLENECKS

ELECTRONICS

* Fast wavelength tuning

* Fast wavelength tuning* Packet Wavelength Conversion

* Optical Rx

* Parallel processing of serial data

* Optical rise/fall times* Electronic header CDR* DC balance

* Per packet payload CDR* Physical layer variations

* Framing* Serialization

* Header recovery* Forwarding control signal generation

* Deframing* Deserialization

OPTICAL NETWORK

Figure 3. 1: End-to-End network functional architecture.

do not satisfy this new requirement as they are traditionally used on continuous data

streams.

3.1.1 Traffic characteristics at the Network Edge

Packet traffic entering and leaving an OPS network is framed in existing data

link/physical layer protocols such as Ethernet or Packet over SONET (PoS).

Connection between the optical network edges may be viewed as simple IP only

connections with control of packet forwarding within the optical network confined to

the optical control plane as in the case of a peer-to-peer network. Alternately, it may

be dictated by the routing mechanism external to the optical network as would be the

case of an overlay model (Figure 3. 2). In order for the optical network to be protocol

31

transparent at the network edge i.e. to allow for IP packet traffic to enter/leave the

OPS network

Optical control plane and forwarding plane

External Network Control Plane

Optical control plane and forwarding plane


(a)



(b)

Figure 3. 2: a) Peer-to-Peer vs. (b) Overlay optical network approach

over various lower layer protocols, any protocol connection into or out of the OPS

network must be terminated at the edge. In this thesis, we focus primarily on IP

packets transported over PoS interfaces as the connection to the outside network. A

suitable framing/deframing card at the ingress/egress of the OPS network would

allow for interfacing to any other external protocol standard. IP Packets are assumed

32

to be of variable byte size from 40 to 1500 bytes and fragmentation of packets is not

considered due to the additional complexity and problems associated.

3.1.2 Electronics in optical networks

While traffic within an OPS network predominantly remains in the optical domain,

control of all traffic is performed using electronics. At the network edges, processing

and framing of the entire packet is done in parallel words of 32/64 bits or higher.

However, unlike most today’s transport standards where processing is performed on

the entire stream of packets, each transmitted packet may follow a different

forwarding path through the network. Consequently, bit level processing on parallel

word traffic would have to be performed considering each packet individually as

opposed to on the entire data stream for traffic on existing standards. Furthermore,

aggregation of packets may be required to scale from lower edge rates for better

utilization of high-speed optical links. During detection at the network egress, per

packet variation of the optimum bit threshold level and the appropriate choice of

sampling instant is critical to maintain error free operation when receiving packets

from different sources. Limiting amplifiers may be used to provide suitable

interfacing between detected analog signals and the digital sampling circuits. At the

network core, a weak temporal relationship between the payload and framing exists.

Electronic rise/fall times of control signals add up to rise/fall times of optical

components while timing uncertainty caused due to serialization/deserialization

originates primarily from core node electronics. These factors must be taken into

account while designing the optical framing mechanism. Control electronics also

33

convert digital control bit into analog signals for control of optics and include Digital-

to-Analog Converters (DAC), limiting amplifiers, modulator drivers and current

drivers. After header removal, control signals are generated at packet rate. Varying

interpacket gaps result in varying dc content in these control signals and pose a

problem in electronics as most commercial high speed electronics are ac coupled. A

compromise between rise/fall times and ac coupling effects may be required for

proper operation. Advantages in packet level processing of forwarding information at

the optical core nodes may be counterbalanced by an inefficient framing mechanism.

Guard bands and synchronization times that allow for processing time in the optical

domain decide the efficiency of the packet framing. However, at the cost of lower

link bandwidth utilization, decoupling of high bit rate payloads from the base rate

framing using expensive guard bands can result in significantly greater scalable

packet forwarding rates.

3.2 Optical Framing Considerations

In this section we design and implement a framing structure based on various

requirements by both the optical core and edge electronic processing stages to ensure

a) proper all-optical payload forwarding and b) successful recovery of all the payload

bits. The general packet structure consists of a framing mechanism containing header

and packet delineation words encapsulating a decoupled payload that may be at

higher bit rate. Decoupling ensures payload transparency in that payloads may be

processed all-optically at the core after header and optical framing information have

34

been stripped and based only on optical header information. The framing mechanism

must support continuously variable payload sizes from 40 to 1500 bytes. After

framing, measurements within the optical network would be confined to the physical

layer whereas layer-3 measurements may be performed end-to-end only.

3.2.1 Forwarding requirements

Edge forwarding may be based on the raw IP header while at the core examination of

the optical payload is not possible. Forwarding information must therefore be

included in the optical header and may be extracted from the IP destination address,

the Type of Service (TOS) and the Time To Live (TTL) fields. It is desirable that

although the IP destination address is 32 bits long, the size of the Optical Header

(OH) be kept small for fast next hop and label lookup at the core. While this limits the

effective maximum size of the optical network, label reusability in label swapping

technologies helps in extending the network size further. Quality of service and

differentiated services may be implemented by including a QOS field in the optical

header. While this information may be based on the TOS field of IP packets, tighter

network control is possible as these bits are set by the optical ingress nodes and not

by the end user as is the case in IP. This could affect bandwidth provisioning and core

node decisions such as access to buffering and contention resolution elements. The

TTL field in an optical packet may be either transported end-to-end and decremented

at the optical egress point, or may be decremented at each node within the optical

network and used to ensure that packets do not occupy network bandwidth infinitely

due to virtual forwarding loops. When the TTL field falls to zero and the packet does

35

not reach a network egress, packet payload may either be re-examined and a new

optical header assigned, or the packet may be simply dropped.

3.2.2 Idlers

In OPS, each optical packet may be viewed as an individual routable unit and packets

in each stream may be statistically multiplexed and demultiplexed at each node within

the optical network. Optical receivers both at the core and egress nodes however,

require that the average power level on incoming link be kept constant for proper

operation. DC balance of the optical packet becomes critical for the following

reasons:

1. Equalization of optical power through the entire packet. Lack of power

equalization would lead to uneven gain at EDFA (amplifiers).

2. Assurance of a bit transition required for clock recovery every specified number

of bits. For any clock recovery system to work, a guaranteed bit transition is

required over a certain number of bits, to charge the system. DC balance can

provide this guarantee.

Idlers or inter-packet fills may be added between packets in a stream to maintain

constant average power and provide adequate dc balance between packets. However,

a proper encoding scheme is required to provide DC balance within the optical packet

itself, more specifically within the optical payload. Standard encoding formats such

as 64B/66B or Manchester encoding can be used to provide the required DC balance.

Idlers may also be used for link level communication between nodes. Characters used

as idlers must be uniquely identifiable and must contain sufficient bit transitions to

36

maintain dc balance. At the optical ingress, idlers may be generated locally and

inserted after payload processing and header insertion as the entire packet stream is

transported together. At the network core however, statistical multiplexing and

demultiplexing of packets changes the nature of packet streams and external idlers

may be required to fill newly generated gaps within the packet stream. In the absence

of proper dc balance in the core, optical components such as limiting amplified

photoreceivers and amplifiers experience transient behavior as shown in Figure 3. 3

that may corrupt the packet stream and result in permanent loss of information.

Amplifier Gain Transients

Photodetector/TLADetector/TLA

slow start

AmplifierIncoming packet stream

Packet distortion

Amplifier Gain Transients

Photodetector/TLADetector/TLA

slow start

AmplifierIncoming packet stream

Packet distortion

Figure 3. 3: Need for idlers - transmission distortions in the absence of dc balance

3.2.3 Edge and Core node Clock-Data Recovery (CDR)

After photodetection, successful recovery of bits within the optical packets depends

on clock and data recovery. After forwarding, packets arriving at the next hop may be

from different node transmitters and consequently at slightly different phase and

frequencies. Moreover, since optical framing is erased and rewritten at each hop,

there is no correlation between phase and frequency of frame and payload clocks.

Clock recovery must therefore be performed on the fly and with extremely short

capture times. An OPS framing mechanism must therefore allow for packet level

37

clock synchronization times and must ensure adequate bit transitions within both the

payload and the framing to facilitate clock tone isolation from the data stream. At the

core, clock synchronization achieved at the start of the optical packet is lost during

the payload time and therefore must be locked on again before the end of optical

packet is detected. While all optical clock recovery approaches have been proposed

[1], a filter based approach is proposed for clock recovery as unlike Phase Locked

Loops (PLLs) and flip-flop based bang-bang detectors (usec range capture times),

this approach can capture clock information within tens of bits. Figure 3. 4 shows the

general structure of a filter based clock recovery approach.

Photodetect

Amplifier

Frequency doubler

Narrowband Filter

Limiting amplifier

Photodetect

Amplifier

Frequency doubler

Narrowband Filter

Limiting amplifier

Figure 3. 4: Packet mode clock recovery

In the case of Non-Return-to-Zero (NRZ) data transmission as is usually the case with

lower bit rates used for optical framing, a frequency doubler is used to retrieve the

appropriate clock tone, whereas this may be excluded when CDR is performed on

Return-to-Zero (RZ) traffic. After initial photodetection and amplification of the

incoming data signal, a doubler is used to recover the clock tone at the data repetition

rate. A narrowband filter is then used to reject all frequencies but the clock tone. The

Q value of the filter decides the capture time of the clock recovery circuit and the

higher the Q, the faster the capture of clock information. However a tradeoff in Q

38

value selection is preferred, as a faster capture time also means a smaller hold time

after data bit transitions are removed. The recovered clock signal suffers from

amplitude variations due to the varying clock content within the data bits and

therefore necessitates the use of a limiting amplifier with a wide input amplitude

dynamic range to guarantee a stable clock. At the core, any optical adaptation to

different bit rates would require additional payload clock recovery. However, this

may be limited to a sub harmonic clock recovery by use of appropriate synch words

and encoding schemes.

3.2.4 Framing structure

The AOLS adaptation layer must satisfy the following main criteria:

1. Complete optical payload and frame decoupling must be achieved. Packet

level processing of optical payload based only on the optical packet header

must be possible and proper delineation of both packet and payload is

required.

2. As discussed in the previous section, on the fly, clock recovery must be

possible based on synchronization words both at the header and the payload.

Adequate frame and payload coding is required to ensure sufficient bit

transitions.

3. All forwarding information for optical packets must be self contained within

the header and must be accessible lower bit rates for electronic processing at

the optical core nodes. All forwarding decisions made at the core are based

39

completely on this information without requiring electronic examination of

the optical payload.

4. Interpacket fills must be in place at all times to ensure adequate dc balance.

Idlers used for this purpose must contain sufficient bit transitions to maintain a

constant average power.

5. Any further optical adaptation requiring transfer of information on the payload

must be contained within the optical header and passed along with the payload

even after label swapping.

Based on these requirements we propose an AOLS packet framing mechanism as

shown in Figure 3. 5.

IdlersIdlers SOOP Optical PayloadSW OL SW SOIP EOIP SW EOOPGB GB IdlersIdlers SOOP Optical PayloadSW OL SW SOIP EOIP SW EOOPGB GB

Variable sized decoupled payload container

Optical Packet

Optical packet framing Optical packet framing

IdlersIdlers SOOP Optical PayloadSW OL SW SOIP EOIP SW EOOPGB GB IdlersIdlers SOOP Optical PayloadSW OL SW SOIP EOIP SW EOOPGB GB IdlersIdlers SOOP Optical PayloadSW OL SW SOIP EOIP SW EOOPGB GB

Variable sized decoupled payload container

Optical Packet

Optical packet framing Optical packet framing

Figure 3. 5: Optical Framing Structure. SW - Synch. Word; S/EOOP –Start/End Of Optical Packet; OL - Optical Label; S/EOIP- Start/End Of IP Packet

A synchronization word (SW) at frame data rate is used at the start of each optical

packet for clock recovery followed by a unique Start Of Optical Packet (SOOP)

identifier word. An End Of Optical Packet (EOOP) identifier demarcates the end of

each optical packet and is preceded by another frame rate synchronization word. All

forwarding information required within the optical network is contained in the

Optical Label (OL). Guard bands are used to temporally decouple the optical payload

from the framing and the size of the guard bands is determined by several factors

40

including the largest rise/fall time for optical processing elements within the core and

any accumulative effects of timing uncertainties. Each packet payload container starts

with a payload rate synchronization word and a Start Of IP Packet (SOIP) identifier

word and may be of variable length. An End Of IP Packet (EOIP) word denotes the

end of the optical payload and no synchronization word is required preceding this as

clock information is assumed to be maintained throughout the payload. Each optical

payload undergoes a simplified 32B/34B encoding scheme to ensure that unique

framing identifiers are not present within the payload. Each word consists of 32 bits

or 4 bytes to conform to the electronic processing bus width and each optical packet

has a 10 words or 40 bytes overhead. Figure 3. 6 shows the simple structure of the 32

bit optical header which consists of 12 reserve bits that may be used to carry

additional information for any optical adaptation, 9 bits of an Optical identifier

(O_ID) used in making forwarding decisions at the core, 8 bits for TTL and 3 bits for

priority to implement quality of service.

O_ID (9 bits)

TTLOptical(8 bits)

Priority (3 bits)

Reserve bits (12 bits)

Optical Label – 32 bits

O_ID (9 bits)

TTLOptical(8 bits)

Priority (3 bits)

Reserve bits (12 bits)

Optical Label – 32 bits

Figure 3. 6: Example 32 bit Optical Label or header. O_ID - Optical Identifier; TTL - Time To Live

While 9 O-ID bits limit the number of unique ID combinations to 512, label reuse

functionality of the label swapping technique can be used to increase the effective

maximum optical network size significantly. Figure 3. 7 shows scope traces of the

framed optical packets transmitted from the optical ingress.

41

Optical PacketsOptical Packets

(a)

SOOP EOOP

SOIP EOIP

GBSOOP EOOP

SOIP EOIP

GB

(b)

Figure 3. 7: Experimental scope traces of (a) Optically framed packets (b) Packet framing (refer Figure 3.6)

3.3 Edge Node Adaptation for OPS Traffic

While prior state of art framing mechanisms deal with fixed sized packets [2], the

AOLS adaptation considers forwarding requirements of variable sized packets. Edge

node adaptation acts as the interface between electronic routing (with optical

transport such as 10GigE and PoS) and all-optical packet forwarding. This section

deals with the structure of edge nodes to successfully implement the adaptation

mechanism proposed in the previous section. Figure 3. 8 shows the adaptation layer

processing steps at the optical network edges. Raw IP packets are extracted from

external transport protocol framing and undergo encoding and delimiting to form the

optical payload. Forwarding information extracted from the IP header is then used to

generate an optical header and framing before the optical packet is placed in the

stream. Locally generated idlers are finally inserted into the data stream to ensure

uniform average power before transmission into the optical network at the appropriate

42

wavelength. After all-optical forwarding through the optical network, payload

extraction from the optical data stream is performed before

IP Packet

Encoded IP

Encoding

Payload Framing

Optical Packet Framing

Idler Insertion

Framed Payload

Framed Optical Packet

Idlers Inserted data stream

Optical λTuning

Payload Extraction

Packet id and Idlers Extraction

Optical Receive

Encoded IP extraction

Decoding

All-Optical Packet Switching

INGRESS EGRESS

Physical layer Transmission

IP Packet

Encoded IP

Encoding

Payload Framing

Optical Packet Framing

Idler Insertion

Framed Payload

Framed Optical Packet

Idlers Inserted data stream

Optical λTuning

Payload Extraction

Packet id and Idlers Extraction

Optical Receive

Encoded IP extraction

Decoding

All-Optical Packet Switching

INGRESS EGRESS

Physical layer Transmission

Figure 3. 8: Edge node adaptation layering

payload decoding and IP packet extraction takes place. The IP header may be updated

with information such as TTL from the optical packet header before the IP packets

are framed in external transport protocol framing and exit the optical network. It is

important to note that while data link and physical layer link signaling are terminated

at the optical network edge, layer-3 or IP continuity is maintained through the AOLS

network and is required for end-to-end performance measurements.

3.3.1 Ingress Node Description

The ingress node functionality implements the following set of steps to generate an

optical packet stream.

43

1. Deframe incoming data from external protocols and present raw IP packets

ready for packet adaptation.

2. Process IP packets and generate a framing by encapsulating each packet into

an optical packet. This stage also performs label lookup based on the IP

header and payload encoding.

3. Fast tunable laser with digital control input to transmit each optical packet at

the appropriate wavelength and an optical modulator that generates an optical

data stream based on the framed electrical packet stream. Wavelength

selection is required to perform space switching by use of an Arrayed

WaveGuide Router (AWGR).

POS deframe

Layer-3 lookup and packet

frame

Fast tunable laser

Optical modulator

POS traffic in

Optical packet traffic out

INGRESS

POS deframe

Layer-3 lookup and packet

frame

Fast tunable laser

Optical modulator

POS traffic in

Optical packet traffic out

INGRESS

Figure 3. 9: Ingress node high level schematic

Figure 3. 9 shows the functionality of the ingress node. Incoming traffic is received as

a PoS OC-48 (2.488Gbps) stream and is decapsulated in the POS deframer which

outputs raw IP packets at its SONET PHY interface. A General Purpose Protocol

Processor (GPPPi) performs individual IP packet identification lookup and framing

along with locally generated idler insertion. A fast tunable laser board controlled by

44

the GPPPi is used to generate CW of the wavelength selected on a per packet basis

onto which optical packets are modulated using an electro-optic modulator. The

ingress node may be adapted to receive IP packets in any external transport protocol

by simply exchanging the POS deframer board with a suitable decapsulation board.

3.3.2 Egress Node Description

Egress node functionality is implemented in the following steps to extract IP packets

and forward them out of the optical network.

1. Amplified photodetection followed by clock and data recovery for payload

(and framing).

2. Payload identification and IP packet extraction.

3. PoS Framing before IP packets exit the optical network.

After successful forwarding through the optical network, packets arrive as an optical

data stream into the egress node (Figure 3. 10).

POS deframe

Optical packet detect

deframe

Photodetectand CDR POS traffic out

Optical packet traffic in

EGRESS

POS deframe

Optical packet detect

deframe

Photodetectand CDR POS traffic out

Optical packet traffic in

EGRESS

Figure 3. 10: Egress node functionality. CDR- Clock Data Recovery

While optical framing is generated by a single upstream node for all optical packets,

optical payloads maybe have undergone completely different processing stages and

may originate from different ingress nodes. Consequently payloads may be of varying

45

amplitudes and signal qualities. A limiting amplified photodetector is used at the

egress to eliminate amplitude variations between packet payloads and present a

constant electrical stream for Clock Data Recovery (CDR). At the egress, optical

framing may or may not be required to be examined and could therefore require

separate CDRs for the frame and payload data extraction. Clock for data recovery is

fed from the transmitter ingress node to core and egress nodes for all demonstrations

and performance measurements performed although a filter based clock recovery

technique is proposed in this research. At the GPPPe, IP packet extraction is

performed after suitable payload identification based on optical frame demarcation

words and decoding of the payload container. Extracted IP packets are finally

encapsulated into OC-48 PoS traffic using a PoS framer board before exiting the

optical network.

3.4 Edge Node Implementation

Edge node implementation is carried out using three main boards: the POS deframer,

the GPPP and the digitally controllable fast tunable laser board. Electronic to Optical

(E/O) and Optical to Electronic (O/E) functions are performed by use of a Lithium

Niobate (LiNbO3) modulator and a limiting amplified photodetector both with a with

a 10GHz bandwidth. Payload decoupling is achieved by implementing the optical

packet structure discussed in section 3.2.4 and can support forwarding of high bit rate

payloads with lower bit rate optical framing. However, in this research, emphasis is

placed on examination of implementation bottlenecks and performance of AOLS in

46

forwarding IP traffic. Payload and framing data rates are therefore chosen to be the

same at 2.5Gbps although core processing examines only optical framing information

for forwarding decisions.

3.4.1 PoS Framer/ Deframer

A Vitesse POS (Packet Over SONET) physical layer framer/deframer board operating

at 2.488Gbps was used to perform SONET and PPP/HDLC framing/deframing of the

IP packets traveling in and out of the AOLS network. The board consists of the POS

line optical interface, a VSC8140 clock-data recovery unit and a VSC 9112 POS

framer/deframer chip as shown in Figure 3. 11.

Photodetect VSC 9112 Framer/deframer

Stratum -3Crystal

oscillator

VSC 8140 CDR

16

16

32

32

Tx

Rx

Raw IP Packet

Interface (PIF)

Photodetect VSC 9112 Framer/deframer

Stratum -3Crystal

oscillator

VSC 8140 CDR

16

16

32

32

Tx

Rx

Raw IP Packet

Interface (PIF)

Figure 3. 11: Vitesse POS OC-48 Framer/Deframer functionality

Traffic entering the AOLS network is detected by the onboard photoreceiver and

clock data recovery is performed by the VSC8140 unit using a locally generated

stratum-3 77.78MHz clock from a crystal oscillator. Digital data bits then arrive at the

line side POS interface and are examined by the 9112 deframing section. SONET

47

overhead information is extracted from the POS frames and PPP/HDLC framed IP

packet stream is forwarded to the next section. Individual IP packets are then

extracted from the PPP/HDLC stream at the packet demapping section. The extracted

IP packets are queued in their order of arrival at the line side interface buffer, ready

for encapsulation into the optical packet format by the ingress General Purpose

Protocol Processor (GPPP). The IP packet line side interface may be run by a

77.78MHz clock from the GPPP independent from the POS interface side thereby

providing separation of clock domains between the synchronous SONET world and

the AOLS optical domain. The IP line side buffers are capable of satisfying any

buffering requirements arising due to slight clock frequency misalignments. The

framer/deframer card may be appropriately configured for the type of POS traffic

received/ transmitted via a computer interface. While the size of the current board

used is about 16”x12”, the board may be reduced to less than a fourth of this size for

commercial use.

3.4.2 Electronic Control design

General Purpose Protocol Processors (GPPP) are used to adapt raw IP packets to and

from the optical framing mechanism. IP packets from the POS framer/deframer board

are transferred to and from the GPPP via a 32 bit wide parallel bus clocked at

77.78MHz. Figure 3. 12 shows the ingress FPGA process functionality where

incoming IP packets are transported along with control signals. IP header processing

and lookup is performed on a per packet basis to determine the appropriate optical

header and wavelength select. Optical payloads then undergo encoding to eliminate

48

the accidental presence of a control word such as SOOP, EOOP and are finally

encapsulated into an optical frame before idler insertion into the stream.

IDENTIFYIP HEADER

INFORMATION

LOOKUP LABEL &

λ

FRAMEIP

PACKET ENCODER IDLERFILL

RAW IP (32)FROM POSDEFRAMER

CONTROL (5)

PROCESSEDDATA (32)

CONTROL (4)

λ Ingress (8)

ENCODEDDATA (32)

CONTROL (1)

ENCODED DATA-STREAM WITH INTER-PACKET

IDLERS (32)

INGRESS FPGA

OPTICAL LABEL (32)

PROCESS STAGE

IDENTIFYIP HEADER

INFORMATION

LOOKUP LABEL &

λ

FRAMEIP

PACKET ENCODER IDLERFILL

RAW IP (32)FROM POSDEFRAMER

CONTROL (5)

PROCESSEDDATA (32)

CONTROL (4)

λ Ingress (8)

ENCODEDDATA (32)

CONTROL (1)

ENCODED DATA-STREAM WITH INTER-PACKET

IDLERS (32)

INGRESS FPGA

OPTICAL LABEL (32)

PROCESS STAGE

Figure 3. 12: Ingress FPGA process functional diagram

GPPPi generates a 2.5Gbps electrical optical packet stream and a set of parallel

‘wavelength select’ bits to control the fast switching tunable laser board. While

encoding used is a simple principle of embedding 0’s in a word of 32 bits within the

payload, the implementation is complex due to the following reasons:

1. 34 bits are generated for every 32 bits input therefore requiring a degree of

flow control to fit encoded payload bits into a 32 bit wide bus.

2. Bit level operations on a 32 bit word based stream are necessary. Rotation of

the entire packet payload stream on a word by word basis is used to emulate

bit stream traffic and achieve encoding.

3. Since encoding is done only on packet payloads, it must start new for each

incoming optical payload. Precise control signaling is therefore necessary to

achieve this as opposed to encoding on a continuous bit stream as is the case

with existing protocols.

49

Egress node electronic control performs the reverse process of identification of

optical payload, decoding and extraction of raw IP packets. Decoupling of payload

containers from the optical packet framing results in payloads arriving at random bit

position relative to the 32 bit processor boundary. Figure 3. 13 shows the egress

FPGA process functionality which includes a phase locker to align payload containers

to the 32 bit processing word boundary, a decoder, and a IP packet extraction stage to

assemble the entire IP packet before transferring it to the POS framer.

Raw DMUX signal bus

(32)

PHASE LOCKER

PAYLOAD CONTAINER

DETECTDECODER

CLOCK (1)

PROCESSEDPAYLOAD (32)

CONTROL (7)

RAW IP Packets(32)

EGRESS FPGAPOS PHY control

(5)

PHASE LOCKEDENCODED DATA (32)

Raw DMUX signal bus

(32)

PHASE LOCKER

PAYLOAD CONTAINER

DETECTDECODER

CLOCK (1)

PROCESSEDPAYLOAD (32)

CONTROL (7)

RAW IP Packets(32)

EGRESS FPGAPOS PHY control

(5)

PHASE LOCKEDENCODED DATA (32)

Figure 3. 13: Egress FPGA process functionality

The phase locker compares incoming sequential 32 bit words with a SOIP and EOIP

words to identify one of 31 bit phases in which the incoming payload may be present

as shown in Figure 3. 14. An aligner stage uses phase information to rotate the

incoming payload to the 32 bit word boundary for further processing. Decoder

performs the reverse process of the encoder and results in 32 valid payload bits for

every 34 bits received. IP packets must therefore be stored in a FIFO until the entire

packet has been decoded before they are ready for transfer into the POS framer.

50

32 to 64 BIT CONVERTER

RAW INPUTDATASTREAM

(32)

64 BITRAW INPUT

DATASTREAM(64)

…

4 3 2 1

SOIP WORD NOT ALIGNED TO 32 BIT BOUNDARY

…1

2

2

3 ...

SOIP/EOIP DETECTOR

PHASE 1

PHASE 2

PHASE 31

ALIGNER/ OPTICAL DECAP-

SULATION

64 BITRAW INPUT

DATASTREAM(64)

…

4 3 2 1

SOIP WORD ALIGNED TO 32 BIT BOUNDARY

PHASE ALIGNMENTCONTROL SIGNALS (36)

CONTROL SIGNAL(5)

ALIGNED OPTICALPAYLOAD (32)

32 to 64 BIT CONVERTER

RAW INPUTDATASTREAM

(32)

64 BITRAW INPUT

DATASTREAM(64)

…

4 3 2 1

SOIP WORD NOT ALIGNED TO 32 BIT BOUNDARY

…1

2

2

3 ...

SOIP/EOIP DETECTOR

PHASE 1

PHASE 2

PHASE 31

ALIGNER/ OPTICAL DECAP-

SULATION

64 BITRAW INPUT

DATASTREAM(64)

…

4 3 2 1

SOIP WORD ALIGNED TO 32 BIT BOUNDARY

PHASE ALIGNMENTCONTROL SIGNALS (36)

CONTROL SIGNAL(5)

ALIGNED OPTICALPAYLOAD (32)

Figure 3. 14: Egress FPGA - Phase locker operating principle

3.4.3 Tunable Laser board and control

A fast tunable laser board was specially designed to control a pigtailed 3-section

Bookham Distributed Feedback (DFB) laser to tune between selected wavelengths

within a few nanoseconds. Whereas, commercial laser boards tune in the usec range,

tunability in a few nanoseconds is important to forward packets rapidly and reduce

guard bands between packets. Figure 3. 15 shows the functional schematic of the fast

tunable laser board designed and is digitally controlled using a parallel bus clocked

from the upstream control device. An FPGA is used to convert the digital wavelength

select bits into appropriate DAC values that set the current for the gain, phase and

rear sections. Fast settling DACs from Analog Devices (AD) were used along with an

op-amp based current driver circuitry to tune to the selected current values.

51

FPGA digital current value

lookup

DAC DAC DAC

V/I V/I V/I

DFB Laser

V1 V2 V3

I3I2I1

12 12 12

Digital control

CW out

Digitally controlled fast tunable Laser board

FPGA digital current value

lookup

DAC DAC DAC

V/I V/I V/I

DFB Laser

V1 V2 V3

I3I2I1

12 12 12

Digital control

CW out

Digitally controlled fast tunable Laser board

Figure 3. 15: Fast tunable laser functionality. DAC – Digital to Analog Converter; V/I – Voltage to Current converter

Figure 3. 16 shows a sample wavelength spectrum output of the tuned DFB laser and

a sample switching curve. A 0.6nm filter was used to test and measure switching

times that was measured to be ~6nsecs between channels 1548.51nm and 1549.32nm.

Figure 3. 16: Fast tunable laser spectrum at 1549.32nm and switching scope trace time

52

Switching time of the DFB laser is currently limited by the settling time of the DACs

and may be brought down to a few hundred picoseconds in future.

3.5 Edge Adaptation testing and performance

We used Layer-3 performance metrics to evaluate the performance of the back-to-

back edge node adaptation. Testing was done using a Spirent (Smartbits) SMB6000

chassis with a POS OC-48 tester interface and controlled using SmartWindows

application. While all data may be generated and received by the Layewr-3 tester, a

Cisco GSR 12000 electronic router was used in the receive path to a) test

performance limits of electrical routers as compared with the optical adaptation

implemented and b) as incremental testing since the electronic routers act as

aggregation routers for traffic after optical packet forwarding at the core to multiple

egress node. The signal path used in adaptation testing is shown in Figure 3. 17.

Smartbits SMB 6000 tester

OC-48 POS deframer

Ingress adaptation

OC-48 POS framer

Egress adaptation

Tx

Rx

Optical packet stream after adaptation

Electronic Router

Smartbits SMB 6000 tester

OC-48 POS deframer

Ingress adaptation

OC-48 POS framer

Egress adaptation

Tx

Rx

Optical packet stream after adaptation

Electronic Router

Figure 3. 17: Test path and traffic flow setup

No significant throughput or latency changes were observed when electrical and

optical back-to-back measurements were performed indicating that basic optical

53

transmission had no impact on the measurements. Packet sizes were varied from 40 to

1500 bytes and packet destination IP addresses were configured to match an entry in

the ingress header lookup table.

3.5.1 Impact of optical framing on utilization

A theoretical study of the impact on utilization due to adaptation overheads with IP

payloads varying from 40 to 1500 bytes is shown in Figure 3. 18. It can be seen that

for small packets (40bytes) a 50% penalty is incurred due to the 40 byte overhead

while for large packets (1500 bytes), adaptation penalty is below 5%.

0 200 400 600 800 1000 1200 1400 1600

50

60

70

80

90

100

Theoretical Throughput INGRESS --> EGRESS

Thro

ughp

ut (%

)

Packet Size (bytes)

OptFraming Encoded EncodedPOS

Figure 3. 18: Edge adaptation theoretical throughput calculation

An additional average penalty of ~4% is seen due to the encoding scheme used and is

nearly offset when PoS traffic overheads are taken into account to yield a maximum

utilization of ~95%. This is lower than the 97.5% or 99.5% overheads seen in

54

10GigE or PoS but may be viewed as the price paid for finer granularity of

forwarding control and lower stress on the electronic forwarding plane.

3.5.2 Measured throughput performance

Packet throughput measurements through the edge adaptation were measured for

varying IP packet sizes and compared with theoretical throughput calculations

discussed in section 3.2.4. Throughput was measured to be the maximum utilization

at which packet loss rises above 0.1% of transmitted packets.. Figure 3. 19 shows the

throughput measurements done along with theoretical throughput calculations.

0 200 400 600 800 1000 1200 1400 16000

20

40

60

80

100

Thro

ughp

ut (%

)

Packet Size (bytes)

Ingress Egress Theoretical Ingress-Egress-Electronic Router Electronic Router

Figure 3. 19: Edge adaptation throughput measurement and theoretical comparison

To ensure sufficient statistics, 5 million packets were transmitted for each run before

packet loss evaluation for a given utilization was performed. A binary incremental

testing procedure was used was used to identify the throughput utilization for each

packet size. It is important to note that before any layer 3 measurements were

55

performed, physical layer tested was made to ensure a low Bit Error Rate (BER) was

done through the electronic router and through the edge adaptation with the electronic

router to compare and identify possible throughput bottlenecks. For packet sizes

greater than 128 bytes, no significant difference was observed between theoretical

and measured throughput whereas a ~20% difference was observed for smaller packet

sizes. This reduction in maximum utilization was traced to the electronic router

forwarding process and not due to the adaptation. Throughput performance through

adaptation was therefore verified to follow the theoretical predictions closely.

3.5.3 Measured latency performance

Latency measurements were performed for different packet sizes through the

adaptation, through the electronic router and through the entire traffic path to identify

contributions to the end-to-end latency as shown in Figure 3. 20. It can be seen that

electronic router contributes a significant portion of the measured latency. A 20%

utilization was used to ensure no packet loss during measurement. Latency increase

with packet size is observed to be fairly linear due to the absence of any complicated

loop up mechanism at the ingress and a maximum of ~41usec latency is observed

through the entire traffic path.

56

0 200 400 600 800 1000 1200 1400 1600

5

10

15

20

25

30

35

40

45

50

Late

ncy

(use

c)

Packet Size (bytes)

Electronic Router Ingress-Egress Ingress-Egress-Electronic Router

Figure 3. 20: Edge adaptation latency measurement

3.6 Chapter Summary

In this chapter, framing considerations were examined and a packet framing structure

that accounts for various signal bandwidths and processing time limitations was

proposed and implemented. An optical edge node adaptation capable of supporting bit

rate transparent all-optical payload forwarding based only on framing information

was studied and demonstrated. Implementation sub-systems necessary to build AOLS

Ingress and Egress nodes were examined. The need for payload encoding, per packet

clock recovery, a decoupled payload framing, a self contained optical packet header

and idlers to maintain constant average power were identified and incorporated in the

proposed framing mechanism. We finally look at the end-to-end performance of the

implemented adaptation using layer-3 test metrics such as throughput and latency and

57

find a close match with theoretical predictions in the behavior for almost all packet

sizes. For small packets, throughput performance dropped to ~40% while for larger IP

packets, throughput was measured to be ~95% comparable to 10GigE and PoS

measurements. No significant latency penalty was observed for all packet sizes and

increase in latency with packet size was noted to be linear as expected due to the

simple nature of the header lookup. Electronic processing at line rate was

demonstrated to perform IP header look up, payload encoding, and idler insertion. We

present the first comprehensive implementation of OPS adaptation for variable length

IP packets to date and introduce layer-3 measurement metrics to evaluate the network

performance of such an adaptation mechanism.

58

References

[1] B. Sartorius, "All-optical clock recovery for 3R optical regeneration,"

presented at Optical Fiber Communication Conference and Exhibit, 2001.

OFC 2001 MG7-1-MG7-3 vol.1, 2001.

[2] P. Gambini, M. Renaud, C. Guillemot, F. Callegati, I. Andonovic, B. Bostica,

D. Chiaroni, G. Corazza, S. L. Danielsen, P. Gravey, P. B. Hansen, M. Henry,

C. Janz, A. Kloch, R. Krahenbuhl, C. Raffaelli, M. Schilling, A. Talneau, and

L. Zucchelli, "Transparent optical packet switching: network architecture and

demonstrators in the KEOPS project," Selected Areas in Communications,

IEEE Journal on, vol. 16, pp. 1245-1259, 1998.

59

Chapter 4

All-Optical Label Swapping

Network Performance

A major performance benchmark in forwarding technologies is the number of packets

forwarded through a switch fabric at any given interval of time. All-optical

forwarding promises the advantage of increasing the packets forwarded per second by

relieving the electronic domain of burdensome processing and by reducing

forwarding times using high bit rate optical payloads. Moreover, the packet level

granularity in packet switching offers the possibility of a high degree of resource

sharing and consequently a better performance in bursty internet traffic seen today.

Electronic framing and packet adaptation necessary for enabling OPS was

investigated in the previous chapter. Forwarding variable sized IP packets using

existing optical technologies can now be investigated using traffic from and to the

optical edge nodes in place. In this chapter, we take a look at enabling optical

technologies such as wavelength conversion and the associated control electronics

necessary to implement asynchronous packet forwarding at the optical core. While

physical layer measurements may be made to characterize the various subsystems

within the optical core, layer-3 metrics are critical in evaluating the true performance

60

of both the optical forwarding mechanism within the core and the effectiveness of the

adaptation layer implemented at the network edge. Finally, issues that limit the

network performance are identified and suggestions to improve overall network

throughput are considered.

4.1 All-Optical forwarding using Wavelength Conversion

The switch fabric is the central point of any router and sets a cap on the best

performance achievable. In today’s electronic routers, switching may either be

performed by transporting the entire packet between input and output queues

(implemented using RAM buffers) (e.g. Cisco GSR) or by simply switching tags

containing the address of the location of packets (Juniper M40). In both cases

however, entire packets must be moved between buffers to perform forwarding, an

operation that consumes considerable energy. All-Optical packet forwarding presents

the unique concept of forwarding payloads without expending much energy to store,

examine or switch between input and output ports by decoupling forwarding

information from the entire payload. One way of implementing this is by space

switching optical payloads based on their wavelengths. An Arrayed WaveGuide

Router (AWGR) is a passive optical component that achieves space switching by

exploiting the additional wavelength dimension in optics to as shown in Figure 4. 1.

An AWGR maps input port, wavelength pairs to specific output port, wavelength

pairs in a cyclic fashion and may be viewed as a non-reconfigurable, non-blocking

61

switch fabric. While dynamic switching of optical packets may not be possible using

an AWGR alone, a fast tunable wavelength converter at the input ports of the AWGR

λ11,λ2

1, λ31, λ4

1,

λ12,λ2

2, λ32, λ4

2,

λ13,λ2

3, λ33, λ4

3,

λ14,λ2

4, λ34, λ4

4,

λ11,λ2

4, λ33, λ4

2,

λ12,λ2

1, λ34, λ4

3,

λ13,λ2

2, λ31, λ4

4,

λ14,λ2

3, λ32, λ4

1,

4 x 4 AWGR

1

2

3

4

1

2

3

4

λ11,λ2

1, λ31, λ4

1,

λ12,λ2

2, λ32, λ4

2,

λ13,λ2

3, λ33, λ4

3,

λ14,λ2

4, λ34, λ4

4,

λ11,λ2

4, λ33, λ4

2,

λ12,λ2

1, λ34, λ4

3,

λ13,λ2

2, λ31, λ4

4,

λ14,λ2

3, λ32, λ4

1,

4 x 4 AWGR

1

2

3

4

1

2

3

4

Figure 4. 1: 4 x 4 AWGR wavelength mapping - passive, non-blocking, static optical switch; λji – input port i, wavelength j

can be used to implement dynamic space switching. Packets entering an input port

may be steered to any physical output port by simply wavelength converting them to

the appropriate wavelength before propagation through the AWGR. Packets can be

converted to outgoing wavelengths at the output ports prior to exiting the node. Fast

tunable wavelength converters (TWC) in compliment to AWGRs may therefore be

viewed as the optical equivalent of an electronic switch fabric. However, unlike

electronic switch fabrics implemented using RAM buffers, efficient packet queuing is

not possible in the optical switch fabric due to the absence of optical random access

buffer. Fixed fiber delay lines may however be used as buffering technologies along

with the optical switch fabric to reduce packet collisions during switching and

improve performance.

62

4.1.1 Statistical Demultiplexing

IP packets arriving at the network ingress of an AOLS network undergo edge node

adaptation and are encapsulated in an optical framing mechanism before being

injected into the optical network. At each node within the optical network, packets in

the stream are individually forwarded to output ports based solely upon forwarding

information contained in the optical packet header. When two or more packets

arriving at the core node switch request to be forwarded to the same output port that

cannot be shared, packet contention occurs.

Input

Outputs

Input

Outputs

(a)

InputOutputs

InputOutputs

(b)

Figure 4. 2: Statistical demultiplexing (a) 1 input (b) 4 inputs - forwarding without contention resolution

63

We consider the simplest case of a single input packet stream at the core distributed

to multiple output ports based on packet forwarding requirements and a core node

with no contention resolution mechanism may be viewed as independent input packet

streams being statistically demultiplexed to multiple output ports as in Figure 4. 2.

4.1.2 Core node issues in optics and electronics

Forwarding rate at the optical core is influenced by several electronic and optical

factors. While a TWC and AWGR provide a switch fabric for the optical core node,

switching time of the wavelength converter laser is critical in determining the

minimum guard band required between incoming packets and consequently the

forwarding rate. Rise/fall times and the granularity of timing control of the electronic

control signals to the switching section of the TWC also contribute to the minimum

guard band requirement. Control signals used within the core, transition at the packet

envelope rate instead of the packet bit rate. Varying stream utilization at the core

node input would therefore directly translate to changes in the frequency content of

the electrical control signals. The slow (packet rate) signal transition rates coupled

with the fast rise/fall time requirements ( high frequency content) presents a new

challenge in designing control electronics for OPS. AC coupling effects in

commercially available electronic amplifiers and drivers may therefore severely limit

the forwarding performance and the ability of the core node to handle varying input

traffic conditions. As with the edge, maintenance of a constant average power both

before and after forwarding presents an additional challenge at the core.

64

4.2 Optical Core Node

The AOLS core node presents a label swapped OPS architecture where in addition to

optical payloads being forwarded all optically based on label information, optical

packet labels are erased and rewritten at each hop. The general principle of operation

of the optical core node is shown in Figure 4. 3.

Electronic frame detection, control signal

generation

Optical processing – label erase, WC,

label rewrite

External Idler generation

.

.

.

AWGR

Space switching

Input

Output 2

Output N

Output 1

Photodetect

Bulk delay

Control signals

Electronic frame detection, control signal

generation

Optical processing – label erase, WC,

label rewrite

External Idler generation

.

.

.

AWGR

Space switching

Input

Output 2

Output N

Output 1

Photodetect

Bulk delay

Control signals

Figure 4. 3: Optical Core Node - Principle of operation

The incoming packet stream is tapped and photodetected for frame recovery. Framing

information is then deserialized and processed electronically to determine packet

position within the stream and the forwarding information from the optical header. In

the optical domain, all framing is erased from the packet stream and individual

payloads are wavelength converted to the selected wavelengths before being space

switched using an AWGR. At the output, framing is rewritten on the statistically

demultiplexed packet streams along with a new optical label used at the next hop.

Framed statistically demultiplexed packets exit the core node at the desired physical

output ports. 2.5Gbps limiting amplified photodetectors from Infineon were used to

detect packet framing and feed a Vitesse VSC 8132 deserializer chip before

65

processing at the core node FPGA. The core electronic controller generates the

following control signals:

• Core Optical Framing Erase signal – Used at the first stage wavelength

conversion and blanking system to optically erase all optical framing from the

incoming optical packet stream.

• λcore Select and Control signal – Controls a fast tunable laser control board

that tunes the second stage wavelength conversion system to the selected

wavelength on a per-packet basis.

• Optical Frame Rewrite signal – Appropriately timed and used to rewrite new

Optical Label and Framing information on the outgoing optical payload

stream exiting the second stage wavelength conversion system.

• Locally generated Idler Control signal – Control the generation of locally

generated idlers that are used to fill voids in the optical packet data stream at

the outputs of the AWGR.

Figure 4. 4 shows the functional representation of the FPGA process steps and as with

the egress node processing, a phase locker stage is used to align the frame start word

to the process word boundary. Recovered framing delimiters are used to determine

the timing information for frame erase and TWC wavelength conversion control

signals generated by the core electronics. The recovered optical header is used to

decide the exit port of each packet by generating an appropriate wavelength select

signal. A new label is also generated for the next hop, based only on the recovered

optical header.

66

λ cor

eC

ON

TRO

L(8

)

CO

NTR

OL

FOR

LO

CA

LLY

GEN

ERA

TED

IDLE

RS

(2)

Figure 4. 4: Core FPGA Process - 4 control signals. OL – Optical Label; S/EOOP – Start/End Of Optical Packet; DMUX – Demultiplexer.

PRO

CES

SST

AG

E

32 B

IT R

AW

SIG

NA

LFR

OM

DM

UX

77 M

Hz

CLO

CK

(1

)

PHA

SE L

OC

KED

STR

EAM

(32)

CO

NTR

OL

(7)

CO

RE

FPG

A

PHA

SE L

OC

KER

IDLE

RS

EOO

PSO

OP

OL

IDLE

RS

OPT

ICA

L PA

YLO

AD

( N

OT

EXA

MIN

ED IN

CO

RE)

BLA

NK

ING

SIG

NA

L (1

)

NEW

LY G

ENER

ATE

DID

LER

SEO

OP

SOO

PN

EW C

OR

EO

L

FFFF

FFFF

( O

PTIC

AL

PA

YLO

AD P

ASS

THR

OU

GH

)

REW

RIT

E SI

GN

AL

(32)

NEW

LY G

ENER

ATE

DID

LER

S

λ cor

eC

ON

TRO

L(8

)

CO

NTR

OL

FOR

LO

CA

LLY

GEN

ERA

TED

IDLE

RS

(2)

BLA

NK

ING

SIG

NA

L (1

)

BLA

NK

ING

SIG

NA

L (1

)

PRO

CES

SST

AG

E

32 B

IT R

AW

SIG

NA

LFR

OM

DM

UX

77 M

Hz

CLO

CK

(1

)

PHA

SE L

OC

KED

STR

EAM

(32)

CO

NTR

OL

(7)

CO

RE

FPG

A

PHA

SE L

OC

KER

IDLE

RS

EOO

PSO

OP

OL

IDLE

RS

OPT

ICA

L PA

YLO

AD

( N

OT

EXA

MIN

ED IN

CO

RE)

BLA

NK

ING

SIG

NA

L (1

)

NEW

LY G

ENER

ATE

DID

LER

SEO

OP

SOO

PN

EW C

OR

EO

L

FFFF

FFFF

( O

PTIC

AL

PA

YLO

AD P

ASS

THR

OU

GH

)

REW

RIT

E SI

GN

AL

(32)

NEW

LY G

ENER

ATE

DID

LER

S

67

4.2.1 Two-stage wavelength conversion principle

Fast tunable wavelength conversion may be performed in a single stage within the

core node. However, two-stage wavelength conversion offers the following distinct

advantages over single stage conversion:

• A two stage approach can utilize a robust first stage that is polarization

insensitive and is relatively insensitive to the average power level of the

incoming packets. Payloads entering the optical core node may originate from

different ingress nodes and therefore may possess different optical

characteristics.

• Signal regeneration is possible as the second stage of the wavelength

converter may be optimized for best performance for signals from the first

stage and zero power penalty through the wavelength converter may be

achieved [1].

• Cascading two stages makes any-to-any wavelength conversion including

same-to-same wavelength conversion possible.

We use a Semiconductor Optical Amplifier (SOA) based first stage and a Mach

Zehnder Interferometer (MZI) based wavelength converter as the second stage. The

first stage WC uses the Cross Gain Modulation (XGM) process and is therefore fairly

insensitive to input wavelength. While the first stage statically converts incoming

packets from any wavelength to an internal wavelength, the second stage uses Cross

Phase Modulation (XPM) to achieve dynamic wavelength conversion from the

68

internal wavelength to a selected output wavelength on a per packet basis as shown in

Figure 4. 5.

SOA - WC1 MZI – WC2

λinput λoutputλinternal

BPF @λinternal

SOA - WC1 MZI – WC2

λinput λoutputλinternal

BPF @λinternal

Figure 4. 5: 2-Stage wavelength conversion principle. BPF - Band Pass Filter; SOA - Semiconductor Optical Amplifier; MZI - Mach Zehnder Interferometer

4.2.2 First stage conversion and label erase

Another advantage of using an XGM based SOA WC as the first stage is the

possibility of performing label erasure in addition to 1st stage WC. Figure 4. 6. shows

the working of the SOA based first stage WC.

SOA - WC1

λinput λinternal

BPF @λinternal

CW@ λinternal

Packet @ λinput Payload @ λinternal

SOA - WC1

λinput λinternal

BPF @λinternal

CW@ λinternal

Packet @ λinput Payload @ λinternal

Figure 4. 6: SOA based 1st stage Wavelength Conversion and frame erase

69

Incoming packets enter the SOA along with the internal CW light and impose a

modulation on the CW through the gain medium. A notch filter at the internal

wavelength may be used to reject the incoming packet and pass only the wavelength

converted payload at the internal wavelength. Label erasure may be performed by

turning on the SOA gain during the packet payload and turning it off during packet

framing. The XGM process is an inverting process and therefore requires that the

second stage be operated in the inverting regime to obtain the same payload polarity

at the node output. A fast switching current controller board is required to

enable/disable optical frame erase at the first stage wavelength conversion system.

The general schematic of the current controller board is shown in Figure 4. 7.

vin

R1

R2

R3

R4

RLoad (SOA)

R5

R6+- +

-

vin

R1

R2

R3

R4

RLoad (SOA)

R5

R6+- +

-

Figure 4. 7: Op Amp based Fast switching current driver circuitry for SOA.

This board must be capable of handling all bandwidth utilizations and packet length

and therefore need to be dc coupled both at the input and outputs while still supplying

sufficient constant current to switch the SOA. AD8009 high speed (1GHz) current

feedback op amps were used to supply switching current to the SOA stage. The

70

circuit was built to ensure constant current supply through the SOA as its resistance

changes during the turn on cycle. A Kamelian SOA used for the first stage

wavelength conversion and frame erase has the following gain response with input

power. The SOA along with the fast switching current controller board was tested for

response to applied input electrical voltage as shown in Figure 4. 8.

Figure 4. 8: Output power of the SOA as a function of the input voltage applied

The SOA was also tested for sufficient Optical Signal to Noise Ratio (OSNR)

performance to provide the optimum operating point for the first stage as shown in

Figure 4. 9.

71

Figure 4. 9: Optical Signal to Noise Ratio (OSNR) performance of first stage wavelength converter system

4.2.3 Second stage conversion and packet steering

An integrated SOA based MZI WC along with a fast tunable SGDBR laser is used as

the second stage for WC as described in [2]. The SGDBR laser is tuned to the

selected wavelength by the core node electronics through a fast switching board

similar to that used at the network ingress. Switching times for various input and

output wavelength combinations were measured and was found to be under 6nsecs for

all combinations. Packets entering the 2nd stage at the internal wavelength are

converted to the output wavelength selected on a per packet basis. A Fiber Bragg

Grating (FBG) based notch filter and circulator are used to reject the internal

72

wavelength and transmit packet payloads only at the output wavelength as shown in

Figure 4. 10.

MZI – WC2

λoutputλinternal

FBG @λinternal

Packet @ λoutputPayload @ λinternal

λinternal

drop

throughMZI – WC2

λoutputλinternal

FBG @λinternal

Packet @ λoutputPayload @ λinternal

λinternal

drop

through

Figure 4. 10: Mach Zehnder Interferometer based 2nd stage Wavelength Conversion. FBG – Fiber Bragg Grating

The 2nd stage WC is operated in the inverting mode to preserve the polarity of the

payload data bits and also serves to present a blank CW signal for label rewrite due to

the nature of the inverting operation.

4.2.4 Framing and idlers revisited

After payloads undergo two-stage WC, new framing information is rewritten onto the

packet stream before space switching is performed. A rewrite signal controlled by the

core node electronics is modulated onto the converted packet stream using an electro-

optic modulator (LiNbO3 modulator). It is critical to note that before rewrite is

completed, the optical payload stream does not present a constant average optical

power and therefore would result in transient distortions if amplifiers (such as

EDFAs) were used. Specially designed modulator drivers with low ac cutoff

frequency were used at the rewrite stage to ensure proper operation even without

73

adequate dc balance of the rewrite signal (constant high to pass the optical payloads).

Although locally generated idlers are used at the rewrite stage to fill interpacket gaps,

space switching or statistical DMUX of the packet stream results in additional void

creation at the output. External idlers controlled by the core node electronics are used

after space switching to ensure constant average output power at both the output ports

as shown in Figure 4. 11. An AWGR was used to perform actual space switching in

compliment to the fast tunable MZI WC.

AWGR out1

External IdlersVoids filled by external idlers

λ1 λ2 λ1 λ1

λ2

out2Internal Idlers

AWGR out1

External IdlersVoids filled by external idlers

λ1 λ2 λ1 λ1

λ2

out2Internal Idlers

Figure 4. 11: External Idler Insertion. AWGR - Arrayed WaveGuide Router

4.3 Experimental Performance Evaluation

In order to ensure proper operation of the core node optics, several physical layer

tests were performed on optical components. A schematic of the complete

implementation of the core node is shown in Figure 4. 12. The incoming packet

stream is kept in the optical forwarding plane and is passively delayed to match the

processing time of the core controller and subsequently fed to a two-stage wavelength

converter. The first WC stage removes the optical framing and the second stage

performs wavelength switching. An electro-optic modulator at the output of the two-

74

Figure 4. 12: Core node experimental setup

λ inte

rnal

CW

SOA

AO

WC

A W G R

λ 1co

re

. . . .

λ 2co

re

Idle

r W

ord

Gen

Q Q

OL,

Fra

me

eras

e

Dat

a in

to c

ore

λ in=

1548

.94n

m

Del

ay 1

λ inte

rnal

1551

.01n

m

λcore

out

cont

rol

λ inte

rnal

EOM

EOM

EOM

FBG

New

OL,

Fra

me

rew

rite

EDFA

Idle

r C

ontr

ol

1stSt

age

WC

2n

dSt

age

WC

Idle

r Gen

erat

ion

Rew

rite

λ id

ler 2

CW

λ id

ler 1

CW

Rx

Cor

e C

ontr

olle

r

OL,

Fra

me

eras

e

λcore

out

cont

rol

New

OL,

Fra

me

rew

rite

Elec

tron

ic C

ontr

ol

clk

data

Idle

r Con

trol

Opt

ical

ta

p

Ch1

Ch2

Idle

r for

C

h1

Idle

r for

C

h2

λ 1co

re =

155

5.86

nm

λ 2co

re =

156

0.71

nm

λ inte

rnal

CW

SOA

AO

WC

A W G R

λ 1co

re

. . . .

λ 2co

re

Idle

r W

ord

Gen

Q Q Q

OL,

Fra

me

eras

e

Dat

a in

to c

ore

λ in=

1548

.94n

m

Del

ay 1

λ inte

rnal

1551

.01n

m

λcore

out

cont

rol

λ inte

rnal

EOM

EOM

EOM

FBG

New

OL,

Fra

me

rew

rite

EDFA

Idle

r C

ontr

ol

1stSt

age

WC

2n

dSt

age

WC

Idle

r Gen

erat

ion

Rew

rite

λ id

ler 2

CW

λ id

ler 2

CW

λ id

ler 1

CW

λ id

ler 1

CW

Rx

Cor

e C

ontr

olle

r

OL,

Fra

me

eras

e

λcore

out

cont

rol

New

OL,

Fra

me

rew

rite

Elec

tron

ic C

ontr

ol

clk

data

Idle

r Con

trol

Opt

ical

ta

p

Ch1

Ch2

Idle

r for

C

h1

Idle

r for

C

h2

λ 1co

re =

155

5.86

nm

λ 2co

re =

156

0.71

nm

75

stage wavelength converter imprints new framing onto CW light surrounding the

converted optical packet payload, which is amplified and forwarded to separate space

ports using an Arrayed Wave Guide Router (AWGR). Optical packet headers and

packet frame delimiters are identified in the electronic core control plane by detecting

and processing the incoming optical packet stream. The control signals for framing

removal, wavelength switching, framing rewrite and idler insertion are all derived

from the core FPGA controller board which identifies individual optical packets,

extracts all-optical processing plane.

Scope traces of the packet processing within the core node are given in Figure 4. 13

(a) where incoming packets are detected and internal CW is aligned with the packet

payload at the input of the 1st stage WC.

idlers idlers

EOOPSOOP

Payload container

idlers idlers

EOOPSOOP

Payload container

Payload container

Figure 4. 13: 1st stage Wavelength Converter Scope traces (a) CW + packet (b) Frame erased wavelength converted payload

Figure 4. 13 (b). shows the wavelength converted payloads at internal wavelength

where framing has been erased while Figure 4. 14 (a) shows the payload after 2nd

stage conversion. The CW control based label erasure results in extremely good label

extinction ratio as can be seen in the scope trace. This ensures that previous label does

76

not corrupt the newly rewritten label and result in errored headers. Figure 4. 14 (b)

shows the optical packet after successful rewrite.

SOIP EOIPIP packetSOIP EOIPIP packet

Figure 4. 14: Scope traces after 2nd stage Wavelength Conversion (a) Before rewrite (b) After frame rewrite

4.3.1 Measurement test system and setup

The test configuration for layer-3 performance measurements over the entire optical

network implementation is shown in Figure 4. 15. Two separate IP streams with

different IP destination addresses were generated at the POS transmitter and adapted

at the ingress before being injected into the core. At the core, packet forwarding is

done based solely on the optical label generated at the ingress based on the IP

destination address of each packet.

77

Sm

artb

itsS

ON

ET

Test

er(R

x)

Hig

hS

peed

Ele

ctro

nic

Rou

ter

PO

SE

ncap

-su

latio

nE

gres

s #1

PO

SE

ncap

-su

latio

nE

gres

s #2

Egr

ess

Opt

ical

Rou

ter

#1

Egr

ess

Opt

ical

Rou

ter

#2

Sm

artb

itsS

ON

ET

Test

er(T

x)

PO

SD

ecap

-su

latio

n

Ingr

ess

Opt

ical

Rou

ter

OC

-48

Cor

eA

ll-O

ptic

alR

oute

rO

C-

48 OC

-48

IP F

LOW

1

IP F

LOW

2

(1)

(3)

(4)

(5)

(6)

(2)

Figure 4. 15: Traffic flow and Layer-3 measurement setup

78

4.3.2 Measured throughput performance

Figure 4. 16 shows the measured layer-3 packet throughput variation versus packet

size. The POS throughput of the electronic router (ER), a GSR 12000, was measured

through path 1-5-6 as shown in Figure 4. 15.

0 200 400 600 800 1000 1200 1400 16000

10

20

30

40

50

60

70

80

90

100

Thro

ughp

ut (%

)

Packet Size (bytes)

ER 1-5-6 Theoretical IE B2B 1-2-4-6 CDR 1-2-3-4-5-6 CDNR 1-2-3-4-5-6

Figure 4. 16: End-to-End IP throughput measurement and theoretical comparison (refer Figure 4.15)

A theoretical computation of the achievable throughput (% of available POS payload

bandwidth) accounts for the optical packet header and framing overheads and

Ingress-Egress back-to-back (IE B2B) throughput measurement shows that the edge

node adaptation closely matches the theoretical throughput for different packet sizes.

Throughput for Core Dynamic forwarding with or without Rewrite (CDR/CDNR)

shows that no significant throughput penalty is incurred both due to either wavelength

79

switching and conversion or label swapping. The 4% penalty measured without label

swapping is attributed to imperfect optimization of the two stage wavelength

converter for sufficient extinction ratio at the output.

4.3.3 Measured latency performance

Latency measurements were performed through the core to investigate the excess

time delay penalty due to core node switching as shown in Figure 4. 17.

0 200 400 600 800 1000 1200 1400 16000

5

10

15

20

25

30

35

40

Lat

ency

(use

c)

Packet Size (bytes)

IE-ER 1-2-4-5-6 CDNR 1-2-3-4-5-6 CDR 1-2-3-4-5-6 IEB2B 1-2-4-6 ER 1-5-6

Figure 4. 17: End-to-End latency vs. packet size (refer Figure 4.15)

Variation in packet latency with packet size is linear as expected due to the nature of

the simple lookup at the core node. The bulk of the latency seen in the system results

from the ER and the Ingress-Egress Adaptation and a comparison between latency

measured for the fully switching network system versus Ingress-Egress and

80

Electronic Router (IE-ER) path shows that only a minimal latency increase (max

0.79usec) is introduced by the core node. This indicates that optical packets may

undergo forwarding at multiple core nodes without significantly impacting the end-to-

end latency through the optical network.

4.4 Chapter Summary

All-optical packet forwarding based on optical header information was investigated in

this chapter. Basic sub-systems that go into building a core node were examined, and

their limitations and the influence on network performance was measured

experimentally and compared to theoretical limits. We experimentally demonstrate a

complete dynamic layer-3 (IP) forwarding AOLS network and present the first true

end-to-end layer-3 performance measurements. The agreement between the measured

throughput and the theoretically predicted is excellent and only show a penalty for

small packet sizes, which is ascribed to the electronic processing in the SONET layer

outside the AOLS network. The latency added is constant due to the fixed optical

path through the core node and measured to 0.79usec. It is negligible compared to

electronic switching and adaptation. Throughput penalty may be further reduced by

tightening control signaling through the core and improving optical rise/fall response

times. We successfully demonstrate an edge node adaptation that facilitates all-optical

forwarding at the core nodes.

81

References

[1] R. Doshi, M. L. Masanovic, and D. J. Blumenthal, "Demonstration of

regenerative any /spl lambda//sub in/ to any /spl lambda//sub out/ wavelength

conversion using a 2-stage all-optical wavelength converter consisting of a

XGM SOA-WC and InP monolithically-integrated widely-tunable MZI SOA-

WC," presented at Lasers and Electro-Optics Society, 2003. LEOS 2003. The

16th Annual Meeting of the IEEE,477-478 vol.2, 2003.

[2] V. Lal, M. Masanovic, D. Wolfson, G. Fish, C. Coldren, and D. J.

Blumenthal, "Monolithic widely tunable optical packet forwarding chip in InP

for all-optical label switching with 40 Gbps payloads and 10 Gbps labels,"

presented at Optical Communication, 2005. ECOC 2005. 31st European

Conference on,25-26 vol.6, 2005.

82

Chapter 5

Core Traffic Adaptation for

Statistical Multiplexing

In the previous two chapters, we investigated edge node adaptation that would be

necessary to support packet all-optical forwarding in the network core. A new

framing mechanism was devised and statistical demultiplexing at the network core

leveraging on this framing mechanism was proposed and successfully demonstrated.

In order to replace today’s electronic routers however, an efficient multiplexing

scheme is necessary at the high bit and packet rates that optics is capable of

supporting. Statistical multiplexing or aggregation of packets within the network core

relies on the fact that not all inputs require the use of the output resource at all times.

Buffering aids in statistical multiplexing by absorbing transients in shared capacity

demands. The absence of optical Random Access Memories (ORAM) severely limits

the forwarding throughput achievable and offsets many of the advantages in OPS. In

this chapter, we investigate the primary issue with implementing statistical

multiplexing namely contention handling and the techniques that improve the

performance of a core optical switch. Optical adaptation may be used to assist in

83

implementing contention handling and we look at strategies to deploy such adaptation

within the optical network.

5.1 Core node with contention resolution

The core forwarding model proposed and implemented thus far looks at packet

forwarding in the optical domain from the perspective of the optical packet stream

itself. It does not take into account the possibility of a second data stream requiring

shared access to the same output ports at the same time. When such multiple data

streams look to share the same set of output ports over time, a technique to multiplex

packets from the different inputs by handling contention becomes necessary as shown

in Figure 5. 1.

Multiple Inputs

Multiple Outputs

CR

CR

CR

CR

Contention Resolution

Multiple Inputs

Multiple Outputs

CR

CR

CR

CR

Contention Resolution

Figure 5. 1: Contention resolution during statistical MUX. CR- Contention Resolution

5.1.1 Statistical Multiplexing and contention

Statistical multiplexing is the aggregation of multiple input channels based on the

probability of need to access a shared output channel and can result in an efficient use

of a shared resource. Multiplexing packets from multiple inputs to the same output is

possible as long as the aggregate capacity demand placed by all the inputs does not

84

exceed the capacity of the shared output port for an extended period of time.

Contention occurs when two or more input ports simultaneously request the use of the

same output port that cannot be shared. At low input utilizations, contention rarely

occurs and packet loss rates are low and usually below tolerance limits. However, as

the aggregate input utilization normalized to the output capacity, tends towards to 1,

packet loss rates increase and contention resolution schemes become necessary.

Successful resolution requires 1) prior knowledge of contention with time to react and

resolve it, 2) an intelligence capable of making decisions on how to resolve

contention and 3) a resolution mechanism to implement contention resolution. Arrival

information through detection of packets by the core electronics may be used both to

detect contention and to decide on the path to resolving contention. In [1] we

demonstrate an all-optical contention resolution scheme with a first come first serve

decision rule for variable length 40 Gbps payloads.

5.1.2 Contention handling schemes

Traditionally, contention may be resolved in either the time domain by use of a buffer

to delay the transmission of the contending packet or in the space domain by steering

the contending packet to a different output port. While optics suffers from the

drawback of the lack of availability of ORAMs, access to an additional dimension

namely the wavelength can be exploited for contention resolution. Fast tunable

wavelength conversion may be used to provide access to a shared set of wavelengths

within a delay loop to provide suitable resolution at packet rates.

85

5.2 Optical Buffering techniques

Optical buffering provides a traditional approach to contention resolution in ensuring

highly efficient optical switches. However, while several alternate buffering

techniques such as slow light in semiconductor nanostructures [2] and arrayed-rod

photonic crystals [3] are being explored, currently fiber based delays structures are

the only available technology to implement optical buffering. Such delay lines offer

fixed and finite amount of delay and may be either used in a single stage or as

multiple stages. Delay based buffers may also be classified as feedback or feed-

forward buffers based on the nature in which they are connected [4].

5.2.1 Feed-forward buffering schemes

Figure 5. 2 shows a typical structure of a feed-forward fiber delay line based buffer.

Fiber delay lines of logarithmically increasing amount of delay are usually used

together and the amount of delay is selected by either traversing a series of delay

elements or bypassing them.

Packet in Delayed packet out

2 x 2 switchFiber delay lines

Packet in Delayed packet out

2 x 2 switchFiber delay lines

Figure 5. 2: Feed-forward delay structure

In feed-forward delay structures, access to the shared output is guaranteed to the

packet once a delay is set for it. Consequently implementing a packet priority

mechanism is difficult to implement. Nonetheless, several feed-forward architectures

86

have proposed including the Cascaded Optical Delay-lines (COD), Switched Fiber

Delay-lines (SDL) and the Logarithmic Delay-Line Switch.

5.2.2 Feedback buffering

Feedback buffering uses a fiber delay loop between one input–output pair of the

optical switch as shown in Figure 5. 3. Contending packets may be stored in the fiber

delay loop for a fixed period of time before contending for access to the desired

output port at a later point in time.

Packets in Delayed packet out

Switch element

Fiber delay lines

Packets in Delayed packet out

Switch element

Fiber delay lines

Figure 5. 3: Feedback delay buffering

Feedback buffering however requires that the length of the feedback loop is large

enough to store the largest packet completely to avoid data corruption. By using fixed

time slots for packets, feedback buffering can efficiently resolve contention.

5.3 Compression/Decompression for contention resolution

Compression takes advantage of the transparency of optics and may be used as a way

to speed up packet payloads so that they occupy a smaller time slot. The time slot

87

selected must be the size of the largest variable sized packet at the highest

compressed bit rate. Compression of all packets at the highest bit rate requires

padding with non-zero data to ensure that the entire time slot is filled for dc balance.

This however can be cumbersome and may require tedious synchronization between

the compressed packet and the padding. By using variable compression ratios to

compress packets of different sizes, each packet may be adjusted to occupy the entire

fixed time slot and therefore eliminate the need for padding. An optical adaptation

mechanism that would provide such compression/decompression between variable

byte sized packets and a fixed time slot in addition to providing bit rate speedup

would therefore be useful in simplifying implementation of contention resolution

schemes.

5.3.1 Network-wide deployment

When compression /decompression are deployed at the network edges, incoming

traffic is aggregated into fixed sized higher bit rate compressed packets before being

injected into the network. At the egress, optical payloads are decompressed back to

the base rate of the original payload. Packets in the network are therefore of fixed

temporal duration throughout the network as shown in Figure 5. 4.

88

1x

1x

1x

1x

8x

2x4x

8x

16x

2x

16x

Ingress CO

M

Ingress

CO

M

EgressDC

OM

Egress

DC

OM

Fixed base rate, variable

temporal sizes

Variable compressed rate, fixed temporal

sizes


temporal sizes

1x

1x

1x

1x

8x

2x4x

8x

16x

2x

16x

Ingress CO

M

Ingress

CO

M

EgressDC

OM

EgressDC

OM

Egress

DC

OM


temporal sizes

Variable compressed rate, fixed temporal

sizes


temporal sizes

Figure 5. 4: Compression/Decompression deployed at the network edges

However, the link bandwidth utilization for packets smaller than the largest packet

goes down as smaller packets are padded to occupy the temporal time slot of the

largest packet at the compressed rate as shown in Figure 5. 5. For an IMIX traffic

distribution of 1500,560 and 40 byte packets in a 7:4:1 ratio, the link utilization is

calculated to be only 26.5% when using discretely variable compression ratio

compression scheme. Network wide deployment of compression/ decompression has

the advantage of reducing the need for compression/decompression at every core

node. It is feasible when a large number of core nodes are present and the link

bandwidth has been over-provisioned throughout the network. Compression ratio and

original packet length information must however be embedded within the optical

header in order to successfully decompress at the egress nodes.

89

Util without padding %

0.00

10.00

20.00

30.00

40.00

50.00

60.00

70.00

80.00

90.00

100.00

0 500 1000 1500

Util without padding %

Figure 5. 5: % available link utilization Network scale ComDecom vs. packet size in bytes

5.3.2 Intra-node deployment

Alternately, compression/ decompression may be applied at the input and output of

every core node as shown in Figure 5. 6. This deployment scheme provides the core

node switching benefits of compressing packets to fixed temporal sizes and higher bit

rates while maintaining 100% link utilization. This approach also has the advantage

that high speed packets do not have to be transported on network links. Variable

length packets entering each core node are compressed to occupy a fixed time slot

large enough to hold the largest length packet at the highest compressed bit rate

before entering the switching element. Once a packet has been successfully switched,

it is decompressed back to base rate before reframing and exiting the core node. Since

a single controller can be used to control compression, switching and decompressing

of each packet, knowledge of the packet resides within the controller for the duration

90

while the packet resides in the core node. Control signals may therefore be

synchronized for each packet within the node.

.

.

.Pack

et

Com

pres

sion

Pack

et

De-

com

pres

sion

.

.

.Synch & Traffic shaping

Switch element

Packet ContentionResolution

Output Traffic Merge

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Feedback buffering

Feed-forward buffering

Fixed base rate,variable sized packets

Variable compressed rate,fixed sized packets


.

.

.Pack

et

Com

pres

sion

Pack

et

De-

com

pres

sion

.

.


Switch element



.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Feedback buffering


.

.

.Pack

et

Com

pres

sion

Pack

et

De-

com

pres

sion

.

.


Switch element



.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Feedback buffering





Figure 5. 6: Intra-node deployment of Compression/decompression at the core

Deployment of core-centric compression/decompression places stricter requirement

on the compression/decompression technique. These increased constraints come from

when packets are decompressed to base rate; they occupy a larger temporal size than

when compressed. Compressed packets would therefore be required to have adequate

spacing between them before entering the decompressor to avoid the overlapping at

the base rate (a condition called time-out that is discussed in more detail in Chapter

7). The core control plane may buffer the compressed packets after switching to

ensure that they are spaced out in time to avoid the packet overlap issue at the

decompressor.

5.3.3 Compression/Decompression process

Compression and decompression involve temporal rearrangement of bits within the

packet and a translation in bit rate to or from a higher bit rate. In the case of both

91

processes, control signals required must have sufficient rise/fall times and granularity

of control to prevent corruption of packet bits. In addition to this requirement,

compression and decompression need suitable modifications to the bits before and

after the processes to facilitate the temporal rearrangement. Figure 5. 7 shows the

steps required before packet compression and decompression may be performed.

CR CR -- Clock RecoveryClock Recovery

CRCR DriveDrive

NRZ NRZ -- to to -- RZRZ CompressionCompression DecompresDecompres--sionsion RZ RZ -- to to -- NRZNRZ

Optical network Optical network or core or core

switching switching elementelement

Incoming low bit rate NRZ packets RZ packets Compressed RZ

packetsOutgoing low bit rate NRZ packets

CR CR -- Clock RecoveryClock Recovery

CRCR DriveDrive

NRZ NRZ -- to to -- RZRZ CompressionCompression DecompresDecompres--sionsion RZ RZ -- to to -- NRZNRZ

Optical network Optical network or core or core

switching switching elementelement

Incoming low bit rate NRZ packets RZ packets Compressed RZ

packetsOutgoing low bit rate NRZ packets

Figure 5. 7: Compression Decompression process detail

Incoming packets at lower bit rates usually use the Non-Return-to-Zero (NRZ) bit

representation format. Compression of packets requires a reduction in the effective bit

period in representing packet information. Conversion of packets from NRZ to

Return-to-Zero (RZ) bit format is one way of achieving this and may be done before

the compression subsystem. Recovery of bit boundary information is important to

performing NRZ-to-RZ conversion and a suitably time aligned signal from a packet

clock recovery system may be used. One approach is to use an electro-optic gating

modulator such as an EAM with a suitable transfer function to carve RZ pulses. After

decompression, it is desirable to represent bits within the packet back in NRZ format

and several techniques have been previously demonstrated to perform this

conversion. The quality of the RZ pulses entering the compressor is critical in

92

determining the quality of the compressed packets. The pulse width must be less than

one bit slot at the compressed data rate while the extinction ratio (Figure 5. 8) must be

sufficiently high (typically >25dB) for successful compression.

High extinction

Small pulse width

t

High extinction

Small pulse width

t

Figure 5. 8: RZ pulse quality requirements for compression

5.4 Chapter Summary

Contention resolution is necessary at the network core to perform low loss statistical

multiplexing of optical packets. We propose the use of bit rate transparency of all-

optical forwarding schemes to achieve adaptation of variable byte size packets into

fixed time slots, thereby making it easier to forward packets using limited contention

resolution resources by trading in bandwidth. Use of fixed sized packets however

requires slot synchronization for efficient forwarding and schemes have been

proposed to implement such synchronization. Two proposed network deployment

strategies were proposed for use of compression and decompression to implement

optical adaptation, namely the intra-node deployment and the network-wide

deployment. The intra-node deployment is chosen as it decouples the link utilization

from adaptation to aid in efficient core node forwarding. Moreover, this scheme relies

93

on a centralized control plane for compression and decompression making it easier to

recover the original payload from high bit rate compressed packets.

94

References

[1] S. Rangarajan, H. Zhaoyang, L. Rau, and D. J. Blumenthal, "All-optical

contention resolution with wavelength conversion for asynchronous variable-length

40 Gb/s optical packets," Photonics Technology Letters, IEEE, vol. 16, pp. 689-691,

2004.

[2] C. J. Chang-Hasnain, K. Pei-cheng, K. Jungho, and C. Shun-lien, "Variable

optical buffer using slow light in semiconductor nanostructures," Proceedings of the

IEEE, vol. 91, pp. 1884-1897, 2003.

[3] M. Tokushima, "Experimental demonstration of waveguides in arrayed-rod

photonic crystals for integrated optical buffers," presented at Optical Fiber

Communication Conference, 2005. Technical Digest. OFC/NFOEC,3 pp. Vol. 3,

2005.

[4] D. K. Hunter, M. C. Chia, and I. Andonovic, "Buffering in optical packet

switches," Lightwave Technology, Journal of, vol. 16, pp. 2081-2094, 1998.

95

Chapter 6

Optical Packet Compression

Statistical multiplexing of optical packets, along with statistical demultiplexing

shown in chapter 4, facilitates packet forwarding at the core from multiple inputs to

multiple outputs. Contention resolution is required to efficiently perform statistical

multiplexing. One way to implement contention resolution is by the use of fixed line

delays as buffering elements. However, due to the inherent nature of today’s delay

line buffers, handling variable sized packets requires tradeoffs in utilization and

efficiency. Optical compression is one way of adapting packets to suit fixed line delay

buffers and implement contention resolution. Compression temporally squeezes each

packet thereby reducing the service time required to forward the packet, be it access

to output ports or storage in optical delay line buffers. Unlike conventional schemes

that slice the optical spectrum using wavelength division multiplexing to tap the large

available optical bandwidth, packet compression aims to use separate high bandwidth

wavelengths to provide ultra-high speed channels. Previous techniques proposed to

implement packet compression have not been scalable in packet size handling

capability. In chapter 5, we proposed a network architecture suitable for deploying

packet compression/decompression to improve forwarding performance in a multiple

input/ output node. In order to enable statistical multiplexing, requirements on a

96

packet compressor are identified and a suitable technique is proposed and

demonstrated. The performance of the compression technique is then studied both on

its versatility and the bit level packet quality after compression.

6.1 Compression – State of the art today

To date, several techniques have been proposed and demonstrated to implement

packet compression. This section details the state of art in compression technology,

and the challenges/drawbacks in practical deployment of such techniques. Each

technique detailed in this section requires that the compression process maintains the

bit order within the packet after compression thereby requiring distinct manipulation

of every bit in the packet at the compressed rate. A second requirement of all

proposed techniques is the availability of several compression control signals with

rise and fall times and control granularity of the compressed bit rate. Such a

requirement negates the advantage of all optical packet forwarding where flow of

high speed traffic in the optical domain is controlled by low data rate electronics.

Compression schemes thus far require several block elements to successfully process

packets and the complexity of such compressors scales with the maximum packet

length. With the growth in average packet length in the internet [1], it is desirable to

have a compression technique whose complexity has a low dependence on the

maximum input packet length. Compression ratio achievable in current schemes is

fixed for all packet lengths and permanent setup changes are usually required to

achieve different compression ratios. Furthermore, none of the current techniques for

97

compression have studied or proposed a way of handling variable length input

packets. Given the packet size distribution found in today’s internet [1], the ability of

a compressor to handle a wide range of input packet lengths becomes essential.

6.1.1 Feed forward delay line approach

One proposed approach to packet compression uses parallel delay line structures. In

this approach, splitting the incoming signal creates multiple copies of the packet.

Each copy of the packet is then appropriately shifted by a distinct delay amount to

align a specific bit of the incoming packet to its corresponding compressed bit slot.

The delayed bit sequences are multiplexed back in time using a combiner and a high

speed gate is used to select the appropriate compressed bits. Active sections may be

used between sections to compensate for the 3dB losses due to power splitting.

Gating may be done either at the end of each section or the output of the compressor.

The general structure of such a compressor is detailed in Figure 6. 1 (a). Figure 6. 1

(b) shows the principle of operation of the feed-forward delay compression approach.

Successful isolation of the compressed bits relies on the quality of the control signals.

Each compression element requires a unique control signal with compressed rate

precision and control granularity for error- free compression of the entire packet. The

number of cascaded compression elements required increases logarithmically with the

input packet length and is given by M = log2(N) where N is the number of bits in the

input packet. Moreover, each of the M delay elements must have an accuracy of a

fraction of the compressed bit slot.

98

…

(T-t) 2(T-t) 2N(T-t)Optical packet

IN

Compressed packet

OUT

- Coupler or switch element

…

(T-t) 2(T-t) 2N(T-t)Optical packet

IN

Compressed packet

OUT


(a) 21 43 65

21 43 65

87

87

21 43 65 87

Gate Gate

21 43 65

21 43 65

87

87

21 43 65 87

Gate Gate (b)

Figure 6. 1: (a) Structure of a feed forward delay compressor, (b) Working Principle

In [2] for example, this requirement for a 9 stage compressor with a maximum packet

size of 512 bits, would require a cleaving/splicing accuracy of ±200um (1psec) on

multiple delay lines of up to 50 m or a 0.0004% accuracy. Control over several delay

lines to such a precision is a design challenge both in implementation and stability

over environmental changes such as temperature. Such a scheme would have a

maximum packet size of less than (K-1) where K is the compression ratio due to the

inherent nature of compression. A similar approach is also demonstrated in [3] uses

multiple Optical Delay Line Lattices (ODLLs) to increase the maximum packet size

that can be handled. However, this scheme also suffers from increased complexity,

instability from environmental changes and stringent requirements imposed on the

compression control signals. The compression ratio achievable using this method is

fixed based on the delay values selected. Variable compression ratio implementation

would therefore require permanent setup changes in the scheme.

99

6.1.2 Spectral Slicing approach

A second approach [4] to compression uses the delay and combine technique as well

but utilizes the wavelength dimension to implement it. Incoming packets are copied

to multiple wavelengths using super continuum generation followed by spectral

slicing. Each wavelength copy is then separated, delayed appropriately and

multiplexed back together to perform timing alignment necessary for compression

using Arrayed Wave Guide Router (AWGR) and delay lines. The compressed packet

is returned to a single wavelength by spectral slicing of the generated super

continuum. Figure 6. 2 shows the operation of this approach.

1 2 3 4 λin

21 43λ1 ... λ4

1 2λ2

3λ3

4λ4λ1

21 43λout

1 2 3 4 λin

21 43λ1 ... λ4

1 2λ2

3λ3

4λ4λ1

21 43λout

Figure 6. 2: Spectral slicing approach - Principle of operation

Decompression may be performed using the same setup in the reverse direction and

using a narrow gating window with base rate repetition rate. While the method

reduces power losses due to splitting and multiplexing by using an AWGR and the

wavelength domain, the complexity of the system increases rapidly with the

maximum packet size. The number of wavelengths required in this approach equals

the number of bits in the packet. A second method proposed in [5] also employs the

wavelength domain and uses linearly frequency chirped pulses to obtain time-to-

100

wavelength conversion. However, 1.15 Km of fiber was used to compress a packet of

only 4 bits. Compression/decompression of large packets is therefore impractical

using either technique. Compression control signals are required to be of compressed

rate accuracy and control granularity. Implementation of variable compression ratio

would require a complete change of the delays and wavelengths used i.e. permanent

hardware changes.

6.1.3 Loop based approach

The oldest approach presented thus far [6] uses recirculating loops to align and store

compressed packet bits until the entire packet has been compressed. Bits are switched

into a recirculating loop whose loop length is adjusted to be one compressed bit

shorter than the base rate bit slot. Once a packet has been compressed, it is gated out

of the loop. Figure 6. 3 shows the principle of operation of the recirculating loop

based approach.

21 43 65 87

21 43321 21 65 87765 65

Gate

12 3 4 567 8

21 43 65 8721 43 65 87

21 43321 21 65 87765 6521 43321 21 65 87765 65

Gate

12 3 4 567 8

Gate

12 3 4 567 8

Figure 6. 3: Loop based compression - Principle of operation

The maximum number of bits is limited to the (K-1) where K is the compression

factor. The maximum packet size is also limited by ASE buildup in the loop, as the

number of recirculations required equals the number of bits in the packet. [7]

experimentally demonstrates the recirculating compression scheme and analyzes the

101

OSNR performance with increasing packet size. OSNR degradation limits the packet

size to below 100 bits, a size that is impractical for most network applications. The

number of compression elements required however does not increase rapidly with

increasing input packet size. This scheme may therefore be suitable for packet

compression if modified appropriately. However, in order to get a clean compressed

packet, a compression control signal with compressed rate precision and granularity is

required.

A compression mechanism chosen must be able to handle packet sizes of at

least 1500 bytes (12000 bits). To date, proposed and/or demonstrated schemes have

been able to handle a maximum of 100 bits with no practical possibility of scaling to

larger packet sizes. Any compression scheme proposed must also facilitate a viable

decompression mechanism without requiring compressed rate electronics to identify

and control compressed packets.

6.2 Network deployment of packet compression

Adaptation of packets for efficient packet forwarding at a multiple input/output core

node was analyzed in Chapter 5. Packet compression may be performed in several

ways based on the nature of the compressed packet at the compressor output. A study

of the deployment of packet compression/decompression is necessary to identify a

specific compression scheme and requirements placed on it. Successful end-to-end

forwarding of packets depends on the decompression scheme’s ability to identify

compressed rate packets and decompress each packet to its original temporal size

102

before it reaches the edge adaptation and exiting the optical network. The primary

requirement of any compression scheme is therefore to output compressed packets

that not only increase the core forwarding efficiency but also are decompressable

without bit errors for any input packet length.

6.2.1 Intra-node compression/decompression

Optical compression and decompression may be used to provide packet adaptation at

the core. However, payload adaptation may still be necessary at the network edge to

perform error free compression and decompression at the core. Figure 6. 4 shows the

adaptation required for efficient forwarding at the core using compression/

decompression. However, at the core, optical compression increases the payload bit

Com

p

Dec

omp

Pkt framing and core adaptation

INGRESS

EGRESS


Base rate pkts

Compressed rate pkts

CORECORE

CORE

All Optical Network

Com

p

Dec

omp

Com

p

Dec

omp

Com

p

Dec

omp


INGRESSINGRESS

EGRESSEGRESS


Base rate pkts

Compressed rate pkts

CORECORE

CORE

All Optical Network

Figure 6. 4: Intra node deployment of compression and decompression

103

size before exiting the core. While compression/decompression are intended to

increase core forwarding efficiency and enable the use of short fixed size buffers,

they are not necessarily bandwidth efficient. Intra-node compression/decompression

deployment separates the high throughput fiber links between nodes from the

switching process at the core. This allows for maintaining high link utilization

regardless of the compression scheme that is tailored to enable and maximize

forwarding at the NxN core.

IP header extraction

Optical packet

framing

Optical header lookup

Payload (de)encoding

for core adaptation

Optical modulation

on λselect

Optical packets

Ingress node

IP header extractionIP header extraction

Optical packet

framing

Optical header lookup

Payload (de)encoding

for core adaptation

Optical modulation

on λselect

Optical packets

Ingress node

Figure 6. 5: Edge ingress node functionality– electronic adaptation

Optical packets IP packetsIP header

extraction

Optical payload

deframing

Payload decoding for

core adaptation

Photo detect and CDR

Egress node

Optical packets IP packetsIP header

extractionIP header extraction

Optical payload

deframing

Payload decoding for

core adaptation

Photo detect and CDR

Egress node

Figure 6. 6: Edge egress functionality – electronic adaptation

Figure 6. 5 and Figure 6. 6 show the electronic adaptation necessary on the packet

payload at the ingress and egress for successful packet adaptation at the core. At the

ingress, incoming IP packets are examined to lookup an optical header with

104

appropriate forwarding information. Each IP packet is individually framed into

optical packets with payload framing and appropriate header and packet framing. The

payload then undergoes electronic processing required to facilitate error free

compression/decompression process at the core. Optical packets are now ready to be

modulated on a selected wavelength and exit the ingress node. At the egress, the

reverse process takes place. For successful core node packet adaptation, the payload

bit sequence must not be altered after processing for compression/decompression.

Subsequently, payload encoding is the last step at the ingress before packet

transmission and the first step at the egress after packet detection.

An electronic control plane at the core is used to identify the time of arrival and

payload length of the each incoming optical packet. Control signals generated by the

electronic circuitry are used for compression of payloads before packet forwarding.

.

.

.Pack

et

Com

pres

sion

Pack

et

De-

com

pres

sion

.

.


Switch element



.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Feedback buffering





.

.

.Pack

et

Com

pres

sion

Pack

et

De-

com

pres

sion

.

.


Switch element



.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Feedback buffering





Figure 6. 7: Intra-node deployment of Compression/decompression at the core

After successful forwarding, packet traffic is merged and decompressed to base rate

before transmission out of the node as shown in Figure 6. 7. Adaptation of variable

105

length packets entering the core node into fixed sized payloads enables the efficient

use of short fixed length delay lines to buffer packets contending for an output port.

6.2.2 Packet compression classification

To realize buffering using current state of art technology, adaptation of variable

length packets to fixed time slots may be used before statistical multiplexing.

Variable length packets may be fragmented or compressed to adapt from variable

length to fixed sized packets. Fragmentation of packets before forwarding requires

fragment IDs to keep track of packet fragments. Independent forwarding of packet

fragments can also lead to disordering and a higher probability of error due to header

corruption. Another approach to perform this adaptation is packet compression.

Compression and decompression of packets to adapt variable length packets into

fixed sized cells may be achieved in one of several approaches. The compression ratio

may be either fixed for all input packet lengths, or varied discretely or continuously

on a per packet basis. Fixed compression ratio results in the same variance of

compressed packet sizes as that of the input packet sizes as shown in Figure 6. 8.

Packets may be padded to the maximum packet size after compression to reach a

fixed size. A compression scheme implementing constant compression ratio is

difficult to achieve both for fixed and variable input packet lengths as it requires bit

level manipulation throughput the packet.

106

CompressorVariable length packets@ base rate

Variable length packets@ compressed rate

4x 4x 4x


Variable length packets@ compressed rate

4x 4x 4x

Figure 6. 8: Fixed compression ratio compression – padding to achieve fixed temporal size

In order for packets of all lengths to occupy the same compressed time slot without

padding, continuously varying the compression ratio based on the input packet length

is required. However, such fine control over the compressor is hard to implement and

therefore a discretely variable compression ratio may be used as shown in Figure 6. 9


Fixed temporal size packets@ compressed rate

2x 1x 4x

CompressorCompressorVariable length packets@ base rate

Fixed temporal size packets@ compressed rate

2x 1x 4x

Figure 6. 9: Variable and discretely variable compression ratio compression

To achieve variable length to fixed size packet adaptation using a compression

scheme with discretely variable compression ratio, a small amount of padding is

required. This limited control over compression ratio allows for higher input

utilization as such a compressor modifies it’s processing for each input packet length,

thereby achieving improved performance.

6.2.3 Requirement of a practical compressor

A compressor/decompressor used to adapt packets from/to variable length IP packets

to/from fixed sized cells requires a compression ratio control, phase aligned clock,

and gating control signals for the compression elements as shown in Figure 6. 10.

107

Gate control Signals for (de)compression elements

Compression Ratio Control

Freq, phase aligned base

rate clock for pulse carving and compression

Data @ compressed/ base rate

Packet Compressor

/Decompressor

(De)Compressed packetstream @ base/ compressed

bit rate

Gate control Signals for (de)compression elements

Compression Ratio Control

Freq, phase aligned base

rate clock for pulse carving and compression

Data @ compressed/ base rate

Packet Compressor

/Decompressor

(De)Compressed packetstream @ base/ compressed

bit rate

Figure 6. 10: Generic structure of compressor/decompressor

Based on the network architecture proposed and studied, several requirements must

be satisfied by such a compressor/decompressor in order to successfully perform

adaptation of variable sized packets to fixed time slots.

• The scheme must support a maximum packet length of 1500 bytes (12000

bits) and a packet length variation from 40 bytes (320 bits ) to 1500 bytes.

• Discretely variable compression ratio changing on a per packet basis must be

achievable without having to implement permanent hardware changes.

• Control signals and clocks necessary for the compressor must be few and

generated asynchronously based solely on the packet timing information. The

number of signals required must not scale directly with the maximum input

packet length.

108

• Any control signals required for compression must be of base rate precision

and granularity in order to be viable for generation by the core node

electronics.

• Compression must take place in real time in order to support the maximum

line rate at the input.

• Any compression scheme selected must be scalable both in terms of

maximum input packet length handling capability and highest compressed bit

rate achievable for a given base rate.

• The proposed compressor technology must be robust against environmental

stress such as temperature fluctuations.

6.2.4 Approaches to compression implementation

Compression of a packet involves the temporal rearrangement of the packet bits to

reduce the footprint of the packet in time. However, such rearrangement is not

required to preserve the bit order of the packet so long as the original packet

information may be successfully reconstructed either by simple processing of the

compressed packet or if the decompression process reverts the packet bit order to its

original form. Such a relaxation of the constraint on the packet bit order reduces the

processing required to achieve compression drastically as individual bit level shifting

is no longer necessary. In the particular case of an all optical label switched network,

the optical payload that undergoes compression and decompression is not examined

until the packet reaches the egress and is ready to exit the network. Compression

schemes proposed/demonstrated thus far strive to maintain the bit order after

109

compression. In such a bit serial approach to compression, each bit must individually

be shifted in time by a distinct delay period to the corresponding bit slot at the

compressed bit rate as shown in Figure 6. 11. Such manipulation therefore requires

compressed bit rate precision control for every bit in the packet and subsequently

does not scale well with input packet length.

21 43 65 87

87654321

Compression Decompression

21 43 65 8721 43 65 87

87654321 877665544332211

Compression Decompression

Figure 6. 11: Bit- serial compression

A second approach to compression is shown in Figure 6. 12 where each packet may

be viewed as a number of contiguous fixed sized virtual bins. While the order of the

bits after compression does not follow that of the base rate packet, the number of

control signals required to perform compression/decompression does not scale

directly with the number of bits in the packet. Such an approach can handle both large

and small packets as discretely variable compression ratio is achieved by controlling

the number of bins included in the compression process. Each bin is buffered for a

multiple of a fixed delay to form the compressed packet. The compression process

takes place as the incoming packet is loaded into the compressor and the compressed

packet becomes available while the end of the packet is being loaded. The scheme

therefore imposes little or no latency to the packet. At decompression, the compressed

packet is sampled to demultiplex bits from each bin to reconstruct the base rate packet

with the correct order of bits.

110

40-1500 bytesbin size

1 bin size - 1 compressed bit slot1 bin size - 1

compressed bit slot

1 bin size - 1 compressed bit slot



1 bin size + n bits

Compressed packet

40-1500 bytesbin size

1 bin size - 1 compressed bit slot1 bin size - 1

compressed bit slot




1 bin size + n bits

Compressed packet Figure 6. 12: Compression Principle - packet fold-in approach

Decompression is also performed while the packet is loaded and therefore imposes

little or no additional latency. The highest compressed bit rate is determined only by

the NRZ to RZ conversion process and the pulse width of the bits in the packet. The

control signal precision has minimal impact on the maximum compressed bit rate.

The fold-in compression/decompression approach is therefore scalable both in bit rate

and in the maximum input packet size and promises to satisfy the relaxed control

signal requirements of a practical packet compressor/decompressor.

6.3 Fold-in packet compression

Two approaches of implementing the fold-in compression scheme are proposed and

studied, namely the linear time delay compression and the loop based compression.

111

The loop based approach is selected for its scalability, ease of implementation and

compactness.

6.3.1 Linear time delay compression

IngressSOA-WC

NRZ to RZ

AWGR

AWGR

Super--continuum or λ1,λ2,λ3,λ4 CW

Envelope delay

Compression/ Fold-in delay

4 by 1 coupler

4 by 1 coupler

Fiber WC

λ0

λ0

λ1- λ4

λingress

λingress

λ1- λ4 λ1λ2

λ3

λ4

λ1λ2

λ3

λ4

Compressed packet on

Optical packet payload

IngressSOA-WC

NRZ to RZ

AWGR

AWGR


Envelope delay

Compression/ Fold-in delay

4 by 1 coupler

4 by 1 coupler

Fiber WC

λ0

λ0

λ1- λ4

λingress

λingress

λ1- λ4 λ1λ2

λ3

λ4

λ1λ2

λ3

λ4

Compressed packet on

Optical packet payload

Figure 6. 13: Linear time delay implementation of fold-in compression

Figure 6. 13 shows the principle of operation of the linear time delay approach.

Different parts of the incoming packet are wavelength converted to separate

wavelengths using a Semiconductor Optical Amplifier (SOA) based Cross Gain

Modulated (XGM) wavelength conversion. The packet then undergoes NRZ to RZ

conversion to reduce the pulse width of the bits. An Arrayed Wave Guide Router

(AWGR) is used to spatially separate sections of the packet and each section is

appropriately delayed before being multiplexed into a single output. A high speed

fiber wavelength converter is used to convert the compressed packet into a single

outgoing wavelength. Control signals required to turn on/off the separate CW’s are of

base rate precision. While the compressed packets have desired sharp edges, this

approach requires a multi-wavelength source and a high speed wavelength converter.

Polarization dependence of the high speed wavelength converter (WC) can also result

112

in degraded compressed packet bit quality. The number of delays with picosecond

accuracy required for compression scales directly with the input packet size and the

compression ratio required. The linear approach is therefore not very scalable in

terms of compressed bit rate and input packet size.

6.3.2 Loop based compression

A second approach proposed uses a fiber loop to provide the delay for each virtual

bin of the incoming packet. Figure 6. 14 shows the principle of operation of the loop

based implementation. Virtual bins of the incoming packet are stored in an

accumulative buffer and bit interleaved with adjacent virtual bins. The bit-interleaved

bin is then stored in the accumulative buffer to be time delayed appropriately and

ready for bit interleaving with the next incoming virtual bin. The compressor gating

outputs only the compressed packet when all incoming virtual bins have been

interleaved. The simple structure of the compressor requires only two simple control

signals, one to control and flush the accumulative buffer and another to control the

compressor gating after each packet has been compressed.

113

Bit levelbin interleaver

Accumulative Buffer

Compressor Gating

Base rate Compressed rate

virtual bins

Uncompressed packets

Compressed packets



Accumulative Buffer

Accumulative Buffer

Compressor Gating

Compressor Gating

Base rate Compressed rate

virtual bins


Compressed packets

Figure 6. 14: Loop based compression principle

Compressed rate precision and granularity is not required on the control signals.

Figure 6. 15 shows an example of packet compression using the loop based

implementation. The size of the loop is chosen to be one compressed bit shorter than

the compressed packet time slot. After undergoing NRZ to RZ conversion, the packet

enters the loop. Each virtual bin of the incoming packet is distinguished by its

separate dotted line in the figure. Each section of the packet is multiplexed into a

fixed time slot with each loop trip. At the end of the fourth loop trip, all sections of

the incoming packet have been successfully multiplexed into a fixed time slot and are

gated to output only the compressed packet. While all bits of the packet are

successfully compressed without corruption, unlike the linear time delay approach

however, the loop based delay approach produces imperfect edges containing stray

bits in the compressed packet limited by the compression gating signal rise/fall times.

Additional encoding would therefore be necessary at the edge to prevent stray bits

from corrupting valid packet bits during decompression. Real time compression of

each packet may take place as read and write functions to the accumulative loop

buffer may be done simultaneously. It is therefore possible to achieve packet

compression with a high utilization at the compressor input.

114

time

Loop 2 output

Loop 3 output

Loop 4 output

SOA gated Output

Uncompressed Packet input

Compressed Packet output

time

Loop 2 output

Loop 3 output

Loop 4 output

SOA gated Output


Compressed Packet output Figure 6. 15: Loop based implementation of fold-in compression (colors indicate virtual bins not

wavelengths)

6.4 Packet adaptation using fold-in compression

Due to the inherent advantages of fewer compression elements required, simpler

structure for scaling maximum input packet size and ease of construction, the loop

based approach is selected to demonstrate packet adaptation using the fold-in

compression approach. However, before setting up the compressor, characterization

of the SOA used to control and flush the compression buffer loop is necessary.

115

6.4.1 Loop characterization

The gated loop is accumulative by nature of the compression scheme and therefore

results in an increasing average power at the loop SOA’s input as shown in Figure 6.

16. The SOA must therefore be characterized to perform uniformly for a range of

input powers. For a maximum compression ratio of 1:4, the average input power for

the loop SOA varies over 6dB. The mark ratio of the input stream to the loop SOA

changes along with the average power. The SOA must therefore be characterized over

a sufficiently large pseudo random bit sequence.

time

Pav 2Pav 3Pav 4Pav

time

Pav 2Pav 3Pav 4Pav Figure 6. 16: Average Power increase at the input of the loop SOA

The loop SOA used in this demonstration was characterized using a 10Gbps bit error

rate system with a PRBS 231-1 sequence over a wide range of average input power.

Figure 6. 17 shows the power penalty incurred through the SOA for varying input

power. For lower optical powers, the power penalty incurred increases due to the low

input OSNR while at higher input optical power; pattern dependence of the SOA due

to saturation effects increases the incurred power penalty. It is therefore desired to

have an SOA with a high saturation power. A <0.5dB penalty is seen over the range

of -7dBm to -24dBm input power of ~17dB and is therefore sufficient to achieve a

maximum compression ratio of 1:4 with minimal degradation in SOA response over

the entire input packet. An average input power of ~-17dBm was chosen to operate

116

the compression loop. Due to the accumulative nature of the compression loop, a

higher compression ratio requires a wider input power range where the loop SOA

operates with minimal degradation. Compression from 2.5Gbps to 40 Gbps would

require a 12dB input power range for successful compression.

Operating region

Figure 6. 17: Measured SOA power penalty, OSNR Vs input power

6.4.2 2.5Gbps to 10Gbps Packet compression

The experimental setup to demonstrate 2.5Gbps to 10Gbps compression is shown in

Figure 6. 18. Compressed packet sizes fit a single sized bin of 1.2usec or the time

period of a 1500 byte packet at the compressed bit rate. The principle of operation of

the compression loop is as follows. 2.5Gbps packets to be compressed were generated

117

CW

Ring laser

10GHz clock source

BERT packet generator

Controller Card

Gain Modulated

SOA

Compression Ring

IsolatorASE Filter

Picosecond+ bulk Optical

Delay Line

50/50 coupler

Ring Gate

Ring Gate

Packet Gate

Compressor Gate

10GbpsCompressed

optical packet

2.5Gbps40-1500 byte

packet

To BERT gating

CW

Ring laser

10GHz clock source

BERT packet generator

Controller Card

Gain Modulated

SOA

Compression Ring

IsolatorASE Filter


Delay Line

50/50 coupler

Ring Gate

Ring Gate

Packet Gate

Compressor Gate

10GbpsCompressed

optical packet

2.5Gbps40-1500 byte

packet

To BERT gating

Figure 6. 18: 2.5Gbps to 10Gbps compression - Experimental setup

to be 1500 bytes (~4.8 usecs) in length with a 25 usec inter packet gap. The input

average power was set to an optimal level based on the characterization of the loop

SOA before entering the loop. A gain controlled SOA compensates for losses in the

loop and is used as a gating element to flush the compression loop when each packet

has been compressed. The SOA gating was adjusted to turn on the SOA only when

packets exist in the loop to eliminate noise buildup and amplifier transient response

due to bursty nature of the packet traffic. A 1.2nm filter centered at 1556.6nm was

used to reject ASE noise in the loop. A bulk fiber delay of ~210m and a picosecond

precision delay line were used to adjust the total loop length to be 1.2usec which is

selected as the bin size for this experiment. A variable attenuator was used to obtain

precise balance between gain and losses in the loop; this is critical to maintaining

118

constant bit heights between multiplexed bins. The compression loop serves as a

buffer to hold the packet while compression takes place as well as to time multiplex

the virtual bins of the packet as they enter the loop. A second gated SOA at the output

of the compression loop samples the compressed packet when all input virtual bins

have entered the loop and compression is complete. Picosecond control signals are

not required since time multiplexing is achieved by accurately controlling the length

of the compression loop. The gated SOAs used in this experiment have rise and fall

times of ~2 nsecs. Compressed packets generated in this experiment therefore have

2nsecs of stray bits at the packet edges.

Scope traces of the input packet and the compressed output packet are shown in

Figure 6. 19(a) and (b). The input packet size was measured to be 4.8usecs long and

the bit period to be ~400psec with a pulse width of ~17psec.

Figure 6. 19: (a) Input 1500 packet and (b) packet stream quality

119

Figure 6. 20: (a) Compressed 1500 byte packet and (b) compressed stream quality

Figure 6. 20(a) shows the 1500 byte packets after compression. The packet size was

measured to be ~1.2usec as expected from a 1:4 compression of the packet. The

compressed bit quality seen in Figure 6. 20(b) shows slight height variations in the

bits attributed to the loop SOA’s pattern dependence and a slight droop seen in the

loop control signal. Compressed bits are spaced 100psecs apart corresponding to a

compressed rate of 10Gbps.

Bit error rate measurements were made on the compressed 1500 byte packets at

10Gbps. The input packet stream bits were chosen such that after proper compression,

they would yield a 12000 bit PRBS 27-1 bit sequence. Error free operation was seen

for packets providing bit level verification of successful compression of the entire

packet. Bit error rate curves shown in Figure 6. 21 show the quality of the

compressed packets.

120

Figure 6. 21: Bit error rate measurements on 1500 byte compressed packets (refer Figure 6. 18)

A ~2.2dB penalty is seen through the compressor and is attributed to the height

variations of the compressed bits. This can be brought down by the appropriate choice

of an SOA with higher saturation power and better control signal quality.

6.4.3 Variable compression control

A second experiment was performed to test the compressor’s capability to handle

variable length input packets. Compression ratio may be varied by controlling the

number of loop trips being made by the incoming packet as explained in Figure 6. 22.

121

1500 byte or 12000 bit packet

Bit 0 Bit 12000

Incoming Packet at base rate

Compression loop Control for:

Pkt size< 375 bytes

Pkt size< 750 bytes

Pkt size< 1125 bytes


1500 byte or 12000 bit packet

Bit 0 Bit 12000

Incoming Packet at base rate

Compression loop Control for:

Pkt size< 375 bytes

Pkt size< 750 bytes


Pkt size< 1500 bytes Figure 6. 22: Principle of variable compression ratio control

By appropriate choice of the loop-gating signal, the compression ratio may be varied

to be 1:1, 1:2, 1:3 or 1:4. The compression scheme was tested for variable length to

fixed time slot adaptation of packets using variable compression ratio over an input

packet size range of 40 to 1500 bytes.

(a1)

(b1)

(a2)

(b2)

(a1)

(b1)

(a2)

(b2)

Figure 6. 23: 1500, 1024 byte input packets (a1,b1) and compressed output packets (a2,b2)

(1μsec time scale)

122

(a1)

(b1)

(a2)

(b2)

(a1)

(b1)

(a2)

(b2)

Figure 6. 23 and Figure 6. 24 show the input and compressed packet traces for 1500,

1024,560 and 40 byte packets. It can be seen that while the input packet size varies

from 4.8usecs (1500 bytes) to 128nsecs (40 bytes), a fixed output size of 1.2usec is

obtained after compression by varying the compression ratio from 1:4 to 1:1. For 40

byte packets, padding is required for the compressed packet to occupy a 1.2usec time

slot.

123

(c1)

(d1)

(c2)

(d2)

(c1)

(d1)

(c2)

(d2)

Figure 6. 24: 560, 40 byte input packets (c1,d1) and compressed output packets (c2,d2)

( 1μsec (top) and 50nsec (bottom) time scale

Bit error rates were not performed on smaller packets due to the minimum

synchronization time requirement of the bit error rate tester. Packet bit quality was

however verified visually to be the same or better than the 1500 byte compressed

packets.

6.4.4 Bandwidth efficient multiple compressor approach

The loop based compression approach has been demonstrated for 2.5Gbps to 10Gbps

compression of 1500 byte packets. Discretely variable compression ratio control on a

per packet basis required for variable length to fixed sized packet adaptation is

possible to achieve and has been demonstrated for individual packet sizes. However,

as studied in chapter 5, a more efficient bandwidth use at the core may be obtained by

124

compressing packets of different sizes to the maximum compression ratio. This

results in a discrete set of packet sizes after compression. The trimodal nature of the

distribution of packet sizes in today’s IP networks may be used in selecting

appropriate bin sizes to achieve maximum compression for most packets. When a

single delay line buffer is used, smaller compressed packets may be padded to the

maximum packet size at the compressed. However, in the case of buffers with

multiple delay line lengths, each compressed packet size may be buffered using the

delay line of optimal length thereby increasing the bandwidth efficiency. A two stage

wavelength converter architecture used in the core node shown in chapter 4 is

preferred to perform packet steering in the core forwarding due to its inherent

robustness and flexibility. Such an approach may also be useful in selecting the

output compressed packet size in the compressor. The polarization insensitive SOA

based XGM wavelength conversion may be used to select the appropriate internal

wavelength for transmission of the base rate packet. Bit carving is used to perform

NRZ to RZ conversion necessary before the packet can be compressed. A single loop-

based compressor may be used to implement compression to multiple compressed

packet sizes, based on input packet length as shown in Figure 6. 25.

125

Tunable λsource

Controller Card

Gain Modulated

SOA

Compression Ring

IsolatorASE Filter


Delay LineBased on λ

50/50 coupler

Ring Gate

Ring Gate

Compression select Compressor Gate

10GbpsCompressed

optical packetSOA

Tunable λsource

Controller Card

Gain Modulated

SOA

Compression Ring

IsolatorASE Filter


Delay LineBased on λ

50/50 coupler

Ring Gate

Ring Gate

Compression select Compressor Gate

10GbpsCompressed

optical packetSOA

Figure 6. 25: Bandwidth efficient compression setup.

Bins undergo a different delay based on the transmitted wavelength of the incoming

packet. Once the packet has been compressed, it may then be steered by the second

stage wavelength converter performing the actual forwarding. However, prior

knowledge of packet size is required at the control plane for the appropriate choice of

internal wavelength resulting in the optimal compressed packet size at the output.

This information may be transmitted in the optical header to ensure proper internal

wavelength selection before the packet arrives at the compressor input.

6.5 Chapter summary

The state of art in compression technology to date is presented in this chapter.

Deployment of compression in an optical network is considered and the requirements

126

for practical implementation of compression/decompression are studied. Prior

compression technologies based on the bit serial approach do not satisfy these

requirements. A new approach to compression, namely fold-in compression is

therefore proposed.

Two schemes are presented and compared to implement the fold-in compression

approach. While, the linear time delay approach offers clean packet edges for the

compressed packets, it does not scale well both with maximum achievable

compression ratio and maximum input packet length. The gated loop compression

scheme is shown to be a simple and robust design. It scales well with maximum bit

rate, highest compression ratio achievable and maximum input packet length.

Moreover, discretely variable compression ratio control required for variable length

to fixed sized packet adaptation is shown to be feasible in this scheme.

Compression of 1500 byte packets from 2.5Gbps to 10Gbps is shown with only a

~2.2dB penalty. Error free operation through the entire packet using the gated loop

approach was verified. Compression of variable length input packets to a fixed time

slot is shown by varying the compression ratio of the loop. A modification to the loop

based compression scheme is proposed to accommodate multiple compressed packet

sizes to increase the bandwidth efficiency of the forwarding scheme.

A compression technology capable of handling a maximum input packet length of

12000 bits (1500 bytes) and the widest dynamic range of input packet lengths from

320 bits (40 bytes) to 12000 bits (1500 bytes) shown to date is proposed and

127

demonstrated. It meets or exceeds all requirements for viable deployment of

compression as an adaptation mechanism at the core.

128

References



[2] P. Toliver, D. Kung-Li, I. Glesk, and P. R. Prucnal, "Simultaneous optical

compression and decompression of 100-Gb/s OTDM packets using a single

bidirectional optical delay line lattice," Photonics Technology Letters, IEEE,

vol. 11, pp. 1183-1185, 1999.

[3] S. Aleksic, V. Krajinovic, and K. Bengi, "A novel scalable optical packet

compression/decompression scheme," presented at Optical Communication,

2001. ECOC '01. 27th European Conference on,478-479 vol.3, 2001.

[4] H. Sotobayashi, K. Kitayama, and W. Chujo, "40 Gbit/s photonic packet

compression and decompression by supercontinuum generation," Electronics

Letters, vol. 37, pp. 110-111, 2001.

[5] P. J. Almeida, P. Petropoulos, B. C. Thomsen, M. Ibsen, and D. J. Richardson,

"All-optical packet compression based on time-to-wavelength conversion,"

Photonics Technology Letters, IEEE, vol. 16, pp. 1688-1690, 2004.

[6] A. S. Acampora and S. I. A. Shah, "A packet compression/decompression

approach for very high speed optical networks," presented at

Telecommunications Symposium, 1990. ITS '90 Symposium Record.,

SBT/IEEE International 38-48, 1990.

129

[7] H. Toda, F. Nakada, M. Suzuki, and A. Hasegawa, "An optical packet

compressor based on a fiber delay loop," Photonics Technology Letters, IEEE,

vol. 12, pp. 708-710, 2000.

130

Chapter 7

Optical Packet Decompression

Compression of low speed packets into ultra-fast optical traffic is one approach to

tapping into the large bandwidth offered by optical networks. Such adaptation may

also be used to shape traffic characteristics such as packet size distribution to meet

today’s optical packet forwarding requirements. However, after forwarding, it is

impractical and sometimes unfeasible to use high-speed electronics to process

compressed packets directly before they exit the optical network. Decompression is

the process of re-expanding high speed packets to their original temporal sizes and to

base data rates that edge node electronics are capable of handling. In the previous

chapter, we proposed and demonstrated a compression scheme that is capable of

adapting variable length packets to fit into a fixed packet duration to better suit them

for optical buffer technologies available today. Such a compression scheme becomes

attractive only when a decompression mechanism that is scalable in packet size

handling capability is available after packet forwarding. This enables edge electronics

to process and transport packets error-free out of the optical network at base rate. In

this chapter, we identify the requirements of a packet decompression scheme for

successful deployment in an optical network and propose a technique capable of

operating in complement to the compression scheme discussed in chapter 6.

131

7.1 Decompression – State of the art

Access to the ultra-fast optical domain may be obtained by adapting low speed traffic

using packet compression and several approaches have been proposed and

demonstrated promising such adaptation. However, in order to successfully deploy

optical compression techniques and for low speed traffic (155Mbps to 2.5Gbps) to

benefit from the advantages of high-speed optics (10Gbps to 160Gbps and beyond), a

decompression mechanism is necessary to scale packets back to the bit rates handled

by electronics. While traffic entering the compressor is at the base rate, thereby

allowing for bit level manipulations with relative ease, packets into the decompressor

are at the compressed rate. At these ultra-fast data rates, complicated bit level

processing is impractical therefore setting a requirement for a higher level of

sophistication during decompression. Consequently, while several approaches have

been proposed for compression of packets, investigation into packet decompression

has been rather limited. As with the compressor, with the increase in the average

packet size and the packet size distribution of today’s internet [1], any optical

decompression scheme would have to be capable of handling different packet byte

sizes. In this section, we present the main approaches to decompression thus far and

investigate the limitations of prior methods in the context of a deployable scheme in

today’s optical packet network.

132

7.1.1 Feed-forward Delay line approach

In the delay line approach, high-speed compressed packets are injected into several

cascaded stages of delay lines to make multiple copies shifted in time. Appropriate

decompressed bits are then selected by gating at the compressed clock rate to yield

the output decompressed packet. Figure 7. 1(a) shows a typical setup for

decompression using this approach. [2] and [3] use this approach for packet

decompression and study the performance of such a decompression scheme. While

the implementation of the delay lines and the output gating element may differ, the

principle of operation behind both these schemes is the same as given in Figure 7.

1(b). Each delay line lattice element generates two copies of the input signal whose

…

(T-t) 2(T-t) 2N(T-t)Decompressed

packet OUT

Compressed packet

IN


High speed optical gating

…

(T-t) 2(T-t) 2N(T-t)Decompressed

packet OUT

Compressed packet

IN


High speed optical gating

(a)

1 2 3 4

21 43

21 43 21 43 21 43 21 43

Gating

Time aligned copies

Compressed input packet

Decompressed output packet

1 2 3 41 2 3 4

21 43

21 4321 43 21 4321 43 21 4321 43 21 4321 43

Gating

Time aligned copies

Compressed input packet

Decompressed output packet

(b)

Figure 7. 1: Feed forward delay decompression- (a) structure (b) operation

133

spacing is the geometrically increasing multiple of 2 times (T-t) where T is the base

rate bit spacing and t is the compressed rate bit spacing. The maximum number of bits

in each packet is determined by the number of cascaded delay line stages in the

decompressor and is given by N= 2M where M is the number of stages.

This maximum number of bits is however limited to (K-1) where K is the

compression ratio in the case of a simple decompression scheme. For packet sizes

greater than (K-1), use of parallel delay line structures would be necessary. However,

in this case the input compressed packet would have to be broken down to smaller

cells each with (K-1) bits, a process that would be extremely challenging at

compressed bit rates. Moreover, the number of delay lines cascaded would be limited

by the losses due to splitting and ASE noise accumulation if periodic or distributed

amplification was used to compensate for such losses. Though the implementation of

such a decompression scheme may be relatively simple for small packet sizes, this

approach does not scale well with packet size. Change in the compression ratio would

require a considerable modification to the setup and would therefore not be possible

on a per packet basis.

A variation of the delay line approach is presented in [4] where linear split and delay

of the input signal is used instead of the exponential approach. While integrated

electrical signal generator and optical gating elements are used, this approach also

does not scale well with packet size due to the linear nature of the decompression

scheme.

134

7.1.2 Spectral slicing

In this approach, the wavelength domain is used to create multiple time-delayed

copies of the compressed packets before time-gating the appropriate decompressed

bits as in [5]. Generation of multiple copies is achieved by creating a Supercontinuum

(SC). The SC is then spectrum sliced using an Arrayed Wave Guide Router (AWGR)

and time delayed such that each decompressed bit is aligned to the corresponding

base rate bit slot for every copy.

12 43 λ1 ... λ4

λin21 43

1 2 3 4λout

λ2 λ3 λ4λ112 43 12 43 12 43 12 43

12 4312 432 43 λ1 ... λ4λ1 ... λ4

λin21 43 λin21 43

1 2 3 4λout

1 2 3 4λout

λ2 λ3 λ4λ112 43 12 43 12 43 12 43

λ2 λ3 λ4λ112 4312 432 43 12 4312 432 43 12 4312 432 43 12 4312 432 43

Figure 7. 2: Decompression using spectral slicing

The output is then time gated using a saturable absorber and decompressed bits are

converted to the same wavelength using SC. The working principle for this approach

is shown in Figure 7. 2. The number of wavelengths used for decompression scales

linearly with the maximum number of bits in the packet and therefore severely limits

the packet size. Moreover, variable compression ratio would again require significant

setup changes and cannot be performed on a per packet basis.

135


In this approach, a fiber loop is used to create multiple copies of the incoming

compressed packets as proposed in [6]. An electro-optic “AND” gate serves as the

gating mechanism to select the decompressed bit from each copy. Figure 7. 3 shows

the principle of operation of the loop based approach. The maximum packet size is

once again limited to (K-1) bits. ASE buildup in the loop can further limit the

decompressable packet size. In order to decompress larger packet sizes, a parallel

loop structure may be used. However, in this case, the complexity of the required

decompression control signals increases rapidly with packet size.

(T-t)

2 43

12 43 12 43 12 43 12 43 1 2 3 4

Gating

Decompressed packet

Compressed packet

(T-t)

2 432 43

12 43 12 43 12 43 12 4312 4312 432 43 12 4312 432 43 12 4312 432 43 12 4312 432 43 1 2 3 4

Gating

Decompressed packet

Compressed packet

Figure 7. 3: Loop based decompression structure and operation

A decompression mechanism chosen for deployment in an IP environment

would be required to handle packet sizes of up to 1500 bytes (12000 bits) and support

a wide dynamic range of packet byte sizes. Approaches proposed to date do not

handle more than 32 bits with little scope for scalability. The need for

minimization/elimination of complex ultra-fast control signals for decompression

suggests that we exploit the complementary features of compression/decompression

and select a compression scheme that simplifies decompression.

136

7.2 Fundamentals, design and implementations of packet

decompression

The choice of the approach to decompression of packets is primarily dictated by the

compression scheme. Depending on the nature of compression used, decompression

may be classified as decompression for fixed compression ratio, variable compression

ratio and discretely variable compression ratio packets. In chapter 6, we proposed and

demonstrated a discretely variable compression scheme to perform adaptation of

packets to higher speeds while compressing them to occupy the same fixed time slot

regardless of the input byte size. A complementing decompression scheme would

therefore handle packets occupying a fixed time slot and re-expand them to their base

rate packet size. Knowledge of the compression ratio used during compression is

therefore necessary for error-free decompression. In the intra-node architecture

network deployment of compression/decompression, such information is contained in

a single synchronized control plane that is aware of the location of the packet

throughout the core switching process. However, in the case of end-to-end

deployment, compression ratio information must be embedded in the packet header

for recovery before the packet arrives at the decompressor.

7.2.1 Decompression - principles of operation

Decompressed packets must contain all the bits of the original packet devoid of

corruption and in the correct bit order. Unlike compression, decompression involves

137

higher output utilization as compared to the input traffic due to the very nature of the

process as shown in Figure 7. 4.

Decompressor

Compressed packetsLow utilization at high bit rate

Decompressed packetsHigh utilization at low bit rate

Decompressor

Compressed packetsLow utilization at high bit rate

Decompressed packetsHigh utilization at low bit rate

Figure 7. 4 : Decompression principle - increase in utilization

In a packet environment, this sets a limit on the minimum interpacket spacing at the

input of the decompressor. The maximum input utilization for error free operation in

the case of fixed compression ratio input packet traffic is therefore given by

UIMAX = UO / K where K is the compression ratio.

In addition to this theoretical limitation, processing time for decompression may add

to the minimum required interpacket spacing. Packets entering the decompressor

prior to the minimum interpacket duration cannot be processed and a ‘timeout’

phenomenon occurs as in Figure 7. 5. Time out can therefore severely limit the packet

handling capacity of the decompressor. When the input utilization temporarily

exceeds UIMAX, buffering of incoming compressed packets may be used to resolve

timeout. However in the case when input utilization steadily exceeds UIMAX a

permanent solution such as the use parallel pipelining using multiple decompressors

feeding multiple base rate output ports would have to be considered to prevent packet

loss during decompression.

138

Time out issue

Compressed packets

Decompression

Time out issue

Compressed packets

Decompression

Figure 7. 5: Decompression - Time out issue

For discretely variable compression ratio implementation, interpacket spacing

requirements are determined on a per packet basis and given by

TSPACING = (K-1) * TPACKET

where TPACKET is the temporal size of the compressed packet. In the intra-node

deployment scheme, the control plane may use knowledge of the timeout requirement

of the decompressor to shape the traffic entering the decompressor after switching to

minimize packet loss.

7.2.2 Requirements on a packet decompressor

A decompression scheme capable of handling large packets and a wide dynamic

range of packet sizes should fulfill the following requirements:

• Decompression should require no complicated control signals at the

compressed bit rate. High speed control signaling would be limited to basic

sinusoidal waveforms or a combination of multiple sinusoids and preferably

driven by low voltage/current swing signals.

139

• Decompression processing time must be limited to reduce timeout overhead.

An ideal decompression scheme would process packets in real time thereby

supporting maximum line rate at the input.

• Scalability of the chosen scheme in terms of input compressed bit rate,

maximum packet size handling capability and input packet byte size dynamic

range is essential.

• The decompression scheme should not depend on elaborate packet and bit

level synchronization as this would negate the advantages of

compression/decompression.

• The number of control signals required must not scale directly with maximum

packet size or maximum compressed bit rate.

• Decompression of fixed cell size input packets with variable compression

ratios on a per packet basis would be required for error-free performance.

• Any approach to decompression must be robust to environmental effects for

successful deployment in an optical packet network.

7.3 Fold-out packet decompression

As with compression, implementing decompression may be approached using two

distinct techniques, the bit serial re-expanding approach and fold-out decompression.

Bit level manipulations that are required in the bit serial approach are practically

impossible to implement for larger packet byte sizes at high compressed bit rates due

to the lack of electronic control with the degree of precision and granularity required.

140

The fold-out technique requires few control signals and no complicated high speed bit

level manipulations and is therefore chosen for demonstration and study here.

Two separate implementations of fold-out packet decompression are proposed and

compared namely the linear time delay and the loop based decompression.

7.3.1 Linear time delay approach

In this approach, the wavelength domain is used to ease the requirement of numerous

electrical decompression control signals. A high speed Wavelength Converter (WC)

aids in separating virtual bins in the compressed packet. Modulating multiple

wavelength CW signals using a single compressed rate electrical clock with a packet

envelope signal generates an optical decompression signal. An AWGR and delay line

combination is then used to create the composite decompression gating signal at the

input of the WC. Compressed packets entering the decompressor are demultiplexed

and wavelength converted into multiple wavelengths as shown in Figure 7. 6. Spatial

separation of the demultiplexed bins takes advantage of the wavelength domain and is

done using an AWGR. Each bin is then appropriately delayed before being combined

into one bit stream. A wavelength insensitive RZ to NRZ converter is used to bring

down the decompressed packet bits to base rate pulse widths and rise/fall times before

being wavelength converted onto a single wavelength using an SOA based WC.

Decompressed packets are now ready to exit the node/network.

141

AWGR


Envelope delay

4 by 1 coupler

λ1λ2

λ3

λ4

Fiber WC

Compressed packet

Harmonics of base rate clock with Envelope of packet

AWGR

Decompression/ Fold-out delay

4 by 1 coupler

λ1- λ4 λ1λ2

λ3

λ4

RZ to NRZ

SOA-WCEgress

Base rateclock

CW @ λ0

AWGR


Envelope delay

4 by 1 coupler

λ1λ2

λ3

λ4

Fiber WC

Compressed packet

Harmonics of base rate clock with Envelope of packet

AWGR

Decompression/ Fold-out delay

4 by 1 coupler

λ1- λ4 λ1λ2

λ3

λ4

RZ to NRZ

SOA-WCEgress

Base rateclock

CW @ λ0

Figure 7. 6:Linear time delay decompression technique

The polarization dependence of the fiber WC imposes strict polarization control

requirements at the input of the decompressor to avoid penalties due to height

variation in the bits of different virtual bins within the decompressed packet.

Furthermore, while the technique enjoys the advantages of fold-out decompression,

the complexity of this implementation increases with number of bits in the

decompressed packet and the compression ratio, therefore making this approach less

scalable.


We propose a second approach to implement fold-out decompression is the loop

based approach the working principle of which is detailed in Figure 7. 7. A

compressed packet entering the decompressor is stored in a packet buffer. A bit

shifted copy of the stored packet is output multiple times until the entire packet has

been decompressed. A decompressor gating is used to select the appropriate bits for

142

the virtual bins of the base rate packet to reconstruct the uncompressed packet at the

output. The requirement of a complex compressed bit rate control signal is eliminated

by bit shifting copies of the compressed packet and therefore requiring only a simple

repetitive gating signal at compressed rate and base repetition rate. Control of the

compressed packet buffer is achieved using base rate electrical control signals while

decompressor gating is driven using a periodic compressed rate electrical signal.

Bit shifted packet copier

Compressed packet buffer

Decompressor Gating

Compressed rate Base rate


Compressed packets

Bit shifted packet copier

Compressed packet buffer

Decompressor Gating

Compressed rate Base rate


Compressed packets

Figure 7. 7:Loop based decompression principle

Figure 7. 8 shows an example of packet decompression using the loop based

implementation to decompress a 1:4 compressed packet. A bit shifted copy is output

by the loop after each loop trip. At the decompressor gating unit, the bits

corresponding to each virtual bin of the decompressed packets align themselves to the

gating signal to result in decompression of the corresponding base rate bits of the bin.

After gating, the complete decompressed packet with the bits in the correct order is

recovered with little or no latency. By proper arrangement of the components in the

decompression loop, an output utilization close to 100% can be achieved for 1500

byte packet decompression. An RZ to NRZ converter may finally be used to widen

the decompressed packet bits to base rate widths before exiting the node/network.

143

Compressed Packet – fixedContainer size

…

Loop 2 output

Through the loop to buffer and time align packet for decompression

Loop 4 output

Cumulative Decompression loop output

Decompressed packet output

Bin gating using Fiber WC or LiNbO3 modulator

Decompression gating signal

Bit positions relative to the compressed packetCompressed Packet – fixedContainer size

…

Loop 2 output

Through the loop to buffer and time align packet for decompression

Loop 4 output

Cumulative Decompression loop output

Decompressed packet output

Bin gating using Fiber WC or LiNbO3 modulator

Decompression gating signal

Bit positions relative to the compressed packet

Figure 7. 8: Loop based fold-out decompression

Although the linear time delay approach affords the possibility of incorporating

wavelength conversion within the decompressor, the loop based approach was chosen

for experimental demonstration due to its scalability, ease of implementation and

compactness.

7.3.3 Loop characterization

Unlike the compression buffer loop, the decompression loop serves to only hold the

compressed packet as it is decompressed. The accumulative effect observed in the

compression loop in chapter 6 is therefore not seen here.

144

Figure 7. 9: Decompression SOA power penalty, OSNR, output power Vs input power

To pick the operating point of the SOA, a penalty measurement was done by varying

the average input power to the SOA as shown in Figure 7. 9. An operating point of ~-

18dBm is chosen for the decompression loop and the average input power into the

decompressor is appropriately adjusted. An average input power of -24dBm to -7dBm

or a range of 17dBm is seen for low distortion operation of the SOA (~0.5 dB p-p).

An operating point of ~-18dBm is chosen for the decompression loop to minimize

accumulating penalties within the loop and the average input power of packet traffic

into the decompressor is appropriately adjusted.

145

7.3.4 10Gbps to 2.5Gbps packet decompression

10G Ring Laser @ λ i

PC

Anritsu Pat Gen

SHF

Compressed Packets at 10G

ATTR

2x2

HP Pat Gen10 GHz Clk

ATTR

2.5Gb/s BER measurement

Gated SOA

BPF @ λ i

Bulk delay

Vardelay

ATTR

Iso.

DECOMPRESSION BUFFER DECOMPRESSION BUFFER LOOPLOOP

DECOMPRESSION DECOMPRESSION BIN GATINGBIN GATING

SHF

10G Ring Laser @ λ i

PC

Anritsu Pat Gen

SHF

Compressed Packets at 10G

ATTR

2x22x2

HP Pat Gen10 GHz Clk

ATTR

2.5Gb/s BER measurement2.5Gb/s BER measurement

Gated SOA

BPF @ λ i

Bulk delay

Vardelay

ATTR

Iso.

DECOMPRESSION BUFFER DECOMPRESSION BUFFER LOOPLOOP

DECOMPRESSION DECOMPRESSION BIN GATINGBIN GATING

SHF

Figure 7. 10: Loop based decompression - experimental setup

Decompression of 1500 byte packets from 10Gbps to 2.5Gbps was experimentally

demonstrated using the setup shown in Figure 7. 10. Compressed 1500 byte packets

at 1556.35nm were generated using a 10Gbps pattern generator and enter the

decompression buffer loop. An interpacket gap of 5usec is used between compressed

packets to ensure that they do not overlap after decompression. An SOA was used to

compensate for loop losses and is gated to eliminate transient gain behavior due to the

bursty traffic. An isolator and a 1.2nm band pass filter centered at 1556.35nm were

used to counter-propagation of signals and noise accumulation within the loop. A

bulk fiber delay of ~210m and picosecond variable delay were used to achieve the

buffer time. The loop delay was adjusted to be one compressed bit shorter than the

146

fixed time slot of the compressed packets in order to provide the appropriate bit shift

for the packet copies at the end of each loop trip. A Lithium Niobate modulator was

used to gate bits of each bin at the output of the decompressor to reconstruct the

original decompressed packet. Polarization sensitivity of the gating stage modulator

resulted in height variations between bits of separate bins.

(b1) (a1)

(a2) (b2)

Figure 7. 11: 10Gbps to 2.5 Gbps decompression of 1500 byte packets - (a1,b1) input compressed packets and bit quality, (a2,b2) decompressed packets and bit quality (2μsec (left) and 100psec

(right) time scale)

In the first experiment, 1500 byte packets at 10Gbps were input to the decompressor.

Figure 7. 11 (a1, b1) show the compressed packets occupying ~1.2usec and spaced

~5usec apart and the bit quality of the input packet spaced at 100psecs. Figure 7.

11(a2, b2) show the decompressed output packets occupying ~4.8usec and with a

~0.2usec interpacket gap. The decompressed bits can be seen to be of good quality

147

and spaced ~400psecs apart. Slight height variations seen in the decompressed

packets were due to the polarization sensitivity of the gating modulator and can be

reduced by using a polarization insensitive gating element. Bit error rate

measurements were made on the decompressed packets to verify the bit sequence

obtained and to assess the bit quality after decompression. The bit sequence in the

compressed 10Gbps 1500 byte packets was chosen to be such that after proper

decompression, a 12000 bit repeating PRBS 27-1 sequence would be formed that can

then be verified using a gated bit error rate detector.

Figure 7. 12 shows the bit error rate curves for the decompressed 1500 byte packets at

2.5Gbps. A receiver sensitivity measurement was performed for continuous 10Gbps

and 2.5Gbps traffic along with packet traffic at 2.5Gbps. A PRBS 231-1 sequence

used for sensitivity measurements shows no variation in receiver performance when

data rate was varied or when continuous or packet traffic was received. A penalty of

~2dB is seen after decompression of packets and can be attributed to the height

variations in bits of different virtual bins. It is important to note that after

decompression, a 96% output utilization is obtained for the 1500 byte packet stream,

indicating that the decompression scheme adds little or no overhead due to

processing. Decompression of packets performed on the fly is therefore verified.

148

Figure 7. 12: Bit error rate measurement on decompressed 1500 byte packets (refer Figure 7. 10)

7.3.5 Variable packet decompression

A second experiment to test the performance of the loop based decompressor was

made over the IP size distribution of 40 to 1500 bytes. Compressed input packets of

different sizes occupying a fixed time slot were input to the decompressor. The

compression ratio of the decompressor was varied by controlling the decompression

buffer signal. The decompressed packets (a2, b2, c2, d2) along with the compressed

inputs (a1, b1, c1, d1) for different packet lengths are shown in Figure 7. 13.

149

(a1) (a2)

(b1) (b2)

(c1) (c2)

(d1) (d2) Figure 7. 13: Compressed packet input(a1, b1, c1, d1) and decompressed packet output (a2, b2,

c2, d2) for 1500, 1024, 560 and 40 byte packets ( 500nsec time scale)

Interpacket spacing at the input is chosen to ensure that there is no packet overlap

after decompression. Decompression of 1500, 1024, 560 and 40 byte packets from a

fixed compressed packet size was successfully demonstrated using variable

150

decompression ratios of 4:1, 3:1, 2:1 and 1:1 respectively. Close to a 100%

decompressed link utilization was observed for most packet sizes. For packet sizes

occupying less than a bin, the utilization is limited only by the padding used at

compression. Bit quality for various packet sizes after decompression was visually

verified to be as good as or better than that of the decompressed 1500 byte packet.

7.4 Chapter summary

In this chapter, current approaches to packet decompression were examined and the

study shows that a decompression mechanism that is scalable in bit rate, maximum

packet size and range of packet size would require a novel approach with relaxed

requirements on the control signaling. We proposed and demonstrated a viable

scheme satisfying all the requirements for successful deployment of such a

decompressor in an IP based optical packet network. A performance study of the

decompression technique shows that packets up to 1500 bytes and a wide dynamic

range of packet sizes from 40 to 1500 bytes may be handled with a low power penalty

of ~2.0dB. on the fly processing of packets becomes critical to offset utilization

bottlenecks at the decompression point and we show that the loop based fold-out

decompression technique can operate at an almost 100% output utilization.

151

References



[2] S. Aleksic, V. Krajinovic, and K. Bengi, "A novel scalable optical packet

compression/decompression scheme," presented at Optical Communication, 2001.

ECOC '01. 27th European Conference on,478-479 vol.3, 2001.

[3] P. Toliver, D. Kung-Li, I. Glesk, and P. R. Prucnal, "Simultaneous optical

compression and decompression of 100-Gb/s OTDM packets using a single

bidirectional optical delay line lattice," Photonics Technology Letters, IEEE, vol. 11,

pp. 1183-1185, 1999.

[4] H. Takenouchi, R. Takahashi, K. Takahata, T. Nakahara, and H. Suzuki, "40-

gb/s 32-bit optical packet compressor-decompressor based on an optoelectronic

memory," Photonics Technology Letters, IEEE, vol. 16, pp. 1751-1753, 2004.

[5] H. Sotobayashi, K. Kitayama, and W. Chujo, "40 Gbit/s photonic packet

compression and decompression by supercontinuum generation," Electronics Letters,

vol. 37, pp. 110-111, 2001.

[6] A. S. Acampora and S. I. A. Shah, "A packet compression/decompression

approach for very high speed optical networks," presented at Telecommunications

Symposium, 1990. ITS '90 Symposium Record., SBT/IEEE International 38-48,

1990.

152

Chapter 8

End-to-End Core Traffic

Adaptation

Compression and decompression of packets to achieve translation between bit rates

and as an adaptation mechanism from variable length to fixed time slots was

demonstrated in the previous chapters. However, in both cases, ideal traffic was

assumed at the input. A more complete understanding of the end-to-end performance

can be obtained by interfacing the output of the implemented compression stage to

the input of the decompression stage. Additional requirements to enable compression

and decompression may then be identified. In this chapter, we analyze and propose

additional adaptation requirements on packets for error free end-to-end compression

and decompression of variable sized packets. Finally, the performance of the end-to-

ed operation of compression and decompression is assessed both for fixed and

variable compression ratio implementations.

8.1 End-to-end core traffic adaptation

We have proposed and experimentally verified compression and decompression of

packets individually. However, in order to evaluate any issues in deploying this

153

technique to adapt core traffic, a complete end-to-end test of packets through the

compressor followed by the decompressor was performed. Packet distortion due to

pattern dependence during compression can be reduced by increasing the SOA

saturation power. The loop SOAs in the compression and decompression loops were

therefore swapped in order to optimize the end to end performance by taking

advantage of the higher saturation power of the decompression loop SOA. Timing

alignment of the two stages can also be improved by observing the output

decompressed eye diagram. When conducting experiments for compression and

decompression, ideal packets generated from a transmitter were used in both cases. It

was therefore not possible to analyze their behavior to an imperfect packet. Issues

relating to the end-to-end deployment of compression and decompression were first

analyzed.

8.2 Issues in end-to-end compression-decompression

experimentation

Signals entering the compressor were gated both by the loop SOA and the output

gating SOA. Any imperfections to the input packet, outside of the compression

window (packet duration) would therefore be suppressed. However, the base rate

rise/fall times of the control signals coupled with the control granularity achievable

result in an imperfect compressed packet at the output as detailed in Figure 8. 1. The

loop output present at all times during compression is eliminated by the gating SOA

at the output.

154

time

Loop 2 output Loop 3 output Loop 4 output

time


Ideal gating signal –Compressed Rate precision

PRACTICAL gated Compression OutputPRACTICAL gated Compression Output

Typical gating signal –Control clock precision

Residual output

IDEAL gated Compression OutputIDEAL gated Compression Output

time


time


Ideal gating signal –Compressed Rate precision

PRACTICAL gated Compression OutputPRACTICAL gated Compression Output


Residual output

IDEAL gated Compression OutputIDEAL gated Compression Output

Figure 8. 1: Loop based fold-in compressor output imperfection

When an ideal gating signal with rise/fall times and control granularity smaller than

the compressed bit rate is applied, an ideal gating of only the compressed packet is

possible. A typical gating signal is limited by the control plane’s granularity and

rise/fall times along with the SOA’s gating drive circuitry. When such a signal is

applied to the output gate, residual outputs are passed through at the compressed

packet edges. When an ideal compressed packet with sharp edges enters the

decompressor, decompression without any bit corruption becomes possible. However,

when a compressed packet with residual outputs at the packet edges enters the

decompressor, bit corruption is possible at the virtual bin interfaces as shown in

Figure 8. 2.

155


Loop 2 output Loop 3 output Loop 4 output-- Corrupted Bit

Base Rate Gating

Loop output

Ideal CompressedSignal

Compressed Packet – fixedContainer size with residual outputs

Practical CompressedSignal

Loop output

Decompressed output

Decompressed output

Corrupted bits

Base Rate Gating


Loop 2 output Loop 3 output Loop 4 output-- Corrupted Bit

Base Rate Gating

Loop output

Ideal CompressedSignal

Compressed Packet – fixedContainer size with residual outputs

Practical CompressedSignal

Loop output

Decompressed output

Decompressed output

Corrupted bits

Base Rate Gating

Figure 8. 2: Loop based fold-out decompressor output imperfection

While higher precision of compression gating control reduces the number of bits

corrupted, it cannot eliminate corruption. To prevent loss of valid data, the optical

payload must be encoded at the network edge.

IP Packet

Payload (de)encapsulation

Compression (de)encoding

Optical packet (de)framing

IP Packet

Payload (de)encapsulation

Compression (de)encoding

Optical packet (de)framing

Figure 8. 3: (De)encoding to enable compression/decompression

The necessary encoding involves insertion of zero bits at the virtual bin interfaces as

shown in Figure 8. 3. Once compressed, the compression encoding provides a

156

sufficient guard band for the compression gating rise/fall times thereby eliminating

residual bits. Decompression without data corruption is therefore possible.



PRACTICAL gated compression outputPRACTICAL gated compression outputwith edge compression coding with edge compression coding


Loop output

Base Rate Gating

Compression Coding – zeros at bin interfaces

Compression CodedInput Packet

PRACTICAL PRACTICAL decompression output decompression output



PRACTICAL gated compression outputPRACTICAL gated compression outputwith edge compression coding with edge compression coding


Loop output

Base Rate Gating

Compression Coding – zeros at bin interfaces

Compression CodedInput Packet

PRACTICAL PRACTICAL decompression output decompression output

Figure 8. 4: Compression and decompression of encoded packets to prevent payload bit corruption

Encoded packets may then be compressed and decompressed without corruption of

payload bits as in Figure 8. 4 and adds to the packet overhead. By reducing increasing

the compression control precision, the core node encoding overhead may be brought

down. Decoding of the optical payload is necessary as it exits the network at the edge.

Payload encoding for core adaptation can be reduced to under ~2% by system

optimization.

A polarization insensitive Electro Absorption Modulator (EAM) was used in place of

the Lithium Niobate modulator to perform decompression gating to minimize height

variation in bits of the decompressed packets.

157

8.3 10Gbps core adaptation of 2.5Gbps traffic

The experimental setup used is shown in Figure 8. 5. The output of a fiber ring laser

with 13psec pulse width and at 1542.3nm was modulated by 2.5Gbps NRZ optical

packet data. The compression buffer loop length is selected to be one bin size

(~1.2µsec) offset by one compressed bit period to achieve proper virtual bin

interleaving. A Semiconductor Optical Amplifier (SOA) used to compensate for loop

losses, is gated to eliminate transient gain behavior due to the bursty traffic. The SOA

also serves to flush the compression loop when each packet has been compressed.

Fiber delays in the loop are chosen for the desired buffer time and a variable

attenuator is used to adjust the loop gain. A 1.2nm filter centered at 1542.3nm was

used to reject ASE noise in the loop. Interleaving of different sections (virtual bins) of

the incoming packet takes place at the output of the compression loop. A second

gated SOA is used to select the compressed packet with all bins completely

interleaved. The output of the compressor was then fed to a packet decompressor. The

decompressor buffer loop is used to make copies of the compressed packet that

occupy contiguous virtual bins. An EAM is used to gate appropriate set of bits at each

bin. However, the copy of the compressed packet occupying each bin must be shifted

in time by 1 compressed bit from the previous copy to align the correct set of base

rate bits in each virtual bin to the EAM gate.

158

Figure 8. 5: End -to-end compression/decompression experimental setup

BPF

FRL

BER

T Tx

Mod

10 G

HzPC

Attn

Cor

e C

ontr

olle

r

BPF Δλ

= 1

.2nm

Attn

Iso.

Com

p rin

g ga

teC

omp

outp

ut g

ate

Dec

omp

Rin

g G

ate

BER

T R

x Pa

cket

Gat

eC

omp

outp

ut

Gat

e

EAM

2.5G

b/s

BER

m

easu

rem

ent

BER

T R

x Pa

cket

Gat

e

N b

ytes

N b

ytes

N b

ytes

N b

ytes

N b

ytes

N b

ytes

2.5G

bps

Pack

ets

10G

bps

Pack

ets

2.5G

bps

Pack

ets

SOA

2x2

Attn Var.

DelayBu

lk

Dela

y

BPF Δλ

= 1

.2nm

Iso.D

ecom

p R

ing

Gat

eD

ecom

pres

sion

B

uffe

r Loo

pEDFA

2x2

Attn Var.

Delay

Bulk

De

lay

Iso.C

omp

Rin

g G

ate

SOA

Com

pres

sion

B

uffe

r Loo

p

BPF Δλ

= 1

.2nm

Dec

omp

bin

Gat

ing

BPF

FRL

BER

T Tx

Mod

10 G

HzPC

Attn

FRL

FRL

BER

T Tx

BER

T Tx

Mod

10 G

HzPC

Attn

Cor

e C

ontr

olle

rC

ore

Con

trol

ler

BPF Δλ

= 1

.2nm

Attn

Iso.

BPF Δλ

= 1

.2nm

Attn

Iso.

BPF Δλ

= 1

.2nm

Attn

Iso.

BPF Δλ

= 1

.2nm

Attn

Iso.

Com

p rin

g ga

teC

omp

outp

ut g

ate

Dec

omp

Rin

g G

ate

BER

T R

x Pa

cket

Gat

eC

omp

outp

ut

Gat

e

EAM

EAM

2.5G

b/s

BER

m

easu

rem

ent

2.5G

b/s

BER

m

easu

rem

ent

BER

T R

x Pa

cket

Gat

e

N b

ytes

N b

ytes

N b

ytes

N b

ytes

N b

ytes

N b

ytes

N b

ytes

N b

ytes

2.5G

bps

Pack

ets

10G

bps

Pack

ets

2.5G

bps

Pack

ets

N b

ytes

N b

ytes

N b

ytes

N b

ytes

2.5G

bps

Pack

ets

10G

bps

Pack

ets

2.5G

bps

Pack

ets

SOA

Iso.D

ecom

p R

ing

Gat

e

2x2

Attn Var.

DelayBu

lk

Dela

y

BPF Δλ

= 1

.2nm

Dec

ompr

essi

on

Buf

fer L

oop

SOA

Iso.D

ecom

p R

ing

Gat

e

BPF Δλ

= 1

.2nm

2x2

2x2

Attn Var.

DelayVar.

DelayBu

lk

Dela

y

Bulk

De

layDec

ompr

essi

on

Buf

fer L

oopED

FA2x

2

Attn Var.

Delay

Bulk

De

lay

Iso.C

omp

Rin

g G

ate

SOA

Com

pres

sion

B

uffe

r Loo

p

BPF Δλ

= 1

.2nm

2x2

Attn Var.

Delay

Bulk

De

lay

Iso.C

omp

Rin

g G

ate

SOA

Com

pres

sion

B

uffe

r Loo

p

BPF Δλ

= 1

.2nm

2x2

2x2

Attn Var.

DelayVar.

Delay

Bulk

De

lay

Bulk

De

lay

Iso.C

omp

Rin

g G

ate

SOA

Com

pres

sion

B

uffe

r Loo

p

BPF Δλ

= 1

.2nm

Dec

omp

bin

Gat

ing

159

This is achieved by adjusting the buffer loop size to be one bin size offset by one

compressed bit period. Figure 8. 6(a1, a2) show incoming 1500 byte packets at

2.5Gbps occupying 4.8µsec with bits spaced 400psecs apart and being compressed to

10Gbps with 100psec bit spacing and occupying a fixed time slot of ~1.2µsec shown

in Figure 8. 6(b1, b2). Figure 8. 7(b1, b2) shows the decompressed packet occupying

4.8usecs with a bit spacing of 400psec.

(a1) (a2)

(b1) (b2) Figure 8. 6: 1500 bytes (a1, a2) Input packet ,bit quality at 2.5Gbps (b1, b2) compressed packet, bit

quality at 10Gbps

Both compression and decompression schemes are now scalable to higher

compressed bit rates and compression ratios due to the simple bin alignment nature of

the compression and decompression schemes and their high tolerance to slow edges

and granularity of control signals.

160

Figure 8. 7: 1500 byte packets (a1, a2) compressed packet and bit quality at 10Gbps, (b1, b2)

decompressed packet and bit quality at 2.5Gbps

To measure end-to-end compressor/decompressor BER performance, a 1500 byte

packet stream containing PRBS 27-1 payload was used. BER measurements were

made for packets constructed from a PRBS 27-1 sequence for proper sequence

detection for integrity check. 1500 byte packets were used due to the minimum

synchronization time requirement of the BERT receiver.

The received decompressed packets showed error free performance when sent to a

gated bit error receiver, confirming the integrity of the received packet. A receiver

sensitivity measurement was made on 2.5Gbps packets with a PRBS 27-1 sequence

generated directly from the transmitter and was found to be ~-18.6dBm. A ~1.65dB

power penalty is seen in the BER measurements shown in Figure 8. 8 from the test on

packets undergoing both the compression and decompression. SOA pattern

161

dependence and bin amplitude variations were identified to be caused of penalty in

the measurement.

Figure 8. 8: End-to-End bit error rate measurement at 2.5Gbps

8.4 Variable length packet adaptation

Variable compression ratios (1:1, 1:2, 1:3 and 1:4) necessary to implement

compression and decompression of variable byte size packets to and from a fixed

compressed packet time slot may be achieved by controlling the length of the ring

gating signals of both the compressor and decompressor. End-to-end performance

was tested for variable input packet sizes from 40 to 1500 bytes.

162

Input Compressed Decompressed

(a)

(b)

(c)

(d)

Figure 8. 9: Input, compressed and decompressed packets for (a) 1500 byte, (b) 1024 byte, (c) 560 byte, (d) 40 byte packets

Figure 8. 9 shows the input, compressed and decompressed packets for (a) 1500 byte,

(b) 1024 byte, (c) 560 byte and (d) 40 byte packets. Compressed packets occupy a

1.2usec time slot regardless of input packet size. It can be seen that the decompressed

163

packets are the same size of the original base rate packets as is expected. 40 byte

packets are padded to occupy the 1.2usec time slot.

8.5 Chapter summary

End to end deployment of compression and decompression techniques are studied and

we identify an adaptation requirement to prevent data corruption due to the use of

control signals with slow rise/fall edges. We demonstrate that compression and

subsequent decompression of packets from 2.5Gbps to 10Gbps and back to 2.5Gbps

may be performed for the entire IP packet byte size range commonly found in today’s

traffic with a low end to end penalty of ~1.65dB. Both techniques are scalable to

higher bit rates and packet sizes while still being controlled by base rate electrical

signals. Adaptation to and from fixed cell sizes suitable for optical buffering

technologies for a wide range of packet byte sizes has been shown to be possible with

a low penalty and with little or no processing time overhead.

164

Chapter 9

Conclusion

9.1 Thesis conclusions

With the increasing capacity demands placed on today’s internet routers, more

functionalities are being moved from the electronic to the optical domain. While

optical integration of these forwarding functionalities has been a promising area of

research, the need for research in adaptation to enable forwarding of IP packets using

optical technologies is evident from the fact that no optical forwarding of IP traffic

has been demonstrated to date. Consequently, a majority of the research presented in

this thesis has been devoted to investigating the issues involving such adaptation that

is critical to the practical realization of end-to-end-IP forwarding through the use of

all-optical technologies.

Several state of the art approaches to optical transport and forwarding were evaluated

from the perspective of identifying the requirements in building a mechanism that

would support IP traffic while packet switching is performed at the core. When

building such an optical network, we realize that performance may no longer be

evaluated only on the basis of physical layer metrics such as Optical Signal to Noise

Ratio and Bit Error Rates, but will have to include network level testing such as

165

packet throughput and latency measurements. In order to perform forwarding at the

optical core, a framing mechanism must achieve complete decoupling of the optical

payload from the header and framing information. A framing structure is proposed

that started based on requirements identified from previous work but has gone

through several stages of evolution from experimental work with IP over optics. It

takes into account various signal processing uncertainties and bandwidth

considerations and was designed with payload bit rate transparency in mind. Ingress

and egress nodes require several optical and electronic sub-systems to implement the

proposed framing mechanism and we researched the influence on them by the

physical signaling characteristics of the packet steam. The need for idlers is shown to

be extremely critical both at the core and at the network edges for proper functioning

of both optics and electronics alike. Successful implementation of the edge nodes

(ingress and egress) was shown and the back to back demonstration of edge node

adaptation was presented. Throughput measurements on traffic through the

implemented edge adaptation show that the experimental results closely match

theoretical predictions for all packet sizes. A minimum throughput of ~40% and a

maximum throughput of ~95% are observed demonstrating the feasibility of such

adaptation without major penalties in packet forwarding rates. We also observe that

latency added at the edge is significant but varies in a fairly linear fashion with

increasing packet size. We note that this changes when a more complex lookup

mechanism is used. We identify switching times and gain control times are major

influencing factors in the optical domain in determining the packet overhead and

166

consequently the forwarding throughput of the core. Electronic bottlenecks include

control signal rise/fall times and control granularities based on temporal uncertainties

and observe that guard band sizes are affected significantly due to the lack of

appropriate electronic technologies. From research in optical packet forwarding

implementation, we see the need for specialized electronic modulator and current

drivers that can support varying signal dc content even when idlers are used to

stabilize packet streams. This may be possible by either dc-coupling high speed

electronics or reducing the low frequency cut off’s of the existing ac coupled

circuitry. Throughput measurements on IP streams show that minimal penalty is

imposed by the forwarding process in the core and end-to-end performance match

theoretical predictions very well. Latency measurements were performed through the

core forwarding node and compared with the back-to-back edge node latency

measurements and we observe that only a 0.79usec additional latency is introduced at

the core as compared to the 10 to 40usec latency through the edge. It can therefore be

deduced that the major contribution to end-to-end IP latency arises from the edge and

cascading several core nodes would result in minimal impact.

While an edge adaptation and a core node forwarding mechanism that supports

variable sized IP packets was demonstrated, we observe the need for fixed time slot

payloads in order to efficiently buffer packets at the core to resolve contention.

Compression and decompression was proposed as a means to speed up and adapt

variable byte seized packets into fixed time slots and an intra node deployment

strategy is suggested. We propose a new approach to packet compression namely the

167

fold-in compression that is capable of handle IP packet sizes of up to 1500 bytes and

more. Compression of variable sized packets from 40 to 1500 bytes from 2.5Gbps to

10Gbps is shown using the loop based implementation of the fold-in approach and we

observe a 2.2dB power penalty due to compression. This is the largest packet size and

the widest range of input packet lengths compressed to date. We also propose and

demonstrate a decompression scheme that complements the fold-in compression

approach, capable of decompressing 40 to 1500 byte packets from a fixed time slot at

10Gbps to their original packet sizes at a base rate of 2.5Gbps. A ~2dB power penalty

is observed through the decompressor for 1500 byte packets. Finally, we identify the

issues associated with implementing end-to-end packet compression and

decompression and propose a modification to the edge node framing to support

compression/decompression at the core. We experimentally demonstrate the

performance of back-to-back compression and decompression for 40 to 1500 byte

packets and reduce the overall power penalty to ~1.65dB for 1500 byte packets by the

use of a polarization independent switching stage at the decompressor output.

Feasibility of using compression and decompression both to speed-up the packet

payload and thereby reducing its temporal footprint and to optically adapt variable

byte sized packets seen in today’s IP networks (40 to 1500 bytes) into a fixed time

slot. The proposed compression and decompression schemes are scalable both in the

maximum input packet byte size and in the highest achievable compressed bit rate

and can therefore be used for scaling payloads bit rates to 40 Gbps or higher and may

be modified to support bigger packets sizes such as jumbo packets of 9000 bytes.

168

9.2 Future work

Several ways to increase the overall optical packet throughput may be taken such as

increasing the interface bit rates to 10Gbps. While the payload and framing data rates

implemented in the research presented in this thesis were the same, the edge

adaptation mechanism may be tested for its performance when frame and payload

data rates are different (e.g. 10Gbps header and 40Gbps payload). Timing uncertainty

due to electrical serialization and deserialization processes can be reduced

significantly by working with electronic chip vendors to meet the additional needs to

gain tighter control over the output signal timing. It could be interesting to investigate

the possibility of reducing overhead due to optical rise/fall times and the subsequent

impact on throughput performance.

While compression and decompression has been proposed and demonstrated to

10Gbps, theoretical and experimental performance analysis of scaling the payload

speed up to 40Gbps and the compression ratio to 1:16 or higher could prove

beneficial extremely useful. Moreover, NRZ-to-RZ conversion and RZ-to-NRZ

conversion required to enable compression and decompression was not demonstrated.

The scheme as proposed would be more suited for edge node deployment but can be

modified suitably using clock recovery for core node deployment. NRZ-to-RZ

conversion may be done electro-optically as suggested in chapter 5 but an all-optical

approach would make core node deployment of compression and decompression

simpler and could be investigated. A demonstration of a multiple-input multiple-

output core node would take OPS technology closer to practical realization. While

169

optical adaptation has been demonstrated, implementing buffering using recirculating

fiber delay loops using adapted packet traffic and a comparison of its packet

throughput performance against packet forwarding with no contention resolution

mechanism would be interesting.

170

171

Date post:	29-May-2019
Category:	Documents
Upload:	tranxuyen
View:	213 times
Download:	0 times

UNIVERSITY OF CALIFORNIA Santa Barbara Hybrid Adaptation...

Documents