+ All Categories
Home > Documents > Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Date post: 03-Feb-2022
Category:
Upload: others
View: 7 times
Download: 0 times
Share this document with a friend
112
Department of Computer Science and Engineering Inter-Domain Incentives and Internet Architecture Jarno Rajahalme DOCTORAL DISSERTATIONS
Transcript
Page 1: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

����������������

�������������������������������������������������� �������������������������������������������������� ������������������������������������� ���������������������������������������������������������

������������������������� ��� ������­�������������������­���� ������������� ������� �������������

����

�������

�����

�������������������������������������������������������������� �������������� ���������������������������������������������������������������������������������� ���������������������� ������������������������������ ����� ��������������� �������������������� ����������������������������� ������������������������������������������������������������������������� ������������������������������ ��������������������������������������������������������� �������� �������������������� ��� �������������������� ��������������������������������������������������������������� �������������������� ����������������������� ��������������� ������������������������������������������� ������������������������� ������������������������������������ ������������������������������������������������ ��������� ���������� �������������� ����������������� �������� �������������� ������������������������������������������������������������ ������������������������������������ �������������� ������� ���������������

��������������

���������

������������� ����������������

����������������

������

������

��������������������������������������������

������������������������� �����������������������

����������������

������� �������������

Page 2: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]
Page 3: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

��������� ��������������� �������������������� �����������

���������� ���������� ��� �������� ������������

��������������

������� ������� ����������������� �������� ����������� ���������������������������������� �!��������� ��������������������� ������������������� �������������"����������������������� ����������������������������������#�� ����� ���������$�

�������������������������������� ������������ !������������"��#�������#��������!�������������$���

Page 4: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

�! �������# ��������% ������ �����&�'()''�*�

�������"�����% ������ �������� *����

%�����������&�������������+*��&������ ���� �������,��� �-����������% ���%��� ��./� �0�1�2�����3������������������� ��!����

� �����% ������ �4"�5�� ���� �������+����������������� �� �����

��������� ��������������� �������������������� �����������

6�)� ��2�.�������

3�78�9#:(9;�(<�(=#�#(��>� ���?�3�78�9#:(9;�(<�(=#�:(#�>���?�3��8(4��#99(=9@=�3��8��#99(=9@=�>� ���?�3��8��#99(=9=��>���?�����A��� $���28A3�78A9#:(9;�(<�(=#�:(#�

�� ����,��1���*������

-����

%��������� �� ��>� �������*?A�.� �$ �.������B����$����

Page 5: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

�'���������������������(%)�)*�&++,,,(-.,,,/0�����$$$)�����)��

�!�������� �������� ��������"�������"��������������������� ���������� ��� �������� ������������

%!'���������� � �������

������������� � ������ ������� ��� �����������

���������� ��������� ��!������� ������ �"�#"��$ ������#�#�"%� &'&()'&)

-���"������������������ ������������

1��!���� ��!'�����"&* ����� )'&) &+ ������ )'&)

%����������� !'����#�����"2"���3+ ���� )'&) ���#!�#��������

1���#�� � �������"�����������2�!�����4���#������������3

�'������������ �,�������� �� ���-�.��� �������� ��� ��-� ����� ��� ������������ ���� ��� � !� ��!������/ � �� ��� �����0 1� ����� ���� ��� � ������ �������� ������!���� � ���� ��� !������ �������� �� �� ���������� ������� �� � ���!��. � ��� ���� � ���� ��� ���� �� �,����� ����/ � ���� / � ��������� ����� ��� ����0 #�������/ ��� ������� ���������� ��������� ���� ��� ������ ���� -��� ��� ��������� � ��� ������� �������� � ��� ����� !� �������� �!����� ��� !���� ���0 �� ���� ����������� -� ���� �,���� ��� �������� �������� � ��� ���������� ��������/

��������� ��� ���������� !����� � ����� ���������� �������/ ������������� ����������/ ��� ��- ��������� � ���������� ������ ���� �� �������������� ���-�.���0 #��� -� ��- �- � ��� ���� .�-����� � !���� �� ���������� ���-�. ��� ���� ��� !� ���� � �������� ��- ���������� �������������0 2����� / -� ���� �,����� � ����� ���� ��� �� ��� ����,� � �������������� ���-�.��� ��� ���������� ��������0

5��$��"��������� ������������/ ���������� ������/ ���� ��� ����������

�* 2 �����"33+*�34)�5'�6+)+�' �* 2 "�33+*�34)�5'�6+)*�+

�� .�&+33�6376 �� 2 �����"3&+33�6376 �� 2 "�3&+33�636)

���������� !'��������� ���������� ������#8�����.� 6���)'&)

%�#��&46 !������9((���0�(��%9��:%93+*�34)�5'�6+)*�+

Date of the defense

Page 6: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]
Page 7: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

����������7�����.���� ����(%�++,,,(,,,/0�����$$$)�����)��

��8��7���� �������� 97��:�8���������#������� �;����� ����������� ���������� ��..������������

�!�8������<������������� .�.��.���

68��88:#�����.���.�� �����

��������� ��������� ��!������� ������ �"�#"��$ ������#�#�"%� &'&()'&)

�!�8��!�������������� ��..���������

57��8������!8��� ��&*0'60)'&) 97��:� 7��7&+0'*0)'&)

�!�8���!�!�����:��7��� 7��7'+0'50)'&) 5������������

1���#����� 6�"������7�7��:�8����2����������.���4�����������88����3

����������7=�����.����� ..��� �������..���.��.����� � �������� ���.. ��� �.; �;��>��� �� ..������� ���..��..����������� .; ��>>��� ����� �;����;;�.�� �������������0 %; ��;;.�� ����;/ ���; .����������� �� � >� ��� ��;�������� ��.�������� ��� ���..�������� .��.�� � .. ���.� .�������� �;���������� � �������� ��������0 #;; ������ ����� ������;;��;��>;.�� �����.�� ������ �;���������� .; ��>>����� �����������;0 #;��; ������ ���������� ���������;����� ��.����/ ���������� .�� �� � �������� � . ����� .����������� ������ �;����>>� ����� ���� ��� ����;/ ������ ���� ���������� ���� ������ ���..�;���������� ��������������0 #;��; �;��>�.������� ����;� ���.�� ���������� ��!����� ��.������ �������������/ �.�����.���

� ��������� ���.������.; �;��>��� ��������� ����� ����������� �� ��..���������� ��������0 $��;.�� ����;� >� ����� �;.>.���� ���������� ������ .���� ���;��>���������� ���.������� �����0 ������ �; �;� .���.� �;�; ����; ��; � >� ��;�� ������ ��.����� ���.���������� ����/ ��� ������ .; ��;; ������/ ���.� ��� �������� �;������ ���..��..����������� ���������0 $��.�� ����� �����..��; �;;� ����� .; �>��; ���;��>���������� ���.������� ��.; ���..����;����� � �;�;��� .��� ��� ����;0

�������������������� ��..���������/ ���..����;����� ��������/ .; ��>>��������������

�* 2 ������!33+*�34)�5'�6+)+�' �* 2 "�33+*�34)�5'�6+)*�+

�� .�&+33�6376 �� 2 ������!3&+33�6376 �� 2 "�3&+33�636)

�!�8���! ��88���� %���� ��88�8�����.� 9!���)'&)

���!�77�7&46 !������9((���0�(��%9��:%93+*�34)�5'�6+)*�+

Page 8: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]
Page 9: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Preface

It seems that architecture emerges only after the fact, when building

has already proliferated. Hence it is natural that my interest in network

architecture manifested itself only after a career as a software engineer and

research team leader building networking systems. It is also undeniable

that the economic challenges of our times have affected my interest towards

consideration of the evolution of network systems from the viewpoint of

the network owners. In retrospect, it seems only to be expected that the

incentives of network investors should play a role in the development of

the network architecture.

Looking back on my professional journey, I am deeply indebted to the

many interesting persons who have guided me and enabled my studies,

both before my working career and after my graduation from Helsinki

University of Technology in 1995. Apologies to anyone wham I fail to

mention below.

First, I must extend my humblest gratitude to the researchers on whose

shoulders I have mainly built my studies. David Clark’s writings on

Internet architecture have stayed with me a long time. The inter-domain

aspect of my work is built on earlier work by Professor Lixin Gao. Much of

the modern networking research on which my work builds on has been led

by Professor Scott Shenker.

I wish to thank Professor Heikki Saikkonen, who kindled my interest on

distributed systems during my Master’s studies. Heikki, you still have not

returned my lecture notes I lent to you after your course.

I must also thank my fellows Tomi Ollila, Pekka Pessi, and Markus

Peuhkuri, with whom I engaged in a networking software project, during

which we delved deep into the BSD TCP/IP networking stack and started

a small software company.

Then I must thank Professor Heikki Hämmäinen, who hired me for an

vii

Page 10: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Preface

internship in his research team at the Nokia Research Center some twenty

years ago. Thanks to Jaakko Teinilä, my first roommate at NRC. Thanks

to Asko Komsi who led the ROME project on which I started and about

which I wrote my Master’s Thesis. Soon after my graduation Heikki also

invited me to serve as Nokia’s representative in the TINA Consortium, Red

Bank, New Jersey. Thanks to Frank Steegmans, my officemate at Bellcore,

and to the other TINA colleagues, with whom I started my literary career

as a published network scientist.

Thanks to Raj Bansal, my superior at NRC Boston. Raj became my

role model of a team leader who fully trusted his team and never lapsed

into micromanaging, and who gave credit where credit was due. I am

sure I have fallen short of this ideal. My thanks go also to Rajeev Koodli,

Rayadurgam Ravikanth, and Senthil Sengodan with whom I had the

pleasure to collaborate while at NRC Boston.

I thank Reijo Juvonen and Professor Antti Ylä-Jääski, who invited me to

lead Hannu Flinck’s research team at the Communication Systems Labo-

ratory in NRC Helsinki when Hannu left for California. Thank you Hannu

for assembling such a fine team! Thank you also for your collaboration ever

since you came back from Mountain View; it has been quite a ride. Thanks

also to all my team members, many of whom I have had the privilege

to work with ever since: Dan Forsberg, Mikael Latvala, Jaakko Lipasti,

Janne Mäntylä, Heikki Ollikainen, Harri Paloheimo, Petteri Pöyhönen,

Siiri Räihä, Ove Strandberg, Haitao Tang, Dirk Trossen, Lucia Tudose,

Janne Tuononen, Petri Velin, Preetida Vinayakray-Jani, and Heikki Waris.

I wish the best that life has to offer to all of you.

I wish to extend thanks to my colleagues in the EU research projects,

namely Ambient Networks and PSIRP, and especially to my coauthors on

the resulting publications, some of which are included in this dissertation;

Thank you Bengt Ahlgren, Jari Arkko, Lars Eggert, Pekka Nikander, Börje

Ohlman, Janne Riihijärvi, Mikko Särelä, Sasu Tarkoma, and Kari Visala.

I thank Professor Antti Ylä-Jääski for accepting my application for doc-

toral studies and Aalto University for offering interesting courses on which

I could refresh my skills in networking research. Thanks to Pekka Nikan-

der’s Future Internetworking seminar I plunged into much of the back-

ground necessary for this work.

It is my pleasure to thank Professors Esa Saarinen and Raimo P. Hä-

mäläinen for their intensive seminar on creative problem solving. You

really pushed us to dive deeper into some difficult texts. This seminar also

viii

Page 11: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Preface

resulted in my first publication outside of networking.1 As a result of this

experience I do recommend anyone who still has a chance to participate in

Esa’s annual spring “lecture” series on philosophy and systems thinking.

I greatly enjoyed discussions with my colleague Kalevi Kilkki, whom I

had pleasure to meet at NRC Boston for the first time. Kalevi, you have

been an example of a writer’s work ethic to me. Thank you also for reading

the draft of this manuscript before pre-examination.

Thanks to TEKES for the funding through the ICT SHOK Future Inter-

net program, and to Professor Martti Mäntylä and Professor Scott Shenker

for offering me the opportunity to visit the International Computer Science

Institute in Berkeley. This visit would not have been possible without gen-

erous support from my employer, Nokia Siemens Networks. Thanks go to

my superiors Jari Lehmusvuori, Kari Aaltonen, and Lauri Oksanen, who

also approved the time off for my doctoral studies. The year in Berkeley

would have amounted to nothing without fruitful collaboration with other

researchers – thank you Ali Ghodsi, Teemu Koponen, and Pasi Sarolahti.

I am grateful for the way Professor Shenker included me in his research

team and for the guidance he provided on my research. Scott, your profes-

sional, candid, no-nonsense attitude made a truly lasting impression on

me.

I am grateful for Professor Antti Ylä-Jääski for the researcher’s chamber

at Aalto University for the last months of last year. I do not believe it

would have been possible to write this dissertation in my regular work

environment.

I thank Professor Antti Ylä-Jääski for leading the dissertation process,

Professor Sasu Tarkoma for guidance in thesis writing, and Edward Bonney

for checking my English. I thank Docent Mika Ylianttila and Associate

Professor Peter Sjödin for serving as pre-examiners for this work.

I am deeply honored by Professor Lixin Gao’s acceptance to serve as my

opponent. I hope you find my defense a worthwhile end to your stay in

Helsinki during the week of SIGCOMM 2012.

Most of all, I express my greatest gratitude to my parents Vesa and Lea,

who provided me with the freedom to follow my interests, to my dear wife

Maarit, who has put up with me during all these years, and to our lovely

daughters Kaisa, Iida, and Oona, who had to endure moving house maybe

1David Bohm’s “Thought as a System” and Systems Intelligence. In Systems intel-ligence: A new lens on human engagement and action, eds. Raimo P. Hämäläinenand Esa Saarinen: pp. 29–38. Espoo: Helsinki University of Technology, SystemsAnalysis Laboratory.

ix

Page 12: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Preface

too often. Women of my life, you are my inspiration ;-)

Espoo, July 26, 2012,

Jarno Rajahalme

x

Page 13: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Contents

Preface vii

Contents xi

List of Publications xiii

Author’s Contribution xv

List of Abbreviations xvii

1. Introduction 1

1.1 Research Setting . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . 7

2. Local Incentives and Inter-Domain Structure 9

2.1 Inter-Domain Structure . . . . . . . . . . . . . . . . . . . . . 11

2.2 Valley-Free Inter-Domain Paths . . . . . . . . . . . . . . . . . 14

2.3 Policy-Compliant Inter-Domain Paths . . . . . . . . . . . . . 15

3. Internet Architecture 19

3.1 Architectural Challenges . . . . . . . . . . . . . . . . . . . . . 20

3.2 The Role of Abstractions . . . . . . . . . . . . . . . . . . . . . 22

3.3 Network Architecture Deployment . . . . . . . . . . . . . . . 24

3.4 Network Overlays . . . . . . . . . . . . . . . . . . . . . . . . . 25

4. Inter-Domain Routing 29

4.1 Border Gateway Protocol (BGP) . . . . . . . . . . . . . . . . . 29

4.2 Inter-Domain Multicast . . . . . . . . . . . . . . . . . . . . . 31

4.3 Clean Slate Inter-Domain Routing . . . . . . . . . . . . . . . 33

4.4 Content-Oriented Network Architectures . . . . . . . . . . . 36

xi

Page 14: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Contents

4.5 Naming in Content-Oriented Architectures . . . . . . . . . . 38

5. Network Modeling with Inter-Domain Incentives 41

5.1 Domain-Level Network Model . . . . . . . . . . . . . . . . . . 42

5.2 Traffic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.3 Routing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.3.1 Basic Valley-Free Path Computation . . . . . . . . . . 46

5.3.2 Faster Valley-Free Path Computation . . . . . . . . . 48

5.3.3 Optimized Shortest Valley-Free Path Computation . 49

5.3.4 Policy-Compliant Path Computation . . . . . . . . . . 51

6. Inter-Domain Incentives and Internet Architecture 55

6.1 Evolvable Inter-Domain Structure . . . . . . . . . . . . . . . 56

6.2 Incentive-Compatible Caching and Peering in Content-Oriented

Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.3 Incentive-Informed Inter-Domain Multicast . . . . . . . . . . 61

6.4 Name-Based Inter-Domain Routing . . . . . . . . . . . . . . 64

6.5 Naming in Content-Oriented Architectures . . . . . . . . . . 68

6.6 Research Summary . . . . . . . . . . . . . . . . . . . . . . . . 69

7. Conclusion 71

Bibliography 73

Index 89

Publications 91

xii

Page 15: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

List of Publications

This thesis consists of an overview and of the following publications which

are referred to in the text by their Roman numerals.

I Bengt Ahlgren, Jari Arkko, Lars Eggert, Jarno Rajahalme. A Node

Identity Internetworking Architecture. In INFOCOM 2006. 25th IEEE

International Conference on Computer Communications Workshops, Glob-

al Internet Symposium, Barcelona, Spain, April 2006 [7].

II Jarno Rajahalme, Mikko Särelä, Pekka Nikander, Sasu Tarkoma. In-

centive-Compatible Caching and Peering in Data-Oriented Networks. In

Proceedings of the 2008 ACM CoNEXT Conference, ReArch’08 workshop,

Madrid, Spain, December 2008 [169].

III Jarno Rajahalme. Incentive-Informed Inter-Domain Multicast. In

INFOCOM 2010, 29th IEEE Conference on Computer Communications

Workshops, Global Internet Symposium, San Diego, CA, USA, March

2010 [167].

IV Jarno Rajahalme, Mikko Särelä, Kari Visala, Janne Riihijärvi. On

name-based inter-domain routing. Computer Networks: The Interna-

tional Journal of Computer and Telecommunications Networking, Vol 55,

Issue 4, pp. 975-986, ISSN 1389-1286, March 2011 [170].

V Ali Ghodsi, Teemu Koponen, Jarno Rajahalme, Pasi Sarolahti, Scott

Shenker. Naming in Content-Oriented Architectures. In Proceedings

of the ACM SIGCOMM workshop on Information-Centric Networking,

xiii

Page 16: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

List of Publications

ICN’11, Toronto, ON, Canada, August 2011 [101].

xiv

Page 17: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Author’s Contribution

Publication I: “A Node Identity Internetworking Architecture”

The architecture presented in this paper was a team effort in the Ambient

Networks, an EU Framework Program 6 integrated project [1], inter-

networking work package. Thus, the authors of the paper are listed in

alphabetical order. The author of this dissertation contributed especially on

the core networking aspects and on the model for communication between

multiple cores.

Publication II: “Incentive-Compatible Caching and Peering inData-Oriented Networks”

The author of this dissertation proposed to apply inter-domain incentives,

inferred from the available Internet domain-level topology data, to the

content-oriented networking paradigm. The paper was designed and writ-

ten with the coauthors in the PSIRP project [2].

Publication III: “Incentive-Informed Inter-Domain Multicast”

This work was done while on a research visit to the International Computer

Science Institute (ICSI), Berkeley, CA, and benefited from guidance by prof.

Scott Shenker.

xv

Page 18: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Author’s Contribution

Publication IV: “On name-based inter-domain routing”

The author of this dissertation first invented the network design presented

in the paper, then collaborated with coauthors on devising the experimental

design to evaluate the architecture. Initial paper submission preparation

was very much a team effort, but in later phases the author of this dis-

sertation reimplemented the model, generated new experimental results,

as well as refined most of the text for final submission, adding also new

material.

Publication V: “Naming in Content-Oriented Architectures”

The author of this dissertation developed the Denial-of-Service argumen-

tation presented in this paper, via a presentation given to the Berkeley

networking group before the writing process started. Authors listed in

alphabetical order.

xvi

Page 19: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

List of Abbreviations

AIP Accountable IP

API Application Programming Interface

ARPANET Advanced Research Projects Agency Network

AS Autonomous System

ASN Autonomous System Number

BGP Border Gateway Protocol

CCN Content-Centric Networking

CDN Content Delivery Network

DFZ Default-Free Zone

DHCP Dynamic Host Configuration Protocol

DHT Distributed Hash Table

DNS Domain Name System

DONA Data-Oriented Network Architecture

DoS Denial of Service

ECMP Equal Cost Multi-Path

FIB Forwarding Information Base

FQDN Fully Qualified Domain Name

HTTP HyperText Transfer Protocol

ICMP Internet Control Message Protocol

ICSI International Computer Science Institute

xvii

Page 20: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

List of Abbreviations

IETF Internet Engineering Task Force

ILNP Identifier-Locator Network Protocol

IP Internet Protocol

IRTF Internet Research Task Force

ISP Internet Service Provider

IXP Internet eXchange Point

LISP Location/Identity Separation Protocol

NAT Network Address Translation

NID Node Identity

NSFNET National Science Foundation Network program

POP (Internet) Point of Presence

PSIRP Publish–Subscribe Internet Routing Paradigm

QoS Quality of Service

RIB Routing Information Base

RRG Routing Research Group

TCP Transmission Control Protocol

URL Uniform Resource Locator

VRR Virtual Ring Routing

xviii

Page 21: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

1. Introduction

“What is essential here is the presence of the spirit of dialogue, which is in

short, the ability to hold many points of view in suspension, along with a primary

interest in the creation of common meaning.”

(David Bohm and David Peat. Science, Order and Creativity [32, p. 247])

Network architecture has traditionally been considered an exercise in dis-

tributed system design – an exercise in optimal distribution of functionality

in a network topology. Apart from the requirement to meet the functional

design criteria, the aspects of feasibility and especially scalability have

been the main objectives in networking designs. Experience has shown,

however, that designs meeting all these objectives have still faced signif-

icant, and many times unsurmountable deployment challenges (see, e.g.,

[121, 72, 202]).

In this research, we look at the deployment problem from the viewpoint

of the network owners, in the spirit of David Clark: Why should I invest,

so that someone else might benefit?2 Since the Internet consists of inter-

connected, but independently owned and operated autonomous systems,

it is evident that common agreement by the network owners is needed

for any sweeping updates affecting all the network domains. There are

two principal options to overcome this hurdle: Firstly, make sure that

the changes are obviously beneficial to all network domains, or, secondly,

re-engineer the design so that only the parties benefiting from the changes

need to participate.

In this dissertation, we first lay out some of the major challenges fac-

ing the Internet, emphasizing the voluntary inter-domain structure of

the network. We present networking solutions designed to tackle these

problems, using a method for network architecture evaluation based on

2Originally presented relating to the lack of QoS deployment in the Internet.

1

Page 22: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Introduction

the analysis of inter-domain traffic incentives. Two examples of applying

this method to network architecture design are presented (Publications IV

and III). Publication II looks expressly at the economic implications of one

possible new internetworking architecture (content-oriented networking),

while Publications I and V focus on the foundational aspects underlying

such architectures: The inter-domain routing structure and cryptographi-

cally derived naming, enabling content-based security without relying on

specific network topologies.

1.1 Research Setting

The research presented in this dissertation originated in collaborative

research projects set to advance the art of networking architecture research

[1, 2, 133]. In the following, we shortly summarize the origins, methodology,

and the main research questions of the included publications.

The Dynamic Internetworking Architecture team of the Ambient Net-

works project [1, 33] was tasked to investigate the current state of inter-

networking research (see Chapters 3 and 4) and design an evolutionary

network architecture to overcome the challenges posed by, e.g., exhausting

[115] and overlapping address spaces [189], the associated loss of trans-

parency [40, 55], and increasing multihoming and mobility among both

hosts and networks [187]. The resulting architecture is presented in Publi-

cation I, and comprises the basis for our understanding that the hardest

problems in internetworking relate to both the enormous scale of the global

network, and the locally determined traffic policies of the participating

network domains.

The PSIRP project [2] took a unique approach to the internetworking

problem by applying the publish/subscribe pattern [80] to the network at

all layers (e.g., link, network, transport, application). In Publication II

we assess the generic content caching incentives for domains in different

positions within the Internet topology, and the corresponding inter-domain

policy impacts of the proposed pub/sub networking paradigm by presenting

networking scenarios where such new policies would be beneficial to the

participants in the inter-domain networking exchange.3

The pub/sub networking model critically depends on a scalable name-

based inter-domain routing solution for it to be globally applicable. We took

on this challenge in the PSIRP project, synthesizing a new architecture for

3Our findings have been later validated by others, see, e.g., [69].

2

Page 23: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Introduction

inter-domain rendezvous. The challenges posed by the new architecture

deployment were soon realized to have a deep impact on the architecture –

here we took a conscious departure from the clean slate research model, on

which the project as a whole was based. Having satisfied our deployability

objectives, we developed a network model to validate the performance of

the design. Again, aiming to stay true to our understanding of architecture

deployment, we built our network model based on the supremacy of the

domain-specific packet-level incentives [96, 98, 195]. This work is reported

in the Publication IV.

Applying similar domain-level modeling, we also validated our under-

standing that non-consideration for inter-domain traffic incentives might

be one of the primary reasons for low, or non-existent, deployment of

inter-domain multicast service [65, 72, 11]. To this end, we simulated an

alternative protocol design that gives network operators a choice on each

offered multicast stream, allowing them to offer service only when it is

locally beneficial. The results, compared against the performance of the

traditional IP multicast [63], are reported in Publication III.

Finally, while visiting the International Computer Science Institute, and

the UC Berkeley networking group, we developed new argumentation for

use of self-certified names [103, 144, 203] in content-oriented networking

[132]. This work resulted in Publication V, which was selected as the best

paper in the workshop.4 The significance of this work may be easier to

see, when it is realized that the security we have in the current Internet

partially derives from the tie-in of the addressing model and the network

topology [75, 12, 107, 158]. When this relationship is lost, as is the case

in the content-oriented networking model, the explicit security provided

by the protocols and user-level trust [53] practices become even more

important than they are today.

1.2 Contributions

The main contributions of this thesis are:

• Proposing the use of cryptographically identified network nodes as the

principal routable entities in an internetworking architecture consisting

of independent locator domains (Publication I).

4The best paper status was shared among two submissions, the other being [161].

3

Page 24: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Introduction

• Establishment of the existence of new inter-domain traffic exchange

policies (Publication II) in content-oriented network designs.

• Demonstrating the impact of deployment considerations and inter-do-

main incentives on new architecture design (Publication IV) and redesign

of an existing service (Publication III). Understanding the different roles

of autonomous systems (as enterprises and network service providers),

and distributing the network functionality accordingly leads to rejection

of universal overlay models and the adoption of a heterogeneous network

design (Publication IV).

• Developing a comprehensive domain-level model for evaluating new

internetworking architectures, taking into account the traffic-level in-

centives of the autonomous systems comprising the Internet (used in

Publications IV and III.)

• New argumentation for the system-wide benefits of self-certified nam-

ing in content-oriented architectures. Separation of the concerns of the

end-user (e.g., establishment of trust) and the network (e.g., availabil-

ity), enables the user-level trust mechanisms (e.g., web-of-trust [18]) to

evolve independently of the content-oriented network design deployed

within the network. Self-certified names are relied on in the designs

presented in Publications I and IV, while the argumentation is presented

in Publication V.

The research reported in this dissertation has been published in five orig-

inal publications that are referred to as Publications I, II, III, IV, and V.

The specific contributions of the original publications are as follows:

Publication I This paper proposes use of locator domains and crypto-

graphically identified nodes as the basis of the inter-domain network-

ing architecture. This allows unique identification of the networking

nodes regardless of their current point of network attachment. This

aspect is similar to what is attainable with Mobile IP [162], but with-

out linking the node identity to a network address at all. Since the

inter-domain routing is also performed on cryptographically gener-

ated, topology-independent identifiers, it is possible to use different

internetworking technologies in each domain. This aspect is also

the cornerstone in a contemporary proposal for making the Internet

4

Page 25: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Introduction

architecture evolvable [133, 100].

The proposed routing solution recognizes the existence of a network

core, formed by the biggest Internet service providers. Edge networks

form topologies (such as trees) that attach to some of the core network

domains. The assumption of the more or less stable core avoids the

need for a globally synchronized routing protocol as node reachability

in the edge topologies can be resolved independently from all the

other edge regions [210]. Within the core, routing is based on an

overlay formed by the node identity routers connected to the core. To

send packets to any destination node, source nodes need to specify

through which node identity router the destination is to be reached.

Sources get this information from a name resolution system to which

all nodes register when they attach to the network.

Publication II In this paper we establish the existence of new inter-

domain traffic policies, when a content-oriented networking approach

is adopted. In content-oriented design, the network interactions

involve publications or files. Network users express interest toward

specific files, which the network then tries to deliver. Each file can be

delivered from any node where the file may reside. We find that if the

file, originated via paid transit links, can be found from a cache in a

peering domain, it might be beneficial to serve the file from that cache,

even though such transit traffic would otherwise be unavailable via a

peering link (adhering to the prevalent peering policies in use today).

Publication III This paper investigates an incentive-informed IP mul-

ticast design, where domains refrain from providing services when

that would be against their locally determined, economically driven

traffic policies. We find that depending on the group size and the

peering policies, our alternative design could provide up to 95% of

the inter-domain redundancy elimination benefits available with the

optimal IP multicast model. Moreover, we also find that if Tier-1

network providers would offer multicast as a service, the overall effi-

ciency could be even better than with the optimal IP multicast model.

Profitability of such service depends on the nature of the peering

relationships between these domains. Consequently, our evaluation

is done on three different peering policy assumptions, and also on an

alternative network topology, on which a denser peering pattern is

assumed. We find that the results vary significantly depending on

5

Page 26: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Introduction

the peering policy assumption, while denser peering has little effect.

Publication IV This journal paper presents a name-based inter-domain

routing architecture developed in the PSIRP project [2]. The design is

built on the realization of the different roles of the Internet domains

(service providers v.s. enterprises), and adopts a hybrid design where

edge topologies form independent rendezvous networks, which are

then interconnected using a hierarchical overlay design [95].

Rendezvous networks use internal content-oriented routing, like

in DONA [132], making all data internally reachable within the

rendezvous network. The virtual hierarchical overlay allows for

different deployment strategies in the different parts of the network

and does not critically depend on any specific networks. In addition to

the assumed globally shared interconnection overlay, the rendezvous

networks are free to participate in other interconnection overlays.

The global scalability challenge is tackled with a two-tier naming mod-

el: All data is referenced with a pair of identifiers, the first naming

the scope that the data is associated with, the second naming the

data itself. Requests can then be forwarded using the deepest match

lookup (see Publication V), where the data identifier is matched first,

and only if no match is found, is the scope identifier matched next.

Since all scope identifiers are present in the interconnection overlay,

the system will find a path if one exists.

The design is thoroughly modeled in an inferred Internet topology.

As results we have reported inter-domain path stretch and additional

latency figures for the initial rendezvous messaging. The underlying

routing model is policy-compliant in that it emulates the typical

policies configured for inter-domain routing.

Publication V In this paper we present new argumentation for the use of

self-certified naming in content-oriented network architectures. Self-

certification allows for a clean separation between the end-user-level

trust establishment, and the network’s need to make sure that only

the rightful owners of the names get to assert status and availability

changes on content thus named.

We also generalize our earlier two-level naming structure as explicit

aggregation. Instead of building hierarchy into a name, aggregation

can be expressed explicitly by prepending the content name with

6

Page 27: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Introduction

one or more names of aggregates. Routing is then performed using

deepest match lookup instead of the traditional longest-prefix match;

at each hop, the content name is checked first, followed by matching

for each aggregate as needed, progressing from the most specific to

the most inclusive identifier, effectively finding the most specific next

hop available for the content thus referenced.

1.3 Structure of the Thesis

The rest of this dissertation is structured as follows. Chapter 2 introduces

the role of local incentives in the inter-domain structure of the Internet.

Chapter 3 summarizes the major challenges facing the Internet architec-

ture, with emphasis on deployment issues. Chapter 4 provides the neces-

sary technical background on inter-domain routing and content-oriented

networking. Chapter 5 introduces the inter-domain incentive-based model-

ing approach we have used in the research included in this dissertation.

Chapter 6 briefly summarizes the key content of the original publications,

including the use of the incentive-based model in practical networking

problems. Chapter 7 gives the conclusion of the dissertation. Five original

publications which describe the research presented in this dissertation

are included at the end. We have included these papers in the numbered

references [7, 169, 167, 170, 101], but refer to them as Publications I–V,

respectively, in this dissertation.

7

Page 28: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Introduction

8

Page 29: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

2. Local Incentives and Inter-DomainStructure

“The art of economics consists in looking not merely at the immediate but at

the longer effects of any act or policy; it consists in tracing the consequences of

that policy not merely for one group but for all groups.”

(Henry Hazlitt. Economics In One Lesson [110, p. 5])

Not since January 1st, 1983, when ARPANET switched to TCP/IP protocol

stack [138] has the Internet been under one, centralized management

[34]. Instead, the Internet is formed by interconnected, separately owned

and managed networks, welded together by an internetworking protocol

layer. In the context of Internet routing protocols, the networks forming

the Internet are called autonomous systems (ASes). In this dissertation,

the unqualified term domain5 refers to the same. Hence, the mechanisms,

or protocols, that allow such domains to interconnect and interoperate can

be termed as inter-domain protocols, of which the current inter-domain

routing protocol, BGPv4 [175], is the prime example. Within the routing

protocols, an autonomous system is referred to by its officially assigned AS

number (ASN).6

The TCP/IP protocols were not originally designed for commercial ser-

vice, rather for the needs of connecting to computing resources between

research organizations [60]. The Internet started to transition to commer-

cial service when the National Science Foundation networking program

(NSFNET) encouraged the academic institutions it served to seek non-

academic customers on the regional level. NSFNET prohibited, however,

any commercial use of its federally funded backbone, thus stimulating

the emergence of the first commercial long-haul interconnection networks

5Not to be mixed up with domain names such as example.org, which may or maynot correspond to an autonomous system.6See Table 2.1 on page 12 for some prominent examples.

9

Page 30: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Local Incentives and Inter-Domain Structure

Figure 2.1. Interconnected domains. Bold lines represent inter-domain links betweendomain border routers.

[138]. During a span of a decade, the Internet made the transition to oper-

ate on competing commercial backbone networks. This process culminated

in 1995, when the NSFNET backbone was defunded, and also academic

institutions were interconnected via commercial networks.

The commercial and voluntary structure of the Internet stresses the

importance of operational incentives, the motivations each independent

domain has for network interconnection [67, 96, 126]. The most basic

motivation comes from customers paying for the network connectivity they

want. Such paid connectivity to third party networks is called transit

service. More subtle incentives based on mutual benefit drive bilateral

interconnection (peering), both between backbone networks, and between

any other networks. While the backbone networks have to peer in order

to provide the universal transit service, the typical incentive for peering

among other networks comes from the desire to limit their transit service

charges, which are typically volume-based [71, 82].

In a sense, the inter-domain structure of the Internet arising from the

interconnection incentives – networks paying service provider networks

for transit service, the provider networks peering to be able to provide

the universal transit service, and the mutual peering on all levels to limit

10

Page 31: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Local Incentives and Inter-Domain Structure

Domain

Siblings

Peers

Provider

Customer

Provider

Customer

Tier-1 domains

Tier-2 domains

Tier-3 domains

Figure 2.2. Inter-domain relationships and Tiers 1-3.

the transit costs – is more fundamental than the specifics of the Internet

architecture the network has emerged from. We will take a deeper look at

this structure next.

2.1 Inter-Domain Structure

The Internet topology divides into two kinds of well-defined regions (Fig-

ure 2.1). Each administrative domain (AS) governs its own network using

suitable intra-domain routing protocols and management systems. Con-

trary to this relative freedom within the domains, inter-domain routes are

exchanged with the Border Gateway Protocol (BGP [175], see Section 4.1).

BGP defines the mechanisms by which inter-domain routes are built and

exchanged between domains exchanging IP traffic. These mechanisms are

used by the domains to implement their routing policies [51], which derive

from their economic incentives [47, 96, 193, 126, 88, 82].

Hence, the contractual relationships between domains, reflected in their

inter-domain routing policies, define the global Internet topology, not the

underlying physical connectivity [38, 96, 195]. For example, two domains

that are present in the same Internet exchange point (IXP) [153] need not

exchange traffic with each other [97].

The two main types of inter-domain relationships (Figure 2.2) in the

Internet are the transit and peering relationships [113, 153], while a third

11

Page 32: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Local Incentives and Inter-Domain Structure

Table 2.1. Tier-1 Domains. ASes that are believed to not pay other networks for peeringor transit [5].

Name AS Number

AT&T 7018

Centurylink 209, 3561

Deutsche Telecom AG 3320

Level 3 Communications 3356, 3549

NTT Communications 2914

Sprint 1239

Tata Communications 6453

Telecom Italia Sparkle 6762

TeliaSonera International Carrier 1299

Verizon Business 701

type, the sibling relationship, is used between the multiple domains under

the same administration [96]. We omit subtler relationships hidden in the

domain-level view, such as partial transit and backup [97], for the clarity

of exposition.

A transit relationship is a customer-provider relationship, where the

provider agrees to announce the customer’s network, identified with an IP

prefix, to the rest of the network and thus makes the customer’s network

reachable. Also, the customer either gets routes to all Internet destinations,

or uses the provider as a default route, and thus can access the rest of the

Internet. The customer pays the provider for these services (reachability

and access) [82].

In a peering relationship the ASes agree to exchange traffic only between

themselves and their customers by advertising the corresponding routes to

each other. The typical incentives for peering are, for large ASes, reciprocal

reachability and robustness benefits, and, for smaller ASes, savings on

the upstream transit costs [47]. This is why also big content providers,

such as YouTube (AS36561), actively seek peering relationships with other

networks [57]. Peering domains do not provide transit to each other. While

peering is typically settlement-free, also paid peering is possible.

This structure leads to a tiered Internet model (Figure 2.2), where the

ASes at the topmost tier (the Tier-1) form, in practice, an oligopoly that

does not buy transit from anyone else; all are peering with each other. Of

12

Page 33: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Local Incentives and Inter-Domain Structure

20121997 2000 2001 2004 2005 2008 2009

# ad

verti

sed

AS

es

0

10000

20000

30000

40000

Year1996

Figure 2.3. The growth of the number of advertised ASes in the Internet. Data from [117].

course, this is what makes the whole Internet connected.7 The current

Tier-1 domains of the Internet are listed in Table 2.1. While Tier-1 net-

works typically cover large geographical areas, national or regional ISPs

often have a denser regional network footprint, and thus have a natural

role in offering regional connectivity services and reselling universal tran-

sit service towards those parts of the Internet they have no peering or

customer relationships with. These regional transit ISPs are called Tier-2

networks. Finally, Tier-3 domains consist of smaller ISPs reselling transit

service to customers not running BGP and of the network service users

running BGP. The role of BGP is significant in this context as networks

not running BGP are not visible in the AS structure as separate entities.

The vast majority, about 90%, of the ASes are in the Tier-3 class [116].8

The observed inter-domain structure and relationships [96, 193, 39] can

be used to estimate the role of each AS. Examining the historical data on

the AS structure, Dhamdhere and Dovrolis [68] have found that the edge

of the Internet, consisting of mainly Tier-3 enterprise customer networks,

is growing at a significant and stable pace (Figure 2.3). Furthermore, the

core of the Internet, consisting of transit and content service providers,

7We are ignoring any possible local Tier-1s not purchasing transit, but not peeringwith all other Tier-1 networks either.8This can be easily observed from the CAIDA AS relationships datasets [39].

13

Page 34: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Local Incentives and Inter-Domain Structure

Uphill path

Downhill path

AS topology

Domain

Up/downhill transition

Siblings

Peers

Provider

Customer

Source

Destination

Figure 2.4. Example valley-free path. Solid arrows represent customer (tail) – provider(head) relationships, solid lines peering relationships, and two-headed arrowsrepresent sibling relationships (mutual transit) between autonomous domains(circles).

represents roughly 10% of all the autonomous systems. This ratio has

remained roughly the same also in the immediate past [116].

Since traffic policies are based on bilateral contracts, the inter-domain

topology is effectively different for each pair of source and destination

domains [88, 193]. While this complicates the overall inter-domain topology

picture, it also narrows down the choices for inter-domain paths between

any source–destination pair, and in some sense makes the inter-domain

routing space more tractable than without any such limitation [210].

2.2 Valley-Free Inter-Domain Paths

In [96], Gao derives an empirical valley-free inter-domain path model

stemming from the bilateral relationships between ISPs, as deducible from

the BGP routing data. In this model, all Internet paths can be divided into

three segments (Figure 2.4), out of which any segment may be omitted for

a given path:

14

Page 35: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Local Incentives and Inter-Domain Structure

Uphill path is the sequence of domains from the source domain up to the

first provider-to-customer or peer-to-peer link. All the links in this

segment are either customer-to-provider, or sibling-to-sibling links.

A peer-to-peer link in the middle. This may be, e.g., a link between

two Tier-1 providers, or any peering link between ISPs.

Downhill path is the sequence of domains from the first provider-to-

customer link to the destination domain. All the links in this segment

are either provider-to-customer, or sibling-to-sibling links.

The valley-free property follows from the ISP incentives reflected in

their inter-domain routing policies. When all inter-domain paths are

exported to customers and siblings, and only paths to (transitive) customers

and siblings are exported to peers and providers [97], the resulting inter-

domain paths are valley-free.

Figure 2.5 depicts the simplest possible configuration of using pathlets

[104] for valley-free local transit routing. In this configuration, each do-

main hosts two virtual nodes, which in practice are distributed to all the

edge routers of the domain. The black nodes build the uphill graph (up-

graph) [210], while the white nodes build the downhill graph (downgraph).

The arrows connecting the virtual nodes are pathlets. Note that the transi-

tion from upgraph to downgraph can only take place within a domain, or

between domains using a peering link. Each domain has d+ s+ 1 pathlets,

where d is the number of neighboring domains (domain’s degree), and s is

the number of sibling domains among the neighboring domains.

By definition, a local transit policy [104] allows formation of valley-

free paths regardless of the packet’s destination. This disregards the

economically motivated, destination-based path selection preferences of

the domains on the path. Since transit provision is a commodity service,

disregarding the destination preference is arguably economically infeasible.

As an example, consider the case where a pathlet route used paid transit

resources when an alternate path via a peer or a customer would have

been available. Next we will consider these constraints for path selection.

2.3 Policy-Compliant Inter-Domain Paths

Since practically all valid inter-domain paths are valley-free [96], path

selection can be safely restricted to valley-free paths. This fact alone

15

Page 36: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Local Incentives and Inter-Domain Structure

Uphill pathlets

Downhill pathlets

AS topology

Uphill vnode

Downhill vnode

Domain

Up/downhill transition pathlets

Siblings

Peers

Provider

Figure 2.5. Valley-free local-transit pathlets in an AS topology.

limits the overall “path space” that needs to be explored to a fraction of all

possible inter-domain paths.

The converse is not generally true, however: all valley-free paths are

not valid, or policy-compliant inter-domain paths. Policy compliant inter-

domain path selection can be performed by selecting a path according to

the locally beneficial path selection incentives as listed below in order of

decreasing preference:

1. A customer path (for additional revenue),

2. a path to a sibling’s customer (revenue for the sibling),

3. a direct peering path (revenue neutral),

4. a path via a sibling’s peering link (cheaper than transit),

5. a direct provider path, and finally, if nothing else is available,

6. a path via a sibling’s provider link.

In each case, the use of sibling paths should be minimized. Since we do not

assume knowledge of service pricing on each individual inter-domain rela-

16

Page 37: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Local Incentives and Inter-Domain Structure

tionship, the shortest overall path may be preferred within each preference

category.

Note that any routing policy derived from the preferences above is com-

patible with the Gao/Rexford conditions for safe BGP configuration [97],

but is more specific than necessary for safety alone.

Since the underlying economic incentives are very similar for all ISPs,

practically all Internet paths observe a structure following from pref-

erences like those presented above. Gao [98] finds that the maximum

policy-compliant AS path length observable in this structure is about 13

AS hops, while the maximum shortest AS path length9 would be around

6. The reason for this increased stretch is that there is no incentive for

the possible “detours” [195] on the shorter paths to provide the transit

service between the end-points. The resulting inter-domain paths are, on

average, about 25% longer than the shortest possible valley-free paths

[98, 195, 182].10

9Maximal shortest path between any two ASes in the global AS graph ignoringthe inter-domain policies.10Note that we do not consider the hypothetical shortest paths in the non-annotated AS graph as the basis for reference.

17

Page 38: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Local Incentives and Inter-Domain Structure

18

Page 39: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

3. Internet Architecture

“We are searching for some kind of harmony between two intangibles: a form

which we have not yet designed and a context which we cannot properly describe.”

(Christopher Alexander. Notes on the synthesis of form [10, p. 26])

Internet architecture consists, according to David Clark, of the “features,

capabilities, and services that should be globally defined in common across

the network, and that are generally useful across multiple technologies, or in

support of multiple applications” [52]. Given the change in the operational

context, from government funded backbone connecting research networks,

to today’s commercial Internet connecting the whole world, it is no wonder

that many aspects of the original architectural assumptions [34] of the

Internet architecture have come under significant stress [55, 106]. In this

chapter we briefly introduce the aspects of this architectural tension most

relevant for our focus on the inter-domain structure of the Internet. For a

wider perspective, see [213, 106, 31, 55].

The architectural design principles of the Internet have only been pos-

tulated after the fact [50]. The “End-to-end arguments in system design”

by Salzer, Reed, and Clark [181] have been highly influential in shaping

the design of Internet protocols during the last decades [34]. Given the

reverence the end-to-end arguments have received as guidelines for In-

ternet design, it is a pity that these arguments were formulated before

the shift towards commercial service started. It has taken the research

community more than a decade to catch up with the reality, and even today

some studies [134] choose to ignore the implications [130].

The end-to-end argument states that when certain functionality can

only be correctly and completely implemented in the communication end-

points, the implementation of the same functionality in the communication

network is both redundant and undesirable. The example given by Salzer

19

Page 40: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Internet Architecture

et.al. is that of a careful file transfer, where it is the application in the

receiving endpoint that is in the position to make sure that the file has been

completely and correctly transferred, and where error checking within

the network is both redundant and potentially detrimental to end-to-

end performance. It is noteworthy, that the argumentation is presented

in terms of endpoints, hosting applications, and a data communication

network, as a whole. Thus, it is no wonder that the end-to-end argument

alone has been found lacking as a guideline [56, 53, 31, 55, 149], when the

more intricate structure of the data communications network has asserted

itself.

3.1 Architectural Challenges

We described the inter-domain structure of the Internet in Section 2.1

above, but there are also other significant aspects to the “data communica-

tion network” that have proven problematic to the Internet architecture as

originally conceived. The transparent, end-to-end, addressing model of the

Internet has been broken ever since the first consumer gateway products

became prevalent. Network address translation (NAT) [189] is built into

every residential networking device today, and increasingly also into the

machinery of the ISP itself [125].

A network address translation device swaps the user’s private IP ad-

dresses [20] and the TCP or UDP port numbers to a public IP address and

to non-overlapping port numbers. The reverse mapping is done on packets

received from the network, allowing the end host to operate under the

illusion of its IP address being publicly reachable. NAT emerged first as

a solution to a tussle [56] between the user and the ISP, where the ISPs

wanted to charge more for serving more than one computer, while the users

wanted to pay for their Internet connection once and share the same connec-

tion between multiple connected devices in their homes. Later, when the

IPv4 address allocations became harder to get, both enterprises and ISPs

started to deploy NATs in their networks to be able to serve ever larger

numbers of customers with a limited number of IPv4 addresses. Home

networks and enterprises deploy NATs also for the benefit of deliberately

making their internal networks invisible to the public Internet.

Given the proliferation of network address translation, the assumptions

that the communicating endpoints are publicly reachable and that the

data communication network transfers the packets to the other endpoint

20

Page 41: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Internet Architecture

unchanged are no longer valid. This loss of end-to-end transparency is no

problem for traditional client/server applications, when the server has a

public IP address, but has made the introduction of new transport protocols

harder, and has forced peer-to-peer applications to deal with the complex

issue of NAT traversal [106].

The official solution to the IPv4 address space depletion problem [115] is

IPv6 [64], the main feature of which is a larger address space [111].11 It is

evident, however, that NAT is needed to aid in IPv6 transition [125], and it

is unlikely that we can ever return to the model of one transparent global

address space – mostly because there are valid reasons for some hosts not

to be publicly reachable.

Apart from network address translators, networks often deploy firewalls

and other types of middleboxes [41], often co-located in the same device.

Firewalls are commonly used to limit the connectivity between private

and public networks. As there is no reason to assume that the need for

access control on internal networks will go away, firewalls, or middleboxes

in general, need to be considered part of the architecture in the future

[204].

The loss of end-to-end transparency in the Internet architecture described

above has been a driving factor behind many recent network architecture

proposals. This suggests a need to abstract away [54] from the view of one

shared global address space and to embrace multiple addressing realms

[93], or independent routing domains, as a fundamental aspect in the

Internet architecture [58, 204]. Such a change would make the now hidden

connectivity control more explicit and possibly give the users better control

of their visibility in the Internet.

Trust, or the lack thereof, is at the center of many of the architectural

tensions shaping the Internet today [56, 53]. In retrospect, the originally

implicit assumption of mutual trust in the operation of network protocols

has proven to have been among the hardest to fix. BGP itself has proven

prone to malicious attacks [152], which has led to proposals for more

accountability and availability in inter-domain routing [12, 207]. Denial-of-

service attacks – such as TCP SYN flooding [139] and routing loop attacks

[151], among others – have resulted in calls to introduce network-based

defenses [120, 160], fundamentally altered Internet addressing [107] and

service models [24, 6], protocols based on capabilities [211], and asking the

legitimate applications to speak up [205] to be heard amidst the attackers.

11The addition of the flow label field deserves a mention as well [168].

21

Page 42: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Internet Architecture

The Internet is prone not only to malicious attacks against its infras-

tructure [152], but also for more benign tragedy of commons [142]. ISPs

issue new routing prefixes to the global default-free zone (DFZ), the central

part of the network, where all reachable prefixes must be individually

advertised in routing updates. This results in significant costs, carried by

all ISPs collectively, while the originating ISP realizes only a tiny fraction

of the total cost. This behavior is driven by the customers’ (enterprises,

institutions, public entities) need to secure their network availability via

multi-homing and to protect against prefix hijack attacks [152], and has

resulted in ever increasing global routing tables [147, 146].

While many believe that the projected growth in the global routing tables

is well within reach of the increasing technological capacity [116], the

perceived problem of looming global routing table growth was specifically

addressed in the IRTF Routing Research Group (RRG). Although RRG

failed to reach a consensus on a single best approach, summaries of the

proposals with critiques, focusing mainly on deployment incentives, are

presented in [140]. In the same document the RRG chairs recommend

follow-on work in the areas of evolutionary route aggregation [131] and

Identifier-Locator Network Protocol (ILNP) [19]. Route aggregation refers

to efficient construction of a forwarding information base (FIB), used for

packet forwarding, given a routing information base (RIB) produced by

the routing protocols. Route aggregation can be started on a single router

level, and further enhanced within a domain using virtual aggregation

[25], where new, more general, prefixes are introduced in order to get rid of

a larger number of covered prefixes. ILNP is an evolutionary enhancement

to the IP addressing model, separating the address (locator) and identifier

roles embedded in the IP addresses today. Separating the locator and

identifier roles in the IP architecture removes the “commons problem”

referred to above as it places the burden of support of, e.g., multi-homing on

the domains themselves. This is accomplished by removing the associated

extra state from the shared routing system, and placing it in the DNS

servers, for which the domain is itself solely responsible.

3.2 The Role of Abstractions

While the Internet architecture has been a wild success by any measure

we can think of, allowing for radical innovation both below and above the

IP layer [62], it still suffers from architectural rigidness [13, 133, 100].

22

Page 43: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Internet Architecture

While this rigidness fosters interoperability, maximizing the commonality

of function on the networking layer,12 it has resulted in very slow or

nonexistent deployment of innovation on the networking layer itself [13].

The rigidness in the Internet architecture is attributable in part to

entanglement of IP addressing in both the network application interface

(e.g., the socket API), and the inter-domain routing protocols (BGP), forcing

lockstep updates on all three [133]. This is evident from the decade-long

effort to deploy IPv6 [202], which requires updates both in applications

to handle the longer addresses, and in routing protocols, in addition to

all the protocols within the networking stack itself (e.g., TCP, ICMPv6).

What makes a dire situation even worse is the almost certain lack of

coordination between the thousands of domains required to be able to

agree on any changes on the networking layer. It seems only a disaster in

the making – like running out of network address space [115] – can force

an industry-wide response to upgrade the networks.

Why, then, are IP addresses visible at the application programming

interface and the inter-domain routing protocol? In a word, due to ill-

defined networking abstractions [133]. In the Internet architecture, name

resolution belongs to the application layer as the names are not part of the

networking application interface. Name-based networking APIs [133, 200]

have been proposed only lately as the problems relating to IPv6 deployment

[202], mobility-induced IP address volatility [162], and newer networking

models [124] have increased awareness of the problem.

Also, the inter-domain routing protocol (BGP) [175] has been built using

the intra-domain network architecture (TCP/IP), which seemed only nat-

ural at first. Later design proposals, such as the pathlet routing design

[104], have shown that inter-domain routing protocols can be designed

to be independent of the network architecture, especially the addressing

model, of the domains being interconnected. The pathlet design requires

an independent namespace for naming the domains, for the purpose of

which cryptographically generated identifiers [12] provide a solution that

by the same token solves many of the vulnerabilities currently present in

BGP [152].

The AIP model [12] calls for domains to be named independently of their

internal networking architecture. Each host is assigned a globally unique

identifier that alone cannot be used for global routing purposes. To refer

12This maximization has come through gradual loss of functionality (e.g., sourcerouting) present in the original Internet architecture.

23

Page 44: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Internet Architecture

to a host in another domain, the domain identifier needs to be explicitly

given, together with the identifier of the destination host. This is similar to

the current practice of having the network identifier (prefix) and interface

identifier (following the prefix) parts in an IP address, only that the entities

identified are the domain13 and the host, instead of a subnetwork prefix and

a single network interface, and that the host identifier is globally unique,

rather than only locally significant. With this addressing structure, inter-

domain routing can take place on the level of domain identifiers, rather

than individual IP prefixes, easing the pressure on the routing tables [147].

By defining proper abstractions (e.g., named communication entities,

domains, and paths), it is possible to break the entanglement currently

present in the Internet architecture, and enable innovation also at the

networking layer. These abstractions cannot be, however, introduced to the

current architecture in an evolutionary fashion, and thus face the same

deployment obstacle as all other proposals. Finding consensus on both the

basic abstractions and on the minimal framework needed to support them

[133], could help set the goal for a far more lasting change when a global

network upgrade is needed again.

3.3 Network Architecture Deployment

As established above, network architecture deployment has turned out

to be an insurmountable problem. There are many kinds of barriers to

deployment, some technical, some related to operational practices.14 Inter-

domain incentives and their effects on technology design and adoption,

however, are typically not considered, or are abstracted away (as in [127]).

Contrasting this backdrop, “Towards an Evolvable Internet Architecture”

by Ratnasamy et.al. [174] presents the case for network architecture

adoption squarely from the viewpoint of ISP participation, and places its

hope for practical adoption on traffic-level incentives. According to [174],

practical network architecture deployment needs to be based on these four

assumptions:

“Assume partial ISP deployment.” There is no expectation that all

ISPs deploy the new architecture at the same time, i.e., there is

13Here the domain may refer to an entity smaller than an AS, but larger than asingle Ethernet segment, for example.14See [72, 84, 121, 202] for examples in the areas of IP multicast, inter-domainrouting, QoS, and IPv6, respectively.

24

Page 45: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Internet Architecture

no “flag day”. Maybe there is only one ISP that starts the deployment,

probably using only a fraction of its networking resources.

“Assume partial ISP participation.” An ISP that does not deploy the

new architecture, may also refrain from configuring their existing

systems (routing, name resolution) so as to help their users reach the

new architecture service offered by someone else.

“Assume existing market structure and contractual agreements.”

No requirement for new architecture customers to sign up with new

providers. Assume sticky business structures as customer–provider

relationships cannot be assumed to change overnight.

“Assume revenue flow.” Assumption of revenue flow following the in-

creasing use of the new architecture, also for in-transit ISPs, via

increased network traffic.

Based on the above assumptions, it follows that new architectures start

as overlays, implemented only on a willing fraction of the global network,

using the existing architecture as transport to connect the parts of the new

deployment together. Next we will turn to overlays.

3.4 Network Overlays

Network overlays have proven a valuable tool for both network architec-

ture and service deployment [77, 114]. An overlay builds a new network

structure using the network resources of an existing architecture (the

underlay). From this viewpoint, both the Domain Name System [148] and

HTTP [92] can be considered as overlays over IP, for example [163].

Overlays, as well as all other network structures, are defined in terms of

nodes with specific forms of addresses, or more generally, identifiers, and

which are interconnected to form a network structure with (overlay) links.

Overlay designs differ mainly in their choice of an identifier space, and

in their related routing mechanism. Practical overlays provide expected

logarithmic routing performance (O(logN)).

Many overlay designs are generalizations of specific data structures

to a distributed setting. For example, SkipNet [109] is a generalization

of the skip list structure, while distributed hash tables (DHTs), such

as Chord [191], Kademlia [143], Pastry [179], and Tapestry [215], are

network generalizations of hash tables. DHTs use hash values [74] as

25

Page 46: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Internet Architecture

Figure 3.1. Example Chord ring showing exponentially distancing overlay links for twonodes.

their identifier namespace and define the routing structures that yield

an expected logarithmic number of overlay hops to be traversed in order

to reach any given node in the overlay. For example, in Chord the nodes

are organized into a ring (Figure 3.1), where each node maintains links to

other nodes at exponentially increasing identifier distances. This enables

efficient routing to any specific part of the identifier namespace while

maintaining a small number of links at each node. While conceptually

simple, DHT protocols can be rather complex, for they need to deal with

network failures and frequent joins and leaves (churn) of the overlay nodes.

DHT structures have been applied to many different services. For exam-

ple, OpenDHT [177] has been made available to provide key/value service

to all applications that may need such a service. The Domain Name System

could be implemented using a DHT [157]; some such designs would as a

result provide much better average lookup latency and faster response to

updates [171, 172].

The basic DHT structure described above is one flat globally shared struc-

ture. As the node identifiers are essentially random numbers, the node

26

Page 47: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Internet Architecture

(a) Chord ring A. (b) Chord ring B.

(c) Combined Canon ring. (d) Chord (Figure 3.1)

Figure 3.2. Chord rings A and B combined into a Canon hierarchy, with original andadded overlay links (heavier lines) for one node from both A and B shown.Note the difference between the Canon (c) and Chord (d) rings of the samenodes.

relations are also random, meaning that neighboring nodes in the iden-

tifier namespace can be anywhere in the underlying network topology.15

This negates the ISPs’ carefully crafted routing policies in the underlying

network, increasing the costs and degrading the service [209]. Due to the

flat structure, some remote nodes leaving the DHT at the same time might

leave overlay nodes topologically close to each other disconnected.

Hierarchical DHTs have been proposed to provide robustness against

highly dynamic nodes [16] and to address the routing locality problem

stated above [17, 95]. Canon [95, 94] builds an explicit hierarchical struc-

ture by successively joining DHT structures together, starting from flat

DHTs, and progressing to arbitrary levels of hierarchy, and finally includ-

15To achieve better performance, DHT protocols in practice favor choosing linksto nodes that are also nearby in the sense of routing latency [191].

27

Page 48: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Internet Architecture

ing all nodes on the top level of the hierarchy. The join operation consists

of each node adding new overlay links to such nodes in the other over-

lays that are closer in the identifier space than its current successor and

that fulfill the logarithmic overlay link selection criteria. For example, in

Figure 3.2(c), nodes whose overlay links are shown in Figures 3.2(a) and

3.2(b), both need to add one link as the combined structure has a new node

between them and their prior successor node.

Overlay formed by joining successive levels of hierarchy in the above fash-

ion provides a locality guarantee: Overlay routes remain in the smallest

sub-hierarchy containing the source and the destination nodes. Intuitively,

this can be seen from the joining process. This is because when the new

links are followed, it is known that the destination is not in the current

ring. Other benefits of the Canon structure include: (a) adaptation to

the underlying physical network, when the physical network structure

is reflected in the Canon hierarchy, (b) effective caching and bandwidth

utilization as an exit from each sub-hierarchy towards a given destination

always happens through the same node, (c) fault isolation and security, due

to the locality guarantee, and (d) hierarchical access control and storage,

for the same reason as caching above.

While routing between any two nodes is limited to their lowest common

level of hierarchy, the routing may still take place via any nodes within that

hierarchy. This includes also nodes in different sub-hierarchies than where

the source and the destination node are located. Also, the hierarchy is

virtual in the sense that all nodes reside at the bottom of the hierarchy, and

the hierarchies are just different, increasingly more inclusive collections of

those nodes.

There are two major problems with overlays as vehicles for network

architecture deployment. Firstly, as such, they do not provide for inter-

working between different networking architectures, but rather build new

isolated networks [100]. If, instead, inter-domain routing would evolve to

the bare minimum needed for forwarding packets towards independently

identified domains and hosts, it would be possible for different domain

architectures to exchange packets using the shared but domain-technology

independent inter-domain mechanism. Secondly, hierarchical mapping to

the inter-domain structure is not straightforward, as the Internet does

not conform to a strict hierarchy, but has potential for multi-homing and

seemingly arbitrary peering interconnection.

28

Page 49: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

4. Inter-Domain Routing

“Many system administrators refer to networking —- and the Border Gateway

Protocol (BGP) in particular —- as black magic or voodoo, a domain not to be

trifled with and one best left to the highly-specialized network shaman. Perhaps,

but not every village can afford a network shaman; often, it’s up to the local

system administrator to perform all the miracles.”

(Jaffrey Papen. Demystifying BGP [156])

Given the inter-domain structure (Section 2.1), we now turn to the problem

of the dynamic interconnection of this global structure, i.e., inter-domain

routing. The purpose of inter-domain routing mechanisms is to allow

for policy-compliant next hop selection all the way from traffic sources to

destinations. We start with a brief introduction to the current inter-domain

routing protocol (BGP) and issues in inter-domain multicast. After that

we summarize proposals for new inter-domain architectures, focusing on

content-oriented networking architectures. We finish this section with an

overview of naming in content-oriented network architectures.

4.1 Border Gateway Protocol (BGP)

In this section we briefly summarize the mechanisms by which admin-

istrative domains implement their routing policies today.16 The current

inter-domain routing protocol, Border Gateway Protocol Version 4 (BGP

[175]), arose with the transition from the one shared backbone (NSFNET)

to the multiple, commercially operated backbone networks, as the more

complex structure required a more sophisticated protocol [22].

BGP originated as a simple path-vector protocol, but mechanisms were

later added to allow each AS to implement their locally defined routing

16For an in-depth introduction, see [22, 38].

29

Page 50: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Routing

Inter-domain linkIntra-domain link

iBGP

eBGP

Figure 4.1. BGP sessions in a partial topology. Exterior BGP (eBGP) is operated be-tween BGP routers of neighboring domains, while internally the routes aredistributed with interior BGP (iBGP). One domain uses a route reflector tocut down the number of iBGP sessions a full mesh would require.

policy, and to keep the policy details to themselves [38]. This evolution-

ary process has produced a protocol fairly simple in its basic protocol

operation, but rather complex in the interactions of its numerous mecha-

nisms. To complicate matters even more, in practical configurations BGP

must interface with the interior (intra-domain) routing protocols (route

redistribution), sometimes causing unforeseen feature interactions [136].

As the name indicates, BGP is principally operated at the domain bound-

aries (Figure 4.1), interfacing with the neighboring domains with which

there exist traffic exchange contracts. Each BGP router maintains BGP

sessions with specific neighboring BGP routers. Using the session, the

routers exchange route updates, which can be either route announcements

or withdrawals. Out of multiple announcements to the same destination

network, identified with an IP prefix, each domain chooses the least expen-

sive next hop, based on its internal cost criterion (route selection). Finally,

subsets of these selected best routes are further advertised to each of the

other neighboring domains, again based on the locally defined routing

policy (route filtering). For example, routes received via domain’s transit

30

Page 51: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Routing

providers are further announced only to the domain’s transit customers,

and not to peers or other transit providers. The reason for this is that

each route announcement is an implicit promise to route traffic to the

announced destinations, and the domain does not get paid for such service

from its peers or transit providers.

The route selection and filtering operations are implemented using a

set of attributes associated with each announced path. The receiving

BGP router attaches a local preference attribute based on who sent the

announcement, effectively recording the nature of the business relationship

with the neighboring domain in question (i.e., transit provider or customer,

or a peering or a sibling domain). This local preference attribute then takes

the highest precedence in the route selection process. For multiple paths of

equal cost, the selection is decided by attributes such as AS path, recording

the AS-level path to the destination, and multiple-exit discriminator, used

by the announcing domain to express its preference in the presence of

multiple links between the two domains.

The destination of a route is specified as an IP prefix. An IP prefix is

the part of the IP address that is common for all hosts in any given IP

network. To allow for further aggregation, networks sharing the same

inter-domain route – like networks served by the same ISP – are typically

assigned with similar prefixes so that they can be all served with a single

route announcement. The needs for multi-homing and traffic engineering

[165, 86] have, however, resulted in deaggregation in inter-domain routing,

causing the BGP update processing load to grow faster than memory load

[118]. These factors have given rise to suspicion about BGP’s ability to

scale as the Internet evolves [147]. This has been a motivator for much

of the recent research in Internet routing architecture [140]. The main

problem seems to be that of the implied upgrade cycles for IP routers as the

routing table growth is still within the limits of the growing technological

capacity, and the update load growth seems to have flattened [116].

4.2 Inter-Domain Multicast

So far we have assumed all communication on the Internet to take place be-

tween a source–destination pair. It can be argued that today’s commercial

inter-domain structure has emerged around this unicast network service

model, where each packet has a sender and a receiver, both financially

responsible for their share of the end-to-end connectivity, as detailed above

31

Page 52: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Routing

S D2 D4

D3

D1

D5

(a) Unicast

#

S D2

#

#

D4

D3

D1

D5

#

(b) IP Multicast

Figure 4.2. Inter-domain unicast vs. multicast from a source S to multiple destinationsD1−5. Each arrow represents inter-domain traffic, with multiple parallelarrows representing redundant transmissions. “#” indicates multicast routinginformation management and traffic forking within a domain. (Figures 1(b)and 1(c) of Publication III).

(Chapter 2).

In contrast, the essence of the IP multicast model is that it allows sources

to send each packet only once, while the network delivers the same packet

to multiple destinations [65, 61, 29] (Figure 4.2). In this section, we

introduce aspects of the multicast service model relevant for our focus on

inter-domain incentives, and their role in Internet architecture. For more

general background, see [11, 183].

The basic motivation for multicast comes from the opportunity to elimi-

nate redundant transmission of duplicated packets over the network. This

redundancy elimination can be considered to save costs if the network up-

grades can be delayed, or to enable better service as the network becomes

less loaded. The benefit is most obvious for the traffic sources, as they can

serve large groups of destinations with the load of one.

The IP multicast service model was originally defined “without a com-

mercial service explicitly in mind” [72]. In the original, mutually beneficial

setting of a single network, or even a single backbone, it is easy to see

the benefits of the multicast service [63]. However, with the shift to the

commercially operated distributed structure, the costs and benefits of the

multicast service may be realized by different entities. To manage this

asymmetry the current set of multicast protocols requires specific routing

policy support for each multicast group [28, 91, 145], so that only those

groups that are backed by some compensation mechanisms (which are

external to the protocols) can be used in inter-domain scenarios. Appar-

ently, this per-group policy support requirement sets a significant barrier

to inter-domain multicast use in practice.

32

Page 53: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Routing

Given the now widespread use of multicast services in intra-domain

settings [129], it is becoming clear that the corresponding lack of inter-

domain multicast deployment [72] must be in part due to the existing

compensation solution proposals [87, 49] falling short. These proposed

cost sharing models assume the destinations to be incentivized to pay for

multicast service, which may be questionable given their ability to obtain

comparable application services over the unicast network service they are

already paying for.

To bypass the multicast deployment impasse [72, 13], unicast-based

multicast services have been proposed [192]. Unicast tunneling is already

a part of the existing protocol suite, either in communicating with multicast

rendezvous points [91, 78, 90], or as transitional tool [77]. Using unicast,

multicast is transformed from a specially addressed network service to a

higher layer overlay. The main benefit of doing this is the independence

of underlying network routing policy support. The more fundamental

question of who has the incentives to provide the multicast service remains,

and sometimes such incentives can indeed be found on the application

layer [27, 214].

One remarkable – often implicit – assumption in the current literature

is that of policy-compliancy of multicast path finding algorithms [48, 173].

This is an important aspect we will review in our contribution (Publication

III).

4.3 Clean Slate Inter-Domain Routing

Realizing the rigidity of the IP networking layer, the research community

has in the recent years turned towards clean slate designs [123, 89], in

which the practical concerns for migration from the current, deployed

network, towards an improved one are left for a later investigation. With

this relaxation of constraints, it is possible to identify favorable designs

solving many of the problems of the current network. However, it is not

clear if the designs thus found are ever realizable, given today’s network

to begin with.

In this section we focus only on a subset of the proposed clean slate

designs, given our interest in the inter-domain structure of the network.

For a more comprehensive survey, please see [159].

The problems with the IP addressing system have been a fertile source

of inspiration for new Internet architecture designs. The loss of addressing

33

Page 54: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Routing

transparency due to both firewalls [41] and introduction of IPv6 [64] has

prompted abstraction away from the rigid IPv4 addressing model [54, 210,

133]. The rise of overlapping address spaces, due to network address

translation [189], has resulted in calls for generic address translation

between addressing realms [58, 33]. Our contribution in Publication I

provides one possible solution in this part of the design space.

Overlay-inspired designs have been explored as a potential solution to

ridding the Internet of an addressing structure altogether. ROFL [37]

exhibits a virtual ring routing [36] design for intra-domain routing, and

spans the Internet’s AS structure with the Canon hierarchical DHT design

[95]. The main challenge here is scalability: Given the current concerns

with the scalability of hierarchical IP address based routing, there is little

hope in building a global routing system where addresses are flat, i.e., a

system where address structure provides no support for aggregating the

routing information base.

Compact routing schemes [134] explore path stretch bounds for routing

schemes with sublinear (or compact) local memory requirements on each

node of the network. However, as we have seen above, path preference is

a fundamental aspect in ISP economy: In addition to carrying only paid

traffic, the competitive ISP economy also requires a strict customer path

preference, given the volume-based charging models in use today. It is

exactly this preference that renders BGP policies incompressible, even if

an unbounded stretch would be allowed [176]. In this light, proposals to

achieve scalable routing on flat names using the compact routing theory

(e.g., [185]) seem futile.

If destination address-based hop-by-hop routing cannot be scaled in the

inter-domain policy environment without a shared, hierarchical addressing

model, then what can be done? One approach is to make the scaling prob-

lem smaller, so that theoretically unscalable solutions are good enough in

practice. In the inter-domain setting this could be implemented by routing

on the domain identifiers, instead of subnet prefixes. Each administrative

domain structures its internal network in terms of link-layer networks

(e.g., Ethernet networks), each of which is given its own subnet prefix, part

of the overall IP address space. Typically these can be aggregated under

a shared routing entry, but practices such as multi-homing and explicit

deaggregation [147] have led to the situation where there are an order of

magnitude more routing prefixes than administrative domains. This trend

could be reversed, were inter-domain routing done on domain identifiers.

34

Page 55: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Routing

When the destination domain is reached, intra-domain routing would take

the packet to the destination using, e.g., an IP address, or an endpoint

identifier [12]. This would also help scale inter-domain routing by hiding

unnecessary updates across domain boundaries [194].

The domain-based addressing model does not, however, help with multi-

homing – when the destination needs to be reachable via multiple admin-

istrative domains. In fact, domain-based addressing seems to be roughly

equivalent to the traditional provider-based addressing model, where all

subnets are allocated from the address space of the ISP, and therefore

global routing tables are perfectly aggregated.

Another way of tackling the inter-domain routing challenge is to separate

routing from packet forwarding by the means of source routing [79, 15, 105].

A source route consists of a collection of network identifiers that need to be

traversed in order to deliver the packet to its intended destination(s). The

significant conceptual difference is that the inter-domain network service

model is changed, from that of packet delivery to an arbitrary address, to

that of first obtaining a path (or multiple paths) to the destination, and

then sending packets to that destination using the path(s) given. This

additional level of indirection would allow for seamless introduction of new

ways of addressing destinations [133, 104], and mapping given destinations

to AS-level paths, making the underlying inter-domain packet forwarding

more scalable than today, also in the presence of site multi-homing.

Even if route computation were separated from the forwarding function-

ality [85], routing policy enforcement still needs to be taken care of in the

forwarding elements. This policy enforcement problem can be solved either

via explicit tokens, or capabilities [166], or by making sure that all pos-

sible path combinations are policy-compliant. This latter approach [104]

requires that arbitrary network elements cannot be stitched together as a

source route, but that the end-to-end path elements (or pathlets) be identi-

fied explicitly in such a way that only policy-compliant path compositions

are possible.

LISP [83] addresses the inter-domain routing scalability challenge by

overlaying a new edge network address space on top of the existing IPv4

structure. LISP preserves the existing network service model and thus

requires mapping from the destination addresses to the corresponding

tunnel endpoint addresses in the core network. Summaries for LISP and

many other related locator/identifier solution proposals can be found in

the report issued by the IRTF routing research group [140].

35

Page 56: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Routing

4.4 Content-Oriented Network Architectures

Given the increased use of the Internet for content access [122], many have

asked if access to named content should be a basic network-level service

[199, 132, 124]. The main benefit of content-oriented networking derives

from coordinated content caching. The widespread adoption of content de-

livery networks (e.g., Akamai [154]) has already proven the role of content

caching in access networks [67]. The next step could be internetworking

between the previously isolated CDN systems, as evidenced by the recently

established CDN Interconnection IETF Working Group [4]. However, as

noted in [99], past results on co-operative caching efficacy are troubling at

the least.

Content-based networking was pioneered by Antonio Carzaniga with

networking designs, where forwarding is based on the content, or content

metadata, instead of a notion of a destination address [42, 43, 44]. More

recent designs opt for simpler notion of forwarding on a data name, finding

the nearest copy of the requested piece of content [135, 105, 132, 124]. In

the following we briefly summarize some of these designs, with a focus on

their inter-domain characteristics.

In a sense, all current content-oriented designs are networking gen-

eralizations of the publish/subscribe model [80, 206]. The model offers

applications primitives to publish content to the network, and to subscribe

to any published content. Content is identified with a content name, which

is used in finding the closest replica of the requested content. This model

works well in local settings, but the challenge in inter-domain setting is

to make it scalable to the processing speeds and available memory on

networking hardware [161], lest the network remain underutilized [99].

We should distinguish between two types of content-oriented designs.

First, name-routing designs route content requests solely based on the

requested content name, finally delivering the request to a node in the

network that can satisfy the request [124, 132]. Alternatively, the end

system may be required to resolve the content name to a network location

(an address), using some name resolution infrastructure, and then, as

a separate step, send the content request to the addressed node in the

network [203].

OceanStore [135] presents an architecture for global, location indepen-

dent storage of versioned files. OceanStore clients connect to “data pools”,

which manage all the data using highly interconnected networks. Ocean-

36

Page 57: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Routing

Store uses self-certified file names [144, 178], which are essentially pseudo-

random, binary strings. Human readable names are resolved via directo-

ries, some of which are chosen by users to serve as their (trust) roots. The

OceanStore design employs two separate strategies for routing based on

the file name: Firstly, via a probabilistic local algorithm, using attenuated

Bloom filters [30], and secondly via a global distributed hash table (DHT)

structure, with probabilistic locality guarantees. While this design can

minimize traffic level policy violations to a degree, the probabilistic nature

of the employed algorithms cannot eliminate policy violations completely.

TRIAD [105] and DONA [132] are examples of inter-domain networking

designs where a name-based routing phase precedes the payload communi-

cation phase. TRIAD uses a BGP-derived routing design, but on the level

of servers identified with DNS names (“BGP with names”).17 DONA, how-

ever, uses self-certified identifiers instead of human readable names and

makes the registration and find messages explicitly follow the underlying

provider and peering hierarchy.

Both TRIAD and DONA assume all ISPs to be naturally willing to peer

on the name level, if they are already peering on the underlying packet

forwarding level. Due to this arguably unrealistic assumption their request

messages are always forwarded on optimal policy-compliant paths. This

has a downside, however, as the amount of name-level forwarding infor-

mation needed to achieve this leads to significant scalability challenges.

TRIAD limits the problem by restricting the managed namespace to service

names, while DONA argues that large data centers can handle the load.

Due to the allowed wildcard principal practice, DONA design burdens the

Tier-1 domains with memory, processing and communication overheads

scaling linearly with the number of individually registered publications.

Assuming that incumbent Tier-1 transit providers try to keep as much of

the transit traffic to themselves as possible, and therefore not be among

the early adopters of any networking architecture with the potential for

better utilization of either peering links or local storage, the global DONA

registration state management burden shifts to a large number of smaller

Tier-2 administrative domains.

CCN [124] adopts the content-oriented directed diffusion paradigm [119]

as the basic network service and defines a combined content networking

and transport layer framework. CCN naming is based on hierarchical

textual names, essentially reusing the current URL namespace. The

17TRIAD also suggests the use of domain-level source routes, see [15].

37

Page 58: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Routing

CCN inter-domain routing model relies on DNS for resolving destination

server addresses when a direct CCN relationship does not exist between

adjacent administrative domains. CCN packet forwarding is based on the

data names, and according to current understanding, cannot scale to the

demands of the Internet core [161, 99].

4.5 Naming in Content-Oriented Architectures

Inter-domain routing in the Internet operates in terms of topologically

aggregated address prefixes. Thus, to reach a named resource, the name

is first mapped to an address, which is then mapped to a route by the

network [180]. That is, an address tells where (in a topology) the named

resource is to be found. A route consists of data that tells how to get there.

Routers perform route lookup using the given address as the search key,

obtaining the locally significant part of the route (i.e., the outgoing port

number) as a result.

Content-oriented designs route on names instead, short-circuiting the

process described above by omitting the topological address from the

name–address–route chain of bindings. As many names typically bind

to the same address, name-routing designs risk a significant increase in

the number of search keys for route lookup, i.e., making the forwarding

tables bigger than they used to be. By design, content names have no bear-

ing on the network topology, making it hard to aggregate the routing table.

These factors present a scalability challenge, especially for inter-domain

routing between transit service providers, where default routes cannot be

used.

Content names can, however, have structure that makes routing some-

what easier. In many cases natural hierarchies or containment relations

exist between pieces of information. This can be used in the network archi-

tecture design, trading some flexibility for more efficient implementation.

For example, a video file may consist of individually named chunks, all of

which could be assumed to be found using the same route (if the chunks

cannot be found from a cache). Also, photographs are often organized in

collections, or albums. Again, the whole collection could be constrained to

a shared route.

Security has also an important role in content-oriented designs. Having

lost the relation to the network topology, securing the use of content names

becomes even more important than securing IP addressing in the Internet

38

Page 59: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Routing

today. Recall that the main primitive in the content-oriented service model,

from the user’s viewpoint, is the request for a named object. If anyone in

the global network could claim to have the content you asked for, you would

be bound to face fraudulent content. Principally, for dynamic content, there

are two kinds of solutions to this problem, presented below. However, if

the content is static, it can be directly named with the hash of itself. This

way content verification can be done with a simple hash computation [74].

First, the content can be delivered with a certificate from a trusted third

party (a certification authority) [76], claiming the content to be produced by

the rightful owner of the content name you included in your request. This

solution retains total freedom in choosing, e.g., human readable names (as

in CCN [124]), but requires the content publisher, user, and the content-

oriented network to have a shared notion of trust [186]. While this solution

is familiar from the Web [70], it should be noted that the Web also relies on

the (mostly) trusted domain name system [148, 14] and a packet routing

framework bound to the network topology.18

Second, the required security mechanics could be bound to the name

itself [103, 144, 203, 59]. Content names are cryptographically derived

from the publisher’s public key (e.g., with a one-way hash function). The

content is then delivered with a signature and the publisher’s public key.

The signature can then be verified at any point in the network, ensuring

that the signature was produced with the corresponding private key, held

by the publisher of the content (the principal [178]). This also verifies

that the content was correctly named, and ensures that the content you

have received is indeed what you asked for. No third party certification

authorities are needed for the verification, and hence this solution is

referred to as self-certified naming, after the notion of self-certified public

keys [103]. For the user, the problem of knowing what content names to

ask for remains. However, this is not a harder problem than, e.g., obtaining

a URL for a specific piece of content on the Web today.

Some systems combine both human readable strings and a self-certified

part in a name [144]. This can be useful in allowing the human readable

part to form a hierarchy that can then be used for route aggregation

[124]. If the human readable part is actually visible to the user, it may

still present a risk by misleading the user to trust the content based on

some fraudulent mnemonic connotation induced by the name. Also, use of

18Return routability checks are used in IP mobility protocols to bootstrap somenotion of trust from the routing system [66].

39

Page 60: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Routing

mnemonic strings in content names gives rise to contention and speculation

in the namespace that could be avoided with semantic-free names [203].

Cryptographically generated names are naturally semantic-free.

Finally, it should be evident that human-friendly naming systems are

required on top of network-visible content names [26, 23, 8, 21], and that

bindings provided must also be secured. However, the content-oriented

network need not be concerned with this, even if it were used to implement

such human-level naming systems.

40

Page 61: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

5. Network Modeling with Inter-DomainIncentives

“Most people are what philosophers call “naïve realists”: they believe what they

see is, that some things are just plain True – and that they know what they are.

Instead, we stress that human perception and knowledge are limited, that we

operate from the basis of mental models, that we can never place our mental

models on a solid foundation of Truth because a model is a simplification, an

abstraction, a selection, because our models are inevitably incomplete, incorrect –

wrong.” (John D. Sterman. All models are wrong [190])

Given the inter-domain traffic incentives (Chapter 2) and the inter-domain

routing structure (Chapter 4), we now ask how we might use this insight

to validate inter-domain architectures. In the following we describe how

to use inferred domain-level topologies, such as ones from CAIDA [39], for

Internet architecture modeling. POP-level models would be more accurate,

but they are both harder to generate and validate and so far have limited

global coverage [212, 184, 188]. Router-level models would provide the

highest level of accuracy, but are both computationally expensive, and

practically impossible to generate in a global scale [9, 141]. Since inter-

domain incentives are captured already in the annotated AS-level topology

maps, the increased accuracy of the POP- and router-level models is not

necessary for incentive-based analysis.

All models are geared towards yielding results on specific metrics. The

simplest metric for an inter-domain architecture is the inter-domain stretch,

the factor by which domain-level paths inflate, compared to baseline paths.

Increased stretch is often a tradeoff for scalability [134], or as in the case of

BGP, for local policy freedom [176, 195, 98], and provides a view on protocol

efficiency and possible other performance implications. As a baseline for

comparison, we should select paths that BGP would choose when reason-

ably configured. Other possible metrics include end-to-end latency and

41

Page 62: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

load distribution. Latency metric is important, if user-level performance is

of interest [35], but requires the network model to be augmented with link

latencies and POP-level detail to be reasonably accurate. Load distribution

requires intra-domain modeling of network nodes, so that load on them

can be estimated.

In the following, we first describe our network model (Section 5.1), on

which the packet-level functionality of new architectures can be experi-

mented. To generate traffic sources and destinations we use a traffic model

(Section 5.2). Then, we trace policy-compliant network paths taken by the

packets using our incentive-derived routing model (Section 5.3). Finally, we

analyze the recorded path data for our selected metrics (e.g., Publications

III and IV).

5.1 Domain-Level Network Model

We use the CAIDA AS relationship datasets [39] as the network topol-

ogy. CAIDA datasets list the inferred AS relationship (i.e., the customer-

provider, peering, or sibling relationships, see Figure 2.2) for all known AS

interconnections. The relationships are inferred based on available BGP

routing data from Route Views [3] route monitors.

The transit topology is known to be highly accurate as it can be reliably

inferred from the available BGP routing data. The datasets include some

apparent transit loops, however, probably due to the abstraction from the

IP prefix level to the AS level. The AS-level model loses the fidelity of, e.g.,

partial transit and backup relationships [97].

Peering relationships are known only for those domains that are at and

above the domains with Route Views route monitors [3]. This is due to

peering paths being only advertised to customers. Thus, to see all the

peering links, route monitors should be placed in practically all domains.

To assess the effect of the up to 90% of peering links reportedly missing

from the CAIDA datasets [155], we form alternate inter-domain topologies

by augmenting the CAIDA topology with 900% additional peering links

according to the following simple rules:

1. All peering relationship at and above the domains with Route Views

route monitors [3] are known, so none are added to these domains.

2. Singly-homed stub domains are not expected to peer. We assume that

42

Page 63: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

domains with an interest in peering would first find interest in multi-

homing.

3. No peering with customers, or customer’s customers transitively, since

such an arrangement would reduce revenue. Conversely, no peering with

(transitive) providers, even if it would be nice to get free connectivity.

Observing these rules, we pick several different augmented datasets,

each with peering domains selected with our traffic model (Section 5.2),

so that the domains most likely to originate traffic are also the ones that

are more likely to peer than others. As the peering relations are the most

dynamic part of the Internet topology, the augmented datasets also provide

a view on the effect of Internet topology evolution. Again, this process could

be made more realistic with a POP-level connectivity model, as domains

already present at the same exchange points would be more likely to peer

than ISPs on different continents, for example.

5.2 Traffic Model

Network traffic between hosts, networks, and ASes is highly non-uniform

[81]. This means that we need a model for generating pairs of traffic

sources and destinations that yields a realistic distribution of traffic. To

this end, we use the inter-AS traffic demand model proposed by Chang

et.al. [46]. The model is based on the general gravity model [164] and

is tuned with experimental characterization and validation of a three

dimensional business rationale for each autonomous system, consisting

of business access (Uba), web hosting (Uweb), and residential access (Ura)

dimensions. The business access dimension is derived from the given

inter-domain transit hierarchy [46, Table 5], while the web and residential

access dimensions are derived from the business access dimension as

random variates, observing the measured rank correlations and power-law

distributions, as shown in Listing 1.

The above business rationale values are relative to the corresponding

traffic volumes. In our modeling we assume them to be relative to the num-

ber of average sized packets exchanged. Domains higher on the residential

access dimension are more likely to be traffic destinations, while domains

with higher web hosting values are more likely to be traffic sources. To

account for peer-to-peer traffic, the residential access dimension is included

43

Page 64: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

Listing 1. Octave/Matlab code for AS residential access and web hosting generation fromthe business access dimension.

1 % ASes indexed continuously starting from 1

2 function [AS_Ura, AS_Uweb] = generateASDimensions(AS_Uba)

3

4 % Pairwise Kendall's τ rank correlations [46, Table 8]

5 Uweb_vs_Ura = 0.1540;

6 Uweb_vs_Uba = 0.1856;

7 Ura_vs_Uba = 0.2068;

8

9 % Product Moment Correlation Matrix (x:BA, y:RA, z:WEB)

10 PMCM = [1.0, sin((pi/2)*Ura_vs_Uba), sin((pi/2)*Uweb_vs_Uba);

11 sin((pi/2)*Ura_vs_Uba), 1.0, sin((pi/2)*Uweb_vs_Ura);

12 sin((pi/2)*Uweb_vs_Uba), sin((pi/2)*Uweb_vs_Ura), 1.0];

13 % Compute lower−triangular matrix L, such that PMCM = L * L'

14 L = chol(PMCM)';

15

16 % Generate ranks in decreasing order for Uba,

17 % randomizing equal AS_Uba values

18 N = numel(AS_Uba);

19 [ubaSorted, ubaIndex] = sort(AS_Uba+(AS_Uba .* ...

rand(1,N)/(1000)), "descend");

20 Rba(ubaIndex) = [1:N]; % BA Rank for each AS

21

22 % Generate properly correlated normalized ranks

23 % (0, 1], 1 being the LOWEST rank.

24 U = L * [Rba/N; rand(1, N); rand(1, N)];

25

26 % Sort the normalized ranks for RA and WEB19

27 [uraRanks, uraIndex] = sort(U(2,:));

28 Rra(uraIndex) = [1:N]; % RA Rank for each AS

29

30 [uwebRanks, uwebIndex] = sort(U(3,:));

31 Rweb(uwebIndex) = [1:N]; % WEB Rank for each AS

32

33 % Zipf parameters [46, pp.142-143]

34 Uweb_EXP = −1.0; % APNIC (−1.1), ARIN/RIPE (−0.9)35 Ura_EXP = −0.9;36

37 AS_Ura = (Rra.^Ura_EXP) / (1.0^Ura_EXP); % Normalized

38 AS_Uweb = (Rweb.^Uweb_EXP) / (1.0^Uweb_EXP);

39 end

44

Page 65: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

Listing 2. Octave/Matlab code for computing CDFs from given distributions

1 function [U_CDF, U_CDF_AS_Index] = calcCDF(U1, U2=[], alpha=1)

2 % Sort to rank order

3 if numel(U2) == numel(U1)

4 [U_CDF, U_CDF_AS_Index] = sort(U1 + alpha * U2, "descend");

5 else

6 [U_CDF, U_CDF_AS_Index] = sort(U1, "descend");

7 end

8 % Cumulative sum for the CDF

9 U_CDF = cumsum(U_CDF);

10 end

Listing 3. Octave/Matlab code for drawing a vector of random ASes according to the givenCDF.

1 function ases = pickRandomASes(N, U_CDF, U_CDF_AS_Index)

2 % Scale from 0 to max of the CDF

3 % Each rank gets their share of the CDF range

4 r = rand(1, N) * U_CDF(end);

5

6 % Sort for determining the matching order with the CDF

7 % 1st rank gets values (0,U_CDF(1)],

8 % 2nd (U_CDF(1),U_CDF(2)], etc.

9 [rv,ri] = sort(r);

10

11 % Vector of invalid AS indices to start with

12 ases = zeros(1,N);

13 nt = 1; n = 0; found = [];

14

15 while (nt <= N)

16 % Find the next match in the CDF

17 n = find(rv(nt) <= U_CDF, 1);

18

19 % Indices for non−assigned random values up to U_CDF(n)

20 found = find(rv(nt:end) <= U_CDF(n));

21

22 % Set the target AS for the matched entries

23 % ASes kept in random order

24 ases(ri(nt+found−1)) = U_CDF_AS_Index(n);

25 nt += numel(found);

26 end

27 end

45

Page 66: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

in source determination with a weight factor. Consequently, we compute

cumulative distribution functions as shown in Listing 2. Finally, using the

CDFs, we draw random sources and destinations as shown in Listing 3.

For information object popularity, when needed, we assume Zipfian dis-

tributions, following the established practice in the field [45, 102, 128, 208].

Due to the power-law distribution, even modest caching is helpful in catch-

ing the demand for the most popular objects. However, due to the assumed

heavy tail of the distribution, there will always be a significant proportion

of traffic escaping the caches, no matter the size.

5.3 Routing Model

Given the relationship-augmented domain-level network topology (Sec-

tion 5.1), and traffic sources and destinations (Section 5.2), we need a

routing model to obtain the baseline for stretch computation. Also, when

using overlays (see Section 3.4), routing for the individual overlay links is

supposed to be using the underlying IP routing system (see Section 4.1).

The option for running BGP on every domain would be computationally

expensive, and would require modeling of IP prefixes and subnets. So, we

ask how we can estimate the domain-level path taken by BGP by using

the relationship-augmented dataset only.

BGP path computation is rendered nonlinear by the locally defined rout-

ing policy in general, and by the primacy of the local preference attribute

in particular. Consequently, it follows that the typical BGP policy leads

to noncompressible routing tables [176] and to difficulties in route redis-

tribution [137]. So it would appear that BGP modeling would be hard.

However, we find that the globally adhered to incentive structure, and the

predictability of the resulting BGP policies make them easy to model. In

the following we explore a set of algorithms for path computation.

5.3.1 Basic Valley-Free Path Computation

Recall that practically all end-to-end paths are valley-free (Section 2.2).

Valley-free paths are characterized by the uphill, possible peering link,

and the downhill path segments. The possible uphill paths for any given

AS depend only on its local transit connectivity and that of its transitive

19[46] has an error here, telling the ranks should be assigned in decreasing orderof the normalized ranks, but that would reverse the rank order. In fact, ranks arein decreasing order of the corresponding dimension value, and vice versa.

46

Page 67: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

Algorithm 1: Find the shortest valley-free path length.Data: Relationship-annotated AS graph G, source SAS , destination

DAS

Result: The length of the shortest valley-free path from SAS to DAS

begin

Up ←− SAS // Current level of upgraph nodes

Dn ←− ∅ // Current level of downgraph nodes

V ←− ∅ // All visited nodes

hops ←− 0

while Dn �= ∅ or Up �= ∅ do

if DAS ∈ Dn or DAS ∈ Up then

return hops // Path found

if Dn �= ∅ then

V ←− V ∪Dn

Dn ←− Gcustomers(Dn) ∪Gsiblings(Dn)

if Up �= ∅ then

V ←− V ∪ Up

Dn ←− Dn ∪Gcustomers(Up) ∪Gpeers(Up)

Up ←− Gproviders(Up) ∪Gsiblings(Up)

Up ←− Up− V // No loops

Dn ←− Dn− V // No loops

hops ←− hops+ 1

return ∞ // No path found

providers, and can as a whole be considered an upgraph [210, 97], a small

subset of the overall AS dataset. The same applies for the possible downhill

paths. Moreover, by ignoring asymmetric routing (which is hidden in the

AS-level relationships datasets), it follows that the possible downhill paths

are the reverse of the possible uphill paths.

Our first path computation algorithm (Algorithm 1) operates on the

whole network graph, exploring all possible valley-free paths starting from

the source AS, and finds a shortest valley-free path to the destination.

Since Algorithm 1 processes large sets of equidistant ASes in breadth-

first fashion at each iteration, it would seem that, given the average AS

path length of about 4-5 hops, it should be pretty fast. But even if the

destination is usually relatively close by, so are the rest of the ASes. In the

worst case, when the destination is not reachable at all, Algorithm 1 will

47

Page 68: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

Procedure GetDownGraph(G, as)Data: Relationship-annotated AS graph G, destination DAS

Result: Dn, the set of ASes in the downgraph of DAS

begin

Dn ←− ∅N ←− DAS // New nodes

while N �= ∅ do

Dn ←− Dn ∪N

N ←− Gproviders(N) ∪Gsiblings(N)

N ←− N −Dn

return Dn

visit all ASes reachable from the source. Actual path computation involves

tracing the path back to the source after the match is found. This is simply

done with backpointers (see Figure 5.1), but we omit the details here for

the sake of clarity.

5.3.2 Faster Valley-Free Path Computation

It should be clear from both Algorithm 1 and Figure 2.4 (page 14), that

unless the destination happens to be in the upgraph of the source, the

AS-level path involves crossing over from the upgraph of the source to the

downgraph of the destination. Hence, it is possible to compute the possible

valley-free paths by finding the intersection of the two graphs [196, 210].

Since the crossing over can take place over a peering link, we include the

peering domains in the upgraph, but not in the downgraph.

All the unnecessary downgraph branches visited by Algorithm 1 can be

eliminated if we start by computing the downgraph of the destination

and compare all new upgraph nodes (including peers) against the whole

downgraph. In other words, we restrict the path computation to the subset

of the whole topology formed by the union of the source’s upgraph and

destination’s downgraph. Only the upgraph will be traversed, if no path

exists, compared to the whole network in Algorithm 1.20 Also, since the

downgraph does not depend on the source, it can be cached for later use.

Algorithm 2 reflects the above, and results in faster path computation.

It uses the GetDownGraph function (page 48), which assumes the provider

and customer relations to be reciprocal, and the sibling relationships to

20This model expands to multiple sources and destinations as well.

48

Page 69: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

Algorithm 2: Find the shortest upgraph path length for a valley-free

path.Data: Relationship-annotated AS graph G, source SAS , destination

DAS

Result: The shortest upgraph length of a valley-free path from SAS to

DAS

begin

Up ←− SAS // Current level of upgraph nodes

Dn ←− GetDownGraph(G,DAS) // Set of downgraph nodes

V ←− ∅ // Visited nodes

hops ←− 0

while Up �= ∅ do

if Up ∩Dn �= ∅ then

return hops // Intersection found

if Gpeers(Up) ∩Dn �= ∅ then

return hops+ 1 // Via peer

V ←− V ∪ Up

Up ←− Gproviders(Up) ∪Gsiblings(Up)

Up ←− Up− V // No loops

hops ←− hops+ 1

return ∞ // No path found

be symmetric. Note that Algorithm 2 only returns the shortest upgraph

length; complementing it with the downgraph length and shortest overall

path length is left as an exercise for the reader.

5.3.3 Optimized Shortest Valley-Free Path Computation

Both uphill and downhill graphs can be flattened to retain the shortest

paths only, organized as lists of ASes at a given hop distance from the

source and destination, respectively. Now, we are optimists and organize

the search space so that the shortest possible valley-free path is found first.

This is accomplished by:

1. Search for the path in the increasing total length hops of the path,

starting from 0.

2. Each path length hops limits which distances from the source and desti-

49

Page 70: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

Algorithm 3: Find the shortest valley-free path length.Data: Relationship-annotated AS graph G, source SAS , destination

DAS

Result: The length of a shortest valley-free domain-level path from

SAS to DAS

begin

Up ←− GetDownGraph(G,SAS) // Upgraph

Dn ←− GetDownGraph(G,DAS) // Downgraph

foundpeer ←− false

for hops ←− 0 to |Up|+ |Dn| do

for is ←− 0 to min(hops, |Up|) do

id ←− hops− is

if id ≤ |Dn| then

for ns ∈ Up(is) do

for nd ∈ Dn(id) do

if ns = nd then

return hops

if nd ∈ Gpeers(ns) then

foundpeer ←− true

if foundpeer then

return hops+ 1

return ∞ // No path found

nation can be considered: is + id = hops.

3. Search for all possible matches in the total distance hops in considera-

tion; is = [0, hops], id = hops− is.

4. The search can be terminated whenever the first match is found, as no

other path can be shorter.

5. If a path via a peer is found, it is returned if no shorter non-peering

path can be found.

Note that in Algorithm 3, reflecting the above, GetDownGraph is now used

to compute both upgraphs and downgraphs, but it now also needs to return

an array of AS lists, one list for each level of distance from the given AS,

50

Page 71: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

the AS itself being at zero distance. Both upgraphs and downgraphs are

independent of each other, so they can be cached for later use.

None of our algorithms so far have included consideration of the typical

BGP path selection policies. We’ll do that next.

5.3.4 Policy-Compliant Path Computation

Having found the possible valley-free paths, it now remains to choose the

one a reasonable BGP policy would choose. This can be done by observing

the incentive-derived inter-domain routing policies (see Section 2.3). Recall

that each AS only considers its own cost, and does not really care about the

costs of other ASes. A practical consequence of this is that the chosen path

in many cases is not the shortest one. Hence, we cannot implement the

search in a simple breadth-first fashion as above, but we need to update

the local view at each AS as new, more preferable paths are found. As part

of this process the discovered path length may either increase or decrease.

To support a local view for each AS in the path, we assign a local cost

to each inter-domain link, and compare those costs on the links through

which the destination can be reached in order to choose the path the AS

“advertises” towards the source. The decision can be made only after all

the potentially more preferable paths have been explored. When that is

the case, the upstream nodes are “informed” of the decision. The preceding

domains only see the (initially infinite) AS hop count as the local cost

metric is not transferred over inter-domain links. We make an exception

on the last rule for sibling links, see below.

In Algorithm 4 we implement path computation adhering to the path

selection preferences stated in Section 2.3. To avoid unnecessary work,

we first compute the downgraph (not shown), which allows finding the

SAS S1 S2 PS1-1 PS1-2 P1 P2 SP1 PP1-2 PP1-2

Current End

Figure 5.1. Data structure for policy-compliant upgraph search. Source AS is the firstnode on the list. Current index points to the AS being considered. Uphillneighbors to be explored are added after the End index. Backpointers (shownabove) point to the previous AS in the upgraph. Data fields such as local costand hop count are not shown.

51

Page 72: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

Algorithm 4: Find the shortest policy-compliant path.Data: AS graph G, customer links C, sibling links S, peering links P ,

link costs LC, source SAS , destination DAS

Result: The shortest policy-compliant path from SAS to DAS

begin

Up ←− UpGNode(SAS) // See Figure 5.1

Dn ←− GetDownGraph(G,DAS) // Downhill nodes

while node ←− Up.GetNext() do

if node ∈ Dn then // Direct or sibling’s customer

node.next←− Dn(node).next

node.hops←− Dn(node).hops

node.cost←− Dn(node).cost

// Matched node and peers not explored further

else if (node.parent, node) /∈ P then

for ∀next : (node, next) ∈ G− C do // Uphill neighbors

if node.cost > LC(node,next) and Up.NotLoop(next) then

node.childcount←− node.childcount+ 1

Up.Insert(next) // Initializes member variables

// Update costs and hop counts if no pending children

while node.childcount = 0 and parent ←− node.parent do

parent.childcount←− parent.childcount− 1

if node.hops �= ∞ then // Have path to advertise?

hops ←− node.hops+ 1

cost ←− LC(parent,node)

if (parent, node) ∈ S then

cost ←− cost+ node.cost

if parent.cost > cost or

(parent.cost = cost and parent.hops > hops) then

parent.next←− node

parent.hops←− hops // May increase

parent.cost←− cost

node ←− parent

return Up.Path(Dn)

preferred customer path, and then use a specific upgraph data structure

allowing for breadth-first search and updating local cost and hop count

52

Page 73: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

Table 5.1. Methods used in Algorithm 4.

Method Description

UpGNode Initialize an upgraph search list (Figure 5.1).

GetDownGraph Return a downgraph with a like structure.

GetNext Advance Current index, return the current node.

NotLoop Check that the given AS would not form a loop.

Insert Insert a new AS at the end of the upgraph. .parent

will be initialized to the current node, .cost and

.hops to ∞, .next to ∅, and .childcount to 0.

Path Return the preferred path by following the next

pointers from source to the destination.

metrics as the search progresses (Figure 5.1). Uphill Nodes with a mini-

mum cost higher than the currently known cost will not be explored. When

the cost at each child node is known, the cost and hops count at the parent

nodes may be updated. We accumulate costs over sibling links to be able

to make a difference between a path through the sibling’s provider and

peer links.21 The search terminates when no node may lead to lower cost

at source. If there is no path at all, both cost and hop count at source will

remain at infinity.

Algorithm 4 can be further optimized like the other algorithms above. For

example, the neighboring nodes could be processed in the order of the

local cost metric, so that pruning out the more expensive neighbors would

be more likely. Additionally, peer and sibling links could be processed in

depth-first fashion to avoid exploring provider paths, when peer or sibling

peer paths exist. Figure 5.2 shows a modified data structure to support

optimized path exploration. Corresponding changes to Algorithm 4 are left

as an exercise for the reader. As a further enhancement, Algorithm 4 could

be modified to support Equal Cost Multi-Path (ECMP) [112] by keeping

track of multiple equally preferable next hops at each AS. Finally, this

model can be expanded to cover also intra-domain links by considering

them like the sibling links, but maybe with a lower local cost. Doing this,

each node in the structure would present a router instead of an AS as a

whole.

21Paths to sibling’s customers are present in the downgraph, and are thereforealready preferred.

53

Page 74: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Network Modeling with Inter-Domain Incentives

Src S1 S2 SS1-1 SS1-2 P1 SP1 P2 PP1-2 PP1-2

Current Insert End

Figure 5.2. Data structure for optimized policy-compliant upgraph search. Peer andsibling ASes are added after the Insert index, while provider ASes are addedafter the End index. The list is fixed up to the Current index. Note how thesibling SP1 has been added between P1 and P2, while the providers of P1appear at the end.

54

Page 75: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

6. Inter-Domain Incentives and InternetArchitecture

“I wrote a great deal during the next ten years, but very little of any importance;

there are not more than four or five papers which I can still remember with some

satisfaction.”

(Godfrey Harold Hardy. A mathematician’s apology [108, p. 147])

Having covered the necessary background, we now present summaries

of the five papers included in this dissertation, showing the principles of

inter-domain structure and incentives in action. We present the papers in

chronological order, spanning five years of development of these themes.

We start with a node identity internetworking architecture (Section 6.1),

where we propose a technical solution to the problem of dynamic evolvabil-

ity of the inter-domain structure through the means of locator domains and

cryptographic node identifier-based routing. Then we show how the alterna-

tive networking service model of content-oriented networking might open up

possibilities for new kinds of inter-domain traffic policies, based on the typ-

ical economic incentives from which the policies derive (Section 6.2). Next

we examine how the consideration of inter-domain incentives shapes inter-

domain multicast protocol design (Section 6.3). After that we present a

name-based routing design informed by the consideration for inter-domain

evolvability and protocol deployment incentives, while keeping performance

and scalability in check (Section 6.4). Last, but not least, we return to

the theme of cryptographic identifiers in network design, assumed already

in the first paper presented here, but on which little has been published

before (Section 6.5).

55

Page 76: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

6.1 Evolvable Inter-Domain Structure

Publication I presents the node identity internetworking architecture, ad-

dressing many of the shortcomings of the Internet of today. As an archi-

tectural proposal, this work builds heavily on components proposed by

others, but composing them into a novel constellation. The contribution

of this paper lies in making some of the ideas more concrete, discussing

the underlying assumptions and tradeoffs, and combining the ideas into a

consistent whole.

This work starts from the assumption of the existence of independent

locator domains, brought into being via the use of private address spaces

and middleboxes, such as network address translators and firewalls (see

Section 3.1). Nodes within one locator domain can freely communicate,

using the internal addressing and routing mechanisms of the domain.

The global, publicly addressable IPv4 network can be seen as one locator

domain, while privately addressed, or firewall protected networks can be

seen as separate locator domains. The definition also allows for introduc-

tion of new networking technologies, forming their own locator domains

(e.g., IPv6). That is, an interconnection between two locator domains may

present a technology, as well as a policy boundary.

We present the architecture with a simplified example (Figure 6.1). One

of the shown locator domains (LD1) is a core IPv4 domain, while the two

communicating nodes attach to the edge domains LD2 and LD3, respec-

tively.22 The edge domains connect to the core via node identity (NID)

routers NR2 and NR3. NID routers map communication across the border

by translating between different locator spaces and network technologies.

NID routers also function as communication gateways for the local nodes

towards the rest of the network.

Each node (including routers) has a cryptographically generated node

identifier, the public key of a public/private key pair (see Section 4.5),

similar to a host identity in HIP [150]. Nodes retain their identifiers

regardless of their attachment to the network – NID is not an address, as

it does not indicate where in the network the node can be found.

Nodes join domains by technology specific network attachment mecha-

nisms, such as stateless address autoconfiguration [197]. Nodes register

their node identifier, assigned local address, and fully-qualified domain

22More generally, the edges may form topologies of their own, similar to the notionof rendezvous networks in Publication IV.

56

Page 77: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

Figure 6.1. Example topology showing three locator domains interconnected with nodeidentity routers (NR) and two end nodes A and B (Figure 1 of Publication I).

name (FQDN), both with dynamic DNS [201] and with the local NID

routers, using, e.g., the dynamic host configuration protocol (DHCP) [73].

Node registrations are forwarded by the NID routers up the edge topology

towards the core. Thus, within a particular edge region, nodes can be

reached via the connected NID routers using the node identifiers only.

Nodes start communication by resolving the name of the correspondent

node to the node identifier and network address if the destination is in the

same locator domain. However, if the destination is in a separate locator

domain, the identifier of a globally reachable NID router to which it has

been registered is also returned with the name resolution response. Packet

forwarding towards the destination happens in deepest match fashion (see

Section 6.5), matching first for the destination NID, and if there is no match,

then for the destination NR, and if also that fails, then a default route

towards a core is taken. In the core the destination NR is resolved using a

distributed hash table, after which the packet is sent to the destination

edge. Finally, within the destination locator domain the destination node

is reached using the local networking technology. In addition to this brief

example, the design covers use of multiple cores and addresses the impacts

of both host and network mobility.

The key design elements of the above described architecture include

technologically independent locator domains, reliance on cryptographic

57

Page 78: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

self-managed identifiers, routing based on both locators within the locator

domains, and node identifiers or default routes between domains, router

referrals and deepest match forwarding to avoid single administration for

the whole network, and end-to-end security based on the cryptographic

node identifiers.

While already five years old at the time of writing this dissertation,

this design is still fresh, as the major design elements listed above are

still being considered for newer inter-domain networking designs: The

separation of edge network identifiers from core locators is embraced in

LISP [83]; cryptographically generated addresses are used in AIP [12]

and DONA [132] and in content-oriented designs more generally [59]. In

addition, AIP advocates identifying domains with similar identifiers to

get rid of many of the security problems with BGP [152]. Contemporary

proposals for making the Internet more evolvable also vouch for domains

being interconnected with a new, technology-independent inter-domain

layer [133, 100]. Many of these elements are also reused in the other

designs included in this dissertation, summarized below.

6.2 Incentive-Compatible Caching and Peering in Content-OrientedNetworks

In Publication II we apply the inter-domain incentives introduced in Sec-

tion 2 to data or content-oriented networking (see Section 4.4). Recall

that inter-domain routing policies are driven by each operator’s economic

incentives. In fact, it seems that the inter-domain structure itself is forged

according to these policies, rather than the physical layer network connec-

tivity. Hence, all new networking architectures are also bound to face a

similar policy structure. When new kinds of traffic patterns, however, are

introduced, possibly changing the incentives, new kinds of traffic policies

might become reasonable.

One motivational factor for content-oriented networking is the expected

more efficient and timely use of network resources. This is possible due to

the routing state being created as the data requests are processed in the

network, which enables sharing the communication and storage resources

between multiple recipients. In the case of synchronous transmission,

this kind of state and resource sharing essentially results in a multicast

service (see Section 4.2). Caching, in turn, enables asynchronous sharing.

In general, this kind of sharing has a similar incentive structure to peering

58

Page 79: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

JJ

EE GG

DD

IIHH

BBAA

“Data”UserB

FF

CC

Route to dataData delivery path

Peering link

Transit link

UserC

(a) Policy-constrained content-oriented networking.

JJ

EE GG

DD

IIHH

BBAA

“Data”UserB

FF

CC

UserC

Cache D

Route to dataData delivery path

Peering link

Transit link

(b) Content-oriented peering.

Figure 6.2. Content-oriented peering (Figures 1 and 2 of Publication II).

for packet forwarding: savings on transit costs and reduced latency.

The possibility for transparent, architecturally integrated caching is a

central piece in the content-oriented designs. However, so far a crucial pol-

icy question remains unanswered: When, exactly, should a content-oriented

domain cache which piece of data? For example, in the Figure 6.2(a) above,

without caching or relevant data specific forwarding information in either

of the domains D or G, the data is sent twice from the server hosting the

data. To understand why this is the case we will need to examine the

incentives for caching.

The next question is cache placement: Who has an incentive to cache

content? It seems clear that Tier-1 domains, in their transit role, would

have no incentive to cache content if the content served from their caches

were away from their customer links, thereby reducing their transit rev-

enue. This generalizes to all uphill domains:23 data served from caches

will, in general, be away from their customer links, thereby reducing rev-

enue. There is no point serving data on your customer’s behalf, unless the

customer pays for the caching service. This is why there is no caching in

23See Section 2.2 for introduction to the valley-free model, including the uphilland downhill segments.

59

Page 80: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

either of the domains D or G in the Figure 6.2(a) above. However, on the

downhill paths all domains seem to have an incentive to cache the data, as

data served from caches is away from their (internal or external) transit

links. The customer network domains, being at the bottom of the downhill

paths, have the most obvious incentive for caching, as they can also directly

benefit from the reduced latency.

The discussion so far points to an interesting question, namely are there

cases in which the valley-free model could be applied to the content-oriented

networking in a relaxed form so that some of the shorter inter-domain

paths could be utilized? Are there possibilities for content-oriented peering?

Consider the scenario in Figure 6.2(b). The domain C is caching the data

UserC has requested, so that other users in domain C interested in the

same data might get it faster. Now it becomes possible for domain C

to further advertise the same data over its peering links. The rationale

for doing this may be that the settlement-free peering link (C-B) may be

underutilized, and that providing the data over the peering link is not

causing any new external transit costs for domain C. Now, comparing to

the situation in Figure 6.2(a), the domain-level path from D to B becomes

shorter, the end-to-end latency likely smaller, and domain B saves the

cost of transit through E. In effect, domain C is in a transit role through

its peering and transit relationships with domains B and G, respectively.

In the existing packet forwarding level inter-domain policy structure this

makes no sense, as a third non-peering segment (G-C) for which there is

no compensation, would form between the uphill (D-G) and downhill (C-B)

segments (see Section 2.2).

In the current BGP4 Internet, each AS is inviting traffic with BGP route

announcements over their incoming inter-domain links, but each AS is in

direct control of only the traffic they send out. Consequently, a peering link

is considered imbalanced if one peer is sending significantly more traffic

than the other, the limit reported in the literature being 4:1 [153] or 2:1

[82]. In the content-oriented model these roles are reversed: the sender

of the data first advertises data availability, and the receiver initiates

the data transfer explicitly, and thus can be considered the benefactor

of the ensuing traffic instead of the sender. This role-reversal should

be considered in forming the accounting models for peering balance for

content-oriented traffic. This would enable avoiding settlement fees by

advertising and providing more data over imbalanced peering links. While

this further increases the imbalance on the packet-forwarding level, the

60

Page 81: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

overall benefits between the peers would be balanced.

As communication and storage sharing enables higher utilization of the

existing peering relationships, some of the weight of the overall traffic

moves down from the Tier-1s to the lower tiers. This forms a challenge

for content-oriented designs. Why should Tier-1 domains participate in

content-oriented networking in the first place if they are faced with less

growth, or even a small decline in their transit traffic, and therefore less

income? What we need is a content-oriented architecture that is not

critically dependent on the Tier-1 networks for other than their present

packet-forwarding transit service.

The higher-level service model of the content-oriented architectures

makes the caching and peering policy a central tool for inter-domain traffic

engineering and management. However, these policies are nontrivial. The

different roles the domain may have in the valley-free model determine

the effect of caching on its revenue. The overall picture also contains the

various internal and external costs, such as those related to storage and

bandwidth. Also, the looser coupling between the packet forwarding and

content-oriented routing functions allows better control over peering traffic

than what is currently possible with BGP. Some of the policy considera-

tions introduced in this paper have later been rediscovered in the context

of recent efforts in named data networking [69].

6.3 Incentive-Informed Inter-Domain Multicast

In Publication III we present a multicast design built on the primacy of

the inter-domain traffic incentives (Section 2).

Optimal IP multicast paths are typically composed of individual policy-

compliant unicast paths, eliminating redundant transmissions on each link

(see Figure 4.2) [48, 87]. Recalling that the typical unicast traffic policies

are derived from the economic incentives of the participating domains,

it is perhaps surprising to see that the straightforward combination of

such unicast paths into multicast delivery trees can be incentive incom-

patible, when only unicast-based revenue is considered. For example, in

Figure 6.3(a), for each packet sent by the source S, the next hop provider

sends one packet to another customer, and two to its own providers. As-

suming typical volume-based charging, the provider is typically better off

not providing the multicast service, as in Figure 6.3(b).

Examining the incentives of all the parties along potential multicast

61

Page 82: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

#

S D2

#

#

D4

D3

D1

D5

#

(a) Optimal IP Multicast

S D2

#

#

D4

D3

D1

D5

(b) Incentive-informed Inter-domain Multi-

cast

Figure 6.3. Incentive-informed inter-domain multicast example from source S to destina-tions D1−5. Each arrow represents inter-domain traffic, “#” indicates multicastrouting information management and traffic forking within a domain. (Figure1 of Publication III)

delivery paths, we find that assuming only unicast-derived revenue, all

uphill (see Section 2.2) transit providers are better off not providing the

multicast service as their customer revenue is maximized by the multiple

unicast transmissions from the source. This is contrary to the downhill

providers, who, when considered separately, all benefit from reducing re-

dundancy on their paid transit links. However, since their cost savings

represent lost revenue from their providers’ viewpoint, the only chance for

multicast deployment comes through domains unilaterally offering multi-

cast service in cases where they can realize transit cost savings or more

customer revenue, or both. Destination networks with multiple receivers

have a natural incentive for reducing incoming transit traffic. However,

assuming that the end users are each already paying for unicast traffic,

they have little to gain from multicast as such, but there are no obvious

disincentives either. The increasing trend of service-specific code running

in destination hosts (e.g., client-side scripting) makes the destinations

share some of the incentives of the application service providers. This

makes it possible for the destination nodes to request multicast service

without the explicit consent of, nor any harm to, the respective end users.

Following these findings we derive a generic unicast friendly inter-

domain multicast policy – a default set of multicast routing rules that

could be safely turned on by all network service providers, and thus make

multicast available for all network applications. Due to the possible neg-

ative incentives and additional overhead caused by multicast, the policy

needs to include an opt-out mechanism. This consideration calls for uni-

cast forwarding of multicast traffic across domain boundaries. Since clear

62

Page 83: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

0

20

40

60

80

100

100 101 102 103 104

Inte

r-do

mai

n re

dund

ancy

elim

inat

ion

[%]

Group size [# destinations]

(N = 2000, CI = 95%).

w/ peers1st hop

w/o T1 peers1st hop

w/o peers1st hop

(a) Efficacy with different multicast peering policies.

0

20

40

60

80

100

120

100 101 102 103 104

Inte

r-do

mai

n re

dund

ancy

elim

inat

ion

[%]

Group size [# destinations]

(N = 2000, CI = 95%).

Tier-1 Detour MulticastTier-1 Detour Multicast (ALT)

1st hop1st hop (ALT)

(b) Tier-1 detour multicast efficacy.

Figure 6.4. The degree of redundancy elimination with (a) incentive-informed inter-domain multicast, and with (b) Tier-1 detour multicast. 0% and 100% levelscorrespond to unicast and IP multicast, respectively. “1st hop” refers to thefirst inter-domain hop from the source for the preceding policy case. (Figures2(a) and 3(a) of Publication III).

multicast incentives exist only at the sources and on the destination side

of the inter-domain paths, multicast operation must be requested by each

destination sending a multicast reception request towards the source encap-

sulated within a normal unicast packet. We define the policy by specifying

domain-level processing rules (see Publication III), and follow with a policy

evaluation using our inter-domain network model (Section 5). Figure 6.4(a)

shows the modeling results for inter-domain redundancy elimination, com-

paring different options for multicast peering policy. Up to 95% of the

redundancy elimination benefit of optimal IP multicast is attainable, vary-

ing by the destination group size.

Overall, it seems that as a block transit providers would be better off if

there was no multicast at all as making senders use unicast maximizes

the overall network traffic and revenue, based on the existing contracts.

Luckily Internet transit provision is a highly competitive undertaking,

making it difficult for any provider to force others to use their services. The

steady increase of bilateral peering provides an illuminating example of the

dynamics in Internet transit provision. Transit providers face decreasing

revenue, but due to competition between transit providers, they can do

little about it.

This being the case, it is possible that some transit providers might be

able to draw more customer traffic to their networks by offering multicast

as a service. This effectively eliminates the corresponding use of peering

63

Page 84: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

links below the multicast service provider domains in the transit hierar-

chy. Facing increasing levels of transit traffic, the only sensible move the

downhill-side providers can make is to respond to the multicast reception

requests they receive from their customers, as described above. Efficacy of

this case is shown in Figure 6.4(b).

Mirroring our earlier findings for inter-domain caching incentives in Pub-

lication II, we note that the multicast incentives are unbalanced along the

end-to-end paths. As the costs and benefits of optimal multicast operation

generally fall on different economic entities, most end-to-end paths will

have domains with no incentives for multicast provision, when only traffic

volume derived revenue is assumed. In our evaluation we find that most

of the optimal IP multicast redundancy elimination benefits are attainable

even when operation is based on local incentives only. Furthermore, we

find that if such a multicast model is adopted, an opportunity to draw

some of the lost traffic back to the top tier transit networks opens up.

Our evaluation shows that such a Tier-1 detour multicast, combined with

voluntary downhill redundancy elimination according to our default policy,

has even higher network-wide efficiency than the optimal IP multicast

model. Due to the small number of top tier transit providers being able to

reach most of the Internet destinations, the uphill redundancy is also very

limited, and scales well also for very large group sizes.

6.4 Name-Based Inter-Domain Routing

In Publication IV we define rendezvous architecture as an inter-domain

networking system that routes communication requests to available repli-

cas of named objects or object collections (see Sections 4.4 and 4.5). The

rendezvous service model consists of two phases of operation: Firstly, ob-

jects are registered in any of the rendezvous node(s) the object owner has a

relationship with. Secondly, object users ask their local rendezvous nodes

to route communication requests to a rendezvous node responsible for the

object named in the request. The task of the rendezvous architecture is to

bridge the gap between the respective rendezvous nodes.

In practice, network service providers find incentives for new deploy-

ments at different times. This calls for design for deployment – mandating

at least partial overlay solutions, as support by all domains cannot be

assumed – as well as for a design where end-to-end traffic is not handed to

arbitrary domains in the Internet. While it is common for an end-to-end

64

Page 85: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

forwarding path to pass through many competing transit networks, the end

customers of the network service providers, however, expect their traffic

not to pass through arbitrary other enterprise networks.

Table 6.1. Objectives for our inter-domain rendezvous architecture.

# Objective

1 Use of a flat, self-certifying namespace.

2 Enterprise domains only as endpoints of any communication path.

3 Rendezvous topologies are formed by willing participants only.

4 Communication locality is preserved whenever possible.

5 Rarely referred reachability information is not distributed globally.

Addressing the security, policy-compliancy, deployment, and scalabil-

ity concerns, we focus on the objectives listed in Table 6.1. We achieve

these objectives by architecting a solution comprised of known component

technologies as follows:

Namespace. Rendezvous namespace is based on flat self-certifying cryp-

tographic labels without any predefined application layer semantics. We

expect these labels to be used to name object collections that share common

distribution and access semantics and call these collections scopes. This

level of indirection takes care of a part of the global scalability challenge

imposed by the flat namespace. The rendezvous requests typically include

additional naming information identifying individual objects, but these

are processed by the rendezvous nodes and caches responsible for those

objects and need not concern the inter-domain name routing system (see

Sections 4.5 and 6.5).

Rendezvous Networks. Rendezvous nodes in enterprise networks adver-

tise availability of scopes in their domains by sending the related reachabil-

ity information to their rendezvous peers and rendezvous service providers.

These service providers further propagate this reachability information

to their rendezvous peers and providers. The top tier rendezvous service

providers in this local hierarchy retain all such registration state adver-

tised by their peers and (transitive) customers. This process makes the

registered object collections reachable in the set of domains thus defined.

We call this kind of domain set a rendezvous network (see the shaded areas

in Figure 6.5).24

24Compare to edge topologies in Publication I.

65

Page 86: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

Figure 6.5. Rendezvous network interconnection. Solid lines represent the packet forward-ing level inter-domain structure. Dashed lines represent the interconnectionoverlay. Rendezvous nodes are located in rendezvous networks (the shadedareas), individual rendezvous nodes are not shown (Figure 1 of PublicationIV).

Interconnection overlay. The rendezvous service providers serving as

roots of their respective rendezvous networks interconnect using an inter-

connection overlay (the dashed lines in Figure 6.5). The overlay is used to

store (scope identifier, pointer) tuples, pointing to the rendezvous nodes

maintaining the reachability state for the named object collections. Imme-

diately upon encountering such a pointer in the overlay, the communication

request is forwarded to the responsible rendezvous node using the forward-

ing path stored with the pointer. The destination rendezvous node can be

hosted by any network reachable either on the underlying packet network

or via the interconnection overlay.

From the many available overlay designs, we have chosen Canonical

Chord [95] (see Section 3.4). However, we apply the overlay strictly between

willing rendezvous service providers, thus maintaining objectives 2 and

3 on the overlay level. The stated locality preservation objective (#4) is

maintained by storing pointers to responsible rendezvous nodes on each

Canon hierarchy level, starting from the rendezvous network hosting the

responsible rendezvous node.

On the top-level of the interconnection overlay we use identifier prefix

based latency optimization [95], where each node maintains links to all

66

Page 87: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

0

20

40

60

80

100

0 2 4 6 8 10 12

CD

F (

%)

Inter-domain path stretch

Rendezvous NetworksRendezvous Networks with Canon(3)

Alternate peering topologyGlobal Prefix-root only

Canon(3)Chord

Figure 6.6. Domain-level stretch CDF without Cache (Figure 3(b) of Publication IV).

other nodes in its own prefix group, and at least one link to any node in each

of the other prefix groups. In practice, this is an additional global overlay

layer on top of the locality-preserving hierarchical overlays. However,

the combined structure allows for significant reduction in overlay link

maintenance overhead as each overlay node needs top-level links only

for prefixes within its own portion of the identifier space in the Canon

hierarchy. Thus, the more there are nodes in the local hierarchy, the fewer

global level links are needed at each node, and vice versa.25

Again, we use the modeling tools described in Section 5 to evaluate our ar-

chitecture on the inter-domain topology of the Internet. Figure 6.6 reports

stretch, the ratio of the domain-level rendezvous path to the comparable

shortest policy-compliant path. The small number of requests with stretch

below one are due to overlay nodes offering a shortcut via links otherwise

not available for policy-compliant end-to-end paths. The reported stretch

only applies to the initial rendezvous exchange as the payload communi-

cation takes place over optimal policy-compliant paths. We see that our

objective of locality preservation has lead to a slightly increased stretch

compared to the case where Canon hierarchies are not used. In retrospect

this is as expected since each level of hierarchy may add an additional

overlay link to go through (see Figure 3.2).

In the case of locating named objects in global inter-domain networks, it

seems that balancing between the extremes of full state flooding and uni-

versal overlay designs enables addressing some of the incentive concerns

25To our knowledge Publication IV was the first to report this result.

67

Page 88: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

of the different stakeholders in the inter-domain network. The presented

design divides the network into rendezvous networks and structured over-

lays interconnecting such networks, and provides better performance than

the universal overlay option and better scalability than the global state

flooding designs.

Deployment of any new architecture takes place one step at a time, each

step taken by individual stakeholders acting on their own incentives. It

is possible, and even likely, that this process never achieves full deploy-

ment over the existing networks. Accepting this, we recognize the need for

solutions for interconnecting the deployed parts. Furthermore, while inter-

connection overlays are needed for deployment purposes, the participation

in such overlays may need to be limited to network service providers.

6.5 Naming in Content-Oriented Architectures

In Publication V we present new argumentation for the use of self-certified

naming (Section 4.5) in content-oriented networking architectures (Sec-

tion 4.4). Self-certified names have been criticized on the basis of both

scalability and security; for the lack of inherent hierarchy and the need

for external real world identity–name binding, respectively. In this paper

we show that, in fact, self-certifying names have better security properties,

and they support more flexible aggregation than that of a strict hierarchy

alone.

Recall that the content-oriented service model calls for delivery of re-

quested data from wherever it might be most readily available (Section

4.4). This makes it necessary to secure the delivered data, regardless of

the transmission source. Furthermore, the content network needs to verify

that the content advertised is correctly named, lest everyone be able to

speak for CNN.com, for example. Making the user check the data after

completed delivery is not an option as the network might never be able to

deliver the authentic content to the user.

Human readable names require external mechanisms for name owner-

ship verification. This entangles the trust mechanisms of the network with

those of the users, making them hard to evolve. Self-certified names, on the

other hand, provide a clean separation of concerns between the users and

the network; users are responsible for knowing what to ask for (using their

own trust mechanisms), and the network is only responsible for providing

the (truthfully) named content.

68

Page 89: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

Using self-certified names, network elements can use the included public

key and the signature to verify that (a) the content is correctly named by

deriving the name from the public key, and (b) that the content is provided

by the principal holding the corresponding private key by checking the

signature, using the public key provided. Thus, no external mechanisms

or global trust is needed for the verifications necessary for preventing

the kinds of Denial of Service attacks otherwise made possible by the

content-oriented networking paradigm.

We further show how the aggregation invariant (defined in the paper)

can be used to derive explicit aggregation. Briefly, explicit aggregation is

formed by placing globally unique names in a sequence, and providing the

aggregation semantics where following the routes towards the first element

will lead to a location where the second element is known, and so on for

the rest of the elements. Forwarding is done on the basis of deepest match:

At each hop, the last aggregation element (the content itself) is matched

first. If no next hop is found, then the second-to-last element is matched

against, and so on towards the first element. This process guarantees that

the most specific route is always used, like for the longest-prefix-match for

IP today. The significant difference is that since each element in an explicit

aggregation is globally unique, there is no need to prefix them with all the

preceding elements of the name during the matching process. This allows

the same content to be placed in multiple different aggregates without

changing the content name.

6.6 Research Summary

In the research included in this dissertation we have examined the inter-

domain structure of the Internet from multiple vantage points. We started

by looking at the technical problems relating to the architectural chal-

lenges facing the Internet, and moved to the problem of new architecture

deployment. The deployment challenges led us to consider the driving

forces behind inter-domain traffic policies – the incentives by which net-

work operators run their networks. The Internet was created by, and

networks are still operated by passionate professionals. Passion is, how-

ever, insufficient in sustaining the exponential growth of the Internet we

have witnessed during the past two decades. In the end, network oper-

ation is a business, where individual actions take place in a regulatory

framework, but ultimately for profit.

69

Page 90: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Inter-Domain Incentives and Internet Architecture

The example of our incentive-informed multicast design clearly shows

how the consideration for inter-domain incentives could – and should –

frame protocol design in the inter-domain ecosystem. Incentives involve

multiple phases of protocol use, starting with deployment incentives, moti-

vated by the overall outlook of the new protocol on network profitability.

Second, there are operative incentives, guiding the use of the protocol in

individual network traffic circumstances. Both of these aspects are equally

important, and only after these comes the concern for scalability. It could

be argued, given the example of BGP, that in the end, scalability also boils

down to an economic concern that becomes relevant only if the operation

could be profitable to begin with.

Furthermore, we have tackled the beneficial use of self-certified naming

in Internet architecture. Naming in general is a foundational aspect of

any network architecture; Many problems we have with the Internet today

stem from the addressing model formed in an environment of mutual trust

– trust that does not exist anymore. We have shown that self-certified

naming can solve some of the additional security challenges introduced

by new networking paradigms, such as content-oriented networking. In

the end, this ties back to inter-domain networking as it is the nature

of the global networking community that due to differing political views

and outlooks on life in general, we are bound to have different opinions

of what is legal, appropriate, and moral behavior on the Internet. This

exemplifies the need to separate the concerns of trust between the diverse

user communities, and the mechanics of the inter-domain network we rely

on.

70

Page 91: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

7. Conclusion

“I know that most men – not only those considered clever, but even those who are

very clever and capable of understanding most difficult scientific, mathematical,

or philosophic problems – can very seldom discern even the simplest and most

obvious truth if it be such as to oblige them to admit the falsity of conclusions

they have formed, perhaps with much difficulty – conclusions of which they are

proud, which they have taught to others, and on which they have built their

lives.” (Leo Tolstoy. What is art? [198, p. 131])

In this dissertation we have looked how incentives have shaped the inter-

domain structure of the Internet, and how they continue to do so in Internet

architecture moving forward. The Internet is arguably the largest man-

made structure in existence. Hence, it is understandable that scalability

has been widely considered the deciding factor in protocol design aiming

at Internet-wide deployment. Internet evolution shows, however, that the

freedom of individual economic entities to function in their own interest,

exemplified by the primacy of the local preference attribute in BGP, is the

leading principle, or value, embedded in the fabric of the Internet.

Make no mistake, scalability is necessary as the Internet continues to

grow, but alone it is not a sufficient metric for determining the fate of new

protocols. Even if it is at times helpful to relax some of the real-world

constraints in order to find something new and fresh, we need to return to

reality to find the possible paths towards the grand vistas we have seen. In

this dissertation we have done both: We have glimpsed the possibilities for

new kinds of inter-domain interconnection, but we have also viewed them

in the light of our limited understanding of the inter-domain ecosystem.

So, as we tackle the challenges that enormous scale has brought us, it is

safe to assume that the fundamental inter-domain structure will change

only ever so slowly, reflecting the pace in which the network owners find

71

Page 92: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Conclusion

mutually beneficial and cost-effective means to expand their capacity to

serve the seemingly ever increasing needs of their customers. Sometimes

these opportunities arise as the ways in which the network is used evolves.

So far it seems that content access is the primary use, and it might be

tempting to optimize the network toward this current use. It is the gener-

ality of the best-effort packet service, however, that has brought us this

far, and great caution should be exercised in changing this fundamental

network service model.

72

Page 93: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[1] Ambient Networks Phase 2. http://cordis.europa.eu/fetch?CALLER=PROJ_IST&ACTION=D&RCN=80709, Jan. 2006.

[2] Publish-Subscribe Internet Routing Paradigm (PSIRP). http://www.psirp.org, Jan. 2008.

[3] Route views peers. http://www.routeviews.org/peers/, Jul. 2009.

[4] Content Delivery Networks Interconnection (cdni). Working Group Charter,IETF, July 2011. http://datatracker.ietf.org/wg/cdni/charter/.

[5] Tier 1 network. http://en.wikipedia.org/wiki/Tier_1_network, Novem-ber 2011.

[6] D. Adkins, K. Lakshminarayanan, A. Perrig, and I. Stoica. Towards aMore Functional and Secure Network Infrastructure. Technical report, UCBerkeley, 2003.

[7] B. Ahlgren, J. Arkko, L. Eggert, and J. Rajahalme. A Node Identity Inter-networking Architecture. In INFOCOM 2006. 25th IEEE InternationalConference on Computer Communications. Proceedings, Global InternetSymposium, Barcelona, Spain, April 2006. IEEE.

[8] B. Ahlgren, L. Eggert, B. Ohlman, J. Rajahalme, and A. Schieder. Names,Addresses and Identities in Ambient Networks. In Proceedings of the 1stACM workshop on Dynamic Interconnection of Networks, DIN’05, pages33–37. ACM, September 2005.

[9] D. Alderson, L. Li, W. Willinger, and J. C. Doyle. Understanding InternetTopology: Principles, Models, and Validation. IEEE/ACM Transactions onNetworking (TON), 13(6):1205–1218, December 2005.

[10] C. Alexander. Notes on the synthesis of form. Harward University Press,1964.

[11] K. C. Almeroth. The Evolution of Multicast: From the MBone to Inter-Domain Multicast to Internet2 Deployment. IEEE Network, 14(1):10–20,Jan. 2000.

[12] D. Andersen, H. Balakrishnan, N. Feamster, T. Koponen, D. Moon, andS. Shenker. Accountable Internet Protocol (AIP). In ACM SIGCOMM’08.Proceedings, Aug. 2008.

73

Page 94: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[13] T. Anderson, L. Peterson, S. Shenker, and J. Turner. Overcoming theInternet Impasse Through Virtualization. Computer, 38(4):34 – 41, April2005. IEEE Computer Society.

[14] R. Arends, R. Austein, M. Larson, D. Massey, and S. Rose. DNS SecurityIntroduction and Requirements. RFC 4033, IETF, Mar. 2005.

[15] K. Argyraki and D. R. Cheriton. Loose source routing as a mechanism fortraffic policies. In ACM SIGCOMM FDNA’04. Proceedings, pages 57–64,2004.

[16] M. Artigas, P. López, and A. Skarmeta. A Comparative Study of HierarchicalDHT Systems. In LCN’07. Proceedings, pages 325–333. IEEE ComputerSociety, 2007.

[17] M. S. Artigas, P. G. Lopez, J. P. Ahullo, and A. F. G. Skarmeta. Cyclone: ANovel Design Schema for Hierarchical DHTs. In Proceedings of the FifthIEEE International Conference on Peer-to-Peer Computing, P2P’05, pages49–56. IEEE Computer Society, September 2005.

[18] D. Artz and Y. Gil. A Survey of Trust in Computer Science and the SemanticWeb. Web Semantics: Science, Services and Agents on the World Wide Web,5(2):58–71, June 2007.

[19] R. Atkinson, S. Bhatti, and S. Hailes. Evolving the Internet ArchitectureThrough Naming. Selected Areas in Communications, IEEE Journal on,28(8):1319 –1325, October 2010.

[20] M. Azinger and L. Vegoda. Issues Associated with Designating AdditionalPrivate IPv4 Address Space. RFC 6319 (Informational), IETF, July 2011.

[21] A. Bakker, E. Amade, G. Ballintijn, I. Kuz, P. Verkaik, I. Van der Wijk,M. van Steen, and A. Tanenbaum. The globe distribution network. In Proc.2000 USENIX Annual Conf.(FREENIX Track), pages 141–152, 2000.

[22] H. Balakrishnan. Wide-Area Internet Routing. http://web.mit.edu/6.033/2009/wwwdocs/assignments/InterdomainRouting.pdf, January 2009.

[23] H. Balakrishnan, K. Lakshminarayanan, S. Ratnasamy, S. Shenker, I. Sto-ica, and M. Walfish. A Layered Naming Architecture for the Internet. InProceedings of the 2004 conference on Applications, technologies, architec-tures, and protocols for computer communications, SIGCOMM’04, pages343–352. ACM, 2004.

[24] H. Ballani, Y. Chawathe, S. Ratnasamy, T. Roscoe, and S. Shenker. Off byDefault! HOTNETS-IV, pages 1–6, Maryland, USA, November 2005. ACM.

[25] H. Ballani, P. Francis, T. Cao, and J. Wang. Making Routers Last Longerwith ViAggre. In Proceedings of the 6th USENIX symposium on Networkedsystems design and implementation, NSDI’09, pages 453–466, Boston, Mas-sachusetts, 2009. USENIX Association.

[26] G. Ballintijn, M. Van Steen, and A. Tanenbaum. Scalable Human-FriendlyResource Names. IEEE Internet Computing, pages 20–27, 2001.

74

Page 95: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[27] S. Banerjee, B. Bhattacharjee, and C. Kommareddy. Scalable ApplicationLayer Multicast. In Proceedings of the 2002 conference on Applications,technologies, architectures, and protocols for computer communications,SIGCOMM’02, pages 205–217, Pittsburgh, Pennsylvania, USA, August2002. ACM.

[28] T. Bates, Y. Rekhter, R. Chandra, and D. Katz. Multiprotocol Extensionsfor BGP-4. RFC 2858, IETF, Jun. 2000.

[29] S. Bhattacharyya (editor). An overview of source-specific multicast (SSM).RFC 3569, IETF, July 2003.

[30] B. H. Bloom. Space/Time Trade-offs in Hash Coding with Allowable Errors.Communications of the ACM, 13(7):422–426, Jul. 1970.

[31] M. S. Blumenthal and D. D. Clark. Rethinking the Design of the Internet:The End-to-End Arguments vs. the Brave New World. ACM Transactionson Internet Technology, 1(1):70–109, August 2001.

[32] D. Bohm and F. D. Peat. Science, Order and Creativity. Routledge, 2ndedition, 2000.

[33] C. Botham, A. Burness, P. Eardley, A. Eriksson, M. Jimenez-Abelleira,L. Loyola, and J. Rajahalme. Inter-Network Routing in Ambient Networks.In Mobile and Wireless Communications Summit, 2007. 16th IST, pages1–5, July 2007.

[34] R. Braden. Architectural Principles of the Internet. http://www.isi.edu/newarch/DOCUMENTS/rtb.IPAM.mar02.pdf, March 2002.

[35] J. Brutlag, H. Hutchinson, and M. Stone. User Preference and SearchEngine Latency. Technical report, Google, Inc, 2007.

[36] M. Caesar, M. Castro, E. B. Nightingale, G. O’Shea, and A. Rowstron.Virtual ring routing: network routing inspired by DHTs. In ACM SIG-COMM’06. Proceedings, pages 351–362, 2006.

[37] M. Caesar, T. Condie, J. Kannan, K. Lakshminarayanan, I. Stoica, andS. Shenker. ROFL: Routing on Flat Labels. In SIGCOMM’06. Proceedings.ACM, September 2006.

[38] M. Caesar and J. Rexford. BGP routing policies in ISP networks. IEEENetwork, 19(6):5–11, Nov.-Dec. 2005.

[39] The CAIDA AS Relationships Dataset, August 10th, 2009. http://www.caida.org/data/active/as-relationships/.

[40] B. Carpenter. Internet Transparency. RFC 2775 (Informational), IETF, Feb.2000.

[41] B. Carpenter and S. Brim. Middleboxes: Taxonomy and Issues. RFC 3234(Informational), IETF, Feb. 2002.

[42] A. Carzaniga, D. S. Rosenblum, and A. L. Wolf. Content-Based Addressingand Routing: A General Model and its Application. Tech. Report CU-CS-902-00, CU, Jan. 2000.

75

Page 96: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[43] A. Carzaniga and A. L. Wolf. Content-Based Networking: A New Commu-nication Infrastructure. Lecture Notes in Computer Science, pages 59–68,2002.

[44] A. Carzaniga and A. L. Wolf. Forwarding in a Content-Based Network. InSIGCOMM’03. Proceedings, pages 163–174. ACM, Aug. 2003.

[45] M. Cha, H. Kwak, P. Rodriguez, Y. Ahn, and S. Moon. I Tube, You Tube,Everybody Tubes: Analyzing the World’s Largest User Generated ContentVideo System. In ACM SIGCOMM IMC’07. Proceedings, pages 1–14, 2007.

[46] H. Chang, S. Jamin, Z. Morley, and M. W. Willinger. An Empirical Ap-proach to Modeling Inter-AS Traffic Matrices. In ACM SIGCOMM IMC’05.Proceedings, pages 139–152, 2005.

[47] H. Chang, S. Jamin, and W. Willinger. To Peer or Not to Peer: Modelingthe Evolution of the Internet’s AS-Level Topology. In IEEE INFOCOM’06.Proceedings, April 2006.

[48] C. Chow. On Multicast Path Finding Algorithms. In INFOCOM’91. Pro-ceedings, pages 1274–1283. IEEE, 1991.

[49] J. C.-I. Chuang and M. A. Sirbu. Pricing Multicast Communications: ACost Based Approach. In Proceedings of INET, 1998.

[50] D. D. Clark. The design philosophy of the DARPA internet protocols. ACMSIGCOMM CCR, 18(4):106–114, 1988.

[51] D. D. Clark. Policy routing in Internet protocols. RFC 1102, IETF, May1989.

[52] D. D. Clark. What is “Architecture”? http://find.isi.edu/presentation_files/Dave_Clark-What_is_architecture_4.pdf, November 2005.

[53] D. D. Clark and M. S. Blumenthal. The end-to-end argument and applica-tion design: the role of trust. In The 35th Research Conference on Commu-nication, Information and Internet Policy. Proceedings, TPRC, Arlington,Virginia, USA, Sep. 2007.

[54] D. D. Clark, R. Braden, A. Falk, and V. Pingali. FARA: Reorganizing theAddressing Architecture. In SIGCOMM’03. Proceedings. ACM, August2003.

[55] D. D. Clark, K. Sollins, J. Wroclawski, and T. Faber. Addressing Reality:An Architectural Response to Real-World Demands on the Evolving Inter-net. In Proceedings of the ACM SIGCOMM workshop on Future directionsin network architecture, FDNA’03, pages 247–257, Karlsruhe, Germany,August 2003. ACM.

[56] D. D. Clark, J. Wroclawski, K. Sollins, and R. Braden. Tussle in Cyberspace:Defining Tomorrow’s Internet. Networking, IEEE/ACM Transactions on,13(3):462–475, June 2005.

[57] C. Corbett. Peering of Video. Presentation at NANOG 37, June 2006.

[58] J. Crowcroft, S. Hand, R. Mortier, T. Roscoe, and A. Warfield. Plutarch: Anargument for network pluralism. In Proceedings of the ACM SIGCOMMworkshop on Future Directions in Network Architecture, FDNA’03, pages258–266. ACM, August 2003.

76

Page 97: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[59] C. Dannewitz, J. Golic, B. Ohlman, and B. Ahlgren. Secure naming fora network of information. In INFOCOM IEEE Conference on ComputerCommunications Workshops, Global Internet Symposium, pages 1–6, March2010.

[60] J. Day. Patterns in Network Architecture: A Return to Fundamentals.Prentice-Hall, 2008.

[61] S. Deering. Host extensions for IP multicasting. RFC 1112, IETF, Aug.1989.

[62] S. Deering. Watching the Waist of the Protocol Hourglass. In TERENANetworking Conference, Invited Talk, Antalaya, Turkey, May 2001. http://tnc2001.terena.org/proceedings/SlidesDeering.ppt.

[63] S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu, and L. Wei. An Ar-chitecture for Wide-Area Multicast Routing. In SIGCOMM’04. Proceedings,page 135. ACM, 1994.

[64] S. Deering and R. Hinden. Internet Protocol, Version 6 (IPv6) Specification.RFC 2460 (Draft Standard), IETF, Dec. 1998. Updated by RFCs 5095, 5722,5871.

[65] S. E. Deering and D. R. Cheriton. Multicast routing in datagram inter-networks and extended LANs. ACM Trans. Comput. Syst., 8(2):85–110,1990.

[66] V. Devarapalli and F. Dupont. Mobile IPv6 Operation with IKEv2 and theRevised IPsec Architecture. RFC 4877 (Proposed Standard), IETF, Apr.2007.

[67] A. Dhamdhere and C. Dovrolis. Can ISPs be Profitable Without Violating“Network Neutrality”? In Proceedings of the 3rd international workshopon Economics of networked systems, NetEcon’08, pages 13–18, Seattle, WA,USA, 2008. ACM.

[68] A. Dhamdhere and C. Dovrolis. Ten years in the evolution of the Internetecosystem. In SIGCOMM IMC’08. Proceedings, pages 183–196. ACM, 2008.

[69] S. DiBenedetto, C. Papadopoulos, and D. Massey. Routing Policies in NamedData Networking. In Proceedings of the ACM SIGCOMM workshop onInformation-centric networking, ICN ’11, pages 38–43, Toronto, Ontario,Canada, 2011. ACM.

[70] T. Dierks and E. Rescorla. The Transport Layer Security (TLS) ProtocolVersion 1.2. RFC 5246, IETF, Aug. 2008.

[71] X. Dimitropoulos, P. Hurley, A. Kind, and M. Stoecklin. On the 95-PercentileBilling Method. In S. Moon, R. Teixeira, and S. Uhlig, editors, Passive andActive Network Measurement, volume 5448 of Lecture Notes in ComputerScience, pages 207–216. Springer Berlin / Heidelberg, 2009.

[72] C. Diot, B. Levine, B. Lyles, H. Kassem, and D. Balensiefen. Deploy-ment issues for the IP multicast service and architecture. Network, IEEE,14(1):78–88, Jan/Feb 2000.

77

Page 98: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[73] R. Droms. Dynamic Host Configuration Protocol. RFC 2131 (Draft Stan-dard), IETF, Mar. 1997. Updated by RFCs 3396, 4361, 5494.

[74] D. Eastlake 3rd and T. Hansen. US Secure Hash Algorithms (SHA andSHA-based HMAC and HKDF). RFC 6234 (Informational), IETF, May2011.

[75] T. Ehrenkranz and J. Li. On the State of IP Spoofing Defense. ACMTransactions on Internet Technology, 9(2):6:1–6:29, May 2009.

[76] C. Ellison, B. Frantz, B. Lampson, R. Rivest, B. Thomas, and T. Ylönen.SPKI Certificate Theory. RFC 2693, IETF, Sep. 1999.

[77] H. Eriksson. MBONE: The Multicast Backbone. Commun. ACM, 37(8):54–60, 1994.

[78] D. Estrin, M. Handley, A. Helmy, P. Huang, and D. Thaler. A DynamicBootstrap Mechanism for Rendezvous-based Multicast Routing. IEEEINFOCOM’99. Proceedings, 3:1090–1098 vol.3, 21-25 Mar 1999.

[79] D. Estrin, T. Li, Y. Rekhter, K. Varadhan, and D. Zappala. Source DemandRouting: Packet Format and Forwarding Specification (Version 1). RFC1940 (Informational), IETF, May 1996.

[80] P. Eugster, P. Felber, R. Guerraoui, and A. Kermarrec. The Many Faces ofPublish/Subscribe. ACM Computing Surveys, 35(2):114–131, 2003.

[81] W. Fang and L. Peterson. Inter-AS Traffic Patterns and Their Implications.In Global Telecommunications Conference, volume 3 of GLOBECOM’99,pages 1869–1868, Dec. 1999.

[82] P. Faratin, D. D. Clark, P. Gilmore, S. Bauer, A. Berger, and W. Lehr.Complexity of Internet Interconnections: Technology, Incentives and Impli-cations for Policy. In TPRC. Proceedings, 2007.

[83] D. Farinacci, V. Fuller, D. Meyer, and D. Lewis. Locator/ID SeparationProtocol (LISP). Internet-Draft draft-ietf-lisp-15 (work in progress), IETF,July 2011.

[84] N. Feamster, H. Balakrishnan, and J. Rexford. Some Foundational Problemsin Interdomain Routing. In Proceedings of ACM HotNets-III, 2004.

[85] N. Feamster, H. Balakrishnan, J. Rexford, A. Shaikh, and J. v. d. Merwe.The case for separating routing from routers. In ACM SIGCOMM FDNA’04.Proceedings, pages 5–12, 2004.

[86] N. Feamster, J. Borkenhagen, and J. Rexford. Guidelines for interdo-main traffic engineering. SIGCOMM Computer Communications Review,33(5):19–30, October 2003.

[87] J. Feigenbaum, C. Papadimitriou, and S. Shenker. Sharing the Cost of Multi-cast Transmissions. Journal of Computer and System Sciences, 63(1):21–41,2001.

[88] J. Feigenbaum, V. Ramachandran, and M. Schapira. Incentive-CompatibleInterdomain Routing. Tech. Report 1342, Yale Univ., May 2006.

78

Page 99: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[89] A. Feldmann. Internet Clean-Slate Design: What and Why? ACM SIG-COMM CCR, 37(3):59–64, 2007.

[90] B. Fenner, M. Handley, H. Holbrook, and I. Kouvelas. Protocol IndependentMulticast – Sparse Mode (PIM-SM): Protocol Specification (Revised). RFC4601, IETF, Aug. 2006.

[91] B. Fenner and D. Meyer. Multicast Source Discovery Protocol (MSDP). RFC3618, IETF, Oct. 2003.

[92] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, andT. Berners-Lee. Hypertext Transfer Protocol – HTTP/1.1. RFC 2616, IETF,June 1999.

[93] P. Francis and R. Gummadi. IPNL: A NAT-Extended Internet Architec-ture. In Proceedings of the 2001 conference on Applications, technologies,architectures, and protocols for computer communications, volume 31 ofSIGCOMM’01, pages 69–80, San Diego, CA, USA, August 2001. ACM.

[94] P. Ganesan, K. Gummadi, and H. Garcia-Molina. Canon in G Major: Design-ing DHTs with Hierarchical Structure. Tech. Rep. 74, Stanford University,2003.

[95] P. Ganesan, K. Gummadi, and H. Garcia-Molina. Canon in G Major: De-signing DHTs with Hierarchical Structure. IEEE Distributed ComputingSystems (ICDCS’04). Proceedings, pages 263–272, 2004.

[96] L. Gao. On inferring autonomous system relationships in the Internet.Networking, IEEE/ACM Transactions on, 9(6):733–745, Dec 2001.

[97] L. Gao and J. Rexford. Stable Internet Routing Without Global Coordination.IEEE/ACM Trans. Netw., 9:681–692, December 2001.

[98] L. Gao and F. Wang. The extent of AS path inflation by routing policies.IEEE GLOBECOM’02. Proceedings, Nov. 2002.

[99] A. Ghodsi, T. Koponen, B. Raghavan, S. Shenker, A. Singla, and J. Wilcox.Information-Centric Networking: Seeing the Forest for the Trees. In Pro-ceedings of the Tenth ACM SIGCOMM Workshop on Hot Topics in Networks,Hotnets’11, pages 1–6, Cambridge, MA, November 2011. ACM.

[100] A. Ghodsi, T. Koponen, B. Raghavan, S. Shenker, A. Singla, and J. Wilcox.Intelligent Design Enables Architectural Evolution. In Proceedings of theTenth ACM SIGCOMM Workshop on Hot Topics in Networks, Hotnets’11,pages 1–6, Cambridge, MA, November 2011. ACM.

[101] A. Ghodsi, T. Koponen, J. Rajahalme, P. Sarolahti, and S. Shenker. Namingin Content-Oriented Architectures. In Proceedings of the ACM SIGCOMMworkshop on Information-centric networking, ICN ’11, pages 1–6, Toronto,Ontario, Canada, August 2011. ACM.

[102] P. Gill, M. Arlitt, Z. Li, and A. Mahanti. YouTube Traffic Characterization:A View From the Edge. In ACM SIGCOMM IMC’07. Proceedings, pages15–28, 2007.

[103] M. Girault. Self-certified public keys. In D. Davies, editor, Advances inCryptology – EUROCRYPT’91, volume 547 of Lecture Notes in ComputerScience, pages 490–497. Springer Berlin / Heidelberg, 1991.

79

Page 100: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[104] P. B. Godfrey, I. Ganichev, S. Shenker, and I. Stoica. Pathlet Routing. InSIGCOMM’09. Proceedings, pages 111–122. ACM, Aug. 2009.

[105] M. Gritter and D. R. Cheriton. An Architecture for Content Routing Supportin the Internet. In USENIX USITS’01. Proceedings, 2001.

[106] M. Handley. Why the Internet only just works. BT Technology Journal,24(3):119–129, Jul. 2006.

[107] M. Handley and A. Greenhalgh. Steps Towards a DoS-resistant InternetArchitecture. In ACM SIGCOMM FDNA’04. Proceedings, 2004.

[108] G. H. Hardy. A mathematician’s apology. Cambridge University Press,1940.

[109] N. J. A. Harvey, M. B. Jones, S. Saroiu, M. Theimer, and A. Wolman. Skip-Net: A Scalable Overlay Network with Practical Locality Properties. InProceedings of the 4th conference on USENIX Symposium on Internet Tech-nologies and Systems, volume 4 of USENIX USITS, pages 1–14, Seattle,WA, USA, March 2003.

[110] H. Hazlitt. Economics In One Lesson. Harper & Brothers, 1946.

[111] R. Hinden and S. Deering. IP Version 6 Addressing Architecture. RFC4291, IETF, Feb. 2006.

[112] C. Hopps. Analysis of an Equal-Cost Multi-Path Algorithm. RFC 2992,IETF, Nov. 2000.

[113] G. Huston. Interconnection, Peering, and Settlements. In Proc. InternetGlobal Summit (INET), Jun. 1999.

[114] G. Huston. Architectural Commentary on Site Multi-homing using a Level3 Shim. Internet-draft, expired, IETF, July 2005.

[115] G. Huston. Predicting the End of the World. The ISP Column, Inter-net Society, http://wattle.apnic.net/papers/isoc/2009-05/ipv4model.pdf, May2009.

[116] G. Huston. The Flat World of BGP. Presentation at RIPE 63, October 2011.http://ripe63.ripe.net/presentations/61-2011-10-31-bgp2011.pdf.

[117] G. Huston. Timeseries of the number of advertized IPv4 ASes in BGP.http://bgp.potaroo.net/as2.0/bgp-as-count.txt, Oct. 21st 2011.

[118] G. Huston and G. Armitage. Projecting future IPv4 router requirementsfrom trends in dynamic BGP behaviour. In ATNAC. Proceedings, Dec. 2006.

[119] C. Intanagonwiwat, R. Govindan, D. Estrin, J. Heidemann, and F. Silva. Di-rected Diffusion for Wireless Sensor Networking. Networking, IEEE/ACMTransactions on, 11(1):2 – 16, feb 2003.

[120] J. Ioannidis and S. M. Bellovin. Implementing Pushback: Router-Based De-fense Against DDoS Attacks. In Network and Distributed System SecuritySymposium, Proceedings of. The Internet Society, February 2002.

[121] P. Jacob and B. Davie. Technical challenges in the delivery of interproviderQoS. Communications Magazine, IEEE, 43(6):112–118, June 2005.

80

Page 101: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[122] V. Jacobson. A New Way to look at Networking. Talk at Google, 2006.

[123] V. Jacobson. If a Clean Slate is the solution what was the problem? Stanford‘Clean Slate’ Seminar, Feb. 2006.

[124] V. Jacobson, D. Smetters, J. D. Thornton, M. Plass, N. Briggs, and R. L.Braynard. Networking Named Content. In ACM CoNEXT’09. Proceedings,Dec. 2009.

[125] S. Jiang, D. Guo, and B. Carpenter. An Incremental Carrier-Grade NAT(CGN) for IPv6 Transition. RFC 6264 (Informational), IETF, June 2011.

[126] R. Johari and J. Tsitsiklis. Routing and Peering in a Competitive Internet.Decision and Control, 2004. CDC. 43rd IEEE Conference on, 2:1556–1561Vol.2, Dec. 2004.

[127] D. A. Joseph, N. Shetty, J. Chuang, and I. Stoica. Modeling the Adoptionof new Network Architectures. In Proceedings of the 2007 ACM CoNEXTconference, CoNEXT’07, pages 5:1–5:12, New York, NY, USA, December2007. ACM.

[128] J. Jung, E. Sit, H. Balakrishnan, and R. Morris. DNS Performance and theEffectiveness of Caching. IEEE/ACM Transactions on Networking (TON),10(5):589–603, 2002.

[129] E. Karpilovsky, L. Breslau, A. Gerber, and S. Sen. Multicast Redux: AFirst Look at Enterprise Multicast Traffic. In Proceedings of the 1st ACMworkshop on Research on enterprise networking, WREN’09, pages 55–64,Barcelona, Spain, August 2009. ACM.

[130] D. Katabi. Public Review of “On Compact Routing for the Internet”. http://ccr.sigcomm.org/online/?q=node/227, June 2007.

[131] V. Khare, D. Jen, X. Zhao, Y. Liu, D. Massey, L. Wang, B. Zhang, andL. Zhang. Evolution Towards Global Routing Scalability. Selected Areas inCommunications, IEEE Journal on, 28(8):1363 –1375, october 2010.

[132] T. Koponen, M. Chawla, B.-G. Chun, A. Ermolinskiy, K. H. Kim, S. Shenker,and I. Stoica. A Data-Oriented (and Beyond) Network Architecture. InSIGCOMM’07. Proceedings, pages 181–192. ACM, 2007.

[133] T. Koponen, S. Shenker, H. Balakrishnan, N. Feamster, I. Ganichev, A. Gh-odsi, P. B. Godfrey, N. McKeown, G. Parulkar, B. Raghavan, J. Rexford,S. Arianfar, and D. Kuptsov. Architecting for Innovation. ACM SIGCOMMComputer Communications Review, 41(3):24–36, July 2011.

[134] D. V. Krioukov, K. Claffy, K. Fall, and A. Brady. On Compact Routing forthe Internet. ACM SIGCOMM CCR, 37(3):43–52, July 2007.

[135] J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels,R. Gummadi, S. Rhea, H. Weatherspoon, C. Wells, and B. Zhao. OceanStore:An Architecture for Global-Scale Persistent Storage. In ACM ASPLOS-IX.Proceedings, volume 28, pages 190–201, Dec. 2000.

[136] F. Le, G. G. Xie, D. Pei, J. Wang, and H. Zhang. Shedding Light on theGlue Logic of the Internet Routing Architecture. In Proceedings of the ACMSIGCOMM 2008 conference on Data communication, SIGCOMM’08, pages39–50, Seattle, WA, USA, 2008. ACM.

81

Page 102: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[137] F. Le, G. G. Xie, and H. Zhang. Theory and new primitives for safely con-necting routing protocol instances. In Proceedings of the ACM SIGCOMM2010 conference on SIGCOMM, SIGCOMM’10, pages 219–230, New Delhi,India, 2010. ACM.

[138] B. Leiner, V. Cerf, D. D. Clark, R. Kahn, L. Kleinrock, D. Lynch, J. Postel,L. Roberts, and S. Wolff. A brief history of the Internet. ACM SIGCOMMComputer Communication Review, 39(5):22–31, 2009.

[139] J. Lemon. Resisting SYN flood DoS attacks with a SYN cache. In Proceed-ings of the BSD Conference, BSDC’02, pages 89–97, Berkeley, CA, USA, Feb.2002. USENIX Association.

[140] T. Li. Recommendation for a Routing Architecture. RFC 6115 (Informa-tional), IRTF, Feb. 2011.

[141] M. Liljenstam, J. Liu, and D. M. Nicol. Development of an Internet Back-bone Topology for Large-scale Network Simulations. In Proceedings of the35th Winter Simulation Conference, WSC’03, pages 694–702, New Orleans,Louisiana, Dec. 2003.

[142] A. Lutu. Game Theory Application to Interdomain Routing. Master’s thesis,University Carlos III of Madrid, Department of Telematics Engineering,September 2010.

[143] P. Maymounkov and D. Mazieres. Kademlia: A peer-to-peer informationsystem based on the XOR metric, 2002.

[144] D. Mazières, M. Kaminsky, M. F. Kaashoek, and E. Witchel. Separatingkey management from file system security. ACM SIGOPS Oper. Syst. Rev.,33(5):124–139, 1999.

[145] M. McBride, J. Meylor, and D. Meyer. Multicast Source Discovery Protocol(MSDP) Deployment Scenarios. RFC/BCP 4611/121, IETF, Aug. 2006.

[146] X. Meng, Z. Xu, B. Zhang, G. Huston, S. Lu, and L. Zhang. IPv4 AddressAllocation and the BGP Routing Table Evolution. SIGCOMM ComputerCommunications Review, 35(1):71–80, January 2005.

[147] D. Meyer, L. Zhang, and K. Fall. Report from the IAB Workshop on Routingand Addressing. RFC 4984, IETF, Sep. 2007.

[148] P. Mockapetris and K. Dunlap. Development of the Domain Name System.ACM SIGCOMM’88. Proceedings, pages 123–133, Aug. 1988.

[149] T. Moors. A critical review of “End-to-end arguments in system design”.In IEEE International Conference on Communications, volume 2, pages1214–1219. IEEE, 2002.

[150] R. Moskowitz and P. Nikander. Host Identity Protocol (HIP) Architecture.RFC 4423 (Informational), IETF, May 2006.

[151] G. Nakibly and F. Templin. Routing Loop Attack Using IPv6 AutomaticTunnels: Problem Statement and Proposed Mitigations. RFC 6324 (Infor-mational), IETF, Aug. 2011.

[152] O. Nordström and C. Dovrolis. Beware of BGP attacks. ACM SIGCOMMCCR, 34(2):1–8, 2004.

82

Page 103: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[153] W. B. Norton. Internet Service Providers and Peering. Draft 2.8, http://www.nanog.org/papers/isp.peering.doc, August 2002.

[154] E. Nygren, R. K. Sitaraman, and J. Sun. The Akamai Network: A Platformfor High-Performance Internet Applications. SIGOPS Oper. Syst. Rev.,44:2–19, August 2010.

[155] R. V. Oliveira, D. Pei, W. Willinger, B. Zhang, and L. Zhang. In Search ofthe Elusive Ground Truth: The Internet’s AS-level Connectivity Structure.SIGMETRICS Perf. Eval. Rev., 36(1):217–228, 2008.

[156] J. Papen. Demystifying BGP. Linux Magazine, May 2003. http://www.linux-mag.com/id/1353/.

[157] V. Pappas, D. Massey, A. Terzis, and L. Zhang. A Comparative Study of theDNS Design with DHT-Based Alternatives. In INFOCOM’06. Proceedings.IEEE, 2006.

[158] K. Park and H. Lee. On the Effectiveness of Route-Based Packet Filteringfor Distributed DoS Attack Prevention in Power-Law Internets. In Proceed-ings of the 2001 conference on Applications, technologies, architectures, andprotocols for computer communications, volume 31 of SIGCOMM’01, pages15–26, San Diego, CA, USA, Aug. 2001. ACM.

[159] S. Paul, J. Pan, and R. Jain. Architectures for the future networks and thenext generation internet: A Survey. Computer Communications, 34:2–42,January 2011.

[160] T. Peng, C. Leckie, and K. Ramamohanarao. Survey of Network-BasedDefense Mechanisms Countering the DoS and DDoS Problems. ACM Com-puting Surveys, 39(1):1–42, April 2007.

[161] D. Perino and M. Varvello. A Reality Check for Content Centric Networking.In Proceedings of the ACM SIGCOMM workshop on Information-centricnetworking, ICN’11, pages 44–49, Toronto, Ontario, Canada, Aug. 2011.ACM.

[162] C. Perkins. Mobile IP. Communications Magazine, IEEE, 40(5):66–82, May2002.

[163] L. Popa, A. Ghodsi, and I. Stoica. HTTP as the Narrow Waist of the FutureInternet. In Proceedings of the Ninth ACM SIGCOMM Workshop on HotTopics in Networks, Hotnets’10, pages 6:1–6:6, Monterey, California, Oct.2010. ACM.

[164] P. Pöyhönen. A Tentative Model for the Volume of Trade Between Countries.Weltwirtschaftliches Archiv, 90:93–100, 1963.

[165] B. Quoitin, C. Pelsser, L. Swinnen, O. Bonaventure, and S. Uhlig. Inter-domain traffic engineering with BGP. Communications Magazine, IEEE,41(5):122–128, May 2003.

[166] B. Raghavan, P. Verkaik, and A. C. Snoeren. Secure and Policy-CompliantSource Routing. IEEE/ACM Transactions on Networking (TON), 17(3):764–777, June 2009.

83

Page 104: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[167] J. Rajahalme. Incentive-Informed Inter-Domain Multicast. In INFOCOMIEEE Conference on Computer Communications Workshops, Global InternetSymposium, San Diego, CA, USA, March 2010.

[168] J. Rajahalme, A. Conta, B. Carpenter, and S. Deering. IPv6 Flow LabelSpecification. RFC 3697 (Proposed Standard), IETF, Mar. 2004.

[169] J. Rajahalme, M. Särelä, P. Nikander, and S. Tarkoma. Incentive-Compatible Caching and Peering in Data-Oriented Networks. In Proceed-ings of the 2008 ACM CoNEXT Conference, ReArch’08 workshop, Madrid,Spain, August 2008. ACM.

[170] J. Rajahalme, M. Särelä, K. Visala, and J. Riihijärvi. On name-based inter-domain routing. Computer Networks, Special Issue on Architectures andProtocols for the Future Internet, 55(4):975–986, March 2011. Elsevier.

[171] V. Ramasubramanian and E. G. Sirer. The Design and Implementationof a Next Generation Name Service for the Internet. In SIGCOMM’04.Proceedings, volume 34, pages 331–342. ACM, Oct. 2004.

[172] V. S. Ramasubramanian. Cost-Aware Resource Management for Decentral-ized Internet Services. PhD thesis, Cornell University, Jan. 2007.

[173] S. Ratnasamy, A. Ermolinskiy, and S. Shenker. Revisiting IP Multicast. InProceedings of the 2006 conference on Applications, technologies, architec-tures, and protocols for computer communications, SIGCOMM’06, pages15–26, Pisa, Italy, 2006. ACM.

[174] S. Ratnasamy, S. Shenker, and S. McCanne. Towards an Evolvable InternetArchitecture. In ACM SIGCOMM’05. Proceedings, pages 313–324, 2005.

[175] Y. Rekhter and T. Li. A Border Gateway Protocol (BGP-4). RFC 1771, IETF,Mar. 1995.

[176] G. Rétvári, A. Gulyás, Z. Heszberger, M. Csernai, and J. J. Bíró. CompactPolicy Routing. In Proceedings of the 30th annual ACM SIGACT-SIGOPSsymposium on Principles of distributed computing, PODC’11, pages 149–158, San Jose, California, USA, 2011. ACM.

[177] S. Rhea, B. Godfrey, B. Karp, J. Kubiatowicz, S. Ratnasamy, S. Shenker,I. Stoica, and H. Yu. OpenDHT: a public DHT service and its uses. InSIGCOMM’05. Proceedings, pages 73–84. ACM, 2005.

[178] R. Rivest and B. Lampson. SDSI – A Simple Distributed Security Infras-tructure. Technical report, MIT, 1996.

[179] A. Rowstron and P. Druschel. Pastry: Scalable, Decentralized Object Loca-tion and Routing for Large-Scale Peer-to-Peer Systems. In Proceedings ofMiddleware 2001, pages 329–250. Springer, Nov. 2001.

[180] J. Saltzer. On the Naming and Binding of Network Destinations. RFC 1498(Informational), IETF, Aug. 1993.

[181] J. Saltzer, D. Reed, and D. D. Clark. End-to-end arguments in systemdesign. Computer Systems, ACM Transactions on, 2(4):277–288, Nov. 1984.

84

Page 105: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[182] S. Savage, T. Anderson, A. Aggarwal, D. Becker, N. Cardwell, A. Collins,E. Hoffman, J. Snell, A. Vahdat, G. Voelker, and J. Zahorjan. Detour: In-formed Internet Routing and Transport. Micro, IEEE, 19(1):50–59, Jan/Feb1999.

[183] P. Savola. Overview of the Internet Multicast Routing Architecture. RFC5110, IETF, Jan. 2008.

[184] Y. Shavitt and N. Zilberman. A Structural Approach for PoP Geo-Location.In INFOCOM IEEE Conference on Computer Communications Workshops,NetSciCom’10, pages 1–6, Mar. 2010.

[185] A. Singla, P. B. Godfrey, K. Fall, G. Iannaccone, and S. Ratnasamy. ScalableRouting on Flat Names. In Proceedings of the 6th International COnference,Co-NEXT ’10, pages 20:1–20:12. ACM, 2010.

[186] D. Smetters and V. Jacobson. Securing Network Content. Tech. report, PaloAlto Research Center, Oct. 2009.

[187] B. M. Sousa, K. Pentikousis, and M. Curado. Multihoming management forfuture networks. Mobile Networks and Applications, 16(4):505–517, August2011.

[188] N. Spring, R. Mahajan, D. Wetherall, and T. Anderson. Measuring ISPTopologies with Rocketfuel. Networking, IEEE/ACM Transactions on,12(1):2 – 16, February 2004.

[189] P. Srisuresh and M. Holdrege. IP Network Address Translator (NAT)Terminology and Considerations. RFC 2663 (Informational), IETF, Aug.1999.

[190] J. D. Sterman. All models are wrong: reflections on becoming a systemsscientist. System Dynamics Review, 18(4):501–531, 2002.

[191] I. Stoica, R. Morris, D. Liben-Nowell, D. R. Karger, M. F. Kaashoek,F. Dabek, and H. Balakrishnan. Chord: a scalable peer-to-peer lookupprotocol for internet applications. IEEE/ACM Trans. Netw., 11(1):17–32,2003.

[192] I. Stoica, T. Ng, and H. Zhang. REUNITE: A Recursive Unicast Approachto Multicast. In IEEE INFOCOM, volume 3, pages 1644–1653, 2000.

[193] L. Subramanian, S. Agarwal, J. Rexford, and R. Katz. Characterizing theInternet Hierarchy from Multiple Vantage Points. In INFOCOM 2002.Twenty-First Annual Joint Conference of the IEEE Computer and Commu-nications Societies. Proceedings. IEEE, volume 2, pages 618–627, November2002.

[194] L. Subramanian, M. Caesar, C. T. Ee, M. Handley, M. Mao, S. Shenker,and I. Stoica. HLP: A Next Generation Inter-domain Routing Protocol. InSIGCOMM’05. Proceedings, Aug. 2005.

[195] H. Tangmunarunkit, R. Govindan, S. Shenker, and D. Estrin. The Impactof Routing Policy on Internet Paths. IEEE INFOCOM’01. Proceedings,2:736–742 vol.2, 2001.

85

Page 106: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[196] S. Tarkoma and M. Antikainen. Canopy: Publish/Subscribe with UpgraphCombination. In INFOCOM IEEE Conference on Computer Communica-tions Workshops, Global Internet Symposium, pages 1–6, San Diego, CA,USA, March 2010.

[197] S. Thomson, T. Narten, and T. Jinmei. IPv6 Stateless Address Autoconfigu-ration. RFC 4862 (Draft Standard), IETF, Sept. 2007.

[198] L. N. Tolstoy. What is art? Translated by A. Maude. Thomas Y. Crowell &Co., New York, 1899.

[199] D. Trossen, M. Särelä, and K. Sollins. Arguments for an Information-CentricInternetworking Architecture. ACM SIGCOMM Computer CommunicationReview, 40(2):27–33, April 2010.

[200] J. Ubillos, M. Xu, Z. Ming, and C. Vogt. Name-Based Sockets Architec-ture. Internet-Draft draft-ubillos-name-based-sockets-03 (work in progress),IETF, September 2010.

[201] P. Vixie, S. Thomson, Y. Rekhter, and J. Bound. Dynamic Updates in theDomain Name System (DNS UPDATE). RFC 2136, IETF, Apr. 1997.

[202] D. Waddington and F. Chang. Realizing the transition to IPv6. Communi-cations Magazine, IEEE, 40(6):138–147, Jun 2002.

[203] M. Walfish, H. Balakrishnan, and S. Shenker. Untangling the Web fromDNS. In USENIX NSDI’04. Proceedings, 2004.

[204] M. Walfish, J. Stribling, M. Krohn, H. Balakrishnan, R. Morris, andS. Shenker. Middleboxes no longer considered harmful. In Proceedings ofthe 6th conference on Symposium on Opearting Systems Design & Imple-mentation - Volume 6, pages 215–229, Berkeley, CA, USA, 2004. USENIXAssociation.

[205] M. Walfish, M. Vutukuru, H. Balakrishnan, D. Karger, and S. Shenker.DDoS Defense by Offense. ACM Transactions on Computer Systems,28(1):3:1–3:54, March 2010.

[206] Y. Wang, L. Qiu, D. Achlioptas, G. Das, P. Larson, and H. Wang. SubscriptionPartitioning and Routing in Content-based Publish/Subscribe Systems. InDISC’02. Proceedings, 2002.

[207] D. Wendlandt, I. Avramopoulos, D. Andersen, and J. Rexford. Don’t SecureRouting Protocol, Secure Data Delivery. In HotNets-V. Proceedings, pages7–12, 2006.

[208] W. Willinger, D. Alderson, J. Doyle, and L. Li. More Normal than Normal:Scaling Distributions and Complex Systems. In Proceedings of the 2004Winter Simulation Conference, pages 130–141, Washington, D.C., Dec. 2004.

[209] H. Xie, Y. R. Yang, A. Krishnamurthy, Y. Liu, and A. Silberschatz. P4P:Provider Portal for Applications. In ACM SIGCOMM’08. Proceedings, pages351–362, August 2008.

[210] X. Yang, D. D. Clark, and A. W. Berger. NIRA: A New Inter-Domain RoutingArchitecture. Networking, IEEE/ACM Transactions on, 15(4):775–788, Aug.2007.

86

Page 107: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

[211] X. Yang, D. Wetherall, and T. Anderson. TVA: A DoS-Limiting NetworkArchitecture. IEEE/ACM Transactions on Networking (TON), 16(6):1267–1280, December 2008.

[212] K. Yoshida, Y. Kikuchi, M. Yamamoto, Y. Fujii, K. Nagami, I. Nakagawa,and H. Esaki. Inferring pop-level isp topology through end-to-end delaymeasurement. In Proceedings of the 10th International Conference onPassive and Active Network Measurement, PAM’09, pages 35–44. Springer-Verlag, 2009.

[213] T. Zahariadis, D. Papadimitriou, H. Tschofenig, S. Haller, P. Daras, G. Sta-moulis, and M. Hauswirth. Towards a Future Internet Architecture. In TheFuture Internet, volume 6656 of Lecture Notes in Computer Science, pages7–18. Springer Berlin, Heidelberg, 2011.

[214] R. Zhang and Y. C. Hu. Borg: A Hybrid Protocol for Scalable Application-Level Multicast in Peer-to-Peer Networks. In NOSSDAV’03. Proceedings,pages 172–179. ACM, Jun. 2003.

[215] B. Y. Zhao, J. D. Kubiatowicz, and A. D. Joseph. Tapestry: An Infrastructurefor Fault-tolerant Wide-area Location and Routing. Technical report, UCBerkeley, Apr. 2001.

87

Page 108: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Bibliography

88

Page 109: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Index

Abstractions, network, 22

Autonomous system, 9

Dataset, 42

BGP, 29

Commons, 21

Local preference, 31

Route selection, 31

Clean slate, 33

Compact routing, 34

Deepest match, 57, 69

Default-free zone, DFZ, 21

Denial of Service, 21

DHT, 25

Hierarchical, 27

Distributed hash tables, 25

Domain, 9

DoS, 21

Downgraph, 15

Downhill, 15, 46, 59, 61

End-to-end argument, the, 19

End-to-end transparency, loss of, 20

Explicit aggregation, 69

Incentive, 10

Incentives, path selection, 16

Inter-domain, 11

Intra-domain, 11

Local transit, 15

Locator domain, 56

Middlebox, 21

Multicast, 31

NAT, 20

Network address translation, 20

Node identifier, 56

Overlay, 25

Path inflation, 17

Pathlet, 15

Pathlet routing, 23

Peering, 10, 12, 42

Links missing, 42

Policy, routing, 11, 15

Publish/Subscribe, 36

Rendezvous Network, 65

Routing policy, 11

Scope, 65

Self-certified naming, 39

Sibling, 11

Source routing, 35

89

Page 110: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

Index

Stretch, 41

Tier-1 domain, 12

Transit, 10, 12, 42

Trust, 21

Upgraph, 15, 46, 48

Uphill, 15, 46, 59, 61

Valley-free path, 46

Downhill, 15, 46, 59, 61

Model, 14, 46

Uphill, 15, 46, 59, 61

90

Page 111: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]
Page 112: Pageflex Server [document: D-Aalto-F9A5DC1B 00001]

����������������

�������������������������������������������������� �������������������������������������������������� ������������������������������������� ���������������������������������������������������������

������������������������� ��� ������­�������������������­���� ������������� ������� �������������

����

�������

�����

�������������������������������������������������������������� �������������� ���������������������������������������������������������������������������������� ���������������������� ������������������������������ ����� ��������������� �������������������� ����������������������������� ������������������������������������������������������������������������� ������������������������������ ��������������������������������������������������������� �������� �������������������� ��� �������������������� ��������������������������������������������������������������� �������������������� ����������������������� ��������������� ������������������������������������������� ������������������������� ������������������������������������ ������������������������������������������������ ��������� ���������� �������������� ����������������� �������� �������������� ������������������������������������������������������������ ������������������������������������ �������������� ������� ���������������

��������������

���������

������������� ����������������

����������������

������

������

��������������������������������������������

������������������������� �����������������������

����������������

������� �������������


Recommended