NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
Distributing Caching on Large IP Networks
Adrian Chadd & Andrew KhooVersatel Telecom – Amsterdam NL
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
2
Why Cache?• Average client bandwidth growth is exceeding backbone
capacity growth• The internet is the ‘unknown’ factor – congestion problems
outside your control
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
3
Caching Problems Today• Bad server implementations of HTTP/1.1• Bad implementation of dynamic content timeouts• Some websites use client IP as part of authentication
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
4
Inter-cache communication
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
5
ICP• Very basic method of inter-cache communication• Only HIT/MISS information is propagated between caches• Connectionless (UDP) based
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
6
Cache Digests• Utilizes a ‘bloom filter’ to determine caches with higher
probabilities of having given objects• Periodic exchanges of digests between caches• Not Exact! Miss = 0% chance of having object, Hit =
>95% chance of having object• Work ‘in progress’
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
7
Caching a large network- a case study
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
8
Cache Objectives• Primary goal: Improve client response times• Secondary goal: Reduce bandwidth usage
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
9
100B-NIX
Backbone Europe
155
LINX100
100
FRA
L3
622
VT
M2
622
155Mbpslevel3
2x100AMSIX SFINX
622
LHR
155+155
BRU
CDG155
Amsterdam
T3-ATM
PARIX
100DCIX
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
10
Backbone US
PAIX
ORD NYC
155Mbps
155Mbps
155Mbps
AMS
MAEWest
155Mbps
PAIX 100T
FDDI
MAEEast
FDDI
DCBackup
transit to“Internet”
100Mbps 100Mbps
100Mbps
45MbpsAADS
CIX
100T
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
11
Cache Backbone
Brussels
London
New York Amsterdam
Paris Frankfurt
OC3/STM-12 x OC3/STM-1
OC12/STM-4
OC3/STM-1OC3/STM-1
Each POP has a single cache cluster
United States Europe
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
12
Backbone POP Design
Cisco GSR12008Core Router
Juniper M40Core Router
Cisco 7206vxrAggregator #1
Cisco 7206vxrAggregator #2
Cisco 2948100T cust attach
Cisco Cat5505100T cust attach
155Mbps POS 155Mbps POS
100baseT 100baseT
155Mbps POS
100baseT
Routing Path #1 Routing Path #2
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
13
155Mbps to NYC
NL Core Network TopologyAMS-IX #1 AMS-IX #2 100bT to Carrier1
155Mbps toSvianed #1
155Mbps toSvianed #2
NIKHEF SARA
Free Internet/ISP Platform
M40
GSR
7206vxr
7206vxr
4-way 155Mbpsrouter mesh
7204
7202
7202
7206vxr
GSR
45Mbps
155Mbps
155Mbps
155Mbps
100bT
Data Operations
100bT
100bT
100bT
100bT
100bTGigabitEthernet
VERSATOWER
MATRIX-2
Dial AccessPlatform
Business Internet(via ATM)
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
14
Single Cache Cluster
Frontend 1 Frontend 3Frontend 2
Backend 1 Backend 2 Backend 3
ClientsClients
Internet
Internet
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
15
Single Cache Cluster (2)• Frontends handle client requests• Backends handle server requests• Frontends ICP each other• Each backend exports a cache digest to each frontend• Backends do NOT communicate
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
16
Hierarchy construction
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
17
Problems with conventional hierarchies
• Conventional hierarchies use static or domain based parent/peer selection
• Network/server failures can affect client response times• Limited network topology intelligence through a ‘whois’
interface
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
18
How to ‘route’ cache peer selection• Using transparent redirection at each border, let your
network route requests for you• Network failures are handled correctly• Method is performance-friendly, but not optimally
bandwidth-friendly
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
19
Cache pre-population• Cache prepopulation WORKS !• Parsing ‘friendly’ provider logs from timezones before your
own• Walk ‘popular’ websites before sunlight• Parsing your own logs and issuing IMS requests
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
20
Melting squid caches
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
21
Melting squid caches – filesystem performance
• Directory seek overhead• No concept of object relationships / locality, defeating
‘read-ahead’• Current implementation uses threads to fake async file
access• Disks are still linear :-)
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
22
Melting squid caches - IFS• IFS – “inode filesystem”• Exporting flat inode namespace to squid• Not optimal by far, but relaxes disk seek and memory
usage• Integration into squid is under way
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
23
Melting squid caches – poor network performance
• Using larger buffer sizes on outgoing requests• Reduce the TIME_WAIT period to cycle through sockets
quicker• QoS on your internal network
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
24
Melting squid caches – ICP• Dropped ICP packets result in noticeable client delays• ICP not suitable for busy WAN links• Cache digests are a ‘solution in progress’• Cache the ICP replies
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
25
Melting squid caches – other problems
• Proprietary protocols (eg WEBDAV)• Some browsers (MSIE) have a habit of issuing non-cache
friendly requests• Issues with transparent redirection – eg MTU Path
Discovery, IP-based authentication
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
26
Caching – the next level• True object-based caching, instead of HTTP/FTP only• True clustering support, providing a single logical cache to
the network• Smart ‘predictive’ caches adapting to client usage patterns• Video/Audio stream caching
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
27
Questions and Answers
NANOG17 – Distributed Caching in Large IP EnvironmentsAdrian Chadd & Andrew Khoo - 10th October 1999
28
The End