Multicache-Based Content Management
for Web Caching
Kai Cheng and Yahiko Kambayashi
Graduate School of Informatics, Kyoto University
Kyoto JAPAN
WISE'2000 (C)[email protected] 2
Outline of the Presentation
• Introduction– Why Content Management– Contributions of Our Work
• Multicache-Based Content Management
• Content Management Scheme for LRU-SP
• Experimental Evaluation
• Concluding Remarks
WISE'2000 (C)[email protected] 3
1.1. Why Content Management
User Network Servers
②① ③ ④
Maximize Hit Rates (r = / )② ① (or Weighted HR)
WISE'2000 (C)[email protected] 4
Can Web Do Without Caching?• Bandwidth Scarcity= Weakest Part
– Unrealistic to Update All Resources
• “Hot-Spot” Servers– Unpredictable of Server Overload
• Inherent Latency = Light Speed Distance – Even Sufficient Bandwidth and Server Capacity
– Transoceanic Data Transfer: 200ms300ms
Caching Is Necessary To AdaptivelyReduce Remote Data Requests
WISE'2000 (C)[email protected] 5
1.2. Why Content Management
Traditional Caching
Web Caching Implications
Process OrientedHuman-User
OrientedUser Preferences
System-Level Application-Level Semantic Information
Data Block Based Document-Based Varying Sizes, Types
Memory-Based Disk-BasedPersistent Storage,
Large Size,
Replacement policies based on empirical formula are difficult to deal with these!
WISE'2000 (C)[email protected] 6
Deploying Content Management
• To Support – Larger Cache Space– Sophisticated Control Logic
• To Support – Sophisticated Replacement Policies With
• User-Oriented Performance Metrics
• Document Treated as Semantic Unit
WISE'2000 (C)[email protected] 7
1.3. Contributions of This Work
• A Multicache Architecture for Implementing Sophisticated Content Management, Including a New Cache Definition
• A Study of Content Management for LRU-SP• Simulations to Compare LRU-SP Against Others
WISE'2000 (C)[email protected] 8
Previous Work• Classifications in Approximate Implementations
of Complicated Caching Schemes– LRV, LNC-W3-U, etc.
• Segmentation in Traditional Caching As Tradeoffs Between Performance and Complexity – Segmented FIFO, FBR, 2Q etc.
• Disadvantages– Both Are Built-in Ad hoc Implementation, Rather than
An Independent Mechanism – Can Not Support Sophisticated Category nor Semantic-
Based Classification
WISE'2000 (C)[email protected] 9
Managing LFU Contents in Multiple Priority Queues
2
1
>2 B(8) C(6) D(3)
A(10) E(2) F(2)
F(1) G(1) H(1)
Hit
Hit
Outs
Outs
First In First Out Order
Ref
eren
ces
A(10) B(8) C(6) D(3) E(2) F(2) F(1) G(1) H(1)
WISE'2000 (C)[email protected] 10
Cache Components
• Space– Limit Storage Space
• Contents– Objects Selected for Caching
• Policies– Replacement Policies
• Constraints– Special Conditions
Space
Contents Policies
Constraints
SpaceSpace
WISE'2000 (C)[email protected] 11
Constraints for Cache
• Admission Constraints– Define Conditions for Objects Eligible For Caching
e.g. (size < 2MB) && !(Source = local)
• Freshness Constraints– Define Conditions for Objects Fresh Enough For Re-Use
e.g. (Type = news) && (Last-Modified < 1week)
• Miscellaneous Constraints e.g. (Time= end-of-day) (Total-Size< 95%*Cache-Size)
WISE'2000 (C)[email protected] 12
Multicache Architecture
SUBCACHE SUBCACHE SUBCACHE SUBCACHE SUBCACHESUBCACHE
CENTRAL
ROUTER
CENTRAL
ROUTER
Cli
ent
Web S
ervers
Web Cache With Multiple Subcaches
JUDGE
CONSTRAINTSCONSTRAINTS
CKBCKB
IN-CACHEIN-CACHE
Request/Response
Cache Knowledge
Base
WISE'2000 (C)[email protected] 13
Components of the Architecture
• Central Router – Control and Mediate the Cache
• Cache Knowledge Base (CKB)– A Set of Rule Based To Allocate ObjectsR1. Allocate(X, 1):-url(X, U), match(U, *.jp),content(X, baseball)
• Subcaches– Cache for Keeping Objects With Special Properties
• Cache Judge – Make Final Decisions From A Set of Eviction Candidat
es
WISE'2000 (C)[email protected] 14
The Procedural Description
Central Router services each request. Suppose current request is for doc
ument p; 1. Locating p by In-cache Index
2. If p is not in cache, download p; i. Validate Constraints, if false, loop;ii. Fire rules in CKB, let subcache ID = K;
iii. While no enough space in subcache K for p– Subcache K selects an eviction ;– If space sharing, other subcaches do same;– Judge assesses the eviction candidates;
– Purge the victim; iv. Cache p in subcache K
3. If p is in subcache , do i) - iv) re-cache p.
WISE'2000 (C)[email protected] 15
Content Management for LRU-SP
• LRU (Least Recently Used)– Primarily Designed for Equal Sized Objects, an
d Only Recency of Reference In Use
• Extended LRUs– Size-Adjusted LRU (SzLRU)– Segmented LRU (SgLRU)
• LRU-SP(Size-Adjusted and Popularity-Aware LRU)– Make SzLRU Aware of Popularity Degree
WISE'2000 (C)[email protected] 16
Probability of Re-ReferenceAs a Function of Current Reference Times
00.10.2
0.30.4
0.50.6
0.70.8
0.9
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Next Reference Next K References After First
WISE'2000 (C)[email protected] 17
Cost –To-Size Ratio Model
• An Object A In Cache Saves Cost nref * (1/atime)
– nref is the frequency of reference
– atime is the time since last access, (1/atime) is the dynamic frequency of A
• When Put In Cache, It Takes Up Space size– Cost-to-size ratio = nref /(size*atime)
• The Object With Least Ratio Is Least Beneficial One
WISE'2000 (C)[email protected] 18
Content Management of LRU-SP
• CKB Rule:– Allocate(X, log(size/nref)):-Size(X, size), Freq(X, nref)
• Subcaches– Least Recently Used (LRU)
• Judge– Find the One With Largest (size*atime)/nref
– The Larger and Older and Colder, the Fast An Object Will Be Purged
WISE'2000 (C)[email protected] 19
Predicted Results
• A higher Hit Rate is expectable for LRU-SP, because it utilizes three indicators to document popularity.
• However, higher Hit Rates are usually at the cost of lower Byte Hit Rates, because smaller documents contribute less to bytes of hit data.
WISE'2000 (C)[email protected] 20
Experiment Results
0
0.05
0.1
0.15
0.2
0.25
0.15 0.3 0.5 0.8 1.5 2 3 4 5 6 7 8
LRU-SP SzLRU SgLRU LRV
0
0.05
0.1
0.15
0.2
0.25
0.3
0.15 0.3 0.5 0.8 1.52 3 4 5 6 7 8
RU-SP SzLRU SgLRU LRV
* *
WISE'2000 (C)[email protected] 21
Explanations
• LRU-SP really obtained a much higher Hit Rate than either SzLRU, SgLRU or LRV.
• LRU-SP also obtained a higher Byte Hit Rate, when cache space exceeds 3% of total required space.
• LRU-SP only incurs O(1) time complexity in content management.
• LRU-SP a significantly improved algorithm
WISE'2000 (C)[email protected] 22
Concluding Remarks
• Multicahe-Based Architecture Has Proved Ideal To Realize Good Balance Between High Performance and Low Overhead
• It Is Capable of Incorporating Semantic Information as Well as User Preference In Caching
• It Can Work With Data Management Systems to Support Web Information Integration