Date post: | 18-Dec-2015 |
Category: |
Documents |
Upload: | frederica-short |
View: | 215 times |
Download: | 1 times |
Exploiting Gray-Box Knowledge of Buffer Cache Management
Nathan C. Burnett, John Bent,
Andrea C. Arpaci-Dusseau,
Remzi H. Arpaci-Dusseau
University of Wisconsin - Madison
Department of Computer Sciences
2
Caching
• Buffer cache impacts I/O performance– Cache hits much faster than disk reads
Buffer Cache
Data Blocks
OSWithout Cache Knowledge: 2 disk reads
With Cache Knowledge 1 disk read
3
Knowledge is Power
• Applications can use knowledge of cache state to improve overall performance– Web Server– Database Management Systems
• Often no interface for finding cache state– Abstractions hide information
4
Workload + Policy Contents
• Cache contents determined by:– Workload– Replacement policy
• Algorithmic Mirroring– Observe workload– Simulate cache using policy knowledge– Infer cache contents from simulation model
5
Gaining Knowledge
• Application knows workload– Assume application dominates cache
• Cache policy is usually hidden– Documentation can be old, vague or incorrect– Source code may not be available
• How can we discover cache policy?
6
Policy Discovery
• Fingerprinting: automatic discovery of algorithms or policies (e.g. replacement policy, scheduling algorithm)
• Dust - Fingerprints buffer cache policies– Correctly identifies many different policies– Requires no kernel modification– Portable across platforms
7
This Talk• Dust
– Detecting initial access order (e.g. FIFO)
– Detecting recency of access (e.g. LRU)
– Detecting frequency of access (e.g. LFU)
– Distinguishing clock from other policies
• Fingerprints of Real Systems– NetBSD 1.5, Linux 2.2.19, Linux 2.4.14
• Exploiting Gray-Box Knowledge– Cache-Aware Web Server
• Conclusions & Future Work
8
Dust
• Fingerprints the buffer cache– Determines cache size– Determines cache policy– Determines cache history usage
• Manipulate cache in controlled way– open/read/seek/close
9
Replacement Policies
• Cache policies often use – access order– recency– frequency
• Need access pattern to identify attributes
• Explore in simulation– Well controlled environment– Variety of policies– Known implementations
10
Dust
I. Move cache to known statea. Sets initial access order
b. Sets access recency
c. Sets frequency
II. Cause part of test data to be evicted
III. Sample data to determine cache state• Read a block and time it
Repeat for confidence
11
Setting Initial Access OrderTest Region Eviction Region
for ( 0 test_region_size/read_size) { read(read_size);}
12
FIFO Priority
FIFO gives latter part of file priority
Newer Pages
Older Pages
13
Detecting FIFO
• FIFO evicts the first half of test region
Out of Cache
In Cache
14
Setting Recency
Left Pointer Right Pointer
Test Region Eviction Region
do_sequential_scan();left = 0; right = test_region_size/2;for ( 0 test_region_size/read_size){ seek(left); read(read_size); seek(right); read(read_size); right+=read_size; left+= read_size;}
15
LRU Priority
LRU gives priority to 2nd and 4th quarters of test region
16
Detecting LRU
• LRU evicts 1st and 3rd quarters of test region
17
Setting Frequency
2 3 4 5 6 6 5 4 3 2 7
Left Pointer Right Pointer
Test Region Eviction Region
do_sequential_scan();left = 0; right = test_region_size/2;left_count = 1; right_count = 5;for ( 0 test_region_size/read_size) for (0 left_count) seek(left); read(read_size); for (0 right_count) seek(right); read(read_size); right+=read_size; left+= read_size; right_count++; left_count--;
18
LFU Priority
LFU gives priority to center of test region
19
Detecting LFU
• LFU evicts outermost stripes • Two stripes partially evicted
20
The Clock Algorithm
• Used in place of LRU• Ref. bit set on reference• Ref. bit cleared as hand passes• Hand replaces a page with a
ref. bit that’s already clear• On eviction, hand searches for
a clear ref. bit
Page FrameReference bit
21
Detecting Clock Replacement
• Two pieces of initial state– Hand Position– Reference Bits
• Hand position is irrelevant – circular queue
• Dust must control for reference bits– Reference bits affect order of replacement
22
Detecting Clock Replacement
• Uniform reference bits • Random reference bits
23
• Two fingerprints for Clock
• Ability to produce both will imply Clock
• Need a way to selectively set reference bits
• Dust manipulates reference bits– To set bits, reference page– To clear all bits, cause hand to sweep
• Details in paper
Clock - Reference Bits Matter
24
Dust Summary
• Determines cache size (needed to control eviction)
• Differentiates policies based on – access order– recency– frequency
• Identifies many common policies– FIFO, LRU, LFU, Clock, Segmented FIFO, Random
• Identifies history-based policies– LRU-2, 2-Queue
25
This Talk• Dust
– Detecting initial access order (e.g. FIFO)– Detecting recency of access (e.g. LRU)– Detecting frequency of access (e.g. LFU)– Distinguishing clock from other policies
• Fingerprints of Real Systems– NetBSD 1.5, Linux 2.2.19, Linux 2.4.14
• Exploiting Gray-Box Knowledge– Cache-Aware Web Server
• Conclusions & Future Work
26
Fingerprinting Real Systems
• Issues:– Data is noisy– Policies usually more complex– Buffer Cache/VM Integration
• Cache size might be changing
• Platform:– Dual 550 MHz P-III Xeon, 1GB RAM, Ultra2
SCSI 10000RPM Disks
27
NetBSD 1.5
• Increased variance due to storage hierarchy
FIFO
LRU
LFU
28
NetBSD 1.5
• Four distinct regions of eviction/retention
FIFO
LRU
LFU
29
NetBSD 1.5
• Trying to clear reference bits makes no difference• Conclusion: LRU
FIFO
LRU
LFU
30
Linux 2.2.19
• Very noisy but looks like LRU• Conclusion: LRU or Clock
FIFO
LRU
LFU
31
Linux 2.2.19
• Clearing Reference bits changes fingerprint• Conclusion: Clock
FIFO
LRU
LFU
32
Linux 2.4.14
• Low recency areas are evicted• Low frequency areas also evicted• Conclusion: LRU with page aging
FIFO
LRU
LFU
33
This Talk• Dust
– Detecting initial access order (e.g. FIFO)– Detecting recency of access (e.g. LRU)– Detecting frequency of access (e.g. LFU)– Distinguishing clock from other policies
• Fingerprints of Real Systems– NetBSD 1.5, Linux 2.2.19, Linux 2.4.14
• Exploiting Gray-Box Knowledge– Cache-Aware Web Server
• Conclusions & Future Work
34
Algorithmic Mirroring
• Model Cache Contents– Observe inputs to cache (reads)– Use knowledge of cache policy to simulate cache
• Use model to make application-level decisions
35
NeST
• NeST - Network Storage Technology
• Software based storage appliance
• Supports HTTP, NFS, FTP, GridFTP, Chirp
• Allows configurable number of requests to be serviced concurrently
• Scheduling Policy: FIFO
36
Cache-Aware NeST• Takes policy & size discovered by Dust
• Maintains algorithmic mirror of buffer cache– Updates mirror on each request– No double buffering– May not be a perfect mirror
• Scheduling Policy: In-Cache-First– Reduce latency by approximating SJF– Improve throughput by reducing disk reads
37
Performance
• Improvement in response time• Robust to inaccuracies in cache estimate
144 clients randomlyrequesting 200, 1MB files
Server: P-III Xeon, 128MB
Clients: 4 X P-III Xeon, 1GB
Gigabit Ethernet
Linux 2.2.19
38
Summary
• Fingerprinting – Discovers OS algorithms and policies
• Dust – Portable, user-level cache policy fingerprinting– Identifies FIFO, LRU, LFU, Clock, Random, 2Q, LRU-2– Fingerprinted Linux 2.2 & 2.4, Solaris 2.7, NetBSD 1.5 & HP-UX 11.20
• Algorithmic Mirroring– Keep track of kernel state in user-space– Use this information to improve performance
• Cache-Aware NeST– Uses mirroring to improved HTTP performance
39
Future Work
• On-line, adaptive detection of cache policy• Policy manipulation• Make other applications cache aware
– Databases– File servers (ftp, NFS, etc.)
• Fingerprint other OS components– CPU scheduler– filesystem layout
40
Questions??
• Gray-Box Systems– http://www.cs.wisc.edu/graybox/
• Wisconsin Network Disks– http://www.cs.wisc.edu/wind/
• NeST– http://www.cs.wisc.edu/condor/nest/
41
Solaris 2.7FIFO
LRU
LFU
42
HP-UX 11.20 (IPF)
• Low recency areas are evicted• Low frequency areas also evicted• Conclusion: LRU with page aging
FIFO
LRU
LFU
43
Related Work
• Gray-Box (Arpaci-Dusseau)– Cache content detector
• Connection Scheduling (Crovella, et. al.)
• TBIT (Padhye & Floyd)
44
Clock - Uniform Reference Bits
• After initial scan, cache state does not change• First half of test region is evicted
Buffer Cache before test scan
File
Buffer Cache after test scan, before eviction scan
45
Clock - Random Reference Bits
• Initial Sequential Scan• Test scan does not change cache state
Buffer Cache before test scan
File
Buffer Cache after test scan, before eviction scan
46
Manipulating Reference Bits
• Setting bits is easy
• Clear bits by causing hand to do a circuit
Buffer Cache after touching all resident data
Buffer Cache after an additional small read