+ All Categories
Home > Documents > DEFER Cache – an Implementation Sudhindra Rao and Shalaka Prabhu Thesis Defense Master of Science...

DEFER Cache – an Implementation Sudhindra Rao and Shalaka Prabhu Thesis Defense Master of Science...

Date post: 21-Dec-2015
Category:
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
50
DEFER Cache – an Implementation Sudhindra Rao and Shalaka Prabhu Thesis Defense Master of Science Department of ECECS OSCAR Lab
Transcript

DEFER Cache – an Implementation

Sudhindra Rao and Shalaka PrabhuThesis Defense Master of ScienceDepartment of ECECS

OSCAR Lab

DEFER Cache 2

DEFER Cache

Overview

Related Work

DEFER Cache Architecture

Implementation – Motivation and Challenges

Results and Analysis

Conclusion

Future Work

DEFER Cache 3

OverviewAccessing remote memory is faster than accessing local disks; hence co-operative cachingCurrent schemes - fast reads, slow writesGoal: Combine replication with logging for fast, reliable write-back to improve write performance

DCD and RAPID already do something similar

New architecture: Distributed, Efficient and Reliable (DEFER) Co-operative CacheDEFER Cache

Duplication & Logging Co-operative caching for both read and write requests Vast performance gains (up to 11.5x speedup) due to write-

back[9]

DEFER Cache 4

Co-operative Caching

High Performance, Scalable LAN Slow speed of the file server disks

Increasing RAM in the server File Server with 1GB RAM 64 Clients with 64MB RAM = 4GB

Cost-Effective Solution Using Remote Memory for Caching 6-12ms for accessing 8KB data from disk Vs 1.05 ms

from remote client Highly Scalable But all related work focuses on read performance

DEFER Cache 5

Other Related Work

N-Chance Forwarding Forwards singlets to remote host on capacity miss Re-circulates N times and then written to server disk Uses write-through cache

Co-operative Caching using hintsGlobal Memory SystemRemote Memory ServersLog Structured Storage Systems

LFS, Disk Caching Disk, RAPIDNVRAM - not cost effective with current technology

What’s DEFER? Improve Write Performance DCD Using Distributed Systems

DEFER Cache 6

Log based write mechanism

DCD[7] and RAPID[8] implement log based write

Improvement in small writes using log

Reliability and data availability from the log partition

Segment Buffer

Data Partition

Local disk

Remote Cache

Memory

Log Partition

Local Cache

DCD like structure of DEFER

DEFER Cache 7

Logging algorithmWriting a segment

Cache disk

RAM Cache

. . ...

. . ...

7 99 2348

10 1 1152

FreeLog segment Free

25

101

11.

Map

pin

g T

able

Segment Buffer

Write 128KB of LRU data to a cache-disk segment, in one large write

Pickup LRU data to capture temporal locality, improve reliability and reduce disk traffic

Most data will be overwritten repeatedly

. . ...

Cache disk

Segment write done

RAM Cache

. . ...

9948

10 11 25

101

11.

Map

pin

g T

able

Free Free2 5 7 1 23

Free

DEFER Cache 8

Garbage Collection

Data is written into the cache-disk continuouslyCache-disk will fill eventually – log writesMost of data in the cache-disk is “garbage”

caused by data overwriting

Need to clean the garbage to make free log disk

Log disk on client

11 10 4 89108 342294 1111

Log disk on client

11 10 4 8934229

Before garbage collection

After garbage collection

DEFER Cache 9

DEFER Cache Architecture

Typical distributed system (client-server)Applications run on workstations (clients) and access files from the ServerLocal disks on clients only for booting, swapping and loggingLocal RAM divided into I/O cache and segment bufferLocal disk has corresponding log partition

DEFER Cache 10

DEFER Cache Algorithms

DEFER is DCD distributed over the network Best of co-operative caching and logging

Reads handled exactly as in N-chance ForwardingWrites are immediately duplicated and eventually logged after a pre-determined time interval MDirty singlets are forwarded like N-chanceThree logging strategies used:

Server Logging Client Logging Peer Logging

DEFER Cache 11

Server Logging

Update Server Table. Free the Lock.

Client 1

Client n

W Request

.

.

.

.

.

.2. Invalidate

2. Invalidate

Log DiskSegment Buffer

4. Send a Copy to Server. Cache.

1. Lock Ownership Request

3. Lock Ownership Granted

Server

Client copies block to server cache on a write Server Table maintains consistency Invalidates clients and Logs the contents of

segment buffer Increased load on the server due to logging

DEFER Cache 12

Client Logging

Advantage: Server load is decreased Disadvantage: Availability of the block is affected

Client 2

Client n

.

.

Free the Lock.Update Server Table.

.

.

.

2. Invalidate

2. Invalidate

Log DiskSegment Buffer

W

6. Logging Complete. Remove the dirty blocks sent by Client 1 from the server cache.

4. Copy Data to Server Cache

1. Lock Ownership Request

3. Lock Ownership Granted

5. After ‘M’

seconds

Server

Client 1

DEFER Cache 13

Peer LoggingClient 1

Each workstation is assigned a peer – peer performs logging

Advantage: Reduces server load without compromising availability,

Log Disk

Segment Buffer

Client 2

Client n

W Request

.

.

.

Update Server Table. Send Invalidate

.

.

.2.Invalidate

2. Invalidate

5. After ‘M’

seconds

1. Lock Ownership Request

3. Lock Ownership Granted

4. Send a Copy of the Block to the Peer

4. Update Server Table

n-5n

..

..

..

12

21

Peer mapping

DEFER Cache 14

Reliability

Every M seconds, blocks that were modified within the last M to 2M seconds are loggedThus, for M = 15, we guarantee that data modified within the last 30 seconds is written to diskMost UNIX systems use a delayed write-back of 30 secondsM can be reduced, to increase frequency of logging, without introducing high overheadWith DEFER, blocks are logged and duplicated

DEFER Cache 15

Crash Recovery

Peer LoggingRecovery algorithm works on the on-log-disk version of dataIn-memory and on-log-disk copy are in different hostsFind the blocks that were updated by the crashed client and the peer informationServer initiates recovery of the blocks from the peer

DEFER Cache 16

Simulation ResultsSimulation[9] using disksim – synthetic and real world traces

DEFER Cache 17

Real WorkloadsSnake – Peer Logging 4.9x; Server and Client Logging 4.4x Cello – Peer Logging 8.5x; Server and Client Logging 7.3x

DEFER Cache Implementation

DEFER Cache 19

DEFER Cache Architecture

write to remote cache

send to peer/server

receive from peer/server

Segment 1

Segment 2

logging to segment buffer

ServerQueue

Defer Server

write to remote cache

send to peer/server

receive from peer/server

local Cache

Remote Cache

Segment 1

Segment 2

logging to segment buffer

ServerQueue

Defer_client Defer_client

local Cache

Remote Cache

DEFER Cache 20

DEFER Cache design

Follow the design principles Use only commodity hardware that is

available in typical systems. Avoid storage media dependencies such

as use of only SCSI or only IDE disks. Keep the data structures and

mechanisms simple. Support reliable persistence semantics. Separate mechanisms and policies.

DEFER Cache 21

ImplementationImplementation with Linux – open source

Implemented client logging as a device driver or a library – No change to the system kernel or application codeUses linux device drivers to create a custom block device attached to the network device – provides system call overload using loadable kernel moduleNetwork device uses Reliable UDP to ensure fast and reliable data transferAlso provides a library for testing and implementing on non-linux systems – provides system call overloadAlternative approach using NVRAM under test

DEFER Cache 22

DEFER as a module

unregister_capability()

register_capability()

printk, add_to_request_queue, ioctl, generic_make_request,

send/recv, ll_rw_blk

init_module()

cleanup_module()

read, write, open, close,

M-sec algorithm garbage collect, call nbd

Data

Defer_module Kernel proper

Kernel functions

Network device

DEFER Cache 23

Data Management

Plugs into the OS as a custom block device that contains memory and a diskDisk managed independent of the OSrequest_queue intercepted to redirect to Defer moduleRead/Write override with Defer read/writeInterfaces with the Network device to transfer dataInterfaces with the kernel by registering special capabilities – logging, de-stage, garbage collection, data recovery on crash

DEFER Cache 24

DEFER Cache - Implementation

Simulation results present a 11.5x speedupDEFER Cache implemented in real time system to support the simulation results.Multi-hierarchy cache structure can be implemented at Application level File System level Layered device driver Controller level

Kernel device driver selected as it achieves efficiency and flexibility.

DEFER Cache 25

Implementation Design

Implementation derived from DCD implementation.DEFER Cache can be considered as a DCD over a distributed system.Implementation design consists of three modules Data management

Implements the caching activities on the local machine. Network interface

Implements the network transfer of blocks to/from server/client. Coordinating daemons

Coordinates the activities of the above mentioned two modules.

DEFER Cache 26

Data Management

Custom block device driver developed and plugged into the kernel during execution.

Driver modified according to DEFER Cache design.

Request function of the device driver modified.

Read/Write for RAM replaced by DEFER Cache read/write call.

DEFER Cache 27

Network Interface

Implemented as a network block device (NBD) driver.NBD simulates a block device on the local client, but connects to a remote machine which actually hosts the data.Local disk representation for a remote client.Can be mounted and accessed as a normal block device.All read/write requests transferred over the network to the remote machine.Consists of three parts

NBD client NBD driver NBD server

DEFER Cache 28

NBD – Design

NBDClient

init_module()

ioctl()

transmit()

request()

NBD Driver Kernel

register_blkdev()

blk_init_queue()

Default Queue

NBDServer

User Space Kernel Space

DEFER Cache 29

NBD – Client

NBDClient

init_module()

ioctl()

transmit()

request()

NBD Driver Kernel

register_blkdev()

blk_init_queue()

Default Queue

NBDServer

User Space Kernel Space

DEFER Cache 30

NBD – Driver

NBDClient

init_module()

ioctl()

transmit()

request()

NBD Driver Kernel

register_blkdev()

blk_init_queue()

Default Queue

NBDServer

User Space Kernel Space

DEFER Cache 31

NBD – Driver

NBDClient

init_module()

ioctl()

transmit()

request()

NBD Driver Kernel

register_blkdev()

blk_init_queue()

Default Queue

NBDServer

User Space Kernel Space

DEFER Cache 32

Linux Device Driver Issues

Successfully implemented Linux device drivers for Data Management and Network Interface module.Could not be thoroughly tested and validated.Poses following problems Clustering of I/O requests by kernel Kernel memory corruption Synchronization problem No specific debugging tool

DEFER Cache 33

User-mode Implementation

Implementation of DEFER Cache switched to User-mode.

Advantages High flexibility. All data can be manipulated by user

according to requirements. Easier to design and debug. Good design can improve the performance.

Disadvantages Response time is slower – worse if data is swapped

DEFER Cache 34

User-Mode Design

Simulates drivers in the user-mode.

All data structures used by device drivers duplicated in the user-space.

Use raw disk.

32MB buffer space allocated for DEFER Cache in RAM.

Emulates I/O buffer cache.

DEFER Cache 35

DEFER Server - Implementation

Governs the entire cluster of workstation.Maintains it’s own I/O cache and a directory table.Server directory table maintains the consistency in the system.server-client handshake performed on every write update.Server directory table entry reflects the last writer.Used for garbage collection and data recovery.

DEFER Cache 36

Initial Testing

Basic idea : Accessing remote data faster than accessing data on local disk.Is LAN speed faster than disk access speed?As UDP used as network protocol, UDP transfer delay measured.Underlying network - 100Mbps Ethernet network.Use UDP monitor program.

DEFER Cache 37

UDP monitor - Results

Effect varying Response message size on Response time

DEFER Cache 38

Benchmark Program

Developed in-house benchmark program.Generates requests using a history table. Generates temporal locality and spatial locality.Runs on each workstation Following parameter can be modified at runtime

Working set size Client cache size Server cache size Block size Correlation factor (c)

DEFER Cache 39

Results (working set size)

Result of varying file size on Bandwidth (c=1)

0

2000

4000

6000

8000

10000

12000

2 4 8 16 32 64

File Size (MB)

Ban

dwid

th (K

B/s

ec)

Baseline System

DEFER Cache

DEFER Cache 40

Results (small writes)

Effect of Small write

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

4 8 16 32 64

File Size (KB)

Ban

dw

idth

(K

B/s

ec)

Baseline System

DEFER Cache

DEFER Cache 41

Results (Response time for small writes)

Result of varying file size on Response time (c=1)

0

2

4

6

8

10

12

14

16

4 8 16 32 64

File Size (KB)

Res

pons

e T

ime

(ms)

Baseline System

DEFER Cache

DEFER Cache 42

Results (Response time for sharing data)

Result of varying file size on Response time (c=0.75)

0

5

10

15

20

4 8 16 32

File Size (KB)

Res

po

nse

Tim

e (m

s)

Baseline System

DEFER System

DEFER Cache 43

Results (Varying client cache size)

Result of varying Client Cache Size on Bandwidth

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

16 32 64

Client Cache Size (MB)

Ban

dw

idth

(K

B/s

ec)

Baseline

DEFER Cache

DEFER Cache 44

Results (varying server cache size)

Result of varying Server Cache Size on Bandwidth

0

2000

4000

6000

8000

10000

128 256 512

Server Cache Size (MB)

Ban

dwid

th (

KB

/sec

)

Baseline System

DEFER Cache

DEFER Cache 45

Results (latency)

Latency comparison of DEFER Cache and Baseline System

0

200400

600800

10001200

14001600

1800

2 4 8 16

File Size (MB)

Late

ncy

(m

icro

seco

nd

s)

Baseline System

DEFER Cache

DEFER Cache 46

Results (Delay measurements)

Delay comparison of DEFER Cache and Baseline System

0

500

1000

1500

2000

2500

3000

3500

4000

2 4 8 16

File Size (MB)

Del

ay

(m

icro

seco

nd

s)

Baseline

DEFER

DEFER Cache 47

Results (Execution time)

0

10

20

30

40

50

60

4 8 16 32 64

Working set (MB)

Tim

e (s

ecs)

Baseline

DEFER

Execution time for DEFER Cache and Baseline system

DEFER Cache 48

Conclusions

Improves write performance for cooperative caching.

Reduces small write penalty.

Ensures reliability and data availability

Improves overall File system performance.

DEFER Cache 49

Future Work

Improve user-level implementation. Extend kernel-level functionality to user-level. Intercept

system level calls and modify them to implement DEFER read/write calls.

Kernel-Level Implementation. Successfully implement DEFER Cache at kernel level

and plug-in with kernel.

DEFER Cache 50

Thank you


Recommended