+ All Categories
Transcript
Page 1: Network Memory Servers: An idea whose time has come

MSN 2004

Network Memory Servers:An idea whose time has come

Glenford Mapp

David Silcott

Dhawal Thakker

Page 2: Network Memory Servers: An idea whose time has come

MSN 2004

Motivation

• Networks are now much faster than disks

• Should be quicker to get data from the memory of another computer compared to using local disk

• Not a new idea - so what’s different?

Page 3: Network Memory Servers: An idea whose time has come

MSN 2004

What’s different?• Networks are faster and cheaper

– Gigabit NICs are £35.00– We could also see 10G NICs in the near future

• Memory is also cheaper– 1GB = £100.00 – Likely to remain stable

• Availability of good “free” Oses– Linux and Free BSD

Page 4: Network Memory Servers: An idea whose time has come

MSN 2004

Our approach is also different

• Previous approaches– Dominated by the Distributed Shared Memory

crowd (Apollo System)– DSM never became mainstream

• lots of fundamental changes to OS platform required

• Exotic Hardware (e.g Scalable Coherent Interconnect or SCI)

• Network Memory became a casualty of this failure

Page 5: Network Memory Servers: An idea whose time has come

MSN 2004

Previous Approach cont’d• Remote paging was also one of the key

areas (SAMSON project, NYU)

• Idle machines approach– Use memory of other machines in the network

when no one is logged on but get off when the person returns

– Very complex -• how do you give guarantees to everyone

Page 6: Network Memory Servers: An idea whose time has come

MSN 2004

Our Approach

• Applied Engineering Approach– what are the real numbers in this area

• Use the power of the Network– use standard networking approach– No DSM, no virtual memory plug-ins

• Client-Server approach– Dedicated servers with loads of memory

Page 7: Network Memory Servers: An idea whose time has come

MSN 2004

Design of the Network Memory Server (NMS)

• NMS has an independent interface– Can interface with any OS

• not like Network Block Device (NBD) in Linux

• NMS is stateless– Does not keep track of previous interactions

• Actions of the NMS are regarded as atomic– Either complete success or total failure

Page 8: Network Memory Servers: An idea whose time has come

MSN 2004

Design of NMS cont’d

• NMS deals with blocks of data– Has no idea how the blocks are being used

• Not like NFS

• Each block is uniquely identified by a block_id allocated by the NMS

• Each client is uniquely identified by a client_id

Page 9: Network Memory Servers: An idea whose time has come

MSN 2004

Block_ids

• 64-bit entities– 32 minor index– 16 major index– 16 bit security tag

• generated when the blocks are created

• checked before any read/write operation on a block

Page 10: Network Memory Servers: An idea whose time has come

MSN 2004

NMS calls

• GetblockMemory(client_id, size, nblocks, options)– Creates a number of blocks of a certain size

with consecutive block_ids• returns the starting Block_id

• options - backup

• Release(client_id, block_id, nblocks)– Releases a number of consecutive block_ids

Page 11: Network Memory Servers: An idea whose time has come

MSN 2004

NMS calls cont’d• WriteBlockMemory(client_id, block_id,

offset, length, *buf)– writes data in buffer to a block on the server

• ReadBlockMemory(client_id, block_id, offset, length, *buf)– reads data from a block on the server into a

buffer

Page 12: Network Memory Servers: An idea whose time has come

MSN 2004

NMS calls cont’d

• GetClientid(password)– creates a new client

• GetMasterBlock(password, client_id)– returns a number of blocks of sector/block_id

mappings

• StoreMasterBlock(block_id, client_id, password, nblocks) – stores a number of sector/block_id mappings

Page 13: Network Memory Servers: An idea whose time has come

MSN 2004

NMS Client

• How does a client use the NMS?– What interface is presented to the OS

• Interface is one that is used to support hard disks. In Linux, we use the block device interface

• So the OS thinks of the NMS service as a fast hard disk

Page 14: Network Memory Servers: An idea whose time has come

MSN 2004

NMS Client cont’d

• So the OS tells the NMS client to read and write sectors.

• NMS client will take sectors and map them onto blocks which it gets from the NMS

• When block device is unmounted, we must store the sector/block_id mappings on the NMS

Page 15: Network Memory Servers: An idea whose time has come

MSN 2004

NMS Cont’d

• The StoreMasterBlock call stores these mappings on the NMS

• When the device is remounted, it must first get the sector/block_id mappings from the NMS and rebuild the sector table.

• The GetMasterBlock call retrieves the mappings from the NMS

Page 16: Network Memory Servers: An idea whose time has come

MSN 2004

NMS Client Cache

• Client also has a cache of blocks that are used to store recently used sectors– this is a secondary cache as the main caching is

really done by the Unix Buffer Cache

• Design decision to keep our cache as a simple round-robin cache -– replace the next item pointed to in the cache

Page 17: Network Memory Servers: An idea whose time has come

MSN 2004

NMS Client Operations• Since we are not a normal disk, we do not

need to rearrange read and write operations

• So we attempt to read and write blocks as the requests come in.

• Also developed a write-out thread operation. So a special thread, called the Write-out thread writes modified blocks to the NMS

Page 18: Network Memory Servers: An idea whose time has come

MSN 2004

NMS Client Implementation

Operating System

Block Device Interface

Sector / Block_idHash Table

Cache

Programs

Unix Buffer Cache

Write-Out Queue

(Two levels)

NMS Block Device

Page 19: Network Memory Servers: An idea whose time has come

MSN 2004

Getting a sectorIs sector in Hash table

YesIs it in the cache

Is it a readYes

Return Rubbish

Get Block_idFrom NMS. Put Entry inHash Table

Is it a read

Get Data from NMS Server; putin cache entry

Is the cache full

Replace Entry

Has replaced entrybeen modified

Put it on WriteOut Queue

Get New Cache Entry

Read from/ Writeto Cache Entry

OKWrite Data toCache Entry

Yes

No

No

Yes

Yes

No

No

Yes

No

No

Page 20: Network Memory Servers: An idea whose time has come

MSN 2004

Structures on NMS Server

Client_id Hash Table

Block_idHash Table(Two-level)

Allocated Memory

Memory for Clients

Memory for InternalUse by the NMS

Page 21: Network Memory Servers: An idea whose time has come

MSN 2004

Testing and Evaluation

• What do we really want to know

• What does it take to operate faster than a hard disk?– Can you use standard hardware (Middlesex)– Do you need special hardware (Cambridge)

• Level 5 Networks

• What are the key parameters in this space

Page 22: Network Memory Servers: An idea whose time has come

MSN 2004

What do you measure• What happens if we change the block size

of the data transfer

• What happens if we change the number of units transferred in one transfer– Added multi-write operation

• Is local caching any good

• What is the network traffic like

Page 23: Network Memory Servers: An idea whose time has come

MSN 2004

Using Iozone• Iozone is quite popular

– Measures the memory hierarchy

• Disk particulars– 60 GB, 2MB buffer, 7200 RPM, Seek Time 9.0 ms,

Average latency 4.16ms

• Network -– using Intel E1000 NICs and Netgear Gigabit

Switch (GS 104); using UDP port 6111

• NMS client and server implemented as Linux kernel modules

Page 24: Network Memory Servers: An idea whose time has come

MSN 2004

Read Performance

0

200000

400000

600000

800000

1000000

1200000

1400000

0 50000 100000 150000 200000 250000 300000 350000

kB file

kB/s

ec

mw4, 2MB_cache, 1kB_msgdisksw, 2MB_cache, 4kB_msg

Page 25: Network Memory Servers: An idea whose time has come

MSN 2004

Record Rewrite Performance

0

200000

400000

600000

800000

1000000

1200000

0 50000 100000 150000 200000 250000 300000

kB file

kB/s

ec

MW4 2mb cache, 1k

disk system

Page 26: Network Memory Servers: An idea whose time has come

MSN 2004

Write Performance for Different Transfer sizes

0

50000

100000

150000

200000

250000

300000

0 50000 100000 150000 200000 250000 300000 350000

kB file

kB/s

ec

sw, 2MB_cache, 4kB_msgdisksw, 2MB_cache, 1kB_msgsw, 2MB_cache, 2kB_msg

Page 27: Network Memory Servers: An idea whose time has come

MSN 2004

Write Performance for Multiples of 1K blocks

0

50000

100000

150000

200000

250000

300000

0 50000 100000 150000 200000 250000 300000

kB file

kB/s

ec

mw4, 2MB_cache, 1kB_msgdiskmw12, 2MB_cache, 1kB_msgmw8, 2MB_cache, 1kB_msgmw16, 2MB_cache, 1kB_msg

Page 28: Network Memory Servers: An idea whose time has come

MSN 2004

Write Performance for extreme configurations

0

50000

100000

150000

200000

250000

300000

0 50000 100000 150000 200000 250000 300000 350000

kB file

kB/s

ec

disk

mw17k, 2MB_cache, 4kB_msg

mw32k, 8MB_cache, 4kB_msg

sw, 2MB_cache, 1kB_msg

Page 29: Network Memory Servers: An idea whose time has come

MSN 2004

Maximum data transfer rate

82

83

84

85

86

87

88

50 100 150 200 250

Filesize(MB)

Ra

te(M

b/s

ec

)

Received

Sent

Page 30: Network Memory Servers: An idea whose time has come

MSN 2004

Buffer cache Hits

0

20

40

60

80

100

120

100 150 200 250

Filesize(MB)

% b

loc

k c

ac

he

hit

s

BCH MAX

BCH MIN

Page 31: Network Memory Servers: An idea whose time has come

MSN 2004

Conclusions and Future

• We can beat the disk

• Will compare these results with those using Level 5 hardware (Rip Sohan, LCE)

• Open source release planned

• Developing a Network Storage Server

• Building prototypes – running Linux and Windows using NMS


Top Related