+ All Categories
Home > Engineering > Liquid: A Scalable Deduplication File System for Virtual Machine Images

Liquid: A Scalable Deduplication File System for Virtual Machine Images

Date post: 16-Feb-2017
Category:
Upload: anamika-vinod
View: 118 times
Download: 4 times
Share this document with a friend
28
Liquid- A Scalable Deduplication File System For Virtual Machine Images Anamika G V(12143630) S7 CSA College of Engineering, Cherthala August 8, 2016 Guided By Josna Jose Assistant Professor Computer Science & Engineering
Transcript
Page 1: Liquid: A Scalable Deduplication File System for Virtual Machine Images

Liquid- A Scalable Deduplication File System ForVirtual Machine Images

Anamika G V(12143630)S7 CSA

College of Engineering, Cherthala

August 8, 2016

Guided ByJosna Jose

Assistant ProfessorComputer Science & Engineering

Page 2: Liquid: A Scalable Deduplication File System for Virtual Machine Images

CONTENTS

1 INTRODUCTION2 VIRTUAL MACHINE3 DEDUPLICATION4 EXISTING SYSTEM5 ISSUES IN VM STORAGE6 LIQUID SYSTEM ARCHITECTURE7 DEDUPLICATION IN LIQUID8 OPTIMIZATIONS ON FINGER PRINT CALCULATION9 FILE SYSTEM LAYOUT10 COMMUNICATION AMONG COMPONENTS HEART BEAT

PROTOCOL11 FAST CLONING FOR VM IMAGES12 FAULT TOLERANCE13 GARBAGE COLLECTION14 ADVANTAGES OF LIQUID15 CONCLUSION16 REFERENCE

Page 3: Liquid: A Scalable Deduplication File System for Virtual Machine Images

INTRODUCTION

Cloud computing is the practice of using a remote servers hostedon the internet to store manage and process data rather than alocal server or a personnel computer.

Figure : 1.A sample cloud computing network

Page 4: Liquid: A Scalable Deduplication File System for Virtual Machine Images

VIRTUALIZATION and VIRTUAL

MACHINE

1 Virtualization deals with extending or replacing an existinginterface so as to mimic the behavior of another system

2 A Virtual Machine is a software that creates a virtualizedenvironment between the computer platform and the end userin which the end user can operate software.

3 Crucial component in cloud computing.

4 Virtual Machine - Hypothetical Computer.

5 Executes programs like a physical machine.

6 Initial state of a virtual machine is stored in a file calledvirtual Machine image.

Page 5: Liquid: A Scalable Deduplication File System for Virtual Machine Images

VIRTUAL MACHINE

Figure : 2.Virtual Machine representation

Page 6: Liquid: A Scalable Deduplication File System for Virtual Machine Images

DEDUPLICATION

1 Data Deduplication data compression technology.2 Eliminates duplicate copies of repeating data.3 A redundant data blocks are replaced so as to avoid the

problems regarding storage consumption of a large number ofVM images.

4 Improves storage utilization.

Figure : 3.Deduplicated file system

Page 7: Liquid: A Scalable Deduplication File System for Virtual Machine Images

EXISTING SYSTEM

1 Hypervisors such as Xen, KVM etc.

2 Network Attached Storage (NAS)

3 Storage Area Network (SAN)

4 Direct Attached Storage (DAS)

Page 8: Liquid: A Scalable Deduplication File System for Virtual Machine Images

ISSUES IN VM STORAGE

1 High demand on VM storage remains a challenging problem.

2 Existing systems have made efforts to reduce storageconsumption.

3 Uses SAN cluster.

4 Cannot satisfy increasing demand due to cost limitation.

5 Hence we propose LIQUID.

Page 9: Liquid: A Scalable Deduplication File System for Virtual Machine Images

LIQUID SYSTEM ARCHITECTURE

1 Three components - Single meta server with hot back upmultiple data server and multiple clients.

2 Runs on user-level service process.

3 VM images are split into fixed size data blocks.

4 Meta server namespace , finger print , reference count.

5 Meta server mirrored to hot back up shadow meta server.

6 Data servers change of managing data blocks in VM images.

7 Organized in a distributed hash table.

8 A liquid client provides a POSIX compatible file system.

9 Client critical component (provides deduplication)

10 Fault tolerance Mirroring the meta server.

11 Replicas of data blocks are stored.

Page 10: Liquid: A Scalable Deduplication File System for Virtual Machine Images

LIQUID SYSTEM ARCHITECTURE

(CONT.)

Figure : 4.Liquid Architecture

Page 11: Liquid: A Scalable Deduplication File System for Virtual Machine Images

DEDUPLICATION IN LIQUID

1 Liquid chooses fixed size chunking instead of variable sizechunking.

2 Better since all files stored in VM images will be aligned ondisk block boundaries.

3 Advantage-simplicity.

4 Block size choice.

5 Block size- balancing factor which is hard to choose. Greatimpact on both deduplication and io performance.

Page 12: Liquid: A Scalable Deduplication File System for Virtual Machine Images

DEDUPLICATION IN LIQUID(CONT.)

1 Smaller block size-more random seeks when accessing a VMimage.

2 Not tolerable.

3 A large block size is also not preferable, it will reducededuplication ratio.

4 Liquid choose different block size under different situation.

5 Advised to use a multiplication of 4 kb between 256 kb and 1MB to achieve good balance between IO performance anddeduplication ratio.

Page 13: Liquid: A Scalable Deduplication File System for Virtual Machine Images

DEDUPLICATION IN LIQUID(CONT.)

Figure : 5.Pictorial representation of Deduplication

Page 14: Liquid: A Scalable Deduplication File System for Virtual Machine Images

OPTIMIZATIONS ON FINGERPRINT

CALCULATION

∗ Rely on comparison of data block finger prints for redundancy.∗ Finger print-collision resistant hash value calculated from datablock contents.∗ MD5[26] and SHA-1[12] are frequently used for this purpose.∗ Finger print collision - very small, orders of magnitude smallerthan hardware error rates.∗ So we could safely assume that two data blocks are identical.∗ Finger print calculation - expensive. ∗ Delays finger printcalculation for recently modified data blocks.∗ Runs deduplication lazily only when it is necessary.∗ Client side maintains a shared cache which contains recentlyaccessed data blocks.

Page 15: Liquid: A Scalable Deduplication File System for Virtual Machine Images

OPTIMIZATIONS ON FINGERPRINT

CALCULATION (CONT.)

∗ A portion of memory is used by the client side of liquid as privatecache.∗ Private cache hold-modified data blocks and delay finger printcalculation on them.∗ Modified data block ejected from shared cache and added toprivate cache.∗ Modified data will be ejected if private cache becomes full.∗ And ejected based on LRU policy.∗ Only then will the modified data block’s finger print becalculated.∗ Liquid uses multiple threads for finger print calculation.∗ Multiple threads will process different data blocks currently.∗ Provides good IO performance.

Page 16: Liquid: A Scalable Deduplication File System for Virtual Machine Images

FILE SYSTEM LAY OUT

1 All file system meta data are stored on the meta server.

2 Organized in a file system tree.

3 Client side could cache portions of file system meta data forfast accesses.

4 When a VM is stopped ,modified meta data and data blocks.

5 Will be pushed back to meta server.

6 Data servers ensures modification on VM image is visible toother client nodes.

Page 17: Liquid: A Scalable Deduplication File System for Virtual Machine Images

FILE SYSTEM LAY OUT

Figure : 6:File System Structure

Page 18: Liquid: A Scalable Deduplication File System for Virtual Machine Images

COMMUNICATION AMONG COMPONENTS:HEART BEAT PROTOCOL

1 META SERVER-manages all data servers.

2 Exchange regular heart beat message with each data server ina ROUND ROBIN FASHION.

3 Detect failed data servers when there are many data servers.

4 To speed up failure detection data servers send an error signalto meta server.

Page 19: Liquid: A Scalable Deduplication File System for Virtual Machine Images

FAST CLONING FOR VM IMAGES

1 Copying large images may be time consuming.

2 Liquid provide efficient solution by means of fast cloning.

3 VM images represented by meta data files having reference todata blocks.

4 By copying meta data file and updating reference count aclone VM image is achieved.

5 Modification on cloned images will not effect the originalimage.

Page 20: Liquid: A Scalable Deduplication File System for Virtual Machine Images

FAULT TOLERANCE

1 Data replication

2 Data migration

3 Hot backup of Meta server

Page 21: Liquid: A Scalable Deduplication File System for Virtual Machine Images

GARBAGE COLLECTION

1 Removes unused garbage data blocks when running out ofspace.

2 Reference counting of all data blocks are maintained by meteservers.

3 Garbage collection request is issued periodically to data server.

4 Garbage collection is executed based on the data blockmembership in the Bloom filter.

Page 22: Liquid: A Scalable Deduplication File System for Virtual Machine Images

ADVANTAGES OF LIQUID

1 Fast Virtual Machine deployment with peer to peer datatransfer.

2 Low storage consumption by means of deduplication.

3 Instant cloning for virtual machine images.

4 On demand fetching through a network caching with localdisks.

5 LIQUID files has no specific limit.

Page 23: Liquid: A Scalable Deduplication File System for Virtual Machine Images

CONCLUSION

1 Presented LIQUID which is a deduplication file system withgood IO performance.

2 Achieved by caching frequently accessed data blocks inmemory cache.

3 Avoids additional disk operations.

4 Deduplication of VM images proved to be effective.

Page 24: Liquid: A Scalable Deduplication File System for Virtual Machine Images

REFERENCES

[1].Xun Zhao, Yang Zhang, Yongwei Wu, Kang Chen, Jinlei Jiang,and Keqin Li, Senior Member, IEEE, Liquid: A ScalableDeduplication File System for Virtual Machine ImagesIEEETRANSACTIONS ON PARALLEL AND DISTRIBUTEDSYSTEMS, VOL. 25, NO. 5, MAY 2014.[2]AmazonMachineImage,Sept.2001.[Online]. Availablehttp://en.wikipedia.org/wiki/Amazon/Machine/Image.[3]BittorrentProtocol,Sept.2011.[Online]. Availablehttp://en.wikipedia.org/wiki/BitTorrent/protocol.[4]BloomFilter,Sept.2011.[Online]. Availablehttp://en.wikipedia.org/wiki/Bloom/filter[5]Xfs:AHigh Performance Journaling Filesystem,Sept.2011.[Online].http://oss.sgi.com/projects/xfs

Page 25: Liquid: A Scalable Deduplication File System for Virtual Machine Images

[6]RabinFingerprint,Sept.2011.[Online]. Availablehttp://en.wikipedia.org/wiki/Rabin/fingerprint.[7]DataDeduplication,Sept.2013.[Online].Available:http://en.wikipedia.org/wiki/Data/deduplication.[8]A.T. Clements, I. Ahmad, M. Vilayannur, and J. Li,Decentralized Deduplication in San Cluster File Systems, in Proc.Conf. USENIX Annu. Techn. Conf., 2009, p. 8, USENIXAssociation.[9]K. Jin and E.L. Miller, The Effectiveness of Deduplication onVirtualMachine Disk Images, in Proc. SYSTOR, Israeli Exp. Syst.Conf., New York, NY, USA, 2009, pp.1-12.[10]A. Liguori and E. Hensbergen, Experiences with ContentAddressable Storage and Virtual Disks,,in Proc. WIOV08, SanDiego, CA, USA, 2008, p. 5.

Page 26: Liquid: A Scalable Deduplication File System for Virtual Machine Images

[11]M. McLoughlin,The qcow2 Image Format, Sept. 2011.[Online]. Available:http://people.gnome.org/markmc/qcow-image-format.html.[12]C. Tang, Fvd: A High-Performance Virtual Machine ImageFormat for Cloud, in Proc. USENIX Conf. USENIX Annu. Tech.Conf., 2011, p. 18.[13]B. Zhu, K. Li, and H. Patterson,Avoiding the Disk Bottleneckin the Data Domain Deduplication File System, in Proc. 6thUSENIX Conf. FAST, Berkeley, CA, USA, 2008, pp. 269-282.

Page 27: Liquid: A Scalable Deduplication File System for Virtual Machine Images
Page 28: Liquid: A Scalable Deduplication File System for Virtual Machine Images

Recommended