Cloudifying Source Code Repositories: How much does it cost?

Post on 23-Feb-2016

27 views 0 download

Tags:

description

Cloudifying Source Code Repositories: How much does it cost?. Hadi Salimi, Distributed Systems Labaratory, School of Computer Engineering, Iran University of Science and Technology hsalimi@iust.ac.ir Fall 2010. What’s the Cloud Computing ???. Large scale - PowerPoint PPT Presentation

transcript

Cloudifying Source Code Repositories:Cloudifying Source Code Repositories:How much does it cost?How much does it cost?

1

Hadi Salimi,Distributed Systems Labaratory,School of Computer Engineering,

Iran University of Science and Technologyhsalimi@iust.ac.ir

Fall 2010

What’s the Cloud Computing ???• Large scale• Application-specific

architectures• Developed for in-house

use• Available for

general usage• Inexpensive,

even for small or medium scale deployments

2

What is Revision Control?• Repository for data (source code)

– All changes are tracked by date and author– Branching and merging

• Why move it to the cloud?– Resilient storage– No physical server to administrate– Scale to larger communities (SourceForge)

3

Available Tools• Subversion, revision control system

– Free, open-source– Very popular– Rigid consistency model

4

Available Tools (Cont’d)• Amazon S3, cloud storage service

– Eventual consistency

• Yahoo ZooKeeper, coordination service– Free, open-source

5

Alternative solutions

Cloud Computing P2P• Subversion etc.• Repository stored

persistently in the cloud• One true, consistent

repository exists

• GIT etc.• Repository stored at

every client• Many repository copies,

converging eventually

6

Outline

• Costs of using cloud storage for revision control• Architecture of a simple solution• Performance evaluation

7

How to Measure Costs• Each revision stored as two files on disk

– Revision data– Revision properties

• Calculate bandwidth, per-transaction, and storage costs of pushing each revision into S3 over time

8

Storage Costs

9

Storage Trends

10

Outline

• Costs of using cloud storage for revision control• Architecture of a simple solution• Performance evaluation

11

AsynchronousReplication

Primary Backup

Clients Today’s architecture for source code revision control...

12

A cloud-basedarchitecture...

EC2 EC2

S3S3S3

13

Two simultaneous commits…

EC2 EC2

S3S3S3

Rev. 31337Rev. 31337

Rev. 31337

Followed by an update…Leads to data loss!

14

Coordination Coordination

EC2 EC2

S3S3S3

15

Commit Process

ZooKeeper

16

17

Outline

• Costs of using cloud storage for revision control• Architecture of a simple solution• Performance evaluation

18

Usage Observations• Apache Foundation

– 1 repository, 74 projects– Average 1.10 commits per minute– Maximum 7 commits per minute

• Debian community– 506 repositories– Average 1.12 commits per minute– Maximum 6 commits per minute

19

Results

Checkouts (Reads) Commits (Writes)

• Adding servers improves the user experience20

Conclusion• Storing source code

repositories in the cloud is feasible…

• …and very inexpensive• Only minor changes to

existing revision control systems are necessary to robustly take advantage of cloud storage

21

Questions or Comment

24