Date post: | 17-May-2015 |
Category: |
Technology |
Upload: | robert-grossman |
View: | 1,516 times |
Download: | 2 times |
An Introduction to Cloud Computing
Robert Grossman
December 8, 2009
Part 1
Introduction
2
What is a Cloud? Clouds provide elastic, on-demand resources
or services over a network, often the Internet, with the scale and reliability of a data center.
The NIST definition has become standard. Cloud architectures are not new. What is new:
– Scale– Ease of use– Pricing model.
3
4
Scale is new.
Elastic, Usage Based Pricing Is New
5
1 computer in a rack for 120 hours
120 computers in three racks for 1 hour
costs the same as
Elastic, usage based pricing turns capex into opex. Clouds can manage surges in computing needs.
Simplicity Offered By the Cloud is New
6
+ .. and you have a computer ready to work.
A new programmer can develop a program to process a container full of data with less than day of training using MapReduce.
Two Types of Clouds
On-demand resources & services over a network at the scale of a data center
On-demand, elastic computing instances (IaaS)– IaaS: Amazon EC2, S3, etc.; Eucalyptus– supports many Web 2.0 applications/users
Large data clouds (Large Data PaaS)– GFS/MapReduce/Bigtable, Hadoop, Sector, …– Manage and compute with large data (say 100+ TB)
7
Ease of use – With Google’s GFS & MapReduce, it is simple to compute with 10 terabytes of data over 100 nodes. With Amazon’s AMIs, it is
simple to respond to a surge of 100 additional web servers.
8
Cloud Architectures – How Do You Fill a Data Center?
Cloud Storage Services
Cloud Compute Services (MapReduce & Generalizations)
Cloud Data Services (BigTable, etc.)
Quasi-relational Data Services
App App App App App
App App
App App
on-demand computing capacity
App App App…
on-demand computing instances
Varieties of Clouds Architectural Model
– Computing Instances vs Computing Capacity
Economic Model– Elastic, usage based pricing,
lease/own, … Management Model
– Private vs Public; Single vs Multiple Tenant; …
Programming Model– Queue Service, MPI,
MapReduce, Distributed UDF10
Computing instances vs computing capacity
Private internal vspublic external
Elastic, usage-based pricing or not
All combinations occur.
Payment Models
Buying racks, containers and data centers Leasing racks containers and data centers Utility based computing (pay as you go)
– Moves cap ex to op ex– Handle surge requirements (use 1000 servers for 1
hour vs 1 server for 1000 hours)
11
Management Models
Public, private and hybrid models Single tenant vs multiple tenant (shared vs
non-shared hardware) Owned vs leased Manage yourself vs outsource management All combinations are possible
12
Programming Model
Amazon’s Simple Queue Service
MPI, sockets, FIFO
13
MapReduce Distributed UDF
on-demandcomputing instances
on-demand computing capacity
DryadLINQ Azure services
Storage Services
Compute Services
Applications
Virtual Network Manager
Data Services
Network Transport
Virtual Machine Manager
Metadata Services
Identity Manager
IaaS
PaaS
Apps
Instances, Services & Frameworks
15
instance(IaaS)
service framework(PaaS)
operating system
Hadoop DFS & MapReduce
Amazon’s EC2
Amazon’s SQSAzure Services
single instance
Google AppEngineMicrosoft AzureForce.com
VMWareVmotion…
many instances
S3
Part 2. Cloud Computing Industry
“Cloud computing has become the center of investment and innovation.”Nicholas Carr, 2009 IDC Directions
16
Cloud computing is approaching the top of the Gartner hype cycle.
Cloud Computing Eco-System No agreed upon terminology Vendors supporting data centers Vendors providing cloud apps & services to
end users Vendors supporting the industry i.e. those
developing cloud applications and services for themselves or to sell to end users
Communities developing software, standards, benchmarks, etc.
17
Cloud Computing Ecosystem
18
Providers of Cloud Services
Consumers of Cloud Services
Providers of Software as a Service
Consumers of Software as a Service
Berkeley RAD Report on cloud computing divides industry into these layers.
Data Centers
Transition Taking Place A hand full of players are building multiple data
centers a year and improving with each one. This includes Google, Microsoft, Yahoo, … A data center today costs $200 M – $400+ M Berkeley RAD Report points out analogy with
semiconductor industry as companies stopped building their own Fabs and starting leasing Fabs from others as Fabs approached $1B
19
Data Center Operating Systems
Data center services include: VM management services, business continuity services, security services, power management services, etc.
20
workstation
VM 1 VM 5
…VM 1 VM 50,000
…
Data Center Operating System
Building Data Centers
Sun’s Modular Data Center (MD)
Formerly Project Blackbox
Containers used by Google, Microsoft & others
Data center consists of 10-60+ containers.
21
Mindmeister Map of Cloud Computing
Dupont’s Mindmeister Map divides the industry:– IaaS, PaaS, Management, Community
http://www.mindmeister.com/maps/show_public/15936058
22
Part 3
Virtualization
23
Virtualization Virtualization separates logical infrastructure
from the underlying physical resources to decrease time to make changes, improve flexibility, improve utilization and reduce costs
Example - server virtualization. Use one physical server to support multiple logical virtual machines (VMs), which are sometimes called logical partitions (LPARs)
Technology pioneered by IBM in 1960s to better utilize mainframes
24
Idea Dates Back to the 1960s
25
IBM Mainframe
IBM VM/370
CMS
App
Native (Full) VirtualizationExamples: Vmware ESX
MVS
App
CMS
App
Two Types of Virtualization
Using the hypervisor, each guest OS sees its own independent copy of the CPU, memory, IO, etc.
26
Physical Hardware
Hyperviser
Unmodified Guest OS 1
Unmodified Guest OS 2
Native (Full) VirtualizationExamples: Vmware ESX
Apps
Physical Hardware
Hyperviser
Modified Guest OS 1
Modified Guest OS 2
Para VirtualizationExamples: Xen
Apps
Four Key Properties
1. Partitioning: run multiple VMs on one physical server; one VM doesn’t know about the others
2. Isolation: security isolation is at the hardware level.
3. Encapsulation: entire state of the machine can be copied to files and moved around
4. Hardware abstraction: provision and migrate VM to another server
27
Managing Virtual Machines
Provision VM Schedule VM Monitor VM Self-service portal for VM
28
Part 4
Technical differences between clouds for data intensive computing, databases and supercomputers
29
Supercomputer Center Model
or
Data Center Model
What Resource is Managed? Scarce processors wait for data
– Manage cycles– wait for an opening in the queue– scatter the data to the processors– and gather the results
Persistent data wait for queries– Manage data– persistent data waits for queries– computation done locally– results returned
Supercomputer Center Model (local)
HPC Grid(distributed)
Data Center 2.0 Model
Distributed 2.0Data Centers
Trading functionality for scalability.
DatabasesvsData Clouds
32
Trading Functionality for ScalabilityDatabases Large Data Clouds
Scalability 100’s TB 100’s PBFunctionality
Full SQL-based queries, including joins
Optimized access to sorted tables (tables with single keys)
Optimized Databases are optimized for safe writes
Clouds optimized for efficient scans reads
Consistency model
ACID (Atomicity, Consistency, Isolation & Durability) – database always consist
Eventual consistency – updates eventually propagate through system
Parallelism Difficult because of ACID model; shared nothing is possible (Graywolf)
Basic design incorporates parallelism over commodity components
Scale Racks Data center
33
Not Everyone Agrees
David J. DeWitt and Michael Stonebraker, MapReduce: A Major Step Backwards, Database Column, Jane 17, 2008
34
Part 5. Standards Efforts
35
Change of gauge at Ussuriisk (near Vladivostok) at the Chinese –Russian border
Train gauge in China is 1435 mm
Train gauge in Russia is 1520 mm
How can a cloud application move from one cloud storage service to another?
Standards Efforts for Clouds
Distributed Management Task Force (DMTF) Storage Network Industrial Association (SNIA) Cloud Computing Interoperability Forum (CCIF) Open Cloud Consortium (OCC) Open Grid Forum (OGF) Plus several others…
36