Including EMC Proven™ Professional Certification
Leveraging Cloud Computing For Optimized Storage Management
EMC Proven Professional Knowledge Sharing 2009
Mohammed Hashim Escalations & Training ManagerGlobal Technical SupportPSE Lab, BangaloreWipro [email protected]
Rejaneesh SasidharanTechnical Lead-SME
Global Technical Support PSE Lab, Bangalore
Wipro Technologies [email protected]
Information Storage & Management (EMCPA)
Leveraging Cloud Computing For Optimized Storage
Management
Mohammed Hashim Rejaneesh Sasidharan
Escalations & Training Manager Technical Lead-SME
Global Technical Support Global Technical Support
PSE Lab, Bangalore PSE Lab, Bangalore
Wipro Technologies Wipro Technologies
[email protected] [email protected]
Information Storage & Management (EMCPA)
2009 EMC Proven Professional Knowledge Sharing 1
Table of Contents
Leveraging Cloud Computing For Optimized Storage Management .......................................... 1 Introduction.................................................................................................................................. 3 Cloud Computing and Data Storage ............................................................................................ 3 Industry Relevance and Article Overview................................................................................... 4 SOA.............................................................................................................................................. 4 SaaS.............................................................................................................................................. 6 Distributed System....................................................................................................................... 6 Grid Computing ........................................................................................................................... 7 Applying Cloud Computing to Storage ....................................................................................... 7 Cloud Computing Status in the Global Market............................................................................ 8 Cloud in Action............................................................................................................................ 8 Market Profile and Market Size of Cloud Computing................................................................. 9 Various Models of Cloud Computing........................................................................................ 11 Customer Adoption.................................................................................................................... 13 Drivers for Adoption and Industrial Outlook............................................................................. 14 Major Cloud Service Providers.................................................................................................. 16 Prominent Players ...................................................................................................................... 17 Evolution of Cloud based Storage ............................................................................................. 18 Need for Cloud based Storage ................................................................................................... 19
Comparison Chart of Major Cloud based Services................................................................ 20 Implementing a Cloud Computing Solution.............................................................................. 22 Optimizing a Cloud Storage Solution ........................................................................................ 24 Managing the Cloud Solution .................................................................................................... 25 Managing Enterprise 2.0 and SLAs ........................................................................................... 27 Enterprise SLAs and Cloud Computing .................................................................................... 28 Security in the Cloud ................................................................................................................. 28 Background Analysis ................................................................................................................. 29 Securing the Cloud Solution ...................................................................................................... 30 Future of the Cloud .................................................................................................................... 34 Conclusion: Cloud Vision and Strategy..................................................................................... 35 Appendix A: Technical References ........................................................................................... 37
Bibliography .......................................................................................................................... 37 Websites................................................................................................................................. 37
Appendix B: Cloud Taxonomy.................................................................................................. 38 Cloud Technology Landscape.................................................................................................... 39 Appendix-C SaaS, Cloud and Web2.0..................................................................................... 41 Biography....................................................................................Error! Bookmark not defined.
Disclaimer: The views, processes, or methodologies published in this compilation are those of the authors. They do not necessarily reflect EMC Corporation’s views, processes, or methodologies.
2009 EMC Proven Professional Knowledge Sharing 2
Introduction
Cloud Computing spreads IT computing resources across internet cloud boundaries that are
selectively accessed through service providers. Generally, users pay for computing capacity
on-demand and are not concerned with the essential technologies or challenges used to
achieve the increased and diverse storage scalability, server and other resource capacity and
extensibility.
Applications of the Cloud Computing model are expanding rapidly as connectivity costs fall
and computing hardware becomes more efficiently operates at scale. The cloud’s services
have expanded beyond web applications to include data storage, raw computing, and access
to different specialized services. This is due to the increase in governments’ economic
incentives for multiple users sharing common resources, and technological advancements
that have improved collective hardware and software performance that earlier delayed
distributed computing solutions. The cloud is becoming a popular solution to the problem of
horizontal scalability.
Cloud Computing and Data Storage
Cloud-based storage has evolved from continuing attempts to decouple storage from
applications so that each resource can be optimally scaled, utilized and managed. Cloud
storage is a model of networked data storage where data resides on multiple virtual servers,
generally hosted by third parties rather than dedicated servers. Hosting companies operate
large data centers; users who require data hosting buy or lease storage capacity. The data
center operators, in the background, virtualize the resources according to the customer’s
requirements and expose them as virtual servers that the customers can manage. Physically,
the resource may span multiple servers, data centers, or even continents.
2009 EMC Proven Professional Knowledge Sharing 3
Industry Relevance and Article Overview
Cloud Optimized Storage is the new buzzword in storage networking; the world is rethinking
managing their data storage. Gartner predicts that “By 2012, 80 percent of Fortune 1000
companies will pay for some cloud computing service, and 30 percent of them will pay for
cloud computing infrastructure.” As Merrill Lynch analysts predict, the cloud market potential
for business and productivity applications is about $96Bn by 2012 (Including SaaS of $30Bn).
There are a series of major industry players waiting to adopt a Cloud Computing model to
maximize the use of their services and boost revenues. This article focuses on Cloud
Computing and Cloud based storage solutions and compares existing setups. It describes
features of storage optimization, leveraging the current IT infrastructure, and the advantages
and disadvantages of the model.
The discussion analyzes SOA, SaaS, Grid and Cloud Computing; Cloud Architecture and
Applying Cloud Computing to Storage; Outlining Cloud Storage Solution with optimal
performance; Managing the Solution over existing Storage Infrastructure; Advantages and
Risks with Cloud Computing and Comparing different Cloud Storage Solutions. This article
provides insight on capabilities for service providers, data centers, and the core capabilities
that end users should consider when evaluating Cloud Storage Solutions. These capabilities
and benefits will shed light on how cloud based storage would benefit each of them.
Engineers responsible for storage design and management will learn about the various
elements of a Cloud Computing business model. The ability to increase capacity or add
capability without investing in new infrastructure, training new personnel, or licensing new
software are just a few of the many benefits. This model comprises any subscribed service
that extends through the existing IT capabilities real time over the Internet cloud.
SOA Service Oriented Architecture was an initiative leading to virtualization. In Service Orientation,
we virtualized resources; everything was differentiated as a service and billed accordingly. It is
a conceptual business architecture where business functionality or application logic is made
available to SOA users or consumers as shared, reusable services on an IT network.
2009 EMC Proven Professional Knowledge Sharing 4
An enterprise level reference SOA architecture establishes guidelines for defining the
architecture of any SOA based project.
1. SOA Roadmap: This defines the milestones on an SOA adoption journey. The
milestones are defined for maturing the SOA infrastructure and for rolling out business
applications using SOA principles. In this stage, an SOA practitioner defines major
activities and timelines.
2. SOA Infrastructure: After defining the roadmap, assess the customer’s IT infrastructure
for SOA readiness. This stage defines the target SOA infrastructure including
hardware, software tools, packages, and alliances. Impacting changes to infrastructure
involves following sub-stages viz., Plan, Define, Design, Construct, Test & Deploy.
3. Services, Composite Applications: This stage of SOA adoption deals with design and
programming aspects of service, and process and application realization. Activities
include creating a project plan, developing a risk assessment and mitigation plan,
formulating QA strategy and guidelines, defining SOA design methodologies, creating
a test plan, and testing report generation. Sub-stages include Plan, Define, Design,
Construct, Test and Deploy.
4. Operations & Maintenance: This stage of the journey observes SOA in production and
measures its value. Metrics collected during this stage guide the next set of SOA
activities including infrastructure augmentation, service portfolio enhancements,
changes to SOA runtime governance etc.
5. Change Management: This stage lays down the formal process for impacting changes
to the SOA system. Change management defines how changes are brought in for
SOA Roadmap definition, infrastructure, and service portfolio. These are guided by
policies defined by SOA Runtime Governance.
6. SOA Runtime Governance: This defines the policies establishing behavioral rules and
guidelines. Policies are specific and cover business, organizational, compliance,
security, and technology facets of services operating within SOA.
2009 EMC Proven Professional Knowledge Sharing 5
SaaS The “Software as a Service” delivery model is increasingly popular as there is no hardware or
software to manage and the service is delivered through a browser. The prime factors in this
model:
i. Pay per use
ii. Instant Scalability
iii. Security and Reliability
iv. APIs
Advantages include lower cost of ownership, reduced responsibility for infrastructure
management, bandwidth for unexpected resource loads, and faster application rollout. CRM,
Financial Planning, Human Resources and Word processing are the common implementation
areas. Risks include ssecurity, downtime, access, dependency, and interoperability.
Distributed System A distributed system executes tasks (orders) via its components’ cooperation. In operation, it
has only limited knowledge of its components’ current status since:
time needed to determine the status of a component is longer than the duration of the
status (very often a consequence of the spatial distribution of the system)
status Hiding
Components typically operate asynchronously and communicate via messages. Protocols are
the specifications for syntax, semantics, and dialog structure of the communication. The
Internet, WWW, Grids and Chip are common examples of distributed systems.
2009 EMC Proven Professional Knowledge Sharing 6
Grid Computing This is a form of distributed computing whereby a "super and virtual computer" is composed of
a cluster of networked, loosely-coupled computers acting in concert to perform large tasks. It’s
a distributed system for a community of users that provides:
Resource sharing (computers, storage, I/O equipment, network, other equipment
(scientific instruments, e.g.), data, software)
Support for user collaboration
Support for virtual organizations
Resources, users and virtual organizations are dynamic, i.e. during operation they may
emerge, change, or vanish. Mostly, grids:
Use heterogeneous resources
Are geographically distributed
In science combine resources and users from different management domains and institutions, whereas in business they are restricted to a single institution/company (security, liability of service)
Approach SOA
Applying Cloud Computing to Storage Cloud Computing is a model in which data and applications are hosted and managed remotely
and provided as a service over the internet cloud. It provisions IT capabilities on-demand
versus the traditional procure and provision model. Service utilization is calculated based on a
consumption model where consumers pay only for the operational units consumed.
Cloud Computing segments the IT Services market as Infrastructure, Platform and Application
services – Infrastructure as a service (IaaS), Platform as a service (PaaS) and Software as a
service (SaaS). Although there is a general industry consensus towards SaaS as a subset of
the Cloud Computing paradigm, they are two entities functioning in composition. In SaaS, we
deal with complete business applications made available for consumption as service whereas
IT capabilities (including application platform and services) made available for consumption as
service will be a part of Cloud Computing. In that sense, cloud services could be used to build
SaaS applications if they were offered for consumption in a service model.
2009 EMC Proven Professional Knowledge Sharing 7
Cloud Computing Status in the Global Market Cloud Computing symbolizes a trend toward a commoditization and utility mindset in the
industry and society as a whole. It may be signaling the emergence of a paradigm of IT
capabilities as utility services provided and consumed on a need rather than an ownership
basis. Nonetheless, the following trends and events may be key contributors.
There is a significant disruption caused by new players like Amazon and Google who
are using commodity hardware to provide Cloud services.
Infrastructure virtualization and management tools are quickly maturing as viable and
reliable cloud platforms. The availability of technology and business environmental
factors is prompting enterprises to consider optimizing their computing resources.
Some enterprises are unable to expand existing data centers due to space, energy
constraints, and government regulation. Expansion is not possible even if there is
money and willingness.
Virtualization and other dynamic schemes are reducing hardware sales; pushing
hardware vendors to pursue service models. SUN, HP, and IBM started pay-per-use
hosted models as an alternative revenue channel as they failed to clock expected
growth from traditional channels. The model aligns how small and medium companies
want to procure resources. Cloud Computing provides them with a model of
incremental growth with an inherent elasticity for shrinkage. It can also meet enterprise
demand for transient compute capacity/scaling requirements.
Cloud in Action The New York Times used 100 Amazon EC2 instances and a Hadoop application to process
4TB of raw image TIFF data (stored in S3) into 1.1 million finished PDFs in the space of 24
hours at a computation cost of just $240[10]. --illustration of economy of scale [Hadoop is
apache implementation of Google technology for large data processing]
2009 EMC Proven Professional Knowledge Sharing 8
In many circles, Sawzall is considered the key building block for much of Google’s data
analysis. http://labs.google.com/papers/sawzall.html
… Sawzall has become one of the most widely used programming languages at Google.
… [O]n one dedicated Workqueue cluster with 1500 Xeon CPUs, there were 32,580
Sawzall jobs launched, using an average of 220 machines each. While running those jobs,
18,636 failures occurred (application failure, network outage, system crash, etc.) that
triggered rerunning some portion of the job. The jobs read a total of 3.2x1015 bytes of data
(2.8PB) and wrote 9.9x1012 bytes (9.3TB).
Other similar languages: Yahoo’s Pig Latin and Pig; Microsoft’s Dryad
Cloned in open source: Hadoop, http://hadoop.apache.org/core/
Market Profile and Market Size of Cloud Computing From the demand perspective, the market needs various models palatable to buyers based on
their appetite for cost, control and risk. The spectrum ranges from models with full control in
terms of hardware choice and root access, to complete black box dynamism. Our age-old
managed hosting model is at the lowest band, repackaged with a higher degree of automation
and optimization from virtualization and standardization in various layers of the IT stack.
Potential cloud contenders come from various sections of the industry (ie. telecom). The
vendor market can be characterized within the following classes in the sense that the offerings
are trying to address requirements in different layers of the IT stack. It appears likely that a
future cloud application will be built with services from more than one vendor. In our opinion,
this is the ideal end goal for Cloud Computing.
1. Application Platform: This is classified under cloud vis-à-vis SaaS and is more
infrastructure in nature rather than business software. Application platform providers
offer software development platforms (Platform as a service) for application
development. This application could be either fully consumable business application
(in which case they could be called SaaS) or they could be just cloud services. Usually
the application is also hosted and run by the same provider. This may not be the case
going forward due to increasing standardization of virtualization and application
definition (viz XAML). Providers include Force.com, Net Suite, Bungee Lab, Cog Head
(developed application run on Amazon) etc.
2009 EMC Proven Professional Knowledge Sharing 9
2. Software Service: These services are standalone business/software services for
consumption by cloud applications/services inside/outside the provider. Unlike SaaS,
these are not fully developed applications with enough value on their own; they must
be combined with other services. Examples include Google Map, Yahoo Pipe, Pay Pal,
Amazon Book Web Services, Strike Iron (data as a service) etc. This is a generic
bucket covering all possible software services required to make applications/services
operational (integration, security, IM etc.) Note: At times, these are also classified as
SaaS, but the majority considers them SaaS only when the applications are
consumable for business functions rather than as components of a function.
3. Storage Service: This is about storage services on the internet. These evolved from
the first generation of storage services offered on the internet during the 1990s.
4. Computing & Storage Infrastructure: This is about providing compute infrastructure
service on the internet. This usually includes other services such as networking and
management services to make the infrastructure useful. Examples include Amazon
WS, Enomaly, CohesiveFT, GoGrid, Joyent etc.
An estimate of the market size:
Merrill Lynch study projects Cloud Computing business potential at 95BN (ad revenue
65Bn); this includes SaaS applications. As per the McKinsey projection, SaaS
projection is around 30Bn. That leaves the pure cloud market at 65 Bn by 2012.
Heuristics Based: As per rough estimate, the installed base of commodity servers in
the world is around 30Mn; assuming half (15Bn) of the servers migrating to Cloud
(other half between SaaS, sunset/rationalization and in-house data center). With
utilization at 80% the cloud will need about 4Mn servers (16Mn normalized for typical
multi-core/CPU enterprise server vs. leaner cloud server). Amazon makes about 1K
per server per year bringing the figure to 4Bn. With CPU normalization, it can go up to
16Bn (typical two CPU quad core enterprise servers). The other cost components are
storage and bandwidth charges totaling to about 30-35Bn (storage is much more
expensive). The remaining 30-35Bn could be from professional services and cloud
software services like gmail, map, paypal, StrikeIron etc.
2009 EMC Proven Professional Knowledge Sharing 10
Various Models of Cloud Computing Dominant models of Cloud Computing from the providers’ classification perspective include:
1. IAAS (Infrastructure As A Service) - This category of providers provide IT infrastructure
capabilities such as servers, storage, networking and other applications infrastructures
like Queue, Database. The primary differentiator is that these providers do not provide
platforms for building applications natively.
2. PAAS (Platform As A Service) - This category of providers provides the next level of
platforms above IAAS, enabling consumers to build cloud ready applications using
proprietary (mostly) API frameworks. These frameworks are designed and optimized
for multi-tenancy (cost) and scalability (autonomy) on demand.
3. SaaS (Software As A Service) - This category of providers provides complete business
applications or functions on a pay-per-user model. The rental model could be based on
'concurrent user count', 'per named user', 'per transaction' etc.
Note: Please remember that these providers are going to evolve over time and the various
dominant models are likely to fuse and give rise to a universal, inter-operable cloud.
Pattern Description Example
Services
Embedded Service
[Access API]
The core software is owned and run by the
enterprise with hooks to use services from the
internet. The core software could also run on the
cloud, while ownership still is with the enterprise
Google API, Flickr,
eBay, Strike Iron
Embedding Service
[Plug-in API]
The service is owned and run by the enterprise,
including on the cloud. The services are then
embedded into Cloud applications and platforms
on the internet. It uses “Access API” to access
the core of the platform.
2009 EMC Proven Professional Knowledge Sharing 11
Pattern Description Example
Services
Runtime
Environment
In this mode, the service code is developed
using the platform and runs inside the platform.
Force.com, Ning,
Bungee
Agnostic Hosting
(Infrastructure
Cloud)
This category of platforms are agnostic of the
application details like API, language etc. Amazon EC2
Integrated Hosting
(Platform Cloud)
This category is more aware of the computing
framework used for application development, but
they are still domain agnostic (from business
model perspective).
Joyent,
Gigaspaces
Verticalized Cloud
(Application Cloud)
These vendors focus on a particular service
domain, acting more like a domain eco-system
(Verticalization of cloud).
Force.com for
CRM, Webex for
collaboration,
RightNow for
Customer Care
Private Cloud
(different from cloud
inside the
enterprise)
This is an emerging trend where customers pay
a premium for extra high quality, reliable service
hosted on the cloud.
Tools and
Technologies
This is about ISV supplying tools and
technologies for the public cloud and also
integrating the enterprise cloud to a public cloud.
These may work across all clouds or a selected
partner set.
VMWare vCloud,
3Tera
1. VMWare is betting big on cloud with its vcloud and VDC-OS (Virtual Data Center OS)
2. 3Tera alliance with Citrix alliance and large customers like BT
2009 EMC Proven Professional Knowledge Sharing 12
Customer Adoption Global CxOs are pressured to reduce non-differentiating IT footprint with innovative solutions
because:
i. Datacenter hardware utilization is very low (10-20%). Organizational policies and
politics are barriers to platform rationalization.
ii. Applications are commonly provisioned for peak capacity.
iii. Less than 40% of the features available in package applications are put to use in
a typical enterprise.
iv. Maintenance and data center operations cost are higher than software costs.
To summarize, organizations are paying for over-provisioned and under-utilized IT
infrastructure and software that is non-differentiating and unresponsive to business change.
This is the general pattern across the industry and we presume that the Cloud will have
appeal once the industry overcomes FUD (fear, uncertainty and doubt). Enterprises
(especially large enterprises) will take advantage of virtualization and other Cloud related
technologies to boost utilization and manage their in-house infrastructure more efficiently. This
is sometimes referred to as a Private Cloud (the term also seems to be used to define part of
the public cloud isolated and dedicated exclusively for a customer). In the near term,
enterprises will focus on making their in-house infrastructure more efficient and start
leveraging cloud services to move some applications off premise.
Gartner estimates that 3% of custom and packaged IT applications are currently off premise.
By 2013, this percentage will rise to 20%. Most off premise applications will run on some sort
of cloud infrastructure. Enterprises will move simple and less-demanding applications to Cloud
Computing in the short term. In the longer term, some of the larger and mission-critical
transactions will be entrusted to cloud providers.
Email is one of the first applications to migrate to Cloud as it has a strong business case
compared to the outsourced or in house options. Organizations like Sanmina-SCI, Avago,
salesforce.com etc are already using Google enterprise email services. According to Gartner,
by 2012, 20% of enterprise e-mail seats will use a SaaS or cloud model for e-mail services.
2009 EMC Proven Professional Knowledge Sharing 13
Drivers for Adoption and Industrial Outlook The following table illustrates both positive and negative forces at play in terms of Cloud
adoption in the industry.
Reference Study http://innovation.wipro.com/CTO/wiki
2009 EMC Proven Professional Knowledge Sharing 14
A few more points follow with respect to use cases from an infrastructure perspective.
SMB market is adopting the cloud infrastructure with enthusiasm. The proposition of
having access to incremental infrastructure on a rental basis could be next to utopia.
Large enterprises are using cloud infrastructure for burst computing purposes
(businesses with seasonality). Burst computing is a scenario where an enterprise
faces enormous computing and storage power for a brief span of time. Large scale
application performance testing is a typical case. A variation of this model is a proposal
for enterprise IT to use cloud to process spill-over requirements in any traditional
application setup in which the capacity provisioned will be for the average load and
anything above the predetermined threshold will be forked to the cloud.
Large enterprises will also look at the underlying technologies to create in-house
clouds to optimize their IT infrastructure. Some enterprises will also be open to the
idea of moving their applications to a private cloud managed by an outsourcer.
Where typical internal sourcing models are too sluggish or cost inefficient, it promotes
the use of a cloud infrastructure.
Cloud services can potentially be embedded inside enterprise applications to enrich
them with the power of the internet; the most popular include Google map, StrikeIron,
Amazon Book WS, EBay WS, Google data API.
ISVs will use many Cloud services to offer their software in a SaaS model. Gartner has
proposed a planning assumption that through 2013, more than 70% of platform as
service based business applications will be developed by ISVs (not enterprise IT
departments).
From a cloud service perspective, we are all familiar with mashup or pure services on
the internet (Gmail, Hotmail, paypal, Google maps, igoogle etc.)
2009 EMC Proven Professional Knowledge Sharing 15
Numerous firms have developed complex applications that demonstrate the potential power of
the cloud. Cloud Computing and storage have become ideal platforms to develop
sophisticated, economical and flexible services. Cloud-based technology is here to stay, will
rapidly become pervasive, and will change the way we do business. Cloud-based storage has
evolved from continuing attempts to de-couple storage from applications so that each
resource can be optimally scaled and managed.
Major Cloud Service Providers The value proposition of Cloud Computing could be offered by many existing players by
extending their skill in areas related to Cloud Computing.
1. Major telecom vendors are converting their business model for providing infrastructure
services to enterprises by leveraging their skill in handling massive amounts of
infrastructure equipment to offset loss of revenues due to the emergence of VOIP.
2. ISPs, hosting companies, and large scale Web platform players are leveraging their
skills in handling large scale data centers like Amazon, Google and Yahoo.
3. Hardware vendors are looking to leverage the cloud paradigm to host their hardware in
a rental model like Network.com from Sun and EC2 from Amazon.
4. Traditional consumer/enterprise service providers can participate in the cloud model
(viz. DHL, ADP) by also hosting the IT applications and processes
5. Traditional software giants like Microsoft, SAP, and Oracle etc. are taking preemptive
measures to retain their customer bases through cloud (SaaS) counterparts of their
traditional software offerings. Also, traditional IT services providers can participate as
aggregators in the cloud.
2009 EMC Proven Professional Knowledge Sharing 16
Services opportunities in Cloud Computing are in distinct areas:
1. Developing software tools and utilities to facilitate build to package, deploy, and manage
applications in the cloud
2. Consulting and implementation services around migration to public and private cloud
Now, our discussion moves to understanding the existing and evolving market offerings in
Cloud Computing. Web players are leading in cloud services and cloud-based applications.
Amazon – The company has moved to monetize its global scale eCommerce platform to offer
computing resources on demand (EC2, S3) along with other cloud based web services.
Amazon was one of the first vendors to bring broader Cloud services to market.
Google – Google offers its search, advertising, email and office suite as cloud based
applications. Google’s cloud offering leverages its massive data center investments. The
company also has a platform as a service offering called Google App Engine.
Salesforce.com – Salesforce.com has moved to monetize the infrastructure it built for its SaaS
CRM application. Their PaaS offering, force.com, allows developers to build multi-tenant SaaS
applications. The company has successfully built an ecosystem of third party applications on
its platform (700 applications from 350 vendors).
Prominent Players
The prominent players include service providers like Amazon, AppNexus, eBay, Google,
GoGrid, Salesforce and Yahoo; as well as traditional vendors including IBM, Intel, Microsoft
and Nirvanix. Individual users are adopting it through large enterprises including General
Electric, L'Oréal, Procter & Gamble and Valeo. Vendors such as Caringo, EMC, Ibrix, Xiotech,
and others are racing to provide a storage services layer or APIs that will underpin the next
generation of cloud-based storage. These vendors are developing cloud-based storage
infrastructures that will go well beyond the limitations of first generation products that tried to
provide basic FTP and WebDAV-type access.
2009 EMC Proven Professional Knowledge Sharing 17
Next-generation solutions will provide services on top of true Web services architectures.
Moreover, these next-generation solutions will provide secure partitioning, data organization,
and advanced user management services. Major IT vendors like Microsoft, IBM and HP are
bringing their cloud products and services to market. Oracle is also expected to bring an
application platform related offering based on BEA products to market.
Evolution of Cloud based Storage
Figure 1 (Reference Taneja Group Research)
2009 EMC Proven Professional Knowledge Sharing 18
We must analyze the evolution of Cloud based storage from the perspective of constantly
changing user demands with regard to the scalability and performance of data storage and
applications. In a Web-centric world, where large service providers host storage and
computing, and customers buy them on a pay-per-use basis, this makes the IT infrastructure
elastic and cost-optimized.
According to Wikipedia, the Cloud is a metaphor for the Internet as commonly depicted in
network diagrams as a cloud outline. The underlying concept dates back to 1960 when John
McCarthy opined that "computation may someday be organized as a public utility" and the
term cloud was already in commercial use in the early 1990s to refer to large ATM networks.
By the turn of the 21st century, Cloud Computing solutions had started to appear on the
market primarily focused on Software as a service (SaaS). Amazon.com played a key role in
the development of Cloud Computing by modernizing their data centers after the dot-com
bubble and providing access to their systems by way of Amazon Web Services in 2002 on a
utility computing basis.
In 2007, Google, IBM, Sales Force, Yahoo and a number of universities began large scale
Cloud Computing research projects and application development. By mid 2008, Cloud
Computing had gained publicity thanks to the press and global technical conferences.
In August 2008, Gartner observed that "organizations are switching from company-owned
hardware and software assets to per-use service-based models" and that the "projected shift
to Cloud Computing will result in dramatic growth in IT products in some areas and in
significant reductions in other areas."
Need for Cloud based Storage One of the major business needs for cloud based storage is the need for a Service-based
online economy where resources and services are transparently provisioned and managed
real time.
2009 EMC Proven Professional Knowledge Sharing 19
Other reasons include:
dramatic growth in interconnected devices
increase in real-time data streams
increased industrial adoption of service oriented architectures
Web 2.0 Applications & SaaS/PaaS/IaaS
changes in multi-vendor collaboration across geographies
globalization
massive social networking and mobile commerce
inexpensive and more efficient means of connecting to Internet and its immense user
penetration
improvements in virtualization
Grid Technologies
change in rationale of demand/usage based cost effective utilization of resources
tremendous increase in the scale of IT environments
Comparison Chart of Major Cloud based Services
Reference Hysea Cloud Computing Workshop- “Migration to Cloud Computing: An Academic
Perspective.”
Illustration on following page.
2009 EMC Proven Professional Knowledge Sharing 20
2009 EMC Proven Professional Knowledge Sharing 21
Implementing a Cloud Computing Solution
Cloud Computing is often confused with:
grid computing (a form of distributed computing whereby a "super and virtual
computer" is composed of a cluster of networked, loosely-coupled computers, acting
in concert to perform very large tasks)
utility computing (the packaging of computing resources, such as computation and
storage, as a metered service similar to a traditional public utility such as electricity)
autonomic computing (computer systems capable of self-management)
Today, many Cloud Computing deployments are powered by grids, have autonomic
characteristics and are billed like utilities, but Cloud Computing can be seen as a natural next
step from the grid-utility model.
Some successful cloud architectures have little or no centralized infrastructure or billing
systems including peer to peer networks like BitTorrent and Skype, and volunteer computing
like Wikipedia. Today, developers can create a cloud application on a number of cloud
platform technologies. To understand cloud platforms, let’s start by looking at cloud services in
general. As Figure 2 shows, cloud services can be grouped into three broad categories:
Figure 2 (Reference Taneja Group Research)
2009 EMC Proven Professional Knowledge Sharing 22
1. Software as a service (SaaS): An SaaS application runs entirely in the cloud (that is, on
servers at an Internet-accessible service provider). The on-premises client is typically a
browser or some other simple client. Salesforce.com is the most well-known example of a
SaaS application but many others are also available.
2. Attached services: Every on-premises application provides useful functions on its own. An
application can sometimes enhance these by accessing application-specific services provided
in the cloud. Because these services are usable only by this particular application, they can be
thought of as attached. Apple’s iTunes is one popular consumer example. The desktop
application plays music and more, while an attached service allows buying new audio and
video content. Microsoft’s Exchange Hosted Services provides an enterprise example, adding
cloud-based spam filtering, archiving, and other services to an on-premises Exchange server.
3. Cloud platforms: A cloud platform provides cloud-based services to create applications.
Rather than building their own custom foundation, for example, the creators of a new SaaS
application could build on a cloud platform.
Whether it’s on-premises or in the cloud, an application platform has three parts:
1. A foundation: Nearly every application uses some platform software on the machine it runs
on. This typically includes various support functions, such as standard libraries and storage,
and a base operating system.
2. A group of infrastructure services: In a modern distributed environment, applications
frequently use basic services provided on other computers. It’s common to provide remote
storage, for example, integration services, an identity service, and more.
3. A set of application services: As more and more applications become service-oriented,
the functions they offer become accessible to new applications. Even though these
applications exist primarily to provide services to end users, they are also part of the
application platform. (It might seem odd to think of other applications as part of the platform,
but in a service-oriented world, they certainly are.)
2009 EMC Proven Professional Knowledge Sharing 23
Developers build cloud applications using the three parts of an application platform. A
framework used for classifying and characterizing the various cloud technology vendors in the
market follows. As per our definition, Cloud Computing offers developing and/or
deploying/managing IT services on the public internet platform. It may unnecessary to develop
cloud software on the cloud. For example, you can use a local development platform to
develop but a public cloud to deploy and manage. It is imperative that we understand the
three pillars involved in software development:
1. Development Environment and API: We need a development environment that
understands the underlying language API to develop software.
2. Runtime Platform: Next, we need a platform to execute the software developed.
3. Management: Once deployed, the software needs to be monitored and administered
from time to time.
Optimizing a Cloud Storage Solution The prominent components of optimizing Cloud Solutions and their utilization analysis over
existing IT are as:
Node Priority:
Issue: Some nodes are more performance critical than others
Solution: Boost spending on critical nodes (e.g. master funding boost)
Workflow Priority:
Issue: Some workflows are more performance critical than others (although they look
the same to the system)
Solution: Declare relative priority of workflows and split budget accordingly
Job Priority:
Issue: Some stages of a workflow are more I/O intensive, others more CPU intensive
Solution: Boost resource spending during resource-intense stages of workflow
2009 EMC Proven Professional Knowledge Sharing 24
Bottleneck Mitigation:
Issue: Some nodes may be bottlenecks during map/reduce synch
Solution: Redistribute funds to active bottlenecks
Best Response:
Issue: Optimal configuration/allocation might change when other users place
competing bids
Solution: Find game theoretical best response bids continuously to maximize utility
Risk:
Issue: Some users are more risk averse than others (can tolerate fewer fluctuations)
Solution: Bid on nodes based on predicted guarantee to deliver a QoS level
Managing the Cloud Solution Workflow management matters because many of the benefits of Cloud Computing come from
the speed and ease with which IT resources can be created and put into production.
Illustration follows on the next page.
2009 EMC Proven Professional Knowledge Sharing 25
Includes clear policies on
i. who to admit
ii. how to arbitrate among competing requests
iii. what resource capacity may be requested over what time frames
Isolated Data centre: Reset, reboot, power up, power down, get status
Bias towards large and short experiments
Site coordination required, e.g. accounting
2009 EMC Proven Professional Knowledge Sharing 26
As per Credit Suisse Analysis- “Managing the Impact of SOA and Enterprise 2.0 on Financial
Services IT”; broad adoption of shared services is clearly a good thing, so what’s holding it
back? Line-of-Business groups will no longer own the whole front-to-back process, so we
need to solve several difficult problems to ensure they can still deliver to their customers:
Technical issues: Security, Architectural Governance, Development Lifecycle, etc.
Non-technical issues: Culture, incentives for cross-silo cooperation, risk management
Managing Enterprise 2.0 and SLAs
Service Level Agreements: SLAs are essential to managing business-critical functions
– Without SLAs, how can you know if the SOA will meet business needs?
SLAs aren’t mentioned in most SOA discussions
Existing systems tend to have implicit SLAs between components
– Easier to manage the end-to-end SLA when you stay inside the silo
Problems arise when refactoring systems as services
– Implicit SLAs in existing applications need to be identified and made explicit
Tight coupling gives way to loose coupling between client, workflow,
and services
Mashups and workflow magnify the problem
– Can no longer make assumptions about how a service is consumed, or by
whom
In effect, each end user desktop can drive a unique application
2009 EMC Proven Professional Knowledge Sharing 27
Enterprise SLAs and Cloud Computing Use SLAs to model the behaviour of each level in the SOA
At each boundary in the SOA, we should have an SLA
– Top-level SLA will be expressed in business terms
transaction throughput, availability, number of concurrent users, etc.
– Underlying services will have SLAs stated in more technical terms
message throughput / latency, number of concurrent connections, etc.
– Model client behaviour too
One size does not fit all
– Different users will require different SLAs (at different price points!)
A lot of effort going into building clouds of utility computing power
– On demand computing, next-generation service fabrics, etc.
SLAs and Policies become ever-more important the closer you get to utility computing
– basis for pricing models; chargeback
– SLAs give confidence that cloud model is manageable
– SLAs can be mapped onto infrastructure and support tiers
e.g., automatically deploy services onto appropriate h/w based on SLA
The bottom line is that senior management won’t outsource to the cloud unless they are sure
to achieve a return on their investment.
Security in the Cloud
Security is a vital concern when designing/realizing Cloud Computing as an abstraction for a
complex on-demand scalable computation grid that is accessible to users through web-
enabled devices. Customer data and programs residing in provider premises and security is
always a major concern in Open System Architectures. An optimal application for securing it
should be a compact, cross platform independent, security application used on Cloud
Computing systems and focusing on protecting users’ sensitive and private data.
2009 EMC Proven Professional Knowledge Sharing 28
The utilization of Cloud Computing systems is steadily rising as is the need for a practical
security application. Cloud Computing typically stores a client’s data in a location accessible
from the Internet so it is no longer stored in the client’s personal computer, but in a data center
operated by the Cloud Computing provider. It is more susceptible to attack since the data is
not completely in the user’s control. Primarily, Cloud Security will have to minimize the risk by
guaranteeing that only authorized users have total access to the data. They will have to
implement security methods to protect users’ data regardless of the Cloud Computing host, its
platform, or its weaknesses.
Security is often an afterthought in computer applications; the same is true with Cloud
Computing. Nevertheless, its implementation is critical since so much sensitive and private
data is being stored on these systems. For example, users can now create, edit and print
Word documents and spreadsheets online. Users may maintain their personal contacts stored
on the Cloud, with associated telephone numbers, addresses, and email addresses. Users
may even retain the most sensitive data such as social security numbers or bank account and
credit card numbers on the Cloud’s storage mediums.
Data is no longer stored on the creator’s computer, but on the servers of the service that
provides the web application. Often, the security of these systems is questionable. Not only
that, the security scheme is up to the Cloud Computing provider. The Security Application
must protect data even though it is stored somewhere else.
Background Analysis As previously mentioned, Cloud Computing is an abstraction for a complex on-demand
scalable computation grid that is accessible to users through web-enabled devices. Although
the specifics of this paradigm are still being defined and revised, Cloud Computing typically
consists of some basic components (e.g. CPUs, storage mediums, network interconnects,
etc.) upon which any number of applications can be deployed. A Cloud Computing platform
incorporates some or all of these components, and each component has its own security
concerns and issues. As this technology becomes more widespread and accessible, the need
for proper security becomes more evident. As the general public (i.e. those with less technical
expertise) shifts to Cloud Computing, security issues should be at the forefront of developers’
minds. Some aspects of Cloud Computing become more secure; others become less.
2009 EMC Proven Professional Knowledge Sharing 29
On one hand, security improves due to the centralization of data. Rather than having private
data spread over a number of systems (e.g. work computer, home computer, and mobile
device), data is stored on the Cloud and accessed with the device. Moreover, security
improves with the ability to increase focus of security resources. Rather than having to secure
the operating system and applications of many different computing devices, security can be
focused on a single data center.
While some features of Cloud Computing are more secure, some are more vulnerable to
exploitation and attack, these aspects can be categorized into two groups:
General weaknesses:
1. Loss of control of data
2. Security measures are in the hands of providers
3. Denial of service type attacks makes all data unavailable
4. Large infrastructure offers many points of failure
Specific weaknesses:
1. Distributed Encryption/Decryption
2. Distributed Key Generation/Distribution
3. Security certification of distributed systems.
Distributed Encryption/Decryption and Key Generation/Distribution have been the topic of
scholarly articles and some companies are already providing solutions. Current systems lack
an easy to use, homogenous application solution that encompasses all of the aforementioned
security issues. A typical Cloud Computing platform has various layers and we wish to
address the issues in the platform and applications layers.
Securing the Cloud Solution This would encapsulate many areas of computer science. Ideally, we will adjust and adapt
existing security applications to serve as a basis for applications. Generally, security
incorporates both hardware and software aspects of computer science specifically,
cryptography, network security, and software security. Furthermore, we will incorporate the
fields are related to Cloud Computing including networking, operating systems, and
virtualization. As we develop, we will need to incorporate principals from these related fields.
2009 EMC Proven Professional Knowledge Sharing 30
Our challenge is to apply typical cryptographic schemes to a Cloud Computing environment.
Some of the specific weaknesses are concerned with cryptography. It is important to maintain
distinctive encryption and decryption keys since much of Cloud Computing is based on
replication. Recently, Amazon faced this issue on their Cloud systems. This problem has since
been resolved, but lack of foresight in cryptography can lead to disastrous results.
Our application will address the challenges presented by distributed cryptography, and focus
on encryption and key distribution. Networking is the backbone of a Cloud Computing system.
Network security has been heavily researched, but our application will be concerned with
networking as it relates to Clouds. Networking on the Cloud is unique in that there are so
many points of entry. A typical data center may host only a handful of websites and therefore
have a small number of access points. A data center focused on Cloud Computing could host
hundreds of sites, plus private data, and other applications.
2009 EMC Proven Professional Knowledge Sharing 31
Users access data using a myriad of different protocols from different locations. This calls for
a complex set of network defenses such as firewalls, intrusion detection systems, and secure
channels. While overall network security of the entire Cloud Computing system is outside the
scope of our application, it will be important to understand its functionality and implementation.
Other related fields such as virtualization and operating systems present their own security
issues. We will identify where security holes exist and incorporate solutions into our design as
we develop our application. Many environments exist in which we can develop our application,
most notably Amazon’s Elastic Computing Cloud (EC2). This system provides the features of
any typical Cloud Computing System and implements an API for usage. Other companies
provide similar systems, but development will most likely occur on the EC2 system as it is the
most accessible and offers a feature-rich API. The EC2 is the system on which our work thus
far has been developed. Security should be available at the following prime levels:
Server access security
Internet access security
Database access security
Data privacy security
Program access Security
Potential Advantages
If you extend the concept of virtualization from a single server to a complete grid, and make
access available over the Internet, it can be summarized as Cloud Computing. So just imagine
that virtualizing a single server can save 50 to 70% of resources, how much savings can you
achieve if a complete data center acts as a single grid and is then virtualized?
Reducing capital and operating expenditures through infrastructure pooling and
improved utilization. Here, customer expense is minimized so we minimize barriers to
entry. The infrastructure does not need to be purchased for one-time or infrequent
intensive computing tasks but is owned by the provider. Lower Operating Costs as
Minimized Capital expenditure
2009 EMC Proven Professional Knowledge Sharing 32
Separating infrastructure maintenance duties from application development
Separating application code from physical resources. i.e, device and location
independence enables users to access systems regardless of their location or what
device they are using (e.g. PC, mobile). Any-time, any-place, any-device access;
Location and Device independence
Centralizing operations with ability to use external assets to handle peak loads.
Increasing administrator efficiency and quickly scale to meet user demands.
Sharing capability among a large pool of users, improving overall utilization. This is an
alternative if departmental or central IT is non-responsive.
Increasing flexibility to shape the software for improved operational efficiency as High
Computing power with flexibility and dynamic load handling.
Enhancing scalability as it facilitates easier cross-institution collaboration
Offering pay as you go options and focusing on core business; pay only for what you
need is useful when service demands fluctuate
Minimizing down times with Fault Tolerance clouds built with the presumption of
untimely component failures
Improving alignment of IT resources with institutional priorities with caching service call
results, higher utilization, and improved efficiency
2009 EMC Proven Professional Knowledge Sharing 33
Some Limitations
Loss of control (mirrors traditional centralize/decentralize debate)
Integration with enterprise authentication, single-sign-on
Integration with key enterprise applications
Accessibility and User Interface limitations of web applications
Performance and availability concerns
Policy/compliance concerns
Breach forensics and mitigation
Need to monitor application availability, not just node or VM availability
Future of the Cloud The Cloud is evolving as each day passes. Here is a graphical analysis by Source: Saugatuck
Technology 2008.
The Enterprise evolution to ‘cloud sourcing’ as indicated in the study – “Innovation and Profit:
How On-Demand Computing Can Change Your Business” - from Dream Force, shown on the
following page, gives another executive insight.
Source: Saugatuck Technology
Wave III: 2008-2013Workflow-Enabled
Business Transformation
SaaS 1.0 SaaS 2.0 Cloud Computing
Wave I: 2001-2006 Cost-Effective Software Delivery
Adopt
Low
High Wave II: 20 5-20100ted Integra
Business Solutions
Early Adoption
• Stand-alone Apps
• Multi-tenancy
• Limited Configurability
• F TCO / id
Mainstream Adoption
• Integrated w/ Business
• SaaS Integration Platforms
• Business Marketplaces and SaaS Ecosystems
• Customization Capability
Ubiquitous Adoption
• Optimized Business Ecosystems
• IT-Targeted Ecosystems
• SaaS Development Platforms
• Inter-enterprise Collaboration
• IT Utility / SaaS Infrastructure
200 200 200 200 201 201 201 201200 200 201 201 201200
Wave IV: 2011-2016
Measured, Monitored, Managed Business Processes
Post-SaaS Adoption
• End-to-End Business Processes
• Integrated w/ Services Anywhere
• Intelligent Hubs Linking Platforms
• Mobile Device- and Sensor-Controllable
•SLAs for Composite Service Offerings
2009 EMC Proven Professional Knowledge Sharing 34
Toda 3-5 1-2
Conclusion: Cloud Vision and Strategy
The cloud paradigm is gaining mindshare in the market (CxO) and it is likely to continue
(including complex enterprise IT) with increasing maturity. Though cloud is not yet a
recognized enterprise IT sourcing strategy, it is deemed a viable alternative for business units,
particularly those who are frustrated with sluggish IT departments. In this sense, cloud is
emerging as an outsourcing alternative (bypassing IT); we need a relationship with the
business units to capture these opportunities. Presently, the majority of current projects are
outsourced through the IT department. The IT procurement processes will soon include an on-
demand option in addition to traditional build or buy options.
Both SaaS and Cloud Computing paradigms are touted as disruptive outsourcing models and
their effect on IT staffing is expected to be significant. The future enterprise IT staff will likely
include CIOs, architects and process experts. Architecture and design will likely remain inside
the enterprise boundary, while the rest of the construction and operations work will be
outsourced through the two models (SaaS & Cloud). Though it is not clear how and to what
extent, there is a definite change on the horizon for enterprise IT processes with proportionate
changes percolating down to SI vendors.
2009 EMC Proven Professional Knowledge Sharing 35
Enterprises will begin to explore the cloud paradigm in-house and externally for smaller non-
mission critical and less demanding applications at first. Over time, larger and mission critical
system will be trusted to the cloud.
There are no clear or immediate answers on the transition to the external cloud, ease of
migration, Fixed Cost Advantage vs Variable Rental Cost etc. On the contrary, the benefits of
cloud-based computing, including scalability and lower costs, are very real. Hence, working in
an application development, whether for a software vendor or an end user; the cloud is
definitely going to play an increasing role in the next generation computing storage systems.
2009 EMC Proven Professional Knowledge Sharing 36
Appendix A: Technical References
Bibliography 1. Computing in the Clouds by A. Weiss.
2. A Short Introduction to Cloud Platforms by David Chappell.
3. An Introduction to SaaS and Cloud Computing by Ross Cooney.
4. Virtualization, Cloud Computing & TeraGrid by Kate Keahey and Marlon Pierce.
5. Computer Lab to Go: A “Cloud” Computing Implementation by Murphy and
McClelland.
6. The Grid: Blueprint for a Future Computing Infrastructure by I. Foster and C.
Kesselman.
7. Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services
as Computing Utilities by Rajkumar Buyya, Chee Shin Yeo and Srikumar Venugopal.
8. Taneja Group Research on Cloud Computing by Jeff Boles
Websites 9. http://www.mesh.com
10. http://www.network.com
11. http://www.wikipedia.org
12. http://www.salesforce.com
13. http://aws.amazon.com/ec2
14. http://www.searchstorage.com
15. http://code.google.com/appengine/
16. http://www.datacenterknowledge.com/
17. http://www.cloudcomputing.org.il/ccd/
18. http://www.ibm.com/developerworks/websphere/zones/hipods/
19. http://www.morganstanley.com/institutional/techresearch/pdfs/TechTrends062008.pdf
20. http://www-03.ibm.com/security/products/prod_dkms.shtml
21. http://cloudsecurity.org/2008/07/14/is-your-amazon-machine-image-vulnerable-to-
sshspoofing-attacks/
22. http://etherealmind.com/2008/08/21/enterprise-cloud-computing-build-your-own-cisco/
23. http://paulstamatiou.com/2008/04/05/how-to-getting-started-with-amazon-ec2
24. http://paulstamatiou.com/2008/08/21/how-to-live-the-cloud-life
25. http://justinleider.com/2008/08/20/running-your-own-hardware-vs-ec2-and-rightscale/
2009 EMC Proven Professional Knowledge Sharing 37
Appendix B: Cloud Taxonomy
2009 EMC Proven Professional Knowledge Sharing 38
Cloud Technology Landscape A view of cloud technology landscape used by a popular cloud blog site in Cloud Computing
discussion group of Google follows.
Amazon EC2
ServePath GoGrid
Rackspace Mosso Cloud
Joyent Accelerators
AppNexus
Flexiscale
Public Cloud
ElasticHosts
Eucalyptus
Cassatt Active Response Private Cloud
Enomaly Enomalism Platform
Heroku Open Cloud Platforms
Morph Labs
Salesforce.com force.com
Google App Engine
Bungee Labs Connect
Intuit Quickbase
LongJump
Custom Cloud Platforms
[One can not run generic applications
but ones developed using the native API
set]
Coghead
Cloud Platform Tools
Rightscale
Scalr
Elastra Cloud Server
3Tera AppLogic
Fabric Mgmt
Kaavo IMOD
Oracle Coherence
IBM eXtreme Scale
GigaSpaces Data Grid Data Grids
Gemstone Gemfire
2009 EMC Proven Professional Knowledge Sharing 39
rPath
CohesiveFT
Hyperic CloudStatus Virtual Appliances
Hadoop
Amazon S3
Amazon SimpleDB
Microsoft SSDS
Rackspace Mosso CloudFS
Storage
Google BigTable
Bungee Labs Connect
Boomi
MuleSource Mule OnDemand
Amazon SQS
Microsoft BizTalk Services
OpSource Connect
SnapLogic SaaS Solution Packs
gnip
CastIron
Appirio
Skemma
Integration
Appian Anywhere
Cloud Services
OpSource Billing
Aria
eVapt
Zuora
Billing
Vindicia
Ping Identity Security
OpenID/OAuth
Data as a Service Strikeiron
2009 EMC Proven Professional Knowledge Sharing 40
Appendix-C SaaS, Cloud and Web2.0 Below are collections of diagrams from the blog http://markusklems.wordpress.com/cloud-
classification/ capturing one of the plausible interpretations of the relationship as they exist
today.
2009 EMC Proven Professional Knowledge Sharing 41
2009 EMC Proven Professional Knowledge Sharing 42
2009 EMC Proven Professional Knowledge Sharing 43