Data & Analytics
Five Reasons Enterprise Adoption Of
Spark Is Unstoppable
Mike Gualtieri, Principal Analyst
February 17, 2016 New York
ADOPTION1. Customer experience is a top
priority for enterprises.
2015 Forrester Research, Inc. Reproduction Prohibited 4
0% 10% 20% 30% 40% 50% 60% 70% 80%
Better leverage big data and analytics in business decision-making
Create a comprehensive strategy for addressing digital technologies like mobile,social & smart products
Create a comprehensive digital marketing strategy
Better comply with regulations and requirements
Improve differentiation in the market
Increase influence and brand reach in the market
Address rising customer expectations
Improve our ability to innovate
Improve our products /services
Improve the experience of our customers
A strong majority of business leaders prioritize improved customer experience and products.
Base: 3,005 global data and analytics decision-makers
Source: Global Business Technographics Data And Analytics Online Survey, 2015
For you For all For segments For you
1800 1900 1950 2000 2015
Customers want and increasingly expect
to be treated like celebrities.
Learn individual customer
Detect customer needs and
desires in real-time
Adapt applications to serve
an individual customer
Celebrity experiences must:
2015 Forrester Research, Inc. Reproduction Prohibited 8
Fortunately, every industry is graced with more data Richer transactional data from portfolio of hundreds of
Usage and behavior data from web and mobile apps
IoT device sensor and event data
Social media data
Data economy firms buying and selling data
Using your best estimate, what is the size of
all data stored within your company?
Source: Forrester Research, September 2015
Base: 100 US Managers and above currently using Hadoop for processing and analyzing data.
Enterprises have plenty of data from both internal and
5% 50-99 Terabytes
Greater than 500
External source data
What % of the data available is from internal business applications (ERP and business
applications) versus external sources (social, IoT)?
2015 Forrester Research, Inc. Reproduction Prohibited 10
Learn Model Detect Adapt
Four kinds of analytics are necessary
Most firms invest here They must invest here too
2015 Forrester Research, Inc. Reproduction Prohibited 11
Source: Forrester Research
Thats why use of advanced analytics is surging
What is your firm's/business unit's current use of the following technologies?
Source: Forrester's Global Business Technographics Data And Analytics Survey, 2015 and 2014
Base: 1805 (2015), 1063 (2014)
Non modeled data exploration and discovery
Metadata generated analytics
Most of your
ADOPTION2. Hadoop and friends makes
analytics of all kinds cost-effective at scale.
100%Number of enterprises that
Forrester estimates will adopt
Hadoop and friends!
Hadoop is designed for volume.
Spark is designed for speed.
2015 Forrester Research, Inc. Reproduction Prohibited 18
Spark and Hadoop can coexist in the same cluster.
ADOPTION3. Perishable insights must be captured and used before they
expire (or rot).
Perishable insights can have exponentially more
value than sleepy, after-the-fact traditional
All data is born fast!
But, analytics is usually done much later.
How can you prevent this dude from fleecing
you right now?
What offers should you make to your customer if
they are within proximity of your store right now?
Resilient Distributed Datasets (RDD) is a
generalized data structure that can cache data in-
memory and spool to disk if necessary.
2015 Forrester Research, Inc. Reproduction Prohibited 30
Spark data processing jobs run exponentially faster when the data set fits in memory.
2015 Forrester Research, Inc. Reproduction Prohibited 31
Why not just pop your data in-memory?
Planning, implementing, or expanding the use of
in-memory data platform.
Base: 1,805 global data and analytics decision-makers
Source: Forrester Global Business Technographics Data And Analytics Online Survey, 2015
ADOPTION4. Massive Machine Learning
Automation (MMLA) is the future of data science.
Massive Machine Learning Automation (MMLA)
is the only competitive way forward.
Data scientists have slogged through the same iterative process for 20 years
MASSIVE MACHINETools and technologies that automate through
configuration rather than coding the process of
data preparation, model building using statistical
and machine learning algorithms, model
evaluation, and model monitoring at scale.
The seven characteristics of massive machine learning automation.
ADOPTION5. Spark community is diverse
and innovating fast.
2015 Forrester Research, Inc. Reproduction Prohibited 41
Learn Model Detect Adapt
Only the analytical enterprise can compete and win in the age of the customer
You shall have
none - until you
build a continuous
2015 Forrester Research, Inc. Reproduction Prohibited 44
Generate industrial strength analytics with Spark and Hadoop