Date post: | 14-Apr-2017 |
Category: |
Documents |
Upload: | daniel-freundel |
View: | 261 times |
Download: | 1 times |
December 5, 2016
Dresner Advisory Services, LLC
2016 Edition
Big Data Analytics Market Study
Wisdom of Crowds®
Series
Licensed to Zoomdata
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
2
Disclaimer:
This report should be used for informational purposes only. Vendor and product selections should be made based on
multiple information sources, face-to-face meetings, customer reference checking, product demonstrations and
proof-of-concept applications.
The information contained in all Wisdom of Crowds® Market Study Reports reflects the opinions expressed in the
online responses of individuals who chose to respond to our online questionnaire and does not represent a scientific
sampling of any kind. Dresner Advisory Services, LLC shall not be liable for the content of reports, study results, or for
any damages incurred or alleged to be incurred by any of the companies included in the reports as a result of its
content.
Reproduction and distribution of this publication in any form without prior written permission is forbidden.
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
3
Definition
Big Data Analytics Defined We define big data analytics as systems that enable end-user access to and analysis of data
contained and managed within the Hadoop ecosystem.
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
4
Introduction This year we celebrate the ninth anniversary of Dresner Advisory Services! We offer our
thanks to all of you for your continued support and ongoing encouragement.
Since our founding in 2007, we have worked hard to set the “bar” high—challenging
ourselves to innovate and lead the market—offering ever greater value with each
successive year.
Our first market report in 2010 set the stage for where we are today. Since that time, we
have expanded our agenda and have added new research topics every year since. For
2016, we are on track to release 15 major reports, including our recent flagship BI
report—in its seventh year of publication!
In addition to our ongoing coverage of key topics such as embedded BI, big data
analytics and advanced and predictive analytics, we have added new topics including
Collective InsightsTM (blending collaboration and governance) and systems integrators.
For this, our second Big Data Analytics Market Study, we continue to focus upon the
combination of analytical solutions within the Hadoop ecosystem, adding some new
criteria and exploring changing market dynamics and user perceptions and plans.
We hope you enjoy this report!
Best,
Howard Dresner Chief Research Officer Dresner Advisory Services
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
5
Contents
Definition ......................................................................................................................... 3
Big Data Analytics Defined........................................................................................... 3
Introduction ..................................................................................................................... 4
Benefits of the Study ....................................................................................................... 7
A Consumer Guide ...................................................................................................... 7
A Supplier Tool ............................................................................................................ 7
About Howard Dresner and Dresner Advisory Services .................................................. 8
About Jim Ericson ........................................................................................................... 9
Survey Method and Data Collection .............................................................................. 10
Data Quality ............................................................................................................... 10
Executive Summary ...................................................................................................... 12
Study Demographics ..................................................................................................... 13
Geography ................................................................................................................. 13
Functions ................................................................................................................... 14
Vertical Industries ...................................................................................................... 15
Organization Size ....................................................................................................... 16
Analysis and Trends: Big Data Analytics ....................................................................... 18
Importance of Big Data .............................................................................................. 18
Big Data Adoption ...................................................................................................... 19
Future Adoption of Big Data ....................................................................................... 25
Big Data Use Cases ................................................................................................... 31
Big Data Infrastructure ............................................................................................... 37
Big Data – Data Access ............................................................................................. 43
Big Data Search ......................................................................................................... 49
Big Data Analytics / Machine-Learning Technologies ................................................ 55
Big Data Distributions ................................................................................................ 61
Industry and Vendor Analysis ........................................................................................ 68
Big Data Analytics Vendor Ratings ............................................................................ 79
Glossary ........................................................................................................................ 80
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
6
Other Dresner Advisory Services Research Reports .................................................... 84
Appendix: Big Data Analytics Study Survey Instrument ................................................ 85
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
7
Benefits of the Study
The DAS Big Data Analytics Market Study provides a wealth of information and
analysis, offering value to both consumers and producers of related technology and
services.
A Consumer Guide
As an objective source of industry research, consumers use the DAS Big Data Analytics
Market Study to understand how their peers are leveraging and investing in big data
analytics and related technologies.
Using our unique vendor performance measurement system, users glean key insights
into software supplier performance, enabling:
Comparisons of current vendor performance to industry norms
Identification and selection of new vendors
A Supplier Tool
Vendor licensees use the DAS Big Data Analytics Market Study in several important
ways:
External Awareness
Build awareness for the big data analytics market and supplier brand, citing
DAS Big Data Analytics Market Study trends and vendor performance
Create lead and demand generation for supplier offerings through association
with DAS Big Data Analytics Market Study brand, findings, webinars, etc.
Internal Planning
Refine internal product plans and align with market priorities and realities as
identified in DAS Big Data Analytics Market Study
Better understand customer priorities, concerns, and issues
Identify competitive pressures and opportunities
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
8
About Howard Dresner and Dresner Advisory Services The DAS Big Data Analytics Market Study was conceived, designed, and executed by
Dresner Advisory Services, LLC, an independent advisory firm, and Howard Dresner, its
president, founder and chief research officer.
Howard Dresner is one of the foremost thought leaders in business intelligence and
performance management, having coined the term “Business Intelligence” in 1989. He
has published two books on the subject, The Performance
Management Revolution – Business Results through Insight
and Action (John Wiley & Sons, Nov. 2007) and Profiles in
Performance – Business Intelligence Journeys and the
Roadmap for Change (John Wiley & Sons, Nov. 2009). He
lectures at forums around the world and is often cited by the
business and trade press.
Prior to Dresner Advisory Services, Howard served as chief
strategy officer at Hyperion Solutions and was a research fellow at Gartner, where he
led its business intelligence research practice for 13 years.
Howard has conducted and directed numerous in-depth primary research studies over
the past two decades and is an expert in analyzing these markets.
Through the Wisdom of Crowds® Business Intelligence market research reports, we
engage with a global community to redefine how research is created and shared. Other
research reports include:
- Wisdom of Crowds “Flagship” Business Intelligence Market study
- Advanced and Predictive Analytics
- Collective InsightsTM
- Internet of Things and Business Intelligence
- Small and Mid-Sized Enterprise Business Intelligence
- Systems Integrators
Howard conducts a weekly Twitter “tweetchat” on Fridays at 1:00 p.m. ET. During these
live events the #BIWisdom “tribe” discusses a wide range of business intelligence
topics.
You can find more information about Dresner Advisory Services at
www.dresneradvisory.com.
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
9
About Jim Ericson Jim Ericson is a research director with Dresner Advisory Services.
Jim has served as a consultant and journalist who studies end-user management
practices and industry trending in the data and information management fields.
From 2004 to 2013 he was the editorial director at Information Management magazine
(formerly DM Review), where he created architectures for user and
industry coverage for hundreds of contributors across the breadth of
the data and information management industry.
As lead writer, he interviewed and profiled more than 100 CIOs,
CTOs, and program directors in a 2010-2012 program called “25
Top Information Managers.” His related feature articles earned
ASBPE national bronze and multiple Mid-Atlantic region gold and
silver awards for Technical Article and for Case History feature
writing.
A panelist, interviewer, blogger, community liaison, conference co-chair, and speaker in
the data-management community, he also sponsored and co-hosted a weekly podcast
in continuous production for more than five years.
Jim’s earlier background as senior morning news producer at NBC/Mutual Radio
Networks and as managing editor of MSNBC’s first Washington, D.C. online news
bureau cemented his understanding of fact-finding, topical reporting, and serving broad
audiences.
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
10
Survey Method and Data Collection As with all of our Wisdom of Crowds® Business Intelligence Market Studies, we
constructed a survey instrument to collect data and used social media and crowd-
sourcing techniques to recruit participants.
We include our own research community of nearly 4,000 organizations as well as
crowdsourcing and vendors’ customer communities.
Data Quality
We carefully scrutinized and verified all respondent entries to ensure that only qualified
participants are included in the study.
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
11
Executive
Summary
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
12
Executive Summary
Over two years of big data analytics study, we see a significant increase in
uptake and a large drop in holdouts with no big data plans. High tech and
telecom are industry leaders (p. 20-24).
Current adoption and future plans for the use of big data analytics have reached
a level of significance we did not see last year. Forty-one percent of
organizations are already using Hadoop-related big data. Even more say they
may use big data in the future (p. 19).
Among organizations that have not yet adopted big data, 14 percent will adopt in
the current calendar year, a horizon grows that grows to 47 percent in 2017.
BICC respondents are likely future adopters (p. 25-30).
Among technologies and initiatives considered strategic to business intelligence,
big data analytics is ranked 20th out of 30 topical areas under study, still well
behind core BI practices (p. 18). Overall, vendors are still highly positive on big
data though sentiment is leveling off (p. 68).
The top big data use cases in 2016 are data warehouse optimization, followed by
customer/social analysis (p. 31-36).
The top big data infrastructure choice among users is Spark, followed by
Map/Reduce, Yarn, Oozie, Tez, Mesos, and Atlas. Over time, Spark is gaining
status as a category leader (p. 40-42). Industry support is strongest for
Map/Reduce, but Spark is closing in quickly (p. 69-70).
Spark SQL is the most-cited big data access structure followed closely by Hive
and HDFS (p. 43-48). Industry support is strongest for Hive and HDFS; Spark
support remains lower than user expectations (p. 71-72).
Amid lukewarm interest, toward big data search technologies, Elasticsearch
resonated most strongly followed by Apache Solr and Cloudera Search (p. 49-
54). The industry is strongest for Apache Solr, and support for Cloudera fell
noticeably (p. 73-74).
Spark MLib is the most-preferred big data machine learning technology,
“important” to more than 60 percent of respondents. All machine learning
technologies gather interest but are still at the fringe (p. 55-60). Industry support
for big data analytics / machine learning is strongest for Spark MLib followed by
Mahout (p. 75-76).
Cloudera is the most popular big data distribution among users, followed by
Hortonworks, Amazon, and MAP/R (p. 61-66). We see significant existing
industry support and future plans for big data (Hadoop) distributions (p. 77-78).
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
13
Study Demographics Our 2015 Big Data Analytics Market Study is based on a cross-section of data that
spans geographies, functions, organization size, and vertical industries. We believe
that, unlike other industry research, this supports a more representative sample and
better indicator of true market dynamics. We constructed cross-tab analyses using
these demographics to identify and illustrate important industry trends.
Geography
North America, which includes the U.S., Canada, and Puerto Rico, represents 57
percent of respondents (fig. 1). EMEA accounts for the next largest group (32 percent),
followed by Asia Pacific and Latin America.
Figure 1 – Geographies represented
57%
32%
8%
3%
0%
10%
20%
30%
40%
50%
60%
North America Europe, Middle Eastand Africa
Asia Pacific Latin America
Geographies Represented
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
14
Functions
IT (28 percent) and the business intelligence competency center (21 percent) are the
two largest groups represented in our big data analytics sample (fig. 2).
Examining trends and behavior by function helps us compare and contrast plans and
priorities in different areas of organizations.
Figure 2 - Functions represented
28%
21%
12%
11%
10%
8%
12%
0% 5% 10% 15% 20% 25% 30%
Information Technology (IT)
Business intelligence competency center
Executive management
Research and development (R&D)
Sales and Marketing
Finance
Other
Functions Represented
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
15
Vertical Industries
Technology (14 percent), financial services (10 percent), and consulting (9 percent) are
the most represented industries in our study, followed by healthcare, education, and
telecommunications (fig. 3). We include responses from consultants—who often have
greater interaction with initiatives and deeper industry knowledge than many customer
counterparts. This also yields insight into the partner ecosystem for BI vendors.
Figure 3 – Vertical industries represented
14%
10% 9%
9% 8%
7%
6% 5%
4% 3%
2% 2% 2% 2% 2%
18%
0%
2%
4%
6%
8%
10%
12%
14%
16%
18%
20%
Vertical Industries Represented
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
16
Organization Size
Respondents to our big data analytics study reflect a mix of organizational sizes and
structures (fig. 4). Small organizations of 1-100 employees represent 26 percent of the
sample. Mid-sized organizations also account for 27 percent, and the remaining 47
percent are large organizations with more than 1,000 employees.
Figure 4 – Organization sizes represented
26% 27%
20%
27%
0%
5%
10%
15%
20%
25%
30%
1 - 100 101 - 1000 1001 - 5000 More than 5000
Organization Sizes Represented
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
17
Analysis and
Trends
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
18
Analysis and Trends: Big Data Analytics
Importance of Big Data
Among technologies and initiatives considered strategic to business intelligence, big
data analytics is ranked 20th out of 30 topical areas we currently study (fig. 5). This
finding reflects interest similar to last year's inaugural Big Data Analytics Market Study
(in which big data ranked 18th of 25 topics under study at the time). We understand that
big data interest can and does vary widely from organization to organization and will be
critical to some and irrelevant to others. While we see increasing momentum, big data
analytics still distantly trails the status and penetration of mainstream business
intelligence practices such as reporting, dashboards, and end-user self-service.
Figure 5 - Technologies and initiatives strategic to business intelligence
0% 20% 40% 60% 80% 100%
Reporting
Dashboards
End-user "self-service"
Advanced visualization
Data discovery
Data warehousing
Data mining, advanced algorithms, predictive
Integration with operational processes
Data storytelling
Enterprise planning/budgeting
Mobile device support
Embedded BI (contained within an application,…
Governance
Collaborative support for group-based analysis
End-user data preparation and blending
Search-based interface
Software-as-a-Service and cloud computing
In-memory analysis
Ability to write to transactional applications
Location intelligence/analytics
Big data (e.g., Hadoop)
Pre-packaged vertical/functional analytical…
Text analytics
Streaming data analysis
Open source software
Social media analysis (Social BI)
Cognitive BI (e.g., Artificial Intelligence-based BI)
Complex event processing (CEP)
Internet of Things (IoT)
Edge computing
Technologies and Initiatives Strategic to Business Intelligence
Critical
Very important
Important
Somewhatimportant
Not important
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
19
Big Data Adoption
Current adoption and future plans for the use of big data analytics have reached a level
of significance we did not see last year. Forty-one percent of organizations say they are
already using big data analytics (fig. 6), which we define as "systems that enable end-
user access to and analysis of data contained and managed within the Hadoop
ecosystem.” Even more respondents (46 percent) say they may use big data in the
future. Just 14 percent have no plans for future use of big data analytics.
Figure 6 – Adoption of big data
Yes. We use big data today, 41%
We may use big data in the future,
46%
No. We have no plans to use big data at all, 14%
Adoption of Big Data
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
20
Over the two years of our comprehensive big data analytics study, we see a significant
increase in uptake and a large drop in holdouts with no plans (fig. 7). Forty-one percent
of respondents report current big data use, a greater than two-fold increase over 2015.
At the same time, the number of respondents with no plans fell by a factor of greater
than two, from 36 percent to 14 percent. The percentage of ambivalent users was
consistent year over year at 45 percent or a bit more. We can anecdotally chalk these
findings up to a emerging mix of practical/achievable projects, service enablement, and
greater understanding of big data uses.
Figure 7 - Adoption of big data 2015 to 2016
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
Yes. We use big data today We may use big data in thefuture
No. We have no plans to usebig data at all
Adoption of Big Data 2015 to 2016
2015
2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
21
In our 2016 sample, EMEA leads slightly in current adoption (43 percent) compared to
North America (40 percent) and is well ahead of Asia Pacific (33 percent) (fig. 8). Asia
Pacific also reports the most organizations with "no plans to use big data at all" (27
percent). Both EMEA and North America report 46 percent undecided ("we may use big
data...") respondents.
Figure 8 – Adoption of big data by geography
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
North America Europe, MiddleEast and Africa
Asia Pacific
Adoption of Big Data by Geography
No. We have no plans to usebig data at all
We may use big data in thefuture
Yes. We use big data today
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
22
Perennial first-mover high-tech organizations lead 2016 big data adoption with 59
percent reporting current use (fig. 9). Telecommunications, with possibly the greatest
data transaction volume issues of any industry, is the next most likely industry to
currently use big data analytics (50 percent). Financial services, another high data
transaction industry, reports 45 percent current use. Less likely to be current users,
consulting industry respondents are nonetheless prepared to embrace big data as
needed.
Figure 9 – Adoption of big data by vertical industry
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Adoption of Big Data by Vertical Industry
No. We have no plans to usebig data at all
We may use big data in thefuture
Yes. We use big data today
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
23
In 2016, the BICC supplanted R&D as the most likely current departmental user of big
data (fig. 10). This finding supports the notion that big data is moving from an
experimental to practical pursuit in organizations. As is often the case, executive
management is a likely-to-sure proponent of evolutionary technologies such as big data.
We are uncertain as to why finance is also a strong player in big data unless interest
there is tuned organizationally at cost savings. IT predictably lags in current adoption
and is most likely to have vested interest in supporting legacy and traditional technology
investments.
Figure 10 – Adoption of big data by function
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
InformationTechnology
(IT)
Businessintelligencecompetency
center
Executivemanagement
Research anddevelopment
(R&D)
Sales &Marketing
Finance
Adoption of Big Data by Function
No. We have no plans to use big data at all
We may use big data in the future
Yes. We use big data today
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
24
Current adoption of big data is strongest (61 percent) within very large businesses and
institutions that have more than 5,000 employees (fig. 11). Small organizations with one
to 100 employees have the lowest rate of current adoption (29 percent). After very large
organizations, however, small and mid-size (101-1,000 employees) are most open to
possible future use. We would expect that small organizations are most likely cloud
users of big data services while large organizations will likely deploy onsite.
Figure 11 – Adoption of big data by organization size
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 - 100 101 - 1000 1001 - 5000 More than5000
Adoption of Big Data by Organization Size
No. We have no plans to usebig data at all
We may use big data in thefuture
Yes. We use big data today
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
25
Future Adoption of Big Data
Among organizations that have not yet adopted big data but have future plans, 14
percent say they will adopt in the current calendar year (fig. 12). This horizon grows
rapidly in 2017 when 47 percent plan to adopt. Unlike 2015 (see following fig. 13), only
a minority of non-users of big data adopters are postponing plans beyond 2017. Though
we often find big data plans compartmentalized to projects or departments, future
adoption will also hinge on current investment budgets for more "conventional"
technologies.
Figure 12 – Future adoption of big data
Will adopt in 2016, 14%
Will adopt in 2017, 47%
Will adopt beyond 2017, 40%
Future Adoption of Big Data
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
26
Compared to our inaugural 2015 study, year-over-year future adoption plans for big
data represent a sea change of respondent behavior (fig. 13). Current year adoption
plans are more than three times greater in 2016 (14 percent) compared to last year (4
percent). Next-year adoption in our current study (47 percent) shows remarkable growth
from 2015's 27 percent plans. Significantly fewer respondents are delaying plans
beyond next year, plainly indicating they are allocating money, resources, and time to
big data solutions and their use.
Figure 13 - Future adoption of big data 2015 to 2016
0%
10%
20%
30%
40%
50%
60%
70%
80%
Will adopt this year Will adopt next year Will adopt beyond next year
Future Adoption of Big Data 2015 to 2016
2015
2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
27
Regionally, among those who have not already adopted big data, North American and
Asia-Pacific respondents are more motivated to increase use compared to those in
EMEA (fig. 14). Asia Pacific has the greatest number of both 2016 (17 percent) and
2017 (50 percent) adopters; EMEA has the most respondents (48 percent) with plans
deferred beyond 2017.
Figure 14 - Future adoption of big data by geography
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
North America Europe, MiddleEast and Africa
Asia Pacific
Future Adoption of Big Data by Geography
Will adopt beyond 2017
Will adopt in 2017
Will adopt in 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
28
Among organizations not yet using big data, vertical adoption in 2016 is highest (about
20 percent) in education, technology, and telecommunications (fig. 15). Plans for 2017
adoption are by far highest in financial services (75 percent), followed by consulting and
healthcare. (While future plans for telecommunications and technology appear relatively
low, recall that these sectors are also the greatest current users of big data technologies
(fig. 9, p. 22)).
Figure 15 – Future adoption of big data by vertical industry
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Future Adoption of Big Data by Vertical Industry
Will adopt beyond 2017
Will adopt in 2017
Will adopt in 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
29
Among non-users of big data, the BICC has by far the highest (30 percent) current-year
adoption plans (fig. 16). Accelerating BICC use is generally a reflection of delivery as
well as incipient demand for business technologies, another indication that big data
analytics is "crossing the chasm" of use cases and enterprise adoption. Sales and
marketing and IT (low in current usage, fig. 10, p. 23), are the next most likely to be
current-year adopters of big data analytics, perhaps by executive fiat, (whose next year
interest is correspondingly highest).
Figure 16 – Future adoption of big data by function
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Businessintelligencecompetency
center
Executivemanagement
InformationTechnology
(IT)
Finance Sales &Marketing
Research anddevelopment
(R&D)
Future Adoption of Big Data by Function
Will adopt in 2016 Will adopt in 2017 Will adopt beyond 2017
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
30
As with current users of big data analytics (fig. 11, p. 24), 2016 first-adoption plans are
highest at very large organizations with more than 5,000 employees (fig. 17). More than
60 percent of very large organizations will take up the use of big data in 2016, more
than twice the rate at small organizations (29 percent). That said, we continue to believe
cloud-based offerings will be a strong driver of big data going forward for organizations
of any size. Possibly in that vein, 2017 adoption plans are highest at small organizations
(58 percent), followed by mid-sized organizations (50 percent).
Figure 17 – Future adoption of big data by organization size
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 - 100 101 - 1000 1001 - 5000 More than5000
Future Adoption of Big Data by Organization Size
Will adopt beyond 2017
Will adopt in 2017
Will adopt in 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
31
Big Data Use Cases
The top big data use case in 2016 is data warehouse optimization, which is considered
critical or very important to 65 percent of respondents (fig. 18). As data warehouse
deployments are mostly confined to large institutions, this reinforces our view that big
data is predominantly a large-organization pursuit meant to lower cost and complexity.
That said, customer / social analysis is the next most likely use case and is, at
minimum, "very important" to a majority of respondents.
Figure 18 – Big data use cases
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Data warehouse optimization
Customer/ social analysis
Clickstream analytics
Fraud detection
Internet of Things
Big Data Use Cases
Critical Very important Important Somewhat important Not important
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
32
Year over year, the top big data use cases, data warehouse optimization and customer /
social analysis, retain (and extend) their top rankings (fig. 19). The Internet of Things,
the third-most popular use case in 2015, lost momentum in 2016, possibly due to
settling hype and uneven prospects for average organizations. Clickstream analytics
and fraud detection gained the most influence year over year.
Figure 19 - Big data use cases 2015 to 2016
0
0.5
1
1.5
2
2.5
3
3.5
4
Data warehouseoptimization
Customer/ socialanalysis
Clickstreamanalytics
Fraud detection Internet of Things
Big Data Use Cases 2015 to 2016
2015
2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
33
By region, Asia Pacific and North America are the most likely to prioritize data
warehouse optimization (fig. 20). (All use cases, particularly fraud detection and
clickstream analytics, are, in fact, more highly prioritized in Asia Pacific than in other
regions.) Compared to North America, EMEA nonetheless has more interest in
customer / social analysis and the Internet of Things.
Figure 20 - Big data use cases by geography
1 2 3 4 5
North America
Europe, MiddleEast and Africa
Asia Pacific
Big Data Use Cases by Geography
Data warehouseoptimization
Customer/ social analysis
Clickstream analytics
Fraud detection
Internet of Things
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
34
When parsed by vertical industry, all industries rank data warehousing as a top or
second priority. Our 2016 sample shows somewhat surprising standout interest in data
warehouse optimization among healthcare respondents (fig. 21). Elsewhere, financial
services predictably reports the highest interest in fraud detection (and clickstream
analysis). Consulting leads technology in interest in customer / social analysis. The
Internet of Things interest is highest in education.
Figure 21 – Big data use cases by vertical industry
1 2 3 4 5
Technology
Financial services
Consulting
Healthcare
Education
Big Data Use Cases by Vertical Industry
Data warehouseoptimization
Customer/ social analysis
Clickstream analytics
Fraud detection
Internet of Things
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
35
All functions in our 2016 sample rank data warehouse optimization as their highest big
data use case priority (fig. 22). IT has the most standout interest in data warehouse
optimization, which is not surprising given traditional ownership boundaries. BICC and
executive management report the highest interest in customer /social analysis, perhaps
with an opportunistic viewpoint. BICC and sales/marketing are most interested in
clickstream analytics. Finance respondents show below-average interest in all big data
use cases.
Figure 22 – Big data use cases by function
1 2 3 4 5
InformationTechnology (IT)
Business intelligencecompetency center
Executivemanagement
Research anddevelopment (R&D)
Sales & Marketing
Finance
Big Data Use Cases by Function
Data warehouseoptimization
Customer/ social analysis
Clickstream analytics
Fraud detection
Internet of Things
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
36
Very large organizations (>5,000) expectedly have the greatest proportional interest in
data warehouse optimization (fig. 23). Generally, we would expect large organizations
to be more conventional in their approach to big data use cases with an eye toward cost
efficiency, while smaller peers are more balanced across opportunities. It is interesting
however that IoT has not caught fire in organizations of any size and that very large
organizations are the least attuned to customer / social analysis.
Figure 23 – Big data use cases by organization size
1 2 3 4 5
1 - 100
101 - 1000
1001 - 5000
More than 5000
Big Data Use Cases by Organization Size
Data warehouseoptimization
Customer/ social analysis
Clickstream analytics
Fraud detection
Internet of Things
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
37
Big Data Infrastructure
To gather baseline data on big data infrastructure awareness/adoption, we assembled a
list of relevant frameworks, databases, and other technologies in the Hadoop / open
source orbits of interest. In our 2016 sample, Spark is the preferred mechanism
followed by Map/Reduce, Yarn, Oozie, Tez, Mesos, and Atlas. Spark and Map/Reduce
notably stand out across multiple grades of importance. All but the top three choices
(Spark, Map/Reduce, Yarn) are "not important" or only "somewhat important" to the
majority of respondents.
Figure 24 – Big data infrastructure
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Spark
Map/Reduce
Yarn
Oozie
Tez
Mesos
Atlas
Knox Gateway
Alluxio (formerly Tachyon)
Big Data Infrastructure
Critical Very important Important Somewhat important Not important
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
38
Across two years of study, Spark has surpassed Map/Reduce as the preferred big data
infrastructure (fig. 25). Preferences for Spark and associated applications/frameworks
extend across all measures in this report even though Map/Reduce is well penetrated in
early-stage use. All infrastructure choices gained favor in 2016 over 2015; the biggest
gainer besides Spark and Map/Reduce was Yarn. (2016 is the first year we polled
respondents on interest in Atlas and Knox Gateway.)
Figure 25 - Big data infrastructure 2015 to 2016
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
Big Data Infrastructure 2015 to 2016
2015 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
39
By region, Asia-Pacific respondents indicated the highest interest in all big data
infrastructures polled in 2016 and prioritize Yarn over Map/Reduce (fig. 22), perhaps
indicating late-arriving interest and newer editions of Hadoop. Among regional
preferences, EMEA had the second-highest interest in Spark and Map/Reduce, ahead
of North America. Interest in Yarn is equal in North America and EMEA. EMEA has
slightly higher interest in Oozie and somewhat less interest in Tez and Mesos compared
to North America.
Figure 26 - Big data infrastructure by geography
1 2 3 4 5
North America
Europe, Middle East and Africa
Asia Pacific
Big Data Infrastructure by Geography
Spark Map/Reduce Yarn Oozie Tez Mesos
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
40
Big data infrastructure preferences vary by vertical industry (fig. 27). While technology
industry respondents are most singularly interested in Spark, other verticals share
similar affinity for Map/Reduce—and consulting actually grades Map/Reduce higher
than Spark. This latter finding may find consulting serving existing demand and
investments in Map/Reduce. Technology, healthcare, and consulting have the most
interest in Yarn; healthcare and consulting are also the most likely to engage with
Oozie.
Figure 27 – Big data infrastructure by vertical industry
1 2 3 4 5
Technology
Financial services
Consulting
Healthcare
Education
Big Data Infrastructure by Vertical Industry
Spark Map/Reduce Yarn Oozie Tez Mesos
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
41
Big data infrastructure preferences vary interestingly by function (fig. 28). The BICC
(often contained within IT) is the strongest proponent of Spark especially, followed by
Map/Reduce. As we have seen elsewhere, executive interest often follows (or leads) in
the lines of BICC activity. By comparison, R&D interest is weak and falls sharply after
Spark and Map/Reduce. Central IT is predictably a laggard in embracing big data
compared to other roles but shows some preference for the various options. Perhaps
most interesting is sales and marketing, where Ozzie and Tez claim the highest marks
of any department.
Figure 28 – Big data infrastructure by function
1 2 3 4 5
InformationTechnology (IT)
Business intelligencecompetency center
Executivemanagement
Research anddevelopment (R&D)
Sales & Marketing
Finance
Big Data Infrastructure by Function
Spark Map/Reduce Yarn Oozie Tez Mesos
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
42
We see differences in big data infrastructure preferences across organizations of
different size, but none that are striking (fig. 29). Spark and Map/Reduce are easily the
preferred choice in organizations large or small, though Spark appears to have the most
influence in very large organizations. Likewise, Yarn is consistently the third most highly
cited infrastructure choice of all organizations.
Figure 29 – Big data infrastructure by organization size
1 2 3 4 5
1 - 100
101 - 1000
1001 - 5000
More than 5000
Big Data Infrastructure by Organization Size
Spark Map/Reduce Yarn Oozie Tez Mesos
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
43
Big Data – Data Access
We asked organizations which big data structure access they preferred and which is
more/most important to them. This includes indirect access to Hadoop and other related
engines. In our 2016 study, Spark SQL is the most cited and considered, at minimum,
“important” to close to 80 percent of the sample (fig. 30). Hive and HDFS, perhaps more
familiar to the conventional data warehousing audience, follow closely and elicited even
more "critical" responses than Spark.
Figure 30 – Big data – data access
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Spark SQL
Hive/HiveQL
HDFS
HBase
Google BigQuery
Redshift
MongoDB
Impala
Pivotal HAWQ
Big Data - Data Access
Critical Very important Important Somewhat important Not important
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
44
Among big data access technologies studied both last year and this year, all gained
positive sentiment year over year, especially Spark SQL, Hive/Hive QL, and Impala (fig.
31). Trailing technologies, with the exception of Pivotal HAWQ, all reached positive
sentiment of 2.7 to 2.9, in the range of "important."
Figure 31 - Big data - data access 2015 to 2016
0
0.5
1
1.5
2
2.5
3
3.5
4
Big Data - Data Access 2015 to 2016
2015 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
45
Big data access preferences vary by region (fig. 32). Asia Pacific had the strongest
response to several technologies, specifically Hbase, Hive, HDFS, and Spark. Globally,
Hbase was less appealing in regions other than Asia Pacific. Cloud-based solutions
(Redshift, Google BigQuery) fared worse but were slightly more appealing in North
America than other regions.
Figure 32 – Big data – data access by geography
1 2 3 4 5
North America
Europe, Middle East and Africa
Asia Pacific
Big Data - Data Access by Geography
Spark SQL Hive/HiveQL HDFS HBase Google BigQuery Redshift
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
46
By vertical industry, financial services, technology, and consulting are the most aligned
around Spark SQL for data access (fig. 33). HiveQL resonated most strongly in
healthcare, followed by consulting and technology. Healthcare was also the strongest
proponent of HDFS, followed by financial services and technology. Consulting
respondents report an outsized interest in Redshift. Google BigQuery fared best in
consulting and financial services.
Figure 33 – Big data – data access by vertical industry
1 2 3 4 5
Technology
Financial services
Consulting
Healthcare
Education
Big Data - Data Access by Vertical Industry
Spark SQL Hive/HiveQL HDFS HBase Google BigQuery Redshift
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
47
Departmental interest in data access varies by function (fig. 34). The BICC and
executive management are the strongest proponents of Spark SQL. More traditional
Hbase, HDFS, and Hive are the most favored in sales and marketing, while the BICC is
most focused on HDFS and Hive along with Spark. Cloud-based offerings (Redshift,
Google BigQuery) are initially most interesting to executive management.
Figure 34 - Big data - data access by function
1
2
3
4
5
Spark SQLHive/HiveQLHDFSHBaseGoogleBigQuery
Redshift
Big Data - Data Access by Function
InformationTechnology (IT)
Business intelligencecompetency center
Executivemanagement
Research anddevelopment (R&D)
Sales & Marketing
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
48
Small organizations (most likely to be early adopters) are proportionately most drawn to
Spark as a newer opportunity for big data access (fig. 35). Redshift (and Google
BigQuery in mid-sized organizations) are also popular as an easy and inexpensive entry
point to big data access for smaller organizations. Very large organizations are more
likely invested in big data access via HDFS and Hive followed by Spark SQL.
Figure 35 - Big data - data access by organization size
1 2 3 4 5
1 - 100
101 - 1000
1001 - 5000
More than 5000
Big Data - Data Access by Organization Size
Spark SQL Hive/HiveQL HDFS HBase Google BigQuery Redshift
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
49
Big Data Search
We asked respondents to rank interest in big data search facilities, which in Hadoop
include indexing and natural language textual search (fig. 36). In our 2016 sample,
Elasticsearch resonated most strongly followed by Apache Solr and Cloudera Search.
Despite shifting over time (which we will expand on in the following figure) there is no
clear first choice in big data search; all three technologies are, at minimum, "important"
to 65 percent to 74 percent of respondents.
Figure 36 - Big data search
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Elasticsearch
Apache Solr
Cloudera Search
Big Data Search
Critical Very important Important Somewhat important Not important
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
50
Across two years of study data, we saw a small reversal of fortunes among big data
search options (fig. 37). While Elasticsearch moved past early open source provider
Apache Solr into first place, Cloudera fell slightly from the top choice to third. While we
consider rising year-over-year sentiment a positive development, we reiterate that there
is currently no clear first choice emerging in big data search.
Figure 37 - Big data search 2015 to 2016
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
Elasticsearch Apache Solr Cloudera Search
Big Data Search 2015 to 2016
2015 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
51
As in other measures, we found sentiment toward big data search options strongest
"across the board" in Asia Pacific (fig. 38). Also as mentioned, year-over-year sentiment
toward big data search increased across all regions, though with middling and not
remarkable levels of interest..
Figure 38 - Big data search by geography
1 2 3 4 5
North America
Europe, Middle East and Africa
Asia Pacific
Big Data Search by Geography
Elasticsearch Apache Solr Cloudera Search
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
52
We saw some divergence from overall results in big data search preference by industry
(fig. 39). Due to sector-size bias, we found respondents in three verticals (financial
services, healthcare, and consulting) preferred Cloudera Search to both top choice
Elasticsearch and Apache Solr. In contrast, technology, with a larger pool of
respondents, preferred Elasticsearch. In all instances, Apache Solr was the second
choice and was most preferred in healthcare and financial services.
Figure 39 - Big data search by vertical industry
1 2 3 4 5
Technology
Financial services
Consulting
Healthcare
Education
Big Data Search by Vertical Industry
Elasticsearch Apache Solr Cloudera Search
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
53
As a new relatively new technology hyped as "innovative," it is not entirely surprising to
find big data search advocacy strongest in executive management (fig. 40). Overall
functional preference was in favor of Elasticsearch (to a striking degree in IT), with the
exception of research and development, which preferred the earlier test bed of Apache
Solr. Overall sentiment ranged at or below a level of 3.0, indicating that big data search
is at best "important" or less and not critical to most audiences.
Figure 40 - Big data search by function
1 2 3 4 5
InformationTechnology (IT)
Business intelligencecompetency center
Executivemanagement
Research anddevelopment (R&D)
Sales & Marketing
Finance
Big Data Search by Function
Elasticsearch Apache Solr Cloudera Search
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
54
Big data search preferences vary somewhat but not dramatically in organizations of
different size (fig. 41). The largest departure in our 2016 sample is in mid-sized firms of
101 to 1,000 employees, where interest declines noticeably from Elasticsearch to other
options.
Figure 41 - Big data search capabilities by organization size
1 2 3 4 5
1 - 100
101 - 1000
1001 - 5000
More than 5000
Big Data Search by Organization Size
Elasticsearch Apache Solr Cloudera Search
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
55
Big Data Analytics / Machine-Learning Technologies
We asked respondents to rank their interest in a variety of big data analytics and
machine-learning technologies (fig. 42). The leader, Spark MLib (here and throughout
this category), is considered, at minimum, “important” by more than 60 percent of
respondents and ranks well ahead of all competitors. As we will see in the following
figure, this is a stark improvement over the previous year. Still, Spark MLib is
considered "critical" to just 15 percent of respondents, reflecting an early-stage market
response to machine learning.
Figure 42 - Big data analytics / machine learning
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Spark MLib
Rhipe (R)
Mahout
Oryx
Myrrix
Big Data Analytics / Machine Learning
Critical Very important Important Somewhat important Not important
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
56
Year-over-year interest in big data analytics and machine learning increased across the
board, though it still remains confined to levels of 2.0 or "somewhat important" (fig. 43).
The most popular choice, Spark MLib, also grew the most from 2015 to 2016. The next
greatest momentum levels were in Rhipe and Mahout.
Figure 43 - Big data analytics / machine learning 2015 to 2016
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
Spark MLib Rhipe (R) Mahout Oryx Myrrix
Big Data Analytics / Machine Learning 2015 to 2016
2015 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
57
Asia-Pacific respondents have a stronger response to different machine-learning
capabilities compared to other geographies (fig. 44). EMEA is next most engaged with
machine learning, ahead of levels in North America. Spark MLib is again the top choice
across all regions. Mean levels of interest are again mostly in the “somewhat important”
to "important" range.
Figure 44 - Big data analytics / machine learning by geography
1 2 3 4 5
North America
Europe, Middle East and Africa
Asia Pacific
Big Data Analytics / Machine Learning by Geography
Spark MLib Rhipe (R) Mahout Oryx Myrrix
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
58
In our 2016 sample, interest in big data machine learning varied by vertical industry but
overall was led by preference for Spark MLib (fig. 45). Healthcare and technology
showed the greatest interest in MLib. Healthcare and consulting were most interested in
Rhipe.
Figure 45 - Big data analytics / machine learning by vertical industry
1 2 3 4 5
Technology
Financial services
Consulting
Healthcare
Education
Big Data Analytics / Machine Learning by Vertical Industry
Spark MLib Rhipe (R) Mahout Oryx Myrrix
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
59
By function, Spark MLib is again the standout category leader across organizational
roles. BICC and executive management are again mirrors of the top areas of interest,
followed by R&D and sales and marketing (fig. 46). IT is mostly unengaged with big
data analytics and machine learning, even more so than sales and marketing or finance.
Figure 46 - Big data analytics / machine learning by function
1 2 3 4 5
InformationTechnology (IT)
Business intelligencecompetency center
Executivemanagement
Research anddevelopment (R&D)
Sales & Marketing
Finance
Big Data Analytics / Machine Learning by Function
Spark MLib Rhipe (R) Mahout Oryx Myrrix
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
60
Organizations of all sizes prefer Spark MLib over all other big data analytics / machine-
learning options (fig. 47). This effect is not correlated to size. In our 2016 sample,
sentiment for MLib is strongest in organizations with 1,001 to 5,000 employees. We see
that preference for Spark MLib is higher at large organizations, while small peers have a
proportionately greater interest in R-based Rhipe.
Figure 47 - Big data analytics / machine learning by organization size
1 2 3 4 5
1 - 100
101 - 1000
1001 - 5000
More than 5000
Big Data Analytics / Machine Learning by Organization Size
Spark MLib Rhipe (R) Mahout Oryx Myrrix
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
61
Big Data Distributions
We asked respondents to rank the most important big data distributions by order of
importance (fig. 48). In 2016, Cloudera led in measures of "critical" and was the
strongest overall performer, followed by Hortonworks, Amazon, and MAP/R. Cloudera,
Hortonworks and MAP/R were all seen as, at minimum, "important" to 63 percent to 68
percent of respondents.
Figure 48 - Big data distributions
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Cloudera
Hortonworks
Amazon
MAP/R
Big Data Distributions
Critical Very important Important Somewhat important Not important
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
62
Interest in all big data distributions increased year over year in 2016 (fig. 49). Amazon
fell slightly from a tie for top place to third, behind Cloudera and Hortonworks. Interest
levels for the top three choices were at or near 3.0, indicating average responses near
"important" to respondents.
Figure 49 - Big data distributions 2015 to 2016
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
Cloudera Hortonworks Amazon MAP/R
Big Data Distributions 2015 to 2016
2015 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
63
In 2016, there were differences of interest by geography in the four big data distributions
we sampled (fig. 50). Asia Pacific is again the leader across the board on all distribution
interest. Perhaps most noticeably, EMEA reported the greatest standout interest in
Cloudera compared to other distributions.
Figure 50 - Big data distributions by geography
1 2 3 4 5
North America
Europe, Middle East and Africa
Asia Pacific
Big Data Distributions by Geography
Cloudera Hortonworks Amazon MAP/R
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
64
By vertical industry, healthcare, consulting, and financial services expressed the
greatest interest in Cloudera (fig. 51). Technology respondents (more heavily weighted
in our study) preferred Amazon. Map/R performed strongest in consulting, healthcare,
and education. Hortonworks performed best in consulting, healthcare, and financial
services.
Figure 51 - Big data distributions by vertical industry
1 2 3 4 5
Technology
Financial services
Consulting
Healthcare
Education
Big Data Distributions by Vertical Industry
Cloudera Hortonworks Amazon MAP/R
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
65
Unlike other measures, Cloudera is not a big data distribution category leader by
function due to sample weighting (fig. 52). In our 2016 sample, Hortonworks was a
standout leader among sales and marketing respondents. Amazon performed strongest
among distributions for executive management respondents. BICC respondents
preferred Hortonworks by a lesser margin, and IT interest was led by Cloudera.
Figure 52 - Big data distributions by function
1 2 3 4 5
InformationTechnology (IT)
Business intelligencecompetency center
Executivemanagement
Research anddevelopment (R&D)
Sales & Marketing
Finance
Big Data Distributions by Function
Cloudera Hortonworks Amazon MAP/R
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
66
Small to very large organizations have varying preferences in big data distributions,
though not to an extreme extent (fig. 53). As we might expect, cloud-based Amazon and
AWS distributions appeal most strongly to small organizations for simple and
inexpensive startup projects that have also demonstrated abilities to scale. Mid-sized
(101-1,000) organizations also most prefer Amazon, though we do see a trend among
larger organizations to bring big data distribution management in house. Cloudera and
Hortonworks are the top picks among large and very large organizations.
Figure 53 - Big data distributions by organization size
1 2 3 4 5
1 - 100
101 - 1000
1001 - 5000
More than 5000
Big Data Distributions by Organization Size
Cloudera Hortonworks Amazon MAP/R
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
67
Industry and
Vendor
Analysis
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
68
Industry and Vendor Analysis In 2016 as in 2015, we reached out to the vendor community with questions about their
capabilities and plans for technologies in big data analytics, including its perceived
importance to their strategies. Compared to 2015, industry sentiment appears to be
leveling off (fig. 54). Overall, vendors are still highly positive on big data but are trading
over the top enthusiasm for something less than a complete revolution in data
management. We view it as a positive that the proclaimed criticality of a still emergent
set of technologies has been replaced by an optimistic upside of one that is "very
important" at the same time user adoption (or awareness of same) has grown notably
year over year (fig. 7, p. 20).
Figure 54 – Industry importance of big data 2015 to 2016
0%
10%
20%
30%
40%
50%
60%
70%
Critically important Very important Somewhat important Not important
Industry Importance of Big Data 2015 to 2016
2015 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
69
Among big data infrastructure options in the Hadoop ecosystem, Map/Reduce still has
the highest level of vendor support, which is not surprising given its longevity and
relative maturity (fig. 55). Support for Spark is closing in quickly with the highest
predicted industry support plans for the next 12 months, after which Spark support will
be ubiquitous. After Spark, industry support drops quickly below 50 percent. Future "no
plans" for support range from 30 percent to more than 60 percent.
Figure 55 - Industry support for big data infrastructure
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Industry Support for Big Data Infrastructure
No plans
24 months
18 months
12 months
Today
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
70
Year over year, industry plans for supporting Map/Reduce, Spark, Yarn, and Tez have
all gathered momentum (fig. 56). Despite some growth in user sentiment (fig. 25, p. 38),
industry support for Oozie declined. We continue to expect that proprietary vendor
support of open source big data projects will be opportunistic and customer driven.
Figure 56 - Industry support for big data infrastructure 2015 to 2016
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
Industry Support for Big Data Infrastructure 2015 to 2016
2015 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
71
Existing industry support for access to big data sources is greatest for Hive/Hive QL (87
percent), followed by HDFS (85 percent) (fig. 57). These top choices are in line with top
user preferences for data access, but Spark support is a good bit lower than user
expectations (fig. 30, p. 43). Industry support for Redshift is next highest, somewhat
ahead of user priorities. Google BigQuery currently has much lower industry support but
is the third most cited choice of users.
Figure 57 – Industry support for access to big data sources
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Industry Support for Access to Big Data Sources
No plans
24 months
18 months
12 months
Today
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
72
Year-over-year industry support for data access increased for all big data sources
polled with the exception of Redshift (fig. 58). Though Redshift was a lower priority
among users than industry vendors, it gained additional user interest in 2016 (fig. 31, p.
44). The biggest gainer of industry support in 2016, Redshift, gained even more interest
among user respondents year over year.
Figure 58 - Industry support for access to big data sources 2015 to 2016
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Industry Support for Access to Big Data Sources 2015 to 2016
2015 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
73
Industry support for big data search did gain momentum in 2016, though support
remains distinctly lukewarm (fig. 59). While 30 percent of vendors indicate support for
Apache Solr, under 20 percent currently support Elasticsearch or Cloudera Search and
more than 40 percent have no plans for future support. The tepid investment in big data
search is in line with current user sentiments, which show little urgency for search (fig.
36, p. 49).
Figure 59 - Industry support for big data search
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Apache Solr Elasticsearch Cloudera Search
Industry Support for Big Data Search
No plans
24 months
18 months
12 months
Today
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
74
Year-over-year industry support for big data search varied noticeably by product (fig.
60). While support for category leader Apache Solr grew from 25 percent to 31 percent,
Cloudera Search support fell from 26 percent to 14 percent. Support for Elasticsearch
was flat year over year. We cannot be certain whether swings in industry support are
related to existing penetration or other market factors. We saw user interest in all three
big data search products grow in interest year over year, but not with urgency (fig. 37, p.
50).
Figure 60 - Industry support for big data search 2015 to 2016
0%
5%
10%
15%
20%
25%
30%
35%
Cloudera Search Apache Solr Elasticsearch
Industry Support for Big Data Search 2015 to 2016
2015 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
75
Industry support for big data analytics / machine learning is strongest for Spark MLib
followed by Mahout, though we concede these investments are not urgent and reflect
the esoteric uses of machine learning in the current market (fig. 61). While industry
support for MLib is expected to reach a total of 56 percent in the next 12 months, future
support for all other machine learning methods is tepid and may never reach 50
percent. (Spark MLib, Rhipe, and Mahout were top user machine-learning choices but
also showed low levels of enthusiasm (fig. 42, p. 55).
Figure 61 –Industry support for big data analytics / machine learning
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Spark MLib Mahout Rhipe (R) Oryx Myrrix
Industry Support for Big Data Analytics / Machine Learning
No plans
24 months
18 months
12 months
Today
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
76
Year-over-year industry support for big data analytics / machine learning was higher for
Spark MLib, slightly lower for Mahout, and significantly lower for other products,
particularly Rhipe (fig. 62). Again, support investments remain low and, as with current
vendor support shown in fig. 61 above, where investment or interest is developing, it
tends to go to Spark MLib.
Figure 62 - Industry support for big data analytics / machine learning 2015 to 2016
0%
5%
10%
15%
20%
25%
30%
35%
Spark MLib Mahout Rhipe (R) Oryx Myrrix
Industry Support for Big Data Analytics / Machine Learning 2015 to 2016
2015 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
77
Compared to support for big data search, we see significant existing industry support
and future plans for big data (Hadoop) distributions (fig. 63). Current support is
strongest for Hortonworks, followed by Cloudera and Map/R. Current support for
Amazon is under 60 percent, but industry respondents expect to see about 90 percent
support for all products within 24 months. These investments support stronger user
sentiments for big data distributions (fig. 48, p. 61) than for search or machine learning.
Figure 63 - Industry support for big data (Hadoop) distributions
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Hortonworks Cloudera MAP/R Amazon
Industry Support for Big Data (Hadoop) Distributions
No plans
24 months
18 months
12 months
Today
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
78
Industry support/investments in Hortonworks and MAP/R big data distributions grew
year over year in 2016, while support for Cloudera and Amazon declined slightly (fig.
49, p. 62). Industry support for Hortonworks is currently greater than 80 percent; in
contrast, Amazon support is below 60 percent.
Figure 64 - Industry support for big data (Hadoop) distributions 2015 to 2016
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
Hortonworks Cloudera MAP/R Amazon
Industry Support for Big Data (Hadoop) Distributions 2015 to 2016
2015 2016
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
79
Big Data Analytics Vendor Ratings
In rating vendors for big data analytics, we examined levels of functionality in five
categories: infrastructure, data access, search, machine learning, and supported
distributions (fig. 65). Criteria were weighted based on user responses/priorities. Top-
rated vendors include Zoomdata (1st), RapidMiner (2nd), Pentaho (3rd), Datameer (4th),
Domo (4th) and Information Builders (5th).
Figure 65 – Big data analytics vendor ratings
0.25
0.5
1
2
4
8
16
32Zoomdata
RapidMiner
Pentaho
Datameer
Domo
Information Builders
SAP
Tableau
Oracle
TIBCO
Jinfonet
Microsoft
Birst
Logi Analytics
Looker
MicroStrategy
Big Data Analytics Vendor Ratings
Infrastructure Data Access Search Distributions Machine Learning Total Score
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
80
Glossary Alluxio (formerly Tachyon) is a memory-centric distributed storage system enabling reliable
data sharing at memory-speed across cluster frameworks.
Source: alluxio.org
Atlas is designed to exchange metadata with other tools and processes within and outside of the
Hadoop stack, thereby enabling platform-agnostic governance controls that effectively address
compliance requirements
Source: Apache Software Foundation
BigQuery is a RESTful web service that enables interactive analysis of massively large datasets
working in conjunction with Google Storage. It is an Infrastructure as a Service (IaaS) service
that may be used complementarily with MapReduce.
Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable
full-text search engine with an HTTP web interface and schema-free JSON documents.
Elasticsearch is developed in Java and is released as open source under the terms of the Apache
License. Elasticsearch is the second most popular enterprise search engine after Apache Solr.*
HAWQ is a parallel SQL query engine that combines the key technological advantages of the
industry-leading Pivotal Analytic Database with the scalability and convenience of Hadoop.
HAWQ reads data from and writes data to HDFS natively. HAWQ delivers industry-leading
performance and linear scalability. It provides users the tools to confidently and successfully
interact with petabyte range data sets. HAWQ provides users with a complete, standards-
compliant SQL interface.
Source: Pivotal
HBase is an open source, non-relational, distributed database modeled after Google's BigTable
and is written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop
project and runs on top of HDFS (Hadoop Distributed File System), providing BigTable-like
capabilities for Hadoop.
The Hadoop distributed file system (HDFS) is a distributed, scalable, and portable file system
written in Java for the Hadoop framework.
The Apache Hive™ data warehouse software facilitates querying and managing large datasets
residing in distributed storage. Hive provides a mechanism to project structure onto this data and
query the data using a SQL-like language called HiveQL. At the same time this language also
allows traditional map/reduce programmers to plug in their custom mappers and reducers when it
is inconvenient or inefficient to express this logic in HiveQL.
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
81
Source: Apache Software Foundation
The Apache Knox Gateway is a REST API Gateway for interacting with Apache Hadoop
clusters. The Knox Gateway provides a single access point for all REST interactions with
Apache Hadoop clusters.
Source: Apache Software Foundation
Impala is an open source, native analytic database for Apache Hadoop. Impala is shipped by
Cloudera, MapR, Oracle, and Amazon.
Source: Cloudera
Mahout is a project of the Apache Software Foundation to produce free implementations of
distributed or otherwise scalable machine learning algorithms focused primarily in the areas of
collaborative filtering, clustering and classification. Many of the implementations use the
Apache Hadoop platform. Mahout also provides Java libraries for common math operations
(focused on linear algebra and statistics) and primitive Java collections.
Source: Apache Software Foundation
MapReduce is a programming model and an associated implementation for processing and
generating large data sets with a parallel, distributed algorithm on a cluster. Conceptually similar
approaches have been very well known since 1995 with the Message Passing Interface standard
having reduce and scatter operations.
Apache Mesos is an opensource cluster manager that was developed at the University of
California, Berkeley. It "provides efficient resource isolation and sharing across distributed
applications, or frameworks". The software enables resource sharing in a fine-grained manner,
improving cluster utilization.
MLlib is Spark’s scalable machine-learning library consisting of common learning algorithms
and utilities, including classification, regression, clustering, collaborative filtering,
dimensionality reduction, as well as underlying optimization primitives.
Source: Apache Software Foundation
MongoDB is a cross-platform document-oriented database. Classified as a NoSQL database,
MongoDB eschews the traditional table-based relational database structure in favor of JSON-like
documents with dynamic schemas (MongoDB calls the format BSON), making the integration of
data in certain types of applications easier and faster. Released under a combination of the GNU
Affero General Public License and the Apache License, MongoDB is free and open source
software.
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
82
Myrrix, offers a “complete, real-time, scalable clustering and recommender system.” The
solution is built on top of the Apache Mahout machine-learning project.
Source: Cloudera
Oozie is a workflow scheduler system to manage Hadoop jobs. It is a server-based Workflow
Engine specialized in running workflow jobs with actions that run Hadoop MapReduce and Pig
jobs. Oozie is implemented as a Java Web application that runs in a Java servlet container.
Oryx is built on Apache Spark and Apache Kafka, with specialization for real-time large scale
machine learning. It is a framework for building applications but also includes packaged, end-to-
end applications for collaborative filtering, classification, regression, and clustering.
Source: Cloudera
RHIPE integrates the R statistical environment with the Hadoop framework. RHIPE allows R
users to compute on terabyte-sized data sets a cluster using the MapReduce framework, thus
offering the best of both worlds to users seeking to leverage the strength of R and Hadoop.
People with very large data sets stored in the Hadoop Distributed File System can now easily
process the data on hundreds or even thousands of nodes in parallel, using only the R language.
Source: Revolution Analytics
Cloudera Search is one of Cloudera's near-real-time access products. Cloudera Search enables
non-technical users to search and explore data stored in or ingested into Hadoop and HBase.
Users do not need SQL or programming skills to use Cloudera Search because it provides a
simple, full-text interface for searching.
Source: Cloudera
Solr is an open source enterprise search platform, written in Java, from the Apache Lucene
project. Its major features include full-text search, hit highlighting, faceted search, real-time
indexing, dynamic clustering, database integration, NoSQL features and rich document (e.g.,
Word, PDF) handling. Providing distributed search and index replication, Solr is designed for
scalability and fault tolerance. Solr is the most popular enterprise search engine.
Apache Spark is an open source cluster computing framework originally developed in the
AMPLab at University of California, Berkeley but was later donated to the Apache Software
Foundation where it remains today. In contrast to Hadoop's two-stage disk-based MapReduce
paradigm, Spark's multi-stage in-memory primitives provides performance up to 100 times faster
for certain applications. By allowing user programs to load data into a cluster's memory and
query it repeatedly, Spark is well suited to machine-learning algorithms.
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
83
Spark SQL is a component on top of Spark Core that introduces a new data abstraction called
DataFrames, which provides support for structured and semi-structured data. Spark SQL
provides a domain-specific language to manipulate DataFrames in Scala, Java, or Python. It also
provides SQL language support with command-line interfaces and ODBC/JDBC server.
Apache™ Tez is an extensible framework for building high-performance batch and interactive
data-processing applications, coordinated by YARN in Apache Hadoop. Tez improves the
MapReduce paradigm by dramatically improving its speed while maintaining MapReduce’s
ability to scale to petabytes of data. Important Hadoop ecosystem projects like Apache Hive and
Apache Pig use Apache Tez, as do a growing number of third-party data-access applications
developed for the broader Hadoop ecosystem.
Source: Apache Software Foundation
YARN is one of the key features in the second-generation Hadoop 2 version of the Apache
Software Foundation's open source distributed processing framework. Originally described by
Apache as a redesigned resource manager, YARN is now characterized as a large-scale,
distributed operating system for big data applications.
* All sources Wikipedia unless otherwise noted
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
84
Other Dresner Advisory Services Research Reports
- Wisdom of Crowds “Flagship” Business Intelligence Market study
- Advanced and Predictive Analytics
- Business Intelligence Competency Center
- Cloud Computing and Business Intelligence
- Collective InsightsTM
- End User Data Preparation
- Enterprise Planning
- Internet of Things and Business Intelligence
- Location Intelligence
- Small and Mid-Sized Enterprise Business Intelligence
- Systems Integrators
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
85
Appendix: Big Data Analytics Study Survey Instrument
Please provide your contact information below:
Name*: _________________________________________________
Company Name: _________________________________________________
Address 1: _________________________________________________
Address 2: _________________________________________________
City: _________________________________________________
State: _________________________________________________
Zip: _________________________________________________
Country: _________________________________________________
Email Address*: _________________________________________________
Phone Number: _________________________________________________
Major Geography
( ) Asia/Pacific
( ) Europe, Middle East and Africa
( ) Latin America
( ) North America
What is your current title?
_________________________________________________
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
86
What function are you a part of?
( ) Business intelligence competency center
( ) Executive management
( ) Finance
( ) Information Technology (IT)
( ) Manufacturing
( ) Marketing
( ) Project/program management office
( ) Sales
( ) Research and development (R&D)
( ) Other - Write In: _________________________________________________
Please select an industry
( ) Advertising
( ) Aerospace
( ) Agriculture
( ) Apparel and accessories
( ) Automotive
( ) Aviation
( ) Biotechnology
( ) Broadcasting
( ) Business services
( ) Chemical
( ) Construction
( ) Consulting
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
87
( ) Consumer products
( ) Defense
( ) Distribution & logistics
( ) Education
( ) Energy
( ) Entertainment and leisure
( ) Executive search
( ) Federal government
( ) Financial services
( ) Food, beverage and tobacco
( ) Healthcare
( ) Hospitality
( ) Gaming
( ) Insurance
( ) Legal
( ) Manufacturing
( ) Mining
( ) Motion picture and video
( ) Not for profit
( ) Pharmaceuticals
( ) Publishing
( ) Real estate
( ) Retail and wholesale
( ) Sports
( ) State and local government
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
88
( ) Technology
( ) Telecommunications
( ) Transportation
( ) Utilities
( ) Other - Write In: _________________________________________________
How many employees does your company employ worldwide?
( ) 1 - 100
( ) 101 - 1000
( ) 1001 - 5000
( ) More than 5000
Do you use or intend to use big data technology/architecture within your organization?*
( ) Yes. We use big data today
( ) No. We have no plans to use big data at all
( ) We may use big data in the future
What product(s) does your organization use with big data for BI/analytics?
____________________________________________
How satisfied are you with your vendor and product for big data analytics?
( ) Extremely satisfied
( ) Somewhat satisfied
( ) Somewhat unsatisfied
( ) Unsatisfied
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
89
What are your plans for Big Data (Hadoop) Analytics in the Future?
( ) Will adopt in 2016
( ) Will adopt in 2017
( ) Will adopt beyond 2017
What use cases are most important for Big Data (Hadoop) in your organization?
Critical
Very
important Important
Somewhat
important
Not
important
Data warehouse
optimization
( ) ( ) ( ) ( ) ( )
Customer/social
analysis
( ) ( ) ( ) ( ) ( )
Internet of
things
( ) ( ) ( ) ( ) ( )
Fraud detection ( ) ( ) ( ) ( ) ( )
Clickstream
analytics
( ) ( ) ( ) ( ) ( )
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
90
Please indicate the importance of the following Big Data infrastructure components
Critical
Very
important Important
Somewhat
important
Not
important
Alluxio
(formerly
Tachyon)
( ) ( ) ( ) ( ) ( )
Mesos ( ) ( ) ( ) ( ) ( )
Spark ( ) ( ) ( ) ( ) ( )
Map/Reduce ( ) ( ) ( ) ( ) ( )
Oozie ( ) ( ) ( ) ( ) ( )
Yarn ( ) ( ) ( ) ( ) ( )
Tez ( ) ( ) ( ) ( ) ( )
Atlas ( ) ( ) ( ) ( ) ( )
Knox
Gateway
( ) ( ) ( ) ( ) ( )
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
91
Please indicate the importance of the following Big Data - data access capabilities
Critical
Very
important Important
Somewhat
important
Not
important
BigQuery
( ) ( ) ( ) ( ) ( )
HBase ( ) ( ) ( ) ( ) ( )
HDFS ( ) ( ) ( ) ( ) ( )
Hive/HiveQL ( ) ( ) ( ) ( ) ( )
Impala ( ) ( ) ( ) ( ) ( )
MongoDB ( ) ( ) ( ) ( ) ( )
Pivotal
HAWQ
( ) ( ) ( ) ( ) ( )
Redshift ( ) ( ) ( ) ( ) ( )
Spark SQL ( ) ( ) ( ) ( ) ( )
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
92
Please indicate the importance of the following Big Data search capabilities
Critical
Very
important Important
Somewhat
important
Not
important
Cloudera
Search
( ) ( ) ( ) ( ) ( )
Apache Solr ( ) ( ) ( ) ( ) ( )
Elasticsearch ( ) ( ) ( ) ( ) ( )
Please indicate the importance of the following Big Data analytical/machine learning components
Critical
Very
important Important
Somewhat
important
Not
important
Mahout ( ) ( ) ( ) ( ) ( )
Rhipe
(R)
( ) ( ) ( ) ( ) ( )
Oryx ( ) ( ) ( ) ( ) ( )
Myrrix ( ) ( ) ( ) ( ) ( )
Spark
MLib
( ) ( ) ( ) ( ) ( )
2016 Big Data Analytics Market Study
http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC
93
Please indicate the importance of the following Big Data (Hadoop) distributions
Critical
Very
important Important
Somewhat
important
Not
important
Cloudera ( ) ( ) ( ) ( ) ( )
Hortonworks ( ) ( ) ( ) ( ) ( )
MAP/R ( ) ( ) ( ) ( ) ( )
Amazon ( ) ( ) ( ) ( ) ( )