+ All Categories
Home > Documents > 2016 Big Data Analytics Market Study -Wisdom of Crowds® Series -Licensed to Zoomdata - Copyright...

2016 Big Data Analytics Market Study -Wisdom of Crowds® Series -Licensed to Zoomdata - Copyright...

Date post: 14-Apr-2017
Category:
Upload: daniel-freundel
View: 261 times
Download: 1 times
Share this document with a friend
93
December 5, 2016 Dresner Advisory Services, LLC 2016 Edition Big Data Analytics Market Study Wisdom of Crowds ® Series Licensed to Zoomdata
Transcript

December 5, 2016

Dresner Advisory Services, LLC

2016 Edition

Big Data Analytics Market Study

Wisdom of Crowds®

Series

Licensed to Zoomdata

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

2

Disclaimer:

This report should be used for informational purposes only. Vendor and product selections should be made based on

multiple information sources, face-to-face meetings, customer reference checking, product demonstrations and

proof-of-concept applications.

The information contained in all Wisdom of Crowds® Market Study Reports reflects the opinions expressed in the

online responses of individuals who chose to respond to our online questionnaire and does not represent a scientific

sampling of any kind. Dresner Advisory Services, LLC shall not be liable for the content of reports, study results, or for

any damages incurred or alleged to be incurred by any of the companies included in the reports as a result of its

content.

Reproduction and distribution of this publication in any form without prior written permission is forbidden.

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

3

Definition

Big Data Analytics Defined We define big data analytics as systems that enable end-user access to and analysis of data

contained and managed within the Hadoop ecosystem.

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

4

Introduction This year we celebrate the ninth anniversary of Dresner Advisory Services! We offer our

thanks to all of you for your continued support and ongoing encouragement.

Since our founding in 2007, we have worked hard to set the “bar” high—challenging

ourselves to innovate and lead the market—offering ever greater value with each

successive year.

Our first market report in 2010 set the stage for where we are today. Since that time, we

have expanded our agenda and have added new research topics every year since. For

2016, we are on track to release 15 major reports, including our recent flagship BI

report—in its seventh year of publication!

In addition to our ongoing coverage of key topics such as embedded BI, big data

analytics and advanced and predictive analytics, we have added new topics including

Collective InsightsTM (blending collaboration and governance) and systems integrators.

For this, our second Big Data Analytics Market Study, we continue to focus upon the

combination of analytical solutions within the Hadoop ecosystem, adding some new

criteria and exploring changing market dynamics and user perceptions and plans.

We hope you enjoy this report!

Best,

Howard Dresner Chief Research Officer Dresner Advisory Services

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

5

Contents

Definition ......................................................................................................................... 3

Big Data Analytics Defined........................................................................................... 3

Introduction ..................................................................................................................... 4

Benefits of the Study ....................................................................................................... 7

A Consumer Guide ...................................................................................................... 7

A Supplier Tool ............................................................................................................ 7

About Howard Dresner and Dresner Advisory Services .................................................. 8

About Jim Ericson ........................................................................................................... 9

Survey Method and Data Collection .............................................................................. 10

Data Quality ............................................................................................................... 10

Executive Summary ...................................................................................................... 12

Study Demographics ..................................................................................................... 13

Geography ................................................................................................................. 13

Functions ................................................................................................................... 14

Vertical Industries ...................................................................................................... 15

Organization Size ....................................................................................................... 16

Analysis and Trends: Big Data Analytics ....................................................................... 18

Importance of Big Data .............................................................................................. 18

Big Data Adoption ...................................................................................................... 19

Future Adoption of Big Data ....................................................................................... 25

Big Data Use Cases ................................................................................................... 31

Big Data Infrastructure ............................................................................................... 37

Big Data – Data Access ............................................................................................. 43

Big Data Search ......................................................................................................... 49

Big Data Analytics / Machine-Learning Technologies ................................................ 55

Big Data Distributions ................................................................................................ 61

Industry and Vendor Analysis ........................................................................................ 68

Big Data Analytics Vendor Ratings ............................................................................ 79

Glossary ........................................................................................................................ 80

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

6

Other Dresner Advisory Services Research Reports .................................................... 84

Appendix: Big Data Analytics Study Survey Instrument ................................................ 85

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

7

Benefits of the Study

The DAS Big Data Analytics Market Study provides a wealth of information and

analysis, offering value to both consumers and producers of related technology and

services.

A Consumer Guide

As an objective source of industry research, consumers use the DAS Big Data Analytics

Market Study to understand how their peers are leveraging and investing in big data

analytics and related technologies.

Using our unique vendor performance measurement system, users glean key insights

into software supplier performance, enabling:

Comparisons of current vendor performance to industry norms

Identification and selection of new vendors

A Supplier Tool

Vendor licensees use the DAS Big Data Analytics Market Study in several important

ways:

External Awareness

Build awareness for the big data analytics market and supplier brand, citing

DAS Big Data Analytics Market Study trends and vendor performance

Create lead and demand generation for supplier offerings through association

with DAS Big Data Analytics Market Study brand, findings, webinars, etc.

Internal Planning

Refine internal product plans and align with market priorities and realities as

identified in DAS Big Data Analytics Market Study

Better understand customer priorities, concerns, and issues

Identify competitive pressures and opportunities

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

8

About Howard Dresner and Dresner Advisory Services The DAS Big Data Analytics Market Study was conceived, designed, and executed by

Dresner Advisory Services, LLC, an independent advisory firm, and Howard Dresner, its

president, founder and chief research officer.

Howard Dresner is one of the foremost thought leaders in business intelligence and

performance management, having coined the term “Business Intelligence” in 1989. He

has published two books on the subject, The Performance

Management Revolution – Business Results through Insight

and Action (John Wiley & Sons, Nov. 2007) and Profiles in

Performance – Business Intelligence Journeys and the

Roadmap for Change (John Wiley & Sons, Nov. 2009). He

lectures at forums around the world and is often cited by the

business and trade press.

Prior to Dresner Advisory Services, Howard served as chief

strategy officer at Hyperion Solutions and was a research fellow at Gartner, where he

led its business intelligence research practice for 13 years.

Howard has conducted and directed numerous in-depth primary research studies over

the past two decades and is an expert in analyzing these markets.

Through the Wisdom of Crowds® Business Intelligence market research reports, we

engage with a global community to redefine how research is created and shared. Other

research reports include:

- Wisdom of Crowds “Flagship” Business Intelligence Market study

- Advanced and Predictive Analytics

- Collective InsightsTM

- Internet of Things and Business Intelligence

- Small and Mid-Sized Enterprise Business Intelligence

- Systems Integrators

Howard conducts a weekly Twitter “tweetchat” on Fridays at 1:00 p.m. ET. During these

live events the #BIWisdom “tribe” discusses a wide range of business intelligence

topics.

You can find more information about Dresner Advisory Services at

www.dresneradvisory.com.

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

9

About Jim Ericson Jim Ericson is a research director with Dresner Advisory Services.

Jim has served as a consultant and journalist who studies end-user management

practices and industry trending in the data and information management fields.

From 2004 to 2013 he was the editorial director at Information Management magazine

(formerly DM Review), where he created architectures for user and

industry coverage for hundreds of contributors across the breadth of

the data and information management industry.

As lead writer, he interviewed and profiled more than 100 CIOs,

CTOs, and program directors in a 2010-2012 program called “25

Top Information Managers.” His related feature articles earned

ASBPE national bronze and multiple Mid-Atlantic region gold and

silver awards for Technical Article and for Case History feature

writing.

A panelist, interviewer, blogger, community liaison, conference co-chair, and speaker in

the data-management community, he also sponsored and co-hosted a weekly podcast

in continuous production for more than five years.

Jim’s earlier background as senior morning news producer at NBC/Mutual Radio

Networks and as managing editor of MSNBC’s first Washington, D.C. online news

bureau cemented his understanding of fact-finding, topical reporting, and serving broad

audiences.

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

10

Survey Method and Data Collection As with all of our Wisdom of Crowds® Business Intelligence Market Studies, we

constructed a survey instrument to collect data and used social media and crowd-

sourcing techniques to recruit participants.

We include our own research community of nearly 4,000 organizations as well as

crowdsourcing and vendors’ customer communities.

Data Quality

We carefully scrutinized and verified all respondent entries to ensure that only qualified

participants are included in the study.

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

11

Executive

Summary

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

12

Executive Summary

Over two years of big data analytics study, we see a significant increase in

uptake and a large drop in holdouts with no big data plans. High tech and

telecom are industry leaders (p. 20-24).

Current adoption and future plans for the use of big data analytics have reached

a level of significance we did not see last year. Forty-one percent of

organizations are already using Hadoop-related big data. Even more say they

may use big data in the future (p. 19).

Among organizations that have not yet adopted big data, 14 percent will adopt in

the current calendar year, a horizon grows that grows to 47 percent in 2017.

BICC respondents are likely future adopters (p. 25-30).

Among technologies and initiatives considered strategic to business intelligence,

big data analytics is ranked 20th out of 30 topical areas under study, still well

behind core BI practices (p. 18). Overall, vendors are still highly positive on big

data though sentiment is leveling off (p. 68).

The top big data use cases in 2016 are data warehouse optimization, followed by

customer/social analysis (p. 31-36).

The top big data infrastructure choice among users is Spark, followed by

Map/Reduce, Yarn, Oozie, Tez, Mesos, and Atlas. Over time, Spark is gaining

status as a category leader (p. 40-42). Industry support is strongest for

Map/Reduce, but Spark is closing in quickly (p. 69-70).

Spark SQL is the most-cited big data access structure followed closely by Hive

and HDFS (p. 43-48). Industry support is strongest for Hive and HDFS; Spark

support remains lower than user expectations (p. 71-72).

Amid lukewarm interest, toward big data search technologies, Elasticsearch

resonated most strongly followed by Apache Solr and Cloudera Search (p. 49-

54). The industry is strongest for Apache Solr, and support for Cloudera fell

noticeably (p. 73-74).

Spark MLib is the most-preferred big data machine learning technology,

“important” to more than 60 percent of respondents. All machine learning

technologies gather interest but are still at the fringe (p. 55-60). Industry support

for big data analytics / machine learning is strongest for Spark MLib followed by

Mahout (p. 75-76).

Cloudera is the most popular big data distribution among users, followed by

Hortonworks, Amazon, and MAP/R (p. 61-66). We see significant existing

industry support and future plans for big data (Hadoop) distributions (p. 77-78).

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

13

Study Demographics Our 2015 Big Data Analytics Market Study is based on a cross-section of data that

spans geographies, functions, organization size, and vertical industries. We believe

that, unlike other industry research, this supports a more representative sample and

better indicator of true market dynamics. We constructed cross-tab analyses using

these demographics to identify and illustrate important industry trends.

Geography

North America, which includes the U.S., Canada, and Puerto Rico, represents 57

percent of respondents (fig. 1). EMEA accounts for the next largest group (32 percent),

followed by Asia Pacific and Latin America.

Figure 1 – Geographies represented

57%

32%

8%

3%

0%

10%

20%

30%

40%

50%

60%

North America Europe, Middle Eastand Africa

Asia Pacific Latin America

Geographies Represented

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

14

Functions

IT (28 percent) and the business intelligence competency center (21 percent) are the

two largest groups represented in our big data analytics sample (fig. 2).

Examining trends and behavior by function helps us compare and contrast plans and

priorities in different areas of organizations.

Figure 2 - Functions represented

28%

21%

12%

11%

10%

8%

12%

0% 5% 10% 15% 20% 25% 30%

Information Technology (IT)

Business intelligence competency center

Executive management

Research and development (R&D)

Sales and Marketing

Finance

Other

Functions Represented

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

15

Vertical Industries

Technology (14 percent), financial services (10 percent), and consulting (9 percent) are

the most represented industries in our study, followed by healthcare, education, and

telecommunications (fig. 3). We include responses from consultants—who often have

greater interaction with initiatives and deeper industry knowledge than many customer

counterparts. This also yields insight into the partner ecosystem for BI vendors.

Figure 3 – Vertical industries represented

14%

10% 9%

9% 8%

7%

6% 5%

4% 3%

2% 2% 2% 2% 2%

18%

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

Vertical Industries Represented

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

16

Organization Size

Respondents to our big data analytics study reflect a mix of organizational sizes and

structures (fig. 4). Small organizations of 1-100 employees represent 26 percent of the

sample. Mid-sized organizations also account for 27 percent, and the remaining 47

percent are large organizations with more than 1,000 employees.

Figure 4 – Organization sizes represented

26% 27%

20%

27%

0%

5%

10%

15%

20%

25%

30%

1 - 100 101 - 1000 1001 - 5000 More than 5000

Organization Sizes Represented

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

17

Analysis and

Trends

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

18

Analysis and Trends: Big Data Analytics

Importance of Big Data

Among technologies and initiatives considered strategic to business intelligence, big

data analytics is ranked 20th out of 30 topical areas we currently study (fig. 5). This

finding reflects interest similar to last year's inaugural Big Data Analytics Market Study

(in which big data ranked 18th of 25 topics under study at the time). We understand that

big data interest can and does vary widely from organization to organization and will be

critical to some and irrelevant to others. While we see increasing momentum, big data

analytics still distantly trails the status and penetration of mainstream business

intelligence practices such as reporting, dashboards, and end-user self-service.

Figure 5 - Technologies and initiatives strategic to business intelligence

0% 20% 40% 60% 80% 100%

Reporting

Dashboards

End-user "self-service"

Advanced visualization

Data discovery

Data warehousing

Data mining, advanced algorithms, predictive

Integration with operational processes

Data storytelling

Enterprise planning/budgeting

Mobile device support

Embedded BI (contained within an application,…

Governance

Collaborative support for group-based analysis

End-user data preparation and blending

Search-based interface

Software-as-a-Service and cloud computing

In-memory analysis

Ability to write to transactional applications

Location intelligence/analytics

Big data (e.g., Hadoop)

Pre-packaged vertical/functional analytical…

Text analytics

Streaming data analysis

Open source software

Social media analysis (Social BI)

Cognitive BI (e.g., Artificial Intelligence-based BI)

Complex event processing (CEP)

Internet of Things (IoT)

Edge computing

Technologies and Initiatives Strategic to Business Intelligence

Critical

Very important

Important

Somewhatimportant

Not important

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

19

Big Data Adoption

Current adoption and future plans for the use of big data analytics have reached a level

of significance we did not see last year. Forty-one percent of organizations say they are

already using big data analytics (fig. 6), which we define as "systems that enable end-

user access to and analysis of data contained and managed within the Hadoop

ecosystem.” Even more respondents (46 percent) say they may use big data in the

future. Just 14 percent have no plans for future use of big data analytics.

Figure 6 – Adoption of big data

Yes. We use big data today, 41%

We may use big data in the future,

46%

No. We have no plans to use big data at all, 14%

Adoption of Big Data

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

20

Over the two years of our comprehensive big data analytics study, we see a significant

increase in uptake and a large drop in holdouts with no plans (fig. 7). Forty-one percent

of respondents report current big data use, a greater than two-fold increase over 2015.

At the same time, the number of respondents with no plans fell by a factor of greater

than two, from 36 percent to 14 percent. The percentage of ambivalent users was

consistent year over year at 45 percent or a bit more. We can anecdotally chalk these

findings up to a emerging mix of practical/achievable projects, service enablement, and

greater understanding of big data uses.

Figure 7 - Adoption of big data 2015 to 2016

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%

Yes. We use big data today We may use big data in thefuture

No. We have no plans to usebig data at all

Adoption of Big Data 2015 to 2016

2015

2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

21

In our 2016 sample, EMEA leads slightly in current adoption (43 percent) compared to

North America (40 percent) and is well ahead of Asia Pacific (33 percent) (fig. 8). Asia

Pacific also reports the most organizations with "no plans to use big data at all" (27

percent). Both EMEA and North America report 46 percent undecided ("we may use big

data...") respondents.

Figure 8 – Adoption of big data by geography

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

North America Europe, MiddleEast and Africa

Asia Pacific

Adoption of Big Data by Geography

No. We have no plans to usebig data at all

We may use big data in thefuture

Yes. We use big data today

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

22

Perennial first-mover high-tech organizations lead 2016 big data adoption with 59

percent reporting current use (fig. 9). Telecommunications, with possibly the greatest

data transaction volume issues of any industry, is the next most likely industry to

currently use big data analytics (50 percent). Financial services, another high data

transaction industry, reports 45 percent current use. Less likely to be current users,

consulting industry respondents are nonetheless prepared to embrace big data as

needed.

Figure 9 – Adoption of big data by vertical industry

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Adoption of Big Data by Vertical Industry

No. We have no plans to usebig data at all

We may use big data in thefuture

Yes. We use big data today

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

23

In 2016, the BICC supplanted R&D as the most likely current departmental user of big

data (fig. 10). This finding supports the notion that big data is moving from an

experimental to practical pursuit in organizations. As is often the case, executive

management is a likely-to-sure proponent of evolutionary technologies such as big data.

We are uncertain as to why finance is also a strong player in big data unless interest

there is tuned organizationally at cost savings. IT predictably lags in current adoption

and is most likely to have vested interest in supporting legacy and traditional technology

investments.

Figure 10 – Adoption of big data by function

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

InformationTechnology

(IT)

Businessintelligencecompetency

center

Executivemanagement

Research anddevelopment

(R&D)

Sales &Marketing

Finance

Adoption of Big Data by Function

No. We have no plans to use big data at all

We may use big data in the future

Yes. We use big data today

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

24

Current adoption of big data is strongest (61 percent) within very large businesses and

institutions that have more than 5,000 employees (fig. 11). Small organizations with one

to 100 employees have the lowest rate of current adoption (29 percent). After very large

organizations, however, small and mid-size (101-1,000 employees) are most open to

possible future use. We would expect that small organizations are most likely cloud

users of big data services while large organizations will likely deploy onsite.

Figure 11 – Adoption of big data by organization size

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 - 100 101 - 1000 1001 - 5000 More than5000

Adoption of Big Data by Organization Size

No. We have no plans to usebig data at all

We may use big data in thefuture

Yes. We use big data today

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

25

Future Adoption of Big Data

Among organizations that have not yet adopted big data but have future plans, 14

percent say they will adopt in the current calendar year (fig. 12). This horizon grows

rapidly in 2017 when 47 percent plan to adopt. Unlike 2015 (see following fig. 13), only

a minority of non-users of big data adopters are postponing plans beyond 2017. Though

we often find big data plans compartmentalized to projects or departments, future

adoption will also hinge on current investment budgets for more "conventional"

technologies.

Figure 12 – Future adoption of big data

Will adopt in 2016, 14%

Will adopt in 2017, 47%

Will adopt beyond 2017, 40%

Future Adoption of Big Data

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

26

Compared to our inaugural 2015 study, year-over-year future adoption plans for big

data represent a sea change of respondent behavior (fig. 13). Current year adoption

plans are more than three times greater in 2016 (14 percent) compared to last year (4

percent). Next-year adoption in our current study (47 percent) shows remarkable growth

from 2015's 27 percent plans. Significantly fewer respondents are delaying plans

beyond next year, plainly indicating they are allocating money, resources, and time to

big data solutions and their use.

Figure 13 - Future adoption of big data 2015 to 2016

0%

10%

20%

30%

40%

50%

60%

70%

80%

Will adopt this year Will adopt next year Will adopt beyond next year

Future Adoption of Big Data 2015 to 2016

2015

2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

27

Regionally, among those who have not already adopted big data, North American and

Asia-Pacific respondents are more motivated to increase use compared to those in

EMEA (fig. 14). Asia Pacific has the greatest number of both 2016 (17 percent) and

2017 (50 percent) adopters; EMEA has the most respondents (48 percent) with plans

deferred beyond 2017.

Figure 14 - Future adoption of big data by geography

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

North America Europe, MiddleEast and Africa

Asia Pacific

Future Adoption of Big Data by Geography

Will adopt beyond 2017

Will adopt in 2017

Will adopt in 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

28

Among organizations not yet using big data, vertical adoption in 2016 is highest (about

20 percent) in education, technology, and telecommunications (fig. 15). Plans for 2017

adoption are by far highest in financial services (75 percent), followed by consulting and

healthcare. (While future plans for telecommunications and technology appear relatively

low, recall that these sectors are also the greatest current users of big data technologies

(fig. 9, p. 22)).

Figure 15 – Future adoption of big data by vertical industry

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Future Adoption of Big Data by Vertical Industry

Will adopt beyond 2017

Will adopt in 2017

Will adopt in 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

29

Among non-users of big data, the BICC has by far the highest (30 percent) current-year

adoption plans (fig. 16). Accelerating BICC use is generally a reflection of delivery as

well as incipient demand for business technologies, another indication that big data

analytics is "crossing the chasm" of use cases and enterprise adoption. Sales and

marketing and IT (low in current usage, fig. 10, p. 23), are the next most likely to be

current-year adopters of big data analytics, perhaps by executive fiat, (whose next year

interest is correspondingly highest).

Figure 16 – Future adoption of big data by function

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Businessintelligencecompetency

center

Executivemanagement

InformationTechnology

(IT)

Finance Sales &Marketing

Research anddevelopment

(R&D)

Future Adoption of Big Data by Function

Will adopt in 2016 Will adopt in 2017 Will adopt beyond 2017

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

30

As with current users of big data analytics (fig. 11, p. 24), 2016 first-adoption plans are

highest at very large organizations with more than 5,000 employees (fig. 17). More than

60 percent of very large organizations will take up the use of big data in 2016, more

than twice the rate at small organizations (29 percent). That said, we continue to believe

cloud-based offerings will be a strong driver of big data going forward for organizations

of any size. Possibly in that vein, 2017 adoption plans are highest at small organizations

(58 percent), followed by mid-sized organizations (50 percent).

Figure 17 – Future adoption of big data by organization size

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 - 100 101 - 1000 1001 - 5000 More than5000

Future Adoption of Big Data by Organization Size

Will adopt beyond 2017

Will adopt in 2017

Will adopt in 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

31

Big Data Use Cases

The top big data use case in 2016 is data warehouse optimization, which is considered

critical or very important to 65 percent of respondents (fig. 18). As data warehouse

deployments are mostly confined to large institutions, this reinforces our view that big

data is predominantly a large-organization pursuit meant to lower cost and complexity.

That said, customer / social analysis is the next most likely use case and is, at

minimum, "very important" to a majority of respondents.

Figure 18 – Big data use cases

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Data warehouse optimization

Customer/ social analysis

Clickstream analytics

Fraud detection

Internet of Things

Big Data Use Cases

Critical Very important Important Somewhat important Not important

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

32

Year over year, the top big data use cases, data warehouse optimization and customer /

social analysis, retain (and extend) their top rankings (fig. 19). The Internet of Things,

the third-most popular use case in 2015, lost momentum in 2016, possibly due to

settling hype and uneven prospects for average organizations. Clickstream analytics

and fraud detection gained the most influence year over year.

Figure 19 - Big data use cases 2015 to 2016

0

0.5

1

1.5

2

2.5

3

3.5

4

Data warehouseoptimization

Customer/ socialanalysis

Clickstreamanalytics

Fraud detection Internet of Things

Big Data Use Cases 2015 to 2016

2015

2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

33

By region, Asia Pacific and North America are the most likely to prioritize data

warehouse optimization (fig. 20). (All use cases, particularly fraud detection and

clickstream analytics, are, in fact, more highly prioritized in Asia Pacific than in other

regions.) Compared to North America, EMEA nonetheless has more interest in

customer / social analysis and the Internet of Things.

Figure 20 - Big data use cases by geography

1 2 3 4 5

North America

Europe, MiddleEast and Africa

Asia Pacific

Big Data Use Cases by Geography

Data warehouseoptimization

Customer/ social analysis

Clickstream analytics

Fraud detection

Internet of Things

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

34

When parsed by vertical industry, all industries rank data warehousing as a top or

second priority. Our 2016 sample shows somewhat surprising standout interest in data

warehouse optimization among healthcare respondents (fig. 21). Elsewhere, financial

services predictably reports the highest interest in fraud detection (and clickstream

analysis). Consulting leads technology in interest in customer / social analysis. The

Internet of Things interest is highest in education.

Figure 21 – Big data use cases by vertical industry

1 2 3 4 5

Technology

Financial services

Consulting

Healthcare

Education

Big Data Use Cases by Vertical Industry

Data warehouseoptimization

Customer/ social analysis

Clickstream analytics

Fraud detection

Internet of Things

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

35

All functions in our 2016 sample rank data warehouse optimization as their highest big

data use case priority (fig. 22). IT has the most standout interest in data warehouse

optimization, which is not surprising given traditional ownership boundaries. BICC and

executive management report the highest interest in customer /social analysis, perhaps

with an opportunistic viewpoint. BICC and sales/marketing are most interested in

clickstream analytics. Finance respondents show below-average interest in all big data

use cases.

Figure 22 – Big data use cases by function

1 2 3 4 5

InformationTechnology (IT)

Business intelligencecompetency center

Executivemanagement

Research anddevelopment (R&D)

Sales & Marketing

Finance

Big Data Use Cases by Function

Data warehouseoptimization

Customer/ social analysis

Clickstream analytics

Fraud detection

Internet of Things

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

36

Very large organizations (>5,000) expectedly have the greatest proportional interest in

data warehouse optimization (fig. 23). Generally, we would expect large organizations

to be more conventional in their approach to big data use cases with an eye toward cost

efficiency, while smaller peers are more balanced across opportunities. It is interesting

however that IoT has not caught fire in organizations of any size and that very large

organizations are the least attuned to customer / social analysis.

Figure 23 – Big data use cases by organization size

1 2 3 4 5

1 - 100

101 - 1000

1001 - 5000

More than 5000

Big Data Use Cases by Organization Size

Data warehouseoptimization

Customer/ social analysis

Clickstream analytics

Fraud detection

Internet of Things

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

37

Big Data Infrastructure

To gather baseline data on big data infrastructure awareness/adoption, we assembled a

list of relevant frameworks, databases, and other technologies in the Hadoop / open

source orbits of interest. In our 2016 sample, Spark is the preferred mechanism

followed by Map/Reduce, Yarn, Oozie, Tez, Mesos, and Atlas. Spark and Map/Reduce

notably stand out across multiple grades of importance. All but the top three choices

(Spark, Map/Reduce, Yarn) are "not important" or only "somewhat important" to the

majority of respondents.

Figure 24 – Big data infrastructure

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Spark

Map/Reduce

Yarn

Oozie

Tez

Mesos

Atlas

Knox Gateway

Alluxio (formerly Tachyon)

Big Data Infrastructure

Critical Very important Important Somewhat important Not important

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

38

Across two years of study, Spark has surpassed Map/Reduce as the preferred big data

infrastructure (fig. 25). Preferences for Spark and associated applications/frameworks

extend across all measures in this report even though Map/Reduce is well penetrated in

early-stage use. All infrastructure choices gained favor in 2016 over 2015; the biggest

gainer besides Spark and Map/Reduce was Yarn. (2016 is the first year we polled

respondents on interest in Atlas and Knox Gateway.)

Figure 25 - Big data infrastructure 2015 to 2016

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

Big Data Infrastructure 2015 to 2016

2015 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

39

By region, Asia-Pacific respondents indicated the highest interest in all big data

infrastructures polled in 2016 and prioritize Yarn over Map/Reduce (fig. 22), perhaps

indicating late-arriving interest and newer editions of Hadoop. Among regional

preferences, EMEA had the second-highest interest in Spark and Map/Reduce, ahead

of North America. Interest in Yarn is equal in North America and EMEA. EMEA has

slightly higher interest in Oozie and somewhat less interest in Tez and Mesos compared

to North America.

Figure 26 - Big data infrastructure by geography

1 2 3 4 5

North America

Europe, Middle East and Africa

Asia Pacific

Big Data Infrastructure by Geography

Spark Map/Reduce Yarn Oozie Tez Mesos

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

40

Big data infrastructure preferences vary by vertical industry (fig. 27). While technology

industry respondents are most singularly interested in Spark, other verticals share

similar affinity for Map/Reduce—and consulting actually grades Map/Reduce higher

than Spark. This latter finding may find consulting serving existing demand and

investments in Map/Reduce. Technology, healthcare, and consulting have the most

interest in Yarn; healthcare and consulting are also the most likely to engage with

Oozie.

Figure 27 – Big data infrastructure by vertical industry

1 2 3 4 5

Technology

Financial services

Consulting

Healthcare

Education

Big Data Infrastructure by Vertical Industry

Spark Map/Reduce Yarn Oozie Tez Mesos

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

41

Big data infrastructure preferences vary interestingly by function (fig. 28). The BICC

(often contained within IT) is the strongest proponent of Spark especially, followed by

Map/Reduce. As we have seen elsewhere, executive interest often follows (or leads) in

the lines of BICC activity. By comparison, R&D interest is weak and falls sharply after

Spark and Map/Reduce. Central IT is predictably a laggard in embracing big data

compared to other roles but shows some preference for the various options. Perhaps

most interesting is sales and marketing, where Ozzie and Tez claim the highest marks

of any department.

Figure 28 – Big data infrastructure by function

1 2 3 4 5

InformationTechnology (IT)

Business intelligencecompetency center

Executivemanagement

Research anddevelopment (R&D)

Sales & Marketing

Finance

Big Data Infrastructure by Function

Spark Map/Reduce Yarn Oozie Tez Mesos

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

42

We see differences in big data infrastructure preferences across organizations of

different size, but none that are striking (fig. 29). Spark and Map/Reduce are easily the

preferred choice in organizations large or small, though Spark appears to have the most

influence in very large organizations. Likewise, Yarn is consistently the third most highly

cited infrastructure choice of all organizations.

Figure 29 – Big data infrastructure by organization size

1 2 3 4 5

1 - 100

101 - 1000

1001 - 5000

More than 5000

Big Data Infrastructure by Organization Size

Spark Map/Reduce Yarn Oozie Tez Mesos

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

43

Big Data – Data Access

We asked organizations which big data structure access they preferred and which is

more/most important to them. This includes indirect access to Hadoop and other related

engines. In our 2016 study, Spark SQL is the most cited and considered, at minimum,

“important” to close to 80 percent of the sample (fig. 30). Hive and HDFS, perhaps more

familiar to the conventional data warehousing audience, follow closely and elicited even

more "critical" responses than Spark.

Figure 30 – Big data – data access

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Spark SQL

Hive/HiveQL

HDFS

HBase

Google BigQuery

Redshift

MongoDB

Impala

Pivotal HAWQ

Big Data - Data Access

Critical Very important Important Somewhat important Not important

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

44

Among big data access technologies studied both last year and this year, all gained

positive sentiment year over year, especially Spark SQL, Hive/Hive QL, and Impala (fig.

31). Trailing technologies, with the exception of Pivotal HAWQ, all reached positive

sentiment of 2.7 to 2.9, in the range of "important."

Figure 31 - Big data - data access 2015 to 2016

0

0.5

1

1.5

2

2.5

3

3.5

4

Big Data - Data Access 2015 to 2016

2015 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

45

Big data access preferences vary by region (fig. 32). Asia Pacific had the strongest

response to several technologies, specifically Hbase, Hive, HDFS, and Spark. Globally,

Hbase was less appealing in regions other than Asia Pacific. Cloud-based solutions

(Redshift, Google BigQuery) fared worse but were slightly more appealing in North

America than other regions.

Figure 32 – Big data – data access by geography

1 2 3 4 5

North America

Europe, Middle East and Africa

Asia Pacific

Big Data - Data Access by Geography

Spark SQL Hive/HiveQL HDFS HBase Google BigQuery Redshift

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

46

By vertical industry, financial services, technology, and consulting are the most aligned

around Spark SQL for data access (fig. 33). HiveQL resonated most strongly in

healthcare, followed by consulting and technology. Healthcare was also the strongest

proponent of HDFS, followed by financial services and technology. Consulting

respondents report an outsized interest in Redshift. Google BigQuery fared best in

consulting and financial services.

Figure 33 – Big data – data access by vertical industry

1 2 3 4 5

Technology

Financial services

Consulting

Healthcare

Education

Big Data - Data Access by Vertical Industry

Spark SQL Hive/HiveQL HDFS HBase Google BigQuery Redshift

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

47

Departmental interest in data access varies by function (fig. 34). The BICC and

executive management are the strongest proponents of Spark SQL. More traditional

Hbase, HDFS, and Hive are the most favored in sales and marketing, while the BICC is

most focused on HDFS and Hive along with Spark. Cloud-based offerings (Redshift,

Google BigQuery) are initially most interesting to executive management.

Figure 34 - Big data - data access by function

1

2

3

4

5

Spark SQLHive/HiveQLHDFSHBaseGoogleBigQuery

Redshift

Big Data - Data Access by Function

InformationTechnology (IT)

Business intelligencecompetency center

Executivemanagement

Research anddevelopment (R&D)

Sales & Marketing

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

48

Small organizations (most likely to be early adopters) are proportionately most drawn to

Spark as a newer opportunity for big data access (fig. 35). Redshift (and Google

BigQuery in mid-sized organizations) are also popular as an easy and inexpensive entry

point to big data access for smaller organizations. Very large organizations are more

likely invested in big data access via HDFS and Hive followed by Spark SQL.

Figure 35 - Big data - data access by organization size

1 2 3 4 5

1 - 100

101 - 1000

1001 - 5000

More than 5000

Big Data - Data Access by Organization Size

Spark SQL Hive/HiveQL HDFS HBase Google BigQuery Redshift

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

49

Big Data Search

We asked respondents to rank interest in big data search facilities, which in Hadoop

include indexing and natural language textual search (fig. 36). In our 2016 sample,

Elasticsearch resonated most strongly followed by Apache Solr and Cloudera Search.

Despite shifting over time (which we will expand on in the following figure) there is no

clear first choice in big data search; all three technologies are, at minimum, "important"

to 65 percent to 74 percent of respondents.

Figure 36 - Big data search

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Elasticsearch

Apache Solr

Cloudera Search

Big Data Search

Critical Very important Important Somewhat important Not important

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

50

Across two years of study data, we saw a small reversal of fortunes among big data

search options (fig. 37). While Elasticsearch moved past early open source provider

Apache Solr into first place, Cloudera fell slightly from the top choice to third. While we

consider rising year-over-year sentiment a positive development, we reiterate that there

is currently no clear first choice emerging in big data search.

Figure 37 - Big data search 2015 to 2016

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Elasticsearch Apache Solr Cloudera Search

Big Data Search 2015 to 2016

2015 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

51

As in other measures, we found sentiment toward big data search options strongest

"across the board" in Asia Pacific (fig. 38). Also as mentioned, year-over-year sentiment

toward big data search increased across all regions, though with middling and not

remarkable levels of interest..

Figure 38 - Big data search by geography

1 2 3 4 5

North America

Europe, Middle East and Africa

Asia Pacific

Big Data Search by Geography

Elasticsearch Apache Solr Cloudera Search

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

52

We saw some divergence from overall results in big data search preference by industry

(fig. 39). Due to sector-size bias, we found respondents in three verticals (financial

services, healthcare, and consulting) preferred Cloudera Search to both top choice

Elasticsearch and Apache Solr. In contrast, technology, with a larger pool of

respondents, preferred Elasticsearch. In all instances, Apache Solr was the second

choice and was most preferred in healthcare and financial services.

Figure 39 - Big data search by vertical industry

1 2 3 4 5

Technology

Financial services

Consulting

Healthcare

Education

Big Data Search by Vertical Industry

Elasticsearch Apache Solr Cloudera Search

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

53

As a new relatively new technology hyped as "innovative," it is not entirely surprising to

find big data search advocacy strongest in executive management (fig. 40). Overall

functional preference was in favor of Elasticsearch (to a striking degree in IT), with the

exception of research and development, which preferred the earlier test bed of Apache

Solr. Overall sentiment ranged at or below a level of 3.0, indicating that big data search

is at best "important" or less and not critical to most audiences.

Figure 40 - Big data search by function

1 2 3 4 5

InformationTechnology (IT)

Business intelligencecompetency center

Executivemanagement

Research anddevelopment (R&D)

Sales & Marketing

Finance

Big Data Search by Function

Elasticsearch Apache Solr Cloudera Search

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

54

Big data search preferences vary somewhat but not dramatically in organizations of

different size (fig. 41). The largest departure in our 2016 sample is in mid-sized firms of

101 to 1,000 employees, where interest declines noticeably from Elasticsearch to other

options.

Figure 41 - Big data search capabilities by organization size

1 2 3 4 5

1 - 100

101 - 1000

1001 - 5000

More than 5000

Big Data Search by Organization Size

Elasticsearch Apache Solr Cloudera Search

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

55

Big Data Analytics / Machine-Learning Technologies

We asked respondents to rank their interest in a variety of big data analytics and

machine-learning technologies (fig. 42). The leader, Spark MLib (here and throughout

this category), is considered, at minimum, “important” by more than 60 percent of

respondents and ranks well ahead of all competitors. As we will see in the following

figure, this is a stark improvement over the previous year. Still, Spark MLib is

considered "critical" to just 15 percent of respondents, reflecting an early-stage market

response to machine learning.

Figure 42 - Big data analytics / machine learning

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Spark MLib

Rhipe (R)

Mahout

Oryx

Myrrix

Big Data Analytics / Machine Learning

Critical Very important Important Somewhat important Not important

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

56

Year-over-year interest in big data analytics and machine learning increased across the

board, though it still remains confined to levels of 2.0 or "somewhat important" (fig. 43).

The most popular choice, Spark MLib, also grew the most from 2015 to 2016. The next

greatest momentum levels were in Rhipe and Mahout.

Figure 43 - Big data analytics / machine learning 2015 to 2016

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

Spark MLib Rhipe (R) Mahout Oryx Myrrix

Big Data Analytics / Machine Learning 2015 to 2016

2015 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

57

Asia-Pacific respondents have a stronger response to different machine-learning

capabilities compared to other geographies (fig. 44). EMEA is next most engaged with

machine learning, ahead of levels in North America. Spark MLib is again the top choice

across all regions. Mean levels of interest are again mostly in the “somewhat important”

to "important" range.

Figure 44 - Big data analytics / machine learning by geography

1 2 3 4 5

North America

Europe, Middle East and Africa

Asia Pacific

Big Data Analytics / Machine Learning by Geography

Spark MLib Rhipe (R) Mahout Oryx Myrrix

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

58

In our 2016 sample, interest in big data machine learning varied by vertical industry but

overall was led by preference for Spark MLib (fig. 45). Healthcare and technology

showed the greatest interest in MLib. Healthcare and consulting were most interested in

Rhipe.

Figure 45 - Big data analytics / machine learning by vertical industry

1 2 3 4 5

Technology

Financial services

Consulting

Healthcare

Education

Big Data Analytics / Machine Learning by Vertical Industry

Spark MLib Rhipe (R) Mahout Oryx Myrrix

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

59

By function, Spark MLib is again the standout category leader across organizational

roles. BICC and executive management are again mirrors of the top areas of interest,

followed by R&D and sales and marketing (fig. 46). IT is mostly unengaged with big

data analytics and machine learning, even more so than sales and marketing or finance.

Figure 46 - Big data analytics / machine learning by function

1 2 3 4 5

InformationTechnology (IT)

Business intelligencecompetency center

Executivemanagement

Research anddevelopment (R&D)

Sales & Marketing

Finance

Big Data Analytics / Machine Learning by Function

Spark MLib Rhipe (R) Mahout Oryx Myrrix

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

60

Organizations of all sizes prefer Spark MLib over all other big data analytics / machine-

learning options (fig. 47). This effect is not correlated to size. In our 2016 sample,

sentiment for MLib is strongest in organizations with 1,001 to 5,000 employees. We see

that preference for Spark MLib is higher at large organizations, while small peers have a

proportionately greater interest in R-based Rhipe.

Figure 47 - Big data analytics / machine learning by organization size

1 2 3 4 5

1 - 100

101 - 1000

1001 - 5000

More than 5000

Big Data Analytics / Machine Learning by Organization Size

Spark MLib Rhipe (R) Mahout Oryx Myrrix

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

61

Big Data Distributions

We asked respondents to rank the most important big data distributions by order of

importance (fig. 48). In 2016, Cloudera led in measures of "critical" and was the

strongest overall performer, followed by Hortonworks, Amazon, and MAP/R. Cloudera,

Hortonworks and MAP/R were all seen as, at minimum, "important" to 63 percent to 68

percent of respondents.

Figure 48 - Big data distributions

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Cloudera

Hortonworks

Amazon

MAP/R

Big Data Distributions

Critical Very important Important Somewhat important Not important

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

62

Interest in all big data distributions increased year over year in 2016 (fig. 49). Amazon

fell slightly from a tie for top place to third, behind Cloudera and Hortonworks. Interest

levels for the top three choices were at or near 3.0, indicating average responses near

"important" to respondents.

Figure 49 - Big data distributions 2015 to 2016

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

Cloudera Hortonworks Amazon MAP/R

Big Data Distributions 2015 to 2016

2015 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

63

In 2016, there were differences of interest by geography in the four big data distributions

we sampled (fig. 50). Asia Pacific is again the leader across the board on all distribution

interest. Perhaps most noticeably, EMEA reported the greatest standout interest in

Cloudera compared to other distributions.

Figure 50 - Big data distributions by geography

1 2 3 4 5

North America

Europe, Middle East and Africa

Asia Pacific

Big Data Distributions by Geography

Cloudera Hortonworks Amazon MAP/R

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

64

By vertical industry, healthcare, consulting, and financial services expressed the

greatest interest in Cloudera (fig. 51). Technology respondents (more heavily weighted

in our study) preferred Amazon. Map/R performed strongest in consulting, healthcare,

and education. Hortonworks performed best in consulting, healthcare, and financial

services.

Figure 51 - Big data distributions by vertical industry

1 2 3 4 5

Technology

Financial services

Consulting

Healthcare

Education

Big Data Distributions by Vertical Industry

Cloudera Hortonworks Amazon MAP/R

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

65

Unlike other measures, Cloudera is not a big data distribution category leader by

function due to sample weighting (fig. 52). In our 2016 sample, Hortonworks was a

standout leader among sales and marketing respondents. Amazon performed strongest

among distributions for executive management respondents. BICC respondents

preferred Hortonworks by a lesser margin, and IT interest was led by Cloudera.

Figure 52 - Big data distributions by function

1 2 3 4 5

InformationTechnology (IT)

Business intelligencecompetency center

Executivemanagement

Research anddevelopment (R&D)

Sales & Marketing

Finance

Big Data Distributions by Function

Cloudera Hortonworks Amazon MAP/R

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

66

Small to very large organizations have varying preferences in big data distributions,

though not to an extreme extent (fig. 53). As we might expect, cloud-based Amazon and

AWS distributions appeal most strongly to small organizations for simple and

inexpensive startup projects that have also demonstrated abilities to scale. Mid-sized

(101-1,000) organizations also most prefer Amazon, though we do see a trend among

larger organizations to bring big data distribution management in house. Cloudera and

Hortonworks are the top picks among large and very large organizations.

Figure 53 - Big data distributions by organization size

1 2 3 4 5

1 - 100

101 - 1000

1001 - 5000

More than 5000

Big Data Distributions by Organization Size

Cloudera Hortonworks Amazon MAP/R

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

67

Industry and

Vendor

Analysis

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

68

Industry and Vendor Analysis In 2016 as in 2015, we reached out to the vendor community with questions about their

capabilities and plans for technologies in big data analytics, including its perceived

importance to their strategies. Compared to 2015, industry sentiment appears to be

leveling off (fig. 54). Overall, vendors are still highly positive on big data but are trading

over the top enthusiasm for something less than a complete revolution in data

management. We view it as a positive that the proclaimed criticality of a still emergent

set of technologies has been replaced by an optimistic upside of one that is "very

important" at the same time user adoption (or awareness of same) has grown notably

year over year (fig. 7, p. 20).

Figure 54 – Industry importance of big data 2015 to 2016

0%

10%

20%

30%

40%

50%

60%

70%

Critically important Very important Somewhat important Not important

Industry Importance of Big Data 2015 to 2016

2015 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

69

Among big data infrastructure options in the Hadoop ecosystem, Map/Reduce still has

the highest level of vendor support, which is not surprising given its longevity and

relative maturity (fig. 55). Support for Spark is closing in quickly with the highest

predicted industry support plans for the next 12 months, after which Spark support will

be ubiquitous. After Spark, industry support drops quickly below 50 percent. Future "no

plans" for support range from 30 percent to more than 60 percent.

Figure 55 - Industry support for big data infrastructure

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Industry Support for Big Data Infrastructure

No plans

24 months

18 months

12 months

Today

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

70

Year over year, industry plans for supporting Map/Reduce, Spark, Yarn, and Tez have

all gathered momentum (fig. 56). Despite some growth in user sentiment (fig. 25, p. 38),

industry support for Oozie declined. We continue to expect that proprietary vendor

support of open source big data projects will be opportunistic and customer driven.

Figure 56 - Industry support for big data infrastructure 2015 to 2016

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

Industry Support for Big Data Infrastructure 2015 to 2016

2015 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

71

Existing industry support for access to big data sources is greatest for Hive/Hive QL (87

percent), followed by HDFS (85 percent) (fig. 57). These top choices are in line with top

user preferences for data access, but Spark support is a good bit lower than user

expectations (fig. 30, p. 43). Industry support for Redshift is next highest, somewhat

ahead of user priorities. Google BigQuery currently has much lower industry support but

is the third most cited choice of users.

Figure 57 – Industry support for access to big data sources

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Industry Support for Access to Big Data Sources

No plans

24 months

18 months

12 months

Today

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

72

Year-over-year industry support for data access increased for all big data sources

polled with the exception of Redshift (fig. 58). Though Redshift was a lower priority

among users than industry vendors, it gained additional user interest in 2016 (fig. 31, p.

44). The biggest gainer of industry support in 2016, Redshift, gained even more interest

among user respondents year over year.

Figure 58 - Industry support for access to big data sources 2015 to 2016

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Industry Support for Access to Big Data Sources 2015 to 2016

2015 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

73

Industry support for big data search did gain momentum in 2016, though support

remains distinctly lukewarm (fig. 59). While 30 percent of vendors indicate support for

Apache Solr, under 20 percent currently support Elasticsearch or Cloudera Search and

more than 40 percent have no plans for future support. The tepid investment in big data

search is in line with current user sentiments, which show little urgency for search (fig.

36, p. 49).

Figure 59 - Industry support for big data search

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Apache Solr Elasticsearch Cloudera Search

Industry Support for Big Data Search

No plans

24 months

18 months

12 months

Today

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

74

Year-over-year industry support for big data search varied noticeably by product (fig.

60). While support for category leader Apache Solr grew from 25 percent to 31 percent,

Cloudera Search support fell from 26 percent to 14 percent. Support for Elasticsearch

was flat year over year. We cannot be certain whether swings in industry support are

related to existing penetration or other market factors. We saw user interest in all three

big data search products grow in interest year over year, but not with urgency (fig. 37, p.

50).

Figure 60 - Industry support for big data search 2015 to 2016

0%

5%

10%

15%

20%

25%

30%

35%

Cloudera Search Apache Solr Elasticsearch

Industry Support for Big Data Search 2015 to 2016

2015 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

75

Industry support for big data analytics / machine learning is strongest for Spark MLib

followed by Mahout, though we concede these investments are not urgent and reflect

the esoteric uses of machine learning in the current market (fig. 61). While industry

support for MLib is expected to reach a total of 56 percent in the next 12 months, future

support for all other machine learning methods is tepid and may never reach 50

percent. (Spark MLib, Rhipe, and Mahout were top user machine-learning choices but

also showed low levels of enthusiasm (fig. 42, p. 55).

Figure 61 –Industry support for big data analytics / machine learning

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Spark MLib Mahout Rhipe (R) Oryx Myrrix

Industry Support for Big Data Analytics / Machine Learning

No plans

24 months

18 months

12 months

Today

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

76

Year-over-year industry support for big data analytics / machine learning was higher for

Spark MLib, slightly lower for Mahout, and significantly lower for other products,

particularly Rhipe (fig. 62). Again, support investments remain low and, as with current

vendor support shown in fig. 61 above, where investment or interest is developing, it

tends to go to Spark MLib.

Figure 62 - Industry support for big data analytics / machine learning 2015 to 2016

0%

5%

10%

15%

20%

25%

30%

35%

Spark MLib Mahout Rhipe (R) Oryx Myrrix

Industry Support for Big Data Analytics / Machine Learning 2015 to 2016

2015 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

77

Compared to support for big data search, we see significant existing industry support

and future plans for big data (Hadoop) distributions (fig. 63). Current support is

strongest for Hortonworks, followed by Cloudera and Map/R. Current support for

Amazon is under 60 percent, but industry respondents expect to see about 90 percent

support for all products within 24 months. These investments support stronger user

sentiments for big data distributions (fig. 48, p. 61) than for search or machine learning.

Figure 63 - Industry support for big data (Hadoop) distributions

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Hortonworks Cloudera MAP/R Amazon

Industry Support for Big Data (Hadoop) Distributions

No plans

24 months

18 months

12 months

Today

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

78

Industry support/investments in Hortonworks and MAP/R big data distributions grew

year over year in 2016, while support for Cloudera and Amazon declined slightly (fig.

49, p. 62). Industry support for Hortonworks is currently greater than 80 percent; in

contrast, Amazon support is below 60 percent.

Figure 64 - Industry support for big data (Hadoop) distributions 2015 to 2016

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

Hortonworks Cloudera MAP/R Amazon

Industry Support for Big Data (Hadoop) Distributions 2015 to 2016

2015 2016

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

79

Big Data Analytics Vendor Ratings

In rating vendors for big data analytics, we examined levels of functionality in five

categories: infrastructure, data access, search, machine learning, and supported

distributions (fig. 65). Criteria were weighted based on user responses/priorities. Top-

rated vendors include Zoomdata (1st), RapidMiner (2nd), Pentaho (3rd), Datameer (4th),

Domo (4th) and Information Builders (5th).

Figure 65 – Big data analytics vendor ratings

0.25

0.5

1

2

4

8

16

32Zoomdata

RapidMiner

Pentaho

Datameer

Domo

Information Builders

SAP

Tableau

Oracle

TIBCO

Jinfonet

Microsoft

Birst

Logi Analytics

Looker

MicroStrategy

Big Data Analytics Vendor Ratings

Infrastructure Data Access Search Distributions Machine Learning Total Score

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

80

Glossary Alluxio (formerly Tachyon) is a memory-centric distributed storage system enabling reliable

data sharing at memory-speed across cluster frameworks.

Source: alluxio.org

Atlas is designed to exchange metadata with other tools and processes within and outside of the

Hadoop stack, thereby enabling platform-agnostic governance controls that effectively address

compliance requirements

Source: Apache Software Foundation

BigQuery is a RESTful web service that enables interactive analysis of massively large datasets

working in conjunction with Google Storage. It is an Infrastructure as a Service (IaaS) service

that may be used complementarily with MapReduce.

Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable

full-text search engine with an HTTP web interface and schema-free JSON documents.

Elasticsearch is developed in Java and is released as open source under the terms of the Apache

License. Elasticsearch is the second most popular enterprise search engine after Apache Solr.*

HAWQ is a parallel SQL query engine that combines the key technological advantages of the

industry-leading Pivotal Analytic Database with the scalability and convenience of Hadoop.

HAWQ reads data from and writes data to HDFS natively. HAWQ delivers industry-leading

performance and linear scalability. It provides users the tools to confidently and successfully

interact with petabyte range data sets. HAWQ provides users with a complete, standards-

compliant SQL interface.

Source: Pivotal

HBase is an open source, non-relational, distributed database modeled after Google's BigTable

and is written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop

project and runs on top of HDFS (Hadoop Distributed File System), providing BigTable-like

capabilities for Hadoop.

The Hadoop distributed file system (HDFS) is a distributed, scalable, and portable file system

written in Java for the Hadoop framework.

The Apache Hive™ data warehouse software facilitates querying and managing large datasets

residing in distributed storage. Hive provides a mechanism to project structure onto this data and

query the data using a SQL-like language called HiveQL. At the same time this language also

allows traditional map/reduce programmers to plug in their custom mappers and reducers when it

is inconvenient or inefficient to express this logic in HiveQL.

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

81

Source: Apache Software Foundation

The Apache Knox Gateway is a REST API Gateway for interacting with Apache Hadoop

clusters. The Knox Gateway provides a single access point for all REST interactions with

Apache Hadoop clusters.

Source: Apache Software Foundation

Impala is an open source, native analytic database for Apache Hadoop. Impala is shipped by

Cloudera, MapR, Oracle, and Amazon.

Source: Cloudera

Mahout is a project of the Apache Software Foundation to produce free implementations of

distributed or otherwise scalable machine learning algorithms focused primarily in the areas of

collaborative filtering, clustering and classification. Many of the implementations use the

Apache Hadoop platform. Mahout also provides Java libraries for common math operations

(focused on linear algebra and statistics) and primitive Java collections.

Source: Apache Software Foundation

MapReduce is a programming model and an associated implementation for processing and

generating large data sets with a parallel, distributed algorithm on a cluster. Conceptually similar

approaches have been very well known since 1995 with the Message Passing Interface standard

having reduce and scatter operations.

Apache Mesos is an opensource cluster manager that was developed at the University of

California, Berkeley. It "provides efficient resource isolation and sharing across distributed

applications, or frameworks". The software enables resource sharing in a fine-grained manner,

improving cluster utilization.

MLlib is Spark’s scalable machine-learning library consisting of common learning algorithms

and utilities, including classification, regression, clustering, collaborative filtering,

dimensionality reduction, as well as underlying optimization primitives.

Source: Apache Software Foundation

MongoDB is a cross-platform document-oriented database. Classified as a NoSQL database,

MongoDB eschews the traditional table-based relational database structure in favor of JSON-like

documents with dynamic schemas (MongoDB calls the format BSON), making the integration of

data in certain types of applications easier and faster. Released under a combination of the GNU

Affero General Public License and the Apache License, MongoDB is free and open source

software.

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

82

Myrrix, offers a “complete, real-time, scalable clustering and recommender system.” The

solution is built on top of the Apache Mahout machine-learning project.

Source: Cloudera

Oozie is a workflow scheduler system to manage Hadoop jobs. It is a server-based Workflow

Engine specialized in running workflow jobs with actions that run Hadoop MapReduce and Pig

jobs. Oozie is implemented as a Java Web application that runs in a Java servlet container.

Oryx is built on Apache Spark and Apache Kafka, with specialization for real-time large scale

machine learning. It is a framework for building applications but also includes packaged, end-to-

end applications for collaborative filtering, classification, regression, and clustering.

Source: Cloudera

RHIPE integrates the R statistical environment with the Hadoop framework. RHIPE allows R

users to compute on terabyte-sized data sets a cluster using the MapReduce framework, thus

offering the best of both worlds to users seeking to leverage the strength of R and Hadoop.

People with very large data sets stored in the Hadoop Distributed File System can now easily

process the data on hundreds or even thousands of nodes in parallel, using only the R language.

Source: Revolution Analytics

Cloudera Search is one of Cloudera's near-real-time access products. Cloudera Search enables

non-technical users to search and explore data stored in or ingested into Hadoop and HBase.

Users do not need SQL or programming skills to use Cloudera Search because it provides a

simple, full-text interface for searching.

Source: Cloudera

Solr is an open source enterprise search platform, written in Java, from the Apache Lucene

project. Its major features include full-text search, hit highlighting, faceted search, real-time

indexing, dynamic clustering, database integration, NoSQL features and rich document (e.g.,

Word, PDF) handling. Providing distributed search and index replication, Solr is designed for

scalability and fault tolerance. Solr is the most popular enterprise search engine.

Apache Spark is an open source cluster computing framework originally developed in the

AMPLab at University of California, Berkeley but was later donated to the Apache Software

Foundation where it remains today. In contrast to Hadoop's two-stage disk-based MapReduce

paradigm, Spark's multi-stage in-memory primitives provides performance up to 100 times faster

for certain applications. By allowing user programs to load data into a cluster's memory and

query it repeatedly, Spark is well suited to machine-learning algorithms.

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

83

Spark SQL is a component on top of Spark Core that introduces a new data abstraction called

DataFrames, which provides support for structured and semi-structured data. Spark SQL

provides a domain-specific language to manipulate DataFrames in Scala, Java, or Python. It also

provides SQL language support with command-line interfaces and ODBC/JDBC server.

Apache™ Tez is an extensible framework for building high-performance batch and interactive

data-processing applications, coordinated by YARN in Apache Hadoop. Tez improves the

MapReduce paradigm by dramatically improving its speed while maintaining MapReduce’s

ability to scale to petabytes of data. Important Hadoop ecosystem projects like Apache Hive and

Apache Pig use Apache Tez, as do a growing number of third-party data-access applications

developed for the broader Hadoop ecosystem.

Source: Apache Software Foundation

YARN is one of the key features in the second-generation Hadoop 2 version of the Apache

Software Foundation's open source distributed processing framework. Originally described by

Apache as a redesigned resource manager, YARN is now characterized as a large-scale,

distributed operating system for big data applications.

* All sources Wikipedia unless otherwise noted

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

84

Other Dresner Advisory Services Research Reports

- Wisdom of Crowds “Flagship” Business Intelligence Market study

- Advanced and Predictive Analytics

- Business Intelligence Competency Center

- Cloud Computing and Business Intelligence

- Collective InsightsTM

- End User Data Preparation

- Enterprise Planning

- Internet of Things and Business Intelligence

- Location Intelligence

- Small and Mid-Sized Enterprise Business Intelligence

- Systems Integrators

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

85

Appendix: Big Data Analytics Study Survey Instrument

Please provide your contact information below:

Name*: _________________________________________________

Company Name: _________________________________________________

Address 1: _________________________________________________

Address 2: _________________________________________________

City: _________________________________________________

State: _________________________________________________

Zip: _________________________________________________

Country: _________________________________________________

Email Address*: _________________________________________________

Phone Number: _________________________________________________

Major Geography

( ) Asia/Pacific

( ) Europe, Middle East and Africa

( ) Latin America

( ) North America

What is your current title?

_________________________________________________

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

86

What function are you a part of?

( ) Business intelligence competency center

( ) Executive management

( ) Finance

( ) Information Technology (IT)

( ) Manufacturing

( ) Marketing

( ) Project/program management office

( ) Sales

( ) Research and development (R&D)

( ) Other - Write In: _________________________________________________

Please select an industry

( ) Advertising

( ) Aerospace

( ) Agriculture

( ) Apparel and accessories

( ) Automotive

( ) Aviation

( ) Biotechnology

( ) Broadcasting

( ) Business services

( ) Chemical

( ) Construction

( ) Consulting

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

87

( ) Consumer products

( ) Defense

( ) Distribution & logistics

( ) Education

( ) Energy

( ) Entertainment and leisure

( ) Executive search

( ) Federal government

( ) Financial services

( ) Food, beverage and tobacco

( ) Healthcare

( ) Hospitality

( ) Gaming

( ) Insurance

( ) Legal

( ) Manufacturing

( ) Mining

( ) Motion picture and video

( ) Not for profit

( ) Pharmaceuticals

( ) Publishing

( ) Real estate

( ) Retail and wholesale

( ) Sports

( ) State and local government

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

88

( ) Technology

( ) Telecommunications

( ) Transportation

( ) Utilities

( ) Other - Write In: _________________________________________________

How many employees does your company employ worldwide?

( ) 1 - 100

( ) 101 - 1000

( ) 1001 - 5000

( ) More than 5000

Do you use or intend to use big data technology/architecture within your organization?*

( ) Yes. We use big data today

( ) No. We have no plans to use big data at all

( ) We may use big data in the future

What product(s) does your organization use with big data for BI/analytics?

____________________________________________

How satisfied are you with your vendor and product for big data analytics?

( ) Extremely satisfied

( ) Somewhat satisfied

( ) Somewhat unsatisfied

( ) Unsatisfied

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

89

What are your plans for Big Data (Hadoop) Analytics in the Future?

( ) Will adopt in 2016

( ) Will adopt in 2017

( ) Will adopt beyond 2017

What use cases are most important for Big Data (Hadoop) in your organization?

Critical

Very

important Important

Somewhat

important

Not

important

Data warehouse

optimization

( ) ( ) ( ) ( ) ( )

Customer/social

analysis

( ) ( ) ( ) ( ) ( )

Internet of

things

( ) ( ) ( ) ( ) ( )

Fraud detection ( ) ( ) ( ) ( ) ( )

Clickstream

analytics

( ) ( ) ( ) ( ) ( )

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

90

Please indicate the importance of the following Big Data infrastructure components

Critical

Very

important Important

Somewhat

important

Not

important

Alluxio

(formerly

Tachyon)

( ) ( ) ( ) ( ) ( )

Mesos ( ) ( ) ( ) ( ) ( )

Spark ( ) ( ) ( ) ( ) ( )

Map/Reduce ( ) ( ) ( ) ( ) ( )

Oozie ( ) ( ) ( ) ( ) ( )

Yarn ( ) ( ) ( ) ( ) ( )

Tez ( ) ( ) ( ) ( ) ( )

Atlas ( ) ( ) ( ) ( ) ( )

Knox

Gateway

( ) ( ) ( ) ( ) ( )

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

91

Please indicate the importance of the following Big Data - data access capabilities

Critical

Very

important Important

Somewhat

important

Not

important

Google

BigQuery

( ) ( ) ( ) ( ) ( )

HBase ( ) ( ) ( ) ( ) ( )

HDFS ( ) ( ) ( ) ( ) ( )

Hive/HiveQL ( ) ( ) ( ) ( ) ( )

Impala ( ) ( ) ( ) ( ) ( )

MongoDB ( ) ( ) ( ) ( ) ( )

Pivotal

HAWQ

( ) ( ) ( ) ( ) ( )

Redshift ( ) ( ) ( ) ( ) ( )

Spark SQL ( ) ( ) ( ) ( ) ( )

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

92

Please indicate the importance of the following Big Data search capabilities

Critical

Very

important Important

Somewhat

important

Not

important

Cloudera

Search

( ) ( ) ( ) ( ) ( )

Apache Solr ( ) ( ) ( ) ( ) ( )

Elasticsearch ( ) ( ) ( ) ( ) ( )

Please indicate the importance of the following Big Data analytical/machine learning components

Critical

Very

important Important

Somewhat

important

Not

important

Mahout ( ) ( ) ( ) ( ) ( )

Rhipe

(R)

( ) ( ) ( ) ( ) ( )

Oryx ( ) ( ) ( ) ( ) ( )

Myrrix ( ) ( ) ( ) ( ) ( )

Spark

MLib

( ) ( ) ( ) ( ) ( )

2016 Big Data Analytics Market Study

http://www.dresneradvisory.com Copyright 2016 – Dresner Advisory Services, LLC

93

Please indicate the importance of the following Big Data (Hadoop) distributions

Critical

Very

important Important

Somewhat

important

Not

important

Cloudera ( ) ( ) ( ) ( ) ( )

Hortonworks ( ) ( ) ( ) ( ) ( )

MAP/R ( ) ( ) ( ) ( ) ( )

Amazon ( ) ( ) ( ) ( ) ( )


Recommended