+ All Categories
Home > Documents > teradata-0003-Big_Data_Comes_of_Age.pdf

teradata-0003-Big_Data_Comes_of_Age.pdf

Date post: 02-Apr-2018
Category:
Upload: daniel-aderhold
View: 216 times
Download: 0 times
Share this document with a friend

of 46

Transcript
  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    1/46

    Big Data Comes of Age

    By Dr. Barry Devlin, Shawn Rogers and John MyersAn ENERPRISE MANAGEMEN ASSOCIAES (EMA) and 9sight Consulting Research Report

    November 2012

    IT & DATA MANAGEMENT RESEARCH,

    INDUSTRY ANALYSIS & CONSULTING

    Tis research has been sponsored by:

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    2/46

    Table of Contents

    Big Data Comes of Age

    1 Executive Summary ....................................................................................................................... 1

    1.1 Key Findings ......................................................................................................................... 2

    2 Big Data Comes o Age ................................................................................................................. 3

    2.1 Big Datathe echnological Evolution ................................................................................ 3

    2.2 Big Datathe Emergence o Systemic Business Value .......................................................... 5

    2.3 Big Datathe Holistic View ................................................................................................ 6

    2.4 Big DataWhere Next? ....................................................................................................... 8

    3 Hybrid Data Ecosystem ................................................................................................................. 9

    3.1 Nodes within the Hybrid Data Ecosystem .......................................................................... 11

    3.2 Shit rom a Single Platorm to an Ecosystem ..................................................................... 12

    4 Big Data Adoption ...................................................................................................................... 12

    4.1 Overall Implementation ..................................................................................................... 12

    4.2 Ongoing Programs vs One ime Projects............................................................................14

    4.3 Industry Breakdown ............................................................................................................ 15

    4.3.1 Big Data Implementation Status by Industry ........................................................... 16

    4.4 Adoption Curve .................................................................................................................. 17

    4.5 Use Cases ............................................................................................................................ 17

    4.5.1 First Steps by Industry.............................................................................................. 184.6 Implementation Sponsors ................................................................................................... 19

    4.6.1 Bumps in the Road .............................................................................................. 20

    4.7 Implementation User Base .................................................................................................. 21

    4.8 Implementation User Base by Industry ............................................................................... 22

    5 Big Data Requirements: Beyond Buzzwords ................................................................................ 23

    5.1 Te Speed o Business ......................................................................................................... 23

    5.1.1 aking o the raining Wheels ................................................................................ 23

    5.1.2 Use Cases by Industry .............................................................................................. 24

    5.1.3 Implementation Strategy by Industry ....................................................................... 25

    5.2 Inexpensive is not Free ........................................................................................................ 26

    5.2.1 Overall Annual Inormation echnology Budget ..................................................... 26

    5.2.2 Comparison o Enterprise and Mid-size Budgets ..................................................... 27

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    3/46

    Big Data Comes of Age

    Table of Contents (continued)

    5.3 Rening Data into Inormation .......................................................................................... 30

    5.3.1 Workload echnical Drivers ..................................................................................... 30

    5.3.2 echnical DriversA Deeper Dive .......................................................................... 31

    5.3.3 Business Drivers by Industry Grouping ................................................................... 32

    5.3.4 Business Challenge by Industry ................................................................................ 33

    5.4 How Big is Big? ................................................................................................................... 34

    5.4.1 Overall Environment Sizing ..................................................................................... 34

    5.4.2 2012 Big Data Environment Sizing ......................................................................... 35

    5.4.3 2013 Big Data Environment Sizing ......................................................................... 36

    5.5 Another Mans reasure ....................................................................................................... 37

    5.5.1 What is Old is New.................................................................................................. 37

    5.5.2 What Does Big Data Actually Look Like? ................................................................ 38

    5.5.3 arget Structure by Industry .................................................................................... 39

    5.5.4 Data DomainsMore Tan Simply Structure ........................................................ 40

    6 Methodology and Demographics ................................................................................................ 42

    6.1 Research Methodology ........................................................................................................ 42

    6.2 Authors ............................................................................................................................... 42

    6.2.1 About Enterprise Management Associates ............................................................... 436.2.2 About 9sight ............................................................................................................ 43

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    4/46

    Page 1

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    1 Executive Summary

    In the Inormation echnology (I) industry, 2012 has been the year o Big Data. From a standingstart toward the end o the last decade, Big Data has become one o the most talked about topics. Tereis hardly a vendor who does not have a solution or, at least, a go-to-market strategy. Beyond I, eventhe nancial and popular press discusses its merits and debates its drawbacks. And yet, the nigglingquestion o exactly how to dene Big Data remains. Respondents to this EMA/9sight survey haveclearly indicated that their Big Data solutions range ar beyond social media and machine-generateddata to include a wide variety o traditional structured and transactional business data.

    Still, although the question o denition o big may continue to niggle, the answer is becomingincreasingly irrelevant. Te concept o Big Data has evolved in two key directions. First was the growingunderstanding that while size is important, the technology implications o data structure and processingspeed are at least as important. Second, what really matters or Big Data is what systemic business casesit supports and what real analytic and operational value can be extracted rom it.

    Big Data has driven change in our traditional data management strategies and has ound a home in anexpanding inormation ecosystem that many companies struggle to manage today. Tis landscape wasonce dominated by the Enterprise Data Warehouse (EDW) on the inormational side and an array olargely monolithic transaction processing systems on the operational side. Tis has now given way to anarray o data management platorms, including NoSQL platorms like Hadoop. EDWs will continue toplay a critical role in this environment, but in support o historical and cross-unctional consistency to amore sophisticated data management strategy rather than as a central clearing house or all inormationalneeds. Tis new data management strategy leverages an array o platorms or the highest perormancepossible and brings together human-sourced inormation, process-mediated data and machine-generateddata as a complete, comprehensive business inormation resource. At the core o this change is a movementto align data with operational and analytic workloads, each on the best possible platorm. Tis shit instrategy is driven by our signicant changes in the data management landscape:

    Maturing user community

    New technology

    Economic value

    Valuable data

    Tis new home or Big Data is a muliple node ecosystem o data management platorms. In thisecosystem, each node or platorm has an equally critical role in supporting the sophisticated workloadsthat todays Big Data requirements demand.

    In many cases, it is the combination o the multiple platorms that enables success in addressing the

    ollowing requirements:

    Response

    Economics

    Workload

    Load

    Structure

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    5/46

    Page 2

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    Each o the nodes involved in this environment delivers a specialized value proposition by addressingthe drivers mentioned above and applying appropriate eature sets to meet Big Data requirements.

    1.1 Key FindingsTe EMA/9sight Big Data online survey was comprised o 255 Business Intelligence (BI) and datamanagement proessionals who qualied to participate in this research. Te survey instrument wasdesigned to identiy the key trends surrounding the adoption, expectations and challenges connectedto Big Data.

    Te research identies trends surrounding Big Data technology, its use, adoption and how it impactsanalytics. Below are highlights and key ndings rom the research:

    Big Data strategies are on the move: Respondents who are already working on Big Data projectsare doing so at ambitious rates. Over 36% o our respondents are In Operation with a Big Data

    oriented project. 35% are ollowing close behind in Serious Planningmode. Enterprise is leading the way: Enterprise-sized companies are the early adopters o Big Data

    driven analytics. Nearly 40% o Enterprise-sized organizations in the EMA/9sight survey haveindicated that they have implemented Big Data solutions on some scale, as either a productionenvironment or a pilot system.

    It is not easy: Most major industries are embracing Big Data technology at some level. Teseprojects are driven or sponsored by an array o stakeholders within the organization. Big Datasponsors vary by industry. Te Finance department is the biggest proponent o Big Data projectsin Healthcare representing over 16% o the responses, while Big Data in Leisure industries isprimarily driven by CEO- and Executive-level management 21%.

    Diferent industries, diferent approaches: By industry there are signicant dierences in how

    Big Data is used to drive value. Te Public Service industry leads all others with 31% o respondentsidentiying online archiving as their primary use case. 31% o Media & PR respondents areprimarily ocused on staging structured data. 22% o nancial services respondents are investing incombining data by structure.

    It is not the size o the data: Big Data isnt as big as the market buzz indicates. Less than 10%o our respondents are managing 750 terabytes or more within their overall system. Te mostcommon enterprise size data environments are 50-100B. O that data, most companies have10-30B in their Big Data environments indicating that Big Data analytics can be served on avariety o platorms, not just Hadoop.

    Data diversity: Te data that eeds Big Data systems comes rom a diverse set o sources. Ourrespondents identied structured operational data, human-generated documents and deep

    operational transaction data as the three most popular or Big Data projects.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    6/46

    Page 3

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    2 Big Data Comes of Age

    In the I industry, 2012 has been the year o Big Data. From a standing start toward the end o the lastdecade, Big Data has become one o the most talked about topics in I. Every analyst has a position.Tere is hardly a vendor who doesnt have a solution or, at least, a go-to-market strategy. Beyond I,even the nancial and popular press discusses its merits and debates its drawbacks. And yet, the nigglingquestion o exactly how to dene Big Data remains. By late 2012, it is clear that Big Data is rapidlybecomingall the digital inormation that is and has ever been collected, generated and processed.Respondents to the EMA/9sight survey have clearly indicated that their Big Data solutions rangear beyond social media and machine-generated data to include all types o traditional transactionalbusiness data.

    Perhaps this is too simplistic. And yet, irrespective o its starting point, every discussion o Big Dataresolutely includes every byte o digital inormation passing through the worlds networks or storedin Public Clouds, enterprise disk arms or even smartphones in teenagers hip pockets. In April 2012,Inormation Management reported1:

    We create 2.5 quintillion [1018] bytes of data every day, with 90% of the data in the world created inthe last two years alone... Every hour, Wal-Mart handles 1 million transactions, feeding a database of 2.5petabytes [1015bytes], which is almost 170 times the data in the Library of Congress. Te entire collectionof the junk delivered by the U.S. Postal Service in one year is equal to 5 petabytes, while Google processesthat amount of data in just one hour. Te total amount of information in existence is estimated at a littleover a zettabyte [1021 bytes].

    Still, although the question o denition o big may continue to niggle, the answer is becomingincreasingly irrelevant. Te concept o Big Data has evolved in two key directions. First was the growingunderstanding that while size is important, the technology implications o data structure and processing

    speed are at least as important. Second, what really matters or Big Data is what systemic business casesit supports and what real analytic and operational value can be extracted rom it.

    2.1 Big Datathe Technological EvolutionTe phrase Big Dataemerged rst in the late 1990s among scientists who couldnt aord to store oranalyze the huge and mounting quantities o data produced by the increasingly sophisticated digitaltechnology then emerging and used rom particle physics, through genomics and climatology, all theway to astrophysics. Tis growth trend continues today.

    By the early to mid-2000s, Big Data had become an open playing eld or researchers at companies likeGoogle, Yahoo!, Amazon and Netix using the growing volumes o Web-sourced data they held. Notonly was the quantity enormous, but the data arrived so ast that the speed o capturing and processing

    it was a major technical challenge. In addition, the data arrived in a multiplicity o structures and,perhaps more importantly, with unanticipated and changing processing needs ar eclipsing the abilitieso traditional data management solutions. In parallel, the growth o RFID devices and readers, as wellas the introduction o the rst smartphones, drove even stronger requirements to process the incominginormation at ever increasing speed. Tese trends led to the development by Google o the MapReduceramework in 2004.

    1 Bettino, Larry A., Transforming Big Data Challenges Into Opportunities, Information Management, April 18, 2012, http://

    bit.ly/PWKQtq

    http://bit.ly/PWKQtqhttp://bit.ly/PWKQtqhttp://bit.ly/PWKQtqhttp://bit.ly/PWKQtq
  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    7/46

    Page 4

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    In 2008, Hadoopa system or parallel processing o large les in batch using the MapReduceramework and a le system to act as a data storewas designated a top-level, Apache open sourceproject. It became almost instantly synonymous with Big Data, although the evolution o Big Datarom both business and technological viewpointssince then demonstrates clearly that the scope andimportance o Big Data is ar broader. A plethora o related open source projects have sprung up aroundHadoop to provide everything rom systems management to query unction, with creative names suchas Hive, Pig Latin, Sqoop, Zookeeper and many more.

    Despite the popularity o the le-based Hadoop approach, it was also clear that database unctionalbeit dierent in some respects rom relationalis required to manage certain types o Big Data,especially those where variety o structure and variability o processing were important. With the rapidgrowth o social networks, such as LinkedIn, Facebook and witter, as well as the big Net denizenslike Google and Amazon in the late 2000s, non-relational database and processing approachesotenlabeled NoSQLcame to the ore. Googles Bigable in 2006 and Amazons development o Dynamo

    in 2007 led the way. Open source products include Amazon SimpleDB, Cassandra, MongoDB anderrastore, to name but a ew.

    By 2010, the popular press was on re. Even Te Economisthad a special report2 on Big Data inFebruary o that year. Marketers at hardware and sotware vendors began relabeling every product andsolution as Big Data, including relational and other traditional processing approaches in the mix. Teapproach might be considered disingenuous, but in reality, it simply serves to emphasize the point madeearlierthat Big Data is allthe digital inormation that is and has ever been collected and generated,includingthe traditional transactional, master and inormational data I have collected, generated andmanaged since time immemorial.

    It has been said or some years that this traditional data accounts or less than 10% o the digital

    inormation managed by business. An in-depth analysis o IDCs annual Expanding Digital Universestudy3 suggests that the percentage may even be as little as 1% overall, although the relative proportionslikely vary widely across bricks-and-mortar and Web-centric companies. Te results o this surveystrongly contradict these percentages. Te role o traditional relational technology is diminishing interms o projected I budget spend according to the respondents, but it still alls in the 15-25%range and exceeds spending on NoSQL sotware by quite some margin in most cases. Furthermore,relational technology has also been undergoing signicant evolution to handle larger volumes o dataand higher processing speeds. Massively Parallel Processing (MPP), columnar and in-memory databaseshave enabled relational technology to support ever larger loads and higher speeds.

    Tis brie history o how technology platorms address Big Datas broad characteristics, oten expressedin the use o v-words such as volume, velocity, etc., shows a clear trend. Te ocus is moving rom largequantities o data in specic contexts to an all-encompassing view o a universal digital inormationenvironment capable o capturing and recording every aspect o physical reality and every event thatoccurs within it. And i Big Data is actuallyalldata, then it clearly must span the ull spectrum o allpossible structural, processing, governance and usage characteristics. Te growing number o v-wordsused broadly by analysts and vendorsvolume, variety, velocity, veracity and validity, to mention but aewaround Big Data are an attempt to describe this broad spectrum. Teir drawback is that they arequalitative in nature and limited only by the imagination o their promoters. Tis document describesa simpler and more holistic model o Big Data in Section , Big Datathe Holistic View,below.

    2Data, data everywhere A special report on managing information, The Economist, February 20103Expanding Digital Universe, International Data Corporation (IDC), 2007-2011, http://bit.ly/IDC_Digital_Universe

    http://bit.ly/IDC_Digital_Universehttp://bit.ly/IDC_Digital_Universe
  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    8/46

    Page 5

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    2.2 Big Datathe Emergence of Systemic Business ValueNon-traditional Big Data, oten called multi-structured,such as loosely structured social media dataand content, as well as the deluge o machine-generated data like geolocation4 and data usage inormationcoming rom smartphones and networked devices rom packaged goods with embedded sensors toautomobiles and domestic appliancesoers game-changing opportunities in operational processoptimization and reinvention, as well as in business analytics and intelligence. Tese opportunities canbe summarized across ve broad types o business applications that are directly enabled by Big Data:

    1. Revenue generation and business model development, particularly in retail and consumerpackaged goods, where there is direct or indirect interaction with large consumer markets, movesto a new level. Marketing uses social media inormation, both content and relationship, to moverom sampling to ull dataset analysis, rom demographic segments to markets-o-one, and romlonger-term trending o historical data to near real-time reaction to emerging events. Prediction

    o customer behaviors and outcomes o proposed actions allow new business models to be cre-ated and tested, ultimately driving increased revenue.

    2. Cost containment in real-time becomes viable as electronic event monitoring rom automobilesto smartphones, raud detection in nancial transaction data and more expands to include largervolumes o oten smaller size or value messages on ever-shorter timescales. Big Data analysis tech-niques on streaming data, beore or without storing it on disk, have become the norm, enablingaster reaction to specic problems beore they escalate into major situations.

    3. Real-time orecastingbecomes possible as utilities, such as water and electricity supply andtelecommunications, move rom measuring consumption on a macro- to a micro-scale usingpervasive sensor technology and Big Data processes to handle it. Value arises as consumption

    peaks and troughs can be predicted and, in some cases, smoothed by inuencing consumerbehavior.

    4. racking o physical items by manuacturers, producers and distributorseverything romood items to household appliances and rom parcel post to container shippingthrough dis-tribution, use and even disposal drives deep optimization o operational processes and enablesimproved customer experiences. People, as physical entities, are also subject to tracking or busi-ness reasons or or surveillance.

    5. Reinventing business processes through innovative use o sensor-generated data oers the pos-sibility o reconstructing entire industries. Automobile insurance, or example, can set premiumsbased on actual behavior rather than statistically averaged risk. Te availability o individualgenomic data and electronic medical records presents the medical and health insurance industrieswith signicant opportunities, not to mention ethical dilemmas.

    Te survey responses to the question o business challenge addressed by Big Data implementationsweigh heavily in avor o Operational Analytics, which describes much o the type o processing requiredto drive these goals at ever-increasing levels o timeliness.

    4 Geolocation is the identication of the real-world geographic location of an object, such as mobile phone or an Internet-

    connected computer terminal. Geolocation may refer to the practice of assessing the location, or to the actual assessed

    location. Wikipedia.com, http://en.wikipedia.org/wiki/Geolocation

    http://en.wikipedia.org/wiki/Geolocationhttp://en.wikipedia.org/wiki/Geolocation
  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    9/46

    Page 6

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    While the impact o Big Data on I is indisputable, traditional operational (or Online ransactionProcessing, OLP) and inormational (or Business Intelligence, BI) data is not going away. Suchtraditional data remains at the heart o running and managing business on a day-to-day basis. It is alsocentral to meaningul, contextually-relevant use o non-traditional data. Te value o social media data,or example, is substantially higher when it can be connected to the master data and transactions oreal, identiable customers. Business analytics perormed in a new Big Data environment with tens othousands o commodity servers is worth perorming only i it is made directly and clearly actionablein the daily operations o the business.

    In this context, the organizational impact o Big Data cannot be overlooked. A new role odata scientisthas been heavily promoted, combining business and I skills in analytic, statistical, visualization anddata manipulation rom gathering and cleansing to mining and even coding. Some describe it as adata analyst on steroids. Te survey results suggest that this role is still emerging in most industries,trailing behind pure business and I roles among users o Big Data. More importantly, the organization

    as a whole must be motivated and aligned to a process that turns the insights gleaned rom Big Datainto new behaviors and activities integrated with existing processes or creating new processes with realbusiness eect. As seen in the survey responses, stakeholder and strategy issues are seen as by ar themost important hurdles to successul implementation.

    Until this year, most analysts and vendors have been ocused on the novelty o Big Datathe dierenttypes o data involved, the new tools and technologies required to manage the loads, speeds andexibility needed and the skills required to build and use such systems. In 2012, as the tools andtechnologies have moved closer to the mainstream, the ocus has shited toward a more integrated viewo all the digital inormation in the environment and the need or a more inclusive approach to supportbusiness-driven implementation o Big Data projects.

    2.3 Big Datathe Holistic Viewo achieve the holistic view o Big Data, one must step back rom technology issues and see how all theinormation and processes used by a business interrelate. Tis leads to a new vision o the inormationlandscape, with three distinct, deeply interrelated domains:

    1. Human-sourced inormation5: All inormation ultimately originates rom people. Tis inorma-tion is the highly subjective record o human experiences, previously recorded in books and workso art, and later in photographs, audio and video. Human-sourced inormation is now almostentirely digitized and electronically stored everywhere rom tweets to movies. Loosely structuredand oten ungoverned, this inormation may not reliably represent or business what has hap-pened in the real world. Structuring and standardizationor example, modelingdenes acommon version o the truth that allows the business to convert human-sourced inormationto more reliable process-mediated data. Tis starts with data entry and validation in operationalsystems and continues with the cleansing and reconciliation processes as data moves to BI.

    2. Process-mediated data: Business processes are at the heart o running every business and organi-zation. Whether ormally dened and managed or not, these processes record and monitor busi-ness events o interest, such as registering a customer, manuacturing a product, taking an order,etc. Te process-mediated data thus collected is highly structured and includes transactions,

    5 In the context of these three domains, data is used to signify well structured and/or modeled and information as more

    loosely structured and human-centric.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    10/46

    Page 7

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    reerence tables and relationships, as well as the metadata that sets its context. Process-mediateddata has long been the vast majority o what I managed and processed, in both operational andBI systems. Its highly structured and regulated orm is well suited to promoting inormationmanagement and data quality, and or storage and manipulation in relational database systems.

    3. Machine-generated data: Over the past decade, there has been phenomenal growth in thenumber o sensors and machines employed to measure and record the events and situations in thephysical world. Te output o these sensors and machines is machine-generated data, and romsimple sensor records to complex computer logs, it is well structured and considered to be highlyreliable. As sensors prolierate and data volumes grow, it is becoming an increasingly importantcomponent o the inormation stored and processed by many businesses. Its well-structurednature is amenable to computer processing, but its size and speed is oten beyond traditionalapproachessuch as the enterprise data warehouseor handling process-mediated data;standalone high-perormance relational and NoSQL databases are regularly used.

    Figure 1

    Te above gure shows the relationship between these three domains and the unctionality thatsurrounds and transorms them. Human-sourced inormation and machine-generated data are theultimate sources o the process-mediated data, which has long been the ocus o all I eort. Tesesources are more exible and timelier than traditional process-mediated data. In most cases, only asmall, well-dened subset moves through the traditional business process layer that creates process-mediated data. Te goal is to ensure the quality and consistency o the resulting data, but the side eectis a reduction in exibility and timeliness.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    11/46

    Page 8

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    Te relative sizes and perceived importance o these three domains has shited over the past decade andis likely to shit urther in the coming one. Process-mediated data was by ar the dominant, and almostexclusive, domain since the introduction o business computing in the 1960s. Digitized human-sourcedinormation and machine-generated data was relatively small in volume and considered unimportantin comparison to the well-managed data in operational and inormational systems. Tere has been anexplosion o both human-sourced inormation and machine-generated data in the last decade. Teormer, as social media data, has captured the most attention or now. In the coming years, the rapidgrowth o the Internet o Tings will likely promote machine-generated data to the highest levels ovolume and importance. As o 2012, responses to the EMA/9sight survey show that human-sourcedinormation accounts or nearly hal o the sources o Big Data, with process-mediated data stilloutstripping the machine-generated variety by a reasonably small margin.

    In these circumstances, copying and transorming human-sourced inormation and machine-generateddata to the process-mediated domain in the traditional manner is increasingly impractical. Tereore,

    advanced technologyextensions to existing techniques in many casesoten labeled businessanalyticsisrequired to process and explore both human-sourced inormation and machine-generateddata as close to their sources and as quickly as possible. O equal importance, process-mediated dataand associated metadata must be copied into the business analytics environment to create meaning,context and coherence in the analytics process. Big Data and business analytics thus complete a longmissing, closed-loop inormation process.

    2.4 Big DataWhere Next?In the past year, marketing activity around Big Data, in the context o the earlier denitions o theconcept, has reached ever pitch. However, taking the more holistic view above, it can be seen thatthe various technology componentsstorage, processing and analyticas well as business drivers are

    simply undergoing a natural, albeit rapid and concerted, evolutionary step to a new level o integrationand value delivery.

    Across a wide range o industries, business value is emerging as Big Datain the all-inclusive senseenables earlier, closer-to-source analysis o emerging trends, better prediction o uture events, and asterreaction to new opportunities and immediate threats. Business processes can be joined across the olddivide between decision-making and action-taking. And new business processes are being developedbased on Big Data sources that were previously impractical and, in many cases, inconceivable.

    Te technical implications o these developments are signicant and wide-ranging:

    Big Data processing must be ully inclusive o traditional, process-mediated data and metadata orthe context and consistency needed or extensive, meaningul use

    Feeding the results o Big Data processing back into traditional business processes will enable anddrive change and evolution o the business

    A ully coherent environmentincluding an integrated, distributed platorm based on diversetechnologies and enterprise-scale organizationor successul Big Data implementation

    Te challenge or business and I is to move rom process-mediated data as the sole source o businessinormation and embrace the speed and variety o knowledge that the new human-sourced andmachine-generated domains oer about the real world.

    As the survey results disclose, these moves are already underway.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    12/46

    Page 9

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    3 Hybrid Data Ecosystem

    In line with the transormations discussed above, Big Data has ound a home in an expanding inormationecosystem that many companies struggle to manage today. Tis landscape was once dominated by theenterprise data warehouse on the inormational side and an array o largely monolithic transactionprocessing systems on the operational side. Tis has now given way to an array o data managementplatorms, including NoSQL platorms like Hadoop. EDWs will continue to play a critical role in thisenvironment, but in support o historical and cross-unctional consistency to a more sophisticateddata management strategy rather than as a central clearing house or all inormational needs. Tis newdata management strategy leverages an array o platorms or the highest perormance possible andbrings together human-sourced inormation, process-mediated data and machine-generated data as acomplete, comprehensive business inormation resource. At the core o this change is a movement toalign data with operational and analytic workloads, each on the best possible platorm. Tis shit instrategy is driven by our signicant changes in the data management landscape:

    1. Maturing user community: Reporting and business intelligence have served users well over thepast decade, but as with any community, the consumers are evolving and applying additionalstress to traditional data centric systems, causing innovative companies to search out betterperorming platorms. Tis shit towards best possible platorm coupled with the widespreadgrowth in adoption o business intelligence and analytics has opened the door or change. As endusers have become more comortable with these tools, a movement towards Sel-Serve analyticshas emerged, empowering users to be less reliant on I and more sophisticated in their approachto analytics. Big Data has arrived at an optimal time to meet some o these new demands.

    2. New technology: Moores Law is alive in well in the world o Analytics and Big Data. Comput-ing power, memory, commodity hardware and new technologies such as Hadoop and other

    NoSQL solutions are all enabling execution o analytics once thought to be impossible or toocostly. Tis tidal wave o technical innovation has presented new opportunities or analyticproessionals, creating a space or Big Data to thrive.

    3. Economic value: Tis driver o innovation and Big Data adoption is a hurdle that had to beremoved beore companies could embrace Big Data. In the past, many companies had attemptedprojects that resemble todays Big Data workloads, but they proved too time consuming and/orextremely expensive. Most Very Large Database (VLDB) projects were previously done outside onormal business channels, and delivered by educational or research driven entities that benetedrom government grants. odays less expensive technologies open the door or a democratizationo innovation, allowing most companies to leverage Big Data and its opportunities at reasonablyaordable rates.

    4. Valuable data: echnology and economic actors have previously hindered leveraging oppor-tunities like Big Data. Now the once-ignored critical data can be included, along with multi-structured enterprise inormation in appropriate business insights. Big Data solutions allow theutilization o traditional enterprise data (email, call center data, voice transcripts, etc.) along withhuman-sourced and machine-generated data that once was dicult to process on traditionalplatorms.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    13/46

    Page 10

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    Big Data has ound a home in the Hybrid Data Ecosystem alongside other platorms that play anequally critical role in delivering the sophisticated analytics that todays users demand. In many cases,it is a combination o the multiple platorms that enable success.

    Figure 2

    Each o the platorms or nodes depicted around the edge o the above gure delivers a specialized valueproposition to the enterprise by addressing the drivers mentioned above and applying appropriateeature sets to meet the ollowing requirements:

    Response: New technology platorms such as Big Data tools and rameworks are at the core o thisevolution and powering new solutions and improved speed o query results.

    Economics: Big Data platorms leverage commodity hardware, and the sotware is oten ree,substantially reducing the nancial barriers to adoption.

    Workload: Big Data platorms play a role within the ecosystem to execute extremely complexanalytic workloads and innovative companies are willing to invest early in these solutions to gaincompetitive advantage.

    Load: Data loads are growing more complex and the sources are more diverse. Driven by greatercomplexity and demand, Big Data adoption is driven by the need to provide exibility.

    Structure: Data schema exibility is key to the oundation o Big Data utilization and adoption.Big Data rameworks provide a level o exibility not present in many traditional data platorms.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    14/46

    Page 11

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    All ve o these requirements play key roles in the adoption o new platorms. Big Data adoption,specically, has been driven orward along these themes and is now experiencing adoption at the earlymajority level. As more and more companies introduce this technology into their data managementecosystems, they will be aced with opportunities to innovate as well as challenges to success.

    3.1 Nodes within the Hybrid Data EcosystemTe requirements o the Hybrid Data Ecosystem are served by the data management nodes/platormsrepresented by the ollowing categories:

    Operational systems: Business support systems such as website order entry applications, Point OSale (POS), Customer Relationship Management (CRM) or Supply Chain Management (SCM)applications. Tese platorms contain increasingly ne-grained inormation on transactions anddemographics and rmographics6.

    Enterprise data warehouse: Centralized analytical environments where corporate-level, reconciledand historical inormation o an organization is stored. Tese platorms have structured dataorganizations (schemas) based on time rather than present inormation.

    Data mart: Oten distributed analytical environments where a particular subject area or departmentlevel data set is stored or historical or other analysis. Tese platorms oten have similar dataorganization to the enterprise data warehouse, but serve smaller user groups.

    Analytical platorm: Specically architected and congured environments or providing rapidresponse times or analytical queries. Tese platorms are generally developed to support high-endanalysis via tuned data structures like columnar data storage or indexing.

    Discovery platorm: Data discovery platorms support both standard SQL and programmatic APIinteraces or iterative and exploratory analytics.

    NoSQL: NoSQL data stores use non-traditional organizational structures such as key-value, wide-column, graph or document storage structures. Tese data stores support programming APIs andlimited SQL variants or data access.

    Hadoop:A specic variant o the NoSQL platorm based on the Apache Hadoop Open Sourceproject and its associated sub-projects. Tese platorms are based on Hadoops Distributed FileSystem (HDFS) storage and MapReduce processing ramework.

    Cloud: Cloud data sources make inormation available via standardized interaces (APIs) andbulk data transers. Examples are Dunn & Bradstreet D&B360, NOAA National Weather Service(NWS) API and social data aggregators.

    6 Firmographics are the characteristics of an organization especially when used to segment markets in market research.

    What demographics are to people, rmographics are to organizations. Wikipedia.com, http://en.wikipedia.org/wiki/

    Firmographics

    http://en.wikipedia.org/wiki/Firmographicshttp://en.wikipedia.org/wiki/Firmographicshttp://en.wikipedia.org/wiki/Firmographicshttp://en.wikipedia.org/wiki/Firmographics
  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    15/46

    Page 12

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    3.2 Shift from a Single Platform to an EcosystemWhen asked how many nodes were part o their Big Data initiatives, the EMA/9sight survey respondentsindicated that a wide number o Hybrid Data Ecosystem nodes were part o their plans.

    27%

    26%

    27%

    8%

    6%

    2%2%

    1%

    Hybrid Data Ecosystem Nodes in Use

    One Node

    Two Nodes

    Three Nodes

    Four Nodes

    Five Nodes

    Six Nodes

    Seven Nodes

    Eight Nodes

    Figure 3

    Te most common answer among the 255 respondents was a total oTree Hybrid Data Ecosystemnodes as part o the respondents Big Data Initiatives, showing that Big Data strategies are not limited

    to a single platorm or solution. When the wo to Five Hybrid Data Ecosystem nodes indications areaggregated, over two thirds o respondents are included in this segment. Tis shows Big Data Initiativesare ocused on more than just a single platorm (e.g. Hadoop) augmentation o the core o operationalplatorms or the enterprise data warehouse. Rather, Big Data requirements are solved by a range oplatorms including analytical databases, discovery platorms and NoSQL solutions beyond Hadoop.

    4 Big Data Adoption

    4.1 Overall ImplementationDue to the hype surrounding Big Data, it seemed critical to investigate the adoption o the technologyas early as possible in this report. Having a clear understanding o where the overall implementation

    processes are would shed urther light on the how and why companies are moving toward Big Data tosolve their analytic challenges.

    Te ollowing groupings are used to dene an organizations status in relation to its Big Data initiatives:

    In Operation: Tis represents actual implementations o Big Data projects including Alreadyhaving a project in production and Currently working to implement a pilot project. Teserespondents have hands-on experience with both Big Data business requirements and technologiesthat solve those requirements.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    16/46

    Page 13

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    Serious Planning: Tis represents near-term to immediate Big Data projects. Tese include surveyrespondents who indicated the planning or implementation within one to six months. Tis grouprepresents organizations that are close to or on the verge o signing contracts or hardware andsotware licenses associated with their Big Data implementation.

    Investigating: Tis grouping represents those organizations still looking at Big Data requirementsand Big Data technologies. Tese respondents are 7+months out rom implementing a Big Datasolution.

    36%

    35%

    28%

    At what stage of implementation are your company's/

    departments Big Data project(s)?

    In Operation

    Serious Planning

    Investigating

    Figure 4

    Te EMA/9sight Big Data survey determined that the physical implementation o projects was morewidespread among organizations considering Big Data implementations than originally thought. Some36% o the respondents already have one or more projects In Operation. Some previous studies onthis topic have indicated that Big Data projects are still in an earlier stage. EMA/9sight suspects thisis a reection o these surveys being too Hadoop centric in their ocus. With over a third o projectsIn Operation and another third in Serious Planning, Big Data is clearly moving aster than manybelieved.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    17/46

    Page 14

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    4.2 Ongoing Programs vs One Time ProjectsWhen EMA/9sight looks at the distribution oMid-sizedorganizations (less than 500 employees),Large (500-5000 employees) and Enterprise-sized(over 5000 employees) organizations, EMA/9sightsees concentrations o hands-on project or implementation experience with Big Data technologies andrequirements.

    36%

    34%

    39%

    34%

    44%

    27%

    31%

    22%

    34%

    Mid-sized: less than 500

    Large: 500-5000

    Enterprise 5000+

    At what stage of implementation are your company's/

    departments Big Data project(s)? by How many employeesare in your company worldwide?

    In Operation Serious Planning Investigating

    Figure 5

    Nearly 40% oEnterprise-sizedorganizations in the EMA/9sight survey have indicated that they haveimplemented Big Data solutions on some scale, as either a production environment or a pilot project In Operation. Both levels o implementation provide hands-on experience with Big Data technologiesand implementation methodologies. Over 35% o Mid-sizedorganizations have indicated similarimplementation experience. Close behind, Large companies have indicated 34% implementationexperience o some sort.

    Ater physical implementation o Big Data solutions, the next set o implementations appears to bedominated byLarge organizations. Nearly 44% oLarge organizations have indicated they plan to beimplementing a Big Data solution in the next six months Serious Planning. Tis shows that, despitetheir initial lag behind Enterprise-sizedorganizations or implementation, Large companies representthe next wave o serious Big Data implementations between now and early 2013.

    Te two leaders in the In Operation phase, Mid-sizedorganizations and Enterprises, appear to beplanning their next waves o projects with signicant strategies and research in the second hal o2013 Investigating. Tese two company size segments, ater their lead In Operation, have indicatedalmost 50% greater plans, as compared to their Large company counterparts, or their next phase o BigData initiatives. Tese long-range plans indicated that Big Data initiatives or Mid-sizedorganizationsand Enterprises are on a track or multi-year Big Data programs as opposed to point in time Big Dataprojects.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    18/46

    Page 15

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    4.3 Industry BreakdownAter looking at Big Data initiative status by company size, EMA/9sight examined the breakdown byindustry. Industry groupings include the ollowing individual industry designations:

    Finance: Finance, Banking, and Insurance

    Public Services: Government, Education, Non-Prot/Not or Prot, and Legal

    Manuacturing:All non-Computer or Networking related Manuacturing industries

    Industrial:Aerospace and Deense manuacturing, Oil and Gas production and rening, Chemicalmanuacturing, and ransportation and Logistics organizations like Airlines, rucking and Rail

    Leisure: Hospitality, Gaming and Entertainment, as well as Recreation and ravel

    Media & PR: Marketing, Advertising, Public Relations and Market Research, and Publishing andBroadcasting

    Utilities Inrastructure: elecommunications Service Providers, Application, Internet andManaged-Network Service Providers, and Energy production and distribution Utilities

    Retail: End Consumer Retail and Wholesale and Distribution

    Healthcare: Medical device and supply and Pharmaceutical production

    It should be noted that the Leisure industry segment has a relatively low number among the surveyrespondents. However, the Leisure segment is included in our evaluations as the industry is otenconsidered to be one o the leading inuencers and innovators related to data management andanalytical practices. Te Leisure industry segment includes elements o the overall Gaming industry.Examples o these companies are, but not necessarily included in the EMA/9sight survey results, CaesarsEntertainment Corporation, MGM Resorts International and Boyd Gaming Corporation.7

    7 Gaming Companies Key Players http://nance.yahoo.com/q/co?s=CZR+Competitors

    http://finance.yahoo.com/q/co?s=CZR+Competitorshttp://finance.yahoo.com/q/co?s=CZR+Competitors
  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    19/46

    Page 16

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    4.3.1 Big Data Implementation Status by Industry

    When the EMA/9sight Big Data survey respondents are viewed by industry along with their Big DataImplementation Stage, the ollowing breakdown is observed:

    24%

    27%

    32%

    33%

    34%

    38%

    38%

    50%

    50%

    42%

    55%

    43%

    39%

    41%

    27%

    38%

    32%

    44%

    34%

    18%

    25%

    27%

    24%

    35%

    23%

    18%

    6%

    Public Services

    Leisure

    Utilities Infrastructure

    Healthcare

    Manufacturing

    Finance

    Industrial

    Retail

    Media & PR

    At what stage of implementation are your company's/

    departments Big Data project(s)?" by "Which of the followingbest describes your companys primary industry segment?

    In Operation Serious Planning Investigating

    Figure 6

    Retail and Media & PR are the most advanced with 50% o implementations In Operation. Allindustries, with the exception oRetail and Finance, are at 38% or higher or the Serious Planningphase. Tis indicates that the strong promotion o Big Data in the marketplace in 2012 has drivensubstantial uptake across industries. Te high percentage o responses or Finance in the Investigatingphase prompted deeper exploration. Te survey source data showed that one third o the respondents

    rom the Finance industry segment that have completed the rst implementation are already planningor a second wave o implementations. Nevertheless, most o the mentions in the Investigatingphaseare rom respondents who have not previously implemented a Big Data project.

    It is not surprising to see Public Services indicating the lowest rate o projects In Operation. ogetherwith Utilities Inrastructure and Healthcare, these industry segments are oten constrained in bothtechnical and nancial resources, in particular, those technical and nancial resources needed orprojects like Big Data implementations. Despite this, they appear to represent the next set o industriesto implement Big Data between now and early 2013 with their Serious Planningindications.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    20/46

    Page 17

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    4.4 Adoption CurveMost o the hype about Big Data relates to the Innovators in the Big Data adoption curve. Big Dataimplementers like Facebook, Yahoo, and Google or Media & PR, and Wal-Mart or Retail have beenin the news or many years now and have pushed and continue to push the envelope o what is possible.However, who are the ensuing groups o implementers?

    Using the inormation above and a standard adoption curve, you can begin to see the organizations thatrepresent Early Adopters, and the Earlyand Late Majorityimplementers.

    Innovators:- Media & PR

    - Retail- Leisure

    Early Adopters:- Finance- Industrial

    Early Majority:- Utilities

    Infrastructure- Public Services

    Late Majority:- Manufacturing

    - Healthcare

    Big Data Adoption Curve

    Figure 7

    4.5 Use CasesWhen asked in what way the EMA/9sight Big Data respondents were using or planning on usingtheir Big Data implementations, the ollowing use case options were oered. It should be note thatrespondents were given the opportunity to select multiple options as they might be engaged in multipleprojects as part o their Big Data program.

    51%

    45%

    45%

    44%

    37%

    Online Archiving

    Combining data by structure

    Staging structured data

    Combining Data by Speed

    Pre-Processing

    Which of the following use cases applies to your current, or

    planned, Big Data project?

    % Valid Cases (Mentions / Valid Cases)

    Figure 8

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    21/46

    Page 18

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    Te top answer or organizations was Online Archiving. Tis supports the concept that simply storingthe data appears to be the initial step in either operational or analytical Big Data implementations.Without this initial step and the experience associated with building the environment and learning thepitalls o Big Data solutions, organizations may nd themselves getting out o their depth. Tis showsa stepwise approach to Big Data solutions adoption. As companies adjust to the associated learningcurve o Big Data platorms and integrated environments, this appears to be a prudent approach.

    4.5.1 First Steps by Industry

    While Online Archiving is the overall rst preerence among aggregated responses, there is quite avariation by industry segment. Online Archiving is the top use by ar or Public Services, and thehighest preerence or Retail, Leisure, Manuacturingand Utilities Inrastructure.

    31%

    28%

    27%

    24%

    24%

    23%

    22%

    21%

    15%

    18%

    13%

    20%

    18%

    31%

    21%

    24%

    27%

    20%

    17%

    20%

    13%

    24%

    14%

    20%

    14%

    15%

    27%

    19%

    23%

    20%

    18%

    21%

    16%

    22%

    17%

    22%

    15%

    18%

    20%

    14%

    10%

    20%

    19%

    21%

    17%

    Public Services

    Retail

    Leisure

    Manufacturing

    Media & PR

    Utilities Infrastructure

    Healthcare

    Industrial

    Finance

    Which of the following use cases applies to your current, or

    planned, Big Data project?" by "Which of the following bestdescribes your companys primary industry?

    Online Archiving Staging structured data

    Combining data by structure Combining Data by Speed

    Pre-Processing

    Figure 9

    As a Phase Zero, or learning stage, o Big Data implementations, Online Archiving representsthe on-the-job training or organizations implementing Big Data initiatives. Tis approach can beused to overcome the hurdle o not having the proper implementation strategy or technology skillswithin an organization. Tis approach makes sense or Public Services, Utilities Inrastructureand Manuacturingwho have perhaps come more recently to the topic o Big Data approaches andchallenges. Retail and Leisure, however, are among early implementers o Big Data, as described inSection 4.4. Tese industry implementation leaders show a more balanced spectrum o drivers.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    22/46

    Page 19

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    4.6 Implementation SponsorsAs EMA/9sight looked at implementations across industry, it is important to identiy the sponsors ordecision makers associated with Big Data implementation and planning.

    14%

    17%

    15%

    11%

    9%

    9%

    9%

    7%

    11%

    16%

    13%

    9%

    10%

    19%

    9%

    15%

    16%

    11%

    11%

    12%

    9%

    15%

    8%

    10%

    11%

    12%

    21%

    7%

    13%

    15%

    11%

    5%

    9%

    6%

    8%

    0%

    7%

    3%

    9%

    3%

    7%

    14%

    4%

    4%

    11%

    7%

    5%

    6%

    7%

    9%

    8%

    9%

    11%

    16%

    4%

    7%

    3%

    8%

    7%

    1%

    6%

    3%

    0%

    6%

    5%

    12%

    6%

    3%

    9%

    9%

    9%

    0%

    3%

    8%

    6%

    7%

    4%

    15%

    6%

    3%

    5%

    6%

    3%

    0%

    6%

    9%

    4%

    4%

    4%

    5%

    18%

    13%

    15%

    15%

    20%

    14%

    19%

    23%

    21%

    Finance

    Healthcare

    Media & PR

    Industrial

    Public Services

    Manufacturing

    Retail

    Utilities Infrastructure

    Leisure

    "Who are the sponsors of your organizations Big Data

    initiative(s)? by "Which of the following best describesyour companys primary industry?

    Finance Function IT / Data Center Corporate Executive (CEO, CIO)

    Sales Marketing Customer Service / Care

    Human Resources Supply Chain Manufacturing

    Regulatory and Compliance R&D

    Figure 10

    Big Data solutions oten involve a wide variety o systems; I / Data Center sponsorship is thus asignicant actor in almost every industry. In particular, I / Data Center sponsorship is importantin Public Services, Finance and Utilities Inrastructure. Big Data solutions also require signicantinvestment and the technical support oI / Data Center sponsorship. Te Finance Function has

    sponsorship roles across the board with particular ocus on Healthcare, Finance and Media & PR. WithBig Data solutions at the heart o many new aspects o business models, Research and Developmentorganizations are also involved heavily in the sponsorship o Big Data strategies according to theEMA/9sight Big Data survey. Given the long struggle in data warehousing and BI to ensure appropriateexecutive-level support, it is heartening to see that Corporate Executives are already involved in manyinstances in Big Data projects.

    It should be noted that survey responses of Other were omitted from the graphic.

    Some industry segments do not reect a response total of 100% because of this omission.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    23/46

    Page 20

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    4.6.1 Bumps in the Road

    For any planned initiative, the best case is oten described to gain stakeholder buy-in and or planningpurposes. However, just as any battle plan goes out the window at the start o a conict, organizationsmust plan and prepare or the hurdles that will stand in the way o the best-case scenario or implementingBig Data initiatives. Te EMA/9sight Big Data Survey thus asked about the implementation hurdlesacing organizations.

    25%

    25%

    25%

    33%

    34%

    34%

    35%

    40%

    45%

    36%

    31%

    32%

    40%

    30%

    36%

    35%

    30%

    25%

    29%

    25%

    22%

    12%

    22%

    23%

    18%

    15%

    10%

    11%

    19%

    21%

    16%

    12%

    6%

    12%

    13%

    18%

    Media & PR

    Leisure

    Utilities Infrastructure

    Public Services

    Finance

    Manufacturing

    Industrial

    Retail

    Healthcare

    Which of the following obstacles will impact your

    organizations ability to implement Big Data project?"by "Which of the following best describes your

    companys primary industry?"

    Stakeholder issues

    Strategy issues

    Lack of skills to manage NoSQL / Hadoop

    Lack of appl. management in Big Data solutions

    Figure 11

    Across all industries, the most signicant hurdles to implementation are associated with the StakeholderIssues o buy-in and strategy. Stakeholder Issues are the top concern or the Retail and Healthcareindustries. Public Services prioritize Strategy Issues, specically the lack o an implementationstrategy. Tis harks back to the widespread experience with BItechnology is seldom a showstopper,but organizational issues oten are.

    Te ot-quoted problems o sourcing NoSQL skills, such as Hadoop MapReduce, or the lack oApplication Management Function in the Big Data environment are rather ar down the list opotential issues or our survey respondents.

    It should be noted that survey responses of Other were omitted from the graphic.

    Some industry segments do not reect a response total of 100% because of this omission.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    24/46

    Page 21

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    4.7 Implementation User BaseAlmost as important as what is being perormed with Big Data environments is the question o whois accessing these environments. Inormation locked in a data store or unavailable to the business usercommunity rarely drives competitive advantage, especially on the revenue side.

    In the EMA/9sight Big Data survey, respondents were asked about the members o their organizationwho had direct access to the Big Data project environments.

    55%

    54%

    50%

    42%

    39%

    1%

    IT Analysts (database administrators, data

    analysts)

    Business Analysts (marketing analyst, finance

    analysts)

    Line of Business Executives

    Application developers

    Data Scientists (statistical analysts, data

    mining specialist, predictive modelers)

    Other (Please specify)

    Which of the following user groups have, or planned to have,

    direct access to your Big Data project?

    % Valid Cases (Mentions / Valid Cases)

    Figure 12

    Among survey respondents, the second most prominent response was that ordinary Business Analystswere those with direct access to the Big Data environments. Surprisingly, the third most indicationswere that Line o Business Executives had access to these environments. Between these two groups,it hardly looks like Big Data solutions are only the domain o specially trained data scientists. In act,

    in this analysis data scientists were the th o the named choices. However, the overall numbers areimpressive considering that the proession or role o the Data Scientistis a relatively recent additionto the modern business lexicon.

    Direct access to Big Data environments byApplication Developers strengthens the conclusion thatBig Data is being used or more than analytical uses and demonstrating signicant operational use casesas well. Tis conclusion comes rom the belie thatApplication Developers would be accessing the BigData environments directly to support their operational platorms.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    25/46

    Page 22

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    4.8 Implementation User Base by Industryo drill deeper, the EMA/9sight survey answers were distributed by industry as ollows:

    25%

    25%

    23%

    21%

    20%

    20%

    17%

    23%

    17%

    27%

    23%

    23%

    20%

    25%

    21%

    24%

    16%

    22%

    15%

    13%

    18%

    21%

    10%

    18%

    17%

    16%

    11%

    12%

    21%

    13%

    18%

    20%

    19%

    18%

    19%

    17%

    20%

    18%

    23%

    18%

    25%

    24%

    23%

    26%

    33%

    Manufacturing

    Industrial

    Healthcare

    Utilities Infrastructure

    Retail

    Public Services

    Finance

    Media & PR

    Leisure

    "Which of the following user groups have, or planned

    to have, direct access to your Big Data project?"by "Which of the following best describes your

    companys primary industry?

    Line of Business Executives Business Analysts Data Scientists

    Application developers IT Analysts

    Figure 13

    O note is the act that the industries seen as innovators in Big DataRetail and Media & PRshow almost the largest percentage o I usage (I Analysts andApplication Developers together),suggesting that I involvement remains a vital component o successul deployment. Manuacturing,which tops the list or business users, also has the highest response rate or Analytics usage in Section5.3.3 Business Drivers by Industry Grouping.

    It should be noted that survey responses of Other were omitted from the graphic.

    Some industry segments do not reect a response total of 100% because of this omission.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    26/46

    Page 23

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    5 Big Data Requirements: Beyond Buzzwords

    5.1 The Speed of BusinessTe need or Big Data platorms to provide new speeds and scale oResponsehas opened the door or new ways to leverage data and provideinsights to end users. Tis is especially true in the area o Big Dataanalytics where the ability to react in near real time is a key componentto the value these platorms can deliver. Sub-second data delivery is notnecessary or all applications and data driven scenarios, but it is clearthat real-time use cases are growing in importance and becoming morecritical to many companies. New Big Data technologies are at the core othis evolution, and powering new solutions and improved time to action.Innovators such as Yahoo and Google helped to pioneer this area and

    created technology oundations to meet the growing needs oResponsewithin organizations. Cassandra and Hadoop, among other tools, are being adopted into traditionaldata management ecosystems to address these new demands. Tese solutions are highly technical andnot generally designed or the lay person, but are creating new opportunities within the enterprise toleverage data at new speeds and scale that were once thought to be prohibitive rom both a technologyand economic viewpoint.

    5.1.1 Taking off the Training Wheels

    When looking beyond the stage oOnline Archiving, more mature Big Data use cases begin to take onthe importance o speed oResponsewhen considered with respect to implementation stage.

    Consistent with the inormation in Section 4.5 Use Cases,Online Archivingis the most implementedBig Data use case with almost 40% o EMA/9sight survey respondents indicating that they are using

    that use case in either a production or pilot setting the In Operation status.

    40%

    39%

    40%

    39%

    37%

    34%

    37%

    31%

    28%

    35%

    27%

    24%

    29%

    33%

    28%

    Online Archiving

    Pre-Processing

    Combining Data by Speed

    Combining data by structure

    Staging structured data

    At what stage of implementation are your company's/

    departments Big Data project(s)?" by "Which of the followinguse cases applies to your current, or planned, Big Data

    project?

    In Operation Serious Planning Investigating

    Figure 14

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    27/46

    Page 24

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    Te next two use cases are Combining Data by Speedand Pre-Processing. Te ocus on speed in thesetwo use cases shows how organizations are examining their current solutions and nding them in needo perormance improvement.

    Tis can be seen in the requirement to meet end-user response Service Level Agreements (SLAs) orwebsites. Slow websites lose in the battle or customer adoption. Many companies are utilizing BigData solutions to respond at high speed or either operational oer response or customer experiencerelated analysis.

    5.1.2 Use Cases by Industry

    Looking at the Big Data implementation use cases, aside rom Online Archivingmentioned in Section; the requirements or speed oResponsebecome important in several key industries.

    Retail and Leisure have a combined 40% o their answers allocated to Combining Data by Speedand

    Pre-Processing use cases. Tis shows the leadership o these industries in Big Data implementationsand the importance o linking operational and analytical Responseto their overall business cases. ForRetail, this is represented by need or timely cross-sell/up-sell oers and the increasingly competitivemarket in the Retail industry segment components like End Consumer Retail. For Leisure, there aresimilar requirements or speed oResponseo next best oers. However, there is also the exposureassociated with raud in the Leisure industry segment component o Gaming.

    31%

    28%

    27%

    24%

    24%

    23%

    22%

    21%

    15%

    18%

    13%

    20%

    18%

    31%

    21%

    24%

    27%

    20%

    17%

    20%

    13%

    24%

    14%

    20%

    14%

    15%

    27%

    19%

    23%

    20%

    18%

    21%

    16%

    22%

    17%

    22%

    15%

    18%

    20%

    14%

    10%

    20%

    19%

    21%

    17%

    Public Services

    Retail

    Leisure

    Manufacturing

    Media & PR

    Utilities Infrastructure

    Healthcare

    Industrial

    Finance

    Which of the following use cases applies to your current, or planned,

    Big Data project?" by "Which of the following best describes yourcompanys primary industry?

    Online Archiving Staging structured data Combining data by structure

    Combining Data by Speed Pre-Processing

    Figure 15

    Te Healthcare industry segment also has a signicant (over 40%) combined answer or CombiningData by Speedand Pre-Processinguse cases. Tis may be surprising when compared to implementationleaders like Retail and Leisure. However, when you look at the components o the industry segmentand the importance o speed oResponsein the Healthcare industry or patient care and quality controlon device and pharmaceutical product, this emphasis on speed oResponseuse case makes more sense.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    28/46

    Page 25

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    Conversely, when analyzingCombining by Speedand Pre-Processing, the bottom two industries orspeed oResponseuse cases are Manuacturingand Public Services. Again, this makes sense basedon the nature o their need. Manuacturinghas a larger ocus on combining their data byStructure,which indicates that they may be ocused on the integration o visual or sensor inormation or qualitycontrol. Te Public Services industry segment is still in the early stages o Big Data Adoption, asdescribed in Section , and may still look to incorporate speed oResponseuse cases as they developtheir strategies.

    What is interesting is the emphasis that Utilities Inrastructure has on speed oResponseuse cases.Utilities Inrastructure industry segment includes telecommunications service providers and energyproduction and distribution organization. elecommunications providers have a similar competitiveenvironment to Retail organizations and raud exposure to the Leisure industry segment. Energyproduction and distribution are some o the leading organizations advocating the smart grid o nearreal-time utilities optimization.

    5.1.3 Implementation Strategy by Industry

    When the Big Data use cases are put in the context o implementation strategy, some industries continuethe trend o ocusing on the speed oResponse. Again, Retail and Leisure ocus strongly on the speedoResponsejust as they did with their use cases. Leisure (33%) is the clear leader ocusing on the use oBig Data strategies to solve Primary Operational platorm strategies. Retail is second (29%) in this useo Big Data strategies or operational purposes. Utilities Inrastructure is also a leader (nearly 28%).Tis placement makes more sense when the real-time nature o telecommunications and managednetwork connectivity is considered.

    29%

    28%

    23%

    33%

    24%

    23%

    22%

    16%

    10%

    19%

    21%

    25%

    13%

    21%

    17%

    15%

    22%

    27%

    16%

    19%

    13%

    27%

    18%

    17%

    22%

    16%

    22%

    19%

    17%

    20%

    12%

    20%

    22%

    23%

    20%

    16%

    16%

    20%

    27%

    24%

    22%

    19%

    23%

    22%

    Retail

    Utilities Infrastructure

    Industrial

    Leisure

    Healthcare

    Public Services

    Media & PR

    Finance

    Manufacturing

    Which implementation strategy(s) are you using, or plan to use,

    with your Big Data project?" by "Which of the following bestdescribes your companys primary industry?

    Primary Operational Platform Comp. Operational Platform Exploratory Environment

    Comp. Analytic Platform Primary Analytic Platform

    Figure 16

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    29/46

    Page 26

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    Tere are signicant variations in implementation strategy across industries, with Finance standingout as adopting the opposite priority on analytic platorms, both primary and complementary. Tismay reect the complex analytical models required in Finance to drive automated trading and riskassessment; with the current ocus on the analysis and driving call center use rather than ull, real-timeimplementation In Operation.

    5.2 Inexpensive is not FreeTeEconomics o technology is the great equalizer and oten cancontribute to an early majority adoption o a particular innovation. Tishas been especially true with Big Data. Many companies have identiedneeds to addressResponseandWorkloadbut the return on investment hasslowed adoption. Big Data platorms can leverage commodity hardwareand oten the sotware is open source, lowering the economic barrier to

    adoption. However, although the barrier to entry is signicantly reduced,this seldom equates to being ree. Special skill set requirements and lacko mainstream management tools create hidden costs that must be takeninto account beore adopting this type o technology.

    5.2.1 Overall Annual Information Technology Budget

    For organizations considering or implementing Big Data solutions, there is a airly normal distributionwith a median selected band value between $10m-$25m USD dollars per year representing companiesmost likely to adopt.

    11%

    16%

    17%

    19%

    12%

    11%

    9%

    Less than $1 million

    $1 million to less than $5 million

    $5 million to less than $10 million

    $10 million to less than $25 million

    $25 million to less than $50 million

    $50 million to less than $100 million

    $100 million or more

    What is your organization's annual IT budget?

    Figure 17

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    30/46

    Page 27

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    5.2.2 Comparison of Enterprise and Mid-size Budgets

    As discussed above in Section , Enterprise businesses and Mid-sizedorganizations lead in the InOperation stage o Big Data implementations. Te ollowing charts examine the distribution obudgeting dollars or both operational and inormational/analytical budgets o these respectiveorganizations. Tese numbers indicate the average percentages o current 2012 annual budget andplanned budget or 2013 allocated to dierent hardware and sotware types.

    5.2.2.1 Enterprise Analytical Big Data Budgets

    Enterprise analytical budgets show a change rom this year to next in the traditional SQL-baseddatabase management category rom approximately 25% o budget to just less than 19% in 2013. Itappears that budget is moving largely to data storage expenditures, indicating an upswing in dataLoadsin 2013 rom this year.

    30%

    14%

    19%

    9%

    13%

    9%

    23%

    18%

    24%

    10%

    9%

    8%

    Traditional Data Storage (SAN, NAS, etc)

    hardware

    NoSQL (Hadoop HDFS) data storage

    Traditional Database Management System

    (SAP Sybase, Teradata) software

    NoSQL Database Management (MongoDB,

    Hadoop Cassandra)

    Data Visualization (dashboard, reporting)

    software

    Advanced Analytics (predictive, classification,

    etc)

    Analytical Big Data Budget 2012/2013

    Comparison for Enterprises

    2012 Budget Enterprise 2013 Budget Enterprise

    Figure 18

    5.2.2.2 Mid-sized organizations Analytical Big Data Budgets

    For Mid-sizedorganization Analytics budgets, the EMA/9sight Big Data survey ound that a signicantamount o planned budget is moving rom traditional database management system spending to twoareas. Te rst is an increase in NoSQL, including Hadoop, technologies, data storage, reectingincreased dataLoad. Te second is an increase in spending on analytic and visualization tools (alsoreected in the Enterprise Analytic category above), consistent with a move in ocus rom gatheringdata to using it or business value.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    31/46

    Page 28

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    22%

    14%

    15%

    11%

    17%

    11%

    22%

    12%

    17%

    12%

    15%

    10%

    Traditional Data Storage (SAN, NAS, etc)

    hardware

    NoSQL (Hadoop HDFS) data storage

    Traditional Database Management System

    (SAP Sybase, Teradata) software

    NoSQL Database Management (MongoDB,

    Hadoop Cassandra)

    Data Visualization (dashboard, reporting)

    software

    Advanced Analytics (predictive, classification,

    etc)

    Analytical Big Data Budget 2012/2013

    Comparison for Mid-size Organizations

    2012 Budget Mid-size 2013 Budget Mid-size

    Figure 19

    5.2.2.3 Enterprise Operational Big Data BudgetsAgain, Enterprise budgets are moving away rom traditional storage and database managementinvestments. Tis shit is moving toward Hadoop-based HDFS data storage and operational applications.

    25%

    16%

    24%

    11%

    16%

    29%

    13%

    26%

    10%

    13%

    Traditional Data Storage (SAN, NAS, etc)

    hardware

    NoSQL (Hadoop HDFS) data storage

    Traditional Database Management System

    (IBM DB2, Oracle) software

    NoSQL Database Management (MongoDB,

    Hadoop Cassandra)

    Operational Application (ERP, CRM, etc)

    Operational Big Data Budget 2012/2013

    Comparison for Enterprise

    2012 Budget Enterprise 2013 Budget Enterprise

    Figure 20

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    32/46

    Page 29

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    5.2.4 Mid-sized Organization Operational Big Data Budgets

    Mid-sizedorganizations Operational Big Data budgets also show movement away rom additionalinvestment in traditional data storage and database management investment toward additional spendingon NoSQL technologies. Also, there is a shit rom data store inrastructure to additional spending onoperational platorms or 2013.

    Tese numbers in both Enterprise and Mid-sizedorganizations conrms the responses on ImplementationStrategy(Section ) as Big Data is being integrated into upgrades o the operational environment.

    26%

    16%

    16%

    14%

    20%

    28%

    13%

    19%

    12%

    18%

    Traditional Data Storage (SAN, NAS, etc)

    hardware

    NoSQL (Hadoop HDFS) data storage

    Traditional Database Management System

    (IBM DB2, Oracle) software

    NoSQL Database Management (MongoDB,

    Hadoop Cassandra)

    Operational Application (ERP, CRM, etc)

    Operational Big Data Budget 2012/2013

    Comparison for Mid-size Organizations

    2012 Budget Mid-size 2013 Budget Mid-size

    Figure 21

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    33/46

    Page 30

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    5.3 Rening Data into InformationEarly adopter companies realized that Big Data platorms could playa role within their ecosystem to execute extremely complex analyticworkloads and were willing to invest early in these solutions to gaincompetitive advantage. Coupling the Workload and Response, thesenew platorms created a powerul tool o dierentiation. Te ability tointroduce new data types such as social inormation or machine datacould be leveraged to add even greater levels o insight and value. oday,running highly complex analytic models over massive data stores isbecoming commonplace across all industries, and the ability to renemeaningul inormation rom raw data a key dierentiator.

    5.3.1 Workload Technical Drivers

    Looking at the technical drivers or organizations, limitations in current platorms top the list by ar.Excluding this obvious driver, access to and processing o internal and external multi-structured dataare the top two answers rom the survey respondents.

    17%

    17%

    15%

    39%

    12%

    Require access to internal and external multi-

    structured data sets

    Require faster processing of structured or multi-

    structured data sets

    Requirement to react faster to complex event

    processing (CEP) platforms

    Current platform scaling limits

    Need access to deep transaction data from

    point of sale (POS) and website clickstreamplatforms

    What are the three (3) primary technical drivers behind your

    organizations need for a Big Data strategy(s)?

    % Total Mentions

    Figure 22

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    34/46

    Page 31

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    5.3.2 Technical DriversA Deeper Dive

    In order to dive deeper, the EMA/9sight survey responses are broken down by industry.

    26%

    19%

    17%

    17%

    17%

    15%

    15%

    13%

    8%

    14%

    13%

    6%

    14%

    14%

    11%

    13%

    17%

    17%

    7%

    20%

    15%

    20%

    11%

    17%

    13%

    24%

    17%

    15%

    15%

    15%

    14%

    14%

    12%

    15%

    20%

    17%

    38%

    32%

    47%

    36%

    44%

    44%

    46%

    24%

    42%

    Industrial

    Utilities Infrastructure

    Public Services

    Manufacturing

    Media & PR

    Finance

    Retail

    Healthcare

    Leisure

    What are the three (3) primary technical drivers behind your

    organizations need for a Big Data strategy(s)?" by "Which of thefollowing best describes your companys primary industry?

    Require access to int / ext multi-structured data

    Need access to deep txn data from POS and clickstream

    Require faster processing of struct. / multi-struct. data

    Requirement to react faster to CEP platforms

    Current platform scaling limits

    Figure 23

    While Platorm Scaling Limitations are cited in 4045% o answers across all industries, Healthcarestands out with less than 25%, as well as being the industry with the highest ocus on speed o processing

    and reaction. Tis reects the possibility that Healthcare and, to a lesser extent, Utilities Inrastructureare looking at new data sources rather than existing systems and driven by usersResponserequirements.Industrial, on the other hand, appears to be still ocused on the more undamental needs o simplyallowing users to access dierent Structureso data.

    It should be noted that survey responses of Other were omitted from the graphic.

    Some industry segments do not reect a response total of 100% because of this omission.

  • 7/27/2019 teradata-0003-Big_Data_Comes_of_Age.pdf

    35/46

    Page 32

    Big Data Comes of Age

    Copyright 2012, EMA Inc. and 9sight Consulting. All Rights Reserved.

    5.3.3 Business Drivers by Industry Grouping

    Analyzing business drivers by industry, grouping a number o responses relating to the CO o datamanagement, provides the ollowing view:

    33%

    29%

    26%

    25%

    25%

    25%

    24%

    24%

    21%

    17%

    6%

    13%

    17%

    14%

    15%

    15%

    10%

    13%

    6%

    15%

    15%

    13%

    6%

    10%

    5%

    16%

    8%

    44%

    50%

    46%

    46%

    56%

    50%

    56%

    50%

    58%

    Manufacturing

    Retail

    Healthcare

    Industrial

    Media & PR

    Finance

    Utilities Infrastructure

    Public Services

    Leisure

    What are the three (3) primary business drivers behind your

    organizations need for a Big Data strategy?" by "Which of the followingbest describes your companys primary industry?

    Improved analytics

    Business requires faster response to operational or analytical queries

    Regulatory / policy need to store large datasets onlineImproved Data Management TCO / Comp. Advantage

    Figure 24

    Across all industries, approximately hal the responses link Business Advantages such as CostReduction and Competitive Advantage to improved data management. Considering that only 20% oour respondents were rom the I unction, this suggests that business people in the Big Data area arehighly aware o the value o data management.

    Excluding the Data Managementanswers, Improved Analytics is by ar the top response across allindustries, with Manuacturingleading the way.

    Query Response ime is an important driver or Manuacturing, Industrial, Utilities Inrastructureand Finance, illustrating the importance o nding the right types o processing workloads or theseindustries. For Manuacturingand Industrial, this ocus on response time may show the importanceo optimization or their organizations.

    Not surprisingly, Regulatory Requirements or storing data and documenting processes are a toppriority or regulated industries like Pub


Recommended