+ All Categories
Home > Technology > Big Data

Big Data

Date post: 17-Dec-2014
Category:
Upload: eufris
View: 1,592 times
Download: 1 times
Share this document with a friend
Description:
Big Data-esitys 09.02.2012
Popular Tags:
23
Big Data Eufris 2012
Transcript
Page 1: Big Data

Big Data

Eufris 2012

Page 2: Big Data

Why should I care?

McKinsey:•$250 billions annual savings in EU alone by enhancing public sector•$600 billions annual consumer surplus from using personal location data globally

•Annual growth of data is remarcable•Data is the most valuable thing most companies have•Data is massively underutilized

Eufris 2012

Page 3: Big Data

Forecast

There will be a shortage of talent necessary for organizations to take advantage of big data. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.

Eufris 2012

Page 4: Big Data

What is Big Data?"Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis"

IDC

"Big Data is a technlogy that helps extract value from the digital universe.”

IDC

"Techniques and technologies that make handling data at extreme scale economical."

Forrester

Eufris 2012

Page 5: Big Data

ABC of Big Data

Analy&cs•making  sense  of  your  data,  in  real-­‐5me,  in  easy  way

Bandwidth•inges5ng,  prosessing  and  delivering  large  amounts  of  data

Content•storing,  managing  and  retaining  large  amounts  of  data

Eufris 2012www.netapp.com

Page 6: Big Data

3 V’s of Big Data

Variety• Big  Data  extends  beyond  structured  data,  including  unstructured  data  of  all  varie5es:  text,  audio,  video,  click  streams,  log  files  and  more

Velocity• o@en  5me  sensi5ve,  Big  Data  must  be  used  as  it  is  streaming  in  to  the  enterprise  in  order  to  maximize  its  value  to  the  business

Volume• Big  Data  comes  in  one  size:  large.  Enterprises  are  awash  with  data,  easily  amassing  terabytes  and  even  petabytes  of  informa5on

Eufris 2012

Page 7: Big Data

Few core concepts

Eufris 2012

Page 8: Big Data

Hadoop

•The  Apache  Hadoop  so.ware  library  is  a  framework  that  allows  for  the  distributed  processing  of  large  data  sets  across  clusters  of  computers  using  a  simple  programming  model.

•Three  subprojects•Hadoop  Common•Hadoop  Distributed  Filesystem  (HDFS)•Hadoop  MapReduce

Eufris 2012

Page 9: Big Data

MapReduce

•Introduced  by  Google  in  2004

Map

2

2

2

1

2

3

Reduce 3

4

5

Eufris 2012

Page 10: Big Data

MapReduce on App Engine

• Mapreduce  is  an  experimental,  innovaNve,  and  rapidly  changing  new  feature  for  App  Engine

Eufris 2012

Page 11: Big Data

NoSQL

•DefiniNon  1

“Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable. The original intention has been modern web-scale databases. The movement began early 2009 and is growing rapidly. Often more characteristics apply as: schema-free, easy replication support, simple API, eventually consistent, a huge data amount, and more.”nosql-database.org

Eufris 2012

Page 12: Big Data

NoSQL

•DefiniNon  2

“In computing, NoSQL (sometimes expanded to "not only SQL") is a broad class of database management systems that differ from the classic model of the relational database management system (RDBMS) in some significant ways. These data stores may not require fixed table schemas, usually avoid join operations, and typically scale horizontally.”Wikipedia

Eufris 2012

Page 13: Big Data

From ACID to BASE

ACID:Atomicity,  Consistency,  Isola&on,  Durability

BASE:Basically  available,  So?  state,  Eventually  consistent

Eufris 2012

Page 14: Big Data

Big Data and cloud

Eufris 2012

Page 15: Big Data

Big Data on AWS

Eufris 2012

Page 16: Big Data

MapReduce on AWS

• Not  yet  Hadoop  1.0.0

Eufris 2012

Page 17: Big Data

MapReduce on AWS

S3EC2

+ DynamoDB

Eufris 2012

Page 18: Big Data

Google BigQuery

Features• Speed - Analyze billions of rows(!) in seconds• Scale - Terabytes of data, trillions of records• Simplicity - SQL-like query language, hosted on

Google infrastructure• Sharing - Powerful group- and user-based permissions

using Google accounts• Security - Secure SSL access• Multiple access methods - Can be used by REST

API, a command-line tool, a browser-based graphical interface, and Google Apps Script

Eufris 2012

Page 19: Big Data

BigQuery example

Eufris 2012

Page 20: Big Data

Big Data outside of cloud

Eufris 2012

Page 21: Big Data

Oracle Big Data Appliance

18 Oracle Sun Servers• 864 GB main memory;• 216 CPU cores;• 648 TB of raw disk storage;• 40 Gb/s InfiniBand connectivity between nodes and engineered systems;• 10 Gb/s Ethernet connectivity.

About 500 000 $

Eufris 2012

Page 22: Big Data

Autonomy IDOL 10

"For far too long, organizations have confined structured data to relational databases and unstructured data to simplistic keyword matching technologies..."

“IDOL 10 brings these worlds together, allowing organizations to automatically process, understand, and act on 100 percent of their data, in real-time. The results will be dramatic, as businesses can develop entirely new applications that explore the richness and color of Human Information that live in unstructured, semi-structured, and structured forms.”

Price?

Eufris 2012

Page 23: Big Data

Thank you!

Eufris 2012


Recommended