+ All Categories
Home > Documents > Big data hadoop-no sql and graph db-final

Big data hadoop-no sql and graph db-final

Date post: 27-Jan-2015
Category:
Upload: ramazan-firin
View: 113 times
Download: 1 times
Share this document with a friend
Description:
 
Popular Tags:
70
This document is intended for only AVEA İletişim Hizmetleri A.Ş.("AVEA"), its dealers, employees and/or others specifically authorised. The contents of this document are confidential and any disclosure, copying, distribution and/or taking any action in reliance with the content of this document is prohibited. AVEA is not liable for the transmission of this document in any manner to any third parties that are not authorised to receive. Big Data – Hadoop - NoSQL and Graph Database Ramazan FIRIN 20.11.2012
Transcript
Page 1: Big data hadoop-no sql and graph db-final

This document is intended for only AVEA İletişim Hizmetleri A.Ş.("AVEA"), its dealers, employees and/or others specifically authorised. The contents of this document are confidential and any disclosure, copying, distribution and/or taking any action in reliance with the content of this document is prohibited. AVEA is not liable for the transmission of this document in any manner to any third parties that are not authorised to receive.

Big Data – Hadoop - NoSQL and Graph DatabaseRamazan FIRIN20.11.2012

Page 2: Big data hadoop-no sql and graph db-final

2

AGENDA

• Big Data

• Hadoop

• NoSQL

• Graph DB and Neoj

• Possible Usage in Tellco

• Demo

Page 3: Big data hadoop-no sql and graph db-final

3

Executive Summary

R&D /MW DevelopementAVEA

• Big Data is a new IT trend

• Hadoop and NoSQL can used to process Big Data

• Possible usage area in Tellco : - Prevent Churn - to offer customer spesific campaign - to get more customer

Page 4: Big data hadoop-no sql and graph db-final

4

What is Big Data?

Datasets that are too awkward to work with using traditional,

hands-ondatabase management tools.

Page 5: Big data hadoop-no sql and graph db-final

5

Big Data- 3V Concept

Page 6: Big data hadoop-no sql and graph db-final

6

Big Data Sources

1.    Social network profiles -Facebook, LinkedIn, Yahoo, Google

2.    Social influencers -  blog comments, user forums, review sites,

3.    Activity-generated data - application logs, sensor data

4.    Public—Wikipedia, IMDb, etc

5.    Data warehouse appliances - transactional data

6. Network and in-stream monitoring

7.   Legacy documents—

Page 7: Big data hadoop-no sql and graph db-final

7

Big Data To Smart Data

Cover of The Economist

Page 8: Big data hadoop-no sql and graph db-final

8

Volume

/

Page 9: Big data hadoop-no sql and graph db-final

9

New Data Sources - Internet

• 2 Billion internet users by 2011

• Twitter processes 7 terabytes data of every day

• Facebook processes 10 terabytes data of every day

• 4.6 billion mobile phone

• Google processes 24 petabytes data of every day

Page 10: Big data hadoop-no sql and graph db-final

10

Big Data Approach

Page 11: Big data hadoop-no sql and graph db-final

11

Big Data Design

Page 12: Big data hadoop-no sql and graph db-final

12

Big Data Usage Sector

Page 13: Big data hadoop-no sql and graph db-final

13

Sample Usage - 360°Degree View of the Customers

Page 14: Big data hadoop-no sql and graph db-final

14

Sample Usage – Customer Sentiment

Page 15: Big data hadoop-no sql and graph db-final

15

Sample Usage – Detect Churn Pattern

Page 16: Big data hadoop-no sql and graph db-final

16

Sample Usage - Healty

Page 17: Big data hadoop-no sql and graph db-final

17

Big Data Market

Page 18: Big data hadoop-no sql and graph db-final

18

Big Data Solutions – Oracle Big Data Appliance

Page 19: Big data hadoop-no sql and graph db-final

19

Big Data Solutions – IBM Pure Data

Page 20: Big data hadoop-no sql and graph db-final

20

TOP 10 Tecnology Trend 2012 from CSC

Page 21: Big data hadoop-no sql and graph db-final

21

Gartner: Top 10 IT Trends for 2013

21R&D /MW DevelopementAvea

Page 22: Big data hadoop-no sql and graph db-final

22

Gartner:10 Critical IT Trends For The Next Five Years

• Third trend is Bigger data and storage:

• By 2015, big data demand will generate 1 million jobs in the Global 1000,

• but only a one-third of jobs will get filled due to shortage of talent.

• Analytics and pattern recognition are key.

• Seeing new specialized ARM-based servers to do specialty analytics.

22R&D /MW DevelopementAvea

Page 23: Big data hadoop-no sql and graph db-final

23

HADOOP

Page 24: Big data hadoop-no sql and graph db-final

24

What is HADOOP?

The Apache Hadoop software library is a framework that

allows for the distributed processing of large data sets

across clusters of computers using simple programming models

Page 25: Big data hadoop-no sql and graph db-final

25

History

Page 26: Big data hadoop-no sql and graph db-final

26

Hadoop Components

Page 27: Big data hadoop-no sql and graph db-final

27

Page 28: Big data hadoop-no sql and graph db-final

28

Hadoop Ecosystem

Pig - simplifies hadoop programming, data processing language

Hive - SQL like queries

HBase - Random read/write, billions of row and millions of colums (NoSQL)

Page 29: Big data hadoop-no sql and graph db-final

29

Other Google Research

Page 30: Big data hadoop-no sql and graph db-final

30

NoSQL

Page 31: Big data hadoop-no sql and graph db-final

31

RDBMS PERFORMANCE

31R&D /MW DevelopementAvea

Page 32: Big data hadoop-no sql and graph db-final

32

Join is killer...

32R&D /MW DevelopementAvea

Page 33: Big data hadoop-no sql and graph db-final

33

What is NoSQL?

• Stands for Not Only SQL

• Non relational

• Cheap, Easy to implement

• Scalability

– Vertically - Add more data

– Horizontally - Add more storage

• No pre-defined schema

• No join operations

• Not ACID, support CAP threom

Page 34: Big data hadoop-no sql and graph db-final

34

NoSQL DB Types

1. Key-values Stores

2. Document Databases

3. Column Family Stores

4. Graph Databases

Page 35: Big data hadoop-no sql and graph db-final

35

Key-Value Stores

- Redis, Voldemort

Page 36: Big data hadoop-no sql and graph db-final

36

Document Database

- CouchDB, MongoDB

Page 37: Big data hadoop-no sql and graph db-final

37

-Cassandra, HBase

Page 38: Big data hadoop-no sql and graph db-final

38

Graph Database

- Neo4J, InfoGrid, Infinite Graph

Page 39: Big data hadoop-no sql and graph db-final

39

RMDBS Support ACID

• Atomicity - a transaction is all or nothing

• Consistency - only valid data is written to the database

• Isolation - pretend all transactions are happening serially and the data is correct

• Durability - what you write is what you get

Page 40: Big data hadoop-no sql and graph db-final

40

NoSQL Support CAP Threom

Page 41: Big data hadoop-no sql and graph db-final

41

NoSQL Support CAP Theorem

• Consistency - each client always has the same view of the data.

• Availability - all clients can always read and write.

• Partition tolerance - if one or more nodes fails the system still works

You can pick only two...

Page 42: Big data hadoop-no sql and graph db-final

42

Visual Guide to NoSQL Systems

42R&D /MW DevelopementAvea

Page 43: Big data hadoop-no sql and graph db-final

43

NoSQL Complexity

Page 44: Big data hadoop-no sql and graph db-final

44

NoSQL Performance

Page 45: Big data hadoop-no sql and graph db-final

45

Job Trends

45R&D /MW DevelopementAvea

Page 46: Big data hadoop-no sql and graph db-final

46

Graph DB and Neo4j

Page 47: Big data hadoop-no sql and graph db-final

47

Graph DB

 Graph database uses graph structures with nodes, edges, and properties to represent and store data.

Page 48: Big data hadoop-no sql and graph db-final

48

Graph DB Usage Area

• Recommendations

• Business Inteligence

• Social networking

• MDM

• System Management

• Time Series data

• Product Catalogue

• Web Analitics

• Scientific Computing

• Indexing your slow RMDBS

Page 49: Big data hadoop-no sql and graph db-final

49

Relational Databases are Graphs!

Page 50: Big data hadoop-no sql and graph db-final

50

Neo4j

• Leading Graph Database

• Transaction support (ACID)

• Indexing

• Querying

• REST support

• Disk Based

• Opensource

• Traversal framework

• High Performance (traverse 1.000.000 + relationship/seconds)

• Robust (in 7/24 operation since 2003)

• Massive scalability

Page 51: Big data hadoop-no sql and graph db-final

51

Neo4j Data Model

Neo4j has Nodes and Relationship.

Nodes and realtionships have properties.

Node1

Node2

Property:name

Property:surname

Property:name

Property:surname

Relationship

Relationship type : knowsProperty : Date of meeting

Page 53: Big data hadoop-no sql and graph db-final

53

Who use Neo4j?

• Cisco - Master Data Management

• Telenor Group : Customer organization scructure (203 million subscribers )

• Deutsche Telekom: Social football site (150 million subscribers )

Page 54: Big data hadoop-no sql and graph db-final

54

Cypher For Query

Page 55: Big data hadoop-no sql and graph db-final

55

Sample Code

Page 56: Big data hadoop-no sql and graph db-final

56

Spring Data Neo4j

Page 57: Big data hadoop-no sql and graph db-final

57

Neoclipse

Page 58: Big data hadoop-no sql and graph db-final

58

Product Catalog

58R&D /MW DevelopementAvea

Page 59: Big data hadoop-no sql and graph db-final

59

Sample OM Data Model

Page 60: Big data hadoop-no sql and graph db-final

60

Hardware Calculating Tool

Page 61: Big data hadoop-no sql and graph db-final

61

Hardware Calculating Tool Result

Calculation Result Prod Environment

• 4 pysical machines

• 3 node at every machines

• 1024 mhz cpu

• 65536 MB Ram

Page 62: Big data hadoop-no sql and graph db-final

62

Orient DB

• The Document-Graph database

• ACID support

• SQL and Native Queries,

• schema-less, schema-full and schema-mixed modes

• Roles + Security

• Functions

• HTTP / Restfull / Json / Binary supports

• Hooks

• Fetch plans

• Inheritance

• 200.000 insert per second(6 M node travels with cache)

Page 63: Big data hadoop-no sql and graph db-final

63

FluxGraph

• Temporal Graph Database

• Has checkpoint

• Compatible with Neo4j

632008-07-01_Presentation Template MBT / CEOMercedes-Benz Türk A.Ş.

Page 64: Big data hadoop-no sql and graph db-final

64

Examples for TelCos

• CDR

• Routing

• Social graphs

• Master Data Management

• Spatial and LBS

• Network topology analysis

• Neo4j and Android

64R&D /MW DevelopementAvea

Page 65: Big data hadoop-no sql and graph db-final

65

CDR Analysis

65R&D /MW DevelopementAvea

Page 66: Big data hadoop-no sql and graph db-final

66

Master Data Management

66R&D /MW DevelopementAvea

Page 67: Big data hadoop-no sql and graph db-final

67

Network Management

67R&D /MW DevelopementAvea

Page 68: Big data hadoop-no sql and graph db-final

68

Cell Network Analiysis

68R&D /MW DevelopementAvea

Page 69: Big data hadoop-no sql and graph db-final

69

Sample Senarios

• Customer Spesific Campaign

• Prevent Churn

• Get More Customer

• Special offer for campaigns

Page 70: Big data hadoop-no sql and graph db-final

70

Thanks


Recommended