Post on 27-Jun-2020
transcript
Age of Big data
Presented by:
Mohammad Iqbal BCM -2014
Agenda
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolution from Google
The future is here!
What is a Big Data ? Name Symbol Value
Kilobyte KB 10^3
Megabyte MB 10^6
Gigabyte GB 10^9
Terabyte TB 10^12
Petabyte PB 10^15
Exabyte EB 10^18
Zettabyte ZB 10^21
Yottabyte YB 10^24
BIG DATA
So large data that it becomes difficult to process it using the traditional system
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Difficult to process by Traditional System
100 MB document
100 TB document
100 GB document
Unable to send
Unable to Edit
Unable to View
Depends on capability of
system
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Organization/Context Specific
500 TB Text,Audio,Video
data per day
Company A
Company B
Big Date
NOT a Big data
Depends on capabilities
of the organization
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Areas of Challenges
Capture
Curation
Storage
Anlaysis Visualization
Transfer
Sharing
search
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Big Data Attributes
Big
Data • Large & growing files • At High speed • In various Format
VELOCITY VOLUME VARIETY
Data comes at
high speed
This files comes in various formats
Data result in large file V^3
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Structured / Unstructured
Unstructured Data 90%
Structured
Data 10%
Challenge /Opportunity
Mostly wasted
Used in decision making
To analyze & extract meaningful information
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Big data Sources
Users
Applications
Systems
Sensors
Large & growing files
(Big data files)
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Data Generation point Examples
Mobile devices
Microphones
Readers/Scanners
Software/program
Social Media
cameras
Machine Sensors
Science facilities
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Sample Events generating Data
• Every day, we create 2.5 Exabytes of data i.e 2.5 billion GB, so much that 90% of the data in the world today has been created in the last few years alone.
• CERN Atomic facility generates 40 TB data per second.
• Twitter generates 12 TB of data every day.
• Airbus A380 generates 10 TB every 30 minutes of flight. About 650TB generated in one flight.
• In 2009 total data in world was estimated to be 1 ZB. By 2020 estimated to be 35 ZB .
(Source :IBM.com)
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Getting Value from Big data
Collect Understand Analyze
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Big data Applications
• Companies gaining edge by collecting ,analyzing and understanding information.
• Government forecasting events and taking proactive actions.
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
New Tools for Big Data
Traditional Systems
(e.g RDBMS ,SQL)
Big data tool (e.g Hadoop
NoSQL)
Time
Not able to handle Big
data
Created to handle big data
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Traditional Enterprise Approach
Big data Powerful Computer Processing Limit
Only So much data could be
processed
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Modern Hadoop’s approach
Big data
Combined result Computation
Computation
Computation
Computation
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Hadoops’s Architecture
Source :hortonworks/hadoop/hdfs/.com/
Map Reduce
File System HDFS
Projects
HBase
Mahout
Pig
Oozie
Flume
Scoop
Hive
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Application Task tracker
Data Node
Task tracker
Data Node
Task tracker
Data Node
Task tracker
Job Tracker
Data Node
Data Node
Task tracker
Name Node
MASTER
Slaves
DATA
Application Task tracker
Data Node
Task tracker
Data Node
Task tracker
Data Node
Task tracker
Job Tracker
Data Node
Data Node
Task tracker
Name Node
MASTER
Slaves
DATA
Kn
ow
w
here
data
residin
g
Data can be taken directly
HDFS vs GFS
• Similarity with Google file system (GFS)MapReduce
• Back in 1990 search engine supported by:
Excite
Altavista
Lycos
Infoseek
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Google Victory
1995
2000
Excite
Altavista
Lycos
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
Hadoop evolution from Google
2003 2006 2005 2004
GFS paper released by Google
Google released paper on MapReduce
Hadoop created by Doug & Cafarella at Yahoo! (Nutch search engine)
Yahoo donated the project to Apache
Source :Google & Nutch white papers
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
The future is here !!
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
• Big data scientists with just two years' experience can earn between $200,000 and $300,000 a year (wall street journel).
• Anyone with "data science" in his or her job title on a LinkedIn page is going to get "100 recruiter emails a day,“.(wall street journel).
• Hadoop is a super hot up-and-coming "big data" technology. (Business insider.com).
• Many other data scientists, especially at data-driven companies such as Google, Amazon, Microsoft, Walmart, eBay, LinkedIn, and Twitter, have added to and looking for developing the Hadoop tool kit. (Harvard business review).
• "People are slapping buzzwords as “Hadoop”on résumés and looking to get 50 or 100 percent more, and they're getting it," said Scott Gnau, president of Teradata Data Lab.
What is a Big Data ?
Big Data Attributes
Big data Sources
Getting Value from
Big data
New Tools for Big Data
Hadoops' Architecture
Hadoop evolve frm
The future is here!
References • Dean & Sanjay (2004)> MapReduce: Simplied Data Processing on Large
Clusters.google.com
• Dogh Cutting Nutch(2005): A Flexible and Scalable Open-Source Web Search Engine.yahoo .com
• Sanjay & Howard (2003): The Google File System, google.com
• https://www.ibm.com/developerworks/vn/library/contest/dw-freebooks/Tim_Hieu_Big_Data/Understanding_BigData.PDF [Accessed date 27th nov 2014]
• http://www.businessinsider.com/10-tech-skills-that-will-instantly-net-you-100000-salary-2012-8?op=1[Accessed date 27th nov 2014]
• Big Data's High-Priests of Algorithms,http://online.wsj.com/articles/academic-researchers-find-lucrative-work-as-big-data-scientists-1407543088[Accessed date 27th nov 2014]
Thank you for your attention
Q/A