+ All Categories
Home > Technology > BIG Data & Hadoop Applications in Social Media

BIG Data & Hadoop Applications in Social Media

Date post: 07-Aug-2015
Category:
Upload: skillspeed
View: 84 times
Download: 0 times
Share this document with a friend
Popular Tags:
18
Slide ‹#› © 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Applications of BIG Data in Social Media Networks
Transcript
Page 1: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Applications of BIG Data in Social Media Networks

Page 2: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Introduction to Big Data

Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications

Systems / Enterprises generate huge amount of data from Terabytes to Petabytes

Understanding Big Data can give you innovative solutions to various queries.

Page 3: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Hadoop in Social Media Networks

FacebookTwitter LinkedInPinterestInstagramStumbleUpon

Page 4: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

FacebookFacebook is one of Hadoop and Big Data's biggest champion and is rumored to operate the single largest HDFS cluster in the world comprising on 100 petabytes of disk space. The site stores more than 250 billion photos, with 350 million new ones uploaded every day.

Hive, the data warehousing infrastructure Facebook helped develop, is central to meeting the company's reporting needs. Facebook uses Hive for faster querying on graph tools.

Hadoop is used in every Facebook product and in a variety of ways. User actions such as a "like" or a status update are stored in a highly distributed, customized MySQL database but applications such as Facebook Messaging run on top of HBase, Hadoop's NoSQL database framework.

All messages sent on desktop or mobile are stored in HBase. Additionally, the company uses Hadoop and Hive to generate reports for third-party developers and advertisers who need to track the success of their applications or campaigns.

Get Started with BIG Data & Hadoop

Applications of Hadoop in Social Media Networks

Page 5: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Get Started with BIG Data & Hadoop

Applications of Hadoop in Social Media Networks

TwitterTwitter uses Hadoop, especially Pig and Hbase for Data analysis. Twitter has large data storage and processing requirements, therefore it uses Hadoop to optimize data storage and workflow solutions. It uses Hadoop and Pig layers on top of LZO files which are compressed for quick analysis of data. Due to this, it has almost completely negated infrastructure failures.

Page 6: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Applications of Hadoop in Social Media Networks

LinkedInLinkedIn crunches more than 120 billion relationships per day and uses Hadoop capabilities to blend large scale data computation with high volume, low latency site serving.

The ‘People You May Know’ feature is the result of scanning through billions of user triggers & activities via a pipeline of 82 MapReduce jobs each processing 16TB of data. This job uses a statistical model to predict the probability of two people knowing each other. It uses bloom filters to speed up large joins, yielding a 10x performance improvement.

LinkedIn also builds an index structure in their Hadoop pipeline - this produces a multi-TB lookup structure that uses perfect hashing (requiring only 2.5 bits per key). This process results in faster server - cluster responses; it takes LinkedIn about 90 minutes to build a 900 GB data store on a 45 node development cluster.

Get Started with BIG Data & Hadoop

Page 7: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Applications of Hadoop in Social Media Networks

PinterestPinterest’s engineering team has created a self-serving platform — comprised of home grown, open source and commercial tools — that hooks in with Hadoop to orchestrate and process all of the company’s enormous amounts of data, consisting of 30 billion pins.

They extensively utilize MapReduce for the processing of stored data which is synced with an AWS cloud, wherein all their data is currently stored.

InstagramInstagram operates one of the world’s largest Hadoop clusters and very likely the world’s largest MySQL implementation. It has developed everything from a PHP-optimized platform to a NoSQL database to a tool for auto-provisioning and configuring tens of thousands of servers wherein all the images are stored & processed..

Get Started with BIG Data & Hadoop

Page 8: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Applications of Hadoop in Social Media Networks

StumbleUponStumbleUpon uses Hadoop to make the best recommendations. Every thumb-up, stumble and share (among other feedback from users) is stored so it can make better decisions about what pages to show you next.

StumbleUpon uses Hbase extensively because of its cost-effectiveness, fast data retrieval capabilities and dependability.

Get Started with BIG Data & Hadoop

Page 9: BIG Data & Hadoop Applications in Social Media

© 2015 BlueCamphor Technologies (P) Ltd.

How Are Insights Derived from BIG Data?

Say Hello to Hadoop Development :)

Get Started with BIG Data & Hadoop

Page 10: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Introduction to Hadoop

Apache Hadoop is a framework that allows the distributed processing of large data sets across clusters of commodity computers using a simple programming model.

It is an Open-Source Data Management technology with extensive storage and distributed processing.

Hadoop CharacteristicsFlexible

Reliable

Economical

Scalable Get Started with BIG Data & Hadoop

Page 11: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Why Should You Learn Hadoop?

• Hadoop is a modern BIG Data processing framework which utilizes distributed computing clusters.

• It can process, manage and store unstructured data with absolute ease.

• Due to its scalability & effectiveness, companies are heavily adopting Hadoop.

• It has become the standard for storing, processing and analyzing big data.

• BIG Data & Hadoop professionals are in extremely high demand.

• This is the first step to becoming a data scientist or data architect.

Get Started with BIG Data & Hadoop

Page 12: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Reports – 2015

Here are some of the other Reports that convey the same message:

• “It is expected that Hadoop and Big Data Analytics will grow to around $13.9 billion from 2012 to 2017.” (Markets and Markets Research Report)

• “The Data market currently with the fastest growth are Hadoop and NoSQL software and services.” (Technology Research Organization, Wikibon)

• “The Big Data market will likely hit $23.8 billion with a 31.7% rise per year.” (IDC Report)

• “Almost 90% organizations have embarked on Hadoop related projects and thus Hadoop skills are in huge demand.” (According to the Big Data Executive Survey 2013)

• “Companies hiring Hadoop professionals Analytics professionals obtain a 250 percent hike in their salaries.” (According to Analytics Industry Report)

Gartner had predicted, “Hadoop will be in most advanced analytics products by 2015.” By far and large, this prediction has turned out to be true. Hadoop is “The Technology” to harness Big Data

Get Started with BIG Data & Hadoop

Page 13: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Predictions for 2015

According to the Forrester Report:

• A new data economy will arise, Hadooponomics. Due to its vast data prowess, enterprise adoption of Hadoop will become compulsory. All those companies, which were unsure of adopting Hadoop till 2014, are bound to adopt it in 2015

• The surge in demand for Hadoop job professionals will be fulfilled by Java professionals & other software engineers who will upgrade their skills

• The creation of SQL-on-Hadoop options will further enhance usability & adoption

• Due to the introduction of Yarn (Version 2.0), the usage of Hadoop will go beyond analytics. It will transform into an application platform giving birth to new frameworks in the Hadoop ecosystem

• New Hadoop sources will emerge from organizations like Oracle, HP, Tibco and SAP. Vendors such as Red Hat, VMware and Microsoft will include Hadoop in their operating Systems

Get Started with BIG Data & Hadoop

Page 14: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Why SkillSpeed?

Course Curriculum

from Industry Experts

Instructor Led Live Virtual Sessions

Lifetime access to Course

Content via LMS

100% Placement Assistance

24x7 Support

24x7

Get Started with BIG Data & Hadoop

Page 15: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Course Topics

Module 1

Introduction to Big Data and Hadoop

Module 2

HDFS Internals, Hadoop

Configurations and Data Loading

Module 3

Introduction to Map Reduce

Module 4

Advanced Map Reduce Concepts

Module 5

Introduction to Pig

Module 6

Advanced Pig and Introduction to Hive

Module 7

Advanced Hive Concepts

Module 8

Extending Hive and HBase Introduction

Module 9

Advanced HBase and Oozie Introduction

Module 10

Project Set-up Discussion

Get Started with BIG Data & Hadoop

Page 16: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Corporate Partners

Get Started with BIG Data & Hadoop

Page 17: BIG Data & Hadoop Applications in Social Media

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Lines open 24/7

To know more about the course, Please contact:

IND+91-90660-20904 USA1866-607-6547 (Toll Free)

Or reach us at

[email protected]

Contact Us

Page 18: BIG Data & Hadoop Applications in Social Media

Recommended