Facebook[The Nuts and Bolts Technology]

Post on 27-Jan-2017

459 views 4 download

transcript

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERINGMADANAPALLE INSTITUTE OF TECHNOLOGY AND SCIENCE

(UGC-AUTONOMOUS)

A Seminar Presentation On

FACEBOOK[The Nuts and Bolts – Technology]

ByM.Koushik reddy12691A0546

Under the guidance N.Sudhakar Yadav

M.TechAsst.professor

Contents• Introduction• Languages• Databases• Software's and technology

So what's all the Hype?What exactly is Facebook®?

• Facebook® is a “social networking website”

• Facebook® is a free service that allows you to create an online page to connect with friends, family, or make new friends with anyone anywhere.

• On your Facebook® page you can share pictures, personal information , messages, videos , join groups and add applications.

Introduction

• Here are a few factoids to give you an idea of the scaling challenge that Facebook has to deal with:

• Facebook serves 570 billion page views per month (according to Google Ad Planner).

• There are more photos on Facebook than all other photo sites combined (including sites like Flickr).

• More than 3 billion photos are uploaded every month.• Facebook’s systems serve 1.2 million photos per second.• More than 25 billion pieces of content (status updates, comments, etc) are

shared every month.• Facebook has more than 30,000 servers (and this number is from last

year!)

Languages:Front End: (client side) - Java script

Back End: (server side) - Hack, PHP (HHVM)

- C++,Java - Python,Erlang - D,XHP and - Haskell

• Java script: (Front End)

It is a high-level, dynamic, un typed, and interpreted programming language

- It is supported by all modern web browsers without plug-ins

• .

Sample code:FB.getLoginStatus(function(response) { if (response.status === 'connected') { console.log('Logged in.'); } else { FB.login(); } })

Back End:(Server side)

Hack: Hack is a programming language for the Hip-hop Virtual

Machine (HHVM), created by face book as a dialect of PHP.• It is open-source, licensed under the BSD License• Hack allows programmers to use both dynamic typing and static typing.• Introduced on march 20,2014 Sample code: <?hh echo 'Hello World';• An important point : Unlike PHP, Hack and HTML code do not mix.

Normally you can mix PHP and HTML code together in the same file.

PHP VS HACK

• They are both PHP, both run on apache• Hack tries to implement more functionality and features to PHP and helps to clean up some of the inconsistencies

Back End:(Server side)

Erlang: It is a general purpose, concurrent, garbage collected programming

language and runtime system. • It was originally designed by Ericsson . • It supports hot swapping, thus code can be changed without stopping a

system.• It provides language-level features for creating and managing processes

with the aim of simplifying concurrent programming. • All concurrency is explicit in Erlang, processes communicate

using message passing instead of shared variables, which removes the need for explicit locks.

Back End:(Server side)

Continue…Sample code:An Erlang function that uses recursion to count to ten

-module(count_to_ten). -export([count_to_ten/0]). count_to_ten() -> do_count(0). do_count(10) -> 10; do_count(Value) -> do_count(Value + 1).

Back End:(Server side)

Erlang in facebook..?? It is used mainly in facebook

chat.

Back End:(Server side)Continue…System overview :User Interface-Chat in the browser:

Back End:(Server side)

Continue…System overview :User Interface-Chat in the browser:• Channel (Erlang): message queuing and delivery . Queue messages in each user’s “channel” Deliver messages as responses to long-polling HTTP requests• Presence (C++): aggregates online info in memory (pull-based presence)• Chatlogger (C++): stores conversations between page loads• Web tier (PHP): serves our vanilla web requests

Back End:(Server side)

Haskell: Haskell is a standardized, general-purpose purely functional

programming language, with non-strict semantics and strong static typing.

Sample code: ”Hello world program “ module Main where main :: IO () main = putStrLn "Hello, World!"

Back End:(Server side)

Haskell in facebook…??Fighting spam with Haskell:Sigma:

One of the weapons in the fight against spam, malware, and other abuse on Facebook is a system called Sigma.

• Its job is to proactively identify malicious actions on Facebook, such as spam, phishing attacks, posting links to malware, etc.

• Bad content detected by Sigma is removed automatically so that it doesn't show up in your News Feed.

• Sigma is a rule engine, which means it runs a set of rules, called policies. • These policies make it possible for us to identify and block malicious interactions

before they affect people on Facebook.

Back End:(Server side)

• Continue…Why Haskell in sigma…??• It was replaced by the FXL(Feature

eXtraction Language) with Haskell. Reasons for replacements:1. Purely functional and strongly typed. 2. Push code changes to production in

minutes.3. Performance. 4. Support for interactive development. 

Database

What database actually Facebook uses..?• A billion of people are using FACEBOOK, storing every transaction for 800

million users and handling more than 60 million queries per second• Interacting with their peer and friends through wall posts, uploading their photos, passing information’s about events and other meaningful information .• Facebook uses several database techniques.

Databases used in facebook:• MySql• HBase• Cassandra

Databases

MYSQL: Facebook primarily uses MYSQL for structured data storage such as wall posts, user information, timeline etc.  • This data is replicated between their various data centers.

Facebook Database Design:

Database

HBase:Is an open source,  non-relational,  distributed database modeled written

in Java.

• It is developed as part of Apache Software Foundation's Apache Hadoop project

• Runs on top of HDFS (Hadoop Distributed File system), providing BigTable-like capabilities for Hadoop.

• Hbase is now serving several data-driven websites, including Facebook's Messaging Platform

Hbase Architecture:

• In HBase, tables are split into regions and are served by the region servers. • Regions are vertically divided by column families into “Stores”. • Stores are saved as files in HDFS.

Continue…

The Master Server -Assigns regions to the region servers and takes the help of Apache

ZooKeeper for this task.• Handles load balancing of the regions across region servers. It unloads the busy servers

and shifts the regions to less occupied servers.• Is responsible for operations such as creation of tables and column families.Regions-

Regions are nothing but tables that are split up and spread across the region servers.

Zookeeper-Zookeeper is an open-source project that provides services like maintaining

configuration information, naming, providing distributed synchronization, etc.• Clients communicate with region servers via zookeeper.

HBase in facebook Messaging

Messaging Data:

• Small/Medium sized data—Hbase• Search index• Small message bodies

o Attachments and Large messages– Haystack• Used for our exesting photo/video store

Continue….

Continue….Write Path in HBase:

•In Hbase, the messages are stored in the file(Hfiles), the messages are directly appended in the HDFS

Read path:•Simillarly messages can be read directly from the Hfiles

Software And Techniques

The Front End:

• Linux & Apache• Memcache• Haystack• Bigpipe

The Back End:

• Thrift (protocol)• Scribe (log server)• HipHop for PHP

Software And Techniques

The Front End:

Linux & Apache:Linux is a Unix-like computer operating system kernel.

• It’s open source, very customizable, and good for security.• Facebook runs the Linux operating system on Apache HTTP Servers. • Apache is also free and is the most popular open source web server in use.

Software And TechniquesMemcache:

• Facebook makes heavy use of Memcached,

• A memory caching system that is used to speed up dynamic database driven websites by caching data and objects in RAM to reduce reading time.

• Having a caching system allows Facebook to be as fast as it is at recalling your data.

• Doesn’t have to go to the database, it will just fetch your data from the cache based on your user ID.

Software And TechniquesFaceebook-Photos-Haystack:

• The Photos application is one of Facebook’s most popular features. • Users have uploaded over 15 billion photos which make Facebook

the biggest photo sharing website.• For each uploaded photo, Facebook generates and stores four images

of different sizes, which translates to a total of 60 billion images and 1.5PB of storage.

• The current growth rate is 220 million new photos per week, which translates to 25TB of additional storage consumed weekly.

Haystack in facebook

• Haystack is Facebook’s high-performance photo storage/retrieval system.• A highly scalable object store used to serve Facebook’s immense amount of photos.• Implements a HTTP based photo server which stores photos in a generic object store called Haystack.

Software And Techniques

BigPipe:Dynamic web page serving system, Facebook has developed.

• BigPipe is a fundamental redesign of the dynamic web page serving system.• BigPipe breaks the page generation process into several stages• The first three stages are executed by the web server, and the last four stages are

executed by the browser.

Questions?Hope you enjoyed this presentation…