Date post: | 14-Jul-2016 |
Category: |
Documents |
Upload: | koushik-muthakana |
View: | 218 times |
Download: | 3 times |
FACEBOOK [The Nuts and Bolts – Technology]
TABLE OF CONTENTSPage No.
Certificate ii Abstract iii
CHAPTER 1: INTRODUCTION
1.01. Social Network ------------------------------------------------------------------ 05
1.02. History ---------------------------------------------------------------- 06
1.03. FACEBOOK ---------------------------------------------------------------- 07
CHAPTER 2: PROGRAMMING LANGUAGES
2.01. Programming Languages--------------------------------------------------------------- 10
2.02. Hack: (SERVER SIDE) --------------------------------------------------------------- 12
2.03. .Erlang (SERVER SIDE) -------------------------------------------------------------- 14
2.04. Haskell (SERVER SIDE) ------------------------------------------------------------- 16
CHAPTER 3: CONTENTS
3.01. DATA BASE ---------------------------------------------------------------- 18
3.02. MySql --------------------------------------------------------------- 19
3.03. HBASE --------------------------------------------------------------- 22
3.04. HBase in facebook Messaging ---------------------------------------- 23
CHAPTER 4: CONCLUSION --------------------------------------------------------------------- 24
REFERENCES
4 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
CHAPTER 1
INTRODUCTION
1.01. Social Network:
The social network is a theoretical construct useful in the social sciences to study relationships
between individuals, groups, organizations, or even entire societies The term is used to describe
a social structure determined by such interactions.
Social networking is the practice of expanding the number of one's business and/or social
contacts by making connections through individuals. Depending on the social media platform,
members may be able to contact any other member. In other cases, members can contact anyone
they have a connection to, and subsequently anyone that contact has a connection to, and so on.
Some services require members to have a preexisting connection to contact other members.
Social media sites include Facebook, Twitter, LinkedIn, and Google+.
Social Networking:
Today, social networking is an essential part of life for people from around the world. Social
networking is a form of social media, used for interactive, educational, informational, or
entertaining purposes. Social media comes in many forms, but all of them are related: blogs,
forums, podcasts, photo sharing, social bookmarking, widgets, and video, just to name a few.
Today, social networking websites allow users to make profiles, upload photos and videos, and
interact with friends and family. Social networking is a tool to join groups, learn about latest
news and events, play games, chat and to share music and video. The top social networking sites
of today are: MySpace, Facebook and Twitter.
Communication has been instrumental to a large extent to the growth of social networking. With
the advent of Internet and the cell phone a lot of social interaction is captured through email and
instant messaging
5 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
1.02. History:
Social networking was born one day in 1971, when the first email was sent. The two computers
were sitting right next to each other. The message said “qwertyuiop’.
In 1978, the BBS–or Bulletin Board System, was created. The BBS hosted on personal
computers, where users need to dial in via the modem of a host computer, and exchanging data
through phone lines to other users. The BBS was the first system that lets users interact with one
another through the internet. It was slow, but it was a good start, and only one user could log in
at a time.
Later that year, the very first web browsers were distributed using Usenet, the earliest online
bulletin board of the time. Usenet was created by Jim Ellis & Tom Truscott, where users posted
news, articles and funny posts. Unlike the BBS and forums, Usenet did not have a ‘central
server’. This concept soon inspired the ‘Groups’ feature we know today; such as Yahoo! Groups,
Google Groups and Facebook Groups. The first ever version of instant messaging was around
1988, called IRC or Internet Relay Chat. IRC was Unix-based then, and thus exclusive only to a
few people. IRC was used for communications, as well as link and fire sharing. Soon the earliest
copies of web browsers were distributed via Use net.
The First Social Networking Site
In 1994, the first social networking site was created, Geocities. Geocities allowed the users to
create and customize their own web sites, grouping them into different ‘cities’ based on the site’s
content.
6 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
1.03. FACEBOOK:
Facebook is a corporation and online social networking service headquartered in Menlo Park,
California, in the United States. Its website was launched on February 4, 2004, by Mark
Zuckerberg with his Harvard College roommates and fellow students Eduardo Saverin, Andrew
McCollum, Dustin Moskovitz and Chris Hughes.
Zuckerberg wrote the software for the Facemash website when he was in his second year of
college. The website was set up as a type of “hot or not” game for Harvard students. The website
allowed visitors to compare two student pictures side-by-side and let them choose who was “hot”
and who was “not”...
Facebook is a free service that allows you to create an online page to connect with friends,
family, or make new friends with anyone anywhere.
On your Facebook page you can share pictures, personal information, messages, videos , join
groups and add applications
7 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
Here are a few factoids to give you an idea of the scaling challenge that Facebook has to
deal with: Facebook serves 570 billion page views per month (according to Google Ad
Planner).There are more photos on Facebook than all other photo sites combined (including sites
like Flickr).More than 3 billion photos are uploaded every month. Facebook’s systems serve 1.2
million photos per second. More than 25 billion pieces of content (status updates, comments, etc)
are shared every month. Facebook has more than 30,000 servers (and this number is from last
year!)
Facebook Company Overview:
Facebook, Inc., incorporated on July 29, 2004, is a social networking company. The Company
builds engaging products that enables people to connect and share through mobile devices and
personal computers. The Company offers various services focused on people, marketers, and
developers. It offers online and mobile-based platform for people to share their opinions, ideas,
photos and videos, and to engage in other activities. The Company’s products include Facebook,
Instagram, Messenger, and WhatsApp. As of December 31, 2014, the Company had 890 million
8 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
daily active users (DAUs). The Company’s subsidiaries include Andale, Inc., Facebook
Operations, LLC, Oculus VR, LLC and Parse, LLC, in Delaware; Edge Network Services
Limited, Facebook Ireland Holdings Limited and Facebook Ireland Limited, in Ireland, and
Pinnacle Sweden AB, in Sweden.
The Company’s Facebook mobile app and Website enables people to connect, share, discover,
and communicate with each other on mobile devices and personal computers. Facebook is
available across the world. Its Facebook product has over 890 million daily active users (DAUs)
in December 2014. It had over 745 million DAUs who accessed Facebook from a mobile device
in December 2014.
9 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
Chapter 2
PROGRAMMING LANGUAGES
2.01. Programming Languages:
The most popular (i.e., the most visited) websites have in common that they are websites.
Their development typically involves server side coding, client side coding and database
technology.
The programming languages applied to deliver similar dynamic web content however vary vastly
between sites.
Facebook also uses many programming languages. In Front End (Client Side) uses Java Script
and in Back End (Server Side) uses Hack, PHP (HHVM)C++,Java, Python, Erlang D, XHP and
Haskell
2.1.1-Java Script: (Front End):
A high-level, dynamic, untyped and interpreted programming
language. It has standardized in the ECMA Script language
specification Alongside HTML and CSS, it is one of the three
essential of World Wide Web content production; the majority
of websites employ it and it is supported by all modern Web
browsers without plug-ins.
Javascript is prototype-based with first multi-paradigm language,
imperative and functional programming styles. It has an API for working with, arrays, dates
and regular expressions, but does not include any I/O, such as networking, storage, or graphics
facilities, relying for these upon the host environment in which it is embedded.
JavaScript and Java are otherwise unrelated and have very different semantics. The syntax of
JavaScript is actually derived from C, while the semantics and design are influenced by the Self
10 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
and Scheme programming languages. JavaScript is also used in environments that are not Web-
based, such as PDF documents, site-specific browsers, and desktop widgets.
On the client side, JavaScript has been traditionally implemented as an interpreted language, but
more recent browsers perform just-in-time compilation. It is also used in game development, the
creation of desktop and mobile applications, and server-side network programming with runtime
environments such as Node.js.
JavaScript in Facebook :
/*This Script allows people to enter by using a form that asks for aUserID and Password*/
function pasuser(form)
{
if (form.id.value=="JavaScript")
{
if (form.pass.value=="Kit")
{
location="page2.html"
}
else
{
alert("Invalid Password")
}
}
else
{
alert("Invalid UserID")
}
11 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
2.02. Hack: (SERVER SIDE)
Hack is a programming language for the Hip-hop Virtual Machine (HHVM), created by face
book as a dialect of PHP. It is open-source, licensed under the BSD License Hack allows
programmers to use both dynamic typing and static typing. Hack was introduced on March 20,
2014.Unlike PHP, Hack and HTML code do not mix. Normally you can mix PHP and HTML
code together in the same file. They are both PHP, both run on apache. Hack tries to implement
more functionality and features to PHP and helps to clean up some of the inconsistencies.
Hack is a programming language for HHVM. Hack reconciles the fast development cycle of a
dynamically typed language with the discipline provided by static typing, while adding many
features commonly found in other modern programming languages.
Hack provides instantaneous type checking by incrementally checking your files as you edit
them. It typically runs in less than 200 milliseconds, making it easy to integrate into your
development workflow without introducing a noticeable delay. The following are some of the
important language features of Hack. For more information, see the full documentation, or
follow through the quick interactive tutorial. Type Annotations allow for code to be explicitly
typed on parameters, class member variables and return values.
Hack, a programming language we developed for HHVM that interoperates seamlessly with
PHP. Hack reconciles the fast development cycle of PHP with the discipline provided by static
typing, while adding many features commonly found in other modern programming languages.
EXAMPLE:
<html> <head>
<title>PHP Test</title>
</head>
<body>
<!-- hh and html do not mix -->
<?php echo '<p>Hello World</p>'; ?> </body></html
12 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
PHP VS HACK:
HACK IN FACEBOOK:
Hack as a new version of PHP. It too runs on the Hip Hop Virtual Machine, but it lets coders use
both dynamic typing and static typing. This is what’s calledgradual typing, and until now, it has
mostly been an academic exercise. Facebook, O’Sullivan says, is the first to bring gradual typing
to a “real, industrial strength” language.
What this means is that Facebook was able to gradually replace its existing PHP code with Hack
— move from the old dynamically typed system to a statically typed arrangement. “It allows you
to slide the dial yourself on the continuum between dynamic types and statics — so you can start
out with dynamically typed code and then gradually add more statically typed code, benefiting
from each little bit of work you do as you go along,”
13 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
2.03. .Erlang (SERVER SIDE):
Erlang is a general-purpose concurrent, garbage-collected programming language and runtime
system. It stays strong while building concurrency programs. Erlang provides language-level
features for creating and managing processes with the aim of simplifying concurrent
programming. It has large application in many chat service well known now such as Whatsapp
and initial version of Facebook chat.
Previously we have written an article. Use of Erlang in WhatsApp . In this post, we will talk
about the story about use of Erlang in Facebook chat. The initial version of Facebook chat is
developed by Adam D'Angelo , an early Facebook employee.
While Adam was in college he made a prototype of a chat website with Erlang because it
seemed like the ideal language for the task and also it was a chance to have fun learning a new
language according to him. Later Rebekah Cox , Ari Steinberg and Adam worked on the
initial Facebook chat prototype as part of a hackathon project in early 2007. Some of the Erlang
code were pulled from his personal project while in college. When they decided to make it an
official project and productionalize the server code, the team decided to stick with Erlang and
took over the code from there.
Example:
An Erlang function that uses recursion to count to ten
-module(count_to_ten).
-export([count_to_ten/0]).
count_to_ten() -> do_count(0).
do_count(10) -> 10;
do_count(Value) -> do_count(Value + 1).
Erlang in Facebook Chat:Erlang is designed by Ericsson to be able to handle massive loads
under very demanding circumstances required of high criticality telecommunication network
operations. As a result of which, language is bottom up designed to have no shared states and
14 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
locks .There are additional features such as ability to live patch and "self-healing" networks.
WhatsApp and Facebook chat as well as all popular RabbitMQ all used and except Facebook
continue to use Erlang. Facebook's reason for leaving Erlang is being given as "stability" issues
but off course considering entire telecommunication networks are written on it, that sounds very
unlikely.
System overview: User Interface-Chat in the browser:
Channel (Erlang): message queuing and delivery. Queue messages in each user’s “channel” Deliver messages as responses to long-polling HTTP requests Presence (C++): aggregates online info in memory (pull-based presence).Chat logger (C++): stores conversations between page loads
Web tier (PHP): serves our vanilla web requests.
2.04. Haskell (SERVER SIDE):
15 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
Haskell is a general-purpose purely functional language with non-strict semantics and strong
static typing language. The latest standard of Haskell is Haskell 2010; however, there is a group
working on the next version, Haskell 2014 as of February 2016.
Haskell features a type system with type inference and lazy evaluation. Type classes first
appeared in the Haskell programming language. Its main implementation is the Glasgow Haskell
Compiler. Haskell is based on the semantics, but not the syntax, of the Miranda programming
language, which served to focus the efforts of the initial Haskell working group. It is widely used
in academia and industry.
The Factorial Function in Haskell:
-- Type annotation (optional)
factorial :: (Integral a) => a -> a
-- Using recursion
factorial n | n < 2 = 1
factorial n = n * factorial (n - 1)
-- Using recursion, with guards
factorial n
| n < 2 = 1 otherwise = n * factorial (n - 1)
-- Using recursion but written without pattern matching
factorial n = if n > 0 then n * factorial (n-1) else 1
-- Using a list
factorial n = product [1..n]
-- Using fold (implements product)
factorial n = foldl (*) 1 [1..n]
-- Point-free style
factorial = foldr (*) 1 . enumFromTo 1
16 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
Haskell in facebook:
Fighting spam with Haskell:
Sigma: One of the weapons in the fight against spam, malware, and other abuse on
Facebook is a system called Sigma. Its job is to proactively identify malicious actions on
Facebook, such as spam, phishing attacks, posting links to malware, etc. Bad content detected by
Sigma is removed automatically so that it doesn't show up in your News Feed. Sigma is a rule
engine, which means it runs a set of rules, called policies. These policies make it possible for us
to identify and block malicious interactions before they affect people on Facebook.
Why Haskell in sigma…??
It was replaced by the FXL(Feature eXtraction Language) with Haskell
Reasons for replacements:
1. Purely functional and strongly typed.
2. Push code changes to production in minutes.
3. Performance.
4. Support for interactive development.
Chapter 3
DATABASE
3.01. DATA BASE:
17 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
A database is an organized collection of data. It is the collection of schemas, tables, queries,
reports, views and other objects. The data are typically organized to model aspects of reality in a
way that supports processes requiring information, such as modeling the availability of rooms in
hotels in a way that supports finding a hotel with vacancies.
A database management system (DBMS) is a computer software application that interacts with
the user, other applications, and the database itself to capture and analyze data. A general-
purpose DBMS is designed to allow the definition, creation, querying, update, and administration
of databases. Well-known DBMSs include MySQL, PostgreSQL, Microsoft SQL
Server, Oracle, Sybase and IBM DB2. A database is not generally portable across different
DBMSs, but different interoperate by using standards such as SQL and ODBC or JDBC to allow
a single application to work with more than one DBMS. Database management systems are often
classified according to the database model.
What database actually Facebook uses..?
A billion of people are using FACEBOOK, storing every transaction for 800 million users and
handling more than 60 million queries per second Interacting with their peer and friends through
wall posts, uploading their photos, passing information’s about events and other meaningful
information .Facebook uses several database techniques.
Databases used in facebook:
• MySql
• HBase
• Cassandra
18 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
3.02. MySql:
MySQL is an open-source relational
database management system (RDBMS) in July
2013, it was the world's second most widely used
RDBMS, and the most widely used open-source
client RDBMS .It is after co-
founder MichaelWidenius's daughter, My. The SQL acronym stands for Structured Query
Language.
The MySQL development project has made its source code available under the terms of the
GNU General Public License, as well as under a variety of proprietary agreements. MySQL was
owned and sponsored by a single for-profit firm, the Swedish company MySQL AB, now owned
by Oracle Corporation. For proprietary use, several paid editions are available, and offer
additional functionality.
MySQL is written in C and C++. Its SQL parser is written in yacc, but it uses a home-
brewed lexical MySQL works on many system platforms, including AIX, BSDi,FreeBSD, HP-
UX, eComStation, i5/OS, IRIX, Linux, OS, Microsoft, NetBSD, Novell, OpenBSD, OpenSolaris
, OS/2 Warp, QNX, Oracle, Symbian, SunOS, SCO, SCO UnixWare, Sanos and Tru64. A port
of MySQL to OpenVMS also exists.
MYSQL IN FACEBOOK:
The database, Facebook utilizes MySQL because of its speed and reliability.MySQL is used
primarily as a key-value store as data is randomly distributed amongst a large set of logical
instances. These logical instances are spread out across physical nodes and load balancing is
done at the physical node level.
As far as customizations are concerned, Facebook has developed a custom partitioning scheme in
which a global ID is assigned to all data. They also have a custom archiving scheme that is based
on how frequent and recent data is on a per-user basis. Most data is distributed randomly.
19 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
Facebook uses MySql in Timeline. Storing every transaction for 800 million users and handling
more than 60 million queries per second, your database environment had better be something
special. Many readers might see these numbers and think NoSQL, but Facebook held a Tech
Talk on Monday night explaining how it built a MySQL (s orcl) environment capable of
handling everything the company needs in terms of scale, performance, and availability.
The importance of its mysql user database:
MySQL handles pretty much every user interaction: likes, shares, status updates, alerts,
requests, etc.
Facebook has 800 million users; 500 million of them visit the site daily.
350 million mobile users are constantly pushing and pulling status updates
7 million applications and web sites are integrated into the Facebook platform
User data sets are made even larger by taking into account both scope and time
Timeline is more concerned about organizing data neatly than shooting out updates in real
time; MySQL is well suited for the app. Although the data is aggregated in the same location
as the data is kept (i.e. not over a network connection), that data is managed by MySQL, and
not an alternative like NoSQL or Hadoop Hbase.
20 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
Database structure in facebook:
21 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
3.03. HBASE:
Hbase is an open source, non-relational, distributed database modeled written in Java. It
is developed as part of Apache Software Foundation's Apache Hadoop project Runs on top
of HDFS (Hadoop Distributed File system), providing BigTable-like capabilities for
Hadoop.Hbase is now serving several data-driven websites, including Facebook's Messaging
Platform.
HBase features compression, in-memory operation, and Bloom filters on a per-column
basis as outlined in the original BigTable paper. Tables in HBase can serve as the input and
output for MapReduce jobs run in Hadoop, and may be accessed through theJava API but also
through REST, Avro or Thrift gateway APIs. Hbase is a column-oriented key-value data store
and has idolized widely because of its lineage with Hadoop and HDFS. HBase runs on top of
HDFS and well-suited for faster read and write operations on large datasets with high throughput
and low input/output latency
Hbase Architecture:
22 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
In HBase, tables are split into regions and are served by the region servers. Regions are vertically divided by column families into “Stores”. Stores are saved as files in HDFS.
The Master Server -
Assigns regions to the region servers and takes the help of Apache ZooKeeper for this task. Handles load balancing of the regions across region servers. It unloads the busy servers and shifts the regions to less occupied servers. It is responsible for operations such as creation of tables and column families.
Regions-Regions are nothing but tables that are split up and spread across the region servers.
Zookeeper-Zookeeper is an open-source project that provides services like maintaining configuration information, naming, providing distributed synchronization, etc. Clients communicate with region servers via zookeeper.
3.04. HBase in facebook Messaging:
Messaging Data: Small/Medium sized data—Hbase Search index Small message bodies
Attachments and Large messages– Haystack Used for our existing photo/video store
23 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
CHAPTER 4
CONCLUSION
Facebook is a “social networking website”Facebook is a free service that allows you to create
an online page to connect with friends, family, or make new friends with anyone anywhere. On
your Facebook page you can share pictures, personal information, messages, and videos, join
groups and add applications. It Uses many programming languages, databases and many
software techniques.
It serves many people requests very quickly and efficiently. Facebook service focuses on
building online communities of people who share interests and activities, or who are interested in
exploring the interests and activities of others.
The popularity of Facebook has increased drastically and is the most popular social networking
of all time.
References:
1. http://www.makeuseof.com/tag/facebook-work-nuts-bolts-technology-
explained/
2. https://gigaom.com/2011/12/06/facebook-shares-some-secrets-on-making-
mysql-scale/
3. https://en.wikipedia.org/wiki/
Programming_languages_used_in_most_popular_websites
4. https://www.facebook.com/CCodingTips/posts/441715235849774
5. https://www.facebook.com/notes/facebook-engineering/mysql-and-database-
engineering-mark-callaghan/10150599729938920/
6. http://www.theregister.co.uk/2013/06/27/facebook_tao/
24 | P a g e
FACEBOOK [The Nuts and Bolts – Technology]
7. https://gigaom.com/2011/12/06/facebook-shares-some-secrets-on-making-
mysql-scale/
25 | P a g e