+ All Categories
Home > Technology > Netlog: What we learned about scalability & high availability

Netlog: What we learned about scalability & high availability

Date post: 08-Sep-2014
Category:
Upload: folke-lemaitre
View: 43,277 times
Download: 0 times
Share this document with a friend
Description:
Talk I did @ http://www.kingsofcode.nl about the things we learned the lst year about making http://www.netlog.com scalable and delivering high performance to our users...
Popular Tags:
70
What we learned about scalability & high availability Folke Lemaitre Director of Development http://nl.netlog.com/folke 27 mei 2008
Transcript
Page 1: Netlog: What we learned about scalability & high availability

What we learned about scalability & high availability

Folke LemaitreDirector of Development

http://nl.netlog.com/folke

27 mei 2008

Page 2: Netlog: What we learned about scalability & high availability

Overview

‣What is Netlog?‣ Translations‣ Network topology‣ Scaling Databases‣ Caching‣ Search‣ Q&A

Page 3: Netlog: What we learned about scalability & high availability

What is Netlog?

Page 4: Netlog: What we learned about scalability & high availability

Social Network

‣ Create your own profile

‣ Discover your friendsʼ activity

‣ Communicate

‣ Explore new content

‣ Applications

Page 5: Netlog: What we learned about scalability & high availability

Your Profile

Page 6: Netlog: What we learned about scalability & high availability

What: itʼs personal

‣ You rule: itʼs yours

YOU Videos

Blogs

Relations.

Photos

People

Games

Music

Photos

YOU

ANOTHER

ANOTHER

Page 7: Netlog: What we learned about scalability & high availability

Friend Activity

‣ Share & discover friendsʼ activity

Toon Coppens uploadt een nieuwe foto

Jaak Noukens en Jo zijn nu vrienden

Mari . reageertop haar foto

Pinguke V wijzigt haar profielfoto

Stijn Symons uploadt een nieuwe foto

Jan Maarten Willems tekent het gastenboek van nico b

Kenny Gryp tekent het gastenboek van Lorenz Bogaert

Page 8: Netlog: What we learned about scalability & high availability

Communication: Shouts

Page 9: Netlog: What we learned about scalability & high availability

Communication: Ratings & Comments

Page 10: Netlog: What we learned about scalability & high availability

Communication: Private messaging

Page 11: Netlog: What we learned about scalability & high availability

Communication: Chat

Page 12: Netlog: What we learned about scalability & high availability

Communication: Clans

Page 13: Netlog: What we learned about scalability & high availability

Explore

Profiles

Videos

Photos

Clans

Events

Blogs

Pages

Music

Applications

Page 14: Netlog: What we learned about scalability & high availability

Applications

‣ OpenSocial• sandbox: http://nl.netlog.com/go/developer/opensocial/sandbox=1

‣ Officially announced tomorrow@ Google I/O• Stay tuned!

‣ Public launch for june

Page 15: Netlog: What we learned about scalability & high availability

Developer Pages

http://nl.netlog.com/go/developer

Page 16: Netlog: What we learned about scalability & high availability

Itʼs going pretty good

‣ More than 35,000,000 unique members‣ More than 4,000,000,000 pageviews/Month‣ 19 languages and more coming up‣ More than 20 countries‣ Current Alexa Top-100 ranking (most visited web sites in the world)

‣ Current ComScore Europe Top-10 ranking

Page 17: Netlog: What we learned about scalability & high availability

Itʼs going pretty good

0

10.000.000

20.000.000

30.000.000

40.000.000

Janu

ary-07

Februa

ry-07

March-0

7

April-0

7

May-07

June

-07

July-

07

Augus

t-07

Octobe

r-07

Novem

ber-0

7

Decem

ber-0

7

Janu

ary-08

Februa

ry-08

March-0

8

April-0

8

Monthly Unique VisitorsNorthern Europe3%Americas

10%

Western Asia16%

Eastern Europe3%

Southern Europe22%

Western Europe46%

0

50.000.000

100.000.000

150.000.000

200.000.000

Janu

ary-07

Februa

ry-07

March-0

7

April-0

7

May-07

June

-07

July-

07

Augus

t-07

Octobe

r-07

Novem

ber-0

7

Decem

ber-0

7

Janu

ary-08

Februa

ry-08

March-0

8

April-0

8

Monthly Visits

0

1.250.000.000

2.500.000.000

3.750.000.000

5.000.000.000

Janu

ary-07

Februa

ry-07

March-0

7

April-0

7

May-07

June

-07

July-

07

Augus

t-07

Octobe

r-07

Novem

ber-0

7

Decem

ber-0

7

Janu

ary-08

Februa

ry-08

March-0

8

April-0

8

Monthly Page Requests

Page 18: Netlog: What we learned about scalability & high availability

Itʼs going pretty good

Page 19: Netlog: What we learned about scalability & high availability

Translations

Page 20: Netlog: What we learned about scalability & high availability

19 languages and alot more coming!

Català

中文 česky

Dansk

Nederlands

English

Eesti

suomi

français

Deutsch

Italiano

Lietuvių kalba

Norsk (bokmål)

Polski

PortuguêsRomână

Русский

slovenščina

EspañolSvenska

TürkçeAfrikaans

български

Hrvatski

Magyar

Latviešu valoda

Slovenčina

Page 21: Netlog: What we learned about scalability & high availability

Translate Tool

Page 22: Netlog: What we learned about scalability & high availability

Template

Page 23: Netlog: What we learned about scalability & high availability

Parsed Template

Page 24: Netlog: What we learned about scalability & high availability

Translated Template

Page 25: Netlog: What we learned about scalability & high availability

Generated PhP code

Page 26: Netlog: What we learned about scalability & high availability

Template Code

Page 27: Netlog: What we learned about scalability & high availability

Template Output

Page 28: Netlog: What we learned about scalability & high availability

Network Topology

Page 29: Netlog: What we learned about scalability & high availability

Overview

Internet

CDN

Netlog Datacenters

Firewall Web Load Balancer

Web Cluster

Master

Slave

Slave

Primary Pool

Memcache Pools

Session Cache

General Cache Html Cache

Database Pools

Master

Slave

Slave

User Pool

Master

Slave

Slave

Activity Pool

Master

Slave

Slave

Friendships Pool

Master

Slave

Slave

...

Static Load Balancer

Storage Servers

Page 30: Netlog: What we learned about scalability & high availability

Web Servers

‣ Software• Apache 2• Php 5.2.6• eAccelerator 0.9.5.2 for bytecode caching• Keepalived for high availability

‣ 200 servers‣ 450 000 requests per second

Page 31: Netlog: What we learned about scalability & high availability

Database Servers

‣ MySQL Enterprise 4.1.22

‣ 200 database servers‣ 40 thousand tables‣ 70 billion records

‣ 60 thousand queries per second

Page 32: Netlog: What we learned about scalability & high availability

Memcache Servers

‣ Memcached 1.2.4

‣ 60 servers

‣ 250 thousand requests/second

‣ 450 GB of memory

Page 33: Netlog: What we learned about scalability & high availability

Static servers

‣ Software:• Lighttpd• NginX

‣ Used for:• static files: css/javascript/images/...• user content: photos, videos

‣ Content Delivery Network: Akamai & Panther

Page 34: Netlog: What we learned about scalability & high availability

Other servers

‣ OpenSocial:• Shindig• Tomcat

‣ Search:• Sphinx

Page 35: Netlog: What we learned about scalability & high availability

Scaling Databases

Page 36: Netlog: What we learned about scalability & high availability

Database & Scalability

‣ Database pools

‣ Replication

‣ Partitioning

Page 37: Netlog: What we learned about scalability & high availability

Database Pools

‣ Different data on different database pools:• messaging• friendships• blogs• music• videos• ...

Page 38: Netlog: What we learned about scalability & high availability

Replication

‣ write to one master‣ read from multiple slaves (and master)

‣ pros• easy to implement• read intensive applications scale very well

‣ cons• write intensive applications donʼt scale

Page 39: Netlog: What we learned about scalability & high availability

Partitioning (sharding)

‣ Divide data on primary key:• all user data for users with id 1 - 10 in database1• all user data for users with id 11 - 20 in database2• ...

‣ Best scaling possible

‣ How?• managed in code• MySQL partitioning (available from version 5.1)

Page 40: Netlog: What we learned about scalability & high availability

Analyse, analyse, analyse!

‣ Tag your queries• SELECT * FROM USER WHERE userid = 123 /*User::getUser():11 */

‣ Analyse mysql slow logs‣ Analyse process lists‣ Analyse based on tags

• 1023 User:getUser():230• 512 User::isOnline():124• 10 Activities:getActivity():320

‣ minutely cron that checks for “too many connections”• if “too many connections”, log process list

Page 41: Netlog: What we learned about scalability & high availability

Caching

Page 42: Netlog: What we learned about scalability & high availability

Introduction to memcached

‣ Developed by Danga Interactive:• http://www.danga.com/

‣ Initially developed for LiveJournal:• http://www.livejournal.com/

‣ OpenSource

Page 43: Netlog: What we learned about scalability & high availability

Introduction to memcached

‣ Least Recently Used

‣ Fast!

‣ Distributed

‣ Automatic failover

‣ Big Hash table: set/add/get/delete

Page 44: Netlog: What we learned about scalability & high availability

What to cache?

‣ sessions

‣ query caching

‣ processed data

‣ generated html

Page 45: Netlog: What we learned about scalability & high availability

Session Cache

‣ 99% hit ratio

‣ Time to live is 20 minutes

‣ Faster than session database

Page 46: Netlog: What we learned about scalability & high availability

Query Cache

‣Why memcache and not MySQL query cache?• MySQL invalidates cached queries on a table on

every update• different query cache for different replicated

databases

‣ Add to generic database classes• Cache key is query

Page 47: Netlog: What we learned about scalability & high availability

Processed data

‣ Better to cache processed data than query results

Page 48: Netlog: What we learned about scalability & high availability

HTML Caching

Page 49: Netlog: What we learned about scalability & high availability

HTML Caching

‣ Profile blocks are fully cached

‣ Data needed to generate html is also cached

‣When data changes, html is invalidated, cached data updated

‣ High cache hit rate on profile pages

Page 50: Netlog: What we learned about scalability & high availability

3 ways of caching

‣ Cache with TTL

‣ Cache forever with invalidate

‣ Cache forever with update

Page 51: Netlog: What we learned about scalability & high availability

Cache with TTL

‣ The good:• Quickly achieve better performance on existing code

‣ The bad:• Users see outdated information• TTL can not be high• Caching efficiency is minimal

Page 52: Netlog: What we learned about scalability & high availability

Cache with TTL

‣ Cache friends for 5 minutes

Page 53: Netlog: What we learned about scalability & high availability

Cache forever with invalidate

‣ The Good:• fairly easy to implement• user never sees outdated data

Page 54: Netlog: What we learned about scalability & high availability

Cache friends forever

‣ For memcached this means ttl=0

Page 55: Netlog: What we learned about scalability & high availability

Invalidate Cache

Page 56: Netlog: What we learned about scalability & high availability

Cache forever with update

‣ The Good:• Best caching possible• Can reduce your select queries to the minimum

Page 57: Netlog: What we learned about scalability & high availability

Update Cache (array)

‣ Only update cache when no db queries needed

Page 58: Netlog: What we learned about scalability & high availability

Update Cache (simple value)

‣ No need to check cache

Page 59: Netlog: What we learned about scalability & high availability

Global Locking

‣ Use memcache as locking mechanism

Page 60: Netlog: What we learned about scalability & high availability

Global Locking: Chat Example

‣ Example: add new message to cached shared chat thread

Page 61: Netlog: What we learned about scalability & high availability

Flooding detection

‣ User can only redo action A after a timeout• a guestbook message can only be posted once every

2 minutes

‣ User can not do action A more than X times in T minutes• only 12 failed login attempts per hour are allowed

Page 62: Netlog: What we learned about scalability & high availability

Flooding detection

Page 63: Netlog: What we learned about scalability & high availability

Flooding detection

‣ User can only redo action A after a timeout• a guestbook message can only be posted once every

2 minutes

‣ User can not do action A more than X times in T minutes• only 12 failed login attempts per hour are allowed

Page 64: Netlog: What we learned about scalability & high availability

Search

Page 65: Netlog: What we learned about scalability & high availability

MySQL full-text search

‣ Initially used for our search• can be very slow• extra load on most of our databases, since most

content is searchable

‣ Better search engine needed• Sphinx!• OpenSource search engine developed by Andrew

Aksyonoff (http://sphinxsearch.com/)

Page 66: Netlog: What we learned about scalability & high availability

Sphinx Features

‣ very fast indexing‣ very fast searching

• 0.04 seconds average• 5 million searches / day• 60 searches / second

‣ distributed‣ document fields‣ stopwords‣ api available in many languages

• PhP, Java, Python, Ruby, Perl, C++, ...

Page 67: Netlog: What we learned about scalability & high availability

Sphinx Indexer

‣ Index is read-only (except for attributes)

‣ Build new index while searching old one

‣ How we index:• rebuild full index from data once in a while (daily,

weekly)• generate delta indexes often (every minute, 5

minutes)• contains changes for search index since last full index merge

• full index merge of previous index and delta (every hour)

Page 68: Netlog: What we learned about scalability & high availability

Sphinx Search

‣ Search query returns list of ids

‣ For every result page shown, we fetch data associated with ids• data is cached with memcache for every id

Page 69: Netlog: What we learned about scalability & high availability
Page 70: Netlog: What we learned about scalability & high availability

Thank you!

Questions?


Recommended