+ All Categories
Home > Technology > LinkedIn Segmentation & Targeting Platform: A Big Data Application

LinkedIn Segmentation & Targeting Platform: A Big Data Application

Date post: 07-May-2015
Category:
Upload: amy-w-tang
View: 1,194 times
Download: 4 times
Share this document with a friend
Description:
This talk was given by Hien Luu (Senior Software Engineer at LinkedIn) and Siddharth Anand (Senior Staff Software Engineer at LinkedIn) at the Hadoop Summit (June 2013).
37
LinkedIn Segmentation & Targeting Platform: A Big Data Application Hadoop Summit, June 2013 Hien Luu, Sid Anand ©2013 LinkedIn Corporation. All Rights Reserved.
Transcript
Page 1: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

LinkedIn Segmentation & Targeting Platform: A Big Data Application

Hadoop Summit, June 2013Hien Luu, Sid Anand

Page 2: LinkedIn Segmentation & Targeting Platform: A Big Data Application

About Us

*

Hien Luu Sid Anand

Page 3: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

Our missionConnect the world’s professionals to make

them more productive and successful

Page 4: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Over 200M members and counting

2004 2005 2006 2007 2008 2009 2010 2011 2012

2 4 817

32

55

90

145

LinkedIn Members (Millions)

200+

The world’s largest professional network

Growing at more than 2 members/sec

Source :

http://press.linkedin.com/about

©2013 LinkedIn Corporation. All Rights Reserved.

Page 5: LinkedIn Segmentation & Targeting Platform: A Big Data Application

*

>88%Fortune 100 Companies use LinkedIn Talent Soln to hire

Company Pages>2.9M

Professional searches in 2012

>5.7BLanguages19

>30MFastest growing demographic: Students and NCGs

The world’s largest professional network

Over 64% of members are now international

Source :

http://press.linkedin.com/about©2013 LinkedIn Corporation. All Rights Reserved.

Page 6: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Other Company Facts

*

• Headquartered in Mountain View, Calif., with offices around the world!

• As of June 1, 2013, LinkedIn has ~3,700 full-time employees located around the world

Source :

http://press.linkedin.com/about

Page 7: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Agenda

Company Overview

Big Data @ LinkedIn

The Segmentation & Targeting Problem

Solution : LinkedIn Segmentation & Targeting Platform

Q & A

Page 8: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Big Data @ LinkedIn

©2013 LinkedIn Corporation. All Rights Reserved.

Page 9: LinkedIn Segmentation & Targeting Platform: A Big Data Application

LinkedIn : Big Data Story

©2013 LinkedIn Corporation. All Rights Reserved.

Our Big Data Story depends on Infrastructure!• On-line Data Infrastructure

• Near-line Data Infrastructure

• Offline Data Infrastructure

Oracle or Espresso

Updates

Web Serving

Teradata

Data Streams

Near-lineOn-line Off-line

Page 10: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Big Data Story : On-line Data

©2013 LinkedIn Corporation. All Rights Reserved.

On-line Data Infrastructure

• Supports typical OLTP requirements • Highly concurrent R/W access• Transactional guarantees• Back-up & Recovery

• Supports a central LinkedIn Data Principle! • “All data everywhere”

• All OLTP databases need to provide a time-line consistent change stream

• For this, we developed and open-sourced Databus!

Oracle or Espresso

Updates

Web Serving

On-line

Page 11: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Big Data Story : On-line Data

Oracle or Espresso Data Change Events

Search Index

Graph Index

Read Replicas

Updates

Standardization

A user updates the company, title, & school on his profile. He also accepts a connection

The write is made to an Oracle or Espresso Master and DataBus replicates it:• the profile change is applied to the Standardization service

E.g. the many forms of IBM were canonicalized for search-friendliness• …. and to the Search Index

Recruiters can find you immediately by new keywords• the connection change is applied to the Graph Index service

The user can now start receiving feed updates from his new connections

Page 12: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Big Data Story : On-line Data

Databus streams also update Hadoop!

Oracle or Espresso

Search Index

Graph Index

Read Replica

Updates

Standardization

Data Change Events

Page 13: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Big Data Story : Near-line & Off-line Data

©2013 LinkedIn Corporation. All Rights Reserved.

2 Main Sources of Data @ LinkedIn• User-provided data

• e.g. Member Profile data (e.g. employment, education history, endorsements)

• Tracking data via web site instrumentation • e.g. pages viewed, email opened/sent, social gestures : posts/likes/shares

Oracle or Espresso

Updates

Databus

Web Servers Kafka

Teradata

Page 14: LinkedIn Segmentation & Targeting Platform: A Big Data Application

The

Segmentation & Targeting

Problem

©2013 LinkedIn Corporation. All Rights Reserved.

Page 15: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Segmentation & Targeting

Page 16: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Segmentation & Targeting Attribute types

Bhaskar Ghosh

Page 17: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Segmentation & Targeting

©2013 LinkedIn Corporation. All Rights Reserved.

Step 1 : Take some information about users

Member ID Join Date Country Responded to Promotion X1

1 01/01/2013 FR F

2 01/02/2013 BE F

3 01/03/2013 FR F

4 02/01/2013 FR T

Step 2 : Provide some targeting criteria for a new promotion Pick members where• Join Date between('01/01/2013", '01/31/2013") and • Country="FR" and • Responded to Promotion X1="F"

Members 1 & 3

Step 3 : Target them for a different email campaign (promotion_X2)

Page 18: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Segmentation & Targeting

©2013 LinkedIn Corporation. All Rights Reserved.

Step 1 : Take some information about users

Member ID Join Date Country Responded to Promotion X1

1 01/01/2013 FR F

2 01/02/2013 BE F

3 01/03/2013 FR F

4 02/01/2013 FR T

Step 2 : Provide some targeting criteria for a new promotion Pick members where• Join Date between('01/01/2013", '01/31/2013") and • Country="FR" and • Responded to Promotion X1="F"

Members 1 & 3

Step 3 : Target them for a different email campaign (promotion_X2)

Attributes

SegmentDefinition

Segment

Page 19: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Segmentation & Targeting

©2013 LinkedIn Corporation. All Rights Reserved.

Problem Definition

• The business wants to launch new campaigns often

• The business wants to specify targeting criteria (segment definitions) using an arbitrary set of attributes

• The attributes often need to be computed to fulfill the targeting criteria

• This data resides on Hadoop or TD

• The business is most comfortable with SQL-like languages

Page 20: LinkedIn Segmentation & Targeting Platform: A Big Data Application

Segmentation & Targeting Solution

©2013 LinkedIn Corporation. All Rights Reserved.

Page 21: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

Segmentation & Targeting

Attribute Computation

Engine

Attribute Serving Engine

Page 22: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

Segmentation & Targeting

Attribute Computation

Engine

Self-service

Support various data sources

Attribute

consolidation

Attribute

availability

Page 23: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

Segmentation & Targeting

Attribute computation

~225M

PB

TB

TB

~240

Page 24: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

LinkedIn Segmentation & Targeting Platform

Attribute Portal Web Application

Attribute & DefinitionMetadata

Page 25: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

LinkedIn Segmentation & Targeting Platform

Attribute & DefinitionMetadata

TD Executor

Hive Executor

Pig Executor

REST

REST

REST

Page 26: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

LinkedIn Segmentation & Targeting Platform

M/RStitcher

/path/dataset1

/path/dataset2

/path/dataset3

/path/dataset4

/path/lnkd_big_table

DataLoader

Attribute consolidation & availability

Page 27: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

LinkedIn Segmentation & Targeting Platform

LinkedIn big table, the most sought after data

Segmentation

Propensity Model

Ad hoc analysis

LinkedIn big table

Page 28: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

Segmentation & Targeting

Attribute Serving Engine

Self-service

Attribute predicateexpression

Build

segments

Build lists

Page 29: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

Segmentation & Targeting

Serving Engine

$

count filter sumcomplex

expressions

Σ1234

LinkedIn big table

~225M

~240

Page 30: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

LinkedIn Segmentation & Targeting Platform

Inverted Index

Inverted Index

Inverted Index

M/RIndexer

LinkedIn big table

Attribute & DefinitionMetadata

Page 31: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

LinkedIn Segmentation & Targeting Platform

Who are north American recruiters that

don’t work for a competitor?

Who are the LinkedIn Talent Solution prospects

in Europe?

Who are the job seekers?

Page 32: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

LinkedIn Segmentation & Targeting Platform

JSON Predicate Expression

JSON Lucene Query Parser

Inverted Index

Inverted Index

Inverted Index

Segment &List

Page 33: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

LinkedIn Segmentation & Targeting Platform

Complex tree-like attribute predicate expressions

Page 34: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

LinkedIn Segmentation & Targeting Platform

A marketing campaign is represented by a list

Page 35: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

Conclusion

Move at business speed and scale at LinkedIn scale

Segmentation & Targeting Platform– Self-service– Multiple data sources & massive data volume– Support complex expression evaluation in seconds– Attribute availability at business speed

Page 36: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

Engineering Team

Jessica Ho Swetha Karthik Raj Rangaswamy Tony Tong Ajinkya Harkare Hien Luu Sid Anand

Page 37: LinkedIn Segmentation & Targeting Platform: A Big Data Application

©2013 LinkedIn Corporation. All Rights Reserved.

Questions?

More info: data.linkedin.com


Recommended