Data Science Behind Display Ads in Digital Marketing

Data Science behind Display Ads in Digital

Marketing

Kushal WadhwaniSenior Data Scientist

We Help Marketers Increase Digital Share of Business

$30M FUNDING

Singapore, South East Asia

Bangalore, India

Dubai, UAE Dallas, USA

CERTIFICATIONS

FOCUS

Clients

INDIA & UAE

Use Case: Bring back a prospective user

1) User visits hdfc website , browsed for personal loan

2) Drops off without submitting lead3) Visits our publisher network4) Vizury shows add with personalized

banners and quotes5) User Clicks banner6) Reaches back to hdfc website

Some of the Channels Powered by Vizury

Programmatic

Mobile PushBrowser Push

/ InstagFacebookram

Programmatic flow

Optimization problem behind Programmatic

Pays for impression

Maximize clicks

PublishersClients

Parameters to Optimize

1. What to bid• Depends upon probability of click of that user• Depends upon probability of click of that ad slot

bidValue ∝ P( click / ad slot, user) ctr (click through rate) = 100* P( click / ad slot, user)

2. What to Show• Products visited by the user • Products and message suggested by the client

Data : Collection and processing

Data Collection

Bids DB

Impressions DB

Clicks DB

User activity DB

User variables and Ad slot variables

User variables

1) Time spent on website2) Products visited 3) Number of impression’s shown 4) Number of clicks

Ad slot variables 1) Size of banner 2) Url of the ad slot

Problem formulation

• Classification problem• 50 – 100 variables • Both Numerical and categorical variables • Massive amount of data to train

Id Categoricalvariable 1

Categoricalvariable 2

Numerical variable 1

Numerical variable 2

- - - - Click flag

1 xyz abc 1 0 0

2 - - - - 1

3 - - - - 0

xyz abc ?

?

?

Ad slot variables User level variables

Historical data

New bid request

ML Algorithms for classification

Logistic Regression

Pros:• Handles all linear interactions between variables• There are established scalable algorithms for training • Handles High cardinality categorical variables Cons:• Assumes that variables are linearly related to the log odds ratio• Does not handles non linear interactions well

ln[p/(1-p)] = + WTX • p is the probability that the event Y occurs,

p(Y=1)

• p/(1-p) is the "odds ratio"

• ln[p/(1-p)] is the log odds ratio, or "logit" p = 1/[1 + exp(- - WTX)]

Decision tree based Models

Pros:• Handles non liner correlation of input variables with output variable• Handles non linear interactions • Models are intuitive, easy to understand and explain

Cons:• Challenges in handling high cardinality categorical variables

Random Forrest

XGBoost

Neural Networks

Pros:• Handles non liner correlation of input variables with output variable• Handles non linear interactions of variables • Handles High cardinality categorical variables• Works well for large data sets

Cons:• Models are not readable

Variable Insights and triage

1. Visualize variables• Plot distributions • Variable Vs ctr - visually try to see the

nature of correlation• Cardinality of categorical variables

2. How to preprocess variable

3. Evaluate variable against ML techniques

Variable Insights : Numerical variable’s

Skewed Distribution Non linear correlation

var1

var2

Distribution Correlation

Handling Skew and non linearity

Non Linearcorrelation

Skewed Distribution

Logistic regression N N

Decision tree based models Y Y

Neural networks Y Y

• In general it is better to preprocess variables with skew• Log transformation newvalue = log (oldvalue)• Bucketization

Handling Skew and non linearity : Log transformationB

efo

reA

fter

Distribution Correlation

Handling Skew and non linearity : Bucketization

Bucketized var1

Distribution within buckets

Variable Insights : Interaction of variables

Non linear interaction

Logistic regression N

Decision tree based models Y

Neural networks Y

var1 vs var2 with size of circle representing ctr

Variable Insights : Categorical variables

Cardinality 104 Cardinality 10

Categorical variables

Neural network and logistic regression doesn’t handle categorical variables out of the box, variable have to be converted into numerical variables

1. One hot encoding – creates one new variable for each categorical value

2. Replace categorical value with its class weigh in our case ctr. Interactions with other variables cannot be captured

High cardinality categorical variables

Interaction between categorical variables

Logistic regression Y N

Decision tree based models N Y

Neural networks Y Y

Evaluation Metrics

AUC (Area under curve) : 2 D plot of False positive rate Vs True positive rate obtained by changing threshold

• Random probability will give auc of 0.5

• More the AUC better is the classification

• Quantifies how well model has ranked test data but doesn’t consider magnitude of response

Log Loss

Q & A

My Coordinates

LinkedIn : https://www.linkedin.com/in/kushal-wadhwani-02109a1a/Email : [email protected]

To know more about Vizury visit : https://www.vizury.com/

https://www.linkedin.com/in/kushal-wadhwani-02109a1a/

mailto:[email protected]

https://www1.vizury.com/

Date post:	22-Jan-2018
Category:	Education
Upload:	digital-vidya
View:	119 times
Download:	2 times

Data Science Behind Display Ads in Digital Marketing

Education