+ All Categories
Home > Data & Analytics > Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Date post: 16-Mar-2018
Category:
Upload: prasad-chalasani
View: 426 times
Download: 0 times
Share this document with a friend
143
Estimating Causal Effect of Ads in a Real-Time-Bidding Platform Prasad Chalasani (SVP Data Science, MediaMath) Sep 24, 2016
Transcript
Page 1: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Estimating Causal Effect of Ads in aReal-Time-Bidding Platform

Prasad Chalasani (SVP Data Science, MediaMath)

Sep 24, 2016

Page 2: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Project PlaceboOr,

How to Measure Causal Effect of Ads in an RTB Platform

Placebo Team (alphabetical):

I Ari Buchalter (President, Technology; co-founder)I Prasad ChalasaniI Himanish KusharyI Jason LeiI Jonathan MarshallI Michael NeissI Tristan PironI Sara SkrmettiI Jawad StouliI Jaynth ThiagarajanI Ezra Winston

Page 3: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

I listen to ~ 100 Bln ad opportunities daily

I respond with optimal bids within milliseconds

I petabytes of data (ad impressions, visits, clicks, conversions)

Page 4: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

I listen to ~ 100 Bln ad opportunities daily

I respond with optimal bids within milliseconds

I petabytes of data (ad impressions, visits, clicks, conversions)

Page 5: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

I listen to ~ 100 Bln ad opportunities daily

I respond with optimal bids within milliseconds

I petabytes of data (ad impressions, visits, clicks, conversions)

Page 6: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

I listen to ~ 100 Bln ad opportunities daily

I respond with optimal bids within milliseconds

I petabytes of data (ad impressions, visits, clicks, conversions)

Page 7: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Key Conceptual Take-aways

I Definition of causal effect

I Context: relationship to Machine LearningI Causal effect in a Real-Time Bidding Platform

I Simplest approach is wastefulI Less wasteful approach: bias (non-compliance)I MediaMath’s solution

I Bayesian Methods for Ad Lift Confidence Bounds

I Gibbs Sampling (MCMC – Markov Chain Monte Carlo)

I Complications unique to our setting:

I Long-running experimentsI Multiple cookies per user

Page 8: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Key Conceptual Take-aways

I Definition of causal effectI Context: relationship to Machine Learning

I Causal effect in a Real-Time Bidding Platform

I Simplest approach is wastefulI Less wasteful approach: bias (non-compliance)I MediaMath’s solution

I Bayesian Methods for Ad Lift Confidence Bounds

I Gibbs Sampling (MCMC – Markov Chain Monte Carlo)

I Complications unique to our setting:

I Long-running experimentsI Multiple cookies per user

Page 9: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Key Conceptual Take-aways

I Definition of causal effectI Context: relationship to Machine LearningI Causal effect in a Real-Time Bidding Platform

I Simplest approach is wastefulI Less wasteful approach: bias (non-compliance)I MediaMath’s solution

I Bayesian Methods for Ad Lift Confidence Bounds

I Gibbs Sampling (MCMC – Markov Chain Monte Carlo)

I Complications unique to our setting:

I Long-running experimentsI Multiple cookies per user

Page 10: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Key Conceptual Take-aways

I Definition of causal effectI Context: relationship to Machine LearningI Causal effect in a Real-Time Bidding Platform

I Simplest approach is wasteful

I Less wasteful approach: bias (non-compliance)I MediaMath’s solution

I Bayesian Methods for Ad Lift Confidence Bounds

I Gibbs Sampling (MCMC – Markov Chain Monte Carlo)

I Complications unique to our setting:

I Long-running experimentsI Multiple cookies per user

Page 11: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Key Conceptual Take-aways

I Definition of causal effectI Context: relationship to Machine LearningI Causal effect in a Real-Time Bidding Platform

I Simplest approach is wastefulI Less wasteful approach: bias (non-compliance)

I MediaMath’s solutionI Bayesian Methods for Ad Lift Confidence Bounds

I Gibbs Sampling (MCMC – Markov Chain Monte Carlo)

I Complications unique to our setting:

I Long-running experimentsI Multiple cookies per user

Page 12: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Key Conceptual Take-aways

I Definition of causal effectI Context: relationship to Machine LearningI Causal effect in a Real-Time Bidding Platform

I Simplest approach is wastefulI Less wasteful approach: bias (non-compliance)I MediaMath’s solution

I Bayesian Methods for Ad Lift Confidence Bounds

I Gibbs Sampling (MCMC – Markov Chain Monte Carlo)

I Complications unique to our setting:

I Long-running experimentsI Multiple cookies per user

Page 13: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Key Conceptual Take-aways

I Definition of causal effectI Context: relationship to Machine LearningI Causal effect in a Real-Time Bidding Platform

I Simplest approach is wastefulI Less wasteful approach: bias (non-compliance)I MediaMath’s solution

I Bayesian Methods for Ad Lift Confidence Bounds

I Gibbs Sampling (MCMC – Markov Chain Monte Carlo)I Complications unique to our setting:

I Long-running experimentsI Multiple cookies per user

Page 14: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Key Conceptual Take-aways

I Definition of causal effectI Context: relationship to Machine LearningI Causal effect in a Real-Time Bidding Platform

I Simplest approach is wastefulI Less wasteful approach: bias (non-compliance)I MediaMath’s solution

I Bayesian Methods for Ad Lift Confidence BoundsI Gibbs Sampling (MCMC – Markov Chain Monte Carlo)

I Complications unique to our setting:

I Long-running experimentsI Multiple cookies per user

Page 15: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Key Conceptual Take-aways

I Definition of causal effectI Context: relationship to Machine LearningI Causal effect in a Real-Time Bidding Platform

I Simplest approach is wastefulI Less wasteful approach: bias (non-compliance)I MediaMath’s solution

I Bayesian Methods for Ad Lift Confidence BoundsI Gibbs Sampling (MCMC – Markov Chain Monte Carlo)

I Complications unique to our setting:

I Long-running experimentsI Multiple cookies per user

Page 16: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Key Conceptual Take-aways

I Definition of causal effectI Context: relationship to Machine LearningI Causal effect in a Real-Time Bidding Platform

I Simplest approach is wastefulI Less wasteful approach: bias (non-compliance)I MediaMath’s solution

I Bayesian Methods for Ad Lift Confidence BoundsI Gibbs Sampling (MCMC – Markov Chain Monte Carlo)

I Complications unique to our setting:I Long-running experiments

I Multiple cookies per user

Page 17: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Key Conceptual Take-aways

I Definition of causal effectI Context: relationship to Machine LearningI Causal effect in a Real-Time Bidding Platform

I Simplest approach is wastefulI Less wasteful approach: bias (non-compliance)I MediaMath’s solution

I Bayesian Methods for Ad Lift Confidence BoundsI Gibbs Sampling (MCMC – Markov Chain Monte Carlo)

I Complications unique to our setting:I Long-running experimentsI Multiple cookies per user

Page 18: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad impact measurement

I Advertisers want to know the impact of showing ads to people.

Page 19: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Measuring Ad Impact: Two Approaches

I Observational studies:

I Compare people who happen to be exposed vs not exposedI Bias a big issue

I Randomized tests:

I Randomly assign people to test (exposed), control (un-exposed)

Page 20: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Measuring Ad Impact: Two Approaches

I Observational studies:I Compare people who happen to be exposed vs not exposed

I Bias a big issue

I Randomized tests:

I Randomly assign people to test (exposed), control (un-exposed)

Page 21: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Measuring Ad Impact: Two Approaches

I Observational studies:I Compare people who happen to be exposed vs not exposedI Bias a big issue

I Randomized tests:

I Randomly assign people to test (exposed), control (un-exposed)

Page 22: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Measuring Ad Impact: Two Approaches

I Observational studies:I Compare people who happen to be exposed vs not exposedI Bias a big issue

I Randomized tests:

I Randomly assign people to test (exposed), control (un-exposed)

Page 23: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Measuring Ad Impact: Two Approaches

I Observational studies:I Compare people who happen to be exposed vs not exposedI Bias a big issue

I Randomized tests:I Randomly assign people to test (exposed), control (un-exposed)

Page 24: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: the questions to ask

When a set of people U is exposed to ads,

I what is the avg response-rate R1 of the people in U?

I what would have been the response rate R0 of U, if theyhad not seen the ad?

I relative causal effect, or causal lift = R1/R0 − 1

Page 25: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: the questions to ask

When a set of people U is exposed to ads,

I what is the avg response-rate R1 of the people in U?I what would have been the response rate R0 of U, if theyhad not seen the ad?

I relative causal effect, or causal lift = R1/R0 − 1

Page 26: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: the questions to ask

When a set of people U is exposed to ads,

I what is the avg response-rate R1 of the people in U?I what would have been the response rate R0 of U, if theyhad not seen the ad?

I relative causal effect, or causal lift = R1/R0 − 1

Page 27: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )

I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an adI Yi(1) = response when exposed to an ad

I Wi = 1 if unit i exposed to ad, else 0.I Observed response: Y obs

i = Yi(Wi)

I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactualI if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactual

I Xi = k-dimensional vector of features

I e.g. (dayOfWeek, age, location, web-domain)

I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 28: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.

I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an adI Yi(1) = response when exposed to an ad

I Wi = 1 if unit i exposed to ad, else 0.I Observed response: Y obs

i = Yi(Wi)

I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactualI if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactual

I Xi = k-dimensional vector of features

I e.g. (dayOfWeek, age, location, web-domain)

I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 29: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an adI Yi(1) = response when exposed to an ad

I Wi = 1 if unit i exposed to ad, else 0.I Observed response: Y obs

i = Yi(Wi)

I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactualI if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactual

I Xi = k-dimensional vector of features

I e.g. (dayOfWeek, age, location, web-domain)

I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 30: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an ad

I Yi(1) = response when exposed to an adI Wi = 1 if unit i exposed to ad, else 0.I Observed response: Y obs

i = Yi(Wi)

I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactualI if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactual

I Xi = k-dimensional vector of features

I e.g. (dayOfWeek, age, location, web-domain)

I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 31: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an adI Yi(1) = response when exposed to an ad

I Wi = 1 if unit i exposed to ad, else 0.I Observed response: Y obs

i = Yi(Wi)

I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactualI if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactual

I Xi = k-dimensional vector of features

I e.g. (dayOfWeek, age, location, web-domain)

I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 32: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an adI Yi(1) = response when exposed to an ad

I Wi = 1 if unit i exposed to ad, else 0.

I Observed response: Y obsi = Yi(Wi)

I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactualI if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactual

I Xi = k-dimensional vector of features

I e.g. (dayOfWeek, age, location, web-domain)

I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 33: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an adI Yi(1) = response when exposed to an ad

I Wi = 1 if unit i exposed to ad, else 0.I Observed response: Y obs

i = Yi(Wi)

I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactualI if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactual

I Xi = k-dimensional vector of features

I e.g. (dayOfWeek, age, location, web-domain)

I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 34: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an adI Yi(1) = response when exposed to an ad

I Wi = 1 if unit i exposed to ad, else 0.I Observed response: Y obs

i = Yi(Wi)I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactual

I if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactualI Xi = k-dimensional vector of features

I e.g. (dayOfWeek, age, location, web-domain)

I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 35: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an adI Yi(1) = response when exposed to an ad

I Wi = 1 if unit i exposed to ad, else 0.I Observed response: Y obs

i = Yi(Wi)I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactualI if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactual

I Xi = k-dimensional vector of features

I e.g. (dayOfWeek, age, location, web-domain)

I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 36: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an adI Yi(1) = response when exposed to an ad

I Wi = 1 if unit i exposed to ad, else 0.I Observed response: Y obs

i = Yi(Wi)I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactualI if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactual

I Xi = k-dimensional vector of features

I e.g. (dayOfWeek, age, location, web-domain)I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 37: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an adI Yi(1) = response when exposed to an ad

I Wi = 1 if unit i exposed to ad, else 0.I Observed response: Y obs

i = Yi(Wi)I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactualI if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactual

I Xi = k-dimensional vector of featuresI e.g. (dayOfWeek, age, location, web-domain)

I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 38: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect: Notation

I “units” i = 1, 2, . . . , n (“users”, “user_context”, . . . )I Yi = 1 if unit i responds (buys, subscribes, . . . ), else 0.I Each unit i has 2 potential responses:

I Yi(0) = response when not exposed to an adI Yi(1) = response when exposed to an ad

I Wi = 1 if unit i exposed to ad, else 0.I Observed response: Y obs

i = Yi(Wi)I if Wi = 1, only Yi(1) is observed, Yi(0) is a counterfactualI if Wi = 0, only Yi(0) is observed, Yi(1) is a counterfactual

I Xi = k-dimensional vector of featuresI e.g. (dayOfWeek, age, location, web-domain)

I Unit level causal effect is impossible to measure:

τi = Yi(1)− Yi(0)

Page 39: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Average Causal/Treatment Effects

Average Treatment Effect (ATE)

ATE = E[Yi(1)− Yi(0)]

Average Treatment Effect on the Treated (ATET)

ATET = E[Yi(1)− Yi(0) |Wi = 1]

Causal Lift (L) (this talk)

L = E[Yi(1) |Wi = 1]E[Yi(0) |Wi = 1] − 1

Conditional Average Treatment Effect: (Athey/Imbens et al)

τ(x) = E[Yi(1)− Yi(0) | Xi = x ]

Conditional Response Rate (usual Machine Learning problem)

R(x) = E[Yi(1) | Xi = x ]

Page 40: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Average Causal/Treatment EffectsAverage Treatment Effect (ATE)

ATE = E[Yi(1)− Yi(0)]

Average Treatment Effect on the Treated (ATET)

ATET = E[Yi(1)− Yi(0) |Wi = 1]

Causal Lift (L) (this talk)

L = E[Yi(1) |Wi = 1]E[Yi(0) |Wi = 1] − 1

Conditional Average Treatment Effect: (Athey/Imbens et al)

τ(x) = E[Yi(1)− Yi(0) | Xi = x ]

Conditional Response Rate (usual Machine Learning problem)

R(x) = E[Yi(1) | Xi = x ]

Page 41: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Average Causal/Treatment EffectsAverage Treatment Effect (ATE)

ATE = E[Yi(1)− Yi(0)]

Average Treatment Effect on the Treated (ATET)

ATET = E[Yi(1)− Yi(0) |Wi = 1]

Causal Lift (L) (this talk)

L = E[Yi(1) |Wi = 1]E[Yi(0) |Wi = 1] − 1

Conditional Average Treatment Effect: (Athey/Imbens et al)

τ(x) = E[Yi(1)− Yi(0) | Xi = x ]

Conditional Response Rate (usual Machine Learning problem)

R(x) = E[Yi(1) | Xi = x ]

Page 42: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Average Causal/Treatment EffectsAverage Treatment Effect (ATE)

ATE = E[Yi(1)− Yi(0)]

Average Treatment Effect on the Treated (ATET)

ATET = E[Yi(1)− Yi(0) |Wi = 1]

Causal Lift (L) (this talk)

L = E[Yi(1) |Wi = 1]E[Yi(0) |Wi = 1] − 1

Conditional Average Treatment Effect: (Athey/Imbens et al)

τ(x) = E[Yi(1)− Yi(0) | Xi = x ]

Conditional Response Rate (usual Machine Learning problem)

R(x) = E[Yi(1) | Xi = x ]

Page 43: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Average Causal/Treatment EffectsAverage Treatment Effect (ATE)

ATE = E[Yi(1)− Yi(0)]

Average Treatment Effect on the Treated (ATET)

ATET = E[Yi(1)− Yi(0) |Wi = 1]

Causal Lift (L) (this talk)

L = E[Yi(1) |Wi = 1]E[Yi(0) |Wi = 1] − 1

Conditional Average Treatment Effect: (Athey/Imbens et al)

τ(x) = E[Yi(1)− Yi(0) | Xi = x ]

Conditional Response Rate (usual Machine Learning problem)

R(x) = E[Yi(1) | Xi = x ]

Page 44: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Average Causal/Treatment EffectsAverage Treatment Effect (ATE)

ATE = E[Yi(1)− Yi(0)]

Average Treatment Effect on the Treated (ATET)

ATET = E[Yi(1)− Yi(0) |Wi = 1]

Causal Lift (L) (this talk)

L = E[Yi(1) |Wi = 1]E[Yi(0) |Wi = 1] − 1

Conditional Average Treatment Effect: (Athey/Imbens et al)

τ(x) = E[Yi(1)− Yi(0) | Xi = x ]

Conditional Response Rate (usual Machine Learning problem)

R(x) = E[Yi(1) | Xi = x ]

Page 45: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect Illustration

Page 46: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect Illustration

Page 47: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect Illustration

Page 48: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect Illustration

Page 49: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect Illustration

Page 50: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect Illustration: Counterfactuals

Page 51: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect Illustration: Counterfactuals

Page 52: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect Illustration: Counterfactuals

Page 53: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect with Counterfactuals

Counterfactuals are unobservable!

Instead of comparing:

I Resp-rate of exposed users U vsI Counterfactual un-exposed response-rate of same users U,

We compare:

I Resp-rate of exposed users U vsI Resp-rate of un-exposed users statistically equivalent to U.

=⇒ using randomization

Page 54: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect with Counterfactuals

Counterfactuals are unobservable!

Instead of comparing:

I Resp-rate of exposed users U vsI Counterfactual un-exposed response-rate of same users U,

We compare:

I Resp-rate of exposed users U vsI Resp-rate of un-exposed users statistically equivalent to U.

=⇒ using randomization

Page 55: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect with Counterfactuals

Counterfactuals are unobservable!

Instead of comparing:

I Resp-rate of exposed users U vsI Counterfactual un-exposed response-rate of same users U,

We compare:

I Resp-rate of exposed users U vsI Resp-rate of un-exposed users statistically equivalent to U.

=⇒ using randomization

Page 56: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Effect with Counterfactuals

Counterfactuals are unobservable!

Instead of comparing:

I Resp-rate of exposed users U vsI Counterfactual un-exposed response-rate of same users U,

We compare:

I Resp-rate of exposed users U vsI Resp-rate of un-exposed users statistically equivalent to U.

=⇒ using randomization

Page 57: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ideal Randomized Test:

Randomize after winning bid

Page 58: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ideal Randomized Test

Page 59: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ideal Randomized Test

Page 60: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ideal Randomized Test

Page 61: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ideal Randomized Test: Ad lift

Page 62: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ideal Randomized Test: Ad lift

Page 63: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

But is this practical?

Page 64: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ideal Randomized Test

Page 65: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ideal Randomized Test

Page 66: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ideal Randomized Test

Page 67: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ideal Randomized Test: Wasted spend

Page 68: Estimating Causal Effect of Ads in a Real-Time Bidding Platform
Page 69: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

MediaMath’s approach:

Randomize before bidding

Page 70: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

A Less Wasteful Randomized Test

Page 71: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

A Less Wasteful Randomized Test

Page 72: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

A Less Wasteful Randomized Test

Page 73: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Compare RC vs RT ?

Page 74: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Compare RC vs RT ?

Page 75: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Compare RC vs RT ?

Page 76: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Compare RC vs RTW ?

Page 77: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Compare RC vs RTW ? Win-bias

Page 78: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Proper Definition

Page 79: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Proper Definition

Page 80: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Proper Definition

Page 81: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Proper Definition

Page 82: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Estimating the Counterfactual RCW

Page 83: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Estimating the Counterfactual RCW

Page 84: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Estimating the Counterfactual RCW

Page 85: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Estimating the Counterfactual RCW

Page 86: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Estimating the Counterfactual RCW

Page 87: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Estimating the Counterfactual RCW

Page 88: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Estimating the Counterfactual RCW

Page 89: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Estimation

Main steps:

I observe response rates RC ,RTW ,RTL

I observe test win-rate w

I estimate the control counterfactual winner response-rate

RCW = RC − (1− w)RTLw

I compute lift L = RTW /RCW − 1

I similar to Treatment Effect Under Non-compliance in clinicialtrials.

Page 90: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Estimation

Main steps:

I observe response rates RC ,RTW ,RTL

I observe test win-rate w

I estimate the control counterfactual winner response-rate

RCW = RC − (1− w)RTLw

I compute lift L = RTW /RCW − 1

I similar to Treatment Effect Under Non-compliance in clinicialtrials.

Page 91: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Estimation

Main steps:

I observe response rates RC ,RTW ,RTL

I observe test win-rate w

I estimate the control counterfactual winner response-rate

RCW = RC − (1− w)RTLw

I compute lift L = RTW /RCW − 1

I similar to Treatment Effect Under Non-compliance in clinicialtrials.

Page 92: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Estimation

Main steps:

I observe response rates RC ,RTW ,RTL

I observe test win-rate w

I estimate the control counterfactual winner response-rate

RCW = RC − (1− w)RTLw

I compute lift L = RTW /RCW − 1

I similar to Treatment Effect Under Non-compliance in clinicialtrials.

Page 93: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Estimation

Main steps:

I observe response rates RC ,RTW ,RTL

I observe test win-rate w

I estimate the control counterfactual winner response-rate

RCW = RC − (1− w)RTLw

I compute lift L = RTW /RCW − 1

I similar to Treatment Effect Under Non-compliance in clinicialtrials.

Page 94: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Estimation

How to compute the 90% confidence interval for L?

Page 95: Estimating Causal Effect of Ads in a Real-Time Bidding Platform
Page 96: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Confidence Intervals with Gibbs sampler

Bayesian approach

I Assume a random parameter vector θ consisting of:

I (RTW ,RL,RCW ,w , ...)

I Set up prior distribution on θ ∼ p(θ)

I Sample M values of unknown θ from posterior: Gibbs Sampler

P(θ |Data) ∝ P(Data | θ) · p(θ)

I For each sampled θ compute lift L = RTW /RCW − 1

I Compute (0.05, 0.95) quantiles of sampled L values

Page 97: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Confidence Intervals with Gibbs sampler

Bayesian approach

I Assume a random parameter vector θ consisting of:

I (RTW ,RL,RCW ,w , ...)

I Set up prior distribution on θ ∼ p(θ)

I Sample M values of unknown θ from posterior: Gibbs Sampler

P(θ |Data) ∝ P(Data | θ) · p(θ)

I For each sampled θ compute lift L = RTW /RCW − 1

I Compute (0.05, 0.95) quantiles of sampled L values

Page 98: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Confidence Intervals with Gibbs sampler

Bayesian approach

I Assume a random parameter vector θ consisting of:I (RTW ,RL,RCW ,w , ...)

I Set up prior distribution on θ ∼ p(θ)

I Sample M values of unknown θ from posterior: Gibbs Sampler

P(θ |Data) ∝ P(Data | θ) · p(θ)

I For each sampled θ compute lift L = RTW /RCW − 1

I Compute (0.05, 0.95) quantiles of sampled L values

Page 99: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Confidence Intervals with Gibbs sampler

Bayesian approach

I Assume a random parameter vector θ consisting of:I (RTW ,RL,RCW ,w , ...)

I Set up prior distribution on θ ∼ p(θ)

I Sample M values of unknown θ from posterior: Gibbs Sampler

P(θ |Data) ∝ P(Data | θ) · p(θ)

I For each sampled θ compute lift L = RTW /RCW − 1

I Compute (0.05, 0.95) quantiles of sampled L values

Page 100: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Confidence Intervals with Gibbs sampler

Bayesian approach

I Assume a random parameter vector θ consisting of:I (RTW ,RL,RCW ,w , ...)

I Set up prior distribution on θ ∼ p(θ)

I Sample M values of unknown θ from posterior: Gibbs Sampler

P(θ |Data) ∝ P(Data | θ) · p(θ)

I For each sampled θ compute lift L = RTW /RCW − 1

I Compute (0.05, 0.95) quantiles of sampled L values

Page 101: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Confidence Intervals with Gibbs sampler

Bayesian approach

I Assume a random parameter vector θ consisting of:I (RTW ,RL,RCW ,w , ...)

I Set up prior distribution on θ ∼ p(θ)

I Sample M values of unknown θ from posterior: Gibbs Sampler

P(θ |Data) ∝ P(Data | θ) · p(θ)

I For each sampled θ compute lift L = RTW /RCW − 1

I Compute (0.05, 0.95) quantiles of sampled L values

Page 102: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Confidence Intervals with Gibbs sampler

Bayesian approach

I Assume a random parameter vector θ consisting of:I (RTW ,RL,RCW ,w , ...)

I Set up prior distribution on θ ∼ p(θ)

I Sample M values of unknown θ from posterior: Gibbs Sampler

P(θ |Data) ∝ P(Data | θ) · p(θ)

I For each sampled θ compute lift L = RTW /RCW − 1

I Compute (0.05, 0.95) quantiles of sampled L values

Page 103: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Gibbs Sampling

Page 104: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Gibbs Sampling

Page 105: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Gibbs Sampling

Page 106: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Gibbs Sampling

Page 107: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Gibbs Sampling

Page 108: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Gibbs Sampling

Page 109: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Gibbs Sampling

Page 110: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Gibbs Sampling

Page 111: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift: Gibbs Sampling

Page 112: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Gibbs Sampling: Random variables

Probabilities: w ,RTW ,RCW ,RL

Counts: CW 0,CW 1,CL0,CL1

Beta(1, 1) priors on probabilities, e.g.:

w ∼ Beta(1, 1) ∼ Uniform(0, 1), . . .

Page 113: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Gibbs Sampling: Random variables

Probabilities: w ,RTW ,RCW ,RL

Counts: CW 0,CW 1,CL0,CL1

Beta(1, 1) priors on probabilities, e.g.:

w ∼ Beta(1, 1) ∼ Uniform(0, 1), . . .

Page 114: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Gibbs Sampling: Posterior ProbabilitiesLikelihood of observed

I k = CL1 + TL1 conversions out ofI n = CL1 + TL1 + CL0 + TL0 trials,I given loser reponse-rate RL:

Binom(k, n;RL) ∝ RkL (1− RL)n−k ,

so posterior of RL

P(RL | k, n) ∝ P(k, n | RL) · p(RL)

∝ RkL (1− RL)n−k · Beta(1, 1)

∝ Rk+1L (1− RL)n−k+1

∝ Beta(k + 1, n − k + 1)

Page 115: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Gibbs Sampling: Posterior ProbabilitiesLikelihood of observed

I k = CL1 + TL1 conversions out ofI n = CL1 + TL1 + CL0 + TL0 trials,I given loser reponse-rate RL:

Binom(k, n;RL) ∝ RkL (1− RL)n−k ,

so posterior of RL

P(RL | k, n) ∝ P(k, n | RL) · p(RL)

∝ RkL (1− RL)n−k · Beta(1, 1)

∝ Rk+1L (1− RL)n−k+1

∝ Beta(k + 1, n − k + 1)

Page 116: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Gibbs Sampling: Posterior ProbabilitiesLikelihood of observed

I k = CL1 + TL1 conversions out ofI n = CL1 + TL1 + CL0 + TL0 trials,I given loser reponse-rate RL:

Binom(k, n;RL) ∝ RkL (1− RL)n−k ,

so posterior of RL

P(RL | k, n) ∝ P(k, n | RL) · p(RL)

∝ RkL (1− RL)n−k · Beta(1, 1)

∝ Rk+1L (1− RL)n−k+1

∝ Beta(k + 1, n − k + 1)

Page 117: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Gibbs Sampling: Posterior ProbabilitiesLikelihood of observed

I k = CL1 + TL1 conversions out ofI n = CL1 + TL1 + CL0 + TL0 trials,I given loser reponse-rate RL:

Binom(k, n;RL) ∝ RkL (1− RL)n−k ,

so posterior of RL

P(RL | k, n) ∝ P(k, n | RL) · p(RL)∝ Rk

L (1− RL)n−k · Beta(1, 1)

∝ Rk+1L (1− RL)n−k+1

∝ Beta(k + 1, n − k + 1)

Page 118: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Gibbs Sampling: Posterior ProbabilitiesLikelihood of observed

I k = CL1 + TL1 conversions out ofI n = CL1 + TL1 + CL0 + TL0 trials,I given loser reponse-rate RL:

Binom(k, n;RL) ∝ RkL (1− RL)n−k ,

so posterior of RL

P(RL | k, n) ∝ P(k, n | RL) · p(RL)∝ Rk

L (1− RL)n−k · Beta(1, 1)∝ Rk+1

L (1− RL)n−k+1

∝ Beta(k + 1, n − k + 1)

Page 119: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Gibbs Sampling: Posterior ProbabilitiesLikelihood of observed

I k = CL1 + TL1 conversions out ofI n = CL1 + TL1 + CL0 + TL0 trials,I given loser reponse-rate RL:

Binom(k, n;RL) ∝ RkL (1− RL)n−k ,

so posterior of RL

P(RL | k, n) ∝ P(k, n | RL) · p(RL)∝ Rk

L (1− RL)n−k · Beta(1, 1)∝ Rk+1

L (1− RL)n−k+1

∝ Beta(k + 1, n − k + 1)

Page 120: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Gibbs Sampling: Posterior Counts

We observe C1 = CL1 + CW 1 (total control conversions).

Need to sample CL1,CW 1

CW 1 is a Binomial draw from n = C1, with probability:

P(ctl winner | ctl conversion) = w · RCWw · RCW + (1− w) · RL

CL1 = C1− CW 1

Page 121: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Gibbs Sampling: Posterior Counts

We observe C1 = CL1 + CW 1 (total control conversions).

Need to sample CL1,CW 1

CW 1 is a Binomial draw from n = C1, with probability:

P(ctl winner | ctl conversion) = w · RCWw · RCW + (1− w) · RL

CL1 = C1− CW 1

Page 122: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Ad Lift Gibbs Sampling: Posterior Counts

We observe C1 = CL1 + CW 1 (total control conversions).

Need to sample CL1,CW 1

CW 1 is a Binomial draw from n = C1, with probability:

P(ctl winner | ctl conversion) = w · RCWw · RCW + (1− w) · RL

CL1 = C1− CW 1

Page 123: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Complication 1: We only observe cookies, not users;

A user’s cookies may be in both test and control(Contamination)

Page 124: Estimating Causal Effect of Ads in a Real-Time Bidding Platform
Page 125: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Control Contamination due to Multiple Cookies

Page 126: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Control Contamination due to Multiple Cookies

Page 127: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Control Contamination due to Multiple Cookies

Page 128: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Control Contamination due to Multiple Cookies

Page 129: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Control Contamination due to Multiple Cookies

Page 130: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Cookie-Contamination Questions

I How does cookie contamination affect measured lift?

I Does the cookie-distribution matter?

I everyone has k cookies vs an average of k cookies

I What is the influence of the control percentage?

I Simulations best way to understand this

I Monte carlo simulations using Spark

Page 131: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Cookie-Contamination Questions

I How does cookie contamination affect measured lift?

I Does the cookie-distribution matter?

I everyone has k cookies vs an average of k cookies

I What is the influence of the control percentage?

I Simulations best way to understand this

I Monte carlo simulations using Spark

Page 132: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Cookie-Contamination Questions

I How does cookie contamination affect measured lift?

I Does the cookie-distribution matter?I everyone has k cookies vs an average of k cookies

I What is the influence of the control percentage?

I Simulations best way to understand this

I Monte carlo simulations using Spark

Page 133: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Cookie-Contamination Questions

I How does cookie contamination affect measured lift?

I Does the cookie-distribution matter?I everyone has k cookies vs an average of k cookies

I What is the influence of the control percentage?

I Simulations best way to understand this

I Monte carlo simulations using Spark

Page 134: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Cookie-Contamination Questions

I How does cookie contamination affect measured lift?

I Does the cookie-distribution matter?I everyone has k cookies vs an average of k cookies

I What is the influence of the control percentage?

I Simulations best way to understand this

I Monte carlo simulations using Spark

Page 135: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Cookie-Contamination Questions

I How does cookie contamination affect measured lift?

I Does the cookie-distribution matter?I everyone has k cookies vs an average of k cookies

I What is the influence of the control percentage?

I Simulations best way to understand this

I Monte carlo simulations using Spark

Page 136: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Simulations for cookie-contamination

I A scenario is a combination of parameters:I M = # trials for this scenario, usually 10K-1MI n = # users, typically 10K - 10MI p = # control percentage (usually 10-50%)I k = cookie-distribution, expressed as 1 : 100, or 1 : 70, 3 : 30I r = (un-contaminated) control user response rateI a = true lift, i.e. exposed user response rate = r ∗ (1 + a).

I A scenario file specifies a scenario in each row.I could be thousands of scenarios

Page 137: Estimating Causal Effect of Ads in a Real-Time Bidding Platform
Page 138: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Complication 2:

Long-running experiments

Page 139: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Long-Running Experiments

Ideal randomized test is instantaneous.

When a test is run for weeks/months,

I A test user may sometimes be a winner, sometimes loser.I How to define who is a “winner” and “loser”?I Crucial because lift L = RTW /RCW − 1.

Our approach (details omitted):

I Ad influence period is limitedI “refresh” a user after suitable time-period elapses.I Count “user time-spans” rather than “users”I Identify “experiments” within user’s time-line

Page 140: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Long-Running Experiments

Ideal randomized test is instantaneous.

When a test is run for weeks/months,

I A test user may sometimes be a winner, sometimes loser.I How to define who is a “winner” and “loser”?I Crucial because lift L = RTW /RCW − 1.

Our approach (details omitted):

I Ad influence period is limitedI “refresh” a user after suitable time-period elapses.I Count “user time-spans” rather than “users”I Identify “experiments” within user’s time-line

Page 141: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Long-Running Experiments

Ideal randomized test is instantaneous.

When a test is run for weeks/months,

I A test user may sometimes be a winner, sometimes loser.I How to define who is a “winner” and “loser”?I Crucial because lift L = RTW /RCW − 1.

Our approach (details omitted):

I Ad influence period is limitedI “refresh” a user after suitable time-period elapses.I Count “user time-spans” rather than “users”I Identify “experiments” within user’s time-line

Page 142: Estimating Causal Effect of Ads in a Real-Time Bidding Platform

MediaMath’s Placebo App

I Currently in production for ∼ 10 advertisersI Advertisers can specify which campaigns to measureI Lift estimation, Gibbs Sampling runs on AWS using SparkI Multiple runs of Gibbs Sampler in parallel (with different priors)


Recommended