A Search-based Method forForecasting Ad Impression in Contextual Advertising
Defense
Overview
Background: Web and contextual advertising
Motivation: importance of volume forecasting in contextual advertising
Methodology: forecasting volume as an inverse of the ad retrieval
Experiments
Web Advertising Huge impact on the Web and beyond
$21 billion industry Main textual advertising channels:
Search advertising Contextual advertising
Contextual Advertising (CA)
CA Basics Supports a variety of the web ecosystem Selects ads based on the “context”:
Web page where the ads are placed Users that are viewing this page
Interplay of three participants: Publisher Advertiser Ad network
Advertiser’s goal is to obtain web traffic
Importance of ImpressionVolume Critical in planning and budgeting adv
ertising campaigns Common questions for advertisers an
d intermediaries: Bid value Impact of ad variations Timing of the campaign
A Challenging Problem of Impression forecasting CA platforms are complex systems
Have hundreds of contributing features A moving target, dynamic
Publisher‘s content and traffic vary over time
Large scale computation: billions of page views, hundreds of millions of distinct pages, and hundreds of millions of ads
Dynamic bid landscape Competitors and what they are willing to pay
Current practice Run test ad in real traffic for a few days
Simultaneously with the baseline Compare with the baseline Obvious drawbacks:
Use ad serving infrastructure Expensive Inefficient Very long turn-around time
Forecasting as Inverse of AdRetrieval Ad retrieval: given a page and a set of ads find the
best ads Forecasting: given an ad and a set of past impressi
ons, find where the ad would have been shown if it were in the system
This work: assumes ads selected based on similarity of features:
Use the WAND (Broder et al, CIKM 2003) DAAT algorithm as page selection
Similarity of ad and context feature vectors: requires monotonic scoring function – this work uses dot product
Features can be based on either user of page context.
Conceptual Work Flow Keep all the data used in ad retrieval for a gi
ven period For an unseen/incoming ad:
Examine each impression Score the ad using the ad retrieval algorithm Compare the ad score with the score of the lowe
st ranking ad shown in the page view Count the impressions where the ad would have
been shown
Main challenge: scale In order to beat scalability problem:
Index only unique pages Adaptation of the WAND algorithm for co
unt aggregation needed in forecasting A Two-level Process
Use a posting list order to allow early termination
Indexing Unique Pages The revenue estimate of an ad-page pair: score(p,
a) = similarity(p,a)*bid Revenue estimate for the lowest ranking ad: minSc
orep For repeating pages the similarity is constant However, ads and bids vary:
Could change the lowest ranking ad of a unique page Only one index entry per unique page: What reven
ue to store for the lowest ranking ads? Save a distribution of estimates {rev1…revn} Assign median to the minScorep MinScorep is recomputed based on the current ad supply
Two-level process (Impression forecasting)
First phase (approximate) evaluation: maxWeightf = max{wf,p : for all p}
Full evaluation:
Framework Offline processing
Analyzing the pages Building a page inverted
index Creating a page statistics
file Online processing
We use the inverted page index and page statistics to forecast the # of impressions of a given ad.
Output Given a ad and bid, output
the # of imp Give a ad, output the curve
describe the relation b/w bid and # of impressions
Experiment Results Day to day forecast
Week to week forecast
Observations: Similar results between day-day and
week-week forecasting. The errors seems big, however,
Due to the traffic fluctuation. Even with large margin of error, our result
is still significant (it’s the best of its kind, and it’s still acceptable in campaigning budgeting and advertising strategy)
Top row has a good prediction. Bottom row does not match well due to traffic
fluctuation, but match the trend and sharp very well.
Tradeoff b/w efficiency and accuracy Changing the value of minScorep will have effect on
the output of the first level
Ad Variation Example Subtle difference could lead to
dramatic performance change
Conclusion Ad retrieval algorithm is the determining fa
ctor in the CA impression volume forecasting
Introduced a search-based forecasting as inverse of ad retrieval
Promising experimental results Further work: combine search with learning
approaches to further improve forecasting.