+ All Categories
Home > Technology > Cpg iitm mar_29_2012_final

Cpg iitm mar_29_2012_final

Date post: 22-Apr-2015
Category:
Upload: discoversudhir
View: 822 times
Download: 0 times
Share this document with a friend
Description:
 
25
Cloud Platform Group (CPG) Presentation at IIT Chennai March 29, 2012
Transcript
Page 1: Cpg iitm mar_29_2012_final

Cloud Platform Group (CPG)

Presentation at IIT Chennai

March 29, 2012

Page 2: Cpg iitm mar_29_2012_final

Agenda

Yahoo! Presentation, Confidential 2

CPG Mission and Value Proposition

Fit within the Yahoo Stack

Drill-down: User Generated Content (UGC)

Drill-down: User Location

Drill-down: Web Extractions

Drill-down: Trending

Q & A

Page 3: Cpg iitm mar_29_2012_final

04/11/20233Yahoo! Presentation, Confidential

Create a global, scalable platform built on science that enables rapid innovation and

delivery of personalized, monetizable experiences across devices.

Cloud Platform Group Mission

Page 4: Cpg iitm mar_29_2012_final

Yahoo! Presentation, Confidential

Agility with Stability1

4

LEGO powered by Content Agility

CPG Value Proposition

Page 5: Cpg iitm mar_29_2012_final

CPG Value Proposition

Yahoo! Presentation, Confidential

Science at Scale 2

5

Page 6: Cpg iitm mar_29_2012_final

CPG powers all of Yahoo! today

Yahoo! Presentation, Confidential

DISPLAY ADS powered by Hadoop

MAILpowered by Edge,

Storage, Ranking, & Hadoop

LEGO (YPP)powered by Content Agility

3x improvement in accuracy of ad placements and our ability to forecast

supply over legacy systems

40% faster download time, 300K+ spam mails blocked/ sec

Reduce time to launch new sites from quarters to weeks

LIVESTANDpowered by Mobile &

Cocktails Presentation Services

SOCIAL CHROMEpowered by Social Platform

FRONT PAGEpowered by CORE

Seamlessly distribute content across devices in an experience that is

elegant and personalized

Over 22M net cumulative installs since launch, Integrated into News,

Games, Movies, OMG, TV

Increased CTR by +263% for Today Module by serving right content to the

right user (over pre-CORE)

ILLUSTRATIVE SAMPLE

6

Page 7: Cpg iitm mar_29_2012_final

RESULTS

UGC platforms are used by over 200 Yahoo! properties with over 650M UGC actions per year

SOLUTION

UGC Cloud is a scalable, real-time platform that lets users to express themselves, resulting in increased user engagement and a vibrant Yahoo! community

USE CASE

Increase content stickiness and user retention; drive repeat usage across the Yahoo! network

Comments

Polls

Message Boards

Ratings & Reviews

40M user ratings

per month

1.2M poll votes per

month

1/3 of US Finance

traffic from MB

6M comments per month

Unified, scalable platform that enables self expression and gets users to connect over content

User Generated Content

Page 8: Cpg iitm mar_29_2012_final

User Generated Content – Applications

Improving Comment Quality

3 pronged approach – Machine; Human and Community Moderation300M analyzed, 70 M blocked with machine moderationReactive Volume (cost of reacting to abuse) avoided

Sentiment Slider

http://news.yahoo.com/open-business-free-agency-set-begin-211828913--spt.html

Page 9: Cpg iitm mar_29_2012_final

User Generated Content – Social Poll

Page 10: Cpg iitm mar_29_2012_final

User Generated Content – In the WorksTopical Organization of Comments Social Conversations

Page 11: Cpg iitm mar_29_2012_final

RESULTSProperties can launch location aware services with faster time to market on a single platform

237M users with 550M locations

User LocationStore, manage & share user locations and locations of interest to create deeply personal digital experiences

USE CASEUser location information was siloed, inconsistent, and not shareable across properties and users

SOLUTION

Create a single data store of user locations, shareable across Yahoo! properties and advertising systems

Management, Authorization, and Control

LOCDROPNormalized, Geo-Aware User LocationsCentralized, Consistent, and Contextual

Accurate, Relevant, Valuable ExperiencesIncrease Content, Targeting and Revenues

Page 12: Cpg iitm mar_29_2012_final

Read locations to drive local news, events and deals

Page 13: Cpg iitm mar_29_2012_final

YAHOO! CONFIDENTIAL

Contextual Locations for Yahoo News

Page 14: Cpg iitm mar_29_2012_final

User cannot find a place and decides to create a new location to check-in

User is asked for permission to detect current location from device

Users location is pointed on a map. This will be used to get the lat/long of the created place

User enters a location “Russian Tea Room”

A new location is stored in UGP platform and the user is checked-in to this location

User has an option to curate the locations created by other users

UGP platform enables algorithmic curation

User Generated Places: Enable users to submit (and curate) a location if one does not exist

Android Messenger Use Case

Page 15: Cpg iitm mar_29_2012_final

Yahoo! Presentation, Confidential 15

KAFE: Technologies*

Manual SDE Rules Large Aggregator Websites

(e.g. amazon)

Editorial Effort

Precision

Dapper Small Websites

(e.g. community sites) Behind the Form sites

(Deep Web)

PSOX (Y! Labs) Unsupervised extractions

from large number of websites

Goldrush, Dish-a-wish, Restaurant Photos

Web Content

Bing WCC YST HVC

KAFE

S.D.E Dapper PSOX

W.O.O PropertiesLegacy

Backend

Live Pages (LLFS)

* Supports Multiple Sources of Data and Multiple Technologies

Page 16: Cpg iitm mar_29_2012_final

Answers Not Links Dappfactory

16

Dappfactory used by DD Builder to create over 3000+ DD experiences !

Page 17: Cpg iitm mar_29_2012_final

Answers Not Links Dappfactory

17

Dappfactory used by DD Builder to create over 3000+ DD experiences !

Page 18: Cpg iitm mar_29_2012_final

Answers Not LinksS-DEKAFE XSL Rules

18

Creating Vertical Search Experiences for Recipes

Page 19: Cpg iitm mar_29_2012_final

Answers Not Links PSOX-Unsupervised Extractions

19

Y! Dish-a-WishCraving for Hummus in Sunnyvale ?

Y! GoldrushLooking for where to buy Amana dishwashers ?

Page 20: Cpg iitm mar_29_2012_final

Enhanced Listings Dappfactory

20

Before:After:

• Taken from Roadmap deck for Y! Local by Erin Johns• Data being provided to Y! Local, Front End revamp on Local Roadmap

Page 21: Cpg iitm mar_29_2012_final

Local Events for N.I.L.E Dappfactory

21

As of Feb ‘12, over 22,000 events for 250 US cities have been extracted using Dappfactory

Extracted using Dappfactory

Page 22: Cpg iitm mar_29_2012_final

Yahoo! Presentation, Confidential 22

Data Extraction – Challenges

Technology whitespace Head – Fully manual scales fine. Gives high precision. Torso – Mostly use human assisted learning. Drop in recall and

precision, but acceptable for production use. Tail content – Only option is ML/no-human-in-loop models.

Recall and Precision need lot of improvement. Semantic Web initiatives – Web of Objects

Linked Open Data Format (RDF-a, OWL, Sparql) Lod Cloud – Few Thousand data sets, 10s of billions of

interlinked facts. Confhopper – Sample/Demo application

Unstructured Corpus – NLP Extraction Systems /Engineering Challenges – Low Latency processing,

tokenization/parsing – Intl support Sciences Challenges – polysemy, synonymy,

aboutness/concepts, sentiment analysis. CAP – Contextual analysis platform

Page 23: Cpg iitm mar_29_2012_final

TimeSense – usecases/business value proposition

23

Plumbing, Monetization, & Games

US FP Trending Now local pool for a given DMA powered by TS –6% CTR lift attributed to local terms

Search Suggestions in SD box – Timesense powered suggestions triggered for 6% of all gossip requests

Trending searches in Left Rail on Yahoo US SRP – triggered for ~6% of all user queries

TW FP Trending Now automated by Timesense API

Page 24: Cpg iitm mar_29_2012_final

TimeSense

24

Plumbing, Monetization, & Games

In Bucket

AUTOMATED trending module on shopping.yahoo.com : First module with no editorial intervention, vertically categorized trends, fast refresh and rotating terms

Soon to Launch

HK , TW and KR Automated trends modules on FP, Mail, OMG, news etc

Editorial Power users of Timesense • Search Forecasting Editorial Team – updates sent twice a day to 500+ subscribers• FP Trending Now team• Regional Content programming , search editorial and SEO teams : US ,UK, HK, TW, IN [Q1 launch – all

INTLs]Upcoming

• Trending Now Syndication for Yahoo Hosted Search partners – via BOSS• Trending Image experience• Trending Now 2.0 automation expansion

Page 25: Cpg iitm mar_29_2012_final

Yahoo! Presentation, Confidential 25

Trending topic detection – Challenges

Systems Challenges• Low latency requirement• GBs of data analyzed from multiple data sources every 5

minutes• Scalability – different verticals, segmented models.• High Availability requirement

Sciences Challenges Algorithmic improvements for near real time detection without

precision loss Short Phrase Categorization Deduping/Clustering – intent detection Segmentation/Smoothing – Age/gender/Behavioral Tracking

Categories/Geography – signal sparsity with fine grained segmentation.


Recommended