Future of Search | Yury Lifshits, Yahoo! Research

Post on 08-May-2015

2,669 views 0 download

description

http://yury.name

transcript

Yury LifshitsYahoo! Researchhttp://yury.name

Future of Search

St. Petersburg | Helsinki

December 2008

Outline

Structured Search Yahoo! Work in Search

SearchMonkey BOSS

Research Agenda

Structured Search:work in progress

Structured Search =Bring structured data to search users

M.K. Bergman. The Deep Web: Surfacing Hidden Value. 2001.

Value Proposition

Coverage Real-time data Semi-private data

Structured queries Ordering and filtering results Straight-to-answers

User Interface: Query Search assist: Yahoo! Selector: LinkedIn, VKontakte.ru Multiple search buttons: Gmail Search tabs: Yahoo / Google

User Interface: Results

Federated page Facets Search transfer / search form

K.P. Yee, K. Swearingen, K. Li, M. Hearst. Faceted metadata for image search and browsing. CHI 2003.

Fernando Diaz. Aggregation of News Content Into Web Results. WSDM 2009.

http://glue.yahoo.com http://au.alpha.yahoo.com

Data Supply Chain

Atomic fact Flight, Event, Patent

Data aggregatorUS Patents, Amadeus/Sabre flights, Upcoming.com

Domain searchExpedia, Spock

General purpose searchYahoo!, Google, Yandex, Baidu

Getting structured data

Entity extraction Markup Feeds Search API (OpenSearch)

OR

Do a search transfer

Give Us Your Data For …

Traffic via search transferFirefox search box

Better presentation in search SearchMonkey

Hosted searchBOSS Custom

Showing your adsYahoo Local + AT&T

Yahoo! Work in Search

Slides by:Paul Tarjan, Chief Technical Monkey

(ptarjan@yahoo-inc.com)

Full version http://www.slideshare.net/ptarjan/searchmonkey-presentation

an open platform for using structured data to build more useful and relevant search results

Before After

What is SearchMonkey?

Enhanced Result: Zagat

Key/Value Pairsor Abstract

LinksImage

Infobar: Wikipedia Preview

Summary Blob

Creating an Infobar

Infobar advantages Annotate someone else’s site Use links and images from other domains

• Mash up info from multiple sites• Affiliate / coupon links? Hmmm…

Can act on *, all websites• But these apps can be annoying if poorly

designed

Key design principles Put something useful in the summary Be creative with the HTML

How to get data to SearchMonkey?

Humans see:• name• picture of a person• current job• industry, …

Computers see:an undifferentiatedblob of HTML

Can we make computers smarter?

How does it work?

Acme.com’sdatabase

Index

RDF/Microformat Markup

site owners/publishers share structured data with Yahoo!. 1

consumers customize their search experience with Enhanced Results or Infobars

3

site owners & third-party developers build SearchMonkey apps.2

DataRSS feed

Web Services

Page Extraction

Acme.com’s Web Pages

SearchMonkey Resources

Main: http://developer.yahoo.com/

searchmonkey

Lists and forums: searchmonkey-

developers@yahoogroups.com http://suggestions.yahoo.com/

searchmonkey

Vik Singh (Architect)Graham Mudd (Senior PMM)

BOSS = Build your Own Search Service

Open Yahoo’s core search features via web services to let 3rd parties revolutionize Search

Unrestricted

What

Unrestricted:

• Unlimited queries• Blend, re-order, discard• Full presentation control• Non-search apps OK

Monetization: Free or CPM or Ads

What

Barriers to entry are massive• $300M, top talent, a prayer to get to basic parity

No monopoly over great ideas

Search anywhere• Improve Vertical Quality w/ Web comprehensiveness• Fragment the market, foster more players, choice, competition

Yahoo extends advertising reach, 3rd parties revenue share

Why

Why

Traditional Search Distribution

+ BOSS Distribution

Tracks

API

A self-service, web services model for developers and start-ups to quickly build and deploy new search experiences.

• UIUC• CMU• Stanford• Purdue• IIT Bombay• MIT• UMass

CUSTOM

Working with 3rd parties to build a more relevant, brand/site specific web search experience.

This option is jointly built by Yahoo! and select partners.

ACADEMIC

Working with the following universities to allow for wide-scale research in the search field:

Interested in Custom? Email us bosscustom@yahoo-inc.com

http://boss.yahooapis.com/ysearch/{vert}/v1/{q}

{vert} := {web, news, images, spelling}

@ requiredappid

@ optional (Y!OS compliant)start, count, lang, region, format, callback, sites

BOSS API v1

Python (v2.5+) library

BOSS Search SDK plus …

SQL for remixing arbitrary XML/JSON sources

Loosely Functional programming paradigm

BOSS Mashup Framework

Ported enhanced version of BMF to GAE platform

http://zooie.wordpress.com/2008/08/04/yahoo-boss-google-app-engine-integrated/

Easiest way to deploy a BOSS application online

BMF + Google App Engine

http://www.4hoursearch.com

http://123people.com

Mashable! Contest for BOSS search engineshttp://mashable.com/boss/

Examples

BOSS Custom for TechCrunch

TechCrunch Network Search

CrunchBase + Posts + Web Sort by time / relevance Enhanced results Domain-specific facets Yahoo! sponsored search Real-time indexing Special results

Research Agenda

Structured Search

Analysis of search demand Intent classification General search vs. vertical

Incentives in data supply Push & real-time indexing Search user interface

One box vs. multi-box General vs. vertical

Deciding search transfer When? To whom?

Key Scientific ChallengesDraft: http://research.yahoo.com/ksc

1. Search intent2. Quality metrics3. Web mining4. Multilingual IR5. Nextgen search

Synthesized result pages

6. World knowledge

A.Z. Broder. Taxonomy of web search. SIGIR 2002.

More Problems

Discovery search

Web search vs. asking people

Event search

Thanks for your attention!

Yury Lifshits http://yury.name yury@yury.name