Yahoo7 Semantic Web Presentation

Post on 28-Nov-2014

107 views 0 download

description

Yahoo7 Open Night Semantic Web presentation by Paul White

transcript

Semantic WebPaul White

Yahoo!7 Proprietary and Confidential

The Problems● Yahoo!7 is known for having great breadth of content, but not depth

The Problems● Content is silo’d across different sites and knowledge domains

● Users get stuck in these property silos

The Problems● Site structure doesn’t allow for natural human browsing patterns

● Visitors browse by content, not type

The Tech● So, what do we use to solve these problems? It’s all about relationships!

● Resource Description Framework (RDF)

○ Basic building block of the Semantic Web

○ Consistent format for representing data

○ Triples: Subject -> Predicate -> Object

Predicate:Portrays

Predicate:Character In

● Web Ontology Language (OWL)

○ Ontologies - everything else!

○ What is a TV Show

○ What is an Actor

○ What is a Sporting Event

○ What are their relationships?

The Tech

● RDF Schema

○ Basic data types

○ Classes/Subclasses

○ Property (Predicates)

The Tech● Graphs

Portrays

Yahoo!7 TV

Character In

born inis a

Winners and Losers

Melbourne Cup Carnival

TV Show

Sporting Event

Social EventYahoo!7 Sport

Yahoo!7 Lifestyle

Derby Day Crown Oaks

Adelaide Matt Kingston

spouse

DBPedia

is a

Attended

is a

an event in

Predicate

Subject / Object

Legend

The Tech● Storage

● Built specifically for storing and reading graph data

● Stores everything - data, metadata, schemas, ontologies

● Allows for inferencing - e.g.

○ Sporting Event subclassOf Event

○ Melbourne Cup Festival subclassOf Sporting Event

therefore

○ Melbourne Cup Festival subclassOf Event

● Query Language

● Designed for querying triples

CONSTRUCT { ?Actor portrays ?Character }

WHERE { ?Actor portrays ?Character }

Results in

Melanie Vallejo portrays Sophie Wong ;

Ray Meagher portrays Alf Stewart ;

Hayden Christensen portrays Anakin Skywalker ;

Hayden Christensen portrays Darth Vader .

The Tech predicate?Variable

Legend

The Tech

● Named Entity Recognition

● Specialised branch of Natural Language Processing

● Generate annotations that link from this thing to that thing

● Two steps

○ Name Recognition

○ Entity Classification

The Results● Show Silos broken down

X-Factor Content Stream - Luke Jacobz Home and Away Content Stream - Luke Jacobz

The Results● Sites are now structured to surface related content

The Results● Automatic Discovery and hyperlinking of Named Entities

Questions?