Date post: | 30-Oct-2014 |
Category: |
Technology |
Upload: | david-peterson |
View: | 12 times |
Download: | 1 times |
THE MASHED UP PLAYLIST part II
David Peterson @davidseth #w3c http://www.flickr.com/photos/soyignatius/
David Peterson@davidseth
Challenge
Create a snapshot of an artist
Problem
<xml><track>
<title>Purple Rain</title><artistName>Prince</artistName>
</track></xml>
Into
It’s all about story telling
Shared Understanding
• Can’t tell a story if the other person doesn’t get what we mean
• Or even speak the same language• Imagine – explain what a kiwi was– or what a sheep was
• The story matters• ... but ...• You never really have all the information you
need, whether big or small
You Just don’t Always Know
• Someone else knows more than you• How to find it?
One Exception
Semantic Web
• Core idea – you never really know the entire picture
• This is a good thing• Freedom
Closed World
Open World
http://www.flickr.com/photos/almasryalyoum_e/
Finding a Solution
• Which APIs to use• Which APIs can we use• How can we combine data from multiple
sources• How can we automate it
The Curse of too much
• There are over 50 APIs listed on programmableweb.com
• Too many to look into• Each has its own API methods and return data
formats– JSON, XML, RSS, RDF !!!
Take your Pick
• APIs everywhere– BBC Music– Discogs– Last.fm– MusicBrainz– Yahoo Music– Flickr– Youtube– The Hype Machine
Finding the key
• One common feature was the usage of a MusicBrainz ID– Last.fm– Discogs– Freebase– Wikipedia/Dbpedia– BBC
Eureka!
• Great, now all I had to do was use the MusicBrainz API to look up the ID and I was done. Easy...
• :( • The search API sucked. It returned too many
fuzzy results• crap
Back to the future
• This is where the Semantic Web enters the picture– All that stuff about story telling– Shared understanding– URIs (web links)
SPARQL
Think of it as Google with a WHERE clause
SELECT ?artist WHERE { ?artist foaf:name "Prince"@en . ?artist a <http://dbpedia.org/ontology/MusicalArtist>.}
SELECT ?artist ?bio ?url ?album WHERE { ?artist foaf:name "Prince"@en . ?artist a <http://dbpedia.org/ontology/MusicalArtist> . ?artist dbpedia2:abstract ?bio . ?artist foaf:page ?url .
OPTIONAL { ?album <http://dbpedia.org/ontology/artist> ?artist . ?album rdfs:label "Purple Rain"@en . }}LIMIT 1
Pinpoint results
• This returns ONE result• “exactly” what we are looking for (or nothing!)
{170d193a-845c-479f-980e-bef15710653e}
http://www.flickr.com/photos/riseofphoenix/
{070d193a-845c-479f-980e-bef15710653e}
http://www.flickr.com/photos/angeldew/
Raw Data
• Not too pretty to look at• But computers LOVE this stuff
So, what do we get
• Disambiguation• MusicBrainz ID• Discography• Related Artists• Official homepage• Bio• Credit card details (in Semantic Web 2.0)
The Rosetta Stone
• MusicBrainz ID is our key to the wild web of APIs
• Wikipedia URL is the key to Semantic Web• One happy family
http://www.flickr.com/photos/vportals/
• [insert LOD graph]
Take a look
[browser]
Hindsight is 20/20
... or lessons learned
Drupal Sucks
• Drupal performance, what performance?• Out of the box it’s been beaten with an ugly
stick
Don’t use Drupal
• To get the best performance out of Drupal, don’t use Drupal
Pressflow
• Key patches and enhancements• Releases mirror official Drupal releases• Big players are using it– Drupal.org– ABC– Music labels– Newspapers
Start your Engines
MySQL base install is ... lacking• MyISAM == slow• Use Percona XtraDB• ... or ... InnoDB
Reduce your footprint
• APC– PHP app is compiled & cached in memory
Search
• Drupal’s built in search can be a dawg• Solr – Much faster search– Offers faceting– Can become a platform in its own right.
A Fresh Coat of Paint
• Varnish– Last but certainly not least– Up to 10 million hits per hour
What’s Next?
• Project Mercury• Drupal 7– RDFa– Views 3– FOAF+SSL• open social networking• everything under your control