drupal case study: abc dig music

48
Case Study – ABC Dig Music David Peterson @davidseth #ddu2011 http://www.flickr.com/photos/soyignatius/

Upload: david-peterson

Post on 02-Nov-2014

3.586 views

Category:

Technology


1 download

DESCRIPTION

A Drupal case study on developing the Australian Broadcasting Corporation's Dig Music website. I gave this talk at Drupal Downunder #ddu2011 in Brisbane, Australia (Jan 23, 2011).I discuss how the Semantic Web was used to create a real time snapshot of a musical artist that is pulled live from the digital radio broadcast.I also talk about performance issues we encountered and ways that they were overcome.

TRANSCRIPT

Page 1: Drupal case study: ABC Dig Music

Case Study – ABC Dig Music

David Peterson @davidseth #ddu2011 http://www.flickr.com/photos/soyignatius/

Page 2: Drupal case study: ABC Dig Music

David Peterson @davidseth

Page 3: Drupal case study: ABC Dig Music

Challenge

Create a snapshot of an artist

Page 4: Drupal case study: ABC Dig Music

Combining • Known Data • Data in the Wild

Page 5: Drupal case study: ABC Dig Music

Problem

<xml> <track> <title>Purple Rain</title> <artistName>Prince</artistName> </track> </xml>

Page 6: Drupal case study: ABC Dig Music

Into

Page 7: Drupal case study: ABC Dig Music

It’s all about Storytelling…

Page 8: Drupal case study: ABC Dig Music

Shared Understanding

• Can’t tell a story if the other person doesn’t get what we mean

• Or even speak the same language

Page 9: Drupal case study: ABC Dig Music

• The story matters

• ... but ...

• You never really have all the information you need, whether big or small

Page 10: Drupal case study: ABC Dig Music

You Just don’t Always Know

• Someone else knows more than you

• How to find it?

Page 11: Drupal case study: ABC Dig Music

One Exception

Page 12: Drupal case study: ABC Dig Music

Semantic Web

• Core idea

– you never really know the entire picture

• This is a “good thing”

• Freedom

Page 13: Drupal case study: ABC Dig Music

Closed World

Open World

http://www.flickr.com/photos/almasryalyoum_e/

Page 14: Drupal case study: ABC Dig Music
Page 15: Drupal case study: ABC Dig Music
Page 16: Drupal case study: ABC Dig Music
Page 17: Drupal case study: ABC Dig Music
Page 18: Drupal case study: ABC Dig Music

“If the graph of people is cool, imagine a graph of

everything” - Dries Buytaert

Page 19: Drupal case study: ABC Dig Music

Open Data

Page 20: Drupal case study: ABC Dig Music

Facebook?

• A little late to the party ;)

Page 21: Drupal case study: ABC Dig Music

Finding a Solution

• Which APIs to use

• Which APIs can we use

• How can we combine data from multiple sources

• How can we automate it

Page 22: Drupal case study: ABC Dig Music

The Curse of too Much

• There are over 50 APIs listed on programmableweb.com

• Too many to look into

• Each has its own API methods and return data formats

– JSON, XML, RSS, RDF !!!

Page 23: Drupal case study: ABC Dig Music

Take your Pick

• APIs everywhere – BBC Music

– Discogs

– Last.fm

– MusicBrainz

– Yahoo Music

– Flickr

– Youtube

– The Hype Machine

Page 24: Drupal case study: ABC Dig Music

Finding the Key

• One common feature was the usage of a MusicBrainz ID

– Last.fm

– Discogs

– Freebase

– Wikipedia/Dbpedia

– BBC

Page 25: Drupal case study: ABC Dig Music

Eureka!

• Great, now all I had to do was use the MusicBrainz API to look up the ID and I was done. Easy...

• :(

• The search API sucked. It returned too many fuzzy results

• crap

Page 26: Drupal case study: ABC Dig Music

Back to the Future

• This is where the Semantic Web enters the picture

– All that stuff about story telling

– Shared understanding

– URIs (web links)

Page 27: Drupal case study: ABC Dig Music

SPARQL

Think of it as Google with a WHERE clause

Page 28: Drupal case study: ABC Dig Music

SELECT ?artist WHERE {

?artist foaf:name "Prince"@en .

?artist a <http://dbpedia.org/ontology/MusicalArtist>.

}

Page 29: Drupal case study: ABC Dig Music

SELECT ?artist ?bio ?url ?album WHERE {

?artist foaf:name "Prince"@en .

?artist a <http://dbpedia.org/ontology/MusicalArtist> .

?artist dbpedia2:abstract ?bio .

?artist foaf:page ?url .

OPTIONAL {

?album <http://dbpedia.org/ontology/artist> ?artist .

?album rdfs:label "Purple Rain"@en .

}

}

LIMIT 1

Page 30: Drupal case study: ABC Dig Music

Pinpoint Results

• This returns ONE result

• “exactly” what we are looking for (or nothing!)

Page 31: Drupal case study: ABC Dig Music

{170d193a-845c-479f-980e-bef15710653e}

http://www.flickr.com/photos/riseofphoenix/

Page 32: Drupal case study: ABC Dig Music

{070d193a-845c-479f-980e-bef15710653e}

http://www.flickr.com/photos/angeldew/

Page 33: Drupal case study: ABC Dig Music

Raw Data

• Not too pretty to look at

• But computers LOVE this stuff

Page 34: Drupal case study: ABC Dig Music

So, what do we get

• Disambiguation

• MusicBrainz ID

• Discography

• Related Artists

• Official homepage

• Bio

• Credit card details (sometime in 2012)

Page 35: Drupal case study: ABC Dig Music

The Rosetta Stone

• MusicBrainz ID is our key to the wild web of APIs

• Wikipedia URL is the key to Semantic Web

• One happy family :)

http://www.flickr.com/photos/vportals/

Page 36: Drupal case study: ABC Dig Music
Page 37: Drupal case study: ABC Dig Music

Take a look

[browser]

Page 38: Drupal case study: ABC Dig Music

Hindsight is 20/20

... or lessons learned

Page 39: Drupal case study: ABC Dig Music

Drupal Sucks

• Drupal performance, what performance?

Page 40: Drupal case study: ABC Dig Music

Don’t use Drupal

• To get the best performance out of Drupal 6, don’t use Drupal 6!

Page 41: Drupal case study: ABC Dig Music

Pressflow

• Key patches and enhancements

• Releases mirror official Drupal releases

• Big players are using it

– Drupal.org

– ABC

– Music labels

– Newspapers

Page 42: Drupal case study: ABC Dig Music

Start your Engines

MySQL base install is ... lacking

• MyISAM == slow

• Use Percona XtraDB

• ... or ... InnoDB

Page 43: Drupal case study: ABC Dig Music

Reduce your footprint

• APC

– PHP app is compiled & cached in memory

• Memcached

Page 44: Drupal case study: ABC Dig Music

Search

• Drupal’s built in search can be a dawg

• Solr

– Much faster search

– Offers faceting

– Can become a platform in its own right

Page 45: Drupal case study: ABC Dig Music

A Fresh Coat of Paint

• Varnish

– Last but certainly not least

– Up to millions of hits per hour

Page 46: Drupal case study: ABC Dig Music

Performance Optimisations

• Switch host to Linode

• Two-server architecture - db server and app server

• Master-slave relationship for mysql

• Migrated Drupal to Pressflow

• Changed tables to InnoDB

• Varnish for serving pages

• memcached for caching

• Setup munin to monitor servers

Page 47: Drupal case study: ABC Dig Music

An Alternate Future

RDFaViewEntitFielMediStreaMongo

Page 48: Drupal case study: ABC Dig Music

An Alternate Future

• Drupal 7

– RDFa

– Views 3

– Entities

– Fields

– Media Module

– Stream Wrappers

– MongoDB