archives, algorithms and people

31

Tristan Ferne / @tristanf Executive Producer BBC Research & Development Archives, algorithms and people or How we put the BBC World Service radio archive online using machines and crowdsourcing

Upload: tristan-ferne

Post on 26-Jun-2015

244 views

Category:

Technology

0 download

Report

Download

Tags:

Embed Size (px):

DESCRIPTION

How we put the BBC World Service radio archive online using machines and crowdsourcing. A talk given to the UK Museums on the Web conference, November 2013. One of the major challenges of a big digitisation project is you simply swap out an under-used physical archive for its digital equivalent. Without easy ways to navigate the data there's no way for your users to get to the bits they want. We recently worked with the BBC World Service to generate metadata for their radio archive, 50,000 programmes from over 45 years. First using algorithms to generate "good enough" topics to put the archive online and then using crowd-sourcing to improve the data. Throughout 2013 we have been running this experiment to crowdsource improvements to the metadata that we automatically created. At http://worldservice.prototyping.bbc.co.uk people can search and browse for programmes, listen to them, correct and add new topics. This talk describes how we went about this and what we've learnt with this massive online multimedia archive - about understanding audio, automatically generating topics and crowdsourcing improvements to the data.

TRANSCRIPT

Page 1: Archives, algorithms and people

Tristan Ferne / @tristanfExecutive Producer

BBC Research & Development

Archives, algorithms and peopleor

How we put the BBC World Service radio archive online using machines and

crowdsourcing

Page 2: Archives, algorithms and people

The BBC World Service archive

Page 3: Archives, algorithms and people

1947-2012

Page 4: Archives, algorithms and people

Spelling mistake

Missing data

Sometimes incorrect dataNo semantic data

The missing metadata

Page 5: Archives, algorithms and people

How it works

Page 6: Archives, algorithms and people

Listening machines

Page 7: Archives, algorithms and people

Noisy transcripts

Page 8: Archives, algorithms and people

Algorithms

Page 9: Archives, algorithms and people

Algorithms and people

Page 10: Archives, algorithms and people

The prototype

Page 11: Archives, algorithms and people

worldservice.prototyping.bbc.co.uk

http://worldservice.prototyping.bbc.co.uk/

http://worldservice.prototyping.bbc.co.uk/

Page 12: Archives, algorithms and people

Page 13: Archives, algorithms and people

Show Synopsis editing version

Page 14: Archives, algorithms and people

Page 15: Archives, algorithms and people

Page 16: Archives, algorithms and people

worldservice.prototyping.bbc.co.uk

http://worldservice.prototyping.bbc.co.uk/

http://worldservice.prototyping.bbc.co.uk/

Page 17: Archives, algorithms and people

Machine learning

Page 18: Archives, algorithms and people

Results

Page 19: Archives, algorithms and people

70000tag edits

How much data?

1000synopsis edits

71000edits

36000listenableprogrammes

1mmachine tags

70000programmes

3000users

of programmes listened to36%

of programmes tagged21%

Page 20: Archives, algorithms and people

And four lost programmes

Page 21: Archives, algorithms and people

Tags are a large and sparse space

When is a tag correct?

When is a programme tagged completely?

How do you measure crowd-sourced data?

How good is the data?

Page 22: Archives, algorithms and people

Who does the work?

1 person = 30% of edits

10 people = 70% of edits

10% of people = 98% of edits

Page 23: Archives, algorithms and people

The shape of the archive

Page 24: Archives, algorithms and people

Page 25: Archives, algorithms and people

Page 26: Archives, algorithms and people

Places mentioned

Page 27: Archives, algorithms and people

Linking from the News

Page 28: Archives, algorithms and people

The Last Danish Christmas Broadcast

“Entirely in Danish”

Page 29: Archives, algorithms and people

We can significantly improve the data

It’s cost-effective with re-usable technology

A crowdsourcing approach

What we’ve learnt

Page 30: Archives, algorithms and people

How good are the machine tags?

How much crowdsourcing do you need?

When is your data good enough?

Open questions

Page 31: Archives, algorithms and people

worldservice.prototyping.bbc.co.ukwww.bbc.co.uk/rdgithub.com/bbrd

[email protected]@tristanf

http://worldservice.prototyping.bbc.co.uk/

http://worldservice.prototyping.bbc.co.uk/

https://github.com/bbcrd

mailto:[email protected]

Algorithm Aversion: People Erroneously Avoid Algorithms ...opim.wharton.upenn.edu/risk/library/WPAF201410-AlgorthimAversion... · Algorithm Aversion: People Erroneously Avoid Algorithms

TEN DIGIT ALGORITHMS - People

Fast Graph-Search Algorithms for General Aviation Flight …webee.technion.ac.il/people/shimkin/PAPERS/Rippel04final.pdf · 2004. 5. 24. · Fast Graph-Search Algorithms for General

Elizabeth Perkes Utah State Archives · Archives has email from two accounts for people who left prior to Gmail conversion: Budget officer of Archives A few dozen emails Former state

Making Sense at Scale with Algorithms, Machines & People

Comparison of Signal Peak Detection Algorithms for …cinc.mit.edu/archives/2007/pdf/0407.pdfComparison of Signal Peak Detection Algorithms for Self-Gated Cardiac Cine MRI GM Nijm1,

Genetic Algorithms in Noisy Environmentseecs.vanderbilt.edu/people/mikefitzpatrick/papers/1988_fitzpatrick... · Genetic Algorithms in Noisy Environments ... Navy Center for Applied

Please help us identify people in our photo archives help us identify people in our photo archives Many of the photos in our archives don’t include the names of the people pictured

An introduction to algorithms for nonlinear optimization1,2 › people › nimg › msc › lectures › paper.pdf · 2006-12-07 · An introduction to algorithms for nonlinear optimization

TEN DIGIT ALGORITHMS - People · For twenty-five years I have been developing numerical algorithms, teaching numerical analysis courses, and helping people solve numerical problems

Chapter 8 Local Graph Algorithms - People | MIT CSAIL · Chapter 8 Local Graph Algorithms Chapter byM. Gha ari Keywords: local graph algorithms, localit,y coloring, maximal independent

Archives: People

Theory of Machine: When Do People Rely on Algorithms?

Algorithm Aversion: People Erroneously Avoid Algorithms

Designing Like Mother Nature - IEEEewh.ieee.org/r2/wash_nova/computer/archives/ieee/archives/apr99.pdf · Designing Like Mother Nature: An Introduction to Genetic Algorithms IEEE

PEOPLE - Perth County · People 2 5002 5002 Dr.ArthurDaltonSmith(Oct.24,1858-1936),whocametoMitchellin1887. Stratford-Perth Archives, Campbell Family Photograph Collection

CENTRAL ARCHIVES FOR THE HISTORY OF THE JEWISH PEOPLE

One Half of the People Fact Sheet - National Archives · One Half of the People Fact Sheet Author: National Archives Traveling Exhibits Service Subject: One Half of the People is

Algorithms to Style People: Stitch Fix Applies Data to Fashion

Top Ten Algorithms Class 13 - People

E-Commerce Recommender Applicationscs.brown.edu/courses/csci2270/archives/2001/groups/... · opportunities of interactive e-commerce.€ Accordingly, the algorithms focus more on

U Name: Date: Algorithms - code.org · Name: Date: Algorithms Tangrams Assessment Worksheet Revision 150520.1a Very speciﬁc algorithms help multiple people create identical products

Do Algorithms Dream of Electric People? Daryll Scott

Process-mimicking Modeling Considerations Website/Gussow/Archives/2014...Process-mimicking Modeling Considerations ... Current geostatistical algorithms, ... Reservoir modeling techniques

Why did people go to war in 1642? - The National Archives

ELEVENTH BUSINESS MEETING - Adventist Archives · of young people, young people who are involved in frontline evangelism, young people from the Trans-European Division, from the Inter-European

Efficient Algorithms for Personalized PageRankdimacs.rutgers.edu/Workshops/ParallelAlgorithms/... · A Dark Test for Twitter’s People Recommendation System Run various algorithms

Ressources Archivistiques's Bookmarks - Université …...Archives de France - Le records management PEOPLE 6 France Records-management Document-d'information Archives de France -

Sublinear Time Algorithms - People | MIT CSAILRonitt Rubinfeld ∗ Asaf Shapira † Abstract Sublinear time algorithms represent a new paradigm in computing, where an algorithm must

Online Passive-Aggressive Algorithms - Technion …webee.technion.ac.il/people/koby/publications/crammer06a.pdfONLINE PASSIVE-AGGRESSIVE ALGORITHMS presented here. Herbster describes

California Odessey: Dust bowl migration archives · California Odyssey: Dust Bowl Migration Archives Special Topic Series A "Flat Tired People": The Health ofCalifornia's Okies During

the archives for the people

CATHOLIC ARCHIVES - castrial.files.wordpress.com · CATHOLIC ARCHIVES No.30 2010 CONTENTS Editorial Notes ii Ecclesiastical Archives and the Memory of God's People MJ. ZIELINSKI O.S.B

Guidelines for Accessible Archives for People with ... Guidelines for Accessible Archives for...Digital Content 9 Task Force members 10 ... by the Joint Archives Management/Records

These Ghostly Archives - Sylvia Plath Profiles 183 These Ghostly Archives Gail Crowther, Lancaster University & Peter K. Steinberg Why do some people, including myself, enjoy in certain