digitised content in an api world

Digitised Content in an API World

Alastair Dunning, JISCa.dunning AT jisc.ac.uk

Resource Discovery Taskforce MeetingLondon, 20th April 2011

Acronym Count: 8

types of content I’m talking about: digitised text, manuscripts, images, film footage, audio archives, newspapers, documents, music (and their metadata). ie. the type of stuff at http://www.jisc-content.ac.uk/

such content tends to be locked down behind interfaces. usage is tied to technical infrastructure and interface

the trouble with current resources is that they demand certain ways of analysing and representing the resource – and they constitute the creators’ way of seeing the world, not the users’

SEARCH then LIST

but what would such content look like in an RDTF world, where data and service are separated? What do APIs allow to happen?

an API driven world would allow much greater flexibility over analysing a digitised dataset, i.e. different intellectual questions to be asked

and also different ways of visualising that digital content

Thanks David McCandless! http://www.informationisbeautiful.net/visualizations/

http://www.informationisbeautiful.net/visualizations/

and also of different ways of tailoring content for different audiences – different interfaces for schools, undergrads and researchers – all over the same content

more importantly, it can help break down the notion of a collection, and the related silos

http://www.connectedhistories.org is a great example. on the surface, it appears like any other resource

http://www.connectedhistories.org/

but it is based on an API architecture which allows for aggregation and cross-search of 11 enriched metadata sets for the resources listed above(note: the aggregators created the enriched metadata and the APIs not the resource provider, and are testing the business model behind this)

• British History Online• British Museum Images

• Burney Newspaper Collection, 1600-1800• Charles Booth Archive

• Clergy of the Church of England Database, 1540-1835• House of Commons Parliamentary Papers

• John Johnson Collection of Printed Ephemera• John Strype’s Survey of London

• London Lives, 1690-1800: Crime, Poverty, and Social Policy in the Metropolis

• Origins.net• Proceedings of the Old Bailey Online, 1674-1913

so not only has an API architecture created a new resource for early modern British history, aggregating many disparate datasets

but others can come along and create their own interfaces – including or excluding elements of 11 resources (and adding others) as required

within these sources, there is rich metadata about places, areas, streets, names, crimes, genders, ages, occupations – these can be exploited in myriad ways

indeed, the team will be incorporating map data and archaeological data

from BL and Museum of London to allow for spatial

visualisation via geographical data (maps in this case) and

mashing of historical data (largely about events and

people) with archaeological data (largely about objects)

Map - First Series Ordnance Survey, c.1805 from British Library via

http://visionofbritain.org.uk/maps

http://www.shef.ac.uk/hri/locatinglondon.html



http://visionofbritain.org.uk/maps

and think how this could work when you start bringing datasets and content from different subject areas – economics, anthropology, fashion

on a practical note: don’t forget sustainability – the pressure of sustaining dataset and digitised content is relaxed for the collection holder; looking after the interface less important

short-term wins

content and enthusiasm is out there, although disparate – see The New History Lab article

visualisation can produce eye-catching success short bursts of funding can make things happen scholarly labs at KCL, UCL, Sheffield and elsewhere

(BL, BUFVC) enthusiasm of GLAM sector (good work at V+A

and Sci Museum) opportunties for enriching metadata via

crowdsourcing

http://sas-space.sas.ac.uk/2854/

long-term challenges

getting people to build and document and sustain APIs; explaing to collection curators how and why to do it

(some) publishers suspicious getting people to build interfaces on top of APIs;

technical knowledge required to do so quality of metadata; who owns enriched metadata? business models unclear; related licencing interoperability between APIs? citation academic scepticism + misunderstanding

{the web changes}

digitised content in an api world

Documents