big uk domain data for the arts and humanities jane winters (professor of digital history, institute...
TRANSCRIPT
Big UK Domain Data for the Arts and HumanitiesJane Winters(Professor of Digital History, Institute of Historical Research)IIPC General Assembly Open Conference, 27 April 2015
AHRC Big Data Call• Part of the Digital Transformations theme• Aim to address the challenges of working with big
data• A total of 21 projects funded• A number of large-scale collaborative projects, of
which BUDDAH is one
Project partners• Institute of Historical Research – Jonathan Blaney and
Jane Winters• British Library – Helen Hockx-Yu, Andy Jackson and
Peter Webster• Oxford Internet Institute – Eric Meyer, Ralph Schroeder
and Josh Cowls• Aarhus University – Niels Brügger
Aims• To highlight the value of web archives for research• To develop a theoretical and methodological
framework for the analysis of web archives• To explore the ethical implications of big data research• To inform collections development and access
arrangements at the British Library• To train researchers in the use of big data
Project outputs• An enhanced interface providing access to the 1996-2013
data at the British Library• An open-access monograph: The Web as History: Using Web
Archives to Understand the Past and the Present (UCL Press)• An online training module• Two short animations explaining web archives to the general
public• A series of case studies presenting research using the archive
Case studies I• 10 bursaries of £2,000 each• Range of arts and humanities disciplines• Open to researchers in universities, libraries, archives
and museums, as well as independent researchers• Work with developers at the British Library to co-
create the tools and interface• Produce a case study of at least 2,000 words
Case studies II• The UK Web Archive and Beat literature• Online reactions to institutional crises: BBC Online• A history of UK companies on the web• Digital barriers and the accessible web• Searching for home in the historic web – the blogs
of the London-French
Case studies III• Revealing British Euro-scepticism in the UK Web
Archive• Looking for public archaeology in the Web Archive• Do online networks exist for the poetry
community?• Capture, commemoration and the citizen historian• The online development of the Ministry of Defence
What do researchers want?• An interface which supports sophisticated query building• The ability to create and manipulate corpora derived from the
larger dataset• Tools to support annotation and curation of data• Tools for basic data analysis, both across the whole dataset and
within smaller corpora• Guidance on the ethical implications of their research• And above all, they want to know what is going on behind the
scenes
Common problems• Messiness, even ‘unknowability’, of the data• Inability to distinguish between content types within a
web page• Discomfort with the idea of ‘good enough’ for purpose• Over-reliance on approaches conditioned by
algorithmically-ranked search• Lack of training in quantitative methods
Acknowledgements• Project team – Jonathan Blaney, Niels Brügger, Josh
Cowls, Helen Hockx-Yu, Andrew Jackson, Eric Meyer, Ralph Schroeder, Jason Webber, Peter Webster
• Bursary holders – Rowan Aust, Rona Cran, Richard Deswarte, Saskia Huc-Hepher, Alison Kay, Gareth Millward, Marta Musso, Harry Raffal, Lorna Richardson, Helen Taylor