melissa terras' report on the #ukmhlivelab

25
Report on The #UKMHLlivelab 26 th Oct 2016, Wellcome Library, London Hosted by: Professor Melissa Terras Professor of Digital Humanities, UCL Dept of Information Studies Director, UCL Centre for Digital Humanities [email protected], @melissaterras With Owen Stephens (@ostephens) and Peter Findlay (@PFindlay_500)

Upload: melissaterras

Post on 08-Apr-2017

962 views

Category:

Education


3 download

TRANSCRIPT

Page 1: Melissa Terras' Report on the #UKMHLiveLab

Report on The #UKMHLlivelab26th Oct 2016, Wellcome Library, London

Hosted by: Professor Melissa TerrasProfessor of Digital Humanities, UCL Dept of Information StudiesDirector, UCL Centre for Digital [email protected], @melissaterrasWith Owen Stephens (@ostephens) and Peter Findlay (@PFindlay_500)A report after the event, hurriedly written up early on 27th October by M Terras!

Page 2: Melissa Terras' Report on the #UKMHLiveLab

Aims• Jisc, the Wellcome Library, and non UK universities and

professional societies, have been working on a three-year large-scale digitisation project of more than 15 million pages of 19th century published works, resulting in the UK Medical Heritage Library, a valuable resource for the exploration of medical humanities.

• How can we best serve the research community with this material?

• Bring together researchers with developers to explore the resource, which launched officially on 27th October 2016

• Understand user needs, and improve functionality• Help user community

• https://ukmhl.historicaltexts.jisc.ac.uk/

Page 3: Melissa Terras' Report on the #UKMHLiveLab

Attendees

• Hosted by Melissa Terras, Peter Findlay (Jisc) and Owen Stephens

• 6 Jisc Historical Text programmers and service managers

• 20 interested researchers– From MA level students– To Professors– And Librarians

Page 4: Melissa Terras' Report on the #UKMHLiveLab

Hard thinking. And biscuits.

Page 5: Melissa Terras' Report on the #UKMHLiveLab

Owen (right) explains the metadata, James explains his research question

Page 6: Melissa Terras' Report on the #UKMHLiveLab
Page 7: Melissa Terras' Report on the #UKMHLiveLab

Hard work in the basement of the Wellcome Library, discussion inbetween

Page 8: Melissa Terras' Report on the #UKMHLiveLab

Hard thinking with the developers on the interface. Plus sugar for sustenance.

Page 9: Melissa Terras' Report on the #UKMHLiveLab

A Dictionary of Psychological Medicine, Tuke, 1982Some looked at individual items. This image is from a dictionary entry about reflexes, but the text on the page is for the next entry – regicide – showing how hard is it to do Machine Learning on images from the text that surrounds them.

Page 10: Melissa Terras' Report on the #UKMHLiveLab

The texts are not just about medicine. Lots for the food historian in there too! + others

Page 11: Melissa Terras' Report on the #UKMHLiveLab

Research Topics• Smells, fumes, air, ventilation• Food History• Semantic Text Analysis• Identifying and using Tables of Data in digitised content

– Particularly related to the census• Diseases• Alcohol• Identification of different genres of text

– Public facing versus medical, grey literature– Messages in research vs promotional vs lay texts

• Tracing different Editions of text over time– Identifying reuse of illustrations in different texts– Using metadata to trace different editions– Using image matching/processing to identify image reuse

Page 12: Melissa Terras' Report on the #UKMHLiveLab

What People Want

• A subset of data – Improved filters, including upload of terms/thesaurus to

generate smaller subset • A way to search the collection and generate a

ringfenced sub-selection to do further analysis on– Locking parameters to only search within subset– For example, topics in certain Boroughs of London– Perform term analysis within subset

• To take away and do deep/close reading/research on• Download their subsets to use with other tools.

– CSV (not necessarily through the API).

Page 13: Melissa Terras' Report on the #UKMHLiveLab

Suggestions

• MA students felt a little overwhelmed with content– Where to start?– Need guidance, approach, start to think about particular topics and

what they could use it for. – Saw possibilities and opportunities

• If used for school groups, intros to selections– As in “Teaching History with 100 Objects” approach– Pointers to interesting topics and collections needed

• Tabular data– Investigate how to improve quality of OCR data– Reusable data– Balance between identifying which tables are useful and how

difficult it is to identify text within tables

Page 14: Melissa Terras' Report on the #UKMHLiveLab

Suggestions (2)

• Crowdsourcing?– Where would it be appropriate?– Tagging? Annotation? OCR Correction? Where to employ

it?– Individual user data used in machine learning to generate

finding aids for all?• Image Processing

– Matching of images– Image Wall is a good tool, can it be expanded to provide

useful image analysis?

Page 15: Melissa Terras' Report on the #UKMHLiveLab

Suggestions (3)

• If UKMHL is open, then BL Book Data should also be open?– Discussion about licensing, financial model

• Request for items to be traceable to a shelf mark, and availability to search for that.– Would allow variants of a book to be identified– Build links between them

• Improvement of KWIC results to facilitate “traditional” research methods– Concise list view would speed up the process of reading content

hugely

Page 16: Melissa Terras' Report on the #UKMHLiveLab

Suggestions (4)• “Premium Version Without the Ads”

Discussion about what the different grey buttons were and why they take up screen real estate

Page 17: Melissa Terras' Report on the #UKMHLiveLab

Suggestion (5)

• Filter by language of text– Return Only French, only German, only English, etc.

• Access to ALTO XML, to reuse/analyse data themselves– People have to already know about the process to even

know ALTO exists• Access to raw image data

Page 18: Melissa Terras' Report on the #UKMHLiveLab

Surprises on the Day

• People’s questions changed– Alcohol in particular type of official literature– Revealed Alcohol in lots of recipes and cookbooks

• Soup (in texts) offered as a replacement for wine/beer as a drink• Changes scope of how we think about Victorian use of alcohol

• Research questions change by using these tools and resources– Scope – Focus– Serendipity

Page 19: Melissa Terras' Report on the #UKMHLiveLab

Hacked the interface to show how these services could be offered

Page 20: Melissa Terras' Report on the #UKMHLiveLab

Could potentially offer the way to upload groups of terms to do search, and to run term analysis (this isn’t live, it’s a mockup, of a possible service)

Page 21: Melissa Terras' Report on the #UKMHLiveLab

Easier way to go through Key Word in Contexts Results Quicker

(mock up)

Page 22: Melissa Terras' Report on the #UKMHLiveLab

Overall Impressions/ Comments

• Useful• Enjoyment• “Open”• “Old Text, New Knowledge”• Identification of gaps in design, and how to scale up

searches, methods, and questions• API well documented• From Developers

– Useful to understand research questions• Rich conversations, interaction, possibilities

Page 23: Melissa Terras' Report on the #UKMHLiveLab

Materials

• https://github.com/jisc-content/ukmhl-lab-data/wiki

• https://ukmhl.historicaltexts.jisc.ac.uk/• #UKMHLlivelab on twitter

Page 24: Melissa Terras' Report on the #UKMHLiveLab

With thanks to

• Owen Stephens (@ostephens)– http://www.ostephens.com/

• Peter Findlay (@PFindlay_500) – Jisc digital portfolio managerhttps://www.jisc.ac.uk/staff/peter-findlay

• Jisc organising folks! And the Wellcome Library for hosting. There were really good biscuits.

Page 25: Melissa Terras' Report on the #UKMHLiveLab

A researcher commented that this level of discussion with developers was “Paradise”.First mention of paradise in https://data.ukmhl.historicaltexts.jisc.ac.uk/view?pubId=ukmhl-b28074117&pageId=ukmhl-b28074117-104&terms=paradiseApposite bridge building!