melissa terras' report on the #ukmhlivelab
TRANSCRIPT
Report on The #UKMHLlivelab26th Oct 2016, Wellcome Library, London
Hosted by: Professor Melissa TerrasProfessor of Digital Humanities, UCL Dept of Information StudiesDirector, UCL Centre for Digital [email protected], @melissaterrasWith Owen Stephens (@ostephens) and Peter Findlay (@PFindlay_500)A report after the event, hurriedly written up early on 27th October by M Terras!
Aims• Jisc, the Wellcome Library, and non UK universities and
professional societies, have been working on a three-year large-scale digitisation project of more than 15 million pages of 19th century published works, resulting in the UK Medical Heritage Library, a valuable resource for the exploration of medical humanities.
• How can we best serve the research community with this material?
• Bring together researchers with developers to explore the resource, which launched officially on 27th October 2016
• Understand user needs, and improve functionality• Help user community
• https://ukmhl.historicaltexts.jisc.ac.uk/
Attendees
• Hosted by Melissa Terras, Peter Findlay (Jisc) and Owen Stephens
• 6 Jisc Historical Text programmers and service managers
• 20 interested researchers– From MA level students– To Professors– And Librarians
Hard thinking. And biscuits.
Owen (right) explains the metadata, James explains his research question
Hard work in the basement of the Wellcome Library, discussion inbetween
Hard thinking with the developers on the interface. Plus sugar for sustenance.
A Dictionary of Psychological Medicine, Tuke, 1982Some looked at individual items. This image is from a dictionary entry about reflexes, but the text on the page is for the next entry – regicide – showing how hard is it to do Machine Learning on images from the text that surrounds them.
The texts are not just about medicine. Lots for the food historian in there too! + others
Research Topics• Smells, fumes, air, ventilation• Food History• Semantic Text Analysis• Identifying and using Tables of Data in digitised content
– Particularly related to the census• Diseases• Alcohol• Identification of different genres of text
– Public facing versus medical, grey literature– Messages in research vs promotional vs lay texts
• Tracing different Editions of text over time– Identifying reuse of illustrations in different texts– Using metadata to trace different editions– Using image matching/processing to identify image reuse
What People Want
• A subset of data – Improved filters, including upload of terms/thesaurus to
generate smaller subset • A way to search the collection and generate a
ringfenced sub-selection to do further analysis on– Locking parameters to only search within subset– For example, topics in certain Boroughs of London– Perform term analysis within subset
• To take away and do deep/close reading/research on• Download their subsets to use with other tools.
– CSV (not necessarily through the API).
Suggestions
• MA students felt a little overwhelmed with content– Where to start?– Need guidance, approach, start to think about particular topics and
what they could use it for. – Saw possibilities and opportunities
• If used for school groups, intros to selections– As in “Teaching History with 100 Objects” approach– Pointers to interesting topics and collections needed
• Tabular data– Investigate how to improve quality of OCR data– Reusable data– Balance between identifying which tables are useful and how
difficult it is to identify text within tables
Suggestions (2)
• Crowdsourcing?– Where would it be appropriate?– Tagging? Annotation? OCR Correction? Where to employ
it?– Individual user data used in machine learning to generate
finding aids for all?• Image Processing
– Matching of images– Image Wall is a good tool, can it be expanded to provide
useful image analysis?
Suggestions (3)
• If UKMHL is open, then BL Book Data should also be open?– Discussion about licensing, financial model
• Request for items to be traceable to a shelf mark, and availability to search for that.– Would allow variants of a book to be identified– Build links between them
• Improvement of KWIC results to facilitate “traditional” research methods– Concise list view would speed up the process of reading content
hugely
Suggestions (4)• “Premium Version Without the Ads”
Discussion about what the different grey buttons were and why they take up screen real estate
Suggestion (5)
• Filter by language of text– Return Only French, only German, only English, etc.
• Access to ALTO XML, to reuse/analyse data themselves– People have to already know about the process to even
know ALTO exists• Access to raw image data
Surprises on the Day
• People’s questions changed– Alcohol in particular type of official literature– Revealed Alcohol in lots of recipes and cookbooks
• Soup (in texts) offered as a replacement for wine/beer as a drink• Changes scope of how we think about Victorian use of alcohol
• Research questions change by using these tools and resources– Scope – Focus– Serendipity
Hacked the interface to show how these services could be offered
Could potentially offer the way to upload groups of terms to do search, and to run term analysis (this isn’t live, it’s a mockup, of a possible service)
Easier way to go through Key Word in Contexts Results Quicker
(mock up)
Overall Impressions/ Comments
• Useful• Enjoyment• “Open”• “Old Text, New Knowledge”• Identification of gaps in design, and how to scale up
searches, methods, and questions• API well documented• From Developers
– Useful to understand research questions• Rich conversations, interaction, possibilities
Materials
• https://github.com/jisc-content/ukmhl-lab-data/wiki
• https://ukmhl.historicaltexts.jisc.ac.uk/• #UKMHLlivelab on twitter
With thanks to
• Owen Stephens (@ostephens)– http://www.ostephens.com/
• Peter Findlay (@PFindlay_500) – Jisc digital portfolio managerhttps://www.jisc.ac.uk/staff/peter-findlay
• Jisc organising folks! And the Wellcome Library for hosting. There were really good biscuits.
A researcher commented that this level of discussion with developers was “Paradise”.First mention of paradise in https://data.ukmhl.historicaltexts.jisc.ac.uk/view?pubId=ukmhl-b28074117&pageId=ukmhl-b28074117-104&terms=paradiseApposite bridge building!