utilising webometric data from online digitised newspaper collections
DESCRIPTION
Utilising Webometric Data from Online Digitised Newspaper Collections by Paul Gooding, UCL Centre for Digital Humanities. Presentation given at the Europeana Newspapers Information Day, held at the British Library on 9 June 2014.TRANSCRIPT
![Page 1: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/1.jpg)
Utilising Webometric Data from Online Digitised Newspaper
Collections Paul Gooding
UCL Centre for Digital Humanities
![Page 2: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/2.jpg)
The Context for Large-Scale Digitisation
![Page 3: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/3.jpg)
Digitised Newspaper Collections: Primary Source and Research topic…
Citations of British Library Nineteenth Century Newspapers (launch to 2012)
BNCN used as research tool BNCN as a collection
From Gooding, P. (2014) “Search All About it”: A Mixed Methods Case Study into the Impact of Large-Scale Newspaper Digitisation.
(Thesis, not yet published)
![Page 4: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/4.jpg)
Web Analytics: Google Analytics
• Web Analytics = “The measurement, collection, analysis and reporting of web data for purposes of understanding and optimizing web usage.” (http://www.digitalanalyticsassociation.org/Files/PDF_standards/WebAnalyticsDefinitions.pdf)
• Google Analytics is the leading analytics platform, and it’s great! • Unobtrusive;
• Easy to implement;
• Rich data source.
• But it does pose a couple of problems…
![Page 5: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/5.jpg)
Google Analytics: A Couple of Flaws…
• (http://dilbert.com/fast/2008-05-08/)
![Page 6: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/6.jpg)
Web Log Analysis for Welsh Newspapers Online
• 3 types of server queries (in this case):
• “Search queries” – users undertake search on the collection;
• “Browser queries” – users use browse or filter functions;
• “Content queries” – users view digitised newspaper content.
• Results cover period from 12th March 2013 to 30th June 2013.
• Investigating a longer period would increase the significance…
![Page 7: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/7.jpg)
Content Log Analysis: Welsh Newspapers Online
• Server logs look like this (except for the colours…):
• 2013-06-02T12:26:50+01:00 51a5c97c3c8d3 llgc-id:3036868 llgc-id:3039814 llgc-id:3037695 Aberystwyth Observer 21 September 1872 [2] ART40
• And they tell us the following information:
• Time and date of interaction Unique user ID Server identification Newspaper title Edition date [Page number] Article number
![Page 8: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/8.jpg)
Users viewed content from the 1840s more than any other decade
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
7.00%
8.00%
9.00%
1804-1809 1810-1819 1820-1829 1830-1839 1840-1849 1850-1859 1860-1869 1870-1879 1880-1889 1890-1899 1900-1909 1910-1919
Po
pu
lari
ty
Most Viewed Decades in WNO, compared to total pages per decade
![Page 9: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/9.jpg)
They searched for personal names, place names and topics relevant to Wales
![Page 10: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/10.jpg)
And they engaged heavily with newspaper content
0%
10%
20%
30%
40%
50%
60%
70%
0 20 40 60 80 100 120 140 160
Pe
rce
nta
ge o
f U
sers
Pageview number
Percentage of Queries by Type
Search %
Browser %
Content %
![Page 11: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/11.jpg)
“But when people are past a certain age,
you sort of stop asking them why they do
things. It feels dangerous. What if you say
So, Mr Penumbra, why do you want to
know about Mr Tyndall's coat buttons? And
he pauses, and scratches his chin, and
there's an uncomfortable silence-- and we
both realize he can't remember?”
Robin Sloan, Mr. Penumbra’s 24 Hour Bookstore.
![Page 12: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/12.jpg)
The Qualitative Context
![Page 13: Utilising Webometric Data from Online Digitised Newspaper Collections](https://reader034.vdocument.in/reader034/viewer/2022052618/554fc56bb4c9050e7d8b4ff1/html5/thumbnails/13.jpg)
Thanks for listening!
Any Questions?