![Page 1: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/1.jpg)
Mining and mapping places with multiple names
James Butler & Christopher Donaldson
Lancaster University
![Page 2: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/2.jpg)
1901
Corpus of Lake District Literature
1688 1789 1837
• 80 texts, comprising more than 1,500,000 words
• Mixture of canonical and non-canonical literature about the Lake District, mainly from c18 and c19 (78 out of 80 works)
• Mixture of genres, including guidebooks, travelogues, novels, poems, journals, and private letters
34 Texts650K words
22 Texts250K words
22 Texts613K words
![Page 3: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/3.jpg)
Sample sentence collocation: beautiful
‘Again entering the boat, we passed up the channel between Lord’s Island the shore, from whence beautiful prospects are obtained of the majestic form of Skiddaw, with the woods of Castlehead and Cockshot Park in the foreground.’ (Edward Baines, A Companion to the Lakes [1829] 121.)
±5 tokens: No place-names identified
±10 tokens: 2 place-names identified – Lord’s Island & Skiddaw
Within sentence: 4 place-names identified – Lord’s Island, Skiddaw, Castlehead & Cockshot Park.
Average sentence length
Lake District corpus = 29.8 wordsBritish National Corpus (BNC) = 16 words
![Page 4: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/4.jpg)
from C. Grover, et al., ‘Use of the Edinburgh Geoparser for Georeferencing Digitized Historical Collections’, Phil. Trans. R. Soc. A 368 (2010) 3875–89.
Diagram of the Edinburgh Geoparser System
![Page 5: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/5.jpg)
Example of input/output from the Edinburgh Geoparser System
![Page 6: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/6.jpg)
Geo-referenced Data from the Edinburgh Geoparser
![Page 7: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/7.jpg)
Geo-referenced Data, Corrected
![Page 8: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/8.jpg)
Bowness: ‘the curved headland’, from ON bogi/OE boga ‘bow’ and ON nes/OE naess ‘headland’
*Variant Historical Spellings: Bownus, Bawnas, Bonas, Bonus, Boulness
cf. D. Whaley, A Dictionary of Lake District Place Names (Nottingham: English Place-Name Society, 2006), 42.
![Page 9: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/9.jpg)
Some of the common generic gazetteer geo-referenced issues…
Spatial misattribution.
Onomastic misassumptionIncorrect weighting
Just for the items that are found!
![Page 10: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/10.jpg)
An extract of our custom manually-collected gazetteer for the corpus
Unique ID
Topog. Cat.
Primary Name Secondary Names Regional Placement
CONISTON (lake):
Thurstan, Coniston Lake, Coniston Water, Thurston, Conistone, Conistone Lake, Cunnistone Lake, Thurston Lake, Coniston Mere, Lake of Coniston, Conis- ton, Conyngs Tun, Conyngeston, Thorstane's watter, Turstinus.
![Page 11: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/11.jpg)
Geospatial categories chosen for flexibility and degree of universal referential specificity
![Page 12: Mining and mapping places with multiple names](https://reader034.vdocument.in/reader034/viewer/2022051521/58ed165c1a28ab40498b4593/html5/thumbnails/12.jpg)
An extract from the latest iteration of the corpus - allowing referential relationships to be analysed on a whole new level.
Lake, Vale, Specific - Farm, Waterfall