lecture01
DESCRIPTION
TRANSCRIPT
Knowledge Representationin
Digital HumanitiesAntonio Jiménez Mavillard
Department of Modern Languages and LiteraturesWestern University
Lecture 1
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard
* Contents: 1. Why this lecture? 2. Justification and goals of the course 3. Chapter 1 4. Overview of the course 5. Assignment 6. Bibliography
2
Why this lecture?
* This lecture... · presents this course, its justification, contents and goals · introduces the concepts of DH and KR, and justify why the latter is so important for the former
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard3
Justification and goals of the course
* Why DH? · It is an emergent field · Its rich heritage from Humanities · The wide range of problems it addresses
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard4
Justification and goals of the course
* Skills in DH: · Modelling · Knowledge Representation . Programming . Natural Language Processing · History
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard5
Justification and goals of the course
* Why KR? · KR is becoming a key dimension of DH · KR is the first and essential step for further computer processing · KR has the potential to change the way humanities scholarship is done
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard6
Justification and goals of the course
* Why programming? · It improves modelling and KR skills · It allows to create new solutions for old problems by providing a more versatile way to manipulate data
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard7
Justification and goals of the course
* Why NLP? · Most of the humanities disciplines rely on texts · Methods and tools for digitization and processing of texts
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard8
Justification and goals of the course
* Why Historical Texts? · Answer research questions from the humanities
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard9
Justification and goals of the course
* Why Historical Texts? · Similar to current media: + sms + forum posts + chats + social networkings · Sentiment Analysis is applicable
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard10
Justification and goals of the course
* Why Sentiment Analysis? · market trends · recommendation systems · targeted advertising
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard11
Justification and goals of the course
* Aim of the course: · Abstract relevant aspects of a problem · Model those relevant aspects into a formal representation · Solve formalized problems by means of programming
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard12
References
“Digital Humanities Forum 2011.” N. p., n.d. Web. 16 June 2013.
Piotrowski, Michael. “Chapter 2: NLP and Digital Humanities.” Natural Language Processing for Historical Texts. [San
Rafael, Calif.]: Morgan & Claypool, 2012. Open WorldCat. Print.
Steven Bird, Ewan Klein, and Edward Loper. “Preface.” Natural Language Processing with Python. O’Reilly Media, 2009.
Print.
Svensson, Patrik. “Humanities Computing as Digital Humanities.” Digital Humanities Quarterly 003.3 (2009): n. pag. Print.
Svensson, Patrik. “The Landscape of Digital Humanities.” Digital Humanities Quarterly 4.1 (2010): n. pag. Web. 30 May
2013.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard13
References
The Alliance of Digital Humanities Organizations et al. Digital Humanities 2012 - Conference Abstracts. Proceedings
(Komplette Ausgabe einer Konferenz etc.). 2008. Print.
Unsworth, John. “Knowledge Representation in Humanities Computing.” Inaugural E-humanities Lecture at the National
Endowment for the Humanities (2001): n. pag.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard14
Chapter 1
Digital Humanitiesand
Knowledge Representation
1. The landscape of Digital Humanities2. Modelling and Knowledge Representation
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard15
Chapter 1
1 The landscape of Digital Humanities 1.1 Definition of DH 1.2 DH projects 1.3 Skills in DH 1.4 The DH community
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard16
Chapter 1
2 Modelling and Knowledge Representation 2.1 Definition of KR 2.2 KR in DH 2.3 Representing data and procedures
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard17
The landscape of Digital Humanities
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard18
Definition of Digital Humanities
* What is DH? · No agreement in what DH is, open debate in the discipline · It descends from Humanities Computing: formal representations of the human record
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard19
Definition of Digital Humanities
* DH is a wide field that involves: · data mining and visualization · modelling and conceptualization · formalization and representation · programming · conservation · linguistics, history, literature...
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard20
Definition of Digital Humanities
* In the specific area of text processing: · digitization of documents + conservation · annotation with metadata + modelling + representation
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard21
Definition of Digital Humanities
* In the specific area of text processing: · public access to collections + visualization · text mining + data mining · natural language processing + programming
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard22
Definition of Digital Humanities
* Some approaches to the concept: · Susan Schreibman, Ray Siemens and John Unsworth (2002): + preservation of physical artifacts + knowledge representation
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard23
Definition of Digital Humanities
* Some approaches to the concept: · Andrew Prescott (2012): + conservation of culture through transformation of original objects ==> modelling and representation + interdisciplinar collaboration ==> “universal science”
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard24
Definition of Digital Humanities
* Some approaches to the concept: · Matthew G. Kirschenbaum (2010): + networking and collaboration · Michael Piotrowski (2012): + traditional humanities & new computer-based methods and tools
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard25
Definition of Digital Humanities
* Some approaches to the concept: · Patrick Svensson (2010): + humanities and information technology + humanities 2.0 + networking + decentralization of knowledge + interdisciplinarity and collaboration
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard26
Definition of Digital Humanities
* Some approaches to the concept: · Tod Presner (2009): + print is no longer “the” medium + digital tools, techniques, new media + new production and dissemination of knoledge
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard27
Definition of Digital Humanities* Some approaches to the concept: · Wikipedia: + intersection of computing and humanities + digitization, curation, data mining... + linguistics, history, literature... · Digital Humanities Quarterly journal: + still emerging field
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard28
Definition of Digital Humanities
* Some common ideas: · intersection of computing and humanities · conservation · knowledge representation · interdisciplinar collaboration
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard29
References
“Digital Humanities.” Wikipedia, the free encyclopedia 9 Aug. 2013. Wikipedia. Web. 12 Aug. 2013.
“DHQ: Digital Humanities Quarterly.” N. p., n.d. Web. 20 Dec. 2013.
Kirschenbaum, Matthew G. “What Is Digital Humanities and What’s It Doing in English Departments?” ADE Bulletin 150
(2010): n. pag.
Piotrowski, Michael. “Chapter 2: NLP and Digital Humanities.” Natural Language Processing for Historical Texts. [San
Rafael, Calif.]: Morgan & Claypool, 2012. Open WorldCat. Print.
Prescott, Andrew. “An Electric Current of the Imagination: What the Digital Humanities Are and What They Might
Become.” Journal of Digital Humanities (2012): n. pag. Web. 29 May 2013.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard30
ReferencesSchreibman, Susan, Ray Siemens, and John Unsworth, eds. “The Digital Humanities and Humanities Computing:
Introduction.” A Companion to Digital Humanities. Hardcover. Oxford: Blackwell Publishing Professional, 2004. Wiley
Online Library. Web. 3 June 2013. Blackwell Companions to Literature and Culture.
Svensson, Patrik. “The Landscape of Digital Humanities.” Digital Humanities Quarterly 4.1 (2010): n. pag. Web. 30 May
2013.
The Alliance of Digital Humanities Organizations et al. Digital Humanities 2012 - Conference Abstracts. Proceedings
(Komplette Ausgabe einer Konferenz etc.). 2008. Print.
“The Digital Humanities Manifesto 2.0.” 2009.
Unsworth, John. “Knowledge Representation in Humanities Computing.” Inaugural E-humanities Lecture at the National
Endowment for the Humanities (2001): n. pag.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard31
Digital Humanities projects
* The Sylva Project
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard32
Digital Humanities projects
* The Sylva Project (http://sylvadb.com/) The CulturePlex Lab, Western University · Modelling and conceptualization · KR: Graph databases · Data visualization · Data mining · Collaboration
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard33
Digital Humanities projects
* The Printer's Devil Project
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard34
Digital Humanities projects
* The Printer's Devil Project (http://ett.arts.uwo.ca/printersdevil/) The Research Group for Electronic Textuality and Theory, Western University · Online communities · Digitization · Public access to collections
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard35
Digital Humanities projects
* Deception Detection
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard36
Digital Humanities projects
* Deception Detection (http://publish.uwo.ca/~vrubin/lab/deceptdetect.html) Language and Information Technology Research Lab, Western University · NLP · Machine Learning
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard37
Digital Humanities projects
* The Proceedings of the Old Bailey, 1674·1913
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard38
Digital Humanities projects
* The Proceedings of the Old Bailey, 1674·1913 (http://www.oldbaileyonline.org/) Universities of Hertfordshire and Sheffield and the Open University · Digitization · Public access to collections · NLP
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard39
Digital Humanities projects
* Alfred Escher
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard40
Digital Humanities projects
* Alfred Escher (http://alfred-escher.ch/en/) Alfred Escher Foundation · Digitization · Public access to collections
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard41
Digital Humanities projects* Extraction of structured knowledge from ancient sources
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard42
Digital Humanities projects
* Extraction of structured knowledge from ancient sources (http://www.eaqua.net/en/) Institute for Computer Science, University Leipzig · Text mining · NLP
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard43
Skills in Digital Humanities
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard44
* From the analisys of the DH's landscape:
Skills in Digital Humanities
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard45
* Most important skills in DH:
Skills in Digital Humanities
* Modelling* Knowledge Representation* Programming* Natural Language Processing* History
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard46
The Digital Humanities community* Communities of Digital Humanists (“DHers”) · ADHO (http://adho.org/) + ALLC + ACH + CSDH/SCHN + centerNet + aaDH + JADH
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard47
The Digital Humanities community
* Communities of Digital Humanists (“Dhers”) · Others + HASTAC (http://www.hastac.org/) + CHAIN (http://www.arts-humanities.net/chain) + DARIAH (http://www.dariah.eu/) + Twitter (https://twitter.com/)
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard48
Modelling and Knowledge Representation
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard49
Definition of Knowledge Representation* What is KR? · KR is a sub-discipline in the field of artificial intelligence, but also an interdisciplinary methodology that combines logic and ontology to produce models of human understanding that are tractable to computation (Unsworth, 2001).
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard50
Definition of Knowledge Representation
* What is KR? · In other words, it is the representation of knowledge by means of a formal language that enables automated processing.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard51
Definition of Knowledge Representation
* KR entails: 1. abstraction and modelling 2. a formal language
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard52
Definition of Knowledge Representation
* Abstraction and modelling · substitution for the thing itself · determine important aspects to represent · ignore irrelevant details · reasoning about the world rather than taking action in it
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard53
Definition of Knowledge Representation* A formal language · a language to say things about the world · a medium of human expression to represent the model · formalized morphology, syntax and semantics · computationally processable
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard54
References
Davis, Randall, Howard Shrobe, and Peter Szolovits. “What Is a Knowledge Representation?” AI Magazine 14.1 (1993): 17.
www.aaai.org. Web. 13 Aug. 2013.
Unsworth, John. “Knowledge Representation in Humanities Computing.” Inaugural E-humanities Lecture at the National
Endowment for the Humanities (2001): n. pag.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard55
Knowledge Representation in Digital Humanities
* Representation of: · cultural objects · archival materials + printed-based (e.g. manuscripts) + visual-based (e.g. paintings) + audio-based (e.g. sound films)
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard56
Knowledge Representation in Digital Humanities
* KR is a critical and self-conscious activity* KR requires humanists · to make explicit what they know about the object · to understand the relationship between the object and its representation
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard57
Knowledge Representation in Digital Humanities
* Examples: · electronic edition of a text · model of an artwork
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard58
* Electronic edition of a text
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard59
Knowledge Representation in Digital Humanities
<text title:”Romeo and Juliet” author:”William Shakespeare”> <act index:1> <scene index:1 title:”A public place”> <actor>Sampson</actor>: Gregory, o' my word, we'll not <note meaning:”take insults”>carry coals</note>.
<actor>Gregory</actor>: No, for then we should be <note meaning:”coal miners”>colliers</note>.
<actor>Sampson</actor>: I mean, if we be <note meaning:”angered”>in choler</note>, we'll <note meaning:draw our weapons>draw</note>.
... </scene> </act></text>
* Model of an artwork
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard60
Knowledge Representation in Digital Humanities
References
Davis, Randall, Howard Shrobe, and Peter Szolovits. “What Is a Knowledge Representation?” AI Magazine 14.1 (1993): 17.
www.aaai.org. Web. 13 Aug. 2013.
Schreibman, Susan, Ray Siemens, and John Unsworth, eds. “The Digital Humanities and Humanities Computing:
Introduction.” A Companion to Digital Humanities. Hardcover. Oxford: Blackwell Publishing Professional, 2004. Wiley
Online Library. Web. 3 June 2013. Blackwell Companions to Literature and Culture.
Unsworth, John. “Knowledge Representation in Humanities Computing.” Inaugural E-humanities Lecture at the National
Endowment for the Humanities (2001): n. pag.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard61
Representing data and procedures
* Data: · Facts · Items · Objects
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard62
* Data: · Facts · Items · Objects
* Procedures: · Recipes · Methods · Algorithms
Representing data and procedures
* Data representation: · Bit · Number · String · Abstract data type
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard63
Representing data and procedures
* Data representation: · Database · Conceptual map · Markup language · Other formats: CSV, RGB...
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard64
Representing data and procedures
* Procedure representation: · Flow diagram · Pseudocode · Programming language implementation
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard65
Representing data and procedures
* Data + Procedure = Problem solution* Example: Count how many words are written in plural (ends with s) in a text.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard66
Representing data and procedures
* Data: · text (list of words) · word (string of letters) · letter (single character) · counter (number)
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard67
Representing data and procedures* Procedure: 1. The counter starts with 0 2. Separate the text in a list of words 3. For each word, get its last letter 4. If the last letter is the letter s, increment the counter in 1 5. Repeat the process for the next word 6. The result is the value of the counter
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard68
Knowledge Representation in Digital Humanities
Antonio Jiménez Mavillard69
Representing data and procedures
* Result of the program
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard70
Representing data and procedures
References
Miller, Bradley N, and David L Ranum. “Chapter 1: Introduction.” Problem Solving with Algorithms and Data Structures
Using Python. 2nd edition. Sherwood, Or.: Franklin, Beedle & Associates, 2011. Print.
Unsworth, John. “Knowledge Representation in Humanities Computing.” Inaugural E-humanities Lecture at the National
Endowment for the Humanities (2001): n. pag.
Sperberg-McQueen, C. M., and David Dubin. “Data Representation.” DH Curation Guide (2012): n. pag. Print.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard71
Overview of the course
* NLP with Python · A natural language... + is used for communication among humans (e.g. English, Spanish...) + is hard to pin down with explicit rules
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard72
Overview of the course
* NLP with Python · NLP to cover any kind of computer manipulation of natural language + counting word frecuencies + “understanding” complete human utterances
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard73
Overview of the course* NLP with Python · Students will learn: + how to program + how to analyze/manipulate language + how data structures and algorithms are used in NLP + how data is stored in standard formats
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard74
Overview of the course
* NLP with Python · Why Python? + simple yet powerful programming language + excellent functionality for NLP + highly readable for humans
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard75
Overview of the course* Contents · Chapter 1. DH and KR · Chapter 2. Principles of Computing · Chapter 3. Fundamentals of Programming · Chapter 4. Python Programming Language · Chapter 5. Text Representation · Chapter 6. Domain Modelling and Complex Object Representation
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard76
Overview of the course
* Contents · Chapter 7. Raw Text Processing · Chapter 8. Accessing Text Corpora and Lexical Resources · Chapter 9. NLP · Chapter 10. Historial Texts · Chapter 11. Sentiment Analysis
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard77
Overview of the course
* Contents · Chapter 12. Final Project Development · Chapter 13. Final Project Presentations
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard78
Assignment
* Assignment 1: DH & KR · Readings + What Is Digital Humanities and What's it Doing in English Departments? + DH2012 – Conference Abstracts (one project)
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard79
Assignment* Assignment 1: DH & KR · Project + From what you have learnt in this lecture and the first reading, give your own definition for DH + Pick up a project from the second reading and explain what its knowledge is about and how it is modelled
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard80
References
Kirschenbaum, Matthew G. “What Is Digital Humanities and What’s It Doing in English Departments?” ADE Bulletin 150
(2010): n. pag.
The Alliance of Digital Humanities Organizations et al. Digital Humanities 2012 - Conference Abstracts. Proceedings
(Komplette Ausgabe einer Konferenz etc.). 2008. Print.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard81
Bibliography
Davis, Randall, Howard Shrobe, and Peter Szolovits. “What Is a Knowledge Representation?” AI Magazine 14.1 (1993): 17.
www.aaai.org. Web. 13 Aug. 2013.
“Digital Humanities.” Wikipedia, the free encyclopedia 9 Aug. 2013. Wikipedia. Web. 12 Aug. 2013.
“Digital Humanities Forum 2011.” N. p., n.d. Web. 16 June 2013.
“DHQ: Digital Humanities Quarterly.” N. p., n.d. Web. 20 Dec. 2013.
Kirschenbaum, Matthew G. “What Is Digital Humanities and What’s It Doing in English Departments?” ADE Bulletin 150
(2010): n. pag.
Miller, Bradley N, and David L Ranum. Problem Solving with Algorithms and Data Structures Using Python. 2nd edition.
Sherwood, Or.: Franklin, Beedle & Associates, 2011. Print.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard82
Bibliography
Piotrowski, Michael. Natural Language Processing for Historical Texts. [San Rafael, Calif.]: Morgan & Claypool, 2012. Open
WorldCat. Print.
Prescott, Andrew. “An Electric Current of the Imagination: What the Digital Humanities Are and What They Might
Become.” Journal of Digital Humanities (2012): n. pag. Web. 29 May 2013.
Schreibman, Susan, Ray Siemens, and John Unsworth, eds. A Companion to Digital Humanities. Hardcover. Oxford: Blackwell
Publishing Professional, 2004. Wiley Online Library. Web. 3 June 2013. Blackwell Companions to Literature and
Culture.
Sperberg-McQueen, C. M., and David Dubin. “Data Representation.” DH Curation Guide (2012): n. pag. Print.
Steven Bird, Ewan Klein, and Edward Loper. Natural Language Processing with Python. O’Reilly Media, 2009. Print.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard83
Bibliography
Svensson, Patrik. “Humanities Computing as Digital Humanities.” Digital Humanities Quarterly 003.3 (2009): n. pag. Print.
Svensson, Patrik. “The Landscape of Digital Humanities.” Digital Humanities Quarterly 4.1 (2010): n. pag. Web. 30 May
2013.
The Alliance of Digital Humanities Organizations et al. Digital Humanities 2012 - Conference Abstracts. Proceedings
(Komplette Ausgabe einer Konferenz etc.). 2008. Print.
“The Digital Humanities Manifesto 2.0.” 2009.
Unsworth, John. “Knowledge Representation in Humanities Computing.” Inaugural E-humanities Lecture at the National
Endowment for the Humanities (2001): n. pag.
Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard84