mining dutch history: researching public debate in the nineteenth century

22
Mining Dutch History: researching public debate in the nineteenth century Dr José de Kruif Researcher Research Institute for History and Culture

Upload: patricia-whitley

Post on 02-Jan-2016

20 views

Category:

Documents


2 download

DESCRIPTION

Mining Dutch History: researching public debate in the nineteenth century. Dr José de Kruif Researcher Research Institute for History and Culture Utrecht University. Newspaper (1840). 2. Pamphlet production. 3. Pamphlet april 1853. 4. Text fragments considered typical. - PowerPoint PPT Presentation

TRANSCRIPT

Mining Dutch History: researching public debate in the nineteenth century

Dr José de Kruif

Researcher

Research Institute for History and

Culture

Utrecht University

Newspaper (1840)

2

Pamphlet production

3

Pamphlet production Low Countries 1800-1853

0

50

100

150

200

250

300

350

400

450

500

18

00

18

01

18

02

18

03

18

04

18

05

18

06

18

07

18

08

18

09

18

10

18

11

18

12

18

13

18

14

18

15

18

16

18

17

18

18

18

19

18

20

18

21

18

22

18

23

18

24

18

25

18

26

18

27

18

28

18

29

18

30

18

31

18

32

18

33

18

34

18

35

18

36

18

37

18

38

18

39

18

40

18

41

18

42

18

43

18

44

18

45

18

46

18

47

18

48

18

49

18

50

18

51

18

52

18

53

Years

Num

ber

of p

amph

lets

Foreign affairs Politics Public Finance/taxes Law Education Public works/floodsPoor relief Defense Varia Colonies Personalia Local AffairsChurch Independence of Belgium Gunpowder ship King Medicine Agriculture and tradeBonaparte Bishop controversy Total

explosion of ship loaded with

gunpowder

Jubilee Reformation

Waterloo

1813 end of rule Napoleon

Bishop controversy 1853

February flood 1825

Cholera

Independence of Belgium

Schism protestant church

Second schism protestant Church

New Constitution

death of king William II, coronation William III

Pamphlet april 1853

4

Text fragments considered typical

5

We gaan naar den grond met die verdraagzaamheid, en verliezen onze eigene vrijheid terwijl wij zoo dolzinnig ijveren voor die van anderen. We zullen er de vruchten van plukken, als de inquisitie regt spreekt op onzen vrijen grond en de schavotten staan opgerigt voor ons en onze kinderen.“

“Tolerance will be our Waterloo. We will loose our freedom whilst devoting ourselves to the freedom of others. We will only recognize the fruits of our ignorance when the inquisition judges on our free soil and the scaffolds will be the fate of ourselves and our children.”

Bij gevolg kan elk middel, hoe snood , hoe onredelijk, hoe goddeloos ook, aangewend worden: staatkundige verdeeldheid revolutie, burgertwist, inquisitie, brandstapels, vergif, zede- loosheid , koningsmoord,... Ziedaar wapenen in handen der Jezuïten !

“Every means, however nasty, malicious or blasphemous can be used: inciting civil war, revolution, inquisition, burning at the stake, poison, murdering the king …are all weapons in the hands of the Jesuits.”

Digitizing, database

6

Scan OCR Text

Database

Meta data

TextminingResults Documents

Access Database

7

Extracted results

8

Synonyms Jesuits

9

Refining extraction results

10

Actors 1853

11

Text Link analysis definitions

12

Opinions on the pope

13

The liberal government could count on criticism as well…......

14

Categories arguments

16

Textmining node and anomaly

17

Peer groups & outliers

18

Group 1: History & civil disorder

Group 2: History & new constitution

Group 3: No history. Civil disorder

Group 4: Very moderate & 3 outliers

C & R Tree served as a source or not?

19

Advantages

20

-Gives insight into large number of documents. No need to use just a few and run the risk of not having a representative sample

-Combining advantages of text analysis with statistical techniques.

Possibility to enrich the dictionary of the software with specific domain knowledge.

- New approaches possible

Set-backs

21

-The researcher will need some knowledge of the documents and their subject to be able to interpret the results.

-The approach is especially apt for broad research of large quantities of text. The more one zooms in, the less relevant the cluster results will become.

-Supplementing the lexical universe of the software with specific domain knowledge might be time-consuming.

- The researcher will have to be familiar, or will need to familiarize him or herself, with a number of statistic techniques (e.g. cluster analysis).

Mining Dutch History: researching public debate in

the nineteenth century

Dr José de Kruif

Researcher

Research Institute for History and

Culture

Utrecht University