Download - De conferentie 2012 - CLARIN
![Page 1: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/1.jpg)
CLARIN-NLReaching out to the users
Arjan van Hessen
Language Resources and Technology Infrastructure for the Humanities and the Social Sciences in the Netherlands
![Page 2: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/2.jpg)
State of the Technology
Language and Speech Technology is (nearly) mature Many applications are available Most of it is usable (although not perfect) but…..
![Page 3: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/3.jpg)
Unused Technology & Resources
Many scholars are not aware of the HLT & Resources
A-priori technical knowledge still necessary Use it to much
dependent of “friends” in the field
Lack of standardization is killing
It is less used than expected
![Page 4: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/4.jpg)
Research Life cycle
Cultural Heritage Institution(s)
New Idea
Research
BuildingTuning
Publications
?
![Page 5: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/5.jpg)
Unused Technology & Resources
CAR
![Page 6: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/6.jpg)
HLT & CHI paths
Language processing
Machine learning
Humaninities
CATCHCultural Heritage Institutions
![Page 7: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/7.jpg)
After the project
7
Lack of standardizationBad interfaces
![Page 8: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/8.jpg)
CLARIN-EU (2007-2012)CLARIN-NL (2009-2015)
CLARIN-ERIC (2012-xxxx)CLARIAH (2015-…)
Infrastructure program for the Humanities
8
![Page 9: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/9.jpg)
Issues to address
1. Finding the users
2. Identification of their needs/problems
3. Do our solutions correspond to their problems?
4. Usability of tools: can they use them?
5. Visualisation
6. Tutorials and web material (movies, courses)
7. Sustainability of tools and resources
9
![Page 10: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/10.jpg)
1. FINDING THE USERSHow to identify and convince potential users
10
![Page 11: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/11.jpg)
Humanities enter a New Era
Huge amounts of digital data are becoming available
Traditionally, Spitzweg’s “lonely scholar” no longer
sufficesBig data, supported by
automated methods
Hardware allows this and many tools are available and under
development
11
![Page 12: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/12.jpg)
User Surveys
Go out to ask potential users User survey in the Netherlands (2010)
12
![Page 13: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/13.jpg)
2. IDENTIFICATION OF THEIR NEEDS/PROBLEMS
What do they need?
13
![Page 14: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/14.jpg)
User attraction cycle
14
Finding new users
Convincing these users to
participate
Train these users in the use of all those wonderful tools
Support the users
Listening to the users
![Page 15: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/15.jpg)
3. DO OUR SOLUTIONS CORRESPOND TO THEIR PROBLEMS?
What to prevent in order to NOT scare off (potential) users
15
![Page 16: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/16.jpg)
16
The CLARIN dream
Give me digital copies of all contemporary documents in European archives that discuss the Great Plague of England (1348-1350)
Give me all negative articles about Catholics in the Fryske Courant (1868-1924)
Find European TV news interviews that involve discussions about Geert Wilders
16
![Page 17: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/17.jpg)
17
The CLARIN nightmare in 6 sleepless nights – night 1
Give me digital copies of all contemporary documents in European archives that discuss the Great Plague of England (1348-1350) “All” means from all countries and all archives, not just some
archives in some (9) countries that happen to be in CLARIN If contemporary docs exist in digital form at all they are
probably pictures – how do we get access to the content? Can we rely on standardized metadata to find them? Many of the docs may be in Latin – can we handle that, and
what about the other languages? How would a scholar know how to formulate this query? How to present results?
![Page 18: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/18.jpg)
4. USABILITY OF TOOLSThe gearbox syndrome
18
![Page 19: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/19.jpg)
19
The gearbox syndrome explained
Humanities scholar with a problem, waiting for a solution
First HLT researcher offering help
![Page 20: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/20.jpg)
20
The gearbox syndrome explained
Humanities scholar with a problem, waiting for a solution
First generation named entity recognizer (rule based)
![Page 21: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/21.jpg)
21
The gearbox syndrome explained
Humanities scholar with a problem, waiting for a solution
Second HLT researcher offering help
![Page 22: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/22.jpg)
22
The gearbox syndrome explained
Humanities scholar with a problem, waiting for a solution
Second generation named entity recognizer (statistics based)
![Page 23: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/23.jpg)
23
The gearbox syndrome explained
Humanities scholar with a problem, waiting for a solution
Third HLT researcher offering help
![Page 24: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/24.jpg)
24
The gearbox syndrome explained
Humanities scholar with a problem, waiting for a solution
LREC 2012 paper about next generation named entity recognizer
![Page 25: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/25.jpg)
25
The gearbox syndrome explained
![Page 26: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/26.jpg)
Making understandable interfaces
![Page 27: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/27.jpg)
5. VISUALIZATION
A picture says more than 1000 wordsEasy visualization fosters data analysisNice visualisation eases use of analysis toolsNice-to-look-at tools help to reach out to the community
27
![Page 28: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/28.jpg)
Who answered which words: visualizing word frequency information in letters
28
C. Culy. 2012. "Some challenges of language and linguistic data for information visualization. " Invited keynote presentation at Advanced Visual Methods for Linguistics. University of York, September 7, 2012.
![Page 29: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/29.jpg)
29
![Page 30: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/30.jpg)
30
![Page 31: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/31.jpg)
Parliamentary Debate
31Which party interrupted which other party and how often?
![Page 32: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/32.jpg)
6. TUTORIALS AND WEB MATERIAL
Create and publish web tutorialsPublish recorded lectures about CLARIN-specific topicsMake and publish show cases
32
![Page 35: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/35.jpg)
7. SUSTAINABILITY OF TOOLS AND RESOURCES
Resources and tools must be accessible after a project finishesData and tools must use international accepted standardsEasy access via federated login
35
![Page 36: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/36.jpg)
CLARIN Centres
36
![Page 37: De conferentie 2012 - CLARIN](https://reader033.vdocument.in/reader033/viewer/2022052820/548684c1b4af9fdc3d8b496d/html5/thumbnails/37.jpg)
Conclusion
CLARIN offers a good and sustainable infrastructure for long-term use of both Resources and Tools
Participating in CLARIN gives you access to enclosure tools, standardized metadata, tools for metadata, the CLARIN community
Give other groups/institutions access to your data….. If you want
37