1 data mining for enlightenment bettina berendt ~berendt
TRANSCRIPT
![Page 1: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/1.jpg)
1
Data Mining for Enlightenment
Bettina Berendtwww.cs.kuleuven.be/~berendt
![Page 2: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/2.jpg)
2
Basics
Data Mining (DM) – used in the sense of Knowledge Discovery:
“the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data”
(Fayyad et al., 1996)
Enlightenment:
“man's emergence from his self-imposed immaturity”
(Kant, 1784)
![Page 3: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/3.jpg)
3
Data mining for ...
![Page 4: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/4.jpg)
4
Putting it together: (One) first tryPutting it together: (One) first try
What makes people happy?
Classification learning
![Page 5: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/5.jpg)
5
Results: Corpus-derived happiness factors
yay 86.67
shopping 79.56
awesome 79.71
birthday 78.37
lovely 77.39
concert 74.85
cool 73.72
cute 73.20
lunch 73.02
books 73.02
goodbye 18.81hurt 17.39tears 14.35cried 11.39upset 11.12sad 11.11cry 10.56died 10.07lonely 9.50crying 5.50 [Mihalcea & Liu, Proc. CAAW 2006]
![Page 6: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/6.jpg)
6
The approach: DM for human learning
5 wrong (but popular) metaphors
about the Internet
articulation and reflection
socialmultiple perspectives
active / con-structive
situated and authentic; multiple contexts
Successful learning
is / has ...
Refutation and DM tool support
![Page 7: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/7.jpg)
7
Interdisciplinary challenges
Engineering challenges
[Reputation challenges]
Interdisciplinary / application question challenges
Computational / DM methods challenges
![Page 8: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/8.jpg)
8
Metaphor 1
![Page 9: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/9.jpg)
9
The Internet is a textbook
![Page 10: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/10.jpg)
10
Multi-purpose tools (with DM) for situated and authentic Internet use
Text and link analysis
![Page 11: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/11.jpg)
11
Metaphor 2
![Page 12: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/12.jpg)
12
The Internet is television
![Page 13: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/13.jpg)
13
DM for active/constructive information use: Can you organize these results some more?
![Page 14: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/14.jpg)
14
Organisation of the literature /bibliography constructionDM for active/constructive information use (1): Intelligent bibliography creation
[Berendt, Dingel, & Hanser, Proc. ECDL 2006; Berendt & Krause, submitted; Berendt & Kolbe, in prep.]
Citation-based clustering,text analysis (TF.IDF, ...)for semi-automatic ontology learning;Embedded in authoring tool
![Page 15: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/15.jpg)
15
Metaphor 3
![Page 16: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/16.jpg)
16
The Internet is a pile of rubbish (biased / extremist / subjective)
![Page 17: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/17.jpg)
17
DM for analyzing multiple perspectives:
[Fortuna, Galleguillos, & Cristianini, in press]
What characterizes different news sources?
Nearest neighbour / best reciprocal hitfor document matching;Kernel Canonical Correlation Analysisand vector operationsfor finding topics and characteristic keywords
![Page 18: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/18.jpg)
18
DM for exploring multiple perspectives
Hyperlinks from blogs to mainstream news media Germany USA
[Berendt, Schlegel, & Koch, in Kommunikation, Partizipation und Wirkungen im Social Web, in press]
How do different news media source / refer to one another?
HTML wrappingand link analysis;(not shown:Named Entity Recognitionfor retrieving textual links)
![Page 19: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/19.jpg)
19DM for making people active explorers of multiple perspectives and multiple contexts
[Berendt & Trümper, PASCAL Symposium, 2008]
Clustering for semi-automatic ontology learning;Named Entity Recognition;Multi-dimensional similarity construct and filtering for nearest-neighbour search
![Page 20: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/20.jpg)
20
Metaphor 4
![Page 21: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/21.jpg)
21
The Internet is a dark cave
![Page 22: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/22.jpg)
22
Social tagging for making people see, explore and generate multiple perspectives
See also [Vuorikari, Ochoa, & Duval, submitted]
![Page 23: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/23.jpg)
23
Social browsing with the semantic pointerfor making people see & explore multiple perspectives
[Ferlež, PASCAL Symposium, 2008; www.jureferlez.name/2007/07/text-mining-for-semantically-enabled.html]
Inter-page text-block similarity analysis;Client-side usage tracking and real-time matching
![Page 24: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/24.jpg)
24
DM for exploring how multiple perspectives evolve
[Griffith, 2007; http://wikiscanner.virgil.gr/]
Why is Scientology an uncontroversial organisation?
Usage tracking,feature constructionby table lookup
![Page 25: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/25.jpg)
25
DM for exploring how multiple perspectives evolve
![Page 26: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/26.jpg)
26
Metaphor 5
![Page 27: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/27.jpg)
27
The Internet is a library
![Page 28: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/28.jpg)
28
... But you‘re a document too!... but you‘re a document too!
[Owad, 2006; www.applefritter.com/bannedbooks]
Where do people live who will buy the Qur‘an soon?
![Page 29: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/29.jpg)
29
DM for demonstrating the Internet‘s inference capabilities (how to create that book map)
Attribute matching in diff. schemas, view construction
![Page 30: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/30.jpg)
30
DM for articulation and reflection
Repetition Organisation Elaboration
[Berendt, in Neues
Handbuch Hochschul-
lehre, 2006]
Proxy server
LogfileASP
Usage tracking, semantic graph coarsening
![Page 31: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/31.jpg)
31
A conclusion ... and a vision
![Page 32: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/32.jpg)
32
A conclusion ... and a vision
New happiness factors:
yay 86.67
shopping 79.56
awesome 79.71
learning 86.67
understanding 79.56
democracy 79.71
…
![Page 33: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/33.jpg)
33
Caveat 1: Data preparation
One approach:
Tools for active (interactive) wrapper learning
![Page 34: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/34.jpg)
34
Caveat 2: “Digging and surfing“
Reductive understanding is not always adequate and/or desired
Person
Context
Task
...
One approach: Treat it as a competency
![Page 35: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/35.jpg)
35
Caveat 3: Cultural/economic biasLand area Population Internet users
[www.worldmapper.org]
![Page 36: 1 Data Mining for Enlightenment Bettina Berendt ~berendt](https://reader036.vdocument.in/reader036/viewer/2022062803/56649ce15503460f949abc08/html5/thumbnails/36.jpg)
36
… Questions? Comments? Other?
Thank you …