diplomvortrag marc ehrig, fzi 22.01.2002 ontology-focused crawling of documents and relational...

21
Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metada ta leer

Upload: lillian-medina

Post on 27-Mar-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

leer

Page 2: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

System-architektur

UserInteraction

Ontology andMetadataManagement

ComputationPreprocessing

Crawling

Computation

Preprocessorand

Separator

Free-TextLookup

RDFmetadatavalidation

andextraction

RelevancyMeasures

Web CrawlingProcess

InstiantiatedOntology &

Metadata Structure

ResultPresentation and

OntologyEvolvement

Document list

URL listRetrieved

WebDocuments

AnchortextLookup

managing ontology and metadatastructures

defin

est

art U

RLs

inspect

Documentrelevance

Linkrelevance

links textmeta-data

RDF-metadata

user

+++

Maintenance

sele

ct &

para

met

rize

Page 3: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

leer

Page 4: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: SOEP

Page 5: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Options

Page 6: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Entities

Page 7: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Run

Page 8: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Dokuments

Page 9: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: URLs

Page 10: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Words

Page 11: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Metadata

Page 12: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Datei mit RDF-statements<rdf:RDF xmlns:b="http://kaon.semanticweb.org/2001/11/kaon-lexical#"

xmlns:d="http://www.w3.org/2000/01/rdf-schema#"xmlns:h="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#"xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

<rdf:Description rdf:about="urn:rdf:6113af897d1e6457528cf08de108541d-A380"h:engine_type="jet"><rdf:type rdf:resource="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#commercial airplane"/>

</rdf:Description>

<rdf:Property rdf:about="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#rival"><d:range rdf:resource="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#company"/><d:domain rdf:resource="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#company"/>

</rdf:Property>

<b:Label rdf:about="urn:rdf:e57f37acd768087d1278de3cbb44f669-label_rival" b:value="rival"><b:references rdf:resource="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#rival"/><b:inLanguage rdf:resource="http://kaon.semanticweb.org/2001/11/kaon-lexical#EN"/>

</b:Label></rdf:RDF>

Page 13: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

leer

Page 14: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: CIIR, allgemein

0

0,2

0,4

0,6

0,8

1

Page 15: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: Prof. Deshmukh

0

0,2

0,4

0,6

0,8

1

Page 16: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: CIIR

0

0,1

0,2

0,3

0,4

0,5

0,6

0 200 400 600 800 1000 1200 1400

keyword

taxonomic

relational

total

Page 17: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: CIIR, Relevante Seiten

• http://ciir.cs.umass.edu/index.html• http://ciir.cs.umass.edu• http://ciir.cs.umass.edu/publications/• http://www-ciir.cs.umass.edu/~allan/• http://ciir.cs.umass.edu/personnel/croft.html• http://www.cs.umass.edu/csinfo/techrep.html• http://www-aml.cs.umass.edu/criccs/level2-4.html• http://www-nlp.cs.umass.edu/ciir-pubs/tepubs.html• http://www.cs.umass.edu/csinfo/groups.html• http://www.umass.edu/research/center.html• http://www.cs.umass.edu/autogen/faculty.html• http://www.umass.edu/pride/

Page 18: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: Boeing 747

0

0,2

0,4

0,6

0,8

1

Page 19: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: Boeing 747

0

0,0050,01

0,015

0,020,025

0,03

0,035

0,040,045

0,05

keyword

taxonomic

relational

total

Page 20: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: Breakwater Hotel

0

0,2

0,4

0,6

0,8

1

Page 21: Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational Metadata leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

leer