obie – ontology-based information extraction · • text mining – information extraction (ie)...
TRANSCRIPT
![Page 1: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/1.jpg)
1
OBIE – Ontology-based Information Extraction
An Approach to Extract and Deal with
Imprecise Temporal Data and Spelling Errors
PhD Proposal
HEGLER TISSOT
Advisor: Marcos Didonet Del Fabro
Universidade Federal do Paraná
Curitiba – Brazil
Fev / 2014
1
![Page 2: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/2.jpg)
2
• Introduction– Context– Motivating example– Problem– Objetives
• State of the art– Information Extraction– Ontologies– OBIE Systems– Temporal Information
• Proposed work– Spelling errors– Temporal Information
Outline2
![Page 3: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/3.jpg)
3
• Information Management – Large volumes of data are available
• 80% = text (on the Internet or within companies)(Aranha, 2007)
– Unstructured data formats• New system modeling and building techniques
– What is the challenge?• ???
Context
![Page 4: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/4.jpg)
4
• Information Management
– Text Mining• From “Data Mining”• a technology which the purpose of extracting non-
trivial and interesting knowledge from large collections of unstructured documents
• Classification / Clustering• Indexing for Search
• Information Extraction
Context
![Page 5: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/5.jpg)
5
• Text Mining – Classification / clustering
• Machine learning algorithms
Based on medical records
textual content, how to
identify a possible Group of
Disease?
Context
![Page 6: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/6.jpg)
6
• Text Mining – Classification / clustering
• Machine learning algorithms
Based on medical records
textual content, how to
identify a possible Group of
Disease?
The most discriminant
words do not necessarily
represent the most
suitable concepts.
(“not” >> E10 x E11)
Context
![Page 7: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/7.jpg)
7
• Text Mining – Information Extraction (IE)
• In IE, relevant information from natural language (NL) texts is identified, collected and normalized.
– NLP» Natural Language Processing» Exhaustive deep NL analysis of all aspects of a text
– OBIE (Ontology-based Information Extraction)
Context
![Page 8: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/8.jpg)
8
Context
Unstructured Textual Content
Medical Record Sample:Blood pressure is lower. No vision complaints. Sub optimal sugar, control with retinopathy and neuropathy, high glucometer readings. Will work harder on diet. Will increase insulin by 2 units.
Information Extractionvision OK
high
lower
blood pressure
glucometer
sugar
retinopathy neuropathy
IE
Ontology-basedInformation Extraction
OBIE
8
![Page 9: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/9.jpg)
9
Where to apply?1. Information Extraction
• Medical records– Statistical view from unstrucured data
• Internet data/news/...– Knowing more from competitors
• Social Networks– How to sort and identify specific profiles?
» (e.g. drug dealers)
• Documents– How many (Word,PDF,...) documents do you have in your
computer/server? What do they say?
Context
![Page 10: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/10.jpg)
10
2. Answering natural language queries
Context
Quais os pacientes que apresentaram os
sintomas X, Y ou Z em casos de doenças A, B
ou C nos dois últimos anos?
Resultado:Paciente 1Paciente 2Paciente 3...
“match”
![Page 11: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/11.jpg)
11
3. Semantic + Analytical Data
Context
Quais os melhores clientes no último
semestre?
![Page 12: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/12.jpg)
12
Medical Record Example (in Portuguese)
Motivating Example12
![Page 13: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/13.jpg)
13
Medical Record Example (in Portuguese)
Motivating Example13
![Page 14: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/14.jpg)
14
Temporal Information (Precise + Imprecise)
Motivating Example14
![Page 15: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/15.jpg)
15
Spelling Errors
Motivating Example15
![Page 16: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/16.jpg)
16
– Extract temporal information from text– Organize events in a timeline– Uncertainty → imprecise temporal data
• “a few weeks ago”
• “the coming months”• “around 10:00 am”• “in the beginning of next month”
– Spelling errors• “in the last tree days”
Problem16
![Page 17: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/17.jpg)
17
OBIE approach to extract and deal with:
– Uncertain Temporal Information
– Spelling Errors
Objective17
![Page 18: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/18.jpg)
18
(Bird et al., 2009)
Information Extraction18
![Page 19: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/19.jpg)
19
(Nedellec and Nazarenko, 2006)
Ontology-based Information Extraction (OBIE)19
Ontologies
![Page 20: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/20.jpg)
20
– Formal specification of concepts
– Knowledge Domain
Classes
+ Instances
+ Properties
+ Relations
+ Axioms
= Formal Conceptualization
(Gruber, 1993) Ontology Web Language (OWL)
Ontology20
![Page 21: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/21.jpg)
21
– process unstructured text (natural language)– guided by ontologies– present output using ontologies– specific knowledge domain extraction
(Wimalasuriya and Dejing, 2010)
– Desired features•String similarity
•Inexact matching
•Large repositories
•Multiple ontologies
•Temporal information
OBIE Systems21
![Page 22: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/22.jpg)
22
OBIE General Framework22
![Page 23: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/23.jpg)
23
OBIE General Framework23
![Page 24: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/24.jpg)
24
OBIE General Framework24
![Page 25: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/25.jpg)
25
OBIE General Framework25
![Page 26: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/26.jpg)
26
2. State of the Art26
![Page 27: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/27.jpg)
27
– Organize events in a timeline– Establish chronological order– Answer temporal questions (Wong et al., 2005)
– “Which were the most prescribed drugs in the last weeks?”
– “Who did use aspirin before having <symptom>?”
– “When did <event-description> happen?”
– Challenges (Temporal Information Extraction)• Linguistics: different expressions• Reference resolution: “tomorrow”
• Negation: “not before”, “it didn´t happen last year”
Temporal Information Extraction27
![Page 28: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/28.jpg)
28
Tokens that represent a temporal entity (point in time, duration, frequency) (Sanampudi and Kumari, 2010)
• Explicit– January 2013
• Implicit– Christmas 2012
• Relative (indexed)– Yesterday, next month, three days ago (Alonso et al., 2007)
• Vague– Several weeks, in the next days (Schilder and Habel,
2003)
Temporal Expressions28
![Page 29: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/29.jpg)
29
– Formal representation of temporal concepts• OWL-time (Hobbs and Pan, 2004)
• TL-OWL (Kim et al., 2008)
• other Temporal approaches and OWL Extensions
– Challenges (Imprecise Temporal Information)– Extraction
– Representation
– Logics and algebra
Temporal Ontologies29
![Page 30: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/30.jpg)
30
Dealing with Imprecise Temporal Information
Logics and algebra
begin
begin
end
end
A
B
30
![Page 31: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/31.jpg)
31
Dealing with Imprecise Temporal Information
Logics and algebra
begin
begin
end
end
A
B
before( A.begin , B.begin )? → probably true/false
31
![Page 32: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/32.jpg)
32
Dealing with Imprecise Temporal Information
Logics and algebra
begin
begin
end
end
A
B
before( A.begin , B.begin )? → probably true/false
before( B.begin , A.end )? → probably true/false
32
![Page 33: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/33.jpg)
33
Dealing with Imprecise Temporal Information
Logics and algebra
begin
begin
end
end
A
B
before( A.begin , B.begin )? → probably true/false
before( B.begin , A.end )? → probably true/false
before( A.end , B.end )? → probably true/false
33
![Page 34: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/34.jpg)
34
Dealing with Imprecise Temporal Information
Logics and algebra
begin
begin
end
end
A
B
before( A.begin , B.begin )? → probably true/false
before( B.begin , A.end )? → probably true/false
before( A.end , B.end )? → probably true/false
before( A.begin , B.end )? → TRUE
34
![Page 35: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/35.jpg)
35
Dealing with Imprecise Temporal Information
Logics and algebra
begin
begin
end
end
A
B
C
before( A.begin , B.begin )? → probably true/false
<statement>: after( B.begin , C.end )
35
![Page 36: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/36.jpg)
36
Dealing with Imprecise Temporal Information
Logics and algebra
begin
begin
end
end
A
B
C
before( A.begin , B.begin )? → probabily true/false
<statement>: after( B.begin , C.end )
before( A.begin , B.begin )? → TRUE
36
![Page 37: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/37.jpg)
37
– Inaccurate temporal expression (+)– Define temporal concepts (+)– Perform arithmetic or logic operations (-)
Temporal Information Approaches37
![Page 38: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/38.jpg)
38
– Spelling ErrorsWordNet Extensions to deal with phonetic similarity
– Imprecise Temporal Information Extraction
Proposed Work38
![Page 39: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/39.jpg)
39
* ED (Levenshtein, 1966), TS (Oliver, 1993), HD (Hamming, 1950), LCS (Allison and Dix, 1986), SWD (Smith and Waterman, 1981), MED (Monge and Elkan, 1996), JWD (Winkler and Thibaudeau, 1991), Soudex (Knuth, 1968), FastSS (Bocek et al., 2007)
Spelling Errors x Similarity39
![Page 40: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/40.jpg)
40
– Multi language / Multi dictionary
– Derivative Words• Verb conjugation in Portuguese
– 13 tenses; 67 variations;» Unlike English (7 variations for ‘to be’)
{am, is, are, was, were, being, been}
– Fast Phonetic Similarity Search
WordNet Extensions40
![Page 41: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/41.jpg)
41
Stringsim function
Fast Phonetic Similarity Search41
![Page 42: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/42.jpg)
42
Stringsim function
Fast Phonetic Similarity Search42
![Page 43: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/43.jpg)
43
PhoneticMapPT
Fast Phonetic Similarity Search43
![Page 44: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/44.jpg)
44
PhoneticMapSimPT function
Fast Phonetic Similarity Search44
![Page 45: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/45.jpg)
45
– Similarity Search Methods• Full• Fast
– PhoneticSearchPT function (fast search method)
Fast Phonetic Similarity Search45
![Page 46: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/46.jpg)
46
– Precise x Imprecise Temporal Information• “08:15 am” x “earlyer in the morning”
» Which one happened before?
– Experiment• 4,748 medical records (MR)• 3,583 imprecise expressions (in 2,018 MR – 42,5%)
Uncertain Temporal Information Extraction46
![Page 47: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/47.jpg)
47
• Temporal Information Mapping• Extracting Process
• Answering User Queries
• Case Study
Proposed Activities47
![Page 48: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/48.jpg)
48
• Temporal Information Mapping• Extracting Process
• Answering User Queries
• Case Study
Proposed Activities
A1 A3 A2
A.1. Temporal Ontology
A.2. Temporal Expressions
A.3. Numeral Ontology
48
![Page 49: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/49.jpg)
49
• Temporal Information Mapping• Extracting Process
• Answering User Queries
• Case Study
Proposed Activities
B.1. Temporal Representation
B.2. Annotation Schemes
B.3. Phonetic Similarity
B.4. OWL Extension
B.5. Extraction Rules
B1
B2
B3
B4B5
49
![Page 50: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/50.jpg)
50
• Temporal Information Mapping• Extracting Process
• Answering User Queries
• Case Study
Proposed Activities
C.1. Temporal Algebra
C.2. Analytical Queries C1
C2
50
![Page 51: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/51.jpg)
51
• Temporal Information Mapping• Extracting Process
• Answering User Queries
• Case Study
Proposed Activities
D.1. Ontology Generator
D.2. Domain Ontology
D.3. Information Extraction
D.4. Accuracy EvaluationD1
D2
51
![Page 52: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/52.jpg)
52
Proposed Activities
A.1
A.2
A.3
B.1
B.2
B.3
B.4
B.5
C.1
C.2
D.1
D.2
D.3
D.4
P.x
T
Temporal Ontology
Temporal Expressions
Numeral Ontology
Temporal Representation
Annotation Schemes
Phonetic Similarity
Temporal OWL Extension
Extraction Rules
Temporal Algebra
Analytical Queries
Ontology Generator
Domain Ontology
Information Extraction
Accuracy Evaluation
Articles
Thesis
Create an ontology to define imprecise temporal concepts
List possible temporal expressions in Portuguese and English
Search for a numeral ontology that maps numeric values in the form of words
Define a representation for temporal expressions
Review and adapt temporal annotation schemes to support uncertain temporal data
Apply the Phonetic Fast Search method in the annotation and extraction processes
Propose an extension to the OWL metamodel to support temporal-dependent elements
Define a set of extraction rules needed to handle uncertain temporal data
Review the literature concerning Temporal Algebra
Convert natural language queries to analytical queries
Review the literature to describe methods to convert data models to ontologies
Create an ontology to handle medical domain knowledge available in InfoSaude
Design and develop part of the proposed framework - case study - medical records
Search for a benchmark to measure accuracy of proposed work
52
![Page 53: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/53.jpg)
53
• Schedule
Uncertain Temporal Information Extraction53
![Page 54: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/54.jpg)
54
– Uncertain x Imprecise x Inaccurate• How do differences in such word senses can
contribute to organize such temporal expressions into different groups?
– Fuzzy Logic• Imprecise time ≡ fuzzy time? How to apply fuzzy
logic to inaccurate temporal data? Are there alternatives?
– OBIE Accuracy• How to evaluate IE accuracy?
Pending questions...54
![Page 55: OBIE – Ontology-based Information Extraction · • Text Mining – Information Extraction (IE) • In IE, relevant information from natural language (NL) texts is identified, collected](https://reader036.vdocument.in/reader036/viewer/2022062605/5fd367793476c722f97c9aa8/html5/thumbnails/55.jpg)
55
OBIE – Ontology-based Information Extraction
An Approach to Extract and Deal with
Imprecise Temporal Data and Spelling Errors
PhD Proposal
HEGLER TISSOT
Advisor: Marcos Didonet Del Fabro
Universidade Federal do Paraná
Curitiba – Brazil
Fev / 2014
55