23 january 2003 apan-fukuoka language and tools for lexical resource management asanee kawtrakul (1)...

59
23 January 2003 APAN-Fuku oka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1) Poonna Yospanya(1) (1)Department of Computer Engineering, Faculty of Engineering, (2) Thai National AGRIS center Kasetsart University

Upload: lindsay-henderson

Post on 26-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

23 January 2003 APAN-Fukuoka

Language and Tools for Lexical Resource Management

Asanee Kawtrakul (1)Aree Thunkijjanukij (2)

Preeda Lertpongwipusana(1)Poonna Yospanya(1)

(1)Department of Computer Engineering, Faculty of Engineering, (2) Thai National AGRIS center

Kasetsart University

Page 2: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Acknowledgement

• JIRCUS: Japan International Research Center for Agricultural Sciences

• Organizing committee

• Kasetsart University

Page 3: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Outline

• Background & Motivation

• Problems in Lexical Resource Preparation

• Requirements for Lexical Resource Management

• Proposed Language and tools

• Conclusion and Next steps

Page 4: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Background and Motivation

• Thailand is the agricultural basis country– having a rich knowledge and data in agricultural field,

• A great quantity of agricultural information was scattered in unstructured and unrelated text – Skimming/Digesting and integrating becomes

essential

• Knowledge is around the world– Knowledge Discovery without language barrier is also

needed

Page 5: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

The Basic Idea behind..

GatheringModule AgriculturalDocument

collection

Indexingand Clustering

Module

Internet

SummarizationModule

TranslationModule Data Cube

GraphicalUser Interface

Page 6: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Textual Data as a Input

Let us focus on Canada’s agricultural products. In 1998, there were 1,216 registered commercial egg producers in Canada. Ontario produced 39.8% of all eggs in Canada, Quebec was second with 16.6%. The western provinces have a combined egg production of 35.6% and the eastern provinces have a combined production of 8.0%.

With a courtesy of Agriculture and Agri-Food Canada, http://www.agr.ca/cb

Page 7: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Summarization and Translation as a Result

CategoryCategory ExporterExporter YearYear MonthMonth PricePrice UnitUnitPaddy Thailand 2002 January 300 Dollars/Ton

Paddy Thailand 2002 February 285 Dollars/Ton

ประเภประเภทท

ผู้�ส่�งผู้�ส่�งออกออก

ป�ป� เดื�อนเดื�อน ราคาราคา หน�วยหน�วย

ข้�าวเปลื�อก

ประเทศไทย

2545 มกราคม 14,340 บาทต่�อเกว�ยน

ข้�าวเปลื�อก

ประเทศไทย

2545 ก�มภาพั�นธ์�

13,625 บาทต่�อเกว�ยน

Page 8: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

The Development of Agricultural System for Knowledge Acquisition and Dissemination

• 5 years Project (2001-2005)

• The Collaborative work between:– Thai National AGRIS center:

• Providing Bilingual Thesaurus (AGROVOC)

– Department of Computer Engineering• Developing NLP techniques for Searching, Summarizing and

Translation including tools for lexical resource management

• Funded by Kasetsart University Research and Development Institution

Page 9: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Acquisition System

Rules Thesaurus Lexicon

Linguist/Domain ExpertVery Large Corpus

DocumentIndexing & Clustering

Linguistic Knowledge Base

Intelligent Search Engine•With Translation

•With Summarization

Document Warehouse

Gathering Module

Internet/Intranet

Page 10: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Thai Agricultural Thesaurus

• Total number of English vocabulary is 27,531 terms

• Translate in to Thai only 10,280 terms (except scientific names)

• Scientific name were not be translated– ex. Oryza (genus) sativa (specy) of rice or

family

Page 11: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Problem in hand-coded Thesaurus

• Scalability

• Reliability and Coherence

• Rigidity

• Cost

Page 12: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Foods

Bakery Product

Deistic Foods

Frozen Foods

Fermented Foods

Processed Products

Canned Products

Dried Products

Frozen Products

Fermented Products

Alcoholic Beverage

milk

Fermented Foods

Fermented Fish

Fermented Fish

Fermented Fish

Page 13: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Foods

Fermented Foods

Processed Products

Local Product

Products

Fermented Fish

Page 14: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Commercial Vegetables: The September index, at 107, was

up 1.9 percent from last month but 3.6 percent below Septe

mber 1998. Priceincreases for lettuce, tomatoes, broccoli, and celery more than offset pricedecreases for onions, carrots, and cucumbers

Commercial Vegetable

tomatoesBroccoli Carrots

Cucumbers

Page 15: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

tomatoes

VEGETTABLESBROCCOLI

type=leaf vegetablecolor=green

SWEET PEPPERtype=fruit vegetablecolor=red, green, yellow

TOMATOEStype=fruit vegetablecolor=red, yellow

CHERRY TOMATOEStype=fruit vegetable

LYCOPERSICON ESCULENTUM

type=taxonomic

BT

NT

RT

SOLANACEAE

CAPSICUMNICOTIANA

BT

NT

Expert DomainExpert Domain

color=red

color=red

tomato

tomatoes

Keyword AssignedKeyword Assigned

Commercial Vegetable

broccoli

carrot

tomato

User CategoryUser Category

Page 16: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Other Major Problems (1)

• Accessing to textual information– Language variation:

• Many ways to express the same idea

Ex: thinning flower uses deblossoming

thinning branch uses pruning

– how the computer can know that words a person uses are related to words found in stored text?Ex: user: thinning branch

computer: pruning

Page 17: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Requirement (1)

• Accessing to textual information

–Need intelligent browsing from related concept to related concept,

rather than from occurrence of stemmed character strings

Page 18: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Other Major Problems (2)

• Transforming from unstructured to structured information

Page 19: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Requirement (2)• Need Application-based Frame about product

price– Knowledge representation in table form– Consisting of attributes and their values

CategoryCategory Paddy

ExporterExporter Thailand

PricePrice 300

UnitUnit Dollars/Ton

Attributes

Values

Page 20: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Problems in Translation: Pragmatic and Semantic

• The September All Farm Products Index was 97 percent of its 1990-92 base, down1.0 percent from the August index and 2.0 percent below th

e September 1998 Index

Using Ontology0.97*

averagePrice of year from1990-

1992

SeptemberOf year ??

AugustYear1997

Down 0.02*price(September 1998)

Page 21: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

“Year 1990-1992” meaning

Product Year

A 1990 1991 1992

B - - -

C - - -

D - - -

Page 22: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Requirement (3)

• Lexicon should having the semantic constraints between lexical entities,

restriction on usage categories

Page 23: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Summary of Problems related to lexicon

• In terms of coverage– Extensional coverage, i.e., number of entries– Intensional coverage, i.e., the number of information fields

• In terms of semantic domain covered by the application– Meaning Interpretation with respect to objects, subject

matter, topics of discourse, and pragmatic interpretation

• The user category with reference to the intended system users– Commercial product vs Plant products vs Family

products

Page 24: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

One Solution

• Encoding world knowledge in the structures attached to each

lexical item which needs both language and tools

Page 25: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

The Design of Lexicon: Requirement Specification

• Macrostructure: Lexicon structure in terms of relations between lexical entries – i.e. Hierarchical taxonomies which are characteristic

of thesauri of semantically related word family

• Microstructure: types of information for each entry– Pronunciation or phonemic transcription– Syntactic properties– Meaning– Pragmatics of their use in real context and language

Page 26: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Microstructure (cont’)

• Lexical entity could contain slots/scripts for each specific domain and need intelligent

Analyzer and understanding language

– Supplies information extraction– Supplies the missing value

Page 27: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Lexical Resource Management Language

• which is able to:

– Handle heterogeneity of linguistic knowledge structures.

– Handle exceptions and inconsistencies of natural languages.

– Provide an intuitive means to store and manipulate both linguistic and

world knowledge.

Page 28: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Language Features

• The language is designed in a way that will enable:– Supports for heterogeneous structures.– Sufficient provisions to handle exceptions and

inconsistencies of natural languages (this is achieved through the +/- operators).

– Deduction of knowledge from rules.– Detection and prevention of potential integrity

violations.

Page 29: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Language and Tools Specification requirement

• Flexibility – almost any structures can be defined in this model.

• Extensibility – extending a structure is simple.

• Maturability – structure reformation and deformation are supported.

• Integrity – meta-relations help prevent malformed or ill-semantic data entries.

• Dealing with inconsistencies is feasible.

Page 30: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Some Syntactic Elements

• Knowledge manipulations are achieved through these primitives:– def is used to define structures not already

existing.– redef changes aspects of existing structures.– undef removes specified structures from the

knowledge base.– ret is used to retrieve structures from the

knowledge base.

Page 31: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Examples

• Hierarchies: tree structures representing generalization semantics, or classes, of atoms.

thing

animate inanimate

animalhuman

A semantic tree represented by a hierarchy structure

Page 32: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Usage Examples

• Defining a hierarchy– def thing(animate(human+animal)+inanimate).

• Adding the ‘plant’ and ‘vehicle’ concepts– def animate(plant+vehicle).

• Reparenting the ‘vehicle’ concept– redef animate(vehicle) inanimate(vehicle).

• Removing the ‘human’ concept– undef human. (provided that there is only a single

instance of ‘human’)

Page 33: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Usage Examples (2)

• Defining case frames for verbs– First, we need to define meta-relations for

words belonging to the sub-hierarchy ‘verb’.– def meta case(verb, sub:thing).– def meta case(verb, sub:thing, obj:thing).– Then, we define case frames for several verbs.– def case(eat, sub:human+animal, obj:food).– def case(fly, sub:bird-penguin). (here, we

emphasize the use of +/- operators)

Page 34: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Hierarchy & Set

c1

w1

w7

c2w2

p1

w6

f1f2 f3

f4

c3

w4

w5

w3

Page 35: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Defining a Hierarchy

c1

w1

w7

c2w2

p1

w6

w4

w5

w3

def c1(“w1”(“w3”)+c2(“w4”)+“w2”).

def “w5”+“w6” under “w4”.

def “p1”(“w7”) under “w2”.

Page 36: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Manipulating the Hierarchy

c1

w1

w7

c2w2

p1

w6

w4

w5

w3

redef “w4” under “w2”.

undef “w1”.

Page 37: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Defining a Set

f1f2 f3

f4

c3

def c3{[f1]+[f2]+[f3]}.

def [f4] in c3.

Page 38: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Defining a Relation

c2

w6

f1f2 f3

f4

c3

w4

w5

def meta r1(c2, c3). Template defined.

r1’

def r1(“w4”, [f1]). Relation defined.

r1

c2

w1

def r1(“w1”, [f3]). Constraint violated.Definition not allowed.

inherited

Page 39: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Synset & Surrogates

• A synset is an unnamed set identified by its unique ID.

• Members of a synset are considered synonymous with different degrees of

synonymity.• Distance graph is automatically constructed

within a synset with surrogates being representatives of synset members.

• Entities with identical features are attached to the same surrogates.

Page 40: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Synset & Surrogates

s1

s4

s2

s3

s5

w2

w1

p2

p3

w3

w4

w6

p1

f2

f4

f4

f3

f3f4

f1

f1f1

f3

f2

f1

f4

synset#1

surrogate network internally constructed

Page 41: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Synset & Multilingual Lexicon

• Synset members are not confined within language scope, that is, entities from different

language may belong to the same synset.• Distance matrix are computed from number of

different features over each pair of surrogates. • Traversing from a word to nearest-distant words

is handled by the system. We can determine words with potentially nearest semantics here.

Page 42: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Expected Result

Page 43: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Keyword GeneratedKeyword Generated

Page 44: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Keyword GeneratedKeyword Generated

“Fruit vegetable”,red

Page 45: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

tomatoes

VEGETTABLESBT

Expert DomainExpert DomainKeyword GeneratedKeyword Generated

“Fruit vegetable”,red

Page 46: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

tomatoes

VEGETTABLESBT

Keyword GeneratedKeyword Generated

“Fruit vegetable”,red

BROCCOLItype=leaf vegetablecolor=green

Expert DomainExpert Domain

Page 47: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

tomatoes

VEGETTABLESBT

Expert DomainExpert DomainKeyword GeneratedKeyword Generated

“Fruit vegetable”,redSweet pepper

BROCCOLItype=leaf vegetablecolor=green

SWEET PEPPERtype=fruit vegetablecolor=red, green, yellow

Page 48: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

tomatoes

VEGETTABLESBT

Expert DomainExpert DomainKeyword GeneratedKeyword Generated

“Fruit vegetable”,redSweet pepperTomatoes

BROCCOLItype=leaf vegetablecolor=green

SWEET PEPPERtype=fruit vegetablecolor=red, green, yellow

TOMATOEStype=fruit vegetablecolor=red, yellow

Page 49: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

tomatoes

VEGETTABLESBT

Expert DomainExpert Domain

CHERRY TOMATOEStype=fruit vegetable

NT

color=red

Keyword GeneratedKeyword Generated

“Fruit vegetable”,redSweet pepperTomatoesCherry Tomatoes

BROCCOLItype=leaf vegetablecolor=green

SWEET PEPPERtype=fruit vegetablecolor=red, green, yellow

TOMATOEStype=fruit vegetablecolor=red, yellow

Page 50: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

tomatoes

VEGETTABLESBT

Expert DomainExpert Domain

CHERRY TOMATOEStype=fruit vegetable

NT

color=red

Keyword GeneratedKeyword Generated

“Fruit vegetable”,redSweet pepperTomatoesCherry Tomatoes

BROCCOLItype=leaf vegetablecolor=green

SWEET PEPPERtype=fruit vegetablecolor=red, green, yellow

TOMATOEStype=fruit vegetablecolor=red, yellow

RTLYCOPERSICON ESCULENTUM

type=taxonomicSOLANACEAE

CAPSICUMNICOTIANA

BT

NTcolor=red

Page 51: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Keyword GeneratedKeyword Generated

Page 52: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Keyword GeneratedKeyword Generated

“Plant in same family”

Page 53: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

tomatoes

VEGETTABLESBT

Expert DomainExpert Domain

CHERRY TOMATOEStype=fruit vegetable

NT

color=red

Keyword GeneratedKeyword Generated

“Plant in same family”Capsicum

BROCCOLItype=leaf vegetablecolor=green

SWEET PEPPERtype=fruit vegetablecolor=red, green, yellow

TOMATOEStype=fruit vegetablecolor=red, yellow

RTLYCOPERSICON ESCULENTUM

type=taxonomicSOLANACEAE

CAPSICUM

BT

NTcolor=red

Page 54: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

tomatoes

VEGETTABLESBT

Expert DomainExpert Domain

CHERRY TOMATOEStype=fruit vegetable

NT

color=red

Keyword GeneratedKeyword Generated

“Plant in same family”CapsicumNicotiana

BROCCOLItype=leaf vegetablecolor=green

SWEET PEPPERtype=fruit vegetablecolor=red, green, yellow

TOMATOEStype=fruit vegetablecolor=red, yellow

RTLYCOPERSICON ESCULENTUM

type=taxonomicSOLANACEAE

CAPSICUMNICOTIANA

BT

NTcolor=red

Page 55: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

tomatoes

VEGETTABLESBT

Expert DomainExpert Domain

CHERRY TOMATOEStype=fruit vegetable

NT

color=red

Keyword GeneratedKeyword Generated

“Plant in same family”CapsicumNicotiana

BROCCOLItype=leaf vegetablecolor=green

SWEET PEPPERtype=fruit vegetablecolor=red, green, yellow

TOMATOEStype=fruit vegetablecolor=red, yellow

RTLYCOPERSICON ESCULENTUM

type=taxonomicSOLANACEAE

CAPSICUMNICOTIANA

BT

NTcolor=red

Page 56: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

tomatoes

VEGETTABLESBROCCOLI

type=leaf vegetablecolor=green

SWEET PEPPERtype=fruit vegetablecolor=red, green, yellow

TOMATOEStype=fruit vegetablecolor=red, yellow

CHERRY TOMATOEStype=fruit vegetable

LYCOPERSICON ESCULENTUM

type=taxonomic

BT

NT

RT

SOLANACEAE

CAPSICUMNICOTIANA

BT

NT

Expert DomainExpert Domain

color=red

color=red

tomato

tomatoes

Keyword AssignedKeyword Assigned

Commercial Vegetable

broccoli

carrot

tomato

User CategoryUser Category

Keyword GeneratedKeyword Generated

tomatoTomatoTomatoesCherry Tomatoes

Page 57: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Conclusion and Next steps

• This is a preliminary introduction of the language, with a few of its many possibilities.

• Structures not mentioned in details here have not yet been firmly specified. These

structures are rules, maps, and contexts, which are incorporated to extend the

potentials in handling deductions, multilingual operations, domain-dependent retrievals, etc.

Page 58: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Next Steps

• Revise the Idea• Continue the Implementation

– Aligner Tool– GUI tools for Thesaurus maintenance

• Short - term solutions to language variability problems by exploiting available knowledge sources with available

techniques• Long-range approach need high quality language understanding , i.e., Automatic thesaurus construction

– System of Agricultural Information Summarization and Translation

Page 59: 23 January 2003 APAN-Fukuoka Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1)

Thank you