agrovoc gacs working group

192
A I M S AGROVOC GACS Working Group First meeting April 7-9, 2014

Upload: aims-agricultural-information-management-standards

Post on 12-Aug-2015

640 views

Category:

Data & Analytics


1 download

TRANSCRIPT

A I M S

AGROVOC

GACS Working Group First meeting

April 7-9, 2014

Links• Agrovoc online: www.fao.org/agrovoc • Download:

– https://aims-fao.atlassian.net/wiki/display/AGV/Releases

• VB 2.0 Sandbox: http://202.73.13.50:55481/vocbench/ (only Agrovoc)

• JIRA : https://aims-fao.atlassian.net/ • Sparql endpoint:

– http://202.45.139.84:10035/catalogs/fao/repositories/agrovoc

• Agrontology: http://aims.fao.org/aos/agrontology• VOID file for Agrovoc LOD:

– http://aims.fao.org/aos/agrovoc/void.ttl

04/15/23 2

People

• Caterina Caracciolo (coordination, data)• Sarah Dister (communication, editors)• Lavanya Kiran (content analysis)• Sachit Rajbahndari (VB, data)• Armando Stellato (VB, data)• Andrea Turbati (VB, data)

• Fabrizio Celli (Agris)• Valeria Pesce (AgriDrupal, AgriVivo)

04/15/23 3

Goal of this presentation

To tell you as much as possible about Agrovoc

04/15/23 4

Outline• A bit of history of Agrovoc• Agrovoc now• Access• Maintenance• Users• Data publication workflow• Editorial workflow• Copyright, license• Agrovoc structure• Walkaround in Agrovoc following VB• Agrovoc by domain: scientific tax, names, geographical entities• LOD• Mapping

04/15/23 5

AGROVOC from the beginning

• 1980s: AGROVOC on paper, only 3-4 language versions

• 1990s: AGROVOC moves to DB. More language versions are added.– Central dabatase, DB dump sent to partners for

translation – > limitations in maintenance and sharing

04/15/23 6

2000s

• Move to semantic technologies • Need to streamline maintenance, and improve

sharing over the web• Adopted format is OWL• Development of Workbench

04/15/23 7

Late 2000s

• Clear need for distinction between conceptual level and terminological level: notion of a concept scheme

• Collaboration between FAO and ICRISAT (KSI, Knowledge Sharing and Innovation Group) – Top concepts reduced from 918 to 25– Around 85,000 term relations revised– Non-hierarchical relationships refined by semantic relations– Ca. 4,000 non-preferred terms changed to preferred terms

2010 until now

• RDF is widely accepted standard• SKOS is the RDF vocabulary for thesauri • OWL shows limitations with thesauri structure

and multilinguality• Workbench becomes VocBench• VocBench adopted by Biotechnology Glossary,

used for bibliographic data

04/15/23 9

AGROVOC now

04/15/23 10

AGROVOC 2014

• AGROVOC RDF/SKOS (SKOS-XL)– for download – “live” through SPARQL endpoint and web services

• LOD: linked to 13 vocabularies• VocBench v2.1 out in a few days

04/15/23 11

Some figures

• Total number of concepts = ~ 32,000• 20 languages published

– 4 under development• 25 top concepts• Maximum depth hierarchy: 14

04/15/23 12

How to look at AGROVOC

• A terminological resource• A domain model

– Provides a view on how concepts are related to one another, e.g. agriculture is an “activity”, not say, a “science”, a “technique”, or a “religion”…

• An RDF/SKOS resource and a linked data set for use in web based applications

04/15/23 13

Strengths of AGROVOC

• Multilinguality• Number of (institutional) users • Experience and work done towards use in

open data environment

04/15/23 14

Access

04/15/23 15

Agrovoc Online

• Browse/search• New tool under development: an SKOS

explorer, Drupal module– http://dev-skos-explorer.gotpantheon.com/

skos_explorer

04/15/23 16

04/15/23 17

For download

• RDF-SKOS– Agrovoc only, aka Agrovoc Core– Agrovoc LOD

• Other formats are no longer updated– relational DB, XML..

• SPARQL endpoint• Webservices

04/15/23 18

“live” access

• SPARQL endpoint– More used than we thought, as per Survey

• Webservices– In fact less used than we thought…– Certainly used by big institutions/library - they

count 1 for us…

04/15/23 19

AGROVOC maintenance

04/15/23 20

In the past

• Institutions translating AGROVOC would get a dump of the DB, then:

• Translations– workflow decided internally to the institution

(who translates, who revises, ...) & totally implicit

• Data sent back to FAO for inclusion in master copy

04/15/23 21

Now• Data is managed within VocBench

– Web based– Implement formalized editorial workflow– Editors may get rights on languages

• Agrontology– URI: http://aims.fao.org/aos/agrontology

• Imported vocabularies are maintained elsewhere….

04/15/23 22

Manual - automatic

• Contribution to Agrovoc (new concepts, translations) is largely manual– ~2012 a couple of languages first proposed by

automatic translation tool (company provided), then revised manually

• Some tests done for automatic extraction of relations from text, we may continue on that in the future

• Mapping automatically extracted, then validated

04/15/23 23

VB 2.0• Re-engineered RDF backend, based on RDF Management platform

Semantic Turkey• Support for different triple stores• Extension mechanism based on OSGi• Multi scheme management. Several skos:ConceptSchemes can be

developed for the same dataset, providing different views on the dataStatistics module (a module providing resuming information about the loaded data).

• Export module: for exporting all or part of the content of a project according to several existing RDF serialization standards

• Load data module: for loading bulk data serialized in some RDF serialization standard

• Ontology Import Management (Administration-->Ontologies): to owl:import ontologies to be used as property vocabularies for the modeled thesauri

• New tabs under the concept view for covering extensively the SKOSXL standard (note, notations)

04/15/23 24

VB 2.1 – out in a few days

• A completely rebuilt installation mechanism - headache-free

• Self-installing DB, with auto-updating scripts• Wizard-driven system configuration, with import/export of

configuration profiles• SPARQL module: query/update content directly through the

SPARQL query language for RDF; syntax completion & highlight

• Multi scheme management: now concepts can be shared among different schemes

• RSS feeds for all editing actions

04/15/23 25

AGROVOC users

04/15/23 26

Editors

• So far, one focal point per language– See Agrovoc web site

• Now, we are moving to editing responsibility per language, per domain

04/15/23 27

Editing rights in VocBench

• Now editors get rights by language• Future: maybe also by domain (or similar

notion?)

04/15/23 28

Users of Agrovoc

• libraries, information management, …• From survey, also software developers,

translators, managers, researchers

04/15/23 29

Communication with users

• Through website (form)• Direct email [email protected]• AIMS bulleting• Agrovoc googlegroup

– Just started, not yet active

04/15/23 30

Support for users

• For editors: – VB support (user manual, video tutorials)– Syntax-oriented guidelines for editors– Would like to have more domain-oriented

guidelines, also explaining the use of Agrontology

04/15/23 31

Data publication infrastructure

04/15/23 32

Tools and hosting

• Mostly at Mimos Berhad (Malaysia)– Editing: VocBench– SPARQL endpoint: Allegrograph– LOD content negotiation, Pubby for producing

HTML

04/15/23 KISAF, Rome 33

Data publication workflow

04/15/23 34

“master” data -> download

1. Extraction of data from VocBench2. Data preparation for publication

1. Agrovoc Core2. Agrovoc LOD

1. Add reference to VOID file, per concept2. Add data of Agrovoc version3. For LOD, add triples

3. Load files on download site

04/15/23 KISAF, Rome 35

“master” data -> dynamic access

• Agrovoc LOD into triple store for external access

04/15/23 KISAF, Rome 36

Editorial workflow

04/15/23 37

Basics

• Formally defined: all editorial activities happen inside VB (as opposed as via email, phone..)

• Roles of users• Status of elements

04/15/23 38

Role of users

• Term editor• Ontology editor• Validator• Administrator• Publisher

04/15/23 39

Status of elements

• Draft• Revised• Validated• Published• Proposed deprecated• Deprecated

04/15/23 40

Validation phase

• Dedicated module: Validation• See also module: Recent change, RSS• Also, History of concepts, terms

04/15/23 41

Data publication

04/15/23 42

Copyright

• FAO languages, stays with FAO– English, French, Spanish, Arabic, Russian, Chinese

• Other languages, stays with the institution that authored it

• Not exactly defined how it will be with a distributed authorship, e.g. by domain– Provenance?

04/15/23 KISAF, Rome 43

License

• Agrovoc may be used for free by anyone• Would like to have a CC3.0

– Although this is not official FAO policy

04/15/23 44

A concept scheme

04/15/23 45

Concepts and terms

• Distinction between conceptual level and terminological level

• Concepts are represented by terms

04/15/23 46

Example from AGROVOC Thesaurus

04/15/23 47

Terms

• Thesaurus– maize UF corn (maize)– Corn (maize) use Maize

• Concept scheme– maize preferred term/ preferred label– Corn (maize) non preferred term/ alternative

label

04/15/23 48

SKOS

• Terms are turned into “labels”…– Skos:prefLabel “maize”@en– Skos:altLabel “corn (maize)”@en– skos:prefLabel “Mais”@it– skos:altLabel “granoturco”@it– ….

• … of the same concept = skos:Concept• A concept is identified and represented by an URI

http://aims.fao.org/aos/agrovoc/c_12332• …and located, as an URL…

04/15/23 49

Example from VocBench

04/15/23 50

04/15/23 51

http://aims.fao.org/aos/agrovoc/c_12332

Language coverage

04/15/23 52

53

Number of terms per language

54

Number of concepts with no terms per language

AGROVOC structure

04/15/23 55

Hierarchy

04/15/23 56

http://aims.fao.org/aos/agrovoc/c_12332

Remarks

• Same hierarchy for all languages– Historical reasons: English was the first language,

the others were added as translations

• Multiple parents allowed– Ca. 1200 concepts with more than one parent

• Max. depth of hierarchy = 14

04/15/23 57

Example from AGROVOC Thesaurus

04/15/23 58

skos:broader

skos:broader

skos:broader

skos:broader

skos:related

c_6211Products @en

c_8171Plant products @en

c_1474Cereals @en

c_12332Maize @en

c_7552Sweet corn @en

c_14385Soft corn @en

c_15500corn starch@en

BT/NT in Agrovoc • In tree like format:

– [Products] • [Plant products]

– [Cereals*] » [Rice*, Paddy]

• In standard thesaurus-like format:– [Rice*] BT [Cereals*]

• In SKOS (simplified):– http://aims.fao.org/aos/agrovoc/c_12332 skos:broader

http://aims.fao.org/aos/agrovoc/c_6599

04/15/23 60

For sake of readability...

• I am using the English preferred label to talk about a concept– As in: [Rice*]– Instead of using its URI

• Preferred and alternative labels would be written as: [Rice*, Paddy]

04/15/23 61

Thesaurus hierarchies• Sometimes close to is-a relations:

– BT Plant products • Maize

– NT Dent maize– NT Sweet corn

• Sometimes close to containment or part-of:– BT Europe– BT Southern Europe

• Italy – NT Abruzzi– NT Sicily

04/15/23 62

URI Generation

04/15/23 63

Alphanumeric URIs

http://aims.fao.org/aos/agrovoc/c_12332

• To be language independent• How to chose one language over the other?• Label in that language may not be available

04/15/23 64

URIs of concepts…

… existing before the conversion to SKOS– URI is formed by appending:

• Namespace +• c_ + • the term code of the Agrovoc term

04/15/23 65

From term code to URIs

04/15/23 66

About Agrovoc term codes

• In the DB, terms were given a double key: a code for the term, plus a code for the language. Then, all preferred term had the same term code + different language code

• Codes had no fixed length

04/15/23 67

URIs of concepts…

… created after the conversion to SKOS– URI is formed by appending:

• Namespace +• c_ + • 13-digit automatically generated code

04/15/23 68

Concepts…

04/15/23 69

Top conceptsand

thematic organization

04/15/23 70

25 top concepts

# of concepts under each Top

04/15/23 72

Remarks

• Great difference in development of hierarchy under each top• Not necessarily a problem…

– No “thematic roots”• Plants and animals under organisms• Agriculture is under activities (economic activity)• Forestry under subject• Food under product

04/15/23 73

04/15/23 74

04/15/23 75

Data and vocabulary

04/15/23 76

AGROVOC data

• All terminological and domain information– Concepts– Terms– Relationships between concepts, or between

terms • URIs within namespace:

http://aims.fao.org/aos/agrovoc/– E.g. URI of concept [maize]:

http://aims.fao.org/aos/agrovoc/c_12332

04/15/23 77

Vocabularies used in Agrovoc - 1

• Vocabulary here is = The collection of properties that are used to describe concepts, terms and relations

• SKOS: to express concepts, BT/NT, RT, and matches with other vocabularies

• SKOS-XL: to express labels (to be able to make statements about labels)

04/15/23 78

Vocabularies used in Agrovoc - 2

• VOID: to describe the dataset• Dcterms: for date of creation and modification • Foaf: for images

04/15/23 79

Vocabularies used in Agrovoc - 3

• Agrontology: for AGROVOC-specific properties and relations – http://aims.fao.org/aos/agrontology

04/15/23 80

Information on concepts

04/15/23 81

A look at Agrovoc through VocBench 2.0

• Note that some things present in the interface of VocBench are not really used in Agrovoc...

• In the following, we follow the tabs used to show information about concepts

04/15/23 KISAF, Rome 82

Following VocBench interface

04/15/23 83

Terms – Tab terms

04/15/23 84

Terms

• All terms available for that concept • Terms are clickable, more info about the term

is shown– The possibility of making statements about terms

is provided by SKOS-XL (more on this later)

04/15/23 85

Definitions of concepts– Tab Definitions

04/15/23 86

Definitions

• VB allows for more than one definition– Can specify source, URL– Expressed with skos:definition

• Each definition may have more than one translation

• Agrovoc only has single definitions– Its value is an URIs, like

agrovoc:c_def_1328252885416

• Mostly added after conversion to SKOS, ~120004/15/23 87

Notes – Tab Notes

04/15/23 88

Notes – general

• This tab collects two types of notes, that Agrovoc inherited from its “thesaurus-time”

• Scope notes– To define the scope of applicability of concepts

• Editorial notes– To keep track of some editorial information

04/15/23 89

Scope note

• Rendered by skos:scopeNote– Its value is string – just text– May be given in various languages– Used to define the scope of applicability of a

concept = in old Agrovoc it was not possible to give definitions, so often Scope Notes were used to provide definitions

04/15/23 90

Editorial note

• Rendered as skos:editorialNote– In old Agrovoc thesaurus there was no way to keep

author/year of a scientific name, so editors often sued Editorial Notes.

– E.g. <http://aims.fao.org/aos/agrovoc/c_39617>Skos:prefLabel Aulopus filamentosus @en Skos:editorialNote “Author: (Bloch 1792)”@en– Plan is to have proper encoding of author/year of

scientific names04/15/23 91

04/15/23 92

Attributes of concepts – Tab Attributes

04/15/23 93

Attributes

• Datatype properties of concepts– = the value of the property is a word, or rather, a

string

• Currently, Agrontology includes:– isSpatiallyIncludedInState (not actually used)– isSpatiallyIncludedInCity– isHoldBy (<sic> Not actually used)– isPartOfSubvocabulary

04/15/23 94

Remarks

• Some are currently under examination as part of the geographical domain (ref. Otakar Čerba)– isSpatiallyIncludedInState (no occurrences)– isSpatiallyIncludedInCity

• Notion of list of concepts– agrontology:isPartOfSubvocabulary– Ideally we would like to use something more

standard

• isHoldBy is not used in Agrovoc04/15/23 95

Concept codes – Tab Notation

04/15/23 96

Notation

• Meant to keep codes of concepts– skos:notation

• Not used in Agrovoc (codes were given to terms)

04/15/23 97

Concept relationships – Tab Relationship

04/15/23 98

Concepts relationships

• Non-hierarchical relationships of a given concept

• In thesauri, only RT, between terms• In concept schemes, same notion with

skos:related– same vagueness as RT, but between concepts

• Also other, more specific “related” are possible

04/15/23 99

Example of relationshiops – concept [Oryza*]

04/15/23 100

Recap• In thesaurus, only RT relation• At some point, the RTs were “refined”

– ~ 160 (including inverse)

• Now:– Number of relations has reduced – further

reduction under evaluation– Vocabulary Agrontology collects Agrovoc relations

• Defined as an extension of skos:related

04/15/23

Example of Agrovoc relationships

[Oryza sativa*] agrontology:produces [Rice*, Paddy]

Which infers:[Rice*, Paddy] agrontology:is producedby [Oryza sativa*]

[Rice*, Paddy] skos:related [Oryza sativa*]

04/15/23

Same example, seen in VocBench

04/15/23 103

Agrontology

• Visualized in Module Relationship (called Properties in VB 2.1)

• Also available as HTML from:– http://aims.fao.org/aos/agrontology

04/15/23

04/15/23 105

History of changes of concept – Tab History

04/15/23 106

History

• The list of actions performed on the concept• Some data is kept in Agrovoc, i.e. date of

creation, last update• The changes performed in between creation

and last update are stored in VB only

04/15/23 107

Example

04/15/23 KISAF, Rome 108

Images – Tab Image

04/15/23 109

Image

• A pointer to an image available online, one may give name, source and URL of the source.

• http://xmlns.com/foaf/0.1/depiction

04/15/23 110

Hierarchy

04/15/23 111

Summary of the hierarchy – tab Hierarchy

04/15/23 KISAF, Rome 112

Hierarchy

• Is meant to give a quick grasp of the position of the selected concept in the hierarchy– Only parents from current concept to its Top

04/15/23 KISAF, Rome 113

Expanding on Agrovoc concept attributes -

“Subvocabularies”of concepts

04/15/23 114

The idea behind

• A way to make lists of concepts• Visualized in tab Attribute (of concepts)• Expressed by predicate

agrontology:isPartOfSubvocabulary– Value is a char (the name of the list)– You may think of them as a flag assigned to a

concept

• Subvocabularies may be defined by administrators

04/15/23 115

Current subvocabularies of concepts

04/15/23 116

Subvocabularies currently defined

• Chemicals (644 concepts)• Geographical country level (247)• Geographical above country level (246)• Geographical below country level (522) • Fishery related terms (259)

04/15/23 117

Remarks

• Geographical vocabularies are currently under examination:– Country level– Above country level– Below country level

04/15/23 KISAF, Rome 118

Information about terms

04/15/23 119

04/15/23 120

Pubby – html presentation of rdf

Terms in SKOS

• In RDF/SKOS view, terms are labels of concepts (strings)

• Labels are strings

04/15/23 121

skos:prefLabel

maize @enagrovoc:c_12332

1981-01-26

skosxk:prefLabel

Dcterms:created

maize @en

PublishedAgrontology:hasStatus

skosxl:literalForm

agrovoc:c_12332 agrovoc:xl_en_1299486843709

Terms in SKOS-XL

• SKOS-XL is an extension of SKOS• Labels are really objects, about which one can

make statements

04/15/23 124

Agrovoc is in SKOS-XL

• Because we want to make statements about terms… – E.g. Creation date, status (drafted, ..., published)

• skosxl:prefLabel for descriptors• skosxl:altLabel for non descriptors

• But also skos:prefLabel and skos:altLabel are generated for various purposes– E.g. Pubby

04/15/23 125

History

04/15/23 126

History information for terms

04/15/23 127

Relationships

04/15/23 128

Relationships between terms

04/15/23 129

Visualize relations between terms in VB

04/15/23 130

Attributes

04/15/23 131

Attributes of terms

04/15/23 132

Linguistic information as attribute of terms

• Singular, plural forms– agrontology:hasSingular, agrontology:hasPlural

• Spelling variants– Agrontology:hasSpellingVariant– E.g. hemophilia, haemophilia different labels?

• Membership to a subvocabulary of terms– Agrontology:hasTermType

04/15/23 133

Notation

04/15/23 134

Term codes -> notations

04/15/23 135

Expanding on Agrovoc term attributes –

“Subvocabularies”of terms

04/15/23 136

The idea behind

• A way to make lists of terms, just like the subvocabularies of concepts

04/15/23 137

Subvocabularies of terms

04/15/23 138

Subvocabularies of terms

• Expressed as a term, i.e., predicate agrontology:hasTermType

• In VocBench: tab Attribute of a term• Subvocabularies may be defined by

administrators

04/15/23 139

# terms per subvocabularies (en)

• Taxonomic terms for plants: 4297 • Taxonomic terms for animals: 14809 • Taxonomic terms for bacteria: 7133 • Common name for viruses: 50 • Taxonomic terms for viruses: 5136 • Common name for plants: 8000 • Common name for bacteria: 29 • Acronym: 867 • Common name for animals: 4613 • Taxonomic terms for fungi: 17830 • Common name for fungi: 144 04/15/23 140

Membership to a vocabulary of terms seen in VB

04/15/23 141

Remark

• More than one membership is possible• But now mostly individual membership

– in the old MySQL maintenance tool it was not possible to assign two attributes

04/15/23 142

Provenance information

04/15/23 143

Information in Agrovoc

• For concepts: – date of creation

• Dct:created

– Date of last update• Dct:modified

• For terms– Date of creation and last update

• As above

04/15/23 144

Information in VocBench

• Extra information for management purposes– User, action, change– See VB modules Recent Changes and Validation

04/15/23 145

Multilinguality in Agrovoc

04/15/23 146

Multilinguality

• Labels are marked with ISO 2 Letter Language Code– Languages as recognized by linguists– En for English, Es (Spanish) in general,

independently of where it is spoken

04/15/23 147

Multilinguality

ISO 2 Languagecodes

04/15/23 148

Language variations by countries

• English in UK, USA, OZ, ..• Spanish in Spain, Argentina, Venezuela…• Portuguese in Portugal, Brazil

• This fact may be expressed by using ISO2 for languages + ISO for countries

04/15/23 149

Example: some terminological differences in PT (EMBRAPA)

04/15/23 150

Terminological variations by regions

• Names for concrete objects, or object of large use (food, plants, animals, …) tend to vary considerably “within a language”, e.g. by country/state/region/…– They also tend to stay over time… e.g., indigenous

words used after the indigenous language is no longer spoken…

• Reflection on limitations of code-oriented approach

04/15/23 151

Multilinguality

• More flexible way to support multilinguality?– accommodate the “area of use” of a given name

• Where area may be a country, a political or geographical region, or an aggregation of these

• E.g. palta is Quechua name for avocado, used in Argentina, Bolivia, Chile, Peru, Uruguay….

• Connection between linguistic and geographical information

04/15/23 152

AGROVOC by domains:living organisms

04/15/23 153

04/15/23 154

Scientific taxonomies

04/15/23 155

Where in the hierarchy

• Most concepts for animal and plants are under top concept organisms*– ~ 2/3 of total concepts in AGROVOC

04/15/23 156

04/15/23 157

In Agrovoc thesaurus

• Only BT/NT hierarchies of taxa– E.g. [Oryza*] BT [Poaceae]

• No specification of ranks– E.g. Oryza is a genus

• No formal way to specify other “attributes” of a scientific name– See the use of Editorial notes

04/15/23 158

Elements in a scientific taxonomy and their names

• In case more than one name is available for a concept, the binominal name is preferred label

04/15/23 159

04/15/23 160

Taxonomies in Agrovoc now

• Mostly, skos:broader– Oryza skos:broader Poacacee

• Also, some pairs of taxa are connected by a number of relations in Agrontology– e.g. agrontology:includesSubGroup– Note that they are subproperties of skos:related..– Plan is to remove them

04/15/23 161

Ranks in Agrovoc now

• Rendered by agrontology:hasTaxonomicLevel – Oryza glaberrima* hasTaxonomicLevel species

(taxa)*• Values for hasTaxonomicLevel are concepts under:

– Groups* » Taxa*

04/15/23 162

Scientific names

04/15/23 163

Scientific names

• Binominal nomenclature

Hibiscus rosa-sinensis, Linnaeus 1753Common names:

Hibiscus, Chinese hibiscus, …

04/15/23 164

What language tag for scientific names?

• Some people say binominal names are Latin….– No! rather latinate…

• In Agrovoc, they are repeated in all languages, with the corresponding language tag

• This is due to historical reason, it was the only way to have scientific names in each language version

04/15/23 165

Scientific names in AGROVOC

• Are all labels of skos:concepts• May be preferred or alternative labels

– General guideline: Scientific name is preferred under a scientific taxonomy

– In case more than one scientific name is available for the same entity, the non preferred is a synonym of the other

04/15/23 166

How to mark scientific names - 1

• Using a subvocabulary of terms• Predicate agrontology:hasTermType

– Current possible values: Taxonomic name for [animals | Bacteria | Fungi | Plants | viruses], and also Common names for …

• Does not require that common names are available

04/15/23 167

How to “mark” scientific names - 1

• agrontology:hasScientificName• Applies to terms • Link together scientific and common name of

the same concept, the same language

04/15/23 168

Remark 1

• It is a binary notion– Requires a common name for that concept!

04/15/23 169

Remarks

• Membership to a subvocabulary is a unary notion

• Our goal: unary property to mark that a term is a scientific name, together with author and year

04/15/23 172

Scientific taxonomies and common names

04/15/23 173

Background

• At the time of the revision of Agrovoc there was the idea to have separate hierarchies for scientific and common sense taxonomies– Connected by skos:related – Under a taxonomic hierarchy, the preferred term is

always a scientific names

04/15/23 174

Separate hierarchiesfor scientific and common sense

taxonomies

* figure taken from earlier revision work on AGROVOC

04/15/23 175

Example

04/15/23 176

skos:related

Remark

• Sometimes no clear separation between scientific taxonomies and “common sense” taxonomies

– Example: Felidae (= biological family of cats, next slide)

• Appears within a scientific taxonomy• But its subconcepts are not scientific

Example: Felidae

04/15/23 178

Remark 1

• It is not always possible to have clear 1-1 correspondences between scientific and common sense taxonomies – Common names are not always available

• One common name may be used for very many…• They often are just the same as scientific names, or

similar…

04/15/23 179

Remark 2

• Different classifications are proposed (-> scientific taxonomies). – taxonomies also change over time

• In AGROVOC, attempt to use a C-C relation agrontology:formerlyIncludedIn– Agavaceae* agrontology:formerlyIncludedIn

Liliaceae– …satisfactory?...

04/15/23 180

Looking at AGROVOC by domains:geographical entities

04/15/23 181

How they are expressed

• Agrontology:hasSubvocabulary• Two vocabularies:

– Geographical country level = 247 concepts– Geographical above country level

04/15/23 182

Geopolitical information

• Agrovoc has no notion of time: – Czechoslovak Socialist Republic – German Democratic Republic

04/15/23 183

Physical geography

• Geographical above country level = 246 • A variety of notions…

– Americas– Baltic States– Islamic countries– Lake Kivu– Atlantic Ocean– Yellow Sea

04/15/23 184

Geographical relations

• spatiallyIncludes– Guyana – Cuyuni River

• A variety of notions…– Americas– Baltic States– Islamic countries– Lake Kivu– Atlantic Ocean– Yellow Sea

04/15/23 185

Agrovoc on geographical entities

• Currently under revision by Otakar Cerba (visitor scientist @FAO)

04/15/23 186

Agrovoc LOD

04/15/23 187

Mapped vocabularies 1

04/15/23 188

Mapped vocabularies 2

04/15/23 189

Mapping process

04/15/23 190

Mapping activities @Agrovoc

• Intense activity around 2011-2012• Process managed internally

– Candidate mapping automatically generated (using both publicly available implementations and algorithms implemented in-house)

– Manual evaluation• Done by one colleague, Gudrun

– Publication

04/15/23 191

VB and mapping

• Development happens within SemaGrow – Project deadline 2015– But because of internal deadline, might be ready Nov

2014, first version at least• What’s needed… ongoing…

– improved multi-project management capabilities (in Semantic Turkey), to access data from a different project, mediated by a dedicated Access Control policy

– mapping engine

More info

SemaGrow project www.semagrow.eu

Deliverable D3.2.1- Techniques for Ontology Alignment

http://www.semagrow.eu/sites/default/files/D3.2.1-Techniques%20for%20Ontology%20Alignment.pdf

Mmmh… anything else?

04/15/23 194