the gacs project by caterina caracciolo

Post on 28-May-2015

140 Views

Category:

Education

8 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation delivered at the Agricultural Data Interoperability Interest Group -- Research Data Alliance (RDA) 4th Plenary Meeting -- Amsterdam, September 2014

TRANSCRIPT

The GACS Project

Caterina CaraccioloFAO

Amsterdam, 22 Sept. 2014

What is GACS?

• A collaboration between – FAO (AGROVOC) – NAL of USA (NAL Thesaurus) – CABI (CAB Thesaurus)

• To make a common repository of terminological and conceptual information in agriculture

Why?

• To coordinate efforts in the same area• And profit from differences of the three

thesauri

Who?

• FAO - AGROVOC thesaurus– 32,000 concepts (RDF/SKOS native), 20 languages– Covering agriculture, fisheries, forestry,

environment, food, ...• CABI – CAB Thesaurus– 245,000 terms, 11 languages (4 currently updated)– Majority, scientific names

• NAL – NAL Thesaurus– ~100,000 terms. English, Spanish. Most chemicals

AGROVOC (website to change soon)

NAL Thesaurus

CAB Thesaurus

How?

Phase 1: Feasibility study (concluded)Phase 2: Creation of a GACS core (now - early 2015)– A core of ~10,000 concepts aligned in the three

thesauri– Separate URIs

Phase 3: The “real” GACS– Expansion of the core– Expansion of the partnership

Some questions/issues

Some questions/issues

• Is there an overlap between the three thesauri?

• What is the potential for alignment?• How to select the “core”?• Will provenance information be kept? • What to do with different hierarchies?• Infrastructure?

Is there an overlap between the three?

An estimate of potential for alignment

How to select the core?

1. Selection of “seeds” of 10,000 concepts based on use in app of choice

How to select the core?

2. Run mapping algorithms and get a “single” core, then to be manually assessed

What is the frequency of concepts used for a corpus?

--- A sample from Agris

Will provenance information be kept?

• Yes, it is fundamental for all• We agreed on a set of metadata to keep at

the level of concept and terms– Creator, Date of creation, Date of last update, ..– Agreed format is SKOS-XL, to be able to make

statements on

Hierarchies may be different, although similar...

What about the GACS infrastructure?

• Suite of tools:– VocBench for editing (FAO, U of Tor Vergata)– Skosmos (Finnish National Library)

• Exposure and publication to be arranged, supported by FAO

Pointers

• On aims.fao.org we regularly publish updates – register to bulletins

• GACS reports published so far:– http://

aims.fao.org/community/agrovoc/blogs/phase-one-gacs-approved-read-reports

• A website will follow

top related