a toolkit for reconciling multiple taxonomic perspectives: euler/x and the perelleschus use case

27
A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the Perelleschus Use Case Nico Franz 1 , Mingmin Chen 2 , Shizhuo Yu 2 , Shawn Bowers 3 & Bertram Ludäscher 2 1 School of Life Sciences, Arizona State University 2 Department of Computer Science, UC Davis 3 Department of Computer Science, Gonzaga University TDWD 2013 Annual Conference, Florence, Italy Semantics for Biodiversity – Formal Models and Ontologies November 01, 2013 Slides @ http://taxonbytes.org/tdwg-2013-a-toolkit-for-reconciling-multiple-taxonomic-pers pectives

Upload: faxon

Post on 25-Feb-2016

31 views

Category:

Documents


0 download

DESCRIPTION

A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the Perelleschus Use Case. Nico Franz 1 , Mingmin Chen 2 , Shizhuo Yu 2 , Shawn Bowers 3 & Bertram Lud äscher 2 1 School of Life Sciences, Arizona State University 2 Department of Computer Science, UC Davis - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

A Toolkit for Reconciling

Multiple Taxonomic Perspectives:

Euler/X and the Perelleschus Use Case

Nico Franz1, Mingmin Chen2, Shizhuo Yu2, Shawn Bowers3 & Bertram Ludäscher2

1 School of Life Sciences, Arizona State University 2 Department of Computer Science, UC Davis 3 Department of Computer Science, Gonzaga University

TDWD 2013 Annual Conference, Florence, Italy

Semantics for Biodiversity – Formal Models and Ontologies

November 01, 2013

Slides @ http://taxonbytes.org/tdwg-2013-a-toolkit-for-reconciling-multiple-taxonomic-perspectives

Page 2: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Introduction – the Euler project & Euler/X toolkit

• The project builds on a ~ 25 year history of using taxonomic concepts in the TDWG community; primarily in Australia, Germany, United Kingdom, Japan.

• Prior extensive uses of concept articulations include Koperski et al. (2000); and concatenation of articulations by Berendsohn, Geoffroy & Güntsch (2003).

Homepage:https://sites.google.com/site/eulerdi/homeOpen source:https://bitbucket.org/eulerx/euler-project Overview paper:http://taxonbytes.org/pdf/ChenEtAl2013-EulerToolkit.pdf

Page 3: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Introduction – the Euler project & Euler/X toolkit

• The project builds on a ~ 25 year history of using taxonomic concepts in the TDWG community; primarily in Australia, Germany, United Kingdom, Japan.

• Prior extensive uses of concept articulations include Koperski et al. (2000); and concatenation of articulations by Berendsohn, Geoffroy & Güntsch (2003).

• David Thau's (2006-2010) work on CleanTax prototyped the use of RCC-5 relations in combination for First-Order Logic reasoning over taxonomies.

• The Euler project (2011-) succeeds CleanTax, with performance optimizations, many added functions, and an increasing focus on Answer Set Programming.

Homepage:https://sites.google.com/site/eulerdi/homeOpen source:https://bitbucket.org/eulerx/euler-project Overview paper:http://taxonbytes.org/pdf/ChenEtAl2013-EulerToolkit.pdf

Page 4: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

congruence proper inclusion

overlapinverse proper

inclusion

exclusion

Source: Franz & Peet. 2009. Towards a language for mapping relationships among taxonomic concepts. Systematics and Biodiversity 7: 5–20.

Use of "OR" to express uncertainty. Example: C1 == OR > C2

Review: RCC-5 articulations between two concepts C1, C2

Page 5: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Interactive taxonomy alignment: Euler/X toolkit workflow

• Challenge: asserting articulations across 2 taxonomies may lead to ambiguities, inconsistencies, and omissions, resulting in an imperfect alignment.

Page 6: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Interactive taxonomy alignment: Euler/X toolkit workflow

• Challenge: asserting articulations across 2 taxonomies may lead to ambiguities, inconsistencies, and omissions, resulting in an imperfect alignment.

• Solution: Euler/X reads in 2 concept taxonomies (TCs + T1 + T2) plus a set of initial, expert-made articulations (A). The toolkit then allows for:

Page 7: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Interactive taxonomy alignment: Euler/X toolkit workflow

• Challenge: asserting articulations across 2 taxonomies may lead to ambiguities, inconsistencies, and omissions, resulting in an imperfect alignment.

• Solution: Euler/X reads in 2 concept taxonomies (TCs + T1 + T2) plus a set of initial, expert-made articulations (A). The toolkit then allows for:• Checking for, and identification of, alignment inconsistencies.

Page 8: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Interactive taxonomy alignment: Euler/X toolkit workflow

• Challenge: asserting articulations across 2 taxonomies may lead to ambiguities, inconsistencies, and omissions, resulting in an imperfect alignment.

• Solution: Euler/X reads in 2 concept taxonomies (TCs + T1 + T2) plus a set of initial, expert-made articulations (A). The toolkit then allows for:• Checking for, and identification of, alignment inconsistencies.• Interactive inconsistency repair.

Page 9: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Interactive taxonomy alignment: Euler/X toolkit workflow

• Challenge: asserting articulations across 2 taxonomies may lead to ambiguities, inconsistencies, and omissions, resulting in an imperfect alignment.

• Solution: Euler/X reads in 2 concept taxonomies (TCs + T1 + T2) plus a set of initial, expert-made articulations (A). The toolkit then allows for:• Checking for, and identification of, alignment inconsistencies.• Interactive inconsistency repair.• Generation of the set of mir – maximally informative relations (necessary

and sufficient to yield a complete alignment).

Page 10: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Interactive taxonomy alignment: Euler/X toolkit workflow

• Challenge: asserting articulations across 2 taxonomies may lead to ambiguities, inconsistencies, and omissions, resulting in an imperfect alignment.

• Solution: Euler/X reads in 2 concept taxonomies (TCs + T1 + T2) plus a set of initial, expert-made articulations (A). The toolkit then allows for:• Checking for, and identification of, alignment inconsistencies.• Interactive inconsistency repair.• Generation of the set of mir – maximally informative relations (necessary

and sufficient to yield a complete alignment).• Interactive uncertainty reduction.

Page 11: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Interactive taxonomy alignment: Euler/X toolkit workflow

• Challenge: asserting articulations across 2 taxonomies may lead to ambiguities, inconsistencies, and omissions, resulting in an imperfect alignment.

• Solution: Euler/X reads in 2 concept taxonomies (TCs + T1 + T2) plus a set of initial, expert-made articulations (A). The toolkit then allows for:• Checking for, and identification of, alignment inconsistencies.• Interactive inconsistency repair.• Generation of the set of mir – maximally informative relations (necessary

and sufficient to yield a complete alignment).• Interactive uncertainty reduction.• Visualization of one or more "Possible World" merge taxonomies.

Page 12: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Euler/X is ready1 for real-lifeuse cases – Perelleschus

1 After many iterations of testing/optimization with abstract cases, PW visualizations, and reasoner benchmarking.

Page 13: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

1986 2001

Perelleschus use case – overview of 6 classifications/phylogenies

1936 1954

2006 2013

= "carludovicae" (name), cumulative history

Page 14: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Key properties of the Perelleschus concept history use case

• 6 classifications (3 taxonomic, 3 phylogenetic), 54 concepts, from 1936 to 2013

• Complete concept history from 1st concept E. carludovicae sec. Günther (1936) to current phylogenetic arrangement (2013) with 10 species-level concepts.

• All instances of taxonomic incongruence occur above the species level.

DOI:10.1080/14772000.2013.806371 (link)

Page 15: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Key properties of the Perelleschus concept history use case

• 6 classifications (3 taxonomic, 3 phylogenetic), 54 concepts, from 1936 to 2013

• Complete concept history from 1st concept E. carludovicae sec. Günther (1936) to current phylogenetic arrangement (2013) with 10 species-level concepts.

• All instances of taxonomic incongruence occur above the species level.

• Franz & Cardona-D. (2013) provide 54 concepts + Trees 1-6 + 76 articulations.

• Only 5 of 54 higher-level concept articulations are unambiguously congruent.

• Articulations take into account membership & diagnostic features.

DOI:10.1080/14772000.2013.806371 (link)

Page 16: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Concept evolution – Günther (1936) to Voss (1954)

Reconciliation appears easy enough; except E. carludovicae sec. Günther (1936; [2]) – a Costa Rican taxon/concept – was placed in Elleschus sec. Günther (1936; [1]) – a European taxon/concept with several other children which the author omitted in his 1936 treatment (issue: incomplete listing of children).

Page 17: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Concept evolution – Günther (1936) to Voss (1954)

Reconciliation appears easy enough; except E. carludovicae sec. Günther (1936; [2]) – a Costa Rican taxon/concept – was placed in Elleschus sec. Günther (1936; [1]) – a European taxon/concept with several other children which the author omitted in his 1936 treatment (issue: incomplete listing of children).

Thus "overlap" (><) is an intuitive articulation among [1] and [3]; however Euler/X would not infer this unless we either:

1. Relax the "coverage assumption" for [1] (coverage means that a parent's extension is fully defined by its children); or

2. Add a child "1 Imp" (implied) to obtain the proper mir and merge.

Page 18: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Concept evolution – Günther (1936) to Voss (1954)

1.1 Imp

Euler/X mergeEuler/X mir

Once "1 Imp" is added, Euler/X yields a consistent merge that is intuitive at all levels.

1954 concepts

1936 concepts

Congruent speciesconcepts '36/'54

Color legend

Overlap (><)

Page 19: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Concept evolution – Wibmer & O'Brien (1986) to Franz & O'Brien (2001)

Euler/X infers a consistent and plausible merge of the 1986 three-species taxonomy and the eight-species 2001 phylogeny.

2001

1986

Congr. '86/'01

Color legend

Euler/X merge

><

Page 20: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Concept evolution – Wibmer & O'Brien (1986) to Franz & O'Brien (2001)

The overlap (><) articulations among 2001 higher-level concepts [14,16,20,…] and Perelleschus sec. W. & O. 1986 [7] are rooted in the inclusion/exclusion of "subcinctus" [10/13] in "Perelleschus" [7/14].

2001

1986

Congr. '86/'01

Color legend

Euler/X merge

><

Page 21: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Concept evolution – Wibmer & O'Brien (1986) to Franz & O'Brien (2001)

2001

1986

Congr. '86/'01

Color legend

Euler/X merge

>< The 2001 authors transferred "subcinctus" into Phyllotrox [12].

Page 22: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Concept evolution – Franz & O'Brien (2001) to Franz & Cardona-D. (2013)

At the surface and beyond, the two phylogenies share many congruent terminals and seemingly also higher-level entities.

However, the 2013 treatment includes two new species/concepts [53,54] and one new clade [52] nested well within the genus-level topology.

Page 23: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Concept evolution – Franz & O'Brien (2001) to Franz & Cardona-D. (2013) Initial merge results: "noisy" due in part because of divergent outgroup assumptions.

Main 2013 higher-level trunk

Main 2001 higher-level tr

unk2001: Derelomini out of position

14 = 2001: Perelleschus

38 = 2013: Perelleschus

2013: Phyllotrogina

Outgroups too much "noise"

Unwanted overlap???

Once the outroups were "stipulated" as congruent and "sealed off" (through application of coverage) from the ingroups, the merge got solidified and simplified.

Page 24: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

2013 higher-level concepts

2001 higher-level concepts

2013/2001 congruence

Concept evolution – Franz & O'Brien (2001) to Franz & Cardona-D. (2013)

New 2013 clade

"Clean" merge with overlapping, parallel 2001/2013 mid-level trunks that reflect the addition of a new, nested 2013 clade.

Zoom in onoverlap

Page 25: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

A201. Merge view – overlap

2. Zoom view – 2 levels

Level 1:

Level 2:

B47

A20' B47'

><

[3 new labels]

A20' B47'"AB2047"

B52A23A21B45 A22B46

In progress – zooming in on overlap, "combined concept" resolution

"AB2047"

Page 26: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

Conclusions & outlook

1. The Euler/X toolkit is moving towards logically sound, interactive, scalable, and visually effective solutions to the challenge of reasoning over concept and classification / phylogeny provenance in real-life use cases.

2. Many agencies and projects aim towards integration of taxonomic names and concepts, including the Global Names Architecture initiative.

3. The Euler concept approach represents a robust and powerful way to achieve this through interactive, semi-automated reasoning and visualization of merge taxonomies.

Page 27: A Toolkit for Reconciling Multiple Taxonomic Perspectives: Euler/X and the  Perelleschus  Use Case

• TDWG 2013 Symposium organizers – John Deck, Mark Schildhauer, Ramona Walls

• Juliana Cardona-Duque – Universidad de Antioquia, Medellín, Colombia

• NSF Award IIS-1118088. "III: Small: A Logic-Based, Provenance-Aware System for Merging Scientific Data under Context and Classification Constraints."

Acknowledgments

http://taxonbytes.org https://sols.asu.edu

https://sites.google.com/site/eulerdi/home