Franz Et Al. Using ASP to Simulate the Interplay of Taxonomic and Nomenclatural Change

Post on 07-Jul-2015

2,058 views 0 download

Tags:

description

Answer Set Programming (ASP) is a declarative, stable model approach to logic programming with an under-realized potential for representing and reasoning over biological information. ASP is particularly suited to address reasoning challenges with complex starting conditions and rule sets. One such challenge is the interplay of taxonomic and nomenclatural change in biological taxonomy that often results when a taxonomy is revised based on a previously published perspective. Depending on the nature of the taxonomic changes to be undertaken, one or more Code-mandated principles will apply to regulate specific and concomitant name changes. In the case of the International Code of Zoological Nomenclature, two principles of significance include the Principles of Priority and Typification. Although the relationship between the number of taxonomic and nomenclatural adjustments under a given transition scenario is not linear, the application of the name-changing rules is usually unambiguous and therefore amenable to logic representation. Here we explore the modeling of the taxonomy/nomenclature interplay in ASP with a simple, abstract nine-taxon use case that contains four terminal species of which two are type-bearers for their respective genera. Four distinct one-taxon transfer scenarios are simulated through a transition system approach, requiring 1-7 concomitant nomenclatural changes depending (1) on the priority relationships among the terminal taxa being repositioned and (2) the type-bearing name dependencies of their higher-level parents. ASP can simulate these rules faithfully and thus reason over situations that range from a one-to-one match of taxonomic and nomenclatural changes to situations where they two kinds of change become increasingly disconnected (e.g., transfer of non-type genera among tribes without name change, or "transfer" [in reverse direction] of a single priority-carrying name/taxon into a larger yet junior entity with numerous required name changes). Our results, though very preliminary, illustrate how ASP logic approach may be utilized to perform optimizations at the taxonomy/nomenclature intersection, and generally represent a novel step towards translating Code-mandated naming rules into logic, with potential benefits for virtual taxonomic domains.

transcript

Using Answer Set Programming to

Simulate the Interplay of Taxonomic

and Nomenclatural Change

Nico Franz1, Joohyung Lee2 & Chao Zhang2

1 School of Life Sciences, Arizona State University2 CIDSE Automated Reasoning Group, ASU

TDWD 2013 Annual Conference, Florence, Italy

Semantics for Biodiversity – Formal Models and Ontologies

November 01, 2013

Slides @ http://taxonbytes.org/tdwg-2013-using-asp-to-simulate-the-interplay-of-taxonomic-and-nomenclatural-change

Question – are the

rules of nomenclature

logically tractable?

Core principles embodied in the Code of Zoological Nomenclature

1. Binominal Nomenclature

• The scientific name of a species, and not of a taxon at any other rank, is a combination of two names.

2. Priority

• The valid name of a taxon is the oldest available name applied to it.

3. Coordination• Within the [family, genus, species] group, a name established for a taxon at any rank is simultaneously

established with the same author/date for taxa with the same name-bearing type at other ranks in the group.

4. First Reviser• The relative precedence of two or more names or nomenclatural acts published on the same date, or of

different original spellings of the same name, is determined by the First Reviser.

5. Homonymy

• The name of each taxon must be unique.

6. Typification• Each nominal taxon in the family group, genus group or species group has a name-bearing type fixed to

provide the objective standard of reference by which the application of the name is determined.

7. [Gender Agreement]• Agreement in grammatical gender between a generic name and Latin or latinized adjectival or participial

species-group names combined with it originally or subsequently.

Source: Code On-Line: http://www.nhm.ac.uk/hosted-sites/iczn/code/index.jsp

Core principles embodied in the Code of Zoological Nomenclature

1. Binominal Nomenclature

• The scientific name of a species, and not…

2. Priority

• The valid name of a taxon is the oldest….

3. Coordination• Within the [family, genus, species] group, a name established for a taxon at any rank is simultaneously

established with the same author/date for taxa with the same name-bearing type at other ranks in the group.

4. First Reviser• The relative precedence of two or more names or nomenclatural acts published on the same date, or of

different original spellings of the same name, is determined by the First Reviser.

5. Homonymy

• The name of each taxon must be unique.

6. Typification• Each nominal taxon in the family group, genus group or species group has a name-bearing type fixed to

provide the objective standard of reference by which the application of the name is determined.

7. [Gender Agreement]• Agreement in grammatical gender between a generic name and Latin or latinized adjectival or participial

species-group names combined with it originally or subsequently.

Source: Code On-Line: http://www.nhm.ac.uk/hosted-sites/iczn/code/index.jsp

Working hypothesis:

All (6 + 1) Principles are representable inStable Model Semantics and computablewith ASP programs & solvers.

Answer Set Programming reviewed in 10 bullet points

• Relatively new programming paradigm, not widely used until late 1990s

• A form of declarative programming based on Stable Model Semantics

• Combines expressive representation language with efficient solving tools

• Instead of proving truth/falsity, identifies solutions that satisfy conditions

Answer Set Programming reviewed in 10 bullet points

• Relatively new programming paradigm, not widely used until late 1990s

• A form of declarative programming based on Stable Model Semantics

• Combines expressive representation language with efficient solving tools

• Instead of proving truth/falsity, identifies solutions that satisfy conditions

• Closed World Assumption – what is not known is false (unlike OWL-DL)

• Can compute non-monotonic reasoning

• Has the property of elaboration tolerance

• Excels at modeling complex rules

Answer Set Programming reviewed in 10 bullet points

• Relatively new programming paradigm, not widely used until late 1990s

• A form of declarative programming based on Stable Model Semantics

• Combines expressive representation language with efficient solving tools

• Instead of proving truth/falsity, identifies solutions that satisfy conditions

• Closed World Assumption – what is not known is false (unlike OWL-DL)

• Can compute non-monotonic reasoning

• Has the property of elaboration tolerance

• Excels at modeling complex rules

• Capable of default reasoning ("by default, X is true"), transition systems

• Translatable (in part) into First-Order Logic (FOL), Description Logic (DL)

• More information in the reference list appended to this presentation

ASP paradigm – set conditions, constraints, ground, identify SMs

Source: Eiter, T. 2008. http://gradlog.informatik.uni-freiburg.de/gradlog/slides_ak/eiter_asp.pdf

ASP paradigm – apply to taxonomy/nomenclature change scenario

Source: Eiter, T. 2008. http://gradlog.informatik.uni-freiburg.de/gradlog/slides_ak/eiter_asp.pdf

Fully specified input taxonomy (t = 0); incl.:ranked names, priority/type relationships

At t = 1 (revision), effect a taxonomic changewhere 1 species is moved into another genus

ASP paradigm – apply to taxonomy/nomenclature change scenario

Source: Eiter, T. 2008. http://gradlog.informatik.uni-freiburg.de/gradlog/slides_ak/eiter_asp.pdf

Represent: input tree, names, years, ranks…

Encode: Principles of Nomenclature

Choice: Select a taxonomic change scenario

ASP paradigm – apply to taxonomy/nomenclature change scenario

Source: Eiter, T. 2008. http://gradlog.informatik.uni-freiburg.de/gradlog/slides_ak/eiter_asp.pdf

Grounding of all domains, variables andconditions at t = 0 (original) vs. t = 1 (revision)

ASP paradigm – apply to taxonomy/nomenclature change scenario

Source: Eiter, T. 2008. http://gradlog.informatik.uni-freiburg.de/gradlog/slides_ak/eiter_asp.pdf

Inference of Stable Models (taxonomies) andall concomitant nomenclatural emendations

9-taxon use case – transition model

Input (original) taxonomy at t = 0 ["9-name/taxon use case"]

• All type bearing and non-type bearing epithets have different publication years

t = 0

* = type-bearing name

Transition: exactly 1 species will move to the other genus at t = 1.

Since there are 4 species, this yields 4 Stable Models.

Model 1: O. secundus moves into Agenus

• Requires new higher-level synonymies, "cascading", new names, new types

t = 0

t = 1

Required nomenclatural changes; O. secundus is a type bearer.

Model 2: A. tertius moves into Ogenus

• Non-type bearer – 1 taxonomic change ↔ 1 new combination

t = 0

t = 1

Model 3: O. quartus moves into Agenus

• Non-type bearer – 1 taxonomic change ↔ 1 new combination

t = 0

t = 1

Model 4: A. primus "moves" [Ogenus spp. ingress into Agenus]

• Most dramatic nomenclatural adjustments – A. primus is globally oldest type

t = 0

t = 1

Two species (names) – secundus & quartus – move into Agenus.

Modeling in ASP

Does it work? It does.

Current ASP program properly resolves all 4 models*

* Output optics notwithstanding; actual tree visualization in progress.

Conclusion – ASP can logically represent key rules of nomenclature

1. Binominal Nomenclature

• The scientific name of a species, and not…

2. Priority

• The valid name of a taxon is the oldest….

3. Coordination• Within the [family, genus, species] group, a name established for a taxon at any rank is simultaneously

established with the same author/date for taxa with the same name-bearing type at other ranks in the group.

4. First Reviser• The relative precedence of two or more names or nomenclatural acts published on the same date, or of

different original spellings of the same name, is determined by the First Reviser.

5. Homonymy

• The name of each taxon must be unique.

6. Typification• Each nominal taxon in the family group, genus group or species group has a name-bearing type fixed to

provide the objective standard of reference by which the application of the name is determined.

7. [Gender Agreement]• Agreement in grammatical gender between a generic name and Latin or latinized adjectival or participial

species-group names combined with it originally or subsequently.

= Principles currently modeled.

Likely feasible.

Likely feasible.

Extension of Priority.

ASP code sample – modeling priority, new combination, synonymy

Next up – improved output visualization, more complex cases

• "20-name/taxon use case" can include 36 *one-species-moves* permutations

• Compute, tabulate, visualize complete set of nomenclatural changes for each

• At the genus level, moving entire non-type genera requires no name change

Conclusions & outlook

1. This work is a novel representation of the Principles of Nomenclature in a

formal logic system with default conditions and transitional properties.

2. The model can be elaborated to include an increasing wide range of

taxonomic / nomenclatural change scenarios, and specific rule exceptions.

3. ASP could be utilized to validate proposed nomenclatural emendations or

infer additional required changes, and implemented in a nomenclatoral

repository such as ZooBank.

4. In complex change scenarios, ASP could be used to perform optimizations

and minimize nomenclatural instability given the need to move one or more

taxa.

• TDWG 2013 Symposium organizers – John Deck, Mark Schildhauer, Ramona Walls

• Stanley Blum, David Patterson, Richard Pyle – nomenclatural use case input

• Euler team, UC Davis – Bertram Ludäscher, Mingmin Chen – ASP support

Acknowledgments

http://taxonbytes.orghttps://sols.asu.edu

What is ASP? – introductory reading list & links

Brewka, G., T. Either & M. Truszczyoski. 2011. Answer set programming at a glance.Communications of the ACM 54: 92-103. Available athttp://people.scs.carleton.ca/~bertossi/KR11/material/communications201112ASP.pdf

Eiter, T. 2008. Answer Set Programming in a nutshell. Available athttp://gradlog.informatik.uni-freiburg.de/gradlog/slides_ak/eiter_asp.pdf

Gelfond, M. 2008. Answer sets; pp. 285-316. In: van Harmelen, F., V. Lifschitz & B. Porter.Handbook of Knowledge Representation. Elsevier. Available athttp://www.depts.ttu.edu/cs/research/krlab/pdfs/papers/gel07b.pdf

Gebser, M., B. Kaufmann, R. Kaminski, M. Ostrowski, T. Schaub & M. Schneider. 2011.Potassco: the Potsdam Answer Set Solving Collection. Available at http://www.cs.uni-potsdam.de/wv/pdfformat/gekakaosscsc11a.pdf

Lifschitz, V. 2008. What is Answer Set Programming? Available athttp://www.cs.utexas.edu/~ai-lab/pubs/wiasp.pdf

Potassco Group website: http://potassco.sourceforge.net/ (programs, tutorials)