automated reasoning for classifying finite algebras simon colton computational bioinformatics...

Automated Reasoning forClassifying Finite Algebras

Simon Colton

Computational Bioinformatics Laboratory

Imperial College, London

Joint work with

• Roy McCasland (Edinburgh)

– Mathematical insights

• Andreas Meier (Saarbrucken)

– Theorem proving expertise

• Volker Sorge (Birmingham)

– ATP and CAS expertise

• Truly collaborative

– i.e., I may not be able to answer some questions

Classification of Finite Algebras

• Major driving force in mathematics– E.g., Kronecker’s 1870 classification of Abelian groups

– Also, 1980 classification of finite simple groups

• For loops and quasigroups, etc.– Large numbers of isomorphism/isotopy classes

– E.g., 109 loops of size, 1441 quasigroups of size 5

• Computational approaches have been used– In a quantitative, rather than a qualitative way

– E.g., existence of QGX quasigroups of certain sizes

The Task We Set Ourselves

• Write a system which can…

• Be given only a particular size and an algebraic specification (in terms of a set of axioms)

• And produce a fully verified classification theorem– Which can be used to classify algebras of that size

• Up to isomorphism

• As a simple example– Given the axioms of group theory and the size 6

– Our system proves that groups of size six are either Abelian or non-Abelian up to isomorphism

The Tools We Used

• Automated Reasoning:

– Spass theorem prover

– MACE-4 model generator

– Omega proof planning system

• Machine Learning:

– HR automated theory formation system

– C4.5 decision tree learner

• Computer Algebra

– Gap system

Why Machine Learning?• Why are these two algebras non-isomorphic?

• Did you use deduction (only) to show this?

• My problem with the term “automated reasoning”

• Doesn’t include inductive reasoning

a b c d

a a b c d

b b a c d

c c b a d

d d b c a

a b c d

a a b c d

b b d c a

c c b a d

d a b c d

The HR System

• Starts with minimal information– E.g., dividing two numbers, ring theory axioms

• Produces a rich theory containing:– Examples, concepts, conjectures, proofs

• 15 Generic production rules form concepts

• 20+ Measures of interestingness– Drive a best-first search

• Conjecture making performed empirically

• Theorem proving/disproving by third party software– Usually Otter and MACE

Approach One

• Use MACE (+isofilter) to produce:– A single example of each isomorphism class

• Use HR to form a theory:– With a concept describing each class uniquely

• Use Spass to:– Verify MACE’s results

• That each example satisfies axioms

• Every algebra is isomorphic to one of the classes

– Verify HR’s results • That each example has the concept’s property

– Prove that each concept is a classifier• Discriminant and isomorphism-class theorems are true

Approaches Two and Three

• Same as approach 1

• But HR allowed to stop before it has found a classifying concept for each class

– In many cases, this is necessary

• Approach 2: use Prolog to combine concepts

• Approach 3: use C4.5 to learn a decision tree

– Problem: sometimes sub-optimal trees produced

Example Discriminating Concept

• First one:

– Idempotent element appearing twice on the diagonal

Difficulties and Lessons Learned

• Difficulty 1:

– MACE intermediate files > 4GB

– Solution: don’t require generation of all isomorphism classes

• Difficulty 1:

– HR has trouble with more than 6 or 7 examples

– Solution: only use HR to discriminate a few examples (pairs)

• Difficulty 2:

– Spass has trouble with sizes greater than 6 or 7

– (Partial) solution: use CAS to describe problem in terms of generators and relations (decrease potential mappings)

Approach Four (Bootstrapping)

• Want fully automated decision tree process– See IJCAR’04 paper for full algorithm description

• Step 1: MACE produces a non-isomorphic pair

• Step 2: HR discriminates the pair

• Step 3: Spass proves that some discriminants are actually classifiers

• Step 4: For non-classifiers, use MACE to produce a non-iso pair which both have the property– If successful, go to step 2

– If not, use Spass to prove it’s a dead-end

Example Decision Tree

Nice Result in Group Theory(Produced by Approach 1)

Class 1:

-(exists b (-(inv(b)=b)))

Class 2:

exists b c (-(inv(b)=b) & c*c=b)

Class 3:

-(exists b (inv(b)=b & -(exists c d (commutator(d,c)=b)))

Class 4:

exists b c d (b*c=d & -(c*b=d) & inv(d)=d)

Class 5:

none of the above

In English…

Groups of order 8 can be classified according to the self-inverse (inv(x)=x) elements they contain: they will either have:

(i) all self inverse elements

(ii) an element which squares to give a non-self inverse element

(iii) no self-inverse elements which aren't also commutators

(iv) a self inverse element which can be expressed as the product of two non-commutative elements

(v) none of these properties

Classification Theorems Produced Using Approach 4

• Generated classifying theorems for

– Groups of size 4 (#2), 6 (#2), 8 (#5)

– Loops of size 4 (#2), 5 (#6), 6 (#109)

– Quasigroups

• Of size 3 (#5), 4 (#35), 5 (#1441)

– Monoids of size 3 (#7)

– QG4-quasigroups of size 5 (#4)

– QG5-quasigroups of size 7 (#3)

Conclusions

• Computers can help in classification tasks– In a qualitative, as well as quantitative way

– Can produce fully verified classification theorems

• Cannot be achieved by deduction alone– Our approach requires deduction (ATP), induction (ML), and

symbolic manipulation (CAS)

– Long live the Calculemus project!!

• Application to model generation (please ask)– Results are not conclusive yet…

Future Work #1

• Improve the current system– By trying out different tools/methods

• SEM, FINDER for model generation

• SAT solvers for the ATP tasks

• Progol (ILP) for machine learning tasks– First test: 68% success (HR was 96%)

• Look at different domains– Possibly domains associated with Zariski spaces

• Also look at isotopy as well as isomorphism

Future Work #2

• Produce general classification theorems

• Analysis of trees produced so far – Important concepts, etc.

• Generalise results over sizes– One possibility:

• Use smaller size decision trees as seeds for the larger trees

• Determine families and parameterisations of the family members– Use the counting abilities of HR

• May be difficult for first order provers

Future Work #3

• Look at sub-algebra structures/mappings

• E.g., centre of a group forms a subgroup– Look for more specific results than this

• Look for algebras embedded within others– HR has abilities to do this

– May be a tough problem for theorem proving

• Build up an “Atlas” for loops & quasigroups

• Start building more constructive classification results– E.g., using cross products, etc.

Future Work #4

• Find mathematical applications of this

• Any help……..?

automated reasoning for classifying finite algebras simon colton computational bioinformatics...

Documents