automated reasoning for classifying finite algebras simon colton computational bioinformatics...
TRANSCRIPT
Automated Reasoning forClassifying Finite Algebras
Simon Colton
Computational Bioinformatics Laboratory
Imperial College, London
Joint work with
• Roy McCasland (Edinburgh)
– Mathematical insights
• Andreas Meier (Saarbrucken)
– Theorem proving expertise
• Volker Sorge (Birmingham)
– ATP and CAS expertise
• Truly collaborative
– i.e., I may not be able to answer some questions
Classification of Finite Algebras
• Major driving force in mathematics– E.g., Kronecker’s 1870 classification of Abelian groups
– Also, 1980 classification of finite simple groups
• For loops and quasigroups, etc.– Large numbers of isomorphism/isotopy classes
– E.g., 109 loops of size, 1441 quasigroups of size 5
• Computational approaches have been used– In a quantitative, rather than a qualitative way
– E.g., existence of QGX quasigroups of certain sizes
The Task We Set Ourselves
• Write a system which can…
• Be given only a particular size and an algebraic specification (in terms of a set of axioms)
• And produce a fully verified classification theorem– Which can be used to classify algebras of that size
• Up to isomorphism
• As a simple example– Given the axioms of group theory and the size 6
– Our system proves that groups of size six are either Abelian or non-Abelian up to isomorphism
The Tools We Used
• Automated Reasoning:
– Spass theorem prover
– MACE-4 model generator
– Omega proof planning system
• Machine Learning:
– HR automated theory formation system
– C4.5 decision tree learner
• Computer Algebra
– Gap system
Why Machine Learning?• Why are these two algebras non-isomorphic?
• Did you use deduction (only) to show this?
• My problem with the term “automated reasoning”
• Doesn’t include inductive reasoning
a b c d
a a b c d
b b a c d
c c b a d
d d b c a
a b c d
a a b c d
b b d c a
c c b a d
d a b c d
The HR System
• Starts with minimal information– E.g., dividing two numbers, ring theory axioms
• Produces a rich theory containing:– Examples, concepts, conjectures, proofs
• 15 Generic production rules form concepts
• 20+ Measures of interestingness– Drive a best-first search
• Conjecture making performed empirically
• Theorem proving/disproving by third party software– Usually Otter and MACE
Approach One
• Use MACE (+isofilter) to produce:– A single example of each isomorphism class
• Use HR to form a theory:– With a concept describing each class uniquely
• Use Spass to:– Verify MACE’s results
• That each example satisfies axioms
• Every algebra is isomorphic to one of the classes
– Verify HR’s results • That each example has the concept’s property
– Prove that each concept is a classifier• Discriminant and isomorphism-class theorems are true
Approaches Two and Three
• Same as approach 1
• But HR allowed to stop before it has found a classifying concept for each class
– In many cases, this is necessary
• Approach 2: use Prolog to combine concepts
• Approach 3: use C4.5 to learn a decision tree
– Problem: sometimes sub-optimal trees produced
Example Discriminating Concept
• First one:
– Idempotent element appearing twice on the diagonal
Difficulties and Lessons Learned
• Difficulty 1:
– MACE intermediate files > 4GB
– Solution: don’t require generation of all isomorphism classes
• Difficulty 1:
– HR has trouble with more than 6 or 7 examples
– Solution: only use HR to discriminate a few examples (pairs)
• Difficulty 2:
– Spass has trouble with sizes greater than 6 or 7
– (Partial) solution: use CAS to describe problem in terms of generators and relations (decrease potential mappings)
Approach Four (Bootstrapping)
• Want fully automated decision tree process– See IJCAR’04 paper for full algorithm description
• Step 1: MACE produces a non-isomorphic pair
• Step 2: HR discriminates the pair
• Step 3: Spass proves that some discriminants are actually classifiers
• Step 4: For non-classifiers, use MACE to produce a non-iso pair which both have the property– If successful, go to step 2
– If not, use Spass to prove it’s a dead-end
Example Decision Tree
Nice Result in Group Theory(Produced by Approach 1)
Class 1:
-(exists b (-(inv(b)=b)))
Class 2:
exists b c (-(inv(b)=b) & c*c=b)
Class 3:
-(exists b (inv(b)=b & -(exists c d (commutator(d,c)=b)))
Class 4:
exists b c d (b*c=d & -(c*b=d) & inv(d)=d)
Class 5:
none of the above
In English…
Groups of order 8 can be classified according to the self-inverse (inv(x)=x) elements they contain: they will either have:
(i) all self inverse elements
(ii) an element which squares to give a non-self inverse element
(iii) no self-inverse elements which aren't also commutators
(iv) a self inverse element which can be expressed as the product of two non-commutative elements
(v) none of these properties
Classification Theorems Produced Using Approach 4
• Generated classifying theorems for
– Groups of size 4 (#2), 6 (#2), 8 (#5)
– Loops of size 4 (#2), 5 (#6), 6 (#109)
– Quasigroups
• Of size 3 (#5), 4 (#35), 5 (#1441)
– Monoids of size 3 (#7)
– QG4-quasigroups of size 5 (#4)
– QG5-quasigroups of size 7 (#3)
Conclusions
• Computers can help in classification tasks– In a qualitative, as well as quantitative way
– Can produce fully verified classification theorems
• Cannot be achieved by deduction alone– Our approach requires deduction (ATP), induction (ML), and
symbolic manipulation (CAS)
– Long live the Calculemus project!!
• Application to model generation (please ask)– Results are not conclusive yet…
Future Work #1
• Improve the current system– By trying out different tools/methods
• SEM, FINDER for model generation
• SAT solvers for the ATP tasks
• Progol (ILP) for machine learning tasks– First test: 68% success (HR was 96%)
• Look at different domains– Possibly domains associated with Zariski spaces
• Also look at isotopy as well as isomorphism
Future Work #2
• Produce general classification theorems
• Analysis of trees produced so far – Important concepts, etc.
• Generalise results over sizes– One possibility:
• Use smaller size decision trees as seeds for the larger trees
• Determine families and parameterisations of the family members– Use the counting abilities of HR
• May be difficult for first order provers
Future Work #3
• Look at sub-algebra structures/mappings
• E.g., centre of a group forms a subgroup– Look for more specific results than this
• Look for algebras embedded within others– HR has abilities to do this
– May be a tough problem for theorem proving
• Build up an “Atlas” for loops & quasigroups
• Start building more constructive classification results– E.g., using cross products, etc.
Future Work #4
• Find mathematical applications of this
• Any help……..?