ilp for mathematical discovery simon colton & stephen muggleton computational bioinformatics...

21
ILP for Mathematical ILP for Mathematical Discovery Discovery Simon Colton & Stephen Muggleton Computational Bioinformatics Laboratory Imperial College

Upload: ralf-reynolds

Post on 25-Dec-2015

227 views

Category:

Documents


0 download

TRANSCRIPT

ILP for Mathematical DiscoveryILP for Mathematical Discovery

Simon Colton & Stephen Muggleton

Computational Bioinformatics Laboratory

Imperial College

The Automation of ReasoningThe Automation of Reasoning

Aims for the talk– Discuss a new ILP algorithm (ATF)

• and its implementation in the HR system

– Promote maths as a domain for ILP research

AutomatedTheoremProving

MachineLearning

Maths

Bioinformatics

AutomatedReasoning

From Prediction to DescriptionFrom Prediction to Description

Predictivetasks

Descriptivetasks

Supervisedlearning

Unsupervisedlearning

Know what you’re looking for

Don’t knowyou’re even

looking

Don’t know what you’re looking for

A Partial Characterisation A Partial Characterisation of Learning Tasksof Learning Tasks

Concept learningOutlier/anomaly detectionClusteringConcept formationConjecture making

Theory formation

The HR Program in OverviewThe HR Program in Overview Embodies a novel ILP algorithm– We call this “Automated Theory Formation” (ATF)– Designed for descriptive tasks (in maths)

• But has had applications to concept learning tasks

– Incrementally builds a theory • Containing association and classification rules

HR has numerous tools for the user – To extract information from the theory generated

• Which is relevant to the task at hand

ATF OverviewATF Overview

Invent new conceptsDerive classification rule from conceptInduce hypotheses relating the conceptsProve/disprove the relationships– Deductively• Using state of the art ATP/model generators

Extract association rules – From the hypotheses

Input to HRInput to HR

Five inputs to HR– Objects of interest (graphs, groups, etc.)– A labelling of the objects

• If the task at hand is predictive…

– Background predicates (Prolog style)– Axioms relating predicates (ATP style)– Termination conditions

• HR works as an any time algorithm

User can supply– numerous different combinations of these

} SeePaper

Representation of Theory Contents Representation of Theory Contents Three types of frames

– All have a clausal definition slot Example frame Concept frame

– Slot 1: range-restricted program clause – Slot 2: success set– Slot 3: classification rule afforded by definition– Other slots: measures of value

Hypothesis frame– Slot 1: clauses (association rules)– Slot 2: proof/counterexample– Other slots: details of the concepts related

Cut Down Algorithm DescriptionCut Down Algorithm Description Build new concept definition from old

• Using one of 12 generic production rules [PR] (see paper)

Find the success set, S, of new concept If S is empty, derive non-existence hypothesis, H

• Extract association rules from H, try to prove/disprove

If S is a repeat, derive equivalence hypothesis, H• Extract association rules from H, try to prove/disprove

If S is new– Add new concept to theory– Derive classification rule– Derive implication & near-equivalence hypotheses

• Extract association rules, try to prove/disprove

Measure concepts in theory

Concept Space SearchedConcept Space Searched

Space determined by PRs, not language bias Clausal definition is:– range-restricted, fully typed program clauses

Definition: n-connectedness – Every variable appears in a body predicate with head

variable n, or with a n-connected variable Example:– c(X,Y) :- p(X), q(Y), p(Z), r(X,Y), s(Y,Z) is 1-connected– c(X,Y) :- r(Y), s(X,Z) is not 1-connected

HR’s definitions are all 1-connected

Deriving Classification RulesDeriving Classification Rules

Given definition D– Arity = n, head predicate = p, success set = S – Classifying function over constants, o, is:

Classification, C, afforded by D:– Put two objects in the same class if f(o1) = f(o2)

Theorem: – If a definition D is not 1-connected, then a literal can be

removed without changing the classifiction afforded by D– So, HR’s search space is non-redundant with respect to C

Illustrative ExampleIllustrative Example concept17(X,Y) :-

integer(X), integer(Y), divisor(X,Y), ¬ divisor(Y,2).

S17 = {(1,1), (2,1), (3,1), (3,3), (4,1), (5,1), (5,5), (6,1), (6,3)}

Classifying function:– f17(1)={(1)} f17(2)={(1)} f17(3)={(1),(3)} f17(4)={(1)}

– f17(5)={(1),(5)} f17(6)={(1),(3)}

Classification afforded by concept 17:– [ [1,2,4] [3,6] [5] ]

Mathematics ApplicationsMathematics Applications

Two applications given here– Both from external research groups– Data sets available online

See paper for details – Of two more applications

FindingFindingDiscriminantsDiscriminants

Finding discriminants of residue classes Work with Sorge and Meier Overall goal: classify algebraic structures– Bottleneck: showing non-isomorphism

Learning task:– Given two multiplication tables– Find a property true of only one

• Which doesn’t refer to individual elements

Data set: 817 pairs of tables (size 5, 6, 10)

ResultsResults

HR given 500 steps per task (~22 secs)– Worked with four production rules

Found discriminants – For 791 out of 817 pairs (~97%)– Average of 20 discriminants per pair– 517 distinct discriminants in total

Example above:– Idempotent element (a*a=a)

• Appearing once on diagonal

– Only one of two discriminants found for pair

Reformulation of CSPsReformulation of CSPs

Work with Miguel and Walsh Constraint satisfaction solving– Very powerful general purpose technique– Specifying a problem is still highly skilled

Learning task:– Given solutions to small problems

• Find concepts to specialise the problem specification• Find implied constraints to increase efficiency

Data set: QG-quasigroups (5 types)– Multiplication tables up to size 6

ResultsResults

HR ran for an hour for each problem class– Produced on average 150 association rules– And 10 specialisation concepts

In each case, a better reformulation was derived (with human interpretation)– Up to 10 times speed up in some cases

Nice example: QG3: (a*b)*(b*a)=a– These are Anti-abelian, i.e., a*b=b*a a=b– Symmetry relation: a*b=b b*a=a

Some Other ApplicationsSome Other Applications

Concept learning tasks:– Extrapolation of integer sequences: ICML’00

– Mutagenesis regression unfriendly Anomaly detection task:

– Analysis of Bach Chorale melodies (current MSc.) Conjecture making tasks:

– Generating TPTP library theorems: CADE’02 (& paper)– Finding links in the Gene Ontology (current MSc.)– Making Graffiti-style conjectures (current MSc.)

Theory formation task:– Invention of integer sequences: AAAI’00, JIS’01 (& paper)

ConclusionsConclusions

Presented the ATF algorithm– Involves induction & deduction– Presented for first time in ILP terminology– Characterised the concept search space

Presented two learning tasks– In mathematics–More in paper (and in previous work)

Shown that HR can make discoveries

Future WorkFuture Work

Apply HR to bioinformatics– Needs more efficient implementation

Look into the conglomeration of– Creative reasoning techniques

Relate HR to other descriptive programs– CLAUDIEN and WARMR

Can these programs – Do better than HR in maths applications?

A Drosophilia for A Drosophilia for Descriptive Induction?Descriptive Induction?

“Something from (nearly) nothing”Can give HR only 1 concept–Multiplication in number theory

Invents the concept of refactorable nums– Number of divisors is itself a divisor– 1, 2, 8, 9, 12, …

A nice hypothesis it produces is:– Odd refactorable numbers are square