performance of oshl on problems requiring definition expansion
DESCRIPTION
Performance of OSHL on Problems Requiring Definition Expansion. Swaha Miller David A. Plaisted UNC Chapel Hill. How do humans prove theorems?. Semantics Case analysis Sequential search through space of possible structures Focus on the theorem. - PowerPoint PPT PresentationTRANSCRIPT
Performance of OSHL Performance of OSHL on Problems Requiring on Problems Requiring
Definition ExpansionDefinition Expansion
Swaha MillerSwaha Miller
David A. PlaistedDavid A. Plaisted
UNC Chapel HillUNC Chapel Hill
How do humans prove How do humans prove theorems?theorems?
SemanticsSemantics
Case analysisCase analysis
Sequential search through space Sequential search through space of possible structuresof possible structures
Focus on the theoremFocus on the theorem
““Systematic methods can Systematic methods can now routinely solve now routinely solve verification problems with verification problems with thousands or tens of thousands or tens of thousands of variables, thousands of variables, while local search methods while local search methods can solve hard random can solve hard random 3SAT problems with 3SAT problems with millions of variables.”millions of variables.”(from a conference (from a conference announcement)announcement)
DPLL ExampleDPLL Example
{p,r},{p,q,r},{p,r}
{T,r},{T,q,r},{T,r}
{F,r},{F,q,r},{F,r}
p=T p=F
{q,r} {r},{r}
{}
SIMPLIFY
SIMPLIFY
SIMPLIFY
Hyper LinkingHyper Linking
Problem Input Clauses
OTTER (sec)
Hyper Linking
Ph5 45 38606.76 1.8
Ph9 297 >24 hrs 2266.6
Latinsq 16 >24 hrs 56.4
Salt 44 1523.82 28.0
Zebra 128 >24 hrs 866.2
Eliminating Duplication with the Eliminating Duplication with the Hyper-Linking Strategy, Shie-Jue Hyper-Linking Strategy, Shie-Jue Lee and David A. Plaisted, Lee and David A. Plaisted, Journal of Automated Reasoning Journal of Automated Reasoning 9 (1992) 25-42.9 (1992) 25-42.
Later propositional Later propositional strategiesstrategies
Billon’s disconnection calculus, Billon’s disconnection calculus, derived from hyper-linkingderived from hyper-linking
Disconnection calculus theorem Disconnection calculus theorem prover (DCTP), derived from prover (DCTP), derived from Billon’s workBillon’s work
FDPLL (Model Evolution Calculus)FDPLL (Model Evolution Calculus)
Performance of DCTP on Performance of DCTP on TPTP, 2003TPTP, 2003
DCTP 1.3 first in EPS and EPR DCTP 1.3 first in EPS and EPR (largely propositional)(largely propositional)
DCTP 10.2p third in FNE (first-order, DCTP 10.2p third in FNE (first-order, no equality) solving same number no equality) solving same number as best proversas best provers
DCTP 10.2p fourth in FOF and FEQ DCTP 10.2p fourth in FOF and FEQ (all first-order formulae, and (all first-order formulae, and formulae with equality)formulae with equality)
DCTP 1.3 is a single strategy prover.DCTP 1.3 is a single strategy prover.
Strategy Selection in Strategy Selection in EE
Strategy SelectionStrategy Selection
Schulz, Stephan, E-A Brainiac Theorem Schulz, Stephan, E-A Brainiac Theorem Prover, Journal of AI Communications Prover, Journal of AI Communications 15(2/3):111-126, 2002. 15(2/3):111-126, 2002.
Strategy SelectionStrategy SelectionThe Vampire kernel provides a fairly large The Vampire kernel provides a fairly large
number of features for strategy selection. number of features for strategy selection. The most important ones are: The most important ones are:
Choice of the main saturation procedure : (i) Choice of the main saturation procedure : (i) OTTER loop, with or without the Limited OTTER loop, with or without the Limited Resource Strategy, (ii) DISCOUNT loop. Resource Strategy, (ii) DISCOUNT loop.
A variety of optional simplifications. A variety of optional simplifications. Parameterised reduction orderings. Parameterised reduction orderings. A number of built-in literal selection A number of built-in literal selection
functions and different modes of functions and different modes of comparing literals. comparing literals.
Age-weight ratio that specifies how strongly Age-weight ratio that specifies how strongly lighter clauses are preferred for inference lighter clauses are preferred for inference selection. selection.
Set-of-support strategy. Set-of-support strategy.
Strategy SelectionStrategy Selection
The automatic mode of Vampire 7.0 is The automatic mode of Vampire 7.0 is derived from extensive experimental data derived from extensive experimental data obtained on problems from TPTP v2.6.0. obtained on problems from TPTP v2.6.0. Input problems are classified taking into Input problems are classified taking into account simple syntactic properties, such account simple syntactic properties, such as being Horn or non-Horn, presence of as being Horn or non-Horn, presence of equality, etc. Additionally, we take into equality, etc. Additionally, we take into account the presence of some important account the presence of some important kinds of axioms, such as set theory kinds of axioms, such as set theory axioms, associativity and commutativity. axioms, associativity and commutativity. Every class of problems is assigned a Every class of problems is assigned a fixed schedule consisting of a number of fixed schedule consisting of a number of kernel strategies called one by one with kernel strategies called one by one with different time limits. different time limits.
DCTP Strategy SelectionDCTP Strategy SelectionDCTP 1.31 has been implemented as a DCTP 1.31 has been implemented as a
monolithic system in the Bigloo dialect monolithic system in the Bigloo dialect of the Scheme language.of the Scheme language.
DCTP 1.31 is a single strategy prover. DCTP 1.31 is a single strategy prover. Individual strategies are started by Individual strategies are started by DCTP 10.21p using the schedule based DCTP 10.21p using the schedule based resource allocation scheme known from resource allocation scheme known from the E-SETHEO system. Of course, the E-SETHEO system. Of course, different schedules have been different schedules have been precomputed for the syntactic problem precomputed for the syntactic problem classes. The problem classes are more classes. The problem classes are more or less identical with the sub-classes of or less identical with the sub-classes of the competition organisers.the competition organisers.
In CASC-J2 DCTP 10.21p performed In CASC-J2 DCTP 10.21p performed substantially better.substantially better.
Goal of OSHLGoal of OSHL
First-order logicFirst-order logic
Clause formClause form
Propositional efficiencyPropositional efficiency
SemanticsSemanticsRequires ground decidabilityRequires ground decidability
Structure of OSHLStructure of OSHL
Goal sensitivity if semantics chosen Goal sensitivity if semantics chosen properlyproperlyChoose initial semantics to satisfy axiomsChoose initial semantics to satisfy axioms
Use of natural semanticsUse of natural semanticsFor group theory problems, can specify a For group theory problems, can specify a
groupgroup
Sequential search through possible Sequential search through possible interpretationsinterpretationsThus similar to Davis and Putnam’s methodThus similar to Davis and Putnam’s methodPropositional EfficiencyPropositional Efficiency
Constructs a semantic treeConstructs a semantic tree
Ordered Semantic Hyperlinking (Oshl)Ordered Semantic Hyperlinking (Oshl)
Reduce first-order logic problem to Reduce first-order logic problem to propositional problem propositional problem
Imports propositional efficiency into first-Imports propositional efficiency into first-order logicorder logic
The algorithmThe algorithmImposes an ordering on clausesImposes an ordering on clausesProgresses by generating ground instances Di Progresses by generating ground instances Di
of input clauses and refining interpretationsof input clauses and refining interpretations
unsatisfiable
I0 I1 I2 I3 …
D0 D1 D2 T
SemanticsSemantics
Trivial semantics:Trivial semantics:Positive: Choose IPositive: Choose I00 to falsify all to falsify all
atoms, first D is all positive. atoms, first D is all positive. Forward chaining.Forward chaining.
Negative: Choose INegative: Choose I00 to satisfy all to satisfy all atoms, first D is all negative. atoms, first D is all negative. Backward chaining.Backward chaining.
Natural semantics: INatural semantics: I00 chosen by chosen by useruser
Semantics OrderingSemantics Ordering<<t t a well founded ordering on atoms, a well founded ordering on atoms,
extended to literalsextended to literals
Extend <Extend <t t to interpretations as follows:to interpretations as follows:
I and J agree on L if they interpret L the I and J agree on L if they interpret L the samesame
Suppose ISuppose I00 is given is given
I <I <tt J if I and J are not identical, A is the J if I and J are not identical, A is the minimal atom on which they disagree, minimal atom on which they disagree, and I agrees with Iand I agrees with I00 on A on A
Rules of OSHL
Start with empty sequence
(C1,C2, …, Cn), D minimal ground instance of an input clause that contradicts I, I minimal model of sequence
(C1,C2, …, Cn,D)
(C1,C2, …, Cn, D), Cn “out of order”
(C1,C2, …, Cn-1,D)
(C1,C2, …, Cn,D), max resolution possible
(C1,C2, …, Cn-1,res(Cn,D,L))
Proof if empty clause derived
Propositional Example (p I0 p)
()
({-p1, -p2, -p3}) I0[-p3]
({-p1, -p2, -p3}, {-p4, -p5, -p6}) I0 [-p3,-p6]
({…}, {…}, {-p7}) I0 [-p3,-p6,-p7]
({…}, {…}, {-p7}, {p3, p7})
({…}, {-p4, -p5, -p6}, {p3})
({-p1, -p2, -p3},{p3})
({-p1, -p2 }) I0 [-p2]
╨
U RulesU RulesChoose clauses instances to match Choose clauses instances to match
existing literals. Look for a existing literals. Look for a contradiction.contradiction.
Basic clauses and U clausesBasic clauses and U clausesBasic clauses are used in three rules givenBasic clauses are used in three rules givenSequence can also have U clauses on the Sequence can also have U clauses on the
endendU clauses have a selected literalU clauses have a selected literalIn basic clauses the max. lit. is selectedIn basic clauses the max. lit. is selectedIn U clauses other literals can be selected.In U clauses other literals can be selected.Significant performance enhancement.Significant performance enhancement.
UR Resolution ExampleUR Resolution Example
Given the sequenceGiven the sequence ({s(a), ({s(a), p(b) p(b) }, {t(a),}, {t(a), q(b) q(b)})})
and the clauseand the clause {{p(X), p(X), q(X), r(X)}q(X), r(X)}
create the sequencecreate the sequence ({s(a), ({s(a), p(b)p(b)}, {t(a), }, {t(a), q(b)q(b)}, {}, {p(b), p(b), q(b), q(b), r(b)r(b)} )} )
X b
Filtering ExampleFiltering Example
Given the sequence ({s(a), Given the sequence ({s(a), p(b)p(b)}, {t(a), }, {t(a), q(b)q(b)}) })
and the clause {and the clause {p(X), p(X), q(X)} q(X)}
create the sequence create the sequence
({s(a), ({s(a), p(b)p(b)}, {t(a), }, {t(a), q(b)q(b)}, {}, {p(b), p(b), q(b)q(b)} )} )
X b
Case Analysis ExampleCase Analysis Example
Given the sequence ({s(a), Given the sequence ({s(a), p(b)p(b)}, {t(a), }, {t(a), q(b)q(b)}) })
and the clause {and the clause {q(X), r(X), s(X)} q(X), r(X), s(X)}
create the sequence create the sequence
({s(a), ({s(a), p(b)p(b)}, {t(a), }, {t(a), q(b)q(b)}, {}, {q(b), r(b), q(b), r(b), s(b)s(b)} )} )
X b
Example Proof Using U Example Proof Using U RulesRules
All positive semanticsAll positive semanticsClauses:Clauses:A1. A1. XXY, Y, YYX, X=YX, X=YA2. A2. ZZX, X, XXY, ZY, ZYYA3. g(X,Y)A3. g(X,Y)X, XX, XYYA4. A4. g(X,Y)g(X,Y)Y, XY, XYYA5. A5. ZZX, ZX, ZX X Y Y A6. A6. ZZY, ZY, ZX X Y YA7. A7. ZZX X Y, Z Y, ZX, ZX, ZYYT. T. A A B = B B = B A A
Example Proof Using U Example Proof Using U RulesRules
1. {1. {A A B = B B = B A} A} (T)(T)2. {2. {A A B B B B A A, , B B A A A A B, A B, A B = B =
B B A} (Case Analysis, A1) A} (Case Analysis, A1)3. {3. {g(A g(A B, B B, B A) A) B B A A, A , A B B B B A} A}
(UR resolution, A4)(UR resolution, A4)4. {g(A 4. {g(A B, B B, B A) A) B B A, A, g(…) g(…) B B} (UR } (UR
resolution, A5)resolution, A5)5. {g(A 5. {g(A B, B B, B A) A) B B A, A, g(…) g(…) A A} (UR } (UR
resolution, A6)resolution, A6)6. {g(…) 6. {g(…) B, g(…) B, g(…) A, A, g(…) g(…) A A B B} }
(UR resolution, A7)(UR resolution, A7)7. {A 7. {A B B B B A, A, g(…) g(…) A A B B} (Filtering, } (Filtering,
A3)A3)
Example Proof Using U Example Proof Using U RulesRules
1. {1. {A A B = B B = B A} A}2. {2. {A A B B B B A A, , B B A A A A B, A B, A B = B =
B B A} (Case Analysis) A} (Case Analysis)3. {3. {g(A g(A B, B B, B A) A) B B A A, A , A B B B B
A} (UR resolution)A} (UR resolution)4. {g(A 4. {g(A B, B B, B A) A) B B A, A, g(…) g(…) B B} }
(UR resolution)(UR resolution)5. {g(A 5. {g(A B, B B, B A) A) B B A, A, g(…) g(…) A A} }
(UR resolution)(UR resolution)8. {g(…) 8. {g(…) B, B, g(…) g(…) A A, A , A B B B B A,} A,}
(Resolution of 6. and 7.)(Resolution of 6. and 7.)
Example Proof Using U Example Proof Using U RulesRules
1. {1. {A A B = B B = B A} A}2. {2. {A A B B B B A A, , B B A A A A B, A B, A B = B =
B B A} (Case Analysis) A} (Case Analysis)3. {3. {g(A g(A B, B B, B A) A) B B A A, A , A B B B B
A} (UR resolution)A} (UR resolution)4. {g(A 4. {g(A B, B B, B A) A) B B A, A, g(…) g(…) B B} }
(UR resolution)(UR resolution)9. {g(A 9. {g(A B, B B, B A) A) B B A, A, g(…) g(…) B B, A , A
B B B B A} (Resolution of 8. and 5.) A} (Resolution of 8. and 5.)
Example Proof Using U Example Proof Using U RulesRules
1. {1. {A A B = B B = B A} A}2. {2. {A A B B B B A A, , B B A A A A B, A B, A B = B =
B B A} (Case Analysis) A} (Case Analysis)3. {3. {g(A g(A B, B B, B A) A) B B A A, A , A B B B B
A} (UR resolution)A} (UR resolution)10. {10. {g(A g(A B, B B, B A) A) B B A A} (Resolution } (Resolution
of 9. and 4.)of 9. and 4.)
Example Proof Using U Example Proof Using U RulesRules
1. {1. {A A B = B B = B A} A}2. {2. {A A B B B B A A, , B B A A A A B, A B, A B = B B = B
A} (Case Analysis)A} (Case Analysis)11. {11. {A A B B B B A A} (Resolution of 10. and 3.)} (Resolution of 10. and 3.)
Example Proof Using U Example Proof Using U RulesRules
1. {1. {A A B = B B = B A} A}12. {12. {B B A A A A B B, A , A B = B B = B A} A}
(Resolution of 11 and 2)(Resolution of 11 and 2)
Now the other half of the proof will be done. Note that there is only one ascending sequence of clauses constructed by OSHL and we are only indicating part of it.
Implementation ResultsImplementation ResultsSlower implementation speed of OSHLSlower implementation speed of OSHL
Uniform strategy versus strategy Uniform strategy versus strategy selectionselection
The choice of OtterThe choice of Otter
Influence of U rules on an earlier Influence of U rules on an earlier version:version:None: 233 proofs in 30 seconds on TPTP None: 233 proofs in 30 seconds on TPTP
problemsproblems
Using them: 900 proofs in 30 secondsUsing them: 900 proofs in 30 seconds
All results for trivial semanticsAll results for trivial semantics
HeuristicsHeuristics
Delta size of propositional Delta size of propositional instancesinstances
Relevance distanceRelevance distance
Favor propositional terms in the Favor propositional terms in the theoremtheorem
Problems Involving Problems Involving DefinitionsDefinitions
Consider the problemConsider the problem
SS1 1 S S2 2 … … S Sn n = S= Sn n S Sn-1 n-1 … … S S11
is associated to the leftis associated to the leftFor increasing values of n, gives us a set of For increasing values of n, gives us a set of
progressively harder problemsprogressively harder problemsRequires expanding definitions of set union, Requires expanding definitions of set union,
set equality, and subset relationsset equality, and subset relationsTested on OSHL-U, Otter and 3 other leading Tested on OSHL-U, Otter and 3 other leading
proversproversVampireVampireE-SetheoE-SetheoDCTPDCTP
Problems Involving Problems Involving DefinitionsDefinitions
nn OSHL-UOSHL-U OtterOtter VampireVampire E-E-SethSetheoeo
DCTDCTPP
time time (s)(s)
GenGen time time (s)(s)
GenGen time time (s)(s)
GenGen time time (s)(s)
time time (s)(s)
22 0.1750.175 4141 600+600+ 100 100 K+K+
0.000.00 103103 0.00.0 0.010.01
33 0.6780.678 8585 600+600+ 66 K 66 K ++
70.170.1 3 M 3 M ++
0.30.3 300+300+
44 2.1072.107 141141 600+600+ 47 K 47 K ++
300+300+ 25 M 25 M ++
0.30.3 300+300+
55 5.3175.317 207207 600+600+ 46 K 46 K ++
300+300+ 25 M 25 M ++
2.62.6 300+300+
66 12.0212.02 283283 600+600+ 60 K 60 K ++
300+300+ 25 M 25 M ++
300+300+ 300+300+
77 38.9738.97 525525 600+600+ 56 K 56 K ++
300+300+ 25 M 25 M ++
300+300+ 300+300+
88 77.9477.94 663663 600+600+ 56 K 56 K ++
300+300+ 25 M 25 M ++
300+300+ 300+300+
Problems Involving Problems Involving DefinitionsDefinitions
nn OSHL-UOSHL-U OtterOtter VampireVampire E-E-SethSetheoeo
DCTDCTPP
time time (s)(s)
GenGen time time (s)(s)
GenGen time time (s)(s)
GenGen time time (s)(s)
time time (s)(s)
22 0.1750.175 4141 600+600+ 100 100 K+K+
0.000.00 103103 0.00.0 0.010.01
33 0.6780.678 8585 600+600+ 66 K 66 K ++
70.170.1 3 M 3 M ++
0.30.3 300+300+
44 2.1072.107 141141 600+600+ 47 K 47 K ++
300+300+ 25 M 25 M ++
0.30.3 300+300+
55 5.3175.317 207207 600+600+ 46 K 46 K ++
300+300+ 25 M 25 M ++
2.62.6 300+300+
66 12.0212.02 283283 600+600+ 60 K 60 K ++
300+300+ 25 M 25 M ++
300+300+ 300+300+
77 38.9738.97 525525 600+600+ 56 K 56 K ++
300+300+ 25 M 25 M ++
300+300+ 300+300+
88 77.9477.94 663663 600+600+ 56 K 56 K ++
300+300+ 25 M 25 M ++
300+300+ 300+300+
Problems Involving Problems Involving DefinitionsDefinitions
nn OSHL-UOSHL-U OtterOtter VampireVampire E-E-SethSetheoeo
DCTDCTPP
time time (s)(s)
GenGen time time (s)(s)
GenGen time time (s)(s)
GenGen time time (s)(s)
time time (s)(s)
22 0.1750.175 4141 600+600+ 100 100 K+K+
0.000.00 103103 0.00.0 0.010.01
33 0.6780.678 8585 600+600+ 66 K 66 K ++
70.170.1 3 M 3 M ++
0.30.3 300+300+
44 2.1072.107 141141 600+600+ 47 K 47 K ++
300+300+ 25 M 25 M ++
0.30.3 300+300+
55 5.3175.317 207207 600+600+ 46 K 46 K ++
300+300+ 25 M 25 M ++
2.62.6 300+300+
66 12.0212.02 283283 600+600+ 60 K 60 K ++
300+300+ 25 M 25 M ++
300+300+ 300+300+
77 38.9738.97 525525 600+600+ 56 K 56 K ++
300+300+ 25 M 25 M ++
300+300+ 300+300+
88 77.9477.94 663663 600+600+ 56 K 56 K ++
300+300+ 25 M 25 M ++
300+300+ 300+300+
Problems Involving Problems Involving DefinitionsDefinitions
9 other sets of problems tested9 other sets of problems tested
SS1 1 S S2 2 … … S Snn= S= Sn n SSn-1 n-1 … … S S11, left side left , left side left associated, right side right associatedassociated, right side right associated
SS1 1 S S2 2 … … S Snn== SS1 1 S S2 2 … … S Sn n SS1 1 S S2 2 … … S Snn, both sides associated to the left, both sides associated to the left
SS1 1 S S2 2 … … S Snn== SS1 1 S S2 2 … … S Sn n SS1 1 S S2 2 … … S Snn, left side left associated, right side right , left side left associated, right side right associatedassociated
SS1 1 SS2 2 … … S Snn== SS1 1 S S2 2 … … S Snn, left side left , left side left associated, right side right associatedassociated, right side right associated
5 similar sets of problems involving 5 similar sets of problems involving Results were similar for all 10 sets of Results were similar for all 10 sets of problems testedproblems tested
Implementation ResultsImplementation Results
OSHL has no special data OSHL has no special data structures.structures.
Implemented in OCaMLImplemented in OCaML
No special equality methodsNo special equality methods
Semantics was implemented but Semantics was implemented but frequently only trivial semantics frequently only trivial semantics was used.was used.
Thus significant performance Thus significant performance improvements are possible.improvements are possible.