answer - nku.edu

183

Upload: others

Post on 28-Mar-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Thesis.dviANSWER SET PROGRAMMING WITH CLAUSELEARNINGDISSERTATIONPresented in Partial Fulllment of the Requirements forthe Degree Do tor of Philosophy in theGraduate S hool of The Ohio State UniversityByJerey Alan Ward, B.A., B.S., M.S., M.S.* * * * *The Ohio State University2004Dissertation Committee:Timothy J. Long, Co-AdviserJohn S. S hlipf, Co-AdviserEri Fosler-LussierNeelam Soundarajan Approved byCo-AdviserCo-AdviserDepartment of Computerand Information S ien e
ii
ABSTRACT Answer set programming (ASP) is a knowledge representation paradigm relatedto the areas of logi programming and nonmonotoni reasoning. Many of the appli- ations for ASP are from the areas of arti ial intelligen e{related diagnosis andplanning. The problem of nding an answer set for a normal or extended logi program is NP-hard. Current omplete answer set solvers are patterned after theDavis-Putnam-Loveland-Logemann (DPLL) algorithm for solving Boolean satisa-bility (SAT ) problems, but are adapted to the nonmonotoni semanti s of answer setprogramming.Re ent SAT solvers in lude improvements to the DPLL algorithm. Of spe ial notein this regard is the in orporation of on i t lauses. A on i t lause representsa ba ktra king solver's analysis of why a on i t o urred. This analysis an beused to further prune the sear h spa e and to dire t the sear h heuristi . The useof su h lauses has improved signi antly the eÆ ien y of satisability solvers overthe past few years, espe ially on stru tured problems arising from appli ations. Inthis dissertation we des ribe how we have adapted on i t lause te hniques for usein the answer set solver Smodels. We experimentally ompare the performan e ofthe resulting program, Smodels , to that of the original Smodels program. We also ompare the performan e of Smodels with that of two other re ent answer set solvers,iii
ASSAT and Cmodels-2, whi h all satisability solvers dire tly, and whi h take adierent approa h to adding lauses to onstrain an answer set sear h.
iv
VITA February 11, 1961 . . . . . . . . . . . . . . . . . . . . . . . . . . Born - Ft. Thomas, Kentu ky1984 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .B.A. Philosophy,B.S. Mathemati s,Northern Kentu ky University1987 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .M.S. Mathmati s,The Ohio State University1988 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .M.S. Computer and InformationS ien e,The Ohio State University1985-1993 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Graduate Tea hing Asso iate,Mathemati s and CIS Departments,The Ohio State University1994-1996 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Assistant Professor,Department of Mathemati s andComputer S ien e,College of Mount Saint Joseph1996-1999 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Computer Game Programmer,PyroTe hnix, In .,Cin innati, Ohio1999-2002 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Graduate Resear h Assistant,Department of Ele tri al andComputer Engineering andComputer S ien e,University of Cin innati
v
PUBLICATIONSResear h Publi ationsWolfgang W. Kue hlin and Jerey A. Ward. \Experiments with Virtual C-Threads".Pro eedings of the Fourth IEEE Symposium on Parallel and Distributed Pro essing,Ft. Worth, Texas, De ember 1991.John Fran o, Mi hal Kouril, John S hlipf, Jerey Ward, Sean Weaver, Mi hael Drans-eld, and Mark Van eet. \SBSAT: A State-based, BDD-based Satisability Algo-rithm". Theory and Appli ations of Satisability Testing: 6th International Confer-en e (SAT 2003), Santa Margherita Ligure, Italy, May 2003.Jerey Ward and John S. S hlipf. \Answer Set Programming with Clause Learn-ing", Pro eedings of Logi Programming and Nonmonotoni Reasoning 7 (LPNMR-7),Ft. Lauderdale, Florida, January 2004.FIELDS OF STUDYMajor Field: Computer S ien eStudies in:Logi and Logi Programming Prof. John S. S hlipfProf. Wolfgang W. Kue hlinProf. Timothy J. CarlsonTheory of Computation Prof. Timothy J. LongComputer Algebra Prof. George C. Collins
vi
TABLE OF CONTENTS PageAbstra t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiVita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiChapters:1. Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Answer Set Programming . . . . . . . . . . . . . . . . . . . . . . . 11.2 Clause Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42. Ba kground: Answer Set Programming . . . . . . . . . . . . . . . . . . . 62.1 De larative Logi Programming . . . . . . . . . . . . . . . . . . . . 62.1.1 Ground Instantiations . . . . . . . . . . . . . . . . . . . . . 72.2 Answer Set Semanti s for Normal Logi Programs . . . . . . . . . . 92.3 Relationship to Classi al Boolean Satisability (SAT) . . . . . . . . 142.3.1 Redu ing of SAT to ASP . . . . . . . . . . . . . . . . . . . 162.3.2 Completion Semanti s . . . . . . . . . . . . . . . . . . . . . 172.3.3 Expressing the Hamiltonian Cy le Problem . . . . . . . . . 192.3.4 Unfounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . 242.3.5 Redu ing ASP to SAT . . . . . . . . . . . . . . . . . . . . . 282.4 Extended Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.5 Disjun tive Logi Programming . . . . . . . . . . . . . . . . . . . . 34vii
3. Ba kground: Basi Sear h Algorithms for SAT and ASP . . . . . . . . . 373.1 DPLL SAT algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 373.2 Smodels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.2.1 Smodels' Inferen e Rules for Normal Programs . . . . . . . 423.2.2 Smodels' Inferen e Rules for Extended Programs . . . . . . 463.2.3 Lookahead-Based Heuristi . . . . . . . . . . . . . . . . . . 493.2.4 Computing Multiple Answer Sets . . . . . . . . . . . . . . . 513.2.5 Implementing Minimize Statements . . . . . . . . . . . . . . 514. Con i t Clauses for SAT and ASP . . . . . . . . . . . . . . . . . . . . . 544.1 Overview of Con i t Clauses . . . . . . . . . . . . . . . . . . . . . 544.2 Relationship to Previous Work . . . . . . . . . . . . . . . . . . . . 604.3 Generating the Impli ation Graph . . . . . . . . . . . . . . . . . . 624.3.1 Generating the Impli ation Graph in a SAT Solver . . . . . 624.3.2 Generating the Impli ation Graph in an ASP Solver . . . . 664.3.3 An ASP Impli ation Graph Example . . . . . . . . . . . . . 714.3.4 In orporating Extended Rules . . . . . . . . . . . . . . . . . 764.4 Unique Impli ation Points . . . . . . . . . . . . . . . . . . . . . . . 804.4.1 UIP-based Clause Generation Strategies . . . . . . . . . . . 834.4.2 Finding UIPs . . . . . . . . . . . . . . . . . . . . . . . . . . 854.5 Generating the Con i t Clause . . . . . . . . . . . . . . . . . . . . 924.6 Using Con i t Clauses in the Sear h . . . . . . . . . . . . . . . . . 944.6.1 Ba kjumping . . . . . . . . . . . . . . . . . . . . . . . . . . 944.6.2 Serving as Additional Constraints . . . . . . . . . . . . . . . 954.6.3 Sear h Heuristi s . . . . . . . . . . . . . . . . . . . . . . . . 974.6.4 Restarts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1014.6.5 Clause Deletion . . . . . . . . . . . . . . . . . . . . . . . . . 1034.7 Computing Multiple Answer Sets . . . . . . . . . . . . . . . . . . . 1064.7.1 Minimize statements . . . . . . . . . . . . . . . . . . . . . . 1075. Answer Set Solvers Whi h Call SAT Solvers Dire tly . . . . . . . . . . . 1085.1 Cmodels-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1095.2 ASSAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1105.3 Cmodels-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1115.4 Comparison with Smodels . . . . . . . . . . . . . . . . . . . . . . 112 viii
6. Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146.1 Boolean Satisability . . . . . . . . . . . . . . . . . . . . . . . . . . 1176.2 Graph Coloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1206.3 Hamiltonian Cy les . . . . . . . . . . . . . . . . . . . . . . . . . . . 1226.4 Bounded Model Che king . . . . . . . . . . . . . . . . . . . . . . . 1267. Con lusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 1307.1 Con lusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1307.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1327.2.1 Disjun tive Logi Programming . . . . . . . . . . . . . . . . 1327.2.2 In orporating SAT Solvers . . . . . . . . . . . . . . . . . . . 132Appendi es:A. Redu tions of ASP to SAT . . . . . . . . . . . . . . . . . . . . . . . . . . 134A.1 Redu tions That In rease the Number of Atoms . . . . . . . . . . . 134A.2 An Exponential Spa e Redu tion That Does Not In rease the Num-ber of Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136B. Weight Constraint Rules: Semanti s and Translations . . . . . . . . . . . 140B.1 Formal Semanti s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140B.2 Translation to Basi Rules . . . . . . . . . . . . . . . . . . . . . . . 142C. Smodels Proof of Corre tness . . . . . . . . . . . . . . . . . . . . . . . 144D. Smodels Proof of Completeness . . . . . . . . . . . . . . . . . . . . . . 151E. Comparison of Hamiltonian Cy le Redu tions . . . . . . . . . . . . . . . 154F. Experimental Results with Lookahead Options in Smodels and Smodels 160Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 ix
LIST OF TABLES Table Page2.1 Answer set for Hamiltonian y le problem instan e . . . . . . . . . . 222.2 Completion semanti s model for Hamiltonian y le problem instan e,with dieren es from Table 2.1 highlighted. . . . . . . . . . . . . . . . 233.1 Davis-Putnam-Loveland-Logemann algorithm for SAT . . . . . . . . . 393.2 Smodels algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.3 Smodels' lookahead-based heuristi . . . . . . . . . . . . . . . . . . . 524.1 Computing skip-to-depths . . . . . . . . . . . . . . . . . . . . . . . . 914.2 Skip-to-depths of nodes from example impli ation graph . . . . . . . 914.3 Finding the losest UIP . . . . . . . . . . . . . . . . . . . . . . . . . 924.4 Computing a on i t lause . . . . . . . . . . . . . . . . . . . . . . . 935.1 ASSAT algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1116.1 Boolean satisability runtimes . . . . . . . . . . . . . . . . . . . . . . 1196.2 Coloring runtimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1216.3 Runtimes on uniform Hamiltonian y le problems . . . . . . . . . . . 1246.4 Runtimes on lumpy Hamiltonian y le problems . . . . . . . . . . . 1266.5 Runtimes on lumpy Hamiltonian y le problems, Extended redu tion 127x
6.6 Bounded model he king runtimes - step semanti s . . . . . . . . . . 1286.7 Bounded model he king runtimes - interleaving semanti s . . . . . . 129E.1 Runtime omparison of dierent Hamiltonian Cy le redu tions . . . . 158E.2 Grounding omparison for dierent HC redu tions . . . . . . . . . . . 158E.3 Grounding omparison for a larger HC problem . . . . . . . . . . . . 159F.1 Lookahead results on satisability problems . . . . . . . . . . . . . . 161F.2 Lookahead results on oloring problems . . . . . . . . . . . . . . . . . 161F.3 Lookahead results on uniform Hamiltonian y le problems . . . . . . 162F.4 Lookahead results on lumpy Hamiltonian y le problems . . . . . . . 162F.5 Lookahead results on Bounded Model Che king - step semanti s . . . 163F.6 Lookahead results on Bounded Model Che king - interleaving semanti s164
xi
LIST OF FIGURES Figure Page2.1 A Hamiltonian y le problem instan e . . . . . . . . . . . . . . . . . 222.2 The solution to the HC problem instan e . . . . . . . . . . . . . . . . 232.3 A \solution" under the ompletion semanti s . . . . . . . . . . . . . . 244.1 Rea hing the rst on i t in a DPLL-based sear h . . . . . . . . . . . 564.2 Ba ktra king from the rst on i t . . . . . . . . . . . . . . . . . . . 574.3 Situation after ba kjumping based on :a2 _ :a3 _ :a100 . . . . . . . 594.4 Impli ation graph for a single inferen e . . . . . . . . . . . . . . . . . 634.5 A SAT-based impli ation graph with on i t . . . . . . . . . . . . . . 644.6 Adding a on i t edge . . . . . . . . . . . . . . . . . . . . . . . . . . 654.7 Impli ation graph involving unfounded set inferen es . . . . . . . . . 704.8 Impli ation graph after hoosing a = true . . . . . . . . . . . . . . . 724.9 Impli ation graph after inferring b by Modus Ponens . . . . . . . . . 734.10 Impli ation graph after inferring : from All Rules Can elled . . . . . 734.11 Impli ation graph after inferring :d from Ba k hain True . . . . . . . 744.12 Impli ation graph after inferring e from Ba k hain False . . . . . . . 74xii
4.13 Impli ation graph after unfounded set dete tion . . . . . . . . . . . . 754.14 SAT impli ation graph example (Figure 4.6) revisited . . . . . . . . . 824.15 A path from the hoi e node to a on i t node . . . . . . . . . . . . . 874.16 Adjusting the on i t node and orresponding path . . . . . . . . . . 884.17 Adding a on i t edge to the adjusted on i t node . . . . . . . . . . 89
xiii
CHAPTER 1INTRODUCTION 1.1 Answer Set ProgrammingAnswer set programming (ASP) is a knowledge representation paradigm relatedto the areas of logi programming and nonmonotoni reasoning. In this approa h,a problem is represented as a logi program whose models under the answer setsemanti s onstitute solutions (answer sets). The logi program representation istypi ally presented as input to an answer set sear h engine, whi h sear hes for one ormore valid models. The answer set semanti s was dened by Gelfond and Lifs hitz [23as a generalization of their denition in [22 of the stable model semanti s. The answerset/stable model semanti s is urrently the leading de larative semanti s for logi programs.Many of the urrent pra ti al appli ations of answer set programming are in theareas of arti ial intelligen e{related diagnosis and planning ([34, [21, [2, [51, [49,[11). Due to the expressive power of logi programs under the answer set semanti s,appli ations have also been onsidered in other areas su h as graph algorithms [49and bounded model he king [28. 1
Answer set programming may be ompared to the widely used knowledge repre-sentation approa hes that represent problems as lassi al, propositional logi formu-las. The problem of nding a model for su h a formula is essentially the well-knownsatisability testing problem (SAT), whi h involves reasoning in a monotoni logi .There are parallels between the way that reasoning is urrently done in answer setprogramming and in SAT. Firstly, logi programs that are to be tested for answer setsare normally \grounded" by a prepro essing stage so that they, like SAT problems,are represented in a propositional form before being presented to a sear h engine.Se ondly, a solution to an answer set program, like a solution to a SAT problem, onsists of a model whi h sets some of the atomi propositions in the problem totrue, and the remaining atomi propositions to false. Thirdly, omplete answer setsear h engines, like omplete SAT sear h engines, are generally based on a depth-rst,ba ktra king sear h patterned after the Davis-Putnam-Loveland-Logemann (DPLL)algorithm [10, [9.However, some ontrasts an be made between answer set programming and SATthat motivate onsidering answer set programming as a useful alternative. Someproblems seem to be representable asymptoti ally more on isely as logi programsunder the answer set semanti s than they an be represented as SAT formulas. Oneimportant advantage of this is greater onvenien e for the user in representing theproblem. Previous work has also suggested that this an result in a faster sear hfor solutions on some problems. (See, for instan e, experiments by Simons [53 onsolving Hamiltonian y le problems with SAT and ASP solvers.) Although the SATde ision problem an easily be redu ed in linear time to the problem of the existen e ofanswer sets, there is no known way of redu ing answer set programming problems to2
SAT without either (1) introdu ing a signi ant number of new atoms, whi h ouldsubstantially in rease the size of the sear h spa e, or (2) reating a representationthat, in the worst ase, is exponentially large in terms of the size of the originalanswer set programming representation.An important key to the ability of answer set languages to express ertain problemssu in tly is the prin iple of negation as failure, whi h is ommon to logi program-ming formalisms. This prin iple states that, if a proposition annot be \proven" fromthe rules given in the program/database, then the proposition will be assumed to befalse. The embodiment of this prin iple in the answer set semanti s adds onsider-able expressive power and onvenien e to the paradigm, but also results in answerset solvers having a onsiderably more omplex set of inferen e rules than the set ofinferen e rules used in SAT solvers.The problem of determining whether a normal logi program has an answer set isNP- omplete [42. Thus, it is not surprising that the task of omputing answer sets isoften omputationally expensive in pra ti e. In order to expand the usefulness of theanswer set programming paradigm, it will be very useful to improve the eÆ ien y of urrent answer set sear h algorithms. This is a topi of onsiderable urrent interestin the answer set programming ommunity ([17,[40,[53,[24) and is the subje t ofthis dissertation.1.2 Clause LearningPerhaps the most important development over the past several years towardsmaking SAT solvers more eÆ ient in solving pra ti al problems has been the use of on i t lauses [3,[43. A on i t lause represents a ba ktra king solver's analysis3
of why a on i t o urred during the sear h. This analysis an be used to furtherprune the sear h spa e and to dire t the sear h heuristi . Con i t lauses have beenespe ially useful in improving the eÆ ien y of SAT solvers on stru tured problemsarising from appli ations. The great majority of urrent, well-known, omplete SATsolvers, su h as GRASP [43, rel sat [3, SATO [62, Cha [46, BerkMin [27, andSIMO [26, on entrate their optimizations and heuristi s around eÆ iently and ef-fe tively generating and using on i t lauses.1.3 ContributionsFor this dissertation we in orporate on i t lause learning into one of the mostwidely used answer set solvers, Smodels [50. The resulting program, Smodels [61, isable to use on i t lauses to prune the sear h spa e and to dire t the sear h heuris-ti . The fundamental new problem whi h we had to address in order to a omplishthis on erned nding an algorithm to diagnose the auses of on i ts in Smodels'sear h. The main hallenge in this regard involved analyzing on i ts whi h resultedfrom negative inferen es derived from Smodels' dete tion of unfounded sets.1 Thedete tion of unfounded sets is entral to Smodels' enfor ement of the answer set se-manti s' version of negation as failure. Negative inferen es derived from unfoundedsets result in a number of ompli ations when performing a on i t diagnosis. Weoutline these ompli ations in Se tion 4.3.2. However, our experimental results showthat frequently testing for unfounded sets is riti al to obtaining good performan eon ertain important problems.1The notion of an unfounded set was dened by van Gelder, Ross, and S hlipf in [20. We dis ussit in Se tion 2.3.4. 4
We ondu t empiri al tests omparing the performan e of Smodels to that of theoriginal Smodels program. We also ompare the performan e of Smodels to that oftwo other re ent high-performan e answer set solvers, ASSAT and Cmodels-2. Thesetwo solvers instead take the approa h of dire tly alling SAT solvers to perform themajority work involved in an answer set sear h.The results of these tests show that on i t lause learning an greatly improvethe runtime performan e of an answer set solver su h as Smodels. This was evident onproblems arising from appli ations (hardware veri ation and bounded model he k-ing), and on randomly generated problems whi h had a non-uniform data distribution(e.g., graph oloring where the distribution of edges was \ lumpy").The most popular performan e ben hmark for answer set systems over the pastseveral years has been solving uniform, random Hamiltonian y le problems. In thisdomain, Smodels had orders of magnitude better performan e than any of the othersolvers whi h we tested.On so- alled \tight" problems2, ASSAT and Cmodels-2 provided somewhat betterperforman e than Smodels . This was expe ted sin e they all re ent, highly opti-mized SAT solvers to perform the sear h, and tight problems are easily redu ed toSAT.However, in our tests involving \non-tight" problems, su h as the Hamiltonian y le problems mentioned above, Smodels performed mu h better than ASSAT andCmodels-2. On these problems, Smodels is able to provide tighter pruning of thesear h spa e than ASSAT and Cmodels-2 be ause of the test for unfounded sets whi hSmodels inherits from the original Smodels solver.2See Denition 25 in Se tion 2.3.5 for the distin tion between tight and non-tight problems.5
CHAPTER 2BACKGROUND: ANSWER SET PROGRAMMINGThis hapter provides ba kground information on the area of Answer Set Pro-gramming. Nothing in this hapter is new to this dissertation. Citations are providedfor major denitions and results.2.1 De larative Logi ProgrammingA logi program is a set of rules of the form' 1; : : : ; nwhere '; 1; : : : ; n are formulas in some logi . In the above general form, ' is referredto as the head of the rule and 1; : : : ; n is the rule's body. Intuitively, su h a rulesays that if all of the formulas in the body of the rule are true, then the formula atthe head of the rule must also be true. If n = 0 then we normally omit writing the symbol. In su h a ase the rule simply states un onditionally that ' is true.The basi task of a de larative logi programming system is to answer questionsabout whi h fa ts are implied by the logi program at hand, or to nd a set of fa tswhi h would onstitute a model for the logi program. There are dierent ways ofinterpreting these questions formally, whi h gives rise to dierent logi programmingsemanti s. 6
2.1.1 Ground InstantiationsThe formulas whi h appear in a logi program may in lude predi ates whi h takeone or more arguments, and some of these arguments may be variables. The vari-ous de larative logi programming semanti s generally treat a rule with variables asrepresenting the set of ground instantiations of that rule. Thus the meaning of alogi program with variables is redu ed to the meaning of a logi program withoutvariables.Example 1 Assume that a, b, and are treated as onstants and that X and Y aretreated as variables.3 Consider the logi program P :Ar (a,b)Ar (b, )Nonterminal(X) Ar (X,Y)The ground instantiation of this program would be:Ar (a,b)Ar (b, )Nonterminal(a) Ar (a,a)Nonterminal(a) Ar (a,b)Nonterminal(a) Ar (a, )Nonterminal(b) Ar (b,a)Nonterminal(b) Ar (b,b)Nonterminal(b) Ar (b, )Nonterminal( ) Ar ( ,a)Nonterminal( ) Ar ( ,b)Nonterminal( ) Ar ( , )A pra ti al problem with reating a ground instantiation is that its size may beexponentially large in terms of the size of the original logi program. Most intelligent3It is, in fa t, a ommon onvention in logi programming that arguments whi h begin withlower ase letters are onstants, and arguments whi h begin with upper ase letters are variables. Wewill adopt this onvention throughout this dissertation.7
grounding programs try to produ e relatively small ground instantiations, while stillpreserving the meaning of the original program. For instan e, under the varioussemanti s whi h we will onsider in this dissertation, it would be permissible to restri tthe ground instantiation above to only the rules:Ar (a,b)Ar (b, )Nonterminal(a) Ar (a,b)Nonterminal(b) Ar (b, )This is be ause it is lear that the only instantiations of the Ar predi ate whi h an be established based on the given program are Ar (a; b) and Ar (b; ).The methods used by the various grounding programs to ut down the size ofthe ground instantiation may vary. However, the grounding pro ess is not a entral on ern of this dissertation. For our experiments in Chapter 6 we used the Lparseprogram[57 to reate ground instantiations of our logi programs.The upshot is that, in this dissertation, we will restri t our attention to nite,propositional logi programs. The kinds of de ision problems whi h one would usu-ally ask about a logi program, su h as whether it has a model or whether it impliesa parti ular statement, are generally de idable when the program is nite and propo-sitional.4Sin e the arguments to the predi ates mentioned in a ground instantiated programwill be onstants, we an view ea h atomi proposition in the program as essentially a4In order to ensure that the ground instantiation is nite, answer set grounders make some restri -tions on how fun tion symbols and variable symbols may be used in a program. These restri tionsmay vary from system to system, but are not regarded as part of the ore denition of the answerset semanti s. The interested reader may nd details on Lparse's restri tions in [57 and [58. Innon-answer set logi programming languages with less tight synta ti restri tions, su h as Prolog,ground instantiations may be innite, and typi al de ision problems, e.g., the Halting problem, areoften unde idable. 8
0-ary predi ate. Thus, the logi program above, after grounding, might be expressedto a solver as follows: pqr ps q
2.2 Answer Set Semanti s for Normal Logi ProgramsIn the following dis ussion, we will assume that Atoms is some xed set of atomi propositions (0-ary predi ate symbols).Denition 1 (Horn program) A Horn rule is a logi programming rule of the formh a1; : : : ; anwhere h; a1; : : : ; an 2 Atoms. A Horn program is a olle tion of Horn rules.We will denote the set of atoms mentioned in P by Atoms(P ). Also, we willassume that the logi programs onsidered in this dissertation have a nite numberof rules.Denition 2 (Model of a Horn program) A model of a Horn program P is asubset M Atoms su h that for every rule h a1; : : : ; an 2 P , a1; : : : ; an 2 Mimplies h 2M .Denition 3 (Minimum model) A minimum model of a program P is a model ofP whi h is a subset of all other models of P .9
If a program has a minimum model, then it has only one minimum model. Animportant fa t is that every Horn program does have a minimummodel. Furthermore,the minimum model of a Horn program an be omputed by the following pro edure:MinimumModel(HornProgram P)f M ;while 9 rule R = h a1; : : : ; an 2 Psu h that a1; : : : ; an 2Mand h =2Mdo M M [ fhgreturn Mg Note, then, that the minimum model of a Horn program P is therefore the dedu -tive losure of P , if we view the rules of P as inferen e rules. The minimum modelis generally taken to be the anoni al model of a Horn program. Using suitable datastru tures, the runtime of the above pro edure is linear in the size of P . (This wasshown by Dowling and Gallier in [12, and the resulting version of the pro edure isknown as the Dowling-Gallier algorithm). Thus, reasoning with Horn programs iseasy from a omputational omplexity viewpoint.An important on ern in logi programming resear h is how to deal with the pres-en e of logi al negation operators in programs. The preeminent negation operator inlogi programming is the not operator, also known as the negation as failure operator.Denition 4 (Normal program) A normal rule is a rule of the form:h a1; : : : ; an; not b1; : : : ; not bmwhere h; a1; : : : ; an; b1; : : : ; bm 2 Atoms. A normal logi program is a olle tion ofnormal rules. 10
Atom h is the head of the above rule, while a1; : : : ; an; not b1; : : : ; not bm form therule's body. We refer to a1; : : : ; an as the body's positive subgoals, and not b1; : : : ; not bmas its negative subgoals. The logi programming semanti s whi h we onsider will viewthe body of a rule as a set of subgoals: the order in whi h the subgoals appear in thebody does not matter.The next two denitions give the answer set semanti s for normal logi programs,and therefore formally dene the meaning of the not operator in the answer setprogramming paradigm.Denition 5 (Redu t) Given a normal logi program P and a set S Atoms(P ),the redu t of P by S, denoted P S, is the logi program obtained from P by deleting1. ea h rule ontaining a negative literal not bi in its body where bi 2 S, and2. all negative subgoals from the bodies of the remaining rules.Note that the redu t P S is a Horn program, and therefore has a (unique) minimummodel. The mapping whi h takes P and S and produ es P S is alled the Gelfond{Lifs hitz transform. This mapping is the key to the following denition.Denition 6 (Answer set) (Gelfond and Lifs hitz [22) If P is a normal logi pro-gram, then a set S Atoms(P ) is an answer set for P if the minimum model of P Sis equal to S.A normal logi program may have zero, one, or more than one answer sets. Theproblem of determining whether a grounded normal logi program has any answersets is NP- omplete [42. 11
Example 2 The logi programa b, not b not not bhas two answer sets: S1 = fa; bg and S2 = f g.Example 3 The logi programb not bhas no answer sets.Note that if a normal logi program does not ontain the not operator, then it isa Horn program and it has exa tly one answer set, namely the minimum model.As suggested earlier, the not operator in logi programs is used to express a formof negation based on the prin iple of negation as failure. In other words, an expressionof the form not a is satised if there is no eviden e for the truth of a, i.e. it is notpossible to prove a. An answer set S for a normal logi program P onstitutes atwo-valued logi al interpretation of the atoms mentioned in P . Those atoms whi hare onsidered to be true are those whi h are elements of S. Atoms not in S are onsidered to be false. The atoms onsidered false in an answer set S are exa tlythose atoms for whi h there is no proof (spe i ally, no proof from P S). The atomswhi h are onsidered true in S are exa tly those atoms whi h have proofs (from P S).The fa t that P S is a Horn program (does not mention not) means that the notionof whether an atom has a proof from P S is well-dened and straightforward. 12
Classi al negationAnswer sets for normal logi programs are also referred to as stable models, whi hwas the term used in the paper where these notions were rst dened ([22). Theterm \answer set" was used in [23, where the semanti s was extended to programsin orporating the : operator.The : operator is often referred to as the lassi al negation operator, whereas thenot operator is referred to as the negation as failure operator. Informally, :a meansthat a is false, whereas not a means that a does not have a proof. Synta ti ally, themain dieren e is that the : operator is allowed in the head of a rule, whereas thenot operator is not allowed in the head. Thus, a rule su h as:a b; whi h says that \if b and are true, then a is false" is allowed in an extended logi program [23. However, a rule su h asnot a b; whi h says that \if b and are true, then a does not have a proof" is not allowed.The interested reader may refer to [23 for the semanti s and uses of the : operatorin answer set programming. Typi ally, one would in lude lassi al negation if onewanted to work with three- or four-valued logi al models rather than two-valued(\true/false") models. For instan e, if : is used, then a and :a are both treated asliterals whi h may belong to an answer set M . If a 2 M , then a is regarded as truein the model. If :a 2 M , then a is regarded as false. If neither a nor :a belong toM , then the truth value of a is onsidered unknown. If both a and :a belong to M ,then a represents a ontradi tion. 13
However, in [23, it was shown how programs involving both the not and : op-erators ould easily be redu ed to normal programs (i.e., to programs whi h use notas the only negation operator). In fa t, the Lparse program automati ally performsthis redu tion on all programs involving :. Hen e, we will restri t our onsiderationof negation operators to the not operator only.Integrity onstraintsSometimes we would like to express in our logi program that a set of onditionsis impossible. We an do this by writing an integrity onstraint, whi h is simply arule with an empty head.Example 4 The rule R = a; not bstates that no answer set of the program an have both a = true and b = false.Formally, the rule R above is dealt with by treating it as shorthand for the ruleR0 = f a; not b; not fwhere f is a new atom introdu ed into the program spe i ally for use, as above, inintegrity onstraints. The reader an he k that the resulting program, whi h in ludesR0, annot have an answer set with both a = true and b = false.2.3 Relationship to Classi al Boolean Satisability (SAT)One of the useful features (alluded to in Se tion 2.1.1) of answer set systems isthat they allow the user to write predi ates whi h an take variables as arguments.14
As we will see, this greatly helps to make it onvenient to express problems in ananswer set language. Answer set systems also generally provide a ri her syntax thanmerely normal logi programs. For example, the Lparse and Smodels systems providethe \extended rules" whi h we will onsider in Se tion 2.4. These are some of thestrengths of answer set programming over lassi al Boolean logi as a knowledgerepresentation paradigm.However, for the remainder of this se tion we will restri t our attention to a om-parison of grounded normal logi programs interpreted under the answer set semanti sversus lassi al Boolean logi expressions (written, say, in onjun tive normal form)interpreted under the usual semanti s in that domain. It turns out that even withthese restri tions, the answer set programming paradigm has a signi ant advantageover lassi al Boolean logi when it omes to expressing problems on isely and on-veniently. This advantage omes from the way in whi h the answer set semanti s em-bodies the prin iple of negation as failure. In pra ti e, one of the payos of negationas failure in the answer set semanti s is that it allows us to write indu tive denitionsof predi ates. We will see an instan e of this in Se tion 2.3.3, where we express theHamiltonian y le problem as an answer set problem. Expressing indu tively denedpredi ates tends to be mu h more diÆ ult in lassi al Boolean logi .Before we look at that example, we will onsider the problem of redu ing lassi alBoolean logi problems to answer set programming problems (Se tion 2.3.5). We labelthe respe tive de ision problems as SAT and ASP. We re all that both SAT and ASPare NP- omplete. Later, in Se tion 2.3.5, we will onsider the problem of redu ingASP to SAT. We will see that it seems to be mu h easier to devise a on ise, pra ti alredu tion of SAT to ASP than to do the reverse. We will see how this relates to the15
negation as failure prin iple. This motivates onsidering the answer set paradigmas an alternative to lassi al Boolean logi on ertain problems. It is also relevantto understanding the dieren es between our approa h with Smodels (dis ussed inChapter 4), and the approa h of answer set solvers whi h all SAT solvers dire tly(Chapter 5).2.3.1 Redu ing of SAT to ASPRedu ing SAT to ASP is rather simple, as we will see below.Denition 7 (Literal) A literal is an expression of the form a or :a, where a isan atom (proposition letter).In the above denition, a is a positive literal, and :a is an negative literal.Denition 8 (Clause) A lause is a disjun tion of literals.Denition 9 (Conjun tive Normal Form) A Boolean statement is in onjun -tive normal form (CNF) if it is expressed as the onjun tion of a set of lauses.Example 5 The Boolean expression(a _ b) ^ (a _ :b) ^ (:a _ :b _ )is in onjun tive normal form.Suppose S is a formula written in onjun tive normal form. We onstru t a orresponding normal logi program P as follows:1. For ea h atom a appearing in S, in lude atoms a and a0 in P , along with therules 16
a not a0a0 not a2. For ea h lause C = a1_ : : :_am_:b1_ : : :_:bn in S, in lude in P the integrity onstraint not a1; : : : ; not am; b1; : : : ; bnThen M has a satisfying assignment if and only if P has an answer set.Two favorable properties are evident in the above redu tion. First, the size ofthe resulting logi program is linear in the size of the original CNF formula, with arather small expansion fa tor. Se ondly, although the redu tion doubles the numberof atoms whi h were in the original problem, the new atoms do not a tually addto the size of the sear h spa e whi h needs to be onsidered. That is, for any ofthe ommonly used answer set solvers, if a truth value is assigned to atom a at anytime in the sear h, the solver an immediately infer the opposite truth value for a0from the rules given above. Likewise, if the solver assigns a truth value for a0, it animmediately infer the opposite truth value for a. As a result, the new atoms whi hare introdu ed by this redu tion do not in rease the number of hoi es whi h wouldneed to be made in the sear h for a solution to a problem instan e.2.3.2 Completion Semanti sBefore onsidering redu tions of ASP to SAT, we will look at a semanti s fornormal logi programs whi h is dened in terms of lassi al Boolean logi . Under-standing the relationship of the answer set semanti s to the ompletion semanti s17
helpful in understanding the redu tions of ASP to SAT, is parti ularly relevant tounderstanding the approa h of the solvers dis ussed in Chapter 5.Denition 10 (Program Completion) Let P be a normal logi program. Foratom h 2 Atoms(P ) let 'h be the propositional formulah$ Wfa1 ^ : : : ^ an ^ :b1 ^ : : : ^ bm j h a1; : : : ; an; not b1; : : : ; not bm 2 PgThen the ompletion of P , denoted Comp(P ), is the formula Vf'hjh 2 Atoms(P )g.An interpretation M is said to be a model of P under the ompletion semanti sif it satises Comp(P ). This semanti s is due to Clark [7. It is easy to he k thatevery model of P under the answer set semanti s is a model of P under the ompletionsemanti s. However, the onverse is false:Example 6 a bb aThe ompletion of the pre eding program is the Boolean logi statement(a$ b) ^ (b$ a):Thus the program has two models under the ompletion semanti s: M1 = fg andM2 = fa; bg. However, M1 is the only answer set of P .Both the ompletion semanti s and the answer semanti s embody the prin iple ofnegation as failure, whi h states that an atom will be onsidered false if and only if it annot be proven true. However, the answer set semanti s, interpretes this prin iple18
more stri tly. That is, the answer set semanti s pla es a tighter restri tion on whatkind of proof is permitted to justify the assertion of an atomi proposition.To see this, suppose that we have a normal logi program P and we wish to he k whether a parti ular interpretation I is a model of P under the answer set and ompletion semanti s, respe tively. Let I+ = fa 2 Atoms(P ) : a is true in I g. LetI = f:a : a 2 Atoms(P ), a is false in I g. Then from Denition 6 we an see thatI is an answer set of P if and only if I+ is exa tly the dedu tive losure of I [ P .On the other hand, I is a model of the ompletion of P if and only if I+ is exa tlythe dedu tive losure of I+ [ I [ P . Thus, under the ompletion semanti s, theelements of I+ an be used as assumptions to prove the elements of I+. This meansthat atoms whi h are asserted true in ompletion semanti s models may have ir ularjusti ations.Su h ir ular proofs are not allowed as justi ations under the answer set seman-ti s. Example 6 gave an instan e of this distin tion. We give another illustratingexample in Se tion 2.3.3.2.3.3 Expressing the Hamiltonian Cy le ProblemSuppose that we wish to express in a normal logi program the problem of ndinga Hamiltonian y le in a dire ted graph. As is often the ase in logi programming,we will express the problem in two parts. The rst part of our program, alled theintentional database (IDB), onsists of a set of rules (typi ally involving variables)whi h express the logi of the problem. The se ond part of the program, the exten-sional database (EDB), will onsist of the set of fa ts (expressed as rules with emptybodies) whi h determine a parti ular instan e of the problem. In the ase of the19
Hamiltonian y le problem, the IDB will onsist of a set of rules whi h spe ify thata Hamiltonian y le is a set of edges whi h form a path whi h starts from an initialnode, visits every node exa tly on e, and returns to the initial node.The following is an IDB similar to one given by Niemela [48 for the Hamiltonian y le problem:% Sele t edges for the y leh (X;Y ) not h 0(X;Y ); edge(X;Y )h 0(X;Y ) not h (X;Y ); edge(X;Y )% Ea h vertex has at most one in oming edge in a y le h (X1; Y ); h (X2; Y ); edge(X1; Y ); edge(X2; Y ); vertex(Y );X1 6= X2% Ea h vertex has at most one outgoing edge in a y le h (X;Y1); h (X;Y2); edge(X;Y1); edge(X;Y2); vertex(X); Y1 6= Y2:% Every vertex must be rea hable from the initial vertex% through the hosen h edges. vertex(X); not r(X)r(Y ) h (X;Y ); edge(X;Y ); initialvertex(X)r(Y ) h (X;Y ); edge(X;Y ); r(X)Of spe ial signi an e in this redu tion is the rea hability predi ate, r. r(Y )means that Y is rea hable from the initial vertex. Note that r is dened indu tivelyin this program. The rules state how to prove that the r predi ate is true. It is notne essary to state when the r predi ate is false: By the prin iple of negation as failure,it is assumed that if r(Y ) annot be proven true based on the above rules, then r(Y ) isfalse. The problem of expressing the rea hability relation is what seems to make theHamiltonian y le problem diÆ ult to express on isely in lassi al Boolean logi .(See Simons [53 for some experiments ontrasting solving the Hamiltonian y leproblem with an answer set solver versus solving it with SAT solvers. He was ableto obtain mu h smaller redu tions, and mu h better runtimes with the answer setapproa h.) 20
Now, suppose that the parti ular instan e at hand is the dire ted graph shown inFigure 2.1. Then a orresponding EDB would be:initialvertex(1)vertex(1)vertex(2)vertex(3)vertex(4)vertex(5)vertex(6)vertex(7)vertex(8)edge(1; 2)edge(2; 4)edge(3; 1)edge(4; 3)edge(4; 6)edge(5; 3)edge(5; 6)edge(6; 8)edge(7; 5)edge(8; 7)Our program P will be the grounding of the union of the above IDB and EDB.From the perspe tive of feasibly solving HC problems, a very ni e feature of the aboveredu tion is that that the size of the grounded program P will be linear in the sizeof the graph being represented.There is only one answer set to the above program, and it is given in Table 2.1.This answer set orresponds to hoosing the set of edges highlighted in Figure 2.2,whi h is the only set of edges from this graph that yields a Hamiltonian y le.Table 2.2 gives an interpretation whi h is a model of the program under the ompletion semanti s. The orresponding set of edges is highlighted in Figure 2.3.Label the answer set given in Table 2.1 M , and the ompletion semanti s modelgiven in Table 2.2 N . We will again take spe ial note of the rea hability predi ater. r(X) had to be true for every vertex X in ea h model, be ause of the integrity21
1 2
3 4
5 6
7 8Figure 2.1: A Hamiltonian y le problem instan e initialvertex(1)vertex(1) edge(1; 2) h (1; 2) r(1)vertex(2) edge(2; 4) h (2; 4) r(2)vertex(3) edge(3; 1) h (4; 6) r(3)vertex(4) edge(4; 3) h (6; 8) r(4)vertex(5) edge(4; 6) h (8; 7) r(5)vertex(6) edge(5; 3) h (7; 5) r(6)vertex(7) edge(5; 6) h (5; 3) r(7)vertex(8) edge(6; 8) h (3; 1) r(8)edge(7; 5) h 0(4; 3)edge(8; 7) h 0(5; 6)Table 2.1: Answer set for Hamiltonian y le problem instan e
22
1 2
3 4
5 6
7 8Figure 2.2: The solution to the HC problem instan e initialvertex(1)vertex(1) edge(1; 2) h (1; 2) r(1)vertex(2) edge(2; 4) h (2; 4) r(2)vertex(3) edge(3; 1) h 0(4;6) r(3)vertex(4) edge(4; 3) h (6; 8) r(4)vertex(5) edge(4; 6) h (8; 7) r(5)vertex(6) edge(5; 3) h (7; 5) r(6)vertex(7) edge(5; 6) h 0(5;3) r(7)vertex(8) edge(6; 8) h (3; 1) r(8)edge(7; 5) h (4;3)edge(8; 7) h (5;6)Table 2.2: Completion semanti s model for Hamiltonian y le problem instan e, withdieren es from Table 2.1 highlighted. 23
1 2
3 4
5 6
7 8Figure 2.3: A \solution" under the ompletion semanti s onstraint vertex(X); not r(X). It is not hard to he k that ea h r(X) is in thededu tive losure of M [ P , for X = 1; : : : ; 8.However, as we turn our attention to N , we observe that r(5); : : : ; r(8) are notprovable from N [ P . The only way to justify r(5); : : : ; r(8) from N is by usingelements of N+ as assumptions. Spe i ally, r(5); : : : ; r(8) an be proven only by a ir ular sequen e of dedu tions.We will return to this example in the next se tion.2.3.4 Unfounded SetsDenition 11 (Partial Interpretation) Let P be a logi program. A (partial) in-terpretation (on P ) is a set of literals mentioning only atoms from Atoms(P ). 24
A (partial) interpretation I on P is onsidered total if every element of Atoms(P )o urs in some element of I.For example, if Atoms(P ) = fa; b; g, then I1 = fa;: g is a partial interpretationon P and I2 = fa; b;: g is a total interpretation. Informally, we onsider a partialinterpretation to be a set of assertions about whi h atoms are true, and whi h atomsare false in a model of P . For instan e, based on the above, we may write I1(a) = true,I1( ) = false, and I1(b) = unknown.5 Also, if, for some atom d, both d and :dare members of an interpretation I, then I is in onsistent and we may write bothI(d) = true and I(d) = false.6 (Partial) interpretations are sometimes referred to as(partial) truth assignments.We denote the positive (resp., negative) elements of I by I+ (resp., I). In theexample above, I+2 = fa; bg and I2 = f: g. We always have I = I+ [ I. And I is onsistent if Atoms(I+) \ Atoms(I) = ;.We say that an interpretation J extends an interpretation I if I J .Suppose P is a logi program and S Atoms(P ). Then the total interpretationon P orresponding to S is M+ [M where M+ = S and M = Atoms(P ) n S.Likewise, a total interpretation M orresponds to the set of atoms M+. If westate that an interpretationM is an answer set of a normal program P then we meanthat (1) M is a total interpretation on P , and (2) M+ is an answer set of P underdenition 6.The following denition is due to Van Gelder, Ross, and S hlipf [20:5In the literature, I1(b) = unknown is often written I1(b) = ?.6This situation is sometimes expressed by the equation I(d) = >. 25
Denition 12 (Unfounded Set) Let P be a normal logi program and I a partialinterpretation on P . Then H Atoms(P ) is said to be unfounded with respe t to Pand I if for every rule R of P with head h 2 H we have at least one of the following onditions:1. there is a negative subgoal 'not d' in the body of R su h that I(d) = true,2. there is a positive subgoal ' ' in the body of R su h that I( ) = false, or3. there is a positive subgoal ' ' in the body of R su h that 2 H.Example 7 Let P be the logi program from Se tion 2.3.3 whi h expresses the Hamil-tonian y le problem instan e given in Figure 2.1. Let I = f:h (4; 6)g. ThenH = fr(5); r(6); r(7); r(8)g is unfounded with respe t to P and I.Observe that if H is unfounded with respe t to P and I, and interpretation Jextends I, then H is unfounded with respe t to P and J .The following proposition is an immediate orollary to Theorem 6.1 in [20. How-ever, we prove the result dire tly here.Proposition 1 (Van Gelder, Ross, and S hlipf [20) Let P be a normal logi program and I a partial interpretation of Atoms(P ). Let H Atoms(P ) be un-founded with respe t to P and I. Let M be a onsistent, total interpretation of Pwhi h extends I, su h that M+ is an answer set of P . Then M+ \ H = ; (i.e., Minterprets every element of H as false).Proof: Sin e M+ is an answer set of P , we have M+ = Dedu tiveClosure(PM+).We will show that H \Dedu tiveClosure(PM+) = ; by indu ting on the length of a26
derivation D = [b1; : : : ; bn from PM+. So we take as our indu tion hypothesis thatbi =2 H for 1 i < n. For the sake of showing a ontradi tion, suppose that bn 2 H.Let RM+ = bn 1; : : : ; mbe the rule from PM+ used to derive bn in D. Then in P there is a orresponding ruleR = bn 1; : : : ; m; not d1; : : : ; not dqwhere ea h di =2 M+. Sin e H is unfounded with respe t to P and I, we have threepossible ases for R:1. di 2 I M , for some 1 i q. This is not possible sin e it ontradi ts ourassumption about R.2. : i 2 I M , for some 1 i m. Sin e RM was used to dedu e bn in D, i = bj for some 1 j < n. Thus, i is in the dedu tive losure of PM+, andis therefore in M+. This ontradi ts M being onsistent. Thus this ase isimpossible.3. i 2 H, for some 1 i m. Again, sin e RM was used to dedu e bn in D, i = bj for some 1 j < n. But our indu tion hypothesis was that no su h bjwas an element of H. Hen e, this ase is also impossible.We on lude that bn =2 H, whi h ompletes the indu tive proof. Example 8 Let P , I, and H be as in Example 7. Let N be the set of atoms givenin Table 2.2, whi h onstituted a model of P under the ompletion semanti s. If we onsider the total interpretation M on P su h that M+ = N , we see that M extendsI, but M+ \H 6= ;. Hen e, by Proposition 1, N =M+ is not an answer set of P .27
In summary, if I is a partial interpretation of P , the set of atoms U is unfoundedwith respe t to I, and we hope to extend I to an answer set of P , then we will needto ensure that every element of U is interpreted as false in the extension. Dete tingunfounded sets early in the sear h will be one of the key issues as we onsider dierentapproa hes to sear hing for answer sets.2.3.5 Redu ing ASP to SATWe saw in Se tion 2.3.1 that it is straightforward to redu e the Boolean satis-ability problem to the problem of nding an answer set for a normal logi program.We ited some ni e pra ti al properties of the redu tion given there. Spe i ally,the redu tion did not in rease the size of the representation mu h, and the new rep-resentation did not in lude any new atoms whi h would require extra hoi e pointsin a sear h for a model. The task of redu ing answer set programming problems toSAT problems is less straightforward, and has been studied in various papers. Wedivide su h redu tions into two lasses: (1) those redu tions whi h operate in poly-nomial time, but whi h signi antly in rease the number of atoms in the problemrepresentation, and (2) those whi h do not in rease the number of atoms, but whi hexponentially in rease the representation size. Be ause the known redu tions fall intoeither of these two types, it presently does not seem to be feasible, in general, to solvean ASP problem by rst redu ing the problem to SAT and then alling a SAT solver.We present examples of ASP-to-SAT redu tions in Appendix A.2.4 Extended RulesCertain ideas are diÆ ult to express on isely in normal logi programs. Anexample would be a onstraint whi h says that from a set of n atoms, fa1; : : : ; ang,28
at least k atoms must be true. Or, suppose that the edges in a graph have variousweights, and we wish to express that a Hamiltonian y le whi h is hosen throughthe graph is to have at most a ertain total weight.In order to make expressing su h onstraints feasible, Simons, Niemela, and Soini-nen [54 introdu ed weight onstraint rules, whi h are implemented in Lparse andSmodels. We summarize their denitions below.Denition 13 (Weight Constraint) A weight onstraint is an expression of theform l fa1 = wa1 ; : : : ; an = wan ; not b1 = wb1 ; : : : ; not bm = wbmg uwhere ea h ai and bj is an atom; and l, u, and ea h wx is a real number.7In the above denition, l and u are alled the lower and upper bounds of the onstraint, respe tively. Ea h w term in the above expression is a weight. Intuitively,the onstraint says that W , the sum of the weights of the satised ai and not bjexpressions, satises the inequality l W u. Either of the bounds may be omittedfrom the onstraint. A missing lower bound in the weight onstraint is taken toindi ate a lower bound of 1. Similarly, the default upper bound is +1.Denition 14 (Weight Constraint Rule) A weight onstraint rule is an expres-sion of the form C0 C1; : : : ; CnA weight onstraint rule program is a program onsisting of weight onstraintrules. Appendix B, Se tion B.1 gives the formal denition of an answer set for aweight onstraint rule program.7In Smodels, l, u, and the wx terms are all restri ted to integers.29
Useful shorthand notationThe notation given below (all of whi h omes from [54) provides some usefulshorthand for some ommonly utilized weight onstraint rule onstru ts. Of parti ularrelevan e to the implementation of the Smodels program, the grounding programLparse translates any weight onstraint rule program into a program onsisting onlyof the types of rules whi h we des ribe below in this subse tion. Thus, handlingthese rule types is suÆ ient for Smodels to deal with arbitrary weight onstraint ruleprograms.A ardinality onstraint is an expression of the forml fa1; : : : ; an; not b1; : : : ; bmg uwhi h is shorthand for the following weight onstraint with all weights equal to one:l fa1 = 1; : : : ; an = 1; not b1 = 1; : : : ; not bm = 1g u:A weight rule is a rule of the formh l fa1 = wa1 ; : : : ; an = wan ; not b1 = wb1; : : : ; not bm = wbmgwhere l and all of the weights are non-negative. This is a shorthand for the rule1 fh = 1g l fa1 = wa1 ; : : : ; an = wan ; not b1 = wb1; : : : ; not bm = wbmg:A hoi e rule is a rule of the formfh1; : : : ; hkg a1; : : : ; an; not b1; : : : ; not bmand is shorthand for0 fh1 = 1; : : : ; hk = 1g n+m fa1 = 1; : : : ; an = 1; not b1 = 1; : : : ; not bm = 1g:30
Informally, a rule of this form states that if a1; : : : ; an are true and b1; : : : ; bm are false,then any of h1; : : : ; hk may be true.A ardinality rule8 is a rule of the formh k fa1; : : : ; an; not b1; : : : ; bmgand orresponds toh k fa1 = 1; : : : ; an = 1; not b1 = 1; : : : ; not bm = 1g:Thus a ardinality rule is shorthand for a weight rule with weights all equal to one.A normal rule is a rule of the formh a1; : : : ; an; not b1; : : : ; not bmand orresponds toh n+m fa1 = 1; : : : ; an = 1; not b1 = 1; : : : ; not bm = 1g:This formal denition of a normal rule in terms of weight rules results in the samesemanti s for normal rules as that given in Denition 6.An integrity onstraint is a rule of the form a1; : : : ; an; not b1; : : : ; not bmand orresponds tof n+m + 1 fa1 = 1; : : : ; an = 1; not b1 = 1; : : : ; not bm = 1; not f = 1gwhere f is a new atom used only in integrity onstraints.8In the Lparse user's manual, and in the Smodels sour e ode, ardinality rules are referred toas \ onstraint rules". 31
In their paper, Simons, Niemela, and Soininen show how to translate arbitraryweight onstraint rule programs rules into programs using only weight rules and hoi erules. (We outline their translation in Appendix B, Se tion B.2.) Lparse uses asimilar approa h whereby it translates arbitrary weight onstraint rule programs intoprograms onsisting only of weight rules, hoi e rules, ardinality rules, and normalrules. Separate data stru tures and lasses are used by Smodels to deal with ea h ofthese four rule types.Conditional literalsConditional literals are relevant only to logi programs with variables. They pro-vide a onvenient way to express to the grounding program (su h as Lparse) how thevariables mentioned in a onstraint may be grounded.A onditional literal is an expression of the form p : d where p is a predi ate or apredi ate pre eded by the not operator. d is the onditional part of the literal, andalso must be a predi ate. The : operator in the literal may be read as \su h that".The grounding program reates from the p : d expression groundings of p su h that dis true.For example, re all the extensional database for the Hamiltonian y le probleminstan e in Se tion 2.3.3:initialvertex(1)vertex(1)vertex(2)vertex(3)vertex(4)vertex(5)vertex(6)vertex(7)vertex(8)edge(1; 2) 32
edge(2; 4)edge(3; 1)edge(4; 3)edge(4; 6)edge(5; 3)edge(5; 6)edge(6; 8)edge(8; 7)edge(8; 7)Given that the EDB uses the predi ates \vertex" and \edge" as shown above,then the statement 1 fh (X; Y ) : edge(X; Y )g 1 vertex(Y )expresses that every vertex Y must have exa tly one in oming edge satisfying the\h " predi ate.The onditional predi ate d is used to determine whi h groundings of p are allowed.Therefore, for any ground instan e of d0 of d, it should be well-dened whether d0 istrue or false. That is, it should not be the ase that d0 is true in some models of theprogram and false in others. Also, it should be omputationally easy to determinewhether d0 is true or false, sin e this needs to be determined by the grounding programbefore the ground instantiation is given to the answer set solver.In [54, Simons, Niemela, and Soininen enfor e this restri tion by stating that dmust be what a refer to as a domain predi ate. Domain predi ates are su h that, ifwe restri t a logi program to only those rules whi h dene domain predi ates, thenthe resulting program will have a unique, easily omputed answer set.99The denition of what qualies as a domain predi ate for Lparse has hanged over time. Theinterested reader may refer to the Lparse manual [57 and to a paper by Syrjanen [58. In thisdissertation, we will use as onditional predi ates only predi ates whi h are dened by the program'sextensional database (EDB). Sin e all rules in the EDB have empty bodies, su h predi ates ertainlyqualify as domain predi ates. Note also that the grounding phase is not the part of the solutionnding pro ess where our ontribution lies. 33
Minimize statementsA minimize statement is a statement of the formminimizefa1 = wa1 ; : : : ; an = wan ; not b1 = wb1 ; : : : ; not bm = wbmg:Su h a statement asks the solver to nd an answer set whi h minimizes the totalweight of the satised subgoals listed inside the bra es. If more than one minimizestatement is given in a logi program, then the olle tion of minimize statementsindu e a lexi ographi ordering on the answer sets of the program. Under this order-ing, minimize statements whi h o ur earlier in the program are taken to be moresigni ant.The maximize statementmaximizefa = wa; not b = wbgis shorthand for minimizefa = wa; not b = wbg.2.5 Disjun tive Logi ProgrammingA disjun tive rule diers from a normal rule in that it may ontain a disjun tionof atoms in its head. Thus a disjun tive rule is a rule of the form:h1 _ : : : _ hk a1; : : : ; an; not b1; : : : ; not bmwhere h1; : : : ; hk; a1; : : : ; an; b1; : : : ; bm 2 Atoms. A disjun tive logi program is a olle tion of disjun tive rules. The answer set semanti s for disjun tive programswas dened by Gelfond and Lifs hitz in [23. The relevant denitions parallel thedenitions by whi h the same authors dened the answer set semanti s for normalprograms. We provide their denitions below.34
A disjun tive program is positive if none of the rules involve the not operator. Aset of atoms M is a model of a positive ruleh1 _ : : : _ hk a1; : : : ; anif M is a model of the lassi al logi formula(a1 ^ : : : ^ an)! (h1 _ : : : _ hk):If P is a disjun tive program whi h onsists of positive rules only, thenM is an answerset of P if M is a minimal model of P . \Minimal model" means that M is a modelof every rule of P , and no stri t subset of M is a model of every rule of P .The (Gelfond-Lifs hitz) redu t, PM , of a disjun tive program is onstru ted fromP and M just as it is in the ase of a normal program:1. Remove every rule whi h has a subgoal of the form not b where b 2M , and2. Remove every negative subgoal from the resulting program.Note that, for an arbitrary disjun tive program P , PM is positive. Hen e, whethera set is an answer set of PM is dened above.Denition 15 (Answer Set of a Disjun tive Program) Let P be an arbitrarydisjun tive logi program. Then a set M is an answer set of P if it is an answer setof PM .The question of whether a disjun tive program has an answer set is P2 - omplete [15.Disjun tive logi programming systems generally ompute answer sets through atwo phase guess-and- he k pro ess whi h works as follows:35
1. (Guess Phase) Sear h for a set of atoms M whi h is a model of its own redu tPM .2. (Che k Phase) Determine whether there is a set M 0 ( M su h that M 0 is alsoa model of PM . M is an answer set of P i no su h M 0 exists.The leading solver for disjun tive logi programming for the past several yearshas been DLV [13. A re ent system, GnT (\Guess 'n' Test") [30, performs theguess and he k phases above by reating normal program instan es from the originaldisjun tive problem. GnT makes su essive alls to the Smodels solver to solve thesenormal instan es until an answer set has been determined for the original problem.
36
CHAPTER 3BACKGROUND: BASIC SEARCH ALGORITHMS FORSAT AND ASP This hapter provides ba kground information on the Davis-Putnam-Loveland-Logemann (DPLL) algorithm, whi h sear hes for satisfying assignments to Booleanformulas. It also des ribes in some detail the answer set sear h program Smodels,whose high-level stru ture is based on DPLL. Here, again, nothing in this hapter isnew to this dissertation. However, it sets the stage for Chapter 4, where we des ribehow others have used on i t lause learning to improve the eÆ ien y of DPLL, andalso where we des ribe how we have adapted on i t lause learning to Smodels to reate the program Smodels .3.1 DPLL SAT algorithmCurrent algorithms whi h sear h for satisfying assignments to lassi al Booleanlogi formulas an be pla ed into either of two broad ategories: omplete methodsor in omplete methods. Complete methods are those methods whi h, in prin iple,are guaranteed to return a satisfying assignment if the problem has one, or a messagewhi h states that the problem is unsatisable if no satisfying assignment exists. Thein omplete methods sear h for a solution to the given problem instan e but are not37
guaranteed to return a result if the instan e is unsatisable. So an in omplete methodmay in prin iple run forever if no satisfying assignment exists.In omplete methods for SAT usually employ some kind of lo al, hill- limbingsear h. GSAT [52 was a pioneering solver in this ategory.In this dissertation we will only deal with omplete methods for solving SAT andASP problems. These omplete methods are usually based on the Davis-Putnam-Loveland-Logemann algorithm [10,[9. The algorithm assumes that the Boolean for-mula to be solved is expressed in onjun tive normal form, or CNF (Denition 9).The DPLL algorithm may be des ribed as a re ursive pro edure, as in Table 3.1.The pro edure has three subroutines: BCP, Redu e, and Heuristi . The BCP routineperforms Boolean Constraint Propagation, also known as Unit Propagation. It he kswhether any of the lauses in the formula are unit lauses. A unit lause is one whi h onsists of only a single literal. If su h a lause exists, then the BCP routine adds theliteral to the urrent partial assignment and simplies the formula. This inferen erule is known as the unit lause rule or the unit literal rule. BCP repeatedly appliesthe unit lause rule until either a on i t with the partial assignment is dete ted, oruntil no unit lauses remain in the formula.The Redu e subroutine is used by both the DPLL pro edure and the BCP pro e-dure to simplify a CNF formula by a single new assignment.It is assumed in the pseudo ode of Table 3.1 that the negation of a negative literalis a positive literal. For instan e, if literal = :a, then :literal = a.The remaining subroutine used by DPLL is Heuristi . The all Heuristi ('0)sele ts an atom a appearing in '0 and returns either the positive or the negativeliteral mentioning a. This orresponds to the algorithm making a guess as to whether38
DPLL(', I)// ' is a Boolean CNF formula; I a partial interpretation.// Pre ondition: Atoms(') \Atoms(I) = ;.// This routine sear hes for an assignment whi h extends I and satises '.// If su h an assignment is found, it outputs the assignment and returns true.// Otherwise, it returns false.f ('0; I 0; onfli t) BCP('; I)if ( onfli t = true)return falseif ('0 ontains no lauses)output I 0 // I 0 is a satisfying assignmentreturn trueliteral Heuristi ('0)if (DPLL(Redu e('0; literal); I 0 [ fliteralg) = true)return trueelse return DPLL(Redu e('0; :literal); I 0 [ f:literalg)gBCP(', I)// Boolean Constraint Propagation// Applies the unit lause inferen e rule to '// until a on i t is found or until no unit lauses remain.// Returns the resulting ' and I, as well as a ag// indi ating whether a on i t was found.f while (9 a unit lause C = fliteralg in ') doif :literal 2 Ireturn ('; I; true)I I [ fliteralg' Redu e('; literal)return ('; I; false)gRedu e(', literal)// Returns ' simplied by the assumption that literal = true.f Remove from ' all lauses ontaining literal.Remove :literal from any remaining lauses in whi h it o urs.Return the resulting '.g Table 3.1: Davis-Putnam-Loveland-Logemann algorithm for SAT39
a will be true or false in the satisfying assignment. We refer to a as a hoi e atom,and the assignment whi h is made to a as a hoi e assignment. DPLL then makesa re ursive all to itself with a instantiated a ording to the hoi e assignment. Ifthat re ursive all fails, then DPLL ba ktra ks and makes another re ursive all, thistime with the value of a reversed. The eÆ ien y of the DPLL algorithm depends toa signi ant extent on the hoi es made by the heuristi routine, and onsiderableresear h has gone into developing ee tive heuristi strategies for DPLL.The exe ution of the DPLL algorithm may be portrayed as a binary tree whereea h node in the tree represents a re ursive all to DPLL, and ea h leaf node representseither a on i t or a satisfying assignment.3.2 SmodelsThe Smodels algorithm follows the general outline of the DPLL pro edure. How-ever, rather than representing its problem instan e as a set of Boolean lauses, Smod-els uses the representation whi h it re eives from the grounding program Lparse.Smodels a epts a grounded10, extended logi program. (We will assume for the mo-ment that the program in ludes no minimize statements, but will address minimizestatements in Se tion 3.2.5). Re all that Lparse translates extended logi programsinto a set of normal rules, hoi e rules, ardinality rules, and weight rules. The oun-terpart to the BCP pro edure in the DPLL routine is the Expand routine in Smodels.Pseudo ode for the Smodels algorithm is given in Table 3.2.10i.e. propositional 40
Smodels(P , I)// P is an extended logi program.// I is a set of literals representing a partial interpretation.// This routine sear hes for a total interpretation whi h extends I to an answer set for P .// If su h an interpretation is found, Smodels outputs it and returns true.// Otherwise, it returns false.f J Expand(P; I)if (Con i t(J))return falseif (J overs Atoms(P ))output J+ // J+ is an answer setreturn trueliteral; for ed Heuristi (P; J)if (Smodels(P; J [ fliteralg) = true)return trueif (for ed)return falsereturn Smodels(P; J [ f:literalg)gExpand(P , I)// P is an extended logi program.// I is a set of literals representing a partial interpretation.// This routine derives inferen es whi h are true in// any answer set of P extending I,// and adds the inferred literals to I.// Returns the resulting ' and I, as well as a ag// indi ating whether a on i t was found.f repeatI 0 II AtLeast(P; I)I I [ fnot x jx 2 Unfounded(P;A)guntil I = I 0 or Confli t(I)return IgCon i t(I)f return I+ [ I 6= ;g Table 3.2: Smodels algorithm41
The two main subroutines of Expand are the AtLeast and Unfounded11 routines,both of whi h are used to obtain inferen es whi h will extend the urrent truth as-signment. In Se tion 3.2.1 we des ribe how these fun tions operate on normal logi programs. Then, in Se tion 3.2.2 we look at how they apply to extended logi pro-grams.3.2.1 Smodels' Inferen e Rules for Normal ProgramsRe all from Denition 4 that a normal rule is one of the formh a1; : : : ; ak; not b1; : : : ; not bmand that a normal logi program is a olle tion of normal rules.AtLeast(P,I)AtLeast uses inferen e rules whi h are valid under both the ompletion semanti sand the answer set semanti s. We summarize the four inferen e rules used by AtLeastbelow:Modus Ponens: If all of the subgoals in a rulea b1; : : : ; bk; not 1; : : : ; not mare true in the urrent truth assignment, infer a. For example, suppose that theprogram ontains the rule w x; y; not z and that the urrent truth assignmentin ludes x; y and :z. Then infer w.11In the Smodels sour e ode, and in papers su h as [53 and [54, the routine whi h omputesunfounded sets is alled the AtMost pro edure. 42
All Rules Can elled If every rule with head a has its body an elled by the urrenttruth assignment,12 infer :a. For example, suppose that the only rules with w intheir head are w x; not yw not x; not zand that the urrent truth assignment ontains :x and z. Then infer :w.Ba k hain True: If atom a is true in the urrent truth assignment, and if everyrule with head a ex ept one has at least one subgoal that is false in the urrent truthassignment, infer all the subgoals of that remaining rule to be true. For example,suppose the only rules with a in their head area b; ; not da e; fa not g; hand that the urrent truth assignment ontains a; d;:e. Then infer :g; h.Ba k hain False: If an atom a is false in the urrent truth assignment and somerule a b1; : : : ; bk; not 1; : : : ; not mhas every subgoal ex ept one true in the urrent truth assignment, infer the remainingsubgoal to be false. For example, suppose the rule is a b; ; not d and that :a; b; are in the truth assignment. Then infer d.The all AtLeast(P; I) takes the logi program P and urrent truth assignmentI, and applies the above four inferen e rules until no further inferen es an be made,12A normal rule: h a1; : : : ; ak; not b1; : : : ; not bm has its body an elled by I i :ai 2 I forsome 1 i k or bi 2 I for some 1 i m 43
or until a on i t is rea hed13, whi hever omes rst. The resulting truth assignmentis returned as the result of the all.Unfounded(P,I)Given a logi program P and a partial truth assignment I, it is immediate fromthe denition of an unfounded set (Denition 12) that the union of a olle tion ofunfounded sets is itself unfounded. Thus, the greatest unfounded set [20, GUS(P; I),with respe t to P and I is the union of all of the sets whi h are unfounded withrespe t to P and I. The all Unfounded(P; I) returns the set GUS(P,I).A basi subroutine used by the Unfounded pro edure is the MinimumModelalgorithm for omputing the minimum model of a Horn program (Se tion 2.2). Re allthat Dowling and Gallier presented a version of this algorithm whi h runs in lineartime in the size of the Horn program.One algorithm whi h ould be used to ompute GUS(P,I) is the following:1. Let P 0 be the set of rules in P minus all rules whi h have their body an elledby I.2. Let P 00 be the set of rules in P 0 but with all negative subgoals removed from allof the rule bodies. (Hen e, P 00 is a Horn program.)3. Return GUS(P; I) = Atoms(P ) nMinimumModel(P 00).However, the pre eding algorithm is not as eÆ ient as one would like for use inSmodels: the routine re omputes MinimumModel(P 00) from s rat h every time thatthe Unfounded routine is alled. The Unfounded routine an be alled multiple timesfrom Expand ea h time that Expand is alled by the Smodels algorithm.13i.e., for some atom x, both x and :x are in the truth assignment44
Note that the GUS operator is monotoni in the sense that, if I I 0, thenGUS(P; I) GUS(P; I 0). So if GUS(P; I) has already been omputed, and the setI n I 0 is not very large, one might hope to lo alize the omputation of GUS(P; I 0) nGUS(P; I) to a relatively small fra tion of the rules in P . Smodels a omplishes thisthrough an optimization whi h uses sour e pointers.For ea h atom a 2 Atoms(P ), a sour e pointer a:sour e is maintained whi hpoints to the rst rule whi h aused a to be ex luded from GUS(P; I). During theexe ution of the AtLeast routine, if it is dete ted that a ruleR = a Bodyhas its body an elled by a new guessed or inferred assignment, and a:sour e = R,then a is entered into a set U .Then the all Unfounded(P; I) works as follows:1. Close set U under the following operator: If atom h =2 U and the rule pointedto by h:sour e has a positive subgoal b 2 U then U U [ fhg.2. Remove elements from U as follows: If a 2 U and ruleR = a Bodyis su h that I does not an el R's body and PosSubgoals(R) \ U = ;, thenU U n fag. (Also, set a:sour e = R.)3. Return GUS(P; I) = U .Step 2 of the above algorithm essentially loses the set U using the Dowling-Gallieralgorithm on P I. 45
3.2.2 Smodels' Inferen e Rules for Extended ProgramsAs mentioned earlier, Lparse translates any extended logi program into a set ofnormal rules, hoi e rules, ardinality rules, and weight rules. For ea h of these ruletypes, Smodels implements the Modus Ponens, All Rules Can elled, Ba k hain True,and Ba k hain False inferen e rules, as well as negation based on unfounded sets.In the pre eding se tion, we explained how these inferen e rules are implementedfor normal logi programming rules. In this se tion, we will explain how they areimplemented for the hoi e, ardinality, and weight rules.Choi e RulesRe all from Se tion 2.4 that a hoi e rule is a rule of the formfh1; : : : ; hkg a1; : : : ; an; not b1; : : : ; not bm:Su h a rule is equivalent to having k rules of the formfhig a1; : : : ; an; not b1; : : : ; not bm (3.1)where 1 i k. The Ba k hain True, All Rules Can elled, and Unfounded Setinferen e rules treat a rule of the form 3.1 exa tly as though it were a normal rulewith hi at the head. The Modus Ponens and Ba k hain False inferen e rules, however,do not apply to hoi e rules.Cardinality RulesA ardinality rule is one of the formh k fa1; : : : ; an; not b1; : : : ; bmg: (3.2)46
With respe t to implementing the Smodels inferen e rules, the main dieren es be-tween a ardinality rule and a normal rule are the onditions under whi h the ruleres (i.e., when Modus Ponens applies) and the onditions under whi h the rule is an elled. A normal rule h a1; : : : ; an; not b1; : : : ; not bmres when n + m of its subgoals are satised, and is an elled when any one of itssubgoals is ontradi ted. A rule of form 3.2 res when any k of its subgoals aresatised, and is an elled when any n +m k + 1 of its subgoals are ontradi ted.The threshold for ring ae ts when Modus Ponens and Ba k hain False areapplied to the rule. For instan e, suppose that h has been set to false and k 1 ofthe subgoals in the rule body have been satised. Then the Ba k hain False inferen erule is applied to ensure that the remaining subgoals of the rule are ontradi ted.The threshold for rule an ellation ae ts the appli ation of the All Rules Can- elled and Ba k hain True. For instan e, if the head atom h from rule 3.2 has beenset to true, all other rules with h in their head have been an elled, and n+m k ofthe subgoals of this rule have been ontradi ted, then the remaining k subgoals maybe inferred true.Both thresholds are involved in implementing the inferen e rule for unfounded sets.The reason for this is evident from the algorithm given for the Unfounded pro edureoutlined in Se tion 3.2.1. The rst phase of the pro edure adds elements to the setU when supporting rules are (in a sense) \ an elled". Then the se ond phase of thepro edure removes elements from U when new supporting rules are found through arule-ring pro ess. 47
Weight RulesRe all that a weight rule is a rule of the formh l fa1 = wa1 ; : : : ; an = wan ; not b1 = wb1 ; : : : ; not bm = wbmg: (3.3)As with ardinality rules, a weight rule has two important thresholds: the ringthreshold, l, and the an elling threshold w = Pni=1 wbi +Pmi=1 w i l. The rule is an elled only if the sum of the an elled subgoals is greater than w.The main dieren e in implementing weight rules, as opposed to ardinality rules, omes from fa t that the weights mentioned in the body of the rule may dier fromea h other. This presents no extra ompli ation regarding the Modus Ponens, AllRules Can elled, and Unfounded Set inferen e rules, but does somewhat ompli atethe implementation of the two ba k haining inferen e rules.First, we will dis uss how Smodels' Ba k hain True inferen e works for weightrules. Suppose that rule R is of the form shown in 3.3. Suppose that d is the ombined weight of the an elled subgoals in the body of R. Suppose also that x isan uninstantiated subgoal of maximal weight from the body of R. Then the Ba k hainTrue inferen e rule an be applied to R to infer that x is true if and only if (1) theatom h is true, (2) all of the rules other than R whi h have h in their heads havebeen an elled, (3) d w, and (4) d + wx > w (where w is the an elling thresholddened above).On e subgoal x has been inferred true by ba k haining, Smodels he ks whetherthere are anymore uninstantiated subgoals in the body of R. If so, then x is resetto refer to the uninstantiated subgoal of maximal weight, as before. If we againhave d + wx > w then the new x is also inferred to be true by the Ba k hain True48
inferen e rule. The rule is repeatedly applied to R until there is no subgoal x su hthat d+ wx > w.Next, we onsider how Smodels applies the Ba k hain False inferen e rule to ruleR. Suppose that s is the ombined weight of the satised subgoals in the body of R.Suppose that x is an uninstantiated subgoal of maximal weight from the body of R.The Ba k hain False inferen e rule an be applied to R to infer that x is false if andonly if (1) h is false, (2) s < l, (3) s+ wx l.As with the Ba k hain True inferen e rule, the Ba k hain False inferen e rulefor weight rules may make several inferen es by iterating through the uninstantiatedsubgoals in the body of the rule. The iteration goes through the subgoals in de reasingorder of their weights and sets ea h subgoal to false until there is no uninstantiatedsubgoal x remaining in the rule su h that s+ wx l.3.2.3 Lookahead-Based Heuristi Smodels' heuristi routine pi ks the hoi e atom a, whi h will be the next atominstantiated in the urrent truth assignment. It returns a if Smodels is to rst in-stantiate a to true, and returns :a if the initial value of a is to be false.Perhaps the biggest drawba k of Smodels' heuristi routine is that it is often ratherexpensive to ompute. A signi ant advantage of the routine is that it sometimesprovides new for ed inferen es, rather than just guessed inferen es.The heuristi attempts to hoose a in su h a way that instantiating a will yielda large number of inferen es. Suppose that I is the urrent truth assignment beingapplied to logi program P . Then, for ea h uninstantiated atom a 2 Atoms(P ) denePosS ore(a) = jExpand(P; I [ fag)j49
NegS ore(a) = jExpand(P; I [ f:ag)j:Smodels' heuristi returns the atom a su h that min(PosS ore(a); NegS ore(a))is maximized. Ties are broken by hoosing a su h thatmax(PosS ore(a); NegS ore(a))is maximized. Ties whi h still remain are then broken randomly. The pro ess of all-ing the Expand routine on I [ fag and I [ f:ag is alled a lookahead.Calling the Expand routine on every uninstantiated atom at every hoi e pointin the sear h may be very expensive. To partially alleviate this expense, Smodelsdoes not perform a lookahead on an atom b if, during the exe ution of the same allto the heuristi , a lookahead was performed on some atoms a su h that b or :b wasan element of Expand(P; I [ a) or Expand(P; I [ :a). The reason for omitting thelookahead for b in this ase is that if, for example, b 2 Expand(P; I [ fag), then it isguaranteed that Expand(P; I [ fbg) Expand(P; I [ fag):Therefore, one may regard it as somewhat likely (though not ertain) that a will s orebetter than b on Smodels' heuristi .On e the atom a with the best heuristi s ore has been sele ted, then a is returnedas the hoi e literal if PosS ore(a) > NegS ore(a). Otherwise, :a is returned.Obtaining for ed inferen es from lookaheadsIf, for some literal x, Expand(P; I [ fxg) is found to ontain a on i t (i.e., botha and :a for some atom a), it is valid to infer that x is false in any answer set whi hextends I. In su h a situation the heuristi routine an immediately return :x asa for ed inferen e. Obtaining these for ed inferen es is potentially a very valuablebenet whi h is earned from the (potentially quite expensive) lookahead pro ess.50
Pseudo ode for Smodels' heuristi routine in given in Table 3.33.2.4 Computing Multiple Answer SetsThe default behavior of Smodels is that it seeks to nd and display a single answerset for the given logi program. However, a ommand-line option exists whi h allowsthe user to request that all answer sets for a given logi program be omputed anddisplayed. The user an also request that Smodels ompute and display up to nanswer sets for the program, where n is a user-spe ied, positive integer.Suppose the user has requested that Smodels ompute multiple answer sets. Then,whenever Smodels omputes a new answer set, it displays the answer set and he kswhether the desired number of answer sets has been met. If not, then it ba ktra ksfrom the most re ent assignment, and ontinues its sear h.3.2.5 Implementing Minimize StatementsIn the ase where the program P ontains minimize statements, then Smodelsmust, at least impli itly, iterate through all of the answer sets of P and ompare theirs ores relative to those statements. After all answer sets have been enumerated, theone with the best s ore is returned as the solution.An optimization to this s heme whi h Smodels uses is to modify the Con i troutine so that it ompares the s ore of the urrent partial truth assignment to thes ore of the best answer set found so far. (The s ore of a partial truth assignmentis the total weight of the subgoals satised by the assignment. Thus it provides alower bound on the s ore whi h an be a hieved by any assignment whi h extends the urrent assignment.) If the s ore of the urrent partial assignment is already worse51
Heuristi (P , I)// Returns the next literal whi h is to be set to true// by the sear h routine.// Also, returns a Boolean value whi h indi ates whether the literal// represents a for ed inferen e.f Avail Atoms(P ) n Atoms(I)bestMin bestMax 0for a 2 AvailI 0 Expand(P; I [ fag)if Confli t(I 0)return :a, trueAvail Avail n Atoms(I 0)posS ore jI 0jI 0 Expand(P; I [ f:ag)if Confli t(I 0)return a, trueAvail Avail n Atoms(I 0)negS ore jI 0jmin min(posS ore; negS ore)max max(posS ore; negS ore)if (min > bestMin) or (min = bestMin and max > bestMax)bestMin minbestMax maxbestAtom abestPolarity (posS ore > negS ore)if (bestPolarity)return bestAtom, falseelse return :bestAtom, falseg Table 3.3: Smodels' lookahead-based heuristi 52
(i.e. higher) than the s ore of the best known answer set, then Con i t returns true,indi ating that there is no purpose in extending this assignment.
53
CHAPTER 4CONFLICT CLAUSES FOR SAT AND ASP In this hapter we will dis uss how SAT solvers generate on i t lauses and howthey use these lauses to improve sear h eÆ ien y. In parallel, we will dis uss how wehave adapted these te hniques into our Answer Set Programming solver, Smodels (\Smodels with on i t lauses").The algorithmi te hniques whi h are original to this dissertation are dis ussed inSe tions 4.3.2-4.3.4 and 4.4.2, where we explain how Smodels diagnoses the auseof a on i t in an answer set sear h. (See also Appendi es C and D, where we proofthat the resulting Smodels sear h algorithm is orre t and omplete.)4.1 Overview of Con i t ClausesA on i t lause is a ba ktra king solver's diagnosis of why a on i t o urred.The solver an use this diagnosis to prune the sear h spa e and to dire t the sear hheuristi .14 Any solver whi h is based upon the Davis-Putnam-Loveland-Logemannalgorithm an in prin iple utilize on i t lauses. This observation, ombined withthe su ess of on i t lause learning in improving the eÆ ien y of DPLL-based SAT14Con i t lauses are sometimes referred to as \lemmas" be ause they are intermediate resultswhi h may be reused by the solver as it develops a solution to the original problem.54
solvers on industrial appli ations, was our motivation for adapting on i t lauses toASP sear h.In this se tion, we will motivate the general idea of on i t lause learning by anabstra t example of DPLL-based sear h. This example will be equally appli able toSAT sear h or to ASP sear h.Let us suppose that we have a DPLL-based solver whi h is sear hing for thesolution to a problem. Suppose further that, before rea hing the rst on i t in itssear h, the solver has sele ted atoms a1; a2; : : : ; a100, in this order, as its hoi e atoms,and that it has initially instantiated ea h of these atoms to the value true. Assumealso that immediately after instantiating a100 to true, it rea hes a on i t in the sear h(i.e., these 100 assumptions have aused the sear h engine to dedu e that some otheratom x is both true and false). This situation is portrayed in Figure 4.1.After rea hing this on i t, the basi DPLL algorithm would ba ktra k the as-signment of a100 = true, set a100 = false, and ontinue the sear h. (See Figure 4.2.)However, if our algorithm in orporates on i t lauses, it would pause immediatelyafter the on i t is dete ted in Figure 4.1 in order to to analyze the \ ause" of the on i t. In solvers using the same on i t lause strategy as Cha and Smodels ,the diagnosis returned will identify a subset of the assignments made that would, inthe ontext of the problem at hand, have been suÆ ient to result in the on i t thatis being analyzed.For instan e, suppose in our example that the on i t analysis routine returns adiagnosis whi h states that the assignments a2 = true, a3 = true, and a100 = true,taken together, were suÆ ient to ause the on i t in Figure 4.1. (In Se tions 4.3through 4.5, we will explain how this diagnosis is obtained.) From this diagnosis,55
a1
a2
a3
a4
a99
a100
Conflict
true
true
true
true
true
true
Figure 4.1: Rea hing the rst on i t in a DPLL-based sear h
56
a1
a2
a3
a4
a99
a100
true
true
true
true
true
true false
continue searchFigure 4.2: Ba ktra king from the rst on i t
57
we may infer that no solution of the problem at hand an set a2, a3, and a100 si-multaneously to true. This results in our solver inferring the lassi al logi lause::a2 _ :a3 _ :a100.On e this on i t lause has been omputed, it may be used by the solver in threeways to improve the sear h eÆ ien y: (i) to dire t ba kjumping, (ii) to serve as anadded onstraint, and (iii) to guide the sear h heuristi .Ba kjumping, also known in the SAT literature as \non hronologi al ba ktra k-ing" [43, potentially allows the solver to remove several levels of assumptions fromthe sear h when ba ktra king from a single on i t. As we saw in Figure 4.2, thebasi DPLL algorithm (whi h uses \ hronologi al ba ktra king") removes only theassumption a100 = true from its sta k, and sets a100 = false at the 99-th level ofthe sear h. (We dene the sear h level of an inferen e to be the number of hoi eassignments whi h were in ee t when the inferen e was made.) In ontrast, a solverutilizing the on i t lause/diagnosis :a2 _ :a3 _ :a100 an ba kjump past the as-sumptions of a99 = true, : : :, a4 = true sin e none of these assumptions were requiredin order to produ e the on i t. The solver then assigns a100 = false at level 3 of thesear h, as shown in Figure 4.3. (In the gure, the assignment a100 = false does notappear in a bubble be ause it is a for ed assignment in this ir umstan e.)The se ond use of a on i t lause is to serve as an added (\learned") onstraint.In our example, immediately after ba kjumping, the solver would store the lause:a2 _ :a3 _ :a100 for later referen e. Suppose, then, that later in the sear h the urrent partial assignment in ludes a2 = true and a100 = true. Then, based onthe stored on i t lause, the solver ould immediately infer that a3 = false, therebyfurther pruning the sear h spa e. Moskewi z et al. [46 provide the following intuition58
a1
a2
a3
true
true
true a100 = false
continue searchFigure 4.3: Situation after ba kjumping based on :a2 _ :a3 _ :a100for using on i t lauses in this way: If we interpret rea hing a on i t in the sear has the solver making a \mistake", then a on i t lause an be seen as the solver'sattempt to diagnose the ause of the mistake, and to learn from it so that futureinstan es of the same mistake are not repeated.The third use of on i t lauses is to guide the sear h heuristi . Part of the roleof the sear h heuristi in the DPLL algorithm i