semanshiyong/papers/luphd.pdf3.2.5 seman tic condition for read committed with rst-committer-wins....

144

Upload: others

Post on 20-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Semanti Corretness of Transations and Workowsa dissertation presentedbyShiyong Lu

    toThe Graduate Shoolin partial fulfillment of the requirementsfor the degree ofdotor of philosophyinomputer sieneState University of New Yorkat Stony BrookApril 2002

  • Abstrat of the DissertationSemanti Corretness of Transations and WorkowsbyShiyong LuDotor of PhilosophyinComputer SieneState University of New York at Stony BrookAdvisor: Arthur J. Bernstein2002Serializability is the orretness riterion generally used in the literature todetermine a shedule's orretness. Suh a riterion is learly inappropriate,however, in determining the orretness of shedules that are produed whenan appliation is run at an isolation level lower than SERIALIZABLE sinesuh shedules might no longer be serializable. In this dissertation, usingsemanti-orretness riterion, we prove a ondition for eah isolation levelunder whih transations that exeute at that level will be semantially orret.We also apply the semanti-orretness theory to automati workow ver-i�ation and generation problems. In partiular, we propose a new workowmodel that allows to: (1) automatially hek if the desired outome of aworkow an be produed by its atual implementation, (2) automatiallysynthesize a workow implementation from the workow spei�ation and agiven task library. Finally, we present some preliminary theoretial results forthe ompleteness of a task library and the realizability of a workow poston-dition.iii

  • To my wife Shuyun and our daughter Emily with love

  • ContentsList of Figures ixAknowledgements x1 Introdution 11.1 Transations and workows . . . . . . . . . . . . . . . . . . . 11.2 Statement of the problems . . . . . . . . . . . . . . . . . . . . 31.3 Organization of this dissertation . . . . . . . . . . . . . . . . . 42 Related work 52.1 Non-serializability orretness riterion . . . . . . . . . . . . . 52.2 Related workow researh . . . . . . . . . . . . . . . . . . . . 62.2.1 Automati workow generation . . . . . . . . . . . . . 92.2.2 Workow orretness . . . . . . . . . . . . . . . . . . . 102.2.3 Exeption and failure handling . . . . . . . . . . . . . 113 Semantially orret transations 133.1 Semanti orretness . . . . . . . . . . . . . . . . . . . . . . . 153.2 Semanti onditions for onventional databases . . . . . . . . . 193.2.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2.2 Some lemmas . . . . . . . . . . . . . . . . . . . . . . . 203.2.3 Semanti ondition for READ UNCOMMITTED . . . 22v

  • 3.2.4 Semanti ondition for READ COMMITTED . . . . . 263.2.5 Semanti ondition for READ COMMITTED with �rst-ommitter-wins . . . . . . . . . . . . . . . . . . . . . . 283.2.6 Semanti ondition for REPEATABLE READ . . . . . 303.2.7 Semanti ondition for SNAPSHOT isolation . . . . . . 313.3 Semanti onditions for relational databases . . . . . . . . . . 343.4 Choosing an isolation level . . . . . . . . . . . . . . . . . . . . 423.5 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484 Semantially orret workows 494.1 Introdution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.2 Workow model features . . . . . . . . . . . . . . . . . . . . . 514.3 Preliminary de�nitions . . . . . . . . . . . . . . . . . . . . . . 534.4 The task model . . . . . . . . . . . . . . . . . . . . . . . . . . 554.5 The workow model . . . . . . . . . . . . . . . . . . . . . . . 584.5.1 Frame rule I . . . . . . . . . . . . . . . . . . . . . . . . 594.5.2 Frame rule II . . . . . . . . . . . . . . . . . . . . . . . 604.5.3 Consequene rule . . . . . . . . . . . . . . . . . . . . . 624.6 Workow onstruts . . . . . . . . . . . . . . . . . . . . . . . 624.6.1 Composition . . . . . . . . . . . . . . . . . . . . . . . . 644.6.2 Impliation . . . . . . . . . . . . . . . . . . . . . . . . 654.6.3 Universal . . . . . . . . . . . . . . . . . . . . . . . . . 664.6.4 Conjuntion . . . . . . . . . . . . . . . . . . . . . . . . 674.6.5 Conditional . . . . . . . . . . . . . . . . . . . . . . . . 684.6.6 Disjuntion . . . . . . . . . . . . . . . . . . . . . . . . 704.6.7 Putting it together . . . . . . . . . . . . . . . . . . . . 714.7 Automati workow veri�ation . . . . . . . . . . . . . . . . . 734.8 Workow veri�ation example . . . . . . . . . . . . . . . . . . 76vi

  • 4.9 Workow generation . . . . . . . . . . . . . . . . . . . . . . . 794.9.1 Funtion genWF . . . . . . . . . . . . . . . . . . . . . 814.9.2 Funtion genQ . . . . . . . . . . . . . . . . . . . . . . 834.9.3 Funtion fix . . . . . . . . . . . . . . . . . . . . . . . 854.10 Summary and future work . . . . . . . . . . . . . . . . . . . . 885 The implementation 906 Completeness and realizability 946.1 De�nitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 956.2 Dependeny graph . . . . . . . . . . . . . . . . . . . . . . . . 966.3 Completeness ondition . . . . . . . . . . . . . . . . . . . . . . 1016.4 Realizability ondition . . . . . . . . . . . . . . . . . . . . . . 1176.5 Summary and future work . . . . . . . . . . . . . . . . . . . . 1217 Conlusion and future work 123

    vii

  • List of Figures1 Withdraw from savings a

    ount . . . . . . . . . . . . . . . . . 332 Prints out a mailing list . . . . . . . . . . . . . . . . . . . . . 433 Proesses a new order . . . . . . . . . . . . . . . . . . . . . . 454 Delivers an order . . . . . . . . . . . . . . . . . . . . . . . . . 475 Produes a

    ounting information . . . . . . . . . . . . . . . . 486 A trade negotiation workow example . . . . . . . . . . . . . . 527 The task model . . . . . . . . . . . . . . . . . . . . . . . . . . 568 An abstration of a workow . . . . . . . . . . . . . . . . . . . 589 Annotations of a workow . . . . . . . . . . . . . . . . . . . . 6410 Composition onstrut . . . . . . . . . . . . . . . . . . . . . . 6511 The impliation onstrut . . . . . . . . . . . . . . . . . . . . 6612 The universal onstrut . . . . . . . . . . . . . . . . . . . . . . 6713 The onjuntion onstrut . . . . . . . . . . . . . . . . . . . . 6814 The onditional onstrut . . . . . . . . . . . . . . . . . . . . 6915 The disjuntion onstrut . . . . . . . . . . . . . . . . . . . . 7116 Funtion Annotate(Wpre, W) pseudoode . . . . . . . . . . . 7417 An example of annotated workow . . . . . . . . . . . . . . . 7718 Searh spae graph . . . . . . . . . . . . . . . . . . . . . . . . 8019 Cyli searh spae graph . . . . . . . . . . . . . . . . . . . . 8120 Funtion genWF(WPre, WPost, G, W) pseudoode . . . . . . 83viii

  • 21 Funtion genQ(G) . . . . . . . . . . . . . . . . . . . . . . . . . 8422 Funtion �x(WPre, WPost, W) . . . . . . . . . . . . . . . . . 8623 Completing dangling edge e of workow W . . . . . . . . . . . 8724 Complete a workow using funtion �x . . . . . . . . . . . . . 8825 W3: an inomplete workow generated . . . . . . . . . . . . . 9226 Final workow generated . . . . . . . . . . . . . . . . . . . . . 9327 A yli positive dependeny graph . . . . . . . . . . . . . . . 9728 A yli negative dependeny graph . . . . . . . . . . . . . . . 9929 A yli negative dependeny graph . . . . . . . . . . . . . . . 9930 A positive dependeny graph . . . . . . . . . . . . . . . . . . . 10031 A negative dependeny graph . . . . . . . . . . . . . . . . . . 10032 A yli dependeny graph . . . . . . . . . . . . . . . . . . . . 10033 Realize1(seq+) . . . . . . . . . . . . . . . . . . . . . . . . . . 10434 A positive dependeny graph . . . . . . . . . . . . . . . . . . . 11335 A negative dependeny graph . . . . . . . . . . . . . . . . . . 11336 A yli dependeny graph . . . . . . . . . . . . . . . . . . . . 11337 Realize2(seq+; seq�; wp) . . . . . . . . . . . . . . . . . . . . . 11438 Realize3(seq+; Q) . . . . . . . . . . . . . . . . . . . . . . . . . 120

    ix

  • AknowledgementsNot enough thanks an be given to my advisor, Art (Prof. Arthur Bernstein).Over the years, I have learned a lot from the way you think, the way youspeak and the way you write as a professional researher. The day that we did\penil and paper" on the table of your bakyard, is one of the happiest daysthat I have ever had. I thank you for all of that!Millions of thanks to Phil (Prof. Philip Lewis), my o-advisor. Togetherwith Art, you have given me great advise and enouragement on my researh,and your good intuition has beome my ompass in the forest of researh!Thank you, Radu (Prof. Radu Grosu), your suggestion has led us to thedevelopment of a logial formalism for the workow model and fored me tobeome a \logiian" (I am still learning though).I would also like to thank Mihael (Prof. Mihael Kifer) who enouragedour work on workows and pointed to us the related AI planning literature.I am extremely grateful to our departmental hairman, Prof. Arie Kauf-man, who provided me the opportunity to teah an undergraduate databaseourse for two semesters and to the following people who wrote wonderful re-ommendation letters for me so I was able to \sell" myself to my �rst job: Prof.Arthur Bernstein, Prof. Philip Lewis, Prof. Steve Skiena, Prof. Sott Smolka,and Prof. David Warren.I enjoyed your ourse projet, Steve and Sott. and David, thank you foryour advise on teahing!

  • I am very thankful to the following faulties from whom I have takenourses: Hussein Badr, Tzi-ker Chiueh, Rane Cleaveland, Mihael Kifer,Ker-i Ko, Steve Skiena, David Smith, and Sott Smolka.It's my great pleasure to work with the following fellow students as myteahing assistants: Diptikalyan Saha, Wenxin Song, Bin Tang, Lujin Wang,Zhenghong Yang. I appreiate your great job!I would also like to thank the following sta�s for your great servie and sup-port in these past few years: Dolores Brush, Kathy Germana, Betty Knittweis,Edwina Osmanski, Patrik Tonra, and Brian Tria.My thanks also goes to my fellow students (I only wish I had the ompletelist) for your enouragement and friendship: Baoquan Chen, Yan Chen, Bao-qiu Cui, Hasan Davulu, Vi (Xiaoqun Du), Ye Duan, Ziyang Duan, DavidGerstl, Ajay Gupta, Jingqian Hu, Lan Huang, Ivy (Ying Huang), Xiaoling Li,Sergio Silva, Li Tan, Bin Tang, Guizhen Yang, Xianfeng Yuan, and the listontinues to my memory.Speial thanks to my wife Shuyun Xu. Without your love, support, enour-agement, understanding and patiene, I would not have been able to ompletethis l-o-n-g program. Thank you for always being there for me, Shuyun!

  • Chapter 1IntrodutionIn this hapter, �rst, the notions of transations and workows are introdued;then a brief overview of the evolution from the transation model to the work-ow model is presented; next, a list of transation and workow problems,for whih we report our researh results in this dissertation, are stated; and�nally, an organization of the rest hapters of the dissertation is outlined.1.1 Transations and workowsA transation is an exeution of a program with the following ACID (Atomiity,Consisteny, Isolation, and Durability) properties [32℄:� Atomiity. Either all ations of a transation are arried out or noneof them are. This is also alled the \all-or-nothing" property.� Consisteny. The exeution of a single transation preserves the on-sisteny of the database. It is the user's responsibility to ensure a trans-ation's onsisteny.� Isolation. The exeution of a transation is isolated, or proteted fromthe e�ets of other onurrently exeuting transations.1

  • CHAPTER 1. INTRODUCTION 2� Durability. The result of a ompleted transation must be persistenteven if the system rashes.Sine eah transation is onsistent, we an run transations one after an-other and the result of the exeution will be orret. However, for performanereasons, a DBMS (Database Management System) has to interleave the ationsof several transations, and it is the responsibility of the onurreny ontrol,an essential omponent of a DBMS, to ensure the above ACID properties whentransations are exeuted onurrently.The above lassial transation model has been widely su

    essful in tra-ditional database appliations. However, it is not appropriate for advaneddatabase appliations in whih transations must be ombined and oordi-nated into potentially long-running ativities to ahieve a larger goal. Theseappliations inlude CAD, oÆe automation, ollaborative work, manufa-turing ontrol, and workow management. To address these issues, severalextended transation models [25, 53℄ have been proposed in whih atomiityand isolation are relaxed.The workow model generalizes these extended transation models andprovides broader funtionality [2℄. A workow is generally used to model aomplex business proess [37℄. It an be viewed as a direted graph, in whihthe nodes represent tasks and the ars desribe their exeution ordering. AWorkow Management System (WFMS) ompletely de�nes, manages, and ex-eutes a workow using the workow logi to ontrol the initiation of tasks. Amajor di�erene between WFMSs and transation models is: WFMSs supportollaboration and ooperation between di�erent tasks but leave the burden oforretness and reliability to the users, while transation models support re-overy and failure handling to ensure orretness and provide only limitedsupport to ollaboration and ooperation between tasks [2℄.

  • CHAPTER 1. INTRODUCTION 31.2 Statement of the problemsReently, based on Hoare logi [35℄, a semanti-orretness theory has beendeveloped for transation proessing systems [12, 9, 10℄, in whih the semantisof a transation is desribed by annotating eah its statement by a preonditionand a postondition. This theory has been demonstrated to be useful forimproving the performane of a transation proessing system [11℄. In thisdissertation, based on this semanti-orretness theory, we investigated thefollowing problems:� Semanti onditions for orretness at di�erent isolation levels.Serializability is the orretness riterion generally used in the literatureto determine a shedule's orretness. Suh a riterion is learly inap-propriate, however, in determining the orretness of shedules that areprodued when an appliation is run at an isolation level lower thanSERIALIZABLE sine suh shedules might no longer be serializable.In this researh, using semanti-orretness riterion, we proved a ondi-tion for eah isolation level under whih transations that exeute at thatlevel will be semantially orret. These theorems provide a theoretialbasis for hoosing isolation levels for the transations of an appliationin pratie, and shed light on the interations between transations atdi�erent isolation levels.� Automati workow veri�ation and generation. Corretness isan important aspet of workow management systems. To address theorretness question properly we propose a new workow model basedon the semanti-orretness theory that allows to: (1) automatiallyhek if the desired outome of a workow an be produed by its atualimplementation, (2) automatially synthesize a workow implementationfrom the workow spei�ation and a given task library.

  • CHAPTER 1. INTRODUCTION 4� Completeness and realizability. In general, given a task library,not all possible workow postonditions are realizable. We identi�edthe onditions for a task library L to be omplete in the sense thatany reasonable workow postonditions are realizable from L. We alsoidenti�ed the onditions for a workow postondition to be realizablewhen the task library is inomplete. This theoretial researh result isonly for a simplifed workow model.1.3 Organization of this dissertationThe remaining hapters of the dissertation are organized as follows: Chapter2 reviews the researh on transations and workows that is mostly losely re-lated to our work; Chapter 3 presents semanti onditions for ensuring theorret exeution of transations at di�erent isolation levels based on thesemanti-orretness theory; Chapter 4 desribes our workow model basedon the notion of semanti-orretness, and presents the algorithms for auto-mati veri�ation and generation of workows; Chapter 5 briey desribes apreliminary implementation of the workow generation algorithm in the frame-work of AutoFlow and its test on some simple examples; Chapter 6 identi�es aset of onditions for the ompleteness of a task libary and the realizability ofa workow postondition from a task libary; �nally, Chapter 7 onludes thedissertation and list some of the remaing interesting researh problems.

  • Chapter 2Related workConsiderable researh has been done in the area of transations and workows.In this hapter, we on�ne ourselves to reviewing the researh that is mostlosely related to the work that we have done here: Setion 2.1 reviews researhin transation proessing in whih some non-serializability orretness riterionis used; and Setion 2.2 summarizes related work in workows with a fous onorretness issues.2.1 Non-serializability orretness riterionSerializability has been widely a

    epted as the orretness riterion for trans-ation proessing systems. Although the semantis of abstrat operations anbe used to narrow the notion of a onit [71℄, [34℄, [6℄, [72℄, serializability isstill the goal of these approahes.To enhane performane and to a

    ommodate some advaned appliationsin whih long-running ativities might be present, non-serializability orret-ness riterion are proposed in [28℄, [49℄, [27℄, and [46℄. In these models, trans-ations are deomposed into steps and the appliation designers speify theorret step interleavings using available tools of the models. In the Saga5

  • CHAPTER 2. RELATED WORK 6model [29℄, [19℄, a transation is a long running ativity that is deomposedinto steps. ConTrats [68℄ [60℄ generalizes the Saga model and the designerspei�es a partial order for the steps of a transation using a ontrol ow lan-guage. Semantis is spei�ed using so alled entry and exit assertions. Theentry assertion spei�es the initial state required to exeute a step, and theexit assertion spei�es what is expeted to be true for the result state afterthe exeution of that step. If an entry assertion of a step is not true whenthe step is to be initiated (as a result of onurrent exeution), speial ationmust be taken.Reently, a semanti-orretness theory has been developed for transationproessing systems [12, 9, 10℄. This theory has been demonstrated to be usefulfor improving the performane of a transation proessing system [11℄. In thisdissertation, based on this semanti-orretness theory, we proved a onditionfor eah isolation level under whih transations that exeute at that level willbe semantially orret. These important theorems provide a theoretial basisfor hoosing isolation levels for the transations of an appliation in pratie,and shed light on the interations between transations at di�erent isolationlevels.2.2 Related workow researhWorkow tehnologies originated from the work on business reengineering andoÆe automation in the 1970s. Sine that time a substantial e�ort has beendevoted to this area as orporations automate their business proesses. Cur-rently, there are hundreds of ommerial systems on the market. In [61, 54℄,some of the most typial produts are reviewed.Workows evolved from transations as the limitations of atomiity and

  • CHAPTER 2. RELATED WORK 7isolation beame evident in distributed and heterogeneous systems. Serializ-ability had long been reognized as a performane bottlenek and databasesystems all provide less restritive notions of isolation. On the researh side,the nested transation model [55℄ was introdued to generalize the oneptof atomiity and the multilevel model [72℄ was introdued to enhane perfor-mane by taking advantage of the semantis of appliations. These models,however, preserved the basi onept of atomiity and isolation.A more radial departure from the basi transation model ame with ex-tended transations: for example the ACTA model [20℄, Flex [58℄, ConTrat[59℄. Klein's [44℄ has proposed a rule-driven mehanism for imposing a tempo-ral ordering on tasks. Transations are omposed of individual subtransationsand their exeution order is ontrolled separately using either a graphial rep-resentation of exeution preedene or a ontrol ow language. Atomiity andisolation are relaxed. Workows generalize this model. Extended transationmodels are summarized in [25, 53℄, and their appliation in workow systemsan be found in [38, 39℄.The extended transation model fouses on data-entri appliations,whereas the workow model provides broader funtionality [2℄. A

    ordingto the workow referene model [37℄, a workow desribes a business proess.A workow management system ompletely de�nes, manages, and exeutesa workow using the workow logi to ontrol the initiation of tasks. Thelogi an be spei�ed as temporal onstraints among the task events. Loalonstraints de�ne ontrol dependenies and data dependenies between tasks.They an be spei�ed easily using ars between tasks annotated with transitiononditions. Global onstraints speify temporal dependenies among arbitrarytasks that are diÆult to speify in a workow graph. (i:e:; if a happens beforeb then must not happen.)Several formal methods have been proposed for speifying and modeling

  • CHAPTER 2. RELATED WORK 8workows. These inlude event algebra [64℄, state harts [56℄, Petri nets [67, 1℄,temporal logi [70℄, and onurrent transation logi [22℄. State harts andPetri Nets have formal semantis and an intuitive graphial representation.They model ontrol ow and data dependenies. A sheduler an be derivedfrom the spei�ation based on Event-Condition-Transition rules. However, itis diÆult to speify global onstraints in these models. Also, the spei�ationimplies a entrally ontrolled exeution model, whereas in pratie distributedontrol may be required [74, 73℄.The event algebra [64, 66℄ an be used to model ontrol ows with globalonstraints. However, its inability to model transition onditions that testvalues in the workow database (in whih the result of the exeution of previoustasks is stored), is a limitation. Workows are desribed as a set of temporalonstraints that speify the order in whih tasks are to be initiated. Theonstraints an be onverted into a distributed sheduler. However, sine eventalgebra is not part of a larger reasoning framework, it is not lear how it anbe used to model features suh as sub-workows, failure, and ompensation,et, and to reason about temporal properties [22℄.Conurrent Transation Logi (CTR) [15℄ provides a logi based frameworkfor workow spei�ation, veri�ation and exeution. Based on this frame-work, the Apply ompiler tehnique in [22℄ allows one to inorporate temporalonstraints into an existing workow, and the result is another workow repre-sented in CTR. Usually, a unique-event property (whih onstrains that eahevent an o

    ur at most one) is assumed for workow graphs, and this limitsthe ability to handle loops. In the following, we fous on reviewing relatedresearh on: (1) automati workow generation; (2) workow orretness; (3)and workow exeption and failure handling.

  • CHAPTER 2. RELATED WORK 92.2.1 Automati workow generationMost workow researh fouses on the modeling aspets of workows, i.e., thespei�ation of how the exeution of tasks in a workow is to be ordered. Cor-retness of a workow is not de�ned in terms of the outome of the workow,but in terms of the enforement of the data and ontrol dependeny that arespei�ed at design time. The semantis of eah task is not modeled, thus thespei�ation of these dependenies is based on the user's informal intuitionand understanding of a partiular workow appliation. This motivates theproblem of deriving these dependenies automatially based on the semantisof eah task and the desired outome of a workow, and thus generating aorret workow automatially.Automati workow generation is related to the planning problem in AIresearh, whih is to determine on a sequene of ations to ahieve a goal [51℄.In its full generality, planning allows a potentially unbounded number of initialstates and there are few restritions on the ations (e:g:; a robot an pik upand put down an objet an arbitrary number of times), so the searh spae islarge. In the workow problem desribed here, the initial state is desribedby a workow preondition whih is a �rst order logi formula. Althoughthe domain of a variable an be large, it is divided into a relatively small setof ranges by the workow spei�ation and assertions in the task library. Forexample, eah task generally ahieves a signi�ant, durable result (e:g:; updatea bank a

    ount). If it runs su

    essfully, it generally needs to be run only one(or a bounded number of times) in a workow exeution [22, 66℄. As a result,the searh for a solution is greatly simpli�ed.Some related work has been done in the workow �eld. Diretor [62℄ de-veloped a methodology to automatially generate workows for CAD appli-ations. However, this approah does not generalize to other workow appli-ations sine in this model, a prediate (whih is alled a data entity there)

  • CHAPTER 2. RELATED WORK 10annot be made false one it beomes true during exeution. The proesshandbook projet [50, 8℄ provides an organized library of business proesses.It is laimed to be a good knowledge base for automati workow generation,though the work did not explore this.2.2.2 Workow orretnessCorretness is an important aspet of workow management system. Althoughthe workow ommunity seems to be primarily foused on modeling aspetsof workows, there have been some researhers investigating the tehniqueof supporting orretness property in workow systems [5, 16, 47, 63℄. Anexellent overview of orretness issues in workow management was given in[41℄ and reently, a formalization of workows to address orretness issueswas given in [4℄ based on set and graph theory.Some researhers foused on the orretness aspets to ensure data onsis-teny when onurreny and failures are present. These tehniques emergefrom the areas of extended transation models [2, 31, 25, 30, 69℄, multi-databases [52, 17℄ and transational workows [63℄. However, the onstraintsthat the data and ontrol ow should satisfy were not disussed in a formalway. Others foused on the data and ontrol ow requirements. These teh-niques inlude ontrol ow graph, triggers (i.e., event-ondition-ation rules)[23℄, temporal onstraints [22, 65, 66℄ and net-based approah [1℄. However,most of these approahes, although formal, assume that the workow will beorret if those onstraints on data and ontrol ow are satis�ed during exe-ution, while the �nal state of the whole workow is not spei�ed nor proved.Thus, to the best of our knowledge, a formal framework of reasoning aboutthe orretness of workow in terms of the desired outome of a workow isstill missing. Furthermore, the semantis of eah task is not a

    ounted for.

  • CHAPTER 2. RELATED WORK 11The reent work presented in [4℄ seems to be the most related to our ap-proah, where a formalization of workow to address orretness was basedon set and graph theory. A omplete exeution history is orret if the inputondition of every ativity involved in the history is orret when the ativitystarts and if the basi onstraints that hold when the history starts also holdat the end of the history. A Constraint Based Conurreny Control (CBCC)mehanism ontrols ativity interleaving in suh a way that inter-ativity on-straints are preserved and a

    esses to workow environment on whih the basi

    onstraints do not hold are prevented. However, the �nal state of the wholeworkow is still neither spei�ed nor proved yet.Reently, a semantis orretness theory has been developed [12, 9, 10℄for transation proessing systems. It has been proven useful for signi�antlyimproving the performane of a onurreny ontrol [11℄, and reasoning aboutexeution orretness for di�erent isolation levels [13, 14℄.Here we extend the semantis orretness theory to model and reason aboutworkows. Spei�ally, we have developed a formal model in whih one anspeify the initial state of a workow in terms of a workow preondition, andthe desired outome in terms of a workow postondition. A workow is or-ret if it starts in a state satisfying the workow preondition, and when theworkow ompletes, the �nal state satis�es the workow postondition. Thesemantis of eah task is also spei�ed in terms of its preondition and post-ondition. Based on this model, we have developed an algorithm that, givena workow spei�ation and a task library, will produe a orret workowusing those tasks if suh a workow exists.2.2.3 Exeption and failure handlingExeption and failure handling is another important issue in workow man-agement system. Unexpeted events may happen during workow exeution.

  • CHAPTER 2. RELATED WORK 12Failure is one of the most ommon but other examples abound (e:g:, the mort-gage appliant supplied a forged doument in his appliation). Considerablework has been devoted to this issue [18, 45, 43, 21, 24, 48, 42, 58℄. Thereare two ommon approahes. The �rst is to de�ne exeptions expliitly in theworkow spei�ation. In this approah, the workow designer has to foreseeall the possible failure situations and speify a reation to eah. The problemsare that the number of workow paths may explode, the spei�ation is hardto hange, and the workow is hard to understand and maintain. Another ap-proah is to handle exeptions in an ad ho fashion when they happens. Thisapproah laks a formal struture, ompliates exeution ontrol, and does notguarantee orretness. Most urrent solutions are a ompromise of the two.When an exeption o

    ur, the automati workow generation algorithman be used to generate another workow to ontinue from the point of theexeption to the original workow postondition, or to another less desirableworkow postondition if no path an be found for the original workow post-ondition. The advantage of this is that not all exeption handlers need to bespei�ed or generated in advane, and the workow whih is used to reoverfrom an exeption is generated on the y based on the state of workow whenan exeption o

    urs.

  • Chapter 3Semantially orrettransationsSerializability is the orretness riterion generally used in the literature todetermine a shedule's orretness. Suh a riterion is learly inappropriate,however, in determining the orretness of shedules that are produed whenan appliation is run at an isolation level lower than SERIALIZABLE sinesuh shedules might no longer be serializable. Hene, in this hapter , we usea orretness riterion proposed in [12℄ and [9℄ (and further developed in [10℄),alled semanti orretness, whih requires that the interleaved exeution ofa set of transations have the same semanti e�et as a serial shedule ofthe same transations. The semanti orretness ondition for a transationshedule is based on the onditions developed in [57℄ for the orret exeutionof an arbitrary onurrent program.The ANSI/ISO standard [3℄ de�nes three isolation levels lower than SE-RIALIZABLE: READ UNCOMMITTED, READ COMMITTED, and RE-PEATABLE READ. Database systems frequently use a loking protool toimplement these levels. In addition, at least one major database vendor uses13

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 14SNAPSHOT isolation implemented through a ombination of loking and mul-tiversion tehniques. An exellent analysis of these levels and a proposed lok-ing implementation for eah is given in [7℄.In this hapter, we assume the loking implementation desribed in [7℄ isused and analyze the way transations an be interleaved at eah isolation level.The semanti orretness of a transation shedule depends on the pattern ofinterleaving. By taking this pattern into a

    ount it is possible to greatlyredue the semanti analysis alled for in [57℄ that is required to demonstratethe orretness of general onurrent programs. We present onditions forsemanti orretness for eah isolation level based on the allowable interleavingpatterns at that level. These onditions provide the formal basis that underliesthe informal reasoning that justi�es the use of non-serializable isolation levels.They allow the appliation designer to hoose the lowest isolation level foreah type of transation of an appliation in order to ahieve high performanetransation proessing.The rest of the hapter is organized as follows: Setion 3.1 presents anoverview of the semanti-orretness theory; Setion 3.2 identi�es the seman-ti onditions for eah isolation level for the onventional databases and Setion3.3 identi�es their ounterparts for relational databases; based on these on-ditions, Setion 3.4 presents an algorithm to determine the lowest isolationlevel for eah transation of a partiular appliation; Setion 3.5 desribes anustomer order example to illustrate the analysis using these onditions; and�nally, Setion 3.6 summarizes the hapter.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 153.1 Semanti orretnessWhen transations are exeuted at isolation levels lower than SERIALIZ-ABLE, the interleaving might no longer be serializable and orretness de-pends on what kind of interleaving an o

    ur and what the transations aredoing. Hene we need a way of desribing what eah transations does: itssemantis. The semantis of a transation, Ti, an be formally haraterizedby the triple fIi ^Bi ^ (xi = Xi)g Ti fIi ^Qig (1)I is the onsisteny onstraint of the database, and Ii represents those on-junts of I required for the orret exeution of Ti. For example, the onsis-teny onstraint of a banking database might assert that all a

    ounts havenon-negative balanes. However, the orret exeution of a transation thata

    esses a partiular a

    ount requires only that the balane of that a

    ount benon-negative. Ii represents the onjuntion of those onjunts of I that arerequired for the orret exeution of Ti. Ii is also a postondition of Ti sinewe require that any onjunt of I that is made false during the exeution of Tiis returned to the true state when Ti terminates. It follows that I � Vni=1 Ii,where n is the number of transation types in the system. Bi desribes allonditions that Ti assumes to be true of the arguments passed to it. For ex-ample, if Ti is a deposit transation and dep is the parameter representing themoney to be deposited, then Bi might assert dep � 0.Qi is alled the result and asserts that Ti has performed its intended fun-tion. Continuing the above example, if Ti deposits into an a

    ount whosebalane is bal, we need to assert as a postondition of Ti that the �nal balaneis dep more than the initial balane. In order to refer to the initial balane inthe postondition, we introdue a logial variable, Xi whose sole purpose is toreord the initial value of a database variable, xi, whose value is hanged byTi. In the example, we haraterize Ti with the following triple

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 16fbal >= 0^dep � 0^bal = BALg Ti fbal >= 0^bal = BAL+depg(1) goes beyond the onsisteny requirement plaed on a transation byasserting that not only must Ti move the database from one onsistent stateto another, but that only a subset of the onsistent states are a

    eptable whenthe transation terminates. (1) an be regarded as a formal restatement of thespei�ation of Ti. We an demonstrate that Ti is orret by proving that (1)is a theorem using a formal system suh as that of [36℄, although this is notgenerally done.To overome the limit of serializability and to inrease performane, in[9℄ we propose a new orretness riterion, alled semanti orretness. Ashedule, Sh, of transations is semantially orret iffIg Sh fI ^QShg (2)is true. First, a semantially orret shedule must maintain the onsistenyof the database, as indiated by the fat that I is a pre- and postonditionof Sh. A semantially orret shedule must also transform the database toa state that reets the umulative results of all the transations in Sh insome order. We denote the assertion that desribes that set of states by QSh,the umulative result. The relationship between QSh and the results, Qi, ofthe individual transations is desribed in [10℄. In essene, Sh is semantiallyorret if its postondition is the same as the postondition of a serial sheduleof the same set of transations, where the serial order is the order of transationompletion in Sh. For example, if Sh onsists of several deposit transationson some bank a

    ount, QSh might assert that the �nal balane is greater thanthe initial balane by an amount equal to the sum of the deposits.As illustrated in [9℄, semanti orretness is weaker than serializability, andit allows shedules that result in states that ould not have been reahed inany serial shedule. A semantially orret shedule an perform signi�antlybetter than any equivalent serial shedule [11℄.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 17A proof of (1) an be abbreviated by an annotated program in whih eah(atomi) statement of Ti, Si;j, is preeded by an assertion, Pi;j, its preondi-tion, desribing the state of the system at the time Si;j is eligible for exeution.Hene, assertions are assoiated with ontrol points and we say that an asser-tion and its orresponding ontrol point are ative if the following statementis eligible for exeution. Eah assertion states some ondition on the values ofitems in Ti's workspae and in the database. If Si;j is exeuted starting in astate where Pi;j is true, the next assertion, Pi;j+1, will be true of the state whenSi;j terminates. Hene, if when eah statement is exeuted its preondition istrue, the postondition of the transation will be true when the transationterminates.The major issue with onurreny is invalidation: if the exeution of thestatements of Ti and Tk (i 6= k) are interleaved, Pi;j might not be true of thedatabase state when Si;j is initiated. Thus, if Sk;l is exeuted when Pi;j isative and true, it might transform the state to one in whih Pi;j is false. Ifthis o

    urs, we say that Sk;l has invalidated Pi;j. For example, the exeutionof the statement x := x + 1 will invalidate the assertion x = y, but not theassertion x > y. If, during exeution of Ti, an ative assertion is invalidated,its subsequent behavior will be unpreditable and semanti orretness is notguaranteed. A suÆient ondition to ensure that no invalidation o

    urs isthat, for all preonditions Pi;j and all statements Sk;l, the triple [57℄:fPi;j ^ Pk;lg Sk;l fPi;jg (3)is a theorem for all Si;j and Sk;l, andfQi ^ Pk;lg Sk;l fQig (4)are theorems for all i and Sk;l.If (3) annot be proven, we say that the proofs of Ti and Tk interfere with

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 18one another and in partiular, Sk;l interferes with Pi;j. Hene there is a possibil-ity of invalidation at run time if the interleaving atually o

    urs. Interfereneis stati. It is a property of the proofs of the transations. Invalidation is dy-nami. It is a result of interferene if the interfering operation is exeuted whenthe interfered with assertion is ative. Hene, interferene does not neessarilylead to invalidation. For example, in [10℄ eah transation is deomposed intoatomi isolated steps. A onurreny ontrol is used to prevent those stepinterleavings that lead to invalidation from happening, thus ensuring that allshedules are semantially orret.In [10℄ we apply the results of [57℄ to a shedule, Sh, of transations andshow that Sh is semantially orret if for all transations, Ti, in Sh� Pi;j is true when Si;j is exeuted and� Qi is not invalidated by any step of a transation interleaved in Sh withTi.By extension, we say that a partiular transation, Ti, is semantially orretin Sh if these onditions apply to Pi;j and Qi.As indiated by (3), heking for non-interferene involves examining eahstatement and eah assertion of all transations. This heking requires a sig-ni�ant amount of work. For example, in a system of K transation types,eah ontaining N operations, (KN)2 possible triples must be heked. Whenthe isolation levels of transations are taken into a

    ount, however, the num-ber of triples that must be heked is greatly redued sine the loking disi-pline that implements the levels prevents ertain interleavings from o

    urring.Hene, although a partiular triple (3) may not be a theorem, it annot resultin invalidation if the loking disipline prevents Sk;l from being exeuted whenPi;j is ative. A major goal of this researh is to determine, for eah isolationlevel, whih triples must be heked. This dramatially redues the amount

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 19of analysis. For example, for SNAPSHOT isolation only K2 triples must beheked, regardless of the number of operations per transation.One additional onsideration must be dealt with: the invalidation of atransation's result after it terminates. A result annot be invalidated if it doesnot referene database variables. For example, the result of the buy transationof the stok trading appliation \when eah share was purhased, no heaperunbought share of the stok existed in the database," refers to a snapshot ofthe database and hene annot be invalidated by subsequent hanges to thedatabase. Similarly, the postondition of the deposit transation will assertthat the value of bal at the time the transation ommitted was dep morethan the value initially read by the transation. Results are generally of thisform: they desribe transformations rather than making assertions about �nalvalues.3.2 Semanti onditions for onventionaldatabasesIn this setion we present onditions whih, for eah isolation level, enumeratethe non-interferene theorems that must be demonstrated in order to ensuresemanti orretness. We �rst onsider onventional databases and then re-lational databases. In onventional databases, no database items are deletedor inserted, and eah item is referred to by name in read or write statements.In relational databases, prediates are used in SQL statements to speify thedatabase items they a

    ess.3.2.1 ModelA transation program a

    esses loal variables (in its workspae) and databasevariables using the following onstruts:

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 20� Assignment statement. There are three kinds of assignment state-ments: a read statement, whih atomially assigns the value of adatabase item to a loal variable, a write statement, whih atomiallyassigns the value of a loal variable to a database item, and a loalassignment statement, whih does not involve any database items.� Conditional statement. We assume that the ondition is onstrutedfrom loal variables.� Loop statement. We assume that the ondition is onstruted fromloal variables.Loal variables will be denotedX; Y and database variables will be denotedx; y. We use the notation sp(P; Si;k) to denote the strongest postondition ofSi;k that an be asserted when it is exeuted starting in a state that satis�esP . We assume the protools given in [7℄ are used to implement all isolationlevels.3.2.2 Some lemmasIn this subsetion, we prove some lemmas whih are the basis for provingtheorems in the rest of the hapter.De�nition 3.2.1 (Write blok) A write blok of a transation Tj is a pieeof ode of Tj whih ontains at least one write statement. 2Thus, any single write statement of Tj is a write blok of Tj, and Tj is awrite blok of Tj if ontains at least one write statement.Lemma 3.2.2 Let Si;k : X := e be a loal assignment statement of transationTi, in whih X is a loal variable and e only involve loal variables, and Sj;hbe a write blok of another transation Tj. Suppose Si;k is haraterized by the

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 21triple fPg Si;k fQg, where Q � sp(P; Si;k). If Sj;h does not interfere with P ,then Sj;h does not interfere with Q.Proof: The strongest postondition, Q, of Si;k with preondition P is givenby [33℄: Q � 9v(PXv V X = eXv ). Sine Sj;h does not interfere with P , thetriple fP^P 0g Sj;h fPg (5)is a theorem, where P 0 is the preondition of Sj;h. Sine X is a loal variableof Ti, transation Tj annot hange its value, it follows from (5) thatfPXv ^P 0g Sj;h fPXv g (6)where v is an arbitrary value of X. Our goal is to show thatfQ^P 0g Sj;h fQg (7)Suppose q is an arbitrary state satisfying Q^P 0, and q0 is the state that resultsafter Sj;h is exeuted starting in q. We would like to show that q0 satis�es Q.Let v0 be a value of X that makes Q true in q. Then PXv0 is true in q and from(6) PXv0 is true in q0 as well. Furthermore, sine X is a loal variable and eXv0only involves loal variables, X = eXv0 is still true of q0. Hene, Q is true of q0.Sine q is an arbitrary state satisfying Q^ P 0, it follows that (7) is a theorem.2Lemma 3.2.3 Let Si;k : x := X be a write statement of transation Ti, andSj;h be a write blok of another transation Tj. Suppose Sj;h does not write xand Si;k is haraterized by the triple fPg Si;k fQg where Q � sp(P; Si;k). IfSj;h does not interfere with P , then Sj;h does not interfere with Q.Proof: The strongest postondition, Q, of Si;k with preondition P is givenby [33℄:

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 22Q � 9v(P xv ^ x = X)Sine Sj;h does not interfere with P , the triplefP^P 0gSj;hfPg (8)is a theorem, where P 0 is the preondition of Sj;h. Note that sine Sj;h doesnot hange x, it follows from (8) thatfP xx0 ^ P 0gSj;hfP xx0g (9)where x0 is an arbitrary value of x. Our goal is to showfQ ^ P 0g Sj;h fQg (10)Suppose q is an arbitrary state satisfyingQ^ P 0, and q0 is the state that resultsafter Sj;h is exeuted starting in q. We would like to show that q0 satis�es Q.Let v0 be a value that makes Q0 true in q. Then P xv0 is true in q and from (9)P xv0 is true in q0 as well. Furthermore, sine Sj;h does not hange x, if x = X istrue of q it is true of q0 as well. Hene Q is true of q0. Sine q is an arbitrarystate satisfying Q ^ P 0, it follows that (10) is a theorem. 23.2.3 Semanti ondition for READ UNCOMMITTEDThe loking implementation for READ UNCOMMITTED [7℄ requires thattransations obtain long-term write loks on items that they write,1 but noread loks are aquired on items that they read. Long term loks are helduntil the transation ompletes. The following theorem states a onditionunder whih a transation will exeute orretly at READ UNCOMMITTED.1Database systems often prohibit transations at this level from updating the database.We ignore this restrition here.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 23Theorem 3.2.4 The exeution of a transation Ti at READ UNCOMMIT-TED will be semantially orret if eah write statement (inluding those thatrollbak a transation) in every transation does not interfere with Ii, the post-ondition of every read statement in Ti, and Qi.Proof: Consider any transation Tj in the system and an arbitrary exeutionpath of transation Ti and label the ontrol points along this path �1, �2, : : :,�n. Let P�k be the assertion assoiated with �k in the proof (1) of Ti. In thefollowing, we show that, when eah ontrol point, �k, of the path is ative, thestate of the system, denoted by state(�k), satis�es an assertion P 0�k suh that:1. P 0�k =) P�k , and2. for eah write statement, Sj;h, of transation Tj, either Sj;h does notinterfere with P 0�k , or if it does, it will not invalidate P 0�k .The proof is by indution on k.1. Indution basis k = 1: Let P 0�1 � P�1 . Sine �1 is the �rst ontrolpoint, P�1 is the preondition of Ti and P�1 � Pi;1. In Setion 3.1, weharaterize the preondition of Ti as Ii ^ Bi ^ (xi = Xi). We assumeBi only involves loal variables of Ti, thus it an not be interfered withby Sj;h; xi = Xi) states that at the time Ti was started the variables inxi were equal to some values, hene it annot be interfered with by Sj;heither; by the onditions of the theorem, Ii is not interfered with by Sj;h.Thus, Pi;1 is not interfered with by Sj;h.2. Indution hypothesis: For all ontrol points �i in the exeution path�1 � � ��m, state(�i) satis�es an assertion P 0�i that satis�es (1) and (2).3. Indution step: We need to exhibit an assertion, P 0�m+1 , satisfying (1)and (2). Consider all possible ontrol point transitions from �m to �m+1:

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 24(a) Ti exeutes a read statement. Let P 0�m+1 � P�m+1 . By the onditionsof the theorem, P�m+1 , thus P 0�m+1 is not interfered with by Sj;h.(b) Ti exeutes a write statement stmt, that writes the same databaseitem as Sj;h. Let P 0�m+1 � sp(P 0�m ; stmt). Sine P 0�m =) P�m ,it follows that P 0�m+1 =) P�m+1 . Furthermore, P 0�m+1 annot beinvalidated by Sj;h beause if stmt has exeuted, Ti has aquired awrite lok on the data item written by stmt. Thus, Sj;h annot beexeuted until Ti terminates. (Note in this ase, it is possible thatSj;h interferes with P 0�m+1 .)() Ti exeutes a write statement, stmt, that writes a database itemthat is distint from the item written by Sj;h. Let P 0�m+1 �sp(P 0�m ; stmt). Sine P 0�m =) P�m , it follows that P 0�m+1 =) P�m+1 .Furthermore, sine P 0�m is not interfered with by Sj;h, it follows fromLemma 2 that P 0�m+1 is not interfered with by Sj;h.(d) Ti exeutes a loal assignment, stmt. Let P 0�m+1 = sp(P 0�m ; stmt).SineP 0�m =) P�m , it follows that P 0�m+1 =) P�m+1 . Sine P 0�m is notinterfered with by Sj;h, it follows from Lemma 1 that P 0�m+1 is notinterfered with by Sj;h.(e) Ti enters the THEN body of a onditional statement with guardG. LetP 0�m+1 � P 0�m^G. Sine P 0�m =) P�m , it follows that (P 0�m^G) =)(P�m ^ G). Sine P 0�m is not interfered with by Sj;h and G onlyinvolves loal variables, (P 0�m ^G), is not interfered with by Sj;h.(f) Ti enters the ELSE body of a onditional statement with guard G.LetP 0�m+1 = P 0�m ^:G. The argument is the same as the previous ase.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 25(g) Ti enters (or re-enters) the body of a while loop with guard G. LetP 0�m+1 = P 0�m^G. Sine P 0�m =) P�m , it follows that (P 0�m^G) =)(P�m ^G), i.e., P 0�m+1 =) P�m+1 . Sine P 0�m is not interfered withby Sj;h and G only involves loal variables, P 0�m+1 is not interferedwith by Sj;h.(h) Ti exits from a while loop with guard G. Let P 0�m+1 = P 0�m ^ :G.The argument is the same as the previous ase..Thus when Ti ommits, Q0i will be true of the �nal state where Q0i =) (Ii^Qi).As one of the onditions of the theorem, Ii ^Qi is not interfered with by Sj;h.Hene none of the assertions of Ti will be invalidated. Sine the proof is donefor an arbitrary exeution of Ti and an arbitrary write statement of Tj, thesemanti orretness of Ti is guaranteed. 2Sine a transation exeuting at READ UNCOMMITTED an read un-ommitted data, it is neessary to onsider the interferene aused by writestatements that rollbak any transation.Suppose a transation exeuting at READ UNCOMMITTED exeutes astatement, s, that updates data item x. The fat that the transation will thenhold a long term write lok on x implies that no write statement, s0, a

    essingx in a onurrent transation an be interleaved after s, and hene s0 annotpossibly invalidate post(s) (or any subsequent assertions).Example 3.2.5 The elements of array ust are reords desribing a mer-hant's ustomers, and an integrity onstraint of the database, I, asserts thisfat. Two transation types a

    ess the array. The Mailing List transa-tion type sans the array and prints a mailing label. The spei�ation of thetransation requires only that eah printed label ontains a valid name andaddress. The New Order transation type enters a new reord into the arrayif the ustomer plaing the new order is not desribed by an existing reord.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 26Eah time a label is printed by a Mailing List transation we would like toassert \The printed label ontains a valid name and address". Neither I northis assertion is interfered with by either the insertion of a new reord by aninstane of New Order or the deletion of a reord if that instane is rolledbak. Hene Mailing List transations an run at READ UNCOMMITTED.The database might inlude a seond array in whih eah reord desribes anorder that has been plaed. A transation that analyzes last year's orders anrun at READ UNCOMMITTED sine only this year's orders are subjet tomodi�ation. 23.2.4 Semanti ondition for READ COMMITTEDThe loking implementation of READ COMMITTED [7℄ requires that trans-ations obtain long-term write loks on items that they write and short-termread loks on items that they read. A short term lok is released when theoperation ompletes. The following theorem states a ondition under whih atransation will exeute orretly at READ COMMITTED.Theorem 3.2.6 A transation, Ti, exeuted at READ COMMITTED willexeute semantially orretly if eah transation does not interfere with thepostondition of every read statement in Ti, and with Qi.Proof: Let Tj be an arbitrary transation in the system. Sine the isolationlevel of Tj is at least READ UNCOMMITTED (we don't onsider the ase inwhih the mixed system inludes SNAPSHOT isolation, sine we assume theSNAPSHOT isolation does not use loking sheme). Tj will hold a long termwrite lok on any item it writes. Sine at the READ COMMITTED level,Ti uses short term read loks, Ti annot read any item written by Tj until Tjterminates. Thus, Ti either sees the whole result of Tj or it does not see anyresult of Tj, and when we reason about the semanti orretness of a shedule

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 27that inludes Ti, as far as Ti is onerned, Tj an be onsidered as a singleisolated unit. Using the same reasoning as employed in the proof of Theorem1, we an prove that no assertion of Ti an be invalidated by Tj. Note thatsine Ti preserves I, the only onjunt of the postondition of Ti that anbe interfered with by Tj is Qi, whose non-interferene is stated expliitly asone of the onditions of the theorem. Also note that we onsider those writestatements that rollbak Tj as part of Tj. 2Example 3.2.7 If the spei�ation of the Mailing List transation isstrengthened to require that all labels refer to ustomers, Mailing List trans-ations must be run at least at the READ COMMITTED level sine thenew postondition of Mailing List \The printed label refers to a ustomer"is interfered with by the update statement that deletes an entry in ust ifNew Order is rolled bak.In addition to ust, the database might ontain an array emp, with onereord for eah employee. emp[i℄:rate is the ith employee's hourly rate,emp[i℄:num hrs is the number of hours that employee has worked so far thisweek, and emp[i℄:sal is that employee's a

    umulated salary for the week. Aonjunt of the integrity onstraint, Isal asserts that for all reords in emp,\emp[i℄:rate � emp[i℄:num hrs = emp[i℄:sal".The granularity of loking is at the level of reords. An instane of transa-tion type Hours is exeuted at the end of eah workday to reord the numberof hours worked by an employee that day. It exeutes one write statementto inrement emp[i℄:num hrs and another to update the a

    umulated salary.Hene, although the two write statements together preserve Isal, individuallythey do not. A seond transation type, Print Reords auses the reords tobe printed. Its spei�ations require that eah printed reord is a onsistentsnapshot of that employee's information at the time the reord is printed. Thisspei�ation makes it neessary that Print Reord be run at a level no lower

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 28than READ COMMITTED for two reasons:� Only ommitted information an be printed.� It follows from Theorem 3.2.4 that, in order to ensure that the snapshotof emp[i℄ is onsistent, the write statements of Hours must be seen asan atomi unit by Print Reords.The spei�ation does not require that all printed reords ome fromthe same ommitted snapshot of emp. Hene, it is not neessary to pro-hibit instanes of Hours from updating reords that have been printed whilePrint Reords is printing other reords. As a result the long term read loksthat would be aquired on eah reord printed if Print Reords were run atREPEATABLE READ are not required. 23.2.5 Semanti ondition for READ COMMITTEDwith �rst-ommitter-winsThe READ COMMITTED with �rst-ommitter-wins isolation level is an ex-tension of READ COMMITTED with one feature from the SNAPSHOT isola-tion level. Transations obtain long-term write loks on items that they writeand short-term read loks on items that they read. In addition, if T1 writes adata item and ommits between the times that T2 has read and attempts towrite the item, T2 will be aborted (�rst-ommitter-wins). READ COMMIT-TED with �rst-ommitter-wins is easily and often implemented in relationaldatabases by running an appliation at the READ COMMITTED level andenoding, (perhaps using sequene numbers) in the UPDATE statements ofthe appliation, heks to determine whether the data item to be updated hashanged sine it was read. The isolation level is also supported by a number ofvendors. Some vendors all this level READ COMMITTED with optimistireads.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 29Theorem 3.2.8 A transation, Ti, exeuted at READ COMMITTED with�rst-ommitter-wins will exeute semantially orretly if eah transation doesnot interfere with the postonditions of those read statements in Ti that are notfollowed by a write statement on the same item, and with Qi.Proof: Let Tj be an arbitrary transation in the system. Sine the isolationlevel of Tj is at least READ UNCOMMITTED (we don't onsider the ase inwhih the mixed system inludes SNAPSHOT isolation, sine we assume theSNAPSHOT isolation does not use loking sheme). Tj will hold a long termwrite lok on any item it writes. Sine at the READ COMMITTED level,Ti uses short term read loks, Ti annot read any item written by Tj until Tjterminates. Thus, Ti either sees the whole result of Tj or it does not see anyresult of Tj, and when we reason about the semanti orretness of a shedulethat inludes Ti, as far as Ti is onerned, Tj an be onsidered as a singleisolated unit.If Tj does not interfere with the postondition of any read statement in Ti,sine Tj does not interfere with Qi ( as one of the onditions of the theorem),the semanti onditions for the READ COMMITTED hold, Ti will exeutesemantially orret at the READ COMMITTED thus at the READ COM-MITTED with �rst-ommitter-wins sine the seond level is higher than the�rst one.If Tj does interfere with the postondition of a read statement of Ti, sayfPg X := x fQg, then a

    ording to the onditions of the theorem, this readstatement must be followed by a write statement on x. In this ase, if Tj isinterleaved between the read statement and the orresponding write statement,Ti will be aborted. Otherwise, the postondition of the read statement will benot invalidated.Note that READ COMMITTED with �rst-ommitter-wins e�etivelyholds a read lok on the data item within the time period between the read

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 30statement and its followed orresponding write statement, and heneforth along-term write lok. 2Note that if transation Ti writes all the data items it reads, then when Tiommits, Ti has e�etively held long term read loks on the data items thatit read, and hene in this ase, READ COMMITTED with �rst-ommitter-wins is equivalent to REPEATABLE READ. We give an example of orretexeution at the READ COMMITTED with �rst-ommitter-wins level in Se-tion 3.5.3.2.6 Semanti ondition for REPEATABLE READThe loking implementation of the REPEATABLE READ isolation level [7℄requires that a transation aquire long-term read and write loks on the dataitems that it a

    esses. The only problem at the REPEATABLE READ level isthe possibility of phantoms [26℄. Sine phantoms do not o

    ur in onventional(non-relational) databases, REPEATABLE READ ensures serializability andhene semanti orretness. Thus we have the theorem:Theorem 3.2.9 Under the onventional database model, a transation exe-uted at REPEATABLE READ exeutes semantially orretly.Proof: For the onventional database model, REPEATABLE READ ensuresonly serializable shedules an be produed. For any serializable shedule, itan be transformed into an equivalent serial shedule, in whih no assertionsof an arbitrary transation Ti will be invalidated by any step of a transationinterleaved with Ti (sine no other transation is interleaved with Ti in a serialshedule. Thus the semanti orretness of Ti is guaranteed. 2

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 313.2.7 Semanti ondition for SNAPSHOT isolationSNAPSHOT isolation is not one of the ANSI/ISO standard isolation levelsbut is implemented in at least one ommerial DBMS. The implementation ofSNAPSHOT isolation given in [7℄ does not use loks. Instead, it uses a mul-tiversion onurreny ontrol that satis�es eah read request made by trans-ation Ti with values from the version of the database, alled a snapshot, thatreets the e�et of all ommitted transations at the time Ti was started.Hene read requests never wait. Writing is deferred until the transation om-mits. Ti an be ommitted as long as no other transation that ommittedafter Ti's �rst read has updated a data item that Ti has also updated. Thismehanism is referred to as �rst ommitter wins, beause the �rst transationthat has updated a partiular data item and requests to ommit is allowedto do so, while onurrent transations that have updated that item are ulti-mately aborted. Thus �rst ommitter wins has at least the e�et of a long-termwrite loks on the items written.We model a transation Ti at the SNAPSHOT isolation level as two iso-lated atomi steps: a read step followed by a write step. The read step reads asnapshot of the database that reets the e�et of all ommitted transationsat the time Ti was started. The write step is the remainder of the transa-tion. The step boundary reets the fat that other transations an ommitwhile Ti is ative, reating new versions of the database that might invalidateassertions that Ti has made about the database based on its snapshot. IfTi ommits, its write step must ommute with the write steps of these othertransations beause they must have written to disjoint data items. Note thatthe postondition of the snapshot does not neessarily state that the valueof a data item in a snapshot is equal to the most reent ommitted value ofthat data item. It only needs to be strong enough to support the proof of thetransation [12℄.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 32Theorem 3.2.10 A shedule produed under SNAPSHOT isolation is seman-tially orret if, given any two transations Ti and Tj from the shedule, either:� Ti's write set intersets Tj's write set or� Tj does not interfere with the postondition of the read step of Ti, andwith Qi.Proof: The read step of Ti reads the snapshot of the database that reetsthe e�et of all ommitted transations at the time Ti was started. Thissnapshot either reets the whole result of Tj or it does not reet any resultof Tj. Thus, when we reason about the semanti orretness of Ti in a shedulethat inludes Ti and Tj, Tj an be onsidered as a single isolated unit. If (1)applies, then either Ti or Tj will be aborted and has no e�et. If not, thenusing Lemma 1, Lemma 2 and ondition 2 of the theorem, it follows that noassertion in Ti will be invalidated by Tj. Note that sine Tj preserves I, thepreondition of Ti is not interfered with by Tj. 2Example 3.2.11 Suppose we have two types of withdraw transations,Withdraw sav(i; w) and Withdraw h(i; w), whih withdraw w from the ithdepositor's savings and heking a

    ounts, respetively. Savings and hek-ing a

    ount information is held in arrays a

    t sav and a

    t h respetivelyand a onjunt of the integrity onstraint, Ibal requires that a

    t sav[i℄:bal +a

    t h[i℄:bal � 0. An annotated version of the Withdraw sav program isgiven in Figure 1. The annotation for Withdraw h is similar.Sav and Ch are loal variables. The postondition of the read step ofWithdraw sav is interfered with by the write step of Withdraw h. Hene,the theorem states that a onurrent shedule of the two transations mightnot be semantially orret. A shedule in whih the write step is interleavedbetween the read and write step of the other exhibits write skew [7℄. Note thatalthough this same preondition is also interfered with by another instane of

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 33Withdraw sav(i, w)BEGIN TRANSACTIONfa

    t sav[i℄:bal + a

    t h[i℄:bal � 0^ Sav = Sav0gSav := a

    t sav[i℄:bal;Ch := a

    t h[i℄:bal;fa

    t sav[i℄:bal + a

    t h[i℄:bal � 0^ a

    t sav[i℄:bal + a

    t h[i℄:bal �Sav + Ch ^ Sav = Sav0gif (Sav + Ch � w) thenfa

    t sav[i℄:bal + a

    t h[i℄:bal � 0 ^a

    t sav[i℄:bal + a

    t h[i℄:bal �Sav + Ch ^ Sav + Ch � w^ Sav = Sav0ga

    t sav[i℄:bal := Sav � w;� fa

    t sav[i℄:bal + a

    t h[i℄:bal � 0^(a

    t sav[i℄:bal = Sav0 � w)gEND TRANSACTIONFigure 1: Withdraw from savings a

    ountWithdraw sav, a onurrent shedule in whih two instanes ofWithdraw savare interleaved is semantially orret beause the �rst-ommitter-wins ruleimplies that one of them will be aborted (this is reeted in the seond ondi-tion of the theorem). Finally, a Deposit sav (Deposit h) transation, whihadds money to a

    t sav (a

    t h) does not interfere with the postondition ofthe read step of Withdraw sav. In this ase, their onurrent exeution issemantially orret (this is reeted in the third ondition of the theorem).2

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 343.3 Semanti onditions for relationaldatabasesIn adapting the onditions for semanti orretness to relational databases, wemust deal with database operations that involve prediates. The read state-ment is now the SELECT and its postondition might assert that it read allthe tuples that satisfy a ertain prediate. Similarly, the write statements areUPDATE, INSERT, and DELETE and their postonditions might assert thatthey wrote, inserted, or deleted all the tuples that satisfy a ertain prediate.Interferene now takes new forms. For example, the postondition of aSELECT statement might assert that the statement read all the tuples thatsatisfy a prediate, P . That assertion an be interfered with by another trans-ation that inserts a phantom tuple that also satis�es P .Phantoms an o

    ur in onnetion with write statements as well as inonnetion with the SELECT. Thus, the postondition of an UPDATE thatasserts that the value of all tuples satisfying P have been updated an beinterfered with by an INSERT that inserts a phantom tuple that satis�es P .That interferene might not ause invalidation of the prediate, however, ifthe loking poliy prevented the INSERT from exeuting after the UPDATEhad taken plae.The loking poliy for implementing the ANSI isolation levels disussed in[7℄ states that all \write loks on data items and prediates (are) long dura-tion". Thus when an UPDATE, INSERT, or DELETE statement refers to aprediate, that prediate is write-loked for the duration of the transation,and phantoms annot be inserted into that prediate. Most DBMSs do notimplement prediate loking, but instead use a loking protool (perhaps on-sisting of some ombination of table loks and index loks) that is equivalentto, or stronger than, prediate loking. We assume in what follows that the

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 35DBMS uses suh a loking protool. Then Theorems 3.2.4, 3.2.6 remain validfor relational databases, although the proofs are di�erent sine we need toonsider phantoms.In the following, we model SELECT, INSERT, DELETE, and UPDATEas assignment statements and axiomatize them using strongest postondition.This axiomatization enables us to apply the reasoning tehnique employed inonventional databases to relational databases.SELECT statement: Assume a SELECT statement reads all the tuplesin a table T that satisfy prediate. We introdue the set variable:r = ft j prediate(t) ^ (t 2 T )gto denote the set of tuples returned by the SELECT statement. We model theSELECT statement as an assignment statement that assigns all the tuples inr to a loal set variable R. Then, we an use the following triple to speifythe semantis of a SELECT statement.fPg R := r fQgIfQ is the strongest postondition of the SELECT statement with preonditionP , by [33℄, we an speify Q in terms of P :Q : 9v(PRv ^ R = r)INSERT statement: Assume an INSERT statement inserts all the tuplesthat satis�es prediate into the table T . Using the same set variable r de�nedfor the SELECT statement, we model the INSERT statement as an assignmentstatement that assigns to T the value T [ r. Thus we an use the followingtriple to speify the semantis of an INSERT statementfPg T := T [ r fQg

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 36If Q is the strongest postondition of the INSERT statement with preonditionP , by [33℄, we an speify Q in terms of PQ : 9v(P Tv ^ T = v [ r)DELETE statement: Assume a DELETE statement deletes all the tuplesin T that satis�es prediate. Using the same set variable r as above, we modelthe DELETE statement as an assignment statement that assigns to T thevalue T � r. Thus we an use the following triple to speify the semantis ofa DELETE statement: fPg T := T � r fQgIf Q is the strongest postondition of the DELETE statement with preondi-tion P , by [33℄, we an speify Q in terms of P :Q : 9v(P Tv ^ T = v � r)UPDATE statement: Assume an UPDATE statement updates all thetuples in T that satis�es prediate, Using the same set variable as above, wemodel the UPDATE statement as an assignment statement that is equivalentto a DELETE statement of the original values in r followed by an INSERTstatement of the new values, r0). Thus we an use the following triple to speifythe semantis of an UPDATE (prediate) statement:fPg T := T � r [ r0 fQgIf Q is the strongest postondition of the UPDATE statement with preondi-tion P , by [33℄, we an speify Q in terms of P :Q : 9v(P Tv ^ T = (v � r) [ r0)The following lemma is the basis for proving the theorems for the relationaldatabase.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 37Lemma 3.3.1 Let Si;k be an INSERT, DELETE or UPDATE statement ontable T with prediate pdi of transation Ti, and Sj;h be an INSERT, DELETEor UPDATE statement on table T with prediate pdj of another transation Tjand pdi \ pdj = �. Let Si;k be haraterized by the triple fPg Si;k fQg whereQ � sp(P; Si;k). If the tuples that Si;k writes do not satisfy pdj and vie versa,then if Sj;h does not interfere with P , it does not interfere with Q either.Proof: Suppose Si;k and Sj;h are UPDATE statements. We haraterize Si;kas fPg T := T � r [ r0 fQgwhere Q is the strongest postondition of Si;k with preondition P , given by[33℄: Q : 9v(P Tv ^ T = v � r [ r0) (11)Sine Sj;h does not interfere with P , the triplefP ^ P 0g Sj;h fPg (12)where P 0 is the preondition of Sj;h, is a theorem. Our goal is to show thatfQ ^ P 0g Sj;h fQg (13)is also a theorem. Suppose q is an arbitrary state satisfying Q ^ P 0, and q0is the state that results after Sj;h is exeuted starting in q. We would like toshow that q0 satis�es Q. Let v0 be a value suh that the assertionP Tv0 ^ T = v0 � r [ r0is true in q. The tuples in T in state q an be divided into two disjoint sub-tables: st1 and st2, where st2 = r0 and st1 \ r = �. Sine no tuple in r0

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 38satis�es pdj, Sj;h an only hange the value of st1. Suppose in the transitionfrom q to q0, st1 is hanged to st10. Then in state q0, T = st10 [ r0. Sine Sj;hdoes not produe any tuple that satisfy pdi, st10 \ r = �. Let v1 = st10 [ r.We have T = v1 � r [ r0 in state q0. ThusfP Tv0 ^ (T = v0 � r [ r0) ^ P 0g Sj;h fT = v1 � r [ r0g (14)is a theorem. Sine pdj \ r0 = �, the following triple must also be a theorem:fP Tv0 ^ (T = v0 � r) ^ P 0g Sj;h fT = v1 � rgA

    ording to the onditions of the lemma, pdi \ pdj = �, thus we get:fP Tv0 ^ T = v0 ^ P 0g Sj;h fT = v1g (15)From (12) and (15), we getfP Tv0 ^ T = v0 ^ P 0g Sj;h fP Tv1 ^ T = v1g (16)Sine pdi \ pdj = � and pdj \ r0 = �, from (16) we havefP Tv0 ^ T = v0 � r [ r0 ^ P 0g Sj;h fP Tv1 ^ T = v1 � r [ r0g (17)From (17) we get (11) is true in state q0. Sine q is an arbitrary state satisfyingQ ^ P 0, (13) is a theorem. For other write statements, similar proofs an bearried out.The semanti ondition for READ UNCOMMITTED of the relationaldatabase is the same as the one for the onventional database, although theproof is di�erent sine we need to onsider phantoms.Theorem 3.3.2 A transation, Ti, that exeutes at READ UNCOMMITTEDwill exeute semantially orretly if eah write statement (inluding those thatrollbak a transation) in every transation does not interfere with Ii, the post-ondition of every read statement in Ti, and Qi.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 39Proof: Let Si;k be an arbitrary write statement (INSERT, DELETE, UP-DATE) of Ti and we haraterize it as triplefPg Si;k fQgwhere Q is the strongest postondition of Si;k with preondition P . SupposeSj;h is an arbitrary write statement (INSERT, DELETE, UPDATE) of Tj andit does not interfere with P . In the following, we prove that Sj;h will notinterfere with Q either, or if it does, it will not invalidate Q during exeution.Consider Sj;h is interleaved after Si;k. If Si;k is the last statement of Ti, Q �IiVQi, a

    ording to the onditions of the theorem, Q is not interfered withby Sj;h. If not, sine we assume the prediates assoiated with Si;k and Sj;h,denoted by pdi and pdj respetively, are long-term, pdi\pdj = �, otherwise Tjwill wait until Ti terminates. Furthermore, Si;k does not produe any tuplesthat satisfy pdj, sine if it does, it will hold long-term write loks on the tuplesthat it produes, and Tj will be prevented from being interleaved right afterit, as a matter of fat, Tj has to wait until Ti terminates; Sj;h annot produeany tuples that satisfy pdi either, sine otherwise, the long-term prediate pdiwill prevent Sj;h from being interleaved right after Si;k.In any ases, either Sj;h is delayed until Ti terminates, in whih ase Q isnot invalidated; or the onditions of Lemma 3 are satis�ed, thus Q will not beinterfered with. The rest of the proof would be similar to the one for Theorem3.2.4. 2The semanti ondition for READ COMMITTED of the relationaldatabase is the same as the one for the onventional database, although theproof is di�erent sine we need to onsider phantoms.Theorem 3.3.3 A transation, Ti, exeuted at READ COMMITTED willexeute semantially orretly if eah transation does not interfere with thepostondition of every read statement in Ti, and with Qi.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 40Proof: Let Tj be an arbitrary transation in the system. Sine the isolationlevel of Tj is at least READ UNCOMMITTED (we don't onsider the ase inwhih the mixed system inludes SNAPSHOT isolation, sine we assume theSNAPSHOT isolation does not use loking sheme). Tj will hold a long termwrite lok on any item it writes. Sine at the READ COMMITTED level,Ti uses short term read loks, Ti annot read any item written by Tj until Tjterminates, thus, Ti either sees the whole result of Tj or it does not see anyresult of Tj, and when we reason about the semanti orretness of a shedulethat inludes Ti, as far as Ti is onerned, Tj an be onsidered as a singleisolated unit.Let Si;k be an arbitrary write statement (INSERT, DELETE, UPDATE)of Ti and we haraterize it as triplefPg Si;k fQgwhere Q is the strongest postondition of Si;k with preondition P . SupposeTj does not interfere with P . In the following, we prove that Tj will notinterfere with Q either, or if it does, it will not invalidate Q during exeution.Consider Tj is interleaved after Si;k. If Si;k is the last statement of Ti, Q �IiVQi, a

    ording to the onditions of the theorem, Q is not interfered withby Tj. If not, sine we assume the prediates assoiated with Si;k and Tj arelong-term, they must not interset, otherwise Tj will wait until Ti terminates.Furthermore, Si;k does not produe any tuples that satisfy the prediate ofany write statement in Tj, otherwise it will hold long-term write loks on thetuples that it produes, and Tj will be prevented from being interleaved rightafter it and Tj has to wait until Ti terminates; Tj annot produe any tuplesthat satisfy pdi either, otherwise the long-term prediate pdi will prevent Sj;hfrom being interleaved right after Si;k.In any ases, either Tj is delayed until Ti terminates, in whih ase Q isnot invalidated; or the onditions of Lemma 3 are satis�ed, thus Q will not be

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 41interfered with.The rest of the proof would be similar to the one for Theorem 2. 2For READ COMMITTED with �rst-ommitter-wins, taking into a

    ountthe e�et of phantoms, the onditions have to be strengthened to be the onefor READ COMMITTED. We state the theorem as Theorem 3.3.3.Theorem 3.3.4 A transation, Ti, exeuted at READ COMMITTED with�rst-ommitter-wins will exeute semantially orretly if eah transation doesnot interfere with the postondition of every read statement in Ti, and with Qi.Proof: The same as Theorem 3.3.3. 2Theorem 3.2.9 an be restated for relational databases by onsidering thepossibility of phantoms. At REPEATABLE READ, the long term read loksobtained on tuples read by a SELECT statement blok the exeution of astatement in a onurrent transation that attempts to delete or update suhtuples. Hene, the postondition of the SELECT statement annot be invali-dated by a transation that attempts to delete or update suh a tuple. As aresult we get the following theorem.Theorem 3.3.5 For a transation, Ti, exeuted at REPEATABLE READ,let Si;j be an arbitrary SELECT in Ti. Ti will exeute semantially orretly ifeah transation does not interfere with Qi and either1. does not interfere with the postondition of Si;j, or2. inludes DELETE or UPDATE statements whose prediates interset theprediate of Si;j.Proof: If (1) applies, similar proof as the one for Theorem 3.3.3 exists;otherwise, (2) applies, and Tj will be delayed until Ti terminates, where Qi isnot interfered with. 2

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 42For SNAPSHOT isolation, we model a transation in the same way aswe did for onventional databases. Theorem 5 remains valid for relationaldatabases. For ompleteness, we restated it as Theorem 10 in the following.Theorem 3.3.6 A shedule produed under SNAPSHOT isolation is semanti-ally orret if, given any two transations Ti and Tj from the shedule, either:1. Ti's write set intersets Tj's write set or2. Tj does not interfere with the postondition of the read step of Ti, andwith Qi.Proof: For SNAPSHOT isolation, we model a transation in the same wayas we did for onventional databases. The proof would be similar to the oneof Theorem 5. 23.4 Choosing an isolation levelGiven the set of transations types of an appliation, the problem faed by theappliation designer is to determine, for eah type, the lowest isolation levelat whih a transation of that type an exeute orretly. Sine SNAPSHOTisolation is not generally o�ered in the ontext of the other isolation levels, wedo not onsider it in what follows.Using the previous results it follows that while we determine the isolationlevel at whih to exeute transation, T1, we do not have to be onernedabout the level of other transations. Spei�ally, it we are performing aninterferene analysis to determine the orretness of exeuting T1 at READUNCOMMITTED, we must onsider the interferene e�ets of eah write ofanother transation, T2, individually, regardless of the level of T2. Similarly, ifwe are onsidering exeuting T1 at any higher level, we onsider the interferenee�ets of the whole transation T2 as an atomi isolated unit, regardless of the

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 43level of T2. Then a proedure for determining the lowest isolation level at whiheah transation an exeute semantially orretly is: for eah transation,Ti, in the set, onsider the isolation levels READ UNCOMMITTED, READCOMMITTED, REPEATAGBLE READ, and SERIALIZABLE in sequeneand return the �rst at whih exeution is semantially orret.3.5 An exampleTo motivate the onditions for semanti orretness in a relational setting,onsider a business appliation that a

    esses a shema with the following threetables (primary keys are underlined):� orders(order info, ust name, deliv date, done)� ust(ust name, address, #orders)� maxdate(maximum date)A onjunt of I, Io, asserts that eah row of order desribes an order and doneis true if that order has been delivered. maxdate is a table ontaining a singlerow that satis�es a seond onjunt, Imax, that asserts that maximum date isthe maximum delivery date for any order in orders.Mailing List()BEGIN TRANSACTIONftruegSELECTust name, address FROM ustfReturned data ontains names and addressesgPrint labels using returned names and addressesfLabels have been printedgEND TRANSACTIONFigure 2: Prints out a mailing list

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 44This appliation has four transation types shown in Figures 2 through 5.Eah �gure shows an annotation of a transation program indiating the pre-and postonditions of the transation and the postondition of eah SELECTstatement. These are the ritial assertions. The purpose of the �gures is todisplay the ritial assertions; the ode is just skethed.Mailing List (Figure 2): This transation sans ust and prints a labelusing ust name and address. The spei�ation of the transation plaes noondition on the labels printed. Sine no ritial assertion is interfered withby any single write statement in any of the transation types, this transationwill exeute orretly at READ UNCOMMITTED.New Order (Figure 3): This transation inserts a new order into ordersand, if this is the �rst order for ustomer, inserts a new tuple into ust. Inorder to keep the delivery truk busy, a business rule asserts that there anbe no gaps in the sequene of delivery dates: there must be at least one orderto be delivered on eah date up to some maximum date whih is the deliverydate of the last outstanding order. A onjunt of the integrity onstraint ofthe database, whih we all \no gaps", asserts this fat. However, there anbe more than one order for any partiular delivery date. Furthermore, thenumber of orders for a partiular ustomer in orders must be equal to thevalue of the #orders �eld of that ustomer's tuple in ust. We refer to this in-tegrity onstraint as \order onsisteny". The intermediate assertion I 0max inFigure 3 asserts that maximum date is one greater than the latest delivery datein orders. Thus New Order reads the value of maximum date in maxdateinto the workspae variable maxdate; and inrements maximum date in max-date by 1. If the ustomer is new it inserts the tuple (ustomer; address; 1)into ust; otherwise it inrements #orders in the ustomer's tuple in ust. 22The postondition of New Order in Figure 3 indiates that the inserted tuple has apartiular value in the #orders �eld. Sine the value will hange as the ustomer addsnew orders, in order to avoid interferene the postondition should atually be weakened toassert that at ommit time this tuple was an element of ust .

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 45New Order(ustomer; address; order info)BEGIN TRANSACTIONfno gap ^ order onsisteny ^ ImaxgSELECT maximum date FROM maxdateINTO : maxdatefno gap ^ order onsisteny^ Imax ^ (maxdate � maximum date)gUPDATE maxdate SETmaximum date = : maxdate + 1SELECT COUNT(*) INTO : ustount FROMorders WHERE ust name = : ustomerfno gap ^ order onsisteny ^ I 0max^(maxdate � maximum date) ^(ustount = 0)) (ustomer is new)gif (ustount == 0) thenINSERT INTO ustVALUES (: ustomer; : address; 1)elseUPDATE ust SET #orders = :ustount+1WHERE ust name = : ustomer�INSERT INTO orders VALUES(: order info; : ustomer; : maxdate + 1; false)fno gap ^ order onsisteny ^ Imax^(ustomer; address; ustount+ 1) 2 ust^ (order info; ustomer; maxdate + 1; false)2 ordersgEND TRANSACTIONFigure 3: Proesses a new orderFinally, it inserts (order info; ustomer; maxdate+1; false) into orders.3Sine no ritial assertion is interfered with by any transation type, thistransation an run at READ COMMITTED. The transation annot run at3Sine the value of done will subsequently hange, the omments in the previous footnoteapply.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 46READ UNCOMMITTED beause, for example, the no gap assertion that isa onjunt of assertions in a New Order transation, T1, is interfered with bythe rollbak statement of another New Order transation, T2, that deletes atuple from orders (it might leave a gap in delivery dates below the deliverydate seleted by T1).Suppose an additional business rule is imposed: there must be exatlyone order with a partiular delivery date for eah date up to some maxi-mum. The \no gap" onjunt of the integrity onstraint is replaed by theonjunt \one order per day" whih asserts the new requirement. The sameNew Order transation an be used to enfore this rule if it is run at READCOMMITTED with �rst-ommitter-wins. At READ COMMITTED the IN-SERT into orders in the New Order transation interferes with the on-junt one order per day in the postondition of the SELECT. However, sineNew Order updates MAXDATE after reading it, one order per day annotbe invalidated at the READ COMMITTED with �rst-ommitter-wins isola-tion level. Also note the postondition of the whole transation is not inter-fered with by any transation type, and thus this transation an run at READCOMMITTED with �rst-ommitter-wins.Delivery (Figure 4): This transation delivers an order. Thus Deliverysans orders to �nd all the orders that are due today and updates the doneattributes in the orders to be delivered to true.The postondition of the SELECT statement of a Delivery transation isinterfered with by another Delivery transation. Thus this transation typeannot exeute at READ COMMITTED. However, if the transation is exe-uted at REPEATABLE READ, the seleted tuples are read loked after theSELECT statement is exeuted. Hene a Delivery transation would not beallowed to update these tuples and the assertion ould not be invalidated.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 47Delivery(today)BEGIN TRANSACTIONfIogSELECT order info INTO : buff FROM ordersWHERE deliv date = : today AND done = FALSEfIo ^ returned values are undelivered ordersto be delivered todaygwhile ((ord inf := next in buff)UPDATE orders SET done = TRUEWHERE order info = : ord inffIo ^ (tuples in orders desribing ordersdue today have done = TRUE)gEND TRANSACTIONFigure 4: Delivers an orderThus this transation meets the ondition for orret exeution at the RE-PEATABLE READ isolation level.Audit (Figure 5): This transation heks that the integrity onstraintorder onsisteny is true. Thus Audit sans orders and ounts the numberof orders registered for a partiular ustomer; reads the tuple for that ustomerin ust and ompares #orders with the ount.This transation must run at the SERIALIZABLE level beause thepostonditions of both SELECT statements might be interfered with by aNew Order transation that inserts a (phantom) new order. Note that thistransation does not satisfy the seond half of the ondition for orret exeu-tion at REPEATABLE READ beause tuple loks do not prevent the insertionof a phantom new order.

  • CHAPTER 3. SEMANTICALLY CORRECT TRANSACTIONS 48Audit(ustomer)BEGIN TRANSACTIONfIogSELECT COUNT(*) INTO : ount1 FROM ordersWHERE ust name = : ustomerfIo ^ (ount1 = the number of tuplesin orders for ustomer)gSELECT #orders INTO : ount2 FROM ustWHERE ust name = : ustomerfIo ^ (ount1 = the number of tuples inorders for ustomer) ^ (ount2 = the valueof #orders in ust for ustomer)gretv := (ount1 == ount2);fIo ^ (retv = order onsisteny)gEND TRANSACTIONFigure 5: Produes a

    ounting information3.6 SummaryWe have used semanti orretness as the riterion to investigate the orret-ness of shedules at di�erent isolation levels. Spei�ally, for eah isolationlevel, we prove a ondition under whih transations that exeute at that levelwill be semantially orret. This tehnique also lari�es the relationship be-tween interferene and invalidation. Interferene does not neessarily lead toinvalidation beause the underling loking sheme might prevent the o�end-ing interleavings from happening. Furthermore, an assertion that is interferedwith an often be replaed by a stronger assertion that is not interfered with.In that ase, the weaker assertion is not invalidated.

  • Chapter 4Semantially orret workows4.1 IntrodutionCorretness is an important aspet of workow management systems. Al-though most of the workow ommunity is interested in workow modelingaspets, a few researhers have been investigating tehniques for supportingworkow orretness [5, 16, 47, 63℄. An exellent overview of orretness is-sues in workow management is given in [41℄. More reently, a formalizationof workows based on set