understanding the requirements of stms

Click here to load reader

Upload: efia

Post on 25-Feb-2016

48 views

Category:

Documents


3 download

DESCRIPTION

Understanding the Requirements of STMs. Sathya Peri IIT Patna [email protected]. Outline. Motivation for Software Transactional Memory Drawbacks of locks Program Support for STMs Intuitive Requirements of STMs Formal Definition of a Correctness Criterion – Simpler Case - PowerPoint PPT Presentation

TRANSCRIPT

Introduction to STMs

Sathya PeriIIT [email protected] the Requirements of STMsOutlineMotivation for Software Transactional MemoryDrawbacks of locksProgram Support for STMsIntuitive Requirements of STMsFormal Definition of a Correctness Criterion Simpler CaseFormal Definition General CaseConclusion222Performance limits of single processorsSoftware systems performance increase in last few decadesIncrease in CPU clock frequenciesLimits in the increase of clock frequencies of CPUs in the last few yearsObtain higher performance: Packing more processors into a single chipThey are called multi-core CPUs333Parallel Processing ApplicationsTo achieve better performance with multi-core CPUsApplications programmed to take advantage of the underlying parallelismCommonly achieved by employing multi-threadingMultiple threads of control execute concurrentlyMulti-threading can efficiently utilize the multi-core CPUs

444Bank Transfer Example: Difficulty with parallel programming T1TimeT2

1Read B1 2B1 := B1 1000 3Write B1sum := 0 4Read B1 5Read B2 6sum := sum + B1 7sum := sum + B2 8 9Read B2 10B2 := B2 + 1000 11Write B2sees wrong sum555Issues with Parallel Programming6Normally threads have to collaborateInvolves sharing of data in memory or on secondary storageWrite access to shared data cannot happen in an uncontrolled fashionOtherwise allows programs to see inconsistent data valuesProcessors can not modify independent memory locations atomicallyHence, must be synchronized

Synchronization: Locks7Traditional solution to data synchronization contention: Employing software locksAll read and write access to shared data using locksNever see inconsistent stateCan simulate atomic update of two different variablesMultiple Shared Data Objects8Multiple data objects shared by programsUse a single program-wide lockHurts the system performanceMultiple locks are requiredA lock for each shared locationFine grain locking

Multiple Shared Data Objects contd9Threads accesses multiple shared objectsEach thread needing access to a shared objectobtains the corresponding lock;accesses the shared object andreleases the lockCan still result in incorrect stateBank Transfer Example: Incorrect Locking T1TimeT2 1Lock B1 2Read B1 3B1 := B1 1000 4Write B1sum := 0 5Release B1Lock B1, B2 6Read B1 7Read B2 8Release B1, B2 9sum := sum + B1 10sum := sum + B2 11Lock B2 12Read B2 13B2 := B2 + 1000 14Write B2 15Release B2Again sees wrong sum101010Correct Locking: Two Phase Locking11Each thread accessing shared data objectLocks all the shared object required firstAccesses the shared objectsThen releases all the locksCommonly referred to as two phase locking

Difficulties with Two Phase locking12Two phase locking involves lock and waitCan potentially lead to deadlocksWrong lock orderBank transfer exampleT1.lock(b1) T2.lock(b2) Threads are deadlockedThus, obtaining locks as & when reqd will not workCan still result in deadlocksAddressing deadlocks13Many solutions for deadlocks have been proposed in the literatureDeadlock prevention, deadlock avoidance, requesting resources in a non-circular manner etc.Normally provided by Operating SystemsProgrammers perspectiveTo incorporate one of the above solutions in her programsIncreases the complexity of programmingSoftware Composition14Lock based software components are difficult to composeComposition: build larger software systems using simpler software componentsBasis of modular programmingPerhaps the most fundamental objection [...] is that lock-based programs do not compose: correct fragments may fail when combined. Tim Harris et al., "Composable Memory Transactions", Section 2: Background, pg.2Software Composition: Difficulties with locks15Consider a hash table with thread-safe insert and delete operations implemented using locksImplementation of insert and delete are hiddenUser wishes to transfer an item A from hash table 1 to hash table 2Transfer implementation: How to make the implementation ofTransfer atomic ? Programmers dilemma16The programmer is caught between two problems:Increasing the part of the program that can be executed in parallelIncreasing the complexity of the program code and therefore the potential for problems

Alternative to locks: Software Transactional Memory17A promising alternative which has garnered lot of interestBoth in Industry and AcademiaSoftware transactions: units of execution in memory which enable concurrent threads to execute seamlesslyHide the difficulties of programming with locksParadigm originates from transactions in databases

1717Software Transactions18A transaction is a unit of code in execution in memoryA software transactional memory system (STM) ensures that a transaction either executes atomically even in presence of other concurrent transactions ornever have executed at allOn successful completion, a transaction commits. Otherwise it aborts

1818Programming Support: Bank Transfer Example19t1() {initialization();atomic {Read B1B1 := B1 1000Write B1

Read B2B2 := B2 + 1000Write B2}}

Programming Support Contd20t2() {initialization();atomic{Read B1;Read B2;

sum := sum + B1;sum := sum + B2;}}STMs Implementation21The operations of a transaction are divided into three phasesPhase 1: Local read/write phaseExecute transaction with writes into private bufferPhase 2: Validation phaseTests whether the reads and writes form a consistent view of the memoryTests for conflictsPhase 3: Commit or Abort phaseDepending on validation phase, the transaction is aborted or committedIf committed, the local writes are performed onto the memoryOptimistic Synchronisation 22MemoryT1Local Log CleanT2ValidateValidateLocal Log Written22Requirements of Validation Phase23Ensures that a transaction commits only if it does not violates consistency requirementsIn the bank transfer example considering threads as transactionsA STM system that commits both the transactions is not semantically correctHence, precise correctness criteria for STMs must be identifiedIntuitive Requirements of STM Systems24Observed by Guerraoui & Kapalka [PPoPP, 2008]R1: A STM system must preserve real-time orderTi commits before Tj starts effects of Ti must be seen before Tj violating real-time order may be misleading for programmers who are used to critical sections that enforces this order naturallyR2: Live transactions should not affect any other transactionA simple solution: Transactions execute normally. If a transaction accesses inconsistent state, abort itSeems to work well in DatabasesIntuitive Requirements Contd25R2: Live transactions should not affect any other transactionApparently, the databases execute in controlled environmentTransactions accessing inconsistent states can cause problems; but they are aborted later; hence no affectConsider, the following example: Initially, x=4, y=6T1: z = 1/(y-x); commit; r1(x) r1(y) c1T2: x=2; y=4; commit; w2(x) w2(y) c2;Consider an interleaving r1(x, 4) w2(x,2) w2(y, 4) c2 r1(y, 4)z = 1/(4 4) divide-by-zero error. Can cause program to crash

Intuitive Requirements Contd26Reading inconsistent values can cause many kinds of errorsDivide-by-zero, infinite-loops, even I/O possiblyThus, before the system can abort a transaction that read inconsistent valuecan cause the system to crash or get into an infinite loopThus, even before the inconsistent read could be detected, the damage could have already been doneMay not be acceptable for Memory TransactionsTranslation into Definition Simple Case27What do these intuitive requirements imply?Consider the simple case Execution consists only of read, write and tryC operations All these operations are atomicEvery transaction reads correct value transactions read only from committed valuesAn example history: H1: r1(x, 0) w2(y, 5) w2(x, 10) r1(y, 0) c1 c2

Some Notations28Every history has an initial committed transaction T0 that initializes all the valuesA history is said to be sequential (or serial) if all the transactions in it ordered by real-timelastWrite of a read operation r(x, v) in a history H, is defined as the previous closest commit operationthat writes to xFor instance, consider the earlier history H1: r1(x, 0) w2(y, 5) w2(x, 10) r1(y, 0) c1 c2lastWrite of r1(x, 0): C0; lastWrite of r1(y, 0): C0

Notations continued29Legality: a history S is said to be legalif every read operation r(x,v) reads the value written by previous closest committed transaction that writes to xFor instance:H2: w1(x, 5) c1 w2(x, 10) c2 r3(x, 5) is not legalH3: w1(x, 5) c1 w2(x, 10) c2 r3(x, 10) is legalDefinition of Correctness Criterion- Opacity30The correctness criterion is called OpacityConsider a history H (with the restrictions as mentioned earlier)H is opaque if there exists a sequential (or serial) history S such that :The operations of H & S are the same (H & S are said to be equivalent)S respects the real-time order of HS is legalThis definition ignores all the write steps of aborted transactions in H

Comparison of Opacity with Serializability31Consider some examplesH4: r1(x, 0) w1(y, 5) w2(x, 10) c1 r2(y, 5) a2Serializable: T2But NOT OpaqueH5: r1(x, 0) w1(y, 5) w2(x, 10) c1 r2(y, A)Opaque: T1 T2Serializability ignores aborted transactions. Opacity considers aborted transactions as wellThus, it can be seen that opacity is costlier to implementOpacity is analogous to Strict-MVSR32Variants of SerializabilityView Serializability (VSR): maintains only a single-versionMultiversion View Serializability (MVSR): maintains multiple versionsH6: r1(x, 0) w1(y, 5) w2(x, 10) c1 r2(y, 0) c1Is in MVSR but not in VSR: T1 T2MVSR allows more concurrency by storing multiple versionsStrict MVSR: MVSR + real-time orderOpacity: Strict MVSR + correctness of aborted transactionsDeferred write vs Direct Write Semantics33Direct Write semantics: Transactions can read uncommitted valuesDeferred Write semantics: Transactions only read from committed valuesH7: w1(x, 5) w2(y, 10) r1(x, 5) r2(y, 10) c2 c1Serializable, not Opaque: T1 T2

Verifying membership of Opacity34Verifying membership of VSR is NP-CompletePapadamitriou [JACM, 1979], Vidyasankar [AINF, 1987] showed thisVerifying membership of Opacity? Not clear if it is NP-Complete, but commonly believed to beDifference is due to direct & deferred write semanticsMembership verification is importantCan be used in developing efficient implementationsAlso used in verifying correctness of implementations

Conflict Serializability (CSR)35Conflict Order: in a history, order can be defined based on w-w, w-r, r-w operations on the same data-itemUsing this order, CSR can be definedA history H is in CSR if there exists a sequential history S with the same set of conflictsInterestingly, membership of CSR can be verified in polynomial time using Conflict Graph (V, E)V: a vertex for each transactionE: based on conflict orderH is in CSR iff the conflict graph is acyclicSimilarly, Conflict Opacity can be defined which can be verified efficientlyOpacity Definition Some relaxations36Read and Write operations are no longer atomicThese operations can overlapOperations: invocation followed by responser(x) ret(5)/ ret(A)w(x, 5) . ret(ok)tryc() . ret(ok)/ ret(A)HistoriesH8: r1(x) r2(x) ret(r1(x), 0) ret(r2(x), 0) w3(y, 5) ret(w3(y, 5), ok) tryc3() r1(y) r2(y) ret(r1(y), 5) ret(r2(y), 0) ret3(ok)Opacity Generalization37H is opaque if there exists a sequential (or serial) history S such that :The operations of H & S are the same (H & S are said to be equivalent)S respects the real-time order of HS is legalH8: r1(x) r2(x) ret(r1(x), 0) ret(r2(x), 0) w3(y, 5) ret(w3(y, 5), ok) tryc3() r1(y) r2(y) ret(r1(y), 5) ret(r2(y), 0) ret(tryc3, ok)Opaque: T2 T3 T1

Opacity Generalization38H is opaque if there exists a sequential (or serial) history S such that :The operations of H & S are the same (H & S are said to be equivalent)S respects the real-time order of HS is legalH8: r1(x) r2(x) ret(r1(x), 0) ret(r2(x), 0) w3(y, 5) ret(w3(y, 5), ok) tryc3() r1(y) r2(y) ret(r1(y), 5) ret(r2(y), 0) ret(tryc3, ok)Opaque: T2 T3 T1

Efficient Sub-Class of Opacity?39With the generalization, it is more commonly believed that verifying opacity is NP-CompleteCan a sub-class be defined whose membership can be efficiently verified?The earlier described notion of conflicts is no longer validOperations no longer atomicCan new conflict notion be defined that gives efficient sub-classOther Considerations40Opacity is prefix-closed, whereas VSR/MVSR is not!Presence of a final reading transactionDrawback of Opacity: interferenceOther correctness criteria: Virtual Worlds Consistency, DU-OpacityNesting of Transactions: Precise correctness guaranteesObject based STMs: Lower level conflicts do not matter. Only upper level conflicts matter. How is nesting different from object based STMs?Conclusion41Software Transactional Memory is a very promising research alternative for programming multi-core CPUsDoes not suffer the disadvantages of programming with locksSaw the motivationUnderstood the intuitive requirements of STM systems

4141Questions?42

42