clr reliability under memory exhaustion

24
07/09/04 Windows Reliability Team 1 CLR Reliability under Memory Exhaustion Solomon Boulos

Upload: jerzy

Post on 30-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

CLR Reliability under Memory Exhaustion. Solomon Boulos. Temporary Memory Exhaustion causes failures. Out of Memory (OOM) is temporary Shouldn’t cause failure Just wait for memory to become available System take action to free up memory All managed code depends on CLR Testing is difficult - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 1

CLR Reliability under Memory Exhaustion

Solomon Boulos

Page 2: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 2

Temporary Memory Exhaustion causes failures

• Out of Memory (OOM) is temporary• Shouldn’t cause failure

– Just wait for memory to become available– System take action to free up memory

• All managed code depends on CLR• Testing is difficult

– Exceptions are objects– Boxing (casting value type to object)– JIT compilation

Page 3: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 3

Overview

• Previous Work– Reliability Working Group– Improvements for Whidbey

• OOM behavior– Everett (CLR v1.1)– Whidbey (CLR v2.0)– WinFX

• Solutions– Transactions– Recovery

Page 4: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 4

Reliability Working Group

• Discussion of CLR reliability issues

• Interaction with Yukon and Avalon teams

• FailFast Behavior

• Controversial Decisions

• Fault Injection

Page 5: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 5

Improvements for Whidbey

• CLR hardened to Out of Memory (OOM)

• Constrained Execution Regions (CERs)– Eagerly Prepared (No JIT Compiling)– Blocks ThreadAbort

• Reliability Contracts– Describes reliability attributes of code– Allows for function calls within CER

• Unhandled Exception Policy

Page 6: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 6

My Approach

• Exhaust Memory (Not fault injection)

• Find failure points

• Consistently reproduce results

• Examine underlying causes

• Develop solutions

Page 7: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 7

Everett OOM Behavior

• Different classes of failures– Catchable Out of Memory (OOM) Exception– Type Initialization Exception– Invalid Program exception from JIT compiler– Fatal OOM Error– Fatal Execution Engine error

Page 8: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 8

Supporting Datavoid ManagedFunction(){

Regex* myReg = new Regex("*");

}Available Memory Observed Behavior

0-5860K Fatal Error

5892-5912K InvalidProgram

5924-5960K TypeInit

5890-Above Success

Page 9: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 9

Fault Injection Examplestatic void Main(string[] args){try

{ // operations in here

}catch ( OutOfMemoryException ){Console.WriteLine(“Nothing should get past me.");}

}

Page 10: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 10

Whidbey OOM Behavior

• See OOM Exception instead of– TypeInit– InvalidProgram

• Exception to Native host is COMPlusException– Not very helpful

• Fatal OOM only during initialization– Initialization can be large though (e.g. 10MB)

• CERs provide defense, but dangerous– CER { for (;;) } cannot be stopped

• Reliability Contracts = Honor System

Page 11: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 11

• Swallows exceptions

• Shell– Crashes and restarts

• WinFS– Silent Process Failure

• Indigo– False Completion

WinFX Case Studies

Base OSBase OS

Whidbey

WinFX

Page 12: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 12

Shell Failure

• Exhaust System Memory

• CLR throws OOM Exception

• Shell doesn’t catch

• Escalates to unhandled Win32 exception

• Shell crashes and restarts– Major disruption to user

Page 13: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 13

WinFS Test

• Simple Contact Store Functions– AddContact– RenameContact– RemoveContact– ListContacts– ReachMemory

Page 14: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 14

WinFS Test Normal Execution

• ListContacts() : “No Contacts Found”• AddContact(“Shane”) : Shane is added• ListContacts(): “Shane”• RenameContact(“Shane”, “Bob”): Shane is now

Bob• ListContacts(): “Bob”• RemoveContact(“Bob”): Bob is now deleted• ListContacts(): “No Contacts Found”

Page 15: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 15

WinFS Test Stressed Execution

• ListContacts() : “No Contacts Found”

• ReachMemory(8MB): 8MB Available

• AddContact(“Shane”) : Shane should be added

• ListContacts(): “No Contacts Found”

• Process Exits

Page 16: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 16

Indigo Test Specifications

• Client::SendMessage(): – Sends message to server and prints confirmation of

sending.

• Client::ReceiveMessage(): – Prints received message.

• Server::SendMessage(): – Sends message to client and prints confirmation of

sending.

• Server::ReceiveMessage(): – Prints message and responds with SendMessage()

Page 17: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 17

Indigo Test Behavior

• Normal Execution– Client::SendMessage()– Server::ReceiveMessage()– Server::SendMessage()– Client::ReceiveMessage()

• Execution with Memory Pressure– Client::SendMessage()– Server::ReceiveMessage()– Server::ExhaustMemory()– Server::SendMessage()– Client never receives message

Page 18: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 18

Solutions

• Transactions– In Memory– Durable (backed by disk)

• Recovery– Creates Recovery Log– Allows state restore

Page 19: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 19

Transaction Participantpublic TransactionParticipant(String _originalValue)

{ originalValue = _originalValue;

result = originalValue;}

public void Prepare(IPreparingEnlistment pe){ // do work for transactionresult = "New Value";// all is well, vote preparedpe.Prepared();

}

Page 20: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 20

Transaction Participant Continuedpublic void Commit(IEnlistment e){

// no work to do, vote done e.EnlistmentDone();}public void Rollback(IEnlistment e){

// restore originalValue result = originalValue; if ( null != e ) e.EnlistmentDone();}

Page 21: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 21

Simple Transaction ExampleTransactionParticipant tp = new TransactionParticipant(txtInput.Text);

try

{

using (TransactionScope s = new TransactionScope()){

Transaction.Current.VolatileEnlist(tp,false);

s.Consistent = true;

}

}

catch (TransactionAbortedException){}

txtInput.Text = tp.Result;

Page 22: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 22

rNotepad Techniques

• Log user work– KeyPressed Records– Resize Records

• Write work to log file every second

• Write checkpoint every 30 seconds

• Upon startup, recover– Checkpoint speeds up recovery

Page 23: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 23

Conclusion

• Testing is difficult but possible

• Temporary memory pressure shouldn’t cause failures

• Transactions and Recovery can provide resilient and recoverable solutions

Page 24: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 24

Questions?

• More info athttp://windows/sites/reliavuls/CLR/default.aspx