Orca A language for parallel programming
of distributed systems
2
Orca
• Parallel language designed at VU
• Design and first implementation (‘88-’92):– Bal, Kaashoek, Tanenbaum
• Portable Orca system (‘93-’97):– Bal, Bhoedjang, Langendoen, Rühl, Jacobs, Hofman
• Used by ~30 M.Sc. Students
3
Overview
• Distributed shared memory
• Orca– Shared data-object model– Processes– Condition synchronization– Language aspects
• Examples: TSP and ASP
• Implementation
4
Orca’s Programming Model
• Explicit parallelism (processes)
• Communication model: – Shared memory: hard to build– Distributed memory: hard to program
• Idea: shared memory programming model on distributed memory machine
• Distributed shared memory (DSM)
5
Distributed Shared Memory(1)
• Hardware (CC-NUMA):– Cache-coherent Non Uniform Memory Access– Processor can copy remote cache line– Hardware keeps caches coherent– Examples: DASH, Alewife, SGI Origin
CPU
cache/memory
remote read
local read
6
Distributed Shared Memory(2)
• Operating system:– Shared virtual memory– Processor can fetch remote pages– OS keeps copies of pages coherent
• User-level system– Treadmarks– Uses OS-like techniques– Implemented with mmap and signals
7
Distributed Shared Memory(3)
• Languages and libraries– Do not provide flat address space– Examples:
• Linda: tuple spaces
• CRL: shared regions
• Orca: shared data-objects
8
Shared Data-object Model
• Shared data encapsulated in objects
• Object = variable of abstract data type
• Shared data accessed by user-defined, high-level operations
Object
Local data
Enqueue( )
Dequeue( )
9
Semantics
• Each operation is executed atomically– As if operations were executed 1 at a time– Mutual exclusion synchronization– Similar to monitors
• Each operation applies to single object– Allows efficient implementation– Atomic operations on multiple objects are seldom
needed and hard to implement
10
Implementation
• System determines object distribution
• It may replicate objects (transparently)
Single-copy object
Network
CPU 1 CPU 2
Replicated object
Network
CPU 1 CPU 2
11
Object Types
• Abstract data type
• Two parts:1 Specification part
• ADT operations
2 Implementation part• Local data
• Code for operations
• Optional initialization code
12
Example: Intobject
• Specification part
object specification IntObject; operation Value(): integer; operation Assign(Val: integer); operation Min(Val: integer);end;
object specification IntObject; operation Value(): integer; operation Assign(Val: integer); operation Min(Val: integer);end;
13
Intobject Implementation Partobject implementation IntObject; X: integer; # internal data of the object
operation Value(): integer; begin return X; end; operation Assign(Val: integer); begin X := Val; end operation Min(Val: integer); begin if Val < X then X := Val; fi; end;end;
object implementation IntObject; X: integer; # internal data of the object
operation Value(): integer; begin return X; end; operation Assign(Val: integer); begin X := Val; end operation Min(Val: integer); begin if Val < X then X := Val; fi; end;end;
14
Usage of Objects# declare (create) objectMyInt: IntObject;
# apply operations to the objectMyInt$Assign(5);tmp := MyInt$Value();
# atomic operation MyInt$Min(4);
# multiple operations (not atomic)if MyInt$Value() > 4 then MyInt$Assign(4);fi;
# declare (create) objectMyInt: IntObject;
# apply operations to the objectMyInt$Assign(5);tmp := MyInt$Value();
# atomic operation MyInt$Min(4);
# multiple operations (not atomic)if MyInt$Value() > 4 then MyInt$Assign(4);fi;
15
Parallelism
• Expressed through processes– Process declaration: defines behavior– Fork statement: creates new process
• Object made accessible by passing it as shared parameter (call-by-reference)
• Any other data structure can be passed by value (copied)
16
Example (Processes)
# declare a process typeprocess worker(n: integer; x: shared IntObject);begin #do work ... x$Assign(result);end;
# declare an objectmin: IntObject;
# create a process on CPU 2fork worker(100, min) on (2);
# declare a process typeprocess worker(n: integer; x: shared IntObject);begin #do work ... x$Assign(result);end;
# declare an objectmin: IntObject;
# create a process on CPU 2fork worker(100, min) on (2);
17
Structure of Orca Programs
• Initially there is one process (OrcaMain)
• A process can create child processes and share objects with them
• Hierarchy of processes communicating through objects
• No lightweight treads
18
Condition Synchronization
• Operation is allowed to block initially– Using one or more guarded statements
• Semantics:– Block until 1 or more guards are true– Select a true guard, execute is statements
operation name(parameters); guard expr-1 do statements-1; od; .... guard expr-N do statements-N od;end;
operation name(parameters); guard expr-1 do statements-1; od; .... guard expr-N do statements-N od;end;
19
Example: Job Queueobject implementation JobQueue; Q: “queue of jobs”;
operation addjob(j: job); begin enqueue(Q,j); end;
operation getjob(): job; begin guard NotEmpty(Q) do return dequeue(Q); od; end;end;
object implementation JobQueue; Q: “queue of jobs”;
operation addjob(j: job); begin enqueue(Q,j); end;
operation getjob(): job; begin guard NotEmpty(Q) do return dequeue(Q); od; end;end;
20
Traveling Salesman Problem
• Structure of the Orca TSP program
• JobQueue and Minimum are objects
Master
Slave
Slave
SlaveJobQueue Minimum
21
Language Aspects (1)
• Syntax somewhat similar to Modula-2
• Standard types:– Scalar (integer, real)– Dynamic arrays– Records, unions, sets, bags– Graphs
• Generic types (as in Ada)
• User-defined abstract data types
22
Language Aspects (2)
• No global variables
• No pointers
• Type-secure– Every error against the language rules will be detected
by the compiler or runtime system
• Not object-oriented– No inheritance, dynamic binding, polymorphism
23
Example Graph Type: Binary Treetype node = nodename of BinTree;type BinTree = graph root: node; # global field nodes # fields of each node data: integer; LeftSon, RightSon: node; end;
type node = nodename of BinTree;type BinTree = graph root: node; # global field nodes # fields of each node data: integer; LeftSon, RightSon: node; end;
t: BinTree; # create treen: node; # nodename variablen := addnode(t); # add a node to tt.root := n; # access global fieldt[n].data := 12; # fill in data on node nt[n].LeftSon := addnode(t); # create left sondeletenode(t,n); # delete node nt[n].data := 11; # runtime error
t: BinTree; # create treen: node; # nodename variablen := addnode(t); # add a node to tt.root := n; # access global fieldt[n].data := 12; # fill in data on node nt[n].LeftSon := addnode(t); # create left sondeletenode(t,n); # delete node nt[n].data := 11; # runtime error
24
Performance Issues
• Orca provides high level of abstraction easy to program
hard to understand performance behavior
• Example: X$foo() can either result in:– function call (if X is not shared)– monitor call (if X is shared and stored locally)– remote procedure call (if X is stored remotely)– broadcast (if X is replicated)
25
Performance model
• The Orca system will:– replicate objects with a high read/write-ratio– store nonreplicated object on ``best’’ location
• Communication is generated for:– writing replicated object (broadcast)– accessing remote nonreplicated object (RPC)
• Programmer must think about locality
26
Summary of Orca
• Object-based distributed shared memory– Hides the underlying network from the user– Applications can use shared data
• Language is especially designed for distributed systems
• User-defined, high-level operations