Transcript
Page 1: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Internet-Based TSP Computation with Javelin++

Michael Neary & Peter CappelloComputer Science, UCSB

Page 2: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

IntroductionGoals

• Service parallel applications that are:– Large: too big for a cluster– Coarse-grain: to hide communication latency

• Simplicity of use– Design focus: decomposition [composition] of computation.

• Scalable high performance– despite large communication latency

• Fault-tolerance– 1000s of hosts, each dynamically [dis]associates.

Page 3: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

IntroductionSome Related Work

Page 4: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

IntroductionSome Applications

• Search for extra-terrestrial life• Computer-generated animation• Computer modeling of drugs for:

– Influenza– Cancer– Reducing chemotherapy’s side-effects

• Financial modeling• Storing nuclear waste

Page 5: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Outline

• Architecture

• Model of Computation

• API

• Scalable Computation

• Experimental Results

• Conclusions & Future Work

Page 6: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Architecture Basic Components

Brokers

Clients

Hosts

Page 7: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

Page 8: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

Page 9: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

Page 10: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

PING(BID?)

Page 11: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

Page 12: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureNetwork of Broker-Managed Host Trees

• Each broker manages a tree of hosts

Page 13: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureNetwork of Broker-Managed Host Trees

• Brokers form a network

Page 14: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureNetwork of Broker-Managed Host Trees

• Brokers form a network

• Client contacts broker

Page 15: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureNetwork of Broker-Managed Host Trees

• Brokers form a network

• Client contacts broker• Client gets host trees

Page 16: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationDeterministic Work-Stealing Scheduler

Task container

addTask( task ) getTask( )

stealTask( )

HOST

Page 17: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationDeterministic Work-Stealing Scheduler

Task getWork( )

{

if ( my deque has a task )

return task;

else if ( any child has a task )

return child’s task;

else

return parent.getWork( );

}

CLIENT

HOSTS

Page 18: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of Computation

• Master-slave

– AFAIK all proposed commercial applications

• Branch-&-bound optimization

– A generalization of master-slave.

Page 19: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0 0UPPER = LOWER = 0

Page 20: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

2

0UPPER = LOWER = 2

Page 21: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

3

2

0UPPER = LOWER = 3

Page 22: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

4

3

2

0UPPER = 4LOWER = 4

Page 23: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

34

3

2

0UPPER = 3LOWER = 3

Page 24: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

34

3 6

2

0UPPER = 3LOWER = 6

Page 25: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0 UPPER = 3LOWER = 7

34

3 6

2 7

0

Page 26: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

• Tasks created dynamically

• Upper bound is shared

• To detect termination:

scheduler detects tasks that

have been:

– Completed

– Killed (“bounded”)34

3 6

2 7

0

Page 27: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

APIpublic class Host implements Runnable{ . . . public void run() { while ( (node = jDM.getWork()) != null ) { if ( isAtomic() ) compute(); // search space; return result else { child = node.branch(); // put children in child array for (int i = 0; i < node.numChildren; i++) if ( child[i].setLowerBound() < UpperBound )

jDM.addWork( child[i] ); //else child is killed implicitly } } }

Page 28: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

APIprivate void compute() { . . .

boolean newBest = false;

while ( (node = stack.pop()) != null ) { if ( node.isComplete() ) if ( node.getCost() < UpperBound ) { newBest = true; UpperBound = node.getCost(); jDM.propagateValue( UpperBound ); best = Node( child[i] ); } else { child = node.branch(); for (int i = 0; i < node.numChildren; i++) if ( child[i].setLowerBound() < UpperBound ) stack.push( child[i] ); //else child is killed implicitly } } if ( newBest ) jDM.returnResult( best );} }

Page 29: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Page 30: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Page 31: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Page 32: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Page 33: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Page 34: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationFault Tolerance via Eager Scheduling

When:

• All tasks have been assigned

• Some results have not been reported

• A host wants a new task

Re-assign a task!

• Eager scheduling tolerates faults & balances the load.

– Computation completes, if at least 1 host communicates with client.

Page 35: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationFault Tolerance via Eager Scheduling

• Scheduler must know which:

– Tasks have completed

– Nodes have been killed

• Performance balance

– Centralized schedule info

– Decentralized computation34

3 6

2 7

0

Page 36: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Experimental Results

0

20

40

60

80

100

0 20 40 60 80 100

Processors

Speedup graph22

ideal

graph24

Page 37: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Experimental Results

34 8 7 12 10 9 10

3 6 10 8

2 7

0 Example of a “bad” graph

Page 38: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Conclusions• Javelin 2 relieves designer/programmer managing a set of

[Inter-] networked processors that is:– Dynamic– Faulty

• A wide set of applications is covered by:– Master-slave model– Branch & bound model

• Weak shared memory performs well.• Use multicast (?) for:

– Code distribution– Propagating values

Page 39: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Future Work

• Improve support for long-lived computation:– Do not require that the client run continuously.

• A dag model of computation– with limited weak shared memory.

Page 40: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Future WorkJini/JavaSpaces Technology

TaskManageraka Broker

H H

HH

H

H

H

H

“Continuously” disperse Tasks among brokers via a physics model

Page 41: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Future WorkJini/JavaSpaces Technology

• TaskManager uses

persistent JavaSpace

– Host management: trivial

– Eager scheduling: simple

• No single point of failure

– Fat tree topology

Page 42: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Future WorkAdvanced Issues

• Privacy of data & algorithm• Algorithms

– New computational complexity model“Minimize” communication between machines

– N-body problem, …

• Accounting: Associate specific work with specific host– Correctness– Compensation (how to quantify?)

• Create international open source organization– System infrastructure– Application codes

Page 43: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB
Page 44: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

34 8 7 12 10 9 10

3 6 10 8

2 7

0UPPER = 3LOWER = 0

Page 45: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Name Service (BNS)

BROKER

HOST

BNS1. Register with BNS

Page 46: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Name Service (BNS)

BROKER

HOST

BNS1. Register with BNS

2. Get broker list

Page 47: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Name Service (BNS)

BROKER

HOST

BNS1. Register with BNS

2. Get broker list

3. Ping brokers on list

Page 48: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Name Service (BNS)

BROKER

HOST

BNS1. Register with BNS

2. Get broker list

3. Ping brokers on list

4. Connect to selected broker


Top Related