x10: ibm’s bid into parallel languages

43
UNIVERSITY OF NIVERSITY OF MASSACHUSETTS ASSACHUSETTS, A , AMHERST MHERST Department of Computer Science Department of Computer Science X10: IBM’s bid into parallel languages Paul B Kohler Kevin S Grimaldi University of Massachusetts Amherst

Upload: emera

Post on 13-Jan-2016

20 views

Category:

Documents


0 download

DESCRIPTION

X10: IBM’s bid into parallel languages. Paul B Kohler Kevin S Grimaldi University of Massachusetts Amherst. introduction. A new language based of Java IBM’s entry to the DARPA’s PERCS project (Productive Easy-to-use Reliable Computer Systems) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science

X10: IBM’s bid into parallel languages

Paul B KohlerKevin S Grimaldi

University of Massachusetts Amherst

Page 2: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 2

introduction A new language based of Java IBM’s entry to the DARPA’s PERCS project (Productive Easy-to-use Reliable Computer Systems)

Built for NUCCs(Non-Uniform Computing Clusters) where different memory locations incur different cost.

Page 3: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 3

intro continued Will eventually be combined with

new tools for Eclipse Goals

Safe Analyzable Scalable Flexible

Page 4: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 4

PGAS Past attempts at parallel languages

have used the illusion of a single shared memory This does not represent the situation

in NUCC. Problems occur when we try divide

memory among processors. X10 uses PGAS to reveal the non-

uniformity and make the language scalable.

Page 5: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 5

PGAS(co nt) PGAS=Partitioned Global Address

Space Memory partitioned into places.

Data is associated with a place and can only be read/changed locally.

Provided in X10 through the abstractions of places and activities.

Page 6: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 6

Places Contain a collection of resident mutable

data objects and associated activities Places represent locality boundaries

Very efficient access to resident data Set of places remains fixed at runtime

Places are virtual Mapped to physical processors by runtime Runtime may transparently migrate

places

Page 7: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 7

Using Places Accessible via place.places

First activity runs at place.FIRST_PLACE

Iterate over places with next() and prev()

here represents current place

Page 8: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 8

Activities Similar to java threads. Activities are associated with a place. Activities never migrate places. Activities may only read/modify

mutable data that is local to its place. However immutable data (i.e.final or

value) maybe accessed by any activity.

Page 9: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 9

Activities (cont) Activities are GALS(Globally

Asynchronous Locally Synchronous)

Local data accesses are synchronized

Global data accesses are not by default. Synchronization can be explicitly forced.

Page 10: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 10

Activities:Syntax It is very simple to spawn new

activities:async(place)statement

This runs the specified statement at the specified place.

Example: final int result;

async(here.next()){result=a+b}

This would add two numbers at the adjacent place and store the result(since result is final it can be accessed by other places)

Page 11: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 11

Type System X10 is strongly typed Unified type system

Everything is an object; no primitive types

Library supplies boolean, byte, short, char, int, long, float, double, complex, String classes

Borrows Java’s single inheritance combined with interfaces

Page 12: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 12

Reference vs Value Types

Two types of objects Value types are immutable and can be freely

copied Reference types can contain mutable fields but

cannot be migrated Value classes are declared value keyword

instead of class Value classes can still contain fields that are of

reference types Allows them to refer to mutable data Copying ‘bottoms out’ on reference fields

Page 13: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 13

Type System (cont) Objects are either scalar or

aggregate Each of value and reference types can

be either scalar or aggregate Types consist of two parts

Data type – The set of values it can take

Place type – The place at which it resides

No generics (yet)

Page 14: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 14

Variables Variables must be initialized (can

never be observed without a value) final variables cannot be

changed after initialization Declared by using the final

keyword and/or using a variable name that starts with a capital letter

Page 15: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 15

Nullable Types Designers view ability to hold null value as

orthogonal to value vs reference type Either reference or value types can be preceded

by nullable Adds a null value to the type Multiple nullables are collapsed (i.e. nullable nullable T = nullable T)

Can cast between T and nullable T (nullable T) v always succeeds (T) null throws an exception if T is not nullable

Page 16: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 16

Rooted exceptions What should happen when a

thread/activity terminates abnormally? In java it’s unclear since the spawning

thread may have already terminated. X10 uses a rooted exception model. All

uncaught exceptions get passed to the calling activity.

A new blocking command finish s is introduced. This command waits for all activities in s to terminate before proceeding.

Page 17: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 17

Exceptions (cont) Finish allows exceptions to travel back

towards the root activity and possibly be caught and handled along the way.

Example:try{finish async(here.next()){

throw new Exception();}

}catch(Exception e){}

Page 18: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 18

Arrays X10 features an array sub-language

similar to ZPL. Arrays have:

Regions Distributions

Arrays are operated on by: for foreach ateach And more!

Page 19: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 19

Even more arrays Arrays may be value(immutable) or

reference(mutable) Keyword unsafe allows arrays that

will play nice with java code. Arrays can run code as an

initialization step.

Page 20: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 20

Arrays:Regions Regions:As in ZPL a region is a set

of indexed data points. Regions and distributions are first

class constructs. Regions can be specified like this:

[0:128,0:256] creates a region 128x256

Page 21: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 21

Regions(cont.) Regions can be modified by

operation such as union(||), intersection(&&) and set difference(-).

Predefined regions types can be constructed using factories.

region R2 = region.factory.upperTriangular(25)

In the future users may be able to define there own regions.

Page 22: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 22

Arrays:Distributions

Every array has a distribution. A distribution is mapping of array

elements to places. Distributions are over a particular

region. Arrays are typed by their

distribution.

Page 23: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 23

Distributions cont. Currently must use pre-defined

distributions(unique,block,cyclic…etc.)

Have set operations like regions. Can be used as functions so for a

point p and distribution d: d[p]=place which point p maps to(i.e. where the p’th element “lives”).

Page 24: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 24

Subarrays Use various boolean operations on

distributions to create subdistributions To get the portion of a block

distribution that is located here:block([1:100]) && [1:100]->here

a | D1 is the portion of array a corresponding to the subdistribution D1

Page 25: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 25

Array construction Here is an example of array

initialization:float [.] data= new[factory.cyclic([0:200,50:250])]

(point [i, j]){return i+j};

Page 26: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 26

Array construction Here is an example of array

initialization:float [.] data= new[factory.cyclic([0:200,50:250])]

(point [i, j]){return i+j};

This specifies a 200x200 region

Page 27: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 27

Array construction Here is an example of array

initialization:float [.] data= new[factory.cyclic([0:200,50:250])]

(point [i, j]){return i+j};

This specifies a 200x200 region. This specifies a cyclic distribution

over the region.

Page 28: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 28

Array construction Here is an example of array

initialization:float [.] data= new[factory.cyclic([0:200,50:250])]

(point [i, j]){return i+j}; This specifies a 200x200 region. This specifies a cyclic distribution

over the region. This code initialize each element to

the some of its i,j coordinates

Page 29: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 29

Array iteration Once you have an array what can you do

with it? Array iterators: for, foreach, ateach for: Sequentially iterates over a supplied

region. At each point it binds the point to a variable and executes the accompanying statement.

foreach: As with for but operations are done in parallel. That is it spawns a new activity for each point.

ateach: takes a distribution instead of a region. Performs operations in parallel at the place specified by the distribution.

Page 30: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 30

Iteration example Example:

for(point p : A){A[p]=A[p]*A[p]

}

Page 31: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 31

More array ops lift: Takes a binary function and two

arrays of the same distribution. Produces a new array formed by a pointwise application of the function to the two arrays.

reduce: As in MPI applies a binary function to every element to produce a single value.

scan: Creates a new array where the i’th element is the result of reduction on the first i elements.

Page 32: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 32

Atomic Blocks X10 allows you to define atomic

blocks The contents of a block is guaranteed

to execute as a single atomic event. This is only in regards to other activities in the same place.

While this is guaranteed to be atomic the details are implementation specific.

Syntax: atomic S

Page 33: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 33

Conditional Atmc Blck

Also provides: when(Cond) S This blocks until cond is true and

then executes S atomically. This allows the creation of a

number of synchronization mechanisms.

Dangerous! If cond is never true or if there is a cycle deadlock occurs.

Page 34: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 34

Future and Force As discussed before futures allow

the asynchronous computation of a value that may be used in the future.

Futures return a object of type Future<T>

Force is a blocking call that waits for a particular future to be finished

Page 35: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 35

Futures(cont.) Can only access final variables.

This prevents side effects. Syntax: future(p)e Example: Future <float> blah =

future(here.next){sqrt(a^2+b^2)};

Page 36: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 36

Clocks Act as barriers

Much more flexible Guarantee no deadlock Dynamically associated with

different sets of activities

Page 37: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 37

Clock Semantics Activities register with zero or more clocks

Can register/unregister at any time Clocks are always in some phase

Do not advance until all currently registered activities quiesce

Activities quiesce with next operation Indicates they are ready for all their clocks to

advance Suspends until all clocks have advanced This makes deadlock impossible

Page 38: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 38

Status IBM has supposedly built a single

VM reference implementation Language still under heavy revision GPL’ed X10-XTC compiler available

Doesn’t conform to current language spec

Uses what will possibly be version 0.5 Speculatively contains support for

operator overloading and generics Currently very poor performance

Page 39: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 39

conclusion So is X10 the answer to all our

parallel programming woes?

Page 40: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 40

conclusion So is X10 the answer to all our

parallel programming woes? In my opinion probably not.

Page 41: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 41

conclusion So is X10 the answer to all our

parallel programming woes? In my opinion probably not. Parallelism still very explicit. Still

opportunities for deadlock, race conditions etc.

Page 42: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 42

conclusion So is X10 the answer to all our

parallel programming woes? In my opinion probably not. Parallelism still very explicit. Still

opportunities for deadlock, race conditions etc.

Takes a “…and the kitchen sink” approach which makes learning the syntax a chore.

Page 43: X10: IBM’s bid into parallel languages

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 43

conclusion So is X10 the answer to all our

parallel programming woes? In my opinion probably not. Parallelism still very explicit. Still

opportunities for deadlock, race conditions etc.

Takes a “…and the kitchen sink” approach which makes learning the syntax a chore.

It’s not FORTRAN. Will people bother to use it?