push podc09
DESCRIPTION
TRANSCRIPT
© 2009 IBM Corporation
PUSHa DISC Shell
Eric Van Hensbergen & Noah Evans - IBM Research11 August 2009
Monday, August 10, 2009
Noah Paul Evans
IBM Research Austin Intern -> Bell Labs Antwerp RSM
Monday, August 10, 2009
© 2009 IBM Corporation
PUSH: a DISC Shell
Wisdom
“This is the Unix philosophy. Write programs that do one thing and do it well. Write programs to work together.” - Doug McIlroy
3
Monday, August 10, 2009
© 2009 IBM Corporation
PUSH: a DISC Shell
4
UNIX Pipelines
cat file | sort -n -r | uniq | more
Source: If applicable, describe source origin
Monday, August 10, 2009
© 2009 IBM Corporation
PUSH: a DISC Shell
5
PUSH Concept
ls |< cat | sort -n -r | uniq >| sort -n -r | more
Source: If applicable, describe source origin
Monday, August 10, 2009
© 2009 IBM Corporation
PUSH: a DISC Shell
PUSH Structure
6
!"#$$%
&'(()*+,-.# /0$1-.$#2'3
,-.#
,-.#
,-.#
,-.#
,-.#
,-.#!"#$$%
&'(()*+
!"#$$%
&'(()*+
!"#$$%
&'(()*+
!"#$$%
&'(()*+
!"#$$%
&'(()*+
!"#$$%
&'(()*+,-.#
,-.#
,-.#
,-.#
,-.#
,-.#
4#(0$1-.$#2'3 ,-.#!"#$$%
&'(()*+
Monday, August 10, 2009
!"#$%&
!"#$%&
!"#$%&'
!"#(#
!"#$%&'
!"#(#
!"#$%&'
!"#(#
!"#$%&
!"#$%&'
!"#(#
!"#$%&'
!"#(#
!"#$%&'
!"#(#
!"#$%&
!"#$%&'
!"#(#
!"#$%&'
!"#(#
!"#$%&'
!"#(#
!"#(#
!"#(#
!"#(#
!"#(#
© 2009 IBM Corporation
PUSH: a DISC Shell
Composable
stage1 |< stage2 |< stage3 >| stage4 >| stage5
7
Monday, August 10, 2009
© 2009 IBM Corporation
PUSH: a DISC Shell
Operators
Fan Out ( |<[n] )–[n] specifies maximum degree of fan-out–default will fan each record out to a new core (up to the maximum number of cores)
–parsing and distribution strategy determined by a module specified via environment variable OFS
–default module splits records based on newline
Fan In ( >| )
8
Monday, August 10, 2009
© 2009 IBM Corporation
PUSH: a DISC Shell
Status
Prototype built using Inferno and MASH shellDeployed to local Linux cluster, Amazon EC2, and BlueGene
via Kittyhawk (which runs a cloud on BlueGene hardware)Currently building out underlying execution model to support
wide range of cluster environments and provide better distribution and control
Future Work - Alternate Distribution Models–Separate Distribution Model from Record Parsing Module–Broadcast and other MPI-style Collective Operations (?)–Adapt to changes in underlying resources and/or failure–Apply to heterogenous systems (Cell, GPUs, multi-ISA)
9
Monday, August 10, 2009
© 2009 IBM Corporation
PUSH: a DISC Shell
Thanks
http://code.google.com/p/pushhttp://www.research.ibm.com/hare
This work has been supported by the Department of Energy Of Office of Science Operating and Runtime Systems for
Extreme Scale Scientific Computation project under contract #DE-FG02-08ER25851.
10
Monday, August 10, 2009