building grids: if everybody else is doing it, why shouldn’t you?

41
Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You? Jay Boisseau, Texas Advanced Computing Center SURA Grid Application Planning & Implementation Workshop December 6-8, 2005

Upload: sarila

Post on 20-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?. Jay Boisseau, Texas Advanced Computing Center SURA Grid Application Planning & Implementation Workshop December 6-8, 2005. Outline. Welcome! Overview of TACC (with Grid Computing Context) Some Perspectives on Grid Computing - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Jay Boisseau, Texas Advanced Computing Center

SURA Grid ApplicationPlanning & Implementation Workshop

December 6-8, 2005

Page 2: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Outline

• Welcome!• Overview of TACC (with Grid Computing

Context)• Some Perspectives on Grid Computing• Closing Thoughts• More

Page 3: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Overview of TACC(with Grid Computing Context)

Page 4: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Mission

To enhance knowledge discovery & education and to

improve society through the application of advanced

computing technologies.

Page 5: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

To accomplish this mission, TACC: – Evaluates, acquires & operates advanced computing systems

and software– Provides documentation, consulting, and training to users of

advanced computing resources

– Conducts R&D to produce new computational technologies & techniques that enhance advanced computing systems

– Collaborates with users to apply advanced computingtechniques in their research, develop, occupations, etc.

– Educates the community to broaden and deepen the pipelineof talented persons choosing careers in advanced computing

– Informs society about the value of advanced computingtechnologies in improving knowledge and quality of life

TACC Strategic Approach

Resources& Services

PR & EOT

Research & Development

Page 6: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Advanced ComputingTechnology Areas

• High Performance Computing (HPC)

• Visualization & Data Analysis (VDA)

• Data & Information Systems (DIS)

• Distributed & Grid Computing (DGC)

Page 7: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Advanced ComputingTechnology Areas

• High Performance Computing (HPC)

• Visualization & Data Analysis (VDA)

• Data & Information Systems (DIS)

• Distributed & Grid Computing (DGC)– newest area of R&D, resources, services at TACC– “tying it all together”

Page 8: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Advanced ComputingApplications Focus Areas

• Computational Geosciences– World-class expertise, programs at UT Austin– Strategic to state of Texas

• Computational Life Sciences– Broad & deep expertise in Texas higher ed institutions– Important to society

• Emergency Situation Assessment & Response– Crucial to life, property– Leverages TACC expertise, resources, and applications

Page 9: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Advanced ComputingApplications Focus Areas

• Computational Geosciences– World-class expertise, programs at UT Austin– Strategic to state of Texas

• Computational Life Sciences– Broad & deep expertise in Texas higher ed institutions– Important to society

• Emergency Situation Assessment & Response– Crucial to life, property– Leverages TACC expertise, resources, and applications

• Each has need for resources sharing & coordination, workflow, data/instrument integration: grid computing

Page 10: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC HPC & Storage Systems

STK PowderHorns (2)2.8 PB max capacity

managed by Cray DMF

IBM Power4 System224 CPUs (1.16 Tflops)

½ TB memory, 7.1 TB disk

Dell Xeon EM64T Linux Cluster656 CPUs (4.2 Tflops)

1.3 TB memory, ~4 TB disk

LONGHORNWRANGLER

ARCHIVE

Cray-Dell Xeon Linux Cluster1028 CPUs (6.3 Tflops)

1+ TB memory, 40+ TB disk

LONESTAR

Sun SANs andData Direct Disk

> 50TB

GLOBAL DISKSTAMPEDE

Mac Xserve G5 Cluster46 CPUs (368 Gflops)

52GB memory, 3.7TB disk

Page 11: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

ACES VisLab

• Front and Rear Projection Systems– 3x1 semi-cylinder immersive environment, 24’ diameter– 5x2 large-screen, 16:9 panel tiled display– Matrix switch between systems, projectors, rooms

• Full immersive capabilities with head/motion tracking

Page 12: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Advanced Visualization Systems

• Sun Terascale Visualization System– 128 UltraSparc 4 cores, ½ TB memory– 16 commodity graphics cards, > 3

Gpoly/sec– Remote to VisLab; very remote to

TeraGrid!

• SGI Onyx2– 24 CPUs, 6 Infinite Reality 2 Graphics

Pipes– 25 GB Memory, 356 GB Disk

Page 13: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Network Connectivity

• Intercampus bandwidth– Force10 switch/routers with 1.2 Tbps backplane in TACC

machine room and ACES building– 10 Gbps between TACC machine room and ACES provided

by Nortel DWDM (waiting for 10GigE cards)• WAN network upgrades:

– UT Internet2 at OC-12– TeraGrid connection at 10 Gbps– New Lonestar Education And Research Network (LEARN)

being built for Texas universities– Texas Joining National Lambda Rail (10 Gbps waves)

• High bandwidth networks (local and national) to facilitate resource sharing, coordination, data flow…

Page 14: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC R&D – Distributed & Grid Computing

• Web-based grid portals– GridPort, TeraGrid User Portal, SURA portal, TIGRE portal

• Grid resource data collection & information services– GPIR

• Overall grid deployment and integration– UT Grid, TeraGrid, TIGRE, OSG, SURA

• Grid scheduling and workflow tools– GridShell, MyCluster, Metascheduling Prediction Services

• Remote and collaborative grid-enabled visualization– For TeraGrid, UT Grid

• Network performance for moving terascale data

Page 15: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Activities & Scope

Research

Development

Services

Resources

EOT

HPC Vis Data Grid

Since 1986

Page 16: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Activities & Scope

Research

Development

Services

Resources

EOT

HPC Vis Data Grid

Since 1986

Since 2001

Page 17: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Activities & Scope

Research

Development

Services

Resources

EOT

HPC Vis Data Grid

Since 1986

UT Grid,TIGRE,

TeraGrid,OSG,

SURAgrid,GridPort,GridShell,

etc.

Since 2001

Page 18: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Today

Page 19: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TACC Tomorrow

Page 20: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Summary

• TACC has grown into a leading center since June 01– 4x of staff, 6x external funding– 100x compute power– New R&D in HPC, Vis, Data, and especially Grid Computing– New EOT, international, industrial partners programs

Page 21: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Summary

• TACC has grown into a leading center since June 01– 4x of staff, 6x external funding– 100x compute power– New R&D in HPC, Vis, Data, and especially Grid Computing– New EOT, international, industrial partners programs

• Grid computing projects have played a major role in TACC’s growth and success so far– Leadership in software including GridPort, GridShell,

MyCluster, Metascheduling Prediction Services– Partnership in grids at campus, state, regional, national, and

international scales

Page 22: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Some Perspectives onGrid Computing

Page 23: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Researchers Already Use Distributed Computing: Case Is Already Made!

• Researchers already use distributed systems:– Local workstations for some development, small simulations– HPC at big centers– Visualization back in their lab or in a Vislab– Archival storage to SANs, NASes, tape silos, etc.

• Researchers already collaborate with peers at other institutions– science is collaborative!

• Grids should enable resource sharing, collaboration, etc. with– Greater ease– More flexibility– More capability

Page 24: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Or in English…

“There are talented people everywhere in the world focused on solving the most challenging problems, and there are companies everywhere determined to provide the best products as efficiently as possible… people WILL collaborate and learn to share resources, as well as ideas and data, in order to ‘be first’ … people have been using distributed resources for decades, and this is only increasing… Grid computing to me is the subset of distributed computing that makes it easier… So, ‘Grid computing’ is here today and will remain important, by whatever name you want to call it.” -- me in GRIDtoday 12/05/05

Page 25: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Grid Computing: My View

• Grid computing is a standard, ‘complete’ set of distributed computing software capabilities

• Grid computing must provide some basic functions– resource discovery and information collection & publishing– data management on and between resources– process management on an between resources– common security mechanism underlying the above

• No grid computing package provides everything• Example: ‘Open Grid Services Architecture’ (OGSA)

(e.g., as implemented in Globus v4) makes it possible to build the components and make them work together

Page 26: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Grid Computing: My View

• TACC focuses on Grid computing to– enhance our HPC, SciVis, and massive data

storage– integrate researchers’ local computing systems

with ours– eventually, integrate research instruments for

research that also requires HPC, SciVis, massive data storage

Page 27: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

So TACC Drank The Grid Kool-Aid

• What grids are we participating in?– UT Grid: campus-scale– TIGRE: state– SURA Grid: regional– TeraGrid: national– Open Science Grid: international– And we’re building grid tools to provide capabilities

for/in these grids

• Why are we participating in these grids? Some examples will answer that question….

Page 28: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

UT Grid: Enable Campus-Wide Terascale Distributed Computing

• Why Build It? To move from ‘island’ of high-end resources to ‘hub’ of campus computing continuum– provide models for local resources (clusters, vislabs, etc.),

training, and documentation– develop procedures for integrating local systems to UT Grid

• single sign-on, data space, compute space

• leverage every PC, cluster, NAS, etc. on campus!

– integrate digital assets into UT Grid– integrate UT instruments & sensors into UT Grid– provide user portals and login nodes to access and use all

campus resources!

Page 29: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

UT Grid: Resources Distributed Across Two Campuses

Research campus

Main campus

TACC Vis

NOC

Ext nets

GAATN

ACES

SwitchICES Cluster

ICES Data

ICES Cluster

PGE Cluster

PGE Data

PGE Cluster

Switch

PGE

NOC

TACCPWR4

CMS

TACCStorage

Switch

TACCCluster

Page 30: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

UT Grid Status

• First 20 Months:– Deployed production United Devices ‘grid,’

(Roundup)– Deployed production Condor pool, integrated with

other pools (Rodeo)– Developed GridPort v4, GridShell v1– Building user portal, downloadable client software

stack– More to come… (see tomorrow’s talk)

Page 31: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TIGRE: Texas Internet Grid forResearch & Education

• Why Build It?: Help Texas universities &medical centers work together to shareresources and advance Texas research,education, economy

• 2 year project, $2.5M– But took 2+ years to get funding!

• 5 funded participants– Rice University– Texas Tech University– Texas A&M– University of Houston– University of Texas

Page 32: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TIGRE: Texas Internet Grid forResearch & Education

• Develop, document, and deploy a grid across the 5 participants– Supporting driving applications

• Enable other LEARN members to join TIGRE– Package grid software so that others can easily install it– Provide good documentation– Ensure that it’s easy, lightweight– Make it modular: enable institutions to provide just what they can

offer

• NOTE: Companion project (LEARN) will provide a high-bandwidth network for use by TIGRE and other Texas institutions

Page 33: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

TIGRE Deliverables: Quick Build!

YEAR 1• Q1

– Project plan– Web site– Certificate Authority– Minimum testbed requirements– Select 3 driving applications

• Q2– Alpha portal

• Q3– Define software stack– Distribution Mechanism– Simple demo of 1 TIGRE app

• Q4– Alpha client software package

distributed

YEAR 2• Q1

– Alpha customer management services system deployed & demonstrated

• Q2– Global grid scheduler deployed

• Q3– Stable software available (only bug

fixes after this)– Services required to be part of

TIGRE specified• Q4

– Complete hardening of software– Complete documentation– Finalized procedures and policies to

join TIGRE & document– Demonstrate TIGRE at SC

Page 34: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

NSF TeraGrid: National Cyberinfrastructure for Computational

Science• Why Build It? Provide

terascale computational capabilities that go beyond just HPC to facilitate 21st century research!

• Includes NCSA, SDSC, PSC, Indiana, Purdue, Argonne, and Oak Ridge

• Anointed as NSF production cyberinfrastructure for 5 years

- TACC is providing terascale computing, storage, - TACC is providing terascale computing, storage, and visualization resources and visualization resources- UT is providing terascale geosciences data sets- UT is providing terascale geosciences data sets

Page 35: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Closing Thoughts

Page 36: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

So Should You Or Shouldn’t You?

• Grid computing is here to stay, by one name or another…– The possibilities are too great– The needs are too great

• But it’s not always needed– Simple solutions, powerful tools, sharp minds get answers– Can maximize collaboration, but can also inhibit people from

working on the real problem

• Get user requirements and THINK!– What is needed?– What is overkill?– Use mature technologies unless doing grid R&D– Use the minimum subset to meet requirements, build on successes

incrementally

Page 37: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

To Build Useful Grids, Software Must Be:

• Easier– No more difficult than CLIs for ‘power users’– No more difficult than the Web/PC apps for the

other 99% of (potential) users (portals, desktop apps, etc.)

– No more difficult than configuring office network for admins

• Smarter– Smart scheduling, data transfers, workflow– Built-in help/advice, like PC apps and portals

Page 38: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

To Build Useful Grids, Software Must Be:

• More robust– Must not break more than the individual resources– Opportunity is to break less than any individual

resource (but only partially successful so far)

• And standards-based & interoperable– Web services, etc.

• So lots of opportunities for us geeks!– But let’s not lose sight of the forest for the trees!

Page 39: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Finally, Enjoy Your Time HereWhile You Learn

• Austin is Fun, Cool, Weird, & Wonderful– Mix of hippies, slackers, academics, geeks, politicos,

musicians, filmmakers, artists, and even a few cowboys– “Keep Austin Weird” is the official slogan– Live Music Capital of the World (seriously)

• Also great restaurants, cafes, clubs, bars, theaters, galleries, museums, etc.– http://www.austinchronicle.com/– http://www.austin360.com/xl/content/xl/index.html– http://www.research.ibm.com/arl/austin/index.html (!)

Page 40: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Your Austin To-Do List

Eat barbecue at Rudy’s, Stubb’s, Iron Works, Green Mesquite, etc. Eat Tex-Mex at Chuy’s, Trudy’s, Maudie’s, etc. Have a cold Shiner Bock (but not Lone Star) Visit 6th Street and Warehouse District at night Go to at least one live music show Learn to two-step at The Broken Spoke Visit the Texas State History Museum Walk/jog/bike around Town Lake Visit the UT main campus and the ACES VisLab See a movie at Alamo Drafthouse Cinema (arrive early, order beer & food) Eat Amy’s Ice Cream Listen to and buy local music at Waterloo Records Buy a bottle each of Rudy’s Barbecue ‘Sause’ and Tito’s Vodka Drive into the Hill Country, visit small towns and wineries See sketch comedy at Esther’s Follies See a million bats emerge from Congress Ave. bridge at sunset

Page 41: Building Grids: If Everybody Else Is Doing It, Why Shouldn’t You?

Welcome to TACCand Austin, Y’all!