grid computing: apracticalguide ...17.2.1 bioinformaticsandhigh-performance computing 343 17.2.2...

13
GRID COMPUTING : A PRACTICAL GUIDE TO TECHNOLOGY AND APPLICATIONS AHMAR ABBAS CHARLES R I V ER M E D I A a-." CHARLES RIVER MEDIA, INC. Hingham, Massachusetts

Upload: others

Post on 05-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

GRID COMPUTING :A PRACTICAL GUIDETO TECHNOLOGYAND APPLICATIONS

AHMAR ABBAS

CHARLESR I V E RM E D I A

a-."

CHARLESRIVERMEDIA, INC.Hingham, Massachusetts

Page 2: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

Contents

Preface

xixAcknowledgments

xxiii

1

IT Infrastructure Evolution

11.1 Introduction

21.2 Microprocessor Technology

21.3 Optical Networking Technology

31.4 Storage Technology

61.5 Wireless Technology

71.6 Sensor Technology

91.6.1 Fiber Optic Sensors

91 .6.2 Wireless Sensors

101.7 Global Internet Infrastructure

111.8 World Wide Web and Web Services

141.9 Open-Source Movement

161.10 Conclusion

16

2

Productivity Paradox and Information Technology

192.1 Introduction

202.2 Productivity Paradox

202.3 Return on Technology Investment

202.4 Multi-Story Bureaucracy

212.5 Information Technology Straightjacket

222.6 Consolidation

252.6.1 Server Consolidation

252.6.2 Consolidating Applications

262.6.3 Consolidating Storage

27rrvu

Page 3: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

Viii contents

2.7 Outsourcing2.8 Toward aReal-Time Enterprise-Operational Excellence2.9 Conclusion

272829

3

Business Value of Grid Computing

31

3.1 Introduction

323.2 Grid Computing Business ValueAnalysis

333.2.1 Grid Computing Value Element #1 : Leveraging Existing

Hardware Investments andResources

333.2.2 Grid ComputingValue Element#2 : Reducing Operational

Expenses

343.2.3 Grid ComputingValueElement#3 : Creating a Scalable

and Flexible Enterprise IT Infrastructure

353.2.4 Grid Computing Value Element #4 : Accelerating Product

Development, Improving Time to Market, andRaisingCustomer Satisfaction

353.2.5 Grid ComputingValue Element#5 : Increasing Productivity

363.3 Risk Analysis

373.3.1 Lock-in

383.3.2 Switching Costs

383.3.3 Project Implementation Failure

383.4 Grid Marketplace

383.4.1 Grid Taxonomy

393.4.2 Fabric

393.4.3 Middleware

403.4.4 Serviceware

403.4.5 Applications

403.4.6 Grid Service Providers

403.4.7 Grid Applications Service Providers

413.4.8 Grid Consultants

413.5 Conclusion

41

4

Grid Computing Technology-An overview

434.1 Introduction

444.2 History

44

Page 4: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

Contents ix

4.3 High-Performance Computing

464.4 Cluster Computing

474.5 Peer-to-Peer Computing

484.6 Internet Computing

494.7 Grid Computing

514.7.1 Peer-to-Peer Networks and Grid Computing

524.7.2 Cluster Computingand Grid Computing

534.7.3 Internet Computing and Grid Computing

534.8 Grid Computing Model

534.9 Grid Protocols

574.9.1 Security: Grid Security Infrastructure

574.9.2 Resource Management: Grid Resource Allocation

Management Protocol

594.9.3 Data Transfer: Grid File Transfer Protocol

604.9.4 Information Services : Grid Information Services

604.10 Globus Toolkit

604.11 Open Grid Services Architecture

614.12 Global Grid Forum

624.13 Types of Grids

644.13.1 Departmental Grids

644.13.2 Enterprise Grids

654.13.3 Extraprise Grids

654.13.4 Global Grids

664.13.5 Compute Grids

664.13.6 Data Grids

674.13.7 Utility Grids

674.14 Grid Networks-WillThere Be Such a Thing as "The Gridnet"?

674.14.1 Grid Network Peering Points

684.15 Grid Applications Characteristics

694.16 Application Integration

704.17 Grid Computing and Public Policy

724.17.1 Sleeper Programs

724.17.2 National Security

72

Page 5: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

X Contents

4.17.3 Philanthropic Computing

72

4.18 Conclusion

73

5

Desktop Grids

755.1 Introduction

76

5.2 Background

765.2.1 Cause Computingandthe Internet

765.2.2 Distributed Computing in the Enterprise

77

5.3 Desktop Grids Defined

795.4 The Desktop Grid Value Proposition

805.5 Desktop Grid Challenges

815.6 Desktop Grid Technology-Key Elements to Evaluate

82

5.6.1 Security

825.6.2 Unobtrusiveness

825.6.3 Openness/Ease ofApplication Integration

835.6.4 Robustness

835.6.5 Scalability

835.6.6 Central Manageability

835.6.7 Key Technology Elements-Checklists

845.6.8 KeyTechnology Elements-Summary

855.7 Desktop Grid Suitability-Key Areas for Exploration

865.7.1 Applications

865.7.2 Computing Environment

895.7.3 Culture

905.8 The Grid Server-Additional Functionality to Consider

915.9 Role ofDesktop Grids in an Enterprise Computing Infrastructure

925.9.1 Departmental Grids

925.9.2 Campus Grids

925.9.3 Web Services andBeyond

935.10 Practical Uses ofDesktop Grids-Real-World Examples

945.10.1 Example: Risk Management for Financial Derivatives

955.10.2 Example: Molecular Docking for Drug Discovery

955.10.3 Example: Architectural Rendering

965.11 Conclusion

97

Page 6: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

Contents xi

6

Cluster Grids

996.1 Introduction

1006.2 Clusters

1016.2.1 Single System Image

1026.2.2 Single System Environment

1056.3 Industry Examples

1076.3.1 Electronic Design Automation (EDA)

1086.3.2 Bioinformatics

1116.3.3 Industrial Manufacturing

1126.4 Cluster Grids

1146.5 Conclusion

116

7

HPC Grids

1197.1 Introduction

1207.2 Five Steps to Scientific Insight

1217.3 Applications andArchitectures

1227.4 HPCApplication Development Environment

1267.5 Production HPC Reinvented

1287.6 HPC Grids

1317.7 Conclusion

133Acknowledgements

133

8

Data Grids

1358 .1 Introduction

1368.2 Data Grids

1388.3 Alternatives to Data Grids

1398.3.1 Network File System (NFS)

1398.3.2 File Transfer Protocol (FTP)

1408.3.3 NFS over IPSec

1418.3.4 Secure Copy-scp/sftp

1428.3.5 De-Militarized Zone (DMZ)

1438.3.6 GridFTP

1438.3.7 Andrew File System (AFS)

144

Page 7: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

Xli contents

8.4 AvakiData Grid

145

8.4.1 Accessing the Data Grid

148

8.4.2 Managing the Data Grid

150

8.5 Data Grid Architecture

153

8.5.1 Grid Servers

153

8.5 .2 Share Servers

154

8.5.3 Data Grid Access Servers (DGAS)

1558.5.4 ProxyServers

1568.5 .5 Failover Servers (Secondary Grid Domain Controllers)

1568.6 Conclusion

157Acknowledgements

157

9

The Open Grid Services Architecture

159

9.1 Introduction

1609.2 An Analogy for OGSA

1619.3 TheEvolution to OGSA

1639.3.1 Grid Computing

1639.3.2 Web Services

1649.3.3 Convergence

1659.4 OGSA Overview

1669.4.1 The OGSA Platform

1679.4.2 OGSI

1689.4.3 OGSA Platform Interfaces

1759.4.4 OGSA Platform Models

1789.5 Building on the OGSA Platform

1799.5.1 WS-Agreement

1809.5.2 Data Access and Integration Services (DAIS)

1819.6 Implementing OGSA-Based Grids

1839.6.1 The Globus Toolkit 3

1839.6.2 GCSF

1859.7 Conclusion

186

Page 8: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

Contents X111

10

Creating and Managing Grid Services

18910.1 . Introduction

19010.2 . Services andthe Grid

19010.3 Converting Existing Software

19610.4 Service Discovery

19810.5 Operational Requirements

19910.6 Tools andToolkits

20110.6.1 Globus Toollcit Grid Information Service

20110.6.2 Accessing Grid Information

20310.6.3 Performance Issues with MDS

20610.6.4 Other Information Services andProviders

20610.6.5 Future

20710.7 Support in UDDI

20710.8 UDDI and OGSA

20910.9 UDDIe: UDDI Extensions and Implementation

21010.10 Uses

21810.11 Quality ofService Management

22110.12 Conclusion

222Download

223Acknowledgements

223

11

Desktop Supercomputing : Native Programming for Grids

22511 .1 Introduction

22611 .2 Historical Background-Parallel Computing

22611.2.1 MIMD Computers

22711 .3 Parallel Programming Paradigms

23011 .4 Problems of Current Parallel Programming Paradigms

23311 .5 Desktop Supercomputing : Solving the Parallel

Programming Problem

23411 .6 Desktop Supercomputing Programming Paradigm

23411 .7 Parallel Programming in CxC

23511.8 Parallelizing Existing Applications

23711 .9 Conclusion

237

Page 9: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

XIV contents

12

Grid-Enabling Software Applications

239

12.1 Introduction

240

12.2 Grid Computing: Discontinuous Innovation or Massive Yawn?

240

12.3 The Needs of Grid Users

241

12.4 Grid Deployment Criteria

242

12.5 Methods ofGrid Deployment

244

12.6 When to Grid-Enable Software

24512.7 Requirements for Grid-Enabling Software

247

12.8 Grid Programming Tools andExpertise

24712.9 The Process of Grid-Enabling Software Applications

24912.9.1 Analysis

24912.9.2 Application Modifications

25012.10 Grid-Enabling aMainstream Software Application: An Example

25212.10.1 Video Encoding

25312.10.2 TheNeed for Speed

25312.10.3 Current Solutions

25412.10.4 Grid Deployment of Video Encoding

25512.10.5 Requirements for Broad Marketplace Adoption

25512.10.6 Overview ofMPEG 4 Encoder

25612.10.7 Overview of GridIron XLR8

25612.10.8 Distributed ComputingStrategy

25712.10.9 Implementation

25812.10.10 Results

26012.10.11 Next Steps

26212.10.12 Grid-Enabling Video Encoding Summary

26212.11 Conclusion

264

13

Application integration

26713 .1 Introduction

26813.2 Application Classification

26913 .2.1 Parallelism

26913.2.2 Communications

271

Page 10: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

Contents XV

13.2.3 Granularity

27113.2.4 Dependency

27113.3 Grid Requirements

27113.3.1 Interfaces

27113.3.2 Job Scheduling

27213.3.3 Data Management

27313.3.4 Remote Execution Environment

27313.3.5 Security

27313.3.6 Gang Scheduling

27413.3.7 Checkpointing and Job Migration

27513.3.8 Management

27513.4 Integrating Applications with Middleware Platforms

27613.4.1 Application Preparation Example

27713.4.2 Issues in Application Integration

27813.5 Conclusion

280

14

Grid-Enabling Network Services

28114.1 Introduction

28214.2 On Demand Optical Connection Services

28314.3 Creating Grid-Enabled Network Services

28414.4 Montague River Grid

28514.5 Montague River Domain

28614.6 Sample API

28814.7 Deployment Example: End-to-End LightPath Management

29214.8 Conclusion

293

15

Managing Grid Environments

295

15.1 Introduction

29615.2 Managing Grids

29615.2.1 Trust

29715.2.2 Identity

29715.2.3 Privacy

29815.2.4 Authorization

299

Page 11: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

xvii Contents

15.3 Management Reporting15.3.1 Users15.3.2 Resources15.3 .3 Jobs15 .3.4 Audit Support

15 .4 Monitoring15.4.1 Types of Events15.4.2 Notification Modes

15.5 Service Level Management15.6 Data Catalogs andReplica Management

15.6.1 Data Catalog15.6.2 Replication

15.7 Portals15.8 Conclusion

301301301302302303303304304305305306306307

16

Grid Computing Adoption in Research and industry

30916.1 Introduction

310

16.2 AGlobal Grid Architecture

31216.3 Core Components for Building a Grid

31216.3.1 Distributed Resource Managers

31316.3.2 Portal Software andAuthentication

31416.3.3 The Globus Toolkit 2.0

31416.4 Examples ofResearch andIndustry Grid Implementations

31516.4.1 GlobeXplorer : From Departmental to Global

Grid Computing

31616.4.2 Houston University Campus Grid for Environmental

Modeling and Seismic Imaging

31916.4.3 Ontario HPC Virtual Laboratory

32416.4.4 Canada NRC-CBRBioGrid

32716.4.5 White Rose Grid

33116.4.6 Progress: Polish Research on Grid Environment on

SunServers

33416.5 Conclusion

338Acknowledgements

339

Page 12: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

Contents Xvii

17

Grids in Life Sciences

34117.1 Introduction

34217.2 Bioinformatics

34317.2.1 Bioinformatics and High-Performance Computing

34317.2.2 Grid Computing and Bioinformatics

344Bioinformatics-An Excellent Grid Application

34517.2.3 Example Grid Computing: Smallpox Research

34617.3 Computational Chemistry andBiochemistry

34617.4 Protein Modeling

34617.5 Ab Initio Molecular Modeling

34717.5.1 ExampleAb Initio Modeling with Parallel Computing

34817.6 Grid Computing in Life Sciences

34817.7 Artificial Intelligence and Life Sciences

34917.8 Conclusion

349

18

Grids in the Telecommunications Sector

35118.1 Introduction

35218.2 Telcos as Users

35218.2.1 CPU Intensive Application: Network Planning

and Management

35318.2.2 Data Intensive Applications : EDRAnalysis and

DataWarehouse

35618.2.3 A Corporate Grid Platform

35918.3 Telcos as Providers

36118.3.1 Grid Services : ABusiness in the Future Network

36118.4 Conclusion

362

19

Grids in Other Industries

36519.1 Introduction

36619.2 Grids in Financial Services

36619.3 Geo Sciences

36819.4 Manufacturing

36819.5 Electronic Design Automation

37019.6 Entertainment andMedia

370

Page 13: GRID COMPUTING: APRACTICALGUIDE ...17.2.1 BioinformaticsandHigh-Performance Computing 343 17.2.2 Grid ComputingandBioinformatics 344 Bioinformatics-AnExcellentGridApplication 345 17.2.3

xviii Contents

19.7 Chemical and Material Sciences

37119.8 Gaming

37119.9 Conclusion

372

20

Hive Computing for Transaction Processing Grids

37320.1 Introduction

37420.2 Hive Computing

37420.2.1 Assumptions

37520.2.2 Overview

37620.2.3 Capabilities

37820.2.4 Programming Model

37920.2.5 Benefits

38220.4 Conclusion

386

About the CD-ROM

389