the microsoft perspective on where high performance computing is heading kyril faenov director of...

23

Post on 21-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

The Microsoft Perspective On Where High Performance Computing Is Heading

The Microsoft Perspective On Where High Performance Computing Is Heading

Kyril FaenovDirector of HPCWindows Server DivisionMicrosoft Corporation

Kyril FaenovDirector of HPCWindows Server DivisionMicrosoft Corporation

Talk OutlineTalk Outline

Market/technology trends

Personal supercomputing

Grid computing

Leveraging IT industry investments

Decoupling domain science from Computer Science

Market/technology trends

Personal supercomputing

Grid computing

Leveraging IT industry investments

Decoupling domain science from Computer Science

Top 500 Supercomputer TrendsTop 500 Supercomputer Trends

Industry usage rising

Clusters over 50%

x86 is winning

GigE is gaining

HPC Market TrendsHPC Market Trends

Capability, Enterprise

$1M+

Top Challenges to Implementing Clusters(IDC 2004, N=229)

System management capability 18%

Apps availability 17%

Parallel algorithm complexity 14%

Space, power, cooling 11%

Interconnect BW/latency 10%

I/O performance 9%

Interconnect complexity 9%

Other 12%

Report of the High-End Computing Revitalization Task Force, 2004(Office of Science and Technology Policy, Executive Office of the President)

“Make high-end computing easier and more productive to use. Emphasis should be placed on time to solution, the major metric of value to high-end computing users… A common software environment for scientific computation encompassing desktop to high-end systems will enhance productivity gains by promoting ease of use and manageability of systems.”

Divisional$250K-$1M

Departmental$50-250K

Workgroup<$50K

4.2%4.2%

2004 Systems2004 Systems

5.7%5.7%

1,1671,167

3,9153,915

22,71222,712

127,802127,802

Source: IDC, 2005Source: IDC, 2005

2004-9 CAGR2004-9 CAGR

7.7%7.7%

13.4%13.4%

<$250K – 97% of systems, 52% of revenue<$250K – 97% of systems, 52% of revenue

In 2004 clusters grew 96% to 37% by revenueIn 2004 clusters grew 96% to 37% by revenueAverage cluster size 10-16 nodesAverage cluster size 10-16 nodes

Major ImplicationsMajor Implications

Market pressures demand accelerated innovation cycle, overall cost reduction, and thorough outcome modelingLeverage volume markets of industry standard hardware and softwareRapid procurement, installation and integration of systemsWorkstation-Cluster integrated applications accelerating market growth

EngineeringBioinformaticsOil and GasFinanceEntertainmentGovernment/Research

Market pressures demand accelerated innovation cycle, overall cost reduction, and thorough outcome modelingLeverage volume markets of industry standard hardware and softwareRapid procurement, installation and integration of systemsWorkstation-Cluster integrated applications accelerating market growth

EngineeringBioinformaticsOil and GasFinanceEntertainmentGovernment/Research

The convergence of affordable high performance hardware and The convergence of affordable high performance hardware and commercial apps is making supercomputing personal commercial apps is making supercomputing personal

Supercomputing Goes PersonalSupercomputing Goes Personal

19911991 19981998 20052005SystemSystem Cray Y-MP C916Cray Y-MP C916 Sun HPC10000Sun HPC10000 Shuttle @ NewEgg.comShuttle @ NewEgg.com

ArchitectureArchitecture 16 x Vector16 x Vector4GB, Bus4GB, Bus

24 x 333MHz Ultra-24 x 333MHz Ultra-SPARCII, 24GB, SBusSPARCII, 24GB, SBus

4 x 2.2GHz x644 x 2.2GHz x644GB, GigE4GB, GigE

OSOS UNICOSUNICOS Solaris 2.5.1Solaris 2.5.1 Windows Server 2003 SP1Windows Server 2003 SP1

GFlopsGFlops ~10~10 ~10~10 ~10~10

Top500 #Top500 # 11 500500 N/AN/A

PricePrice $40,000,000$40,000,000 $1,000,000 (40x drop)$1,000,000 (40x drop) < $4,000 (250x drop)< $4,000 (250x drop)

CustomersCustomers Government LabsGovernment Labs Large EnterprisesLarge Enterprises Every Engineer & Scientist Every Engineer & Scientist

ApplicationsApplications Classified, Climate, Classified, Climate, Physics ResearchPhysics Research

Manufacturing, Energy, Manufacturing, Energy, Finance, TelecomFinance, Telecom

Bioinformatics, Materials Bioinformatics, Materials Sciences, Digital MediaSciences, Digital Media

The FutureSupercomputing on a ChipThe FutureSupercomputing on a Chip

IBM Cell processor256 Gflops today4 node personal cluster => 1 Tflops32 node personal cluster => Top100

Microsoft Xbox3 custom PowerPCs + ATI graphics processor1 Tflops today$3008 node personal cluster => “Top100” for $2500 (ignoring all that you don’t get for $300)

Intel many-core chips“100’s of cores on a chip in 2015” (Justin Rattner, Intel)“4 cores”/Tflop => 25 Tflops/chip

IBM Cell processor256 Gflops today4 node personal cluster => 1 Tflops32 node personal cluster => Top100

Microsoft Xbox3 custom PowerPCs + ATI graphics processor1 Tflops today$3008 node personal cluster => “Top100” for $2500 (ignoring all that you don’t get for $300)

Intel many-core chips“100’s of cores on a chip in 2015” (Justin Rattner, Intel)“4 cores”/Tflop => 25 Tflops/chip

Key To EvolutionTackling system complexityKey To EvolutionTackling system complexity

ScenarioScenario FocusFocus

Departmental ClusterDepartmental ClusterConventional scenarioConventional scenario

IT owns large clusters due to cost and IT owns large clusters due to cost and complexity and allocates resources on complexity and allocates resources on per job basisper job basis

Users submit batch jobs via scriptsUsers submit batch jobs via scripts

In-house and ISV apps, many based on In-house and ISV apps, many based on MPIMPI

Scheduling multiple users’ Scheduling multiple users’ applications onto scarce applications onto scarce compute cyclescompute cycles

Cluster systems administration

Personal/Workgroup Personal/Workgroup ClusterClusterEmerging scenarioEmerging scenario

Clusters are pre-packaged OEM Clusters are pre-packaged OEM appliances, purchased and managed by appliances, purchased and managed by end-usersend-users

Desktop HPC applications transparently Desktop HPC applications transparently and interactively make use of cluster and interactively make use of cluster resourcesresources

Desktop development tools integrationDesktop development tools integration

Interactive applicationsInteractive applications

Workstation clusters, accelerator appliances

Distributed, policy-based management and security

HPC Application IntegrationHPC Application IntegrationFuture scenarioFuture scenario

Multiple simulations and data sources Multiple simulations and data sources integrated into a seamless application integrated into a seamless application workflowworkflow

Network topology and latency Network topology and latency awareness for optimal distribution of awareness for optimal distribution of computationcomputation

Structured data storage with rich meta-Structured data storage with rich meta-datadata

Applications and data potentially span Applications and data potentially span organizational boundariesorganizational boundaries

Data-centric, “whole-Data-centric, “whole-system” workflowssystem” workflows

Rapid prototyping of HPC Rapid prototyping of HPC applicationsapplications

Grids: Distributed application, systems, and data management

Interoperability

Interactive Computation Interactive Computation and Visualizationand Visualization

Manual, batchManual, batchexecutionexecution

IT MgrIT Mgr

SQL

“Grid Computing” A catch-all marketing term“Grid Computing” A catch-all marketing term

“Grid” Computing means many different things to many different people/companies

Desktop cycle-stealing

Managed HPC clusters

Internet access to giant, distributed repositories

Virtualization of data center IT resources

Out-sourcing to “utility data centers”

Originally this was all called “Distributed Systems”

“Grid” Computing means many different things to many different people/companies

Desktop cycle-stealing

Managed HPC clusters

Internet access to giant, distributed repositories

Virtualization of data center IT resources

Out-sourcing to “utility data centers”

Originally this was all called “Distributed Systems”

HPC Grids And Web ServicesHPC Grids And Web Services

HPC Grid ~ Compute Grid + Data Grid

Compute gridForest of clusters and workstations within an organization

Coordinated scheduling of resources

Data gridDistributed storage facilities within an organization

Coordinated management of data

Web ServicesThe means to achieve interoperable Internet-scale computing, including federation of organizations

Loosely-coupled, service-oriented architecture

HPC Grid ~ Compute Grid + Data Grid

Compute gridForest of clusters and workstations within an organization

Coordinated scheduling of resources

Data gridDistributed storage facilities within an organization

Coordinated management of data

Web ServicesThe means to achieve interoperable Internet-scale computing, including federation of organizations

Loosely-coupled, service-oriented architecture

Computational Grid Economics*Computational Grid Economics*

What $1 will buy you (roughly): Computers cost $1000 (roughly)

1 cpu day (~ 10 Tera-ops) == $1 (roughly, assuming 3 yr use cycle)

10TB network transfer costs == $1 (roughly, assuming 1Gbps interconnect)

Internet bandwidth costs roughly 100 $/mbps/month (not including routers and management)

1GB network transfer costs == $1 (roughly)

Some observationsHPC cluster communication is 10,000x cheaper than WAN communication

Break-even point for instructions computed per byte transferred:Cluster: O(1) instrs/byte => many parallel applications are economical to run on a cluster or across a GigE LAN

WAN: O(10,000) instrs/byte => few parallel applications are economical to run across the Internet

*Computational grid economics material courtesy of Jim Gray

What $1 will buy you (roughly): Computers cost $1000 (roughly)

1 cpu day (~ 10 Tera-ops) == $1 (roughly, assuming 3 yr use cycle)

10TB network transfer costs == $1 (roughly, assuming 1Gbps interconnect)

Internet bandwidth costs roughly 100 $/mbps/month (not including routers and management)

1GB network transfer costs == $1 (roughly)

Some observationsHPC cluster communication is 10,000x cheaper than WAN communication

Break-even point for instructions computed per byte transferred:Cluster: O(1) instrs/byte => many parallel applications are economical to run on a cluster or across a GigE LAN

WAN: O(10,000) instrs/byte => few parallel applications are economical to run across the Internet

*Computational grid economics material courtesy of Jim Gray

Exploding Data SizesExploding Data Sizes

Experimental data: TBs PBs

Modeling dataToday

10’s to 100’s of GB per simulation is the common case

Applications mostly run in isolation

Tomorrow 10’s to 100’s of TBs, all of it to be archived

Whole-system modeling and multi-application workflows

Experimental data: TBs PBs

Modeling dataToday

10’s to 100’s of GB per simulation is the common case

Applications mostly run in isolation

Tomorrow 10’s to 100’s of TBs, all of it to be archived

Whole-system modeling and multi-application workflows

How Do You Move A Terabyte?*How Do You Move A Terabyte?*

14 minutes14 minutes6176172002001,920,0001,920,00096009600OC 192OC 192

2.2 hours2.2 hours10001000GbpsGbps

1 day1 day100100100 Mpbs100 Mpbs

14 hours14 hours97697631631649,00049,000155155OC3OC3

2 days2 days2,0102,01065165128,00028,0004343T3T3

2 months2 months2,4692,4698008001,2001,2001.51.5T1T1

5 months5 months36036011711770700.60.6Home DSLHome DSL

6 years6 years3,0863,0861,0001,00040400.040.04Home phoneHome phone

Time/TBTime/TB$/TB$/TBSentSent$/Mbps$/MbpsRentRent

$/month$/monthSpeedSpeedMbpsMbpsContextContext

24 hours24 hours5050100100FedExFedEx

*Material courtesy of Jim Gray*Material courtesy of Jim Gray

LAN SettingLAN Setting

13 minutes13 minutes100001000010 Gpbs10 Gpbs

Anticipated HPC Grid TopologyAnticipated HPC Grid Topology

Islands of high connectivityIslands of high connectivity

Simulations done on personal and Simulations done on personal and workgroup clustersworkgroup clusters

Data stored in data warehousesData stored in data warehouses

Data analysis best done inside the data Data analysis best done inside the data warehousewarehouse

Wide-area data sharing/replication Wide-area data sharing/replication via FedEx?via FedEx?

Data warehouse Workgroupcluster

Personalcluster

Data Analysis And MiningData Analysis And Mining

Traditional approachKeep data in flat filesWrite C or Perl programs to compute specific analysis queriesProblems with this approach

Imposes significant development timesScientists must reinvent DB indexing and query technologiesHave to copy the data from the file system to the compute cluster for every query

Results from the astronomy communityRelational databases can yield speed-ups of one to two orders of magnitudeSQL + application/domain-specific stored procedures greatly simplify creation of analysis queries

Traditional approachKeep data in flat filesWrite C or Perl programs to compute specific analysis queriesProblems with this approach

Imposes significant development timesScientists must reinvent DB indexing and query technologiesHave to copy the data from the file system to the compute cluster for every query

Results from the astronomy communityRelational databases can yield speed-ups of one to two orders of magnitudeSQL + application/domain-specific stored procedures greatly simplify creation of analysis queries

Is That The End Of The Story?Is That The End Of The Story?

Relational Data warehouse Workgroup

cluster

Personalcluster

Too Much ComplexityToo Much Complexity

Relational Data warehouse Workgroup

cluster

Personalcluster

Distributed systems Distributed systems issues:issues:

SecuritySecurity System managementSystem management Directory servicesDirectory services Storage managementStorage management

Digital experimentation:Digital experimentation: Experiment Experiment

managementmanagement Provenance (data & Provenance (data &

workflows)workflows) Version management Version management

(data & workflows)(data & workflows)

Parallel application developmentParallel application development Chip-level, node-level, cluster-level, Chip-level, node-level, cluster-level,

LAN grid-level, WAN grid-level LAN grid-level, WAN grid-level parallelismparallelism

OpenMP, MPI, HPF, Global Arrays, …OpenMP, MPI, HPF, Global Arrays, … Component architecturesComponent architectures Performance configuration & tuningPerformance configuration & tuning Debugging/profiling/tracing/analysisDebugging/profiling/tracing/analysis

Domain science

2004 NAS supercomputing report: O(35) new computational scientists graduated per year2004 NAS supercomputing report: O(35) new computational scientists graduated per year

(Partial) Solution Leverage IT Industry’s Existing R&D(Partial) Solution Leverage IT Industry’s Existing R&D

Parallel applications developmentHigh-productivity IDEs

Integrated debugging/profiling/tracing/analysis

Code designer wizards

Concurrent programming frameworksPlatform optimizations

Dynamic, profile-guided optimization

New programming abstractions

Distributed systems issuesWeb Services & HPC grids

Security

Interoperability

Scalability

Dynamic Systems ManagementSelf (re)configuration & tuning

Reliability & availability

RDMS + data miningEase-of-use

Advanced indexing & query processing

Advanced data mining algorithms

Parallel applications developmentHigh-productivity IDEs

Integrated debugging/profiling/tracing/analysis

Code designer wizards

Concurrent programming frameworksPlatform optimizations

Dynamic, profile-guided optimization

New programming abstractions

Distributed systems issuesWeb Services & HPC grids

Security

Interoperability

Scalability

Dynamic Systems ManagementSelf (re)configuration & tuning

Reliability & availability

RDMS + data miningEase-of-use

Advanced indexing & query processing

Advanced data mining algorithms

Digital experimentationCollaboration-enhanced Office productivity tools

Structure experiment data and derived results in a manner appropriate for human reading/reasoning (as opposed to optimizing for query processing and/or storage efficiency)

Enable collaboration among colleagues

(Scientific) workflow environmentsAutomated orchestration

Visual scripting

Provenance

Digital experimentationCollaboration-enhanced Office productivity tools

Structure experiment data and derived results in a manner appropriate for human reading/reasoning (as opposed to optimizing for query processing and/or storage efficiency)

Enable collaboration among colleagues

(Scientific) workflow environmentsAutomated orchestration

Visual scripting

Provenance

Separating The Domain Scientist From The Computer ScientistSeparating The Domain Scientist From The Computer Scientist

Computer Computer scientistscientist

Computational Computational scientistscientist

Domain Domain

scientistscientist

Parallel domain application developmentParallel domain application development

Parallel/distributed file systems, relational data warehouses, Parallel/distributed file systems, relational data warehouses, dynamic systems management, Web Services & HPC gridsdynamic systems management, Web Services & HPC grids

(Interactive) scientific workflow, integrated with collaboration-(Interactive) scientific workflow, integrated with collaboration-enhanced office automation toolsenhanced office automation tools

Concrete concurrencyConcrete concurrency

Abstract concurrencyAbstract concurrency

Concrete workflowConcrete workflow

Abstract workflowAbstract workflow

Write scientific paperWrite scientific paper

(Word)(Word)Record experiment dataRecord experiment data

(Excel)(Excel)Individual experiment runIndividual experiment run

(Workflow orchestrator)(Workflow orchestrator)

Analyze dataAnalyze data

(SQL-Server)(SQL-Server)Share paper with co-authorsShare paper with co-authors

(Sharepoint)(Sharepoint)Collaborate with co-authorsCollaborate with co-authors

(NetMeeting)(NetMeeting)

Example:Example:

Scientific Information WorkerPast and futureScientific Information WorkerPast and future

PastBuy lab equipmentKeep lab notebookRun experiments by handAssemble & analyze data (using stat pkg)Collaborate by phone/email; Write up results with Latex

MetaphorPhysical experimentation“Do it yourself”Lots of disparate systems/pieces

PastBuy lab equipmentKeep lab notebookRun experiments by handAssemble & analyze data (using stat pkg)Collaborate by phone/email; Write up results with Latex

MetaphorPhysical experimentation“Do it yourself”Lots of disparate systems/pieces

FutureBuy hardware and softwareAutomatic provenanceWorkflow with 3rd party domain packagesExcel and Access/Sql-ServerOffice tool suite with collaboration support

MetaphorDigital experimentationTurn-key desktop supercomputerSingle integrated system

FutureBuy hardware and softwareAutomatic provenanceWorkflow with 3rd party domain packagesExcel and Access/Sql-ServerOffice tool suite with collaboration support

MetaphorDigital experimentationTurn-key desktop supercomputerSingle integrated system

Microsoft StrategyMicrosoft Strategy

Reducing barriers to adoption for HPC clustersEasy to develop

Familiar Windows dev environment + key HPC extensions (MPI, OpenMP, Parallel Debugger)Best of breed Fortran, numerical libraries, performance analysis tools through partnersLong-term, strategic investments in developer productivity

Easy to useFamiliarity/intuitiveness of WindowsCluster computing integrated into the workstation applications, user workflow

Easy to manage and ownIntegration with AD and the rest of IT infrastructureLower TCO through integrated turnkey clustersPrice/performance advantage of industry standard hardware components

Application support in three key HPC verticalsEngagement with the top HPC ISVsEnabling Open Source applications via University relationships

Leveraging a breadth of standard knowledge-management toolsWeb Services, SQL, Sharepoint, Infopath, Excel

Focused Approach to MarketEnabling broad HPC adoption and making HPC into a high volume market

Reducing barriers to adoption for HPC clustersEasy to develop

Familiar Windows dev environment + key HPC extensions (MPI, OpenMP, Parallel Debugger)Best of breed Fortran, numerical libraries, performance analysis tools through partnersLong-term, strategic investments in developer productivity

Easy to useFamiliarity/intuitiveness of WindowsCluster computing integrated into the workstation applications, user workflow

Easy to manage and ownIntegration with AD and the rest of IT infrastructureLower TCO through integrated turnkey clustersPrice/performance advantage of industry standard hardware components

Application support in three key HPC verticalsEngagement with the top HPC ISVsEnabling Open Source applications via University relationships

Leveraging a breadth of standard knowledge-management toolsWeb Services, SQL, Sharepoint, Infopath, Excel

Focused Approach to MarketEnabling broad HPC adoption and making HPC into a high volume market

© 2005 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.