applied research and technology shared services group the changing business case for supercomputing:...

22
Applied Research and Technology Shared Services Group The Changing Business Case for Supercomputing: An Industrial Perspective Dr. Kenneth W. Neves Senior Technical Fellow Manager, Computer Science Seattle, WA

Post on 19-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Applied Research and TechnologyShared Services Group

The Changing Business Case for Supercomputing: An Industrial Perspective

The Changing Business Case for Supercomputing: An Industrial Perspective

Dr. Kenneth W. Neves Senior Technical Fellow

Manager, Computer Science

Seattle, WA

Applied Research and TechnologyShared Services Group

TopicsTopics

Indicators of market health and viability of supercomputing

– 1970

– late 1980s early 1990s

– Today Boeing high performance computing challenges

– Production computing

– Research computing

– Enterprise-wide computing

– Product visualization Conclusions: Common research issues

– technical

– system

Applied Research and TechnologyShared Services Group

Key Factors to MonitorKey Factors to Monitor

Market for high performance computers

Applications - “the need”

Computer Power

Computer Architecture

Applied Research and TechnologyShared Services Group

Now: Facts of LifeNow: Facts of Life

Today, SC companies have all but died or been absorbed into a more commodity market

Micro’s dominate Cutting edge computational research

MUST resort to highly parallel machines (separates the men from the boys)

The “cost” of novel architectures both in hardware and software has thinned the market

Many supercomputer users of “old” are workstation users today

Big Pacing 100X VeryNovel

Vector

Parallel

Big Pacing 100X VeryNovel

Like 1970

Applied Research and TechnologyShared Services Group

Boeing ApplicationsBoeing Applications

CAD/CAM (billion dollar investment) Product Data Management and Manufacturing

Resource Control (multi-billion dollar investment) Scientific Computing (important, but multi-million

dollar investment) that tends to be cyclic Super Computing Problems, e.g.,

– CFD: highly separated flows– multi-disciplinary optimization– constrained design – electromagnetics

Applied Research and TechnologyShared Services Group

High-end Computing ActivityHigh-end Computing Activity

Production computing

Scientific research computing

Enterprise-wide computing

Product visualization

Applied Research and TechnologyShared Services Group

Production ComputingProduction Computing

Requires repeatable, controllable process Can be big problems (CFD for cruise wing design,

structural analysis) Done on more “ordinary” architectures (Cray T-90) Migration from “central computing”

– as workstation and server capability improved many of the central users migrate to more “affordable” environments

– department level supercomputers

– application dedicated platforms (can be novel architectures, but not shared with many users)

– secret computing

Applied Research and TechnologyShared Services Group

Scientific Research ComputingScientific Research Computing

Grand challenge problems are often multidisciplinary, can involve optimization

Often offer opportunity for macro-level parallelism Airfoil Constrained Optimization

–0.1

Target

Target

Original

OriginalDesign

Design

PressureGeometry

–0.1

0.1 0.3 0.5 0.7 0.9 1.11.5

0.1 0.3 0.5 0.7 0.9 1.1

1.0

0.5

0.0

–0.5

–1.0

–1.5

Cp0.06

0.04

0.02

0.00

–0.02

–0.04

–0.06

–0.08

y

Applied Research and TechnologyShared Services Group

UnconstrainedUnconstrained

Significant improvement in cruise performance, not manufacturable

Applied Research and TechnologyShared Services Group

With Manufacturing ConstraintsWith Manufacturing Constraints

Significant improvement in cruise performance and manufacturable

Applied Research and TechnologyShared Services Group

Temperature Strain

Models physics of metal cutting

Factory ModelingFactory Modeling

Applied Research and TechnologyShared Services Group

Enterprise-wide ComputingEnterprise-wide Computing

Distributed data– 700 terabytes– 20 business units– secure, reliable, coherent

Parallel SMP servers– Oracle as middleware for 4 major applications– Re-engineering of 315 legacy applications– 50,000 users world wide (not including subcontractors)

Applied Research and TechnologyShared Services Group

NT Resource Server(S3, Print)

Printers & Workstations

Switches

STACServers

Application Servers(BaaN, Cimlinc, ShopView

Capp, Linkage, Web)

DNS/NFS Cluster(ServiceGuard)

Routers

Campus Server Room FDDI Ring

Router

Sequent Clusters

UFSFile

Server *

VitalProductionSystems

DCE Security Server *

Scheduling Server BNN

TokenRing

Utility/Method Servers Clusters(ServiceGuard)

NFS Cluster *(ServiceGuard)

Routers

Data Center FDDI Ring

NT WINS and MAD

NIS

DHCPServer

MasterNIS

Enterprise System ComplexityEnterprise System Complexity

Applied Research and TechnologyShared Services Group

Machining from CADGenerative DesignNeural Network design retrievalSystem complexity rivals enterprise

wide computing

ALSO

Product VisualizationProduct Visualization

Applied Research and TechnologyShared Services Group

Research IssuesResearch Issues

Goal: The network is the computer “Power Grid” (NASA term)

– computing resources are managed like a power system

– data movement is minimized, access time is minimized

– fail safe

– networking queuing, agent assisted Threads maintained Synchronization of process managed by middleware rather than

individuals data authentication and time stamping for coherency

Parallel data based performance (unsolved problem) Scientific computing approach, but applied to new application

areas of the enterprise

Applied Research and TechnologyShared Services Group

Old Style Performance EnhancementOld Style Performance Enhancement

CPU TimeAnalysis Application

I/O

Algorithm

Setup

I/O

Setup

Vectorized, parallelized, etc.

Applied Research and TechnologyShared Services Group

New Style Performance EnhancementNew Style Performance Enhancement

CPU Time

Stack & Batch Approach

Visualization

App 2

App 1

Optimizer(executive)

CAD to finite element gridder

Input &Setup

VisualizationInput &Setup

App 2 App 1

OptimizerGrid gen.

middleware

Goal: minimize response time and/or through put

Applied Research and TechnologyShared Services Group

What Questions to AskWhat Questions to Ask

What is the process? Design a subsonic wing

How will the problem be defined? Geometry from CAD

What are performance constraints? Minimize fuel burn

What are the manufacturing constraints?Less than 60ft, 2000 lbs,no curvature > xxx

What answers are needed? Multi-angle of attack visualizedflows?

Applied Research and TechnologyShared Services Group

What Questions to AskWhat Questions to Ask

What is the process? Design a subsonic wing

How will the problem be defined? Geometry from CAD

What are performance constraints? Minimize fuel burn

What are the manufacturing constraints?Less than 60ft, 2000 lbs,no curvature > xxx

What answers are needed? Multi-angle of attack visualizedflows?

Parallelize the process, maximize throughput against number of users, exploit parallelism in optimization, algorithm, loop

PRIORITY OF PERFORMANCE

Applied Research and TechnologyShared Services Group

NASA Power Grid ConceptNASA Power Grid Concept

?

Applied Research and TechnologyShared Services Group

System Performance PyramidSystem Performance Pyramid

Hardware & Network

Middleware

Applications

Process

Storage Systems

SMP Clusters

Data base, objectspace, security

ManufacturingResource Planning

Build a Plane

Enterprise Computing

SMP Clusters, MPPs

Geometry (CAD), algorithms,libraries, message system

Multidisciplinary optimization:e.g. design for manufacturability

Design a Plane

Scientific Computing

Applied Research and TechnologyShared Services Group

ConclusionsConclusions

Older performance improvement techniques are fundamental and necessary, but not sufficient

New system level “attack” on performance and scalability is needed

– need to address response time

– system throughput (of the entire process) Looking at performance for (system level) analysis is similar to

enterprise-wide computing Scientists, hardware vendors from the SC community, computer

scientists, and enterprise-wide system developers need to collaborate

The traditional supercomputing community needs to diversify its interests!