power efficiency and the top500 · recap (from sc06 and isc07) •last year we presented the...

20
Power Efficiency and the Top500 John Shalf, Shoaib Kamil, Erich Strohmaier, David Bailey Lawrence Berkeley National Laboratory (LBNL) National Energy Research Supercomputing Center (NERSC) Top500 Birds of a Feather SC2007, Reno Nevada November 14, 2007 NERSC 5 BIPS BIPS

Upload: others

Post on 19-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Power Efficiency and the Top500

John Shalf, Shoaib Kamil, Erich Strohmaier, David Bailey

Lawrence Berkeley National Laboratory (LBNL)National Energy Research Supercomputing Center (NERSC)

Top500 Birds of a FeatherSC2007, Reno Nevada

November 14, 2007

NERSC

5

BIPSBIPS

Page 2: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

New Design Constraint: POWER

• Transistors still getting smaller– Moore’s Law is alive and well

• But Denard scaling is dead!

– No power efficiency improvements with smaller transistors

– No clock frequency scaling with smaller transistors

– All “magical improvement of silicon goodness” has ended

• Traditional methods for extracting more performance arewell-mined

– Cannot expect exotic architectures to save us from the “powerwall”

– Even resources of DARPA can only accelerate existingresearch prototypes (not “magic” new technology)!

Page 3: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Estimated Exascale Power Requirements

• Recent Baltimore Sun Article on NSA system in Maryland

– Consuming 75MW and growing up to 15MW/year

– Not enough power left for city of Baltimore!

• LBNL IJHPCA Study for ~1/5 Exaflop for Climate Science in 2008

– Extrapolation of Blue Gene and AMD design trends

– Estimate: 20 MW for BG and 179 MW for AMD

• DOE E3 Report

– Extrapolation of existing design trends to exascale in 2016

– Estimate: 130 MW

• DARPA Study

– More detailed assessment of component technologies

– Estimate: 20 MW just for memory alone, 60 MW aggregate extrapolated from

current design trends

The current approach is not sustainable!

Page 4: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Top500 Estimated Power Requirements

Growth in Power Consumption (Top50)

0.00

500.00

1000.00

1500.00

2000.00

2500.00

3000.00

3500.00

4000.00

Jun-00

Sep-00

Dec-00

Mar-01

Jun-01

Sep-01

Dec-01

Mar-02

Jun-02

Sep-02

Dec-02

Mar-03

Jun-03

Sep-03

Dec-03

Mar-04

Jun-04

Sep-04

Dec-04

Mar-05

Jun-05

Sep-05

Dec-05

Mar-06

Jun-06

System

Po

wer (

kW

)

max pwr

avg pwr

Growth in Power Consumption (Top50)Excluding Cooling

0.00

100.00

200.00

300.00

400.00

500.00

600.00

700.00

800.00

Jun-00

Sep-00

Dec-00

Mar-01

Jun-01

Sep-01

Dec-01

Mar-02

Jun-02

Sep-02

Dec-02

Mar-03

Jun-03

Sep-03

Dec-03

Mar-04

Jun-04

Sep-04

Dec-04

Mar-05

Jun-05

Sep-05

Dec-05

Mar-06

Jun-06

System

Po

wer (

kW

)

Avg. Power Top50

Page 5: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Power is an Industry Wide Problem

“Hiding in Plain Sight, Google Seeks More Power”, by John Markoff, June 14, 2006

New Google Plant in The Dulles, Oregon, from NYT, June 14, 2006

Page 6: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Broader Objective

• Role of Top500– Collected history of largest HEC investments in the world– Archive of system metrics plays important role in analyzing

industry trends (architecture, interconnects, processors, OS,vendors)

– Can play an important role in collecting data necessary tounderstand power efficiency trends

– Feed data to studies involving benchmarks other thanLINPACK as well (you can do your own rankings!)

• Objectives– Use Top500 List to track power efficiency trends– Raise Community Awareness of HPC System Power

Efficiency– Push vendors toward more power efficient solutions by

providing a venue to compare their power consumption

Page 7: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Recap (from SC06 and ISC07)

• Last year we presented the following– Introduced Looming Power Crisis in HPC

– Described Top500’s interest in collecting power efficiency data

– Estimated Top500 power trends from vendor specs.

• Questions about measurement– How far is manufacturer’s specs from actual power?

– If I have to measure, how do I do it?

– Does power consumed by HPL reflect power consumed byregular workload? (is this data relevant?)

– How do I collect the data without taking down my system for anentire day? (making it non-invasive)

– What are some real-world experiences with collecting this data?

– What metric should be used for ranking systems?

Page 8: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Anatomy of a “Value” Metric

Good Stuff

Bad Stuff

Page 9: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Anatomy of a “Value” Metric

FLOP/s

Watts

Bogus!!!

PotentiallyBogus!!

Page 10: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Anatomy of a “Value” Metric

Performance

Measured Watt

This is what Top500 is interested in collecting(but is it applicable to other workloads?)

Choose your own metric for performance!(doesn’t need to be HPL, or FLOPS)

Page 11: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Power Efficiency running fvCAM

Benchmark results from Michael Wehner, Art Mirin, Patrick Worley, Leonid Oliker

System Power Efficiency for fvCAM(fvCAM performance / system power)

0

0.5

1

1.5

2

2.5

Power3 Power5 BG/L Itanium2 X1E EarthSimulator

Su

sta

ined

Meg

aFLO

Ps/

watt

on

fvC

AM

Power efficiency for real applications is less differentiatedPeak Power Efficiency

( Peak FLOPs / system power )

0

20

40

60

80

100

120

140

160

Power3 Power5 BG/L Itanium2 X1E EarthSimulator

Peak M

eg

aFLO

Ps/

watt

Performance/measured_watt is much more useful than

FLOPs/peak_watt

Page 12: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Questions

• How do I measure power?• Does Power Consumed when running

LINPACK reflect power consumptionwhen running normal system load (orother benchmarks)?

• Is there a sane way to estimate powerconsumption under LINPACK/HPLwithout taking down my system toperform the measurement?

• What should be included/excluded frommeasurement?

Page 13: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Compared Methods Measure Power

• Clamp meters– +: easy to use, don’t need to disconnect test systems, wide variety of

voltages– -: very inaccurate for more than one wire

• Inline meters– +: accurate, easy to use, can output over serial– -: must disconnect test system, limited voltage, limited current

• Power panels / PDU panels– Unknown accuracy, must stand and read, usually coarse-grained

(unable to differentiate power loads)– Sometimes the best or only option: can get numbers for an entire HPC

system

• Integrated monitoring in system power supplies (Cray XT)– +: accurate, easy to use– - : only measures single cabinet. Must know power supply conversion

efficiency to project nominal power use at wall socket

Page 14: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Single Node Tests: AMD Opteron

• Highest power usage is 2x NAS FT and LU

Page 15: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Similar Results when Testing OtherCPU Architectures

• Power consumption far less than manufacturer’estimated “nameplate power”

• Idle power much lower than active power• Power consumption when running LINPACK is very

close to power consumed when running othercompute intensive applications

Core Duo AMD Opteron IBM PPC970/G5

Page 16: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Entire System Power Usage

0

200

400

600

800

1000

1200

1400

Time-->

Kil

ow

att

s

Catamount

CNL

Full System Test on Cray XT4(franklin.nersc.gov)

• Tests run across all 19,353 compute cores

• Throughput: NERSC “realistic” workload composed of full applications

• idle() loop allows powersave on unused processors; (generally more efficient)

STREAM HPL Throughput

No idle()Idle() loop

Page 17: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Single Rack Tests(extrapolating power consumption)

• Administrative utility gives rack DC amps & voltage• HPL & Paratec are highest power usage

Single Cabinet Power Usage

0

50

100

150

200

250

300

Time -->

Am

ps (

@ 5

2 V

DC

)

STREAM GTC FT.D LU.D CG.D paratec pmemd HPL

Page 18: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Modeling the Entire System

• Error factor is 0.05 if we assume 90% efficiency

Full System Power Usage, Model Using Actual vs. Single Rack x Num Racks

0.00E+00

2.00E+05

4.00E+05

6.00E+05

8.00E+05

1.00E+06

1.20E+06

1.40E+06

idle STREAM HPL

Wat

ts

Actual, Watts AC

90% Eff

80% Eff

Model, with FS (est. 50 MW)

Page 19: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Power Measurement(what to include/exclude)

• Facility Cooling: NO!

• Compute Nodes: Yes!– Collect power consumption when running HPL

– Can accurately extrapolate from HPL running on any subsetof the system

• Switches (infiniband or Ethernet) Yes!– Are not generally stressed by HPL

– Appear to consume nearly the same power under any load(may change in future with IEEE stds.)

• I/O: NO!– Probably should exclude if possible

– Ambiguous contribution because of increasing use of facility-wide filesystems

– Not stressed by HPL

Page 20: Power Efficiency and the Top500 · Recap (from SC06 and ISC07) •Last year we presented the following –Introduced Looming Power Crisis in HPC –Described Top500’s interest in

Conclusions• Power utilization under an HPL/Linpack load is a good

estimator for power usage under mixed workloads for singlenodes, cabinets / clusters, and large scale systems– Idle power is not– Nameplate and CPU power are not

• LINPACK running on one node or rack consumes approximatelysame power as the node would consume if it were part of full-sysparallel LINPACK job

• We can estimate overall power usage using a subset of theentire HPC system and extrapolating to total number ofnodes using a variety of power measurement techniques– And the estimates mostly agree with one-another!

• Need to come up with a better NUMERATOR for our “powerefficiency” metric– Top500 will be able to collect the data– But ranking by HPL_FLOPs/measure_watts probably not the right

message to the HPC industry!