liquid coolingcommissioning lessonslearned@lrz · pdf filevictor apostolescu created date:...
TRANSCRIPT
Leibniz Supercomputing Centre
Liquid Cooling CommissioningLessons Learned @LRZ
Detlef Labrenz ([email protected])EE HPC WORKING GROUP, SC13, Denver, Nov. 17-22, 2013
Leibniz Supercomputing Centre
-2-
Munich Bavaria Germany & Europe
• We provide generic IT services to all Munich universities• We provide special IT services to all universities in Bavaria
• Network, High Performance and Grid Computing• Backup and Archive Services• IT Management
• We provide supercomputing resources to scientists in Europe• Member of the German Gauss Supercomputing Centre• Third party of the European HPC Infrastructure PRACE• PRACE Tier-0 Supercomputing Center (SuperMUC system)• Investigations on Future HPC Systems:
• Hardware Architectures• Programming Models & System Software• Zero Emission Data Center Infrastructures• Re-Use of Waste Heat
Data Center Infrastucture
-3-
Layout of Cooling Infrastructure
-4-DH
Cold Water -1
WW 30/36°C
CW 14/20°C
VCW 4/10°C
W/Gly
Htg/CCA
CRAH
DAR
KLT19RCT
RCT
KLT11RCT
KLT12RCT
KLT13RCT
KLT14RCT
KLT15Ch+RCT
KLT16Ch+RCT
KLT17CT
KLT18Fountain
KLT04Ch+RCT
KLT05Ch+RCT
KLT02Ch+RCT
HE
9
HRRDLC RHEx
NSRDLC RHEx
DARRHEx
EG
UG
CRAH
CRAH
CRAH
CRAH
CRAC
CRAC
CRAC
CAVE Office 2CRAH/C
Htg CCCA
AC SAN
LECTURESOffice 1
RLT
AC SAN
Htg CCA
KLT03Ch+RCT
HE
11
KLT01Ch+RCT
HE
1H
E10
2
HE20
HE ?
HE6
HE4
HE1
HE8
HE4HE20 HE8
HE11HE12
Warm Water Cold Water -2 Very Cold Water
CRAC
CCA
SuperMUC: IBM System x iDataPlexWith Direct Water Cooling
-5-Torsten Bloth, IBM Lab Services - © IBM Corporation
iDataplex DWC Rack w/ water cooled nodes
(rear view of water manifolds)
Heat Profile of SuperMUC (2/2013)
-6-
CASE #1: Cooling Towers
-7-
Malfunction of Water Level Sensors
• Issue: • Water demand cooling loop increased
• Filling volume: 7 m³• Typical feed: 1+ m³/h• Observed losses: 1 – 2 m³ in hours or days
• Investigations:• Load tests of the cooling towers• Inspection of the control system• Examination of the measuring devices
• Reason: • Wrong dimensions of the filling level sensor
installed
• Root cause: • Main contractor did not use the sensor
recommended by the manufacturer of the cooling tower
-8-
CASE #2: Operating ControlWarm Water Cooling Infrastructure
-9-
SuperMUC
M
Cluster
M
M
M
∞ KLT 12 ∞M
T T
MM
∞ KLT 11 ∞M
M
∞ KLT 14 ∞M
M
∞ KLT 13 ∞
N NNN
SS S
S
T T
M
T T
M
T T
M
HRR NSR
M
M M
Operating Control: Test of DTinlet(NSR) = -20 KResponse of Warm Water Cooling Infrastructure
-10-
Operating Control: Test of DTinlet(NSR) = -20 KResponse of Warm Water Cooling Infrastructure
-11-
COP = Cooling Capacity/ Power Input
COP
WWW.SIMOPEK.DE
-12-
Thank You!
Zero Emission Supercomputing Centre
Energy Efficient HPC: The Four Pillar Model
Data Center (Goal: Reduce Total Cost of Operation)
Uti
lity
Pro
vid
ers
Ne
igh
bo
rin
g B
uil
din
gs
Building Infrastructures
HPC System Hardware
HPC SystemSoftware
HPC Applications
Goal:Improve PUE (Power Usage Effectiveness)
Goal:Reduce Hardware Power Consumption
Goal:Optimize Resource Usage, Tune System
Goal:Optimize Application Performance
-14-