thermal aware data management in cloud based data centers ling liu college of computing georgia...
TRANSCRIPT
Thermal Aware Data Management in Cloud based Data Centers
Ling LiuCollege of Computing
Georgia Institute of Technology
NSF SEEDM workshop, May 2-3, 2011
Thermal aware Computing Era
• Power density increases– Circuit density increases by a factor of 3 every 2 years– Energy efficiency increases by a factor of 2 every 2 years– Effective power density increases by a factor of 1.5 every 2 years
[Keneth Brill: The Invisible Crisis in the Data Center]
• Maintenance/TCO rising– Data Center TCO doubles every three years– Three-year cost of electricity exceeds the purchase cost of the server– Virtualization/Consolidation is a 1-time/short term solution
[Uptime Institute]
• Thermal management corresponds to an increasing portion of expenses– Thermal-aware computing and management solutions becoming prominent
– Increasing need for thermal awareness
Thermal aware Task Scheduling in Data Centers
• Given a total task C, how to divide it among N server nodes to finish computing task with minimal cooling energy cost ?
• Self-Interference and cross-interference lead to the temperature rise of inlet air, should be minimized
• Environment interference(room temperature) is not critical• Task scheduling in spatial domain
[VarsamopoulosGupta 2008]
Energy Saving by Dynamic Load Distribution
Increasing the range of changes in the rack heat load
• Heat load distribution of [30 kW, 5 kW, 5 kW, 20 kW] in the case study only needs 1.7 m/s (9,726 CFM) cooling air flow
• It is 19% less than the uniform distribution needs
• This could save ~$189,000 annually in typical real world data centers
[15,15,15,15] kW with 2.1 m/s [30,5,5,20] kW with 1.7 m/s
Temperature Contours Around Racks:
[Yogendra Joshi, Georgia Tech/CERCS]
Think Globally, Act Locally
Numerically
Run simulations for a range of
velocities
Make a server heat load-Inlet T variation matrix
Change in max. inlet T of servers
Unit change in server loads
S1 S2 Sn
S1
S2
Sn
Experimentally
Vary the heat loads sequentially
at servers for a chosen unit cell and monitor the
max. server inlet T
Advantage:
The simulations run for different velocities are not required for the experimental approach.
Modifications:
Blocks of servers can be identified with same effect or no effect on the inlet T.
• This will give insights on the sparsity of this matrix.
• Reduce the computational work.
A Matrix
n
iil
1
max
..ts crT TlA
maxmin lll Where,
server I load
Minimum load (startup)
Max. load (full utilization)
Max. inlet T allowed by ASHRAE
n
iil
1
max
crT TlA
maxmin lll maxmin lll
[Yogendra Joshi, Georgia Tech/CERCS] ]
68% increase in allowed heat dissipation
(For the same CRAC velocity)
37.5% decrease in Facilities Energy Consumption (For the same heat
dissipation)
An Example
288
293
298
303
308
313
318
323
328
Max.
Inle
t T
at
Serv
ers
(K
)
AILM: 0.8-7.5kWserver range - A rack
AILM: 0.8-7.5kWserver range - B Rack
Uniform: 5kW serverload - A Rack
Uniform: 5kW serverload - B Rack
SafeTemperature
Limit
11 141312 15 4116 21 3122 23 2524 26
Total Data Center Load Dissipation
298kW
297kW
VCRAC = 5m/s
11 41
16 46
[Yogendra Joshi, Georgia Tech/CERCS]
Pertinence of Thermal Maps in Data Center Management
• Given an equipment utilization layout, find the temperature around the room
• Create a collection of thermal maps or a function to “predict” thermal behavior of a task assignment
• Use collection to decide on job placement (temporally and spatially)
[VarsamopoulosGupta 2008]