forecasting with cyber-physical interactions in data centers
DESCRIPTION
Forecasting with Cyber-physical Interactions in Data Centers. Lei Li [email protected]. Outline. Overview of time series mining Time series examples What problems do we solve Motivation Experimental setup ThermoCast : the forecasting model Results Other time series models and algorithms. - PowerPoint PPT PresentationTRANSCRIPT
Carnegie MellonSchool of Computer Science
Forecasting with Cyber-physical Interactions in Data Centers
PDL Seminar9/28/2011
(c) Lei Li 2012
Outline• Overview of time series mining
– Time series examples– What problems do we solve
• Motivation • Experimental setup• ThermoCast: the forecasting model• Results• Other time series models and algorithms
2
(c) Lei Li 2012 3
What is co-evolving time series?
Correlated multidimensional time sequences with joint temporal dynamics
(c) Lei Li 2012 4
• Goal: generate natural human motion– Game ($57B)– Movie industry
• Challenge: – Missing values– “naturalness”
Motion Capture
Right hand
Left handwalking motion
[Li et al 2008a]
(c) Lei Li 2012 5
Environmental Monitoring• Problem: early detection of leakage & pollution• Challenge: noise & large data
Chlorine level in drinking water systems [Li et al 2009]
(c) Lei Li 2012 6
Network Security
• Challenge: Anomaly detection in computer network & online activity
BGP # updates on backbonefrom http://datapository.net/
Webclick for newsfrom NTT
Webclick for TV
(c) Lei Li 2012
Time Series Mining Problems• Forecasting• Imputation (missing values)• Compression• Segmentation, change/anomaly detection• Clustering• Similarity queries • Scalable/Parallel/Distributed algorithms
7
See my thesis for algorithms covering these problems
(c) Lei Li 2012
Outline• Overview of time series mining
– Time series examples– What problems do we solve
• Motivation • Experimental setup• ThermoCast: the forecasting model• Results• Other time series models and algorithms
8
(c) Lei Li 2012
Datacenter Monitoring & Management
Temperature in datacenter
• Goal: save energy in data centers– US alone, $7.4B power
consumption (2011)• Challenge:
– Huge data (1TB per day)– Complex cyber physical
systems
9
(c) Lei Li 2012
Typical Data Center Energy Consumption
• LBL data center • Google data center
[Barroso 09]
[LBNL/PUB-945]
DC equipment4%
Server46%
CRAC25%
Cooling tower
4%
air movement8%
electric room
4%
UPS losses
8%
lighting4%
10
(c) Lei Li 2012
Towards Thermal Aware DC Management
• Data centers are often over provisioned, with ≈40% of energy spent for cooling (total=$7.4B)
• How can we improve energy efficiency in modern multi-MegaWatt data centers?
11
JHU data centerwith Genomote
(c) Lei Li 2012
Air cycle in DC
12
(c) Lei Li 2012 13
Possible Ways for Saving Cooling and Computing Cost
• Challenges:– airflow interaction, spatial placement, SLA, …
• Possible direction:– Shutdown unused machine according to workload
Example MSN workload
(c) Lei Li 2012 14
Towards Data Driven AC control and server management
• Reactive energy saving:– slow down cooling fan in CRAC– raise AC temperature set points
• Proactive data center management:– predicting temperature distribution and thermal
aware placement of workload
supply air temperature < threshold
max(active inlet air temperature)< threshold
(c) Lei Li 2012 15
Big Picture: Predictive AC Control and Server Management
Temperature prediction
Sensor measuring
Server/workload management
Cooling energy model
Computing energy model
CRAC control
(c) Lei Li 2012
Outline• Overview of time series mining
– Time series examples– What problems do we solve
• Motivation • Experimental setup• ThermoCast: the forecasting model• Results• Other time series models and algorithms
16
(c) Lei Li 2012
Experimental setup• Tested in JHU data center with 171 1U servers,
instrumented with a network of 80 sensors
17
(c) Lei Li 2012
Sample measurements
18
(c) Lei Li 2012
Observations• Temperature difference cycle
(max/min temp. on the same rack) is in anti-phase with air velocity cycle.
• Middle and bottom sections are coldest; Top is hottest
• Shutting down under-utilized servers could reduce energy consumption.
19
(c) Lei Li 2012
What happens when shutting down servers?
20
Shut down
(c) Lei Li 2012
Outline• Overview of time series mining
– Time series examples– What problems do we solve
• Motivation • Experimental setup• ThermoCast: the forecasting model• Results• Other time series models and algorithms
21
(c) Lei Li 2012
ThermoCast [Li et al, KDD 2011]
• Given: intake temperatures, outtake temperatures, workload for each server , and floor air speed
• Goal: forecasting temperature distribution and thermal aware placement of workload
• Approach: a zonal forecasting model– divide the machine room into zones, and each
rack into sections.
22
(c) Lei Li 2012
Assumptions• A0: incompressible air• A1: environmental temperature is constant• A2: supply air temperature is constant within a
period• A3: constant server fan speed• A4: vertical air flow at the outtake is negligible• A5: vertical air flow at the intake is linear to
height
23
(c) Lei Li 2012
Sensor measurements & Air interactions
24
(c) Lei Li 2012 25
ThermoCast
(c) Lei Li 2012
ThermoCast Model
26
floor air
speed
Inlet temp
outlet temp
Derived from fluid dynamics and thermodynamics together with assumptions [Li et al, KDD 2011]
(c) Lei Li 2012
Parameter Learning
27
s.t.
(c) Lei Li 2012
Outline• Overview of time series mining
– Time series examples– What problems do we solve
• Motivation • Experimental setup• ThermoCast: the forecasting model• Results• Other time series models and algorithms
28
(c) Lei Li 2012 29
ThermoCast Results
AR
ThermoCast
75% 100%
shutdown
• Q1: How accurately can a server learn its local thermal dynamics for prediction? 2x better
using 90 minutes as training, predicting 5 minutes away
(c) Lei Li 2012
ThermoCast Results• Q2: How long ahead can ThermoCast forecast
thermal alarms? 2x faster
30
Baseline ThermoCast
Recall 62.8% 71.4%FAR 45% 43.1%MAT 2.3min 4.2 min
FAR=false alarm rateMAT=mean look-ahead time
(c) Lei Li 2012
Implication on Capacity Gain• Preliminary results comparing workload
placement strategies:– 5 minutes forecast length– With the same cooling:
• Inlet temp with ThermoCast: 13.75 C• Inlet temp with Static profiling: 16.5 C
• Assume the servers consume 200W on average (Dell PowerEdge 1950), we gain extra 26% computing power with the same cooling
31
(c) Lei Li 2012
Contributions and Impact• Predictability: a hybrid approach to
integrate the thermodynamics and sensor data
• Scalable learning/training thanks to the zonal thermal model
• Real data and instrument in a data center with practical workload
• Projected impact: can handle extra 26% workload (e.g. PUE 1.5 PUE 1.4) 32
(c) Lei Li 2012
Outline• Overview of time series mining
– Time series examples– What problems do we solve
• Motivation • Experimental setup• ThermoCast: the forecasting model• Results• Other time series models and algorithms
33
(c) Lei Li 2012
DynaMMo: imputation/forecasting
34
Time
sensor 1sensor 2…
sensorm
blackout
Goal: recover the missing values
Details in [Li et al, KDD 2009]
(c) Lei Li 2012
DynaMMo result
35
Reconstruction error
Average missing lengthIdeal
Our DynaMMo
MSVD [Srebro’03]Linear
Interpolation
Spline
Dataset:CMU Mocap #16mocap.cs.cmu.edu
more results in [Li et al, KDD 2009]
better
harder
(c) Lei Li 2012
PLiF and CLDS for clustering
36
BGP data: hierarchical clustering + PLiF features
Details in [Li et al, VLDB 2010] and [Li & Prakash, ICML 2011]
(c) Lei Li 2012
CLDS Clustering Mocap Data
37Accuracy = 93.9% Accuracy = 51.0%
PCA top 2 components CLDS two features
walking motion running motion
(c) Lei Li 2012
WindMine• Goal: find patterns and anomalies from user-
click streams
38
(c) Lei Li 2012
Discoveries by WindMine
39
Jobwebsite
weather kids health
Conclusion• time series mining with many applications• Numbers for energy consumption in DC, and
cooling costs much• Sensor networks find use in data center
monitoring• ThermoCast: the forecasting model• Other time series models and algorithms
– DynaMMo for imputation– PLiF & CLDS for clustering– WindMine for web clicks 40
(c) Lei Li 2012
References• Lei Li, et al. ThermoCast: A Cyber-Physical Forecasting Model
for Data Centers KDD 2011• Lei Li, et al. Time Series Clustering: Complex is Simpler. ICML
2011• Yasushi Sakurai, Lei Li, et al, WindMine: Fast and Effective
Mining of Web-click Sequences, SDM, 2011.• Lei Li, et al. Parsimonious Linear Fingerprinting for Time
Series. VLDB 2010. • Lei Li, et al. DynaMMo: Mining and Summarization of
Coevolving Sequences with Missing Values. ACM KDD 2009.
41
Thanks!contact: Lei Li ([email protected])papers, software, datasets on
http://www.cs.cmu.edu/~leili
42