feedback control real-time scheduling
DESCRIPTION
Feedback Control Real-Time Scheduling. C. Lu, J.A. Stankovic, G. Tao, and S.H. Son, Design and Evaluation of a Feedback Control EDF Scheduling Algorithm, IEEE Real-Time Systems Symposium (RTSS'99), December 1999. Motivation for Feedback control Scheduling. - PowerPoint PPT PresentationTRANSCRIPT
Feedback Control Real-Time Scheduling
C. Lu, J.A. Stankovic, G. Tao, and S.H. Son, Design and Evaluation of a Feedback Control EDF Scheduling Algorithm, IEEE Real-Time Systems Symposium (RTSS'99), December 1999.
Motivation for Feedback control Scheduling Open-loop scheduling paradigms perform poorly
in unpredictable dynamic systems where the workload cannot be accurately modeled
Many complex applications, e.g., robotics and agile manufacturing, are dynamic and operate in a non-deterministic environment where precise workload is not known
Challenging to build real-time systems providing predictable performance in a highly uncertain environment
Feedback control can support the target performance even when the workload varies dynamically via graceful QoS degradation in a closed-loop loop
Motivation
Apply control theoretic approaches to real-time performance management
Feedback control is well known for its robustness, e.g., cruise control or chemical reactor control, in the presence of disturbances Doesn’t need a precise system model
If the precise system model is known, feedback control is not necessary
Dynamically adapt the system behavior to achieve the targe performance (also called set point) in the feedback loop
Feedback Control Concepts
Set-point: Target performance to achieve, e.g., 1% deadline miss ratio
Measured perf: Actual perf, e.g., actual (deadline) miss ratio, measured at the current sampling period
Error = set-point – measured perf = target miss ratio – current miss ratio
ControllerControlled RT System
+
-
Setpoint MeasuredPerf.Error
ControlSignal
Feedback Control Loop
Periodically measure and compare the perf to the set point to determine the error
Controller computes the control signal based on the error and controlled system model
Actuator, e.g., admission controller or QoS manager, change the value of the manipulated variable to control the system
FC-EDF Architecture
Miss Ratio Control Model
At kth sampling instant, miss ratio is:m(k) = m(k-1) + g(k) ∆u(k-1) where m(k-1): miss ratio at the (k-1)th sampling
period g(k): miss ratio gain ∆u(k-1): utilization adjustment by admission
control and QoS adaptation at the (k-1)th sampling period
Miss Ratio Control Model
Instead of considering time-varying miss ratio gain g(k), they took G = maximum (miss ratio/unit load increase)
0.9 1 1.1 1.2 1.3 ...
Load
MissRatio
Miss ratio control is very challengingdue to the nonlinear nature of MR increase!!
Miss Ratio Control Model
Replace g(k) with G m(k) = m(k-1) + g(k)∆u(k-1) → m(k) = m(k-1) + G∆u(k-1)
Take z-transform to convert to frequency domain Convert from time domain to frequency domain You can do arithmetic manipulation rather than
solving (partial) differential equations
Apply z-transform to m(k) = m(k-1) + G∆u(k-1)
M(z) = z-1M(z) + z-1∆U(z)
M(z) = (G/z-1) ∆U(z)
Transfer function T(z) = output/input = M(z)/U(z) = G/z-1
Utilization Control Model
Miss ratio controller itself is not stable MR controller is saturated when utilization is
less than 1 if EDF is used In their later work, they added utilization
controller Utilization controller works when U ≤ 1,
miss ratio controller works when U > 1 Turn on/off util/MR controller when U ≤ 1 Turn on/off MR/util controller when U > 1
Good idea?
Controller Tuning
Given the control model shown in the previous slide, apply Root Locus model to graphically tune the controller in Matlab to support the stability & specified transient performance such as the overshoot and settling time
Feedback performance control in software services
T.F. Abdelzaher, J.A. Stankovic, C. Lu, R. Zhang, and Y. Lu, Feedback Performance Control in Software Services, IEEE Control Systems, 23(3): 74-90, June 2003.
Overview
SW systems become larger and bigger Performance guarantee required, e.g.,
in web-based e-commerce Control theory
Promising theoretical foundation for perf control in complex SW applications, e.g., real-time scheduling, web servers, multimedia control, storage mangers, power management, routing in computer networks, …
Overview
Software performance assurance problems Feedback control problems focused on web
server performance guarantee problems Data centers
SW performance control
Less rigorous guarantees on perf and quality
Most SW eng. research deals with the development of functionally correct SW
Functional correctness is not enough! Timeliness in embedded systems
Correct but delayed action can be disastrous Non-fucntional QoS attributes, e.g.,
timeliness, security, availability, …
Traditional approaches for perf guarantees Worst case estimates of load &
resource availability Recall EDF, RM, DM, Priority Ceiling
Protocol, …
New demand for performance assurance QoS guarantees required in a broader scope
of applications run in open, unpredictable environments Global communication networks enabling online
banking, trading, distance learning, … Points of massive aggregation suffering
unpredictable loads, potential bottlenecks, DoS attacks, …
-> Precise workload/system model unknown a priori Failure to meet QoS requirements -> loss of
customers or financial damages Worst case analysis/overdeisgn could be overly
pessimistic or wasteful Solid analytic framework for cost-effective perf
assurance required
Challenges
How to model SW architecture? How to map a specific QoS problem into
a feedback control system? How to choose proper SW sensors and
actuators to monitor and adjust perf and workloads/resource allocation?
How to design controllers for servers?-> This paper focuses on web servers
QoS metrics
Delay metrics Proportional to time: queuing delays,
execution latencies, service response time Rate metrics
Inversely proportional to time Connection bandwidth, throughput, packet
rate
Time-related perf attributes
Can be controlled by adjusting resource allocation Queuing theory can predict perf given a
particular resource allocation or vice versa Queuing theory only works for Poisson
arrival patterns Queuing theory can only predict average perf
even if this assumption holds Arrival patterns in web applications follow
heavy-tailed distribution -> Bursty arrival patterns
Service architecture
Fig. 1 Server architecture: (a) computing model (b) control-orientedrepresentation
Liquid task model
Liquid task model
Ci << Di
Takes Ci units of time to serve request i Di is the max tolerable response time Tolerable response time is finite Service times are infinitesimal
Progress of requests through the server queues ≈ Fluid flow
Service rate at stage k = dNk(t)/dt where Nk is #requests processed by stage k
Liquid task model
Volume at time T≈ #requests queued at stage k = ∫T(Fin – Fk) Fk: service rate at stage k Fin: request arrival rate to this stage
Valves: points of control, i.e., manipulated variables such as the queue length
Liquid model does not describe how individual requests are prioritized
Control theory can be combined with queuing theory or real-time scheduling
Server modeling Difference equation to model web servers
y(k): perf, e.g., delay or throughput, measured at the kth sampling period
U(k): control input at the kth sampling period ARMA (Auto Regressive Moving Average) model
y(k) = a1y(k-1) + a2y(k-2) + … + any(k-n)
+ b1u(k-1) + b2u(k-2) + … + bnu(k-n) n: system order – higher order model is usually (not
always!) more accurate but more complex Transfer function can be derived
Web proxy cache model [4] TCP dynamics [5]
Transfer function
Shows the relation between input and output
Apply z-transform to y(k) in the previous slide
Open loop transfer function vs. closed-loop transfer function
Resource allocation for QoS guarantees Allocate more/less resource =
open/close a valve Need actuators to control resource
allocation or QoS provided by the system
SW system actuators
Input flow actuators Admission control Control queue length, server utilization, … Reject some requests under overload
SW system actuators
Quality adaptation actuators Change processing requirements to
increase server rate under overload E.g., Return abbreviated web page under
overload Tradeoff btwn delay & quality Service level m in a range [0, M] where 0 is
rejection
Resource reallocation actuator Alter the amount of allocated resources Usually applicable to multiple classes of
clients, e.g., dynamically reallocate disk space for differentiated web caching to support the service delay ratio 1:2 between two service classes [4,7]
QoS Mapping
Convert common resource management & SW perf assurance problems to FC problems
Absolute convergence guarantee Relative guarantee Resource reservation guarantee Prioritization guarantee Statistical multiplexing guarantee Utility optimization guarantee
Absolute convergence guarantee
Convergence to the specified problem Overshoot: Maximum deviation Settling time: Time taken to recover the
desired perf
Absolute convergence guarantee
Rate & queue length control Result in linear FC (Flow) rate can be directly controlled by
actuators Queue length can be linearly controlled by
controlling the flow E.g., server utilization control loop
Absolute convergence guarantee
Delay control More difficult Delay is inversely proportional to flow
Queuing delay d = Q/r where Q is queue length & r is service rate
Nonlinear
Relative guarantee
For example, fix the delays of two traffic classes at a ratio 3:1
Hi: measured perf of class i Ci: weight of class i Relative guarantee specifies H1:H2 = 1:3 Set point = 1/3 Error e = 1/3 – H1/H2
Controlled variable: relative delay ratio Manipulated variable: #allocated processes
per class to control connection delay HTTP protocol summary
A client, e.g., a web browser, establishes a TCP connection with a server process
The client submits an HTTP request to the sever over the TCP connection
The server sends the response back to the client Keep open the TCP connection for the Keep Alive interval,
e.g., 15s-> Claim connection delay dominates service response time -> Scheduling can also significantly affect relative delay
ratio, but it is not considered
Relative guarantee in Apache web server
Relative guarantee in Apache web server System identification based on the
ARMA model (Least square method) Also called System Identification (SYSID) in
control theory Randomly change per class process allocations Measure response time
Relative guarantee in Apache web server Perf settings
4 Linux machines run the Surge web workload generator
1 Linux machine runs the Apache web server
Suddenly increase #premium clients by 100 at time 870s
Relative guarantee in Apache web server Perf results
Open Loop
Closed LoopStable?
Related work
ControlWare CPU scheduling Storage management Network routers Power/heat management RTDB
Conclusions
Feedback control is applicable to managing performance in SW systems
Future work Adaptive/robust control Predictive control Apply to other computational systems such
as embedded systems
Adptive Control: Self-Tuning Regulator Dynamically estimate a model of the system
via the Recursive Least Square method Controller will accordingly set the actuators to
support the desired perf.
References (HP Storage Systems Lab)
Designing controllable computer systems, Christos Karamanolis, Magnus Karlsson and Xiaoyun Zhu. USENIX Workshop on Hot Topics in Operating Systems (HotOS), June 2005, pp. 49-54, Santa Fe, NM.
Dynamic black-box performance model estimation for self-tuning regulators, Magnus Karlsson and Michele Covell. International Conference on Autonomic Computing (ICAC), pp. 172-182, June 2005, Seattle, WA.
Autonomic Computing
General, broader research issues regarding self-tuning, self-managing, self-* systems Autonomic computing web site
http://autonomiccomputing.org/
IBM http://www.research.ibm.com/autonomic/
index.html Adaptive Systems Department
Some University Labs
Tarek Abdelzaher: http://www.cs.uiuc.edu/homes/zaher/
Chenyang Lu: http://www.cse.wustl.edu/~lu/
Next class
We will discuss papers from our RTES Lab on feedback control of software system K. D. Kang, J. Oh, Y. Zhou, "Backlog Estimation and
Management for Real-Time Data Services", 20th Euromicro Conference on Real-Time Systems (ECRTS '08), July 2-4, Prague, Czech Republic.
C. Basaran, K. D. Kang, M. H. Suzer, K. S. Chung, H. R. Lee, K. R. Park, "Bandwidth Consumption Control and Service Differentiation for Video Streaming," 17th International Conference on Computer Communications and Networks (ICCCN '08), August 3 - 7, 2008, St. Thomas U.S. Virgin Islands.
Questions?