-
Curbing Aggregate Member Flow Burstiness to Bound End-to-End Delay in Networks of
TDMA Crossbar Real-Time Switches
Qixin Wang*, Yufei Wang*, Rong Zheng†,*, Xue Liu‡* Dept. of Computing, the Hong Kong Polytechnic Univ.
† Dept. of Computer Science, Univ. of Houston.‡ School of CS, McGill Univ
December 5, 2012
-
ContentDemand
Background
Problem: Per-Aggregate Queueing
Solution: Real-Time Aggregate
Evaluation
Related Work
-
Real-Time Networks (RTN)
Specialized networks used in industrial, mining, medical, vehicular, avionic environments
-
Specialized networks used in industrial, mining, medical, vehicular, avionic environments
RTN is fundamentally different from the Internet
-
Specialized networks used in industrial, mining, medical, vehicular, avionic environments
Typically serves continuous sensing/actuating feedback control traffic: (Period, Workload, Deadline)
RTN is fundamentally different from the Internet
RTN: The Internet:
-
Specialized networks used in industrial, mining, medical, vehicular, avionic environments
Typically serves continuous sensing/actuating feedback control traffic: (Period, Workload, Deadline)
RTN is fundamentally different from the Internet
RTN:(Hard) Real-Time
The Internet:Best Effort
-
Specialized networks used in industrial, mining, medical, vehicular, avionic environments
Typically serves continuous sensing/actuating feedback control traffic: (Period, Workload, Deadline)
RTN is fundamentally different from the Internet
RTN:(Hard) Real-Time
Periodic Stable Traffic
The Internet:Best Effort
Random Bursty Traffic
-
Specialized networks used in industrial, mining, medical, vehicular, avionic environments
Typically serves continuous sensing/actuating feedback control traffic: (Period, Workload, Deadline)
RTN is fundamentally different from the Internet
RTN:(Hard) Real-Time
Periodic Stable Traffic
Bounded w/ Global Info
The Internet:Best Effort
Random Bursty Traffic
Unbounded w/o Global Info
-
Specialized networks used in industrial, mining, medical, vehicular, avionic environments
Typically serves continuous sensing/actuating feedback control traffic: (Period, Workload, Deadline)
RTN is fundamentally different from the Internet, hence requires different solutions.
RTN:(Hard) Real-Time
Periodic Stable Traffic
Bounded w/ Global Info
The Internet:Best Effort
Random Bursty Traffic
Unbounded w/o Global Info
-
RTN is evolving from shared medium to multi-hop switched due to scalability needs.
Concord
1976
-
RTN is evolving from shared medium to multi-hop switched due to scalability needs.
A380
2007,
Several Hundreds of computing nodes
-
RTN is evolving from shared medium to multi-hop switched due to scalability needs.
Robotic Manufacturing
-
RTN is evolving from shared medium to multi-hop switched due to scalability needs.
Tele-robotic underground mining saves lives
2000ft (609.6m)2000ft (609.6m)
> 5000 annual death toll
-
RTN is evolving from shared medium to multi-hop switched due to scalability needs.
Telepresence
-
RTN is evolving from shared medium to multi-hop switched due to scalability needs.
Software Defined Networks (SDN)
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
Input Ports
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3
Output Ports
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3
Per-Flow-Queueing
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3
cells
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1 O1
O2
O3
I2
I3
cell cell cell
cell cell
cell
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3
Synchronous periodic cell forwarding Cell-Time
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3
Matching
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3
Why Matching? An input/output can only send/receive one cell per cell-time
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3
Internal Matching: if an input has multiple per-flow-q for the same output, only one is picked every cell-time.
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
Fit all real-time flows’ periods into frame, e.g., (11, 3) (5, 2), i.e., (10, 4)
Cell time: 1 2 3 4 5
I1:
I2:
I3:
I4:
Demand
a cell to send to O1
a cell to send to O2
a cell to send to O3
a cell to send to O4
TDMA scheduling frame of M cell-time, e.g., M = 5
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
Demand
Cell time: 1 2 3 4 5
I1:
I2:
I3:
I4:
Schedule
Sche
duling
Algori
thm
Theorem 1: If demand matrix’ every color ≤ M cell, then have config. time scheduler with O(N4) time cost [wang10].
Cell time: 1 2 3 4 5
I1:
I2:
I3:
I4:
a cell to send to O1
a cell to send to O2
a cell to send to O3
a cell to send to O4
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3
Support for Multicast
c
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
I1
I2
I3
O1
O2
O3c
cc
Support for Multicast
-
TCRT switches have to reconcile the dilemma btw flow aggregation (scalability) and flow isolation
Per-Flow Queueing:
Good isolation;
Poor scalability
I 0
f00f01f02f03
f10f11
f20f21f22
f30f31f32f33
f20 f21 f22 .... f20..
Schedule of O2
-
TCRT switches have to reconcile the dilemma btw flow aggregation (scalability) and flow isolation
I 0
Single Aggregate Queueing:
Good scalability;
Poor isolation tight RT E2E delay bound is an open problem
.... ..
Schedule of O2
-
TCRT switches have to reconcile the dilemma btw flow aggregation (scalability) and flow isolation
I 0
Per-Aggregate Queueing:
Good scalability;
Poor isolation tight RT E2E delay bound is an open problem
Schedule of O2
Q000Q001
Q010Q011
Q020
Q021
Q030Q031
Q020Q021 Q020 .... Q021..
-
Switch 1
Switch 2
Switch 3
Switch 4
Switch 5
Switch 6
-
Switch 1
Switch 2
Switch 3
Switch 4
Switch 5
Switch 6
Switch 1 Clk Domain
Switch 2 Clk Domain
Switch 3 Clk Domain
Switch 4 Clk Domain
Switch 5 Clk Domain
-
Switch 1
Switch 2
Switch 3
Switch 4
Switch 5
Switch 6
Missed match due to Clk Skew
Slot idle due to bursty flow
-
Switch 1
Switch 2
Switch 3
Switch 4
Switch 5
Switch 6
The infected flow becomes bursty, and may further infect other flows.
A tight bound is unknown
-
Heuristics: per-aggregate queueing provides spatial isolation, how about adding some temporal isolation?
Per-Aggregate Queueing:
Sentences without punctuations
-
Heuristics: per-aggregate queueing provides spatial isolation, how about adding some temporal isolation?
V-Frame V-Frame V-FrameV-Frame
Real-Time Aggregate Queueing:
Added dummy cells as sentence punctuations
Dummy Cell (Sync Cell): Blocks Output Port’s Grant until the next scheduling frame starts.
-
Holistic view of a real-time aggregate
-
Holistic view of a real-time aggregate
-
Holistic view of a real-time aggregate
0th Switch
-
Holistic view of a real-time aggregate
1st Switch
-
Holistic view of a real-time aggregate
2nd Switch
-
Holistic view of a real-time aggregate
(k-2)th Switch
-
Holistic view of a real-time aggregate
(k-1)th Switch
-
Holistic view of a real-time aggregate
k th Switch
-
Holistic view of a real-time aggregate
(k+1)th Switch
-
Dummy cells are injected (using multicast) at aggregator. O(1) runtime complexity.
0th SwitchOutput Port O0 : insert dummy cells to its created aggregates
-
Each subsequent output port, when forwarding (i.e. granting) aggregate’s cells, respects dummy cells.
1st Switch 2nd Switch (k-2)th (k-1)th
-
Dummy cells are deleted at segregator. O(1) runtime complexity.
At the segregator, dummy cells are respected but not forwarded, data cells go to their corresponding next aggregator.
k th Switch
-
Another view: life cycle of dummy cells.
-
Another view: life cycle of dummy cells.
The set of aggregates that O creates.
-
Another view: life cycle of dummy cells.
-
Another view: life cycle of dummy cells.
} leavingafter joins ,|{ XFfXfF
-
Another view: life cycle of dummy cells.
-
We realized the aforementioned ideas by revising TCRT switch output/input port algorithms.
Input Port Alg.
Output Port Alg.
-
We proposed a resource allocation scheme that can provide E2E delay bound.
Theorem 2: Lf(k-1) – Lf(0) ≤ kPmax Theorem 3: Lf(k) – Lf(0) ≤ (k+3)Pmax + 2τmax
Lf(0)Lf(k-1) Lf(k)
Pmax : max scheduling frame period; M cell-time per frame. τmax : max cell-time; min : min cell-time; max – min < min /M
-
We compare two aggregate schemes: Per-Aggregate Queueing and RT-Aggregate
-
In an grid network of TCRT switches, we overlay layers of aggregates.
EE E2log1
-
Evaluation Settings: 500bit/cell; M = 2000 (cell-time/frame); max = 50(ns), Pmax = 0.1(ms); min = 49.98(ns), Pmin = 0.09996(ms)
-
100 trials for each aggregate scheme; each trial randomly adds RT flows until no more is schedulable.
90% sensing/actuating flows: 5Mbps; 10% video flows: 80Mbps.
Uniform distribution of source and destination ends.
Routed via aggregate layout per Dijakstra.
-
Worst case E2E delay bound statistics (dot: mean of the corresponding 100 trials; error bars: 95% confidence range)
-
Average Physical Link Utilization Statistics (dot: mean of the corresponding 100 trials; error bars: 95% confidence range)
-
Related Work
Pinwheel [holte89]: also TDMA scheduling; focus on CPU or independent multi- CPU; focus on optimizing TDMA period.
Hierarchical Scheduling [deng97]: focus on CPU scheduling; recent works on output queueing switches [santos11], not input queueing TCRT switches.
Non-Work-Conserving Switch Scheduling (e.g. Stop-and-Go) [golestani90]: focus on output queueing, not TCRT switch architecture. Few works on aggregate, burstiness growth/infection control.
Drastically different from Internet (see demand analysis).
Rate based network QoS (e.g. WFQ) [parekh93]: assume output queueing, not customized for TCRT; need time-stamping and packet sorting.
ATM TDMA switches [rao12][chen11][wang10][dopatka07][chang99][leung97]: these are the ancestors that eventually evolve to TCRT. We are exactly talking about how to support hard real-time aggregate in such switches.
-
Related Work
Real-time LANs [can93][fischmeister09]: most focus on shared medium single hop;
TTEthernet [ttethernet08] is open to switch architectures, assuming underlying multi-hop switched network already guarantees E2E delay bound, hence our real- time aggregate enabled TCRT switches can serve TTEthernet;
IEC 61784 [iec61784-2_10] defines a set of communication profiles on flow aggregate, but is also open to specific implementations;
IEEE 802.1 AVB Task Group IEEE 802.1Qav Specifications [802.1qav09] proposes a flow aggregate for multi-hop switched networks. Designed for output queueing work-conserving switch architecture with prioritized scheduling. Ours is designed for input queueing non-work-conserving TDMA crossbar switch architecture.
Scharbarg et al. [scharbarg09] gives a probabilistic E2E delay bound for AFDX [afdx05] aggregates. Ours focuses on deterministic E2E delay bound instead.
-
Related Work
On how to realize and analyze real-time flow aggregate.
MPLS [mpls12] is a flow labeling mechanism for aggregate based routing. Focuses on Layer 2.5 (above Layer 2, Data Link Layer); while ours is a Layer 2 design, in other words, our real-time aggregates can serve MPLS.
Guaranteed Rate (GR) server [sun05]: focus on output queueing, time-stamping and packet sorting based switches.
FIFO based aggregates (e.g. DiffServ) [peterson07]: sufficient schedulability test methods exists [boudec01], but utilization is low and delay bounds are big [boudec01][wang04]. As Wang et al. [wang04] point out, schedulability is very susceptible to rogue bursty traffic, mainly due to lack of isolation in FIFO. A generic schedulability test and tight E2E delay bound are still open problems.
-
Conclusion
1. Proposed “real-time aggregate” for TCRT switches:
1.1. exploits TCRT switch’s features;
1.2. deploys spatial-temporal isolations;
1.3. provides E2E delay bound.
2. Proposed the corresponding resource allocation/admission-control scheme.
3. Derived the closed form E2E delay bound.
4. Achieves much higher utilization and shorter E2E delay bound than per-aggregate queueing.
-
Conclusion
1. Proposed “real-time aggregate” for TCRT switches:
1.1. exploits TCRT switch’s features;
1.2. deploys spatial-temporal isolations;
1.3. provides E2E delay bound.
2. Proposed the corresponding resource allocation/admission-control scheme.
3. Derived the closed form E2E delay bound.
4. Achieves much higher utilization and shorter E2E delay bound than per-aggregate queueing.
Thank You!
-
References[802.1qav09] IEEE Standard 802.1Qav, 2009.[afdx05] Aircraft Data Network Part 7: Avionics Full Duplex Switched Ethernet (AFDX) Network. ARINC 664, Jun. 2005.[boudec01] J.-Y. L. Boudec and P. Thiran, Network Calculus: A Theory of Deterministic Queuing Systems for the Internet. Springer, 2001.[can93] Road Vehicles - Exchange of Digital Information - Controller Area Network (CAN) for High-Speed. ISO Std. 11898, 1993.[chang99] C.-S. Chang, W.-J. Chen, and H.-Y. Huang, “On service guarantees for input buffered crossbar switches: a capacity decomposition
approach by birkhoff and von neumann,” IEEE IWQoS’99, pp. 79–86, 1999.[charny00] A. Charny et al., “Delay bounds in a network with aggregate scheduling,” in Intnl’ Workshop on Quality of Future Internet Services, 2000.[chen11] L. Chen et al., “A real-time multicast routing scheme for multi-hop switched fieldbuses,” Proc. of INFOCOM’11, pp. 3209–3217, 2011.[deng97] Z. Deng and J. W.-S. Liu, “Scheduling real-time applications in an open environment,” Proc. of IEEE RTSS’97, 1997.[dopatka07] F. Dopatka et al., “Design of a realtime industrial ethernet network including hot-pluggable asynchronous devices,” Proc. of IEEE
ISIE’07, 2007.[fischmeister09] S. Fischmeister et al., “Hardware acceleration for conditional state-based communication scheduling on real-time ethernet,” IEEE
Tran. On Industrial Informatics, vol. 5, no. 3, 2009.[golestani90] S. Golestani, “A stop-and-go queueing framework for congestion management,” ACM SIGCOMM, pp. 8–18, Sep. 1990.[holte89] R. Holte et al., “The pinwheel: A real-time scheduling problem,” Proc. of Annu. Hawaii Intl’ Conf. on Sys. Sci., vol. 2, 1989.[iec61784-2_10] Industrial communication networks - Profiles - Part 2: Additional fieldbus profiles for real-time networks based on ISO/IEC 8802-3.
IEC 61784-2 Ed. 2.0 b:2010, 2010.[leung97] Y.-W. Leung and T.-S. Yum, “A TDM-based multibus packet switch,” IEEE Trans. on Communications, vol. 45, no. 7, pp. 859–866, Jul.
1997.[mpls12] MPLS Working Group, IETF. http://www.ietf.org/html.charters/mpls-charter.html.[parekh93] A. K. Parekh and R. G. Gallager, “A generalized processor sharing approach to flow control in integrated services network: the single
node case,” IEEE/ACM TON, vol. 1, pp. 344–357, Jun. 1993.[peterson07] L. L. Peterson and B. S. Davie, Computer Networks: A System Approach, 4th ed. Morgan Kaufmann, 2007.[rao12] L. Rao et al., “Analysis of TDMA Crossbar Real-Time Switch Design for AFDX Networks,” Proc. of INFOCOM’12, Mar. 2012.[santos11] R. Santos et al., “Multi-level hierarchical scheduling in ethernet switches,” Proc. Of EMSOFT’11, pp. 185–194, 2011.[scharbarg09] J.-L. Scharbarg et al., “A probabilistic analysis of end-to-end delays on an afdx avionic network,” IEEE TII, vol. 5, no. 1, 2009.[sun05] W. Sun and K. G. Shin, “End-to-end delay bounds for traffic aggregates under guaranteed-rate scheduling algorithms,” IEEE/ACM TON,
vol. 13, no. 5, pp. 1188–1201, Oct. 2005.[ttethernet08] TTEthernet Specification. TTTech Computertechnik AG, 2008.[wang04] S. Wang et al., “Providing absolute differentiated services for real-time applications in static-priority scheduling networks,” IEEE/ACM
TON, vol. 12, no. 2, 2004.[wang10] Q. Wang et al., “Adapting a main-stream internet switch architecture for multi-hop real-time industrial networks,” IEEE TII, vol. 6, no. 3,
2010.
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
-
A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switches
-
Switch 1
Switch 2
Switch 3
Switch 4
Switch 5
Switch 6
TCP use packet/cell drop and feedback based congestion control.
* NOT hard real-time!* No guarantee of zero cell loss. * OK for Internet and telephone networks.* Unacceptable for avionics and other life critical, certification needed CPS.
-
Switch 1
Switch 2
Switch 3
Switch 4
Switch 5
Switch 6
Conventional fair queueing QoS schemes (WFQ etc.) do not assume
input queueing switch (iSLIP) architecture, and are too complex.
Curbing Aggregate Member Flow Burstiness to Bound End-to-End Delay in Networks of TDMA Crossbar Real-Time SwitchesContentReal-Time Networks (RTN)�RTN is fundamentally different from the InternetRTN is fundamentally different from the InternetRTN is fundamentally different from the InternetRTN is fundamentally different from the InternetRTN is fundamentally different from the InternetRTN is fundamentally different from the Internet, hence requires different solutions.RTN is evolving from shared medium to multi-hop switched due to scalability needs.RTN is evolving from shared medium to multi-hop switched due to scalability needs.RTN is evolving from shared medium to multi-hop switched due to scalability needs.RTN is evolving from shared medium to multi-hop switched due to scalability needs.RTN is evolving from shared medium to multi-hop switched due to scalability needs.RTN is evolving from shared medium to multi-hop switched due to scalability needs.A widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesTCRT switches have to reconcile the dilemma btw flow aggregation (scalability) and flow isolationTCRT switches have to reconcile the dilemma btw flow aggregation (scalability) and flow isolationTCRT switches have to reconcile the dilemma btw flow aggregation (scalability) and flow isolationSlide Number 36Slide Number 37Slide Number 38Slide Number 39Heuristics: per-aggregate queueing provides spatial isolation, how about adding some temporal isolation?Heuristics: per-aggregate queueing provides spatial isolation, how about adding some temporal isolation?Holistic view of a real-time aggregateHolistic view of a real-time aggregateHolistic view of a real-time aggregateHolistic view of a real-time aggregateHolistic view of a real-time aggregateHolistic view of a real-time aggregateHolistic view of a real-time aggregateHolistic view of a real-time aggregateHolistic view of a real-time aggregateDummy cells are injected (using multicast) at aggregator. O(1) runtime complexity.Each subsequent output port, when forwarding (i.e. granting) aggregate’s cells, respects dummy cells.Dummy cells are deleted at segregator. O(1) runtime complexity.Another view: life cycle of dummy cells.Another view: life cycle of dummy cells.Another view: life cycle of dummy cells.Another view: life cycle of dummy cells.Another view: life cycle of dummy cells.We realized the aforementioned ideas by revising TCRT switch output/input port algorithms.We proposed a resource allocation scheme that can provide E2E delay bound.We compare two aggregate schemes: Per-Aggregate Queueing and RT-AggregateIn an grid network of TCRT switches, we overlay� layers of aggregates.Evaluation Settings: 500bit/cell; M = 2000 (cell-time/frame); max = 50(ns), Pmax = 0.1(ms); min = 49.98(ns), Pmin = 0.09996(ms)100 trials for each aggregate scheme; each trial randomly adds RT flows until no more is schedulable.Worst case E2E delay bound statistics (dot: mean of the corresponding 100 trials; error bars: 95% confidence range)Average Physical Link Utilization Statistics (dot: mean of the corresponding 100 trials; error bars: 95% confidence range)Related WorkRelated WorkRelated WorkConclusionConclusionReferencesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesA widely adopted RTN switch architecture: TDMA Crossbar RT (TCRT) switchesSlide Number 79Slide Number 80