improved queue-size scaling for input-queued switches via ...jx77/jiaming_mostlyom19.pdf ·...

Post on 01-Jan-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Improved Queue-Size Scaling for Input-QueuedSwitches via Graph Factorization

Jiaming Xu

The Fuqua School of BusinessDuke University

Joint work withYuan Zhong (Chicago Booth)

Mostly OM Workshop, June 2, 2019

Data Center Switches

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 2

Data Center Switches

Switch

Inputs

OutputsHP Data Center Switch

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 3

Input-Queued Switch

Output 1 Output 2

Input 1

Input 2

• n× n input-queued switch: n inputs and n outputs

• unit-sized packets

• n2 queues: (input, output) ↔ queue

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 4

Input-Queued Switch

Output 1 Output 2

Input 1

Input 2

Matching constraints (2n resource constraints):

• each input can connect to at most one output

• each output can connect to at most one input

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 4

Input-Queued Switch

Output 1 Output 2

Input 1

Input 2

Not allowed

0 10 1

!

"#

$

%&

Matching constraints (2n resource constraints):

• each input can connect to at most one output

• each output can connect to at most one input

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 4

Input-Queued Switch

AllowedInput 1

Input 2

Output 1 Output 2

1 00 1

!

"#

$

%&

Matching constraints (2n resource constraints):

• each input can connect to at most one output

• each output can connect to at most one input

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 4

Input-Queued Switch

Output 1 Output 2

Input 1

Input 21 00 1

!

"#

$

%&

Allowed

Matching constraints (2n resource constraints):

• each input can connect to at most one output

• each output can connect to at most one input

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 4

Input-Queued Switch

Input 1

Input 2

Output 1 Output 2

0 11 0

!

"#

$

%&

Allowed

Matching constraints (2n resource constraints):

• each input can connect to at most one output

• each output can connect to at most one input

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 4

Input-Queued Switch

Output 1 Output 2

Input 1

Input 20 11 0

!

"#

$

%&

Allowed

Matching constraints (2n resource constraints):

• each input can connect to at most one output

• each output can connect to at most one input

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 4

Queueing Dynamics

Input 1

Input 2

Output 1 Output 2

Q = 2 34 2

!

"#

$

%&

Q12

• Independent Bernoulli arrivals with rate λij• Λ = [λij ] is admissible if∑

i

λij < 1 and∑j

λij < 1

• Focus on uniform arrival rates: λij = ρ/n and

ρ =∑i

λij =∑j

λij

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 5

• Input-queued switches extensively studied

• Throughput and stability well understood

• Refiner metrics (moments of queue sizes/delay) less understood, butbecome increasingly important in the big data era

Focus of this talk:

How∑

ij E [Qij ] scales with n (large system) and 1− ρ (heavy traffic)?

Outline of the remainder

1 A universal lower bound

2 Previously best-known and our improved upper bounds

3 Our policy via batching + graph factorization

4 Summary and concluding remarks

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 6

• Input-queued switches extensively studied

• Throughput and stability well understood

• Refiner metrics (moments of queue sizes/delay) less understood, butbecome increasingly important in the big data era

Focus of this talk:

How∑

ij E [Qij ] scales with n (large system) and 1− ρ (heavy traffic)?

Outline of the remainder

1 A universal lower bound

2 Previously best-known and our improved upper bounds

3 Our policy via batching + graph factorization

4 Summary and concluding remarks

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 6

• Input-queued switches extensively studied

• Throughput and stability well understood

• Refiner metrics (moments of queue sizes/delay) less understood, butbecome increasingly important in the big data era

Focus of this talk:

How∑

ij E [Qij ] scales with n (large system) and 1− ρ (heavy traffic)?

Outline of the remainder

1 A universal lower bound

2 Previously best-known and our improved upper bounds

3 Our policy via batching + graph factorization

4 Summary and concluding remarks

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 6

A Universal Lower Bound

Input 1

Input 2

Output 1 Output 2

𝜌

𝜌

• Decouples into n independent components• Expected total queue size scales as

n

1− ρ• A universal lower bound for any policy

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 7

A Universal Lower Bound

𝜌

𝜌

1

1

• Decouples into n independent components

• Expected total queue size scales as

n

1− ρ

• A universal lower bound for any policy

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 7

Previously Best-known Upper Bounds

nn2

n

11−ρ

Universal Lower bound: n1−ρ

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 8

Previously Best-known Upper Bounds

n

n log n(1−ρ)2[NMC’07]

n2

n

11−ρ

Universal Lower bound: n1−ρ

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 8

Previously Best-known Upper Bounds

n

n log n(1−ρ)2[NMC’07]

n2

n1−ρ

[SWZ’11]

n

11−ρ

Universal Lower bound: n1−ρ

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 8

Previously Best-known Upper Bounds

n

n log n(1−ρ)2[NMC’07]

n2

n1−ρ

[SWZ’11]n1.5 log n

1−ρ[STZ’16]

n

11−ρ

Universal Lower bound: n1−ρ

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 8

Our Improved Upper Bound

n

n log n(1−ρ)2

n2

n1−ρ

n1.5 log n1−ρ

n1.5 n

n

11−ρ

Improvements:

• 11−ρ < n: n logn

(1−ρ)2 −→n logn

(1−ρ)4/3

• 11−ρ = n: n2.5 log n −→ n7/3 log n

• n < 11−ρ ≤ n

1.5: n1.5 logn1−ρ −→ n logn

(1−ρ)4/3

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 9

Our Improved Upper Bound

nn2

n1−ρ

n1.5 n

n1.5 logn1−ρ

n log n

(1−ρ)4/3

n

11−ρ

Improvements:

• 11−ρ < n: n logn

(1−ρ)2 −→n logn

(1−ρ)4/3

• 11−ρ = n: n2.5 log n −→ n7/3 log n

• n < 11−ρ ≤ n

1.5: n1.5 logn1−ρ −→ n logn

(1−ρ)4/3

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 9

Main Theorem

Theorem (X. and Zhong ’19)

Consider an n× n input-queued switch for which the n2 arrival streamsform independent Bernoulli processes with a common arrival rate ρ/n,where ρ ∈ (0, 1). There exists a scheduling policy under which

E

n∑i,j=1

Qij(τ)

≤ c n

(1− ρ)4/3log

n

1− ρ, ∀τ ∈ N

Remarks

• A multiplicative factor 1(1−ρ)1/3 log

n1−ρ away from the lower bound

• Computational complexity per slot is at most polynomial in n

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 10

Main Theorem

Theorem (X. and Zhong ’19)

Consider an n× n input-queued switch for which the n2 arrival streamsform independent Bernoulli processes with a common arrival rate ρ/n,where ρ ∈ (0, 1). There exists a scheduling policy under which

E

n∑i,j=1

Qij(τ)

≤ c n

(1− ρ)4/3log

n

1− ρ, ∀τ ∈ N

Remarks

• A multiplicative factor 1(1−ρ)1/3 log

n1−ρ away from the lower bound

• Computational complexity per slot is at most polynomial in n

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 10

Key innovation: efficient scheduling via graph factorization

Question

Given a queue matrix q = (qij)ni,j=1, how to deplete the packets as much

as possible without wasting service opportunities?

k∗ = maxk,g

k

s.t.∑i

gij =∑j

gij = k

gij ≤ qij no service waste

gij ∈ N

aa

aa

• g can be viewed as a k-factor (spanning k-regular graph) of abipartite multigraph q

• A simple upper bound: k∗ ≤ min{mini

∑j qij , minj

∑i qij

}• Is the upper bound tight?

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 11

Key innovation: efficient scheduling via graph factorization

Question

Given a queue matrix q = (qij)ni,j=1, how to deplete the packets as much

as possible without wasting service opportunities?

k∗ = maxk,g

k

s.t.∑i

gij =∑j

gij = k

gij ≤ qij no service waste

gij ∈ N

aa

aa

• g can be viewed as a k-factor (spanning k-regular graph) of abipartite multigraph q

• A simple upper bound: k∗ ≤ min{mini

∑j qij , minj

∑i qij

}• Is the upper bound tight?

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 11

Key innovation: efficient scheduling via graph factorization

Question

Given a queue matrix q = (qij)ni,j=1, how to deplete the packets as much

as possible without wasting service opportunities?

k∗ = maxk,g

k

s.t.∑i

gij =∑j

gij = k

gij ≤ qij no service waste

gij ∈ N

aa

aa

• g can be viewed as a k-factor (spanning k-regular graph) of abipartite multigraph q

• A simple upper bound: k∗ ≤ min{mini

∑j qij , minj

∑i qij

}• Is the upper bound tight?

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 11

Key innovation: efficient scheduling via graph factorization

Question

Given a queue matrix q = (qij)ni,j=1, how to deplete the packets as much

as possible without wasting service opportunities?

k∗ = maxk,g

k

s.t.∑i

gij =∑j

gij = k

gij ≤ qij no service waste

gij ∈ N

aa

aa

• g can be viewed as a k-factor (spanning k-regular graph) of abipartite multigraph q

• A simple upper bound: k∗ ≤ min{mini

∑j qij , minj

∑i qij

}• Is the upper bound tight?

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 11

Key innovation: efficient scheduling via graph factorization

Question

Given a queue matrix q = (qij)ni,j=1, how to deplete the packets as much

as possible without wasting service opportunities?

k∗ = maxk,g

k

s.t.∑i

gij =∑j

gij = k

gij ≤ qij no service waste

gij ∈ N

aa

aa

• g can be viewed as a k-factor (spanning k-regular graph) of abipartite multigraph q

• A simple upper bound: k∗ ≤ min{mini

∑j qij , minj

∑i qij

}• Is the upper bound tight?

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 11

Largest k-factor of random queue matrices (multigraphs)

Theorem (X. and Zhong ’19)

Let q = (qij)ni,j=1 be an n× n queue matrix with qij

i.i.d.∼ Binom(m, p).

With probability 1− n−16, q has a k-factor with

k ≥ pmn−√

304pmn log n.

• Matches the upper bound up to a constant factor:

k∗ ≤ min

mini

∑j

qij , minj

∑i

qij

≤ pmn−√pmn log n

• Proof based on Gale-Ryser Theorem (extension of max-flow min-cut)+ Large deviation analysis

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 12

Largest k-factor of random queue matrices (multigraphs)

Theorem (X. and Zhong ’19)

Let q = (qij)ni,j=1 be an n× n queue matrix with qij

i.i.d.∼ Binom(m, p).

With probability 1− n−16, q has a k-factor with

k ≥ pmn−√

304pmn log n.

• Matches the upper bound up to a constant factor:

k∗ ≤ min

mini

∑j

qij , minj

∑i

qij

≤ pmn−√pmn log n

• Proof based on Gale-Ryser Theorem (extension of max-flow min-cut)+ Large deviation analysis

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 12

A Standard Batching Policy [NMC’07]

0 T 2T 3T

Batch 2

Serve batch 1

Batch 1

• No. of arrivals to any input/output port in time T∼ Binom(nT, ρ/n)

• Max no. of arrivals to any input/output port in time T≈ ρT +

√T log n

• Finishing serving a batch in time T needs

T ≥ ρT +√T log n ⇐⇒ T ≥ log n

(1− ρ)2

• Expected total queue size ≈ nT ≥ n logn(1−ρ)2

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 13

A Standard Batching Policy [NMC’07]

0 T 2T 3T

Batch 2

Serve batch 1

Batch 1

• No. of arrivals to any input/output port in time T∼ Binom(nT, ρ/n)

• Max no. of arrivals to any input/output port in time T≈ ρT +

√T log n

• Finishing serving a batch in time T needs

T ≥ ρT +√T log n ⇐⇒ T ≥ log n

(1− ρ)2

• Expected total queue size ≈ nT ≥ n logn(1−ρ)2

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 13

A Standard Batching Policy [NMC’07]

0 T 2T 3T

Batch 2

Serve batch 1

Batch 1

• No. of arrivals to any input/output port in time T∼ Binom(nT, ρ/n)

• Max no. of arrivals to any input/output port in time T≈ ρT +

√T log n

• Finishing serving a batch in time T needs

T ≥ ρT +√T log n ⇐⇒ T ≥ log n

(1− ρ)2

• Expected total queue size ≈ nT ≥ n logn(1−ρ)2

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 13

An Impatient Batching Policy [STZ’16]

0 TS

Round robin

T+S

Clear all pktsduring [0, T]

2T

Arrival batch 1 Arrival batch 2

Serve batch 1

• Start serving before the arrival of entire batch1 Wait for S time slots2 Simple round-robin for T − S time slots

• Need to ensure no waste of service opportunities during round-robin

T − Sn≤ ρT

n−√T log n

n⇐⇒ S ≥ (1− ρ)T +

√nT log n

• T � logn(1−ρ)2 ⇒ Expected total queue size ≈ nS ≥ n1.5 logn

1−ρ

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 14

An Impatient Batching Policy [STZ’16]

0 TS

Round robin

T+S

Clear all pktsduring [0, T]

2T

Arrival batch 1 Arrival batch 2

Serve batch 1

• Start serving before the arrival of entire batch1 Wait for S time slots2 Simple round-robin for T − S time slots

• Need to ensure no waste of service opportunities during round-robin

T − Sn≤ ρT

n−√T log n

n⇐⇒ S ≥ (1− ρ)T +

√nT log n

• T � logn(1−ρ)2 ⇒ Expected total queue size ≈ nS ≥ n1.5 logn

1−ρ

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 14

Our Improved Batching Policy

𝐼" 𝐼#

𝐼#

𝐼$

𝐼$

Arrival Period of T𝐼ℓ

𝐼ℓ

𝐼ℓ&#⋯

Service PeriodFactorization Clearing

• Start serving even earlier1 Wait for I0 time slots2 Serve packets via factorization for T − I0 time slots:

Iu serves arrivals in Iu−1 for 1 ≤ u ≤ `

• To ensure no waste of service opportunities during factorizationIu ≤ ρIu−1 −

√Iu log n

I0 ≥ I1 ≥ · · · ≥ I` � I0I0 + I1 + · · ·+ I` = T

⇐⇒ I0 � T 2/3 log1/3 n

• T � logn(1−ρ)2 ⇒ Expected total queue size ≈ nI0 � n logn

(1−ρ)4/3

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 15

Our Improved Batching Policy

𝐼" 𝐼#

𝐼#

𝐼$

𝐼$

Arrival Period of T𝐼ℓ

𝐼ℓ

𝐼ℓ&#⋯

Service PeriodFactorization Clearing

• Start serving even earlier1 Wait for I0 time slots2 Serve packets via factorization for T − I0 time slots:

Iu serves arrivals in Iu−1 for 1 ≤ u ≤ `

• To ensure no waste of service opportunities during factorizationIu ≤ ρIu−1 −

√Iu log n

I0 ≥ I1 ≥ · · · ≥ I` � I0I0 + I1 + · · ·+ I` = T

⇐⇒ I0 � T 2/3 log1/3 n

• T � logn(1−ρ)2 ⇒ Expected total queue size ≈ nI0 � n logn

(1−ρ)4/3

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 15

Conclusion and remarks

nn2

n1−ρ

n1.5

n1.5 logn1−ρ

n log n

(1−ρ)4/3[X.-Zhong ’19]

n

11−ρ

1 Improved queue-size scalingsvia graph factorization

2 A tight characterization of thelargest k-factor in randombipartite multigraphs

Open problem

• Achieving the universal lower bound n1−ρ?

References

• X. & Yuan Zhong (2019). Improved Queue-Size Scaling forInput-Queued Switches via Graph Factorization, ACM SIGMETRICS2019, arXiv:1903.00398.

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 16

Conclusion and remarks

nn2

n1−ρ

n1.5

n1.5 logn1−ρ

n log n

(1−ρ)4/3[X.-Zhong ’19]

n

11−ρ

1 Improved queue-size scalingsvia graph factorization

2 A tight characterization of thelargest k-factor in randombipartite multigraphs

Open problem

• Achieving the universal lower bound n1−ρ?

References

• X. & Yuan Zhong (2019). Improved Queue-Size Scaling forInput-Queued Switches via Graph Factorization, ACM SIGMETRICS2019, arXiv:1903.00398.

Jiaming Xu (Duke) Queue-Size Scaling for Input-Queued Switches 16

top related