queueing analysis for multi-core performance improvement: two case studies deng, j.d. and purvis,...

1

Queueing analysis for multi-core performance improvement: Two case studiesDeng, J.D. and Purvis, M.K.Dept. of Information Science., Univ. of Otago, Dunedin

Telecommunication Networks and Applications Conference, 2007. ATNAC 2007. Australasian

2

OutlineIntroductionEvaluation model

◦Tandem queueing model for two case studies

Two case studies◦Snort◦POISE

Conclusion

3

OutlineIntroductionEvaluation modelTwo case studiesConclusions

4

IntroductionAnalysis of Multi-core performance

◦Tandem system model for applications◦Queueing analysis◦Problem

Given a tandem queueing model, and find the optimal number of cores, so that the total service time is minimal

Case studies◦Snort and POISE◦Evaluation results is consistent with

queueing analysis

5


6

Evaluation modelTandem queueing model

◦Pipeline

◦Applications Being able to parallelized into

independent procedures Each procedure can be served by one or

more cores

7

Evaluation modelTerms definition

◦λ : arrival/departure rate◦μi : service time

◦ci : number of cores

◦n : total number of proceduresBurke’s Theorem

◦When tandem in a steady state Arrival rate = departure rate for each

procedure

8

Evaluation modelProblem definition

◦Given the arrival rate (λ), processing times μi and a total number of cores available, find the optimal choice of ci, so that the total time in system is minimal.

9

Evaluation modelTo solve the problem

◦Using D/D/c model for each procedure

◦Arrival rate/departure rate/number of services

◦D is for deterministic (D = λ)

10

Evaluation modelD/D/c model

◦No queueing delay◦Consider only processing overhead◦Total processing time T

◦Total number of cores

, C for maximum number of cores

11

Evaluation modelTo find minimum T

◦Lagrange multiplier

By letting →

=>

12

Evaluation modelLagrange multiplier

◦In mathematical optimization, the method of Lagrange multipliers (named after Joseph Louis Lagrange) provides a strategy for finding the maxima and minima of a function subject to constraints Maximize f (x, y ) subject to g(x, y) = c Λ (x, y, λ) = f (x, y ) + λ (g (x, y) – c ) maximum : partial derivatives of Λ are zero

13

Evaluation modelLemma

◦Assign the numbers of servers to the subsystems in proportion to the square roots of their processing time, respectively

This lemma can also work well in more generic systems with M/D/c subsystems

14


15

Two case studies - SnortSnort

◦A free and open source Network Intrusion Prevention System (NIPS) and Network Intrusion Detection System (NIDS)

16

Two case studies - SnortSnort flow

17

Two case studies - SnortMeasurement

◦Packets injection 100,000 to 1 million

◦Queueing discipline: FIFO◦Using three types of traffic

Attack free, light attacks, heavy attacks

18

Two case studies - SnortScenario 1

◦Without pipelining◦Packet distribution: round-robin◦Packet rate

Light: 0.1 packets/μs Medium: 0.2 packets/μs Heavy: 0.4 packets/μs

19

Two case studies - SnortEvaluation of scenario 1

◦Performance curve

20

Two case studies - SnortScenario 2

◦With pipelining◦Queueing model

M/D/c for core group 1 M/D/1 for core group 2

◦2~8 number of Cores◦Packet rate

Light: 0.1 packets/μs Medium: 0.2 packets/μs Heavy: 0.4 packets/μs

2.31 μs

0.12 μs

0.16 μs

21

Two case studies - SnortEvaluation of scenario 2

◦Performance curve

22

Two case studies - SnortConclusions

◦Scheme 2 copes much better with heavy packet traffic

◦Relevant queueing delay is significantly reduced to minimum with 3-4 cores

◦The 4-core results shown in Fig. 6 are consistent with Lemma 1 3 cores for group1 and 1 core for group 2

23

Two case studies - POISEPOISE

◦An image retrieval and organization application

24

Two case studies - POISEMeasurement

◦200 images◦3.2GHz Pentium 4 single-core with

1GB RAM

25

Two case studies - POISEScenario

Assignment of number of cores◦4-core as an example◦ round to 3◦Group 1 : group 2 = 3:1

0.097s 0.007s0.036

s

26

Two case studies - POISEEvaluation in 8-core

◦Markovian image arrival rate 20 images per second

5+3 has a minimaltotal processing time

27


28

ConclusionsA simplified tandem queueing

model is analyzed for two case studies

Using queueing analysis to gain quantitative assessment

The ideal proportion of core number distribution is worked out

queueing analysis for multi-core performance improvement: two case studies deng, j.d. and purvis,...

Documents