resource monitoring for cgroup hierarchies · current mbm/cmt design is transparent to the version...

Post on 06-Aug-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Resource Monitoring for cgroup Hierarchies

Kanaka Juvva, Intel Corporation

Agenda

• Motivation • Multi-core Cache Hierarchies

• Application Domains

• Intel Xeon’s Shared Resource Monitoring Technology

• Implementation

• Use Cases

• cgroups

• Hierarchical Monitoring

Motivation (Why Monitoring ?)

Moore’s Law Drives Exponential growth in Compute and Data

Big Data ProliferationDynamic Data Centers CloudHPC

Multi-Core is the Horse Power behind the Information Highway

Linear Scalability for CPU boundContention for shared resources

• Impacts Scalability

Multi-core cache hierarchies

Cache Hierarchies help performance and mitigate contention

Also Data Orchestration improves Scalability to applications

core(s)core(s)core(s)core(s)core(s)core(s)core(s)

MEMORY

Motivation Contd….

Intel® Xeon® processor Shared Resource Monitoring

LLC Occupancy

Memory Bandwidth

Monitoring at multiple levelso Thread

o Process

o APP/Workload/cgroup/Container

o VMM

o System

• Quantify Performance

C4C3C2C1

L1/L2L1/L2L1/L2L1/L2

L3- LLC Occupancy

Memory - Memory Bandwidth

APP VM CONTAINER

Intel Xeon’s Shared Resource Monitoring

Incarnation of Intel Xeon CPUs

Platform/Shared Resource Monitoring Broadwell-EP and Broadwell-EX

Grantley and Brickland Codenames

Broadwell – EP Microarchitecture Features

o Cache Monitoring Technology

o Memory Bandwidth Monitoring

o QuickPath Interconnect:

o 2 QPI 1.1 channels

o Cores Per Socket: Up to 22

o Threads Per Socket: Up to 44

o Sockets: 2

o Last Level Cache (L3): ~55MB

BROADWELL-EP

Core Core

Core Core

Core Core

Shared Cache

QPI

QPI

2 QPI 1.1

4 Channels

DDR4

40 Lanes PCIe* 3.0

DMI2

DDR4

DDR4

DDR4

DDR4

Intel Xeon’s Shared Resource Monitoring Technology (Contd..)

Resource monitoring capability in each logical processor i.e. core. Two monitoring technologies are supported. Cache Monitoring Technology (CMT): Allows an OS, Hypervisor or VMM to monitor the usage of cache by applications Memory Bandwidth Monitoring Technology (MBM): Allows OS, Hypervisor or VMM to monitor bandwidth from one level

of cache hierarchy to the next – L3 cache which is backed directly by system memoryEither of them or both features can be present in a given CPU model. Have some common sub-features.

Linux (OS) can check the presence of the above features via a CPUID feature bit Details of each sub-feature via CPUID leaves and sub-leaves A mechanism for the Linux/OS or VMM to set a software-defined ID known as Resource Monitoring

ID (RMID) for each software thread A mechanism to monitor cache and bandwidth stats per RMID basis Linux/OS can Read back the collected stats such as L3 occupancy and Memory BW for an RMID Different event types are to distinguish the resource metrice.g. LLC_OCCUPANCY, TOTAL_BW, LOCAL_BWSoftware configures the event selection MSR (IA32_QM_EVTSEL) to specify the event or metric to be reported. RDMSR and WRMSR instructions are used to read and write the monitoring MSRs.

Implementation

Adoption into Linux Kernelhttps://lkml.org/lkml/2016/3/21/176https://github.com/kanakajuvva

Linux User Space APIs Perf_event APIo perf_event_open() struct perf_event

o Performance Monitoring Unit Fill and Register struct pmu Call back functions to start, read and stop stats

Perf events are exported to userland LOCAL_BWo perf stat –e intel_cqm/local_bw/ -p “pid of my_application” TOTAL_BWo perf stat –e intel_cqm/total_bw/ -p “pid of my_application”

LLC_OCCUPANCYo perf stat –e intel_cqm/llc_occupancy/ -p “pid of my_application”

Use CasesMonitoring Applicationcpumem benchmarkoConsumes constant memory bw of ~13.7GB/Sec

13000

13100

13200

13300

13400

13500

13600

13700

13800

13900

1 6

11

16

21

26

31

36

41

46

51

56

61

66

71

76

81

86

91

96

10

1

10

6

11

1

11

6

12

1

12

6

13

1

Total BW MB/Sec

13780

13785

13790

13795

13800

13805

13810

13815

13820

13825

13830

13835

1 6

11

16

21

26

31

36

41

46

51

56

61

66

71

76

81

86

91

96

10

1

10

6

11

1

11

6

12

1

12

6

13

1

13

6

Local BW MB/Sec

Use Cases Contd..Monitoring Multi-threaded Application cqm benchmarko 8 – 48 threads

0

5000

10000

15000

20000

25000

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82

Loca

l BW

-M

B/S

ec

Time (Secs)

CQM Benchmark - 8,12,24 and 48 Threads

8 Threads 12 Threads 24 Threads 48 Threads

0

5000

10000

15000

20000

25000

30000

35000

1 4 7 10131619222528313437404346495255586164677073767982858891

Tota

l BW

-M

B/S

ec

Time (Secs)

CQM Benchmark -- 8, 12, 24 and 48 Threads

8 Threads 12 Threads 24 Threads 48 Threads

Use Cases Contd..Monitoring VMLight VM running Linux – 3 scenariosoVM Boot

oVM Idle

oAn APP running in VM

0

100

200

300

400

500

600

1

18

35

52

69

86

10

3

12

0

13

7

15

4

17

1

18

8

20

5

22

2

23

9

25

6

27

3

29

0

30

7

32

4

34

1

35

8

37

5

39

2

40

9

42

6

44

3

Loca

l BW

-M

B/S

ec

Time (Secs)

VM Boot

0

2

4

6

8

10

12

14

16

1

19

37

55

73

91

10

9

12

7

14

5

16

3

18

1

19

9

21

7

23

5

25

3

27

1

28

9

30

7

32

5

34

3

36

1

37

9

39

7

41

5

43

3

45

1

46

9

Loca

l BW

-M

B/S

ec

Time (Secs)

VM Idle

Monitoring VM Contd..

0

5

10

15

20

25

30

35

40

45

1 3 5 7 9 11131517192123252729313335373941434547495153555759616365676971737577798183858789919395

Loca

l BW

-M

B/S

ec

Time (Secs)

FWTS APP

cgroups

• Linux mechanism(s) to manage and monitor resources for processes

• cgroups blend very well with containers

• MBM and CMT counters are read per RMID cgroup perf_event subsystem is used by Linux Perf tool for reading stats

Thread level stats are maintained in driver

Obtain cgroup membership for each thread

cgroup -> RMID(s) mapping

• LLC Occupancy /Memory BW consumed by a cgroup

= ∑ Resource consumed for each RMID in cgroup

Hierarchical Monitoring

cgroups organize processes hierarchically Form a tree structure cgroup v1 and cgroup v2

Process belongs to at least one cgroupAll threads in a process belong to same cgroup in v2Traverse from parent to child to list the threads

• Resource consumed by parent cgroup = ∑ Resource consumed for each cgroup

= ∑ ∑ Resource consumed for each RMID in cgroup

Current MBM/CMT design is transparent to the version of thehierarchy

Conclusions

• Intel Xeon’s Platform Resource Monitoring

• Linux Implementation

•Use Cases and Examples Scenarios

•Monitoring for Linux cgroups

Q & A

Thank you

top related