linuxcon europe 2013 - efficios · 2016. 2. 26. · lttng 2.3 dominus vobiscum (september 2013)...

22
1 Cloud Monitoring and Distribution Bug Reporting with Live Streaming and Snapshots LinuxCon Europe 2013 [email protected]

Upload: others

Post on 13-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

1

Cloud Monitoring and Distribution Bug Reporting with Live Streaming

and Snapshots

LinuxCon Europe 2013

[email protected]

Page 2: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

2

Presenter

Mathieu Desnoyers

•http://www.efficios.com

Author/Maintainer of

•Userspace RCU,

•LTTng kernel and user-space tracers,

•Babeltrace.

Page 3: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

3

Content

New and upcoming features since Tracing Summit 2012

● LTTng● Babeltrace

Cloud monitoring

Enhanced bug reports

LTTng project roadmap

Page 4: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

4

LTTng Reminder

● LTTng 2.x does

NOT

require any kernel modification.

Page 5: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

5

LTTng Reminder

● LTTng 2.x does

NOT

require any kernel modification.

Page 6: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

6

LTTng Reminder

● LTTng 2.x does

NOT

require any kernel modification.

Page 7: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

7

LTTng Reminder

● LTTng 2.x does

NOT

require any kernel modification.

Page 8: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

8

LTTng Features Since Tracing Summit 2012

● LTTng 2.1 Basse Messe (December 2012)● Network Streaming (TCP)● Session daemon health check● Event field filtering (LTTng-UST)● ARM, MIPS sysem call tracing (LTTng modules)

Page 9: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

9

LTTng Features Since Tracing Summit 2012

● LTTng 2.2 Cuda (June 2013)● Per user ID buffers (LTTng-UST)● On disk file rotation (maximum stream file size)

Page 10: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

10

LTTng Features Since Tracing Summit 2012

● LTTng 2.3 Dominus Vobiscum (September 2013)● Flight recorder tracing

● Stop and non-stop snapshots● Core dump handler integration

● LTTng Tools extras/

Page 11: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

11

Flight recorder session + snapshot

$ lttng create --snapshot

$ lttng enable-event -k sched_switch

$ lttng enable-event -k –-syscall -a

$ lttng start

$ ...

$ lttng snapshot record

Snapshot recorded successfully for session auto-20131019-113803

$ babeltrace /home/julien/lttng-traces/auto-20131019-113803/snapshot-1-20131019-113813-0/kernel/

Page 12: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

12

Snapshot

At any point in time, a snapshot can be taken of a the current trace buffers.

Overwrite mode meaning flight recorder

trace data

ring buffer

$ lttng snapshot record

snapshot trace data

lttng_snapshot_record(..)

Page 13: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

13

LTTng Features Since Tracing Summit 2012

● LTTng 2.4 Époque Opaque (upcoming)● Java Util Logging (JUL) tracing● Live streaming

● Analysis of live traces● Consumer and relay daemon health check● Packet index generated by consumer daemon

● Faster load of large traces in viewers afterward

Page 14: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

14

Live Network Streaming Deployment

Server A(lttng-sessiond)

Server B(lttng-sessiond)

Server C(lttng-sessiond)

lttng-relayd

Viewer

TCP

TCP

Page 15: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

15

Live Network Streaming Session

On the server to trace :

$ lttng create -–live 2000000 -U net://10.0.0.1

$ lttng enable-event -k sched_switch

$ lttng enable-event -k –-syscall -a

$ lttng start

On the receiving server (10.0.0.1) :

$ lttng-relayd -d

On the viewer machine :

$ lttngtop -r 10.0.0.1

Page 16: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

16

Live Trace Streaming Usage

As the trace is being created, you extract and can analyze the data.

Continous Analysis● Extract data with live streaming for analysis on an other machine

Cluster-level analysis● Gather traces from multiple machines

● Load balancing analysis

● Latency detection

System Administration● Get data of faulty machine “on-demand”

Page 17: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

17

Performance Results

Page 18: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

18

Babeltrace Features Since Tracing Summit 2012

● Babeltrace 1.0 (initial release, October 2012)● Babeltrace 1.1 (API namespacing fix, March 2013)● Babeltrace 1.2 (upcoming)

● Common Trace Format (CTF) Writer API● Python bindings● Nexus to CTF converter● Live trace stream read support

● Connect to LTTng relay daemon

Page 19: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

19

Cloud Monitoring

● Live network streaming● Flight recorder tracing and snapshots● Bytecode interpreter

● On traced target or separate dedicated machine,● Triggers:

● Start tracing● Stop tracing● Gather snapshot

● Aggregation

Page 20: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

20

Enhanced Bug Reports

● Flight recorder tracing● In production● Extremely low overhead ● When error is encountered

● Gather snapshot● “Do you want to send a detailed bug report ?”● Very detailed trace of trace leading to the problem

sent along with bug report

Page 21: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

21

LTTng Project Roadmap

● Save/restore trace session configuration to/from files● Support Perf PMU counters from LTTng-UST● Dynamic instrumentation of user-space (dyninst)● Listing libraries shared objects (LD_PRELOAD)● Hardware tracing (ARM, Freescale, Intel, ...)● Triggers and aggregation in LTTng-UST bytecode

interpreter● LTTng modules bytecode interpreter● Android port for LTTng modules and UST

Page 22: LinuxCon Europe 2013 - EfficiOS · 2016. 2. 26. · LTTng 2.3 Dominus Vobiscum (September 2013) Flight recorder tracing Stop and non-stop snapshots Core dump handler integration LTTng

22

Questions ?

? lttng.org

[email protected]

@lttng_project

www.efficios.com