visualizing real-time errors and performance anomalies 2015 embedded systems conference – silicon...

Post on 19-Jan-2016

216 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart1

Visualizing Real-Time Errorsand Performance Anomalies

Dave Stewart, PhDSr. Principal Software Architect – Physio-Control, Inc.

dave.stewart@physio-control.com ¤ http://davestewart.info

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart2

Visualizing Real-Time Errors and Performance Anomalies

• Troubleshooting Tough Bugs using a Logic Analyzer• Print Debug Macros• Logic Analyzer Debug Macros

• Tracing Code using Logic Analyzer debug macros• Displaying Variable Data on the Logic Analyzer

• Visualizing Real-Time Execution• Focus on Anomalies• Performance Issues• Clock or Synchronization Errors• Troubleshooting Rare Glitches

• Setup Extras• Analog Signals coupled with Digital Debug• Debug Clock Bit• Serial Protocols

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart3

Troubleshooting Tough Bugs using a Logic Analyzer

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart4

What are The Tough Bugs to Debug in Real-Time Systems?

• Glitches• Timing and Synchronization Problems• Driver Errors• Misbehaving Interrupts• Memory Corruption• Priority Inversion• Performance Issues• Hardware Errors

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart5

Limitations of Traditional Debugging

• Print Statements• There is no console, so Print Statements don’t work• Print Statements are too slow thus provide insufficient information• Print function significantly affects real-time performance• Writing debug output to a serial port changes the timing too much• Adding print statements changes program behavior• Can’t measure performance at a fine granularity• Max 50 to 100 print statements per second• The code crashes, but there is insufficient feedback as to where• Can’t see the “integrated picture” at a glance

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart6

Limitations of Traditional Debugging

• Symbolic Debuggers(e.g. IDEs, via JTAG, ActiveSync, or other comm link)

• A symbolic debugger or emulator is not available• Stepping through the code makes the program behave differently• Breakpoints will “break” real-time performance• There is real I/O, it doesn’t work• Debugger doesn’t deal with interrupts properly• There might be a race condition or other synchronization problem• Can’t see the “integrated picture” at a glance

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart7

Solution: Use a Logic Analyzer for Real-Time Issues

• Both broad and detailed view of your code• Go from the “big picture” to microsecond view of software in seconds• Easily 50,000 debug data points per second• Info is time-stamped for timing assessment

• Real-Time• Can use it for interrupts and I/O drivers• Impact on real-time execution is negligible• Identify temporal relationships among tasks• Monitor interrupts and how they may affect execution

• Visualization of Performance and Anomalies• Quickly spot anomalies and different patterns of execution• Obtain sufficient proof that a problem is hardware, not software• Fine-grain timing measurements to identify performance culprits

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart8

Logic Analyzer for Debugging - Should I use it all the time?

• Using a Logic Analyzer is NOT easy• Use Print Statements and Symbolic Debuggers to solve easy and

non-real-time problems first• Add to your repertoire of available tools to solve hard problems

• Solve functional problems using print statements.• If necessary, run the functions on the desktop, and debug them there. • Only move to embedded environment when it is working well.

• Use Symbolic Debuggers mainly for• Tracing through code that fails in a consistent manner• Post-Mortem debugging of crashes, to view all variables

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart9

Logic Analyzer Features

• Low-End Logic Analyzer features are sufficient• Only need about 16 Channels for techniques shown in this presentation• Anything faster than 20MHz is likely fine. Possibly need up to 100MHz if debugging

SPI • Except! Large memory depth

• in Mega-Samples, not Kilo-Samples. This is usually found in high-end analyzers.• Multiple Views

• Timing Diagrams and State Listings (standard on pretty much any logic analyzer)• Decoding of Serial Protocols (not standard)• Useful features: Search, Filtering, and Triggering (capabilities vary tremendously)

• High-Speed Interface to a PC• Built-in PC, USB Memory Stick, Ethernet all OK

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart10

Logic Analyzer Features

• Low cost USB Logic Analyzer Pods are OK• A $25,000 logic analyzer is nice to have and

will have many extras that could be useful• Examples in this presentation use the Tektronix

TLA700

• What is discussed in this talk can be accomplished with a $400 USB pod.

• Examples in this presentation useLogicPort or DigiView

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart11

Sample Logic Analyzer Output - Timing Diagram

• This output format will be used for examples in this presentation.• More details interpreting this diagram to follow.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart12

Sample Logic Analyzer Output - State Listing

• Primarily useful for automating analysis. Not covered in this presentation, thus we won’t use this view in examples.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart13

Target Setup: Interconnections

• Black = Required, Blue = Optional

LogicAnalyzer

Real-TimeSystem

Under Test

DEBUG_D0 .. D7Ch.1

RX/TXCh.2 INTR

SPI

Keys

DEBUG_CLK

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart14

An Aside - Print Statement Debugging

• Forms the basis for logic analyzer methods• “Printing” to logic analyzer uses same debug concept as print statements• Abstract the logic analyzer prints using same method as debug prints

• First, example of using macros for Print Statement Debugging

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart15

Print Statement Debugging 101

myfunc() {

code here

printf(“I got here\n”);

more code here

printf(“Going to call yourfunc()\n”);

result = yourfunc();

printf(“My result is %d\n”,result);

etc

}

I got hereGoing to call yourfunc()My result is 384

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart16

Print Statement 101 Debugging: Problems

• A Lot of typing• Minimal information per statement• Hard to separate between debug and normal print statements• Prone to errors• Cannot easily disable them• Ultimately very inefficient

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart17

Debug Macros instead of Print Statements - DEBUG_WHERE()

#define DEBUG_WHERE() \

fprintf(stderr, \

“[%s:%u-%s]\n” , \

__FILE__,__LINE__,__FUNCTION__)

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart18

Print Statement Debugging 101

myfunc() {

code here

DEBUG_WHERE();

more code here

DEBUG_WHERE();

result = yourfunc();

DEBUG_INT(result);

etc

}

[myfile.c:3-myfunc][myfile.c:5-myfunc] [myfile.c:7-myfunc] result=384

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart19

Debug Macros instead of Print Statements - DEBUG_INT()

#define DEBUG_INT(_var) \

fprintf(stderr, \

“[%s:%u-%s] %s=%d\n” , \

__FILE__,__LINE__,__FUNCTION__\

#_var,_var)

Result = 384;DEBUG_INT(results); [myfile.c:7-myfunc] result=384

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart20

Debug Macros instead of Print Statements - DEBUG_HEX8()

• Printing in Hex instead of Decimal prepares is the basis for what a logic analyzer can show.

#define DEBUG_HEX8(_var) \

fprintf(stderr, \

“[%s:%u-%s] %s=0x%02X\n” , \

__FILE__,__LINE__,__FUNCTION__\

#_var,_var)Result = 384;DEBUG_HEX8(results); [myfile.c:7-myfunc] result=0xA4

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart21

Debug Macros instead of Print Statements - DEBUG_HEX16()

• Printing larger size variables can also be helpful.

#define DEBUG_HEX16(_var) \

fprintf(stderr, \

“[%s:%u-%s] %s=0x%04X\n” , \

__FILE__,__LINE__,__FUNCTION__\

#_var,_var)

Result = 0x02A4;DEBUG_HEX16(results); [myfile.c:7-myfunc] result=0x02A4

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart22

Debug Macros instead of Print Statements - DEBUG_HEX32()

• Printing larger size variables can also be helpful.

#define DEBUG_HEX32(_var) \

fprintf(stderr, \

“[%s:%u-%s] %s=0x%08X\n” , \

__FILE__,__LINE__,__FUNCTION__\

#_var,_var)

Result = 0x000352A4;DEBUG_HEX32(results); [myfile.c:7-myfunc] result=0x000352A4

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart23

Debug Macros for Logic Analyzers

• Similar to debugging with Print Statements, Except:• (disadvantage) More difficult to use

• Learning curve could be days, not minutes(but its worth it for the “hard” bugs that could otherwise take weeks to debug)

• (advantage) 1000 times more information• Print Statement: 50 Lines/Data Points per second• Logic Analyzer: 50000 Data Points per second is easy

• (advantage) Visualize execution to spot trouble points in seconds• Don’t need to page through 1000 pages of debug output

• (advantage) Precise real-time view of the system• Data points are time-stamped with microsecond or better resolution

• (advantage) Zoom in/out (like Google maps!)• Takes just a moment to switch from looking at a minute of execution to 10 usec

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart24

Basics of Logic Analyzer Debugging

• Print “Hex” codes to a logic analyzer. Demonstrated here by example:• Using a print statement:

int GetVel(int Pos) {Vel = f(Pos);printf(“Pos=%04X Vel=%04X\n”,Pos,Vel);return (Vel);

}• Using logic analyzer macros:

int GetVel(int Pos) {Vel = f(Pos);LADEBUG_HEX16(0x40,Pos);LADEBUG_HEX16(0x41,Vel);return (Vel);

}

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart25

Basics of Logic Analyzer Debugging

Pos = 0x0063;LADEBUG_HEX16(0x40,Pos);

Timing Diagram of each bit.We’ll see later the value ofthese

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart26

Basics of Logic Analyzer Debugging

Spot the Code Pattern. Each code represents a different variable. Correlate with source code.Variable data displayed in sequence.

Pos

LADEBUG_HEX16(0x40,Pos);LADEBUG_HEX16(0x41,Vel);LADEBUG_HEX16(0x42,Acc);LADEBUG_HEX16(0x43,Time);

Vel Acc Time

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart27

Basics of Logic Analyzer Debugging

Timescale is Microseconds.Easily add debug statements with minimal intrusion on real-time code.

In contrast, each print statement takes 1+ milliseconds.

1 usec

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart28

Defining Logic Analyzer Macros

• The LADEBUG_HEX8 is a hardware independent abstraction• It will generally be necessary to create two platform-dependent

macros:• LADEBUG_INIT() to initialize the hardware• LADEBUG_HEX8() to send an 8-bit code

• Other macros can generally be built upon those in a hardware independent manner

• E.g. LADEBUG_HEX16()• Examples on following slides

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart29

LADEBUG Macro Definition – Example MSP430

• Assume we wanted Port 4 of an MSP430// MSP430 is a simple memory-mapped port

// Using P4.0 thru 4.7 for our 8 bits

static uint8_t *datreg = (uint8_t *) 0x001D;

static uint8_t *dirreg = (uint8_t *) 0x001E;

#define LADEBUG_INIT() { \

*dirreg = 0xFF; } // Initialize as output

#define LADEBUG_HEX8(_val) { \

*datreg =(uint8_t)(_val) } // output code to P4

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart30

LADEBUG_HEX8() Macro with Set/Clear Registers

// Example, ARM, use bits 3 thru 10 of a GPIO bank

static uint32_t *setreg= (uint32_t *) 0x40E00020;

static uint32_t *clrreg= (uint32_t *) 0x40E0002C;

static uint32_t *dirreg= (uint32_t *) 0x40E00014;

#define LADEBUG_INIT() { \

*dirreg |= 0x000007F8; /* set bits for output */ }

#define LADEBUG_HEX8(_hex8) { \uint32_t set32,clr32,h32 =(uint32_t)_hex8; \set32= ( (h32<<3) & 0x000007F8); /* mark bits to set */ \ clr32= (((~h32)<<3)& 0x000007F8); /* mark bits to clear*/ \*setreg = set32; /* write 1 to the set bits register */ \*clrreg = clr32; /* write 1 to the clr bits register */ }

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart31

LADEBUG_HEX8() when Scrounging Bits

• PXA270, we had following bits available, by leveraging the Camera Interface bits:

• GPIO Port2: 84, 85, 93, 94• GPIO Port3: 116,106,107,108• PXA270 has separate set/clear registers.• Code became too complex for macro, hence:• #define LADEBUG_HEX8(val) ladebugHex8(val)

• Code takes longer to execute• 5 usec instead of 0.5 usec• Still very useable with minimal intrusion on system

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart32

LADEBUG_HEX8() when Scrounging BitsladebugHex8(uint8_t hex8) { uint32_t h32 = (uint32_t) hex8; uint32_t set2,set3,clr2,clr3;

// First determine what to write to each register set2=(((h32 & 0x0C)<<(29-2)|((h32 & 0x03)<<(20-0)))); // D3.D2 | D1.D0 set3=(((h32 & 0xE0)<<(10-5)|((h32 & 0x10)<<(20-4)))); // D7.D6.D5 | D4 h32 = ~h32; clr2=(((h32 & 0x0C)<<(29-2)|((h32 & 0x03)<<(20-0)))); // D3.D2 | D1.D0 clr3=(((h32 & 0xE0)<<(10-5)|((h32 & 0x10)<<(20-4)))); // D7.D6.D5 | D4

// The write out to the GPIOs *clrreg2 = clr2; *setreg2 = set2; *clrreg3 = clr3; *setreg3 = set3;}

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart33

Always Test Macros Before Using Them

LADEBUG_HEX8(0xFF);LADEBUG_HEX8(0x01);LADEBUG_HEX8(0x02);LADEBUG_HEX8(0x04);LADEBUG_HEX8(0x08);LADEBUG_HEX8(0x10);LADEBUG_HEX8(0x20);LADEBUG_HEX8(0x40);LADEBUG_HEX8(0x80);LADEBUG_HEX8(0x55);LADEBUG_HEX8(0xAA);

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart34

LADEBUG_HEX16() Macro

• Other macros can be built off the HEX8 base macro

#define LADEBUG_HEX16(code,_val) \

LADEBUG_HEX8(code); \

LADEBUG_HEX8( (_val) >> 8); \

LADEBUG_HEX8( (_val) )

Pos = 0x0063;LADEBUG_HEX16(0x40,Pos);

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart35

LADEBUG_HEX32() Macro

#define LADEBUG_HEX32(code,_val) \

LADEBUG_HEX8(code); \

LADEBUG_HEX8( (_val) >> 24); \

LADEBUG_HEX8( (_val) >> 16); \

LADEBUG_HEX8( (_val) >> 8); \

LADEBUG_HEX8( (_val) )

Sum = 0x0009B900;Avg = 0x0000CFB8;LADEBUG_HEX32(0x44,Sum);LADEBUG_HEX32(0x45,Avg);

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart36

More Logic Analyzer Macros are Possible

• There are many possibilities for defining macros• Use macros to enable a common interface across platforms• For each new platform, define the macros in a hardware-dependent

manner, and place in a .h file• Simply include a different .h file for each platform

• Additional macro examples• Depending on what you have to troubleshoot, define the macros to meet

your specific needs

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart37

Tracing Code using Logic Analyzer Macros

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart38

Tracing Driver, Low-Level, and Real-Time Code

• Tracing is typically the starting point for troubleshooting hard bugs• Need to understand the program flow, including paths taken within

conditionals and loops in the lowest-level code• Especially if you can’t use printf() or serial I/O because they are too slow.• Or can’t use breakpoints as it breaks functionality.

• The code to troubleshoot was written by someone else• The flow of code might be difficult to follow, thus trying a straight code

review can be confusing. Instead, get facts on program flow.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart39

Where do we start?

• Consider Model of Typical Real-Time Code:• Periodic threads wait for Timer Events• Aperiodic threads may wait for some kind of

message or signal to arrive• Interrupts handlers wait for hardware interrupts

to trigger• In each case, processing is usually the same,

per this basic model

Thread

Read Inputs/Events

Do Processing

Write Outputs

Wait for Event or Timer

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart40

Start by Tracing Who Gets Called When

Mark beginning and end of each key thread or interrupt

Thread

Read Inputs/Events

Do Processing

Write Outputs

Wait for Event or Timer

Beginning of Execution

End of Execution

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart41

Codes to Logic Analyzer

• To trace, send codes to logic analyzer that show when each thread (or function or interrupt handler) starts and finishes

• No specific rules on what codes to use• Using codes that establish patterns make it faster to understand

• My personal convention• Two-digit HEX numbers show up on logic analyzer• Use first digit to indicate which thread • Use second digit to indicate where we are within a thread

• Use “1” as second digit for start• Use “F” as second digit for finish

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart42

Start by Tracing Who Gets Called When

We can monitor up to 16 threads/functions/interrupts at a time using this method.

Thread

Read Inputs/Events

Do Processing

Write Outputs

Wait for Event or Timer

LA_DEBUG(0x21)

LA_DEBUG(0x2F)

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart43

Start by Tracing Who Gets Called When

Add additional debug macros within the code to monitor branches or function calls

Thread

Read Inputs/Events

If (condition) { Do_A() } else { Do_B() }

Write Outputs

Wait for Event or Timer

LA_DEBUG(0x22)LA_DEBUG(0x23)LA_DEBUG(0x24)

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart44

Sample Instrumentation of Codewhile (1) {

LADEBUG_HEX8(0x2F);

wait_for_something();

LADEBUG_HEX8(0x21);

read_stuff()

LADEBUG_HEX8(0x22);

if (condition) {

LADEBUG_HEX8(0x26);

Do_A();

} else {

LADEBUG_HEX8(0x2A);

Do_B();

}

LADEBUG_HEX8(0x2C);

write_stuff();

}

Intentionally skip a few numbers so that if we want to add more codes later, perhaps inside the function Do_A(), we can keep them in sequence. Not essential to keep them in sequence, just helpful to keep things less confusing on the logic analyzer output.

We can put up to 16 codes per thread (second digit 0..F)

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart45

Visualizing Real-Time Execution

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart46

Now What?

• Once code is instrumented, executed, and captured on logic analyzer, most real-time code will exhibit repeated patterns

• Patterns may not be precise, but still recognizable. • “Issues” (errors, glitches, etc.) typically break the pattern. E.g.

• Very periodic pattern is broken• A Non-“xF” code, indicating something not ending in timely manner• Long gap between two logic analyzer codes• Short period of more activity than usual• Specific codes showing up too often• Specific codes not showing up often enough, or ever

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart47

Run Code, Capture on Logic Analyzer

• Following Slides Show Many Examples• Zoom In or Out and Scroll as Needed

• Zoom out to see big picture; Zoom in to see more detail• Scroll left/right to see earlier or later portions of execution

• Look for Anomalies – These are POTENTIAL issues• An anomaly is not necessarily an error

• It represents an area that needs additional focus to see if that’s expected or not

• Search for Specific Codes• Use the Logic Analyzer search function to find a specific thread or function

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart48

Code Trace Example – High Level View

• At the high level, can’t see any of the codes, but we could see patterns

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart49

Code Trace Example – Medium Level View

• Zoom in a bit, and can start seeing codes• When code is efficient and CPU not overloaded, most codes end in “F”,

which indicate end of functions or threads

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart50

Code Trace Example – Anomaly seen in Medium Level View

• An Anomaly – something running for extended period of time!• Code is 0xBA, which is not the end of a function or thread

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart51

Measuring Execution Time

• Use the logic analyzer to measure execution time• Measurements are accurate to sub-microsecond• Put an LADEBUG_HEX8() before and after any code that you want to

measure• Use it for fine-grain measurements. E.g. how long is a 32-bit division?

LADEBUG_HEX8(0x24);

y=x/a; // measuring one line of code

LADEBUG_HEX8(0x25);

• Use it for coarse-grain measurements. E.g. start and stop of thread.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart52

Measuring Execution Time

Use logic analyzer to mark stop and end of the codeThen read exact

measurement

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart53

Code Trace Example – Low Level View

• When we zoom in more, we can see the interim trace,not just the ends of functions

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart54

Code Trace Example – Low Level View

• Good news. This can be fun. Troubleshooting can be game.• Let’s play Find the Anomaly

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart55

Code Trace Example – Low Level View

• Same code repeated multiple times, indicative of a loop.• But not the same number of repetitions each time?

0x13-0x15 occur 6 times, then 8 times. Why?

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart56

Code Trace Example – Searching for Specific Code• Suppose we want to find the execution of one of the instrumented

functions, and it’s instrumented as 0x31-0x3F. Find 0x31 first.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart57

Code Trace Example – Searching for Specific CodeSearch for 0x31 – The search found it here Note: Search function is dependent

on logic analyzer software. Works differently on every analyzer.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart58

Code Trace Example – Searching for Specific CodeZoom in – See the 0x31 See the path within code that was taken:

Branch that prints 0x36 was executed.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart59

Visual Debugging: More Examples of Anomalies

• Let’s look at a few more examples of identifying anomalies• The more experience you get, the easier it will be to spot the anomaly• As you research each anomaly, you’ll get a better understanding of how the

code works• Once an anomaly becomes understood, and considered “normal”, it is

usually easy to ignore it• For anomalies that indicate real issues, add more LADEBUG_HEX8() codes if

necessary, to gain a better understanding of that code

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart60

Find the Anomaly

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart61

Anomaly: Inconsistent Clock

• Very repetitive until we reach here.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart62

Anomaly: Inconsistent Clock

• Repeatedly zoom in more to see what is happening

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart63

Anomaly: Inconsistent Clock

• Example of printing data within the streammsgID = wait_for_something();

LADEBUG_HEX16(0x31,msgID);

LADEBUG_HEX8(0x31);

Pattern I use to intermix data, which is surround the data with the “trace” code.

Knowing which message came in was the key to understanding why the timing was different. We got an extra unexpected message.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart64

Anomaly: Another Inconsistent Clock

• Half a period? Why?

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart65

Anomaly: Another Inconsistent Clock

• Extended Period? Why?

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart66

Find the Anomaly

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart67

Anomaly: Extended Execution Time

• Code after 0x43 much longer one time as compared to other times• Could be alternate thread of execution

• zoom in to see if others are 0x43 as well.• Could be a result of preemption.

• But by what? • If preempted, by something NOT instrumented.

• Possible issue if concerned about race conditions.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart68

Find the Anomaly

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart69

Anomaly: Burst Pattern

• Thread instrumented with 0x4? executes 4 times in a row at fixed rate, then long delay, then repeats. Why?

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart70

Anomaly: Burst Pattern

• 4th pulse is regularly longer than the other 3. Why?

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart71

Find the Anomaly

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart72

Find the Anomaly

• Don’t see any issues, looks pretty regular• Zoom in on a segment

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart73

Find the Anomaly

• See anything at this zoom level?

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart74

Find the Anomaly• Maybe if we zoom in more?

• Now we’re seeing differences, but still not sure what the pattern is.• The 0x3F/0x4F demark start of repeated pattern

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart75

Find the Anomaly

• Now down to 500 us/Div. See something now?

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart76

No Anomaly! This is a GOOD Thing!

• Not an anomaly, but rather quite the opposite• VERY regular at this zoom level – every thread taking 500 usec

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart77

No Anomaly! This is a GOOD Thing!

• Not an anomaly, but rather quite the opposite• VERY regular at this zoom level – every thread taking 500 usec• Task 2 runs consistently every msec

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart78

No Anomaly! This is a GOOD Thing!

• This is what we strive for!• Very regular at every zoom level• Consistent patterns will make it easy to spot any disturbances• The most reliable real-time systems are repeatable like this during the

steady state!• This is IDEAL. Few systems will look like this, but if it does, then great!

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart79

Find the Anomaly

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart80

Anomaly: Clock Drift

• This looks like clock drift

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart81

Anomaly: Clock Drift

• This is what it looks like zoomed in

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart82

Clock Drift

• Clock Drift is a fact of life• It will happen whenever threads are almost the same rate, but not quite

• E.g. Sensor has a built-in timer, generating data at 100Hz• Sensor has it’s own internal oscillator

• Processor reads this with thread, also running at 100Hz• Thread’s timer is based on processor crystal

• Unless both timers are sourced from the same crystal, there will be clock

• 100 vs 101 Hz would result in one “skipped” or “extra” cycle per second• 100.0 vs 100.1 Hz would be one skipped or extra every 10 seconds

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart83

Clock Drift: One Kind of Glitch

• Since each clock operates within a tolerance, not possible to plan the exact time for the skip

• Real-Time software needs to take this into account for every data value, otherwise problems such as empty or full queues can happen

• For example, suppose software ignores this.• Assume system typically runs about 8 hours between power cycles• Sensor running at 100.01 Hz, thread at 100.00 Hz• Data buffer is 1024 sensor values• Every 100 seconds, sensor adds one more value to buffer than gets read• In 102400 seconds (about 1.2 days), the buffer fills up• One day, someone runs the system for more than a day. Voila, glitch!

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart84

Find the Anomaly

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart85

Find the Anomaly

• Swapping back and forth way too fast.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart86

Find the Anomaly

• No RTOS will thread switch that quickly. • Threads were running in parallel on two cores.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart87

Consider Problem Scenario

• Customer reported that their system failed• They reported “garbage on screens. Some obviously wrong numbers”.• They said it’s happened twice over past month• Engineers in failure analysis lab unable to replicate it

• What do you do?

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart88

Consider Problem Scenario: Rare Glitch

• What do you do?• Start by doing what we did above, and just trace the code• Look for the baseline pattern.• Identify variants, i.e. code that runs occasionally, but not at the same time

every cycle relative to the baseline pattern• For each variant, review any shared resources for possible conflict• If the variant uses significant execution time at high priority, measure how

long it is running, and what happens if any other thread is delayed by that long

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart89

Troubleshooting Rare Glitches

• Why does this work to find a rare glitch?• Many rare glitches actually happen all the time. But the odds of the

“collision” that creates the observation is much more rare.• Consider root causes of rare glitches:

• Timing or synchronization error (including issues with locks and mutexes)• Memory corruption• Hardware error

• Almost any other type of error will be reproducible• If troubleshooting a rare glitch, look for timing or synchronization issues!

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart90

Troubleshooting Rare Glitches: Example

• The following slides provide an example• Assume we have been unable to reproduce the issue

• But someone demands we fix it!• We’ll use the LADEBUG output to zero-in on the thread(s) that have the

highest probability of being the source of a rare glitch.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart91

Troubleshooting Rare Glitches that Can’t be ReplicatedLook for the “baseline” pattern

• This code is not fully deterministic, but still some patterns observed

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart92

Troubleshooting Rare Glitches that Can’t be Replicated : Verify Pattern at Different Levels of Zoom

• One observation when zooming in, Thread 3 regularly follows thread 8

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart93

Troubleshooting Rare Glitches that Can’t be Replicated : Look at the Anomalies

• In one spot, more “activity than usual” with Thread 14 (E) after it.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart94

Troubleshooting Rare Glitches that Can’t be Replicated : Identify a Potential Variant within the Anomaly

• Zoom on additional activity. Its Thread 2; it preempted thread 3!

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart95

Troubleshooting Rare Glitches that Can’t be Replicated: Check how variant potentially affects other threads

• Search for other instances of thread 2 – E.g. here it preempts thread 8.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart96

Troubleshooting Rare Glitches that Can’t be Replicated: Give additional scrutiny to the most problematic threads

• Thread 2 is an example of a potentially troublesome variant.• It inconsistently preempts others, sometimes in the middle of execution

• This thread should get additional scrutiny• Review all potential shared resources. • Find all instances of this thread, understand worst-case execution time,

potential priority inversion, and effect on other threads whenever this thread runs.

• If it uses locks or mutexes, try to force the glitch by increasing execution time or forcing a context switch while inside a critical section (Example Follows)

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart97

Forcing Rare Glitch to Show ItselfHypothesis: Mutex is working fine, not cause of glitch. Prove it!

• How do you know that a mutex is working properly? • What if two threads appear to be using same mutex, but mutex was

initialized in different context; no obvious way to know that there is a problem.

• More often than not, mutexes are assumed to work, they don’t get tested explicitly

• Months of testing might never catch it. This could become an escaped defect.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart98

Forcing Rare Glitch to Show ItselfHypothesis: Mutex is working fine, not cause of glitch. Prove it!

• Example of why a mutex error or race condition shows up as a rare glitch:

• Mutex is held by periodic thread A, about 5 usec every 100 msec, or 0.005% of the time.

• Aperiodic thread B runs about once every ten or so seconds, and also holds mutex for about 5 usec, or 0.0002% of the time.

• Odds of the two colliding if rates are not harmonic is 0.005*0.00002 = 1-in-10-million.

• If rates are harmonic, collision is either never or always!• With thread A running 10 times per second, it runs 10-million times in

about 15 days.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart99

Consider Problem Scenario: Rare Glitch

• Can we test the mutex?• Yes … but few people know how.

• If there’s concern that maybe the mutex is not working properly, put a “usleep(1000)” in both thread A and thread B when they hold the lock.

• This might mess up the real-time performance, but data integrity should be maintained

• We’ve just changed the 1-in-10-Million odds to about 1-in-10. • If there’s a problem with the mutex, it might now show up within a few

seconds.• Put a LADEBUG_HEX8() code at start and end of critical section in each thread.• If mutex is a problem, you’ll also see on the logic analyzer where one thread

enters the critical section, even though the other one should have it locked.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart100

Troubleshooting Mutexes

• To test a mutex, insert usleep() while holding the lockmutex_Lock();{ LADEBUG_HEX8(((threadId) << 4) | 0x6); usleep(1000); do critical section stuff LADEBUG_HEX8(((threadId) << 4) | 0x7);}mutex_Unlock();

• LADEBUG Codes 0x36 means start of critical section, 0x37 end• Once one thread starts, no other thread should be able to enter

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart101

Mutex not working

• Sequence 0x36-0x46-0x37 is observed

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart102

Mutex Working

• Usleep() gives plenty of time for thread 4 to interrupt thread 3 while it held mutex• Thread 4 (0x41) interrupts thread 3, but thread 3 holds lock (0x37)• 0x37 indicates thread 3 swapped back in and released lock.• Since threads 3 and 4 had same priority in this example, thread 3 continued.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart103

Avoiding Rare Glitches

• Rare glitches can be the most devastating of all issues• Usually the source of the most expensive failures• Catastrophic system failures; recalls; regulatory shutdowns.

• Reduce glitches my maximizing determinism• Use harmonic periodic threads when possible• Use sporadic servers if execution is aperiodic • Limit preemption by executing more threads at same priority in FIFO manner

• Use the logic analyzer to verify the determinism • Deterministic systems produce the most consistent patterns• Start by minimizing the variants when possible• Scrutinize every remaining variant• Actively test every synchronization mechanism used in the system

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart104

Setup Extras

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart105

Setup Extras

• Visualizing execution can be augmented by watching other signals at the same time.

• Key presses and other triggers• Serial communication• Multiple processors, each with their own 8-bit logic analyzer

instrumentation• Analog signals or power consumption via oscilloscope integration• Reset or other hardware signals• Mechanical devices like motors and relays that have significant timing

delays

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart106

Example: Troubleshooting High Power Consumption

• Next few pages show a logic analyzer setup with:• LADEBUG_HEX8(): 8-bit debug port (yellow)• LADEBUG_MODE(): 3-bit mode (green) – indicates CPU speed• LADEBUG_TRIG(): 1-bit trigger (magenta) – indicates DVFS change• 2 keys on keypad (cyan)• serial TX/RX (grey)• instantaneous current consumption of CPU (green analog)• input voltage (red).

• Each page is a zoom of the prior page.• The zoom area is shown by the magenta rectangle.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart107

Example: Troubleshooting High Power Consumption

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart108

Example: Troubleshooting High Power Consumption

Lower measurement is higher power usage

Mode=0 is CPU @ 13MHzMode=5 is CPU @ 500 MHz

Expected power usageAt 500 MHz Anomaly: twice as much power than

expected from CPU @ 500 MHz

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart109

Example: Troubleshooting High Power Consumption

Zoom in around the anomaly

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart110

Example: Troubleshooting High Power Consumption

DVFS TriggeredExpected 100 usec

settling time

Codes that may explain details of DVFS change

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart111

Example: Troubleshooting High Power Consumption

Zoom in to see codes

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart112

Example: Troubleshooting High Power Consumption

From the code, the decision is captured before 0xE4

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart113

Example: Troubleshooting High Power Consumption

Zoom in to see codes

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart114

Example: Troubleshooting High Power Consumption

Codes show exact path through the DVFS state change code. Issue was found to be correctly changing frequency, but adjusting voltage too high when going from 100 MHz 500 MHz. Code was ok going from 13 MHz 500 MHz.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart115

Practical Issue: Race Condition with DEBUG_HEX8

• When writing multiple bits, there are race conditions due to propagation delay that a logic analyzer can capture

• Logic analyzer runs faster than the GPIO writes, thus may capture transitions as separate events

• Usually easy to ignore when visualizing, as these captures are only a few nanoseconds in length

• For automated analysis, however, it makes it more difficult to not know precisely when the logic analyzer captured a new value, or a transition value

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart116

Race Condition with DEBUG_HEX8 – Example

• When writing multiple bits, there are race conditions due to propagation delays

• Logic analyzer runs faster than the GPIO writes, thus may capture transitions as separate events

Transition from 0x80 to 0x44

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart117

Adding DEBUG_CLK as a 9th GPIO bit

#define LADEBUG_HEX8(_val) { \*clkbit |= 0x01; \*datreg =(uint8_t)(_val); \*clkbit &= ~0x01; \

}

Clock bit eliminates ambiguity

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart118

Example of Needing DEBUG_CLK

• 0x46 is NOT actually there. It was the transition from 0x43 to 0x4E• Tends to happen most if either debug GPIOs are on different registers,

or set/clr registers are used.• 0b01000011• 0b01001110

Bits shown in Blue are changing at different rates, and thus subjected to different delays when changing 01 or 10

No clock bit, transition is due to propagation delays

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart119

DEBUG_CLK Bit

• Use DEBUG_CLK bit if you have one more available GPIO• While not absolutely necessary when visualizing data, it can help• If doing DEBUG_HEX32(), it can help distinguish between values with

repeated bytes.• 0x00001234, 0x00121234, and 0x00123434 all look very similar on logic analyzer

when not using DEBUG_CLK to distinguish start of each byte

• Use it especially when exporting data to an external analysis tool• External Analysis tool should walk through every event captured, and

discard any that don’t have a DEBUG_CLK bit toggle

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart120

Troubleshooting Serial Links

• Green shows I2C. • Obvious Anomaly, I2C stops for extended period of time

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart121

Troubleshooting Serial Links

• Zoom in multiple times to see more

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart122

Troubleshooting Serial Links

• I2C signals more obvious, but can’t tell what is the I2C data.

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart123

Serial Interpreters

• Many Logic Analyzers have Serial Interpreters for popular protocols• RS232/UART, SPI, I2C, SDHC

• They convert the serial data into codes, similar to LADEBUG codes• E.g. Following is an I2C breakout, including address/start/stop bits

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart124

Serial Interpreters

• Couple serial lines with LADEBUG codes to see relationship• Types of issues that can be observed

• Code will “write” register, but serial transmission is usually NOT done when code continues, as operations continue in parallel.

• Observe pre-loading the NEXT byte, while previous byte is being transmitted

• Determine when a status bit gets set relative to the serial transmission• Byte ordering on serial line• Whether data on serial line matches data that was sent or received

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart125

Summary

• Troubleshooting Tough Bugs using a Logic Analyzer• Print Debug Macros• Logic Analyzer Debug Macros

• Tracing Code using Logic Analyzer debug macros• Displaying Variable Data on the Logic Analyzer

• Visualizing Real-Time Execution• Focus on Anomalies• Performance Issues• Clock or Synchronization Errors• Troubleshooting Rare Glitches

• Setup Extras• Analog Signals coupled with Digital Debug• Debug Clock Bit• Serial Protocols

Visualizing Real-Time Errors andPerformance Anomalies

2015 Embedded Systems Conference – Silicon Valley© 2015 – Dave Stewart126

Visualizing Real-Time Errorsand Performance Anomalies

Dave Stewart, PhDSr. Principal Software Architect – Physio-Control, Inc.

dave.stewart@physio-control.com ¤ http://davestewart.info

top related