julio auto [julio {funny a} julioauto com]. the problem the solution demo solution details whats...
TRANSCRIPT
![Page 1: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/1.jpg)
Triaging Bugs with Dynamic Dataflow Analysis
Julio Auto [julio {funny a} julioauto com]
![Page 2: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/2.jpg)
Agenda
The Problem The Solution Demo Solution Details What’s Next? Greetings & References
![Page 3: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/3.jpg)
Preface
We will be talking about analyzing closed-source software here
Absolutely no debugging information needed
However... Depending on the complexity of the
bug, even people with the source might opt for this analysis too E.g. Vendors receiving crash reports
![Page 4: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/4.jpg)
The Problem
Sometimes people just have to analyze bugs in closed-source software
These bugs may come from: A fuzzing session Contributor-sent Proof-of-Concept codes In-the-wild exploit code Etc...
As varying as the sources of bugs are the reasons why one wants to analyze them, but this is irrelevant. The fact is...
![Page 5: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/5.jpg)
The Problem (2)
ANALYZING BUGS CAN BE HARD! A seasoned reverse engineer may
take weeks to get somewhere If the target software is too big If the data consumed is in a very
complex and/or undisclosed format If bugs in this target are so rare that
your reversing team has no previous experience with it
But which bugs do we mostly care for?
![Page 6: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/6.jpg)
The Problem (3)
“Analyzing bugs” is very broad No ./write-me-a-very-detailed-advisory
We will concentrate in answering one question: what exact part of my data made the program crash?
Understanding that and how such data is transformed is primordial
![Page 7: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/7.jpg)
The Solution
Dynamic Dataflow Analysis Watching data and its ramifications as
the doomed program executes What we do really is Taint Analysis
We start with a subset of the program’s data: the attacker’s input – assume it’s evil
Its ‘ramifications’ are tainted memory, tainted registers
... but we do it backwards.
![Page 8: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/8.jpg)
Is any of these from the Evil Input?
This is of interest
Is any of these of interest?
This is the Evil Input
The Solution (2)
TAINT ANALYSIS BACKWARDS TAINT ANALYSIS
![Page 9: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/9.jpg)
The Solution (3)
So we really don’t care about every tainted piece of data in the process space Most of it is legitimate, anyway
Thus, we avoid the explosion of watched data
Plus we can do stuff like: Bug: mov eax, [esi] (where esi =
DEADBEEFh) Analysis runs... ... and reports: esi = user[4] +
var_unk * 8
![Page 10: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/10.jpg)
The Solution (4)
This is all done in two steps: tracing and analysis
First we trace the program from a “good” point until it crashes The trace is incrementally dumped to a
file Not just the disassembly, but also some
extra info▪ E.g.: In the past slide’s example, effective
address ([esi]) == DEADBEEFh Then the trace file goes under
analysis
![Page 11: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/11.jpg)
The Solution (5)
Target starts
Evil Input enters (and
we start tracing)
Target crashes!
The “good” starting point
![Page 12: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/12.jpg)
The Solution (6)
So we feed the trace file to the analyzer and tell it: “Address ranges from ABCDh to ACCDh
and from DCBAh to DCCAh held Evil Input”
“I wanna know if ‘esi’ was tainted by Evil Input”
And magic happens!
![Page 13: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/13.jpg)
The Solution (7)
Considerations Tracing is very time-consuming▪ For the bug I’ll analyze as an example, it takes
about 2 hours to dump the 650,000+ instructions it executes
The analysis... not so much▪ 1 to 2 minutes
May sound like much, but how long would take to do it manually?▪ Plus, you can always use this time to do something
else while the computer is working for you
![Page 14: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/14.jpg)
Demo
Introducing... Visual Data Tracer!
![Page 15: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/15.jpg)
Solution Details
The VDT Tracer is implemented as a WinDbg extension Because WinDbg is free and it’s a great
debugger The VDT Analyzer is a stand-alone
C++ app The tracer needs to understand
some simple instruction “semantics” E.g.: The source and destination
operands Currently only the basic x86 subset is
implemented (no x87, MMX, etc)
![Page 16: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/16.jpg)
Solution Details (2)
The semantic rules are simplified to avoid dumping useless info to the trace file E.g.: a ‘push’ does not meaninfgully
change ‘esp’ (same for ‘inc’, ‘dec’, and their destination ops)
They are also written to fit the very simplistic format of the trace file entries
All of this makes the analysis easier, thus faster, and yet useful
![Page 17: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/17.jpg)
Solution Details (3)
Trace file entry: Mnemonic Destination operand Source operand Up to three source operand “dependences”
Dependences are, for example, the elements of an indirectly addressed memory operand
This effectively exposes the dataflow relations as a Tree (rooted at the crash instruction) Performing the backwards taint analysis becomes
then a matter of searching the tree, which VDT does with a BFS algorithm
![Page 18: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/18.jpg)
Solution Details (4)
Putting it together so far
mov edi, 0x1234 ; dst=edi, src=0x1234
mov eax, [0xABCD] ; dst=eax, src=ptr 0xABCD ; Note 0xABCD is evil addrlea ebx, [eax+ecx*8] ; dst=ebx, src=eax, srcdep1=ecx
mov [edi], ebx ; dst=ptr 0x1234, src=ebx
mov esi, [edi] ; dst=esi, src=ptr 0x1234, srcdep1=edi
mov edx, [esi] ; Crash!!!
![Page 19: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/19.jpg)
Solution Details (5)
Simplifying semantic rules to fit that format is not always easy CMPXCHG r/m32, r32▪ “Compare EAX with r/m32. If equal, ZF is set and r32
is loaded into r/m32. Else, clear ZF and load r/m32 into EAX.”
The aftermath: the need for “conditional taints”▪ ▪ i.e. One of the possibilities of controlling ‘r/m32’ is
controlling ‘r32’ AND ‘eax’ Note that “alternative taints” is also existant,
implemented in the form of srcdep{1,2,3}
32/3232/ mrrEAXmr
![Page 20: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/20.jpg)
Solution Details (6)
Other subtleties to watch for AH defines EAX EAX defines AL AL does not define AH
Similar problem for 1-byte and 2-byte memory accesses
EAX (32)
AX (16)
AL (8) AH (8)
Unnamed (16)
![Page 21: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/21.jpg)
What’s Next?
Extending the coverage of x86 Enhancing speed
God knows how... Heuristically detecting user input
e.g. By making the tracer understand CreateFile()
Automatic exploit generation What else?
Any ideas, let me know...
![Page 22: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/22.jpg)
References
SpiderPig Project - http://piotrbania.com/all/spiderpig/ Very similar ideas, different approach
!exploitable - http://www.codeplex.com/msecdbg A more superficial (but much faster) tool for
bug triaging If you have many bugs to triage, you can
first run !exploitable on them and, then, use VDT on those that seem really interesting
![Page 23: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/23.jpg)
Greetings
Julien Vanegue For all the lecturing, motivating and supporting
Piotr Bania For discussing DDF analysis and much more
People from PSV (http://www.unprotectedhex.com/psv) For letting me idle on IRC, leeching their
knowledge Everyone else who talks to me about
security and similarly cool stuff
![Page 24: Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References](https://reader038.vdocument.in/reader038/viewer/2022110303/5516b311550346f6208b531d/html5/thumbnails/24.jpg)
Triaging Bugs with Dynamic Dataflow Analysis
Julio Auto [julio {funny a} julioauto com]