validation with code introspection of a virtual platform ...€¦ · validation with code...
Post on 01-Oct-2020
7 Views
Preview:
TRANSCRIPT
C&ESAR 2019
Validation with Code Introspection of a Virtual Platform for Sandboxing and
Security Analysis
Yves Lhuillier, Gilles Mouchard, Franck Vedrine
CEA LIST, Software Reliability and Security Laboratory
Tuesday, November 19
Rennes Convention Center - France
C&ESAR 2019
Introduction
Electronic Systems Virtualization Hardware design exploration Test bench virtualization Interoperability Resource optimization Instrumentation Sandboxing (cybersecurity)
Reusing common tools for V&V, safety and cybersecurity Decrease development costs Increase confidence
Growing need for high accuracy/fidelity simulators
Ta
rget S
W
Mo
de
l
CPU CPU
BUS
MEM IO
Target CPU
Simulator
Targ
et H
W
■ Program outputs
■ Non-functional stats
C&ESAR 2019
Outline
Context
Virtual platforms with UNISIM-VP framework
Cybersecurity use-cases with BINSEC
Self-Verifying simulator
Conclusion
C&ESAR 2019
Context : Virtualization and Cybersecurity
Complexification of software stacks How to address this complexity ?
System-wide approaches Rights management
Confinement, firewalls
Traffic inspection
ML detection
Code-centric approaches Quality code engineering
Rich and automatic code analysis
Threads
Applications
Services
Hyperv
isor
S
uperv
isor
V
M
Drivers
Lib
rarie
s P
rogra
m
Pro
gra
m
Application
C&ESAR 2019
Models and simulators are critical tools for designing and testing and validating electronic and cyber-physical systems.
Requirements Specification
HW/SW interface
HW prototyping SW development
VIRTUAL
PLATFORM
Refinem
ent
Assem
ent
Developing, using
and shipping virtual
platforms for more
than 20 years
Our virtual platforms are
primarily designed to
help CPS software
developers
Context : UNISIM-VP framework
C&ESAR 2019
UNISIM-VP
Services
Hardware
Components
Models
Third Party
Tools
Interfaces
■ Test
■ V&V
■ Debugger(s)
■ Code profiling
■ Monitoring
■ Trace analysis
■ Code Sanitizing
GNU GDB
DWARF
ML507 Board S12XDP512 Board
■ System on Chip/Boards
■ Processors
■ Interconnects,
memories, peripherals
http://unisim-vp.org
C&ESAR 2019
Extending UNISIM-VP for generic code analysis
Provides a uniform way to deploy a wide range of code analyzers Numerical stability (floating point instrumentation)
Code sanitizing (e.g. computation on uninitialized values)
Taint analysis
How ? Re-write all hardware simulators to use flexible data type.
Curiously enough, this is often not that hard…
R1 R2 RD
10 5 0
10 5 15
R1 R2 RD
X Y Z
X Y X+Y
UNISIM-VP Instruction Set DescriptionEncodings – Disassembly – Behavior
add.execute( cpu )
{
U32 operand1 = cpu.GetRegister( r1 );
U32 operand2 = cpu.GetRegister( r2 );
U32 result = operand1 + operand2;
cpu.SetRegister( result );
}
Ex:
pro
ce
sso
r sim
ula
tor Concrete
Executing instruction
on concrete values
Symbolic
Executing instruction
on symbolic values
C&ESAR 2019
Security use-case: the BINSEC tool
https://binsec.github.io
Analyzers
(Back-ends)Decoders
(Front-end) DBAIR
BINSEC
Intermediate representation
Machine
Code
IA32,
RISC-V
simplifyers
BINSEC is a binary code analyzer for security
leveraging formal methods & automatic code verification
C&ESAR 2019
Using UNISIM-VP as front-end to BINSEC
Fast & Successful addition of a full ARMv7 front-end to BINSEC
(Re)demonstration with existing code analyzers / back-ends “Cyber Grand Challenge” demos
Machine Code to C source decompilationF. RECOULES et al. “Get rid of inline assembly through verification-oriented lifting”
Application to vulnerability detection in an Avionics Real-Time OS Multiple non-interference proofs (CODEX back-end)
One critical bug identification
C&ESAR 2019
Instruction Set Simulator Verification
Why verify instruction set simulator ? Some critical systems are tested and validated on simulators Formal methods based on semantic accuracy
Why validate using single instruction tests (unit tests) ? Simulator errors are hard to spot on full application runs Only way to verify every machine instruction
Why a self-verifying simulator ? Writing unit tests is tedious Running unit tests is tedious
C&ESAR 2019
Encodings scan
Decode Simulate
Analyze
Te
sts
se
lectio
n
bytes
Instructions
Execute
Compare
Tests generation
Behaviors
functions
JIT
outputs outputs
results
Algorithm
Data
Architecture
Specific
Architecture
Independent
Principles of the self-verifying simulator
C&ESAR 2019
Instruction discovery
2 usual techniques: Random generation of instruction bytes
Derive random encodings of documented instructions
Unfortunately infeasible in x86-64
Up to 15 bytes long instructions
(much) more than one way to encode an instruction
Forced to use a little of x86-64 knowledge
1
10
100
1000
10000
0E+0 1E+5 2E+5 3E+5 4E+5 5E+5 6E+5 7E+5 8E+5 9E+5
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
50000
0E+0 1E+5 2E+5 3E+5 4E+5 5E+5 6E+5 7E+5 8E+5 9E+5
Se
lecte
d (
gro
wth
)S
ele
cte
d (
cu
mu
lative
)
OpCode ModR/M SIB Displacement ImmediatePrefixes
0-4 1-3 ?1 ?1 ?1,2,4 ?1,2,4
Rate of instruction discovery
C&ESAR 2019
Instruction comparison
Monitoring instruction side effect is complex Looking at everything that may have changed is hard
Looking only at things that changed in simulation is biased
Some side effects may be hard to compare
Once again a little x86-64 knowledge is required Lookup simulation side-effects
Lookup common hardware feature: flag values
Transfer comparison values to memory
Some side effects may (still) be hard to compare
Simulate Execute
Compare
Tests generation
functions
JIT
outputs outputs
results
C&ESAR 2019
Characterizing native instruction execution
Native execution of instruction tests raise issues Side effect may be unsafe for execution
Random inputs may generate errors
Side effect difficulty Resolution
Memory access May access memory
randomly
a) Fix target address using input value control
b) Run in an protected environment
Branch May transfer control to
any memory location
Step in debugger
Interrupts and Traps Leave the program
scope
Run in an hypervisor
Hypervisor ops Hard to confine Hardware debug ?
Access HW counters Environment
dependent values
No satisfying solution to our knowledge
C&ESAR 2019
A sandbox detection bug in QEMU
For other simulators: Self-testing should work elsewhere Generated tests can also be reused
Among our own bugs We picked the trickiest Checked other simulators…
Incorrect simulation in QEMU Incorrect decoding of instruction prefixes Incorrect address computation Allows simple sandbox detection May confuse malware analyzers
qemu-x86_64 segment prefixes error Bug #1847467
$ ./sample
I'm not in QEMU
$ qemu-x86_64 ./sample
I'm in QEMU
C&ESAR 2019
Conclusion
Simulator extension for advanced Sandboxing analysis Factorizing developments
Few code modifications, little overhead
Automatic simulator verification/validation Generic methodology
Successful implementation
Crucial follow-ups System instruction characterization
Architecture independent instruction scanning
C&ESAR 2019
Thank you !
top related