validation with code introspection of a virtual platform ...€¦ · validation with code...

C&ESAR 2019

Validation with Code Introspection of a Virtual Platform for Sandboxing and

Security Analysis

Yves Lhuillier, Gilles Mouchard, Franck Vedrine

CEA LIST, Software Reliability and Security Laboratory

Tuesday, November 19

Rennes Convention Center - France

C&ESAR 2019

Introduction

Electronic Systems Virtualization Hardware design exploration Test bench virtualization Interoperability Resource optimization Instrumentation Sandboxing (cybersecurity)

Reusing common tools for V&V, safety and cybersecurity Decrease development costs Increase confidence

Growing need for high accuracy/fidelity simulators

rget S

CPU CPU

MEM IO

Target CPU

Simulator

■ Program outputs

■ Non-functional stats

C&ESAR 2019

Outline

Context

Virtual platforms with UNISIM-VP framework

Cybersecurity use-cases with BINSEC

Self-Verifying simulator

Conclusion

C&ESAR 2019

Context : Virtualization and Cybersecurity

Complexification of software stacks How to address this complexity ?

System-wide approaches Rights management

Confinement, firewalls

Traffic inspection

ML detection

Code-centric approaches Quality code engineering

Rich and automatic code analysis

Threads

Applications

Services

Hyperv

Drivers

Application

C&ESAR 2019

Models and simulators are critical tools for designing and testing and validating electronic and cyber-physical systems.

Requirements Specification

HW/SW interface

HW prototyping SW development

VIRTUAL

PLATFORM

Refinem

Developing, using

and shipping virtual

platforms for more

than 20 years

Our virtual platforms are

primarily designed to

help CPS software

developers

Context : UNISIM-VP framework

C&ESAR 2019

UNISIM-VP

Services

Hardware

Components

Models

Third Party

Interfaces

■ Test

■ V&V

■ Debugger(s)

■ Code profiling

■ Monitoring

■ Trace analysis

■ Code Sanitizing

GNU GDB

ML507 Board S12XDP512 Board

■ System on Chip/Boards

■ Processors

■ Interconnects,

memories, peripherals

http://unisim-vp.org

C&ESAR 2019

Extending UNISIM-VP for generic code analysis

Provides a uniform way to deploy a wide range of code analyzers Numerical stability (floating point instrumentation)

Code sanitizing (e.g. computation on uninitialized values)

Taint analysis

How ? Re-write all hardware simulators to use flexible data type.

Curiously enough, this is often not that hard…

R1 R2 RD

10 5 0

10 5 15

R1 R2 RD

X Y X+Y

UNISIM-VP Instruction Set DescriptionEncodings – Disassembly – Behavior

add.execute( cpu )

U32 operand1 = cpu.GetRegister( r1 );

U32 operand2 = cpu.GetRegister( r2 );

U32 result = operand1 + operand2;

cpu.SetRegister( result );

tor Concrete

Executing instruction

on concrete values

Symbolic

Executing instruction

on symbolic values

C&ESAR 2019

Security use-case: the BINSEC tool

https://binsec.github.io

Analyzers

(Back-ends)Decoders

(Front-end) DBAIR

BINSEC

Intermediate representation

Machine

RISC-V

simplifyers

BINSEC is a binary code analyzer for security

leveraging formal methods & automatic code verification

C&ESAR 2019

Using UNISIM-VP as front-end to BINSEC

Fast & Successful addition of a full ARMv7 front-end to BINSEC

(Re)demonstration with existing code analyzers / back-ends “Cyber Grand Challenge” demos

Machine Code to C source decompilationF. RECOULES et al. “Get rid of inline assembly through verification-oriented lifting”

Application to vulnerability detection in an Avionics Real-Time OS Multiple non-interference proofs (CODEX back-end)

One critical bug identification

C&ESAR 2019

Instruction Set Simulator Verification

Why verify instruction set simulator ? Some critical systems are tested and validated on simulators Formal methods based on semantic accuracy

Why validate using single instruction tests (unit tests) ? Simulator errors are hard to spot on full application runs Only way to verify every machine instruction

Why a self-verifying simulator ? Writing unit tests is tedious Running unit tests is tedious

C&ESAR 2019

Encodings scan

Decode Simulate

Analyze

lectio

Instructions

Execute

Compare

Tests generation

Behaviors

functions

outputs outputs

results

Algorithm

Architecture

Specific

Architecture

Independent

Principles of the self-verifying simulator

C&ESAR 2019

Instruction discovery

2 usual techniques: Random generation of instruction bytes

Derive random encodings of documented instructions

Unfortunately infeasible in x86-64

Up to 15 bytes long instructions

(much) more than one way to encode an instruction

Forced to use a little of x86-64 knowledge

0E+0 1E+5 2E+5 3E+5 4E+5 5E+5 6E+5 7E+5 8E+5 9E+5

lative

OpCode ModR/M SIB Displacement ImmediatePrefixes

0-4 1-3 ?1 ?1 ?1,2,4 ?1,2,4

Rate of instruction discovery

C&ESAR 2019

Instruction comparison

Monitoring instruction side effect is complex Looking at everything that may have changed is hard

Looking only at things that changed in simulation is biased

Some side effects may be hard to compare

Once again a little x86-64 knowledge is required Lookup simulation side-effects

Lookup common hardware feature: flag values

Transfer comparison values to memory

Some side effects may (still) be hard to compare

Simulate Execute

Compare

Tests generation

functions

outputs outputs

results

C&ESAR 2019

Characterizing native instruction execution

Native execution of instruction tests raise issues Side effect may be unsafe for execution

Random inputs may generate errors

Side effect difficulty Resolution

Memory access May access memory

randomly

a) Fix target address using input value control

b) Run in an protected environment

Branch May transfer control to

any memory location

Step in debugger

Interrupts and Traps Leave the program

Run in an hypervisor

Hypervisor ops Hard to confine Hardware debug ?

Access HW counters Environment

dependent values

No satisfying solution to our knowledge

C&ESAR 2019

A sandbox detection bug in QEMU

For other simulators: Self-testing should work elsewhere Generated tests can also be reused

Among our own bugs We picked the trickiest Checked other simulators…

Incorrect simulation in QEMU Incorrect decoding of instruction prefixes Incorrect address computation Allows simple sandbox detection May confuse malware analyzers

qemu-x86_64 segment prefixes error Bug #1847467

$ ./sample

I'm not in QEMU

$ qemu-x86_64 ./sample

I'm in QEMU

C&ESAR 2019

Conclusion

Simulator extension for advanced Sandboxing analysis Factorizing developments

Few code modifications, little overhead

Automatic simulator verification/validation Generic methodology

Successful implementation

Crucial follow-ups System instruction characterization

Architecture independent instruction scanning

C&ESAR 2019

Thank you !

validation with code introspection of a virtual platform ...€¦ · validation with code...

Documents

tax1-cir vs lhuillier

greg howe - introspection

virtualization introspection system

towards transparent introspection

virtual machine introspection overview ece main...

autonomy validation, introspection, and assessment...

job analysis defining effective performance of the employees...

review openaccess virtualmachineintrospection:towards...

introspection with python

an introspection on photography

introspection museum exhibition

autumn - winter 2019/20 - ruffo coli · monique lhuillier 5...

greg howe introspection songbook

the schema of introspection

crit thinking topic: introspection

kvm virtualization introspection

simply smitten carolina herrera, monique lhuillier … ·...

monique lhuillier ethereal baby bedding sets ·...

pereboom limits of introspection

book et travaux graphiste marie-laure lhuillier