silhouette: efficient protected shadow stacks for embedded … · nnet_test-125k radix2-big-64k...

28
Silhouette: Efficient Protected Shadow Stacks for Embedded Systems Jie Zhou Yufei Du Zhuojia Shen University of Rochester University of Rochester University of Rochester Lele Ma John Criswell Robert J. Walls College of William & Mary University of Rochester Worcester Polytechnic Institute Presented at USENIX Security 2020

Upload: others

Post on 07-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Silhouette: Efficient Protected Shadow Stacks for Embedded Systems

    Jie Zhou Yufei Du Zhuojia ShenUniversity of Rochester University of Rochester University of Rochester

    Lele Ma John Criswell Robert J. WallsCollege of William & Mary University of Rochester Worcester Polytechnic Institute

    Presented at USENIX Security 2020

  • 2

    Microcontroller-based Systems are Almost Everywhere

    ECU

    Bluetooth module

    Wi-Fi module

  • Microcontroller-based Embedded Devices

    • Limited CPU speed

    • Limited memory

    • Real-time constraints

    • Frequent direct operations on hardware

    3

  • C is Not Memory Safe

    Control-flow Hijacking: corrupting control-data to divert

    control flow to attacker-selected destinations

    4

    void bar(…) { … … return; }

    void foo(…) { … bar(); … }

  • C is Not Memory Safe

    5

    void bar(…) { … … return; }

    void foo(…) { … bar(); … } attacker-selected destination

    Control-flow Hijacking: corrupting control-data to divert

    control flow to attacker-selected destinations

  • Control-Flow Integrity (CFI)

    6

    void bar(…) { … … return; }

    void foo(…) { … bar(); … }

    void bar(…) { … … return; }

    void bar(…) { … … return; }

    Common weakness of practical CFI*: allowing a return instruction to return back to multiple places

    void f2(…) { … call … … }

    void f3(…) { … call … … }

    allowed by practical CFI

    *Exploited by Out of Control @Oakland’14, ROP is Still Dangerous @USENIX Security’14, Control-flow Bending @USENIX Security’15, etc.

    void f1(…) { … add … sub … }

  • Silhouette

    Silhouette: a compiler-based defense that

    • guarantees the integrity of return addresses

    • coarse-grained forward-edge CFI

    • low performance overhead (1.3% and 3.4% overhead on two benchmark suites)

    • Developed for ARMv7-M due to its popularity • Also working on other ARM embedded processors

    7

  • Outline

    • Silhouette Design

    • Evaluation

    • Summary

    8

  • Outline

    • Silhouette Design

    • Evaluation

    • Summary

    9

  • Shadow Stack

    Protecting return addresses

    Shadow stack itself also needs protection!

    ret. addr. of fooret. addr. of bar

    shadow stack SSP

    0x40000000

    0x30000000

    ret. addr. of foo

    ret. addr. of barregular stack SP

    0x36000000

    10

  • From a Shadow Stack’s Point of View

    stores writing to the shadow stack all other stores

    Legal Illegal

    Can we make the shadow stack writable only by its legal stores?

    11

    All Store Instructions

  • Background on ARMv7-M• Execution Mode: Privileged and Unprivileged. (Embedded devices usually run everything in privileged mode.)

    • Memory access permissions are configurable.

    ret. addr. of fooret. addr. of bar

    shadow stackSSP

    0x40000000

    0x30000000

    ret. addr. of foo

    ret. addr. of barregular stack SP

    0x36000000

    can be configured to be writable only

    by privileged stores

    Can we make the shadow stack only writable by its legal stores?

    12

    Is it possible to make • only the shadow-stack-legal stores privileged • all other stores unprivileged?

    Yes

  • Unprivileged Store

    Act as if running in unprivileged mode when running in privileged mode.

    // running in privileged modestrt r1, [r0, #12] writable only by privileged stores

    This instruction would fail!

    13

  • Use Unprivileged Store to Protect Shadow Stack

    • Transform all stores to be unprivileged stores except

    • shadow-stack-legal stores

    • those that require to run as privileged such as some I/O-related operations.

    Effect: even if memory is corrupted and control flow is diverted, illegal store instructions

    do not have write access to corrupt the shadow stack.

    • Configure the memory region for shadow stack to be writable

    only by privileged stores.

    14

    Store Hardening

  • Store Instructions of ARMv7-M

    Addressing Mode Number of Types

    Normal Store Instructions

    source register, base register, offset register, immediate, left shift, write back, store multiple, floating-point stores over 40

    Unprivileged Store Instructions source register, base register, immediate 3

    Comparison of Normal and Unprivileged Store Instructions

    15

  • Store Hardening Examples

    // example 1

    str r0, [r1, #4]

    strt r0, [r1, #4]

    // example 2

    str r0, [sp, #-12]

    sub sp, #12 strt r0, [sp, #0] add sp, #12

    no performance overhead no code size overhead performance and code size overhead

    16

  • Forward-edge Control-flow Issues

    17

    • Transform all stores to be unprivileged stores except

    • shadow-stack-legal stores

    • those that require to run as privileged

    Forward-edge Control Flow How Silhouette Handles Them

    Indirect Function Calls Restricted by Label-based Forward-edge CFI

    Large switch Statements Compiled to Bounds-checked TBB or TBH instructions

    Computed goto Statements Transformed to switch statements

  • Silhouette Architecture

    Simplified Architecture of Silhouette

    LLVM IR

    Shadow Stack Transform N

    ative Cod

    e

    Generat

    ion

    LLVM Machine

    IR Generation

    18

    Store Hardening Label-based Forward-edge CFI

    Executable with Protected Shadow Stack

    Security guarantee:

    • Return instruction always returns to its legal destination

    • Forward-edge control flows are restricted to selected destinations

  • Outline

    • Silhouette Design

    • Evaluation

    • Summary

    19

  • Experiment Setup

    Development board: STM32F469 • Cortex-M4 processor, run at 180 MHz

    • 384 KB SRAM

    • 16 MB SDRAM

    • 2 MB Flash Memory

    Base compiler: Clang/LLVM 9.0

    Optimization level: -O3

    Benchmarks: all 9 programs in CoreMark-Pro 29 programs in BEEBS

    20

    Evaluated both performance and code size overhead

  • Performance on CoreMark-Pro Benchmarks

    21

    Shadow Stack Store Hardening CFI SilhouetteMin 0 0 -0.1% 0.1%Max 1.3% 4.9% 0.1% 4.9%

    Geo. Mean 0.2% 1% 0 1.3%

    Nor

    mal

    ized

    Perfo

    rman

    ce

    0.950

    0.975

    1.000

    1.025

    1.050

    cjpeg-rose7...

    core

    linear_alg-..

    loops-all-...

    nnet_test

    parser-125k

    radix2-big-64k

    sha-test

    zip-testShadow Stack Store Hardening CFI Silhouette

  • Code Size on CoreMark-Pro Benchmarks

    22

    Nor

    mal

    ized

    Cod

    e Si

    ze

    0.950

    1.000

    1.050

    1.100

    1.150

    1.200

    cjpeg-rose7...

    core

    linear_alg-..

    loops-all-...

    nnet_test

    parser-125k

    radix2-big-64k

    sha-test

    zip-testShadow Stack Store Hardening CFI Silhouette

    Shadow Stack Store Hardening CFI SilhouetteMin 0.5% 2.8% 0.2% 3.6%Max 1.7% 11.1% 9.4% 19.3%

    Geo. Mean 0.8% 6.8% 1.2% 8.9%

  • Performance Overhead on BEEBS Benchmarks

    23

    Shadow Stack Store Hardening CFI SilhouetteMin 0 -0.3% -0.3% -0.3%Max 9.2% 24.7% 2.2% 24.8%

    Geo. Mean 1.1% 1.8% 0.1% 3.4%

    Nor

    mal

    ized

    Perfo

    rman

    ce

    0.95

    1

    1.05

    1.1

    1.15

    1.2

    1.25

    bubblesort

    ctl-string

    levenshtein

    matmult-int

    ndes

    picojpeg

    qrduino

    sglib-queue

    sglib-rbtree

    slre

    stb_perlin

    trio-sscanf

    wikisort

    Shdaow Stack Store Hardening CFI Silhouette

  • Code Size on BEEBS Benchmarks

    24

    Shadow Stack Store Hardening CFI SilhouetteMin 0.3% 0.5% 0 0.9%Max 0.6% 6.1% 1.3% 6.8%

    Geo. Mean 0.4% 1.8% 0.1% 2.3%

    Nor

    mal

    ized

    Cod

    e Si

    ze

    0.98

    1

    1.02

    1.04

    1.06

    1.08

    bubblesort

    ctl-string

    levenshtein

    matmult-int

    ndes

    picojpeg

    qrduino

    sglib-queue

    sglib-rbtree

    slre

    stb_perlin

    trio-sscanf

    wikisort

    Shdaow Stack Store Hardening CFI Silhouette

  • Silhouette-Invert

    Not supported on ARMv7-M

    25

    Proposed two solutions with minor hardware changes.

    See the paper for details.

    ret. addr. of fooret. addr. of bar

    shadow stack SSP

    0x40000000

    0x30000000

    ret. addr. of foo

    ret. addr. of barregular stack SP

    0x36000000

    Writable only by unprivileged stores but not by privileged stores?

    • Configure shadow stack to be unprivileged-write-only

    • Transform shadow-stack-legal stores to be unprivileged

    • Leave all other stores unchanged

    Silhouette-Invert

  • Silhouette v.s. Silhouette-Invert on CoreMark-Pro

    26

    Nor

    mal

    ized

    Perfo

    rman

    ce

    0.950

    0.975

    1.000

    1.025

    1.050

    cjpeg-rose7...

    core

    linear_alg-..

    loops-all-...

    nnet_test

    parser-125k

    radix2-big-64k

    sha-test

    zip-testSilhouette Silhouette-Invert

    Silhouette Silhouette-InvertMin 0.1% 0Max 4.9% 1.5%

    Geo. Mean 1.3% 0.3%

  • Silhouette v.s. Silhouette-Invert on BEEBS

    27

    Nor

    mal

    ized

    Perfo

    rman

    ce

    0.95

    1

    1.05

    1.1

    1.15

    1.2

    1.25

    bubblesort

    ctl-string

    levenshteinmatmult-int

    ndes

    picojpeg

    qrduino

    sglib-queuesglib-rbtree

    slre

    stb_perlin

    trio-sscanf

    wikisort

    Silhouette Silhouette-Invert

    Silhouette Silhouette-InvertMin -0.3% 0Max 24.8% 18.6%

    Geo. Mean 3.4% 1.9%

  • Summary

    • Silhouette: an efficient defense to protect return addresses

    for ARM embedded systems

    • Low performance and code size overhead

    • Silhouette-Invert:

    • Further decreases performance and code size penalty

    •Minor hardware change

    • Open-Source:

    28

    Question?

    https://github.com/URSec/SilhouetteContacts: Jie Zhou ([email protected])

    Zhuojia Shen ([email protected])

    John Criswell ([email protected])

    https://github.com/URSec/Silhouettehttps://github.com/URSec/Silhouettemailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]