cloud-native sandboxes for microservices: palo alto ... · cloud-native sandboxes for...

Cloud-Native Sandboxes for Microservices: Understanding New Threats and Attacks

Tongbo Luo (Chief AI Security Scientist, JD.com)

Zhaoyan Xu (Principal Security Researcher, Palo Alto Networks)

Tongbo Luo - Chief AI Security Scientist - JD.com

About Us

2014: Listed on Nasdaq (NASDAQ: JD)2017: Fortune 500 Company (Rank 261)2017: More than 1 Billion Active Customer2018: Google Invest $500M

17K+Merchant

16K+Employee

About Us

Zhaoyan Xu (Principal Security Researcher, Palo Alto Networks)

Agenda

● Introduction● New Challenges● System Design

○ Overview○ Core Design Issue

● Parallel Execution and Alignment Analysis● Case Study and Usage

○ Path Transversal○ RCE○ Authentication Bypass○ Sandbox Escape

Introduction

● Container = Namespace + cGroup● Cloud-native and container-based cluster● Orchestrators - Kubernetes● Container Security

○ Image Vulnerability○ Docker/K8s Vulnerability

● Defense○ Static Image Scanner○ Dynamic Runtime Prevention/Detection

Source: A Study of Security Vulnerabilities on Docker Hub

New Challenges

Problem-Side:

Missing a novel sandbox tool to discover container-based threats.

Opportunity-side:

Rethink the design of traditional sandbox.

New Features to Improve the Detection.

Parallel Execution

Context-Awareness

Container Sandbox Design - Core Question

Q1: How to make it convenience to use sandbox in container-based cloud?

Q2: How to efficiently retrieve and build the context for sandbox?

Q3: How to collect sample behaviors in sandbox?

Q4: How to analyze the sample behaviors beyond existing detection mechanism?

Q1 - Integrated into Orchestrator● Using CRD to extend the k8s APIs

○ Introduce our custom-defined-resource type “sandbox”○ No learning curve to manage container sandboxes.

● Example○ Using YAML file to create a sandbox○ With sample data (URL) and context info (image name)


Q1: How to make it convenience to use sandbox in container-based cloud?




Q2 - EnvBuilder

● Rebuild the client's context for environment-sensitive detection.

● Input = Sample + Context

○ Explicit Way ( User Upload DockerFile/Image Name)

○ Implicit Way ( Retrieve Context from Orchestrators)

Q2 - Hard Isolation: Kata or gVisor

Run Sandbox in the Kata (Hypervisor-based Container Runtime)

Sandbox1

Sandbox2

Collector


Q1: How to make it convenience to use in microservice cloud?




Q3 - Syscall Collection

● Sysdig

○ based-on tracepoint + vring buffer

● eBPF -- enhancements to BPF (Berkeley Packet Filter)

○ BPF Compiler Collection (bcc)

○ Linux 4.x series

● auditd

○ Linux Audit system

○ Integrated to kernel since v2.6.9


Q1: How to make it convenience to use in microservice cloud?




Parallel Execution and Alignment Analysis

Parallel Execution

1. Run Two containers created from the same image in parallel.

2. Feed two similar inputs to each container at the same time.

3. Collected behaviours.

4. Find the Differences between two behaviours.

Raw Data to Syscall Sequence

● Raw Data:

ruby3760A [27726, 21:43:14, >] [puma, 3818, 1, 14039, 18] [getsockopt, net] DataContainer

nameMetadata

[event_num, time, direction]Caller’s Info

[proc_name, pid, vpid, tid, vtid]Syscall Info

[call_type, call_catalog] buffer

1. Group syscalls based on container-name and pid2. Extract the syscall name3. Map syscall name to an unique char

[open read write read read close] → [1 a q a a 5]

Performance Issue

FILTER NORMALIZE Compress

● Syscall Sequence is too long

● 1000~5000 syscalls per second per process

● Reduced to 10% ~ 30%waitmutexmutexopenfstatlstatstatlstatstat

closemutex

waitopenfstatlstatstatlstatstat

close

waitopenstatstatstatstatstat

close

waitopen

statstatstat

close

Scoring Function

lstat*2 write*3 wait read*3 accept getsocknamefcntl setsockoptchmod vfork

accept getsocknamefcntl setsockopt

fchmodat lstat*3 write*2 wait*2 read*3

write*2Container1

Container2

MicroService Idle Stage

Feed Sample w/ Connection

Sample Execution Stage

● Quality of Scoring Function → Quality of Alignment Result

Scoring Function


lstat*3 write*2 wait*2 read*3

q

lstat*2 write*3 wait read*2


| | | |… e e a a a m i i i ... h … … 3 n f 4 …

... h … … 3 n f 4 … e e e a a m m i i i …

chmod

fchmodat

|

vfork

- - - - - - - -

- - - - - - - - -

lstat*3 write*2 wait*2 read*2

lstat*2 write*3 wait read*3

e e e a a m m i i i

e e a a a m i i i| | | | | | | |

accept getsocknamefcntl setsockoptchmod vfork


fchmodat

h

h 3 n f 4

3 n f 4 q

Ideal Result

...

...

Traditional Scoring Function Result

● Ideal Alignment Result: syscalls from the same stage are aligned.

Scoring Function

Our Customized Scoring Function for sandbox

Importance Closeness Sensitivity

frequency of syscall in sequence

functionality similarity

critical to perform exploit

accept > fstat (read, seek) >

(read, close)

chmod > wait

Depends on domain knowledge

Depends on the context

Depends onOS

Case Study and Practical Usage

Effectiveness:

Can we pinpoint exactly exploitation fraction from alignment result?

Can we generalize a pattern for each type of attack?

Case Study - Path Traversal

(CVE-2018-7490) @ uWSGI PHP plugin

curl http://localhost:8080/..%2f..%2f..%2f..%2f..%2fetc/passwd

open, fcntl, getdents, getdents, close, stat, stat, stat

open, fcntl, read, read, read, close, read, read

Additional repeated blocks of path traversal operations

attack

Attack(proc_c1)

Legitimate(proc_b1)

Case Study - Remote Command Execution (RCE)

Vulnerability CVE-2016-4977 @ Spring Security OAuth

pipe * 3, mmap, vfork

(write, wait, read) * 3, epoll_ctl, wait

stat, ioctl, closeaccept, fcntl, getsockname*2,

fcntl*2, setsockopt *2 java.lang.Runtime.exec(CMD) Attack(proc_c1)

Legitimate(proc_b1)

proc_c2 proc_c3 proc_c4


Vulnerability CVE-2017-5638 @ Jakarta plugin in Apache Struts

- Content-Type arbitrary command execution

Embed injected OGNL script in “Content-Type” field from HTTP header(#container=#context['com.opensymphony.xwork2.ActionContext.container']) (#cmds=(#iswin?{'cmd.exe','/c',#cmd}:{'/bin/bash','-c',#cmd}))

OGNL Runtime (new process)

setsockopt, fcntl, fcntl*2, read, stat*3, write

pipe*3, vfork, close

clone, clone, gettid,read, ioctl

stat*2, write, close

Attack(proc_c1)

Legitimate(proc_b1)

proc_c2


Vulnerability CVE-2017-11610 @ Supervisord

- client/server system to monitor/control processes on UNIX-like OSes.

container1: pids [ 12182 12120 6371 6372 6373]

container2: pids [ 12246 12306]

accept, fcntl, getpeername, wait4,

poll, recvfrom

sigaction, clone, wait4, sigaction, sendto, poll

sendto wait poll wait poll recvfrom wait recvfrom

EXEC

Attack(proc_c1)

Legitimate(proc_b1)

Case Study - Authentication Bypass

ssh_server_fork ssh_server_fork

ssh_server_fork ssh_server_fork ssh_server_fork

sh sh mesg

wget chmod

24700 26670

24580 26661 26664 26665 26666

26672 26673 26674Authenticating Authenticated

2667

0

2666

1

2666

4clone access getpid

open fcntl getrandom*3 recvfrom getrusage clock_gettime

getrusage clock_gettime … …

wait close*3 kill getpid

sendto getpid getrusage getpid …

recvfrom close

socket connect close access fstat mmap*2 close

2667026661

2667026664

2667026661

Case Study - JVM sandbox Escape

Vulnerability CVE-2015-1427 @ Elasticsearch - full-text search engine- Bypassing the Sandbox with Reflection:

sigaction, mmap, pipe*3, vfork, close, read, close*3

clone, clone, robust_list, gettid, sigprocmask*3, stat*3

accept, fcntl, getsockname*2, setsockopt*3, write

java.lang.Math.class.forName("java.lang.Runtime”).getRuntime().exec("id").getText()

Escape Sandbox Execute arbitrary shell

● More detail in the white paper○ Evaluation Result○ Future work

● Q & A

cloud-native sandboxes for microservices: palo alto ... · cloud-native sandboxes for...

Documents