cloud-native sandboxes for microservices: palo alto ... · cloud-native sandboxes for...
TRANSCRIPT
Cloud-Native Sandboxes for Microservices: Understanding New Threats and Attacks
Tongbo Luo (Chief AI Security Scientist, JD.com)
Zhaoyan Xu (Principal Security Researcher, Palo Alto Networks)
Tongbo Luo - Chief AI Security Scientist - JD.com
About Us
2014: Listed on Nasdaq (NASDAQ: JD)2017: Fortune 500 Company (Rank 261)2017: More than 1 Billion Active Customer2018: Google Invest $500M
17K+Merchant
16K+Employee
About Us
Zhaoyan Xu (Principal Security Researcher, Palo Alto Networks)
Agenda
● Introduction● New Challenges● System Design
○ Overview○ Core Design Issue
● Parallel Execution and Alignment Analysis● Case Study and Usage
○ Path Transversal○ RCE○ Authentication Bypass○ Sandbox Escape
Introduction
● Container = Namespace + cGroup● Cloud-native and container-based cluster● Orchestrators - Kubernetes● Container Security
○ Image Vulnerability○ Docker/K8s Vulnerability
● Defense○ Static Image Scanner○ Dynamic Runtime Prevention/Detection
Source: A Study of Security Vulnerabilities on Docker Hub
New Challenges
Problem-Side:
Missing a novel sandbox tool to discover container-based threats.
Opportunity-side:
Rethink the design of traditional sandbox.
New Features to Improve the Detection.
Parallel Execution
Context-Awareness
Container Sandbox Design - Core Question
Q1: How to make it convenience to use sandbox in container-based cloud?
Q2: How to efficiently retrieve and build the context for sandbox?
Q3: How to collect sample behaviors in sandbox?
Q4: How to analyze the sample behaviors beyond existing detection mechanism?
Q1 - Integrated into Orchestrator● Using CRD to extend the k8s APIs
○ Introduce our custom-defined-resource type “sandbox”○ No learning curve to manage container sandboxes.
● Example○ Using YAML file to create a sandbox○ With sample data (URL) and context info (image name)
Container Sandbox Design - Core Question
Q1: How to make it convenience to use sandbox in container-based cloud?
Q2: How to efficiently retrieve and build the context for sandbox?
Q3: How to collect sample behaviors in sandbox?
Q4: How to analyze the sample behaviors beyond existing detection mechanism?
Q2 - EnvBuilder
● Rebuild the client's context for environment-sensitive detection.
● Input = Sample + Context
○ Explicit Way ( User Upload DockerFile/Image Name)
○ Implicit Way ( Retrieve Context from Orchestrators)
Q2 - Hard Isolation: Kata or gVisor
Run Sandbox in the Kata (Hypervisor-based Container Runtime)
Sandbox1
Sandbox2
Collector
Container Sandbox Design - Core Question
Q1: How to make it convenience to use in microservice cloud?
Q2: How to efficiently retrieve and build the context for sandbox?
Q3: How to collect sample behaviors in sandbox?
Q4: How to analyze the sample behaviors beyond existing detection mechanism?
Q3 - Syscall Collection
● Sysdig
○ based-on tracepoint + vring buffer
● eBPF -- enhancements to BPF (Berkeley Packet Filter)
○ BPF Compiler Collection (bcc)
○ Linux 4.x series
● auditd
○ Linux Audit system
○ Integrated to kernel since v2.6.9
Container Sandbox Design - Core Question
Q1: How to make it convenience to use in microservice cloud?
Q2: How to efficiently retrieve and build the context for sandbox?
Q3: How to collect sample behaviors in sandbox?
Q4: How to analyze the sample behaviors beyond existing detection mechanism?
Parallel Execution and Alignment Analysis
Parallel Execution
1. Run Two containers created from the same image in parallel.
2. Feed two similar inputs to each container at the same time.
3. Collected behaviours.
4. Find the Differences between two behaviours.
Raw Data to Syscall Sequence
● Raw Data:
ruby3760A [27726, 21:43:14, >] [puma, 3818, 1, 14039, 18] [getsockopt, net] DataContainer
nameMetadata
[event_num, time, direction]Caller’s Info
[proc_name, pid, vpid, tid, vtid]Syscall Info
[call_type, call_catalog] buffer
1. Group syscalls based on container-name and pid2. Extract the syscall name3. Map syscall name to an unique char
[open read write read read close] → [1 a q a a 5]
Performance Issue
FILTER NORMALIZE Compress
● Syscall Sequence is too long
● 1000~5000 syscalls per second per process
● Reduced to 10% ~ 30%waitmutexmutexopenfstatlstatstatlstatstat
closemutex
waitopenfstatlstatstatlstatstat
close
waitopenstatstatstatstatstat
close
waitopen
statstatstat
close
Scoring Function
lstat*2 write*3 wait read*3 accept getsocknamefcntl setsockoptchmod vfork
accept getsocknamefcntl setsockopt
fchmodat lstat*3 write*2 wait*2 read*3
write*2Container1
Container2
MicroService Idle Stage
Feed Sample w/ Connection
Sample Execution Stage
● Quality of Scoring Function → Quality of Alignment Result
Scoring Function
accept getsocknamefcntl setsockopt
lstat*3 write*2 wait*2 read*3
q
lstat*2 write*3 wait read*2
accept getsocknamefcntl setsockopt
| | | |… e e a a a m i i i ... h … … 3 n f 4 …
... h … … 3 n f 4 … e e e a a m m i i i …
chmod
fchmodat
|
vfork
- - - - - - - -
- - - - - - - - -
lstat*3 write*2 wait*2 read*2
lstat*2 write*3 wait read*3
e e e a a m m i i i
e e a a a m i i i| | | | | | | |
accept getsocknamefcntl setsockoptchmod vfork
accept getsocknamefcntl setsockopt
fchmodat
h
h 3 n f 4
3 n f 4 q
Ideal Result
...
...
Traditional Scoring Function Result
● Ideal Alignment Result: syscalls from the same stage are aligned.
Scoring Function
Our Customized Scoring Function for sandbox
Importance Closeness Sensitivity
frequency of syscall in sequence
functionality similarity
critical to perform exploit
accept > fstat (read, seek) >
(read, close)
chmod > wait
Depends on domain knowledge
Depends on the context
Depends onOS
Case Study and Practical Usage
Effectiveness:
Can we pinpoint exactly exploitation fraction from alignment result?
Can we generalize a pattern for each type of attack?
Case Study - Path Traversal
(CVE-2018-7490) @ uWSGI PHP plugin
curl http://localhost:8080/..%2f..%2f..%2f..%2f..%2fetc/passwd
open, fcntl, getdents, getdents, close, stat, stat, stat
open, fcntl, read, read, read, close, read, read
Additional repeated blocks of path traversal operations
attack
Attack(proc_c1)
Legitimate(proc_b1)
Case Study - Remote Command Execution (RCE)
Vulnerability CVE-2016-4977 @ Spring Security OAuth
pipe * 3, mmap, vfork
(write, wait, read) * 3, epoll_ctl, wait
stat, ioctl, closeaccept, fcntl, getsockname*2,
fcntl*2, setsockopt *2 java.lang.Runtime.exec(CMD) Attack(proc_c1)
Legitimate(proc_b1)
proc_c2 proc_c3 proc_c4
Case Study - Remote Command Execution (RCE)
Vulnerability CVE-2017-5638 @ Jakarta plugin in Apache Struts
- Content-Type arbitrary command execution
Embed injected OGNL script in “Content-Type” field from HTTP header(#container=#context['com.opensymphony.xwork2.ActionContext.container']) (#cmds=(#iswin?{'cmd.exe','/c',#cmd}:{'/bin/bash','-c',#cmd}))
OGNL Runtime (new process)
setsockopt, fcntl, fcntl*2, read, stat*3, write
pipe*3, vfork, close
clone, clone, gettid,read, ioctl
stat*2, write, close
Attack(proc_c1)
Legitimate(proc_b1)
proc_c2
Case Study - Remote Command Execution (RCE)
Vulnerability CVE-2017-11610 @ Supervisord
- client/server system to monitor/control processes on UNIX-like OSes.
container1: pids [ 12182 12120 6371 6372 6373]
container2: pids [ 12246 12306]
accept, fcntl, getpeername, wait4,
poll, recvfrom
sigaction, clone, wait4, sigaction, sendto, poll
sendto wait poll wait poll recvfrom wait recvfrom
EXEC
Attack(proc_c1)
Legitimate(proc_b1)
Case Study - Authentication Bypass
ssh_server_fork ssh_server_fork
ssh_server_fork ssh_server_fork ssh_server_fork
sh sh mesg
wget chmod
24700 26670
24580 26661 26664 26665 26666
26672 26673 26674Authenticating Authenticated
2667
0
2666
1
2666
4clone access getpid
open fcntl getrandom*3 recvfrom getrusage clock_gettime
getrusage clock_gettime … …
wait close*3 kill getpid
sendto getpid getrusage getpid …
recvfrom close
socket connect close access fstat mmap*2 close
2667026661
2667026664
2667026661
Case Study - JVM sandbox Escape
Vulnerability CVE-2015-1427 @ Elasticsearch - full-text search engine- Bypassing the Sandbox with Reflection:
sigaction, mmap, pipe*3, vfork, close, read, close*3
clone, clone, robust_list, gettid, sigprocmask*3, stat*3
accept, fcntl, getsockname*2, setsockopt*3, write
java.lang.Math.class.forName("java.lang.Runtime”).getRuntime().exec("id").getText()
Escape Sandbox Execute arbitrary shell
● More detail in the white paper○ Evaluation Result○ Future work
● Q & A