securing the web with decentralized information flow control maxwell krohn (mit) in cahoots with:...
TRANSCRIPT
1
Securing the Web with Decentralized Information
Flow ControlMaxwell Krohn (MIT)
in cahoots with: Alex Yip , Micah Brodsky, Petros Efstathopoulos (UCLA), Steve VanDeBogart (UCLA), Frans Kaashoek, Eddie Kohler (UCLA), David
Mazières (Stanford), Robert Morris, Mike Walfish, Natan Cliffer, Cliff Frey, David Ziegler
2
A Computing Shift
Classic PC
3
1. The “Classic” Attack
Bob’s Data
Alice’s Data
Bob
GET /xxXxxxXXxxX/Alice
Web App
Alice’s Data
Alice’s DataChuck’s
DataDoug’s DataEd’s
Data
4
Vulnerabilities in Websites Exploits
– “USAJobs.gov hit by Monster.com attack, 146,000 people affected”
– “Payroll Site Closes on Security Worries”– “Hacker Accesses Thousands of Personal Data Files at CSU
Chico”– “FTC Investigates PETCO.com Security Hole”– “Major Breach of UCLA’s Computer Files”– “UN Website is Defaced via SQL Injection”– “Harvard Security Breach Exposes Sensitive Student Data”– “Security Lapse Exposes Facebook Photos”
5
2. Server-Side Malware
Feature B Feature A
Alice’sdata
Bob’sdata
GET /FeatureA
GET /FeatureC
Alice
Bob3rd Party
Feature C
8
From Bad To Worse!
1. The “Classic” Attack2. Server-side Malware – NEW!!
3. Others Not Considered in this talk:– XSS– Phishing
9
Two Options
GREAT
OPPORTUNITY
FOR SYSTEMS
BUILDERS!
10
My Work in Web Security
• TheSpark.com, OkCupid.com• New Web Server [USENIX ’04, USENIX ’07]
• Limitations of Unix [HotOS ’05]
• New OS, Attempt 1 [SOSP ’05*, TOCS ’07†]
• New OS, Attempt 2: “Flume” [SOSP ’07]
• Ideas for the Future Web [HotNets ’07]
First author is *Petros Efstathopoulos and †Steve VanDeBogart.
THIS TALK
11
Web Server
Web App
Service 1 (C++)
Service 2 (Python)
Service 3(Java)
Why Is Web Security Difficult?
Storage (DB or FS)
Bob’s Data
Alice’s Data
Service 4(???)
12
Web Server
Web App
Service 1 (C++)
Service 2 (Python)
Service 3(Java)
New Proposal: End-to-End Web Security [HotOS ’05]
Storage (DB or FS)
Bob’s Data
Alice’s Data
Service 4(???)
GatewayAlice’s Data
Alice’s Data
13
Run-time or Compile-Time Tracking?
• Web sites favor run-time tracking:– Use scripting languages• PHP, Python, Ruby, Perl, etc..
– Mix-and-match different languages– Use plug-ins and third-party software
Bob’s DataChuck’s
DataDoug’s DataEd’s
DataAlice’s Data
Alice’s Data
14
Decentralized Information Flow Control (DIFC) for the OS
(OS tracks data at run-time)
Gateway
• Inspired by PL-based DIFC [Myers ’97]
15
Contributions
• Idea: End-to-end Web security• Realization: Build Web sites with DIFC– Model for DIFC at the OS level– API: How to build apps (for non-experts)– Implementation on Linux, OpenBSD– Case Study: MoinMoin Wiki
• Generalization: a secure, extendable Web platform
16
Outline
1. Operating System Support for DIFC2. Security improvement in a real Web site3. Generalization
18
DIFC By Example
PWeb App
Bob’s Data
Alice’s Data
{ Alice }
{ Bob }
DIFC KERNEL
{ Alice }
gateway
19
Defining DIFC for the OS [SOSP ’07]
PgatewayWeb App
Bob’s Data
Alice’s Data
{ Alice }
DIFC KERNEL
{ Alice }
2. How does the kernel track data?
1. How to label secret
data?3. How can the app
legislate policy?{ Bob }
20
1. Labeling Data
• Each process/file gets a secrecy label– summarizes which categories of secret data a
process is assumed to have seen. – Examples:• { “Alice’s Secrets” }• { “Financial Secrets” }• { “Alice’s Secrets” and “Financial Secrets” }
“tag”
“label”
21
2. Tracking Data
• For p to write to network, Sp = {}• p can write to q iff:
Sp Í Sq
22
Tracking Data: File I/O
Sw= { a, b }
P Web App
Sf = { a }
Alice’s Data
23
Tracking Data: IPC
Sw= { a }Sp = { b }
Web AppHelper
Process p
24
Defining DIFC
PgatewayWeb App
Bob’s Data
Alice’s Data
{ Alice }
{ Bob }
DIFC KERNEL
{ Alice }
2. How does the kernel track data?
1. How to label secret
data?3. How can the app
legislate policy?
25
3. Legislating data policies
• Processes can:– change labels by adding tags– allocate tags– change labels by subtracting tags
26
Sw= { b }Sw= {}
Any Processes Can Add Any Tag
change_label(S={b})
change_label(S={})
Web App
27
Processes Can Allocate Tags
Bob’s Data
gatewayWeb App
S = { b }
Sg= {}Sw= {}
DIFC KERNEL
= { a }
28
Processes Can Allocate Tags
Sg= {}Dg = {}
a create_tag()
Sg= {}Dg = { a }
gateway
“Secrecy”
“Declassify”
29
Sg= {}Dg = { a }Sg= { a }Dg = { a }
Some Processes Can Subtract Some Tags
change_label(S={})change_label(S={a})
gateway
30
Dw= {}
S = { a }
Sw= {}Sw= { a }
Alice’s Data
Web App
helper
Sp= { a }
Putting the Pieces Together
Sg= {}Dg = { a, b }Sg= { a }Dg = { a, b }
P
DIFC KERNEL
gateway
31
Advances Over Traditional IFC
• Previous systems tracked information flow control at OS level [Bell-LaPadula, KeyKOS, Orange Book, IX, SELinux, TrustedBSD, …]
• Now, apps can “legislate” security policies:– create_tag()– change_label()
33
Outline
1. Operating System Support for DIFC– High-Level Design and Model– API: How to build apps (for non-experts)– Implementation on Linux, OpenBSD
2. Security improvement in a real Web site3. Generalization
34
How To Build Apps [SOSP ’07 ]
• Maintain existing API (Unix in our case)– “open file abstraction”– “reliable inter-process communication”– “network sockets”– “threads”
• Add DIFC labels / rules• A road historically fraught with peril!
35
Gotcha 1: Different Labels
gatewayWeb App
PW DB
Network
S = { a }
S = { d }
S = {}S = { ? }
D = { a, d }
36
Gotcha 2: Buggy Apps
Web App
S = { a } S = { a }D = { a, d }
Top Secret
FileS = { t }
gateway
S = { a, t }D = { a, d }
37
Solution: Endpoints
gatewaye1
Se1={a}
Web App
S = { a }
e3 Se3={d}
S = {}D = { a, d }
PW DB S = { d }
Network
S = {}
e2
Se2={}
f
Sf= {a}
38
Kernel Controls Flow Between Endpoints
e1
Se1={a}
Web App
S = { a } S = {}D = { a, d }
f
Sf= {a}gateway
Sf Í Se1
Se1 Í Sf
39
Endpoints Declassify Data
Thus gateway needs a Î D
Data enters gateway with secrecy { a }
But gateway keeps its label
S = {}e1
Se1={a}
Web App
S = { a } S = {}D = { a, d }
f
Sf= {a}gateway
40
Restrictions on Endpoints
• For process p, endpoint e:
• (Note, “ – ” is set-wise XOR)
Sp – Se Í Dp
41
Endpoints Suppress Temptation
gatewayeSe = {a}
Web App
S = { a } S = {}D = { a }
Top Secret
FileS = { t }
S = { t }D = { a }X
{ t } – { a } Í Dp
f
Sf = {a}
42
Endpoints Provably Fit DIFC Model
• If kernel enforces endpoint restrictions• and kernel enforces subset rule between
endpoints• then process-level subset rule is upheld – (i.e., p can send to q iff Sp Í Sq)
44
Outline
1. Operating System Support for DIFC– High-Level Design and Model– Key detail: how apps manage resources– Implementation
2. Result: real Web security improvements3. Generalization
45
Flume Kernel Module
glibc
Flume: System Call Delegation
Web App
Flume Libc
Linux Kernel
open(“/alice/data”, O_WRONLY);
Flume Reference Monitor
Alice’s Data
Works on Linux 2.6 and OpenBSD 3.9
46
Endpoints In Flume
• Endpoints for:– File descriptors– Signal receive / signal send– Parent wait / child exit– Network– System V IPC– …etc…
47
Flume’s Place in the Software Stack
Flume
Linux OpenBSD Windows HiStar AsbestosSymbian
WikiGeneral Web Computing
Platform
Online Banking
Mobile Apps Etc…
48
Outline
1. Operating System Support for DIFC2. Security improvement in a real Web site3. Generalization
49
Example App: MoinMoin Wiki
50
MoinMoinWiki
Example MoinMoin Use
FreeTShirts
Alice’s Data
LayoffPlans
51
Threat ModelPlug-in 1 Plug-in 2
MoinMoin Wiki
Apache Python libs
glibc
Linux Kernel FS
Compiler (gcc)
VULNERABLE
SECURED
Flume
52
Harden Biggest Pieces?
Linux KernelArea LOC
glibc
Apache
libs
FS
gcc
Python
MoinMoin Wiki
Plugins
Flume
53
Harden What’s Hard To Secure
Plug-in 1 Plug-in 2
LOCArea
# of Installs
Kernel
–“USAJobs.gov hit by Monster.com attack, 146,000 people affected”–“Payroll Site Closes on Security Worries”–“Hacker Accesses Thousands of Personal Data Files at CSU Chico”–“FTC Investigates PETCO.com Security Hole”–“Major Breach of UCLA’s Computer Files”–“UN Website is Defaced via SQL Injection”–“Harvard Security Breach Exposes Sensitive Student Data”–“Security Lapse Exposes Facebook Photos”
54
DIFC KERNEL
Implementation Strategy
Web App
Bob’s Data
Alice’s Data
gateway
MoinMoin Wiki90 kLOCPython
PluginsWeb Server
FlumeWiki Gateway (1 kLOC)
55
Implementation
MoinMoin Wiki90 kLOCPython
Plugins
Apache Web Server
FlumeWiki Gateway1 kLOC
Flume Server
Alice’s Data
FLUME
Dg = { a, b }
UNTRUSTED
TRUSTED
56
Results
• Can Flume accommodate non-expert code?– only 1,000 out of 90,000 LOC in MoinMoin changed– Python interpreter, Apache, unchanged
• Does Flume perform reasonably?– 43% slower in read throughput– 34% slower in write throughput– Overhead from system call interposition– Scales up linearly with a cluster of Flume servers
57
Security Improvements
• Isolate security code:– Declassifier is 1/90th the size of MoinMoin
application• FlumeWiki inherited 2 ACL bypass
vulnerabilities from MoinMoin– e.g., Bug in MonthCalendar() macro
58
Bug In MonthCalendar
HTTP ERROR 500
60
Security Improvements
• Exploits fail against FlumeWiki
• Related Bug-Stopping Techniques Won’t Work– Taint Tracking, SFI, PittSFIeld, XFI, Program
Shepherding, Model-Checking, Non-DIFC PL techniques [SPIN, Java, Singularity], etc….
61
Outline
1. Operating System Support for DIFC2. Security improvement in a real Web site3. Generalization: The W5 Platform
62
Recall: FlumeWiki
Alice’s Data
FlumeWikiGateway1 kLOC
Flume Server
Apache Web Server
Bob’s Data
MoinMoin Wiki90 kLOCPython
Plugins
SecurityGuarantees
FLUME
63
FLUME
Proposal: W5 [HotNets ’07]
FlumeWiki Gateway1 kLOC
Flume Server
Apache Web Server
SecurityGuaranteesAlice’s
Data
Bob’s Data
Unvetted / Suspect / Uploaded
Code
64
FLUME
A New Architecture
Alice’s Data
W5 Gateway
Bob’s Data
BlogPhotoSharing
Flume Server
Messaging Matching
(Application writers)
65
“Declassifiers” Allow Data SharingW5 Server
Alice’sphoto
W5Gateway
PhotoSharing
Alice’s Declassifier
Chuck
S = { a } S = { a }S = { a }D = { a }S = { c }D = { a }
FLUME
Security in W5
Alice’s Declassifier
BloggerPhotoz
FilterWiki
DatingVampires
Videos
CalendarEmail
MsgBoardAquarium
Hot Or Not
PokeTravel Map
MusicQuizzes
66
Flume Platform
W5 Gateway
Trusted: if buggy, Alice might lose data
Untrusted: cannot export/steal Alice’s data
68
“What Are You Doing Next?”
• Evangelize W5 (requires interview; in progress)• Build it out (requires job)– Spell out appropriate IFC policies– SQL++ for mutually distrustful apps– Tackle browser and JavaScript– FS support for integrity / rollback– Economic model– Multiple providers– Resource allocation
69
Conclusions
• Idea: End-to-end Web security• Realization: DIFC for the OS– Endpoints: Merge DIFC with standard API– Implementation for standard OS abstractions– MoinMoin Wiki secured
• New direction for Web platforms
• Download software:http://flume.csail.mit.edu
70
Privilege Separation: OKWS [USENIX ’04]
DB
OKWSdemux
search
Search DB PW DB
login inbox
mail DB
Python
search login
Apache
inbox
71
MoinMoin Insight: Integrity
Write /LayoffPlansMoinMoin
LayoffPlans
Policy before: MoinMoin is not to be trusted.
Policy now: Original MoinMoin is correct; plugins might be problematic
72
MoinMoin Insight: Integrity
Write /LayoffPlans
LayoffPlans
MoinMoin
73
End-To-End Integrity
Write /LayoffPlansMoinMoin
moin.pyos.py
re.pysys.py
sys.pysys.py
sys.pysys.py
sys.pysys.py
sys.pysys.py
XYZ.py
libc.solibm.solibintl.so
libintl.solibintl.so
libintl.solibintl.so
libintl.solibintl.so
libintl.solibXY.so
resolv.conftzdata
tzdatatzdata
tzdatatzdata
XY.confXY.conf
XY.confXY.conf
XY.confXY.conf
XY.confXY.conf
2,000+ file opens
74
End-To-End Integrity
Write /LayoffPlansMoinMoin
moin.pyos.py
re.pysys.py
sys.pysys.py
sys.pysys.py
sys.pysys.py
sys.pysys.py
XYZ.py
libc.solibm.solibintl.so
libintl.solibintl.so
libintl.solibintl.so
libintl.solibintl.so
libintl.solibXY.so
resolv.conftzdata
tzdatatzdata
tzdatatzdata
XY.confXY.conf
XY.confXY.conf
XY.confXY.conf
XY.confXY.conf
PYTHONPATH=???
75
FlumeWiki Can Certify Results
Write /LayoffPlans
LayoffPlans
MoinMoin
FlumeWiki Declassifier
sys.py
76
MoinMoin Growing Over Time
Aug-00
Feb-01
Jul-0
1
Dec-01
Jun-02
Nov-02
Apr-03
Sep-03
Mar-
04
Aug-04Jan
-05Ju
l-05
Dec-05
May
-06
Oct-06
Apr-07
Sep-07
010,00020,00030,00040,00050,00060,00070,00080,00090,000
100,000
Line
s of
Pyt
hon
Code
77
History of IFC
1968
Adept-50 [Weissman]
1973
Bell-LaPadula
1976
Lattice Model [Denning]
IX [McIlroy + Reeds]
1992
SELinux [Loscocco + Smalley]
20011985
Orange Book [DoD]
1997
JIF [Myers]
Asbestos [SOSP]2005 2006
HiStar [Zeldovich]2007
Flume [SOSP]
Decentralized IFC: relaxed, more practical
IFC
78
Why Asbestos Leaks Data
leakerAlice’s Data
S = { a } S = { a }
b1
b2
b3
b4
S = {}
S = {}
S = {}
S = {}
1011
Leak file
S = {}
0000
S = {}
1011
S = { a }
79
leakerAlice’s Data
S = { a } S = { a }
b1
b2
b3
b4
S = {}
S = {}
S = {}
S = {}
1011
Leak file
S = {}
0000
S = {}S = { a }
Why Flume Doesn’t Leak Data [HiStar]
leaker
S = { a }b2
S = {}S = { a }
Why Flume Doesn’t Leak Data [HiStar]
Leak file
S = {}
82
Classic Attack, Revisited
Feature B Feature A
Alice’sdata
Bob’sdata
Bob3rd Party
Feature D
GET /XxxXxx/FeatureD/Alice
Alice’s Data
83
Web Server
Web App
Service 1 (C++)
Service 2 (Python)
Service 3(Java) ...
Why Is Web Security Difficult?
Storage (DB or FS)
Bob’s Data
Alice’s Data
Service 4(Ruby)
87
Why Endpoint Constraints Are Correct
ep qf
Message sent from p to q:
Sp – Se Í Dp Sf – Sq Í Dq
change_label(Se Ç Sp);
<send message>change_label(Sp);
change_label(Sf È Sq);<receive message> change_label(Sq);
Se Í Sf
Se Ç Sp Í Se Í Sf Í Sf È Sq