Retrofitting Privacy intoOperating Systems
A dissertation presented in partial fulfillment of therequirements for the degree of
Doctor of Philosophy
in the field of
Information Assurance
by
Kaan Onarlioglu
Committee Members
Engin Kirda, Northeastern UniversityWilliam Robertson, Northeastern University
Christo Wilson, Northeastern UniversityManuel Egele, Boston University
Northeastern UniversityCollege of Computer and Information Science
Boston, MassachusettsDecember, 2016
Abstract
The changing technology and adversarial models often render existing privacy de-
fenses obsolete, or lead to the emergence of new privacy-sensitive computer resources.
This necessitates the security community to develop novel defenses that address con-
temporary privacy threats.
The operating system is a natural platform to design these novel privacy defenses
on, as it allows for enforcing correct, strong security properties on applications, and
scales to the entire user space. However, developing and deploying a secure operating
system from scratch is impractical.
In this thesis, we instead argue that extending existing operating systems with
novel privacy defenses is the preferable approach to addressing emerging privacy
needs, while also sidestepping issues of cost and usability. We show that such an
approach is both feasible and effective.
To support this claim, we examine four distinct privacy scenarios: (1) keeping
privacy-sensitive data produced during short-lived program execution sessions con-
fidential, (2) securely deleting sensitive data persisted to modern storage media for
long-term use, (3) hiding the existence of encrypted sensitive data on disk, and finally,
(4) providing a user-driven access control model for non-traditional privacy-sensitive
resources on a computer. We discuss the contemporary privacy threats pertaining to
each scenario, and present four privacy-enhancing techniques to address these issues.
We demonstrate that our solutions can be retrofitted into existing operating systems
to provide general, application-transparent, and usable defenses.
i
Acknowledgments
Pursuing a PhD can be a scarring experience. I came through unscathed, and it is
largely thanks to Prof. Engin Kirda who mentored and championed me throughout
these long years. I am indebted to him for the many opportunities he has opened
up for me. Likewise, I owe a special debt of gratitude to Prof. William Robertson
for guiding me through the inception and formulation of my research. I have learned
a lot from Engin and Wil, and enjoyed countless scholarly moments together with
them∗. It was an honor to be part of the team.
I thank my committee members Prof. Christo Wilson and Prof. Manuel Egele for
patiently reading this thesis, and for their valuable feedback.
I am grateful to Prof. Ali Aydin Selcuk who set things in motion by converting
me to security research. I understand now that the alternative path would have led
to a world of pain and misery. It was a close call.
I thank all the SecLab members I had the pleasure of working and spending
time with: Ahmet, Ahmet2, Amin, Andrea, Beri, Can, Collin, Erik, Matt, Michael,
Patrick, Sajjad, and Sevtap. Also Amirali, even though I am still not certain why he
was hanging around with us. A special shout-out to Tobias for sticking with me since
the very beginning. It was a fun ride.
I thank my parents for their endless “love and support” – without them I never
could have afforded the rent in Boston. Lastly, I thank Defne for bearing with me
through some extremely difficult times and making life easier on all fronts.
∗Excluding that one time in Japan. That was not very scholarly.
ii
Contents
List of Tables vii
List of Figures viii
1 Introduction 1
1.1 Motivation for Novel Defenses . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Constraints for Practical Defenses . . . . . . . . . . . . . . . . . . . . 3
1.3 Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Research Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 PrivExec: Private Execution as an Operating System Service 9
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Security Properties . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.3 Swap Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.4 Inter-Process Communication . . . . . . . . . . . . . . . . . . 17
2.3.5 Memory Isolation . . . . . . . . . . . . . . . . . . . . . . . . . 18
iii
2.3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.1 Private Process Management . . . . . . . . . . . . . . . . . . . 19
2.4.2 Private Disk I/O . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.3 Private Swap Space . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.4 Private Inter-Process Communication . . . . . . . . . . . . . . 29
2.4.5 Launching Private Applications . . . . . . . . . . . . . . . . . 30
2.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.5.1 Running Popular Applications . . . . . . . . . . . . . . . . . . 31
2.5.2 Disk I/O and File System Benchmarks . . . . . . . . . . . . . 32
2.5.3 Real-World Application Performance . . . . . . . . . . . . . . 34
2.6 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3 Eraser: Secure Deletion on Blackbox Hardware 46
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2 Background & Related Work . . . . . . . . . . . . . . . . . . . . . . . 48
3.2.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2.2 Flash Translation Layers . . . . . . . . . . . . . . . . . . . . . 52
3.2.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Naıve Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.2 File Key Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
iv
3.5.1 Alternative Solutions & Our Philosophy . . . . . . . . . . . . 60
3.5.2 Prototype Overview . . . . . . . . . . . . . . . . . . . . . . . 62
3.5.3 I/O Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.5.4 Intercepting File Deletions . . . . . . . . . . . . . . . . . . . . 66
3.5.5 Key Storage & Management . . . . . . . . . . . . . . . . . . . 67
3.5.6 Master Key Vault . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.5.7 Encrypting Non-File System Blocks . . . . . . . . . . . . . . . 70
3.5.8 Managing Eraser Partitions . . . . . . . . . . . . . . . . . . 71
3.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.6.1 I/O Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.6.2 Tests with Many Small Files . . . . . . . . . . . . . . . . . . . 75
3.6.3 Discussion of Results . . . . . . . . . . . . . . . . . . . . . . . 76
3.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4 HiVE: Hidden Volume Encryption 83
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.3 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.3.2 Generic Hidden Volume Encryption . . . . . . . . . . . . . . . 86
4.3.3 Write-Only ORAM Construction . . . . . . . . . . . . . . . . 88
4.3.4 Choosing the Parameter k . . . . . . . . . . . . . . . . . . . . 89
4.3.5 Write-Only ORAM Optimizations . . . . . . . . . . . . . . . . 90
4.3.6 Hidden Volume Encryption with HiVE . . . . . . . . . . . . . 91
4.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
v
4.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5 Overhaul: Input-Driven Access Control on
Traditional Operating Systems 101
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.3 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.3.1 Security Properties . . . . . . . . . . . . . . . . . . . . . . . . 105
5.3.2 Trusted Input & Output Paths . . . . . . . . . . . . . . . . . 106
5.3.3 Permission Adjustments . . . . . . . . . . . . . . . . . . . . . 107
5.3.4 Sensitive Resource Protection . . . . . . . . . . . . . . . . . . 108
5.3.5 Interaction Across Process Boundaries . . . . . . . . . . . . . 113
5.3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.4.1 Enhancements to X Window System . . . . . . . . . . . . . . 117
5.4.2 Enhancements to the Linux Kernel . . . . . . . . . . . . . . . 124
5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.5.1 Performance Measurements . . . . . . . . . . . . . . . . . . . 131
5.5.2 Usability Experiments . . . . . . . . . . . . . . . . . . . . . . 133
5.5.3 Applicability & False Positives Assessment . . . . . . . . . . . 134
5.5.4 Empirical Experiments . . . . . . . . . . . . . . . . . . . . . . 136
5.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6 Conclusions 142
vi
List of Tables
2.1 Disk I/O and file system performance of PrivExec. . . . . . . . . . 32
2.2 Runtime performance overhead of PrivExec for two popular web
browsers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3 Runtime performance overhead of PrivExec for various desktop and
console applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1 Disk I/O and file system performance of Eraser compared to full disk
encryption with dm-crypt. . . . . . . . . . . . . . . . . . . . . . . . . 73
3.2 Timed experiments with the Linux kernel source code directory to
compare the small-file performance of Eraser to full disk encryption
with dm-crypt. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1 Disk I/O and file system performance of Hive. . . . . . . . . . . . . 97
5.1 Performance overhead of Overhaul. . . . . . . . . . . . . . . . . . . 131
vii
List of Figures
2.1 An overview of PrivExec’s design. . . . . . . . . . . . . . . . . . . . 13
2.2 An overview of the Linux block I/O layers. . . . . . . . . . . . . . . . 21
2.3 Setting up the secure storage container and overlaying it on the root
file system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 Structure of an n-ary FKT. . . . . . . . . . . . . . . . . . . . . . . . 57
3.2 Secure deletion using an FKT. . . . . . . . . . . . . . . . . . . . . . . 58
3.3 An overview of Eraser’s design. . . . . . . . . . . . . . . . . . . . . 63
4.1 Hive stores volumes interleaved on disk. . . . . . . . . . . . . . . . . 93
5.1 Dynamic access control over privacy-sensitive hardware devices. . . . 110
5.2 Protecting copy & paste operations against clipboard sniffing. . . . . 112
5.3 A program launcher executing a screen capture program. . . . . . . . 113
5.4 A multi-process browser, components of which communicate via shared
memory IPC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.5 Sample visual alerts shown by Overhaul. . . . . . . . . . . . . . . . 120
5.6 Protocol diagram for the X11 copy & paste operation. . . . . . . . . . 122
viii
Chapter 1
Introduction
The ever-increasing significance of computers in our daily lives, whether to work, to
communicate, or to entertain, has resulted in the present situation where our comput-
ers process and store immense amounts of private information. For instance, personal
and sensitive information such as records of online communications or financial trans-
actions, bank account or credit card details, and user credentials such as passwords
to Internet-based services is often kept on the computer’s disk.
Various defenses are readily available on modern computer systems to alleviate
potential threats to user privacy. For example, a large pool of file and disk encryption
utilities make it possible for users to secure their confidential data against unautho-
rized access. Similarly, many modern operating systems have powerful access control
mechanisms in place to isolate multiple users sharing a computer from each other, or
to enforce security policies on applications that attempt to access privacy-sensitive
system resources.
On the one hand, such well-established defenses have proven to be effective at
mitigating many common attacks on user privacy in the past. On the other hand,
privacy requirements of modern-day computer users are still rapidly evolving as new
1
applications of privacy-sensitive information emerge, all the while attacks to compro-
mise privacy become more complex and lucrative. This situation necessitates security
professionals to revise, update, and adapt existing defenses, or to come up with novel
solutions to adequately address contemporary threats to privacy.
1.1 Motivation for Novel Defenses
Two substantial factors that motivate the design of novel privacy defenses and un-
derline the shortcomings of existing ones are the changing technology and adversarial
models. Here, we take a closer look at these, and illustrate how they impact current
privacy solutions.
Rapid technological advancements are important components that give rise to a
need for novel privacy defenses by exposing new types of privacy-sensitive resources
to attackers. One prominent example is the proliferation of mobile devices such as
smartphones and tablets. These devices are often equipped with various physical
sensors such as cameras and microphones, which are often not sufficiently protected
by traditional access control mechanisms. As a result, the potential for their abuse
to spy on users has become a growing concern.
Another instance of a typical system resource that has previously received little at-
tention from a privacy standpoint is the clipboard. As the use of online commerce and
other Internet-based services has soared over the years, digital wallets and password
managers have become commonplace. These utilities often utilize the clipboard to
automatically copy users’ secret credentials into Web forms. Thus, clipboard contents
have become attractive targets for attackers.
In a similar manner, other technologies render previously working defenses ob-
solete. For example, while securely erasing sensitive persistent data was previously
2
possible by overwriting the contents of the corresponding disk sectors, modern stor-
age technologies such as journaling file systems and solid state drives (SSDs) make it
difficult to prevent data remanence.
A second pertinent factor that calls for revisions to existing privacy defenses is
the changing roles and capabilities of adversaries. Contemporary attacks against full
disk encryption technologies clearly illustrate this problem. Disk encryption is a well-
established, technically sound solution to restrict access to stored private information
in the case of device theft or forensic inspection of the disk. Nevertheless, evolving
adversarial models have positioned government authorities, such as law enforcement
agencies, as powerful threats to user privacy [1]. The fact that such authorities can
force users to disclose their disk encryption keys and render any encryption ineffective
necessitates novel defenses that are resistant to coercion attacks.
Even with the use of techniques such as hidden encrypted disk partitions that
would allow a user to conceal the existence of encrypted data on disk, sufficiently
motivated adversaries can still exploit shortcomings of known defenses and gain ac-
cess to the sensitive data. For example, the hidden partition scheme provided by
TrueCrypt, a popular disk encryption tool, can be bypassed by inspecting multiple
snapshots taken from the same disk [2].
1.2 Constraints for Practical Defenses
Generalizing from the above examples, we observe that as the accumulation of private
information on computers increases and threats to privacy keep evolving, defenses
need to be revised to address changing security requirements. However, resolving
these emerging privacy issues in an effective way also requires solution designers to
adhere to various constraints.
3
First, with the scale of sensitive information being processed and stored on com-
puters today, a great many applications would benefit from any additional privacy fea-
tures offered by novel solutions. However, implementing and maintaining application-
specific privacy features would both come at a high software development cost, as well
as be more likely to lead to security-crippling bugs. This issue is exemplified by a
prevalent application-specific defense: the private browsing modes offered by modern
web browsers which promise to erase the users’ tracks after each browsing session is
ended. Indeed, recent research has demonstrated that all major browsers have flawed
private browsing mode implementations that leave behind traces of the terminated
private session [3].
Consequently, we argue that designing more general solutions that target large
categories of applications without requiring modifications to every single application
instance would be a better approach to implementing correct privacy features at scale.
We point out that this requirement also implies that practical defenses would need to
be transparent to existing applications; in other words, they would need to work out
of the box, without disrupting the functionality of, or requiring drastic modifications
to common software running on the system.
Next, we note that a common pattern in security technologies is that low-complexity,
low-overhead solutions are easily adopted (e.g., ASLR [4], NOEXEC [5]). In con-
trast, solutions that require complex setup procedures or are resource-intensive are
not immediately or widely utilized, even if they are effective defenses (e.g., full-scale
Control Flow Integrity [6], CSP [7]). Therefore, defenses that are easy to deploy and
less resource-intensive are more likely to have a significant practical impact.
Finally, we stress that the human element is another important factor that deter-
mines the success of novel security features. Defenses that are not well-understood
by users or are unfamiliar to them are more likely to be misused or completely aban-
4
doned in practice. Therefore, designing high-usability privacy defenses also warrants
significant attention.
1.3 Our Approach
In light of the constraints and requirements discussed above, we argue that the op-
erating system is a natural and suitable place to implement novel privacy defenses
against emerging threats.
First and foremost, thanks to its supervisor role the operating system is able to
introspect on all user space, providing strong security guarantees to applications and
enforcing privacy policies. Furthermore, due to its position as a common platform
that all applications run on, the operating system is capable of offering general pri-
vacy services to programs in an application-agnostic manner. This also enables easy
verification of the implemented defenses to ensure correctness.
While it could be possible to build a secure operating system from the ground
up with this philosophy in mind, such an approach would have numerous negative
implications on the effectiveness and practicality of any proposed privacy features.
To begin with, a new operating system would likely require extensive changes to
the user space. In other words, maintaining compatibility with the large pool of
existing, well-established software systems would require modifying, or in the worst
case, rewriting each one of these programs. In addition, replacing the universally-
deployed and popular operating systems such as Windows, Mac OS X, or Linux, and
familiarizing their users with a brand new platform would be costly. All in all, a new
secure operating system would be highly unlikely to gain widespread adoption.
5
Thesis statement
In this thesis, we argue that extending existing operating systems with novel de-
fenses is the preferable approach to addressing emerging privacy needs. We show
that retrofitting privacy-enhancing technologies into existing operating systems is
both feasible and effective.
1.4 Contributions
In this work, we examine the issue of user privacy from four perspectives, each repre-
senting a distinct threat scenario to sensitive information captured and processed by
or stored on a computer. We present four systems to retrofit novel privacy-enhancing
techniques into existing operating systems to address these problems. A summary of
the topics we cover is as follows.
• In Chapter 2, we look at the problem of keeping short-lived program execution
sessions private, or in other words, securely erasing all traces left behind by a
running application in persistent storage. We describe PrivExec, an operat-
ing system service that allows a private browsing mode-like private execution
platform for arbitrary applications.
• In Chapter 3, we shift our focus to the challenges of irreversibly deleting data
that is persisted to storage media for long-term use, after it is no longer needed.
We present Eraser, a technique that can guarantee secure deletion indepen-
dent of the characteristics of the underlying storage medium.
• In Chapter 4, we tackle the problem of hiding the existence of sensitive infor-
mation on a computer. We propose Hive, a hidden volume encryption scheme
6
that offers plausibly deniable disk encryption against strong adversaries with
multiple disk snapshot capabilities.
• In Chapter 5, we discuss access control for other types of privacy-sensitive sys-
tem resources, such as hardware sensors attached to a computer, and virtual
resources including the clipboard and screen contents. We present Overhaul,
a dynamic, input-driven access control architecture, where access to privacy-
sensitive resources is mediated based on the temporal proximity of user inter-
actions to access requests, and requests are communicated back to the user via
visual alerts.
In each of the following chapters, we first present operating system-independent
designs for the aforementioned systems, and describe how they solve the privacy
issues in our focus. Next, we describe concrete Linux implementations for all four
systems, and demonstrate that retrofitting privacy features into a standard, prevalent
operating system is both feasible and practical.
1.5 Research Goals
Before we venture into the discussion of these individual systems that we propose, we
lay out a set of criteria, or in other words, research goals, as a guideline to designing
effective privacy systems with high practical impact. These are presented below.
(G1) Compatibility. The privacy threats that we have described and that we are
trying to address in this research are affecting many users today. Therefore, the
solutions we propose must be retrofittable into existing operating systems. This
must be performed without drastic modifications to well-established operating
7
system design and implementation principles, but instead by reusing tried and
tested technologies already available.
(G2) Generality. Providing application-specific privacy features to target ever-
evolving threats and adversaries does not scale well with the plethora of ap-
plications available to users today. Such an approach would require high devel-
opment and maintenance effort, and be prone to developer errors. Instead, our
approach must provide security features as general operating system services
to all applications running on the system, making it possible for users to avail
themselves of the defenses as necessary regardless of the application they are
running.
(G3) Transparency. Directly following from the above, techniques that would re-
quire patching or rewriting individual applications to make them conform to the
restrictions of the introduced defense mechanisms are cumbersome to deploy,
and are especially impractical for use with legacy binary applications with-
out source code. Thus, the techniques we propose must be transparent to the
software running above the operating system level, and must not require any
modifications to existing applications. In the same way, the systems we propose
must not break, or interfere with, the normal functionality of applications.
(G4) Usability. Security solutions that are resource-intensive, difficult to set up
or use, or that drastically change the user experience of interacting with a
computer are less likely to enjoy widespread adoption. Therefore, the systems
we propose must provide low-overhead, unintrusive defenses while preserving
the user experience computer users are familiar with.
At the end of each of the following chapters, we will revisit these criteria to assess
and discuss the effectiveness of the proposed systems.
8
Chapter 2
PrivExec: Private Execution as an
Operating System Service
2.1 Overview
Approaches to preserving user privacy on the client side often involve preventing
sensitive data handled by applications from being exposed in the clear in persistent
storage. Web browsers serve as a canonical example of such an approach. As part of
their normal execution, browsers store a large amount of personal information that
could potentially be damaging were it to be disclosed, such as the browsing history,
bookmarks, cache, cookie store, or local storage contents. In recognition of the fact
that users might not want to leave traces of particularly sensitive browsing sessions,
browsers now typically offer a private browsing mode that attempts to prevent persis-
tent modifications to storage that could provide some indication of the user’s activities
during such a session. In this mode, sensitive user data that would normally be per-
sisted to disk is instead only stored temporarily, if at all, and when a private browsing
session ends, this data is discarded.
9
Private browsing mode has come to be a widely-used feature of major browsers.
However, its implementation as an application-specific feature has significant disad-
vantages that are important to recognize. First, implementing a privacy-preserving
execution mode is difficult to get right. For instance, prior work by Aggarwal et al. [3]
demonstrates that all of the major browsers leave traces of sensitive user data on disk
despite the use of private browsing mode. Second, if any sensitive data does reach
stable storage, it is difficult for user-level applications to guarantee that this data
would not be recoverable via forensic analysis. For example, modern journaling file
systems make disk wiping techniques unreliable, and applications must be careful to
prevent sensitive data from being swapped to disk through judicious use of system
calls such as mlock on Linux.
One way to avoid leaving traces of sensitive user data in persistent storage is to use
cryptographic techniques such as full disk encryption. Here, the idea is to ensure that
all application disk writes are encrypted prior to storage. Therefore, regardless of the
nature of the data that is saved to disk, users without knowledge of the corresponding
secret key will not be able to recover any information. While this is a powerful and
realizable technique, it nevertheless has the significant disadvantage that users can
be coerced, through legal or other means, into disclosing their keys, at which point
the encryption becomes useless.
These concerns suggest that private execution is a feature that is best provided
by the operating system, where strong privacy guarantees can be provided to any
application and analyzed for correctness. Standard cryptographic techniques such as
disk encryption do not satisfactorily solve the problem.
In this chapter, we present the design and implementation of PrivExec, a novel
operating system service for private execution. PrivExec provides strong, general
guarantees of private execution, allowing any application to execute in a mode where
10
storage writes, either to the file system or to swap, will not be recoverable by others
during or after execution. PrivExec achieves this by binding an ephemeral private
execution key to groups of processes that wish to execute privately. This key is
used to encrypt all data stored to file systems, as well as process memory pages
written to swap devices, and is never exposed outside of kernel memory or persisted
to storage. Once a private execution session has ended, the private execution key
is securely wiped from volatile memory. In addition, inter-process communication
(IPC) restrictions enforced by PrivExec prevent inadvertent leaks of sensitive data
to public processes that might circumvent the system’s private storage mechanisms.
PrivExec does not require application support; any unmodified, legacy binary
application can execute privately using our system. Due to the design of this ap-
proach, users cannot be coerced into disclosing information from a private execution.
We also demonstrate that our prototype implementation of PrivExec, which we
construct using existing, well-tested technologies as a foundation, incurs minimal
performance overhead. For many popular application scenarios, PrivExec has no
discernible negative impact on the user experience.
2.2 Threat Model
Our primary motivation for designing PrivExec is to prevent disclosure of sensitive
data produced in short-lived private execution sessions. The model for these private
execution sessions is similar to private browsing modes implemented in most modern
browsers, but generalized to any user-level application.
We divide the threat model we assume for this work into two scenarios, one for
the duration of a targeted private execution session, and another for after a session
has ended.
11
For the first scenario, we assume that an adversary can have remote access to the
target system as a normal user. Due to normal process-based isolation, the attacker
cannot inspect physical memory, kernel virtual memory, or process virtual memory
for processes labeled with a different user ID.
The threat model for the second scenario corresponds to a technically sophisticated
adversary with physical access to a target system after a private execution session
has ended. In this scenario, the adversary has complete access to the contents of any
local storage such as hard disks, as well as the system memory. It is assumed that
the adversary has access to sophisticated forensics tools that can retrieve insecurely
deleted data from a file system, or process memory pages from swap devices.
Common to both scenarios is the assumption of a “benign-but-buggy”, or perhaps
“benign-but-privacy-unaware”, application. In particular, our threat model does not
include applications that maliciously transmit private information to remote parties,
or users that do the same. As we describe in the next section, PrivExec aims to
avoid inadvertent disclosure of private information.
2.3 Design
In this section, we first outline the security guarantees that our system aims to provide,
and then elaborate on the privacy policies that a PrivExec-enabled system must
enforce for file system, swap space, IPC, and memory isolation.
2.3.1 Security Properties
PrivExec provides private execution as a generic operating system service by cre-
ating a logical distinction between public processes and private processes. While
public processes execute with the usual semantics regarding access to shared system
12
P1 P2
P3 P4
Public Processes
P6{PEK }x
Private Process Group
Private Container
Disk reads, writes
P7{PEK }y
Disk reads, writes
Public Filesystem
Disk readsDisk reads, writes
P8{PEK }y
Private Container
Disk reads, writesP5
IPC
IPC
IPC
x
Private Process Groupy
Figure 2.1: An overview PrivExec’s design. Public processes behave as normalapplications, with read-write access to public file systems and unrestricted IPC inthat they can write to all other processes. Private processes, however, have read-onlyaccess to public file systems. All private process writes are redirected to a dedicatedtemporary secure storage container that persists only for the lifetime of the processand is irrevocably discarded at process exit. Data stored in this container is encryptedwith a protected, process-specific private execution key (PEK) that is never revealed.Private process swap is conceptually handled in a similar fashion. Finally, privateprocesses cannot write data to public processes or unrelated private processes viaIPC channels.
13
resources, private processes are subject to special restrictions to prevent disclosure
of sensitive data resulting from private execution. In the PrivExec model, private
processes might execute within the same logical privacy context, where resource ac-
cess restrictions between processes sharing a context are relaxed. We refer to private
processes related in this way as private process groups.
The concrete security properties that our system provides are the following:
(S1) Data explicitly written to storage must never be recoverable without knowledge
of a secret bound to an application for the duration of its private execution.
(S2) Application memory that is swapped to disk must never be recoverable without
knowledge of the application secret.
(S3) Data produced during a private execution must never be passed to processes
outside the private process group via IPC channels.
(S4) Application secrets must never be persisted, and never be exposed outside of
protected volatile memory.
(S5) Once a private execution has terminated, application secrets and data must be
securely discarded.
Together, (S1), (S2) and (S3) guarantee that data resulting from a private exe-
cution cannot be disclosed without access to the corresponding secret. (S4) ensures
that users cannot be coerced into divulging their personal information, as they do
not know the requisite secret, and hence, cannot provide it. (S5) implies that once
a private execution has ended, it is computationally infeasible to recover the data
produced during that execution. Figure 2.1 depicts an overview of the design of
PrivExec.
14
2.3.2 File System
Public processes have the expected read-write access to public file systems. Private
processes, on the other hand, are short-lived processes that have temporary secure
storage containers. This storage container is allocated only for the lifetime of a private
execution and is accessible only to the private process group it is associated with.
Each private process group is bound to a private execution key (PEK) that is the
basis for uniquely identifying a privacy context. This PEK is randomly generated
at private process creation, protected by the operating system, never stored in non-
volatile memory, and never disclosed to the user or any other process. The PEK
is used to encrypt all data produced during a private execution before it is written
to persistent storage within the secure container. In this way, PrivExec ensures
that sensitive data resulting from private process computation cannot be accessed
through the file system by any process that does not share the associated privacy
context. Furthermore, when a private execution terminates, PrivExec securely
wipes its PEK, and hence makes it computationally infeasible to recover the encrypted
contents of the associated storage container.
Although all new files created by a private process must be stored in its secure
container, applications often also need to access files that already exist in the normal
file system in order to function correctly. For instance, many applications load shared
libraries and read configuration files as part of their normal operation. The operating
system needs to ensure that such read requests are directed to the public file system.
An even more complicated situation arises when a private process attempts to modify
existing files. In that case, we need to create a separate private copy of the file in
the process’ secure container, and redirect all subsequent read and write requests for
that file to the new copy. PrivExec ensures that private processes can only write to
15
the secure storage container while they still have a read-only view of the public file
systems by enforcing the following copy-on-write policy:
For a write operation,
• if the destination file does not exist in the file system or in the secure container,
a new file is created in the container;
• if the file exists in the file system, but not in the container, a new copy of the
file is created in the container and the write is performed on this new copy;
• if the file exists in the container, the process directly modifies it regardless of
whether it exists in the file system.
For a read operation,
• if the file exists in the container, it is read from there regardless of whether it
also exists in the file system;
• if the file exists in the file system but not in the container, the file is read from
the file system;
• if the file exists neither in the file system nor in the container, the read operation
fails.
2.3.3 Swap Space
In addition to protecting data written to file systems by a private process, PrivExec
must also preserve the privacy of virtual memory pages swapped to disk. This is
different from existing approaches to swap encryption, which use a single key to
encrypt the entire swap device, and fail to meet our security requirements in the
16
same way that full disk encryption also does not. Since swap space is shared between
processes with different user principals, PrivExec encrypts each private process
memory page that is swapped to disk with the PEK of the corresponding process as
in the file system case, and thus imposes a per-application partitioning of the system
swap.
2.3.4 Inter-Process Communication
The private storage mechanisms described in the previous sections effectively prevent
sensitive data resulting from private computation from being persisted in the clear.
However, applications frequently make use of a number of IPC channels during their
normal operation. Without any restrictions in place, private processes might use
these channels to inadvertently leak sensitive data to a public process. If that public
process in turn persists that data, it would circumvent the protections PrivExec
attempts to enforce. Therefore, PrivExec must also enforce restrictions on IPC to
prevent such scenarios from occurring.
Specifically, PrivExec ensures that a private process can write data via IPC
only to the other members of its group that share the same privacy context. In other
words, a private process cannot write data to a public process or to an unrelated
private process.
As usual, public processes can freely exchange data with other public processes.
Note that public processes can also write data to private processes since data flow
from a public process to a private process does not violate the security properties of
PrivExec.
17
2.3.5 Memory Isolation
Enforcing strong memory isolation is essential to our private execution model, not
only for protecting the virtual address space of a private process, but also for pre-
venting the disclosure of PEKs. To this end, PrivExec takes measures to enforce
process and kernel isolation boundaries against unprivileged users for private pro-
cesses, in particular by disallowing standard exceptions to system isolation policies.
This includes disabling features such as debugging facilities or blocking unprivileged
access to devices that expose the kernel virtual memory or physical memory.
2.3.6 Discussion
The design we describe satisfies the goals we enumerate in Section 2.3.1. The PEK
serves as the application secret that ensures confidentiality of data produced during
private execution (S1), (S2). The PrivExec-enabled operating system is responsi-
ble for protecting the confidentiality of the PEK, ensures that the user cannot be
expected to know the value of individual PEKs, and prevents private processes from
inadvertently leaking sensitive data via IPC channels to other processes (S3), (S4).
Destroying the PEK after a private execution has ended ensures that any data pro-
duced cannot feasibly be recovered by anyone, including the user (S5).
2.4 Implementation
In the following, we describe our prototype implementation of PrivExec as a set of
modifications to the Linux kernel and a user-level helper application. We center this
discussion around five main technical challenges: managing private processes, con-
18
structing secure storage containers, implementing private application swap, enforcing
restrictions on IPC channels, and running applications privately at the user level.
2.4.1 Private Process Management
The first requirement for implementing PrivExec is to enable the operating system
to support a private execution mode for processes. The operating system must be able
to launch an application as a private process upon request from the user, generate the
PEK, store it in an easily accessible context associated with that process, mark the
process and track it during its lifetime, and finally destroy the PEK when the private
process terminates. The operating system must also expose a simple interface for
user-level applications to request private execution without requiring modifications
to existing application code.
The Linux kernel represents every process on the system using a process descrip-
tor. The process descriptor contains all the information required to execute the
process, including functions such as scheduling, virtual address space management,
and accounting. A new process, or child, is created by copying an existing process,
or parent, through the clone system call which allocates a new process descriptor
for the child, initializes it, and prepares it for scheduling. clone offers fine-grained
control over which system resources the parent and child share through a set of clone
flags passed as an argument. When a process is ready to terminate, the exit system
call deallocates resources associated with that process.
To implement our system, we first defined a new private execution flag that is
passed to clone to signal that a private process is to be created. We also defined
a similar flag that is set in the process descriptor to indicate that its corresponding
process is executing privately. We further extended the process descriptor to store
19
the PEK and a pre-allocated cryptographic transform structure that is used for swap
encryption.
To handle private process creation we modified clone to check for the presence of
our private execution flag. If present, we mark the newly cloned process descriptor
as private and generate a fresh PEK using a cryptographically-secure PRNG. As
previosuly discussed, the PEK is stored inside the process descriptor, resides in the
kernel virtual address space, and is never disclosed to the user. For private process
termination we adapted exit to check whether the terminating process is executing
privately, and if so, to deallocate the swap cryptographic transform and securely wipe
the PEK from memory. Since the Linux kernel handles processes and threads in the
same way, this approach also allows for creating and terminating private threads
without any additional implementation effort.
Note that applications might spawn additional children for creating subprocesses
or threads during their course of execution. This can lead to two critical issues with
multi-process and multi-threaded applications running under PrivExec. First, pub-
lic children of a private process could cause privacy leaks. Second, public children
cannot access the parent’s secure container, which could potentially break the appli-
cation. In order to prevent these problems, our notion of a private execution should
include the full set of application processes and threads, despite the fact that the
Linux kernel represents them with separate process descriptors. Therefore, we modi-
fied clone to ensure that all children of a private process inherit the parent’s private
status and privacy context, including the PEK and the secure storage container. Ref-
erence counting is used to ensure that resources are properly disposed of when the
entire private process group exits.
Also note that our implementation exposes PrivExec to user applications through
a new flag that is passed to clone. As a result, when the private execution flag is
20
Block Driver Block Driver
dm-crypt
Ext4
eCryptfs
ReiserFS
Virtual File System (VFS)
Page Cache
AppUser Space
Kernel
Hardware
Figure 2.2: An overview of the Linux block I/O layers.
not passed to the system call, the original semantics of the system call are preserved,
maintaining full compatibility with existing applications. Likewise, applications that
are not aware of the newly implemented PrivExec interface to clone can be made
private by simply wrapping their executables with a program that spawns them using
the private execution flag. We explain how existing applications run under PrivExec
without modifications in Section 2.4.5.
21
2.4.2 Private Disk I/O
PrivExec requires the operating system to provide every private application with
a dedicated secure storage container, to which all application data writes must be
directed. Upon launching a private application, the operating system must construct
this container, intercept and redirect I/O operations performed by the private appli-
cation, and encrypt writes and decrypt reads on the fly.
Although the Linux file I/O API consists of simple system calls such as read
and write, the corresponding kernel execution path crosses many different layers
and subsystems before the actual physical device is accessed. Block I/O requests
initiated by a system call first pass through the virtual file system (VFS), which
provides a unifying abstraction layer over different underlying file systems. After a
particular concrete file system processes the I/O request, the kernel caches it in the
page cache and eventually inserts the request into the target device driver’s request
queue. The driver periodically services queued requests by initiating asynchronous
I/O on the physical device, and then notifies the operating system when the operation
is complete. We refer the reader to Figure 2.2 for a graphical overview of these kernel
subsystems.
The choice of where to integrate PrivExec into the file I/O subsystems requires
careful consideration. In particular, in order to build a generic solution that is inde-
pendent of the underlying file system and physical device, we should avoid modifying
the individual file systems or the drivers for the physical storage devices. One option
is to intercept I/O requests between the page cache and the device’s request queue.
However, this results in sensitive data being stored as plaintext in the page cache, a
location that is accessible to the rest of the system. Thus, this is not an acceptable
solution. Likewise, encrypting the data as it enters the page cache is insufficient since
22
direct I/O operations that bypass the page cache would not be intercepted by our
system. In addition, a second major implementation question is how to handle the
redirection of I/O requests made by private processes per our copy-on-write policy.
In order to build a generic system that addresses all of the above challenges,
we leverage stackable file systems. A stackable file system resides between the VFS
and any underlying file system as a separate layer. It does not store data by itself,
but instead interposes on I/O requests, allowing for controlled modifications to these
requests before passing them to the file system it wraps. Since stackable file systems
usually do not need to know the workings of the underlying file system, they are
often used as a generic technique for introducing additional features to existing file
systems. PrivExec uses a combination of two stackable file systems to achieve its
goals: A version of eCryptfs [8] with our modifications to provide the secure storage
containers, and Overlayfs [9] to overlay these secure containers on top of the root file
system. In the following, we explain their use in PrivExec and our modifications
to eCryptfs in detail.
Secure Storage Containers
eCryptfs is a stackable cryptographic file system distributed with the Linux kernel,
and it provides the basis of PrivExec’s secure storage containers. eCryptfs provides
file system-level encryption, meaning that each file is encrypted separately and all
cryptographic metadata is stored inside the encrypted files. While this is likely to
be less efficient compared to block-level encryption (e.g., the approach taken by dm-
crypt [10]), eCryptfs does not require a full device or partition allocated for it, which
allows us to easily create any number of secure containers on the existing file systems
as demand necessitates.
23
Containers are structured as an upper directory and a lower directory. All I/O
operations are actually performed on the lower directory where files are stored in
encrypted form. The upper directory provides applications with a private view of the
plaintext contents.
The lower directory is provided by eCryptfs using AES-256 to encrypt both file
contents and directory entries. However, while its cryptographic capabilities are pow-
erful, eCryptfs has a number of shortcomings that make it unsuitable for use in
PrivExec on its own. First, once an encrypted directory is mounted and a de-
crypted view is made available at the upper directory, all users and applications with
sufficient permissions can access the decrypted content. Second, eCryptfs expects to
find the secret key in the Linux kernel keyring associated with the user before the file
system can be mounted. This makes it possible for other applications running under
the same user account to access the keyring, dump the key, and access data belonging
to another private application. Therefore, we modified eCryptfs in order to address
these issues and restrict access to private process data in line with our system design.
Our first set of modifications aim to uniquely associate mounted eCryptfs contain-
ers with a single privacy context. In Linux, each file system allocates and initializes
a super block structure when it is mounted. We extended this structure used by
eCryptfs to include a private execution token (PET) that serves as a secret that iden-
tifies the privacy context associated with the mounted eCryptfs container. We then
modified the file system mount routine of eCryptfs to check whether the mount oper-
ation is requested by a private process. Since this function runs in the process context
inside the kernel, we can bind a container to a privacy context by simply checking for
the presence of the private execution flag we introduced in Section 2.4.1 inside the
process descriptor. If the flag is set, we populate the PET with a value derived from
the PEK. These extensions allow us to use the PET as a unique identifier in order
24
to determine whether a process performing eCryptfs operations is the owner of the
container. Of course, we securely wipe the PET from memory when the container is
unmounted.
To enforce access control on containers, we modified the cryptographic functions
of eCryptfs to check the identity of the requesting process using PET. If the process
is not the owner of the container, the I/O request is blocked. Otherwise, if the
private process is the owner of the container, we fetch the PEK from the current
process descriptor and use it as the cryptographic key. This ensures that the PEK
never appears in the user’s kernel keyring, and is never exposed outside of the private
process group.
Although these extensions to eCryptfs address the root cause of the aforemen-
tioned privacy issues, one last problem remains: Once an encrypted file is accessed
by an authorized private process, eCryptfs caches the decrypted content and directly
serves subsequent I/O requests made by other processes from the cache, bypassing our
privacy measures. Therefore, we perform a final PET verification during file access
permission checks, and ensure that access to the eCryptfs upper directory is denied
to the rest of the system regardless of the directory’s UNIX permissions.
All in all, our modified eCryptfs layer provides a secure storage container that is
only accessible to a single private process group. Also note that all of the security
checks we inserted only trigger if eCryptfs is mounted by a private process in the
first place. This guarantees that normal applications can still use eCryptfs as before
without being restricted by our additional privacy requirements.
Overlaying Secure Storage Containers
Once a dedicated secure container has been constructed for a private process group,
we need to redirect I/O operations to that container. We achieve this through the
25
root /
eCry
ptfs
~/pr
ivat
e/
P
r/w
r/w
root /
Ove
rlay
fs/tm
p/fa
kero
ot/
P
r/w
r/w
eCry
ptfs
~/pr
ivat
e/
r/w
root /
r/w
eCry
ptfs
~/pr
ivat
e/
ro
U
chro
ot/tm
p/fa
kero
ot/
P
r/w
Ove
rlay
fs/tm
p/fa
kero
ot/
root /
r/w
eCry
ptfs
~/pr
ivat
e/
ro
U
Step
ISt
ep II
Step
III
Fig
ure
2.3:
Set
ting
up
the
secu
rest
orag
eco
nta
iner
and
over
layin
git
onth
ero
otfile
syst
em.
26
use of a stackable union file system. Union file systems are used to overlay several
different file system trees – sometimes referred to as branches – in a unified hierarchy,
and merge their contents as if they were a single file system together. Although every
implementation supports different unioning capabilities, in theory, a union file system
can be used to overlay any number of branches in a defined order, with specific read
and write policies for each branch.
Overlayfs is an implementation of this idea distributed with the Linux kernel, and
we leverage it as part of our prototype. We use Overlayfs to layer secure storage
containers on top of the root file system tree. The root file system is mounted as
a read-only lower branch, while the secure container is made the read-write upper
branch. In this way, through an Overlayfs mount point, a private process has a
complete view of the root file system, while all write operations are actually performed
on the secure container. Overlayfs also supports copy-on-write by default. In other
words, when an application attempts to write to a file in the lower read-only root file
system, Overlayfs first makes a copy of the file in the writable secure container and
performs the write on the copy. The files in an upper branch take precedence over
and shadow the same files in the lower branch, which also ensures that all subsequent
read and write operations are redirected to the new encrypted copies.
The entire process of setting up a secure container for a private process P and
overlaying it on the root file system is illustrated in Figure 2.3. Note that the given
path names are only examples; PrivExec uses random paths to support multiple
private execution sessions that run simultaneously. Before launching a private pro-
cess, in step one, PrivExec creates a secure container using our modified version
of eCryptfs and mounts it on ~/private. In step 2, Overlayfs is used to overlay the
container on the root file system, and this new view is mounted on /tmp/fakeroot.
In the final step, the private process is launched in a chroot environment with its
27
root file system the Overlayfs mount point. In this way, the private process still has a
complete view of the original file system and full read-write access; however, all writes
are transparently redirected to the secure container. When the private process ter-
minates, PrivExec destroys the secure container and PEK, rendering the encrypted
data in ~/private irrecoverable.
2.4.3 Private Swap Space
Since the Linux kernel handles swap devices separately from file system I/O, PrivExec
must also interpose on these operations in order to preserve the privacy of virtual
memory pages swapped to disk. To this end, each page written to a swap device
must be encrypted with the PEK of the corresponding private process.
We implemented per-application swap encryption as a patch to the kernel’s swap
page-out routine. First, a check is performed to determine whether a page to be
written belongs to a private process. If so, the pre-allocated cipher transform in the
process descriptor is initialized with a page-specific IV, and the page is encrypted
with PEK prior to scheduling an asynchronous write operation.
For page-in, the situation is more complex. The kernel swap daemon (kswapd)
is responsible for scanning memory to perform page replacement, and operates in
a kernel thread context. Therefore, once a page has been selected for replacement,
process virtual memory structures must be traversed to locate a process descriptor
that owns the swap page. Once this has been done, however, the inverse of page-out
can be performed. Specifically, once the asynchronous read of the page from the swap
device has completed, a check is performed to determine whether the owning process
is in private execution mode. If so, the process cipher transform is initialized with
28
the page-specific IV, and the page is decrypted with the PEK prior to resumption of
the user process.
2.4.4 Private Inter-Process Communication
PrivExec also imposes restrictions on private process IPC to prevent data leaks
from a privacy context. In general, our approach with respect to private IPC is to
modify each IPC facility available to Linux applications as follows.
Similarly to secure storage containers, we embedded a PET in the kernel struc-
tures corresponding to IPC resources. We then modified the kernel IPC functions to
perform a check to compare the tokens of the endpoint processes at the time of chan-
nel establishment, or before read and write operations, augmenting the usual UNIX
permission checks as appropriate. The policy we implemented ensures that private
processes with the same token can freely exchange data, while private processes with
different tokens are prevented from communicating with a “permission denied” er-
ror. In addition, private processes are allowed to read from public processes, but
prevented from writing data to them. Of course, IPC semantics for communication
between public processes remains unchanged.
The specific Linux IPC facilities that we modified to conform to the policy de-
scribed above include UNIX SysV shared memory and message queues, POSIX shared
memory and message queues, FIFO queues, and UNIX domain sockets. We omit de-
tails of the specific changes as they are similar in nature to those described for the
case of secure storage containers.
29
2.4.5 Launching Private Applications
While PrivExec-aware applications can directly spawn private subprocesses or threads
as they require by passing the private execution flag to the clone system call, we
implemented a PrivExec wrapper as the primary method for running existing ap-
plications in private mode.
The PrivExec wrapper first creates a private copy of itself by invoking clone
with the private execution flag. Then, this private process creates an empty secure
storage container and mounts it in a user-specified location. Recall that, as explained
in Section 2.4.2, our modifications to eCryptfs ensure that only this specific private
process and its children can access the container from this point on. The wrapper
then creates the file system overlay, and finally loads the target application executable
in a chroot environment, changing the application’s root file system to our overlay. As
explained in Section 2.4.1, the application inherits the PEK of the wrapper, and starts
its private execution. When the application terminates, the PrivExec wrapper
cleans up the mounted overlay and exits.
Note that the final destruction of the container is simply for user convenience.
Even if the wrapper or the private application itself crashes or is killed, leaving the
container and the overlay mounted, the container is accessible only to the processes
that have the corresponding PEK (i.e., the private application that created it). Since
that application and its PEK are guaranteed to be destroyed by the kernel, the private
data remains inaccessible even if the container remains mounted.
2.5 Evaluation
The primary objective of our evaluation is to demonstrate that PrivExec is prac-
tical for real-world applications that often deal with sensitive information, without
30
detracting from the user experience. To this end, we first tested whether our system
works correctly, without breaking program functionality, by manually running pop-
ular applications with PrivExec. Next, we tested PrivExec’s performance using
standard disk I/O and file system benchmarks. Finally, we ran performance exper-
iments with well-known desktop and console applications that are representative of
the use cases PrivExec targets.
All tests were run on a standard desktop computer with an Intel i7-930 CPU, 9 GB
RAM, running Arch Linux x86-64 with kernel version 3.12.0-rc2. Disk benchmarks
were performed on a Samsung Spinpoint F3 HD502HJ mechanical hard disk.
2.5.1 Running Popular Applications
To demonstrate that our approach is applicable to and compatible with a wide va-
riety of software, we manually tested 50 popular applications with PrivExec. We
selected our test set from the top rated applications list reported by Ubuntu Software
Center. Specifically, we selected the top 50 applications, excluding all non-free or
Ubuntu-specific software. The tested applications include software in many differ-
ent categories such as developer tools (e.g., Eclipse, Emacs, Geany), graphics (e.g.,
Blender, Gimp, Inkscape), Internet (e.g., Chromium, FileZilla, Thunderbird), office
(e.g., LibreOffice), sound and video (e.g., Audacity, MPlayer), and games (e.g., Battle
for Wesnoth, Teeworlds). We launched each application with PrivExec, exercised
their core features, and checked whether they worked as intended.
This experiment revealed two important limitations of PrivExec regarding our
measures to block IPC channels. First, private X applications failed to start because
they could not communicate with the public X server through UNIX domain sockets.
This led us to modify our system to launch these applications in a new, private X
31
Original eCryptfs-only PrivExec
Performance Performance Overhead Performance Overhead
Write 110694.60 KB/s 97536.83 KB/s 13.49 % 97979.47 KB/s 12.98 %Read 111217.67 KB/s 107134.53 KB/s 3.81 % 106293.73 KB/s 4.63 %
Create 13906.73 files/s 8312.73 files/s 67.29 % 8181.10 files/s 69.99 %Delete 42012.87 files/s 25232.67 files/s 66.50 % 23017.00 files/s 82.53 %
Table 2.1: Disk I/O and file system performance of PrivExec. eCryptfs-only per-formance is also shown for comparison.
session, which resolved the issue. Alternatively, the IPC protection for stream type
UNIX domain sockets could be disabled as a trade-off in order to run private and
public applications in the same X session.
Second, a number of X applications that utilized the MIT Shared Memory Exten-
sion (MIT-SHM) to draw to the X display failed to render correctly since SysV shared
memory writes to the public X server were blocked. This issue was also resolved by
running a private X session, or simply by disabling the MIT-SHM extension in the X
server configuration file.
Once the above problems were dealt with, all 50 applications worked correctly
without exhibiting any unusual behavior or noticeable performance issues.
2.5.2 Disk I/O and File System Benchmarks
In order to evaluate the disk I/O and file system performance of PrivExec, we
used Bonnie++ [11], a well-known I/O and file system benchmark tool for UNIX-like
operating systems.
We first configured Bonnie++ to use 10 × 1 GB files to test the throughput of
block write and read operations. Next, we benchmarked file system operations by
configuring Bonnie++ to create and delete 102,400 files in a single directory, each
containing 512 bytes of data. We ran Bonnie++ as a normal process and then using
32
PrivExec for comparison, repeated all the experiments 10 times, and calculated the
average scores to get the final results. We present our findings in Table 2.1.
These results show that PrivExec performs reasonably well when doing regular
reads and writes, incurring an overhead of 12.98% and 4.63%, respectively. However,
private applications can experience slowdowns ranging from 70% to 85% when dealing
with large numbers of small files in a single directory. In fact, unoptimized file
system performance with large amounts of files is a known deficiency of eCryptfs,
which could provide an explanation for this performance hit.1 When we adjusted our
benchmarks to decrease the number of files used, or when we configured Bonnie++
to distribute the files evenly to a number of subdirectories, the performance gap
decreased drastically.
To see the impact of eCryptfs on PrivExec’s performance in general, we repeated
the measurements by running Bonnie++ on an eCryptfs-only partition. The results,
also shown in Table 2.1 for comparison, indicate that a significant part of PrivExec’s
disk I/O and file system overhead is introduced by the eCryptfs layer. This suggests
that a more optimized encrypting file system, or the use of block-level encryption
via dm-crypt (despite its various disadvantages such as the requirement to create
separate partitions of fixed size to be utilized by PrivExec) could greatly increase
PrivExec’s disk I/O and file system performance. We report the worst-case figures
in this section and leave the evaluation of these alternative techniques for future work.
While these results clearly indicate that PrivExec might not be suitable for
workloads involving many small files, such as running scientific computation appli-
cations or compiling large software projects, we must stress that such workloads do
not represent the use cases PrivExec is designed to target. In the next section
1See an eCryptfs developer’s response to a similar performance-related issue at http://
superuser.com/questions/397252/ecryptfs-and-many-many-small-files-bad-performance,also linked from the official eCryptfs web page.
33
we demonstrate that these benchmark scores do not translate to decreased perfor-
mance when executing real-world applications with concrete privacy requirements
using PrivExec.
2.5.3 Real-World Application Performance
In a final set of experiments, we measured the overhead incurred by various common
desktop and console applications when running them with PrivExec. Specifically,
we identified 12 applications that are representative of the privacy-related scenarios
and concerns that PrivExec aims to address, and designed various automated tests
to stress those applications. We ran each application first as a normal process, then
with PrivExec, and compared the elapsed times under each configuration.
Note that designing custom test cases and benchmarks in this way requires careful
consideration of factors that might influence our runtime measurements. In particular,
a major challenge we faced was automating the testing of desktop applications with
graphical user interfaces. Although several GUI automation and testing frameworks
exist for Linux, most of them rely on recording and issuing X server events without
any understanding of the tested application’s state. As a result, the test developer
is often expected to insert fixed delays between each step of the test in order to give
the application enough time to respond to the issued events. For instance, consider a
test that involves opening a menu by clicking on it with the mouse, and then clicking
on a menu item. When performing this task automatically using a tool that issues
X events, the developer must insert a delay between the two automated click events.
After the first click on the menu, the second click must be delayed until the tested
application can open and display the menu on the screen. This technique works
well for simple automation tasks, but for runtime measurements, long delays can
34
Fir
efox
Ch
rom
ium
Ori
g.
Ru
nti
me
(s)
PrivExec
Ru
nti
me
(s)
Over
hea
dO
rig.
Ru
nti
me
(s)
PrivExec
Ru
nti
me
(s)
Over
hea
d
Ale
xa
98.4
310
3.56
5.21
%91
.63
94.6
93.
34%
Wik
iped
ia37
.80
39.9
65.
71%
39.2
540
.12
2.22
%C
NN
66.6
169
.15
3.81
%49
.21
50.8
33.
29%
Gm
ail
58.4
361
.36
5.02
%30
.61
30.9
81.
21%
Tab
le2.
2:R
unti
me
per
form
ance
over
hea
dof
PrivExec
for
two
pop
ula
rw
ebbro
wse
rs.
35
easily mask the incurred overhead and lead to inaccurate results. Taking this into
consideration, in our tests, we refrained from using any artificial delays, or employing
tools that operate in this way.
First, we tested PrivExec with two popular web browsers, Firefox and Chromium.
We designed four test cases that represent different browsing scenarios.
Alexa. In this test, we directed the browsers to visit the top 50 Alexa domains.
While some of these sites were relatively simple (e.g., www.google.com), others in-
cluded advertisement banners, embedded Flash, multimedia content, JavaScript, and
pop-ups (e.g., www.bbc.co.uk).
Wikipedia. In this test, we visited 50 Wikipedia articles. As is typical of
Wikipedia, these web pages mostly included text and images.
CNN. In this test, we navigated within the CNN web site by clicking on different
news categories and articles. We cycled 5 times through 10 CNN pages with many
embedded images, videos, and Flash content in order to exercise the browser’s cache.
Gmail. In this test, we navigated to and logged into Gmail, composed and sent
5 emails, and then logged out of the web site.
To execute these tests, we used Selenium WebDriver [12], a popular browser au-
tomation framework. Selenium commands browsers natively through browser-specific
drivers, and is able to detect when the page elements are fully loaded without requiring
the user to introduce fixed delays. We repeated each test 10 times, and calculated the
average runtime over all the runs. We present a summary of the results in Table 2.2.
Next, we tested 10 popular Linux applications, including media players, an email
client, an instant messenger, and an office suite. These applications and their corre-
sponding test cases are described below.
36
Orig. Runtime (s) PrivExec Runtime (s) Overhead
Audacious 61.27 62.30 1.68 %Feh 51.86 52.52 1.27 %FFmpeg 105.47 111.31 5.54 %grep 245.37 253.82 3.44 %ImageMagick 96.16 101.41 5.46 %LibreOffice 99.64 100.62 0.98 %MPlayer 122.98 129.39 5.21 %Pidgin 116.49 117.87 1.19 %Thunderbird 75.45 78.78 4.41 %Wget 71.48 71.89 0.57 %
Table 2.3: Runtime performance overhead of PrivExec for various desktop andconsole applications.
Audacious. We configured Audacious, a desktop audio player, to iterate through
a playlist of 2500 MP3 audio files totaling 15 GB, load each file, and immediately
skip to the next file without playing them.
Feh. Feh is a console-based image viewer. We configured Feh to load and cycle
through 1000 JPEG images totaling 1.5 GB.
FFmpeg. FFmpeg, a video and audio converter, was configured together with
libmp3lame to convert 25 AAC formatted audio files to the MP3 format.
grep. grep is the standard Linux command-line utility for searching files for
matching regular expressions. We used grep to search the entire root file system for
a fixed string, and dumped the matching lines into a text file. This process resulted
in 16186 matching lines leading to a 3 MB dump.
ImageMagick. ImageMagick is a software suite for creating, editing and viewing
various image formats. Using ImageMagick’s convert utility we converted 150 JPEG
images to PNG images.
LibreOffice. LibreOffice is a comprehensive office software suite. We used Libre-
Office to open 10 documents and print them to PostScript files.
37
MPlayer. We configured MPlayer, a console and desktop movie player, to iterate
through a playlist of 100 Matroska files totaling 30 GB containing videos in various
formats, load each file, and immediately skip to the next one without displaying the
content.
Pidgin. Pidgin is a multi-protocol instant-messaging client. Using Pidgin we sent
500 short text messages between two Gtalk accounts.
Thunderbird. Thunderbird is a desktop email client. We composed and sent 5
emails with 1 MB attachments in our test.
Wget. Wget is a console-based network downloader. We used Wget to download
10 small video clips from the Internet, each sized 10-25 MB.
To carry out these tests, we utilized the synchronous command line interfaces
provided by the applications themselves, and also used xdotool [13], an X automation
tool that can simulate mouse and keyboard events. We stress that we only used
xdotool for simple tasks such as bootstrapping some of the GUI applications for
testing, and never included any artificial delays. Similar to the previous experiments,
we repeated each test 10 times, and we present the average runtimes in Table 2.3. Note
that in the tests above, we had the option to supply inputs to the applications from
the secure storage containers or from the public file systems. For each application,
we tested both and have reported the worse case. Also note that PrivExec would
normally prevent us from writing to the secure container from outside the private
process. Therefore, we implemented a backdoor in PrivExec during the evaluation
phase in order to copy the test data to the secure container.
In our experiments, the overhead of private execution was under 6% in every
test case, and private applications took only 3.31% longer to complete their tasks on
average. These results suggest that PrivExec is efficient and that it does not detract
from the user experience when used with popular applications that deal with sensitive
38
data. Finally, these experiments support our claim in Section 2.5.2 that the Bonnie++
benchmark results do not necessarily indicate poor performance for common desktop
and console applications. On the contrary, PrivExec can demonstrably provide a
private execution environment for real applications without a significant performance
impact. Still, we must stress that if a user runs PrivExec with a primarily I/O
bound workload, lower performance should be expected as indicated by the Bonnie++
benchmarks.
2.6 Limitations
While our prototype aims to provide a complete implementation of private execution
for Linux, there are some important limitations to be aware of.
One limitation is that the current prototype does not attempt to address system
hibernation, which entails that the contents of physical memory are persisted to
disk. As a result, if a hibernation event occurs while private processes are executing,
sensitive information could be written to disk as plaintext in violation of system
design goals. We note that this is not a fundamental limitation, as hibernation could
be handled in much the same manner as per-process encrypted swap. However, we
defer the implementation of private execution across hibernation events to a future
release.
By design, PrivExec relies upon memory isolation to protect both private process
memory as well as the corresponding PEK that resides in kernel memory. If malicious
code runs as a privileged user, such as root on UNIX-like systems, then that code
could potentially bypass PrivExec’s protection mechanisms. One example of this
would be for a malicious user to load a kernel module that directly reads out PEKs,
or simply introspects on a private process to access its memory directly. For this
39
reason, we explicitly consider privileged malicious users or code as outside the scope
of PrivExec’s threat model.
As previously discussed in Section 2.5, certain X clients do not interact well with
the current prototype implementation of stream-based UNIX domain socket and SysV
shared memory IPC privacy restrictions. In the former case, UNIX domain socket re-
strictions must be relaxed for X applications, while disabling the MIT-SHM extension
is sufficient to work around the second case. A related limitation is the possibility for
malicious code to extract sensitive data by capturing screenshots of private graphi-
cal elements through standard user interface facilities. However, we again note that
these are not fundamental limitations of the approach, and they can be addressed
with additional engineering effort.
2.7 Related Work
While other work has attempted to protect application privacy to varying degrees,
we believe that PrivExec strikes the right balance between security guarantees,
system integration effort, and performance with its operating system-level interface
for protecting generic binaries. In this section, we relate existing work to PrivExec,
including work on privacy attacks and defenses, file system and disk encryption, and
sensitive information leakage in various contexts.
Privacy as an Operating System Service
To the best of our knowledge, Ioannidis et al. [14] provide the first academic work
that proposes the idea of deploying privacy mechanisms in an application-independent
manner as an operating system service, but without providing a concrete system
design or implementation.
40
In a recent work, Lacuna [15] enables private execution for virtual machines,
which – like PrivExec’s private process groups – are used to confine the secrets of
sets of processes. By leveraging QEMU, modifications to the host operating system,
and hardware support, Lacuna not only ensures privacy for storage and swap space,
but also eliminates leaks into operating system drivers via its ephemeral channel
abstraction. PrivExec provides a subset of the security guarantees of Lacuna, but
at a lower engineering cost and with greater usability.
Similar to PrivExec, Kywe et al. [16] implement a general private execution
mode on Android by leveraging the platform’s existing sandboxing capabilities.
Djoko et al. [17] extend PrivExec to use a memory-backed secure storage con-
tainer, and report improved performance over our original implementation.
Privacy Leaks in Web Browsers
Privacy attacks and defenses have been studied extensively specifically in the context
of web browsers. For example, Felten and Schneider [18] introduce the first privacy
attacks exploiting DNS and browser cache timing. In other works, Clover et al. [19]
demonstrate a technique for stealing browsing history using CSS visited styles, and
Janc and Olejnik [20] show the real-world impact of this attack. On the defense side,
solutions have been proposed for preventing sniffing attacks and session tracking
(e.g., [21, 22, 23, 24]). These works are largely orthogonal to ours in that they target
information leaks on the web, while PrivExec addresses the problem of privacy leaks
in persistent storage.
Aggarwal et al. [3] and Said et al. [25] analyze the private browsing modes of
various browsers, and reveal weaknesses that would allow a local attacker to recover
sensitive data saved on the disk. The former study also shows that poorly designed
browser plug-ins and extensions could undermine well-intended privacy protection
41
measures. These studies underline the value of PrivExec as our approach aims to
mitigate the attacks described in these papers.
Xu et al. [26] present similar findings, and propose a universal private browsing
framework for web browsers, which utilizes a temporary sandbox file system to con-
tain, and later discard, data produced during a private browsing session. In contrast,
PrivExec is designed as a generic solution that is not only limited to protecting web
browsers. In other words, our approach can be used to run any arbitrary application
in private sessions, including browsers that already have private browsing modes and
that have been shown to be vulnerable.
Privacy Leaks in Volatile Memory
Studies have demonstrated that it is possible to recover sensitive data, such as disk
encryption keys, from volatile memory [27], and many others have proposed solutions
to address this problem. While PrivExec stores PEKs in memory, we are careful to
wipe them after the associated process has ended. Anti-cold boot measures can also
be deployed to complement PrivExec if so desired by users.
Secure hardware architectures such as XOM [28] and AEGIS [29] extensively
study memory encryption techniques to prevent information leakage, and support
tamper-resistant software and processing. Alternatively, Cryptkeeper [30] proposes
a software-encrypted virtual memory manager that works on commodity hardware
by partitioning the memory into a small plaintext working set and a large encrypted
area.
Likewise, secure deallocation [31] aims to reduce the lifetime of sensitive data in
the memory by zeroing memory promptly after deallocation. Provos [32] proposes
encrypting swapped-out memory pages in order to prevent data leaks from memory
to disk.
42
In contrast, PrivExec is designed as an operating system service that guarantees
storage writes to the file system or to swap cannot be recovered during or after a pri-
vate execution session. As such, encrypted memory is complementary to PrivExec’s
private processes. Furthermore, PrivExec works on commodity hardware and does
not necessitate architectural changes to existing systems.
Disk and File System Encryption
Many encrypted file systems (e.g., CFS [33], Cryptfs [34], eCryptfs [8], EncFS [35]),
and full disk encryption technologies (e.g., dm-crypt [10], BitLocker [36]) have been
proposed to protect the confidentiality of data stored on disk. In a recent study,
CleanOS [37] extends this idea to a new Android-based operating system that protects
the data on mobile devices against device loss or theft by encrypting local flash and
storing keys in the cloud. Borders et al. [38] propose a system that takes a system
checkpoint, stores confidential information in encrypted file containers called storage
capsules, and finally restores the previous state to discard all operations that the
sensitive data was exposed to.
Although many of these solutions provide confidentiality while the encrypted
drives or partitions are locked, once they are unlocked, sensitive data may become
exposed to privacy attacks. Moreover, encryption keys can be retrieved by exploiting
insecure key storage, or through malware infections. Approaches that may be resilient
to such attacks (e.g., storage capsules) remain open to key retrieval via coercion (e.g.,
through a subpoena issued by a court). In contrast, PrivExec destroys encryption
keys promptly after a process terminates, guaranteeing that recovery of sensitive data
on the disk is computationally infeasible. Furthermore, it can be applied selectively
to specific processes on demand, as opposed to encrypting an entire device or par-
43
tition. Finally, PrivExec is a flexible solution that can work with any file system
supported by the kernel.
Secure File Deletion
The idea of securely deleting files using ephemeral encryption keys was introduced by
Boneh and Lipton [39], and was later used in various other systems (e.g., [40, 41, 42]).
We borrow this idea and apply it to a new context.
Other more general secure wiping solutions, including user space tools such as
shred [43] and kernel approaches [44, 45] provide only on-demand secure removal of
files. In contrast, PrivExec provides operating system support for automatically
rendering all files created and modified by a private process irrecoverable, and does
not require users to manually identify files that contain sensitive data for deletion.
Ritzdorf et al. [46] describe a technique to automatically identify related content
upon file deletion. While this work does not consider secure deletion per se, in
principle, the proposed system can be combined with other secure deletion techniques
to automatically remove all traces of a private execution session.
We present an in-depth discussion of other secure deletion techniques in Chapter 3.
Application-Level Isolation
Various mechanisms have been proposed to sandbox applications and undo the effects
of their execution. For example, Alcatraz [47] and Solitude [48] provide secure exe-
cution environments that sandbox applications while allowing them to observe their
hosts using copy-on-write file systems. Li et al. [49] propose a two-way sandbox for
x86 native code that protects applications and the operating system from each other.
Other works utilize techniques such as system transactions, monitoring, and logging
to roll back the host to a previous state (e.g., [50, 51]). Unlike PrivExec, these
44
systems are primarily concerned with executing untrusted applications and recovery
after a compromise; they do not provide privacy guarantees.
2.8 Summary
Preventing sensitive data handled by applications from being exposed in persistent
storage is a common privacy goal. To achieve this, web browsers often support private
browsing modes that discard users’ traces after a browsing session ends. However,
as evidenced by the security flaws found in many popular browsers, implementing
private execution features in an application-specific manner is bug prone and can be
costly.
In this chapter, we presented PrivExec, an operating system service for private
execution of arbitrary applications. PrivExec leverages the short-lived nature of
the private execution model to associate protected, ephemeral private execution keys
with processes that can be securely wiped after use so that they cannot be recovered
by a user or adversary.
The proposed design and implementation satisfies all of the research goals we laid
out in Section 1.5. (G1) We provided an abstract design of PrivExec independent of
the underlying operating system, and demonstrated that it can be applied to Linux-
like systems with lightweight modifications to existing operating system structures,
and by reusing already deployed technologies such as eCryptfs and Overlayfs. (G2)
PrivExec works with any application, and provides strong, general guarantees of
private execution. (G3) It does not require explicit application support, recompila-
tion, or any other preconditions. (G4) Finally, our evaluation shows that PrivExec
is applicable to a wide variety of popular applications, and that it incurs a minimal
performance overhead in practice, when running real-world applications.
45
Chapter 3
Eraser: Secure Deletion on
Blackbox Hardware
3.1 Overview
Secure deletion of data from non-volatile storage is a well-recognized and heavily
studied problem. To date, researchers and developers have proposed a plethora of
techniques for securely erasing data from physical media, often employing methods
such as overwriting files containing sensitive data in-place, encrypting data with tem-
porary keys that are later discarded, or hardware features that scrub storage blocks.
Despite these extensive efforts, advances in storage technologies and character-
istics of modern hardware still pose significant difficulties to achieving irreversible
data deletion in prevailing computing environments. For instance, Solid State Drives
(SSDs) often utilize hardware controllers inaccessible to the outside world. These con-
trollers can redirect I/O operations performed on logical device blocks to arbitrary
memory cells in order to implement wear leveling and minimize the effects of write
amplification. Similarly, journaling file systems may keep traces of I/O operations
46
that include sensitive data in their logs. As a result, many secure deletion methods
that base their security on behavioral assumptions regarding older file systems or me-
chanical disk drives are rendered ineffective because tracking and removing sensitive
data in these settings is often infeasible, or sometimes impossible.
In the face of these emerging challenges, recent research has adapted secure dele-
tion technologies to new applications. For example, Reardon et al. [42] present an
encrypting file system that guarantees secure erasure on raw flash memory used in
smartphones. However, secure deletion remains a challenge on blackbox devices such
as the aforementioned SSDs, which only allow access to their storage through opaque
hardware controllers that translate I/O blocks in an unpredictable manner.
In this chapter, we present a technique that provides secure deletion guarantees at
file granularity, independent of the characteristics of the underlying storage medium.
Our approach is based on the general observations made in previous work that secure
deletion cannot be guaranteed on a blackbox storage medium with unknown behavior.
Therefore, we instead bootstrap secure deletion using a minimal master key vault
under the user’s control, such as a Trusted Platform Module chip or a smartcard.
Our approach is an evolution of the first cryptographic erasure technique proposed
by Boneh and Lipton [39]. At an abstract level, we encrypt every file on an insecure
medium with a unique key, which can later be discarded to cryptographically render
a file’s data irrecoverable. Note that while these keys would need to be persisted to
keep the files accessible in the future, they cannot be stored on the same medium
together with the files since that would in turn prevent us from securely deleting the
keys.
To address this problem, we compress the keys into a single master key that is
never persisted to insecure storage, but instead is evicted to the master key vault.
To this end, we utilize a key store organized as an n-ary tree (i.e., a tree where
47
each node has up to n children), where every node represents a unique encryption
key. We term this key store a file key tree (FKT). Keys corresponding to leaf nodes
each encrypt a single file stored on the blackbox medium, and in turn parent nodes
encrypt their children nodes. This tree hierarchy compresses the master secret to a
single encryption key, the root node, which is never persisted to the blackbox storage
but is instead easily evicted to the master key vault. In contrast, the rest of the tree
nodes (i.e., encrypted keys) are stored together with the files on the insecure device.
In this model, securely deleting a file from an FKT of capacity |F | involves de-
crypting n logn |F | nodes, regenerating logn |F | keys, and re-encrypting the n logn |F |
nodes with the new keys. During this process, the master key is also securely wiped
from the vault and replaced with a fresh one. In this way, the previous path leading
to the deleted file will be rendered irrecoverable.
We implemented this technique in an unconventional prototype, a file-aware stack-
able block device, which can be deployed as a stand-alone Linux kernel module that
does not require any modification to the operating system architecture. As the name
implies, our implementation exposes a virtual block device on top of an existing phys-
ical device installed on the computer. Users can format this drive with any file system
and interact with it as they would normally do with a physical disk. Our block level
implementation is able to capture higher-level file system information to identify file
blocks while providing I/O performance significantly better than a file system-level
solution.
3.2 Background & Related Work
Secure deletion of data from physical storage is a well-studied and complicated prob-
lem. Regardless, it remains unsolved in the general case. In the following, we briefly
48
outline related work on various forms of secure deletion, highlight their shortcomings,
and motivate our approach.
3.2.1 Related Work
Secure deletion approaches have been investigated at several different layers of ab-
straction and using a variety of techniques. We refer readers to a comprehensive
classification of prior approaches [52], while in the following we summarize relevant
related work.
Hardware Techniques
The lowest point at which secure deletion can be performed is at the physical layer.
In the most direct interpretation, secure deletion can be performed through physical
destruction of the storage medium. Scenarios where these methods apply are out of
scope for this paper.
Secure deletion can also be performed at the hardware controller. For magnetic
media, SCSI and ATA controllers provide a Secure Erase command that overwrites
every physical block. Some solid-state drives also provide such a command. However,
this is a coarse-grained approach to secure deletion that is difficult to improve upon
since, without knowledge of the file system, controllers cannot easily distinguish data
to be preserved from data to be deleted. Furthermore, prior work has shown that
hardware-level secure deletion is not always implemented correctly [53].
49
File System-based Solutions
The next layer of abstraction above the physical controller is at the file system.
Here, secure deletion approaches can take advantage of file system semantics, but are
potentially restricted by the device driver interface.
One class of techniques is aimed at devices for which the operating system can reli-
ably perform in-place updates (e.g., magnetic hard drives). Many specific techniques
have been proposed, including queuing freed blocks for explicit overwrite [44, 45, 54]
as well as intercepting unlink and truncation events for user space scrubbing [45].
Another class of techniques is intended for devices such as raw flash memory, where
there is asymmetry between the minimum sizes of read or write and erase operations.
One notable example is DNEFS [42], which modifies the file system to encrypt each
data block with a unique key and co-locates keys in a dedicated storage area. Secure
deletion is implemented by erasing the current key storage area and replacing it with
a new version. During this replacement, keys corresponding to deleted data are not
included in the new version.
However, a fundamental underlying assumption of these approaches, that the OS
has ability to directly read or write physical blocks as in the case of magnetic hard
drives or raw flash memory, is not valid for modern storage devices such as SSDs as
we describe below.
User-level Tools
User space is the highest layer of abstraction from which secure deletion can be at-
tempted. These approaches are restricted to the file system API exposed by the op-
erating system to accomplish their task (e.g., the POSIX API for a POSIX-compliant
system). One example of such an approach is Secure Erase [55], an application that
50
simply invokes the Secure Erase command on a storage controller. However, as dis-
cussed above, this is not a reliable secure deletion mechanism.
User-level tools can also attempt to explicitly overwrite data to be securely deleted [56],
a popular approach first proposed by Gutmann [57]. However, these approaches as-
sume that overwriting a block using the interface provided by the operating system
guarantees that all copies of that data on physical storage will be overwritten on the
underlying physical medium.
A third user space secure deletion approach is to fill the free space of a file sys-
tem [58, 59]. The motivation for this approach is to proactively overwrite remnants
of potentially sensitive data on storage left in the free block pool. However, this
approach is also limited by the operating system actually providing the capability
to overwrite all free blocks on storage, as well as the system’s ability to expose all
physical blocks to user space. We discuss below that this may not always be the case
with modern SSDs.
Cryptographic Erasure
Along a different axis than abstraction layer, there are also techniques that make use
of cryptographic erasure as a fundamental primitive. Put simply, these techniques
reduce secure deletion of data to secure deletion of a key encrypting that data. Under
computational hardness assumptions, encrypted data without the corresponding key
is infeasible for an attacker to recover. Prominent examples of this include Boneh’s se-
cure deletion approach for offline data such as tape archives [39], Lee’s secure deletion
approach for YAFFS [60], DNEFS [42], and TrueErase [61, 62]. While these works
present various secure deletion techniques for certain solid state storage types, they
are not compatible with flash translation layers implemented in opaque, hardware
controllers, excluding them from use on typical SSDs.
51
Another approach to cryptographic erasure is proposed by Tang et al. [37]. In
CleanOS, sensitive data on mobile devices is encrypted and the corresponding key is
evicted to the cloud. The fundamental assumption underlying this work is that the
cloud is more trustworthy than the user’s device, which is not always the case.
Yet another example of cryptographic erasure is proposed by Swanson et al. [63],
this time at the controller level. Here, a cryptographic key is used to encrypt all data
stored on the physical device, and this key is stored within a dedicated memory also
located on the device. Secure deletion is performed by replacing this key, resulting in
a coarse-grained secure deletion of all data on storage.
Reardon et al. [64] also present a graph theoretic approach to analyzing and prov-
ing the security of any tree-like approach to secure deletion involving encryption and
key wrapping. They provide an implementation of an instance of this class of ap-
proaches as a B-tree that can provide file-level deletion granularity, and exhibits the
potential for good performance when combined with a suitable caching policy. This
work is closely related to ours, and therefore, we defer a direct comparison between
them to the discussion in Section 3.7.
3.2.2 Flash Translation Layers
Raw flash memory is a common storage technology due to its low power consump-
tion, density, and efficient random-access characteristics. In a significant departure
from classical storage technologies such as magnetic hard disks, flash memory pos-
sesses an asymmetry between the sizes of read and write operations versus the size
of erasure operations. In particular, data is read and written at page granularity
(e.g., 4 KB blocks), but is erased at an erase block granularity (e.g., 256 KB chunks).
Furthermore, flash memory cannot be written to unless the page, and its enclosing
52
erase block, has first been erased. Since this operation incurs significant wear, wear
leveling is performed wherein erasure operations are evenly distributed across flash
erase blocks in order to maximize the device’s service lifetime. This leads to the
phenomenon of write amplification, where one logical I/O operation leads to multiple
physical I/O operations.
For raw flash devices intended to be directly exposed to an operating system,
wear leveling is expected to be performed by the device driver. However, devices such
as solid-state drives (SSDs) do not expose this low-level interface. Instead, a flash
translation layer (FTL) is interposed to provide a traditional sector-based interface to
the operating system much as a magnetic hard disk would provide. For an SSD, the
FTL is implemented within the hardware controller, and in such cases the operating
system does not have direct access to physical flash pages, erase blocks, or visibility
into the wear leveling process. In fact, in order to accommodate expected wear,
account for failed erase blocks, and improve performance, modern SSDs are typically
over-provisioned by 25%.
Since FTLs obscure physical flash erase blocks and wear leveling leads to write
amplification that results in significant amounts of duplicated data, existing secure
deletion techniques are incompatible with such devices.
3.2.3 Motivation
To summarize, while prior secure deletion approaches work under certain circum-
stances, they do not address common cases where the operating system cannot guar-
antee that physical blocks are not duplicated on storage, or that logical blocks map
directly to physical blocks, as in the case of FTL-based devices such as SSDs. Those
53
approaches that remain, such as whole-device secure erase commands or cryptographic
erasure [63], only operate at the coarsest granularity possible.
Our work aims to fill this important gap for arbitrary storage devices by satisfying
the following design goals:
• Secure deletion must not rely on the assumption that blocks are not duplicated
without its knowledge.
• Secure deletion must not rely on the assumption that logical block addresses
map one-to-one to physical block addresses.
• Secure deletion must operate at a useful level of granularity – in our case, at
the file level.
3.3 Threat Model
The threat model we consider in this work is essentially a notion of forensic security.
That is, while the system computes over sensitive data an adversary is not present
on the system and cannot examine or tamper with this data. We assume that an
adversary can later gain a high level of access to the system, including physical access,
and attempt to forensically recover deleted files that previously contained privacy-
sensitive data. The secure deletion approach we describe in this chapter guarantees
that attackers cannot recover data that has been deleted during prior computation.
We assume a trusted computing base (TCB) composed of a subset of the system’s
software that includes the kernel and a small set of high-privilege user space utilities.
The TCB also includes a subset of the underlying firmware and hardware, in partic-
ular a secure storage area described later in the paper such as a Trusted Platform
Module (TPM) chip or smartcard. However, storage controllers are considered to be
54
untrusted, and no assumptions are made as to the kind of physical medium used in
the system (e.g., magnetic hard disk, SSD, tape, optical drive).
3.4 Design
Before describing Eraser, we first describe a naıve approach to secure file deletion,
and discuss its drawbacks to motivate the actual design of Eraser. We then analyze
the theoretical storage and time bounds for Eraser.
3.4.1 Naıve Approach
A straightforward approach to secure file deletion using cryptographic erasure is to
simply generate a unique encryption key for each file. Any data written to storage
would be encrypted with its associated file key, and decrypted when read from storage.
Securely deleting a file is then reduced to securely deleting the corresponding file
key (i.e., under computational hardness assumptions, it should be infeasible for an
attacker to recover the file without the key).
This approach, however, has an important flaw: file keys must also be persisted
to storage across system reboots or failures, and as a result, there would be no way
to assure that file keys themselves are securely deleted. To address this recursive
problem, we encrypt the file keys with a master key and rely upon a trusted element
to serve as secure storage for this master key. We term this master key secure storage
the master key vault, which must satisfy the following properties:
• The vault must be large enough to store a master key.
• The vault must allow the system to perform encryption and decryption opera-
tions using the stored master key.
55
• The vault must allow the system to update the stored master key with a new
key.
Unfortunately, this leads to a second problem: the simple two-level hierarchy
described above implies that deleting a single file requires re-encrypting all file keys.
To understand why this is the case, consider that on modern storage devices, block
data might be persisted to multiple physical locations due to phenomena such as
flash wear leveling, and that such processes are completely outside the control of an
operating system kernel. Therefore, in order to ensure that file data is irrecoverable,
the master key must itself be rotated, and the old key securely deleted from the vault
such that there is no computationally feasible way for an attacker to decrypt block
data recovered from physical storage. Since the master key must be rotated, all file
keys must be re-encrypted before being persisted to disk, leading to a phenomenon we
term encryption amplification. This is an expensive operation that should be avoided
for any practical system.
3.4.2 File Key Trees
To address the above problems identified in the naıve approach, Eraser’s design
incorporates two key elements: (i) a master key vault, and (ii) a file key tree (FKT).
The master key vault has the properties described above, which allows for master
keys to be rotated with secure deletion of the old key. The FKT, on the other hand,
avoids the problem of encryption amplification by bounding the number of keys that
must be re-encrypted each time the master key is rotated.
An FKT is an n-ary tree (i.e., a tree where each node has up to n children ) of
height m. At the root is the master key, which is stored in the master key vault, is
never released from the system TCB, and is never persisted to other storage in any
56
M
...
Ekn(kn2+n)
EM(kn)
Ekn(kn2+1)Ek1
(k2n)
EM(k1)
Ek1(kn+1)
......
m = 2
Figure 3.1: Structure of an n-ary FKT with m = 2. The root node is represented bya master key M stored in a secure master key vault. Each internal node contains akey encrypted by the parent key. Leaf nodes correspond to file encryption keys.
form. Internal nodes of the tree correspond to randomly-generated encryption keys.
Each node key encrypts the keys of its children. Leaves of the tree correspond to file
encryption keys. An example of an n-ary FKT with m = 2 is shown in Figure 3.1.
FKT Space Complexity
To represent |F | files, an FKT with at least |F | leaves must be created. Therefore,
the size of an FKT is bounded by
O(nblogn |F |c + |F |
).
This is simply the number of internal nodes required to represent |F | leaves in an
n-ary tree plus the leaves themselves. In practice, the root key will be evicted to the
dedicated master key vault, while the remaining levels of the FKT will be persisted
to disk.
57
M
Ek1(k3) Ek1
(k4) Ek2(k5) Ek2
(k6)
EM(k1) EM(k2)
Ek3(f1) Ek4
(f2) Ek5(f3) Ek6
(f4)
M
Ek1(k3) Ek1
(k4) k5 k6
k1 k2
Ek3(f1) Ek4
(f2) Ek5(f3) Ek6
(f4)
M'
Ek1(k3) Ek1
(k4) k'5 k6
k1 k'2
Ek3(f1) Ek4
(f2) Ek5(f3) Ek6
(f4)
M'
Ek1(k3) Ek1
(k4) Ek'2(k'5) Ek'2
(k6)
EM'(k1) EM'(k'2)
Ek3(f1) Ek4
(f2) Ek6(f4)
Ek2(k5)
EM(k2)
Ek5(f3)
Step 1 Step 2
Step 3 Step 4
Figure 3.2: Secure deletion using an FKT with n = 2, |F | = 4. Step 1: The initialstate of the tree contains encryption keys for four files. Step 2: When the user decidesto delete file 3, a traversal of the FKT from the corresponding leaf node for file 3 to Mis performed. Starting below the master node, each node’s key is decrypted using theparent’s key. Additionally, all other direct children of the current node are decrypted.Decrypted nodes are shown here in bold. Step 3: Keys along the direct path fromfile 3’s leaf node to the master key node are randomly regenerated. These nodes havea dotted outline. The old master key M is securely deleted from the vault, and a newmaster key M ′ is stored. Step 4: Keys at direct children of nodes on the path fromStep 3 are re-encrypted to obtain the new FKT, which is persisted to disk. Nodesfrom the pruned branch as it existed at Step 1 might remain on insecure storage, butsince M has been erased it is computationally infeasible for an attacker to decryptdata along that path.
58
FKT Operations and Time Complexity
Accessing a file encrypted using an FKT involves collecting a chain of encryption keys
from the corresponding FKT leaf node to the master key and performing a series of
decryption operations to recover the file encryption key. Therefore, the number of
decryption operations to obtain access to a file is bounded by
O (dlogn |F |e) .
Deleting a file, similarly to file access, first requires collecting a chain of encryption
keys from the corresponding FKT leaf node to the master key. However, the next step
of this process is to: (i) randomly generate new encryption keys for each node along
the path to, and including, the master key node; and, (ii) re-encrypt the existing keys
at direct children (i.e., non-recursively) for each node along the previously identified
path in the FKT. Therefore, this operation’s time complexity is bounded by
O (ndlogn |F |e) .
This process is explained in a concrete example in Figure 3.2 for n = 2, |F | = 4.
3.5 Implementation
We implemented the general secure deletion approach presented in the previous sec-
tion in a prototype tool called Eraser, which operates at the block I/O layer of
Linux but provides secure deletion guarantees at file granularity. Eraser does not
require modifications to the rest of the Linux architecture and can be deployed as a
stand-alone kernel module. Our prototype utilizes commodity TPM chips that are
59
present on many modern motherboards and can easily be extended from user space
to support other types of secure external storage (e.g., smartcards).
3.5.1 Alternative Solutions & Our Philosophy
As we discussed in Section 3.2, there is a myriad of tools and techniques that imple-
ment secure deletion capabilities at various layers of a computer system. To assess
the advantages and drawbacks of each of these options, we first discuss various imple-
mentation alternatives to realize our approach, and explain our decision to choose the
block I/O layer for our prototype. We refer readers to Section 2.4.2 and Figure 2.2
for an overview of the various Linux I/O subsystems.
Unsatisfactory Solutions
While it is relatively easy to develop user space solutions instead of trying to un-
derstand and modify operating system internals, the I/O-related system calls offer
minimal control over how data blocks are processed and stored at lower levels, lim-
iting the effectiveness of such solutions for our purposes. Likewise, in this work, we
refrain from directly modifying concrete file system implementations or specific de-
vice drivers. While implementing our approach at those layers is possible, choosing
any specific instance to adapt to our needs would limit the usefulness of our system.
Otherwise, modifying and maintaining every single file system or driver available in
Linux would be a high-effort and bug-prone affair.
In spite of the issues mentioned above, the file system layer is still the most natural
place to enforce secure deletion of files. By definition, the file system is already aware
of all the data blocks corresponding to any given file and also has full control over
file metadata, all of which significantly eases development burden. One solution that
60
alleviates the issues tied to working with a specific file system, while also leveraging
the advantages of the file system layer, is utilizing a stackable file system. These
special file systems reside on top of another underlying file system and transparently
interpose on the passing I/O requests, presenting a viable option to implement secure
deletion. For instance, eCryptfs [8] is an stackable encrypting file system distributed
with Linux, and could easily be adapted to our approach.
Unfortunately, stackable file systems often come with a significant performance
overhead. In fact, during our evaluation of PrivExec in Section 2.5 we already wit-
nessed the performance issues with eCryptfs, and benchmarks also show that eCryptfs
performs considerably worse than block-layer encryption [65]. Since one of our im-
plementation goals is to build a performant system that could be used on everyday
computers, we chose to employ a different strategy for our prototype.
Another seemingly viable alternative is to implement our system as part of the
Linux page cache. However, examining the kernel internals reveals that some of
the critical page cache functions that manipulate file blocks (e.g., pageread and
pagewrite) are actually required to be provided by file systems. Furthermore, Linux
gives applications the ability to perform direct I/O operations that bypass the page
cache. As a result, we conclude that the page cache is not a suitable layer to implement
secure deletion.
Our Solution
In light of the above considerations, we decided to implement our approach at the
block device level in a stackable block device driver. Similar to how stackable file
systems operate, stackable drivers intercept block I/O requests before they reach the
underlying drivers and allow us to manipulate them as necessary. The main advantage
of a block-layer approach is its performance (e.g., compare dm-crypt’s performance
61
to eCryptfs [65]). However, at a first glance, it is not clear how file-level information
could be gathered at the block layer or, in other words, how physical sectors on a
device could be matched to logical file blocks.
Our prototype Eraser closes this semantic gap between the file system and block
device layers by leveraging the Linux kernel’s property that, regardless of the file
system implementation, every file system object is represented by a common data
structure provided by the VFS: the inode. In this way, we can avail ourselves of the
performance benefits of operating on low-level device blocks, while still retaining a
high-level understanding of the file system. At the same time, Eraser works under
any Linux-native file system and is compatible with any physical block device.
3.5.2 Prototype Overview
We implemented Eraser using device-mapper [66] as a stackable block device driver,
also referred to as a device-mapper target in Linux parlance. Device-mapper is a
standard Linux kernel framework that allows users to create stackable drivers, and is
used in technologies such as dm-crypt, LVM, software RAID, and Docker. It maps
existing physical block devices onto new ones and exposes these virtual devices to
user space via new device nodes, often found under /dev/mapper/*. Users can then
interact with these device nodes in the usual way, formatting them with a file system
of their choice and storing data in them.
A high-level view of the system is illustrated in Figure 3.3. Eraser organizes file
encryption keys in an FKT as described and stores them in a reserved section of the
underlying storage device. The master key, however, is never persisted to this device.
Instead, it is confined to an external secure store. Specifically, in our implementation,
we store it in the NVRAM area of a TPM chip installed on the machine. Eraser
62
?
File Key Tree File System
...a7SEPwZLelCZBqn
5UbAcrRICmaBIum
rx2yLulxZ0EJ2OJ
rx2yLulxZ0EJ2OJ
TXT
Key Cache
9vAOgP3geq2oqI6
TPM Chip ERASER Block Device
File I/O
Storage
Device
KernelMaster Key Vault
Figure 3.3: An overview of Eraser’s design. Our prototype implementation utilizesa TPM chip as its external secure store to preserve the master encryption key.
63
then intercepts all block I/O operations in flight, identifies which files those I/O blocks
belong to, and retrieves the appropriate keys to encrypt or decrypt the file contents on
the fly. When a file is deleted, its associated key is discarded as described previously.
Finally, the newly generated keys are written to the key store and a fresh master key
is synced to the TPM chip, overwriting the obsolete key. We will now discuss these
components in more detail.
3.5.3 I/O Manipulation
The kernel represents and tracks in-flight block I/O operations with a data structure
called a bio. Through the device-mapper framework, each bio destined for an un-
derlying physical device is first handed to Eraser where we can freely manipulate
them before passing them on to the next device driver in the stack.
Identifying Files
The first task Eraser needs to be able to perform is detecting whether a bio corre-
sponds to a file system operation. Thanks to the way Linux handles pages of a file
during I/O and the VFS layer which necessitates that every file system object have
a corresponding inode object associated with it, this task is possible without explic-
itly modifying the upper kernel layers or attempting to propagate this information
downwards.
Since there is a one-to-one mapping between inodes and files, our implementation
uses inode numbers to uniquely identify files and find their corresponding encryption
keys. Whenever Eraser receives a bio, it first iterates over all of the memory pages
(i.e., data buffers in volatile memory) it points to. Linux provides another related
object per file, called an address space, that describes the mapping between physical
64
blocks on a disk, pages in memory, and the inode owning these. By walking through
this structure Eraser is able to match every page in a bio to a specific inode,
check whether the inode at hand corresponds to a file, and subsequently identify the
encryption keys to be used based on the inode number. Otherwise, if the pages are
found to have no corresponding inode or an inode that represents a file system object
other than a file, that bio is simply remapped and sent to the actual underlying
device without further processing.
Writing Files
Once a bio corresponding to a file write operation is identified, Eraser needs to
retrieve the appropriate key, encrypt the contents, and perform the write to the
underlying device. However, simply iterating over the memory pages pointed to by
a bio and encrypting them in-place is not the correct approach. This is because
the same memory pages representing the write buffers are often also present in the
page cache. Thus, directly encrypting them would result in ciphertext being served
to user space from the cache with future I/O requests, without our driver having an
opportunity to decrypt them. Even if we could attempt to intercept cache hits, it
would be sub-optimal to decrypt the same contents with every individual read.
To address this issue, Eraser makes a clone of the original bio and all of its
pages, and instead encrypts the copied pages. Next, the cloned bio is asynchronously
submitted to the underlying device, while the original I/O request is being stalled.
Once Eraser receives notification of a completed disk write through a callback, it
marks the original bio as completed as well, which automatically signals the upper
layers of a successful disk write. In this way, the cached data remains untouched and
subsequent file reads that result in cache hits do not require repeatedly decrypting
the same data buffers.
65
Reading Files
Handling of bio objects that represent file reads is similar, with a single difference.
When Eraser first intercepts the bio, the pages it points to are empty, ready to be
filled with data read from the physical disk. Therefore, Eraser first needs to initiate
the actual disk read, and decrypt the data only once the operation is complete.
This is achieved by, once again, cloning the original bio, and submitting the
clone for I/O to the underlying device while the original operation is being stalled.
However, this time, it is not necessary to allocate separate memory pages for the clone;
instead, the clone points to the original memory pages. Once we receive notification
of a completed read operation, Eraser retrieves the appropriate key, decrypts the
contents in place, and finally signals completion of the original bio.
Cryptographic Operations
Eraser uses AES-256 in CBC mode to encrypt file blocks. Every file is also given a
unique IV stored together with the keys in the FKT. Since encryption is performed
on a page-by-page basis and pages of a file could be read or written in any arbitrary
order, a page IV is derived from the file’s unique IV and the file offset of the processed
page. Finally, all random data used by Eraser to regenerate encryption keys and IV
after a file deletion is generated using AES-256 in CTR mode. This random stream
is seeded by a key from the kernel’s cryptographically secure random byte pool and
the cipher is reseeded after every 1 MB of data output.
3.5.4 Intercepting File Deletions
When an Eraser virtual device is first initialized, its FKT is filled with randomly
generated keys and IVs for every file, and the system is ready for use. Consequently,
66
our system does not need to track file creation events. Instead, we only monitor file
deletions, discard the appropriate keys in the FKT as discussed in Section 3.4, and
immediately generate new keys for the freed inodes later to be used by the next file
that is assigned the same inode number.
While this approach simplifies our implementation, intercepting file deletion events
from the block layer is still not a trivial task. In particular, because a file deletion
often only involves changes to file system indices and metadata, and no I/O operations
are performed on the actual file blocks, the block I/O layer remains oblivious to this
file system modification.
Eraser addresses this challenge with the help of another Linux kernel framework,
Kernel Probes (kprobes) [67]. Kprobes allow users to hook into code addresses inside
the kernel, and access or manipulate system state. We utilize this capability to trap
execution at the entry point of the vfs unlink function, a choke point inside the
VFS for all deletion operations. Next, in our hook function, we access the original
function’s arguments from the CPU registers, retrieve a pointer to the deleted inode,
and check whether it represents a file object. Note that since vfs unlink is called
from all file systems available on the machine, we also need to check here that the
inode actually resides on an Eraser partition and not some other device. Once it is
confirmed that a file on a relevant device is being deleted, we then trigger the secure
deletion process and generate fresh keys for the freed inode.
3.5.5 Key Storage & Management
While Eraser’s key organization is based on the high-level FKT design presented in
Section 3.4, we also employ a number of optimizations specific to our implementation.
67
Due to the potentially large size of an FKT, the majority of the keys are stored
on the disk at any given time and are only accessed when required for a file access
or secure deletion. Because of this, the parameter n of the FKT should be chosen
to optimize disk I/O performance. In our implementation, every node of the tree
contains a 256-bit encryption key and 128-bit IV, for a total size of 48 bytes. To
ensure that we perform disk I/O operations on block boundaries, we set n to the
maximum number of tree nodes that can fit into a single block (i.e., b409648c = 85 for a
system with 4 KB logical blocks). In this way, we can perform disk I/O on all children
of a node directly with a single block access. This also has the desirable side effect
of allowing us to perform cryptographic operations on the blocks with a single pass,
because page sizes are often equal to or multiples of block sizes. Of course, other
configurations are also possible as long as n is chosen so that nodes fall within block
boundaries.
Note that the structure of an FKT can be estimated fairly accurately at the time of
system initialization, and the tree structure will remain static throughout the system’s
life. In our approach, there is a leaf node corresponding to each file. The number of
files is limited by either the number of inodes a file system can support on a device of
given capacity or, in the case of file systems that allocate inode indices dynamically,
by the space reserved on the device for the key store. With this knowledge of how
many leaves are going to be available in the FKT at any given time, we can further
optimize the tree structure for space efficiency.
We do this by first calculating the minimum tree height required based on the
number of inodes we need to support, and then decreasing the fan-out of the root
node to a value smaller than n in order to cull unused, empty subtrees of the root.
For instance, an Ext4 file system created on a device with a 100 GB capacity would
default to allocating 6,553,600 inodes. To create a tree with 6,553,600 leaves working
68
back towards the root (with n = 64 to simplify calculations), we would need 655360064
=
102400, 10240064
= 1600, and 160064
= 25 nodes at each level. Consequently, a fan-out of 25
for the root would be sufficient for this configuration. The recurrence relationship for
calculating the total number of tree nodes required to support |F | files is given below,
excluding the root node stored externally. As a result, our implementation reserves
48R(|F |) bytes of storage space for a file system that can represent a maximum of
|F | files.
R(|F |) =
|F |+R(d |F |
ne), if |F | > n
|F |, if |F | ≤ n
Recall that our approach requires logn |F | disk accesses (i.e., the height of the
tree) to retrieve or discard the required keys with each file access and deletion. Our
approach to mitigate the I/O overhead caused by this necessity is twofold. First,
we always keep the decrypted nodes in levels 1 and 2 in memory. For example,
following the previous example with 6,553,600 leaves and n = 64, this would require
less than 7 MB of memory. Modified nodes are written to disk periodically. Next, we
employ a caching strategy for the leaf nodes so that the keys for frequently accessed
files are available in memory for quick access. Similarly, a dedicated kernel thread
periodically synchronizes dirty cache entries to their disk blocks and evicts old cache
entries. We should point out that we experimented with various caching strategies and
data structures for searching the cache efficiently. Our performance measurements
show that having a cache, as opposed to always reading the keys from the disk, results
in a significant performance gain. However, fine-tuning the cache organization had
no discernible impact on performance. This indicates that Eraser’s performance is
69
primarily I/O bound as expected, and that cache searches are overshadowed by I/O
operations.
3.5.6 Master Key Vault
In our prototype implementation we store the master key inside the NVRAM area
of a TPM chip. This enables us to reliably discard (i.e., overwrite) an obsolete key
when the master key needs to be regenerated after a file deletion, and also provides
a strong defense against unauthorized retrieval of the master key.
While it would also be possible to interact with the TPM chip directly from
within the kernel, our implementation instead utilizes a user space helper application
to read from and write to the NVRAM. This is a conscious design choice to make it
possible to extend the system to support different secure storage modules in the future
without requiring modifications to the kernel core. Eraser coordinates with this
helper application using the Linux kernel’s netlink facility, a standard mechanism
for kernel-to-user space communication. Note that the master key is further protected
by an encryption key derived from a user password, configured when setting up a new
Eraser instance. Therefore, the master key is always encrypted when residing inside
the TPM chip, accessed by the user space helper, or in transit through the netlink
channel.
3.5.7 Encrypting Non-File System Blocks
Our discussion of cryptography in this chapter focused on the secure deletion of
files. The approach we present also provides security guarantees similar to ordinary
file encryption tools as a side benefit, provided that the external master key cannot
be read by others or is otherwise further protected with another secret such as a
70
password. However, the approach we described so far does not provide full disk
encryption capabilities, since non-file blocks on the disk (e.g., the file system’s internal
data or free blocks) are not encrypted. This would require a user desiring both full
disk encryption and secure deletion at the same time to run Eraser on top of yet
another disk encryption solution, such as dm-crypt. This redundancy could hurt I/O
performance.
To address this limitation, we extended Eraser to provide full disk encryption
for non-file blocks as well. In short, Eraser operates in file encryption mode, as
described in Section 3.5.3, if the I/O request is for a file block. In all other cases,
it performs regular disk sector encryption using a fixed key generated on system
initialization and protected with a user password. The IVs in this mode are also
derived from disk sector numbers using the “encrypted salt-sector initialization vector
(ESSIV)” method [68]. In this way, Eraser becomes a full replacement option for
other disk encryption solutions, offering secure deletion guarantees on top of the usual
confidentiality characteristics of disk encryption.
3.5.8 Managing Eraser Partitions
Lastly, users interact with Eraser through a user space application that allows them
to format physical devices to create the required headers and internal metadata.
During this setup process, users are required to configure a password from which
encryption keys are derived for securing the master key while it is being transported
from the TPM chip to the kernel, and also to encrypt non-file system blocks.
Later, Eraser partitions can be activated with this tool to expose the securely-
deleting virtual device node by supplying the correct password. In the same way,
users can view active instances of Eraser, made available by the driver through a
71
/proc node, and deactivate them when no longer needed. Through this application
users can also configure Eraser to use any of the supported vault devices for master
key storage.
3.6 Evaluation
While the approach we presented in this chapter was primarily designed to provide
strong secure deletion guarantees, many of our implementation choices were also
geared toward achieving good I/O performance on commodity computer systems so
that Eraser could have a practical impact. In this section, we present two sets
of experiments to evaluate the performance overhead of Eraser and compare it to
ordinary full disk encryption.
All experiments described in this section were performed on a regular desktop
computer with an Intel i7-930 2.2GHz CPU, 9 GB of RAM, running Arch Linux x86-
64 with an unmodified 4.17.0 kernel. The storage device used was a Samsung 950
PRO solid state drive with 1TB capacity, formatted with Ext4 using the default file
system settings.
For all tests, the results presented were averaged over five runs. The maximum
relative standard deviation we observed was below 2% for the I/O benchmarks we
describe first, and below 5% for the real-life small file tests we discuss next.
3.6.1 I/O Benchmarks
To understand how Eraser impacts the I/O performance of the underlying storage
device, we first put our system under stress using the popular disk and file system
benchmarking tool Bonnie++. For file I/O tests, we configured Bonnie++ to write
and read 20×1 GB files. This size was chosen to be more than twice the system
72
No
En
cryp
tion
dm
-cry
pt
Erase
r
Ove
rhea
dvs.
Ove
rhea
dvs.
Bon
nie
++
Tes
tsP
erfo
rman
ceP
erfo
rman
ceN
oE
nc.
Per
form
ance
No
En
c.d
m-c
ryp
t
Wri
te255
300.
00K
B/s
2549
90.0
0K
B/s
0.12
%25
3530
.20
KB
/s0.
70%
0.58
%R
ead
21377
8.0
0K
B/s
1421
74.2
0K
B/s
50.3
6%
1417
47.6
0K
B/s
50.8
2%
0.30
%
Cre
ate
37183
.60
file
s/s
3585
0.80
file
s/s
3.72
%34
266.
00fi
les/
s8.
52%
4.63
%D
elet
e5941
8.8
0fi
les/
s59
098.
00fi
les/
s0.
54%
4923
0.80
file
s/s
20.6
9%
20.04
%
Tab
le3.
1:D
isk
I/O
and
file
syst
emp
erfo
rman
ceof
Erase
rco
mpar
edto
full
dis
ken
crypti
onw
ith
dm
-cry
pt.
Ben
chm
ark
resu
lts
onan
unen
crypte
ddev
ice
are
also
pre
sente
das
abas
elin
e.
73
RAM, following the benchmark tool’s recommendation. Next, file creation and dele-
tion tests were performed with 512×1024 small files each containing 512 bytes of
data, distributed among 10 directories. These tests were also repeated on the same
test environment, but instead using dm-crypt, the standard Linux subsystem that
provides full disk encryption. While our discussion will primarily focus on compar-
ing Eraser’s performance to dm-crypt, we also provide benchmark results obtained
without running either encryption tool as a baseline. The results are shown in Ta-
ble 3.1.1
Bonnie++ benchmarks reveal that when performing read and write operations
on a small number of large files Eraser exhibits very similar performance to dm-
crypt, with the overhead staying below 1%. This is not surprising, because once
Eraser obtains the encryption key for the processed file with a negligible, one-time
performance hit, the remaining task of encrypting and decrypting the file blocks in-
flight is nearly identical to how dm-crypt performs disk block encryption. However,
in file creation tests, Eraser incurs a more noticeable performance impact. This is
likely due to the fact that Eraser now needs to perform a larger number of additional
I/O operations to repeatedly access its key store, and decrypt the corresponding FKT
nodes to obtain keys corresponding to each newly created file.
Finally, the most significant performance impact is observed during file deletions,
where Eraser falls behind dm-crypt by about 20%. Once again, this outcome is
in line with our expectations since a file deletion is the most expensive operation
Eraser performs: Eraser first intercepts the unlink system call, then performs
several accesses to the FKT, and finally replaces all involved keys with freshly gen-
1We point out that in all tests performed with and without Eraser we measured higher writespeeds than read speeds. While this was unanticipated, unofficial Internet discussions indicate thatthis is an issue observed with SSDs produced by this vendor, most likely due to a firmware quirk.Notwithstanding the reasons, we would like to point out that this issue does not have any bearingon our experimental results.
74
No Encryption dm-crypt Eraser
Overhead vs. Overhead vs.Tests Time (s) Time (s) No Enc. Time (s) No Enc. dm-crypt
Unpack 10.60 10.84 2.26 % 11.39 7.45 % 5.07 %Copy 11.44 23.59 106.21 % 22.61 97.64 % −4.15 %Remove 3.26 4.17 27.91 % 5.04 54.60 % 20.86 %Grep 11.11 25.18 126.64 % 24.12 117.10 % −4.21 %MD5 Hash 10.39 24.20 132.92 % 22.20 113.67 % −8.27 %Compile 1564.13 1564.15 > 0.01 % 1568.13 0.26 % 0.26 %
Table 3.2: Timed experiments with the Linux kernel source code directory to comparethe small-file performance of Eraser to full disk encryption with dm-crypt. Testsresults on an unencrypted device are also presented as a baseline.
erated ones, also encrypting and writing them back to the key store if there is cache
contention. However, regardless of this drawback, the actual number of files pro-
cessed per second by Eraser remains considerably high. As a result, we next test
how Eraser performs with real-life tasks that heavily involve small file operations
and explore this behavior in more detail.
3.6.2 Tests with Many Small Files
Prompted by Eraser’s relatively high performance overhead observed when dealing
with large numbers of small files under benchmark conditions, we next investigated
how it would perform in more realistic scenarios. To this end, we chose six tasks
involving a large directory tree – namely, the Linux kernel source code – and measured
the time elapsed to complete each task. Once again, the test were performed first
with Eraser and then dm-crypt. Measurements on a vanilla system with no disk
encryption are also provided as a baseline.
Our tests included the following tasks: (i) Unpacking the XZ-compressed source
code archive, (ii) making a copy of the directory tree, (iii) deleting the directory tree,
(iv) grepping the entire directory for a fixed string, (v) computing an MD5 hash over
75
all the files, and finally, (vi) compiling the kernel. All tasks were chosen to include
a large number of file operations, including reads, writes, deletions, and new file
creations. Furthermore, certain tasks such as kernel compilation and MD5 hashing
combined small file I/O operations with a CPU-bound component to cover different
scenarios. The results are presented in Table 3.2. Note that all operating system
caches were dropped between tests to ensure that measurements were not affected by
prior runs.
On the one hand, these results confirm our findings from the Bonnie++ bench-
marks that Eraser has a noticeable file deletion overhead, this time manifesting
itself at 21% during the directory removal task. On the other hand, in terms of the
time elapsed, the real-life impact of this performance loss is measured in a few sec-
onds. In all other tasks, Eraser performed comparably to dm-crypt, and surpassed
it in certain cases. However, this should not be taken to mean that Eraser is faster
than dm-crypt. Instead, we conclude that they perform similarly in real-life tasks.
The small differences in our measurements are likely due to natural variations in how
the underlying operating system and hardware performs.
3.6.3 Discussion of Results
In light of our evaluation, we confirm that the performance overhead of Eraser is
directly correlated with the number of files it handles at any given time. I/O per-
formed in big chunks and on a small number of files incurs no significant overhead.
In contrast, accessing a new file or deleting an existing one triggers additional I/O
operations to retrieve the corresponding keys from the FKT, or to rebuild branches
of the FKT with fresh keys. Therefore, repeatedly accessing large numbers of small
files results in a noticeable loss of throughput compared to ordinary full disk encryp-
76
tion. However, in comparison to ordinary full disk encryption, Eraser guarantees
secure data deletion and is useful in scenarios where privacy guarantees are of utmost
importance.
In addition, our tests also show that this reduction in throughput does not always
translate negatively to realistic workloads such as manipulating or working with very
large directory trees. In our tests, the performance loss is often measured in merely
seconds. In fact, in many workloads that include those that have processor-heavy
components, Eraser matches dm-crypt in performance. We find these results very
encouraging, especially considering that dm-crypt is a standard, well-optimized sub-
system of the Linux kernel. We conclude that in most practical use cases Eraser
offers performance comparable to regular full disk encryption with the added benefit
of guaranteed secure deletion.
We should point out that we refrained from comparing our prototype to a system
running without any disk encryption. As shown in Tables 3.1 and 3.2 a vanilla
system offers significantly higher I/O performance than both Eraser and dm-crypt.
However, we believe that such a comparison between encrypted and unencrypted
storage is not very meaningful in this context. First, the observed performance loss is
a direct result of disk encryption, and thus, it is not directly related to secure deletion.
Moreover, we believe that this downside of full disk encryption is a well-understood
and accepted trade-off in the face of modern privacy threats.
3.7 Discussion
Prior Tree-based Secure Deletion Work
As mentioned in Section 3.2, Reardon et al. [64] implemented a B-tree-based approach
to secure file deletion that also made use of cryptographic erasure and key wrapping.
77
This work is highly related; however, a significant difference between their prototype
and Eraser lies in our focus on developing a high performance secure deletion tech-
nique, and subsequently, presenting a practical and usable system that can act as a
viable substitute for existing, well-established full disk encryption tools.
First, while Reardon’s B-tree prototype shows promising performance character-
istics when combined with a suitable caching policy, our evaluation of Eraser shows
that an FKT implementation can closely approach the performance characteristics
of a heavily used and optimized production-level full disk encryption implementation
(i.e., dm-crypt). We stress that we are not the first to propose tree-based crypto-
graphic erasure using key wrapping. However, we believe that FKTs and our pro-
totype implementation are the first to show that it can be performant for everyday
use.
Next, Reardon’s work leverages the Linux kernel’s network block device facil-
ity [69], which routes block I/O requests over a TCP connection, and is typically used
for accessing remote storage devices. The authors utilize this technique to present a
proof-of-concept implementation of their approach for their experiments. In contrast,
one of our primary goals when developing Eraser was to provide a robust, practical,
and usable system that could easily be adopted for everyday use, on a typical Linux
system. As a result, we were faced with a different set of design and implementation
challenges to fulfill our specific requirements.
Implementation Limitations
As we have shown, Eraser makes it possible to maintain a file-level secure deletion
granularity even when operating at the block device layer. However, this design
choice does pose a major difficulty to securely deleting file metadata, as matching file
system-specific metadata to inodes is a non-trivial (but not impossible) task. Our
78
current implementation does not perform secure deletion of metadata, and we leave
tackling this implementation challenge to future work.
Eraser uses inode numbers to uniquely match encryption keys to files. This
is an intuitive solution when dealing with Linux-native file systems, such as Ext4,
which internally represent files directly using inodes. However, it should be noted
that “foreign” file systems that are ported to work under Linux (e.g., FAT, ZFS)
do not necessarily have the concept of an inode. Instead, they construct inodes
in memory as files are accessed, and map their own internal representation of files
onto these in-memory structures as this is required to interface with the VFS. This
peculiar technical detail does not currently pose any difficulty to our implementation.
However, in theory, it could be possible to implement a file system that does not have
a fixed inode number-to-file mapping, but rather assigns arbitrary inode numbers
to files every time the file system is mounted. Our Eraser prototype would not
be compatible with such a file system, and addressing this limitation would require
Eraser to employ a different method to uniquely identify files on that file system.
Swap Space & Hibernation Considerations
The secure deletion guarantees provided by our approach require that file keys are
never written to physical storage without first being encrypted by a parent key. Like-
wise, the master key must never be persisted outside its designated secure vault.
These conditions could easily be satisfied by keeping the keys in volatile memory
protected by the kernel while in use. However, we stress that implementations should
also take the necessary precautions to prevent inadvertent leakage of keys in case the
system goes into hibernation, or when memory pages are swapped to non-volatile
storage. Specifically, sensitive memory areas containing key caches should be marked
79
non-swappable, and before entering hibernation, all key caches must be written back
to persistent storage and their corresponding memory regions sanitized.
Unavailability of Master Key Vault
As part of Eraser’s normal operation it would be necessary to frequently rotate the
master key stored in the external vault. However, should the vault become inacces-
sible for any reason (e.g., a removable storage device acting as the vault, such as a
smartcard, could be unplugged by the user), Eraser needs to take the appropriate
actions to prevent inadvertent loss of data on disk. One way to deal with such sit-
uations is to delay the master key rotation until the vault becomes available once
again.
Even if an Eraser partition needs to be taken offline under these conditions, the
direct children of an FKT could be encrypted with the old master key and persisted
to disk, which would temporarily forgo secure deletion. Later, the next time the vault
could be accessed, the master key would be rotated and all its direct children in the
FKT immediately re-encrypted to securely erase all previously deleted files. Note
that even in this scenario, an offline Eraser partition cannot be accessed again until
the vault becomes available, because the master key is required to unlock the FKT
on disk before the file system could be mounted.
Alternatively, in different threat environments that involve highly-sensitive files,
it could be preferable to rotate the master key as soon as files are deleted regardless of
the vault’s availability, and opt for having the file system become inaccessible should
the system be taken offline before the new key could be written to the vault. Such a
policy would instead sacrifice data integrity in favor of guaranteed secure deletion.
80
Users’ Perception of Secure Deletion
Finally, we should point out that Eraser is designed to securely delete files only when
a system call explicitly requests removal of the file inode in question. For instance, our
prototype implementation considers the unlink family of system calls as the trigger
for secure deletion. Of course, this could trivially be extended to cover other related
system calls, such as truncate, by intercepting their corresponding entry points as
well.
However, file system implementations may not always explicitly destroy inodes
even when, from a user’s perspective, it may appear that a file’s contents are being
deleted. For example, consider a scenario under Linux and Ext4 where a directory
contains two files X and Y. When a user executes the command “mv X Y” to overwrite
the first file with the latter, the file system does not actually unlink Y. Instead, its
inode is reused, and only the data blocks of Y are overwritten. In other words,
Eraser would not consider this a file deletion event and would not securely delete
the contents of Y until the user later executes another command such as “rm Y”,
at which point all current and old data pointed by that inode is securely deleted.
Therefore, users of Eraser should be aware of this semantic gap and limitation of
the system, and explicitly execute file deletion operations when secure deletion is
desired.
3.8 Summary
Even though the problem of irrevocably deleting data from non-volatile storage has
been explored by many researchers, flash-based storage media with opaque on-board
controllers, and journaling file systems with data replication features still make it a
challenging task to provide strong secure deletion guarantees on modern computers.
81
At the same time, previously practical secure deletion tools and techniques are rapidly
becoming obsolete, and are rendered ineffective.
In this chapter, we leveraged the well-known concept of cryptographic erasure to
design a novel, effective secure deletion technique called Eraser. Our work is dis-
tinct from the myriad of existing literature in this field in that, Eraser can guarantee
secure deletion of files on storage media regardless of the underlying hardware’s char-
acteristics, treating storage devices as blackboxes. We achieve this by bootstrapping
cryptographic erasure with the help of an external, secure storage vault, which could
be implemented in practice using cheap, commodity hardware such as a TPM chip,
or a smartcard.
Eraser’s design and implementation fulfills all of our research goals laid out in
Section 1.5. (G1) We presented a practical implementation of Eraser, realized as a
stand-alone Linux block device driver that can be deployed and used on a commodity
computer with a TPM chip. (G2) Eraser partitions are exposed to user space
as virtual devices that behave identical to ordinary block storage media; they are
supported on any block-based hardware and can also be formatted with any file
system. (G3) Eraser requires explicit cooperation neither from applications, nor
users; it performs secure deletion automatically whenever a file is removed. (G4)
Finally, our implementation exhibits similar performance characteristics to dm-crypt,
and thus offers users a viable alternative disk encryption solution with the added
benefit of secure file deletion.
82
Chapter 4
HiVE: Hidden Volume Encryption
4.1 Overview
Full disk encryption is a common security technology used for protecting sensitive
information saved in a computer’s persistent storage. Today, many major operating
systems offer basic disk encryption solutions out of the box, and there also exists a
large pool of free and commercial tools that provide disk encryption technologies to
suit different security and privacy needs.
While disk encryption is a well-studied technology that is known to provide strong
security when implemented correctly, it is nevertheless vulnerable in the face of chang-
ing adversarial models of the modern day. Specifically, against powerful adversaries,
such as government and law-enforcement agencies, which may have the authority to
force users into disclosing their keys, basic disk encryption techniques become inef-
fective regardless of how strong the underlying cryptographic algorithms are.
To address this problem, certain disk encryption tools provide advanced features
that offer plausible deniability to their users. For example, TrueCrypt, a popular disk
encryption tool, allows users to create a second hidden volume inside an ordinary
83
encrypted disk partition, using a separate key for encryption. In this scheme, data
blocks of the hidden volume are stored inside the seemingly-free blocks of the first
volume. Then, if the user is coerced into disclosing her encryption keys, she can reveal
only the key to the first partition and withhold the key to the hidden volume. Even
with full access to the primary volume, the adversary cannot tell whether a second
hidden volume exists, or more specifically, he cannot distinguish actual free blocks
from data blocks that are part of a potential hidden partition.
Unfortunately, as recognized by Czeskis et al. [2], this hidden volume scheme has
an important flaw. Namely, an adversary that has the ability to inspect multiple
snapshots of the disk at different times can guess with a high probability of success
whether a hidden volume exists. This is an important shortcoming since it is com-
mon for users to lose possession of their encrypted devices on multiple occasions, for
instance, while traveling (e.g., checking bags for multiple flights, border inspections
when entering and leaving a foreign country, leaving the device in a hotel room unat-
tended). The reason behind this vulnerability is the fact that TrueCrypt does not
make any attempts to hide disk access patterns. To explain intuitively, an adversary
can compare two disk snapshots, and attempt to determine whether an unlikely large
number of “free” disk blocks have been modified in between, which would give away
activity in a hidden volume.
In this chapter, we present a hidden volume encryption scheme that is secure
against adversaries with multiple-snapshot capabilities. We achieve this by using an
Oblivious RAM (ORAM) as a building block to hide disk access patterns, and then
refine our basic construction in several steps to present a final scheme we call Hive.
We demonstrate that our design can be implemented as a standard block device driver
on Linux, allowing users and applications to interact with Hive volumes in the exact
same manner as they would with ordinary disk partitions.
84
4.2 Threat Model
The primary threat we target with this work is a coercion attack, whereby an adver-
sary aims to defeat disk encryption and gain unauthorized access to privacy-sensitive
data by forcing the disk’s owner to willfully reveal her encryption keys. Typical exam-
ples of such adversaries include government and law-enforcement agencies that often
carry the authority to request users to reveal their secret keys, for instance, through
a court order.
Following the standard adversarial model for this attack, and as a precursor to
coercion, we assume that an attacker has the capability to inspect a disk of interest
in order to identify any encrypted data stored on it. However, we further assume that
the attackers may access and inspect the same disk more than once, and can compare
multiple disk snapshots taken at different times.
In the rest of this chapter, we refer to such scenarios as multiple-snapshot attacks.
As we describe in the next section, Hive aims to enable users to create hidden
encrypted partitions on their disk, and plausibly deny their existence in the face of
multiple-snapshot attackers.
4.3 Design
A key observation in our design of a hidden volume scheme resistant to multiple-
snapshot adversaries is that access patterns to encrypted volumes need to be hidden.
To this end, we first present a naıve, generic scheme using a standard ORAM. ORAM
is a block-based oblivious data structure; in other words, it does not reveal any
information about the sequence of read and write operations performed on its data
store. Thus, it is a natural fit for our purposes. ORAM specifics have been widely
85
studied by computer science researchers and we refer the readers to the large body of
previous work for more details (e.g., [70, 71, 72]).
In the following, we first present a basic, generic hidden volume scheme that
is resistant to multiple-snapshot attacks, but performs poorly from a performance
standpoint due to the inclusion of ORAMs as its building blocks. We then discuss
refinements and optimizations to this scheme in iterations, and finally present Hive,
a practical hidden volume encryption scheme.
4.3.1 Model
For the hidden volume encryption schemes presented in this chapter, we assume the
following model. The scheme gives a user access to max number of encrypted volumes
Vi, where the user can choose to set up and use any l, l < max number of volumes.
Each Vi is encrypted with a key Ki derived from a password Pi, and consists of ni
blocks of B bytes of data each. The total size of the disk is N . The hidden volume
scheme works in such a way that, given that an adversary has access to a number of
passwords smaller than the total number of volumes present, and can inspect multiple
snapshots taken from the disk, he will be uncertain about the real value of l.
4.3.2 Generic Hidden Volume Encryption
Our generic scheme uses max ORAMs as its storage units, each holding the data
for a corresponding encrypted volume Vi. The volume read and write operations are
performed as described in Algorithms 1 and 2. Intuitively, a write into Vi writes the
actual data into ORAMi, and then, for all the remaining ORAMs executes a dummy
write that does not change the data stored in the volume. Similarly, a read operation
for Vi reads the requested data from ORAMi, and then performs dummy writes to all
86
Algorithm 1 Generic Hidden Volume WriteInput: volume v, block b, data d, keys < K1, . . . ,Kmax >for i← 1 to max do
if i = v thenORAMi.write(b, d,Ki)
elser ← random({1, . . . , ni})dummy ← ORAMi.read(r,Ki)ORAMi.write(r, dummy,Ki)
end ifend for
Algorithm 2 Generic Hidden Volume ReadInput: volume v, block b, keys < K1, . . . ,Kmax >Output: data dd← ORAMv.read(b,Kv)for i← 1 to max do
r ← random({1, . . . , ni})dummy ← ORAMi.read(r,Ki)ORAMi.write(r, dummy,Ki)
end forreturn d
ORAMs. Note that when not all max volumes are in use, and consequently, there is
no corresponding password P for those, the unused ORAMs would be replaced by a
simulator S which executes dummy operations that look identical to real operations
to an adversary.
This construction has two important properties:
• Write patterns to encrypted volumes are hidden. This is guaranteed by defini-
tion through the use ORAMs to represent encrypted volumes.
• Writes to hidden volumes can be plausibly denied. This is possible because any
read operation results in dummy writes to all volumes, which covers for a write
into a hidden volume. In effect, with the described scheme, all disk operations
looks the same to an adversary, regardless of which volume is being used, and
what operation is being performed.
87
Algorithm 3 Write-Only ORAM WriteInput: block b, data d, key KS ← random subset({1, . . . , N}, k)β ← random(S), where β is a free block
Disk.write(β,EncK(d))Map[b]← β
for all β′ in (S − β) doif β′ is free then
Disk.write(β′, random bytes(B))else . β′ holds data
d← DecK(Disk.read(β′))Disk.write(β′, EncK(d))
end ifend for
Algorithm 4 Write-Only ORAM ReadInput: block b, key KOutput: data dβ ←Map[b]d← DecK(Disk.read(β))return d
4.3.3 Write-Only ORAM Construction
Note that an adversary inspecting snapshots of a disk would not be able to observe
read operations, as reads do not leave any trace on the disk. Consequently, hiding
disk read patterns to an encrypted volume does not provide any additional security;
only hiding the write patterns would be sufficient. This means that the ORAMs we
use in the described generic hidden volume scheme are more powerful than required.
Therefore, in this section, we describe a more efficient write-only ORAM construction
to use with our hidden volume scheme.
This write-only ORAM construction uses a data structure Map to map a virtual
block b in the ORAM to a physical sector β on the disk. Map is kept in volatile
memory, and thus, is not visible to an adversary inspecting a disk snapshot. We also
88
assume that the disk has at least twice the amount of space allocated for the ORAM
(i.e., N ≥ 2n). Finally, we require that the cryptographic operations utilized realize
IND$-CPA encryption, meaning that the ciphertext produced is indistinguishable from
random strings [73].
The write-only ORAM write and read operations are defined in Algorithms 3 and
4. To perform a write, we first pick a set S of k random disk sectors (we discuss the
considerations for choosing a concrete value for k in the following sections). Then
we choose a random sector β from S, where β is free, meaning that it is not mapped
to any ORAM block b. The data is encrypted and written to β on disk, and Map
is updated to reflect this. The remaining k − 1 sectors in S are either overwritten
with randomized strings if they are free, or are re-encrypted and written back if they
contain data. Because the k sectors are chosen uniformly randomly and the ciphertext
on disk is indistinguishable from a random string, this construction does not reveal
any information about b or d to the adversary.
As this construction does not attempt to hide read patterns, the ORAM read
operation is trivial. It only involves resolving the requested logical ORAM block b
to the corresponding disk sector β through a Map lookup, and then performing the
read normally.
4.3.4 Choosing the Parameter k
In order to successfully execute a disk write operation, the choice of k must ensure
that there exists at least one free block in S where we can write the data into.
Recall our assumption that N ≥ 2n; that is, at least half of the ORAM’s underly-
ing storage disk is left empty. Then, the probability of a randomly chosen block from
the disk being empty is at least 1/2. Let X be a random variable that, when selecting
89
k blocks uniformly from N , describes the number of free blocks among those k. As
N is typically large compared to k, we approximate X with a binomial distribution.
Then, P [X ≥ 1] = 1 − P [X = 0] ≈ 1 −(k0
)· ( n
N)k = 1 − 2−k for N = 2n. In other
words, the probability of not finding any free block is a negligibly small 2−k, at the
cost of doubling the disk space requirement.
4.3.5 Write-Only ORAM Optimizations
There are two important limitations of the write-only ORAM construction we pre-
sented above. First, we need to perform k disk operations for each ORAM write,
which would impact performance for large values of k. Second, storing Map in mem-
ory could be excessively expensive with large disks.
To address the first problem we introduce a stash optimization. Recall that the
probability of selecting a random free block on disk is at least 1/2. This means that,
for a single write operation, our ORAM scheme will pick k/2 free blocks in expec-
tation; however, we need only one to write our data block into. Our optimization
technique exploits this fact to allow for very small values of k (e.g., our Linux imple-
mentation uses k = 3). Specifically, we extend our construction with an in-memory
data block queue, or stash. During a write operation, if there is no free block among
the k selected, we instead temporarily store the data in the stash. Otherwise, if there
are multiple free blocks in our selection, we write the pending blocks from the stash
into the excess free blocks.
To bound the size of the stash, and thus, memory use, we use a standard queuing
argument. We model our stash as a D/M/1 queue with a deterministic arrival rate
of γ = 1 and service times exponentially distributed with a rate parameter µ = k/2.
Then, as shown by [74], the steady state probability P of having i items in the stash
90
at any time is P = (1− δ) · δi, where δ is the root of the equation δ = 2−µγ(1−δ) with
the smallest absolute value. If µ is larger than 1, then δ < 1, and the steady state
probability of having i blocks in the stash will be O(2−i). As a result, we can set k
to a small constant, for example, k = 3, to find δ = 0.41718, and we can bound the
probability of overflowing the stash at 2−64 using a stash size of only 50 blocks.
We address the second problem of large Map sizes by adopting a standard tech-
nique that involves storing the mapping recursively in smaller ORAMs [71, 75, 76]. If
our block size B is at least χ · logN for some constant χ > 2, then we are guaranteed
that another ORAM recursively holding our map will have its own map no greater
than half the size of the original. After O(log n) recursive ORAMs, we will have a
constant size map that can be stored in memory. Of course, this slightly increases the
communication complexity since we now have to access O(log n) recursive ORAMs
to map ORAM blocks to disk sectors.
4.3.6 Hidden Volume Encryption with HiVE
While the combination of our write-only ORAM with the generic scheme we pre-
sented previously provides a secure hidden volume encryption technique, it has one
final practical limitation. Namely, our generic scheme uses max separate ORAMs to
support max volumes. In turn, each I/O on any given volume performs additional I/O
operations on all ORAMs, resulting in a complexity dependent on the value of max.
Our final refinement of the hidden volume encryption scheme, which we call Hive,
addresses this problem by storing all volumes interleaved on the disk inside a single
ORAM. Blocks of all volumes are mapped to random disk sectors, and mappings are
updated randomly using our ORAM mechanism every time a block is written to (see
Figure 4.1).
91
Algorithm 5 Hive WriteInput: volume v, block b, data d, keys < K1, . . . ,Kmax >Stashv.enqueue(b, d)S ← random subset({1, . . . , N}, k)for all β in S do
for i← 1 to max doif β is block b in Vi then
d← DecKi(Disk.read(β))Stashi.enqueue(b, d)break
end ifend for
end for
i← 1for all β in S do
while i ≤ max and Stashi = ∅ doi← i+ 1
end while
if i > max thenbreak
end if
b, d← Stashi.dequeue()Disk.write(β,EncKi(d))Mapi[b]← βS ← (S − β)
end for
for all β in S doDisk.write(β, random bytes(B))Mapi.dummy write()
end for
92
Algorithm 6 Hive ReadInput: volume v, block b, keys < K1, . . . ,Kmax >Output: data dif b in Stashv then
d← Stashv.dequeue()return d
end if
β ←Mapv[b]d← DecKv(Disk.read(β))HiV E.dummy write()return d
Volume 1 Volume 2 Volume max
...
block b
1. Map[b] = β
2. Map[b] = β'
block βblock β'Hard Disk
Figure 4.1: Hive stores volumes interleaved on disk.
93
Details of the Hive write and read operations are shown in Algorithms 5 and
6. Note that, in line with our recursive block translation map design, the Map
structures referred to in the algorithms are actually separate Hive instances as well.
Consequently, they must also be accessed with the presented Hive read and write
routines in a recursive manner. For brevity, however, we use the simple array-like
notation in the algorithm listings.
We point out that combining all encrypted volumes together in a single backing
ORAM has an important implication on security: writes to volumes may influence
each other. To illustrate, to write to a given volume, when we pick k random blocks to
form S as usual, this set can contain both free blocks, as well as blocks used by other
volumes. In this case, we cannot simply choose to use the free block as that would
create a write pattern that attempts to deliberately avoid certain used blocks, and
subsequently, would undermine the security of the scheme. To address this problem,
Hive utilizes separate stashes for each volume. When performing a write, the data
to be written, and then all used blocks among the randomly chosen k blocks are read
from disk, and enqueued to their corresponding stashes. As a result, k blocks are now
freed on the disk. Next, these k free spaces are filled with the pending blocks in the
stashes, with stashes for lower volumes taking priority. This ensures that writes to
higher volumes cannot influence writes to lower volumes, solving the aforementioned
problem of leaking write patterns. Finally, if all stashes are empty and any of the k
freed blocks remain unprocessed, they are filled with randomized blocks. Note that
when randomizing blocks in this way we still need to perform a dummy update on
the corresponding Map, because the maps are recursive instances of Hive as well.
During the Hive read operation the requested block is read as usual. However,
as a final step, a dummy write should be performed according to Algorithm 5. Once
94
again, this ensures that reads and writes look identical to an adversary, and also
provides an opportunity for writing stashed blocks to disk.
4.4 Implementation
We implemented Hive for Linux as a combination of a kernel module, and a user space
helper application. The kernel module allows us to expose Hive as a virtual Linux
block device that behaves and can be used like any physical block device installed on
a machine. The helper application allows users to create and manage Hive instances
on any system partition.
HiVE Volumes
Similar to our Eraser prototype previously discussed in Chapter 3, Hive’s im-
plementation makes use of the device-mapper framework provided by the kernel.
Device-mapper allows us to map any part of a hardware block device onto virtual
block devices, which the users then interact with. As a result, Hive sits between the
Linux block I/O layer and the underlying hardware, intercepts the I/O requests in
flight, and modifies or redirects them to different disk sectors as necessary to imple-
ment the hidden volume scheme we described. Note that our implementation is not
tied to the low-level block device drivers, and works on any underlying block device,
including hard disks and USB sticks. Likewise, since device-mapper resides below
the Linux virtual file system (VFS), Hive volumes could be formatted with any file
system available to the user.
The cryptographic algorithms we use in our implementation are AES-256 in CBC
mode for volume encryption, AES-256 in CTR mode for efficiently generating ran-
95
domized blocks, and PBKDF2 for deriving the volume encryption keys from user
selected passwords.
Disk I/O Optimizations
Remember that all data blocks sent to a Hive volume are written to a uniformly
randomly chosen disk sector. This is an inherent characteristic of our write-only
ORAM construction that is required to satisfy the property that write patterns cannot
be observed in a disk snapshot. Unfortunately, this also means that all I/O operations
performed on Hive volumes result in random disk accesses, which can be significantly
slower than typical sequential access patterns.
We perform two optimizations to alleviate this problem. First, we set the block
size of Hive volumes to 4 KB (i.e., the maximum allowed by Linux on the x86
architecture) regardless of the underlying storage medium’s physical sector size (which
is usually 512B). This forces the kernel to issue I/O requests in larger chunks, reduces
the overall number of block writes and random disk seeks necessary during operation,
and greatly reduces the I/O overhead.
Next, we disable all kernel I/O reordering and scheduling optimizations for Hive
volumes. Because Hive strictly performs random access to the disk, it cannot benefit
from these disk access pattern anticipation features. On the contrary, disabling them
improves performance by eliminating the overhead of these unnecessary optimization
routines.
Managing HiVE Partitions
The Hive user space tool is a management interface which allows users to create Hive
instances on top of their hardware devices. The tool automatically computes the size
requirements for each volume and the recursive map Hive instances, partitions the
96
Write Read Create Stat Delete(MB/s) (MB/s) (files/s) (files/s) (files/s)
Raw disk 216.04 221.74 82290 201180 105100Hive 0.97 0.99 1570 3230 1790
Table 4.1: Disk I/O and file system performance of Hive. System parameters chosenas l = 2, k = 3.
device accordingly, sets up the required metadata such as encryption IVs and reverse
maps used for detecting whether a given disk sector is free, and allows users to set
their passwords.
Once a Hive instance is created, the management tool enables users to create
virtual block devices that represent Hive volumes by supplying their passwords,
and then format, mount, and interact with them as usual. Users can also view
active instances of Hive and their volumes; this information is made available to the
management tool by the Hive driver through a /proc node.
The management tool performs all of its functionality automatically by commu-
nicating with the device-mapper framework through the appropriate ioctl calls.
4.5 Evaluation
We tested our implementation on a standard desktop computer with an Intel i7-930
CPU, 9 GB RAM, running Arch Linux x86-64 with kernel version 3.13.6. As the
underlying block device, we used an off-the-shelf Samsung 840 EVO SSD. For the
evaluation, we used Bonnie++, a standard disk and file system benchmarking tool.
We first tested an Ext4 file system with 4 KB blocks on the raw disk to get a baseline.
We then created 2 hidden volumes on our disk and set l = 2 and k = 3. We repeated
the experiments by running Bonnie++ on an Ext4 file system created on top of the
97
Hive volume. Table 4.1 presents the results averaged over 5 runs with a relative
standard deviation of < 6%.
These results show that I/O operations (i.e., writes and reads) were slower by
a factor of ≈ 200, while file system operations (i.e., create, stat, and delete) were
slower by a factor of 50 to 60. Random seek performance was not measurable on the
raw SSD (i.e., Bonnie++ reported that the tests completed too quickly to measure
reliable timings), whereas Hive achieved 1200 seeks/s. The Hive-induced CPU uti-
lization was low with < 1% during measurements, indicating that random access I/O
constitutes the main bottleneck.
We conclude that the measured slowdown is certainly significant, and that Hive is
not a suitable substitute to general-purpose full disk encryption solutions. However, a
throughput of 1 MB/s on an off-the-shelf disk would be acceptable in many high-risk
scenarios that involve highly-sensitive data, rendering Hive practical for the intended
real-world cases.
4.6 Related Work
The concept of deniable encryption is explored in detail for the first time by Canetti
et al. [77]. There exists a large body of free and commercial disk encryption tools
that provide various plausible deniability solutions (e.g., [78, 79, 80, 81]). These tools
do not hide access patterns to the disk, and as a result, repeated writes to hidden
volumes can be detected by a multiple-snapshot adversary.
Anderson et al. [82] present StegFS, and describe two techniques to hide data on
a disk. The first one utilizes a set of randomized cover that are later modified to hide
data in. The plaintext can then be retrieved as a linear combination of these. This
98
technique does not offer deniability if an attacker has knowledge of some of the files
on the disk, and it is not secure against multiple-snapshot adversaries.
The second technique, and its extension [83], hides files among randomly-filled
disk blocks, where data blocks are stored in locations derived from the file name and
a password. To avoid possible collisions, [84] instead iterates over the disk blocks
following the initial one, and writes to a free block found. This approach subverts
the problem of known files, but it is still vulnerable to multiple-snapshot attacks.
Other research focuses on providing plausibly deniable encryption specifically on
mobile devices [85, 86, 87, 88]. Once again, these techniques are not designed to be
resilient in the face of multiple-snapshot attacks.
Paterson and Strefler [89] describe a practical attack on an earlier implementation
of our work. Specifically, Hive’s previous use of the RC4 stream cipher to generate
random blocks was vulnerable, allowing an attacker to distinguish between random-
ized dummy blocks and actual encrypted data on disk, and consequently, to break
Hive’s security guarantees. The current Hive implementation instead uses AES-256
in CTR mode for generating randomized blocks, and is no longer vulnerable to this
attack.
4.7 Summary
Even though full disk encryption is recognized as an effective way to guarantee con-
fidentiality of sensitive information on a computer’s disk, this basic cryptographic
technique is bound to fail in the face of adversaries with the ability to coerce users
into revealing their secret keys. Advanced tools offer plausibly deniable disk encryp-
tion schemes to address this weakness; however, more powerful adversaries that can
99
inspect multiple snapshots of a disk taken at different times can still successfully
detect hidden encrypted partitions.
In this chapter, we presented Hive, a practical hidden volume encryption scheme
resistant against multiple-snapshot adversaries. Our performance evaluation indicates
that Hive may not be suitable for everyday use, or for replacing regular disk encryp-
tion technologies. However it provides strong privacy guarantees and reasonable per-
formance in scenarios involving highly-sensitive data, where a security-to-performance
trade-off may be acceptable.
The proposed design and implementation satisfies all of our research goals stated
in Section 1.5. (G1) Hive is realized on a standard Linux system using a well-known
kernel framework; the concept of virtual block devices are applicable to other major
operating systems. (G2) Hive volumes function exactly like ordinary block devices;
they are supported on any block-based storage hardware, and can be formatted with
any file system. (G3) Likewise, both applications and users access Hive volumes just
like any other disk partition available on a system; encryption and privacy features
are offered transparently. (G4) Finally, both the presented write-only ORAM con-
struction, and the final Hive scheme perform reasonably well for the intended use
cases, and are suitable for practical use.
100
Chapter 5
Overhaul:
Input-Driven Access Control on
Traditional Operating Systems
5.1 Overview
The prevailing security model for traditional operating systems focuses on protect-
ing users from each other. For instance, the UNIX access control model provides a
framework for isolating users from each other through a combination of user iden-
tifiers, group identifiers, and process-based protection domains. The fundamental
assumption underlying this approach to security is that the primary threat to user
data originates from other users of a shared computing system.
The traditional user-based security model makes sense in the context of timeshar-
ing systems, where many users share access to a common pool of computing resources.
However, the modern proliferation of inexpensive and powerful computing devices has
resulted in the common scenario where one user has sole access to a set of resources.
101
Unfortunately, there exists a significant impedance mismatch between user-based ac-
cess control and the primary security threat in the single-user scenario, where users
inadvertently execute malicious programs that operate with that user’s privilege and
have full access to all of the user’s sensitive computing resources. As such, user-based
access control is not well-suited to preventing attacks against user confidentiality. In
particular, malicious programs can access privacy-sensitive hardware devices such as
the microphone or camera, or access virtual resources such as the system clipboard
and display contents of other programs.
In response to the changing computing landscape, much effort has been invested
in extending the user-based access control model to enable dynamic, user-driven
security. For instance, modern operating systems for smartphone and tablet devices
have taken the opportunity provided by these new platforms to introduce permission
systems as an extension to the underlying UNIX security model that remains in use
on these systems. For instance, iOS gives users the ability to approve or deny access
to sensitive resources during runtime via popup prompts. Research operating systems
have also proven a fertile milieu for experimenting with security models that address
the needs of modern computing systems. For instance, Roesner et al. [90] present
an extension to ServiceOS where gadgets are embedded into applications that allow
users to grant or deny access to sensitive resources.
In each of the preceding examples, determining legitimate user intent and translat-
ing that intent into appropriate security policies is a central feature of their respective
security models. For each system, security decisions as to whether to allow or deny
access to sensitive resources for individual programs are delegated to the user, and the
system is responsible for establishing trusted input and output paths to capture user
intent such that malicious programs cannot influence this process by either spoofing
or intercepting user inputs.
102
We fundamentally agree with this approach to securing modern computing de-
vices, since users are often solely capable of classifying program actions as privacy
violations or other inappropriate uses of their resources. However, one drawback of
these efforts is that applications and operating systems must be written with this
security model in mind. This requirement largely excludes traditional operating sys-
tems such as Windows, Linux, and OS X, which remain in wide use, from enjoying
the benefits of user-driven access control.
In this chapter, we show that providing a user-driven security model for protect-
ing privacy-sensitive computing resources can be realized on traditional operating
systems, as an extension to the traditional user-based security model. In particular,
our security model is based on the observation that a legitimate application usually
accesses privacy-sensitive devices immediately after the user interacts with that ap-
plication (e.g., by clicking on a button to turn on the camera, or pressing the key
combination for a copy & paste operation). We call this security model input-driven
access control, and demonstrate how it can be enforced by correlating user input
events with security-sensitive operations based on their temporal proximity, making
access control policy decisions automatically based on this information, and notify-
ing the user of resource accesses in an unintrusive manner. We achieve this by using
lightweight and generic techniques to augment the operating system and display man-
ager with trusted input and output paths, which we collectively call Overhaul, and
demonstrate our approach by implementing a prototype for Linux and X Window
System.
In contrast to prior work, we show that capturing user interaction as a basis for
security decisions involving sensitive resources can be performed in an application-
transparent manner, obviating the requirement that applications be rewritten to con-
form to special APIs or with a more refined security model in mind. Using our
103
approach, we demonstrate how dynamic access control can be transparently achieved
for common resources such as the microphone, camera, clipboard, and display con-
tents. Finally, we show that this can be achieved without a discernible performance
impact, and without utilizing intrusive prompts or other changes to the way users
interact with traditional operating systems.
5.2 Threat Model
Input-driven access control primarily addresses two privacy breach scenarios. The first
one covers programs that stealthily run in the background and access privacy-sensitive
resources without the user’s knowledge, behavior typical of malware [91, 92, 93, 94].
Overhaul ensures that such attempts are automatically blocked.
The second scenario involves “benign-but-buggy” or misbehaving applications
that access protected resources without the user’s knowledge. Due to the trade-
offs Overhaul make in order to transparently retrofit a dynamic access control into
existing systems, unlike previous work [90], it is not possible to match each input
event to a precise user intent. Therefore, in this scenario, Overhaul instead visu-
ally notifies the user to alert her of the undesired resource access.
For this work, we assume that the trusted computing base includes the display
manager, OS kernel, and underlying software and hardware stack. Therefore, we
assume that these components of the system are free of malicious code, and that
normal user-based access control prevents attackers from running malicious code with
super user privileges. On the other hand, we assume that the user can install and
execute programs from arbitrary untrusted sources, and therefore, that malicious
code can execute with the privileges of the user. We assume that complementary
104
preventive security mechanisms are in place to prevent privilege escalation attacks,
such as ASLR or DEP.
We note that all forms of user-driven security are fundamentally vulnerable to full
mimicry attacks. For instance, if a user could be tricked into knowingly installing,
executing, and granting privileges to a malicious application that imitates a well-
known legitimate application, user-driven security models would fail to provide any
protection. Hence, our threat model does not include this third scenario.
5.3 Design
The architecture of an Overhaul-enhanced system requires modifications to and
close interaction between several components of the operating system and display
manager. In this section, we describe the abstract design of Overhaul, independent
of the underlying operating system, and present the challenges involved in monitoring
and tracking user input across process boundaries.
Note that our work assumes a user space display manager (i.e., a design similar
to that of the X Window System), an approach employed by popular commodity
operating systems. Different OS designs can allow display managers integrated into
the kernel, which would alleviate the need for some of the components we describe
below, such as a separate trusted communication channel between the kernel and the
display manager. Our design can be applied to that case in a straightforward manner.
5.3.1 Security Properties
The primary security goals Overhaul aims to achieve through input-driven access
control are the following:
105
(S1) Overhaul must allow an application to access privacy-sensitive resources only
if the user has explicitly interacted with that application through physical, hard-
ware input devices, immediately before the access request. Resources include
hardware devices such as cameras, microphones, and other sensors, or virtual
resources such as the system clipboard and display contents of user programs.
(S2) Overhaul must prevent programs from forging input events or mimicking user
interaction to escalate their (or other applications’) privileges.
(S3) Overhaul must ensure that legitimate user interaction events cannot be hi-
jacked by malicious applications, such that users should not mistakenly grant
permissions to a malicious program that were intended for a legitimate program.
(S4) Overhaul must notify users of successful accesses to protected resources via
a trusted output path that cannot be obscured or interfered with by other
applications.
5.3.2 Trusted Input & Output Paths
In order to realize any of the aforementioned security goals, Overhaul must es-
tablish a trusted path for user input. By a trusted path, we refer to the property
that input events should be authenticated as legitimately issued by a real user with
a hardware input device, as opposed to synthetic input events that can be issued
programmatically. This capability serves as a generally useful primitive that could
be exposed to higher layers of the software stack. However, in this chapter we focus
on illustrating its use for transparently securing access to system-wide resources.
The display manager of the system is often responsible for receiving all low-level
input events, including mouse clicks and key presses, from device drivers and deliv-
ering them to their target application windows. Consequently, Overhaul utilizes a
106
display manager with an enhanced input dispatching mechanism that can detect and
filter out synthetically generated inputs to fulfill the trusted input path requirement.
Likewise, Overhaul is tasked with establishing a trusted output path to alert
users whenever a sensitive resource access request is granted. We achieve this through
visual notifications that appear on the screen. Since the display manager is in control
of the screen contents, Overhaul extends it with an overlay notification mecha-
nism that is always stacked on top of the screen contents, and cannot be obscured,
interrupted, or interfered with by other processes.
5.3.3 Permission Adjustments
The kernel is responsible for dynamically adjusting the privilege level of user pro-
grams in response to permission granting actions, or in other words, authentic user
input events. In order to accomplish this task, the kernel first needs to establish
a secure communication channel to the display manager. The display manager can
then use this channel to send the kernel interaction notifications each time the user
interacts with an application. Since the display manager is often a regular user space
process, the kernel is able to authenticate the communication endpoint and ignore
communication attempts by other processes in a straightforward manner.
The kernel keeps a history of these interaction notifications, which include the
identity of the application that received the interaction and a timestamp, inside a
permission monitor. Once this information is stored, the permission monitor can
respond to permission queries and adjustment requests, originating either from the
user space display manager through the already established secure communication
channel, or from within the kernel, any time a permission decision is to be made.
This decision process involves comparing a timestamp issued together with the query
107
with the stored interaction timestamp corresponding to the target application, and in
this way correlating privileged operations with input events based on their temporal
proximity.
Finally, the kernel also uses the secure communication channel to request from
the display manager that it display a visual alert when a resource access is granted.
5.3.4 Sensitive Resource Protection
An important class of system resources that Overhaul aims to protect is privacy-
sensitive hardware devices. These devices could include arbitrary sensors attached to
the system; typical examples on desktop operating systems include the camera and
microphone. In order to implement dynamic access control over hardware resources,
the kernel is responsible for mediating accesses to these privacy-sensitive hardware
devices.
However, note that the kernel does not interpose on all privacy-sensitive resources;
representative examples include the system clipboard and program display contents.
The operating system often has no immediate visibility into such resources. Instead,
these resources are controlled by the system’s display manager. Applying dynamic
access control over these resources requires the display manager to query the kernel
permission monitor, and grant or deny the action based on the response.
To illustrate the enhancements required to the kernel and the display manager,
and how sensitive resources are protected, we present two scenarios that build upon
the components described above.
108
For the following discussion, we let:
opt be a privileged operation at time t,
where op ∈ {copy, paste, scr,mic, cam},
EA,t be an input event sent to application A at time t,
NA,t be an interaction notification corresponding to EA,t
QA,t be a permission query for application A at time t,
RA,t be a response ∈ {grant, deny} for QA,t,
VA,op be a visual alert request, indicating A performs op.
Hardware Resources
Figure 5.1 presents an example interaction involving an application’s request to access
the system microphone. In an unmodified system, the request would succeed so long
as application A holds the permission to access the microphone device at t+ n.
Overhaul introduces the following changes: First, the system ensures that for all
applications the permission to access the microphone is denied by default. (1) When
the user clicks on a button in application A to turn on the microphone at time t,
the display manager receives the input event EA,t and verifies that it is generated by
a hardware input device through user interaction. (2) If EA,t is authentic, then the
display manager first sends the kernel permission monitor an interaction notification
NA,t through the secure communication channel. The permission monitor records
this notification, indicating that A received authentic user input at t. (3) The display
manager then forwards EA,t to its destination A. (4) Upon receiving the event, A
109
DisplayManager
Permission Monitor
A
EA,t
NA,t
EA,t1
2
3
Kernel
Userspace
Cam Mic
4
5
mict+n
Hardware
VA,mic
6
Figure 5.1: Dynamic access control over privacy-sensitive hardware devices.
attempts to turn on the microphone. The permission monitor intercepts A’s request
mict+n to access the device. It compares A’s latest interaction time t with the device
access request time t + n to correlate the input event with the privileged operation,
based on a preconfigured threshold δ. (5) Access to the device is granted to A only if
the privileged operation could successfully be correlated with a preceding input event
(i.e., if (t + n)− t = n < δ holds). (6) Finally, the kernel sends VA,mic to the display
manager to request that the user be alerted. This step is necessary because the display
manager may not have adequate information to identify the process that actually
accessed the resource (e.g., due to IPC mechanisms, as explained in Section 5.3.5).
The verification of user input authenticity provides the property that sensitive
device access operations can only be performed in response to legitimate user input.
110
Note that, in this scenario, no permission query from the display manager to the
permission monitor is necessary. Since the kernel has full mediation over hardware
resources, the permission monitor can implicitly adjust the permissions of A when
necessary. This entire process is transparent to the application.
Display Resources
Figure 5.2 shows an example interaction for a clipboard paste operation between the
display manager and an application A. The baseline protocol consists of A requesting
the clipboard contents from the display manager, and receiving back the copied data.
Overhaul revokes all clipboard access permissions by default, and modifies the
protocol in the following way:
(1) First the user inputs the keystrokes to paste some text, (2) the display manager
verifies that the input EA,t is authentic and notifies the kernel permission monitor with
NA,t, (3) and forwards the key event to A. (4) After receiving the command from the
user, A issues a clipboard paste request pastet+n to the display manager. (5) Instead
of immediately serving the request, the display manager sends a permission query
QA,t+n to the kernel permission monitor through the secure communication channel.
(6) As before, the permission monitor compares the interaction time t in its records
for A with the privileged operation request time t+n issued together with the query.
If the correlation of the input event with the operation request is successful based
on the temporal proximity threshold δ, (i.e. n < δ), the permission monitor replies
with a grant response RA,t+n; otherwise RA,t+n is a deny response. (7) If and only if
RA,t+n is a permission grant does the display manager return to A the data; or else
A is blocked from accessing the clipboard. In this scenario an explicit visual alert
request from the kernel is not necessary, because the display manager can successfully
identify the requesting process without kernel assistance.
111
DisplayManager
Permission Monitor
A
EA,t
NA,t QA,t+n
RA,t+n
EA,t
pastet+n
data
1
2
3
7
6
4
5
Kernel
Userspace
Figure 5.2: Protecting copy & paste operations against clipboard sniffing.
Here, the secure communication channel between the kernel and the display man-
ager is used both for sending interaction notifications to the permission monitor, and
for querying it whether to allow the privileged operation.
As before, the verification of user input authenticity provides the property that
copy & paste operations can only be performed in response to actual inputs. This
provides protection against malicious programs that attempt to capture sensitive data
from the system clipboard, such as passwords pasted from a password manager. We
note that because permission queries are implicitly generated along with the copy &
paste requests, this protection is transparent to the application.
Note that, in this scenario, first sending input notifications to the permission mon-
itor and later querying it for the same information could seem unnecessary. Instead,
one could store input notifications inside the display manager to avoid kernel commu-
nication. However, in the next section, we show that our design is necessary for the
112
DisplayManager
Permission Monitor
Shot
ERun,t
NRun,t
ERun,t1
2
3
Kernel
Userspace
Run
4
QShot,t+n
RShot,t+n
76
scrt+n
img8
5
create process
Figure 5.3: A program launcher executing a screen capture program, illustrating theneed for interposing on process spawn mechanisms to propagate interaction informa-tion.
kernel to track interactions across process boundaries through process spawns and
inter-process communication (IPC) channels.
5.3.5 Interaction Across Process Boundaries
Real-life applications often consist of multiple processes or threads, and communicate
with each other using application-specific protocols via IPC facilities provided by the
OS. This significantly complicates the task of associating user input with privileged
operations requested by an application, because the process receiving the input event
could be different from the actual process that accesses a sensitive resource. We
illustrate this challenge Overhaul needs to address with the examples below.
Figure 5.3 presents a scenario where an application Shot attempts to capture a
screen image. Since the screen content is also a resource controlled by the display
manager, this example is similar to the previous copy & paste example. However, here,
113
DisplayManager
Permission Monitor
Tab
EBrowser,t
NBrowser,t
EBrowser,t1
2
3
Kernel
Userspace
Cam Mic
5
6
camt+n
Hardware
Browser
4
open cam
SharedMemory
Figure 5.4: A multi-process browser, components of which communicate via sharedmemory IPC. This example illustrates the need for interposing on IPC endpoints topropagate interaction information.
the user first executes a program launcher Run, types in the name of the program Shot,
and the application launcher executes Shot on the user’s behalf. In other words, (1–3)
the user actually interacts with Run, which the kernel permission monitor records;
(4) but Run creates a new process Shot, (5) and the screen capture request scrt+n is
made by this different process for which there exists no interaction record.
In another scenario, Figure 5.4 depicts how a multi-process Internet browser that
uses separate processes for each browser tab (i.e., similar to Chromium) would run
a web-based video conferencing application. (1–3) When the user commands the
browser to launch a video conference session, she actually interacts with the main
browser window Browser, and the permission monitor is notified of this. However,
114
Browser opens the web application in a separate process Tab and (4) commands it
to turn on the camera via shared memory IPC. As a result, (5) Tab requests camt+n
without a corresponding interaction record in the permission monitor.
The ubiquity of multi-process application architectures, applications that launch
third-party programs, and IPC use make it necessary for Overhaul to correctly han-
dle cases similar to those exemplified above. Therefore, our design requires Over-
haul to interpose on all process and thread spawns, as well as the entire range of IPC
mechanisms provided by the OS (e.g., (4) in Figure 5.3 and Figure 5.4). Specifically,
Overhaul needs to propagate interaction notifications between processes according
to the following policy:
• Interaction notifications of a parent process must be propagated to a newly-
spawned child process. In other words, whenever a process X creates a new
process Y , all interaction notifications NX,t recorded in the permission monitor
must be duplicated as NY,t.
• In an IPC channel established between two (or more) processes, interaction
notifications of a message sender process must be propagated to the receiver
process. That is, Overhaul must monitor all established IPC endpoints, and
whenever process X sends a message to process Y , interaction notifications NX,t
recorded in the permission monitor must be duplicated as NY,t.
In this way, Overhaul can support process spawns and IPC chains of arbitrary
length and complexity, and remain transparent to the applications and oblivious to
the application-level communication protocols.
115
5.3.6 Discussion
Our approach fulfills the design goals enumerated in Section 5.3.1. Overhaul pro-
vides a trusted input path between the user and kernel, a display manager that
authenticates hardware-generated input events and interposes on display resources,
and a kernel permission monitor that mediates access to sensitive hardware (S1), (S2).
The display manager also enforces appropriate visibility requirements on application
windows to prevent hijacking of authentic user interaction (S3), and ensures that
resource accesses are communicated to the user via visual alerts (S4).
We point out that Overhaul inherently shares the limitations of other user-
driven security approaches. In particular, because the user’s perception of malice
and their interaction with applications are central to this security model, Overhaul
cannot provide protection against malware that can trick a user into voluntarily in-
stalling and using them, for example, by mimicking the appearance and functionality
of well-known legitimate applications. Additionally, Overhaul does not support
running scheduled tasks, or persistent non-interactive programs that need access to
the protected sensitive devices (e.g., a cron job or daemon that periodically takes
screen captures). We stress that these issues are fundamental to any user-driven ac-
cess control model, and despite its limitations Overhaul provides important security
benefits complementing the standard access control models employed in commodity
operating systems, without any significant detriments to performance or user experi-
ence.
The trade-offs Overhaul makes between backwards compatibility with legacy
programs and defending against on-system malware results in a system that pro-
vides strictly weaker security guarantees than prior work on user-driven access con-
trol [90], where a stronger connection between user intent and program behavior can
116
quired for malware to interact with a user interface on the user’s behalf so long as
the hardware is considered to be free of embedded malicious functionality.
As a result, Overhaul focuses on distinguishing between hardware and software-
generated input events. We identified two facilities provided by X11 for generating
and injecting synthetic events to the event queue: the SendEvent [95] and XTest-
FakeInput [96] requests. SendEvent is a core X11 protocol request that allows a client
to send events to other clients connected to an X server. In particular, this interface
could allow malware to inject keystrokes or mouse events on other windows. However,
events sent using this interface must have a flag set that indicates that the event is
synthetic. As such, filtering such input events within the X server is a matter of
checking for the presence of this flag.
The second request, XTestFakeInput, is part of the XTest extension, which is used
for providing a GUI testing framework. In this case, it is not possible to implement
a flag check since no indicator flag is used with XTest requests. Therefore, it was
necessary to modify the X server to tag events with the extension or driver that
generated the event. While this is more onerous than checking for the existence
of a flag, it is also a method for determining the provenance of input events that
generalizes to future modifications to the X Window System.
With the ability to distinguish hardware-generated input from synthetic input,
the X server was modified to connect to a secure communication channel upon ini-
tialization (as we will explain in Section 5.4.2), and send interaction notifications to
the kernel permission monitor every time the user interacts with an X client. These
notifications are labeled with the PID of the process that received the event and a
timestamp. The PID serves as an unforgeable binding between a window belonging
to a process and events, as the mapping between X client sockets and the PID is
retrieved from the kernel.
118
We note that the trusted input path described so far remains vulnerable to click-
jacking attacks [97]. For instance, a malicious X client may place transparent overlays
on the screen, or periodically display a previously invisible window over other appli-
cations in an attempt to trick users into clicking on them and stealing authentic input
events. To prevent this, Overhaul only generates interaction notifications if the X
client receiving the event has a valid mapped window that has stayed visible above a
predefined time threshold.
Trusted Output
As described before, the trusted output path that Overhaul utilizes is a visual alert
shown on the screen whenever a sensitive resource is accessed. Since the X Window
System controls the entire display contents, Overhaul ensures that displayed alerts
are rendered on top of all other windows, and cannot be blocked, obscured, or manip-
ulated by other X clients. We designed the alert messages to be displayed for a few
seconds at the top of the screen, at a reasonably large size, to be easily noticeable.
Since resource accesses can only be granted immediately following user input, the user
is highly likely to be present and interacting with the computer, making it difficult
for her to miss an alert. In addition, the alerts make use of a visual shared secret set
by the user to prevent malicious applications from forging fake alerts. Two example
alerts are shown in Figure 5.5.
Note that, compared to popup prompts that require explicit policy decisions from
the user during runtime (e.g., Windows User Account Control or iOS permission di-
alogues), alerting the users with visual notifications inherently establishes a looser
association between user actions and the application behavior. Indeed, we imple-
mented and verified that Overhaul’s security primitives can be used to support
such a security model in a trivial manner, where the trusted output path would be
119
Figure 5.5: Sample visual alerts shown by Overhaul. The cat image is used as avisual shared secret to indicate that the alert is authentic.
used for displaying an unforgeable prompt, and the trusted input path to verify user
interaction with it. However, it has been shown that popup prompts have severe
usability issues that conflict with their security properties, and that they are often
ignored by users, or disabled completely [98]. Therefore, we believe the non-intrusive,
transparent approach we took with Overhaul is a worthwhile trade-off between
security and usability, and would be a more effective security solution in a real-life
setting. We do not explore the popup prompt approach further in this chapter.
Display Contents
The X Window System allows any client program to access the contents of the root
window (i.e., the entire screen), or any specific window through the GetImage core
protocol request [95], or the XShmGetImage request provided by the MIT shared mem-
ory extension [99]. These interfaces can be used to retrieve the displayed contents for
any purpose such as taking screenshots or recording the desktop.
120
In order to mediate accesses to the display contents of X clients, our modified
X server intercepts these events, and queries the kernel permission monitor via the
secure communication channel with a message containing the PID of the requesting
process and a timestamp. Based on the response, access is either granted, or the
screen capture request is dropped. This way, Overhaul can enforce that display
contents can only be accessed in response to user input.
The X Window System also provides two additional core protocol requests, CopyArea
and CopyPlane, which are used for copying a representation of display contents be-
tween two buffer areas. These requests could be used as an alternative approach to
capture the screen contents, and therefore, Overhaul must also interpose on them.
However, unlike the previous GetImage, these requests are not specifically designed
for capturing display contents, and they are regularly used by X clients for various
other purposes. Therefore, in this case, Overhaul first needs to inspect the owners
of the source and destination buffers specified in the copy request. If the owners of
both buffers are identical, in other words, a client is copying a portion of its own win-
dow, the request is allowed to proceed. However, if a client is requesting the display
contents owned by a different client (or the root window), Overhaul applies its user
input-based access control as before, and allows or blocks the request accordingly.
Clipboard Contents
The X Window System does not provide a central clipboard space, but instead defines
the copy & paste operations as an inter-client communication protocol [100] outlined
in Figure 5.6. The steps to copy data from a source client to a target client are as
follows:
(1) A copy operation is initiated by user input received via an X input driver.
(2) The source client asserts ownership of a selection object by issuing to the X server
121
App A
Copy Source
App B
Paste Target
XServer
SetSelection
GetSelection
ConvertSelection
SelectionRequest
ChangeProperty
SendEvent
SelectionNotify
GetProperty
data
DeleteProperty
CopyA PasteB
Owner is A
data
1
2
10
4
5
6
7
8
9
3
11
12
13
Figure 5.6: Protocol diagram for the X11 copy & paste operation. Modified steps arehighlighted in bold.
122
a SetSelection request. In (3) and (4) the source client confirms with the X server
that it has successfully acquired the selection. This concludes the copy operation;
note that no data has actually been copied at this stage.
(5) The paste event is initiated by user input. (6) The target client sends a
ConvertSelection request to the X server, (7) which, in turn, issues a Selection
Request to the selection owner (i.e., the source client) to notify it of the request for
the copied data. (8) The source client sends the data to the X server to be stored
as a property using a ChangeProperty request, (9) and then requests from the server
that the target client be sent a Selection Notify event, using a SendEvent request.
(10) The paste target is notified that the copied data is available. (11) The target
client responds with a GetProperty request, (12) retrieves the data, (13) and finally,
removes it from the server.
In Figure 5.6, the protocol steps that were modified in Overhaul are highlighted
in bold. In particular, steps (1) and (5) are events that are verified as authentic
user input from a hardware input device. The X server notifies the kernel permission
monitor of these events as previously described. In steps (2) and (6), before serving the
SetSelection or ConvertSelection requests received from the clients, the X server
first queries the kernel permission monitor via the secure communication channel to
confirm that the copy or paste request is preceded by corresponding user interaction.
The operation is allowed to proceed only if the permission monitor responds with a
permission grant message; otherwise, the client is sent back an error message.
Note that this copy & paste protocol is followed merely by convention, and the
given interaction sequence is not enforced by the X server. As a result, a malicious
X client may attempt to skip certain steps of the protocol to bypass Overhaul’s
checks. One possible attack vector is the SendEvent request which allows an X client
to command the X server to send an X11 event on behalf of the client. By exploiting
123
this mechanism, a malicious client can directly send SelectionRequest events to
other clients and receive the copied data from the selection owner. To prevent such
attacks, our implementation also interposes on the SendEvent requests, and blocks
the sending of events that can break the copy & paste protocol. Other examples
of possible attacks include subscribing to events generated by the X server when
properties are created and updated to retrieve the pasted data stored in them before
the actual paste target could remove it. Overhaul ensures that such events are only
delivered to the paste target while the clipboard data is in flight. We omit details of
these low-level implementation issues.
5.4.2 Enhancements to the Linux Kernel
As shown in Section 5.3, our implementation augments the Linux kernel with a per-
mission monitor that establishes a secure communication link to the X Window Sys-
tem, mediates sensitive hardware accesses, adjusts per-application privileges in re-
sponse to interaction notifications, and responds to permission queries from the X
server for access to display resources.
Secure Communication Channel
The first property that our kernel must support is establishing and authenticating the
communication channel to the X Server. In our prototype, we used the Linux netlink
facility to provide this channel [101]. Netlink was originally designed to exchange
networking information between the kernel and user space, but it serves as a robust
general communication channel across this boundary.
Netlink, however, does not solve the authentication problem. That is, the kernel
and X server must ensure that no malicious program is interposing on the channel.
124
While using a standard mutual authentication protocol is possible, our prototype
instead relies on the fact that the kernel operates in supervisor mode and can intro-
spect on the user space X process. Once the kernel establishes the netlink channel
and receives a connection request from X during server initialization, it examines the
virtual memory maps to check whether the process it is communicating with is indeed
the X server. In particular, it checks whether the executable code mapped into the
process is loaded from the well-known, and super user-owned, file system path for the
X binaries. If so, it considers the remote party to be authenticated as the legitimate
X server and, due to the kernel’s supervisor privileges, the X server trusts that the
kernel will perform this procedure correctly.
Device Mediation
Overhaul must interpose on all accesses to sensitive hardware devices. To this
end, it suffices on Linux to monitor open system call invocations on device nodes
exposed in the file system. Therefore, our prototype implements an augmented open
system call that, in addition to normal UNIX access control checks, looks up the
interaction notification records received from the X server for the running process
to allow or deny access to the device accordingly. Note that it is usually considered
better practice to implement kernel-side security checks using the Linux Security
Modules (LSM) framework [102] instead of modifying system calls directly. However,
as of this writing, LSM does not officially support stacking multiple security modules.
Since Overhaul is not a replacement for other security modules, we implemented
our prototype in this way as a conscious design choice.
An important implementation detail of our prototype deals with accurately map-
ping sensitive devices to their file system paths. In particular, modern Linux dis-
tributions often make use of dynamic device name assignments at runtime using
125
frameworks such as udev. Therefore, our prototype relies on a trusted helper applica-
tion, owned by the super user and protected against unauthorized modification using
normal user-based access control, to manage this mapping. It is invoked in response
to changes in the device file system mounted by convention at /dev, and propagates
these changes to the kernel via an authenticated netlink channel.
Process Permission Management
The kernel permission monitor receives interaction notifications from the X server,
which includes a PID and a timestamp, and needs to record this information in an
easily accessible context associated with each process. Our prototype stores this infor-
mation inside the process descriptor, the data structure Linux uses to represent a
process. Every process descriptor is implicitly associated with a unique process; there-
fore, this procedure only requires us to locate the process descriptor corresponding to
the PID reported in the interaction notification, and save the interaction timestamp
inside this structure.
To perform a permission check, the permission monitor first receives the PID of the
process that requests access to the sensitive resource, either internally from the device
mediation layer, or from the X server via the netlink channel. Next, it retrieves the
correct process descriptor and compares the timestamp recorded there (i.e., the most
recent user interaction time) with the privileged operation’s timestamp. If the tem-
poral proximity of the two is above a pre-configured threshold, permission is granted
(or a positive response is sent back to the X server). We empirically determined that
setting a threshold of less than 1 second can lead to falsely revoked permissions, but
2 seconds is sufficient to prevent incorrectly denying access to legitimate processes.
In our long-term experiments with this configuration, described in Section 5.5.4, we
did not encounter any broken functionality or unusual program behavior.
126
Process Creation & IPC
As previously explained, Overhaul must be able to track interaction information
across process boundaries for any meaningful real-life use. Recall from Section 2.4.1
that, in Linux, a new process (i.e., the child) is created by duplicating an existing
process (i.e., the parent), using the clone system call. This operation duplicates
the process descriptor of the parent to be used for the child process, which includes
the interaction timestamp stored in the same data structure. In other words, our
implementation ensures that the parent’s interaction information is passed down to
a newly-created child automatically, without additional modification to the kernel.
This property also extends to the threads of a process, because Linux does not have a
strict distinction between processes and threads and uses a separate process descriptor
for each.
In contrast, tracking interaction information across IPC channels requires further
modifications to the kernel for each IPC facility provided by the OS. Our implemen-
tation supports all of POSIX shared memory and message queues, UNIX SysV shared
memory and message queues, FIFOs, anonymous pipes, and UNIX domain sockets.
Higher-level IPC mechanisms that are built on these OS primitives (e.g., D-Bus) are
also automatically covered. These IPC mechanisms are modified in a similar man-
ner to propagate interaction information between the two endpoint processes, which
works as follows:
(1) When an IPC channel is first established, we embed an expired interaction
timestamp inside the kernel data structures that correspond to the IPC resource.
(2) When a process wants to send data through an IPC link, it first embeds its own
interaction timestamp inside the IPC resource, unless the structure already contains a
more recent timestamp. (3) When the receiving process reads data from the channel,
127
it compares its own interaction timestamp with the one embedded inside the IPC
resource. If the IPC channel has a more up-to-date timestamp, the process saves it
in its own process descriptor.
Implementation of this protocol requires adding a timestamp field inside the IPC
data structures, and inserting checks inside the corresponding send and receive func-
tions for each IPC facility. However, a notable exception is POSIX and SysV shared
memory, which must be handled differently. Specifically, once the kernel allocates and
maps a shared memory region with the mmap system call, writes and reads to these
regions are regular memory operations that cannot be intercepted above the hardware
level. We overcome this obstacle by taking a different approach. We interpose on vir-
tual memory mapping operations inside the kernel, check whether the mapped area
is flagged as shared (indicated by a flag inside the corresponding vm area struct),
and if so, revoke read and write permissions for that memory area. This causes
subsequent accesses to that memory region to generate access violations, and allows
Overhaul to capture the IPC attempt inside the page fault handler. We then run
the interaction propagation protocol described above, and temporarily restore the
memory access permissions to their original values to allow the memory operation to
succeed on the next try. Clearly, repeating this process for every memory access could
lead to severe performance overhead; therefore, after every access violation, we put
the corresponding vm area struct on a wait list before its permissions are revoked
once again. This allows memory accesses that immediately follow the first page fault
to proceed uninterrupted. This wait duration must be sufficiently shorter than the 2
second interaction expiration time, since we would miss shared memory IPC attempts
and fail to propagate interaction timestamps during this period. We configured this
duration to 500 ms, which yielded a good performance-usability trade-off as shown
in Section 5.5.
128
Command Line Interface Interactions
A final implementation requirement arises from the fact that Linux systems often
make extensive use of the command line interface. On graphical desktops, this is
achieved by running a terminal emulator (e.g., xterm) that communicates with a
command line shell (e.g., bash) via a pair of pseudo terminal devices. If the user
was to type in the name of a command line application inside a terminal emulator
(as opposed to using a graphical application launcher), the terminal emulator would
receive the input events, and communicate the command to launch to the shell via the
pseudo terminal devices. Any subsequent device access requests would be made by a
program launched by the shell process, which has not received any direct interaction.
In fact, the shell usually is not even an X client and, thus, cannot receive X11 input
events.
To enable command line tools that access the protected sensitive devices to func-
tion correctly under Overhaul, we implemented an interaction timestamp propaga-
tion protocol analogous to the one described for IPC channels above. Here, the modifi-
cations are made inside the pseudo terminal device driver. Whenever a process writes
to a terminal endpoint, that process embeds its timestamp into the kernel data struc-
ture representing the pseudo terminal device. Subsequently, when another process
reads from the corresponding terminal endpoint, that process copies the embedded
timestamp to its process descriptor, unless it already has a more recent timestamp.
Processes Isolation and Introspection
Overhaul does not require sandboxing of individual user applications, or any ad-
vanced process isolation mechanism beyond the kernel and process memory isolation
that commodity operating systems provide. In particular, all interaction notifica-
129
tions in our design are managed by the OS; they are never exposed to user space
applications. This prevents malicious applications from tampering with legitimate
interaction notifications to mount denial-of-service attacks, or hijacking interaction
notifications of other processes. Similarly, since each interaction notification is bound
to a specific process, malicious applications that run in the background and receive
no user interaction cannot hijack the permissions granted to another application.
However, process introspection and debugging facilities offered by operating sys-
tems need attention, because they might make it possible to inject malicious code
into legitimate applications that are expected to have access to sensitive resources.
In Linux, this threat is somewhat contained since the Linux debugging facilities, such
as ptrace and /dev/{PID}/mem (which also uses ptrace internally), do not allow
attaching to processes that are not direct descendants of the debugging process. In
other words, even if two unrelated processes run with identical (but non-super user)
credentials, they cannot manipulate each other’s state.
In our implementation, we provide even stricter security by temporarily disabling
all permissions for a debugged process with a trivial patch to the ptrace system
call. This also prevents parent processes from tracing their own children, which, in
turn, subverts attacks where a malicious program could launch another legitimate
executable, and then inject code into it. Overhaul enables this protection by de-
fault, but it can be toggled off by the super user through a proc file system node to
facilitate legitimate debugging tasks.
5.5 Evaluation
Our evaluation of Overhaul consists of measuring its performance impact on the
system, and testing the usability and security properties of the implementation.
130
Benchmarks Baseline Overhaul Overhead
Device access 45.20 s 46.18 s 2.17 %Clipboard 116.48 s 119.93 s 2.96 %Screen capture 68.26 s 69.86 s 2.34 %Shared memory 234.86 s 236.33 s 0.63 %Bonnie++ 47319 files/s 47265 files/s 0.11 %
Table 5.1: Performance overhead of Overhaul.
5.5.1 Performance Measurements
Since Overhaul is an input-driven system that only impacts the operations per-
formed on privacy-sensitive resources, we expect its performance overhead to be
overshadowed by human-reaction times and I/O processing delays. Indeed, in our
experiments with the prototype implementation, we did not observe a discernible
performance drop compared to normal system operation. Consequently, in order to
obtain measurable performance indicators to characterize the overhead of Over-
haul, we created micro-benchmarks that exercise the critical performance paths of
our system. We also used a standard file system benchmarking utility to measure
the impact of our modified open system call on regular file system operations. We
explain each of these benchmarks in more detail below.
Device access. In this benchmark, we measured the time to open the file sys-
tem device node corresponding to the microphone installed on our testing system 10
million times.
Clipboard operations. We designed this benchmark to measure the runtime for
performing 100,000 clipboard operations. Since in the X Window System a paste is
significantly more costly than a copy, we configured our benchmark to only perform
pastes for this test and report the worst-case results.
131
Screen capture. This benchmark takes 1,000 screen captures using the imlib2
library and measures the total runtime. The time to save the image files to disk is
not included.
Shared memory IPC. Although Overhaul interposes on every IPC mecha-
nism, our preliminary measurements indicated that the shared memory communica-
tion incurred the highest overhead due to the necessity for intercepting page faults,
changing virtual memory access permissions, and invalidating page tables. Conse-
quently, to measure the worst-case performance impact, in this benchmark we mea-
sured the runtime for performing 10 billion write operations on a shared memory
area. We repeated this benchmark with different shared memory sizes (i.e., from 1
to 10,000 pages, with a page size of 4096 KB), and experimented with sequential and
random write patterns. We found no correlation between these parameters and the
performance impact; the overhead was near-identical in all runs. Here, we present
the results for a shared memory size of 10,000 pages, and random writes.
File system. To measure the performance impact of Overhaul on regular file
system operations, we ran Bonnie++ configured to create, stat and delete 102,400
empty files in a single directory. Since Overhaul does not interpose on stat or
unlink system calls, we were unable to reliably measure any overhead for file access
or deletion operations, as expected. Therefore, we only report the runtime overhead
for file creation.
For the purpose of this evaluation we temporarily modified Overhaul’s permis-
sion monitor to grant access to resources even when there is no user interaction,
in order to exercise the entire execution path of the benchmarked operation. We
repeated all tests on a Linux system with Overhaul, and on a system with an un-
modified kernel and X server, five times each, and compared the average results when
calculating the overhead.
132
Experiments were performed on a computer with an Intel i7-930 2.2GHz CPU,
9GB of RAM, running Arch Linux x86-64 with an Intel i7-930 processor, 9 GB mem-
ory, and running Arch Linux x86-64. We present the results of our experiments in
Table 5.1.
Our measurements show that Overhaul performs efficiently, with the highest
overhead observed being below 3%. Note that these experiments artificially stress
each operation under unusual workloads, and the overhead for a single operation is
on the order of milliseconds in the worst case, and ranging down to below a nanosec-
ond. Hence, the overhead is often not noticeable by the user. Moreover, the Bon-
nie++ benchmark demonstrates that Overhaul does not significantly impact the
performance of regular file open operations.
5.5.2 Usability Experiments
We conducted a user study with 46 participants to test the usability of Overhaul.
The participants were computer science students at the author’s institution, recruited
by asking for volunteers to help test a “defensive security system”. In order to avoid
the effects of priming, participants were not informed about the specific functionality
of Overhaul. The only recruitment requirement was that the participants be famil-
iar with using Skype and web browsing so that they could perform the given tasks
correctly. No personal information was collected from the participants at any point.
The participants were asked to perform two tasks to test different aspects of our
system. The first task presented them with a Skype instance on our test machine
running Overhaul, logged into a test account. They were asked to perform a call to
a second test account, while Overhaul performed its security checks without their
knowledge. Once complete, an experimenter asked the participants to compare this
133
process with their previous experience of using Skype. Specifically, they were asked
to rate the difficulty involved in interacting with the test setup on a 5-point Likert
scale, where a score of 1 indicated that their experience was almost identical, and 5
indicated that the test setup posed significant difficulty.
In the next task, the participants were asked to perform a specific search on the
Internet on an Overhaul-enabled machine. While they were occupied with the task,
a hidden background process that attempted to access the camera was triggered at
a random time, and was subsequently blocked by Overhaul, causing a visual alert
to be displayed. Once the task was complete, the participants were asked to explain
whether they had noticed anything unusual while performing their tasks.
At the end of the first phase of the experiment, all 46 participants found the
experience to be identical to using Skype on an unmodified system. This empiri-
cally confirms that Overhaul is transparent to the users. In the second phase, 24
participants immediately interrupted the task when the Overhaul notification was
displayed, and alerted the experiment observer to the blocked camera access. An-
other 16 noticed the alert, however continued the task and reported the unexpected
camera activity after being prompted by the observer. Only 6 users reported not
having noticed anything unusual. These results confirm that Overhaul alerts are
able to draw most users’ attention while they are occupied with other tasks, and are
effective security notifications.
5.5.3 Applicability & False Positives Assessment
To understand whether Overhaul interferes with the normal functionality of ap-
plications, or produces false alerts due to incorrectly blocked legitimate programs,
we tested the system on common applications. To compile the application pool for
134
this task, we first manually inspected the descriptions of all Top Rated packages in
the Ubuntu Software Center, and identified those that access the resources Over-
haul is designed to protect. Next, we searched the official and community package
repositories of Arch Linux, our experiment environment, with relevant keywords (e.g.,
webcam, microphone, screenshot, capture, record), and added the hits to the pool.
After eliminating the packages that do not work (e.g., due to missing dependencies)
we ended up with 58 applications consisting of video conferencing tools (e.g., Skype,
Jitsi), audio/video editors (e.g., Audacity, Kwave), audio/video recorders (Cheese,
ZArt), screenshot utilities (Shutter, GNOME Screenshot), and screencasting tools
(e.g., Istanbul, recordMyDesktop). The pool also included the popular web browsers
Firefox and Chromium; in those cases we tested them with various web-based video
chat applications. Note that the application pool contained both GUI and console pro-
grams. We manually experimented with each application to verify that they work as
expected, observed whether Overhaul alerts were displayed correctly, and whether
there were false alarms.
In our experiments, we encountered a single application that produced what could
be considered a spurious alert. Specifically, we observed that Skype attempted to
access the camera as soon as the program was launched, before the user logs into the
application. When Skype was configured to automatically start on boot, this situation
led to a camera access without user interaction, and consequently, Overhaul blocked
the access and produced an alert. This did not cause subsequent video calls to fail,
and we argue that blocking such unanticipated device accesses is the desired behavior
in order to achieve Overhaul’s security properties.
While we did not encounter any malfunctioning application, this experiment also
revealed a peculiar limitation of Overhaul. Specifically, some of the screenshot tools
we tested included an option to delay the shot by a user-specified time. By design,
135
Overhaul does not support this functionality since the interaction notifications
associated with the application expire before the screen could be captured.
To test Overhaul’s clipboard protection mechanism we used an additional set
of 50 applications including popular office programs, text and media editors, web
browsers, email clients, and terminal emulators. Since Overhaul does not display
alerts for clipboard accesses due to usability reasons, we instead verified correct func-
tionality by inspecting the logs produced by our system. In these tests we did not
encounter any false positives or incorrect program behavior.
We note that Overhaul does not support running scheduled tasks, or persistent
non-interactive programs that access the protected devices (e.g., a cron job that
periodically takes screenshots). While we did not encounter such applications in our
tests, this remains a fundamental limitation of our system.
5.5.4 Empirical Experiments
Due to ethical concerns, and the necessity of installing a custom kernel and malware-
like applications on users’ machines, it is a difficult task to design a large-scale user
study to test the long-term security and usability properties of Overhaul. There-
fore, we instead experimented with Overhaul on our personal home and work com-
puters. Below, we present the anecdotal insights gained in the process.
For this experiment, we implemented a malware-like application that runs in the
background during the computer’s normal operation and spies on the user. In par-
ticular, it periodically retrieves clipboard contents, takes screenshots, and records
sound samples from the microphone. For privacy reasons, our sample did not record
camera images. Since the test was performed on actual, personal machines used on
a daily basis, we only stored the captured information on disk, while real malware
136
would exfiltrate it to a remote host. We stress that our spy application was created
to mimic the behavior of real information-stealing malware [91, 92, 93, 94], exploiting
the standard interfaces to the sensitive resources exposed by the operating system.
No functionality was artificially added or removed that would ease its detection. We
installed this spy application on two of our computers, and enabled Overhaul on
one of the machines, while the other was left running unmodified, without protection.
We left the malware running for 21 days. Both computers were actively used everyday
for work and personal use.
At the end of the experiment we confirmed that the malware running on the Over-
haul-protected system could not collect any information, as expected. We checked
Overhaul’s logs and verified that attempts to access the protected resources were
detected and blocked. The malware on the vulnerable computer, on the other hand,
was able to successfully spy on the user. We manually investigated the collected data
and found sensitive information including screenshots of bank account information
displayed on an e-banking site and email exchanges. The data sampled from the clip-
board included passwords copied from the password manager, phone numbers, and
excerpts from emails. The malware was also able to collect voice recordings from the
headset microphone.
We also investigated Overhaul’s logs to see which applications were granted
access to the protected resources. The camera and microphone were used by two video
conferencing applications. Screen was captured by the system’s default screenshot
tool, and by a desktop recording application. Clipboard accesses were logged for
a large number of applications. During the testing period of 21 days, we did not
encounter any cases of legitimate applications being incorrectly blocked.
These observations show that spying malware can be severely damaging, and that
Overhaul is effective at improving user privacy in the face of attacks. Conducting a
137
similar long-term study at a larger scale, in a more scientific framework, is a difficult
yet promising future research direction.
5.6 Related Work
Previous work has studied capturing user intent to implement user-driven access con-
trol. Roesner et al. [90] present an approach in which permission granting is built
into user interactions with permission-granting GUI elements called access control
gadgets (ACG). The authors extend ServiceOS to provide this capability to applica-
tion developers, and require that applications be modified to use ACGs. This work
captures user intent at a fine granularity and provides stronger security guarantees
than Overhaul as each action is precisely mapped to a permission. However, our
goal is to propose an architecture that can be retrofitted into traditional OSes trans-
parently. In our work, we encountered a different set of challenges stemming from
the fact that we are dealing with traditional systems (i.e., Linux) that do not provide
the features that ServiceOS does.
Ringer et al. [103] take a different approach, and provide access control gadgets via
a secure user space library, combined with static and dynamic analyses. Then, they
port various Android applications to work with their system. For similar reasons as
above, this work provides stronger security guarantees than Overhaul. In contrast,
Overhaul treats user space applications in a blackbox manner, and does not require
modifications to them.
Gyrus [104] is a virtualization-based system that displays editable UI field entries
in text-based networked applications back to the user through a trusted output chan-
nel, and guarantees that this is the information sent over the network. BLADE [105]
138
infers the authenticity of browser-based file downloads based on user behavior. While
sharing similar goals with Overhaul, these address different security problems.
Systems that use timing information to capture user intent include BINDER [106]
and Not-a-Bot [107]. BINDER associates outbound network connections with input
events to build a host-based IDS. However, its design does not address the challenges
of IPC, making it unsuitable for use with certain applications that Overhaul targets.
Not-a-Bot uses TPM-backed attestations to tag user-generated network traffic on
the host, and a verifier on the server that checks them to implement DDoS, spam,
and clickjacking mitigation measures. These systems target network-based attacks,
whereas Overhaul aims to control access to privacy-sensitive devices.
Some systems that advocate user-authentic gestures for secure copy & paste be-
tween domains are the EROS Window System (EWS) [108], Qubes OS [109], and
Tahoma [110]. Similarly, in this chapter, we also address the problem of secure copy
& paste so that malicious applications cannot intercept these requests. There has also
been much work in the domain of trusted computing. For example, Terra [111], Over-
shadow [112], and vTPM [113] use virtual machine technology for enabling trusted
computing. In contrast to the above, Overhaul does not require use of virtualization
or explicit user cooperation.
Several operating systems and applications employ popup prompts to defer privacy
policy decisions to users [114, 115, 116, 117]. However, this approach to user-driven
access control has been shown to suffer from usability issues; for instance, Motiee et
al. [98] demonstrate that Windows users often find User Account Control prompts
distracting, dismiss them without due diligence, or disable them completely. Over-
haul sidesteps these concerns by taking a transparent, unintrusive approach. Flash
Player employs a mechanism that only allows clipboard operations initiated by user
input [118]. Overhaul generalizes this application-specific defense to the entire sys-
139
tem and other sensitive resources, and provides the additional security property that
user input cannot be generated synthetically.
Quire [119] is an extension to Android that enables applications to propagate
call chain context to downstream callees. Hence, applications can verify the sources
of user interactions, and make policy decisions accordingly. There has also been
much work that aims to enforce install time application permissions within Android
(e.g., Kirin [120], Saint [121], Apex [122]). These approaches enable the user to
define policies for protecting themselves against malicious applications. Overhaul
is orthogonal to the smartphone platform security work.
SELinux [123] enables MAC policies on Linux, and is mostly used to restrict
daemons such as database engines or web servers that have clearly defined data access
rights. While SELinux does not address the problem of dynamic access control,
Overhaul could in principle make use of SELinux for enforcing access control, as
an implementation choice.
5.7 Summary
Security models for traditional operating systems center on multiplexed computation
on timesharing systems, where multiple users share access to a single set of computing
resources. However, the shift towards dedicated devices with single users has resulted
in a fundamental impedance mismatch between the traditional model of users, groups,
and processes and the needs of modern systems. In particular, contemporary threats
often take the form of malicious programs that execute with the full privileges of
the user, rendering user-based security models largely ineffective. Mobile operating
systems such as iOS and Android, as well as research systems such as ServiceOS [90],
have promoted the concept of user-driven, dynamic access control to address the
140
shortcomings of traditional access control models. Here, permissions to access sensi-
tive resources are granted by users on-demand. However, operating systems for the
desktop and server have been largely neglected by these advances, since prior work
has required that applications be designed with dynamic access control in mind.
In this chapter we presented Overhaul, a general architecture for retrofitting a
dynamic, input-driven access control model into traditional operating systems in a
transparent manner. In our access control model, access to privacy-sensitive resources
is mediated based on the temporal proximity of user interactions to access requests.
We built upon this architecture to demonstrate how input-driven access control can be
implemented to protect privacy-sensitive resources such as the microphone, camera,
clipboard, and display contents.
The proposed design and implementation satisfies all of the research goals we laid
out in Section 1.5. (G1) We presented an abstract design of Overhaul indepen-
dent of the underlying operating system, and described a practical implementation
for Linux and X Window System. (G2) Overhaul is applicable to any software that
requires access to the privacy-sensitive resources covered by the architecture. (G3)
Overhaul necessitates no explicit effort to make applications conform to the pro-
posed input-driven access control model; existing applications can remain oblivious
to the presence of Overhaul and still benefit from Overhaul’s security properties.
(G4) Overhaul is demonstrably performant, usable, and applicable to most appli-
cations; in particular, it requires no changes to the traditional computing interface.
141
Chapter 6
Conclusions
Together, the changing technology and adversarial models gradually render existing
privacy defenses obsolete, or otherwise lead to the emergence of previously unexplored
privacy challenges. This evolving nature of privacy threats necessitate the security
community to continuously innovate and develop novel defenses. A significant aspect
of these efforts is to ensure that new defenses can easily enter into practice and achieve
widespread adoption. To this end, factors such as correctness of the solutions, de-
ployment and maintenance costs, scalability, and usability are of utmost importance.
From a technical standpoint, the operating system is a natural and convenient
platform to develop novel defenses on. In particular, an operating system-based
defense would allow the enforcement of strong security properties at scale, on all user
space applications. However, despite these advantages, such an approach would suffer
on the cost and usability front. Rolling out a new operating system in lieu of the
already widely established, popular alternatives is unlikely to find support in practice.
In this thesis, in light of the above considerations, we argued that retrofitting
novel privacy solutions into existing operating systems is the preferable approach. In
this way, solution developers can leverage the technical advantages of working at the
142
operating system level, while also sidestepping many of the cost and usability related
concerns. We illustrated that our proposed approach is not only feasible, but also
effective, by discussing four contemporary privacy threats, and presenting solutions
to address them.
In Chapter 2, we looked at keeping privacy-sensitive data produced during short-
lived program execution sessions, and persisted to disk, confidential. We then pre-
sented PrivExec as a means of providing an application-agnostic private execution
service inside the operating system.
In Chapter 3, we explored the challenges of securely discarding long-term persis-
tent data from modern storage hardware, once it is no longer needed. To address this
issue, we presented Eraser as a technique that can perform secure file deletion on
any blackbox storage medium.
In Chapter 4, we examined existing plausibly deniable disk encryption techniques
that allow users to hide the existence of privacy-sensitive data on their disk, and
pointed out their vulnerability to multiple-snapshot attacks. Next, we presented
Hive, a hidden volume encryption scheme that remains secure even in the face of
such attacks.
In Chapter 5, we discussed the shortcomings of traditional access control mech-
anisms in securing emerging, unconventional types of privacy-sensitive system re-
sources. We presented a user-driven access control model and operating system
architecture, collectively called Overhaul, to address this problem on traditional
desktop operating systems.
As evidenced by the operating system-independent designs, concrete Linux imple-
mentations, and evaluation of each of these systems, retrofitting novel privacy defenses
into existing operating systems is indeed possible. Furthermore, all four techniques
satisfy our research goals of (G1) designing solutions compatible with prevalent op-
143
erating systems, (G2) offering general privacy guarantees to entire classes of appli-
cations, (G3) requiring no modifications to user space applications, and finally, (G4)
providing good performance, and familiar interfaces to users.
Future Research Directions
In this thesis, we primarily focused on extending operating systems with defenses
that provide strong confidentiality of privacy-sensitive data on a host computer. In
principle, the general philosophy we presented in this thesis could also be applied
to techniques that address various other types of privacy issues. For instance, an
operating system service could be designed to transparently provide any application
with end-to-end secure communication capabilities over insecure network connections.
In this following, however, we are going to aim our attention at ideas related to
the specific privacy defenses we presented, and discuss their implications on possible
future research directions.
All four techniques we discussed in this thesis strictly conform to our overarch-
ing research goals of achieving compatibility, generality, transparency, and usability.
While, in an ideal world, all privacy defenses could greatly benefit from following
these principles, it is also important to recognize that relaxing some of these require-
ments can allow for a wider range of novel privacy techniques. One promising avenue
to investigate would be to relax the transparency requirement, and explore whether
cooperative applications (i.e., applications that are aware of the operating system’s
privacy services, and explicitly request or interact with them) could benefit from ad-
ditional, or stronger privacy guarantees. For example, PrivExec could be extended
with a fine-grained private execution API that would allow for more control over the
degree or types of privacy an application would like to provide to users.
144
Another promising research direction would be to explore ways to automatically
capture user intent, or in other words, link users’ intents to their actions when in-
teracting with a computer. One immediate application of such a capability would
be to address the limitations of Overhaul, and close the gap between white-box
approaches [90] that require applications to be written with user-driven access con-
trol and the black-box approach adopted here. For instance, it could be possible to
leverage static and dynamic program analyses to more precisely link user intent, user
input, and device accesses, all without requiring modifications to existing programs.
Of course, such a capability would also serve as a strong general primitive that
could be applied to different contexts and threats, and assist the security community
in designing defenses that target human vulnerabilities, or otherwise significantly
improve the usability of security tools.
Acknowledgments
The research presented in Chapter 2 is based on author’s previously published work:
Kaan Onarlioglu, Collin Mulliner, William Robertson, and Engin Kirda. PrivExec:
Private Execution as an Operating System Service. In IEEE Symposium on Security
and Privacy, 2013.
The research presented in Chapter 4 is based on author’s previously published
work: Erik-Oliver Blass, Travis Mayberry, Guevara Noubir, and Kaan Onarlioglu.
Toward Robust Hidden Volumes using Write-Only Oblivious RAM. In ACM Confer-
ence on Computer and Communications Security, 2014.
The research presented in Chapter 5 is based on author’s previously published
work: Kaan Onarlioglu, William Robertson, and Engin Kirda. Overhaul: Input-
Driven Access Control for Better Privacy on Traditional Operating Systems. In
IEEE/IFIP International Conference on Dependable Systems and Networks, 2016.
145
Bibliography
[1] The New York Times. Apple Fights Order to Unlock San BernardinoGunman’s iPhone. http://www.nytimes.com/2016/02/18/technology/
apple-timothy-cook-fbi-san-bernardino.html, 2016.
[2] A. Czeskis, D.J. St. Hilaire, K. Koscher, S.D. Gribble, T. Kohno, andB. Schneier. Defeating Encrypted and Deniable File Systems: TrueCrypt v5.1aand the Case of the Tattling OS and Applications. In USENIX Summit on HotTopics in Security, 2008.
[3] Gaurav Aggarwal, Elie Bursztein, Collin Jackson, and Dan Boneh. An Analysisof Private Browsing Modes in Modern Browsers. In USENIX Security Sympo-sium, 2010.
[4] The PaX Team. PaX Address Space Layout Randomization (ASLR). http:
//pax.grsecurity.net/docs/aslr.txt, 2003.
[5] The PaX Team. PaX Non-Executable Pages (NOEXEC). http://pax.
grsecurity.net/docs/noexec.txt, 2003.
[6] Martın Abadi, Mihai Budiu, Ulfar Erlingsson, and Jay Ligatti. Control-FlowIntegrity. In ACM Conference on Computer and Communications Security,2005.
[7] Michael Weissbacher, Tobias Lauinger, and William Robertson. Why is CSPFailing? Trends and Challenges in CSP Adoption. In International Symposiumon Research in Attacks, Intrusions and Defenses, 2014.
[8] eCryptfs. https://launchpad.net/ecryptfs.
[9] Overlayfs Filesystem. https://www.kernel.org/doc/Documentation/
filesystems/overlayfs.txt.
[10] dm-crypt. http://code.google.com/p/cryptsetup/wiki/DMCrypt.
[11] Bonnie++. http://www.coker.com.au/bonnie++/.
[12] Selenium – Web Browser Automation. http://seleniumhq.org/.
146
[13] xdotool. http://www.semicomplete.com/projects/xdotool/xdotool.
xhtml.
[14] Sotiris Ioannidis, Stelios Sidiroglou, and Angelos D. Keromytis. Privacy as anOperating System Service. In USENIX Summit on Hot Topics in Security,2006.
[15] Alan M. Dunn, Michael Z. Lee, Suman Jana, Sangman Kim, Mark Silberstein,Yuanzhong Xu, Vitaly Shmatikov, and Emmett Witchel. Eternal Sunshineof the Spotless Machine: Protecting Privacy with Ephemeral Channels. InUSENIX Conference on Operating Systems Design and Implementation, 2012.
[16] Su Mon Kywe, Christopher Landis, Yutong Pei, Justin Satterfield, Yuan Tian,and Patrick Tague. PrivateDroid: Private Browsing Mode for Android. InIEEE International Conference on Trust, Security and Privacy in Computingand Communications, 2014.
[17] Judicael Briand Djoko, Brandon Jennings, and Adam J. Lee. TPRIVEXEC:Private Execution in Virtual Memory. In ACM Conference on Data and Appli-cation Security and Privacy, 2016.
[18] Edward W. Felten and Michael A. Schneider. Timing Attacks on Web Privacy.In ACM Conference on Computer and Communications Security, 2000.
[19] Andrew Clover. CSS visited pages disclosure. http://seclists.org/bugtraq/2002/Feb/271, 2002.
[20] Artur Janc and Lukasz Olejnik. Web Browser History Detection as a Real-worldPrivacy Threat. In European Symposium on Research in Computer Security,2010.
[21] Adil Alsaid and David Martin. Detecting Web Bugs with Bugnosis: PrivacyAdvocacy through Education. In Privacy Enhancing Technologies, 2003.
[22] Collin Jackson, Andrew Bortz, Dan Boneh, and John C. Mitchell. ProtectingBrowser State from Web Privacy Attacks. In World Wide Web Conference,2006.
[23] Markus Jakobsson and Sid Stamm. Invasive Browser Sniffing and Countermea-sures. In World Wide Web Conference, 2006.
[24] Umesh Shankar and Chris Karlof. Doppelganger: Better Browser Privacy With-out the Bother. In ACM Conference on Computer and Communications Secu-rity, 2006.
147
[25] Huwida Said, Al Noora Mutawa, Al Awadhi Ibtesam, and Mario Guimaraes.Forensic Analysis of Private Browsing Artifacts. In IEEE Innovations in Infor-mation Technology, 2011.
[26] Meng Xu, Yeongjin Jang, Xinyu Xing, Taesoo Kim, and Wenke Lee. UCog-nito: Private Browsing Without Tears. In ACM Conference on Computer andCommunications Security, 2015.
[27] J. Alex Halderman, Seth D. Schoen, Nadia Heninger, William Clarkson,William Paul, Joseph A. Calandrino, Ariel J. Feldman, Jacob Appelbaum, andEdward W. Felten. Lest We Remember: Cold Boot Attacks on EncryptionKeys. In USENIX Security Symposium, 2008.
[28] David L. C. Thekkath, Mark Mitchell, Patrick Lincoln, Dan Boneh, JohnMitchell, and Mark Horowitz. Architectural Support for Copy and TamperResistant Software. In ACM International Conference on Architectural Supportfor Programming Languages and Operating Systems, 2000.
[29] G. Edward Suh, Dwaine Clarke, Blaise Gassend, Marten van Dijk, and Srini-vas Devadas. AEGIS: Architecture for Tamper-Evident and Tamper-ResistantProcessing. In International Conference on Supercomputing, 2003.
[30] Peter A. H. Peterson. Cryptkeeper: Improving Security with Encrypted RAM.In IEEE International Conference on Technologies for Homeland Security, 2010.
[31] Jim Chow, Ben Pfaff, Tal Garfinkel, and Mendel Rosenblum. Shredding YourGarbage: Reducing Data Lifetime through Secure Deallocation. In USENIXSecurity Symposium, 2005.
[32] Niels Provos. Encrypting Virtual Memory. In USENIX Security Symposium,2000.
[33] Matt Blaze. A Cryptographic File System for UNIX. In ACM Conference onComputer and Communications Security, 1993.
[34] Erez Zadok, Ion Badulescu, and Alex Shender. Cryptfs: A Stackable VnodeLevel Encryption File System. Technical report, Computer Science Department,Columbia University, 1998.
[35] EncFS. www.arg0.net/encfs.
[36] BitLocker. http://windows.microsoft.com/en-US/windows7/products/
features/bitlocker.
148
[37] Yang Tang, Phillip Ames, Sravan Bhamidipati, Ashish Bijlani, Roxana Geam-basu, and Nikhil Sarda. CleanOS: Limiting Mobile Data Exposure with IdleEviction. In USENIX Conference on Operating Systems Design and Implemen-tation, 2012.
[38] Kevin Borders, Eric Vander Weele, Billy Lau, and Atul Prakash. ProtectingConfidential Data on Personal Computers with Storage Capsules. In USENIXSecurity Symposium, 2009.
[39] Dan Boneh and Richard J. Lipton. A Revocable Backup System. In USENIXSecurity Symposium, 1996.
[40] Radia Perlman. The Ephemerizer: Making Data Disappear. Technical report,Sun Microsystems, Inc., 2005.
[41] Zachary N. J. Peterson, Randal Burns, Joe Herring, Adam Stubblefield, andAviel D. Rubin. Secure Deletion for a Versioning File System. In USENIXConference on File and Storage Technologies, 2005.
[42] Joel Reardon, Srdjan Capkun, and David Basin. Data Node Encrypted FileSystem: Efficient Secure Deletion for Flash Memory. In USENIX SecuritySymposium, 2012.
[43] shred(1) - Linux Man page. http://www.gnu.org/software/coreutils/.
[44] Steven Bauer and Nissanka B. Priyantha. Secure Data Deletion for Linux FileSystems. In USENIX Security Symposium, 2001.
[45] Nikolai Joukov, Harry Papaxenopoulos, and Erez Zadok. Secure DeletionMyths, Issues, and Solutions. In ACM Workshop on Storage Security and Sur-vivability, 2006.
[46] Hubert Ritzdorf, Nikolaos Karapanos, and Srdjan Capkun. Assisted Deletion ofRelated Content. In Annual Computer Security Applications Conference, 2014.
[47] Zhenkai Liang, Weiqing Sun, V. N. Venkatakrishnan, and R. Sekar. Alcatraz:An Isolated Environment for Experimenting with Untrusted Software. ACMTransactions on Information and System Security, 12(3):14:1–14:37, 2009.
[48] Shvetank Jain, Fareha Shafique, Vladan Djeric, and Ashvin Goel. Application-Level Isolation and Recovery with Solitude. In European Conference on Com-puter Systems, 2008.
[49] Yanlin Li, Jonathan McCune, James Newsome, Adrian Perrig, Brandon Baker,and Will Drewry. MiniBox: A Two-Way Sandbox for x86 Native Code. InUSENIX Annual Technical Conference, 2014.
149
[50] Francis Hsu, Hao Chen, Thomas Ristenpart, Jason Li, and Zhendong Su. Backto the Future: A Framework for Automatic Malware Removal and SystemRepair. In Annual Computer Security Applications Conference, 2006.
[51] Suman Jana, Donald E. Porter, and Vitaly Shmatikov. TxBox: Building Secure,Efficient Sandboxes with System Transactions. In IEEE Symposium on Securityand Privacy, 2011.
[52] Joel Reardon, David Basin, and Srdjan Capkun. SoK: Secure Data Deletion.In IEEE Symposium on Security and Privacy, 2013.
[53] Michael Wei, Laura M. Grupp, Frederick E. Spada, and Steven Swanson. Reli-ably Erasing Data from Flash-Based Solid State Drives. In USENIX Conferenceon File and Storage Technologies, 2011.
[54] Nikolai Joukov and Erez Zadok. Adding Secure Deletion to Your Favorite FileSystem. In IEEE International Security in Storage Workshop, 2005.
[55] Gordon F. Hughes, Tom Coughlin, and Daniel M. Commins. Disposal of Diskand Tape Data by Secure Sanitization. In IEEE Symposium on Security andPrivacy, 2009.
[56] Berke Durak. Wipe. https://github.com/berke/wipe, 2009.
[57] Peter Gutmann. Secure Deletion of Data from Magnetic and Solid-State Mem-ory. In USENIX Security Symposium, 1996.
[58] Apple, Inc. Mac OS X: About Disk Utility’s erase free space feature. https:
//support.apple.com/kb/HT3680, 2016.
[59] Jim Garlick. diskscrub. https://code.google.com/archive/p/diskscrub/,2008.
[60] Jaeheung Lee, Sangho Yi, Junyoung Heo, Hyungbae Park, Sung Y. Shin, andYookun Cho. An Efficient Secure Deletion Scheme for Flash File Systems.Journal of Information Science and Engineering, 2010.
[61] Sarah Diesburg, Christopher Meyers, Mark Stanovich, Michael Mitchell, JustinMarshall, Julia Gould, An-I Andy Wang, and Geoff Kuenning. TrueErase: Per-file Secure Deletion for the Storage Data Path. In Annual Computer SecurityApplications Conference, 2012.
[62] Sarah Diesburg, Christopher Meyers, Mark Stanovich, An-I Andy Wang, andGeoff Kuenning. TrueErase: Leveraging an Auxiliary Data Path for Per-FileSecure Deletion. ACM Transactions on Storage, 12(4):18:1–18:37, 2016.
150
[63] Steven Swanson and Michael Wei. SAFE: Fast, Verifiable Sanitization for SSDs.Technical report, University of California, San Diego, 2010.
[64] Joel Reardon, Hubert Ritzdorf, David Basin, and Srdjan Capkun. Secure DataDeletion from Persistent Media. In ACM Conference on Computer and Com-munications Security, 2013.
[65] Michael Larabel. Phoronix – The Performance Impact Of Linux Disk En-cryption On Ubuntu 14.04 LTS. http://www.phoronix.com/scan.php?page=article&item=ubuntu_1404_encryption.
[66] Device-mapper – Linux Kernel Documentation. https://www.kernel.org/
doc/Documentation/device-mapper/.
[67] Kernel Probes (Kprobes) – Linux Kernel Documentation. https://www.
kernel.org/doc/Documentation/kprobes.txt.
[68] Clemens Fruhwirth. New Methods in Hard Disk Encryption. http://clemens.endorphin.org/cryptography, 2005.
[69] Network Block Device (TCP version) – Linux Kernel Documentation. https:
//www.kernel.org/doc/Documentation/blockdev/nbd.txt.
[70] Oded Goldreich and Rafail Ostrovsky. Software Protection and Simulation onOblivious RAMs. Journal of the ACM, 43(3):431–473, 1996.
[71] Elaine Shi, T.-H. Hubert Chan, Emil Stefanov, and Mingfei Li. Oblivious RAMwith O(log3(N)) Worst-Case Cost. In International Conference on the Theoryand Applications of Cryptology and Information Security, 2011.
[72] Emil Stefanov, Marten van Dijk, Elaine Shi, Christopher Fletcher, Ling Ren,Xiangyao Yu, and Srinivas Devadas. Path ORAM: An Extremely Simple Obliv-ious RAM Protocol. In ACM Conference on Computer and CommunicationsSecurity, 2013.
[73] Phillip Rogaway. Nonce-Based Symmetric Encryption. In Fast Software En-cryption, 2004.
[74] Birger Jansson. Choosing a Good Appointment System – A Study of Queuesof the Type (D,M, 1). Operations Research, 14(2):292–312, 1966.
[75] Lichun Li and Anwitaman Datta. Write-Only Oblivious RAM-based Privacy-Preserved Access of Outsourced Data. International Journal of InformationSecurity, 2016.
151
[76] Travis Mayberry, Erik-Oliver Blass, and Agnes Hui Chan. Efficient Private FileRetrieval by Combining ORAM and PIR. In Network and Distributed SystemSecurity Symposium, 2014.
[77] Ran Canetti, Cynthia Dwork, Moni Naor, and Rafail Ostrovsky. DeniableEncryption. In Advances in Cryptology, 1997.
[78] TrueCrypt. Free Open-Source On-the-Fly Encryption. http://www.
truecrypt.org/.
[79] Adam Skillen and Mohammad Mannan. On Implementing Deniable StorageEncryption for Mobile Devices. In Network and Distributed System SecuritySymposium, 2013.
[80] Sarah Dean. FreeOTFE, 2010. Archive available at https://web.archive.
org/web/20130531062457/http://freeotfe.org/.
[81] Julian Assange, Ralf Philipp Weinmann, and Suelette Dreyfus. Rubber-hose File System, 2001. Archive available at http://web.archive.org/web/
20120716034441/http://marutukku.org/.
[82] Ross Anderson, Roger Needham, and Adi Shamir. The Steganographic FileSystem. In Information Hiding, 1998.
[83] Andrew D. McDonald and Markus G. Kuhn. StegFS: A Steganographic FileSystem for Linux. In Information Hiding, pages 462–477, 1999.
[84] Hwee Hwa Pang, Kian-Lee Tan, and Xuan Zhou. StegFS: A SteganographicFile System. In International Conference on Data Engineering, 2003.
[85] Adam Skillen and Mohammad Mannan. On Implementing Deniable StorageEncryption for Mobile Devices. In Network and Distributed System SecuritySymposium, 2013.
[86] Adam Skillen and Mohammad Mannan. Mobiflage: Deniable Storage Encryp-tion for Mobile Devices. IEEE Transactions on Dependable and Secure Com-puting, 11(3):224–237, 2014.
[87] Xingjie Yu, Bo Chen, Zhan Wang, Bing Chang, Wen Tao Zhu, and Jiwu Jing.MobiHydra: Pragmatic and Multi-level Plausibly Deniable Encryption Storagefor Mobile Devices. In Information Security, 2014.
[88] Bing Chang, Zhan Wang, Bo Chen, and Fengwei Zhang. MobiPluto: FileSystem Friendly Deniable Storage for Mobile Devices. In Annual ComputerSecurity Applications Conference, ACSAC 2015, 2015.
152
[89] Kenneth G. Paterson and Mario Strefler. A Practical Attack Against the Useof RC4 in the HIVE Hidden Volume Encryption System. In ACM Symposiumon Information, Computer and Communications Security, 2015.
[90] Franziska Roesner, Tadayoshi Kohno, Alexander Moshchuk, Bryan Parno, He-len J. Wang, and Crispin Cowan. User-Driven Access Control: RethinkingPermission Granting in Modern Operating Systems. In IEEE Symposium onSecurity and Privacy, 2012.
[91] CERT Polska - Slave, Banatrix and Ransomware. http://www.cert.pl/news/10358.
[92] Dell SonicWALL Security Center - Malware switches users Bank Account Num-ber with that of the attacker. https://www.mysonicwall.com/sonicalert/
searchresults.aspx?ev=article&id=614.
[93] Alexander Gostev. The Flame: Questions and Answers. http://securelist.com/blog/incidents/34344/the-flame-questions-and-answers-51/.
[94] Trojan-Spy:W32/Zbot. http://www.f-secure.com/v-descs/trojan-spy_
w32_zbot.shtml.
[95] Robert W. Scheifler. X Window System Protocol. http://www.x.org/
releases/X11R7.7/doc/xproto/x11protocol.html.
[96] Kieron Drake. XTEST Extension Protocol. http://www.x.org/releases/
X11R7.7/doc/xextproto/xtest.html.
[97] Lin-Shung Huang, Alex Moshchuk, Helen J. Wang, Stuart Schechter, and CollinJackson. Clickjacking: Attacks and Defenses. In USENIX Security Symposium,2012.
[98] Sara Motiee, Kirstie Hawkey, and Konstantin Beznosov. Do Windows UsersFollow the Principle of Least Privilege? Investigating User Account ControlPractices. In Symposium on Usable Privacy and Security, 2010.
[99] Jonathan Corbet. MIT-SHM (The MIT Shared Memory Extension). http:
//www.x.org/releases/X11R7.7/doc/xextproto/shm.html.
[100] David Rosenthal. Inter-Client Communication Conventions Manual. http:
//www.x.org/releases/X11R7.7/doc/xorg-docs/icccm/icccm.html.
[101] J. Salim, H. Khosravi, A. Kleen, and A. Kuznetsov. Linux Netlink as an IPServices Protocol. http://www.ietf.org/rfc/rfc3549.txt, 2003.
153
[102] Chris Wright, Crispin Cowan, James Morris, Stephen Smalley, and Greg Kroah-Hartman. Linux Security Modules: General Security Support for the LinuxKernel. In USENIX Security Symposium, 2002.
[103] Talia Ringer, Dan Grossman, and Franziska Roesner. AUDACIOUS: User-Driven Access Control with Unmodified Operating Systems. In ACM Confer-ence on Computer and Communications Security, 2016.
[104] Yeongjin Jang, Simon P. Chung, Bryan D. Payne, and Wenke Lee. Gyrus: AFramework for User-Intent Monitoring of Text-Based Networked Applications.In Network and Distributed System Security Symposium, 2014.
[105] Long Lu, Vinod Yegneswaran, Phillip Porras, and Wenke Lee. BLADE: AnAttack-agnostic Approach for Preventing Drive-by Malware Infections. In ACMConference on Computer and Communications Security, 2010.
[106] Weidong Cui, Randy H. Katz, and Wai-tian Tan. Design and Implementationof an Extrusion-based Break-In Detector for Personal Computers. In AnnualComputer Security Applications Conference, 2005.
[107] Ramakrishna Gummadi, Hari Balakrishnan, Petros Maniatis, and Sylvia Rat-nasamy. Not-a-Bot: Improving Service Availability in the Face of Botnet At-tacks. In USENIX Symposium on Networked Systems Design and Implementa-tion, 2009.
[108] Jonathan S. Shapiro, John Vanderburgh, Eric Northup, and David Chizmadia.Design of the EROS Trusted Window System. In USENIX Security Symposium,2004.
[109] The Qubes OS Project. http://www.qubes-os.org/trac.
[110] Richard S. Cox, Steven D. Gribble, Henry M. Levy, and Jacob Gorm Hansen.A Safety-Oriented Platform for Web Applications. In IEEE Symposium onSecurity and Privacy, 2006.
[111] Tal Garfinkel, Ben Pfaff, Jim Chow, Mendel Rosenblum, and Dan Boneh. Terra:A Virtual Machine-based Platform for Trusted Computing. In ACM Symposiumon Operating Systems Principles, 2003.
[112] Xiaoxin Chen, Tal Garfinkel, E. Christopher Lewis, Pratap Subrahmanyam,Carl A. Waldspurger, Dan Boneh, Jeffrey Dwoskin, and Dan R.K. Ports. Over-shadow: A Virtualization-based Approach to Retrofitting Protection in Com-modity Operating Systems. ACM SIGOPS Operating Systems Review, 42(2),2008.
154
[113] Stefan Berger, Ramon Caceres, Kenneth A. Goldman, Ronald Perez, ReinerSailer, and Leendert Doorn. vTPM: Virtualizing the Trusted Platform Module.In USENIX Security Symposium, 2006.
[114] OS X Mountain Lion: Prompted for access to contacts when opening an appli-cation. http://support.apple.com/en-us/HT202531.
[115] Flash Player Help - Privacy settings. http://www.macromedia.com/support/documentation/en/flashplayer/help/help09.html.
[116] iOS Developer Library - Getting the User’s Location. https:
//developer.apple.com/library/ios/documentation/UserExperience/
Conceptual/LocationAwarenessPG/CoreLocation/CoreLocation.html.
[117] Windows Help - What is User Account Control? http://windows.microsoft.
com/en-us/windows/what-is-user-account-control.
[118] Ian Melven. User-initiated action requirements in Flash Player 10.http://www.adobe.com/devnet/flashplayer/articles/fplayer10_uia_
requirements.html.
[119] Michael Dietz, Shashi Shekhar, Yuliy Pisetsky, Anhei Shu, and Dan S. Wal-lach. Quire: Lightweight Provenance for Smart Phone Operating Systems. InUSENIX Security Symposium, 2011.
[120] William Enck, Machigar Ongtang, and Patrick McDaniel. On Lightweight Mo-bile Phone Application Certification. In ACM Conference on Computer andCommunications Security, 2009.
[121] Machigar Ongtang, Stephen McLaughlin, William Enck, and Patrick McDaniel.Semantically Rich Application-centric Security in Android. In Annual ComputerSecurity Applications Conference, 2009.
[122] Mohammad Nauman, Sohail Khan, and Xinwen Zhang. Apex: ExtendingAndroid Permission Model and Enforcement with User-defined Runtime Con-straints. In ACM Symposium on Information, Computer and CommunicationsSecurity, 2010.
[123] SELinux Project. http://selinuxproject.org/page/Main_Page.
155