triad-nvm: persistent-security for integrity-protected and ...the other side, unlike dram systems,...

12
Submitted to arXiv Triad-NVM: Persistent-Security for Integrity-Protected and Encrypted Non-Volatile Memories (NVMs) Amro Awad University of Central Florida Orlando, FL - USA [email protected] Laurent Njilla Air Force Research Laboratory Rome, NY - USA [email protected] Mao Ye University of Central Florida Orlando, FL - USA [email protected] Abstract—Emerging Non-Volatile Memories (NVMs) are promising contenders for building future memory systems. On the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While data encryption and integrity verification have been proposed earlier for DRAM systems, protecting and recovering secure memories becomes more challenging with persistent memory. Specifically, security metadata, e.g., encryption counters and Merkle Tree data, should be securely persisted and recovered across system reboots and during recovery from crashes. Not persisting updates to security metadata can lead to data inconsistency, in addition to serious security vulnerabilities. In this paper, we pioneer a new direction that explores per- sistency of both Merkle Tree and encryption counters to enable secure recovery of data-verifiable and encrypted memory systems. To this end, we coin a new concept that we call Persistent-Security. We discuss the requirements for such persistently secure systems, propose novel optimizations, and evaluate the impact of the proposed relaxation schemes and optimizations on performance, resilience and recovery time. To the best of our knowledge, our paper is the first to discuss the persistence of security metadata in integrity-protected NVM systems and provide corresponding optimizations. We define a set of relaxation schemes that bring trade-offs between performance and recovery time for large capacity NVM systems. Our results show that our proposed design, Triad-NVM, can improve the throughput by an average of 2x (relative to strict persistence). Moreover, Triad-NVM maintains a recovery time of less than 4 seconds for an 8TB NVM system (30.6 seconds for 64TB), which is 3648x faster than a system without security metadata persistence. I. I NTRODUCTION Securing Non-Volatile Memory (NVM) systems is a critical requirement for their deployment. Unlike DRAM, NVMs retain data after powering off the system, which necessiates encrypting them to avoid data remenance attacks. Moreover, NVMs enable persistent applications which can recover after a crash by frequently persisting their data on NVMs. However, while both concepts, persistency and security, seem orthogonal, they become naturally relevant with emerging NVMs; the system should be able to recover securely and guarantee its data integrity and confidentiality across crashes. Unfortunately, such synergy between persistency and security has received limited attention from the research community, which we believe is due DISTRIBUTION A. Approved for public release. Distribution unlimited. Case Number 88ABW-2018-4218 Dated 29 Aug 2018. to the following reasons. First, securing memory systems, e.g., encryption and data integrity verification, has been originally developed for DRAM systems to defeat cold-boot attacks [1], [2], [3]. In such systems, persistency and data recovery is not anticipated and thus all security metadata are expected to be reinitialized [4]. Second, the research on encrypting and securing NVMs has been mostly focused on performance of normal-operation without consideration of maintaining consistency across crashes [5], [6], [7], [8]. Third, persisting data in emerging NVMs has many compelling models, notably JUSTDO logging [9] and epoch-based persistence [10], where the focus is on data persistency than its accompanying security metadata [11], [12], [13], [14]. While the Asynchronous DRAM Refresh (ADR) feature has been a requirement for many NVM form factors, e.g., NVDIMM [15], most processor vendors limit persistent do- mains within the processor chip to tens of entries in the write pending queue (WPQ)[16]. The main reason is the high costs for ensuring a long-lasting sustainable power supply in case of power loss in addition to the power-consuming nature of NVM writes. As security metadata can be hundreds of kilobytes or megabytes, the system-level ADR support would fail to guarantee the persistence of all their updated values within the processor chip. For many users, affording powerfull backup batteries or deploying expensive power-supplies is infeasible, either due to environmental, limited area or cost limitations, thus battery-free solutions are always in demand[17]. Recently, several work have explored persisting encryption counters on secure NVMs [18], [17], [19]. Unfortunately, none of the previous work investigate persisting integrity protected systems and its accompanying metadata, but rather limit their schemes to encryption counters only. In fact, persisting integrity- verification metadata has much higher overhead. Moreover, most of the previous work do not clearly define which security metadata need to be persisted and the impact of relaxing the persistence of such metadata on security, recovery time, resilience, performance and write endurance. In summary, the current literature in persisting security metadata lacks the following: 1 Solid definition of what data needs to be persisted in secure memory systems, especially for systems with integrity protection. 2 Complete understanding of the arXiv:1810.09438v1 [cs.DC] 20 Oct 2018

Upload: others

Post on 15-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

Submitted to arXiv

Triad-NVM: Persistent-Security forIntegrity-Protected and Encrypted Non-Volatile

Memories (NVMs)Amro Awad

University of Central FloridaOrlando, FL - [email protected]

Laurent NjillaAir Force Research Laboratory

Rome, NY - [email protected]

Mao YeUniversity of Central Florida

Orlando, FL - [email protected]

Abstract—Emerging Non-Volatile Memories (NVMs) arepromising contenders for building future memory systems. Onthe other side, unlike DRAM systems, NVMs can retain data evenafter power loss and thus enlarge the attack surface. While dataencryption and integrity verification have been proposed earlierfor DRAM systems, protecting and recovering secure memoriesbecomes more challenging with persistent memory. Specifically,security metadata, e.g., encryption counters and Merkle Treedata, should be securely persisted and recovered across systemreboots and during recovery from crashes. Not persisting updatesto security metadata can lead to data inconsistency, in additionto serious security vulnerabilities.

In this paper, we pioneer a new direction that explores per-sistency of both Merkle Tree and encryption counters to enablesecure recovery of data-verifiable and encrypted memory systems.To this end, we coin a new concept that we call Persistent-Security.We discuss the requirements for such persistently secure systems,propose novel optimizations, and evaluate the impact of theproposed relaxation schemes and optimizations on performance,resilience and recovery time. To the best of our knowledge, ourpaper is the first to discuss the persistence of security metadatain integrity-protected NVM systems and provide correspondingoptimizations. We define a set of relaxation schemes that bringtrade-offs between performance and recovery time for largecapacity NVM systems. Our results show that our proposeddesign, Triad-NVM, can improve the throughput by an averageof 2x (relative to strict persistence). Moreover, Triad-NVMmaintains a recovery time of less than 4 seconds for an 8TBNVM system (30.6 seconds for 64TB), which is 3648x fasterthan a system without security metadata persistence.

I. INTRODUCTION

Securing Non-Volatile Memory (NVM) systems is a criticalrequirement for their deployment. Unlike DRAM, NVMs retaindata after powering off the system, which necessiates encryptingthem to avoid data remenance attacks. Moreover, NVMs enablepersistent applications which can recover after a crash byfrequently persisting their data on NVMs. However, whileboth concepts, persistency and security, seem orthogonal, theybecome naturally relevant with emerging NVMs; the systemshould be able to recover securely and guarantee its dataintegrity and confidentiality across crashes. Unfortunately, suchsynergy between persistency and security has received limitedattention from the research community, which we believe is due

DISTRIBUTION A. Approved for public release. Distribution unlimited.Case Number 88ABW-2018-4218 Dated 29 Aug 2018.

to the following reasons. First, securing memory systems, e.g.,encryption and data integrity verification, has been originallydeveloped for DRAM systems to defeat cold-boot attacks [1],[2], [3]. In such systems, persistency and data recovery isnot anticipated and thus all security metadata are expectedto be reinitialized [4]. Second, the research on encryptingand securing NVMs has been mostly focused on performanceof normal-operation without consideration of maintainingconsistency across crashes [5], [6], [7], [8]. Third, persistingdata in emerging NVMs has many compelling models, notablyJUSTDO logging [9] and epoch-based persistence [10], wherethe focus is on data persistency than its accompanying securitymetadata [11], [12], [13], [14].

While the Asynchronous DRAM Refresh (ADR) featurehas been a requirement for many NVM form factors, e.g.,NVDIMM [15], most processor vendors limit persistent do-mains within the processor chip to tens of entries in the writepending queue (WPQ)[16]. The main reason is the high costsfor ensuring a long-lasting sustainable power supply in case ofpower loss in addition to the power-consuming nature of NVMwrites. As security metadata can be hundreds of kilobytesor megabytes, the system-level ADR support would fail toguarantee the persistence of all their updated values within theprocessor chip. For many users, affording powerfull backupbatteries or deploying expensive power-supplies is infeasible,either due to environmental, limited area or cost limitations,thus battery-free solutions are always in demand[17].

Recently, several work have explored persisting encryptioncounters on secure NVMs [18], [17], [19]. Unfortunately, noneof the previous work investigate persisting integrity protectedsystems and its accompanying metadata, but rather limit theirschemes to encryption counters only. In fact, persisting integrity-verification metadata has much higher overhead. Moreover,most of the previous work do not clearly define which securitymetadata need to be persisted and the impact of relaxingthe persistence of such metadata on security, recovery time,resilience, performance and write endurance. In summary,the current literature in persisting security metadata lacksthe following: 1 Solid definition of what data needs to bepersisted in secure memory systems, especially for systemswith integrity protection. 2 Complete understanding of the

arX

iv:1

810.

0943

8v1

[cs

.DC

] 2

0 O

ct 2

018

Page 2: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

impact of relaxing security persistence, e.g., selective counterpersistence [18], on security and best practices that mustaccompany such relaxation schemes. 3 Understanding ofthe impact of different relaxation schemes on recovery timeand resilience of the system, i.e., possibility of recovery failure.To address all these shortcomings, we first discuss the problemof persisting security metadata in integrity-protected securesystems. Later, we define a new concept, Persistent Security,which defines the requirements for securely recovering integrity-protected and encrypted memory systems. Finally, we discussseveral relaxation schemes and a comprehensive design thatleverages them to optimize performance.

To reduce the overhead of persisting Merkle Tree and encryp-tion counters after each update, we propose Triad-NVM, a noveldesign that leverages four key insights: 1 Current systemsand support for NVM clearly define regions of the NVMdevice that can be used as persistent memory, e.g., initializingthe Linux kernel with the parameter memmap=4G!12G, canbe used to dedicate the 4GB starting from address 12GBto be mounted/mapped as persistent memory. 2 Being ableto spatially divide the address space into persistent and non-persistent regions allows to partition Merkle Tree verticallyinto persistent and non-persistent subtrees. While prior work[18] distinguish between persistent and non-persistent datathrough modifying applications to explicitly hint hardwareand memory controller, we observe that such modificationsmight be not needed if persistent regions are well-defined andspatially distinguishable. 3 Merkle Tree can be reconstructedafter a crash by only ensuring the persistence of low levels,however, at the cost of resilience (single-point of failure) andrecovery time. 4 Building Merkle Tree for Non-Persistentregions at recovery time can be very expensive, however, itcan be mitigated by encoding special values at some levels ofthe Merkle Tree during recovery.

To evaluate our design, we use Gem5 [20], a full-systemcycle-accurate simulator. We use Linux kernel version 4.14along with a disk image based on Ubuntu 16.04. Additionally,we initialize the kernel to dedicate the last 4GB (out of 16GB)as a persistent region, which we later mount as a directly-accessible (DAX) ext4 filesystem. The persistent region can beused by any library that supports persistent memory, however,we opt for using Intel’s PMDK library [21] (previously knownas PMEM) to build 3 microbenchmarks, in addition to 4 DAX-based synthetic workloads, along with 12 benchmarks from theSPEC2006 suite [22]. We additionally evaluate 4 workloadsthat run combinations of persistent (PMDK and DAX) andnon-persistent (SPEC) workloads. Our simulation results showthat Triad-NVM can improve the throughput by an averageof 2x (relative to strict persistence). Moreover, Triad-NVMmaintains a recovery time of less than 4 seconds for an 8TBNVM system (30.6 seconds for 64TB), which is 3648x fasterthan a system without security metadata persistence.

The rest of the paper is organized as follows. First, we discussthe problem and background of secure NVM systems in SectionII, where we also discuss the conventional way of persistingsecurity metadata in secure NVMs. Later, in Section III, wediscuss the requirements of strictly-persistent secure memory

systems. Later, we discuss different relaxation schemes andtheir impact on security, performance and resiliency of NVMsystems. In Section IV, we discuss our evaluation methodology.Our results and discussion of our evaluation are presented inSection V. Section VI discusses the most relevant work to ourpaper. Finally, we conclude our work in Section VII.

II. BACKGROUND AND MOTIVATION

We will start our section with background discussionfollowed by demonstrating the issue of crash consistency ofsecurity metadata. Later, we present motivational data of theimpact of the problem.

A. BackgroundEmerging Non-Volatile Memories (NVMs), such as Intel’s

and Micron’s 3D XPoint [23], are just around the corner andthey are expected to be on the market soon. Unlike DRAM,they feature high-density, ultra-low idle power (no need torefresh cells), low latency and persistence of data [24], [25],[26]. Due to their ability to retain data across system reboots,they can be also used to store files and persistent data structures.Hence, NVMs can be used as a memory and a storage deviceat the same time. Specifically, NVM-based DIMMs can beused to hold files and also regular memory pages, and canbe accessed in a way similar to DRAM through load/storeoperations. To realize this, new Operating Systems (OSes)started to support configuring the memory as both filesystemand conventional memory. In particular, recent linux kernelsand Windows started to support directly-access (DAX) supportfilesystems [27]. In DAX supported filesystems, e.g., ext4with DAX, a file can be directly memory-mapped and accessedthrough typical load/store operations, however, without the needto copy its pages to page cache as in conventional filesystems.For example, the NVM memory can be configured to havea part dedicated to hold filesystem. In linux systems, wheninitializing the kernel, the parameter memmap=4G!12G canbe used to dedicate the 4GB starting from address 12GB todirectly-accessible filesystems. Thus, the same memory chipwill be accessed for both files and conventional memory pages.

1) NVM Security: Due to non-volatility, emerging NVMsfacilitate data remenance attacks; data remains there even afterlosing power. Accordingly, emerging NVMs are commonlycoupled with memory encryption and data integrity verifica-tion. State-of-the-art secure NVM systems use counter-modeencryption[6], [7], [18]. Counter-mode encryption can protectagainst replay attacks, scanning memory and bus snoopingattacks. Counter-mode memory encryption associates a counterwith each 64B block in memory and this counter changes eachtime the corresponding block is being written to memory andthe new counter value will be used to do the encryption. Themost-recent value of the counter must be book-kept to be laterused for decryption. In counter-mode encryption, repeating thesame counter value can result in serious security vulnerabilities,e.g., known-plaintext attacks. To avoid replaying counters, theyare typically protected against tampering through Merkle Tree.State-of-the-art work, e.g., Bonsai Merkle Tree [4], appliesMerkle Tree over the encryption counters and protects datathrough computing MAC value over ciphertext and data.

2

Page 3: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

C C C CLeaves (64B counter blocks)

Merkle Tree

C C C C

Hash (64B to 8B)

C C C C C C C C

...

...

...

Secure Processor BoundaryMerkle Tree Root

Hash (64B to 8B)

Hash (64B to 8B)

Intermediate Nodes

Fig. 1: Example Merkle Tree.

As shown in Figure 1, each encryption counter, which isassociated with a data block, is used to calculate a hash value(MAC) that will be used along with other hash values fromother groups of counters to create a lower level hash value.Finally, the resulting MAC value, which is called root, is kept inthe processor. Each time a counter is brought from the insecurearea, e.g., memory module, it will be verified by calculatingthe upper hash values and see if the result matches the rootkept in the processor. Furthermore, when the processor writesa memory block and updates the corresponding counter, theMerkle Tree intermediate nodes and root should be updated toreflect the most-recent change.

2) Memory Counter-Mode Encryption: In counter-modeencryption, the encryption algorithm, e.g., AES, takes ini-tialization vector (IV) as its input to create a one-time pad(OTP) as explained in Figure 2. Later, when the block arrivesthe processor chip, a low-cost bitwise XOR with the pad(encrypted IV) is needed to obtain the plaintext. By doing so,the decryption latency is hidden by the memory access latency.In our paper, we use state-of-the-art design for organizingencryption counter, split-counter scheme [6], [4], where eachIV consists of a unique ID of a page (to distinguish betweenswap space and main memory space), page offset (to guaranteedifferent blocks in a page will get different IVs), a per-blockminor counter (to make the same value encrypted differentlywhen written again to the same address), and a per-page majorcounter (to guarantee uniqueness of IV when minor countersoverflow).

AES-ctr mode

PageID

PageOffset

MajorCounter

MinorCounter PaddingIV

Key

Block to Write Back

Fetched Block(from NVMM)

Ciphertext(to NVMM)

Plaintext(to cache)

Counter Cache

MajorCtr Minor counters

PadXOR XOR

Fig. 2: State-of-the-art counter mode encryption as in [6]

As in previous work [6], [18], [17], [28], [4], [7], we assumecounter-mode memory encryption. In addition to its perfor-mance advantages, it also provides strong security defensesagainst a wide range of attacks. Counter-mode encryption is

secure against dictionary-based attacks, known-plaintext attacks,bus snooping and replay attacks. In split-counter scheme, theencryption counters are organized as major counters (sharedbetween cache blocks of the same page) and minor countersthat are specific for each cache block [4]. Such organizationallows packing 64 cache blocks’ counters in a 64B block;7-bit minor counters and 64-bit major counter. Major countersare incremented when one of its minor counters overflows,in which all corresponding minor counters will be reset andthe whole page will be re-encrypted using the new majorcounter[4]. When the major counter of a page overflows (64-bit counter), a new key is generated and the memory contentswill be re-encrypted using the new key. Split-counter schemeprovides significant reduction of memory re-encryption rate andminimizes the storage overhead of encryption counters whencompared to other schemes, e.g., monolithic counter schemeor independent counter for each cache block. Moreover, split-counter exploits the spatial locality of encryption counters,and hence achieves a higher counter cache hit rate. Similar tostate-of-the-art work [6], [29], [5], [4], we use split-counterscheme for organizing the encryption counters.

3) Impact of Persistency on Security: As persistent memorycan be utilized to store checkpoints and recover from crashes,the accompanying security metadata should be persisted too.For instance, we mentioned earlier that the counter used forencryption will be updated on each write operation of theassociated memory block, however, such update can occur ona volatile counter cache inside the processor chip, hence notpersisted to the NVM memory. Accordingly, the data blockcould be encrypted with the new counter value and written toNVM, but the counter is not updated in memory. Clearly, ifa crash occurs, a stale counter value will be used to do thedecryption, which will result in incorrect decryption. While therecovery of non-persistent data is not important, it is criticalto avoid reusing their counters’ previous values; repeating anold (not persisted) counter will result in a reuse of OTP thatcould have been observed earlier, which results in securityvulnerabilities as described below.

4) Attack on Reusing Counters for Non-Persistent Data: Asdescribed by Ye et al.[17], assume an adversarial applicationuses known-plaintext and writes it to memory, however, if thememory location is non-persistent, the encrypted data will bewritten to memory but the counter might be not updated in

3

Page 4: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

memory yet. By observing the memory bus, the attacker canlearn the encryption pad, i.e., OTP, by XOR’ing the observedciphertext, (Ekey(IVnew)⊕ Plaintext), with the Plaintext.Even worse, it is also possible to predict the plaintext forinitial accesses, for instance, zeroing at first access. To thisend, the attacker knows the encryption pad, i.e., Ekey(IVnew).Later, after recovering from a crash, the memory controllerwill fetch IVold and increment it, which generates IVnew, andthen uses it to encrypt the new data written to that location tobecome Ekey(IVnew) ⊕ Plaintext2. As the attacker knowsthe encryption pad, revealing the value of Plaintext2 onlycan be done by XORing the ciphertext with the previouslyobserved encryption pad, i.e., Ekey(IVnew). Note that the stalecounter could have been incremented multiple times beforethe crash, hence multiple writes of the new application canreuse counters with known encryption pads. Note that suchan attack only need a malicious application to run (or justpredictable initial plaintext of an application) and having aphysical attacker.

B. Security Metadata Crash ConsistencyAfter a crash occurs, the system must be able to recover and

restore its encrypted memory data along with its accompanyingsecurity metadata. Figure 3 discusses the logical steps neededfor crash consistency as described by Ye et al.[17].

Merkle Tree

Counters

Data

NVM Memory

Secure Processor (Trusted Chip)

LLC...

MT Cache

Counter Cache

MT Root

Memory Controller

+Encryption

Engine

1

2

3

Fig. 3: Description of write process that ensures crash consis-tency.

As depicted by Figure 3, the root of the Merkle tree (on-chip) should be updated/persisted (as shown in step 1 ) alongwith affected intermediate nodes inside the processor. However,only the root of the Merkle Tree needs to be kept in the secureregion.

In Step 2 , the updated counter block is written back tomemory as it gets updated in the counter cache. Counters arecritical to keep and persist to avoid having the security of thecounter-mode encryption be compromised. Note that even ifthe data is not expected to be recovered, the counters must bepersisted to avoid repetition. While losing counter values cancause inability to restore encrypted memory data. Liu et al. [18]observe that it is possible to only persist counters of persistentdata structures (or subset of them) to enable consistent recovery.

Unfortunately, this can lead to serious security vulnerabilities asdiscussed earlier; reusing counter values even for non-persistentmemory locations can compromise the security of the counter-mode encryption. Moreover, modifying applications to exposetheir persistent ranges to the memory controller is challenging,especially for legacy applications.

Finally, in Step 3 , the written data block will be sent tothe memory.

C. MotivationAs we now understand the issue of guaranteeing security

metadata crash consistency, let’s now discuss the impact ofensuring such consistency on performance of Non-VolatileMemories (NVMs). As mentioned earlier, a strictly persistentsecure system would persist any changes to Merkle Tree and thecorresponding encryption counter before each memory writeoperation. Figure 41 depicts the relative system throughputwhen using strict persistent security.

1

2

3

4

5

6

7

8

9

10

QUEUE

HASHTABLE

SWAPARRAY

MIX1

MIX2

MIX3

MIX4

DAXBENCH1

DAXBENCH2

DAXBENCH3

DAXBENCH4

LBMM

CFZEUS

BWAVES

GCCLESLIE3D

MILC

SJENG

CACTUS

HMM

ER

LIBQUANTUM

OMNETPP

Gmean

Slo

wdo

wn

(Nor

mal

ized

to B

asel

ine)

Benchmarks

The Impact of Persisting Merkle-Tree and Counters on Performance

Fig. 4: Performance Overhead of Persisting Security Metadata.

As shown in Figure 4, most workloads get severe per-formance degradation due to the impact of additional writeoperations to ensure persistency of security metadata. Notably,some workloads can be slowed down by as much as 9.4xcompared to a system that does not guarantee such type ofpersistency. The impact of persisting security metadata dependsmainly on the memory behavior and write-intensity of therunning applications. Unfortunately, with such performanceoverheads, most users would avoid enabling persistent securitywhich would leave their systems insecure and potentially theirdata unrecoverable. To address this issue, in this paper, wedefine the requirements of persistently-secure systems and studythe impact of different schemes on performance, recovery timeand resilience of the system. To the best of our knowledge,our paper is the first to discuss the issue of persisting bothencryption counters and Merkle tree in NVM-based systems.

III. DESIGN

In this section, we first define the requirements of persistentlysecure systems. Later, we discuss the different design optionsthat can be adopted for the purpose of implementing persistentlysecure systems. Finally, we discuss our proposed design thatcombines several novel optimizations.

1More details about the workload and methodology can be found in SectionIV

4

Page 5: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

A. Persistently Secure SystemsIn order to start our discussion of the requirements of

persistently secure system, let’s first formally define suchsystems.

Definition 1:A Persistently Secure System is any secure system

that is capable of recovering its security metadata, e.g.,encryption counters and Merkle Tree, in case of power-loss or system crash. Such a system should be ableto verify that any (persistent) memory data being readafter the crash reflects the most-recent value before orafter system recovery. Specifically, if using counter-modeencryption, such a system should guarantee that one-timepads are never repeated regardless of the nature (persistentor non-persistent) of their corresponding data.

As mentioned earlier, state-of-the-art designs for secureNVMs deploy counter-mode encryption along with an op-timized Merkle Tree that protects tampering with data andencryption counters. Two key requirements for such schemes1 Never repeat the same counter without changing the key 2

Ensure that the Merkle Tree root reflects the most-recent stateof the memory. The hierarchical implementation of MerkleTree is meant for reducing the overhead of verification in caseof read operation; if you read a counter that has the most-recent value of its parent node being cached in the processorchip, i.e., has been already verified, then there is no need togo further (to upper levels) in the Merkle Tree. Similarly, ifits grand-parent node is present but its immediate parent isnot, then only the immediate parent needs to be fetched frommemory and verified to match its hash value in its parent nodeand then the counter/leaf will be verified against its hash valuein its fetched parent.

Verifying the counters being fetched from memory wouldrequire accessing all its parents up to the first one hits in thecache, which will be the root in the worst case. However, giventhe spatial locality of most applications, most applicationsrarely need to access more than few levels from memory.Unfortunately, the write operation is quite different; all upperlevels must be updated including the root of the Merkle Tree.As NVMs are expected to have huge capacities, e.g., tensof TBs, the height of the Merkle Tree is expected to be inorder of tens of levels, which would require updating tens ofintermediate nodes blocks. Even worse, to achieve persistenceof such security metadata, we need to persist each updatedintermediate node block in the NVM. This brings us to thediscussion of the requirements of implementing persistentlysecure systems.1 Encryption Counters: While losing the most recent values

of encryption counters of non-persistent data structures ordata would likely not affect the correctness of the system, itcan compromise its security as discussed earlier. Accordingly,to prevent such security vulnerability, encryption countersshould be strictly persisted or there must be a guarantee of notrepeating one-time pads.2 Merkle Tree: Persisting Merkle Tree updates is necessary

to ensure the system ability of verifying encryption counters

when recovered. However, we observe that not all parts of theMerkle Tree need to be persisted. Instead, just persisting thelowest levels of Merkle Tree is sufficient to rebuild the wholeMerkle Tree at the time of recovery, however, at the cost ofcreating a single-point of failure and increasing recovery timeas we will discuss later.

1) Encryption Counters Persistence: Encryption countersof persistent data must be updated strictly before updating thedata to ensure consistent recovery of data. Meanwhile, as alsoobserved by Liu et al.[18], counters of non-persistent data donot need to be strictly persisted. However, while prior work[18] completely relaxes the persistence of such counters, weargue that we additionally need to ensure that such countersshould never repeat with the same encryption key. Later, wewill discuss a scheme that relaxes counters of non-persistentdata while ensuring that they will never repeat. Meanwhile, theencryption counters of persistent data will be strictly persistedbefore updating their corresponding data.

Observation 1: Encryption counters of persistent data mustbe persisted and up-to-date during normal operation and acrosssystem failures/crashes. In contrast, for non-persistent data, theencryption counters must be up-to-date during normal operationand guaranteed not to repeat old values after recovering fromcrash.

C C

NVMFile

Pmem

Non-PersistentPersistent

... C C C C...

Encryption Counters for Persistent Data Encryption Counters for Non-Persistent Data

Fig. 5: Encryption counters for data with different persistencyrequirements.

2) Merkle Tree Persistence: While Merkle Tree covers allencryption counters, regardless of their persistence require-ments, not all parts of the Merkle Tree need to be persisted.Specifically, given that current systems define the persistentregions of the NVM during bootup time, there is a range inmemory that is guaranteed not to be used for persistent data.For instance, if the Linux Kernel defines the first 4GB ofmemory to be used for persistent memory, i.e., can be mountedand used by PMEM and DAX applications, then other partsof the memory do not ensure any persistence of data. Thus,persisting the updates to Merkle Tree intermediate nodes thatonly cover non-persistent regions might be relaxed as long asthis ensures integrity during the normal operation.

Observation 2: For non-persistent data, it is important toensure their integrity during run-time, however, after a crash,we are no longer concerned about their values being lost ortampered with. In contrast, for persistent data, the data andcounters integrity must be ensured during normal operationand across system failures/crashes.

Based on the above observation, a Merkle Tree can besubdivided into multiple subtrees, i.e., some parts only cover

5

Page 6: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

C C C CLeaves (64B counter blocks)

Merkle Tree Update For Persistent Data

C C C C

Hash (64B to 8B)

C C C C C C C C

...

...

...

Secure Processor BoundaryMerkle Tree Root

Hash (64B to 8B)

Hash (64B to 8B)

Intermediate Nodes

Encryption Counters for Persistent Data

MTCache

CounterCache

MemroyController

1

2

3

4

Belongs to non-persistent data regions

Belongs to persistent data regions

Fig. 6: Updating Merkle for counters correspond to data with different persistency requirements.

the first 4GB of the main memory as aforementioned, henceonly that part of the tree needs to be persisted strictly. Inother words, the Merkle Tree can be vertically divided intopersistent and non-persistent tree. We also observe that justpersisting the leaves of the Merkle Tree is sufficient to allow itsreconstruction, however, this can increase the recovery time andcreate single-point of failure, which would require additionalfault isolation mechanisms.

B. Relaxed Schemes

In this part of the paper, we will discuss several optimizationsand their potential impacts.

1) Relaxed-but-Secure Persistence of Encryption Counters:To understand this scheme, let’s first discuss how currentsystems adopt persistent memory. In current systems, at the timeof bootup, the kernel is initialized with a value that determineswhich range of the memory module is considered persistent.Such region can be formatted and mounted as a filesystem, e.g.,ext4 filesystem. Additionally, such persistent region can beused by persistency APIs, such as PMDK, which would backupthe persistent data structures by files in the persistent region.Any file in the persistent region can be memory-mmaped, i.e.,mmap’ed, and accessed through typical load/store operations.Meanwhile, the non-persistent region in the NVM device is notensured recoverability after a crash; an application will onlybe able to access data persisted to the persistent region afterrecovery. Figure 5 depicts a system with encryption counters ofpersistent and non-persistent regions in an NVM device. Notethat such distinction between persistency of regions does notrequire modifying applications to explicitly define persistentdata structures and exposure them to hardware as in prior work[18].

As we now understand how future NVM devices will beleveraged by OS, let’s now discuss how encryption countersshould be treated differently based on that. While encryptioncounters of persistent data must be recovered correctly for bothcorrectness and security reasons, those of non-persistent datacan be relaxed from ensuring correctness after recovery. The

only requirement of encryption counters of non-persistent datais that they should not produce a previously used one-time pad,which we can ensure by using a different encryption key.

Based on our observation that non-persistent data can losetheir most-recent counter values after recovery, however, theyshould be never reused with the same key, we devise a designthat uses two different keys. One key, which we refer to byPersistent Key, and the other key which we refer to by VolatileKey. The persistent key is used for encrypting/decryptingpersistent data regions, whereas the volatile key is used fornon-persistent data. The volatile key will be changed after eachsystem reboot or recovering from crash, which allows us tosecurely relax the persistence of encryption counters of non-persistent data. Our design only requires modifying the memorycontroller to be able to receive a command from the kernelwhich notifies the MC about the start and end address of thepersistent region, and to use such information to decide whichkey to use. Moreover, the kernel needs to send the memorycontroller, through a memory-mapped register, a command tochange the volatile key at the time of recovery or reboot.

Observation 3: In systems with persistent and non-persistentregions, if counter-mode encryption scheme is employed,separate keys can be used for each region. As encryptioncounters of the non-persistent region can repeat due to optionalpersistence necessity, the key used for such region must bechanged after recovery or system reboot. By doing so, even ifthe same IV gets reused, the one-time pad will be different:Ekey1(IV ) 6= Ekey2(IV ). Thus, updating encryption countersof non-persistent data can occur in counter-cache and use awrite-back scheme instead of write-through.

2) Relaxed-Persistence Subtrees of Merkle Tree: As we havediscussed in the prior part, some parts of the memory are notexpected to recover data correctly after a crash. However, suchregions require that their integrity be protected during normalrun-time, but do not have any expectation of integrity-protectionafter recovery from crash as the data will be discarded anyway.To better understand how an optimized scheme that relaxespersistence of non-persistent sub-trees of Merkle Tree, Figure 6

6

Page 7: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

shows how Merkle Tree can be updated based on the persistencyrequirement of the underlying data.

As depicted in Figure 6, when an update occurs for apersistent memory location, all the corresponding parts forsuch location in Merkle Tree at all levels should be updatedup to the root (inside the processor chip). Updates to MerkleTree for persistent data should be updating both the memorycopy and the cache copy of 1 The root of the Merkle Tree,2 The root’s child node that owns the updated counter, and

all other intermediate nodes correspond to the updated counter.Finally, as in 4 , updating the counter in both the memoryand counter cache. Note that, for non-persistent data, updatingthe intermediate nodes and counters in the cache and lazilywrite it back to memory is sufficient. Clearly, the subtrees thatbelong to persistent regions should be separable, i.e., someparts of the root only belongs to persistent-region and othersonly to non-persistent regions. Having some parts belong toboth is challenging can lead to unverifiable parts after recovery;that part will reflect most-recent values of both types of data,but we will be unable of reproducing the root due to the lostupdates of intermediate nodes and counters of non-persistentdata. The only level that has MAC values on the same blockfor both persistent and non-persistent data is the root, however,they are clearly separable. To avoid cases of MAC values coverboth types of data, the memory space ratio between persistentand persistent-data should 0:8, 1:7, 2:6, 3:5, 4:4, 5:3, 6:2, 7:1or 8:0. In other words, each MAC value in the root shouldcover either persistent or non-persistent data but not both. Notethat updates to the root does not need to be updated in memoryas it is guaranteed to be persistent in the processor, i.e., in aNVM register inside the processor.

Observation 4: Merkle Tree can be logically divided intosubtrees that either belong to persistent data or non-persistentdata. Due to the architectural layout of the tree, the ratio ofpersistent to non-persistent data is limited to specific valuesthat guarantee each MAC value in the root belongs to eitherpersistent or non-persistent data. To this end, the Merkle Treesubtrees that belong to the non-persistent ranges can be updatedin the MT cache and lazily updated in memory when writtenback.

3) Bottom-Up Merkle Tree Persistence: As we discussedin the previous part on how to divide Merkle Tree verticallyinto persistent and non-persistent subtrees, we now will discusswhich levels of Merkle Tree need to be updated and the impactof possible relaxation schemes.

Unlike encryption counters, Merkle Tree intermediate nodescan be rebuilt if lost and their main use is to speed upverification and updating the root of the Merkle Tree. However,if encryption counters are guaranteed to be strictly persistentand up-to-date, then after a crash it is possible to rebuild alllevels of intermediate nodes all the way up to the root whichneeds to match the root value in the processor. While this lookslike a straight-forward optimization, it can actually create asingle-point of failure; if any counter has been corrupted, e.g.,due to memory error, we can only know that the MerkleTree root has not matched and thus none of the memory isintegrity-verifiable. While a Merkle Tree root can hold 64B

instead of only 8B values, still an uncorrectable counter errorwould result in 1

8 th of the memory being lost/unverifiable. Asemerging NVMs are expected to contain terabytes of data, thechance that an uncorrectable error of any counter can leadto large part of memory being unverifiable is unacceptable.Moreover, only guaranteeing the persistence of encryptionswould require high recovery time due to the need of iteratingover all encryption counters to rebuild all levels of MerkleTree.

To enable high-resolution of identifying unverifiable loca-tions in addition to keep the recovery time manageable, wepropose persisting the first N levels of the Merkle Tree (frombottom to up) while optionally maintaining several NVMregisters within the processor. The value of N depends onthe acceptable performance/recovery-time trade-off. Persistinglow levels of Merkle Tree can help isolating problems and iden-tifying which counters are corrupted/uncorrectable. Moreover,more number of levels being guaranteed persistence allowsshorter recovery time to reconstruct the Merkle Tree. Alsonote that since higher levels of the Merkle Tree has much lessintermediate nodes, persisting them reduces the chances of theinability to construct Merkle Tree due to uncorrectable errors.

Observation 5:Persisting parts of the Merkle Tree can avoid single-point

of failures resulting from uncorrectable errors of encryptioncounters. Moreover, it reduces the Merkle Tree recovery time byreducing the number of levels need to be rebuilt after recovery.Internal NVM registers inside the processor can be also usedto hold upper levels of Merkle Tree, e.g., root’s immediatechildren and grand children (a total of 73 blocks), thus it candeclare only a small percentage of memory unverifiable incase uncorrectable errors corrupt encryption counters or thepersisted parts of the Merkle Tree.

C. Triad-NVM

Our design that includes all of the proposed optimizationsis referred to as Triad-NVM. Triad-NVM logically dividesthe Merkle Tree into subtrees, persistent and non-persistent.Moreover, it strictly persists counters that belong to persistentregions. Finally, for persistent regions, the Merkle Tree mustbe reconstructed after crash, thus it is necessary to ensureresiliency through persisting additional levels as configured bysystem owner or recommended by the system architects. Fornon-persistent regions, there is no need to ensure reconstructionof the corresponding subtree of the Merkle Tree, thus updatesto such subtree are not strictly persisted.

At the recovery time, the Merkle Tree has to be reconstructedagain to be able verify any tampering. During recovery, beforeadmitting the current status of the system as verified, allpersistent locations need to be verified through reconstructingthe corresponding parts of the Merkle Tree and verifying thatit generates a correct root. This ensures that the data has notbeen tampered with from (or before) crash time till recoverytime. However, for non-persistent data locations, while we donot need to verify the integrity of data at the recovery time,we need to be able to verify it after recovery. Thus, MerkleTree subtrees that correspond to non-persistent regions should

7

Page 8: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

be constructed. Now the question is how can we do that in anefficient way.Reconstructing Subtree of Non-Persistent Region: As men-tioned earlier, in state-of-the-art schemes, e.g., Bonsai MerkleTree [28], the MAC values stored with data are calculatedover data and encryption counters and hence all data andits accompanying MAC values need to be reinitialized basedon the updated counter values after recovery. Note that sincepersisting updates to Merkle Tree and counters of non-persistentdata is relaxed, there is no guarantee that the recoveredcounter values are similar to those used to calculate the MACvalues accompanying their corresponding data. Accordingly,there will be inconsistency that will result in errors andmistakenly flagging tampering/errors during normal operationsafter recovery.

Unfortunately, it is very time-consuming to iterate over allnon-persistent data to reinitialize counters, data and its MACvalues. To better understand the overhead, let’s assume a 6TB ofNVM, where 50% is used as non-persistent data, i.e., normalmemory. At recovery time, if reading each data block andinitializing it takes 100ns, then just iterating over the 3TB(non-persistent region) would take 5154 seconds (almost 85.9minutes). However, for the persistent data (the other 3TB), sincethe low-levels of Merkle Tree along with data and counters havebeen strictly persisted, only the upper levels of Merkle Tree.For instance, if only the first level of Merkle Tree (the parentsof counter blocks), and reading each block and calculating itsMAC values takes 100ns, then that will take only 92 seconds( 1.5 minutes). For more aggressive persistence of Merkle Treesuch as up to level 2 (the grandparents of counter blocks), thenthe construction time will take only 11.5 seconds.

Observation 6:Relaxing the persistence of Merkle Tree parts and counters

that correspond to non-persistent data can result in significantincrease in recovery time. Unfortunately, given the significantcosts of system downtime (hundreds of thousands per minute[30]), completely reconstructing counters and Merkle Treeparts of non-persistent data at the recovery time can lead tounanticipated system unavailability.

To avoid an unanticipated long recovery time of rebuildingcounters and Merkle Tree parts of non-persistent regions, weadditionally propose Lazy Recovery of Low-Levels of Non-Persistent Merkle-Tree. As the number of counter and datablocks is the largest and they consume most of the rebuildingtime, we can lazily update them as following. By initializing allintermediate nodes at level 1 (parents of counter blocks) withzeros, and constructing all upper levels sequentially, we canobtain an initial root value (or part of root) for non-persistentdata. Later, any update for any counter value, if the parentintermediate node has a zero value, we do not flag an error ortampering as the upper levels of Merkle Tree (and root) arealready updated to reflect such value, however, we know thatthis is the first write to the counter block after recovery andhence we zero out (or initialize) the counter value and updatethe parent node accordingly. Note that since each counter blockhas its MAC value stored in 8 bytes of the 64B parent node,only the corresponding 8B in the parent node would indicate if

it is the first update after recovery or not. While the odds thata counter block value would naturally lead to 64-bit zero valueis only 1

264 , to avoid falsely assuming an initialized countervalue, if a counter block naturally has a MAC value of 0, were-encrypt one of its cachelines and accordingly increment itsminor value and calculate its new MAC value and use it toupdate the parent block if new MAC value does not equal to 0.By lazy update of non-persistent data and counter blocks, weensure a recovery process that only needs to iterate over levels1 or 2 instead of all corresponding data and counter blocks.

Persistent Logging Registers

Write Pending Queue (WPQ)

Volatile Write Queue

READY_BITData Counter MT Nodes MT Root

Persistence Domain

Counter Cache

Merkel Tree Node Cache

Merkle Tree Root

Z

1 Select Entry

2 Log Encrypted Data

3 Update Counter

4 Update MT Nodes

5 Update Root6 Set READY_BIT

7 Copy to WPQ8 Copy to WPQ (Optional)

and Update Cache

9

Copy to WPQ (Optional)and Update Cache

10 Copy Root and Reset READY_BIT

Depends on Relaxation Schemeand Address

Non-Volatile Memory (NVM)

Processor Boundary

Periodically flushed to NVM(entries can be flushed in parallel)

11 Remove entry Zand Select a new entry

Fig. 7: High-Level Overview of Write Operation on TriadNVM.

Figure 7 depicts the Triad-NVM design. As mentionedearlier, Write-Pending Queue (WPQ) is considered part of thepersistence domain (power-fail protected domain) in modernprocessors [16]. Thus, anything reaches there should beconsistent or there is a way to ensure its consistency. To doso, each write operation, before being persisted (removed fromvolatile buffer), it logs all its corresponding updates (counter,data, MT nodes and root) to persistent registers inside theprocessor and then set a persistent bit (called READY BIT). Ifa crash occurs while copying the updates in persistent registersto WPQ, then when the system restores the memory controllerwill attempt to write the persistent registers to NVM (or WPQ)again. At the time of copying the contents from the registers toWPQ, Triad-NVM selectively chooses if the counter or MerkleTree nodes should be copied to WPQ or it is just enough toupdate them in caches (as shown in steps 8 and 9 ). Notethat once a dirty block gets evicted from these caches it willgo to WPQ as usual. Persistent registers can be implementedas fast NVM registers or volatile registers will be flushed toslower NVM registers once a crash occurs through leveragingADR or residual power. The number of persistent registersdepends on the Triad-NVM model, e.g., if TriadNVM-2 isused then we only need 5 registers, whereas if we use theimpractical strict persistence then we might need up to 15registers. Note that using persistent registers have been alsoassumed in state-of-the-art work [17].

If the updated encryption counter corresponds to a persistentmemory region, it will be strictly updated (copied to WPQ).

8

Page 9: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

However, for Merkle Tree, if the updated parts belong to apersistent memory region and in a level higher than the persistlevel, then the updates will be persisted to NVM (copied toWPQ). Note that the persist level is the highest level of MerkleTree that is guaranteed to be strictly persisted, e.g., if sucha level is 2, then counters, their parents and grandparentsare guaranteed to be persisted after each update. Note thatTriadNVM recovery process is as simple as iterative over theintermediate notes at the persist level and construct the upperlevels of the persistent memory regions, however, additionallyinitialize the intermediate nodes of such level to zeros fornon-persistent memory regions before constructing their upperlevels of the tree.

IV. METHODOLOGY

In this section, we describe our evaluation methodology.Since our work involves kernel modifications and bothpersistent and non-persistent applications, we use Gem5simulator[20] in its full-system mode. The kernel we simulateis based on Linux Kernel 4.14. Moreover, the disk imagewe use is based on Ubuntu 16.04 distribution. The kernelwas initialized with configuring the 4GB starting from 12GBas a persistent region, e.g., the kernel was initialized withmemmap=4G!12G. Later, the persistent regions was formattedwith DAX-enabled ext4 filesystem and mounted to be usedby persistent applications and libraries. Table I presents thearchitectural configurations of our simulated system:

TABLE I: Configuration of the simulated system.

ProcessorCPU 8-core, 1GHz, out-of-order x86-64L1 Cache private, 2 cycles, 32KB, 2-way, 64B blockL2 Cache private, 20 cycles, 512KB, 8-way, 64B blockL3 Cache shared, 32 cycles, 8MB, 64-way, 64B block

DDR-based PCM Main MemoryCapacity 16 GBPCM Latencies 60ns read, 150ns write [25]Organization 2 ranks/channel, 8 banks/rank, 1KB row buffer,

Open Adaptive page policy, RoRaBaChCo address map-ping

DDR Timing tRCD 55ns, tXAW 50ns, tBURST 5ns, tWR 150ns, tRFC5ns [25], [18]tCL 12.5ns, 64-bit bus width, 1200 MHz Clock

Encryption ParametersCounter Cache 128KB, 8-way, 64B blockMerkle-Tree Cache 128KB, 8-way, 64B blockMerkle-Tree 9 levels, 8-ary, 64B blocks on each level

To evaluate the impact of our proposed optimizations,we run 3 sets of workloads: persistent applications, non-persistent applications and a combination of both. For thebenchmarks being used, following is the description of themajor applications followed by Table II which shows the mixedworkloads we use. For non-persistent workloads, we use repre-sentative benchmarks from SPEC2006[22]. We also implementpersistent Hashtable, ArraySwap and Queue benchmarks usingIntel’s PMDK library. Additionally, we implement syntheticbenchmark, DAXBENCH-S-RW that leverages DAX to mmapa file directly in the persistent region and access it throughmemory load/store operations with S stride and RW read towrite ratio. Finally, we use those benchmarks to create multi-programmed workloads to enable studying the optimizationsthat relax persistence of security metadata for non-persistentdata applications.

TABLE II: Mixed Benchmarks

Benchmark CombinationDAXBENCH1 DAXBENCH-128-2DAXBENCH2 DAXBENCH-1024-2DAXBENCH3 DAXBENCH-256-2DAXBENCH4 DAXBENCH-512-3MIX1 Array-swap, Queue, Hashtable, DAXBENCH-64-

2MIX2 Mcf, Queue, Hashtable, DAXBENCH-64-2MIX3 Mcf, LBM, Hashtable, DAXBENCH-512-2MIX4 Array-swap, Hashtable, Hashtable, DAXBENCH-

1024-2

All applications were fast-forwarded to representative regionsand each run simulated at least 200M instructions.

V. EVALUATION

In this section, we evaluate our proposed optimizations andstudy the impact of combining them on Triad-NVM. We willfirst study the impact of each optimization on performance, andlater study and analyze the trade-off between recovery-timeand performance overhead when relaxing the persistence ofMerkle Tree.

A. The Impact of Relaxation Schemes on Performance

Figure 8 shows the impact of Triad-NVM on performance.TriadNVM-N represents TriadNVM with strict persistence ofpersistent regions up to the N th level of the Merkle Tree. Asexpected, for the persistent region, TriadNVM-1 is expected toperform better than TriadNVM-2 and TriadNVM-3. The mainreason for that is that less number of writes need to occurto memory to ensure strict persistence of Merkle Tree, i.e.,in TriadNVM-1 only the parent of the updated counter blockneeds to be persisted along with the counter block, whereas inTriadNVM-2 both the parent and the grandparent of the updatedcounter block need to be persisted along with the counterblock. Clearly, there will be no difference between differentTriadNVM persistence levels for non-persistent workloads, e.g.,LBM and MCF. Moreover, in mixed workloads, only the writesto persistent regions will make difference between TriadNVMmodels.

1

2

3

4

5

6

7

8

9

QUEUE

HASHTABLE

SWAPARRAY

MIX1

MIX2

MIX3

MIX4

DAXBENCH1

DAXBENCH2

DAXBENCH3

DAXBENCH4

LBMM

CFZEUS

BWAVES

GCCLESLIE3D

MILC

SJENG

CACTUS

HMM

ER

LIBQUANTUM

OMNETPP

Gmean

Slo

wdo

wn

(Nor

mal

ized

to B

asel

ine)

Benchmarks

The Impact of Persisting Merkle-Tree and Counters on Performance

StrictTriadNVM-1TriadNVM-2TriadNVM-3

Fig. 8: The impact of Merkle Tree persistence model onThroughput.

9

Page 10: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

Our results show that using strict persistence scheme canlead to an average of 2.21x slowdown, whereas TriadNVM-1, TriadNVM-2 and TriadNVM-3 lead to only 4.9%, 10.1%and 15.6% performance overheads, respectively. Moreover,we observe that write-intensive workloads with non-persistentregion allocations, e.g, Libquantum, can get an almost orderof magnitude speed up when using schemes that are awareof persistent regions and thus relax the requirements of non-persistent memory ranges.

As mentioned earlier, one major shortcomings of the emerg-ing NVMs are their slow writes and limited write endurance.However, while ensuring higher level of persistence wouldpossibly reduce the recovery time and also avoid single-pointof failures, it can excessively increase the number of writes.However, more relaxed schemes, e.g., TriadNVM-1, wouldlead to additional recovery time and potentially single-point offailure, however, it incurs smaller number of extra writes.

0

5x108

1x109

1.5x109

2x109

2.5x109

3x109

QUEUE

HASHTABLE

SWAPARRAY

MIX1

MIX2

MIX3

MIX7

DAXBENCH1

DAXBENCH2

DAXBENCH3

DAXBENCH4

LBMM

CFZEUS

BWAVES

GCCLESLIE3D

MILC

SJENG

CACTUS

HMM

ER

LIBQUANTUM

OMNETPP

# W

ritte

n B

ytes

Benchmarks

The Impact of Persisting Merkle-Tree and Counters on # Writes

NO PERSISTStrict

TriadNVM-1TriadNVM-2TriadNVM-3

Fig. 9: The impact of Merkle Tree persistence model on #Writes.

Figure 9, shows the total number of times in our simulationsfor different schemes for each workload. We can observe thatmost applications’ # writes are proportional to the persistencelevel, however, since large percentage of writes could beresulting from natural cache write backs and data persistence,only the writes result from persisting security metadata wouldincrease linearly with the level of persistence. We can observethat for most workloads, the number of writes for TriadNVMis close to the original number of writes without persistingsecurity metadata.

B. The Impact on Recovery Time

As mentioned earlier, strict persistence ensures near zerorecovery time, however, at the cost of significant performancedegradation during normal operation. Meanwhile, not persistingsecurity metadata at all will require creating Merkle Tree andinitializing MAC values in data blocks which requires iteratingover all memory blocks. In contrast, Triad-NVM guaranteesthe persistence of the counters and the first level of the MerkleTree. To estimate recovery time, we assume that reading a treeblock in addition to calculating its MAC value takes 100ns.

Figure 10 shows the impact of persisting security metadata onrecovery time.

0.000100

0.010000

1.000000

100.000000

10000.000000

1000000.000000

16GB 128GB 1TB 8TB 64TB

Rec

over

y T

ime

(in s

econ

ds)

Memory Size

The Impact of Merkle Tree Persistence on Recovery Time

No-RecoveryTriadNVM-1TriadNVM-2TriadNVM-3

Strict

Fig. 10: The impact of Merkle Tree persistence model onRecovery Time.

As emerging NVMs are expected to be in terabytes, our goalis to get an estimation of how long does it take to recover thewhole Merkle Tree for different persistency models. We expectTriadNVM-3 to have the least recovery time as it requiresthe least number of computations; starting the reconstructionfrom Level 3. Meanwhile, TriadNVM-2 brings in a tradeoffbetween better performance but almost 8x slowdown. Inthe case of strict persistency, there is almost zero recoverytime; all parts of Merkle Tree are up to date. As shownin the figure, the problem becomes more obvious when wedeal with terabytes of memory, e.g., at 1TB the recovery(construction) time for no-persist scheme is 30 minutes. Incontrast, for TriadNVM-1 it takes only 30.68 seconds tocomplete recovery. As system downtime can be in hundredsof thousands per minute [30], some high-availability systemsmight trade performance for faster recovery time. For instance,if TriadNVM-2 or TriadNVM-3 are used with 1TB of memory,then the recovery time is 3.83 and 0.48 seconds, respectively.We can also observe that when the memory 8TB, the recoverytime is almost 8x slower than that of 1TB, due to linear increasein number of nodes/blocks to iterate through to calculate MACvalues. While TriadNVM can be configured to persist up toany specific value, we leave the decision of which level topersist to up to system integrator or administrator and basedon the acceptable performance/recovery-time trade-off. Notethat such a decision can be also affected by the crash rate andpossibility of system failures in addition to how sensitive thedata is and tolerance to declaring them unverifiable at recoverytime.

As resilience of the system is also important, TriadNVM-2and TriadNVM-3 provide a higher resolution of pinpointingerrors. In TriadNVM-2, if calculating the MAC values startingfrom level 2 does not eventually generate the root value storedin the processor, then TriadNVM restarts form level 1 and findout which node in Level 2 does not match with that calculatedfrom its children, and hence only declaring the corresponding32KB as unverifiable. Note that the root must match with that

10

Page 11: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

generated from those in Level 1, before declaring that Level-2node to be the source of error. Similarily, if an uncorrectableerror occurs in Level-1 nodes, while Level-2 nodes endedup generating the same root value, i.e, verified, then we canuse Level-2 nodes to pinpoint which Level-1 node is thecorrupted one. TriadNVM-3 follows the same logic but nowadd an additional level of isolation in case uncorrectable errorsoccurred on both Level 1 and Level 2. It is also important tonote that since the number of levels of Level-2 and Level-3 are8 and 64 times, respectively, smaller than that of Level-1, thenthe chances of uncorrectable errors on them is also smaller.

VI. RELATED WORK

Persisting security metadata has been rarely studied up untilrecently. Recent work [18] proposed selective persistence ofencryption counters without any discussion of how MerkleTree persistence can be handled. Moreover, as discussedearlier, selective encryption counter is vulnerable to known-plaintext attacks as it allows the reuse of encryption countersfor non-persistent data. In contrast, our work takes a moreholistic approach of studying all relevant security metadata anddiscuss their persistence impact on recovery-time, resilienceand performance. Additionally, we propose solutions to mitigatesuch overheads and to protect against potential repetition ofone-time pad.

Recently, Zuo et al.[19] proposed combining multipleupdates of encryption counter block into one write to memory,especially for large transactions. Unfortunately, such a solutionworks only well if we can predict the spatial locality ofwrites to memory, which is typically hard to predict as theycan result and interfere with evictions/write-backs from theLast-Level Cache (LLC). Thus, such a solution would bebeneficial only when many writes to memory are spatiallycontagious and occur at relatively close time. In contrast, oursolutions are generic and do not expect any special behaviorfrom workloads. Additionally, we investigate Merkle Treepersistence, which has not been discussed or explored in any ofthe previous work we are aware of. Nevertheless, SecPM can beorthogonally augmented with our proposed solutions. Anotherrecent work explore repurposing Error-Correcting Codes (ECC)to be additionally used as a sanity-check for the encryptioncounter [17]. By doing so, Ye et al.[17] propose a novel schemethat can be used to relax the persistence of encryption countersand rely on restoring it through trying several values untilECC match or indicate natural number of errors. Our work isorthogonal to Osiris and can be integrated with it to reduce thenumber of writes to persist counters in the persistent memoryregions.

Persisting data in NVMs have different models based on therequired software and hardware changes[31], [10], [32], [33].Several APIs and libraries have been developed for the purposeof persisting data in emerging NVMs and utilizing the OSsupport for PMEM, e.g., Intel’s PMDK [21]. In this paper, oursupport focuses on persisting security metadata accompanyingpersistent data when being written to NVM, and thus we donot require any changes at application or library level, henceTriad-NVM can be integrated with most persistency models.

Other works that target reducing NVM writes have notconsidered the problem of persisting security metadata [6],[7]. All of these solutions can employ TriadNVM to ensurepersistence and recoverability of security metadata.

VII. CONCLUSION

In this paper, we propose TriadNVM, a set of schemes thatenable efficient, resilient and performance-friendly recoverymechanism for security metadata. TriadNVM ensures thepersistence of Merkle Tree and encryption metadata whileincurring minimial overheads. Moreover, TriadNVM can becustomized to different modes based on the desired recovery-time/performance trade-offs.

REFERENCES

[1] J. A. Halderman, S. D. Schoen, N. Heninger, W. Clarkson, W. Paul, J. A.Calandrino, A. J. Feldman, J. Appelbaum, and E. W. Felten, “Lest weremember: cold-boot attacks on encryption keys,” Communications ofthe ACM, vol. 52, no. 5, pp. 91–98, 2009.

[2] R. Geambasu, T. Kohno, A. A. Levy, and H. M. Levy, “Vanish: Increasingdata privacy with self-destructing data.,” in USENIX Security Symposium,vol. 316, 2009.

[3] T. Pettersson, “Cryptographic key recovery from linux memory dumps,”2007.

[4] C. Yan, D. Englender, M. Prvulovic, B. Rogers, and Y. Solihin,“Improving cost, performance, and security of memory encryption andauthentication,” in ACM SIGARCH Computer Architecture News, vol. 34,pp. 179–190, IEEE Computer Society, 2006.

[5] G. Saileshwar, P. J. Nair, P. Ramrakhyani, W. Elsasser, and M. K.Qureshi, “Synergy: Rethinking secure-memory design for error-correctingmemories,” in The 24th International Symposium on High-PerformanceComputer Architecture (HPCA), 2018.

[6] A. Awad, P. Manadhata, S. Haber, Y. Solihin, and W. Horne, “Silentshredder: Zero-cost shredding for secure non-volatile main memorycontrollers,” in Proceedings of the Twenty-First International Conferenceon Architectural Support for Programming Languages and OperatingSystems, ASPLOS ’16, (New York, NY, USA), pp. 263–276, ACM, 2016.

[7] V. Young, P. J. Nair, and M. K. Qureshi, “Deuce: Write-efficientencryption for non-volatile memories,” in Proceedings of the TwentiethInternational Conference on Architectural Support for ProgrammingLanguages and Operating Systems, ASPLOS ’15, (New York, NY, USA),pp. 33–44, ACM, 2015.

[8] J. Rakshit and K. Mohanram, “Assure: Authentication scheme for secureenergy efficient non-volatile memories,” in 2017 54th ACM/EDAC/IEEEDesign Automation Conference (DAC), pp. 1–6, June 2017.

[9] J. Izraelevitz, T. Kelly, and A. Kolli, “Failure-atomic persistent memoryupdates via justdo logging,” ACM SIGARCH Computer ArchitectureNews, vol. 44, no. 2, pp. 427–442, 2016.

[10] J. Ren, J. Zhao, S. Khan, J. Choi, Y. Wu, and O. Mutlu, “Thynvm:Enabling software-transparent crash consistency in persistent memorysystems,” in Proceedings of the 48th International Symposium onMicroarchitecture, pp. 672–685, ACM, 2015.

[11] J. Arulraj and A. Pavlo, “How to build a non-volatile memory databasemanagement system,” in Proceedings of the 2017 ACM InternationalConference on Management of Data, pp. 1753–1758, ACM, 2017.

[12] T. Wang and R. Johnson, “Scalable logging through emerging non-volatile memory,” Proceedings of the VLDB Endowment, vol. 7, no. 10,pp. 865–876, 2014.

[13] J. Arulraj, A. Pavlo, and S. R. Dulloor, “Let’s talk about storage& recovery methods for non-volatile memory database systems,” inProceedings of the 2015 ACM SIGMOD International Conference onManagement of Data, pp. 707–722, ACM, 2015.

[14] W.-H. Kim, J. Kim, W. Baek, B. Nam, and Y. Won, “Nvwal: Exploitingnvram in write-ahead logging,” ACM SIGOPS Operating Systems Review,vol. 50, no. 2, pp. 385–398, 2016.

[15] J. Chang and A. Sainio, “Nvdimm-n cookbook: A soup-to-nuts primeron using nvdimm-ns to improve your storage performance.”

[16] A. M. Rudoff, “Deprecating the pcommit instruction,” 2016.[17] M. Ye, C. Hughes, and A. Awad, “Osiris: A Low-Overhead Mechanism

for Restoration of Secure Non-Volatile Memories,” in Microarchitecture(MICRO), 2018 51st Annual IEEE/ACM International Symposium on.

11

Page 12: Triad-NVM: Persistent-Security for Integrity-Protected and ...the other side, unlike DRAM systems, NVMs can retain data even after power loss and thus enlarge the attack surface. While

[18] S. Liu, A. Kolli, J. Ren, and S. Khan, “Crash consistency in encrypted non-volatile main memory systems,” in 2018 IEEE International Symposiumon High Performance Computer Architecture (HPCA), pp. 310–323, Feb2018.

[19] P. Zuo and Y. Hua, “Secpm: a secure and persistent memory systemfor non-volatile memory,” in 10th USENIX Workshop on Hot Topicsin Storage and File Systems (HotStorage 18), (Boston, MA), USENIXAssociation, 2018.

[20] N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu,J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell,M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood, “The gem5 simulator,”SIGARCH Comput. Archit. News, vol. 39, pp. 1–7, Aug. 2011.

[21] U. Usharani, “Introduction to Programming with Persistent Memory fromIntel.”

[22] J. L. Henning, “Spec cpu2006 benchmark descriptions,” SIGARCHComput. Archit. News, vol. 34, pp. 1–17, Sept. 2006.

[23] Micron, “Breakthrough nonvolatile memory technology.”[24] B. C. Lee, P. Zhou, J. Yang, Y. Zhang, B. Zhao, E. Ipek, O. Mutlu, and

D. Burger, “Phase-change technology and the future of main memory,”IEEE micro, no. 1, pp. 143–143, 2010.

[25] B. Lee, E. Ipek, O. Mutlu, and D. Burger, “Architecting phase changememory as a scalable dram alternative,” in International Symposium onComputer Architecture (ISCA), 2009.

[26] Z. Li, R. Zhou, and T. Li, “Exploring high-performance and energyproportional interface for phase change memory systems,” IEEE 20thInternational Symposium on High Performance Computer Architecture(HPCA), pp. 210–221, 2013.

[27] “Linux Direct Access of Files (DAX).”[28] B. Rogers, S. Chhabra, M. Prvulovic, and Y. Solihin, “Using address

independent seed encryption and bonsai merkle trees to make secureprocessors os-and performance-friendly,” in Proceedings of the 40thAnnual IEEE/ACM International Symposium on Microarchitecture,pp. 183–196, IEEE Computer Society, 2007.

[29] A. Awad, Y. Wang, D. Shands, and Y. Solihin, “Obfusmem: A low-overhead access obfuscation for trusted memories,” in Proceedings ofthe 44th Annual International Symposium on Computer Architecture,pp. 107–119, ACM, 2017.

[30] T. Holterbach, S. Vissicchio, A. Dainotti, and L. Vanbever, “Swift:Predictive fast reroute,” in Proceedings of the Conference of the ACMSpecial Interest Group on Data Communication, pp. 460–473, ACM,2017.

[31] J. Coburn, A. M. Caulfield, A. Akel, L. M. Grupp, R. K. Gupta, R. Jhala,and S. Swanson, “Nv-heaps: making persistent objects fast and safe withnext-generation, non-volatile memories,” ACM Sigplan Notices, vol. 46,no. 3, pp. 105–118, 2011.

[32] A. Kolli, V. Gogte, A. Saidi, S. Diestelhorst, P. M. Chen, S. Narayanasamy,and T. F. Wenisch, “Language-level persistency,” in Proceedings of the44th Annual International Symposium on Computer Architecture, pp. 481–493, ACM, 2017.

[33] J. Zhao, S. Li, D. H. Yoon, Y. Xie, and N. P. Jouppi, “Kiln: Closingthe performance gap between systems with and without persistencesupport,” in Microarchitecture (MICRO), 2013 46th Annual IEEE/ACMInternational Symposium on, pp. 421–432, IEEE, 2013.

12