consistency in distributed systems - uzh · 2014-10-10 · consistency in distributed systems...

Consistency inDistributed Systems

Sebastian GolaszewskiBasel, Switzerland

Student ID: 09-911-983

Supervisor: Dr. Thomas Bocek, Andri LareidaDate of Submission: October 14, 2014

University of ZurichDepartment of Informatics (IFI)Binzmühlestrasse 14, CH-8050 Zürich, Switzerland ifi

MA

ST

ER

TH

ES

IS–

Com

mun

icat

ion

Sys

tem

sG

roup

,Pro

f.D

r.B

urkh

ard

Stil

ler

Master ThesisCommunication Systems Group (CSG)Department of Informatics (IFI)University of ZurichBinzmühlestrasse 14, CH-8050 Zürich, SwitzerlandURL: http://www.csg.uzh.ch/

Zusammenfassung

Verteilte Hash Tabellen (engl. Distributed Hash Tables, DHT) sind eine wichtige Da-tenstruktur fur das Errichten von strukturierten P2P Netzwerken. DHTs sind robust,skalierbar und waren bereits oft Bestandteil wissenschaftlicher Publikationen. Sie bietenin der Regel die zwei fundamentalen Operationen put(key, value) und value = get(key)an, um Daten im Netzwerk speichern und laden zu konnen. Allerdings zeigen traditionelleDHT Implementationen schwache Konsistenzeigenschaften fur sich verandernde Daten,wie dies der Fall bei Updates ist. Ein Update uberschreibt in der Regel eine altere Ver-sion. Durch das Erweitern der traditionellen DHT Operationen get und put, stellt dieseMasterarbeit ein versionisiertes DHT (vDHT) mit starkeren Konsistenzeigenschaften fursich verandernde Daten vor. Der Ansatz erlaubt den Entwurf zweier Strategien, die es er-moglichen, Updates in einem DHT abzuspeichern. Die Strategien wurden mit Hilfe einerEvaluation analysiert, die auf Simulationen einer dynamischen und verteilten Umgebungbasierte und die durch den vDHT Ansatz erreichte Konsistenz mit traditionellen DHTsverglich. Die Evaluation konnte aufzeigen, dass trotz eines sich sehr stark veranderndesNetzwerkes, eine der Strategie letztendliche (engl. eventual) und die andere strikte Kon-sistenz beibehielt.

i

Abstract

An important mechanism to build structured P2P systems are Distributed Hash Tables(DHT). DHTs are robust, scalable and have been researched for many years. DHTs usuallyoffer the two fundamental operations put(key, value) and value = get(key), for retrievingand storing data. However, traditional DHT implementations have weak consistencycapabilities for mutable data. An update usually overwrites old versions and leaving andjoining nodes may cause updates to disappear. By slightly extending the traditional DHToperations get and put, this master thesis proposes a versioned DHT (vDHT) havingstronger consistency properties for mutable data. Based on the proposed extensions andmodifications, two putting strategies are described in order to update data in a DHT.The proposed putting strategies are analyzed in an evaluation by multiple simulations ofa dynamical distributed environment, helping to compare the consistency of the vDHTapproach to traditional DHTs. The evaluation showed that under worst churn conditions,one of the two proposed putting strategy achieves eventual consistency, the other oneremains strictly consistent.

iii

Acknowledgments

I would like to thank several people for their support in the realization of this masterthesis. First of all, I express my deepest gratitude to my supervisor, Dr. Thomas Bocek,for his competent assistance, patience, enthusiasm and ever cooperative interaction. Ialso like to thank my assistant supervisor, Andri Lareida, for his feedback and advice.Furthermore, I thank Guilherme Sperb Machado for his help and assistance setting upthe testing environment cluster provided by the Communication Systems Group. Lastbut not least, I thank my significant other and family for their ongoing encouragementthroughout this work.

v

Contents

Zusammenfassung i

Abstract iii

Acknowledgments v

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Description of Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Related Work 5

3 Design 9

3.1 Extensions and Modifications . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 The vDHT API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2.1 Put Modified Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2.2 Get Latest Version . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2.3 Remove Version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3 Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.4 Put Strategies and Programming Patterns . . . . . . . . . . . . . . . . . . 13

3.4.1 Traditional Put Strategy . . . . . . . . . . . . . . . . . . . . . . . . 14

3.4.2 Simple Put Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.4.3 Optimistic Put Strategy . . . . . . . . . . . . . . . . . . . . . . . . 17

3.4.4 Pessimistic Put Strategy . . . . . . . . . . . . . . . . . . . . . . . . 20

vii

viii CONTENTS

4 Evaluation 25

4.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.1.1 Replication Settings . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.1.2 Churn Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5 Summary and Conclusions 37

6 Future Work 39

Bibliography 40

List of Figures 43

List of Tables 45

List of Listings 47

A Installation Guidelines 51

B Data Sheet of the Simulations 55

C Contents of the CD 59

Chapter 1

Introduction

P2P systems are still enjoying an increasing popularity and are responsible for a largeportion of Internet traffic [2, 3]. A significant percentage of this P2P traffic is relatedto file sharing systems, such as BitTorrent [1]. But also other and new types of P2Papplications, like VoIP or media streaming are emerging. This trend gets intensified bythe increasing number of devices connected to the Internet, especially mobile devices [18].

A well-researched and important mechanism to build structured P2P systems are dis-tributed hash tables (DHTs). DHTs scale with increasing number of nodes, while pro-viding redundant storage for key value pairs. DHTs are robust and can handle churn,i.e., nodes join or leave the network. DHTs usually offer the two fundamental operationsput(key, value) and value = get(key), allowing client nodes a strong abstraction to theunderlying mechanism in order to load from, respectively store into the network. Thesimplicity of these operations significantly contributed to the success of DHTs, allowingto build powerful distributed applications.

However, DHTs are not well suitable for mutating data. An update usually overwritesold versions. Since DHTs store data redundantly due to churn on a replica set, causingdifferent requested replica sets or subsets, not all replica nodes may see the update. Tradi-tional DHTs show weak consistency semantics when mutating data. Therefore, this thesispresents versioned DHT (vDHT), an extension for DHTs in order to achieve strongerconsistency requirements for mutating data. The approach extends the two fundamen-tal operations put(key, value) and get(key) by introducing a version key. Furthermoresome minor changes and extensions in the implementation of a DHT have been intro-duced. vDHT integrates versioning fully into the DHT for general purpose data, whilestill maintaining the simplicity of those two basic operations put and get.

The vDHT approach gets tested and analyzed in a distributed environment with differentchurn scenarios in mind. The evaluation compares consistency among updates achievedby traditional putting strategies with the vDHT approach. The updates are concurrentlyperformed by various amounts of putting client nodes. This work distinguishes betweenweak and strong consistency to analyze and describe the design and programming pat-terns for vDHT. While a system with weak consistency does not guarantee that updateswill be visible to subsequent requests, systems with strong consistency will return the

1

2 CHAPTER 1. INTRODUCTION

updated value for any subsequent requests. An overview of further consistency definitionsis provided in [28].

1.1 Motivation

High performance distributed systems recently are tending to eventual consistency (BASE),allowing distributed system to be temporarily in an inconsistent state, which may be re-solved eventually. On the other hand, centralized database systems, such as PostgreSQLor DB/2, are following the ACID principle, which have much stricter consistency guar-antees and are guaranteeing transactions from one consistent state to another. Thereforebringing distributed systems and stricter consistency guarantees closer together is a de-sirable requirement in order to allow applications to rely on stronger data consistency ina distributed environment.

The origins of this work are located from the emerging advanced requirements of theopen source project Hive2Hive [9]. Hive2Hive is a library, written in Java, for secure,distributed, P2P-based file synchronization and sharing. The underlying framework isTomP2P [8], a P2P-based high performance distributed hash table library. In order toprovide synchronisation among multiple clients of a user as well as sharing among severalusers, several objects such as meta data or profiles are regularly updated and stored intothe DHT. Those modifications may happen simultaneously, which may lead to conflictingscenarios. In order to avoid or resolve the conflicting states, a first versioning approach hasbeen developed and integrated into the project. However, the initial versioning mechanismoffers version recovery, but still doesn’t avoid or handle conflicting states in a sufficientmanner. Consulting other approaches and implementations, as described in the followingchapter, showed that there was no suitable concurrency mechanism for DHTs, which fulfilsthe requirements of the Hive2Hive project completely. It became clear that a combinationof existing approaches extended with own ideas would lead to a tailored and adequatesolution for versioning in DHTs.

1.2 Description of Work

The key idea of this work is to find mechanisms, rules and its dependencies for strongerconsistency semantics in DHTs, which are moving away from a system with BASE buttending to a system with ACID requirements. This included the design of a consistentputting approach in DHTs as well as a developer-friendly and easy to use API. Therefore abrief analysis of existing approaches has been required in order to be able to combine andextend elements from them, such as versioning, consensus protocols, lock-free mechanismsand so on, as well as the experiences from the H2H project. The design of a consistencyapproach went along with the finding of constraints and the defining of assumptionswhich is crucial especially for distributed environments. So far, only a first undocumentedprototypical implementation existed in the mentioned Hive2Hive open source project. Thenewly designed vDHT approach needs to be implemented and documented in order toexplain the underlying ideas for its mechanisms as well as the chosen assumptions.

Furthermore, the approach needs to be evaluated in a well-defined testing environment.Considering the challenges of a distributed environment and the taken assumptions, the

1.3. OUTLINE 3

design and implementation of a testing framework fulfilling the desired requirements isneeded, in order to be able to run simulations. Well defined test setting scenarios and theirresults had to be analyzed and concluded in order to underline respectively invalidate theassumed effectiveness of the designed approach.

1.3 Outline

This work is structured as follows. Chapter 2 contains related work, comparing vDHT toother similar approaches, highlighting and discussing their differences. Chapter 3 describesthe design of the proposed solution and presents the idea and mechanisms of the vDHTapproach. An evaluation of the presented approach is given in chapter 4, followed by adiscussion of the results. Chapter 5 draws the conclusion of this work. Finally, chapter 6enlists the future work.

4 CHAPTER 1. INTRODUCTION

Chapter 2

Related Work

Achieving consistency among data in a distributed environment was and will remain achallenge for computer scientists and developers. The corresponding research field is stillan active research area [15, 24]. In order to provide stronger consistency semantics, severalapproaches like the 2-phase commit protocol [12, 17], the Paxos family [16] and Raft [21]have been proposed and are very common solutions reflected by numerous amounts ofexisting implementations [6].

However, the main focus of this work relies on consistency in DHTs. In general, DHTsshow very poor consistency capabilities for mutable data. Traditional DHT implemen-tations often provide a first-come-first-served handling for updates, where older versionsare simply overwritten by the newer ones. Concurrent modification scenarios, caused bysimultaneous puts of different client nodes, can often result in conflicts and updates mayget lost. The following listing presents related work compared to each other and thevDHT approach according five criteria dimensions:

(1) Specific / Generic ContentSpecific, respectively structured content, such as text, allows the appliance of merge anddiff mechanisms. In contrast, generic content, like binary data, is not suited for suchapproaches. The vDHT approach doesn’t require structured data and is also suitablefor generic data. Saccol et al. [11] utilizes XML-based content. The authors presentan approach, which uses diffs to store and aggregate versions on super peers. Thesesuper peers are again managed by manager-peers. Thus, the peers are organized in anhierarchically shaped DHT. Other mechanisms, such as proposed by Oster et al. [22],show a distributed merge without vector clocks to ensure convergence.In contrast the vDHT approach does not use any hierarchical structures and does notdistinguish between different roles, so that all peers have the same behaviour. Further,vDHT does not rely on any diff mechanisms. Indeed merge mechanisms in vDHT can butdo not have to be applied. This will be discussed in the following sections.

(2) On-top-of DHT / IntegratedApproaches providing consistency mechanisms can be located on different abstraction lay-ers of a DHT implementation. Jiang et al. [13] presented a versioning mechanisms whichis built on top of a DHT, relying on the traditional put and get operations, which are the

5

6 CHAPTER 2. RELATED WORK

common interface of DHTs. The presented approach supports versioning for collaborationpurposes, which relies on an operation-based merge approach, where operations get storedwhat has been done on the data. Merges are resolved by the clients manually.The vDHT approach already considers consistency concerns within the DHT implementa-tion and provides an extended get and put interface for the clients, replacing the commonget and put API of a DHT.

(3) Optimistic / PessimisticIn order to handle inconsistent states, different handling respectively prevention strategiescan be applied. Optimistic approaches try to detect inconsistent states, temporarily al-lowing inconsistencies. The given circumstances in a distributed environment often allowthe appliance of approaches basing on detection mechanisms, which suffice for the mostinconsistency scenarios, due to their rare and occasional appearance. On the other handpessimistic approaches try to avoid it. A given distributed environment that keeps theapproach-specific assumptions should never provoke inconsistency scenarios. The Paxosfamily and its implementations go through the process of consensus decision before ac-cepting an update. PaxonDHT [27] is a Paxos-based middleware service using the PastryDHT. They guarantee a very low probability of becoming inconsistent among replicas.The node with the closest id to the id of the service serves as leader, where replicas areleaf nodes of the leader node. Other approaches, such as Yu et al. [29] and Mesaros etal.[19], use a 2-phase protocol for blocking writes.The vDHT approach allows both mechanisms. An optimistic approach is presented, al-lowing conflicts, which are detected and resolved, as well as a pessimistic approach tryingto prevent inconsistent states.

(4) Version History / LatestConsistency mechanisms, which are using or relying on a version history, are storingpointers, respectively references to its predecessor versions. Following these pointers theversion history can be accessed and used to restore older versions if needed. Howeversome mechanisms, such as presented by Knezevic et al. [14], store only the latest version.Such approaches do not keep a history and simply overwrite the old versions. A propertyof these mechanisms is, that overwrites never happen if versions are conflicting. Knezeviset al.’s approach describes beside overwriting older versions, a replication mechanismachieving high data availability and incremental version numbers for each version. Thework of Plavec et al. [25] proposes a similar mechanism.vDHT is keeping a version history, which is used, beside optional version restoring, forconflict resolution as described later. Each version has one or more references to itspredecessor respectively predecessors, allowing version forks and merges. Each versiongets assigned an incrementing number, which is a part of the unique id of each version.In vDHT a version is never overwritten respectively can not be lost by a concurrent putoperation.

(5) Timestamp-based / -freeMechanisms relying on timestamps need to ensure that all nodes are synchronized witheach other. If nodes have Internet connectivity, time synchronization is feasible by us-ing NTP (Network Time Protocol). E.g Googles globally distributed NewSQL databaseSpanner makes heavy use of hardware-assisted time synchronization using GPS clocks [10].However, for nodes with wrong timestamps, such mechanisms fail quickly or show un-

7

wanted behaviour. Akbarinia et al. [7] propose to use a combination of ids and timestampsin order to find the latest version.The vDHT forgoes timestamps and therefore does not rely on time synchronisation mech-anisms. As already suggested, each version gets assigned a unique id which consists outof an incrementing number and a hash value in order to find the correct version in theDHT.

Further work, such as Blobseer [20], focuses on storing large unstructured data objects,so called BLOBs, while also maintaining versions. They split the BLOB into fix-sizedpages that are scattered across the nodes to achieve versioning. In case of modification,current pages are not updated. Instead, newly generated pages are stored in the network.Metadata is organized in a segment-tree like structure which is also scattered across theDHT. Concurrent modifications are handled by a so-called version manager keeping trackof all snapshot versions. vDHT has no central instance in order to manage versioning andall versions are handled as whole objects.

Paganelli et al. [23] focus on discovery services for IoT. They use versions and vectorclocks for conflict resolution. This mechanism was implemented on top of a DHT anduses a Prefix Hash Tree (PHT) as an indexing mechanism. A detected version conflict oftwo conflicting versions is reconciled through a basic merge mechanism.

The closest work compared to vDHT is p2pstm [26], which is also a lock-free concurrencymechanism applied on a Pastry DHT. The approach prevents deadlocks by executing readsand writes with a defined partial order that is transparent to the developer. However,the main difference is that vDHT uses regular DHT routing and does not rely on HilbertSpace Filling Curves.

Table 2.1 shows the comparison of related work according the mentioned five dimensions.

Table 2.1: Related Work Comparision

vDHT

XML Ver-sions andReplicas[11]

DHTVersion/Col-labora-tion [13]

paxon-DHT[27]

HighlyAvailableDHTs [14]

DataCur-rency[7]

p2p-stm[26]

(1) Generic Content X × X X × X X(2) Integrated Approach X × × × × × X(3) Optimistic & Pessimistic X × × × × × ×(4) Version History X X × × × × X(5) Timestamp-free X × X X X × X

8 CHAPTER 2. RELATED WORK

Chapter 3

Design

In order to support mutable data with stronger consistency in DHTs, versioning in DHTsis introduced. The chapter is structured as follows. Section 3.1 presents and describesall extensions and modifications, which are required for the vDHT approach. Section 3.2introduces the vDHT API. Section 3.3 discusses the replication aspect, which is crucialfor data availability. Finally, 3.4 presents the two put strategies using the vDHT API andthe described extensions and modifications.

3.1 Extensions and Modifications

The following Listing briefly describes the small modifications and extensions, whichvDHT is requiring.

� Version Key Common DHT implementations use keys respectively locationKeys tomap values to distributed nodes. These location keys are used for routing in the keyspace to put and get data from the DHT. vDHT extends these keys by introducingversionKeys, which are distinguishing different versions of data object, stored underthe same location key. The version keys are not used for routing. A version keyconsists of an incrementing number as well as the hash of the corresponding dataobject. Having the version number encoded in the most significant bits makes iteasy to find the newest or oldest version. The hash makes the version key unique.Thus, in vDHT, data will never be overwritten with different values, because mu-tated data has always a new version key.

� Sorted Map A typical DHT implementation stores the key value pairs in an internalhash table. vDHT uses instead sorted maps, where each version can be sorted. Evenwhen two versions have the same version number, the hash value part of the versionkey clearly defines the order between these two versions. Versions with the sameversion number and the same hash value are equivalent. Due the incrementingversion number the last entry in the sorted map is also the latest version.

� Based On Key The values, which are stored in the DHT, are extended by areference to the version key of one or more predecessor versions. This basedOnKeys

9

10 CHAPTER 3. DESIGN

allow version forks and merges. Two different values with the same based on keyare forks, while values with more than one based-on key are merged.

� Prepare Flag Furthermore, a prepareFlag is added to those values. If this flag isset, a targeted replica node stores this value only temporarily and is not yet addedto the internal sorted map. A get request ignores all temporarily stored values.

� Time-to-live A time-to-live (ttl) value attached to the data object gives each ver-sion a lifespan. An expired version is automatically removed from the DHT.

3.2 The vDHT API

Listing 3.1 and 3.2 show the interface of traditional DHT implementations in order tostore and load data into respectively from the network. Furthermore Listing 3.3 showsthe interface used in traditional DHTs in order to remove data from the network.

Listing 3.1: DHT interface for putting data

put(locationKey, value)

Listing 3.2: DHT interface for getting data

value = get(locationKey)

Listing 3.3: DHT interface for removing data

remove(locationKey)

Versioning in DHTs requires an extension of those interfaces. The following subsectionsdescribe the interfaces, proposed by the vDHT approach, for putting, getting and removingversions.

3.2.1 Put Modified Data

Putting modified data into the vDHT requires a version key. The version key generationmay be handled internally. However, for better understanding the version key is part ofthe put interface, as showed in Listing 3.4.

Listing 3.4: vDHT interface for putting modified data

status = put(locationKey, versionKey, ←↩value[basedOnKey(s), prepareFlag, ttl])

status ∈ {ok, fork}

3.2. THE VDHT API 11

The value, which is stored, may have one or more references to its predecessors, given asbased-on keys. Furthermore, a prepareFlag and a ttl value, as described in Section 3.1, isalso a part of the data object. The method returns a status object, replying with ok, ifdata has been stored on all replica nodes, and with fork if at least one replica node hasdetected a version fork.In combination with an active prepareFlag, the method showed in 3.5 removes the pre-pareFlag and confirms the version.

Listing 3.5: vDHT interface for confirming prepared data

putConfirm(locationKey, versionKey, ttl)

In order to target the correct version, the corresponding location and version key has tobe given. A new ttl parameter refreshes the prior set ttl value.

3.2.2 Get Latest Version

vDHT supports two types of get calls. Listing 3.6 shows the get interface to load a specificversion, if the corresponding location and version key is given. This method can be usedfor version recovery.

Listing 3.6: vDHT interface for getting a specific version

value = get(locationKey, versionKey)

The second get method provided by vDHT, shown in Listing 3.7, requests all latest ver-sions for a given location key. Each requested replica node checks its internal sorted mapfor the latest version.

Listing 3.7: vDHT interface for getting latest version

SortedMap<value> = getLatest(locationKey)

In case a requested replica node has previously stored a version fork, the replica noderesponds with the forked versions. The getLatest method returns a sorted map, holding allreceived versions from the replica nodes. Listing 3.8 shows an algorithm, which recursivelyremoves all predecessors of each version using the based-on key. The algorithm finallyreturns all latest forks. Only versions without any successor(s) remain.

Further consistency considerations for get calls include the number of results. A DHTapplication developer can choose between returning data from all replicas or from onereplica node. If choosing only one, a replica node may not have the newest version,caused by network or replication delays. Choosing all results leads to more resourcesbeing used, but with higher chances of retrieving all latest versions.


Listing 3.8: Algorithm to find latest versions

List<value> = latestVersions(sortedMap) {result = new List()sortedMap = sortedMap.copy()while (!sortedMap.isEmpty()) {

latest = sortedMap.lastEntry()result.put(latest.value)deletePredecessors(sortedMap, latest.versionKey)

}return result

}

deletePredecessors(sortedMap, versionKey) {if (versionKey == null)

returnif (!sortedMap.contains(versionKey))

returnpredecessor = sortedMap.remove(versionKey)deletePredecessors(sortedMap, predecessor.basedOnKey)

}

3.2.3 Remove Version

Listing 3.9 shows the interface for removing a specific version.

Listing 3.9: vDHT interface for removing data

remove(locationKey, versionKey)

Given the corresponding location and version key the remove method deletes also uncon-firmed data. Thus, data objects set with a prepareFlag can be pulled back.

3.3 Replication

In order to increase the availability and scalability, replication is crucial. Data is storedamong a configurable amount of replica nodes. Thus, DHTs can handle churn and datacan not be lost as long as a subset or even a single replica node survives. Differentreplication strategies can be applied.

The so-called 0-root approach [25] stores data among the closest neighbours of a givenlocation key. The replica node closest to the location key takes responsibility for thereplication maintenance among the replica set. As soon as this node detects a leaveof a node belonging to the same replica set, data is replicated to its neighbours untilthe defined replication factor is reached. However, this approach suffers if the replicanode with replication responsibility leaves the network, especially when the leave was

3.4. PUT STRATEGIES AND PROGRAMMING PATTERNS 13

involuntary, caused e.g. by a crash. Nodes in a DHT occasionally perform some keep-alive respectively heartbeat messages, so that the leaving is detected and the next closestreplica node takes the replication responsibility.

The n-root approach [25] delegates the replication responsibility to all members of areplica set. Each replica node holds a copy and performs the replication maintenance,which includes regular checks if the desired amount of replica nodes is available. Then-root approach causes a larger message overhead, but guarantees higher availability andscalability.

3.4 Put Strategies and Programming Patterns

The modifications presented in Section 3.1 and 3.2 allow different put approaches tosupport versioning in DHT. Their mechanisms are explained in the following subsections.This thesis has been written under the following assumptions: all participating nodes ofthe DHT show a non byzantine behaviour. All used algorithms show the same resultsunder the same conditions and parameters. The following put approaches have beendesigned under the assumption of light churn, delimited from heavy churn, which isdefined as follows:

� Routing Having light churn and using the same location key, the routing of theDHT directs all requesting clients to at least one and the same replica node. Leavingor joining nodes may affect the replication responsibility of some replica nodes. E.g.a joining node, which is closer to the replication range of a given location key,may replace another replica node, which discards its replication responsibility. Theannouncement of the join may be delayed, such that a requesting client gets routedto a different node. Having heavy churn may route requesting clients to differentreplica sets, without a common replica node.

� Network Partitioning Network interrupts, e.g. caused by physical breaks, maysplit the DHT network in two or more separate networks. For better understandingpurposes light churn excludes and heavy churn includes the possibility of networkpartitioning.

� Replication In an environment with heavy churn the replication may be unableto complete to duplicate all data to the newly joined nodes. Assuming light churnconditions that the replication will always replicate the data to at least one replicanode, which is seen by all requesting clients.

The Figures 3.1a and 3.1b illustrates the routing to different replica subsets under lightand heavy churn conditions. These constraints are necessary for the design and analysis ofthe presented putting mechanisms because all participating nodes rely on the mechanismsexecuted on all replica and putting client nodes.


Figure 3.1: Replica subsets under light and heavy churn conditions

(a) Under light churn, client CA and CB arerouted to two replica subsets A and B, whichhave a common replica node.

(b) Under heavy churn, client CA and CB arerouted to two different replica subsets.

Listing 3.10: Programming pattern - traditional put strategy

getAndUpdate(locationKey, updateFunction) {versionN = get(locationKey)versionN+1 = updateFunction(versionN)return versionN+1

}

store(locationKey, updateFunction) {versionN+1 = getAndUpdate(locationKey, updateFunction)status = put(locationKey, versionN+1)

}

3.4.1 Traditional Put Strategy

Listing 3.10 presents a simple programming pattern using the API of traditional DHTs,in order to update data, which is stored under a given location key. A client gets thedata from the network. Usually the DHT routes to a subset of the replica nodes, whichrespond with their stored values. A quorum based evaluation chooses the value versionN , which is returned to the requester. If no quorum can be found the get request fails.The returned value is updated by a provided update function. Finally, the client puts themodified value with version number N + 1 into the DHT.

This approach works well if no simultaneous puts occur. On the other hand, severalputting clients may lead to overwrites, because a requesting client may not find a quorum.Due to the occurring overwrites, the traditional put strategy is not consistent in case ofhaving multiple updating clients.


Listing 3.11: Programming pattern - simple put strategy

getAndUpdate(locationKey, updateFunction) {versions = getLatest(locationKey)versionN = versions.lastEntry()versionN+1 = updateFunction(versionN)return versionN+1

}

store(locationKey, updateFunction) {versionN+1 = getAndUpdate(locationKey, updateFunction)put(locationKey, N+1, versionN+1[N, ttl, false])

}

Figure 3.2: Ideal putting scenarios

(a) Client CA repeatedly updates theversions on all replica nodes R1..5.

(b) Clients CA and CB alternately updatethe versions on all replica nodes R1..5.

3.4.2 Simple Put Strategy

A further straightforward putting strategy using version keys is presented in Listing 3.11.A client node calling store loads the latest version from the network. The getLatest callcontacts at least one replica node for its latest version. From the returned sorted setof the latest versions, the given updateFunction considers only the version N with thehighest version key. This one is updated by the updateFunction, generates a new versionkey N+1 and sets N as the based on key. The put method stores the updated version.The prepareFlag does not need to be set. Optionally a ttl value can be given, so that theDHT can remove outdated or unused respectively not refreshed data, allowing the DHTto reduce its workload.

This putting approach works well having one putting client only, as illustrated in Figure3.2a. A single client node CA gets the initial version 0 from the replica nodes R1, R2, R3,R4 and R5. The client node CA updates the fetched version to version 1a and stores iton all replica nodes. The procedure is repeated. The figure shows the versions as history


Figure 3.3: Version fork scenarios

(a) Client CA detects a version fork. (b) Clients CA and CB detect a versionfork.

from top to bottom on all five replica node, starting by the initial version 0 and endingat version 5a. The same procedure is applied for the scenario with two different puttingclient nodes CA and CB as long as they do not conflict with each other, as illustratedin Figure 3.2b. Both clients alternately get the latest version, update it and put theupdated version on all replica nodes. The unique version key of the versions guaranteesno overwrites.

The simple putting approach is limited in case of version forks, which are showed in theFigures 3.3. Two client nodes simultaneously get the version 1a, update it to version 2arespectively 2b and try to put their versions on all five replica nodes at the same time.Figure 3.3a shows a scenario where the client node CA stored its version 2a on all replicanode before the client CB could do so. As soon as the client CB puts its version 2b, thecontacted replica nodes detect a version fork. According the put method presented inListing 3.4, client CB gets notified with a fork status. Client CA does not know that aversion fork occurred.

Figure 3.3b shows a version fork scenario where a simultaneously executed update andput, basing on the same predecessor version, ends in notifying both participating clientnodes CA and CB with a fork status. Uploading data to replica nodes can never be kept insync in a distributed environment. E.g., the upload of replica nodes with more resourcesor higher bandwidth finish faster. The example of Figure 3.3b shows that client CA wasfaster putting its version 2a on the replica nodes R1 and R2 and client CB was fasterputting its version 2b on the replica nodes R3, R4 and R5. As soon as the uploads on theremaining replica nodes finish, both clients are informed about the fork state.

The simple put strategy ignores these version fork scenarios. A requested replica nodereturns all forked versions. The strategy considers only the version with the highest versionkey for its updates. The other forked version will be not considered and the update islost. Data updated with the simple put strategy cannot be consistent.

A further scenario are version delays, as illustrated in Figure 3.4. Due to replication ornetwork delays, the version 2a is not stored on all replica nodes. Simultaneously anotherclient is requesting the latest version from the replica nodes and may receive both versions1a and 2a. The simple put strategy considers always the latest received version, so thatversion delays will not bother data consistency, as long as the replicas get the latest versioneventually. But considering churn, it may lead to data inconsistency, when e.g, the subsetof replica nodes, which is holding the latest version, leaves the network. The other replica


Figure 3.4: Version delay scenario - The replicas R4 and R5 are too slow. Client CB

detects a version delay.

nodes are delayed and can’t response with the latest version. As long as the replica nodes,which went offline, do not rejoin the network, the updates of the latest version are lost.

3.4.3 Optimistic Put Strategy

The optimistic put strategy considers version forks and delays, while achieving strongerconsistency semantics. As already the name suggests, the put strategy optimistically putsupdated versions into the network and tries to handle conflicting states after the putting.The putting approach is presented in Listing 3.12. The loaded versions are checked forversion delays according their based-on keys. A version is delayed if another loaded versionis basing on it. For optimization purposes, the version keys and their corresponding based-on keys of all loaded and stored versions can be cached, in order to have more reliableversion delay detection. If a version delay has been detected, getLatest is repeated, untilthe latest versions have been replicated on all replica nodes. Having no version delays,the loaded versions are checked for version forks. A version fork occurs when the replicanodes responded with different latest versions. In case of a version fork, getting the latestversion is repeated until the replica nodes do not respond with version forks or a timeoutexpires. Having no version forks, the latest received version can be used for an update andis returned to put it into the DHT. In case of timeout expiration, the forked versions haveto be merged. The updated, respectively merged version is loaded into the DHT usingthe put interface presented in Listing 3.4: the response of the put is analyzed for versionforks. If at least one replica node locally detects and responds a version fork state, therecently put version has to be rejected. For this reason the merging fallback is required.A putting client who detected a version fork after putting its updated version, may nothave removed its conflicting version due to a crash or connection loss. In order to not losethe updates of the failed client the latest versions have to be merged. In case of a versionfork after a put getLatest is repeated, until no version delay, no version fork after gettingand no version fork after putting is detected.

The presented optimistic put strategy considers beside version delays several version forkscenarios as showed in the Figure 3.5a and 3.5b. Figure 3.5a presents a version forkscenario after a simultaneous put. The clients CA and CB put their updated versions2a and 2b, which both base on version 1a. Client CA was faster and put its version 2a


Listing 3.12: Programming pattern - optimistic put strategy

getAndUpdate(locationKey, updateFunction) {while(true) {

versions = getLatest(locationKey)if (versions.haveDelay) {

wait()continue

} else if (versions.haveFork && timeout) {versionN+1 = updateMerge(updateFunction, versions.forks)return versionN+1

} else if (versions.haveFork) {wait()continue

} else {versionN = versions.lastEntryversionN+1 = updateFunction(versionN)return versionN+1

}}

}

store(locationKey, updateFunction) {while(true) {

versionN+1 = getAndUpdate(locationKey, updateFunction)status = put(locationKey, N+1, versionN+1[N, ttl, false])if(status == fork) {

// reject versionremove(locationKey, N+1)wait()

} else {return

}}

}

first on all replica nodes R1−5. After putting its version 2b, client CB gets notified by allreplica nodes about a version fork state. Therefore, client CB rejects its version 2b, getsthe latest version 2a, updates it to version 3b and puts it into the network. Figure 3.5bshows a version fork scenario, which is first detected by client CA after getting the latestversions. In contrast to a detected version fork after a put, a version fork after a get leadsto waiting, until the other putting client resolves the conflicting state. In the example,client CA gets a version fork, because client CB is too slow to reject its conflicting version2b.


Figure 3.5: Version fork scenarios resolved with the optimistic put strategy

(a) Client CB detects a version fork afterputting its version 2b. Client CB rejects itsversion and loads the latest version again.

(b) Client CB detects a version fork but CB

has the version 2b not yet rejected. ClientCA detects a fork after a get request andwaits until CB rejects.

(c) Clients CA and CB simultaneously puttheir versions, detect a version fork and re-ject their versions.

(d) Client CB is too slow rejecting its ver-sion 2b. Meanwhile client CA puts a newversion 3a. Both clients get a version fork.

Figure 3.5c illustrates a situation where both putting clients detect a version fork responsestate after putting. E.g. client CA put its version 2a first on the replica nodes R1, R2

and R3 and client CB put its version 2b first on the replica nodes R4 and R5. Now, ifboth clients upload their versions on the remaining replica nodes, both detect the versionfork state. The clients reject their versions, reload the latest version and try to put theirupdated versions into the network. This may lead to race conditions, which are resolvedby an exponential back off waiting mechanism. Both clients wait a random amount oftime within a given time range in order to get the latest version, update and put it. Thetime range increases when a version fork occurs repeatedly. This leads to a shifted waiting,until a client puts its updated version without a version fork state.

However, Figure 3.5d shows a limitation of the optimistic put approach. Both clientsCA and CB simultaneously put their versions 2a and 2b and both detect the version forkstate. Client CA quickly rejects its version 2a. Client CB needs slightly longer, causede.g. by network delays or poor computing resources. Meanwhile client CA gets version


2b, which has not yet been removed by the slow client CB. The replica nodes replythe get request of client CA with the version 2b, which is at this time the latest on thereplica nodes. Therefore client CA sees the loaded version 2b as the latest one and usesit to update it to version 3a and put it into the network. If this put is still faster thanthe rejection of the version 2b, the replica nodes do not reply with a version fork state.Meanwhile client CB finally removed its version 2b. Now if a client requests for the latestversion, the replica nodes will respond with the versions 1a and 3a. This response is aversion fork for the requesting client, because the predecessor version 2b of version 3a ismissing respectively has been removed. The problem is no client can resolve this versionfork state by removing a forked version. If the timeout of a requesting client expires, theclient will merge these versions. This merge is actually unnecessary because version 3aalready contains version 1a. Caching of all fetched and stored version could detect thisspecial case, which would avoid the merging. But version 3a contains also the update ofversion 2b from the client CB and at this time the client CB is assuming that the updatesof version 2b are not part of the update history. However, the described scenario occursonly if no replica node received the rejection request, which is a very rare but neverthelesspossible case. If at least one replica node has removed the version 2b, requesting clientswill receive different versions from the replica set, which they will interpret as versiondelays. The delay gets resolved as soon as all version copies of 2b are removed from thereplica nodes. This specific version fork scenario limits the consistency semantic of theoptimistic put strategy. Merging is difficult to apply, especially for non generic contentlike binary files. Different merging strategies, such as considering only the latest, can beapplied.

The optimistic put approach achieves eventual consistency semantics under the assump-tion to have light churn and that merges do not lose any data [28]. On the other hand,the optimistic put approach is not consistent under heavy churn conditions, especially ifthe last replica node, which is holding the latest version, also leaves the network.

3.4.4 Pessimistic Put Strategy

In contrast to the optimistic put strategy, the pessimistic one does not allow the possibilityto receive a version fork after a get. The pessimistic putting approach is presented inListing 3.13.

Similar to the optimistic put strategy updated versions are put into the DHT withoutrequiring prior mechanisms like locking. However, as the name already suggests, theapproach is pessimistic and tries to avoid the need of resolving conflicting states. Theoptimistic put strategy tries to handle them. For this purposes, all newly updated versionshave set the prepareFlag. As described in Section 3.1, a prepared version is stored on thereplica nodes, but will not be used for get requests. In order to detect version forksthe replica nodes do not distinguish between prepared and confirmed versions. That’swhy a put with a prepared version may lead to a reported version fork state. In thiscase the putting client needs to reject its prepared and conflicting version. Getting thelatest version works similar as in the optimistic put strategy but does not need timeouts.


Listing 3.13: Programming pattern - pessimistic put strategy

getAndUpdate(locationKey, updateFunction) {while(true) {

versions = getLatest(locationKey)if (versions.haveDelay) {

wait()continue

} else if (versions.haveFork) {versionN+1 = updateMerge(updateFunction, versions.forks)return versionN+1

} else {versionN = versions.lastEntryversionN+1 = updateFunction(versionN)return versionN+1

}}

}

store(locationKey, updateFunction) {while(true) {

versionN+1 = getAndUpdate(locationKey, updateFunction)status = put(locationKey, N+1, versionN+1[N, ttl, true])if(status == fork) {

// reject versionremove(locationKey, N+1)wait()

} else {putConfirm(locationKey, N+1)return

}}

}

Receiving version delays also results in repetitive getting, until the delay has been resolvedby the replication of the DHT. If a client received different latest versions, the client hasto merge them without waiting for a resolution.

Figure 3.6 shows two clients CA and CB successively put an updated version accordingthe pessimistic put strategy. Client CA gets the latest version 0 from the replica nodesR1−5, updates it to version 1a and puts it with a prepareFlag on all replica nodes. Sinceno version fork states has been responded, client CA confirms its version 1a on all replicanodes. Client CB repeats the procedure with its own update 2b basing on version 1a.

The Figures in 3.7 show version fork scenarios handled by the pessimistic put strategy.Figure 3.7a shows two clients CA and CB putting their version 1a and 1b, which are basingon the same version 0. Client CA put its prepared version 1a on all replica nodes R1−5


Figure 3.6: Pessimistic Putting - The clients CA and CB successively put a preparedversion and confirm it.

Figure 3.7: Version fork scenarios resolved with the pessimistic put strategy

(a) The client CB rejects its prepared ver-sion 1b after causing a version fork. ClientCA’s rejection does not affect client CA’sconfirmation.

(b) The clients CA and CB simultaneously puttheir prepared versions, detect a version fork andreject their versions.

prior to client CB. Client CB also puts its prepared version 1b on the replica nodes anddetects a version fork, which leads CB to reject its prepared version. The example showsthat CB’s rejection of its conflicting version 1b has already been executed on the replicanodes R3, R4 and R5, but not yet on the replica nodes R1 and R2. Client CA did notreceive any version fork from any replica node, so that the client confirms its version 1aon all replica nodes. The example shows that for the client CA it does not matter whetherthe conflicting version 1b of client CB has been rejected at the time of confirmation ornot.

In contrast to the optimistic putting, the pessimistic put strategy handles subsequent getrequests from other clients better, although forked and non-rejected versions remain onthe replica nodes. For a requesting client it does not make a difference if a replica node hasa version fork, since the replica nodes respond only with confirmed versions. Also havingthe case that a client, which caused a version fork, failed respectively did not reject itsconflicting prepared version, a short time to live value attached to each prepared versionresolves the version fork as soon as it expires.

Figure 3.7b shows a version fork scenario where both putting clients simultaneously re-ceived a version fork state and rejected its prepared versions. This may lead to raceconditions which are again resolved by exponential backoff waiting. As soon as a client


does not receive a version fork state, the client can confirm its prepared version.

The pessimistic put approach merges versions only if no light churn conditions are given.Heavy churn or network partitioning may lead to different replica sets. Once these sets seeeach other, the different version histories have to be merged. Assuming that these replicasubsets see each other eventually and have a lossless merge function, the pessimistic putstrategy is eventual consistent. The pessimistic put approach is not consistent, if heavychurn is given, which sets all replica nodes offline, before the latest version has beenreplicated. If the distributed environment fulfils light churn conditions, as defined at thebeginning of this section, the pessimistic put strategy achieves strict consistency. Aftera client has confirmed its updated version, any subsequent access by other or the sameclient will return the updated version [28].

Chapter 4

Evaluation

In order to evaluate consistency semantics by applying the vDHT approach, several sim-ulations under different settings have been performed. Thereby, traditional and vDHTputting strategies are compared to each other.

The chapter is structured as follows. Section 4.1 describes the settings and methods,which have been used in order to evaluate the vDHT approach. Section 4.2 presents theachieved results, which are discussed in Section 4.3.

4.1 Method

The vDHT approach has been implemented and integrated into TomP2P [8], a highperformance DHT framework. Further, a testing framework for simulations has been im-plemented, which locally setups a network and simulates different churn settings. In orderto cause conflict scenarios, multiple independent putting clients try to perform updatesto the same location key. The simulations test different settings for 2 to 9 putting clients.Every simulation contains 1000 update writes distributed above the putting clients. Aputting client tries to perform an update every 0.6 to 1.2 seconds. In order to measureand compare consistency, each putting client appends the latest version with its id. Assoon as the putting client nodes performed the updates, the test framework loads thelatest version and counts the appended ids. A put strategy achieves strict consistency ifall performed updates are available in the latest version. All put and update strategiespresented in Section 3.4 are tested in order to compare them with each other.

In summary, the evaluation contains simulations considering three following factors, whichare mainly influencing the consistency above updates.

� Put Strategies The tested put strategies are traditional, simple, optimistic andpessimistic, as described in Section 3.4.

� Number of Putting Clients The more simultaneously putting clients the more itis probable to have version forks and delays. 2 to 9 simultaneously putting clientsare tested.

25

26 CHAPTER 4. EVALUATION

Figure 4.1: A possible churn curve

� Churn Settings Details are following in Section 4.1.2.

Each setting has been tested 25 times, in order to achieve high expressiveness of themeasured data. In total, this evaluation contains 1’600 simulation runs. The simulationshave been executed on five Dell PowerEdge R815 rack servers, each having 2x12 AMDOpteron 6180 2.5 GHz cores, 64 Gb RAM and 500Gb (7.2k rpm) + 146Gb (15k rpm)sized disks. The statistical analysis has been performed with R 3.1.1 [5] and RStudiov0.98 [4].

4.1.1 Replication Settings

The simulations implement the n-root replication approach [25]. The replication factor(the desired number of replica nodes for a corresponding location key) is 6, which isalso the default setting of TomP2P. The replica nodes are performing replication everysecond. Therefore the replication is faster than any churn event, as described in thefollowing Section.

4.1.2 Churn Settings

Simulating churn is a large research area and several solutions have been proposed. Nodesare leaving and joining the network. Nevertheless, churn has many parameters like when,how much, how long and which nodes are leaving and joining the network. Therefore,a more detailed definition is required. A simple churn behaviour is presented in Figure4.1. In a defined range a varying amount of randomly selected nodes is randomly andalternately leaving or joining the network. The churn event frequency is also varying in a

4.2. RESULTS 27

defined time range. Furthermore, given network size limits guarantee that not too manynodes leave or join the network. However this churn pattern does not necessarily reflectreal life circumstances. More ambitious churn patterns base on behaviour observations ofnetworks, such as BitTorrent [1], and map most accurately real life churn conditions.

However, churn affects consistency above updates only if the corresponding replica nodesleave the network. In this case, the DHT has to replicate the data to other replica nodes,until a defined amount of replica nodes is holding it. Furthermore, leaving and joiningnodes affect the routing, which is necessary to find replica nodes of given location keys.That’s why the most promising churn settings are close to heavy churn conditions, butstill are within the limits of light churn assumptions (as described in Section 3.4), in orderto show the accuracy of the presented vDHT approach.

The vDHT approach has been evaluated under two churn settings. As the baseline, thedifferent put strategies have been tested without churn, referred as no churn. No nodesare leaving or joining the network, so that constantly 100 nodes are online during thewhole simulation. Second, the testing framework simulated worst churn conditions,which affect the vDHT approach, but are within the described light churn assumptions.This includes the following settings. Leaving nodes which are not part of the replica setdo not affect the availability of data under the corresponding location key. The routingmay be affected, but DHTs are designed for changing routing paths. For this reason, thetesting framework always selects the nodes closest to the location key in order to removethem from the network. If all replica nodes leave the network at once, all versions wouldbe lost. Therefore, the worst churn removes N − 1 replica nodes at once, where N is thereplication factor. Accordingly, only N − 1 nodes join the network at once. The testingframework executes churn all 1.5 to 2 seconds. Leaving and joining churn events equallyoften alternate. The initial network size setup is also 100 nodes, with 5 nodes as the lowerand 200 nodes as the upper bound.

4.2 Results

The Figures 4.2 show the amount of present versions towards performed updates. Fig-ure 4.2a shows the counted present versions without churn and Figure 4.2b under worstchurn conditions. The graphics are using percentage in order to compare the amount ofperformed updates. E.g. in the simulation setting with 2 putting clients, each client per-formed 500 updates, which are in summary 1000 updates. On the other hand, 6 puttingclients each performed 166 updates, which are 996 updates. Each dot represents the meanvalue of 25 test runs. The vertical black lines show the standard deviation.

Without churn, the optimistic and pessimistic put strategies achieved 100% with 0%standard deviation above all number of putting clients. For the traditional and simpleputting strategies, only between 50% to 68% of the updates have been available in thelatest version. A light tendency is viewable, where more putting clients are losing moreupdates. The standard deviation is very high, due to a wide range of measured values.

Under worst churn, the pessimistic and optimistic put strategies have lost not a singleupdate for 2 to 5 putting clients. Occasional update losses for 6 to 9 putting clients are


Figure 4.2: Present version percentage of performed updates

(a) With no churn conditions

(b) With worst churn conditions

4.2. RESULTS 29

reflected in the very small deviations of 0.03% to 0.92%. With the simple put strategy,32% to 57% and with the traditional put strategy 1.1% to 3.9% of updates have beenavailable in the latest version. A light tendency is viewable again, having more updateloses with more putting clients. The deviation of the simple put strategy is also veryhigh, but decreases with an increasing number of putting clients. The deviation of thetraditional put strategy is relatively small and tends to zero with an increasing amountof putting clients.

The Figures 4.3 show the average time that it took the putting clients to perform allupdate writes. Figure 4.3a shows the elapsed time of the putting clients under no churnconditions. The traditional and simple putting strategies took about the same amountof time, with no more than 0.4 seconds difference. The optimistic putting needed 3 to 6seconds longer, followed by the pessimistic strategy, which needed 5 to 10 seconds longer.With an increasing amount of time, the mean values show an exponential decay. Theputting strategies have very small standard deviations, whereby standard deviations ofthe optimistic and pessimistic ones tend to be slightly higher than the deviations of thetraditional and simple putting strategies.

Figure 4.3b shows the needed amount of time of the putting clients under worst churnconditions. The simple and traditional putting strategies also show an exponential decaywith an increasing amount of putting clients as well as very small standard deviations.The optimistic and pessimistic putting strategies show a similar trend for 2 to 5 puttingclients. For 6 to 9 putting clients the mean values increase for both strategies, while thepessimistic has a stronger increasing tendency. The standard deviations are also increasingand show i.e. for 9 putting clients 739.8 and 1182.2 seconds for the optimistic respectivelypessimistic putting strategies.

For the optimistic and the pessimistic putting strategies, no merges occurred under nochurn conditions as well as under worst churn conditions and having 2 to 5 putting clients.Figure 4.4 shows the average amount of merges with worst churn conditions.

Figure 4.5 shows the average number of occurred version delays for optimistic and pes-simistic putting strategies having no churn. In average, a delay occurred no more thanonce in a simulation run, which is also reflected in the small standard deviations of thesettings.

In contrast, having worst churn conditions, as presented in figure 4.6, version delays havebeen counted much more frequently, whereby the occurrence increases with the amountof putting clients. The standard deviation is relatively high, especially for higher amountof putting clients.

Figure 4.7 shows the average number of version forks after a put under no churn conditions.With an increasing amount of putting clients the occurrence of version forks after a putincreases for both putting strategies.

The same tendency can be observed in figure 4.8, which is showing the average number ofversion forks after a put under worst churn conditions. However, certainly more versionforks after a put have been counted, than under no churn conditions.


Figure 4.3: Elapsed time of simulation runs

(a) No Churn

(b) Worst Churn

4.2. RESULTS 31

Figure 4.4: Number of merges having worst churn

Figure 4.5: Number of version delays having no churn


Figure 4.6: Number of version delays having worst churn

Figure 4.7: Number of version forks after a put having no churn

4.2. RESULTS 33

Figure 4.8: Number of version forks after a put having worst churn

Figure 4.9: Number of version forks after a get having no churn


Figure 4.10: Number of version forks after a get having worst churn

Figure 4.9 presents the average amount of forks after a get having no churn. For thepessimistic put strategy no version forks after a get occurred, whereby the optimistic putstrategy occasionally had to deal with some version forks after a get, especially for higheramounts of putting clients.

The same applies for version forks after get under worst churn conditions, where thepessimistic put strategy had no version forks after a get. However, as figure 4.10 shows,with an increasing amount of putting clients the optimistic put strategy had to handleversion forks after a get considerably more often.

More detailed tables containing the mean values and standard deviations of all simulationsettings can be found in the Appendix B.

4.3 Discussion

The simulations showed that the traditional and simple put strategies are not consistentwith multiple concurrently putting clients. The simple putting strategy fetches the latestversion updates and puts it on a replica subset. Meanwhile another putting client maydo the same and have a higher version key than the recently put version, such that aversion will not be considered anymore, which leads to version losses. The more puttingclient nodes, the more probable is an overwrite. Because a put targets only a subset ofreplica nodes, worst churn conditions lead to additional version losses. The same, butmore distinct, can be observed for the traditional putting strategy, where every updateis an overwrite. The traditional putting strategy also stores its updates on a subset ofreplica nodes. The replica nodes try to replicate their data to each other. Under worst

4.3. DISCUSSION 35

churn, the putting clients may target another subset of replica nodes for every update,which leads to a replication competition under the replica nodes and to additional updatelosses, as showed in the results.

Without churn conditions the simulations with the optimistic and pessimistic puttingstrategies did not lose any version updates and therefore remained strictly consistent.The same applies under worst churn conditions with 2 to 5 putting clients. In contrastto the traditional and simple putting strategies, no version update has been lost for thesesettings. But the measurements also showed, that for 6 to 9 putting clients, versionupdates occasionally has been lost. This setting group had also an occasional occurrenceof merges. For the optimistic strategy itself, version forks after a get have been measured.While version forks after a get, the pessimistic strategy directly leaded to merges, theoptimistic strategy waited, until another client resolved the version fork by rejecting itsversion, which explains also the increasing trend with an increasing amount of puttingclients. The successful rejection of conflicting versions is also the reason why the optimisticstrategy had few version forks after a get but no merges. The presence of merges and thefact that some version updates have been lost is an indication that the assumptions, asdescribed in Section 3.4, have been violated. Concretely, the chosen testing environmentleaded to delayed replication, especially for higher amounts of putting clients. The workload caused by the replication under worst churn conditions and the multiple concurrentlyputting clients leaded to longer replication durations. This often resulted in abortions,caused by internal timeout limits given by the underlying TomP2P DHT framework.These replication abortions changed the assumed light churn into heavy churn conditions,explaining the occasional version losses.

Without churn, the elapsed time decreases with an increasing amount of putting clients.Each simulation setting had to put the same amount of updates distributed to a varyingnumber of putting clients. Therefore more putting clients needed less time to put allupdates. The elapsed time of the simple and traditional putting slightly differs fromthe optimistic and pessimistic strategies, whereby the pessimistic one needed slightlylonger than the optimistic strategy. Additionally, these differences are increasing withmore putting clients. In contrast to the traditional and simple putting strategies, theoptimistic and pessimistic ones have waiting mechanisms, e.g. in case of a version delay.The measurements showed that the more putting clients, the more version delays andversion forks are occurring. This leads to longer waiting times for the optimistic andpessimistic putting strategies, which also can be observed in the measured running times.In order to achieve consistency, the running performance suffers under the synchronisationmechanisms, mainly caused by the waiting times. These exponential backoff waitings arealso the reason for increasing running times of the 6 to 9 putting clients and worst churnsetting group. Due to the exponential backoff waiting mechanisms the waiting timesquickly achieve very long durations, which are also explaining the very strong variationof the standard deviations.

Significantly more version delays and forks has been measured under worst churn con-ditions than without churn. The routing in a stable network (without churn) does notchange. In contrast, worst churn causes a new routing path for each put, get, confirm,remove and replication requests respectively tasks. Compared to no churn, requests andtasks arrive delayed the replica nodes, while having worst churn conditions. Furthermore,


under worst churn the client and the replica nodes itself have a different and inconsistentview of the network. Replica and client nodes may not yet received notifications aboutleaving or joining nodes. Under such circumstances, some nodes may accept and/or takereplication responsibility, even when closer replica nodes exist at this time. Such unjus-tified replica nodes do not receive always the latest version and are replicating its partlyoutdated versions, frequently causing version delays and forks, as seen in the measure-ments.

Chapter 5

Summary and Conclusions

This thesis presents ongoing work of versioning in DHTs, overcoming the poor consistencycapabilities of traditional DHTs for mutating data. The vDHT approach proposes a smallset of API extensions and modifications in traditional DHTs to add stronger consistencyproperties for mutable data in DHTs.

The proposed approach supports generic as well as structured data. Furthermore vDHTcan be integrated directly into a DHT implementation by making minor changes and ex-tensions, without applying a new layer on top of the DHT. The versioning allows keepinga version history, which can be used for restoring older versions. Additionally all updatesare retraceable. Using vDHT, each update results in a new version, whereby an uniqueidentifier respectively version key is attached. In contrast to traditional DHTs, an updatedoes not causes overwrites. The vDHT approach requires no timestamps, avoiding timesynchronization issues. Furthermore the approach forgoes any central instances and re-mains fully decentralized. Participating nodes do not have to distinguish among differentroles, such that all nodes have the same behaviour.

Due to the dynamic behaviour of distributed environments, the design of the vDHTapproach requires a definition of distinct assumptions. Distributed environments, havinglight churn conditions, guarantee no network partitioning as well as a replication, whichreplicates the latest version to at least one and the same replica node. Under light churn,this replica node is seen, respectively can be found trough routing, by all participatingnodes. Distributed environments having heavy churn conditions can not guarantee thisconditions.

Version delays and version forks are two aspects mainly affecting consistency above versionupdates. Two different programming patterns have been proposed, which are handlingthese version forks and version delays. The optimistic putting strategy optimistically putsupdated version into the network and handles possible conflicting scenarios afterwards.The optimistic putting strategy assumes that in the most updating cases no conflictsoccur. Received version delays are handled by exponential backoff waiting until the repli-cation completes. Self-caused version forks after a put are rejected by the putting clients.Received version forks after a get are resolved also by exponential waiting cycles. In case aconflicting version will not be removed, timeouts are serving as fallback and lead to merges.

37

38 CHAPTER 5. SUMMARY AND CONCLUSIONS

Under the assumption of a loss-free merging function and light churn conditions, the opti-mistic putting strategy achieves eventual consistency. The optimistic putting strategy isnot consistent in case of heavy churn conditions. The pessimistic putting strategy avoidsconflicting version scenarios by putting prepared updated versions, which will be ignoredin get requests, until the putting clients confirms its updated version. Therefore, replicanodes will never respond with conflicting versions. Similar to the optimistic strategy,putting clients causing a version fork reject their versions. Furthermore, version delaysare also handled by exponential backoff waiting. Assuming light churn conditions thepessimistic putting strategy is strict consistent and gets along without merges. The pes-simistic putting strategy is eventual consistent, if a loss-free merging function is given andreplica set splits, caused by heavy churn or network partitions, eventually see each other.

Both presented putting strategies have in common that putting updated versions doesnot need preceding checks or locks. In contrast to other consistency approaches in dis-tributed environments, the vDHT approach is lock-free and non-blocking, therefore avoidsstarvations and deadlocks.

A broadly based evaluation containing simulations with different settings completes thiswork. The consistency among version updates applying the optimistic and putting strate-gies has been compared to the traditional updating strategy as well as a straightforwardsimple putting strategy. Various number of concurrently putting clients have been con-sidered. Settings without churn served as the baseline for all discussed putting strategies.Simulations applying worst churn conditions, which still have fulfilled light churn con-ditions, had the goal to bring the optimistic and pessimistic putting strategies to theirstress limits. While the traditional and simple putting strategies lost a significant amountof their updates, the optimistic and pessimistic ones stayed consistent, beside some occa-sional version loses due to replication issues.

Furthermore, time performance aspects have been discussed. The optimistic and pes-simistic putting strategies required slightly more time than the simple and traditionalones, which have been mainly caused by the exponential backoff waiting mechanisms.This overhead has to be considered in order to achieve stronger consistency for updatesin DHTs by using the vDHT approach.

In summary, the measured results underline the stronger consistency semantics providedby the optimistic and pessimistic putting strategies.

Chapter 6

Future Work

So far, each update causes the putting of a new complete version containing all previousupdates. The used storage increases with each update. Version truncating respectivelystoring deltas may reduce the storage use.

The vDHT approach has been designed under consistency considerations as the mainaspect. Fairness among client and replica nodes received less attention. The vDHT puttingrespectively updating strategies are using exponential backoffs for retry and waiting cases.However, further analysis and fine-tuning is required in order to prevent livelocks andwaiting.

Besides putting strategies and varying number of putting clients, the evaluation has con-sidered two different churn settings. They are oriented according the limits of the definedassumptions, but do not necessarily reflect real life conditions. A further point is thatall participating nodes does not necessarily show the same churn behaviour. E.g. serversrarely go offline. On the other hand, laptops and mobile devices are online for only aspecific time of a day. Future work has to consider such different behaviour patterns,especially for replication concerns.

All simulations have been tested on virtual networks located on few but powerful machines.The transaction time between the virtual nodes depended mainly on the underlying hard-ware and not on the throughput or the bandwidth of the links between the nodes. Inorder to achieve more accurate results, future evaluations are required, which are locatedon physically distributed nodes. This would allow the investigation of further factors likevarying network delays or hardware resources, such as CPU time. Additionally, a real-life distributed environment would have replication delays, which have been caused bynetwork delays and do not have their origins in high workloads of the local machines, asdiscussed in the discussion Section 4.3 of the evaluation.

Each simulation had fixed settings and configurations. Different kind of data and usersshow different update behaviour. The selected settings could be dynamic for optimizationpurposes. Automated and dynamic adaptations, which are considering varying variablessuch as update frequency, version size or churn behaviour, could be integrated. E.g lessfrequently updated data does not have to be replicated as many times as frequentlychanging data.

39

40 CHAPTER 6. FUTURE WORK

This work has its roots in the Hive2Hive open source project [9], whereby several objectof its internal model are frequently updated and stored respectively published in theunderlying DHT framwork. The vDHT approach and its achieved results and findingswill be integrated in the project. Adapting vDHT into the H2H library will make it morereliable and stable due to better consistency semantics.

Bibliography

[1] BitTorrent. http://www.bittorrent.com/. last visited: Sept., 2014.

[2] Cisco Visual Networking Index: Forecast and Methodology, 2011-2016. http://www.cisco.com/c/en/us/solutions/collateral/service-provider/ip-ngn-ip-next-generation-network/white_paper_c11-481360.html. last visited: June, 2014.

[3] Miniwatts Marketing Group - Internet Growth Statistics. http://www.internetworldstats.com/emarketing.htm. last visited: Mai, 2014.

[4] RStudio. http://www.rstudio.com/. last vistied: Sept., 2014.

[5] The R Project for Statistical Computing. http://www.r-project.org/. lastvistied: Sept., 2014.

[6] The Raft Consensus Algorithm. http://raftconsensus.github.io/. lastvistied: Sept., 2014.

[7] R. Akbarinia, E. Pacitti, and P. Valduriez. Data Currency in Replicated DHTs. InACM SIGMOD International Conference on Management of Data, pages 211–222,Beijing, China, June 2007.

[8] T. Bocek. TomP2P, a P2P-based high performance key-value pair storage library.http://tomp2p.net/. last visited: Sept., 2014.

[9] T. Bocek, S. Golaszewski, C. Luethold, N. Rutishauser, and M. Weber. Hive2Hive,an open-source library for secure, distributed, P2P-based file synchronization andsharing. http://hive2hive.com/. last visited: Sept., 2014.

[10] J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. Furman, S. Ghemawat,A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li,A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito,M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google’s Globally-Distributed Database. In Proceedings of OSDI 2012 (Google), Sept. 2012.

[11] D. de Brum Saccol, N. Edelweiss, R. de Matos Galante, and C. Zaniolo. ManagingXML Versions and Replicas in a P2P Context. In The Nineteenth InternationalConference on Software Engineering and Knowledge Engineering (SEKE), pages 680–686, Boston, USA, Sept. 2007.

41

http://www.bittorrent.com/

http://www.cisco.com/c/en/us/solutions/collateral/service-provider/ip-ngn-ip-next-generation-network/white_paper_c11-481360.html



http://www.internetworldstats.com/emarketing.htm

http://www.internetworldstats.com/emarketing.htm

http://www.rstudio.com/

http://www.r-project.org/

http://raftconsensus.github.io/

http://tomp2p.net/

http://hive2hive.com/

42 BIBLIOGRAPHY

[12] M. J. Flynn, J. Gray, A. K. Jones, K. Lagally, H. Opderbeck, G. J. Popek, B. Randell,J. H. Saltzer, and H.-R. Wiehle. Notes on Data Base Operating Systems. LectureNotes in Computer Science. Springer, London, UK, UK, 1978.

[13] Y. Jiang, G. Xue, and J. You. A Version-enabled Peer-to-peer Content DistributionSystem based on DHT. In 10th International Conference on Computer SupportedCooperative Work in Design (CSCWD), pages 1–6, Shanghai, China, May 2006.

[14] P. Knezevic, A. Wombacher, and T. Risse. Highly Available DHTs: Keeping DataConsistency After Updates. In 4th International Conference on Agents and Peer-to-Peer Computing, (AP2PC), pages 70–80, Utrecht, The Netherlands, July 2005.

[15] L. Lamport. Time, Clocks, and the Ordering of Events in a Distributed System.Communications of the ACM, pages 558–565, July 1978.

[16] L. Lamport. The Part-time Parliament. ACM Transactions on Computer Systems,pages 133–169, May 1998.

[17] B. W. Lampson. Atomic Transactions. Lecture Notes in Computer Science. Springer,London, UK, UK, 1981.

[18] M. Meeker and L. Wu. Internet trends. http://allthingsd.com/tag/mary-meeker/. last visited: Mai, 2014.

[19] V. Mesaros, R. Collet, K. Glynn, and P. V. Roy. A Transactional System for Struc-tured Overlay Networks. Universite catholique de Louvain, 2005.

[20] B. Nicolae, G. Antoniu, and L. Bouge. BlobSeer: How to Enable Efficient Version-ing for Large Object Storage Under Heavy Access Concurrency. In 12th Interna-tional Joint Conference on Extending Database Technology (EDBT) / Conference onDatabase Theory (ICDT), pages 18–25, St.-Petersburg, Russia, Mar. 2009. ACM.

[21] D. Ongaro and J. Ousterhout. In Search of an Understandable Consensus Algorithm.Technical report, Stanford University, CA, USA, 2013.

[22] G. Oster, P. Urso, P. Molli, and A. Imine. Data Consistency for P2P CollaborativeEditing. In 20th Anniversary Conference on Computer Supported Cooperative Work(CSCW), pages 259–268, Banff, Alberta, Canada, Nov. 2006.

[23] F. Paganelli and D. Parlanti. A DHT-Based Discovery Service for the Internet ofThings. Journal of Computer Networks and Communications, 2012, Oct. 2012.

[24] M. Pease, R. Shostak, and L. Lamport. Reaching Agreement in the Presence ofFaults. Journal of the ACM, pages 228–234, Apr. 1980.

[25] F. Plavec and T. Czajkowski. Distributed File Replication System based on FreeP-astry DHT. In International Conference on Knowledge Engineering, Principles andTechniques (KEPT2009), pages 1–10, Cluj-Napoca, Romania, July 2009.

[26] P. Pratt-Szeliga and J. Fawcett. p2pstm: A Peer-to-Peer Software TransactionalMemory. Technical report, Syracuse University, NY, USA, 2010.

http://allthingsd.com/tag/mary-meeker/

http://allthingsd.com/tag/mary-meeker/

BIBLIOGRAPHY 43

[27] B. Temkow, A.-M. Bosneag, X. Li, and M. Brockmeyer. PaxonDHT: AchievingConsensus in Distributed Hash Tables. In International Symposium on Applicationson Internet (SAINT), pages 236–244, Arizona, USA, Jan. 2006.

[28] W. Vogels. Eventually consistent. ACM Queue, 6(6):14–19, Oct. 2008.

[29] H. Yu and A. Vahdat. Consistent and Automatic Replica Regeneration. ACM Trans-actions on Storage, pages 3–37, Feb. 2005.

44 BIBLIOGRAPHY

List of Figures

3.1 Replica subsets under light and heavy churn conditions . . . . . . . . . . . 14

3.2 Ideal putting scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.3 Version fork scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.4 Version delay scenario - The replicas R4 and R5 are too slow. Client CB

detects a version delay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.5 Version fork scenarios resolved with the optimistic put strategy . . . . . . . 19

3.6 Pessimistic Putting - The clients CA and CB successively put a preparedversion and confirm it. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.7 Version fork scenarios resolved with the pessimistic put strategy . . . . . . 22

4.1 A possible churn curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2 Present version percentage of performed updates . . . . . . . . . . . . . . . 28

4.3 Elapsed time of simulation runs . . . . . . . . . . . . . . . . . . . . . . . . 30

4.4 Number of merges having worst churn . . . . . . . . . . . . . . . . . . . . 31

4.5 Number of version delays having no churn . . . . . . . . . . . . . . . . . . 31

4.6 Number of version delays having worst churn . . . . . . . . . . . . . . . . . 32

4.7 Number of version forks after a put having no churn . . . . . . . . . . . . . 32

4.8 Number of version forks after a put having worst churn . . . . . . . . . . . 33

4.9 Number of version forks after a get having no churn . . . . . . . . . . . . . 33

4.10 Number of version forks after a get having worst churn . . . . . . . . . . . 34

45

46 LIST OF FIGURES

List of Tables

2.1 Related Work Comparision . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

B.1 Present version percentage of update writes without churn . . . . . . . . . 55

B.2 Present version percentage of update writes under worst churn . . . . . . . 56

B.3 Elapsed time in seconds without churn . . . . . . . . . . . . . . . . . . . . 56

B.4 Elapsed time in seconds under worst churn . . . . . . . . . . . . . . . . . . 57

B.5 Measurements of optimistic put strategy without churn . . . . . . . . . . . 57

B.6 Measurements of optimistic put strategy under worst churn . . . . . . . . . 57

B.7 Measurements of pessimistic put strategy without churn . . . . . . . . . . 58

B.8 Measurements of pessimistic put strategy under worst churn . . . . . . . . 58

47

48 LIST OF TABLES

List of Listings

3.1 DHT interface for putting data . . . . . . . . . . . . . . . . . . . . . . . . 103.2 DHT interface for getting data . . . . . . . . . . . . . . . . . . . . . . . . . 103.3 DHT interface for removing data . . . . . . . . . . . . . . . . . . . . . . . 103.4 vDHT interface for putting modified data . . . . . . . . . . . . . . . . . . . 103.5 vDHT interface for confirming prepared data . . . . . . . . . . . . . . . . . 113.6 vDHT interface for getting a specific version . . . . . . . . . . . . . . . . . 113.7 vDHT interface for getting latest version . . . . . . . . . . . . . . . . . . . 113.8 Algorithm to find latest versions . . . . . . . . . . . . . . . . . . . . . . . . 123.9 vDHT interface for removing data . . . . . . . . . . . . . . . . . . . . . . . 123.10 Programming pattern - traditional put strategy . . . . . . . . . . . . . . . 143.11 Programming pattern - simple put strategy . . . . . . . . . . . . . . . . . . 153.12 Programming pattern - optimistic put strategy . . . . . . . . . . . . . . . . 183.13 Programming pattern - pessimistic put strategy . . . . . . . . . . . . . . . 21A.1 Example for starting the simulator with a config file argument . . . . . . . 51A.2 Example for starting the simulator with a config file and bootstrap arguments 51

49

50 LIST OF LISTINGS

Appendix A

Installation Guidelines

The implemented vDHT put simulator has been published on https://github.com/ippes/vDHTNetworkSimulator and requires a Java SE runtime environment 7 orhigher. Further, the simulator requires a configuration file argument, as illustrated inListing A.1.

Listing A.1: Example for starting the simulator with a config file argument

java -jar vDHTPutSimulator.jar config.config

The simulator accepts, besides a configuration file, also an IP address and a port forbootstrapping to another node, as illustrated in Listing A.2

Listing A.2: Example for starting the simulator with a config file and bootstrap arguments

java -jar vDHTPutSimulator.jar config.config 192.168.1.21 5002

The simulator sets up a virtual network according to the given configuration file. Then,it applies churn and starts putting. Churn is applied only to the local network and hasno influence to bootstrapped nodes.

The simulator is configured according the given *.config configuration file, which is con-taining the following parameters.

� port The port for the initial node. All other nodes locally bootstrap to this node.

� runtimeInMilliseconds The desired runtime of the simulation in milliseconds. Use’-1’ for an unlimited execution.

� numPuts The desired amount of puts per putting client. Use ’-1’ for an unlim-ited amount of puts. Note: You have to provide either runtimeInMilliseconds ornumPuts.

51

https://github.com/ippes/vDHTNetworkSimulator

https://github.com/ippes/vDHTNetworkSimulator

52 APPENDIX A. INSTALLATION GUIDELINES

� replicationStrategyName Choose ’0Root’ or ’nRoot’ to enable replication. ’off’ ap-plies no replication.

� replicationFactor The size of the replica node set.

� replicationIntervalInMilliseconds Replication frequency in milliseconds.

� numPeersMin The lower bound for churn.

� numPeersMax The upper bound for churn. Note: The initial network size is themedian between numPeersMin and numPeersMax.

� maxVersions The maximal amount of versions allowed to be stored on a singlereplica node. The latest versions is always preferred over older versions. Use ’-1’ tohave no version limit.

� churnRateJoin The maximal amount of nodes allowed to join the network at once.

� churnRateLeave The maximal amount of nodes allowed to leave the network at once.

� churnStrategyName Following options are available:

– ’off’ No churn gets enabled.

– ’stepwise’ Nodes are leaving or joining the network according churnRateJoinand churnRateLeave.

– ’stepwiseRandom’ Similar to ’stepwise’, but the amount of leaving or joiningnodes at once is randomly selected. churnRateJoin and churnRateLeave arethe upper bounds.

– ’wild’ Ignores churnRateJoin and churnRateLeave. A randomly selected amountjoins or leaves the network within the numPeersMin and numPeersMax bounds.

– ’specific’ Similar to stepwise, but leaving nodes are always closest to the usedlocation key for putting.

� churnJoinLeaveRate The ratio between join and leave churn events. Choose ’0.0’for having only join churn events or ’1.0’ for having only leave churn events.

� churnRateMinDelayInMilliseconds Minimal delay between two churn events in mil-liseconds.

� churnRateMaxDelayInMilliseconds Maximal delay between two churn events in mil-liseconds.

� ttlCheckIntervalInMilliseconds Frequency in milliseconds to check storage for expireddata.

� putTTLInSeconds Time to live in seconds for confirmed and stored data.

� putPrepareTTLinSeconds Time to live for prepared data.

� putStrategyName Following options are available:

53

– ’traditional’ Traditional put strategy according section 3.4.1.

– ’traditionalVersion’ Simple put strategy according section 3.4.2.

– ’optimistic’ Optimistic put strategy according section 3.4.3.

– ’pessimistic’ Pessimistic put strategy according section 3.4.4.

� putConcurrencyFactor Number of putting clients. All putting clients are using thesame location key.

� putDelayMaxInMilliseconds The maximal delay of a putting client updating a ver-sion in milliseconds.

� putDelayMinInMilliseconds The minimal delay of a putting client updating a versionin milliseconds. Note: A putting client executes randomly between putDelayMinIn-Milliseconds and putDelayMaxInMilliseconds.

At the end of each simulation the vDHT simulator appends its results in a comma-separted values file, called outcome.csv. Each simulation appends a new line. If no suchfile exists, a file will be created at the end of a simulation run. A result entry containsa date and timestamp, followed by all above mentioned configurations and the followingmeasurements:

� presentVersions At the end of each simulation run the latest version is loaded. Allvisible writes of the updating clients are counted and summarized.

� versionWrites The sum over all putting clients of all actual executed updates.

� merges The sum over all putting clients of all performed merges. Note: If numPutsis given, versionWrites plus merges should be equal.

� delays The summed count of detected version delays above all putting clients.

� forksAfterGet The summed count of detected version forks after a get over all puttingclients.

� forksAfterPut The summed count of detected version forks after a put over allputting clients.

� consistencyBreaks The summed count of detected consistency breaks a put over allputting clients. A consistency break occurs when loading of the latest version fails(received null or an empty value).

� elapsedTime The measured runtime of the simulation run. Corresponds runtimeIn-Milliseconds, if it has been not set to ’-1’.

54 APPENDIX A. INSTALLATION GUIDELINES

Appendix B

Data Sheet of the Simulations

Table B.1: Present version percentage of update writes without churn

nr. putting clients

2 3 4 5

put strategy mean sd mean sd mean sd mean sd

optimistic 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000pessimistic 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000traditional 0.5740 0.1564 0.5576 0.1153 0.6756 0.1086 0.6031 0.1954simple 0.6110 0.1482 0.6347 0.1377 0.6384 0.1122 0.6008 0.0697

nr. putting clients

6 7 8 9



55

56 APPENDIX B. DATA SHEET OF THE SIMULATIONS

Table B.2: Present version percentage of update writes under worst churn

nr. putting clients

2 3 4 5



nr. putting clients

6 7 8 9



Table B.3: Elapsed time in seconds without churn

nr. putting clients

2 3 4 5



nr. putting clients

6 7 8 9



57

Table B.4: Elapsed time in seconds under worst churn

nr. putting clients

2 3 4 5



nr. putting clients

6 7 8 9



Table B.5: Measurements of optimistic put strategy without churn

nr.puttingclients

merges delays forks after get forks after put

mean sd mean sd mean sd mean sd

2 0.00 0.00 0.00 0.00 0.00 0.00 2.04 0.543 0.00 0.00 0.20 0.50 0.00 0.00 4.24 1.424 0.00 0.00 0.20 0.41 0.04 0.20 5.12 0.735 0.00 0.00 0.40 0.58 0.00 0.00 6.48 1.736 0.00 0.00 0.40 0.58 0.04 0.20 7.80 2.087 0.00 0.00 0.60 1.00 0.44 0.58 10.00 2.368 0.00 0.00 0.48 0.77 0.20 0.50 11.48 3.069 0.00 0.00 0.52 0.77 0.08 0.28 13.96 3.45

Table B.6: Measurements of optimistic put strategy under worst churn

nr.puttingclients



2 0.00 0.00 22.56 12.26 0.00 0.00 13.64 3.883 0.00 0.00 14.84 11.66 0.32 0.48 19.68 5.494 0.00 0.00 18.40 8.58 1.08 0.95 33.12 8.765 0.00 0.00 19.16 14.36 2.68 3.39 45.44 15.506 0.12 0.33 26.48 18.41 13.80 18.12 74.04 34.967 0.64 1.66 35.28 28.02 33.92 46.28 111.64 48.748 0.24 0.44 34.44 27.70 43.60 39.67 137.68 46.649 1.48 3.18 66.36 58.84 80.68 91.03 190.72 82.70

58 APPENDIX B. DATA SHEET OF THE SIMULATIONS

Table B.7: Measurements of pessimistic put strategy without churn

nr.puttingclients


mean sd mean sd mean sd mean

2 0.00 0.00 0.04 0.20 0.00 0.00 2.32 1.143 0.00 0.00 0.16 0.37 0.00 0.00 4.48 1.504 0.00 0.00 0.36 0.64 0.00 0.00 6.16 2.255 0.00 0.00 0.24 0.52 0.00 0.00 6.52 1.536 0.00 0.00 0.24 0.52 0.00 0.00 8.60 2.577 0.00 0.00 0.52 0.87 0.00 0.00 12.12 3.218 0.00 0.00 0.52 0.71 0.00 0.00 13.08 3.309 0.00 0.00 0.68 0.90 0.00 0.00 16.12 4.31

Table B.8: Measurements of pessimistic put strategy under worst churn

nr.puttingclients



2 0.00 0.00 16.00 14.31 0.00 0.00 25.24 14.053 0.00 0.00 14.72 9.74 0.00 0.00 32.56 13.594 0.00 0.00 13.88 6.35 0.00 0.00 45.32 23.865 0.00 0.00 4.76 2.55 0.00 0.00 33.36 7.446 0.12 0.44 16.80 10.32 0.00 0.00 77.88 23.877 0.12 0.33 20.20 12.22 0.00 0.00 121.48 37.078 0.44 0.87 20.88 13.88 0.00 0.00 134.68 61.979 0.56 1.12 20.84 13.66 0.00 0.00 146.32 60.71

Appendix C

Contents of the CD

� Thesis - Folder containing the master thesis in PDF and PS format and source filesincluding all figures.

� Abstract.txt - Abstract in english as plain-text file.

� Zusfsg.txt - Abstract in german as plain-text file.

� Code - Folder containing the vDTH put simulator as runnable JAR file, source filesand dependencies.

� Related Work - Folder containing related work in PDF and PS format.

� Configurations - Folder containing config files used for simulations.

� Results - Folder containing raw simulation results in CSV format.

� Analysis.R - Script written in R for statistical analysis.

59

consistency in distributed systems - uzh · 2014-10-10 · consistency in distributed systems...

Documents