enhancing fairness of fib lookup in named data networking

! i!

NATIONAL TAIWAN NORMAL UNIVERSITY

COMPUTER SCIENCE AND INFORMATION ENGINEERING

Enhancing Fairness of FIB Lookup in Named Data

Networking

Author: Supervisor:

Jing-Yung FU Dr. Ling-Jyh CHEN

! ii!

,

IP

I ,

B

F

I ,

I

FIB

DePT (Dispersed eminent Patricia Trie)

I Patricia Trie

! iii!

Abstract

The novel network architecture Named Data Networking (NDN) has been proposed.

The way of information transfer will be pass by name content rather than IP address.

Users could get the content by directly describe the information. However, with the

appearance of Internet of Things (IoT), there will be more information in the network

flow. On the other hand, names do not have the same length and content, the lookup

time of names in router would be very different. The difference of lookup time may

result in that every user will get his data by during very different waiting time.

Therefore, the “unfair” situation happened. In order to solve the fairness issue, we

propose a method “DePT (Dispersed eminent Patricia Trie)” which based on two basic

methods. This proposed method will make each name have about the same lookup

time in FIB. Besides, the lookup efficiency will be better and the memory consumption

will be lower. It will reduce the burden on the hardware.

Key words: Name Data Networking, fairness, FIB

! iv!

TABLE OF CONTENTS

1. Introduction………………………………………………………………………1� � � � � � � � � � � � � � � � � �

2. Backgrounds and Related work………………………………………….............4� � � � � � � � � � � � � � � � � �

2.1 NDN overview………………………………….…….…….…......................4

2.2 Forwarding process in NDN………………………….……………….…......8

2.3 Name lookup in NDN………………………………………………………..9

2.3.1 Pending Interest Table lookup……………...........................................9

2.3.2 Forwarding Information Base lookup…………………………..…….11

3. Problem Definition………………………………………………………….…...14� � � � � � � � � � � � � � � � � �

4. Our Approach……………………………………………………………….…...15� � � � � � � � � � � � � � � � � �

4.1 DePT - Building DePT....................................................................................17

4.1.1 Building DePT - Filtering phase...........................................................17

4.1.2 Building DePT - Incremental update phase..........................................18

4.2 DePT - FIB lookup..........................................................................................19

5. Evaluation………………..………………….…….…….….................................21� � � � � � � � � � � � � � � � � �

5.1 Dataset……………….………………….….….…….…................................21

5.1.1 Real-world dataset................................................................................22

5.1.2 Synthetic dataset...................................................................................22

5.2 Measurement……………….………….……………….…............................23

5.2.1 Data distribution…….………………………………………..............23

5.2.2 FIB lookup time of Real-world dataset….……………………..……24

5.2.3 FIB lookup time under different n……….……………………..........25

5.2.4 FIB lookup time under different p……….……………………..........26

! v!

5.2.5 Coefficient of Variation of FIB lookup time under different p….…..29

5.2.6 Memory consumption of DePT under different p……….….……....30

5.2.7 Incremental update of DePT under different k………………………31

6. Discussion……………………………………………………………..………..32

7. Conclusion � Future Work...………………………………………..………..33

8. References……………………………………………………………….……..34

!

! vi!

!!

LIST OF TABLE

1. Relationship between length (number of characters) and lookup time ………...24!!

LIST OF FIGURES

1. Internet and NDN hourglass architecture .………………………………………..5

2. Packets in NDN architecture …...….……………………………………………..6

3. Entry in PIT and FIB..…………………………………………………………….7

4. Interest / Data forwarding process in NDN router …..………………...................9

5. Trie, Ternary Trie and Patricia Trie for names “by”, “sea”, “sells”, ”shells” and

“shore”……………………………………………………………………….......13

6. Flow chart of building DePT…………………………………………………….16

7. Building DePT – Filtering phase………………………………………………..18

8. Building DePT – Incremental update phase……………………………………..19

9. Distribution of name length in real-world dataset……………………………….23

10. FIB lookup time for four methods comparison under different n

(a) Real-world dataset…………………………………………..……………….25

(b) Synthetic dataset…………………………………….……………………….25

11. FIB lookup time for four methods comparison among 0%, 30%, 60% and 90% of

common prefix

(a) 0%....................................................................................................................26

(b) 30%..................................................................................................................26

(c) 60%..................................................................................................................26

(d) 90%..................................................................................................................26

12. Average FIB lookup time for four methods comparison in different p………….28

13. Coefficient of Variation of FIB lookup time for four methods comparison in

different p………………………………………..……………………………...29

14. Memory consumption for four methods comparison in different p………..…...30

15. Incremental update under different k……………………………………….…...31

!

!1!

1. INTRODUCTION

The increasing demand for highly scalabe and efficient distribution of content

has motivated the development of future Internet architecture. Different from

traditional IP network, it focused on information objects such as videos, documents,

and other pieces of information rather than physical address or location of desired data.

This approach of this architecture is commonly called information centric networking

(ICN) [1]. ICN favors the deployment of in-network caching and multicast

mechansims. Based on ICN conception, a clean-slate network architecture Named

Data Networking [2] has been proposed. It focuses on “What” the information (content)

is rather than “Where” the information is located. Data transferring between nodes in

NDN will reduce the cost of bandwidth required for content providers. And improve

the consumer’s download speed even increase system stability.

However, NDN has been applied to certain places and the detail as follow.

Named Data Networking for Internet of Things (IoT) [3] has been concerned. NDN is

recognized as a content retrieval solution in wired and wireless domains. Due to its

innovative concepts, such as named content, name-based routing and in-network

caching, paticularly suit the requirement of Internet of Things. Besides, directly

management in security, naming, data aggregation are also beneficial for IoT. The

most significant thing is that IP address assignment procedure is to be spared. Device

in IoT communicates with each other by meaningful names. Therefore, application

developers are free to design its own namespace that fits the constraints of their

environment. Vehicular ad-hoc network (VANET) [4] is another application for NDN.

Indeed an increasing number of vehicles are connected to Internet today and they

mainly connected via cellular network only. A new design V-NDN [5] has been

!2!

proposed and demonstrated through real expreimatation. In V-NDN, communication

between cars would be content name and consists of abundant traffic information.

For example applications above, names in NDN play an important role. Every

consumer receives its desired data by going through routers. Regardless of getting data

from Content Store (CS) directly or find its corresponding name prefix in Forwarding

Information Base (FIB), lookup time is the most important thing that every consumer

cares about. However, many methods for name lookup have been proposed and they

mainly based on Hash Table (HT), Bloom Filter (BF) and Trie. In Pending Interest

Table (PIT), exact string matching (ESM) of NDN name would be most efficient by

using a kind of hardware such as Ternary Content Addressable Memories (TCAM) [6].

It minimizes the memory access required to locate an entry by comparing it against all

memory words in one clock cycle. On the other hand, longest prefix matching (LPM)

is frequently used in FIB and Trie is the core method. Therefore, our preliminary

approach has implemented two relative Trie, Ternary Trie and Patricia Trie. The

reason why we try another Trie for name prefix search is that lookup efficiency of Trie

depends on length of name. Another purpose of our approach is to achieving fairness

in searching as possible as it can. Consequently, the remaining problem is that how to

serve every request in NDN router in fairness. Because of uncertain length of name

and number of routers name goes through, time to receive data packet may have great

difference.

In this paper, we propose and implement our method Distributed-eminent

Patricia Trie (DePT), name prefix lookup architecture to obtain high speedup and

achieve fairness in NDN FIB. Name prefix lookup in FIB is not as same as in PIT,

unknown length of name will be matched in FIB. We develop DePT to be a generic

architecture to solve fairness issue. DePT is based on multiple Patricia Trie so as to

!3!

narrow the range of name prefix lookup. Every Patricia Trie has almost the same

number of name prefix, every name serverd in fair search range. Besides, we make the

following contributions.

1. We propose DePT that performs accelerated NDN name lookup in FIB. The core

of DePT is a distributing method that classifies all name prefix in FIB into several

groups according to its specific value. Search scope will therefore become smaller

and efficiency will be better.

2. We use Patricia Trie as basis of our method DePT because it not only achieves

remarkable average speedup for name prefix lookup than general Trie but also has

fairer lookup time.

3. We have a incremental update in our method DePT to face special cases. In order

to prevent the specific situation that much the common prefix appeared in FIB and

resulted in worse lookup efficiency and unfairness in DePT, we have a incremental

update deal with it. The incremental update would be activated immediately when

reaching the standard.

4. We demonstrate a better saving of memory consumption; Trie is viewed as

higher memory consumption method than hash table or bloom filter. However,

DePT using Patricia Trie to saving memory consumption in case name greatly

increase in the future.

The rest of this paper is organized as follows. Section II introduces NDN such as

architecture; packet forwarding and some name prefix lookup methods that have been

proposed. Section III defines the key problems existing in NDN cache components and

!4!

Section IV decribes DePT approach in detail. Section VI is our experimental

evaluation and we conclude our research in Section VII.

2. BACKGROUNDS & RELATED WORK

Before introducing our approach, we have a brief introduciton of NDN that

consists of interior architecture, forwarding process and name prefix lookup method

proposed that relative to our research.

2.1 NDN overview

Named Data Networking (NDN) is a proposal for the information-centric

networking (ICN) conception. A most significant distinction from IP is that every

piece of content in NDN routed and forwarded by its assigned name instead of fixed

length IP addressed. However, NDN names are application-dependent and opaque to

the network.

NDN aims to remove the restriction that packets can only name communication

endpoints. Consequently, names in packet can be anything: an endpoint, a chunk of

movie or book, a command to do something, etc. In Figure1, NDN has an evolution of

the IP architecture that generalize the role of thin waist and it means packets can name

objects other than communication endpoints.

!5!

Figure 1: Internet and NDN hourglass architecture [2]

Besides, security is another noteworthy point. In NDN, security is built into data

itself rather than being a function. Signature is a security key binding with data, not

only coupled with data publisher information but also enables determination of data

provenance. In network traffic, router in NDN can control the traffic load by

controlling the number of interest to achieve flow balance between Interest packet and

Data packet.

However, in-network storage is the core of the practice compared with IP

network. Routers in IP cannot reuse the data after forwarding them. But routers in

NDN can cache data to satisfy future request since they are identified by the data

names. This caching mechanism can accelerate the speed of information obtained and

achieve almost optimal data delivery. Other details about caching will be narrated in

the following section.

Different from traditional Internet Protocol (IP) network, NDN has many

distinctive features. NDN is a consumer-driven and data-driven communication

protocol. In Figure 2, every data consumer and data provider will exchange their

information by using two distinct packets: Interest Packet and Data Packet. Both of

!6!

them carry a content name that uniquely identifies a piece of data. Unlike IP address, a

serial meaningless number in IP network.

Figure 2: Packets in NDN architecture

By using Interest Packet and Data Packet as medium in NDN infrastructure, a

consumer will put the name of a desired piece of data into an Interest Packet and send

it to the network. Routers in NDN use this name to forward the Interest Packet toward

the data producer. Once the Interest reaches a node that has requested data, the node

will return a Data packet that contains both the name and the content, together with a

signature by the producer’s key. The most significant difference between NDN and IP

network is that NDN has cache mechanism. Consists of Content Store (CS), Pending

Interest Table (PIT) and Forwarding Information Base (FIB).

1. The purpose of Content Store (CS) is like a buffer memory in today’s network

and it mainly used to store content that have ever been forwarded. On the other

hand, there is no information left in router after packet forwarding in IP network.

In NDN network, CS left some reusable data through computational algorithms as

possible as it can, like popular news and information. When a lot of users request

!7!

for same contents, it can save bandwidth to download the data, and consumers can

also ensure that the data couldn’t be tampered with by signed info and signature in

data packet. In addition, the replacement algorithm to improve the hit ratio and

configuration size for CS is an important research topic in NDN.

2. Pending Interest Table (PIT), which keeps track of the Interest Packet upstream.

In Figure 3, Each PIT entry contains the name of the Interest and a set of interfaces

from which the Interests for the same name have been received. When its

corresponding data packet arrives, router will forward data to all the interfaces

listed in the PIT entry. Then router removes the corresponding PIT entry, and

caches the data in the Content Store. Furthermore, in order to preclude that PIT

consists of overfull entries, the incoming interface is also removed from PIT when

the lifetime expires.

3. In Figure 3, Forwarding Interest Base (FIB) in NDN differs from IP FIB in two

ways. First, entry in IP FIB only contains a single best next-hop. On the contrary,

FIB entry in NDN contains a list of multiple interfaces. Besides, an IP FIB entry

contains nothing but the information of next-hop, while an NDN FIB entry records

both data planes and routing preference to provide every name a adaptive

forwarding decisions.

!8!

Figure 3: Entry in PIT and FIB

2.2 Forwarding process in NDN

In Figure 4, forwarding process in NDN node follow a specified rule. A

consumer put the name of desired data into an Interest Packet and put it into network.

Routers use this name to forward to data producers. In particular, it follows a specific

rule.

In NDN router, when an Interest Packet arrives, an NDN router first checks

whether the corresponding matching data is in CS or not. If desired data exist, Data

Packet that contains both the name, content and producer’s key will be transferred

back to requesting consumer. Otherwise the router will keep looking up the name in its

PIT. If a matching entry exists in PIT, it records the incoming interface of this Interest

and forwards it to the next station FIB. FIB then forward it to the data producer based

on router’s adaptive forwarding strategy. Among it, if a router receives Interest Packets

that have same name from multiple downstream nodes, it forwards only the first one

upstream toward the data producer. In contrast, if there is no matching entry in PIT,

the name will be added in PIT entry and forwarded to FIB.

After a Data Packet arrives, an NDN router will forward the data to all

downstream nodes whose interfaces listed in that PIT entry. So every Interest Packet

that arrived in the same time slot will receive this Data Packet. Simultaneously, the

PIT entry will be removed and data will be cached in the Content Store. But if Data

Packet arrives over the setting time, the waiting Interest Packet in PIT entry will also

be removed and this Data Packet will be dropped to prevent network congestion.

!9!

Figure 4: Interest/Data forwarding process in NDN router [2]

2.3 Name lookup in NDN

In this paper, our method and estimation mainly aim at name lookup in FIB.

Apart from name lookup for data caching in Content Store, we simply mention some

proposed method of name lookup in PIT and introduce the basis of our method for

name prefix lookup in FIB. Different from fixed IP address comparing process in

router, there are two ways for name lookup metioned before in NDN router, Exact

String Matching and Longest Prefix Matching. However, PIT used Exact String

Matching for entire name searching and Longest Prefix Matching is applied in FIB.

The only one common thing is that existence of name to wait for searching in PIT and

FIB has limited time and the name would be deleted due to nothing responds.

Therefore, the lookup method must be fit the property of them. In the following

subsection, we discuss some proposed method for name lookup in PIT and FIB.

2.3.1 Pending Interest Table (PIT) Lookup

!10!

Exact string matching in PIT is similar to string matching in traditional data

structure. In proposed method, there are mainly based on DFA, Hash Table and Bloom

Filter. The basic and classical method we have ever seen is DFA (Deterministic finite

automata) [7]. The difficulty of DFA is that each name prefix in NDN is associated

with unbounded string and it requires a special encoding scheme.

Another method is hash table (HT), HT-based lookup algorithm is efficient but

the choice of a hash function significantly affects its performance. As a result, many

extension or deformation of hash table has been proposed. In [8], there is various hash

function at efficiency and collision rate comparison, it verify that different hash

function may bring about large difference between them. And [9] shows multi-hash

name lookup table and the main objective is lower false positive rate. In other words,

size of hash table and number of linked list in every location will be another threshold.

However, [10] wants to using hash table based on compact array rather than linked

lists, it also set a certain number of components for lookup start to prevent DoS attack

in NDN router. Similarly, number of default lookup start will be a threshold and affect

the efficiency.

In [11], Bloom Filter (BF) is a method used for IP network and it is based on

multiple hash function. About Bloom Filter in NDN usage, [6] shows distributed

Bloom Filter to reduce the necessary memory space for implementing the PIT; A name

lookup engine with Adaptive Prefix Bloom Filter [12] has been proposed. Each NDN

name is split into B-prefix followed by T-suffix. The B-prefix is matched in Bloom

Filter and T-prefix is matched in Trie. However the length of these two segments

depend on how popular they are. It needs a additional statistic to adjust boundary.

NameFilter [13] is a two-stage Bloom Filter for name lookup. The first stage

determines the length of a name prefix, and the second stage lookup the prefix in a

!11!

narrow group of Bloom Filters. Mapping Bloom Filter [14] that is a modified data

structure of Bloom Filter has been proposed to minimize the on-chip memory

consumption by using SRAM and even decreases the false positive rate. In summary,

there are two drawbacks in Bloom Filter, one of drawback is requiring large memory

bandwidth when operating multiple hash functions. Another one drawback is how to

choose adequate hash functions to lower false positive rate.

2.3.2 Forwarding Information Base (FIB) Lookup

The purpose of name prefix lookup in FIB is to find the longest prefix of the

name and toward the face to obtain the desired data. Different from Exact String

Matching, Longest Prefix Matching only needs to find a shorter or equal length prefix.

In terms of Longest Prefix Matching, the majority of methods are based on Trie.

[15,16] is Trie-based longest prefix matching algorithm for IP network, which cannot

satisfy the need of storing millions of variable and unbounded names. Fu Li et al.[17]

presented a framework of a fast longest prefix name lookup based on name space

reduction scheme. The method use fat tree and extensible hybrid data structures to

accelerate the name lookup process. Yi Wang et al. [18] proposed a Name

Components Encoding approach for longest prefix lookup in NDN. This technique

involves a code allocation mechanism and an evolutionary state transition arrays. Not

only increase the search complexity but also reduce the efficiency of lookup. However

[19] also used tree-based structured upon hardware parallelism to achieve high lookup

speed.

After discussing proposed approach before, we found that most of methods and

approaches mentioned above are used for entire name lookup and fit the property in

!12!

PIT. In contrast, less approach aimed at searching longest prefix in FIB. Besides,

another point worth noting is that insertion and deletion of name prefix in NDN router

will often occur. Therefore we engaged in basic data structure in our approach to do

longest prefix matching, Trie, Ternary Trie and Patricia Trie. Although Trie has

relatively higher memory consumption, content management is more convenient than

hash table or Bloom Filter. Consequently, we implement and discuss about the other

two extensive methods to our DePT basis, Ternary Trie and Patricia Trie and there are

three brief introductions below.

1. Trie [20] is a data structure in which each path from the root to a leaf

corresponds to one key in the represented set. Each node in Trie has an array which

consists of several characters. The path in the Trie corresponds to characters of the

key in the FIB. When input request doing longest prefix matching in Trie, the

value with existing key will be returned. Figure 5 (a) shows that five strings stored

in Trie and each existing string has its corresponding value. Consequently, the

request name “season” will obtain a returned value “7” after doing lookup process.

2. Ternary Trie [21] must have three children in each node. The location of each

node depends on the order of input due to its comparison of letters. When input

request doing longest prefix matching in Ternary Trie, each character of name will

compare not only character in string but also other characters in lookup path. In

Figure 5 (b), a request name “season” will meet two additional existed characters

“h” and “l” before matching longest string “sea”. Besides, another disadvantage of

Ternary Trie is that alphabetical order may result in nodes in Ternary Trie tilt in

certain side and make lookup procedure more difficult. Therefore, Ternary Trie not

!13!

only reduces the speed of name prefix searching but also increases memory

consumption.

3. Patricia Trie [22] is equivalent to compressed Trie, it is a simple variant on a

trie in which any path whose interior vertices all have only one child is compressed

into a single edge. The lookup path is as same as Trie but it has better efficiency

due to its less number of nodes. In Figure 5 (c), a request name “shellshock”

lookup in Patricia Trie only needs to meet three nodes rather than six nodes in Trie.

Figure 5 : Trie, Ternary Trie and Patricia Trie for names “by”, “sea”, “sells”,

“shells” and “shore”

In NDN router in current simulator ndnsim [23], Trie is used to longest prefix

matching. Compare with general method hash table, it has lower computational

complexity and it is favorable for data insertion and deletion. According to the

introduction above, the complexity of lookup in Trie and Patricia Trie at most O (|s|)

which s represents the length of name. However, Ternary Trie has additional

comparing step so its complexity is O (|s| + log n) and n equals to number of string.

!14!

For Patricia Trie, it saves a lot of memory storage and greatly speed up the efficiency

of longest prefix matching. That is why we used Patricia Trie as basis of our method

DePT to store name prefix in FIB.

3. PROBLEM DEFINITION

With respect to longest prefix matching of NDN name in FIB, efficiency is the

most concerned issue. However, fairness issue has not been mentioned before.

Considering throughput and overhead at the same time, our emphasis and another goal

on name prefix lookup is achieving fairness as far as possible. In the following section,

we define three main goals and they are searching delay, memory consumption and

searching fairness and our purpose is to optimize these three key points.

1) Searching delay in FIB represents the time that Interest Packet waits for longest

prefix matching process. It means that each request name from Interest Packet has

a searching delay d and we define D as total searching delay of our name dataset.

(n equals to number of data).

2) Memory consumption in FIB means how much memory consumption a data

structure used to store name prefix. In our method DePT, we need k sub data

structure to store all name prefix and each subsection occupies m memory.

Consequently, total memory consumption M in FIB could show as k∗m

!15!

3) Searching fairness of input requests in FIB shows how fair the searching delay of

longest prefix matching process. About measuring fairness, we use Coefficient of

Variation (CV) to estimate whether the searching delay is fair or not.

According to details of these three key points, we summarize the ideal solution

below. Searching delay is smaller represents that lookup process in FIB has better

efficiency, however, if the data structure used in FIB has lower memory consumption,

it would be a better methodology. The last one point is CV, smaller CV represents that

searching delay of input requests are similar and concentrated, it is under a fair

situation. Therefore, the ideal state is that evaluations of these three methodologies are

smaller in common.

4. OUR APPROACH

We proposed a method named DePT. Based on alphabetical build conception in

Ternary Trie; we also have a novel classified idea for building prompt data structure.

Unlike a disadvantage that input order of string will greatly affects the lookup

efficiency in Ternary Trie. We have a better-classified mehod, therefore, our design

rationale behind DePT is that we evenly classify all name prefix of FIB into many sub-

tries to narrow the lookup scope and enchance the lookup fairness. Besides, we have

an incremental update mechanism to avoid the happening of special situation and it

will be descripted in the following.

The word “dispersed” in DePT is the core conception. We separate entire

structure into several subsections so as to narrow the searching range. Besides, the

!16!

searching scope of every name request is nearly the same, so the lookup time of them

are close to each other and the searching fairness becomes better.

However, more than one method of dispersing and we considered that the

method must have at lesat two basic properties, efficiency and diversity. In our method,

we use hash function to classify all name prefix into group. The hash function make

number of name prefix evenly distributed in every Patricia Trie. About longest prefix

matching, we use Patricia Trie as our data structure due to its efficiency and fairness.

The detail of Patricia Trie has been mentioned in the section two in this paper.

Summarize the above description; DePT also takes advantage of hash function

and Patricia Trie. Figure 6 shows the flow chart of building DePT, and we will detail

each phase of our approach in the following subsection.

Figure 6 : Flow Chart of building DePT

!17!

4.1 DePT – Building DePT

1) Building DePT – Filtering phase

In first part of our approach, the destination is to narrow the scope of searching.

In general FIB, all name prefix are put into a single data structure like Trie.

However in order to enhance the search speed, we use a filtered way to split original

structure into multiple subsection. All name prefix will be classified and restricted

into a range depend on hash table size. At the same time, in order to control the

number of name prefix in every bucket in hash table, we allocate a counter in every

bucket to prevent special circumstance which affects the efficiency and fairness

mentioned later.

In our default, we set a normal number of input components. We capture first

three components of name prefix and put into our default hash function. The reason

why we select “three” as our preliminary number is that almost first three

components represents domain name in URL. Besides, almost every name prefix

has at least three components and has common prefix in NDN status [24].

In terms of hash function, we choose CityHash64 as our hash function and it

serves every name prefix a 64-bits hash value. After this action, our narrow step will

shrink hash value to a specific range to ensure that every name prefix is in the table

of DePT. Figure 7 shows the procedure of entire filtering and our default size of

DePT is one thousand buckets. Therefore, every hash value needs to mod 1000 to

obtain corresponding sub-trie number.

!18!

Figure 7: Building DePT – Filtering phase

2) Building DePT – Incremental Update Phase

By using DePT, a worst case would happen. This situation is excessive

concentration of name prefix. For example, the majority name has “ndn/ntnu/office”

in their first three (n) components in this campus NDN router. In this scenario, a

certain Patricia Trie will be larger and deeper than others in DePT. Thus, the lookup

efficiency and fairness will be affected. In order to avoid this situation, we have to

check counter Ci we set in every bucket. The counter Ci can show how many name

prefix has been put into this number of sub-trie. If the counter of someone sub-trie

(s) exceeds fifty percent of total number of name prefix in FIB, the Incremental

Update mechansim will be activated. Name prefix in sub-trie (s) that has more than

three components will be put into another DePT by hashing first four (n+1)

components. In this additional DePT, name prefix that comes from someone sub-trie

!19!

will be reassigned into sub-trie in the new DePT. The excessive accumulation of

name prefix in sub-trie in the first DePT will be solved.

Figure 8 (a), (b) shows the procedure of incremental update. In Figure 8 (a),

there are relatively more name prefix accumulating in number four sub-trie. After

doing incremental update, name prefix will be evenly distributed in new DePT,

Figure 8 (b).

(a) Original DePT (b) New DePT

Figure 8 : Building DePT - Incremental update phase

Although the default n in every NDN router is three, the n may be adjusted

according to circumstance in router at that time. Consequently, initial n in every

router may be different but the integrity of name will not be changed.

4.2 DePT – FIB lookup

After filtering phase, name prefix in FIB have been classified into sub-trie in

DePT. If the number of sub-trie is n and there are N name prefix in FIB, then the name

prefix lookup range has been shorten to N/n. According to pseudo-code shown below,

!20!

each input request name will obtain the hash value after hashing the first n components.

Then, the hash value will be limited to a specific range in line with DePT default.

Therefore, the NDN name will do longest prefix matching in specific sub-trie. There

are two kinds of content in this sub-trie, the name which has same first n components

and gets same hash value of first n components. However, if number of sub-trie name

gets is equal to new_DePT, name will do longest prefix matching in the additional

DePT according to new sub-trie number first to find the corresponding face. If the

name is not matching, it will do name prefix lookup in original DePT by original

sub-trie number. After that, the request name will get the face list of router which

destination router should go. However, which face should name forwards in face list

depend on the forwarding stratege NDN router use and it is not discussed in this paper.

Algorithm of Name lookup in DePT

Input : request_name

Output : face_number

Procedures:

01: (com1, com2, …, comk) � Decompose(request_name);

02: value � CityHash64(com1~n)

03: Number_of_subtrie � value mod(table size)

04: if (Number_of_subtrie = new_DePT) then

05: value � CityHash64(com1~n+1)

06: Number_of_subtrie � value mod(table size)

07: if LongestPrefixMatching(Number_of_subtrie) then

08: return face_number

!21!

09: if (Number_of_subtrie != new_DePT) then

10: if LongestPrefixMatching(Number_of_subtrie) then

11: return face_number

5. EVALUATION

In our evaluation, we designed our own dataset by several cases and compared

longest prefix search time by using our approach based on it. Among it, our

experiment is focused on comparison of lookup efficiency, storage and fairness. In

terms of lookup efficiency, we have served each input request about 100 repeat times

and obtained a mean value to enhance the experiment accuracy.

5.1 Dataset

Naming scheme in NDN has not been specified. It may have different naming

rule on different purpose. However, we investigate some characteristics of NDN name.

Names in NDN are hierarchically structured and design decision allows each

application to choose the naming scheme that fits its needs. Such as names in V-NDN,

name content has traffic, vehicular and road information. Because the amount of

current NDN name is not enough, we use URLs (Uniform Resource Locator) as our

real-world dataset that its property and structure is similar to NDN name. In URL, ‘/’

is a delimiter which separate every part of name and its domain name most have its

meaning. On the other hand, we have randomly created the same amout of data based

!22!

on NDN basic rule. The details of these two types of dataset will be described in the

following.

1) Real-world data:

In real-world dataset, URLblacklist[25] provides a collection of URL domain

and short URLs for access. We used the dataset after modifying it to NDN name

according to a specific rule. For example, a URL name

1-domination.com/video-bdsm/massage-du-corps-puis-coups will be converted

into /ndn/com/1-domination/video-bdsm/massage-du-corps-puis-coups by adding a

component “ndn” to title of name to represent it is used for ndn router. Another

modification is the reversion of the first component, content will be reversed

according to ‘.’ and split into components. This action could increase the property

of hierarchy and similarity to NDN names currently. In order to compare the

efficiency of name prefix matching, we randomly selected a hundred of thousand

URLs for our experiment.

2) Synthetic data:

In synthetic data, we imitated NDN naming characteristic to build our FIB

dataset. By observation in NDN status, we found almost every name prefix in FIB

in NDN router has at least three components. Every name from request has at least

5 components and has 5-20 random characters in every component. Compared with

real-world data, character strings of synthetic data are not meaningful. It is

completely a random combination of letters and numbers. On the other hand, name

prefix in FIB we set are 4 components.

!23!

Besides, we emulate a situation that it has p percentage common prefix in FIB.

The variable p range from 10 to 100, and it represents that it has p% common

prefix in FIB. The reason why we designed these types of dataset is that we

consider local NDN router may has most of common prefix. For example, a NDN

router in supermarket like costco has common prefix for their same kinds of food,

and this situation may have an influence on our method DePT.

5.2 Measurement

5.2.1. Data distribution

Figure 9 : Distribution of name length in real-world dataset

In Figure 9, we analyze the length of each NDN name in our real-world dataset

by number of characters. In real-world dataset, we try our best to select names which

consists of short length and long length. In particular, the shortest length of name is 19

characters and 5 components, the longest length of name is 766 characters and 26

!24!

components. Besides, the coefficient of variation of length is 0.33 (characters) and

0.18 (components), it represents that the disparity of name length is large enough.

5.2.2. FIB lookup time of Real-world dataset

Q1 (0~25%) Q2 (25~50%) Q3 (50~75%) Q4 (75~100%)

Data_1 31 33 42 43

Data_2 41 41 40 62

Data_3 38 46 65 80

Table 1: Relationship between length (number of characters) and lookup time

In Table1, we used three datasets to test how Trie data structure affects the

lookup efficiency with different length ndn name. They are different number of

real-world datasets, 1000,10000,100000 names in Data_1, Data_2 and Data_3

respectively. For these three datasets, we partition the name prefix lookup time by

interquatile range (IQR) Q1- Q4. Lookup time of each dataset will be partitioned into

four parts. Consequently, Q1 has first 25% shortest time and Q4 stands for last 25%

longest lookup time. However, numbers in average length column represent the

number of characters in name. We can found that if a name has longer length then it

has longer name lookup time. On the contrary, shorter name needs shorter lookup time.

Thus proving, using Trie as data structure in FIB will have an influence on lookup

fairness.

!25!

5.2.3. FIB lookup time under different n

(a) Real-world dataset (b) Synthetic dataset

Figure 10: FIB lookup time for four methods comparison under different n

In Figure 10 (a), we evaluate the FIB lookup time of Trie, Ternary Trie, Patricia

Trie and DePT under different n that means that each name needs to put its first n

components into filtering phase. In our FIB, name prefix are set to have at least four

components and the first component are “ndn”. Therefore, we compare n by 2, 3 and 4.

Among Trie, Ternary Trie and Patricia Trie, Trie has better lookup efficiency than

Patricia Trie, and Ternary Trie is at a disadvantage not only on lookup efficiency but

also on lookup fairness. In particular, stepped curve of Patricia Trie become relatively

more than Trie and Ternary Trie. The reason why does these steps occur is that there is

not only one group of common prefix in real-world datast and these common prefix

have different length. This situation is likely to occur in NDN router in the future.

In terms of DePT, three curves show lookup time when n equals to 2, 3 and 4.

We found that if n equals to 3, then the vertical extent of curve is bigger than others. It

means that when n equals to 3, the fairness of lookup time is the best. Besides, the

!26!

curves show that lookup time is the best also. Therefore, DePT has the best lookup

efficiency and fairness when n is initialized to 3.

In Figure 10 (b), the number of common prefix is 50% of total name prefix (p =

50), and the number of components in common prefix we set is 3. It represents that

there are more than half of total name prefix have the same prefix in their first 3

components. After incremental update, n in additional DePT will be set to 4.

5.2.4. FIB lookup time under different p

(a) 0% (b) 30%

(c) 60% (d) 90%

!27!

Figure 11: FIB lookup time for four methods comparison among 0%, 30%, 60% and

90% of common prefix

In Figure 11, there are Trie, Ternary Trie, Patricia Trie and DePT four methods

comparisons of FIB lookup time under 0%, 30%, 60% and 90% (p = 0, 30, 60, 90)

common prefix. Besides, the standard k we set for incremental update is 50. In (a),

there are totally different name prefix in their first three components. Ternary has a

worst-case for lookup efficiency and lookup fairness, on the contrary, DePT is the

most efficient and fair method. However, DePT method is based on Patricia Trie, the

maximum difference between these two methods and others is that their curves have

vibration. The reason why the vibration exists is that matching process in Patricia Trie

may not has only one character, it will be a long section of name. Therefore, curves are

not as smooth as Trie and Ternary Trie. Besides, zero percentage of common prefix in

FIB is the best case for our method DePT, the situation which excessive name prefix

accumulate in one sub-trie will hardly occur.

In (b) and (c), the curve of Trie and Ternary Trie is more vertical than zero

percentage. The reason is that as number of common prefix increase and variant prefix

decrease, name prefix lookup needs relatively less time. Besides, there is an

increasingly segment of smooth part in curves and it represents that a section of name

prefix has nearly lookup time. In (b), we can find that Trie is faster than Patricia Trie

in the beginning of curve, and the vibration amplitude is much bigger than zero

percentage in (a), and tendency of DePT curve is similar to Patricia Trie curve.

However in (c), curve of Patricia Trie shows that a part of lookup time is worse than

Trie, and the gap in curve is bigger and obvious than zero percentage in (a), and

tendency of DePT curve is not similar to Patricia curve due to incremental update in

!28!

building DePT when p is more than 50 so it will not affected by increasing number of

common prefix.

In (d), p is equal to 90 and it shows that names are almost following a common

prefix in FIB and it may be a local NDN router. In terms of Trie and Ternary Trie, they

both have a relatively vertical part in curve. It represents that almost name lookups are

gather in the same part of Trie. Similarly, curve of Patricia Trie has a relatively larger

gap than (b) and (c). Although vertical extent of Patricia Trie curve is better than Trie,

most of the lookup time obviously is worse than Trie. Best of all, DePT has the best

lookup efficiency and lookup fairness because it has an incremental update when

number of name prefix in one sub-trie achieves the default standard. After incremental

update, the number of name prefix in every sub-trie in DePT keeps in nearly equal so

that it has better fairness of lookup.

!

Figure 12: Average FIB lookup time for four methods comparison in different p

Figure 12 shows the average FIB lookup time of different dataset that percentage

of common prefix range from 0% to 100% (p = 0 ~ 100). Obviously, Ternary Trie is

!29!

the worst one of these four methods due to its redundant comparison of alphabetical

order. Unlike methods like Patricia Trie and DePT, average lookup time of Trie and

Ternary Trie decrease as the ratio increaces. It indicates that more common prefix in

FIB, lookup efficiency in Patricia Trie become worse instead. The fact is that if

Patricia Trie has a lot of common prefix, decomposition of string in Patricia Trie is

more than general Trie and Ternary Trie. Therefore, name lookup needs relatively

more time. In particular, DePT line shows that almost every percentage point has about

the same search time. The reason is that incremental update mechanism has been

activated when number of common prefix in someone sub-trie exceeds 50% (p = 50)

of total name prefix in FIB.

5.2.5. Coefficient of Variation of FIB lookup time under different p

Figure 13: Coefficient of Variation of FIB lookup time for four methods comparison in

different p

!30!

Figure 13 shows coefficient of variation of name prefix lookup time. We

analyzed the CV value of every percentage of common prefix in dataset. The CV value

smaller the fairer lookup time it has. In average, DePT is better than other three

methods. However, curves of Patricia, Ternary and Trie are close to each other and an

overlapping point appears in 40% point. Upon 40% point, we found Patricia Trie and

Ternary Trie both achieve the highest CV value and have a large gap between 30% and

50%. On the other hand, Trie has the highest CV value in 60% point. Above all, we

considered that if there has 40% to 60% common prefix in dataset, the lookup time

would be less concentrated and unfairer. That is the reason why the default k for

incremental update in DePT is 50. Accidentally, in DePT curve, the incremental

update k we set is 50, and we found that the 50% point has relatively smaller

coefficient of variation than 40% point. It represents that fairness of name prefix

lookup has been improved.

5.2.6. Memory consumption of DePT under different p

Figure 14: Memory consumption for four methods comparison in different p

!31!

Figure 14 shows that Ternary Trie accounts for the largest memory size. In

Ternary Trie, every node has fixed three pointers that point towards to the child node.

These three nodes are prepared for character comparison. In contrast, number of

pointer in Trie and Patricia Trie depends on how many child nodes. Consequently,

number of nodes in Patricia Trie is less than Trie due to its string in nodes rather than

only one character in nodes. Accidentally, Patricia Trie has less memory consumption.

However, DePT is composed of a lot of Patricia Trie so its memory consumption in

bar chart is similar to basic Patricia Trie. Because of increasing number of common

prefix in FIB, memory consumption relatively decreases in four methods. So the bar

chart of 100% shows the lowest limitation of memory.

5.2.7. Incremental update of DePT under different k

Figure 15: Incremental update under different k

!32!

In Figure 15, we compared the difference k for incremental update in 60%

common prefix synthetic dataset. In our dataset, there are 100000 name requests so it

means that there are more than 60000 name prefix would be classified into someone

sub-trie. Due to excessive name prefix accumulate in the same sub-trie in DePT, the

lookup efficiency and fairness are both affected. In order to avoid that more percentage

of name prefix in one sub-trie, the incremental update needs to be implemented. In

figure 15, when k is set 30 and 60, it means that if number of name prefix in someone

sub-trie is more than 30% or 60% of total name prefix in FIB, the incremental update

will add an additional DePT by adding n to n+1 for filtering phase insertion. In this

experiment, our default n is three, and the n in additional DePT will be four after doing

incremental update. Besides, name prefix will be well distributed in DePT.

In Figure 15, when k is 90 and �, the incremental update will not be activated.

Because of there are totally 60% (p = 60) or more name prefix in someone sub-trie and

it doesn’t exceed default k, according to the curve, the lookup efficiency and fairness

are both worse than curve that has incremental update.

6 DISCUSSION

Our method DePT is focused on enhancing fairness of search process. For

unbounded length NDN name, a lot of information will be consists in request name.

Unlike mothod that based on Hash function or Bloom Filter, process complexcity

depends on how long the name is. If the name has more components, it needs more

processing time. According to our evaluation results, our method DePT has better

lookup fairness. Besides, we have considered the special case happening in the future.

!33!

The incremental update mechanism in DePT not only enhances the scalability but also

make lookup more efficient and fair.

However, our method DePT has not been implemented on simulator or testbed

and under real scenario. Consequently, the real time insertion and deletion of name

prefix in FIB may have some influences on our lookup process. If there are frequent

insertion and deletion that makes number of prefix in someone sub-trie achieves the

standard of activating incremental update, the building of new DePT will be frequent

and it may affect the name lookup process in FIB. In our experiment, we do not

estimate the time to build an additional DePT when acitivating incremental update.

Besides, we deal with each request in the continuous time rather than in the same time.

The hardware problem like access simultaneously is another external factor.

In terms of dataset, because of formal NDN naming rule has not been formulated,

our real world dataset is converted to NDN form according to public steps that

proposed in another paper. It is not sure that our names of evaluation will meet the

actual NDN name in the future.

7 CONCLUSION & FUTURE WORK

We focused on the fairness of the name lookup in FIB in Named Data

Networking, and proposed a method that not only has great lookup efficiency but also

has better lookup fairness. Simultaneously, we reduce the memory requirement of data

structure which used to store name prefix of FIB. In particular, fairness of name prefix

lookup is the most significant issue we mind. We hope every name from Interest

Packet will be served in fair treatment no matter how long its name is and what content

it has. Then we implemented and evaluated our method DePT under real-world dataset

!34!

and synthetic dataset. Experimental results of them both show better fairness and

lookup efficiency.

Compared with HT-based and other methods, DePT must have good fairness and

unncessarily cares about length of name. Consequetly, we do not need to worry about

the abundant information name contains in the future. The future work we want to do

is collect real NDN name and implement our method DePT on the NDN testbed.

REFERENCES

[1] Bengt Ahlgren, Christian Dannewitz, Claudio Imbrenda, Dirk Kutscher and

B ̈orje Ohlman, “A Survey of Information-Centric Networking,” IEEE

Communications Magazine, July 2012, vol. 50, no. 7, pp.26-36.

[2] Lixia Zhang, Deborah Estrin, and Jeffrey Burke, et al, “Name Data Networking

(ndn) Project,” PARC, Palo Alto, CA, Tech. Rep. NDN-0001, October 2010.

[3] M. Amadeo, C. Campolo, et al, “Named Data Networking for IOT: an

Architectural Perspective,” European Conference on Networks and Communications

(EuCNC), June 2014, pp.1-5.

[4] Ghassan Samara, Wafaa A.H. Al-Salihy, R. SuresS, “Security Analysis of

Vehicular Ad Hoc Networks (VANET),” Second International Conference on Network

Applications, Protocols and Services, NETAPPS 2010, pp.55-60.

[5] Giulio Grassi, Davide Pesavento, Giovanni Pau, Rama Vuyyuru, Ryuji

Wakikawa and Lixia Zhang, “VANET via Named Data Networking,” IEEE

Conference on Computer Communications Workshops (INFOCOM WKSHPS), April

2014, pp.410-415.

!35!

[6] Wei You, Bertrand Mathieu, Patrick Truong and Jean-Franc ̧ois Peltier, “DiPIT:

a Distributed Bloom-Filter based PIT table for CCN Nodes,” 21st International

Conference on Computer Communications and Networks (ICCCN), 2012, pp.1-7.

[7] Yujian Fan, Hongli Zhang, Jiahui Liu and Dongliang Xu, “An Efficient Parallel

String Matching Algorithm Based on DFA,” Trustworthy Computing Services,

Communications in Computer and Information Science, 2013, vol. 320, pp.349-356.

[8] Won So, Ashok Narayanan, David Oran and Yaogong Wang, “Toward Fast

NDN Software Forwarding Lookup Engine based on Hash Tables,” ACM/IEEE

Symposium on Architectures for Networking and Communications Systems, October

2012, pp.85-86.

[9] D. Xu, and H. Zhang et al. “A Scalable Multi-Hash Name Lookup Method for

Named Data Networking.”

[10] Won So, Ashok Narayanan and David Oran, “Named Data Networking on a

Router: Fast and DoS-resistant Forwarding with Hash Tables,” ACM/IEEE Symposium

on Architectures for Networking and Communications Systems, October 2013,

pp.215-226

[11] Sarang Dharmapurikar, Praveen Krishnamurthy and David E. Taylor, “Longest

Prefix Matching using Bloom filters,” IEEE/ACM Transactions on Networking, April

2006, vol. 14, no.2, pp.397-409.

[12] Wei Quan, Changqiao Xu, Jianfeng Guan, Hongke Zhang and Luigi Alfredo

Grieco, “Scalable Name Lookup with Adaptive Prefix Bloom Filter for Named Data

Networking,” IEEE Communications Letters, January 2014, vol. 18, pp.102-105.

[13] Yi Wang, Tian Pan, Zhian Mi, Huichen Dai, Xiaoyu Guo, Ting Zhang, Bin Liu

and Qunfeng Dong, “NameFilter: Achieving fast name lookup with low memory

!36!

consumption via applying two-stage Bloom Filters,” IEEE INFOCOM, April 2013,

pp.95-99.

[14] Zhuo Li, Kaihua Liu, Yang Zhao and Yongtao Ma, “MaPIT: An Enhanced

Pending Interest Table for NDN with Mapping Bloom Filter,” IEEE Communications

Letters, November 2014, pp.1915-1918.

[15] Ioannis Sourdis, Georgios Stefanakis, Ruben de Smet, and Georgi N. Gaydadjiev,

“Range Tries for Scalable Address Lookup,” ACM/IEEE Symposium on Architectures

for Networking and Communications Systems, 2009, pp. 143-152.

[16] I. Sourdis, and S. H. Katamaneni, et al. “Longest Prefix Match and Updates in

Range Tries,” IEEE International Conference on Application-Specific Systems,

Architectures and Processors (ASAP), September 2001, pp.51-58.

[17] Fu Li, Fuyu Chen, Jianming Wu and Haiyong Xie , “Fast longest prefix name

lookup for content-centric network forwarding,” ACM/IEEE Symposium on

Architectures for Networking and Communications Systems, 2012, pp.73-74

[18] Yi Wang, Keqiang He, Huichen Dai, Wei Meng, Junchen Jiang, Bin Liu and

Yan Chen, “Scalable Name Lookup in NDN using Effective Name Component

Encoding,” IEEE 32nd International Conference on Distributed Computing Systems

(ICDCS), June 2012, pp.688-697.

[19] Yi Wang, Huichen Dai, Junchen Jiang, Keqiang He, Wei Meng and Bin Liu,

“Parallel Name Lookup for Named Data Networking,” IEEE Global

Telecommunications Conference (GLOBECOM), December 2011, pp.1-5.

[20] Jun-Ichi Aoe, Katsushi Morimoto and Takashi Sato, “An Efficient

Implementation of Trie Structures,” Software: Practice and Experience, September

1992, Vol. 22, Issue 9, pp. 695–721.

!37!

[21] Ghada Hany Badr and B. John Oommen, “Self-Adjusting of Ternary Search

Tries Using Conditional Rotations and Randomized Heuristics,” The Computer

Journal, 2005, vol. 48, pp.200-219.

[22] Sebastian Kniesburges and Christian Scheideler, “Hashed Patricia Trie: Effective

Longest Prefix Matching in Peer-to-Peer Systems,” 5th International Workshop

WALCOM: Algorithms and Computation, 2011, vol. 6552, pp.170-181.

[23] ndnsim. http://ndnsim.net/2.0/

[24] URLBlacklist. http://urlblacklist.com.

[25] NDNstatus. http://www.arl/wustl.edu/~jdd/ndnstatus/ndn_prefix/tbs_ndnx.html.

enhancing fairness of fib lookup in named data networking

Documents