![Page 1: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/1.jpg)
1
Ph.D. Thesis Proposal
Data Caching in Ad Hoc and Sensor Networks
Bin Tang
Computer Science DepartmentStony Brook University
![Page 2: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/2.jpg)
2
Summary of My Work Data Caching
Update cost constraint Optimal algorithm for tree; approximation algorithm for
general graph. Memory constraint with multiple data items
Approximation algorithm for general graph number constraint w/h read/write/storage cost
Optimal algorithm for tree
Localized distributed implementations. Compare with existing work
![Page 3: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/3.jpg)
3
Motivation Ad hoc and sensor networks are resource
constrained Limited bandwidth, battery energy, and
memory
Caching can save access (communication) cost, and thus, bandwidth and energy Under update cost, memory, number constraint
![Page 4: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/4.jpg)
4
Rooted in…
Facility location problem: set up facilities in a network to minimize total access cost and setting up cost
K-median problem: set up k facilities to minimize total access cost
![Page 5: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/5.jpg)
5
1. Cache Placement in Sensor Networks Under Update Cost Constraint
![Page 6: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/6.jpg)
6
Problem Statement Sensor Network Model
A data item stored at a server node. Updated at a certain frequency. Other nodes access the data item at a
certain frequency.
Problem StatementSelect nodes to cache the data item to:
Goal: Minimize “total access cost” Constraint: Total update cost.
![Page 7: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/7.jpg)
7
Why update cost constraint?
Nodes close to the server bear most of the update cost.
![Page 8: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/8.jpg)
8
Problem Formulation Given:
Network graph G(V,E). A data item stored at a server node Update frequency Access frequency for each other node Update cost constraint Δ
Goal: Select cache nodes to minimize the “total access
cost” Total update cost is less than Δ
![Page 9: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/9.jpg)
9
Total Access/Update Cost Total Access Cost =
∑ i є V (hop length between i and its nearest cache x access frequency of i)
Total Update cost = cost of the optimal Steiner tree over server and all caches
![Page 10: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/10.jpg)
10
Algorithm Design Outline Tree Networks
Optimal dynamic programming algorithm.
General Networks Multiple-unicast update model --
Approximation algorithm.
Steiner-tree update model – Heuristic and Distributed.
![Page 11: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/11.jpg)
11
Tree Networks
![Page 12: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/12.jpg)
12
Subtree notation
Server: “r”
Consider a subtree Tv.
Let path (v,x) on its leftmost branch be all caches.
Let C_v be the optimal access cost in Tv using additional update cost δ
Next: Recursive equation for C_v
r
Tr
v
Tvx
![Page 13: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/13.jpg)
13
Dynamic Programming Algorithm for Tvunder update cost constraint δ
Let u = leftmost deepest node in the optimal set of caches in Tv
Path(v,u) can be all caches (update cost doesn’t increase)
For a fixed u, C_v =
Constant + optimal access cost in Rv,u for constraint (δ – δ_u)
Here, δ_u is the cost to update u (using path(v,x)).
Tv = Lv,u + Tu + Rv,u
![Page 14: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/14.jpg)
14
DP recursive equation for Tv
C_v = minu є Tv (access cost in Lv,u using path(v,x) or path(v,u)
+ access cost in Tu using u + optimal cost in Rv,u with
constraint δ – δ_u)
Here, δ_u is the cost in updating u (using path(v,x)).Note that Rv,u has a path (v, parent(u)) of caches on its leftmost branch.
![Page 15: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/15.jpg)
15
Time complexity Time complexity: O(n4+n3 Δ)
Analysis Precomputation takes O(n4)
Lv,u with cache path (v,x): O(n4), for all v,u,x Tu: O(n2), for all u
Recursive equation takes O(n3 Δ) n2Δ entries: for each pair of (v,x) and all values of Δ Each entry takes O(n): n possible u
![Page 16: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/16.jpg)
16
General Graph Network Two Update Cost Models
Multiple-Unicast
Optimal Steiner Tree
![Page 17: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/17.jpg)
17
Multiple-Unicast Update Model Update cost: Sum of shortest path lengths
from server to each cache node
Benefit of node A: Decrease in total access cost due to selection of A as a cache
Benefit per unit update cost.
![Page 18: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/18.jpg)
18
Greedy Algorithm
Iteratively: Select the node with the highest benefit per unit update cost, until the update cost is exhausted
Theorem: Greedy solution’s benefit is at least 63% of the optimal benefit.
![Page 19: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/19.jpg)
19
Steiner-Tree Update Cost Model Steiner-tree update cost: Cost of 2-
approximation Steiner tree over cache nodes
Incremental Steiner update cost of node A: Increase in Steiner-tree update cost due to A becoming a cache
Greedy-Steiner Algorithm:Iteratively, select the node with the highest benefit per unit above-defined update cost.
![Page 20: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/20.jpg)
20
Distributed Greedy-Steiner Algorithm
Each non-cache node estimates its benefit per unit update cost
If the estimate is maximum among all its non-cache neighbors, then it decides to cache
Algorithm: In each rounds, each node decides to cache based
on above. The server gathers new cache node information,
and computes the total update cost The remaining update cost is broadcast to the
network, and the new round begins
![Page 21: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/21.jpg)
21
Performance Evaluation (i) network-related -- number of nodes and
transmission radius, (ii) application-related -- number of clients.
Random network of 2,000 to 5,000 nodes in a 30 x 30 region.
![Page 22: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/22.jpg)
22
Compared Caching Schemes Centralized Greedy
Centralized Greedy-Steiner
Distributed Greedy-Steiner
Dynamic Programming on Shortest Path Tree of Clients
Dynamic Programming on Steiner Tree over Clients and Server
![Page 23: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/23.jpg)
23
Varying Network Size – Transmission radius =2, percentage of clients = 50%, update cost = 25% of the Steiner tree cost
![Page 24: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/24.jpg)
24
Varying Transmission Radius - Network size = 4000, percentage of clients = 50%, update cost = 25% of the Steiner tree cost
![Page 25: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/25.jpg)
25
Varying number of clients – Transmission Radiu =2, update cost = 50% of the Steiner tree cost, network size = 3000
![Page 26: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/26.jpg)
26
To Recap: Data caching problem under update cost
constraint.
Optimal algorithm for tree; an approximation algorithm for general graph.
Efficient distributed implementations.
More general cache placement problem: (a) under memory constraint; (b) multiple data items.
![Page 27: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/27.jpg)
27
2. Data Caching under Memory Constraint
![Page 28: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/28.jpg)
28
Problem Addressed
In a general ad hoc network with limited memory at each node, where to cache data items, such that the total access (communication) cost is minimized?
![Page 29: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/29.jpg)
29
Problem Formulation Given:
Network graph G(V,E) Multiple data items Access frequencies (for each node and data item) Memory constraint at each node
Select data items to cache at each node under memory constraint
Minimize total access cost = ∑nodes ∑data items [(distance from node to the nearest
cache for that data item) x (access frequency) ]
![Page 30: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/30.jpg)
30
Related Work Related to facility-location problem and K-
median problem; No memory constraint
Baev and Rajaraman 20.5-approximation algorithm for uniform-size data
item For non-uniform size, no polynomial-time
approximation unless P = NP We circumvent the intractability by
approximating “benefit” instead of access cost
![Page 31: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/31.jpg)
31
Related Work - continued
Two major empirical works on distributed caching Hara [infocom’99] Yin and Cao [Infocom’ 04] (we compare our work
with theirs)
Our work is the first to present a distributed caching scheme based on an approximation algorithm
![Page 32: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/32.jpg)
32
Algorithms
Centralized Greedy Algorithm (CGA) Delivers a solution whose “benefit” is at least 1/2 of
the optimal benefit
Distributed Greedy Algorithm (DGA) Purely localized
![Page 33: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/33.jpg)
33
Centralized Greedy Algorithm (CGA)
Benefit of caching a data item at a node
= the reduction of total access cost
i.e., (total access cost before caching) – (total access cost after caching)
![Page 34: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/34.jpg)
34
Centralized Greedy Algorithm (CGA)
CGA iteratively selects the most beneficial (data item, node to cache at) pair.
I.e., we pick (at each stage) the pair that has the maximum benefit.
Theorem: CGA is (1/2)–approximate for uniform data item.
¼-approximate for non-uniform size data item
![Page 35: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/35.jpg)
35
CGA Approximation Proof Sketch
G’: modified G, where each node has twice memory of that in G caches data items selected by CGA and optimal
B(Optimal in G)
< B(Greedy + Optimal in G’)
= B(Greedy) + B(Optimal) w.r.t Greedy
< B(Greedy) + B(Greedy) [Due to greedy choice]
= 2 x B(Greedy)
![Page 36: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/36.jpg)
36
Distributed Greedy Algorithm (DGA)
Each node caches the most beneficial data items, where the benefit is based on “local traffic” only.
“Local Traffic” includes: Its own data requests Data requests to its data items Data requests forwarding to others
![Page 37: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/37.jpg)
37
DGA: Nearest Cache Table
Why do we need it? Forward requests to the nearest cache Local Benefit calculation
What is it? Each nodes keeps the ID of nearest cache for
each data item Entries of the form: (data item, the nearest cache) Above is on top of routing table.
Maintenance – next slide
![Page 38: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/38.jpg)
38
Maintenance of Nearest-cache Table
When node i caches data Dj
broadcast (i, Dj) to neighbors Notify server, which keeps a list of caches
On recv (i, Dj) if i is nearer than current nearest-cache of Dj,
update and forward
![Page 39: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/39.jpg)
39
Maintenance of Nearest-cache Table -II
i deletes Dj get list of caches Cj from server of Dj
broadcast (i, Dj, Cj) to neighbors
On recv (i, Dj, Cj) if i is current nearest-cache for Dj, update
using Cj and forward
![Page 40: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/40.jpg)
40
Maintenance of Nearest-cache Table -III
More details pertaining to Mobility Second-nearest cache entries (needed for
benefit calculation for cache deletions) Benefit thresholds
![Page 41: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/41.jpg)
41
Performance Evaluation
CGA vs. DGA Comparison
DGA vs. HybridCache Comparison
![Page 42: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/42.jpg)
42
CGA vs. DGA
Summary of simulation results: DGA performs quite close to CGA, for
wide range of parameter values
![Page 43: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/43.jpg)
43
Varying Number of Data Items and Memory Capacity – Transmission radius =5, number of nodes = 500
![Page 44: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/44.jpg)
44
DGA vs. Yin and Cao’s work.
Yin and Cao:[infocom’04] CacheData – caches passing-by data item CachePath – caches path to the nearest cache HybridCache – caches data if size is small
enough, otherwise caches the path to the data Only work of a purely distributed cache placement
algorithm with memory constraint
![Page 45: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/45.jpg)
45
DGA vs. HybridCache Simulation setup:
Ns2, routing protocol is DSDV Random waypoint model, 100 nodes move at a
speed within (0,20m/s), 2000m x 500m area Tr=250m, bandwidth=2Mbps
Performance metrics: Average query delay Query success ratio Total number of messages
![Page 46: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/46.jpg)
Server Model: 1000 data items, divided into two
servers. Data item size: [100, 1500] bytes
Data access models Random: Each node accesses 200 data
items randomly from the 1000 data items Spatial: (details skipped)
Naïve caching algorithm: caches any passing-by data, uses LRU for cache replacement
![Page 47: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/47.jpg)
Varying query generate time on random access pattern
![Page 48: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/48.jpg)
48
Summary of Simulation Results
Both HybridCache and DGA outperform Naïve approach
DGA outperforms HybridCache in all metrics Especially for frequent queries and small
cache size For high mobility, DGA has slightly worse
average delay, but much better query success ratio
![Page 49: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/49.jpg)
49
To Recap: Data caching problem for multiple items
under memory constraint Centralized approximation algorithm Localized distributed implementation No update or storage cost are considered
(otherwise, no performance guarantee)
Can we consider and minimize the total cost of read/write/storage ?
![Page 50: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/50.jpg)
50
3. Data Caching Under Number Constraint
![Page 51: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/51.jpg)
51
Problem Formulation Given:
Network graph G(V,E). A data item to be stored in the network Access (read) frequency for each node Write frequency for each node Caching (storage) cost for each node Number of allowable caching node: P
Goal: Select cache nodes to minimize the “total cost” Under number constraint
![Page 52: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/52.jpg)
52
Total Cost
= Total read cost + total write cost + total storage cost
= ∑ i є V (hop length between i and its nearest cache x access frequency of i)
+ ∑ i є V (cost of optimal steiner tree over i and all caches x write frequency of i)
+ ∑ i є cache nodes (storage cost at i)
![Page 53: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/53.jpg)
53
Related Work
K-median problem (access and storage cost)
Tamir attains the best time complexity in tree
We generalize it with write cost in both tree ( O(n2P3) ) and general graph Kalpakis et al. solves the same problem, with time
complexity O(n6P3)
![Page 54: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/54.jpg)
54
Tree Topology
![Page 55: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/55.jpg)
55
Tamir’s DP Algorithm on tree Tr
Transform arbitrary tree into full binary tree
Each non-leaf node v has two children: v1, v2
For each v in binary tree, compute and sort the distance from v to all nodes
“leaves to root” dynamic programming algorithm
![Page 56: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/56.jpg)
56
Our DP Algorithm
Ideal: For each node v in Tr:
the cost of sub-tree Tv =
access cost of nodes in Tv
+ storage cost of caching nodes in Tv
+ write cost of all the writer nodes in Tr due to edges in Tv
![Page 57: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/57.jpg)
57
DP Algorithm - Definitions G(v, q, r): optimal cost for subtree Tv, exact q
caches in Tv, closest to v is at most r hops away
F(v, q, r): optimal cost for Tv, exact q caches in Tv; some cache nodes outside of Tv, closest to v is r hops away
F’(v, r): optimal cost for Tv, no cache in Tv; some cache nodes outside of Tv, closest to v is r hops away
![Page 58: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/58.jpg)
58
Recursive DP Equations: p cache nodes allowed
1. G(v, q, 0) -- v is cache node= storage cost at v
+ the cost of Tv1, Tv2 + the write cost on vv1, vv2
2. G(v, q<p, r>0) – there is some cache node outside of Tv = min{ G(v, q, r-1), // there is cache in Tv r-1 hops from
v cost in “closest cache to v is r hops away” }
![Page 59: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/59.jpg)
59
Recursive DP Equations - continued
3. G(v, q=P, r>0) – no cache node outside of Tv = min{ G(v, q, r-1),
the cost of “closest cache is r hops away” }
4. F(v, q, r) – there is cache node outside of Tv= min {G(v, q, r-1),
the cost of “closest cache to v is r hops away
}
![Page 60: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/60.jpg)
60
Minimum total cost of original tree Tr = min {1≤p≤P} G(r, p, L}, L is the hops of
r to the farthest node in Tr
Time Complexity – O(n2P3) For each p, vary q from 1 to q For each (v, q), vary closest cache node to v
(n possibilities) and spit q in to Tv1, Tv2 (q such possibilities)
![Page 61: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/61.jpg)
61
Conclusion
We design optimal, near optimal and heuristics for data caching under different constraint in ad hoc and sensor networks
We show our algorithms can be implemented in distributed way
![Page 62: 1 Ph.D. Thesis Proposal Data Caching in Ad Hoc and Sensor Networks Bin Tang Computer Science Department Stony Brook University](https://reader035.vdocument.in/reader035/viewer/2022062516/56649d595503460f94a38f53/html5/thumbnails/62.jpg)
62
Questions?