wireless sensor network protocols - welcome to networked
TRANSCRIPT
IV-1
Part IV: Sensor Network Protocols
Deborah Estrin
IV-2
Wireless Sensor Network Protocols
• Primary theme: building long-lived, massively-distributed, physically-coupled systems:
– Coordinating to minimize duty cycle and communication
• Adaptive MAC
• Adaptive Topology
• Routing
– In-network processing
• Data centric routing
• Programming models
In-network: Application processing, Aggregation, Query processing
Adaptive topology, Geo-Routing
MAC, Time, Location
Phy: comm, sensing, actuation, SP
User Queries, External Database
Data dissemination, storage, caching
IV-3
Medium Access Control in Sensor Nets
• Important attributes of MAC protocols
1. Collision avoidance2. Energy efficiency3. Scalability in node density4. Latency5. Fairness6. Throughput7. Bandwidth utilization
IV-4
• Major sources of energy waste• Idle listening when no sensing events, Collisions,
Control overhead, Overhearing
MAC Impact on Sensor Networks(Intanago et al, 2000)
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0 50 100 150 200 250 300
Ave
rage
Dis
sipa
ted
Ene
rgy
(Jou
les/
Nod
e/R
ecei
ved
Eve
nt)
Network Size
DiffusionDiffusion
Omniscient MulticastOmniscient MulticastFloodingFlooding
00.0020.0040.0060.0080.01
0.0120.0140.0160.018
0 50 100 150 200 250 300
Ave
rage
Dis
sipa
ted
Ene
rgy
(Jou
les/
Nod
e/R
ecei
ved
Eve
nt)
Network Size
DiffusionDiffusion
Omniscient MulticastOmniscient Multicast
FloodingFlooding
Over 802.11-like MAC Over energy-aware MAC
IV-5
Identifying the Energy Consumers
•• Need to shutdown the radio
From Tsiatis et al. 2002
SENSORS
Power consumption of node subsystems
0
5
10
15
20P
ower
(m
W)
CPU TX RX IDLE SLEEP
RADIO
SLEEPIDLERXTX EEEE >>≈≈
IV-6
• Major sources of energy waste
– Idle listening• Long idle time when no sensing event happens• Collisions• Control overhead• Overhearing
• Try to reduce energy consumption from all above sources
• TDMA requires slot allocation and time synchronization• Combine benefits of TDMA + contention protocols
Energy Efficiency in MAC
Common to all wireless networks
IV-7
Sensor-MAC (S-MAC) Design(Wei et al. 2002)
• Tradeoffs
• Major components of S-MAC• Periodic listen and sleep• Collision avoidance• Overhearing avoidance• Message passing
LatencyFairness
Energy
IV-8
Periodic Listen and Sleep
• Problem: Idle listening consumes significant energy– Nodes do not sleep in IEEE 802.11 ad hoc mode
• Solution: Periodic listen and sleep– Turn off radio when sleeping– Reduce duty cycle to ~10% (200 ms on/2s off)– Increased latency for reduced energy
sleeplisten listen sleep
IV-9
Periodic Listen and Sleep
• Schedules can differ
• Preferable if neighboring nodes have same schedule
— easy broadcast & low control overhead
Border nodes:two schedulesbroadcast twice
Node 1
Node 2
sleeplisten listen sleep
sleeplisten listen sleep
Schedule 2
Schedule 1
IV-10
Periodic Listen and Sleep
• Schedule maintenance– Remember neighbors’ schedules
— to know when to send to them
– Each node broadcasts its schedule every few periods– Refresh on neighbor’s schedule when receiving an
update– Schedule packets also serve as beacons for new
nodes to join a neighborhood
IV-11
Collision Avoidance
• Problem: Multiple senders want to talk• Options: Contention vs. TDMA• Solution: Similar to IEEE 802.11 ad hoc mode (DCF)
– Physical and virtual carrier sense– Randomized backoff time– RTS/CTS for hidden terminal problem– RTS/CTS/DATA/ACK sequence
IV-12
Overhearing Avoidance
• Problem: Receive packets destined to others• Solution: Sleep when neighbors talk
– Basic idea from PAMAS (Singh 1998)– But we only use in-channel signaling
• Who should sleep?
• All immediate neighbors of sender and receiver• How long to sleep?
• The duration field in each packet informs other nodes the sleep interval
IV-13
Message Passing
• Problem: In-network processing requires entire message
• Solution: Don’t interleave different messages– Long message is fragmented & sent in burst– RTS/CTS reserve medium for entire message– Fragment-level error recovery
— extend Tx time and re-transmit immediately• Other nodes sleep for whole message time
Fairness EnergyMsg-level latency
IV-14
Msg Passing vs. 802.11 fragmentation
Time reservation by duration field
ü If ACK is not received, give up Tx — fairnessü No indication of entire time — other nodes keep listening
RTS 21 ......
Data 19
ACK 18CTS 20
Data 17
ACK 16
Data 1
ACK 0
• MP
RTS 3 ......
Data 3
ACK 2CTS 2
Data 3
ACK 2
Data 1
ACK 0
• 802.11
IV-15
Implementation on Testbed Nodes
• Platform• Motes (UC Berkeley) :
• 8-bit CPU at 4MHz,• 8KB flash, 512B RAM
• TinyOS: event-driven
• Also used as NIC for 32-bit embedded PCs
• Compared MAC modules• IEEE 802.11-like protocol• Message passing with overhearing
avoidance• S-MAC (2 + periodic listen/sleep)• URL: http://www.isi.edu/scadds/smac/
IV-16
S-MAC Experimental results(implemented on UCB Mote over RFM radio)
• Topology and measured energy consumption on source nodes
Source 1
Source 2
Sink 1
Sink 2
• Each source node sends 10 messages — Each message has 10
fragments x 40B• Measure total energy
— Data + control + idle0 2 4 6 8 10
200
400
600
800
1000
1200
1400
1600
1800Average energy consumption in the source nodes
Message inter-arrival period (second)
Ene
rgy
cons
umpt
ion
(mJ)
802.11-like protocol Overhearing avoidanceS-MAC
Message Inter-arrival period
Energy consumed
IV-17
Adaptive Topology
• Can we do more than shut down radio in between transmissions/receptions?
• Can we put nodes to sleep for longer periods of time?
• Goal:– Exploit high density (over) deployment to extend
system lifetime – Provide topology that adapts to the application needs– Self-configuring system that adapts to environment
without manual configuration
IV-18
Adaptive Topology: Problem Description
• Simple Formulation (Geometric Disk Covering)– Given a distribution of N nodes in a plane.– Place a minimum number of disks of radius r (centered on
the nodes) to cover them.– Disk represents the radio connectivity (simple circle model).
• The problem is NP-hard.
IV-19
Connectivity Measurements*
Packet reception over distance has a heavy tail. There is a non-zero probability of receiving packets at distances much greater than the average cell range
169 motes, 13x13 grid, 2 ft spacing, open area, RFM radio, simple CSMA
Can’t justdetermineconnectivityclusters thrugeographic coordinates…
For the same reason you can’t determine coordinates w/connectivity
*An Empirical Study of Epidemic Algorithms in Large Scale Multihop Wireless NetworksGanesan, Krishnamachari, Woo, Culler, Estrin and Wicker, UCLA/CSD-TR 02-0013.
IV-20
Tradeoff• How many nodes to activate?
– few active nodes:• distance between neighboring nodes high -> increase packet loss and
higher transmit power and reduced spatial reuse;• need to maintain sensing coverage (see earlier session on
coverage/exposure)
– too many active nodes:• at best, expending unnecessary energy;• at worst nodes may interfere with one another by congesting the
channel.
IV-21
Adaptive Topology Schemes
• Mechanisms being explored: – Empirical adaptation: Each node assesses its connectivity
and adapts participation in multi-hop topology based on the measured operating region, ASCENT (Cerpa et al. 2002)
– Cluster-based, load sharing within clusters, CEC (Xu et al. 2002)
– Routing/Geographic topology based, eliminate redundant links, SPAN (Chen et al. 2001), GAF (Xu et al. 2001)
– Data/traffic driven: Trigger nodes on demand using paging channel, STEM (Tsiatsis et al. 2002)
IV-22
One example algorithm: ASCENT
• The nodes can be in active or passive state.– Active nodes forward data packets (using routing
mechanism that runs over topology). – Passive nodes do not forward any packets but may sleep or
collect network measurements. • Each node joins network topology or sleeps according to
measured number of neighbors and packet loss, as measured locally.
(b) Self-configuration transition(a) Communication Hole (c) Final State
Help Messages
Data Message
SinkSource SinkSource
Neighbor AnnouncementsMessages
Data Message
SinkSource
Active NeighborPassive Neighbor
IV-23
State Transitions
Test
Passive Sleep
Activeafter Tt
after Tp
after Ts
neighbors < NTand• loss > LT• loss < LT & help
neighbors > NT (high ID for ties); orloss > loss T0
NT: neighbor thresholdLT: loss thresholdT?: state timer values (p: passive, s: sleep, t: test)
IV-24
Testbed Platforms
• Tiered architecture: heterogeneous set of platforms (no particular set of compromises with a unique platform).
• PC-104/iPAQ enable running tasks that are more CPU/memory intensive.
iPAQ UCB Mote
• Simulator:– Discrete event-driven simulator.
– Multiple stackable modules perform different functionality.
PC-104/RPC
IV-25
Energy Savings Ratio
Energy Savings Ratio (normalized to the Active case, all nodes turn on) as a function of density. ASCENT provides significant amount of energy savings, with a factor of 5 for high density scenarios.
IV-26
Event Delivery Ratio
Event Delivery Ratio as a function of density. ASCENT reduces collisions experienced by diffusion routing by limiting the maximum number of active nodes transmitting packets.
Open question: to what extent would SMAC or Geo-routing produce the same result?
IV-27
radio supports promiscuous mode
noneadapt topology based on application needs
ASCENT
2 radios/wake-up channelconnectivity conditions
remain constant in sleeping periods
needs routing info to direct the wake-up wave
tradeoff latency for energy savings
STEM
802.11 MAC with Power Savings mode
gets connectivity matrix and neighbors from routing
requires modifications in the routing lookup process
preserve capacity of the raw topology
SPAN
geographic information for grid placement
radio connectivity directly correlated with geography
nonepreserve routing fidelityGAF
AssumptionsRouting dependency
Goal(general: energy
savings)
IV-28
STEM: Data drive wakeup(Tsiatsis et. al 2002)
Sensor-triggered node wakeup
Wake up the nodes along the path
HOW ???
event
sensor network
user
Zzz
Path nodes need to be woken up
Zzz ZzzZzz
IV-29
• Need to separate Wakeup and Data Forwarding Planes
• Chosen two separate radios for the two planes
• Use separate radio for the paging channel to avoid interference with regular data forwarding
• Trades off energy savings for path setup latency
Wakeup plane: f1
Data plane: f2
STEM: Sparse Topology and Energy Management
IV-30
Duty Cycled Wakeup Radio
f1
f1
Target node
Initiator node
Train of beacon packetsTRx
B1 B2 1. beacon received
2. beacon acknowledge
T
IV-31
Performance Metrics
(sec)ST
(sec)T (sec)T
0EE
IV-32
Energy vs Latency
%50
%10
%1
%1.0
0EE
(sec)ST
IV-33
STEM Design Decisions
• One radio with one frequency band would cause interference between the wakeup and data planes
• One radio with two frequencies would have to change frequency to listen for setup requests
• One radio with time multiplexed wake-up and data planes would require time synchronization between the two nodes –time sync in ad-hoc networks is an open problem by itself
• The additional radio cost is about 15% of the total cost of the node (from private communication with Savvides)
Active mode
Polling mode
Sleep mode
Interference
A B
C
D
IV-34
Design Issue: Collision Resolution
more initiator nodes
• upon detection of collision, a node turns on its data radio• after T, the initiator node assumes the target node is up
and contacts it on the data plane• when an expected target node doesn’t receive data, it
times out and goes back to sleep
1 initiator node
• beacon received correctly• only intended receiver turns on the data
radio and sends a beacon acknowledge in the wakeup plane
IV-35
Exploiting Latency AND Density
• Existing approaches (ASCENT, GAF, SPAN, etc) exploit node density
• Combine STEM and one of these schemes (chosen GAF)
Zzz
Active nodes
Zzz
Zzz
Zzz
STEM
IV-36
Performance of STEM+GAF
GAF
STEM + GAF
STEMalone
avg # of neighbors
0EE
msTS 900=msTS 675=msTS 400=
IV-37
Conclusions
• STEM leverages data latency for energy savings
• Combining existing algorithms that leverage densityand STEM yields additional energy savings
• Provide with design time tools for extending the lifetime of the network
IV-38
Adaptive Topology: Future Work
• Load Balancing.• Larger scale experiments.• Interaction with adaptive MAC and geographic
routing• Application defined Adaptive Fidelity
– Expanding on STEM’s data driven characteristics to achieve more than on/off behavior
IV-39
Routing
• Given a topology, how to route data?– MANET: Reactive[DSR], proactive[AODV], TORA,
GPSR[KarpKung00]– Location-aided routing: Geocast[Navas97],Cartesian-LAR,
[KOVaidya98]
– Building on Geo Routing• GRAB (Lu et al 2002)• Routing on curve (Badri 2002)
IV-40
GRAB:Field Based Minimum Cost Forwarding
(Lu et al 2002)
• Each node broadcasts only once• Cost Function
– A measure of how expensive it is to get a message back to the sink.
– Could be base:• Energy needed in radio communication.• Hop count.• …
• Node Cost– Each node keeps a best estimate on its minimum cost.– Estimate updated upon receipt of every ADV message.– ADV message forwarding deferred for time proportional to
nodes cost estimate.
IV-41
ADV Dissemination Example
• Signal strength is used to measure cost.
• B sees strong signal and judges cost to be 1.
• C sees weak signal and judges cost to be 3.
IV-42
ADV Dissemination Example contd.
• Because B has a smaller cost, he defers for a shorter time then C.
• C updates his cost to 2 and restarts his deferral timer.
• Each node has optimal cost with minimum broadcast.
IV-43
Data Dissemination
• A node that decides it has interesting data. broadcasts two things (besides data)– Total budget to get back to sink.– Amount of budget used in initial broadcast.
• A node receiving a data message will only forward a data message if
Total Budget ≥ Budget Spent So Far + My Cost– If the inequality holds then Budget Spent So Far is updated.– Otherwise the message is dropped.
IV-44
Data Dissemination Example
• Assume hop count was used as a cost metric.
• Node A is the sink.• Node C is the source.
IV-45
Data Dissemination Example contd.
• Node C sends a data message which specifies– Total Budget = 2– Budget Spent = 1
• Node E drops message– TB < BS + E’s Cost
• Node B forwards message.
IV-46
Routing on a Curve(Nath et al 2002)
• Trajectories are a natural name space for embedded networks
• By definition, network structure mimics physical structure that is instrumented– Stress along a column– Flooding along a river– Pollution along a road
• Trajectories come from the application domain
IV-47
TBF (Trajectory based forwarding)
• Fundamental Idea– Route packets along a specified trajectory
• Generalization of Source Based Routing and Cartesian routing
• Trajectory specified in the packet
IV-48
Specifying trajectory
• Function • Equation • Parametric
– we use this choice
xybaxy sin, =+=
1,1 22 ++=++ yxcbyax trytrxbatytx cos,sin;, ==+==
IV-49
Features of TBF
• Basic Features– Decouples pathname from the actual path
• Source based Routing (LSR, DSR etc) mixes naming and route path
• Applications:– Route around obstacles/changes/failures
– Trajectory forwarding need not have a “destination”• Route along a line, pattern• Applications:
– Flooding, discovery, group communication (pollination)
IV-50
Routing on a curve
IV-51
Spoke flooding
IV-52
Flooding along spokes
0102030405060708090
100
4 9
Number of Spokes
CoveragePackets
IV-53
Related Work
• MANET– Reactive[DSR], proactive[AODV], TORA,
GPSR[KarpKung00]
• Location-aided routing– Geocast[Navas97],Cartesian-LAR, [KOVaidya98]
• Localization techniques– GPS, Cricket[Priyantha00],RADAR[Bahl00],– Ahlos[bulusu00,Savides01],TdoA technique, Visual cues
IV-54
Discussion
• Trajectory modifications• No forward progress
– Retry from source– Local reverse gear– Apply detour patches– Packets allowed to backtrack
• Specifying trajectories– Accurate trajectories– Use fourier components– Tradeoff between number of components and
accuracy
IV-55
In-Network ProcessingThe Key to Sensor Network scalability and Realization
• Gupta and Kumar pointed out fundamental limits of large scale wireless networks (per node throughput O(1/sqrtN)
• However, S. Servetto shows that result holds only for independent nodes (Mobicom 2002)– Densely deployed sensor network data will be
correlated and can be aggregated• Scalability and lifetime will depend on techniques
for in-network processing of data– Naming Data: Directed Diffusion– Data base perspectives:TAG, Sylph– Programming mechanisms: Sensorware, Mate
IV-56
Directed Diffusion: Data Centric Routing
• Basic idea– name data (not nodes) with externally relevant attributes
• Data type, time, location of node, SNR, etc– diffuse requests and responses across network using
application driven routing (e.g., geo sensitive or not)– optimize path with gradient-based feedback– support in-network aggregation and processing
• Data sources publish data, Data clients subscribe to data– However, all nodes may play both roles
• A node that aggregates/combines/processes incoming sensor node data becomes a source of new data
• A sensor node that only publishes when a combination of conditions arise, is a client for the triggering event data
– True peer to peer system
• Implementation defines namespace and simple matching ruleswith filters– Linux (32 bit proc) and TinyOS (8 bit proc) implementations
IV-57
Diffusion as a construct for in-network processing
• Nodes pull, push, and store named data (using tuple space) to create efficient processing points in the network– e.g. duplicate suppression, aggregation, correlation
• Nested queries reduce overhead relative to “edge processing”• Complex queries support
collaborative signal processing– propagate function
describing desired locations/nodes/data (e.g. ellipse for tracking)
– Interesting analogs to emergingpeer-to-peer architectures
• Build on a data-centric architecturefor queries and storage
IV-58
Nested Query Evaluation(example experiment w/sub-optimal hardware)
(Heidemann et al.)
• Nested queries greatly improve event delivery rate
• Specific results depend on experiment– placement– limited quality MAC
• General result: app-level info needed in sensor nets; diffusion is good platform ev
ents
suc
cess
fully
rece
ived
(%)
number of light sensors
flat
nested
60
1 2 3 4
80
40
20
IV-59
A more general look at A more general look at Data Centric vs. Address Centric Data Centric vs. Address Centric
approachapproach((KrishnamachariKrishnamachari et al.)et al.)
• Address Centric• Distinct paths from each source to sink.
• Data Centric• Support aggregation in the network where paths/trees overlap• Essential difference from traditional IP networking
• Building efficient trees for Data centric model• Aggregation tree: On a general graph if k nodes are sources and one is a sink, the aggregation tree that minimizes the number of transmissions is the minimum Steiner tree. NP-complete….Approximations:
• Center at Nearest Source (CNSDC): All sources send through source nearest to the sink.• Shortest Path Tree (SPTDC): Merge paths.• Greedy Incremental Tree (GITDC): Start with path from sink to nearest source. Successively add next nearest source to the existing tree.
IV-60
Source placement: eventSource placement: event--radius modelradius model
IV-61
Comparison of energy costsComparison of energy costs
Address CentricShortest path data centricGreedy tree data centricNearest source data centricLower Bound
Data centric has many fewer transmissions than does Address Centric; independent on the tree building algorithm.
Opportunism always pays;Greed pays only when things get very crowded
(Intanagowiwat et al. ns-2 more detailed simulations)
IV-63
Programming Paradigm
• Move beyond simple query with in-network aggregation model
• How do we task a 1000+ node dynamic sensor network to conduct complex, long-lived queries and tasks ??– What constructs does the query language need to support?
• What sorts of mechanisms need to be “running in the background” in order to make tasking efficient?– Small databases scattered throughout the network, actively
collecting data of nearby nodes, as well as accepting messages from further away nodes?
– Active messages traveling the network to both train the network and identify anomalous conditions?
IV-64
Examples• Map isotherms and other “contours”, gradients, regions
– Record images wherever acoustic signatures indicate significantly above-average species activity, and return with data on soil and air temperature and chemistry in vicinity of activity.
– Mobilize robotic sample collector to region where soil chemistry and air chemistry have followed a particular temporal pattern and where the region presents different data than neighboring regions.
• Raises requirements for some global context, e.g. “average” levels– Emerging role for distributed storage architecture
• Pattern identification: how much can and should we do in a distributed manner?– Similar to some vision/image analysis problems but distributed
noisy inputs
IV-65
Why databases?
• Sensor networks are capable of producing massive amounts of data – Sensor networks should be able to
• Accept queries for data• Respond with results
– Users will need • An abstraction that guarantees reliable results• Largely autonomous, long lived network
• Efficient organization of nodes and data will extend network lifetime– Database techniques already exist for efficient data storage
and access
IV-66
• Query processing– Storage structures:
• Less choice here … traditionally, data organized on disk• Here, data originates at the sensors’ location
– Time coherency across space– “Stream” processing … one pass over data
• E.g., try to estimate a histogram on the fly.
– Quality of service:• Latency vs. completeness• Actuators: e.g., camera pan and zoom
– Tradeoffs in communication vs. computation and storage: e.g., filter-refine: where to do what.
Database-centric Approach(Muntz et al.)
IV-67
Differences between databases and sensor networks
• Database– Static data– Centralized– Failure is not an option– Plentiful resources– Administrated
• Sensor Network– Streaming data– Large number of nodes– Multi-hop network– No global knowledge about
the network– Frequent node failure– Energy is the scarce– Resource, limited memory– Autonomous
IV-68
Bridging the Gap
• What is needed to be able to treat a sensor network like a database?– How should sensors be modeled?– How should queries be formulated?
IV-69
Traditional Approach: Warehousing
• Data is extracted from sensors and stored on a front-end server• Query processing takes place on the front-end.
Warehouse
Front-end
Sensor Nodes
IV-70
What We’d Like to Do:Sensor Database System
• Sensor Database System supports distributed query processing over a sensor network
SensorDB
SensorDB
SensorDB
SensorDB Sensor
DB
SensorDB
SensorDB
SensorDB
Front-end
Sensor Nodes
IV-71
Sensor Database System
• Characteristics of a Sensor Network:
– Streams of data– Uncertain data– Large number of nodes– Multi-hop network– No global knowledge
about the network– Node failure and
interference is common– Energy is the scarce
resource– Lmited memory– No administration, …
• Can existing database techniques be reused? What are the new problems and solutions?– Representing sensor data– Representing sensor
queries– Processing query fragments
on sensor nodes– Distributing query fragments– Adapting to changing
network conditions– Dealing with site and
communication failures– Deploying and Managing a
sensor database system
IV-72
Performance Metrics
• High accuracy– Distance between ideal answer and actual answer?– Ratio of sensors participating in answer?
• Low latency– Time between data is generated on sensors and answer is
returned• Limited resource usage
– Energy consumption
IV-73
Representing Sensor Data and Sensor Queries
• Sensor Data:– Output of signal processing functions
• Time Stamped values produced over a given duration
– Inherently distributed• Sensor Queries
– Conditions on time and space• Location dependent queries• Constraints on time stamps or aggregates over time windows
– Event notification
IV-74
Early Work in Sensor Databases
• “Towards Sensor Database System”“Querying the Physical World”– Phillipe Bonnet, Johannes Gehrke, Praveen Seshdri
IV-75
Fjording the Stream: An Architecture for Queries over Streaming Sensor Data
(Hellerstein et al.)
• How can existing database querying methods be applied to streaming data?
• How can we combine real-time sensor data with stored historical data?
• What architecture is appropriate for supporting simultaneous queries?
• How can we lower sensor power consumption, while still supporting a wide range of query types?
IV-76
Traditional Database Operators
• Are implemented using “pull” mechanisms.• Block on incoming data.• Most require all the data to be read first. (i.e. sort,
average…)• Optimized for classic IO.• Usually implemented as separate threads.
IV-77
Hardware Architecture
• Centralized data processing.
• Sensor proxies read and configure sensors.
• Query processor interacts with proxies to request and get sensor data.
• Sensor proxies support multiple simultaneous queries, multiplexing the data.
IV-78
Operators
• Implemented as state machines.• Support transition(state) method, which causes the operator to
optionally read from input queues, write to output queue, and change state.
• Multiple operators per thread, called by a scheduler. (Round robin in the experiments)
• Allows fine-grained tuning of processing time allocated to each operator.
IV-79
Sensor Sensitive Operators
• Certain operations are impossible to perform on continuous data streams. (sum, average, sort…)
• Can be performed on partial data windows.• Joins can be implemented by hashing tuples.• Can provide aggregation based on current data, with continuous
updates to parent operators.??
IV-80
Sensor Proxy
• Responsible for configuring sensor that belong to it, setting sensing frequency, aggregation policies, etc..
• To save power, each sensor only listens for commands from proxy during short intervals.
• Handles incoming data from sensors, and pushes it into appropriate queues.
• Stores a copy to disk for historical queries.• Performs data filtering, which it can sometimes offload to the
sensors.
IV-81
Building a Fjord
• For all sensor data sources, locate the proxy for the sensor, and install a query on it to deliver tuples at a certain rate to a push queue.
• For non-sensor data sources, set up a pull queue to scan for data.
• Pipe the data through the operators specified by the query.
IV-82
Query
• Find average car speeds during time window (w), for all segments the user is interested in (knownSegments)
• More complicated queries are possible, with joins of streaming sensor data and historical data stored in a normal database fashion.
IV-83
Dataflow for the Query
• Data is pushed from the sensors to the user, through the filter operator set up by the query.
• Multiple similar queries can be added to an existing fjord, instead of creating one per query.
IV-84
Conclusion
• Fjords allow sensors to be treated as database sources for querying, with little change in the overall architecture.
• Proxies can optimize energy consumption of individual sensors based on user queries, and multiplex data from sensors to multiple queries.
• Processing is centralized, but can sometimes be offloaded to the sensors, to lower energy consumed by radio transmissions.
IV-85
The Sylph Middleware
• Database mechanisms packaged as middleware services
• Modular, Layered Design– Sensor module– Service discovery module– Proxy core
• Query LanguageREAD temperature EVERY 30 seconds FOR 1 hour
• Sensor Stream Processing
IV-86
IV-87
Sensor Module
• Strong Abstraction Barrier• Sensor Description
– Derived from IEEE 1451 Transducer Electronic Data Sheet (TEDS)
– Metadata - name, manufacturer, etc.– Attributes - device status, sampling interval, etc.– Data – available data, return types
• Attribute-Value Interface– SET SamplingPeriod = 1 second– GET DeviceStatus
IV-88
Service Discovery Module
• Device Proxy• Common Service Discovery Mechanisms (e.g., Jini)• Soft-State Reliability
IV-89
Proxy Core
• Device Manager• Sensor Proxy
– Service discovery– Layered functionality (e.g., buffering)
• Query Processor– Query parsing– Directed graph of stream operators– Query optimization
• Query Distribution - Gateways
IV-90
Accommodating in-network Aggregation
• Fjords and Sylph did not explicitly address in-network aggregation/processing
• “Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks” (Madden et al. 2002)
• Explores aggregation techniques that are application independent
• Count• Min• Max• Sum• Average
IV-91
At A Glance …
• Trying to minimize the number of messages sent
• All aggregation is done by building a tree
IV-92
Tricks of the Trade …
• How do you ensure an aggregate is correct?– Compute it multiple times.
• How do you reduce the message overhead of redistributing queries?– Piggy back the query along with data messages.
• Is there anyway to further reduce the messaging overhead?– Child nodes only report their aggregates if they’ve changed.
• Nodes can take advantage of multiple parents for redundancy reasons.
IV-93
The Tiny Aggregation (TAG) Approach(S. Madden, UCB)
• Push declarative queries into network– Impose a hierarchical routing tree onto the network
• Divide time into epochs• Every epoch, sensors evaluate query over local
sensor data and data from children– Aggregate local and child data– Each node transmits just once per epoch– Pipelined approach increases throughput
• Depending on aggregate function, various optimizations can be applied
IV-94
SQL Primer(Madden)
• SQL is an established declarative language; not wedded to it– Some extensions clearly necessary, e.g. for sample rates
• We adopt a basic subset:
• ‘sensors’ relation (table) has– One column for each reading-type, or attribute– One row for each externalized value
• May represent an aggregation of several individual readings
SELECT {aggn(attrn), attrs}FROM sensorsWHERE {selPreds}GROUP BY {attrs}HAVING {havingPreds}EPOCH DURATION s
SELECT AVG(light) FROM sensors WHERE sound < 100GROUP BY roomNoHAVING AVG(light) < 50
IV-95
Aggregation Functions(Madden)
• Standard SQL supports “the basic 5”:– MIN, MAX, SUM, AVERAGE, and COUNT
• We support any function conforming to:
Aggn={fmerge, finit, fevaluate}
Fmerge{<a1>,<a2>} → <a12>
finit{a0} → <a0>
Fevaluate{<a1>} → aggregate value
(Merge associative, commutative!)
Example: Average
AVGmerge {<S1, C1>, <S2, C2>} → < S1 + S2 , C1 + C2>
AVGinit{v} → <v,1>
AVGevaluate{<S1, C1>} → S1/C1
Partial Aggregate
IV-96
Query Propagation(Madden)
• TAG propagation agnostic– Any algorithm that can:
• Deliver the query to all sensors• Provide all sensors with one or
more duplicate free routes to some root
• Paper describes simple flooding approach– Query introduced at a root;
rebroadcast by all sensors until it reaches leaves
– Sensors pick parent and level when they hear query
– Reselect parent after k silent epochs
Query
P:0, L:1
2
1
5
3
4
6
P:1, L:2 P:1, L:2
P:3, L:3
P:2, L:3
P:4, L:4
IV-97
TAG Discussion(Madden)
• Result is a stream of values– Ideal for monitoring scenarios
• One communication / node / epoch– Symmetric power consumption, even at root
• New value on every epoch– After d-1 epochs, complete aggregation
• Given a single loss, network will recover after at most d-1 epochs
• With time synchronization, nodes can sleep between epochs, except during small communication window
1
2 3
4
5
IV-98
Simulation Result(Madden, OSDI ‘02)
Total Bytes Xmitted vs. Aggregation Function
0
1000020000
3000040000
50000
6000070000
8000090000
100000
EXTERNAL MAX AVERAGE COUNT MEDIANAggregation Function
Tota
l Byt
es X
mitt
ed
Simulation Results
2500 Nodes
50x50 Grid
Depth = ~10
Neighbors = ~20 Some aggregates require dramatically more state!
IV-99
Taxonomy of Aggregates(Madden)
• TAG insight: classifying aggregates according to various functional properties– Yields a general set of optimizations that can
automatically be applied
Hypothesis Testing, SnoopingCOUNT : monotonicAVG : non-monotonic
Monotonic
Applicability of Sampling, Effect of Loss
MAX : exemplaryCOUNT: summary
Exemplary vs. Summary
Routing RedundancyMIN : dup. insensitive,AVG : dup. sensitive
Duplicate Sensitivity
Effectiveness of TAGMEDIAN : unbounded, MAX : 1 record
Partial State
AffectsExamplesProperty
IV-100
SensorWare: a Sensor Middleware for Distributed Applications
(Srivastava et al 2002)
• Current approaches: sensor nodes viewed as tiny database servers– a user query is answered by aggregating results– forms of queries:
• Do you currently see X?• Did you see X in the past timer T?• Inform me when you see X.• Tell me periodically what you see.
• Problem: remote user in the interactive loop– leads to higher latency for tracking– also, excessive traffic and therefore wasted energy
• Mobile scripts naturally result in an information-driven strategy
IV-101
SensorWare
• Mobile scripts (agents) injected as queries & tasks for application-defined processing at the nodes– replicate and migrate according to application logic and the state of
the physical world– executed in a sand-box at each node
• Suite of core services for sensing, processing, communication, and actuation available at each node
• Features– Scripts use application-specific dissemination strategies layered on
top of a thin and simple protocol stack• no one routing hard coded
– Autonomous coordination in the sensor network• Avoids the “user-in-the-loop” model inherent in the usual database
query-response model
– Task centric programming model• Avoids thinking in terms of individual nodes
IV-102
SensorWare Architecture
Middleware
OS
Hardware
HW abstractionlayer
Scripts Apps,Services
Scriptmigration
Node 2
Middleware
OS
Hardware
HW abstractionlayer
ScriptsApps,Services
Message exchanging
External user can inject agent script
Node 1
Middleware
APIsRuntime environment
Mobility NetworkingSensinginterpreterScript State
keeping
Energy-based Admission control & resource management
NetworkingSensor
AbstractionTimers &
CPU control
Resource handling tasks
IV-103
Implementation on iPaq-based Sensor Nodes & Sensoria WINS
Magnetic Sensor- uC Based with RS232- Range of +/- 2Gausus-Adjustable Sampling Rate- X, Y, Z output- Device ID Management
WaveLan Card-IEEE 802.11b Compliant- 11 Mbit/s Data Rate
Familiar v0.5- Linux Based OS for iPAQ H3600s- JFFS2, read/write iPAQ’s flush- Tcl ported
iPAQ 3670-Intel StrongARM- Power Management (normal, idle & sleep mode)- Programmable System Clock- IR, USB, Serial (RS232) Transmission
IV-104
System Operation Overview
User Node
Script Migration
Script Migration
Script Migration
Script Code injecting
Mobile Script
- Light weight
- Programmable
- Code Mobility
- Tcl front end
IV-105
Event-driven Model
• Agents express the event they are interested in• Sensor event
– event -data <name> -sample_period <value> {-buffer_size <value>} {-wakeup_period <value>} {-threshold <value>}
• Timer event– event -timer <name><value>
• Network event – event -net <name> <start> <reg exp>– agent can generate network event
• Timer and data event generated by the system• send <node.user.family.instance> <msg>
• wait {-net <events> ...} {-data <events> ...} {-timer <events> ...}
IV-106
Mobile Script Model
• spawn <node> [-s|<code>] {<bind> ...}
• Binding variables– Pass-by-name– Encourage agent reusability– Support agent modulation
• Embedded Agents– Agents that are being carried by other ones– Two methods
• Static– Prone to errors– Cannot reuse: Mission specific– High memory usage: Carrier has to stay alive!
• Dynamic– High modulation
IV-107
Tracking: Mobile Script Flooding
Sensor Node
User Node
Video Node
Monitoring Target
“Tracking Script Code” injecting
IV-108
Tracking: Event Sensing
Sensor Node
User Node
Video Node
Monitoring Target
Event Sensing
Event Sensing
Event Sensing
Event Sensing
Event Sensing
Event Sensing
Event Sensing
Event Sensing
IV-109
Tracking: Mobile Script Activation
Tracking Code Activated
Sensor Node
User Node
Video Node
Target
Tracking Code Activated
IV-110
Tracking: Position Notification and Code Migration
Position InformationTracking
Sensor Node
User Node
Video Node
Target
Position Information
TrackingScript Migration
Script Migration
Monitoring
IV-111
Tracking: Position Notification
Tracking
Sensor Node
User Node
Video Node
Target
Position Information
TrackingScript Migration
Script Migration
Monitoring
Tracking
IV-112
Making this work on much smaller nodes
• UCB Mate project (Levis et al 2002) – Stack-based bytecode interpreter– Self-forwarding code– Rapid reprogramming– Message receive and send
contexts– TinyOS component, 7286 bytes
code, 603 bytes RAM, three concurrent execution contexts, Code broken into 24 instruction capsules
Network Programming Rate
0%
20%
40%
60%
80%
100%
0 20 40 60 80 100 120 140 160 180 200 220 240
Time (seconds)
Per
cent
P
rog
ram
med
IV-113
Moving Forward: Storage, Representation, Distributed processing
• Spatio-temporal Queries– Index data for easy temporal and spatial
searching
• Interpretation of Spatially Distributed Data– Provide ability to interpret data at different spatial
and temporal extents. Per-node processing alone is not enough
• Pattern-Triggered Data Collection– Should enable multi-resolution data storage and
retrieval for data collection.• Data Centric Protocols and In network
Processing– Network does in-network processing of data
based on distribution of data– When data changes, portion of the network
updates understanding of the data– When user queries, it travels to node that
maintains relevant summarized data
K V
K V
K V
K V
K V
K V
K VK V
K V
K VK V
IV-114
How should nodes summarize data?(Bien et al)
• Histograms are a useful way to summarize data
• Queries for statistics
• Show degree of data variation in an area
• Queries for maps
• Geographically show data variation
• Provide approximation of map
Sensor Value
Num
ber
of N
odes
with
V
alue
YLo
catio
n
X Location
Sen
sor
Val
ue
IV-115
DIMENSIONSTM: A Multi-Resolution Storage Architectures for Sensor Networks (Ganesan et al)
increasingspatial
resolution
increasingtemporalresolutionTi
me
n Per-Node Temporal Data Processingn Progressive encoding : more lossy
storage of older datan Design Principle: System
guarantees lossless multi-resolution data collection within time T of event occurrence. Older data is stored ONLY for long-term query processing, which can tolerate greater loss
n Spatial Data Processingn Nodes in the network form a logical
spatial hiearchy.n At each step of the hierarchy,
further summarization of the data takes place.
IV-116
Spatial Data Processing
• For level=1 to N– Cluster-Heads (i-1) send data to Cluster-Head (i)– ClusterHead(i) aggregates data from 4 lower level clusterheads,
locally stores the deltas, and sends resulting coefficients to ClusterHead(i+1)
• To a first order approximation, size of coefficients transmitted at each level is constant.
LocalStorage
2D WaveletTransform on
Tile
HuffmanEncoding
HuffmanDecoding
Deltas
Send coefficients to level l+1 cluster-heads
Get coefficients fromlevel l-1 cluster-heads
IV-117
Content Addressable Storage…aka Data Centric Storage
(Ramiswamy et al)
• Queries for some things (like local maxima) are not necessarily spatially correlated.
• Sensor networks require a storage scheme that indexes on semantics as well as on the location of data.
IV-118
The Semantic Hierarchy(Greenstein et al)
8 < Temperature ≤ 8
7 < Temperature ≤ 8
1 < Temperature ≤ 8
4 < Temperature ≤ 8• The wider the
geographical extent a node indexes, the more data there is to index.
• Given that each node has limited storage, tighter semantic constraints must be used for nodes that index wider areas.
IV-119
Event
Index Construction
AttributeName = Temperature
Value = 8Min = 1Max = 8
Distribution = Uniform
AttributesName = EventLatitude
Value = 171
Hash Function
AttributesName = EventLongitude
Value = 120
(120, 171)(0111 1000, 1010 1011)
Level1 of 4
Coordinates
“Temperature 1 8”Hash String
(121, 169)(0111 1001, 1010 1001)
Coordinates
Indexes contain histograms summarizing events within their value and spatial ranges
8 < Temperature ≤ 8
7 < Temperature ≤ 8
1 < Temperature ≤ 8
4 < Temperature ≤ 8
IV-120
Search
8 < Temperature ≤ 8
7 < Temperature ≤ 8
1 < Temperature ≤ 8
4 < Temperature ≤ 8
etc.
• Search most constrained index nodes for region of interest
• Search travels from constrained semantics to constrained geography (location of an event)
• Distributions are generated as part of the index construction
Node Histogram
Region 008 0
Region 018 0
Region 108 2
Region 118 0
< 8 -- 752> 8 -- 0
Level 4
Node Histogram
Region 007 528 2
Region 017 128 0
Region 107 158 0
Region 117 08 0
< 7 -- 700> 8 -- 0
Level 3
Node Histogram
Region 005 -- 1006 -- 807 -- 268 -- 1
Region 105 -- 206 -- 207 -- 98 -- 0
Region 005 -- 306 -- 207 -- 138 -- 1
Region 105 --156 -- 157 -- 48 -- 1
< 5 -- 400> 8 -- 0
Level 2