IV-3

Medium Access Control in Sensor Nets

• Important attributes of MAC protocols

1. Collision avoidance2. Energy efficiency3. Scalability in node density4. Latency5. Fairness6. Throughput7. Bandwidth utilization

IV-4

• Major sources of energy waste• Idle listening when no sensing events, Collisions,

Control overhead, Overhearing

MAC Impact on Sensor Networks(Intanago et al, 2000)

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 50 100 150 200 250 300

Ave

rage

Dis

sipa

ted

Ene

rgy

(Jou

les/

Nod

e/R

ecei

ved

Eve

nt)

Network Size

DiffusionDiffusion

Omniscient MulticastOmniscient MulticastFloodingFlooding

00.0020.0040.0060.0080.01

0.0120.0140.0160.018

0 50 100 150 200 250 300

Ave

rage

Dis

sipa

ted

Ene

rgy

(Jou

les/

Nod

e/R

ecei

ved

Eve

nt)

Network Size

DiffusionDiffusion

Omniscient MulticastOmniscient Multicast

FloodingFlooding

Over 802.11-like MAC Over energy-aware MAC

IV-5

Identifying the Energy Consumers

•• Need to shutdown the radio

From Tsiatis et al. 2002

SENSORS

Power consumption of node subsystems

0

5

10

15

20P

ower

(m

W)

CPU TX RX IDLE SLEEP

RADIO

SLEEPIDLERXTX EEEE >>≈≈

IV-6

• Major sources of energy waste

– Idle listening• Long idle time when no sensing event happens• Collisions• Control overhead• Overhearing

• Try to reduce energy consumption from all above sources

• TDMA requires slot allocation and time synchronization• Combine benefits of TDMA + contention protocols

Energy Efficiency in MAC

Common to all wireless networks

IV-7

Sensor-MAC (S-MAC) Design(Wei et al. 2002)

• Tradeoffs

• Major components of S-MAC• Periodic listen and sleep• Collision avoidance• Overhearing avoidance• Message passing

LatencyFairness

Energy

IV-8

Periodic Listen and Sleep

• Problem: Idle listening consumes significant energy– Nodes do not sleep in IEEE 802.11 ad hoc mode

• Solution: Periodic listen and sleep– Turn off radio when sleeping– Reduce duty cycle to ~10% (200 ms on/2s off)– Increased latency for reduced energy

sleeplisten listen sleep

IV-9


• Schedules can differ

• Preferable if neighboring nodes have same schedule

— easy broadcast & low control overhead

Border nodes:two schedulesbroadcast twice

Node 1

Node 2



Schedule 2

Schedule 1

IV-10


• Schedule maintenance– Remember neighbors’ schedules

— to know when to send to them

– Each node broadcasts its schedule every few periods– Refresh on neighbor’s schedule when receiving an

update– Schedule packets also serve as beacons for new

nodes to join a neighborhood

IV-11

Collision Avoidance

• Problem: Multiple senders want to talk• Options: Contention vs. TDMA• Solution: Similar to IEEE 802.11 ad hoc mode (DCF)

– Physical and virtual carrier sense– Randomized backoff time– RTS/CTS for hidden terminal problem– RTS/CTS/DATA/ACK sequence

IV-12

Overhearing Avoidance

• Problem: Receive packets destined to others• Solution: Sleep when neighbors talk

– Basic idea from PAMAS (Singh 1998)– But we only use in-channel signaling

• Who should sleep?

• All immediate neighbors of sender and receiver• How long to sleep?

• The duration field in each packet informs other nodes the sleep interval

IV-13

Message Passing

• Problem: In-network processing requires entire message

• Solution: Don’t interleave different messages– Long message is fragmented & sent in burst– RTS/CTS reserve medium for entire message– Fragment-level error recovery

— extend Tx time and re-transmit immediately• Other nodes sleep for whole message time

Fairness EnergyMsg-level latency

IV-14

Msg Passing vs. 802.11 fragmentation

Time reservation by duration field

ü If ACK is not received, give up Tx — fairnessü No indication of entire time — other nodes keep listening

RTS 21 ......

Data 19

ACK 18CTS 20

Data 17

ACK 16

Data 1

ACK 0

• MP

RTS 3 ......

Data 3

ACK 2CTS 2

Data 3

ACK 2

Data 1

ACK 0

• 802.11

IV-15

Implementation on Testbed Nodes

• Platform• Motes (UC Berkeley) :

• 8-bit CPU at 4MHz,• 8KB flash, 512B RAM

• TinyOS: event-driven

• Also used as NIC for 32-bit embedded PCs

• Compared MAC modules• IEEE 802.11-like protocol• Message passing with overhearing

avoidance• S-MAC (2 + periodic listen/sleep)• URL: http://www.isi.edu/scadds/smac/

IV-16

S-MAC Experimental results(implemented on UCB Mote over RFM radio)

• Topology and measured energy consumption on source nodes

Source 1

Source 2

Sink 1

Sink 2

• Each source node sends 10 messages — Each message has 10

fragments x 40B• Measure total energy

— Data + control + idle0 2 4 6 8 10

200

400

600

800

1000

1200

1400

1600

1800Average energy consumption in the source nodes

Message inter-arrival period (second)

Ene

rgy

cons

umpt

ion

(mJ)

802.11-like protocol Overhearing avoidanceS-MAC

Message Inter-arrival period

Energy consumed

IV-17

Adaptive Topology

• Can we do more than shut down radio in between transmissions/receptions?

• Can we put nodes to sleep for longer periods of time?

• Goal:– Exploit high density (over) deployment to extend

system lifetime – Provide topology that adapts to the application needs– Self-configuring system that adapts to environment

without manual configuration

IV-18

Adaptive Topology: Problem Description

• Simple Formulation (Geometric Disk Covering)– Given a distribution of N nodes in a plane.– Place a minimum number of disks of radius r (centered on

the nodes) to cover them.– Disk represents the radio connectivity (simple circle model).

• The problem is NP-hard.

IV-19

Connectivity Measurements*

Packet reception over distance has a heavy tail. There is a non-zero probability of receiving packets at distances much greater than the average cell range

169 motes, 13x13 grid, 2 ft spacing, open area, RFM radio, simple CSMA

Can’t justdetermineconnectivityclusters thrugeographic coordinates…

For the same reason you can’t determine coordinates w/connectivity

*An Empirical Study of Epidemic Algorithms in Large Scale Multihop Wireless NetworksGanesan, Krishnamachari, Woo, Culler, Estrin and Wicker, UCLA/CSD-TR 02-0013.

IV-20

Tradeoff• How many nodes to activate?

– few active nodes:• distance between neighboring nodes high -> increase packet loss and

higher transmit power and reduced spatial reuse;• need to maintain sensing coverage (see earlier session on

coverage/exposure)

– too many active nodes:• at best, expending unnecessary energy;• at worst nodes may interfere with one another by congesting the

channel.

IV-21

Adaptive Topology Schemes

• Mechanisms being explored: – Empirical adaptation: Each node assesses its connectivity

and adapts participation in multi-hop topology based on the measured operating region, ASCENT (Cerpa et al. 2002)

– Cluster-based, load sharing within clusters, CEC (Xu et al. 2002)

– Routing/Geographic topology based, eliminate redundant links, SPAN (Chen et al. 2001), GAF (Xu et al. 2001)

– Data/traffic driven: Trigger nodes on demand using paging channel, STEM (Tsiatsis et al. 2002)

IV-22

One example algorithm: ASCENT

• The nodes can be in active or passive state.– Active nodes forward data packets (using routing

mechanism that runs over topology). – Passive nodes do not forward any packets but may sleep or

collect network measurements. • Each node joins network topology or sleeps according to

measured number of neighbors and packet loss, as measured locally.

(b) Self-configuration transition(a) Communication Hole (c) Final State

Help Messages

Data Message

SinkSource SinkSource

Neighbor AnnouncementsMessages

Data Message

SinkSource

Active NeighborPassive Neighbor

IV-23

State Transitions

Test

Passive Sleep

Activeafter Tt

after Tp

after Ts

neighbors < NTand• loss > LT• loss < LT & help

neighbors > NT (high ID for ties); orloss > loss T0

NT: neighbor thresholdLT: loss thresholdT?: state timer values (p: passive, s: sleep, t: test)

IV-24

Testbed Platforms

• Tiered architecture: heterogeneous set of platforms (no particular set of compromises with a unique platform).

• PC-104/iPAQ enable running tasks that are more CPU/memory intensive.

iPAQ UCB Mote

• Simulator:– Discrete event-driven simulator.

– Multiple stackable modules perform different functionality.

PC-104/RPC

IV-25

Energy Savings Ratio

Energy Savings Ratio (normalized to the Active case, all nodes turn on) as a function of density. ASCENT provides significant amount of energy savings, with a factor of 5 for high density scenarios.

IV-26

Event Delivery Ratio

Event Delivery Ratio as a function of density. ASCENT reduces collisions experienced by diffusion routing by limiting the maximum number of active nodes transmitting packets.

Open question: to what extent would SMAC or Geo-routing produce the same result?

IV-27

radio supports promiscuous mode

noneadapt topology based on application needs

ASCENT

2 radios/wake-up channelconnectivity conditions

remain constant in sleeping periods

needs routing info to direct the wake-up wave

tradeoff latency for energy savings

STEM

802.11 MAC with Power Savings mode

gets connectivity matrix and neighbors from routing

requires modifications in the routing lookup process

preserve capacity of the raw topology

SPAN

geographic information for grid placement

radio connectivity directly correlated with geography

nonepreserve routing fidelityGAF

AssumptionsRouting dependency

Goal(general: energy

savings)

IV-28

STEM: Data drive wakeup(Tsiatsis et. al 2002)

Sensor-triggered node wakeup

Wake up the nodes along the path

HOW ???

event

sensor network

user

Zzz

Path nodes need to be woken up

Zzz ZzzZzz

IV-29

• Need to separate Wakeup and Data Forwarding Planes

• Chosen two separate radios for the two planes

• Use separate radio for the paging channel to avoid interference with regular data forwarding

• Trades off energy savings for path setup latency

Wakeup plane: f1

Data plane: f2

STEM: Sparse Topology and Energy Management

IV-30

Duty Cycled Wakeup Radio

f1

f1

Target node

Initiator node

Train of beacon packetsTRx

B1 B2 1. beacon received

2. beacon acknowledge

T

IV-31

Performance Metrics

(sec)ST

(sec)T (sec)T

0EE

IV-32

Energy vs Latency

%50

%10

%1

%1.0

0EE

(sec)ST

IV-33

STEM Design Decisions

• One radio with one frequency band would cause interference between the wakeup and data planes

• One radio with two frequencies would have to change frequency to listen for setup requests

• One radio with time multiplexed wake-up and data planes would require time synchronization between the two nodes –time sync in ad-hoc networks is an open problem by itself

• The additional radio cost is about 15% of the total cost of the node (from private communication with Savvides)

Active mode

Polling mode

Sleep mode

Interference

A B

C

D

IV-34

Design Issue: Collision Resolution

more initiator nodes

• upon detection of collision, a node turns on its data radio• after T, the initiator node assumes the target node is up

and contacts it on the data plane• when an expected target node doesn’t receive data, it

times out and goes back to sleep

1 initiator node

• beacon received correctly• only intended receiver turns on the data

radio and sends a beacon acknowledge in the wakeup plane

IV-35

Exploiting Latency AND Density

• Existing approaches (ASCENT, GAF, SPAN, etc) exploit node density

• Combine STEM and one of these schemes (chosen GAF)

Zzz

Active nodes

Zzz

Zzz

Zzz

STEM

IV-36

Performance of STEM+GAF

GAF

STEM + GAF

STEMalone

avg # of neighbors

0EE

msTS 900=msTS 675=msTS 400=

IV-37

Conclusions

• STEM leverages data latency for energy savings

• Combining existing algorithms that leverage densityand STEM yields additional energy savings

• Provide with design time tools for extending the lifetime of the network

IV-38

Adaptive Topology: Future Work

• Load Balancing.• Larger scale experiments.• Interaction with adaptive MAC and geographic

routing• Application defined Adaptive Fidelity

– Expanding on STEM’s data driven characteristics to achieve more than on/off behavior

IV-39

Routing

• Given a topology, how to route data?– MANET: Reactive[DSR], proactive[AODV], TORA,

GPSR[KarpKung00]– Location-aided routing: Geocast[Navas97],Cartesian-LAR,

[KOVaidya98]

– Building on Geo Routing• GRAB (Lu et al 2002)• Routing on curve (Badri 2002)

IV-40

GRAB:Field Based Minimum Cost Forwarding

(Lu et al 2002)

• Each node broadcasts only once• Cost Function

– A measure of how expensive it is to get a message back to the sink.

– Could be base:• Energy needed in radio communication.• Hop count.• …

• Node Cost– Each node keeps a best estimate on its minimum cost.– Estimate updated upon receipt of every ADV message.– ADV message forwarding deferred for time proportional to

nodes cost estimate.

IV-41

ADV Dissemination Example

• Signal strength is used to measure cost.

• B sees strong signal and judges cost to be 1.

• C sees weak signal and judges cost to be 3.

IV-42

ADV Dissemination Example contd.

• Because B has a smaller cost, he defers for a shorter time then C.

• C updates his cost to 2 and restarts his deferral timer.

• Each node has optimal cost with minimum broadcast.

IV-43

Data Dissemination

• A node that decides it has interesting data. broadcasts two things (besides data)– Total budget to get back to sink.– Amount of budget used in initial broadcast.

• A node receiving a data message will only forward a data message if

Total Budget ≥ Budget Spent So Far + My Cost– If the inequality holds then Budget Spent So Far is updated.– Otherwise the message is dropped.

IV-44

Data Dissemination Example

• Assume hop count was used as a cost metric.

• Node A is the sink.• Node C is the source.

IV-45

Data Dissemination Example contd.

• Node C sends a data message which specifies– Total Budget = 2– Budget Spent = 1

• Node E drops message– TB < BS + E’s Cost

• Node B forwards message.

IV-46

Routing on a Curve(Nath et al 2002)

• Trajectories are a natural name space for embedded networks

• By definition, network structure mimics physical structure that is instrumented– Stress along a column– Flooding along a river– Pollution along a road

• Trajectories come from the application domain

IV-47

TBF (Trajectory based forwarding)

• Fundamental Idea– Route packets along a specified trajectory

• Generalization of Source Based Routing and Cartesian routing

• Trajectory specified in the packet

IV-48

Specifying trajectory

• Function • Equation • Parametric

– we use this choice

xybaxy sin, =+=

1,1 22 ++=++ yxcbyax trytrxbatytx cos,sin;, ==+==

IV-49

Features of TBF

• Basic Features– Decouples pathname from the actual path

• Source based Routing (LSR, DSR etc) mixes naming and route path

• Applications:– Route around obstacles/changes/failures

– Trajectory forwarding need not have a “destination”• Route along a line, pattern• Applications:

– Flooding, discovery, group communication (pollination)

IV-50

Routing on a curve

IV-51

Spoke flooding

IV-52

Flooding along spokes

0102030405060708090

100

4 9

Number of Spokes

CoveragePackets

IV-53

Related Work

• MANET– Reactive[DSR], proactive[AODV], TORA,

GPSR[KarpKung00]

• Location-aided routing– Geocast[Navas97],Cartesian-LAR, [KOVaidya98]

• Localization techniques– GPS, Cricket[Priyantha00],RADAR[Bahl00],– Ahlos[bulusu00,Savides01],TdoA technique, Visual cues

IV-54

Discussion

• Trajectory modifications• No forward progress

– Retry from source– Local reverse gear– Apply detour patches– Packets allowed to backtrack

• Specifying trajectories– Accurate trajectories– Use fourier components– Tradeoff between number of components and

accuracy

IV-55

In-Network ProcessingThe Key to Sensor Network scalability and Realization

• Gupta and Kumar pointed out fundamental limits of large scale wireless networks (per node throughput O(1/sqrtN)

• However, S. Servetto shows that result holds only for independent nodes (Mobicom 2002)– Densely deployed sensor network data will be

correlated and can be aggregated• Scalability and lifetime will depend on techniques

for in-network processing of data– Naming Data: Directed Diffusion– Data base perspectives:TAG, Sylph– Programming mechanisms: Sensorware, Mate

IV-56

Directed Diffusion: Data Centric Routing

• Basic idea– name data (not nodes) with externally relevant attributes

• Data type, time, location of node, SNR, etc– diffuse requests and responses across network using

application driven routing (e.g., geo sensitive or not)– optimize path with gradient-based feedback– support in-network aggregation and processing

• Data sources publish data, Data clients subscribe to data– However, all nodes may play both roles

• A node that aggregates/combines/processes incoming sensor node data becomes a source of new data

• A sensor node that only publishes when a combination of conditions arise, is a client for the triggering event data

– True peer to peer system

• Implementation defines namespace and simple matching ruleswith filters– Linux (32 bit proc) and TinyOS (8 bit proc) implementations

IV-57

Diffusion as a construct for in-network processing

• Nodes pull, push, and store named data (using tuple space) to create efficient processing points in the network– e.g. duplicate suppression, aggregation, correlation

• Nested queries reduce overhead relative to “edge processing”• Complex queries support

collaborative signal processing– propagate function

describing desired locations/nodes/data (e.g. ellipse for tracking)

– Interesting analogs to emergingpeer-to-peer architectures

• Build on a data-centric architecturefor queries and storage

IV-58

Nested Query Evaluation(example experiment w/sub-optimal hardware)

(Heidemann et al.)

• Nested queries greatly improve event delivery rate

• Specific results depend on experiment– placement– limited quality MAC

• General result: app-level info needed in sensor nets; diffusion is good platform ev

ents

suc

cess

fully

rece

ived

(%)

number of light sensors

flat

nested

60

1 2 3 4

80

40

20

IV-59

A more general look at A more general look at Data Centric vs. Address Centric Data Centric vs. Address Centric

approachapproach((KrishnamachariKrishnamachari et al.)et al.)

• Address Centric• Distinct paths from each source to sink.

• Data Centric• Support aggregation in the network where paths/trees overlap• Essential difference from traditional IP networking

• Building efficient trees for Data centric model• Aggregation tree: On a general graph if k nodes are sources and one is a sink, the aggregation tree that minimizes the number of transmissions is the minimum Steiner tree. NP-complete….Approximations:

• Center at Nearest Source (CNSDC): All sources send through source nearest to the sink.• Shortest Path Tree (SPTDC): Merge paths.• Greedy Incremental Tree (GITDC): Start with path from sink to nearest source. Successively add next nearest source to the existing tree.

IV-60

Source placement: eventSource placement: event--radius modelradius model

IV-61

Comparison of energy costsComparison of energy costs

Address CentricShortest path data centricGreedy tree data centricNearest source data centricLower Bound

Data centric has many fewer transmissions than does Address Centric; independent on the tree building algorithm.

Opportunism always pays;Greed pays only when things get very crowded

(Intanagowiwat et al. ns-2 more detailed simulations)

IV-63

Programming Paradigm

• Move beyond simple query with in-network aggregation model

• How do we task a 1000+ node dynamic sensor network to conduct complex, long-lived queries and tasks ??– What constructs does the query language need to support?

• What sorts of mechanisms need to be “running in the background” in order to make tasking efficient?– Small databases scattered throughout the network, actively

collecting data of nearby nodes, as well as accepting messages from further away nodes?

– Active messages traveling the network to both train the network and identify anomalous conditions?

IV-64

Examples• Map isotherms and other “contours”, gradients, regions

– Record images wherever acoustic signatures indicate significantly above-average species activity, and return with data on soil and air temperature and chemistry in vicinity of activity.

– Mobilize robotic sample collector to region where soil chemistry and air chemistry have followed a particular temporal pattern and where the region presents different data than neighboring regions.

• Raises requirements for some global context, e.g. “average” levels– Emerging role for distributed storage architecture

• Pattern identification: how much can and should we do in a distributed manner?– Similar to some vision/image analysis problems but distributed

noisy inputs

IV-65

Why databases?

• Sensor networks are capable of producing massive amounts of data – Sensor networks should be able to

• Accept queries for data• Respond with results

– Users will need • An abstraction that guarantees reliable results• Largely autonomous, long lived network

• Efficient organization of nodes and data will extend network lifetime– Database techniques already exist for efficient data storage

and access

IV-66

• Query processing– Storage structures:

• Less choice here … traditionally, data organized on disk• Here, data originates at the sensors’ location

– Time coherency across space– “Stream” processing … one pass over data

• E.g., try to estimate a histogram on the fly.

– Quality of service:• Latency vs. completeness• Actuators: e.g., camera pan and zoom

– Tradeoffs in communication vs. computation and storage: e.g., filter-refine: where to do what.

Database-centric Approach(Muntz et al.)

IV-67

Differences between databases and sensor networks

• Database– Static data– Centralized– Failure is not an option– Plentiful resources– Administrated

• Sensor Network– Streaming data– Large number of nodes– Multi-hop network– No global knowledge about

the network– Frequent node failure– Energy is the scarce– Resource, limited memory– Autonomous

IV-68

Bridging the Gap

• What is needed to be able to treat a sensor network like a database?– How should sensors be modeled?– How should queries be formulated?

IV-69

Traditional Approach: Warehousing

• Data is extracted from sensors and stored on a front-end server• Query processing takes place on the front-end.

Warehouse

Front-end

Sensor Nodes

IV-70

What We’d Like to Do:Sensor Database System

• Sensor Database System supports distributed query processing over a sensor network

SensorDB

SensorDB

SensorDB

SensorDB Sensor

DB

SensorDB

SensorDB

SensorDB

Front-end

Sensor Nodes

IV-71

Sensor Database System

• Characteristics of a Sensor Network:

– Streams of data– Uncertain data– Large number of nodes– Multi-hop network– No global knowledge

about the network– Node failure and

interference is common– Energy is the scarce

resource– Lmited memory– No administration, …

• Can existing database techniques be reused? What are the new problems and solutions?– Representing sensor data– Representing sensor

queries– Processing query fragments

on sensor nodes– Distributing query fragments– Adapting to changing

network conditions– Dealing with site and

communication failures– Deploying and Managing a

sensor database system

IV-72

Performance Metrics

• High accuracy– Distance between ideal answer and actual answer?– Ratio of sensors participating in answer?

• Low latency– Time between data is generated on sensors and answer is

returned• Limited resource usage

– Energy consumption

IV-73

Representing Sensor Data and Sensor Queries

• Sensor Data:– Output of signal processing functions

• Time Stamped values produced over a given duration

– Inherently distributed• Sensor Queries

– Conditions on time and space• Location dependent queries• Constraints on time stamps or aggregates over time windows

– Event notification

IV-74

Early Work in Sensor Databases

• “Towards Sensor Database System”“Querying the Physical World”– Phillipe Bonnet, Johannes Gehrke, Praveen Seshdri

IV-75

Fjording the Stream: An Architecture for Queries over Streaming Sensor Data

(Hellerstein et al.)

• How can existing database querying methods be applied to streaming data?

• How can we combine real-time sensor data with stored historical data?

• What architecture is appropriate for supporting simultaneous queries?

• How can we lower sensor power consumption, while still supporting a wide range of query types?

IV-76

Traditional Database Operators

• Are implemented using “pull” mechanisms.• Block on incoming data.• Most require all the data to be read first. (i.e. sort,

average…)• Optimized for classic IO.• Usually implemented as separate threads.

IV-77

Hardware Architecture

• Centralized data processing.

• Sensor proxies read and configure sensors.

• Query processor interacts with proxies to request and get sensor data.

• Sensor proxies support multiple simultaneous queries, multiplexing the data.

IV-78

Operators

• Implemented as state machines.• Support transition(state) method, which causes the operator to

optionally read from input queues, write to output queue, and change state.

• Multiple operators per thread, called by a scheduler. (Round robin in the experiments)

• Allows fine-grained tuning of processing time allocated to each operator.

IV-79

Sensor Sensitive Operators

• Certain operations are impossible to perform on continuous data streams. (sum, average, sort…)

• Can be performed on partial data windows.• Joins can be implemented by hashing tuples.• Can provide aggregation based on current data, with continuous

updates to parent operators.??

IV-80

Sensor Proxy

• Responsible for configuring sensor that belong to it, setting sensing frequency, aggregation policies, etc..

• To save power, each sensor only listens for commands from proxy during short intervals.

• Handles incoming data from sensors, and pushes it into appropriate queues.

• Stores a copy to disk for historical queries.• Performs data filtering, which it can sometimes offload to the

sensors.

IV-81

Building a Fjord

• For all sensor data sources, locate the proxy for the sensor, and install a query on it to deliver tuples at a certain rate to a push queue.

• For non-sensor data sources, set up a pull queue to scan for data.

• Pipe the data through the operators specified by the query.

IV-82

Query

• Find average car speeds during time window (w), for all segments the user is interested in (knownSegments)

• More complicated queries are possible, with joins of streaming sensor data and historical data stored in a normal database fashion.

IV-83

Dataflow for the Query

• Data is pushed from the sensors to the user, through the filter operator set up by the query.

• Multiple similar queries can be added to an existing fjord, instead of creating one per query.

IV-84

Conclusion

• Fjords allow sensors to be treated as database sources for querying, with little change in the overall architecture.

• Proxies can optimize energy consumption of individual sensors based on user queries, and multiplex data from sensors to multiple queries.

• Processing is centralized, but can sometimes be offloaded to the sensors, to lower energy consumed by radio transmissions.

IV-85

The Sylph Middleware

• Database mechanisms packaged as middleware services

• Modular, Layered Design– Sensor module– Service discovery module– Proxy core

• Query LanguageREAD temperature EVERY 30 seconds FOR 1 hour

• Sensor Stream Processing

IV-86

IV-87

Sensor Module

• Strong Abstraction Barrier• Sensor Description

– Derived from IEEE 1451 Transducer Electronic Data Sheet (TEDS)

– Metadata - name, manufacturer, etc.– Attributes - device status, sampling interval, etc.– Data – available data, return types

• Attribute-Value Interface– SET SamplingPeriod = 1 second– GET DeviceStatus

IV-88

Service Discovery Module

• Device Proxy• Common Service Discovery Mechanisms (e.g., Jini)• Soft-State Reliability

IV-89

Proxy Core

• Device Manager• Sensor Proxy

– Service discovery– Layered functionality (e.g., buffering)

• Query Processor– Query parsing– Directed graph of stream operators– Query optimization

• Query Distribution - Gateways

IV-90

Accommodating in-network Aggregation

• Fjords and Sylph did not explicitly address in-network aggregation/processing

• “Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks” (Madden et al. 2002)

• Explores aggregation techniques that are application independent

• Count• Min• Max• Sum• Average

IV-91

At A Glance …

• Trying to minimize the number of messages sent

• All aggregation is done by building a tree

IV-92

Tricks of the Trade …

• How do you ensure an aggregate is correct?– Compute it multiple times.

• How do you reduce the message overhead of redistributing queries?– Piggy back the query along with data messages.

• Is there anyway to further reduce the messaging overhead?– Child nodes only report their aggregates if they’ve changed.

• Nodes can take advantage of multiple parents for redundancy reasons.

IV-93

The Tiny Aggregation (TAG) Approach(S. Madden, UCB)

• Push declarative queries into network– Impose a hierarchical routing tree onto the network

• Divide time into epochs• Every epoch, sensors evaluate query over local

sensor data and data from children– Aggregate local and child data– Each node transmits just once per epoch– Pipelined approach increases throughput

• Depending on aggregate function, various optimizations can be applied

IV-94

SQL Primer(Madden)

• SQL is an established declarative language; not wedded to it– Some extensions clearly necessary, e.g. for sample rates

• We adopt a basic subset:

• ‘sensors’ relation (table) has– One column for each reading-type, or attribute– One row for each externalized value

• May represent an aggregation of several individual readings

SELECT {aggn(attrn), attrs}FROM sensorsWHERE {selPreds}GROUP BY {attrs}HAVING {havingPreds}EPOCH DURATION s

SELECT AVG(light) FROM sensors WHERE sound < 100GROUP BY roomNoHAVING AVG(light) < 50

IV-95

Aggregation Functions(Madden)

• Standard SQL supports “the basic 5”:– MIN, MAX, SUM, AVERAGE, and COUNT

• We support any function conforming to:

Aggn={fmerge, finit, fevaluate}

Fmerge{<a1>,<a2>} → <a12>

finit{a0} → <a0>

Fevaluate{<a1>} → aggregate value

(Merge associative, commutative!)

Example: Average

AVGmerge {<S1, C1>, <S2, C2>} → < S1 + S2 , C1 + C2>

AVGinit{v} → <v,1>

AVGevaluate{<S1, C1>} → S1/C1

Partial Aggregate

IV-96

Query Propagation(Madden)

• TAG propagation agnostic– Any algorithm that can:

• Deliver the query to all sensors• Provide all sensors with one or

more duplicate free routes to some root

• Paper describes simple flooding approach– Query introduced at a root;

rebroadcast by all sensors until it reaches leaves

– Sensors pick parent and level when they hear query

– Reselect parent after k silent epochs

Query

P:0, L:1

2

1

5

3

4

6

P:1, L:2 P:1, L:2

P:3, L:3

P:2, L:3

P:4, L:4

IV-97

TAG Discussion(Madden)

• Result is a stream of values– Ideal for monitoring scenarios

• One communication / node / epoch– Symmetric power consumption, even at root

• New value on every epoch– After d-1 epochs, complete aggregation

• Given a single loss, network will recover after at most d-1 epochs

• With time synchronization, nodes can sleep between epochs, except during small communication window

1

2 3

4

5

IV-98

Simulation Result(Madden, OSDI ‘02)

Total Bytes Xmitted vs. Aggregation Function

0

1000020000

3000040000

50000

6000070000

8000090000

100000

EXTERNAL MAX AVERAGE COUNT MEDIANAggregation Function

Tota

l Byt

es X

mitt

ed

Simulation Results

2500 Nodes

50x50 Grid

Depth = ~10

Neighbors = ~20 Some aggregates require dramatically more state!

IV-99

Taxonomy of Aggregates(Madden)

• TAG insight: classifying aggregates according to various functional properties– Yields a general set of optimizations that can

automatically be applied

Hypothesis Testing, SnoopingCOUNT : monotonicAVG : non-monotonic

Monotonic

Applicability of Sampling, Effect of Loss

MAX : exemplaryCOUNT: summary

Exemplary vs. Summary

Routing RedundancyMIN : dup. insensitive,AVG : dup. sensitive

Duplicate Sensitivity

Effectiveness of TAGMEDIAN : unbounded, MAX : 1 record

Partial State

AffectsExamplesProperty

IV-100

SensorWare: a Sensor Middleware for Distributed Applications

(Srivastava et al 2002)

• Current approaches: sensor nodes viewed as tiny database servers– a user query is answered by aggregating results– forms of queries:

• Do you currently see X?• Did you see X in the past timer T?• Inform me when you see X.• Tell me periodically what you see.

• Problem: remote user in the interactive loop– leads to higher latency for tracking– also, excessive traffic and therefore wasted energy

• Mobile scripts naturally result in an information-driven strategy

IV-101

SensorWare

• Mobile scripts (agents) injected as queries & tasks for application-defined processing at the nodes– replicate and migrate according to application logic and the state of

the physical world– executed in a sand-box at each node

• Suite of core services for sensing, processing, communication, and actuation available at each node

• Features– Scripts use application-specific dissemination strategies layered on

top of a thin and simple protocol stack• no one routing hard coded

– Autonomous coordination in the sensor network• Avoids the “user-in-the-loop” model inherent in the usual database

query-response model

– Task centric programming model• Avoids thinking in terms of individual nodes

IV-102

SensorWare Architecture

Middleware

OS

Hardware

HW abstractionlayer

Scripts Apps,Services

Scriptmigration

Node 2

Middleware

OS

Hardware

HW abstractionlayer

ScriptsApps,Services

Message exchanging

External user can inject agent script

Node 1

Middleware

APIsRuntime environment

Mobility NetworkingSensinginterpreterScript State

keeping

Energy-based Admission control & resource management

NetworkingSensor

AbstractionTimers &

CPU control

Resource handling tasks

IV-103

Implementation on iPaq-based Sensor Nodes & Sensoria WINS

Magnetic Sensor- uC Based with RS232- Range of +/- 2Gausus-Adjustable Sampling Rate- X, Y, Z output- Device ID Management

WaveLan Card-IEEE 802.11b Compliant- 11 Mbit/s Data Rate

Familiar v0.5- Linux Based OS for iPAQ H3600s- JFFS2, read/write iPAQ’s flush- Tcl ported

iPAQ 3670-Intel StrongARM- Power Management (normal, idle & sleep mode)- Programmable System Clock- IR, USB, Serial (RS232) Transmission

IV-104

System Operation Overview

User Node

Script Migration

Script Migration

Script Migration

Script Code injecting

Mobile Script

- Light weight

- Programmable

- Code Mobility

- Tcl front end

IV-105

Event-driven Model

• Agents express the event they are interested in• Sensor event

– event -data <name> -sample_period <value> {-buffer_size <value>} {-wakeup_period <value>} {-threshold <value>}

• Timer event– event -timer <name><value>

• Network event – event -net <name> <start> <reg exp>– agent can generate network event

• Timer and data event generated by the system• send <node.user.family.instance> <msg>

• wait {-net <events> ...} {-data <events> ...} {-timer <events> ...}

IV-106

Mobile Script Model

• spawn <node> [-s|<code>] {<bind> ...}

• Binding variables– Pass-by-name– Encourage agent reusability– Support agent modulation

• Embedded Agents– Agents that are being carried by other ones– Two methods

• Static– Prone to errors– Cannot reuse: Mission specific– High memory usage: Carrier has to stay alive!

• Dynamic– High modulation

IV-107

Tracking: Mobile Script Flooding

Sensor Node

User Node

Video Node

Monitoring Target

“Tracking Script Code” injecting

IV-108

Tracking: Event Sensing

Sensor Node

User Node

Video Node

Monitoring Target

Event Sensing

Event Sensing

Event Sensing

Event Sensing

Event Sensing

Event Sensing

Event Sensing

Event Sensing

IV-109

Tracking: Mobile Script Activation

Tracking Code Activated

Sensor Node

User Node

Video Node

Target

Tracking Code Activated

IV-110

Tracking: Position Notification and Code Migration

Position InformationTracking

Sensor Node

User Node

Video Node

Target

Position Information

TrackingScript Migration

Script Migration

Monitoring

IV-111

Tracking: Position Notification

Tracking

Sensor Node

User Node

Video Node

Target

Position Information

TrackingScript Migration

Script Migration

Monitoring

Tracking

IV-112

Making this work on much smaller nodes

• UCB Mate project (Levis et al 2002) – Stack-based bytecode interpreter– Self-forwarding code– Rapid reprogramming– Message receive and send

contexts– TinyOS component, 7286 bytes

code, 603 bytes RAM, three concurrent execution contexts, Code broken into 24 instruction capsules

Network Programming Rate

0%

20%

40%

60%

80%

100%

0 20 40 60 80 100 120 140 160 180 200 220 240

Time (seconds)

Per

cent

P

rog

ram

med

IV-113

Moving Forward: Storage, Representation, Distributed processing

• Spatio-temporal Queries– Index data for easy temporal and spatial

searching

• Interpretation of Spatially Distributed Data– Provide ability to interpret data at different spatial

and temporal extents. Per-node processing alone is not enough

• Pattern-Triggered Data Collection– Should enable multi-resolution data storage and

retrieval for data collection.• Data Centric Protocols and In network

Processing– Network does in-network processing of data

based on distribution of data– When data changes, portion of the network

updates understanding of the data– When user queries, it travels to node that

maintains relevant summarized data

K V

K V

K V

K V

K V

K V

K VK V

K V

K VK V

IV-114

How should nodes summarize data?(Bien et al)

• Histograms are a useful way to summarize data

• Queries for statistics

• Show degree of data variation in an area

• Queries for maps

• Geographically show data variation

• Provide approximation of map

Sensor Value

Num

ber

of N

odes

with

V

alue

YLo

catio

n

X Location

Sen

sor

Val

ue

IV-115

DIMENSIONSTM: A Multi-Resolution Storage Architectures for Sensor Networks (Ganesan et al)

increasingspatial

resolution

increasingtemporalresolutionTi

me

n Per-Node Temporal Data Processingn Progressive encoding : more lossy

storage of older datan Design Principle: System

guarantees lossless multi-resolution data collection within time T of event occurrence. Older data is stored ONLY for long-term query processing, which can tolerate greater loss

n Spatial Data Processingn Nodes in the network form a logical

spatial hiearchy.n At each step of the hierarchy,

further summarization of the data takes place.

IV-116

Spatial Data Processing

• For level=1 to N– Cluster-Heads (i-1) send data to Cluster-Head (i)– ClusterHead(i) aggregates data from 4 lower level clusterheads,

locally stores the deltas, and sends resulting coefficients to ClusterHead(i+1)

• To a first order approximation, size of coefficients transmitted at each level is constant.

LocalStorage

2D WaveletTransform on

Tile

HuffmanEncoding

HuffmanDecoding

Deltas

Send coefficients to level l+1 cluster-heads

Get coefficients fromlevel l-1 cluster-heads

IV-117

Content Addressable Storage…aka Data Centric Storage

(Ramiswamy et al)

• Queries for some things (like local maxima) are not necessarily spatially correlated.

• Sensor networks require a storage scheme that indexes on semantics as well as on the location of data.

IV-118

The Semantic Hierarchy(Greenstein et al)

8 < Temperature ≤ 8



4 < Temperature ≤ 8• The wider the

geographical extent a node indexes, the more data there is to index.

• Given that each node has limited storage, tighter semantic constraints must be used for nodes that index wider areas.

IV-119

Event

Index Construction

AttributeName = Temperature

Value = 8Min = 1Max = 8

Distribution = Uniform

AttributesName = EventLatitude

Value = 171

Hash Function

AttributesName = EventLongitude

Value = 120

(120, 171)(0111 1000, 1010 1011)

Level1 of 4

Coordinates

“Temperature 1 8”Hash String

(121, 169)(0111 1001, 1010 1001)

Coordinates

Indexes contain histograms summarizing events within their value and spatial ranges





IV-120

Search





etc.

• Search most constrained index nodes for region of interest

• Search travels from constrained semantics to constrained geography (location of an event)

• Distributions are generated as part of the index construction

Node Histogram

Region 008 0

Region 018 0

Region 108 2

Region 118 0

< 8 -- 752> 8 -- 0

Level 4

Node Histogram

Region 007 528 2

Region 017 128 0

Region 107 158 0

Region 117 08 0

< 7 -- 700> 8 -- 0

Level 3

Node Histogram

Region 005 -- 1006 -- 807 -- 268 -- 1

Region 105 -- 206 -- 207 -- 98 -- 0

Region 005 -- 306 -- 207 -- 138 -- 1

Region 105 --156 -- 157 -- 48 -- 1

< 5 -- 400> 8 -- 0

Level 2

wireless sensor network protocols - welcome to networked

Documents