provenance in sensornet republishing unkyu park and john heidemann university of southern california...

19
Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Upload: helena-elliott

Post on 18-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

Sensor-Internet Goals –Share and search across many independently running sensor networks –Allow users to process and share transformed data 3 republisher: transforms the existing data the Internet sensor-search: index data and support sensornet discovery users mote sensornet mobile phones or personal computers sensors: sense the environments sensor store: repository for all data republisher [S. Reddy, G. Chen, B. Fulkerson, S. J. Kim, U. Park, N. Yau, J. Cho, M. Hansen, and J. Heidemann. Sensor-Internet Share and Search: Enabling Collaboration of Citizen Scientists. in Data Sharing and Interoperability on the World-wide Sensor Web, IPSN 2007, April 2007]

TRANSCRIPT

Page 1: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Provenance in Sensornet Republishing

Unkyu Park and John HeidemannUniversity of Southern California

Information Science InstituteJune 18, 2008

Page 2: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Why Sensornet Provenance?

• Growing amount of sensornet data– In isolated sensornets?– Today, reuse of data and collaboration are rare

• Sharing is important– Use the Internet in sharing sensor data– multiple steps, different users

• Provenance for sensornet– Support tracking data back to its source– Encourage sharing

2

Page 3: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Sensor-Internet

• Goals– Share and search across many independently running sensor networks– Allow users to process and share transformed data

3

republisher: transforms the existing data

theInternet

sensor-search:index data and

support sensornet discovery

users

motesensornet sensornet mobile phones or personal computerssensors:

sense the environments

sensor store: repository for all data

republisher

[S. Reddy, G. Chen, B. Fulkerson, S. J. Kim, U. Park, N. Yau, J. Cho, M. Hansen, and J. Heidemann. Sensor-Internet Share and Search: Enabling Collaboration of Citizen Scientists. in Data Sharingand Interoperability on the World-wide Sensor Web, IPSN 2007, April 2007]

Page 4: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Sensor data sharing

How Can Sensornet Provenance Help?

4

TempMapInterpolate point data into a complete temperature map

1. Check the transformation

Problem: A user detects abnormality on the map

Temperature SensorRaw87.1?7.187.?

Image

Recognition

94.3

94.3

77.187.378.8

58.4

2. Check the input

Q. What causes problem?

3. Find an abnormal sensor reading4. Check the transformation and input

5. Found that the image recognition problem (74.3 94.3)

Fixing Digits

Digit Repair

Corrected87.1 ±0.187.1 ±0.187.5 ±0.5

Raw87.1?7.187.?

Page 5: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Building Sensornet Ecosystem

• Collaborative processing– Encourage users

who use the same data to collaborate

– Participatory sensing• Search over the provenance

– Exploit the provenance to indentify high quality sensor data

5

TempMapTemperature

SensorRaw87.1?7.187.?

Digit Repair

Page 6: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Challenges in Sensornet Provenance

• Sensor data are distributed across many data providers

– Need: distributed data management and authorization• Locate the distributed sensor data• Support a distributed authorization in tracking provenance

• Each sensor data item is often small– Need: efficient provenance storage

• Scale the provenance storage according the sensor data size

• Sensor data keeps arriving– Need: stream-aware provenance

• Record the temporal location of stream

6

Page 7: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Sensor ProvenanceGoals and Contributions

• Goals– End-user can follow back to the original source– Observe each step of processing

• Contributions– Provenance via new linking scheme (distributed data

management)– User-centric access control (distributed authorization)– Incremental compression (provenance storage)– Stream-aware provenance

7

Page 8: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Outline• Motivation• Sensornet Provenance• Evaluation

– Prototype deployment– Storage cost– Compression alternatives– Ease-of-use provenance

8

Page 9: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Design Choice of Sensornet Provenance

• Representation – annotation vs. inversion– content vs. link

• Granularity– tuple-level (fine-grained) vs. table-level

• Consistency (Stream-aware provenance)– timestamping to handle sensor data that keep arriving

• Authorization– The data generator controls data access– Pass a “letter of reference” to the owner

9

Page 10: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Predecessor Links• Purpose: locate sensor data across different administration• Fine-grained, annotation based, timestamped links

– Source location• Location of the source repository • Table at that repository• Search from the table

– Timestamp• To replay a relative query and produce the same result

– Transformation• A point to a general description, source codes, or executable programs

• An example– .

10&x="http://www.isi.edu/ilense/siss/tempread.html"

sb://sensorbase.org/soap/sensorbase2.wsdl?s=getData&a1="datetime,temperature"&a2=p_97_temperature&a3=‘sensorid="sum-in"’&a4=0&a5=1

&t="2008-02-24 12:00:00”

Page 11: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Letter of Reference• Purpose: provide an ease-of-use authorization• Sensor-store security model

– Public– Case-by-case basis

• Letter of reference– Contextual information of the data requestor

• User’s activities : collaboration with others, data sharing activities• How the user encountered the provider’s data

• Authentication– Provide this context to inform the data owner– The owner will make a decision based on it

11

Page 12: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Outline• Motivation• Sensornet Provenance• Evaluation

– Prototype deployment– Storage cost– Compression alternatives– Ease-of-use provenance

12

Page 13: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Prototype Deployment• Deployment

– Provenance system– Sensors– Sensor-store

• Prototype republishers– Digit repair– Digit repair with Image– TempMap

13

Fixing D

igits

Repair with image Correcte

d87.1 ±0.187.1 ±0.187.5 ±0.5

Raw87.1?7.187.?

Fixing D

igits

Digit RepairCorrected87.1 ±0.187.1 ±0.187.5 ±0.5

Raw87.1?7.187.?

TempMapInterpolate point data into a complete temperature map

republishingIm

age Recognitio

n

West L.A. Temperature Publishing Raw

87.1?7.187.?

republishinrepublishingg

Page 14: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Storage Alternatives• Alternatives

– copy source– uncompressed links– compressed links

• Small source, and data – Copying source works well– Uncompressed link is

verbose, larger than data– With compression, cost

equals copying source

14

Digit Repair(small source and republished data)

Page 15: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Benefits Depend the Size of Source

• Copying source is expensive when source is large• Compressed link works well in all three cases

15

Repair with Image(large source and

small republished data)

TempMap(small source and

large republished data)

Page 16: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Link Compression• We showed that link compression is important, so what

are the compression alternatives• Compression Alternatives

– no compression– per-link– Incremental

• Exploit redundancy across predecessor links

• 83% storage savingcompared to no compression

16

Page 17: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Ease-of-use: Provenance• Provenance extension

– Sensorbase.org– predecessor links

• Easy source tracking– A simple click allow to

track the source data

17

provenance

a list of predecessor linkssource data

provenance of the source data

Page 18: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Ease-of-use: AuthorizationEasy, user-centric, distributed access control

18

have an account?

Yes No

Generated a letter of reference (predecessor link, user account, target, user’s activities)

If accessing source data requires an authentication

Page 19: Provenance in Sensornet Republishing Unkyu Park and John Heidemann University of Southern California Information Science Institute June 18, 2008

Conclusions• Sensor republishing will become an

important means to share sensor data• New provenance for sensornet

– Provenance via new linking scheme– Easy, user-centric, distributed access control– Compression makes the tuple-level provenance

reasonable• http://www.isi.edu/ilense/siss

19