sensemycity: crowdsourcing an urban sensor · crowdsourcing, crowdsensing or participatory sensing...

10
1 SenseMyCity: Crowdsourcing an Urban Sensor Jo˜ ao G. P. Rodrigues * , Student Member, IEEE, Ana Aguiar * , Member, IEEE, Jo˜ ao Barros * , Senior Member, IEEE Abstract—People treat smartphones as a second skin, having them around nearly 24/7 and constantly interacting with them. Although smartphones are used mainly for personal communi- cation, social networking and web browsing, they have many connectivity capabilities, and are at the same time equipped with a wide range of embedded sensors. Additionally, bluetooth connectivity can be leveraged to collect data from external sen- sors, greatly extending the sensing capabilities. However, massive data-gathering using smartphones still poses many architectural challenges, such as limited battery and processing power, and possibly connectivity costs. This article describes SenseMyCity 1 (SMC), an Internet of Things mobile urban sensor that is extensible and fully con- figurable. The platform consists of an app, a backoffice and a frontoffice. The SMC app can collect data from embedded sensors, like GPS, wifi, accelerometer, magnetometer, etc, as well as from external bluetooth sensors, ranging from On-Board Diagnostics gathering data from vehicles, to wearable cardiac sensors. Adding support for new internal or external sensors is straightforward due to the modular architecture. Data transmis- sion to our servers can occur either on-demand or in real-time, while keeping costs down by only using the configured type of Internet connectivity. We discuss our experience implementing the platform and using it to make longitudinal studies with many users. Further, we present results on bandwidth utilization and energy consumption for different sensors and sampling rates. Finally, we show two use cases: mapping fuel consumption and user stress extracted from cardiac sensors. Index Terms—Sensor System Integration, Crowd Sensing and Crowd Sourcing, Mobile and Ubiquitous Systems, Smart Cities I. I NTRODUCTION Crowdsourcing, crowdsensing or participatory sensing have recently become feasible (besides popular) ideas thanks to the rapidly growing number of affordable internet-enabled sensing devices—smartphones. They are equipped with a wide range of embedded sensors, like GPS for location, magnetometer, accelerometer, gyroscope and most recently even barometer and temperature, as well as connectivity opportunities, like Bluetooth, WiFi, NFC and 3G. Hence, they may become personal sensors in self-quantification applications that range from health to mobility [1], but also sensors for groups of peo- ple [2] or for wide geographic areas [3] through aggregation of the values sensed from multiple devices. Obtaining large- scale dynamic datasets for a widespread geographic area, like collecting cartographic and environmental data (e. g. street steepness, pavement type, average traffic delay, noise levels and pollution) in urban areas becomes a tractable problem if the sensing can be crowdsourced. However, mobile phones have limited resources, and should nevertheless be available at all times. Users will only collab- * Instituto de Telecomunicac ¸˜ oes, Departamento de Engenharia Eletrot´ ecnica e de Computadores, Faculdade de Engenharia da Universidade do Porto, Porto, Portugal 1 http://cloud.futurecities.up.pt/sensemycity/ orate and provide data if the offered incentives or services overcomes at least the possible monetary costs. Data collection typically requires communication, from the user’s device to a centralized system, which can have inherent transmission costs when using a payed data connection. Furthermore, gathering data from a personal device such as a smartphone should take security and privacy issues into consideration. The users should be in control of when their devices are gathering possibly sensitive data, who can access that data and for what reason. In existing sensing platforms these aspects are typically tuned in a trade-off manner and hard-coded in the system according to their specific requirements, e.g.: reducing battery consumption by sacrificing sampling rate, or transmitting only when connected to a free data connection and disallowing real- time data transmissions. In this article, we extend our previous work [4] where we proposed an architecture for a data gathering system and a working prototype based on a netbook. We now describe the design of a framework for a crowdsourced urban sensor that has achieved a maturity level that enables its large- scale deployment among common citizens. The framework is comprised of a smartphone application that gather data from many internal and external sensors and transmit it securely to a backend server. It is modular and fully configurable, being able to meet requirements on resources consumption, data usage and sampling rates, by simple configuration changes. This effectively allows the trade-offs to be performed per usage scenario and even changed in real-time according to the environment, e.g. decreasing sampling rates if the battery goes below a configured level. We use a two-tier servers architecture to store, process and visualize the massive amounts of data. The main contributions of this paper are: 1) the requirement analysis of a smartphone-based crowdsourced urban sensor (Section I-A); 2) the design of a sensing system that meets the requirements of both users and researchers (Section II); 3) a modular and configurable mobile Android OS application, that can be used in various sensing tasks (Section II-C); 4) a resource consumption study of the different sensors and sampling rates, in terms of energy and storage, providing us an estimation of the cost-of-knowledge of such sensing tasks (Section III); 5) example projects that already use the system and gathered data (Section IV). A. User Requirements A framework for a data gathering system has to satisfy two main stakeholders: the participants that have to use and handle the gathering device; and the researchers or operators who wish to gather the data. We designed our system with these users’ interests in mind. arXiv:1412.2070v1 [cs.CY] 5 Dec 2014

Upload: others

Post on 09-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SenseMyCity: Crowdsourcing an Urban Sensor · Crowdsourcing, crowdsensing or participatory sensing have recently become feasible (besides popular) ideas thanks to the ... or future

1

SenseMyCity: Crowdsourcing an Urban SensorJoao G. P. Rodrigues∗, Student Member, IEEE, Ana Aguiar∗, Member, IEEE, Joao Barros∗, Senior Member, IEEE

Abstract—People treat smartphones as a second skin, havingthem around nearly 24/7 and constantly interacting with them.Although smartphones are used mainly for personal communi-cation, social networking and web browsing, they have manyconnectivity capabilities, and are at the same time equippedwith a wide range of embedded sensors. Additionally, bluetoothconnectivity can be leveraged to collect data from external sen-sors, greatly extending the sensing capabilities. However, massivedata-gathering using smartphones still poses many architecturalchallenges, such as limited battery and processing power, andpossibly connectivity costs.

This article describes SenseMyCity1 (SMC), an Internet ofThings mobile urban sensor that is extensible and fully con-figurable. The platform consists of an app, a backoffice anda frontoffice. The SMC app can collect data from embeddedsensors, like GPS, wifi, accelerometer, magnetometer, etc, aswell as from external bluetooth sensors, ranging from On-BoardDiagnostics gathering data from vehicles, to wearable cardiacsensors. Adding support for new internal or external sensors isstraightforward due to the modular architecture. Data transmis-sion to our servers can occur either on-demand or in real-time,while keeping costs down by only using the configured type ofInternet connectivity. We discuss our experience implementingthe platform and using it to make longitudinal studies with manyusers. Further, we present results on bandwidth utilization andenergy consumption for different sensors and sampling rates.Finally, we show two use cases: mapping fuel consumption anduser stress extracted from cardiac sensors.

Index Terms—Sensor System Integration, Crowd Sensing andCrowd Sourcing, Mobile and Ubiquitous Systems, Smart Cities

I. INTRODUCTION

Crowdsourcing, crowdsensing or participatory sensing haverecently become feasible (besides popular) ideas thanks to therapidly growing number of affordable internet-enabled sensingdevices—smartphones. They are equipped with a wide rangeof embedded sensors, like GPS for location, magnetometer,accelerometer, gyroscope and most recently even barometerand temperature, as well as connectivity opportunities, likeBluetooth, WiFi, NFC and 3G. Hence, they may becomepersonal sensors in self-quantification applications that rangefrom health to mobility [1], but also sensors for groups of peo-ple [2] or for wide geographic areas [3] through aggregationof the values sensed from multiple devices. Obtaining large-scale dynamic datasets for a widespread geographic area, likecollecting cartographic and environmental data (e. g. streetsteepness, pavement type, average traffic delay, noise levelsand pollution) in urban areas becomes a tractable problem ifthe sensing can be crowdsourced.

However, mobile phones have limited resources, and shouldnevertheless be available at all times. Users will only collab-

∗Instituto de Telecomunicacoes, Departamento de Engenharia Eletrotecnicae de Computadores, Faculdade de Engenharia da Universidade do Porto, Porto,Portugal

1http://cloud.futurecities.up.pt/sensemycity/

orate and provide data if the offered incentives or servicesovercomes at least the possible monetary costs. Data collectiontypically requires communication, from the user’s device to acentralized system, which can have inherent transmission costswhen using a payed data connection. Furthermore, gatheringdata from a personal device such as a smartphone shouldtake security and privacy issues into consideration. The usersshould be in control of when their devices are gatheringpossibly sensitive data, who can access that data and for whatreason.

In existing sensing platforms these aspects are typicallytuned in a trade-off manner and hard-coded in the systemaccording to their specific requirements, e.g.: reducing batteryconsumption by sacrificing sampling rate, or transmitting onlywhen connected to a free data connection and disallowing real-time data transmissions.

In this article, we extend our previous work [4] wherewe proposed an architecture for a data gathering system anda working prototype based on a netbook. We now describethe design of a framework for a crowdsourced urban sensorthat has achieved a maturity level that enables its large-scale deployment among common citizens. The framework iscomprised of a smartphone application that gather data frommany internal and external sensors and transmit it securely toa backend server. It is modular and fully configurable, beingable to meet requirements on resources consumption, datausage and sampling rates, by simple configuration changes.This effectively allows the trade-offs to be performed perusage scenario and even changed in real-time according to theenvironment, e.g. decreasing sampling rates if the battery goesbelow a configured level. We use a two-tier servers architectureto store, process and visualize the massive amounts of data.

The main contributions of this paper are: 1) the requirementanalysis of a smartphone-based crowdsourced urban sensor(Section I-A); 2) the design of a sensing system that meetsthe requirements of both users and researchers (Section II); 3)a modular and configurable mobile Android OS application,that can be used in various sensing tasks (Section II-C); 4)a resource consumption study of the different sensors andsampling rates, in terms of energy and storage, providing usan estimation of the cost-of-knowledge of such sensing tasks(Section III); 5) example projects that already use the systemand gathered data (Section IV).

A. User Requirements

A framework for a data gathering system has to satisfy twomain stakeholders: the participants that have to use and handlethe gathering device; and the researchers or operators whowish to gather the data. We designed our system with theseusers’ interests in mind.

arX

iv:1

412.

2070

v1 [

cs.C

Y]

5 D

ec 2

014

Page 2: SenseMyCity: Crowdsourcing an Urban Sensor · Crowdsourcing, crowdsensing or participatory sensing have recently become feasible (besides popular) ideas thanks to the ... or future

2

Intuitive and low user-interaction required: We aim fora massive data gathering system, so the gathering unit shouldrequire the least user interaction possible.

The gathering and sensing devices should be non-intrusivein order to maximize user utilization and avoid biased informa-tion. If the user have a constant sensation that his environmentis being monitored, he might have a different behavior andavoid using the system.

Lightweight and stable: Mainly if the device is used forother tasks, the application should work seamlessly, withoutaffecting the normal user operation.

The Battery Consumption is one of the most importantfactors. The application should provide information aboutenergy consumption of different configurations, and allow auser to configure the desired sensors.

Our application should support any type of internet con-nectivity. However, to avoid mobile communication costs,the user should be in control of when communications shouldoccur and which connection type can be used.

Transparency: The user interface should clearly state whatthe application is doing. No data should be gathered nortransmitted without the user’s knowledge.

Security and privacy should also be tackled throughout theapplication. Also, the user shall retain ownership over gathereddata, with full access to his own data, including being ableto download the full data-set or delete it. Furthermore, therequired user identification mechanism should not be directlyreversible, nor accessible by researchers nor data analysisservices, not even when processing clusters of data.

B. Researcher Goals

Ubiquitous Availability: The system should support widelyused and economic platforms, avoiding the need for specifichardware requirements or expensive devices.

Modular, flexible and extensible: Allow a wide variety ofsensors and be easily extended for new sensors and gatheringdevices.

Reliable: Preserve as much accuracy as possible, from thesensors to the storage facility, in order not to hinder any currentor future use the data may have.

Scalability: To increase availability even under high de-mand, the system should be scalable and compatible with loadbalancing mechanisms.

Storage Utilization: The researchers should be provided anestimation of the bandwidth and storage required for a givensensing task.

II. SENSEMYCITY CROWDSENSOR

To meet the above requirements, we separated our systemdesign in four sections: system architecture including datastructures, the communication protocol, a working implemen-tation based on a Mobile Application and the required Back-End servers.

A. System Architecture

Our system is organized in a 3-tier architecture (Fig. 1),where each block has distinct requirements and functions.

Gathering Units

Smartphones

Netbooks

Car PCs DB

SensorsGPS, WiFi, Acc,

Gyro, Mag, ...

External: OBD,

Biosensors, …

Backup DB

Secondary Servers

DB

Main ServersAccess Control

Store and Replicate

Data processing

Data visualiza!on

Fig. 1. System Architecture

The gathering units make the base block of the system,collecting data acquired by the sensors. However modular,these units are responsible for synchronizing the data fromthe various sources they may have, to ensure quality andreliability of the data, and to encapsulate and transmit itto the main servers using our protocol. The communicationbetween these units and the main servers is performed usinga machine-to-machine protocol based on JSON and standardencryption protocols, and is described in Section II-B. We haveimplemented data collection programs in netbooks, car-PCsand have also created an application running on Android basedsmartphones that is further discussed in Section II-C.

The main servers are the system core. Their job is toreceive the data from the units, store it, and replicate it asyn-chronously to secondary servers. They should be reliable witha very high uptime, and offer large storage space. To increasethe uptime, the database storage engine should only performwrite queries and simple indexed read queries that may beneeded when creating or restoring a secondary server. Thesemain-servers can also be configured as a load-balancing ormulti-master database if necessary due to increased write de-mand. There are software solutions that provides multi-masterarchitecture for relational databases supporting primary-keyscollision-avoidance, and our system should be compatible withany of those solutions.

Secondary servers are responsible for provide any de-sired application level service, such as data processing andvisualization, and to provide user defined privacy policies,anonymity and access control. These services usually requirehigher bandwidth and complex database read queries that canleave any database engine unresponsive for some time. Theservers can either maintain a replica of the entire database orjust a specialized subset of data, according to some projectdata requirements or to location of the data, such as a serverdedicated to process the data from a specific city.

Our system should be compatible with any existing multi-server, replication or load-balancing database software, thatare able to improve the uptime and scalability of the system.However, we would like to emphasize the importance of amulti-server architecture (Fig. 1) in sensing systems, becausethe data gathering and storage, and the data analysis, process-ing and visualization, have completely distinct requirements:Storing the data takes little computational power, but should be

Page 3: SenseMyCity: Crowdsourcing an Urban Sensor · Crowdsourcing, crowdsensing or participatory sensing have recently become feasible (besides popular) ideas thanks to the ... or future

3

as reliably and quick as possible; Analyzing or visualizing thedata can be very computationally expensive, new experimentscan block or destroy the whole database, but in most casesdoes not have tight deadlines.

This division in three blocks with two-tier servers alsoprovides added security, since the only servers exposed tousers or researchers are the secondary ones, which can berolled-back at any time without loosing raw data.

1) Data Structure: The used database structure is not fixed,being updated when new sensors or requirements are added tothe system. However, to increase the framework usability andcompatibility between different projects and implementations,we defined a set of possible data types or tables.

Session information, with data about the recording sessionsuch as starting time, gathering unit identifiers, user and ve-hicle information, cryptography keys, and application version.These tables should provide an identifier to be used by the restof the database, such as user id, vehicle id and session id.User identification exists for authentication and ownershipreasons, but these are implemented using a one-way functionsuch as hashing and these tables should not be accessible byany researcher or user.

Gathered data from sensors or other sources, such asquestionnaires or device events, have distinct formats andthus can have their own data types and structure. However,they should follow some structural rules to facilitate rela-tional data-analysis. To this end, every gathered data shouldbe timestamped and identified by the current session id. Acommon identifier makes session analysis and access-controlmuch easier to implement.

These tables get one new row of data per second fortypical periodic sensors. Sensors with higher frequencies, suchas accelerometer or gyroscope, are stored as an array. Amilliseconds column should be used by asynchronous sensors,such as an On-Board Diagnostics (OBD) device where thedelay between readings is not constant, or by sensors havinga higher time precision, such as Global Positioning System(GPS). Questionnaires responses and device events, such asbattery consumption status, can also be stored and sent to theserver.

Auxiliary data are tables that contain non-personal datashared among many sessions or users. Some examples aretables relating mac addresses with the WiFi essid, that savestorage space and bandwidth by storing the repetitive and staticinformation about the networks found.

B. Reliable and Secure Data GatheringIn this section we describe the communication protocol that

handles the communication between gathering units and mainservers. It was designed to be efficient but secure, to protectthe possibly sensitive data gathered from a users’ devices.The protocol guarantees confidentiality through encryption, in-tegrity and reliability through compression and feedback, andan access control mechanism is responsible for authentication.

Our protocol encrypts every transmission using standardmechanisms, more precisely 4096-bits RSA public key cryp-tography to exchange per-session random 128-bits AES sym-metric keys.

We implemented reliability through encrypted applicationlayer feedback after successful storage of the data on thedatabase. Also, to minimize the resources utilization anddamage done by lost packets, our server implementation worksin a stateless mode, without keeping record of an internalcommunication state and treating every arriving packet thesame way. These features allow a gathering unit to transmitdata in real-time or on-demand at arbitrary times, using anykind of network connection. A unit can opt for the utilizationof network friendly protocols (such as TCP) with a high-quality connection, while using other more efficient or packetoriented protocols (UDP) in other network environments, suchas vehicular networks.

For authentication and access control we avoid accountcreation and password handling by using OpenID, an open andsecure authentication method. To access the data, a user hasto login in the front-end server via OpenID, with the same e-mail configured and verified in the gathering unit. To this end,the only user identification automatically stored on our serversis a normalized and hashed version of the user e-mail, whichis not directly identifiable. Furthermore, the table relating anuser’s hash to the internal user ID is protected, only accessibleby the authentication mechanism and the DB Administrator.

The gathering unit is responsible for authenticating theuser’s OpenID when gathering or uploading data, but forefficiency we perform no user authentication on the serverwhen storing the data. This may allow a malicious user toreplicate the data-gathering protocol or to fake the OpenIDvalidation. However, in this situation the attacker can onlyupload data under a fake user ID, and is not able to access anyexisting data from another user. Other data forgery attacks arealso impossible to prevent, e.g. almost every smartphone canbe configured to return fake data from the integrated sensors.A user can also alter gathered data just by changing the dateand time of the mobile device’s internal clock.

We decided not to use available cryptographic protocolssuch as TLS or DTLS for simplicity reasons. Even thoughthose protocols are standard and with proven security, theyoffer many capabilities that are not required for applicationslike ours, such as Certificate or Cipher Specification Exchange.By designing a custom protocol based on the same securitystandards, we were able to integrate our authentication phasein the first transmitted packet responsible for handshake andkey exchange, reducing the number of overhead messages tozero (Fig. 2). We also did not use any existing web-servicestandards or frameworks for the same reasons: complexityand overhead. They typically bring architectural and designconstraints like required Resource Identification, fixed mes-sage formats, or specific allowed communication protocol. Adata gathering platform should be able to deal with massiveamounts of messages and data, making it very important toreduce required processing and bandwidth. On the other hand,it only needs to support session authentication and data upload.

The protocol is divided in two parts (Fig. 2): a 2-wayhandshake and a data transmission phase with feedback. The 2-way handshake phase is responsible for session authenticationand symmetric key exchange. The gathering units shouldauthenticate every sensing session by sending a packet.

Page 4: SenseMyCity: Crowdsourcing an Urban Sensor · Crowdsourcing, crowdsensing or participatory sensing have recently become feasible (besides popular) ideas thanks to the ... or future

4

Unit ServerAuthen ca on

Data Upload

……

Fig. 2. Protocol description, with session authentication and data transmissionwith feedback

• seq: a sequence number defined by the gathering unit.Identifies the response packet, allowing the unit to sendmultiple authentication packets in parallel.

• hash: this is the md5-hash of the user e-mail.• time: the unix timestamp of the start of the session. This

timestamp together with the user id make the session’sprimary key.

• key: a per-session randomly-generated AES-128 symmet-ric key that should be used for the remainder of thesession.

• version: the application’s version, to allow for futureformat upgrades with backwards compatibility.

• identifiers: reserved for optional extra session informa-tion, such as identifiers of the gathering unit, sensors orvehicles.

Both the key and the identifiers of a session can be changedby the gathering unit at any time, by re-sending an authentica-tion packet with the same hash (user ID) and time. This allowsthe gathering unit to not store the aes key of each session andto update any identifier that may be unavailable at the start.

On receiving this packet, the server should initialize thesession on the database and return the time and unique sessionidentifier (session id) now reserved for that session’s data.Both fields are compressed and encrypted before transmissionusing the received symmetric key.

In the data transmission phase, the gathering unit createsa packet where the first 4 bytes correspond to the receivedsession’s unique ID, followed by the compressed and en-crypted packet with the data to be transmitted. The ID andthe used encryption key must be the ones exchanged duringthe handshake, otherwise the server will discard the packet.

The sequence number is used to identify the feedback pack-ets, allowing the unit to transmit the next data packets beforereceiving feedback from the previous one. This asynchronoustransmission can substantially increase the throughput in highlatency connections. The gathering unit is responsible forsequence number generations, tracking, and any necessary re-transmissions.

The feedback packet is only generated after the storage

and flushing of the received data, and contains the numberof rows received and successfully written to the DB. Thisserves as guarantee of receipt and storage, making it safe tothe gathering unit to delete the local data after receiving apositive feedback.

In our implementation we used JSON and zlib to serializeand compress the data before transmission, since they com-pletely serves our purpose while being simple and lightweight.

C. Mobile Data Gathering ApplicationThe Android application, labeled SenseMyCity, was de-

veloped to become a testbed for the FutureCities Projectat Instituto de Telecomunicacoes. The application is aimedat urban data gathering, taking advantage of the AndroidSmartphones embedded sensors.

In order to gain acceptance by a large audience in thegeneral public, SenseMyCity was designed to run on a widerange of Android smartphones and require almost no config-uration and user interaction. The interface was designed to beintuitive and easy to use in any possible scenario. A significantamount of time was invested on enhancing the stability andefficiency of the application to allow it to run properly acrosssmartphones with different hardware resources.

1) Interface: The application is designed to have a simpleand minimalistic interface, with just five big buttons availableto start, pause or terminate the gathering session, to access thesettings menu, and to synchronize the gathered data with theserver. A widget is also available to quickly start, pause orstop a session without leaving the smartphone’s home screen.A verbose interface is also available that shows the activatedsensors and most recent gathered data.

The settings menu is very complete, since almost everypossible parameter is configured from this interface. However,the application was designed to be almost configuration free,with default settings that gather data from GPS, accelerometerand Wifi networks. Settings include, among others, activationand deactivation of individual sensors, their desired sensingfrequency, choosing between real-time data transfers or user-activated synchronization, and the allowed internet connectiontype to use on real-time data transmissions. Furthermore, aresearcher can fix or limit the configurations, like specifyingactive sensors and their sampling rates, prior do deployment.

When a sensing session is started, the application initializesthe configured sensors and connects to the external devices.The arriving data is buffered, stored and transmitted, as con-figured in the settings menu. The application keeps collectingdata in the background, without an active interface, but alwaysshowing a notification with its status.

2) Architecture (Fig. 3): The main service is responsible forcontrolling the whole application state and passing feedbackto any active interface or notification. This service starts gath-ering sessions by initializing the configured sensor threads andthe required managers. Every resource is executed in a separatethread, including the network connection, database access,each sensor, memory buffer and UI. The main service alsoensures that every sensor thread is correctly terminated andthe data is flushed to local storage when the user terminatesthe session.

Page 5: SenseMyCity: Crowdsourcing an Urban Sensor · Crowdsourcing, crowdsensing or participatory sensing have recently become feasible (besides popular) ideas thanks to the ... or future

5

Main Activity

Main Service

Sensor Services

Memory Manager

Internet Manager

Database Manager

WidgetMain

Sensor ServicesSensor Services

Buffer

Fig. 3. Application architecture depicting the various processing modules andinformation flow between them

Each sensor thread communicates with the correspondingsensor, and receives, timestamps and sometimes converts thegathered data before sending it to the Memory Manager.During tests, the main performance bottleneck of our mobileapplication proved to be the local SQLite storage, that canintroduce a delay of a few seconds for a single database opera-tion. This manager buffers the data and allows the applicationto only require one SQLite database transaction every fewseconds, while performing optional real-time synchronizationsat the same time.

The Internet Manager implements the designed communi-cation protocol for authenticating and uploading the data toour servers described in Section II-B. All of the requiredcryptography and compression algorithms are available in theAndroid or java standard libraries. The server RSA key ishardcoded on the application and thus can be changed anytimevia an application update. The symmetric keys are randomlygenerated when a session is started.

The application requires at least one google account to beconfigured in the android phone, leveraging in this way thebuilt-in Google account verifier as the authentication methodof the user e-mail, described in Section II-B. This allowsa user to login to our front-end using Google’s OpenIDauthentication.

3) Data Synchronization: A common problem of data gath-ering systems is the synchronization and timing accuracy ofthe data. To minimize the losses of time information, gettingthe device timestamp in milliseconds is the first performedtask after every new sensor data. This guarantees a minimumdelay and jitter in time information.

We can further improve timing accuracy with offline pro-cessing, e.g., GPS data is stored containing both the satellites’and the smartphone’s internal clock, and can thus be used tocompare and correct the device’s clock.

D. Back-office Implementation

Our back-office implementation is running on two servers,implementing the proposed tiered architecture with one mainserver and one secondary server.

Authentication

Workers

Database

Manager

Connection

Workers

(TCP/UDP)

SQL Workers

Feedback

Data

Fig. 4. Back-end implementation

1) Main Server (Fig. 4): On our sample implementationwe use a lightweight java daemon to handle the requests anda SQL database for storage. The daemon is composed by fivethreads: two Connection Workers listening for incoming pack-ets, an Authentication Worker handling hand-shakes and key-management, an SQL Worker decrypting and decompressingreceived data packets, and a Database Manager performing anyrequired database operation. We have developed the server touse both PostgreSQL and MySQL local databases, but it iseasily configurable to work with any other storage engine.

The java application listens on two distinct sockets for au-thentication or data packets, and in both UDP and TCP mode.As stated in Section II-B, this application is stateless in boththe authentication and data transmission phases, analyzingevery received packet the same way.

The worker that handles authentication packets is respon-sible for decrypting them using the server private key, de-compressing the data, and sending the SQL commands tothe database manager. This service discards and ignores anypacket that cannot be decrypted using the server public key.Every database request related to authentication is handledby the same DB connection to prevent lockups and race-conditions, running in an optimized worker thread. This insertsor updates the corresponding session on the database, andreturns the corresponding unique session id back to the gath-ering unit. The authentication worker also caches recent ses-sion id - aes key combinations to avoid unnecessary databasereads when receiving data packets.

Arriving data packets are handled by another worker thread,that starts by searching for the corresponding AES key, anddiscards the packet if no key is found or the decryptionand decompression steps fail. To maximize performance, thisworker parses and separates the received data into optimizedstructures, before sending it to the database worker. Thisdatabase worker is a different one from the authenticationphase, since it operates over different tables. Again, feedbackis only sent back to the unit when storage confirmation isreceived from the database.

With this back-end architecture we ensure reliability, sinceevery necessary data is stored in the database, from sessioninformation to the received sensor data, before sending theresponse to the gathering unit.

Our server is designed to store the data in a local Post-greSQL database, which is configured to replicate the data toa secondary server. We considered using noSQL since it hassome advantages over relational databases on huge datasets,mainly with single-index write operations. However, they havea similar performance in multiple-index multiple-writes, andeven have lower performance on complex reads [5]. We opted

Page 6: SenseMyCity: Crowdsourcing an Urban Sensor · Crowdsourcing, crowdsensing or participatory sensing have recently become feasible (besides popular) ideas thanks to the ... or future

6

Accelerometer

session_idsecondsx_axisy_axisz_axis

User

user_idnameemailage...

GPS

session_idsecondsmillisgpstimesats_countlatlonalttrackspeedclimbaccuracyhdop

OBD

session_idsecondsmillisecondscodevalue

Vehicle

session_idvinmodelfuel_typedisplacementmileageweight

Session

session_iduser_idvehicle_idaes_keystart_timeversion

Temperature

session_idsecondsmillisecondsdata

1

n

1 nn

1

WifiMonitor

session_idsecondsmac_addrsnrnoise

WifiName

mac_addrlastsecondsessidchannelauthenticationmode

BluetoothName

mac_addrlastsecondsessidclass

BluetoothMonitor

session_idsecondsmac_addrsnr

Log

session_idsecondsmillisecondscodevalue

1

n

1

n

Fig. 5. Database structure

for a relational database since our system performs mainlymultiple-index bulk writes to the servers and can have a hugeamount of multiple-index complex reads.

Our data processing projects and front-end visualizationstypically query the database using multiple indexes, such asby session id and timestamp when showing a session to itsuser, by GPS coordinates for spatial queries or clustering,and many others. These queries may also require correlatingthe data between many sensors, such as when mapping Wifihotspots as can be seen in our examples in Section IV, makingrelational databases a perfect choice. We opted for PostgreSQLby default since it offers a robust GIS extension, and a largepart of our data analysis is based on geographical information.Furthermore, it allows an easy setup of replication or multi-master architectures, solving scalability problems that mayoccur.

The implemented database structure can be seen in Fig. 5,and follows the structure proposed in II-A1 with three distinctdata types: Session information, gathered data, and auxiliarydata.

The first stores Session information including vehicles’ andusers’. As previously stated, for privacy reasons, the users’table is only accessible by the authentication mechanisms andby the database administrator.

The auxiliary data prevents the system from transmitting orstoring redundant data, saving a lot of bandwidth and storagespace.

We try to have a very similar structure for every table hold-ing gathered data, with similar primary keys, codifications and

TABLE IAVERAGE BATTERY CONSUMPTION AND AUTONOMY AFTER 3 H SENSING

PERIODS

Samsung I5500 Nexus 4 E9603h Cons. Auton. 3h Cons. Auton.

Idle 5.6% 53.6h 1.4% 214.3hAcc, Gyro, Mag 16.7% 18.0h 4.3% 69.8h

GPS 25.0% 12.0h 14.0% 21.4hWifi (WF) Scan 19.8% 15.2h 11.3% 26.5h

BlueTooth (BT) Scan 11.0% 27.3h 6.0% 50.0hExternal via BT 11.0% 27.3h 9.7% 30.9h

Ambient pressure 5.0% 60.0hAcc, Gyro, Mag, GPS 34.8% 8.6h 16.6% 18.1h

Acc, Gyro, MagGPS, WF, BT 49.8% 6.0h 24.3% 12.4h

All internal 49.8% 6.0h 28.4% 10.6h

column names. For example, the tables storing accelerometer,gyroscope and magnetometer have exactly the same structure.

In Section III-B we present the storage requirements foreach sensor used in common sensing tasks.

2) Front End: The gathered data can only be accessed andprocessed through the secondary servers. We implementeda simple visualization web-page, that handles authenticationvia OpenID with a Google account, and allows a user tovisualize all of his recorded sessions and associated data usinga GIS service. Other running services and algorithms areconstantly providing new data to the secondary databases, suchas calculating fuel consumption from OBD data, or mappingthe wifi signal strength of detected open networks. Some ofthese projects that provide extra information are described inSection IV.

Besides the user visualization webpage, we provide a web-page that allows a user to access, download and delete all ofhis data.

III. MEETING THE USER INTERESTS

A. Saving the user’s battery

The reconfigurability of our system was leveraged to per-form a battery consumption study with different classes ofdevices, sensor configurations and sampling rates. In thissection we compare a 2010 small and cheap smartphone -Samsung I5500 running Android version 2.3.7 with a single-core 600 MHz CPU and a 1200 mAh battery - to a 2012powerful model - LG Nexus 4 E960 running Android 4.2.2with a quad-core 1.5 GHz CPU and a 2100 mAh battery.The study involved at least 5 runs of each configurationand smartphone model, with everything turned off except forour running application and desired sensors, and analyzingthe battery level (%) after a 3 hours data-gathering session.We consider these results, shown as percentage of batteryconsumed and estimated time to battery depletion, to be moreinformative to the users.

Table I presents some of these results, where we can see aclear improvement between phones on the energy consumptionof the Accelerometer, Gyroscope and Magnetometer sensors,even after taking into consideration the differences in batterycapacity (see estimated wattage from Fig. 7).

Fig. 6 presents the most common sensing setups and theircorresponding battery consumption in average watts per hour

Page 7: SenseMyCity: Crowdsourcing an Urban Sensor · Crowdsourcing, crowdsensing or participatory sensing have recently become feasible (besides popular) ideas thanks to the ... or future

7

All Off Only GPS 1 HzAcc, Gyro,

Mag, GPS

Acc, Gyr, Mag,

GPS, WF, BTAll Sensors

I5500 Bat. Cons. 22.4 mW 100.0 mW 139.2 mW 199.2 mW 199.2 mW

E960 Bat. Cons. 9.8 mW 98.0 mW 116.2 mW 169.8 mW 198.8 mW

I5500 Bat. Life 53.6 hs 12.0 hs 8.6 hs 6.0 hs 6.0 hs

E960 Bat. Life 214.3 hs 21.4 hs 18.1 hs 12.4 hs 10.6 hs

0.0 mW

50.0 mW

100.0 mW

150.0 mW

200.0 mW

Fig. 6. Energy consumption for different sensor setups

Acc Normal Acc Fast Acc Faster Acc Fastest

I5500 Bat. Cons. 65.3 mW 64.0 mW 65.3 mW 64.0 mW

E960 Bat. Cons. 25.7 mW 30.3 mW 42.0 mW 49.0 mW

I5500 Bat. Life 18.4 hs 18.8 hs 18.4 hs 18.8 hs

E960 Bat. Life 81.8 hs 69.2 hs 50.0 hs 42.9 hs

0.0 mW

10.0 mW

20.0 mW

30.0 mW

40.0 mW

50.0 mW

60.0 mW

70.0 mW

Fig. 7. Energy consumption for different Accelerometer sampling rates

(Bat. Cons.) and estimated battery autonomy (Bat. Life). Byperforming this analysis we can give the users and researchersa prediction of the battery requirements of a specific sensingscenario.

Fig. 7 compares the battery consumption of different ac-celerometer sampling rates. The older Samsung phone showeda constant consumption across all sampling rates, from 5 Hzto 50 Hz, while the newer Nexus 4 showed a more efficientsensor but with an increasing consumption for higher samplingrates, from 5 Hz to 200 Hz.

From Fig. 8, we can see that the GPS energy consumptiondoes not scale linearly with the sampling rate, which isexpected since a GPS fix takes at least a few seconds to obtainafter waking up from standby. A 60 s GPS sampling rate resultsin a 35% of energy reduction (-35 mWh) comparing to a 1 ssampling.

GPS 1 Hz GPS 1 Hz GPS 1/5 Hz GPS 1/30 Hz GPS 1/60 Hz

I5500 Bat. Cons. 100.0 mW 100.0 mW 101.3 mW 80.0 mW 68.0 mW

E960 Bat. Cons. 98.0 mW 98.0 mW 86.8 mW 77.0 mW 66.5 mW

I5500 Bat. Life 12.0 hs 12.0 hs 11.8 hs 15.0 hs 17.6 hs

E960 Bat. Life 21.4 hs 21.4 hs 24.2 hs 27.3 hs 31.6 hs

0.0 mW

20.0 mW

40.0 mW

60.0 mW

80.0 mW

100.0 mW

Fig. 8. Energy consumption for different GPS sampling rates

WiFi 1Hz WiFi 1/5Hz WiFi 1/30Hz WiFi 1/60Hz

I5500 Bat. Cons. 79.0 mW 46.7 mW 28.0 mW 25.0 mW

E960 Bat. Cons. 79.3 mW 45.5 mW 31.5 mW 25.2 mW

I5500 Bat. Life 15.2 hs 25.7 hs 42.9 hs 48.0 hs

E960 Bat. Life 26.5 hs 46.2 hs 66.7 hs 83.3 hs

0.0 mW

20.0 mW

40.0 mW

60.0 mW

80.0 mW

Fig. 9. Energy consumption for different WiFi sampling rates

TABLE IISTORAGE REQUIREMENTS FOR DIFFERENT SENSORS

Streams Sample size Sample rate Approx storageGPS 47 B 1 Hz 50 B/s

Acc, Gyro, Mag 8+3x2 B 4 - 200 Hz 32 - 1200,B/sWifi Scanner 17 B 0-10 Hz 10 - 100 B/sBT Scanner 16 B 0-20 /10 s 0 - 20 B/s

Pressure 12 B 1 /10 s 1 B/sOBD 14 B 4-12 Hz 112 B/s

Typical storage ' 150 BpsStorage with all sensors < 400 BpsMaximum storage ' 3800 Bps

In a similar fashion, we can see from Fig. 9 that just likethe GPS, the WiFi module energy consumption does not scaleproportionally to the sampling rate. However, we can savemore energy from this sensor than the GPS, saving 69% (-54 mWh) by using a 60 s sampling rate instead of 1 s.

Overall, we can see that, in the tested models, the smart-phones’ battery lasts for 6 h to 10 h when collecting data fromevery internal sensor. A user that travels for 1 h every day willarrive at the end of the day with 15% less battery for sensingthose trips.

B. Minimizing the user’s storage and communication costs

The size and amount of collected data can have a big impacton the cost-of-knowledge, depending on the communicationand storage costs. To save storage space, our mobile applica-tion compresses the gathered data and stores higher-frequencysensors in arrays. In Table II we present the resulting storagerequirement of individual sensors and typical configurations.

With default configurations, our application is gathering datafrom GPS at 1Hz; Wifi scanner sensing networks every 2 or3 seconds; and Accelerometer, Gyroscope and Magnetometerat the default Android rate, typically between 4 and 15 Hz.This configuration results in around 150 bytes of data storedper second, or 540 KB per hour, already including identifiersand timestamps.

In a more sensing intensive utilization, gathering data withthe highest available sample rate, with many surrounding wifinetworks and gathering vehicular data from an external OBDdevice, the bandwidth requirement is around 4 KB of dataper second, 13 MB per hour. The accelerometer, gyroscopeand magnetometer are responsible for more than 90% of thestorage utilization.

Our mobile application uses SQLite to store data beforebeing transmitted, which requires some overhead storage.

Page 8: SenseMyCity: Crowdsourcing an Urban Sensor · Crowdsourcing, crowdsensing or participatory sensing have recently become feasible (besides popular) ideas thanks to the ... or future

8

Tests showed a storage efficiency between 70% and 95%depending on the size and sensors configuration. Also, beforetransmission, our protocol serializes the data in JSON format,compresses using zlib, encrypts and transmits the data to theserver. Our tests indicate that a 1 h session occupying 700 KBin sqlite storage, are encapsulated in a JSON string witharound 1100 KB of length, but after our protocol’s compres-sion and encryption result in just 120 KB of transmitted data(83% compression from the SQLite storage).

C. Meeting the privacy expectations of the user

Studies indicate it is very hard to provide true anonymityand privacy when storing users’ location history [6], but thereare some ways to improve it.

To this end, our system implements the following tech-niques:• Every communication is secured to prevent eavesdroppers

from obtaining possibly sensitive information;• We use third-party openID authentication, preventing us

from handling and storing users’ passwords;• Only the user’s hashed email address is stored in the DB

for authentication purposes, but it is not accessible by anyservice nor researcher;

• Users retain full ownership of gathered data, and ourfrontend allows a user to completely delete his sessionsif desired.

We have analyzed the European Union Data ProtectionGuidelines, and our proposed architecture conforms with thesuggested privacy and anonymity features, and provides mech-anisms that allow services and applications built upon it to doas well.

IV. EXPERIMENTAL RESULTS

We have been running our system for a few months, and sofar we have gathered around 150 million rows of data from115 different volunteers, 3.500 trips and a total of more than6.200 hours of sensing time. From those we have 8 millionGPS points, 22 million seconds of accelerometer data atapproximately 5Hz, 3 millions of OBD vehicle data points, and57 million wifi signal strength points from 295.000 differentaccess points, 69.900 of which are open without protection.

One of the projects leveraging our platform aimed at car-diac stress detection among public bus drivers [7]. The bio-monitoring external devices supported by our app can gathercardiac information, which was used to estimate stress andcreate the stress map shown in Fig. 10.

Our platform was also used to connect to OBD externalsensors and gather vehicle information such as speed, fuelconsumption and other driving metrics. The map in Fig. 11was created by clustering fuel consumption data in the city ofPorto. The data gathered from the OBD external sensors wasused to improve fuel consumption estimation in smartphoneswithout requiring any external sensor [8]. We used the syn-chronized data from the OBD and GPS, to train a mathematicalmodel to estimate the instantaneous fuel consumption basedsolely on GPS data: vehicle speed, acceleration, and roadgradient.

Fig. 10. Cardiac stress levels in Porto’s BUS Drivers

Fig. 11. Fuel Consumption among Porto’s SenseMyCity Users

Another use case aims at improving users or vehicles WiFiconnectivity in urban scenarios. A stable internet access cangreatly improve a users experience and allow for real timesynchronization of data. The project analyzes the georefer-enced Wifi data collected to estimate the location of WiFiaccess points (AP). Furthermore, connectivity is optimised bypredicting the optimum sequence of AP for a given travelpath that minimizes the amount of handshakes and maximizeavailability and path coverage.

SenseMyCity is also being used in some universities asa data provider for machine learning courses and projects.Teachers and students leverage our system to gather a realworld data-set tightly related to the user (student), and exporttheir gathered data to a standard format.

V. RELATED WORK

Lane [9] presents a very thorough survey of existing mo-bile phone sensing applications, strategies and policies. Simi-larly, Chatzimilioudis [10] compares recent smartphone-basedcrowdsourcing applications and classifies them according toseveral aspects. The required user involvement is consideredone of the most important aspects, which can be eitherparticipatory or opportunistic. According to this classification,

Page 9: SenseMyCity: Crowdsourcing an Urban Sensor · Crowdsourcing, crowdsensing or participatory sensing have recently become feasible (besides popular) ideas thanks to the ... or future

9

participatory systems require a user to report or subscribe tosensing tasks, aiming to gather data from targeted events orplaces in time and space. On the other hand, opportunisticsystems aim to gather a large number of raw data with almostno user interaction. This model typically requires more offlinedata processing, to filter and infer information from the data.

Some systems have been developed that allow a quick in-tegration of crowdsourcing technologies, such as the Ushahidiproject [11], designed to easily deploy a crowdsource mappingplatform to collect reports from volunteers during naturaldisasters or other events. These and similar participatoryplatforms, requiring users to actively participate in the sensingactivity, have been used by a multitude of projects [12] [13][14]. Participatory platforms can be used to quickly detectand report on main events, but fail to detect secondary eventsor unnoticeable patterns in the environment that the usersdo not consider worth reporting. The projects Medusa [15],mCrowd [16] and the authors in [17] also present participatorycrowdsensing platforms with user-in-the-loop designs. Besidesjust gathering user reports, these also support the collectionof raw sensor data as specified in the task, defined by otherusers or researchers as a sensing requirements, and the systemdistributes these tasks to smartphone users who wish toparticipate, while handling optional monetary incentives.

Incentives are an important aspect of such participatorysystems as they require a large user interactivity. They canbe given in the form of monetary compensation [15], byproviding services [1], by entertaining the user [18], or evenby motivating altruism [10].

In an opportunistic way, StreetSmart [19] uses crowdsensingwith mobile phones to gather data from GPS, Accelerometerand OBD data, and provides fuel consumption information tousers. CarTel [20] has a similar approach, gathering informa-tion solely from Global Positioning System (GPS) and On-Board Diagnostics (OBD). These projects implement sensingapplications focused on solving problems for their specificscenarios, such as implementing delay-tolerant network com-munications, and gathering data from only a small subsetof sensors. Our platform was designed to be able to copewith a larger set of scenarios and requirements, by beingeasily configured and able to gather data from a multitudeof sensors. Fuel consumption estimation requires only a smallsubset of sensors and capabilities available in our platform.A similar calculation was inclusively performed over ourgathered data [8].

With a completely different goal, the CrowdSense@Placeframework [21] uses opportunistically captured images andaudio clips from smartphones to automatically identify andcharacterize places a user visits, without requiring any userinteraction.

In this work we propose a more general and configurableopportunistic data gathering platform, that is able to gatherdata from a large number of sensors available in smartphoneswithout hindering its normal operation or requiring regularconscious interactions from the user. A lot of information canbe retrieved from such massive data gathering system, such asestimating fuel consumption [20] [8] or monitoring health-caresystems [1] [7]. Moreover, by aggregating collected data it is

possible to infer much more information and provide usefulservices, such as automatic car sharing suggestions from tracessimilarity as proposed in [22], predicting congestions andmovement patterns [23] [24], estimating per street average fuelconsumption and cost [19] or predicting network connectivityfrom upcoming open WiFi access points (Section IV).

However, it is very hard to gather a large data-set withoutcompromising the privacy and anonymity of the users. Studiesshow that although people are generally very permissiveregarding location privacy policies, it is largely dependent onhow their location data is used and with whom it is shared [25][26]. There have been many proposals to integrate privacy inparticipatory systems [6] [27] [2], mainly in systems offeringservices that allows a user to access and analyze data reportedby others.

Crowdsensing platforms are vulnerable to misbehavingnodes and the existence of noisy or biased data, whichcan impact the quality of the information inferred from thegathered data [28]. In our platform, data reliability and mis-behaving nodes should be addressed during the data processingphase [29].

VI. CONCLUSIONS

Instead of implementing a platform for a specific scenariowith specific limiting requirements, we designed a supersedingand modular opportunistic system that can be used in distinctsensing projects and easily extended for new sensors. Theapplication is fully configurable and able to gather data whilerunning on the background, while requiring minimal userinteraction. The many configuration parameters can, amongothers, choose the gathered sensors and their sampling rates,and make it transmit data in real-time while choosing allowedconnection types (e.g. send only through WiFi). It then usesbackend servers to store and process the massive amounts ofdata according to the researchers and projects goals.

The reconfigurability and ubiquity nature of our mobileplatform allows us to benchmark resources utilization indifferent smartphone models and configuration, providing usan estimation of battery consumption and storage requirementsfor some desired sensing task. This way, besides providingvaluable data-sets our platform allows researchers to estimateand minimize the cost-of-knowledge for different projects.

ACKNOWLEDGMENTS

This work was financed by Instituto de Telecomunicacoes,by the European FP7 Project - Future Cities: FP7-REGPOT-2012-2013-1, by National Funds through the FCT Fundaopara a Ciłncia e a Tecnologia within project PTDC/EEI-ELC/2760/2012 and grant SFRH/BD/62537/2009.

REFERENCES

[1] M. N. K. Boulos, S. Wheeler, C. Tavares, and R. Jones, “How smart-phones are changing the face of mobile and participatory healthcare: anoverview, with example from eCAALYX.” Biomed. Eng. Online, vol. 10,no. 1, p. 24, Jan. 2011.

[2] K. Shilton, J. a. Burke, D. Estrin, and M. Hansen, “Designing the Per-sonal Data Stream : Enabling Participatory Privacy in Mobile PersonalSensing,” Work, no. September, pp. 25–27, 2009.

Page 10: SenseMyCity: Crowdsourcing an Urban Sensor · Crowdsourcing, crowdsensing or participatory sensing have recently become feasible (besides popular) ideas thanks to the ... or future

10

[3] R. N. Murty, G. Mainland, I. Rose, A. R. Chowdhury, A. Gosain, J. Bers,and M. Welsh, “CitySense: An Urban-Scale Wireless Sensor Networkand Testbed,” in 2008 IEEE Conf. Technol. Homel. Secur. IEEE, May2008, pp. 583–588.

[4] J. G. P. Rodrigues, A. Aguiar, F. Vieira, J. Barros, and J. P. S. Cunha, “Amobile sensing architecture for massive urban scanning,” in 2011 14thInt. IEEE Conf. Intell. Transp. Syst. Ieee, Oct. 2011, pp. 1132–1137.

[5] J. S. van der Veen, B. van der Waaij, and R. J. Meijer, “Sensor DataStorage Performance: SQL or NoSQL, Physical or Virtual,” in CloudComput. (CLOUD), 2012 IEEE 5th Int. Conf., 2012, pp. 431–438.

[6] E. D. Cristofaro and C. Soriente, “Participatory Privacy: EnablingPrivacy in Participatory Sensing,” CoRR, vol. abs/1201.4, 2012.

[7] J. G. P. Rodrigues, M. Kaiseler, A. Aguiar, J. Barros, and J. P. S.Cunha, “A Mobile Sensing Approach to Stress Detection and MemoryActivation for Public Bus Drivers,” Submitt. Publ., 2014.

[8] V. Ribeiro, J. Rodrigues, and A. Aguiar, “Mining geographic data forfuel consumption estimation,” in 16th Int. IEEE Conf. Intell. Transp.Syst. (ITSC 2013). IEEE, Oct. 2013, pp. 124–129.

[9] N. D. Lane, E. Miluzzo, H. Lu, D. Peebles, T. Choudhury, A. T.Campbell, and CollegeDartmouth, “Adhoc And Sensor Networks: ASurvey of Mobile Phone Sensing,” IEEE Commun. Mag., no. September,pp. 140–150, 2010.

[10] G. Chatzimilioudis, A. Konstantinidis, C. Laoudias, and D. Zeinalipour-Yazti, “Crowdsourcing with Smartphones,” Internet Comput. IEEE,vol. 16, no. 5, pp. 36–44, 2012.

[11] O. Okolloh, “Ushahidi, or ’testimony’: Web 2.0 tools for crowdsourcingcrisis information,” Particip. Learn. Action, vol. 59, no. 1, pp. 65–70,2009.

[12] J. Powell, G. Nash, and P. Bell, “GeoExposures: Documenting temporarygeological exposures in Great Britain through a citizen-science web site,”Proc. Geol. Assoc., no. 0, pp. –, 2012.

[13] J. Dugdale, B. de Walle, and C. Koeppinghoff, “Social media and SMSin the haiti earthquake,” in Proc. 21st Int. Conf. companion World WideWeb, ser. WWW ’12 Companion. New York, NY, USA: ACM, 2012,pp. 713–714.

[14] S. McClendon and A. C. Robinson, “Leveraging Geospatially-OrientedSocial Media Communications in Disaster Response,” 2012.

[15] M.-R. Ra, B. Liu, T. F. La Porta, and R. Govindan, “Medusa: aprogramming framework for crowd-sensing applications,” in Proc. 10thInt. Conf. Mob. Syst. Appl. Serv., ser. MobiSys ’12. New York, NY,USA: ACM, 2012, pp. 337–350.

[16] T. Yan, M. Marzilli, R. Holmes, D. Ganesan, and M. Corner, “mCrowd:a platform for mobile crowdsourcing,” in Proc. 7th ACM Conf. Embed.Networked Sens. Syst., ser. SenSys ’09. New York, NY, USA: ACM,2009, pp. 347–348.

[17] D. Lasnia, A. Broering, S. Jirka, and A. Remke, “Crowdsourcing SensorTasks to a Socio-Geographic Network,” in 13th Agil. Int. Conf. Geogr.Inf. Sci., 2010.

[18] I. Celino, D. Cerizza, S. Contessa, M. Corubolo, D. Dell’Aglio, E. D.Valle, and S. Fumeo, “Urbanopoly – A Social and Location-Based Gamewith a Purpose to Crowdsource Your Urban Data,” in Privacy, Secur.Risk Trust (PASSAT), 2012 Int. Conf. 2012 Int. Confernece Soc. Comput.,2012, pp. 910–913.

[19] A. L. Oehlerking, “StreetSmart : modeling vehicle fuel consumptionwith mobile phone sensor data through a participatory sensing frame-work,” Master’s thesis, Massachusetts Institute of Technology. Dept. ofMechanical Engineering., 2011.

[20] B. Hull, V. Bychkovsky, Y. Zhang, K. Chen, M. Goraczko, A. Miu,E. Shih, H. Balakrishnan, and S. Madden, “CarTel: a distributed mobilesensor computing system,” Proc. 4th Int. Conf. Embed. networked Sens.Syst., pp. 125–138, Oct. 2006.

[21] Y. Chon, N. D. Lane, F. Li, H. Cha, and F. Zhao, “Automatically charac-terizing places with opportunistic crowdsensing using smartphones,” inProc. 2012 ACM Conf. Ubiquitous Comput., ser. UbiComp ’12. NewYork, NY, USA: ACM, 2012, pp. 481–490.

[22] D. Zeinalipour-Yazti, C. Laoudias, C. Costa, M. Vlachos, M. Andreou,and D. Gunopulos, “Crowdsourced Trace Similarity with Smartphones,”p. 1.

[23] F. Calabrese, M. Colonna, P. Lovisolo, D. Parata, and C. Ratti, “Real-Time Urban Monitoring Using Cell Phones: A Case Study in Rome,”IEEE Trans. Intell. Transp. Syst., vol. 12, no. 1, pp. 141–151, 2011.

[24] J. Zimmerman, A. Tomasic, C. Garrod, D. Yoo, C. Hiruncharoenvate,R. Aziz, N. R. Thiruvengadam, Y. Huang, and A. Steinfeld, “Field trialof Tiramisu,” in Proc. 2011 Annu. Conf. Hum. factors Comput. Syst. -CHI ’11. New York, New York, USA: ACM Press, May 2011, p. 1677.

[25] J. Krumm, “A survey of computational location privacy,” Pers. Ubiqui-tous Comput., vol. 13, no. 6, pp. 391–399, 2009.

[26] J. Y. Tsai, P. Kelley, P. Drielsma, L. F. Cranor, J. Hong, and N. Sadeh,“Who’s viewed you?: The impact of feedback in a mobile location-sharing application,” Proc. 27th Int. Conf. Hum. factors Comput. Syst. -CHI ’09, p. 2003, 2009.

[27] B. H. B. Hoh, M. Gruteser, H. X. H. Xiong, and a. Alrabady, “AchievingGuaranteed Anonymity in GPS Traces via Uncertainty-Aware PathCloaking,” IEEE Trans. Mob. Comput., vol. 9, no. 8, pp. 1089–1107,2010.

[28] M. N. K. Boulos, B. Resch, D. N. Crowley, J. G. Breslin, G. Sohn,R. Burtner, W. A. Pike, E. Jezierski, and K.-Y. S. Chuang, “Crowd-sourcing, citizen sensing and sensor web technologies for public andenvironmental health surveillance and crisis management: trends, OGCstandards and application examples,” Int. J. Health Geogr., vol. 10, no. 1,p. 67, 2011.

[29] S. Tanachaiwiwat and A. Helmy, “Correlation analysis for alleviatingeffects of inserted data in wireless sensor networks,” in Second Annu.Int. Conf. Mob. Ubiquitous Syst. Netw. Serv. IEEE, 2005, pp. 97–108.

Joao Rodrigues (S’11) received his M.Sc. degreein Electrical and Computer Engineering from theUniversity of Porto, Portugal, in 2009, and sincethen is pursuing the Ph.D. degree at the same in-stitution. He develops his work at the Institute forTelecommunications (IT), and the main topics of histhesis are data gathering and mining in intelligenttransportation systems. He was awarded a DoctoralScholarship from the Portuguese Foundation for Sci-ence and Technology in 2009. His general interestsare sensor networks and intelligent transportation

systems.

Ana Aguiar (S’94-M’98-S’02-M’09) graduated inElectrical and Computer Engineering from the Fac-ulty of Engineering University of Porto (FEUP),Portugal, in 1998, and received her PhD in Telecom-munication Networks from the Technical Universityof Berlin, Germany, in 2008. She is an assistant pro-fessor at FEUP since 2009, with research interestsin wireless networking and mobile sensing systems,specifically vehicular networks, crowd sensing, andmachine-to-machine communications. She also con-tributes to several inter-disciplinary projects in the

fields of intelligent transportation systems and well-being (stress). She beganher career as an RF engineer working for cellular operators, and she workedat Fraunhofer Portugal AICOS on service-oriented architectures and wirelesstechnologies applied to ambient assisted living. She has published in and isreviewer for several IEEE and ACM conferences and journals.

Joao Barros (S’98-M’04-SM’11) is an AssociateProfessor of Electrical and Computer Engineeringat the University of Porto and Founding Director ofthe Institute for Telecommunications (IT) in Porto,Portugal. He also teaches at the Porto BusinessSchool and co-founded two recent startups, Stream-bolico and Veniam, commercializing wireless videoand vehicular communication technologies, respec-tively. He received his undergraduate education inElectrical and Computer Engineering from the Uni-versidade do Porto (UP), Portugal and Universitaet

Karlsruhe, Germany, and the Ph.D. degree in Electrical Engineering andInformation Technology from the Technische Universitaet Muenchen (TUM),Germany.