mobile data analytics

34
Mobile Data: Under the Hood Munich DataGeeks Meetup June 2014

Upload: joerg-blumtritt

Post on 27-Jan-2015

118 views

Category:

Technology


0 download

DESCRIPTION

A smartphone is a mighty array of sensors. How to access the data, and get meaningful information from the various readings, like geo-location, gyroscope, accelerometer, or even the magnetic flux. We also discuss the ehtical implication of mobile tracking: informational self-determination, "other-tracking" vs. self-tracking, and how to do spooky things with apparently innocent measurements.

TRANSCRIPT

Page 1: Mobile Data Analytics

Mobile Data: Under the Hood

Munich DataGeeks Meetup

June 2014

Page 2: Mobile Data Analytics

• Data collection has meandered from engineering to humanities and back:

• Its origins are in population statistics and taxation, beginning from the late mediaval time; the age of enlightenment brought scientific data to the top, with its peak in 19th century thermodynamics; the rise of quantitative social science in the mid 20th century brought humanities back, namely in the form of market research and media planning.

• With the Internet of Things, engineering as prime source of data seams logic, however: it is people whos behavior stands behind most machine generated data, too.

2 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 3: Mobile Data Analytics

3 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 4: Mobile Data Analytics

• Two scholars are particularily influencial in the field of wearable technology and mobile data generation.

• Most prominent figure might still be Steve Mann, whose cyborgist wirring has provoced hostile reactions.

• Mann is noteworthy also for his discussion on counter-surveillance, souvaillance, and other political implications of tracking technology.

• Application of smartphone technology, especially the smartphones sensors has been brought forward by Alex Pentland.

• Many papers by Pentland and his students have inspired our work, too.

4

Source: http://commons.wikimedia.org/w/index.php?title=User:Glogger&action=edit&redlink=1(CC BY-SA 3.0)

Source: Robert Scoble(CC)

Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 5: Mobile Data Analytics

All seeing research?

• Social studies bear an inherrent problem: either you could afford to long-term track or you could try to drill down in depth to some aspects. Both is hardly possible (too expensive and too intrusive for people's lives to carried out).

• Seamless tracking via devices we carry with us anyway, offers a solution to this dilemma.

5

1) Reality Mining [8], (2) Social evolution [11], (3) Friends and Family dataset, (4) Rich-data pioneers [15,16], (5) Sociometric Badge studies [14], (6) Midwest field station [17], (7) Framingham Heart Study [4], (8) Large call record datasets [5,1,6], (9) ‘‘Omniscient’’/all-seeing view.

Aharony et.al:, "Social fMRI, investigating and shaping social mechanism in the real world", Pervasive and Mobile Computing, Vol. 7, 2011, pp. 643-659.

Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 6: Mobile Data Analytics

Sociometric Solutions

• Sociometric Solutions is a team that uses mobile sensor technology, stripped bare of all "classic" phone functions.

• The participants of a survey would carry the badge and get tracked in multiple ways: sounds, movements, communications, and proximitiy with others are continuously monitored.

• Sociometric Solutions can e.g. derive network analytics from the data, showing how efficient people would interact in businesses. (Ben Waber, founder of the startup, presented the example below at Strata Conference 2012: Three branches of a bank. Branch 2 shows two distinct clusters - people hardly interacting in between; the reason: two departments sitting on seperate floors, which could be easily solved by mixing the teams, once the problem had been discovered).

6 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 7: Mobile Data Analytics

getsaga.com

• Lifelogging as storytelling, supported bytechnology, is the appealing task of getsaga.

• Data here is not seen as "facts" (like in self-tracking gadgets that monitor e.g. your running).

• Data is like memories and embedded intonarrative context.

• The data collections is syndicated with various app-based services, people would use anyway: foursquare, fitbit, Jawbone, and even Withings.

• These data silos usually collect data just in their very limited context; with getsage all tracking data can be synoptically brought into "the story of your daily life".

7 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 8: Mobile Data Analytics

"Social MRI"

• Like MRI makes you see your innards in detail without loosing the full picture, Nadav Aharony proposed using mobile data to get to "Social MRI" - studying behavior on the individual level but in social context.

• The startup Behavio, that was spun off a remarkable research project at MIT was acquired by Google. Never heard of them eversince again.

• (His readworthy paper was sourced on p 5)

8 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 9: Mobile Data Analytics

Sad reality

• Quantitative social science is in dismay. What used to be "good enough", like pulling representative samples just by picking random phone numbers is helplessly exposed as a blunt tool. The construct of presuming, people would be similar enough in all of their characteristics, just because they would share some broad aspects of demographics, has probably never been true. With the rise of online tracking, click analytics etc., we have learned how misleading the assumption of representativeness can be.

• Postmodernist thinking, and especially Alain Badiou's "Being and Event" is helpful to deconstruct the fallacies of quant social research.

• With people switching from desktop computers to mobile, research companies try to adopt their crude online questionnairs to the new screen. The result is pathetic.

9 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 10: Mobile Data Analytics

Technical Tracking

• Smartphones carry a phalanx of sensors and track all kind of environmental data.

• Our movements and immediate surroundings are monitored by gyroscope, accelerometer, luminosity sensor in the camera, microphone etc.

• The location is captured by satellite connection and mobile network.

• Proximity can be trackt via bluetooth or Wifi signal (which just becomes systematically useable with the iBeacon)

10

Pei et.al.: "Human Behavior Cognition Using Smartphone Sensors"Sensors 2013, 13, 1402-1424; doi:10.3390/s130201402

Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 11: Mobile Data Analytics

Context

• Technical data does not reveal the full story. Just because somebody went to some place does not tell what she did there, how she felt, if she succeded, and how this action related to the rest of her life.

• So it makes sense to ask people directly, to engange in personal interaction, if we want to study their lives.

11

Pei et.al.: "Human Behavior Cognition Using Smartphone Sensors"Sensors 2013, 13, 1402-1424; doi:10.3390/s130201402

Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 12: Mobile Data Analytics

Our App: explore

• We started our own app 'explore':

• explore tracks all kinds of sensor data on the smartphone. The data can be collected for analysis, and it can trigger interactions (like asking questions or offering suggestions).

• The open beta is available on Google Play Store; the iOS version should be ready by July 2014.

12 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 13: Mobile Data Analytics

Usability

• To avoid the shortcommings of the other social research apps, we focused on the user interface, to make it as "mobile" as possible.

13 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 14: Mobile Data Analytics

Interactions

• explore can ask questions or offer suggestions, triggered by data.

14 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 15: Mobile Data Analytics

Quantified Self

• explore offers clear and simple analytics of the data collected. People can also get their raw-data for their own purposes.

• We want people to be aware what we (and other apps) do on the phone. So we do not only tell in advance, we also show what sensors are activated and give the opportunity to opt-out per sensor.

• Since we reflect the results of our tracking as well as questionnairs and interactions in form of diagrams and sumaries, we hope, people will realize what we are doing and can act self-determined.

• Of couse we respect take-down notices: if people ask for their data to be deleted, we follow their request (which btw is also required by German data protection laws); this is also a reason for us not to use common cloud storage and cloud computing platforms, since we would not have control over the back-up.

Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt15

Page 16: Mobile Data Analytics

Data

• Sensor data is generated mostly in forms of tables, locally stored as SQL databases for each app. We transfer the data to analyze it, e.g. visualize geo-location on a map.

16 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 17: Mobile Data Analytics

Travel

• Studying the means of transportation, they paths, people choose for their communte or travel, is a streightforward application of our data.

• We work e.g. for airports to optimize the shops they would offer to passengers. Since many passengers come from other cultures, it is not an easy task for an airport (or in general for a shopping mall) to learn the preferences of potential clients - not consitent shopping data or market research is available.

• So, e.g. we incentivize passengers from China to let us accompany their stay in Euorpe with our app 'explore'. So we can understand, what they wanted to buy, if they succeded and if they would have missed anything, that an airport could have offered to them.

17 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 18: Mobile Data Analytics

Events

• To see what happens, we have to process the data. How people move arround is visible through the gyroscope - you see the turns, changes in directions ect.

• With gyroscopic data in combination with acceleration and speed, also the means of transportation can be revealed: walking has a distinct signature, driving by car shows more changes in directions then sitting on a train, etc.

• However: the data is noisy; artefacts emerge from different brands of the sensors, of glitches in the operating systems, and also can be caused by environmental influences.

• Take e.g. the rhytmik spikes in the picture below: nobody would turn rhythmically and so fast.

• So we have to preprocess the data in the app, to really see, what happens.

18 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 19: Mobile Data Analytics

What is behavior?

• The normalized gyroscopic data on the right shows the movements of a person going from her desk into the kitchen, fixing a pot of tea, leaving the kitchen and returning to her desk.

• Sampling rate was 10s, timeframe is 15min.

• We notice episodes of different behavior:

• turning sharply

• walking

• turning smoothly

• walking again

• entering the kitchen, preparing the pot

• waiting for the water to boil

• standing up, leaving the kitchen

• sitting down again

19

0,0

1,0

2,0

3,0

4,0

5,0

6,0

7,0

8,0

9,0

1 4 7 1013161922252831343740434649525558616467707376798285

Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 20: Mobile Data Analytics

Complex Event Processing

• Simple events, like changing direction, entering an area of specific geo-coordinates, or having moved for a specific time span can be combined to complex events.

• EPL (event processing language) offeres a way to listen to the data stream and detect the occurance of events.

• EPL looks like SQL, but instead of tables, the search goes into the data stream.

• For our app, we define events, boolean-combine these events in a GUI and parse the definition in the app via JSON doc.

• The event processing itself takes place in the app - no network connection is needed.

20

SELECT ID AS sensorIdFROM ExampleStreamRETAIN 60 SECONDSWHERE Observation= '' Outlet"

Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 21: Mobile Data Analytics

Battery

• Battery data is both interesting in itself, and also important to maintain the app usable.

• Battery consumptions is telling a lot about the environment of the phone: temperature, moisture, even air pressure can be derived using the change in charge.

21 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 22: Mobile Data Analytics

Powersaving Strategies

• (Asynchronous data transfer)

• Data is not contiuously transfered from the phone to our backend. This is trivial of course.

• Lower sampling rate

• not good. You loose details that you might need.

• Controlled sampling rate

• much better: someone walking does not change location as quickly as someone on a train; hence location tracking every 2 minutes or so is sufficient. Of course you have to detect and predict the behavior of the user first ;)

• Piggybacking on other apps

• this would be straightforward. However, most operting systems keep apps siloed so data is collected for each app seperatly.

• Co-processor for the data

• Apple has introduced the M7 coprocessor recently. The M7 stores all sensor data seamlessly; battery drainage is hugely reduced.

22 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 23: Mobile Data Analytics

Spooky Wifi

• You can't avoid tracking others involunarily, too. This is a problem. People might be aware, what they themselves are doing. But others might be tracked along without giving their consent.

• Wifi is a good example of "others-tracking": all wifi signals within reach are tracked by the phone. It tells a lot about other people; not only about the devices they use.

23 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 24: Mobile Data Analytics

Identify whereabouts bylocation-specific magnetic field• Every place has a distinct

signature of the magnetic field (in strenght like shown on my own tracking data on the right as well as in bearing).

• So even if someone decided to not-track geolocation, we might still get sufficient information on their whereabouts via other measurements.

• That this is not hypothetical can be seen on the diagram: the field's signature of my home is different from other places, I stayed during that week.

24 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 25: Mobile Data Analytics

Lifelogging vs. Others-Tracking

• Next page shows some lifelogging pictures I took with the Narrative lifelogging device.

• First (as Jillian York so eloquently twittered): there is many things you'd rather not like to see of other people's lives.

• Second: you can hardly avoid to log not only your own life but also get others on your track record as well.

• Google glass, provoking already some resistance, is however just a form-factor, a "metaphor": the interesting thing is the augmented ubiquity (Bruce Sterling's term), that comes with is.

25 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 26: Mobile Data Analytics

Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 27: Mobile Data Analytics

Postprivacy, and communalization of private life• In Neal Stephenson's "Snow Crash", we read about the 'Central

Intelligence Corporation' - a commercialized version of today's NSA. Mobile health, computational social science, and mass measurement of environmental influences are obvious and benign applications of QS for the public good. With quantifying and making public, what European data protection law defines as "the most intimate personal data", however do we transform the current "knowledge-database" character of the Net along with its communication-networks to something new, something that might become similar to Stephenson's vision?

• Could this even lead to Teilhard's (resp. McLuhan's) angelization of humans, not only connected via social media but bodily knit into the data? Would we rather end up in a rally bucolic global village with moral control by the panoptic community (and an inherent abelism that comes with a village life)? In both aspects, representative aggregates like society as well as the concept of the individual might be rendered obsolete.

27 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 28: Mobile Data Analytics

"Privacy"

28 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 29: Mobile Data Analytics

"Data Protection"

29 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 30: Mobile Data Analytics

30 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 31: Mobile Data Analytics

Becoming cyborgs?

• The bodily extension into the data squere - this is what cyborgism is really about.

• People like Neal Harbison or Enno Park are pushing the discussion in that direction: How do we maintain posession of our bodies? What ethic framework has there to be set-up? How do we avoid technological extensions becoming "black boxes" that control us, rather than we do them?

• So it is worthwhile to follow the proceedings of the Cyborg e.V that Enno founded.

31 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 32: Mobile Data Analytics

Cyborg Rights

32 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 33: Mobile Data Analytics

Some links

• http://datarella.com/blog

• http://beautifuldata.com my blogs.

• http://twitter.com/jbenno/bigdata a data science related twitter list.

• http://quantifiedself.com

• http://cyborgs.cc/

33 Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt

Page 34: Mobile Data Analytics

Jörg Blumtritt

@jbenno

Datarella GmbH

Oskar-von-Miller-Ring 36

80333 München

089/44 23 69 99

[email protected]

Mobile Data: Unter the Hood. Datarella - Joerg Blumtritt34