private information exposure in online social networks...

17
Laboratory for Communications and Applications Mini-Project Report in Security and Cooperation in Wireless Networks Private Information Exposure in Online Social Networks with iOS, Android and Maemo Mobile Devices Author: Vasileios Agrafiotis Communication Systems - Master 3 Professor: Jean-Pierre Hubaux Supervisors: Igor Bilogrevic Mathias Humbert January 19, 2012

Upload: hoangthuy

Post on 19-Jul-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Laboratory for Communications and Applications

Mini-Project Report in Security and Cooperation in Wireless Networks

Private Information Exposure in OnlineSocial Networks with iOS, Android and

Maemo Mobile Devices

Author:

Vasileios AgrafiotisCommunication Systems - Master 3

Professor:

Jean-Pierre Hubaux

Supervisors:

Igor Bilogrevic

Mathias Humbert

January 19, 2012

1 Introduction

Privacy is a well established and integral part of the common culture of modern societies,especially in the western world. The notion of privacy consists of controlling when, where,how and by whom personal information is used. Personal information is considered to be thetype of information linked to an individual that can be used to uniquely identify him withinany context. The nature of such information can either be biological, such as the iris or thefingerprints, or administrative, such as the date of birth or an email address. In order to ensure,the ability of an individual to protect and preserve his privacy over his personal data, severaltechnical means are employed. For instance, in the communication networks, encryption is apowerful and widely used mechanism developed to establish privacy. However, the presence ofhuman factor makes technology insufficient to address this critical issue all alone. That is withseveral legal measures are used as a complement to the technical based mechanisms aiming topreserve privacy.

At this point, it might appear that preserving personal information implies that this in-formation is not shared by the individual. However, under specific circumstances individualsshare some of this information with third parties service providers which they surround withtrust. Such a third party is the government, a hospital or an insurance company. Lately, OnlineSocial Networks (OSNs) have emerged deeply in our social life and have become very importantholders of their users personal information.

Online social networks (OSNs), such as Facebook and Google plus, allow millions of in-dividuals to create online profiles and share personal information with friends and strangers.Their users are encouraged to interact with their friends by posting messages and media contenton their online profiles. It is obvious that the nature of such platforms threatens individualsPrivacy; the core concept of all type of OSNs is that their users are offering a representationof themselves a profile through the website to others, with the intention of contacting orbeing contacted by others. This implies that users share with the OSN a significant amountof personal information depending on the privacy awareness of the user. Unlike to what couldhave been expected, recent studies on the patterns of personal information revelation [5] withinOSNs, demonstrate that the data owned by the OSNs are very rich and accurate.

The booming phenomenon of OSNs over the world combined with the potential threatto Privacy that can be associated with them, provided an inspiration for a series of studieswithin the academic society. Most of them [4], [7] focus on identifying whether or not thereis any privacy leakage related with the personal information shared between users and theOSNs. In other words, they investigate if and how any third parties can have access to thisinformation. Moreover, some other studies concentrate their interest on evaluating the privacypolicies published by the OSNs and the privacy control mechanisms adopted aiming to protectusers personal data.

In this, project we try to address the Privacy issues related to the OSNs from a differentpoint of view. We focus on the usage of OSNs through mobile devices having different operatingsystems and the information exchanged between the users mobile device (Smartphone) and theOSN server. Users can access OSNs through the web browser installed on the mobile phone.In addition, several client applications for such OSNs exist on different mobile platforms (iOS,Android and Maemo), which offer a better user experience thanks to a specific software design.It sometimes happens that, for some mobile platforms, different social networks require someprivate information, such as location information, even when the actions requested by the userwith a first glance do not need this information in order to perform their advertised functionality.Apparently, this behavior constitutes of a violation of users Privacy. Similarly to our project, arecent paper [6] performs a quite similar study for the case of mobile Applications not related

2

with OSNs.

In this project, we investigate private information disclosure in OSNs, within the contextdescribed above, by accessing some popular OSNs from both their mobile client applications andthe mobile web browser, over different mobile platforms. We try to simulate the usage of eachof the targeted OSNs for a typical user. We perform the different actions that are available tothe user within each OSN, and we record the traffic exchanged between the Smartphone and theOSN servers. Subsequently, we study the captured network traffic and try to distinguish whatkind of information was exchanged. Finally we characterize each action according to a privacyscale by comparing the information that we expected to discover for the specific functionalitywith the one that we actually recorded. The mobile platforms on which we restricted our studyare Android, Symbian and iOS. The mobile devices that we used for each of the aforementionedOS are an HTC Desire, a Nokia N97 and Apple iPhone 4. Furthermore, based on severalstudies [1], [2], [3] we selected six OSNs according to their popularity; Facebook, Twitter,Foursquare, Google plus, YouTube and LinkedIn.

2 Set Up

The traffic exchanged between the Smartphone and the Online Social Network servers wascaptured under two different conditions.

In the first, we worked outside EPFL campus using an Internet connection through a homelocal network (LAN). This situation is described visually in Figure 1. We used two routers; onewired connected to the Internet and playing the role of a gateway (router 1), and one wireless(router 2) connected to the Internet through the first one. The wireless network establishedby router 2 did not use any encryption algorithm and was Open (no key to associate withit). In order to avoid interference from neighboring wireless LANs, we configured router 2 touse channel 8 for its transmissions. In this configuration, as described above, the Smartphonewas connected to the wireless network of router 2, and through this network we accessed thedifferent OSNs. At the same time, we used a laptop (Ubuntu distribution) with a wirelessinterface in order to sniff the network traffic and store it. The wireless interface was put intomonitor mode on channel 8. At this stage, after the described configurations, we were ableto record the network traffic. To achieve this we used a free and open-source network packetanalyzer, Wireshark. Our methodology for capturing, storing and evaluating this traffic, willbe described in the next section of this report.

In the second configuration, shown in Figure 2, we worked at EPFL using a wireless internetconnection through Public EPFL network, which is an unencrypted network. Briefly in this case,the intermediate router 1 from the previous configuration was not present, and the Smartphonewas directly associated to the closest Access Point. The configuration for the laptop in monitormode was the same as described previously. So did, the capturing methodology for the traffic.

It is essential to pinpoint that under both conditions we worked indoors and in parallel alllocation detection mechanisms (GPS, Bluetooth and WLAN detection) were activated on theSmartphones. Thus, in order to be able to use GPS, we tried to capture the traffic traces inthe outside side of the buildings.

3 Methodology

In this section, we give an overview of the methodology we used in order to capture andanalyze the traffic between the Smartphone and OSN servers using the aforementioned config-urations and Wireshark.

3

Figure 1: Configuration with home LAN.

Prior to sniffing the traffic from the targeted OSNs, we created a fake user account for eachof them. As a next step, we defined a list of all possible actions that could be requested by theuser within each of the OSNs after inspecting their functionalities and their user interface. It isessential to clarify at this point that the set of possible actions differentiates for the same OSNacross different mobile platforms and between the mobile application and the web based versionin the same mobile platform. This is due to the fact that the mobile web version of the OSNsprovides less functionality than the corresponding mobile Application and to the fact that thelatter for some less popular mobile platform, like Symbian compared to Android and iOS, iseither unavailable or less operable.

After having defined the list of all possible actions for each OSN, we performed separatelyeach of them and captured through Wireshark the packets originated or destinated to theSmartphone in the process of this actions. These packet traces were stored into two file forms;into a Wireshark packet trace file and into a plaintext file.

Once these traffic traces were captured, we had to move on the next step, the traffic analysis.We inspected each trace file individually in two ways; first, by parsing them with a ParserSoftware and then manually one by one in an ad hoc way. The Parser was a piece of java codethat we developed during the project. It is very primitive; it parses the plaintext trace of eachaction and tries to detect some predefined keywords within the traces. Such keywords are email- like strings and coordinates like decimals. The occurrence of these keywords is recorded ina log file. The manual inspection of the traces uses the parser results as a compass in order todetermine the interesting trace files and analyze them further.

During our project we faced an important obstacle; some of the mobile Web versions of the

4

Figure 2: Configuration with Public EPFL.

OSNs were only accessible in their https form, and as a result we were unable to analyze thecontent of the network traffic. In order to possibly and partially overcome this problem, wedecided to capture the corresponding traffic traces from a PC instead of a Smartphone trying toaccess if available the http version of the OSNs. By analyzing these http desktop traces whenthey were available, we could hypothesize on the content of the corresponding https traffic.

Finally, a last element that we took into consideration (when available) during this analysisphase was the GPS indication flashing light that appears on the Smartphone screen, when anyApplication calls the OS module that accesses the GPS. This is just an indication and not asolid finding.

4 Privacy Risk Scale

In the next section of the report we present and comment the results of our study. Thecore contribution of this project was to determine for each action performed within each OSNwhether or not it is associated with any Private issue. In order to achieve this we associate witheach action a level of Privacy rick based on the scale that we define in Figure 3.

Using this Privacy Risk Scale any reader can easily perceive the overall situation and detectimmediately any Privacy breaches. With each privacy risk level we have associated a color. Weexpect that the majority, if not all, of the traffic would be characterized as safe, and that there

5

Figure 3: Privacy Risk Scale

will be a few cases of privacy disclosure.

5 Results

The results of our project are presented and discussed in this section. First, we presentand briefly discuss the Privacy overview per platform by underlying the interesting cases or theoverall situation. Then we provide the reader with six tables, one per studied OSN, summarizingthe Privacy Risk level for each action generated across the different platforms.

As an overall observation we could say that we did not discover any major privacy violation.At the same time, we faced an insurmountable obstacle; the majority of the traffic we capturedand analyzed was encrypted. Thus, we were unable to determine the content of the trafficexchanged in these conditions.

Looking into our Privacy Risk tables we can deduce that the Privacy patterns are dependenton the OSN, and not on the operating system of the mobile device or the access mode (mobileapplication or mobile web version). For this reason we comment these results for each OSNseparately.

To begin with, it is clear that for the case of LinkedIn (Figure 4) no encryption is used underany platform or access mode (mobile application or mobile web version). This allowed us toanalyze the whole traffic and conclude that there is no suspicious traffic.

Concerning Twitter (Figure 5) it appears that for both Android and iOS platform thenetwork traffic is totally encrypted, and as a consequence we are not in position to observeit. In the case of Symbian, we can only access Twitter through its mobile web version sinceno mobile Application is available. By accessing Twitter via Symbian we found out that inall generated actions the user id and the session cookie is contained in the exchanged packet.Furthermore, in some actions we noticed that the GPS was used although location informationdid not appear to be required to perform this specific action. Such cases are when we performa tweet or reply to a tweet.

Next, for Google Plus (Figure 6) we caught again only encrypted traffic. While trying toaccess the mobile web version through any platform we could only use the secure version of thewebsite (https). While trying to use Google Plus through a mobile application, we could dothat only from Android and iOS. With both OS the traffic was encrypted and moreover duringthe login process extended information concerning the Smartphone was explicitly sent from the

6

Figure 4: LinkedIn

7

mobile device to google servers. This information included mobile operator information, suchas the Home System Identifier.

Subsequently, YouTube (Figure 7) could be accessed from all OSN and from both the mobileweb version and the mobile application without the usage of encryption. We did not notice anysuspicious case except in the mobile application for the Android OS where some pseudorandomstrings were exchanged during some actions. For these actions we noticed that a TCP connectionbetween the device and google analytics server was also established. However, we had not enoughinformation to understand the service provided by this connection.

Foursquare (Figure 8) presents several interesting cases from a Privacy point of view. ForiOS and the mobile version of the Android all the traffic is encrypted. On the other hand,for the mobile application for Android and the mobile web version of Foursquare for Symbian(the corresponding mobile application is primitive and less functional) location information wassent during different actions. However, Foursquare is an OSN which encourages by default thesharing of location.

To conclude, when we tried to access Facebook (Figure 9) through the mobile web ver-sion, regardless the mobile platform, we captured only encrypted traffic since only the httpsversion of the platform exists. While using the mobile application from all mobile platforms,pseudorandom strings were included in the network packets generated for almost all the actions.

Last, in Figures 10 and 11, we present the information of the previous six tables in a moreaggregated way by summing it up and pointing out the most interesting cases.

8

Figure 5: Twitter

9

Figure 6: Google Plus

10

Figure 7: YouTube

11

Figure 8: Foursquare

12

Figure 9: Facebook13

Figure 10: Big Picture for per Operation System

14

Figure 11: Actions generating Interesting traffic

15

6 Conclusion and Improvements

It is essential to point out some drawbacks in the approach we used in our study that can beimproved in a next project. The collection of the traffic traces was a time consuming proceduredue to the fact that it was conducted manually, action by action for every OSN and every mobileplatform. We were aware that there existed within the scientific community some emulators,like the one developed by TEMA project [8] for Android platform. These emulators could beused to simulate automatically the execution of a mobile Application using a script language.However, their usage necessitated the modification of the core libraries of the Smartphones OS.We were not positive towards such a prospective since none of the Smartphone we used wasexperimental; they were all actually borrowed to accomplish this project from volunteers.

A second point that should taken into consideration for future improvements is the fact thatthe network traffic was captured indoors and this implies that sometimes the GPS Satellite wasnot within sight. However, we were not able to determine this in a secure way. It might beconvenient in a future work to be able to capture the network traces in an automatic way andoutdoors.

To conclude, in this project we tried to address a specific aspect of the Privacy issues relatedwith the OSNs. Our results consist of an initial mapping of the situation and it can be regardedas a first step for further analysis in the same field.

16

References

[1] http://online-social-networking.com/most-popular-social-networking-sites-for-business.

[2] http://www.bizreport.com/2011/03/social-networks-most-popular-online-content-in-uk.

html.

[3] http://www.socialsoftware.weblogsinc.com/.

[4] Krishnamurthy Balachander and Craig E. Wills, On the leakage of personally identifiableinformation via online social networks, SIGCOMM Comput. Commun. Rev. 40 (2010),112–117.

[5] Ralph Gross and Alessandro Acquisti, Information revelation and privacy in online socialnetworks, Proceedings of the 2005 ACM workshop on Privacy in the electronic society (NewYork, NY, USA), WPES ’05, ACM, 2005, pp. 71–80.

[6] Peter Hornyack, Seungyeop Han, Jaeyeon Jung, Stuart Schechter, and David Wetherall,These aren’t the droids you’re looking for: retrofitting android to protect data from imperiousapplications, Proceedings of the 18th ACM conference on Computer and communicationssecurity (New York, NY, USA), CCS ’11, ACM, 2011, pp. 639–652.

[7] Balachander Krishnamurthy and Craig E. Wills, Characterizing privacy in online socialnetworks, Proceedings of the first workshop on Online social networks (New York, NY,USA), WOSN ’08, ACM, 2008, pp. 37–42.

[8] Tampere University of Technology, Introduction: Model-based testing and glossary, http://tema.cs.tut.fi/intro.html.

17