Download - PERCEPTIONS AND AWARENESS OF DATA COLLECTION IN …dagda.shef.ac.uk/dispub/dissertations/2015-16/... · I would like to thank my mother, Carmita, Huguito, my father and the rest of

PERCEPTIONS AND AWARENESS OF

DATA COLLECTION IN SOCIAL MEDIA

A study submitted in partial fulfilment

of the requirements for the degree of

MSc Information Systems

at

THE UNIVERSITY OF SHEFFIELD

by

SARA MICHELLE URREA AGUILERA

September 2016

1

ABSTRACT

BACKGROUND

The growth of social media and the rise of big data has raised legitimate concerns

over data collection and user privacy. As the majority of active users are aged 16 to

34, the demographics are the primary contributors of this unstructured big data and

sought after by companies who wish to use the mined data and make profit. As

users are becoming more aware of data collection, and University students belong

to this age group, it is important to examine their perceptions, thoughts and

concerns.

AIMS

The aim of the dissertation was to analyse and investigate students’ perceptions of

their social media data being collected and used for marketing, surveillance and

other purposes with or without their awareness.

METHODS

The research was based on qualitative methods. Interviewees were audio recorded

using a semi-structured interview plan. The data was then sorted into themes and

sub-themes based on thematic analysis.

RESULTS

Most of the interviewees were aware that their data is being collected through social

media channels. They do understand that social media is a business and needs to

make profit somehow, and consider some purposes positive. But at the same time

they are worried that their data may fall in the wrong hands, that may end up being

used for undesired purposes or being used to identify private individuals.

CONCLUSIONS

Overall, the students were aware of the data collection and privacy issues, and they

were just living with it. The interviewees did not like their private information being

used, but they just accept it as part of their daily lives, as long as it does not put

them in a dangerous situation, that will not stop them from their normal activities in

the online media. Also, a number of recommendations for research and practice

were suggested.

2

ACKNOWLEDGEMENTS

I would like to thank my supervisor during this research project, Doctor Jo Bates, for

offering her guidance and advice during this time.

I would like to thank the participants that shared their time for this project.

I would like to thank my mother, Carmita, Huguito, my father and the rest of my

family that have supported me through this time.

I would like to thank my group of friends, Ecua-Mex, Oscar Winners, with a special

mention to Loli, Sole and Vil.

3

TABLE OF CONTENTS

TABLE OF CONTENTS ............................................................................................ 3

1 INTRODUCTION ............................................................................................... 5

1.1 RESEARCH AIM ........................................................................................ 7

1.2 RESEARCH OBJECTIVES ......................................................................... 7

1.3 RESEARCH QUESTIONS .......................................................................... 7

2 LITERATURE REVIEW ..................................................................................... 8

2.1 INTRODUCTION ........................................................................................ 8

2.2 BIG DATA & DATA COLLECTION.............................................................. 8

2.3 LEGAL & ETHICAL CONSIDERATIONS .................................................... 9

2.4 USER AWARENESS & PRIVACY CONCERNS ....................................... 10

2.5 CONCLUSION .......................................................................................... 13

3 METHODOLOGY ............................................................................................ 15

3.1 RESEARCH STRUCTURE ....................................................................... 15

3.2 RESPONSE RATE ................................................................................... 16

3.3 DATA COLLECTION METHODS .............................................................. 17

3.3.1 METHODS......................................................................................... 17

3.3.2 INTERVIEW ....................................................................................... 17

3.4 DATA ANALYSIS ...................................................................................... 18

3.4.1 INTRODUCTION ............................................................................... 18

3.4.2 PROCESS DESCRIPTION ................................................................ 18

3.4.3 THEMES AND SUBTHEMES DESCRIPTION ................................... 20

3.5 ETHICAL ASPECTS ................................................................................. 21

3.6 LIMITATIONS AND RISKS ....................................................................... 22

4 RESULTS AND DISCUSSION ......................................................................... 23

4.1 INTRODUCTION ...................................................................................... 23

4.2 OVERVIEW .............................................................................................. 24

4

4.3 AWARENESS OF DATA COLLECTION ................................................... 24

4.3.1 GENERAL AWARENESS .................................................................. 24

4.3.2 AWARENESS OF THE PURPOSE OF DATA COLLECTION ............ 25

4.4 PERCEPTIONS OF DATA COLLECTION ................................................ 25

4.4.1 GENERAL PERCEPTIONS ............................................................... 25

4.4.2 FEELINGS ......................................................................................... 26

4.4.3 PERCEPTIONS OF A POSITIVE COLLECTION PURPOSE ............. 26

4.4.4 PERCEPTIONS OF A NEGATIVE COLLECTION PURPOSE ........... 27

4.5 CONCERNS OF DATA COLLECTION ..................................................... 28

4.5.1 GENERAL CONCERNS .................................................................... 28

4.5.2 RISKS ................................................................................................ 28

4.5.3 PRECAUTIONS ................................................................................. 29

4.6 DISCUSSION ........................................................................................... 29

5 CONCLUSIONS .............................................................................................. 31

5.1 LIMITATIONS AND RECOMMENDATIONS ............................................. 32

6 REFERENCES ................................................................................................ 34

7 APPENDICES.................................................................................................. 37

7.1 RESEARCH ETHICS APPLICATION ....................................................... 37

7.2 INVITATION EMAIL .................................................................................. 42

7.3 CONSENT FORM ..................................................................................... 43

Perceptions and awareness of data collection in social media......................................... 43

7.4 RESEARCH ETHICS APPROVAL LETTER ............................................. 45

7.5 INTERVIEW QUESTIONS ........................................................................ 46

7.6 ACCESS TO DISSERTATION .................................................................. 47

7.7 ADDRESS & FIRST EMPLOYMENT DESTINATION DETAILS ................ 50

5

1 INTRODUCTION

“The interest in Big Data is growing exponentially” (Eynon, 2013)

The concept of “big data” has been an increasingly trending topic over the last few

years and is only expected to grow (Marr, 2016). People from different areas, such

as research, the marketing sector and the government, are more and more

interested in maximising the use of technology in order to analyse the “massive

amounts of data” in the most aggressive ways (Eynon, 2013). But what is this “big

data”? According to Soares (2012), big data is the processed information about the

“customer experiences, organisational processes, and emergent trends” that is

originated while the customer lives its normal life. This unstructured big data can be

found everywhere and is considered too big to be processed by regular data base

software. Big data is different from the Web, although the Internet helps to collect

and share this data. The whole idea is that with big data a better level of insight can

be achieved, which cannot be done using a small portion of information (Cukier &

Mayer-Schoenberger, 2013). The data that is mined through social media can reveal

a lot of information about the user, from their location to whom it is socially

interacting with or linked, the level of influence the user has, and the activity

patterns, which could be used to build a profile of the possible likely preferences or

activities (Kennedy & Moss, 2015). The organisation of all this information translates

into a “source of business analysis” that may result in performance improvements

and lead to new opportunities (Soares, 2012), proving the importance of acquiring,

analysing and processing this data.

Taking into consideration the exponential growth of big data in the past few years

and that it can be found everywhere, it makes it simple to correlate it to the

exponential growth of “social-networking data”. This major increase in social media

data is reaching a point that may run out of control, if it has not done it already

(Scarfi, 2012). The situation is raising concerns in regards to user privacy. Personal

information related to location, social media interactions and internet usage is being

sold to the data broker industry. At the beginning, it was the USA government that

was interested in data mining for surveillance purposes, now it is the business

corporations leading this activity, making the both of them interested in profiting from

the data collection without the worry of a major strict data protection law that may

benefit the user who is the actual owner of the data (Peacock, 2014). With all this

big data being used for different purposes, Google was scrutinized by European

governments due to antitrust and privacy issues. Facebook may also turn into a

6

target due to their big amount of personal data possession, reaching a point where

diplomats will have to decide whether they “treat information flows as similar to free

trade” (Cukier & Mayer-Schoenberger, 2013).

At the same time, users themselves have been becoming increasingly concerned

about their data. Several surveys done in the past years showed that clients are

worried about the information the companies hold about them, how they got this

information, and what are they using it for (Phelps, Nowak & Ferrell, 2000). Another

report showed that most of the USA’s population, which used or did not use the

Internet, were concerned about their private information when they shopped online

(Malhotra, Kim & Agarwal, 2004). In an online user study, young people between 13

to 25 years old mentioned that if they had the chance to choose, they would be

willing to accept data collection for marketing purposes, if they were somehow

rewarded for the loss of their privacy (Graeff & Harmon, 2002).

This has become even more important with the massive growth of social networks

(Statista, 2016A). The most recent possible breach of privacy through social media

channels concerned Facebook’s plans to use WhatsApp data, including phone

numbers, and combine it with Facebook data in order to suggest new friends and to

properly tailor the advertisements for the users. The user reaction has been

negative, with some of them expressing that they feel that WhatsApp is no longer a

trustworthy app, because they do not protect user privacy anymore (Tynan, 2016).

Some customers have been concerned about the “lack of control” (Schechner &

Koh, 2016). The main problem has been not only the users’ backlash, but also

European and British privacy regulators investigating the companies’ privacy

practices in this new plan of sharing information between the two platforms.

According to Statista (2016B), over the third quarter of 2014, from the worldwide

population, 54% of the active Facebook users were between 16 to 34 years old,

making this particular demographic the primary contributors of big data and the

primary target for companies seeking to exploit it. Showing that more than half of the

users are of the young age, it is important to focus on their perceptions to inspect

the moral effects data collection has on people. Most university students are within

the age group. In effect, this research will focus on the awareness, perceptions and

concerns of students from the University of Sheffield in regards to their data being

collected.

7

1.1 RESEARCH AIM

The aim of the dissertation is to analyse and investigate student’s perceptions of

their social media data being used for marketing, surveillance and other purposes

with or without their awareness.

1.2 RESEARCH OBJECTIVES

The objectives of the dissertation are:

Examine the general trends of data collection in social media within

scholarly literature and popular media;

Investigate University students’ awareness of personal data collection in

social media and their personal feelings and perceptions in regards to it

through the use of a questionnaire;

Inspect the moral effect personal data collection in social media has on

people by analysing the data from the interviews in combination with the

literature.

1.3 RESEARCH QUESTIONS

Using the interview questionnaire, this study´s goal is to answer the following

questions:

1. Are the students aware of their personal data being collected?

2. What feelings, emotions, reactions, does this situation produce in them?

3. What do they think about that situation, if it is positive or negative?

4. What measures have they taken or plan to take in regards to that?

8

2 LITERATURE REVIEW

2.1 INTRODUCTION

The literature review provides an overview of literature related to the topic of data

collection awareness. The literature review is divided into three sections:

Big Data & Data Collection provides some critical reflections from scholars;

Legal & Ethical Considerations provides an overview of issues with

regulations and ethics of data collection;

User Awareness & Privacy Concerns expands on the privacy concerns and

the state of customer awareness of data collection.

2.2 BIG DATA & DATA COLLECTION

This part of the literature review attempts to group some critical reflections from

scholars in regards to big data and data collection. The introduction provided one

brief definition on big data and its main characteristics. However, according to

Kitchin & McArdle (2016), big data do not all share the same characteristics, and

there are multiple forms of big data. There have been many concepts of big data

that included seven characteristics to identify big data by: “exhaustivity”, “fine-

grained”, “relationality”, “extensionality”, “veracity”, “value” and “variability”. However,

according to the authors’ research, for data to be classified as big data it only has to

include a few characteristics, not all. According to them, velocity – data being

“created in real time” – and exhaustivity – “capturing” all the data and not specific

parts – are the two “most important” and decisive points that define the concept of

big data.

For Boyd & Crawford (2012), the term “Big Data” was used in the past as a way to

refer to really big data sets that required supercomputers to process. Nowadays the

analysis does not require big equipment, it can be done using desktop computers

with any standard software. It is no longer about the quantity of data, but about the

spread and the impact of the content: “Big Data is less about data that is big than it

is about a capacity to search, aggregate, and cross-reference large data sets”

(p.663). The critical point would be how the data is being handled, now that it is

easier to collect and analyse on a large scale, with different purposes, like the

marketers seeing the data as a way to target advertising, insurance providers as a

way to optimize their offerings, bankers using it to gain market insights, etc. Some

9

institutions used their client’s data for inside studies, just as a way to analyse their

behaviour. However, that data from anonymous users was given away to another

company, finding out that is easy to identify the original user even when the data is

anonymous. Again, the user was affected with no chance of defending himself, or

controlling which data he wants to share and with whom.

According to Zwitter (2014), “there are three categories of big data stakeholders: big

data collectors, big data utilizers, and big data generators”. Users, the big data

generators, are the least aware of what their data is being used for. It is reaching a

point, where the big data utilizers, using algorithms, are able to determine our

preferences in many aspects, food, friendship, places, movies, etc. Like Boyd &

Crawford, Zwitter claims that “this information gathered from statistical data and

increasingly from big data can be used in a targeted way to get people to consume

or to behave in a certain way, e.g. through targeted marketing” (p.4). There is an

intent of manipulation, trying to use preferences taken from the data and use it for

purposes that may or may not be transparent, like offering to sell a particular product

and in return receive a present, some small article that they know people would like.

A regular person, with no suspicion, would just believe what he sees, without having

an idea of what is behind that proposal, leaving a big hole in regards to the ethics of

the process.

2.3 LEGAL & ETHICAL CONSIDERATIONS

The current regulations for data usage do not reserve the total rights to the actual

owner, which is the one that originates and provides the data. From Kennedy &

Moss’s perspective (2015), the metadata mined through social media is highly

valuable details as: “who is speaking and sharing, where they are located, to whom

they are linked, how influential and active they are, what their previous activity

patterns look like and what this suggests about their likely preferences and future

activities.” All that data that the user produces while interacting is being used by

companies, in a not so interactive way with the user, due to the public not being able

to participate and modify this processing of their information. So the authors express

their concern about the creation of some regulations that protect the user from this

mining of their information, that may “adapt news and other content based on the

knowledge they have about audiences”, basically offering to sell what they know the

user already likes.

10

Different information governance regulations have been established in Europe and

the U.S. and are enforced to different levels. In the UK, data governance has been

covered by the Data Protection Act 1998, which enforces strict “data protection

principles” (Gov.uk, 2015). According to the law, data collection, storage and usage

purposes have to be declared transparently. The data collectors are required by the

Act to provide access to the collected personal information, except when the Act

says otherwise. On the other hand, Peacock (2014) states that in the USA, people’s

information, particularly related to their leisure activities, social media interactions,

internet usage, are freely bought and sold to the consumer data broker industry.

Ethical law that should protect the user is almost non-existent and allows the online

retailers to increase their profits by using web tracking and the user’s personal data

storage. Every bit of personal interaction data is being analysed using the most

modern methods, somehow these companies have managed to avoid any

regulation, and the customers have no actual data protection laws to defend them.

Even per the user agreement, the client has no way to negotiate the agreement - the

user must accept or decline, and due to most of the people just wanting to use the

social media tool, they accept without even knowing what they are accepting. Due to

this, currently data tracking business is expanding, most of the big data storage

capacities are growing and becoming cheaper.

Once, the state was the one interested in data extraction, it seems that now the

corporations are the one leading the business of data extraction. The government

somehow takes advantage that there is no data extraction regulation, and more than

once they have aligned with private industries to get any desired information. It is a

win-win situation for both, which may explain why the government has no interest of

creating and enforcing a strict law to protect users from data extraction. Lyon (2014)

reviewed the effects of Edward Snowden revelations on big data, explaining the

situation since the alliance between the National Security Agency (NSA) and the

Internet companies in order to collect data for surveillance. This surveillance process

takes the data from the internet provider and the cell phone provider, having any

possible chance of getting all the desired information. The data is being filtered,

analysed, stored for whatever time they consider necessary, using algorithms to

define relations or any suspicious activities.

2.4 USER AWARENESS & PRIVACY CONCERNS

According to the literature, users are aware of privacy policies in relation to their

personal data being collected by data collectors, but the agreements are

complicated and they rely on the trust and good will of the company not to misuse

11

the data. In his study, Obar (2015) questioned what the behaviour of a digital citizen

should be like nowadays, mentioning how irrational it is for a regular person to

understand all the technical terms used in the contract agreement they have signed

for: “Imagine having to understand, manage and control, not only the myriad data

stockpiles that exist, but also the routing data associated with every data

transmission” (p.12). A regular person or any other individual without technical

knowledge is not likely to realise how a third party may take advantage or sell their

data. It is also assumed a regular individual is not likely to keep informed about any

progress or modification related to any contract that he has previously signed. Per

Obar, users need a definite solution, a concrete law that could actually protect them,

citing Lippman (p.13):

“The public is interested in law, not in the laws; in the method of

law, not in the substance; in the sanctity of contract, not in a

particular contract, in understanding based on custom, not in the

custom or that.”

In contrast, Norberg, Horne & Horne (2007) state that personal privacy will keep

deteriorating if the general public does not realise that they need to start making an

effort to actually understand what are they granting permission and to whom, every

time they share their personal data. Earlier research proved (cited in Norberg, Horne

& Horne, 2007, p.107) that users’ concerns about privacy are associated mainly with

risks of “potential negative outcomes” to themselves than with customers’ “trust” on

the company that is handling their data. At the same time because of the way the

current market works in regards to people’s ignorance of their data being collected

illegally, the authors claim that users’ behaviour may not be impacted that much by

these negative perceptions. In theory, users recognise the risks of releasing private

information, however, in practice, people often voluntarily consent to giving away

their data on the basis of trust, especially right now that people are increasing the

time they spend in “data rich transaction channels” or the world wide web, and are

ignoring or simply “ticking” to give their consent without reading the “privacy policy”

of the different online sites.

With the amount of big data available, modern data mining tools were proven to

facilitate monitoring and tracking consumer “purchase behaviour”, in order to gain

deeper targeted insight into consumers’ needs (Graeff & Harmon, 2002). However,

the simplicity of collecting this information also contributes to growing privacy

concerns over companies’ intentions to use the data for its internal marketing

12

purposes or making a profit by selling it to third-parties, losing control over where the

information ends up. At the same time, Malhotra, Kim & Agarwal (2004) claimed that

a user’s thoughts on company’s intentions for their private data are subjective to the

user’s personal beliefs. Sayre and Horne (cited in Norberg, Horne & Horne, 2007)

established that customers were willing to share their personal data with a company

if they get a reward in return. As a result, even when the customer is aware of the

importance of their data, even when the customer is concerned with their privacy,

they will likely end up knowingly giving it away.

The current belief that the control over information is a key factor in measuring the

consumers’ privacy concerns has led to suggest that the sellers should start

assuming as an implicit contract the exchange of private data between them and the

users (cited in Phelps, Nowak, & Ferrell, 2000). Taking that into consideration, a

social contract would be held any time a customer is giving away personal

information to a seller. This contract would be considered violated if the customers’

data is being collected, if the seller rent the customer’s information to a third party

without asking for the customer consent, or whether the customer is not allowed to

remove their name from a marketing list or somehow allow them to decide to restrict

the propagation of their data. In this research, it is assumed that the key point in

order to relieve the users’ privacy concerns is that customers would like to “have

more control” over their data in general and more control on how this personal data

is used.

“Consumer privacy exists when people can limit their accessibility

and control the release of information about themselves, and

invasions of privacy occur when control is lost or unwillingly

reduced as a result of a marketing transaction.” (cited in Phelps,

Nowak, & Ferrell, 2000, p.29)

These privacy issues are particularly obvious among young people using social

networks and modern technology, such as smartphones. Per Pybus, Cote &

Blanke’s paper (2015), mobile applications (apps) are more vulnerable to data

leaking than platforms that run through a desktop browser. Applications

configurations do not distinguish between first and third parties, which means

between the app proprietor and the other companies that the proprietor sells the

data to. In effect, the third party is granted easy access to the cell phone data. By

default, Android and IPhone applications share the device’s SIM identifiers, the

user’s phone number, so the third party has data and the identification of the data

13

producer for a long period of time. Despite this situation, young people still try to find

a way to protect their privacy, however they end up giving up, accepting the contract

agreement, surrendering to social media, which inevitably will generate more data to

the third party. In their paper, the authors commented on the experiment they did,

where a group of young people developed an app that somehow allowed to track

data as the regular apps does, but they were tracking their own data of their

smartphones, giving them the chance to work with their own data mined, and make

whatever they want with it, showing a different scenario, when the owner mines their

own data and decides what to do with it.

In current times where most of the young adults have a Facebook account, it is

acknowledged that this app tracks people whenever they open their browsers, even

if a person does not pose a Facebook account, or the user that actually has the

account has logged out, or disabled the tracking option, the app is still able to track

any subject (Skeggs & Yuill, 2015). Part of the agreement signed with Facebook

when opening an account mentions that the user “should not create an account that

is not for your own personal use, you will not create more than one personal

account, you will keep your contact information accurate and up-to-date” (Facebook,

2015). Basically, they are trying to guarantee that the user keeps their users

authenticated, because that is where the value resides, “they extract property from

the person rather than attaching property to the person” (Skeggs & Yuill, 2015,

p.384). Facebook requires “singularity” from the user as a way to warranty the

authenticity of the data, however, Facebook is not interested in the individual per

say, what they actually do is take all those “individual” data “into multiple aggregate

representations to be monetized as targeted ad space”.

2.5 CONCLUSION

The literature review attempted to provide a critical overview of what big data is and

what are the issues in regards to big data collection. Legal and personal privacy

issues were discussed. It was identified that different regulations are used to

maintain the integrity of data collection and privacy in different countries, with UK

being more strict than USA. However, there are still no real boundaries to data

collection.

The literature highlights that people are generally aware of the risks and the fact that

their data is being collected and used, they do not want to hand it over, but at the

end of the day, under most circumstances, they will still submit their data voluntarily.

14

When giving consent to data collectors, users have no deep knowledge of what is

included in the agreements, and the final destination of their data. It is a mix of

wanting to use the online services and difficulty of understanding the terms and

conditions. Young people are more involved in the privacy issues because they

generate a lot of data, but from their concerns to their attitudes, they are also more

interested in protecting their data, which makes them a good target to investigate.

15

3 METHODOLOGY

3.1 RESEARCH STRUCTURE

This research was based on a qualitative approach, using a semi-structured

interview for collection the data, and a thematic analysis methodology for studying

the data. According to Vaismoradi, Turunen & Bondas (2013), qualitative research

methodology is a group of “philosophical perspectives, assumptions, postulates, and

approaches” that an analyst uses to leave their research open to “analysis, critique,

replication, repetition, and/or adaptation and to choose research methods”. Also, this

methodology is not a single research method, but “different epistemological

perspectives” that have helped to create different approaches, such as: grounded

theory, phenomenology, ethnography, actions research, narrative analysis and

discourse analysis (p.398). Per Dunn (1983), this type of methodology contributes

with some guidelines to improve “and evaluate particular theories and models of

knowledge creations, diffusion and utilisation”.

In regards to interview methods, Qu & Dumay (2011) claimed that structured

interviews are more effective for studying facts, while unstructured interviews are

used for research related to focusing or meaning, and the semi-structure approach

is used for “social construction”, in a sort of overlapping the two other approaches.

Which fits with what is expected to achieve from this study, making a better fit to use

a semi-structured type of interview.

Lythcott & Duschl (1990) emphasise that any research relies on the coherence of

choosing the correct analysis methodology. For this study, thematic analysis was

chosen as the proper approach. According to Braun & Clarke (2016), “thematic

analysis is a method for identifying, analysing and reporting patterns (themes) within

data”. It is a method that provides flexibility which the other methods cannot provide.

There are no strict rules to determine what the themes are; however, and it relies on

the “researcher judgement” in order to define those. It also does not depend on

“quantifiable measures”, but is more focused on if “it captures something important

in relation to the research question”. In effect, the thematic analysis is particularly

useful for analysing qualitative data. The following list provides a step-by-step guide

on how to conduct a thematic analysis based on Braun and Clarke (2016):

16

1. Getting familiarised with the data by reading it and re-reading it several

times;

2. Generating initial codes, by trying to code or label interesting features within

the entire data set;

3. Searching for common themes, by checking the codes and trying to group

them into possible themes;

4. Reviewing themes, and rechecking if the themes actually accomplished what

was stipulated in levels 1 and 2;

5. Defining and naming themes - after rechecking, deciding which will be the

final codes, with a clear definition and name for each theme;

6. Producing a report by comparing the data analysed with the literature and

the research questions.

A pivotal step in thematic analysis is to “catalogue related patterns into sub-themes”

as per Aronson (1995). Themes are a type of pattern of living or behaviour that

could be identified in the data, which in isolation may not have a particular meaning,

but when put together, it provides a better overview of the general opinion.

Gathering sub-themes will give a better understanding of the “pattern emerging”.

Based on the qualitative methodology, the research was divided into two stages.

The first part used a semi-structured qualitative interview method to retrieve the

necessary data from the respondents, which were University students, recruited

through email. The participants were notified about their responses being collected,

anonymized and used for this research. The second stage involved analysing the

data from the interviews using the qualitative thematic analysis approach and

comparing the data results with the trends inferred in the literature review.

3.2 RESPONSE RATE

Part of the research process was to recruit participants that would provide their

opinions in regards of data collection through social media channels. After receiving

the Ethical Approval, a general email was sent to all the University members,

searching for students over 18 years old interested in participating on the study,

using the University service. The response rate was positive, with 10 people replying

to the email in a short time, showing true interest on participating in the project.

17

From the 10 emails responses, after trying to settle up meetings according to mutual

time availability, 7 of them were more feasible. The seven respondents came to the

meetings that were held at the Diamond building group rooms, from August 4th to

9th. Two other participants were brought by one of the interviewees that was

recruited by email. Those participants came freely due to their common interest in

sharing their opinions in regards of the research, making it a total of 9 interviewees.

3.3 DATA COLLECTION METHODS

3.3.1 METHODS

For the data collection, a semi structured interview was designed with assistance

from the dissertation supervisor. After obtaining the Ethical Approval, a general

email was sent through the University system, inviting any student over 18 years old

that wanted to help in the research, the invitation email can be found in Appendix

7.2. The consent form that was previously accepted by the Ethical Approval was

presented to the interviewees. The participants were given time to read it through,

and asked if they had any doubts. Also it was explained to them verbally that the

interview was going to be audio recorded, transcribed and anonymised. Both

transcripts and audio files would be stored at the University secure server with

limited access from my supervisor and me and after the dissertation gets accepted,

the data would be deleted. They were informed that they could stop the interview

anytime or avoid answering any particular question. They freely signed the form, and

a signed copy was given to them.

3.3.2 INTERVIEW

In order to prepare for the interviews, a first draft with the initial questions was used

as a test run with a colleague. This questions draft was sent to the supervisor for

review. After that, some changes in the questions were made in collaboration with

her. The final draft with some substantial changes was the one used for the

interviews, making it impractical to use the test run as part of the research data.

The interviews done with the nine participants were audio recorded with their

confirmed consent. Afterwards, that the audio recordings were transcribed and

anonymised, and stored in the University’s secure drive as part of the ethical

procedures.

18

The interview was designed as semi-structured, due to the benefits it provides to the

interviewer to follow some previously prepared questions, while allowing to expand

depending on how the interview develops and giving it the sensation of a more

relaxed conversation.

Due to the structure of the interview, it was divided into 9 questions. The questions’

subjects were related to:

- Discussing the most used social media platforms

- Assessing the awareness of the ways their data was collected on social

media, the awareness of how and for what purpose their data was used by

third parties.

- Analysing the feelings and emotions related to their data being collected, as

well as any positive and negative thoughts about the usage of their data.

- Addressing and pinpointing risks and concerns in regards to security, privacy

and integrity of data being collected, and actions and precautions taken in

regards to the latter.

The whole set of the interview questions can be found in the Appendix 7.4.

3.4 DATA ANALYSIS

3.4.1 INTRODUCTION

Data analysis section describes the process in which the thematic analysis method

was used in order to study the data obtained during the interviews. The audio from

the interviews was transcribed and used to do the analysis. Main thoughts extracted

from the transcripts were coded into labels that resemble the main idea of those

phrases. Themes and subthemes were recognised. The most important codes were

grouped into the themes and subthemes accordingly. The information was analysed

per codes in each subtheme, and conclusions were reached after that.

3.4.2 PROCESS DESCRIPTION

During the data analysis, all the transcripts from the interviews were read several

times. The phrases that contained the clearer thoughts and feelings were

highlighted. Those thoughts were coded in an effort to label them into shorter and

19

concrete phrases, giving a total of 150 codes. The codes were to be grouped into

themes as part of the thematic analysis procedures. At a first glance, three different

themes were identified: Social Media, Privacy, Data Collection. For the Social Media

theme, the subthemes identified were Platforms and frequency. For Privacy:

Feelings, Perceptions, Precautions, Comments were the subthemes. Data

Collection contained the subthemes: Awareness, Comments, Concerns, Feelings,

Purposes. All the main phrases that were coded were distributed between those

subthemes.

However, after rechecking all the codes, it was identified that some subthemes were

common in between themes, for example the Feelings and Comments subthemes

were common in between Privacy and Data Collection. Also it was recognised that

most of the topics were related to data collection rather than Privacy, so it was more

appropriate to remove the Privacy theme and focus on Data Collection. Another

issue was that the Social Media theme contained information about the participants’

platform preferences, which are more of a quantitative data that does not needed to

be grouped.

After that analysis, it was decided that it was a better approach to redefine the

themes and subthemes in a different way. First of all, Data Collection will no longer

be a theme, and the information handled in that question will be treated as a

statistical data. Secondly, the Privacy theme will be removed, as the information

obtained was not from the privacy perspective, but from the data collection point of

view. The Perceptions theme came as a solution to handle all feelings and,

emotions, related to the fact that the interviewees’ data is being collected. The

Concerns theme was used to handle the data collection issues and risks that worry

the participants and the precautions taken by them.

With the new themes and subthemes defined, the codes were grouped accordingly.

However, in order to make it more efficient, due to the big amount of codes, it was

thought that there should be between 5 to 8 codes per subtheme, which would

require to select in between the codes the more substantial. For the code selection

process, the codes that were repeated more frequently between the interviewees

were given higher precedence. After that, the codes that contain the more

interesting and actual information were selected.

20

3.4.3 THEMES AND SUBTHEMES DESCRIPTION

From the 9 interviews that were analysed, 3 main themes were identified:

Awareness, Perceptions, Concerns. From those themes, between two to four

subthemes were identified for each theme.

Table 3.1 Themes and sub-themes

Themes Sub-Themes

Awareness of Data Collection General Awareness

Awareness of the Purpose of Data

Collection

Perceptions of Data Collection General Perceptions

Feelings

Perceptions of a Positive collection

purpose

Perceptions of a Negative collection

purpose

Concerns of Data Collection General Concerns

Risks

Precautions

According to the main subject areas listed in the data collection methods section, a

number of themes have been identified and the questions were grouped and

analysed accordingly. Three themes have been defined: Awareness of Data

Collection, Perceptions of Data Collection and Concerns of Data Collection. The

answers related to the questions about awareness of how their data was collected in

social media and by third parties were put into the theme Awareness. The answers

related to examples of their perception of how their data was collected in social

media and by third parties, feelings and emotions, positive thoughts, were grouped

in the theme Perceptions. The answers related to risks, concerns and precautions

taken about data collection were put into the theme Concerns. Also the data from

the first subject area detailing most used social media platforms was used to

quantify the preferences of the interviewees.

Inside of the Awareness of Data Collection theme, the subthemes General

Awareness and Awareness of the purpose of data collection were chosen to group

the interviewees answers related to the second main subject. The General

Awareness subtheme was used to collect the opinions in regards of the interviewees

21

awareness of data collection, not particularly of how the data is collected, but

whether they know or not if the data is being collected. For the second subtheme,

Awareness of the Purpose of Data Collection, the opinions related to knowing how

the data is used, or intended to be used for, were coded in this group.

For the Perceptions of Data Collection theme, four subthemes were established.

General Perceptions subtheme was used to group the opinions the interviewees

have about data collection, how they think their data is being collected, any related

idea or thought about data collection itself. In the Feelings subtheme, the particular

sentiments or feelings that are brought about by knowing that their data has been

collected were grouped. After coding the interviewees’ awareness in regards to their

data being used for a particular purpose for the first theme, two following subthemes

were identified. Positive purposes included the opinions of the interviewees showing

acceptance towards how and where their data is being used for specific purposes,

while Negative purposes detailed the opposite of the previous subtheme.

For the third theme, the participants’ opinions were distributed between three

subthemes. General Concerns was used to classify the thoughts that worry the

interviewees the most about their data being collected. Risks subthemes was

created to group the thoughts related to what they consider a possible harm of data

collection. The ideas related to current or future measures taken by the participants

in order to protect their data were placed in the subtheme Precautions.

3.5 ETHICAL ASPECTS

As per University procedures, an ethical application was submitted to the

Information School department for review, which classified this research as “low

risk”. The application form and the certificate of approval letter were included in the

Appendix 7.1, 7.3. The interviews included a human element; however, the

participants were not asked any personal or sensitive information, no names, gender

or age were recorded. The interviews were voice recorded. All data taken from the

audio records was anonymised, as it was assured to the participants at the

beginning of the interview. An ethics consent form was given to the participants,

containing details about the research project, the kind of data that was going to be

collected from them, as well as explaining that the data will be anonymised and

saved in a secure storage drive with limited access. The participants were requested

to read and sign if they agree to give consent to the interview. A copy of the consent

form was given to the participants. The audio recordings, the transcripts, and

22

scanned versions of the consent forms were uploaded to the University secure

research data file store.

3.6 LIMITATIONS AND RISKS

After being reviewed, this research was classified as low risk by the Ethical Approval

letter. The interviewees were recruited by email, using the University service,

restricting the participants to be University students, over 18 years old and living

inside the United Kingdom. The participation in the research contain no risks for the

interviewees. During the interview no demographic data was documented, no

gender, age, background or nationality. After the interviews, the audio recordings

were transcribed and the participants’ names were represented using the letter ‘I’ to

indicate the word ‘interviewee’ followed by an ascending number to represent the

order in which the interviewees participated in the project, as a way to anonymize

the data and protect the interviewees identity, avoiding any possible risk and fulfilling

the Ethical approval agreement. The data analysis is based on the participants’

opinions provided during the interviews, which may not be entirely trustworthy,

however it is a minor risk that is part of the qualitative research.

23

4 RESULTS AND DISCUSSION

4.1 INTRODUCTION

This chapter will discuss the results found in the qualitative analysis of the nine

semi-structured interviews. The following sections will describe the findings obtained

using the thematic analysis method for each of the themes identified during the data

analysis chapter.

The Awareness of Data Collection section will illustrate the main opinions found in

the interviews in regards of the participants’ knowledge of their data being collected.

Showing the main thoughts of the majority of interviewees proving their awareness

about the subject.

The Perceptions of Data Collection section will pinpoint the participants’ thoughts

about their data being collected, their feelings in regards of that, their opinions when

they approve or disapprove the purpose what for their data is being collected.

Illustrating that most of them have a balance between knowing some of the uses

that their data will have and being aware they are not fully in control of that.

The Concerns of Data Collection section will discuss the subjects that worry the

interviewees, general concern, what they consider risky in their data being collected

and the possible precautions taken or willing to take in order to protect their

information. Being that their most common resolution was to trying to stay as private

as possible, keeping control of the data they share.

24

4.2 OVERVIEW

Figure 4.1 Number of interviewees per Platform.

From the data it can be assumed that all the interviewees use Facebook platform.

The first question of the interview was in regards to which social media platforms the

interviewees use on a regular basis. All interviewees answered they have a

Facebook account that they use in a daily basis, the second platform most used is

Twitter according to the 78% of interviewees, the third one is Instagram with 67% of

interviewees, Snapchat, Linked-in, WhatsApp count with 22% of interviewees each,

the rest Yik Yak, Telgram, Skype, Weibo, WeChat, Google+, Pinterest, Myspace,

Hi5, Wayn, only count with one interviewee, please refer to Figure 4.1

4.3 AWARENESS OF DATA COLLECTION

4.3.1 GENERAL AWARENESS

General Awareness subtheme showed that most of the interviewees are aware that

their data is being collected through social media channels, with an amount of 8

from 9 interviewees admitting their knowledge of their data being collected. For

instance, interviewee I6 mentioned “I don´t know what data they collect from me, but

they must collect data, but I don´t know why or what for?”, showing a clear

understanding of the fact that the data is being collected even when does not know

for what specific purpose. There were common comments like, “I know lots of data is

being collected”, “I am aware of these things they do, so I am not really negligent”,

from interviewees I1 and I2 respectively, others showed a particular knowledge and

conviction of how the data is collected, for example, interview I3 mentioned, “I am

0

1

2

3

4

5

6

7

8

9

Amount of interviewees per Platform

25

quite aware of how they may use the data or sell”. Only one participant was not

aware of the data being collected, stating, “I don`t know that….my information

probably be taken away while I don`t know”, [Interview I4].

4.3.2 AWARENESS OF THE PURPOSE OF DATA COLLECTION

Awareness of the purpose of data collection subtheme was created to allocate the

ways in which the participants expressed their knowledge that their data is being

used for a particular purpose. As showed in the subtheme above, most of the

interviewees were aware that their data is being collected. In this subtheme it will be

shown that also most of them are aware of the final use of that collected data,

except for the same interviewee that in the first subtheme was also not aware of

data collection, the rest showed some signs of knowledge. For example, I7

mentioned: “I wouldn`t say I am overly aware of particular ways, but I am aware that

is possible, I don`t know the how`s and exactly what”. 3 of them (I2, I5, I6) made

comments revealing their understanding that their data will be used in marketing

purposes. I5 stated “I know it`s valuable to collect people`s information and to be

able to pinpoint what it`s necessary marketable to people”. The other three (I7, I8,

I9) were able to be more specific and pinpoint that the purpose was targeted

marketing. For example, I8 claimed that “probably third party companies would use

that to target market products to you, to make profit”. Other purposes were also

identified, such as "surveillance" and "health care". I5 expressed that data will be

used: “Mostly in marketing, sometimes in defense…I mean marketing, surveillance,

health care…”.

4.4 PERCEPTIONS OF DATA COLLECTION

4.4.1 GENERAL PERCEPTIONS

For this subtheme were coded the most outstanding thoughts shared by the

interviewees in regards of their data being collected. Some mixed reflections were

identified. Two participants coincided in the fact that those third party companies

need to make money somehow, for example, I9 mentioned: “that is just the reality of

an online enterprise, they have to make money”. Two others (I5, I8) found it

unsettling that their activities were tracked. According to I5, it “is kind of creepy too,

you don`t want to be watched in your back for apps that are going to attack you”. On

the other hand, interviewee I3 found the end balance as positive, “if we did the pros

vs cons list, there are probably more pros to having this personal information out

26

there than there are cons, in my opinion”. I9 was positive that users would be

informed of any use of private data, “I think a lot of the information, transactions that

I am pretty sure happen with my consent and my knowledge”. I1 shared the

preference of searching for social media channels that provides more privacy, “I am

moving to Snapchat, because I feel it has a higher level of privacy”. There is a sense

of understanding of the business but at the same time some sense of worry that the

information is no longer under the owners' control, as stated by I3: “anybody can

actually set up an application and start actually retrieving Twitter data, you got not

control of who actually got the data and who is using it”.

4.4.2 FEELINGS

The feelings subtheme gathers all the different emotions the participants shared in

regards of data collection. The same as in the General Perceptions, there are mixed

emotions in regards of this subject. It seems it is related to if the participant

considers a good final destiny for their data or not and not being in control of

deciding this final purpose. Two described it as scary but in different contexts, for I1

it was related to not knowing the information's final destination (“It`s a bit scary,

because we don`t know what they are going to do with the information.”), while for I9

it was about data going to an unwanted destination (“Some sort of huge intelligence

network to try to then find out what every single person thinks and knows, then yeah

that can be a bit scary”). This interviewee is also one of the two interviewees,

together with I7, that coincide that they do not mind about data being used by third

parties. To this end, I7 mentioned: “I am saying basically I don`t mind about me”.

However, this same interviewee I9 stated that it depends of the intentions and

awareness of the user of the data use: “I think if my information it`s being used

maliciously, what I mean by that is that, if my information is being used by a

company that wanted to use me for its own profit without at least my understanding

it, I probably be a little bit annoyed”. Another interviewee, I6, expressed awareness

of data collection and being worried about that: “I read a lot of articles about it and it

makes me really worried”, while I8 expressed that is not that worried, “I supposed,

I`m probably not as worried as I should be”.

4.4.3 PERCEPTIONS OF A POSITIVE COLLECTION PURPOSE

This subtheme contains more common answers of what the interviewees considered

a good use of their personal information that has being collected, the following three

27

being the most voted: targeted advertising, security, sociology. 5 out of 9

interviewees agreed that targeted advertising is a positive asset. I6 mentioned: “I

guess if you thought of advertising, cause sometimes you can find good stuffs that

you did not realized you wanted”. I1 confirmed that this purpose improves the

service the user is receiving, “at the moment I am not bothered because is improving

my experience”, I2 stated it makes life more efficient “all those things, they make our

day better” “I am more productive with those apps… some are good, some are not,

gives you more information, sometimes they mess up”. Security was also

considered a positive purpose by 2 interviewees, for example, I5 said: “Positive

reasons, would be to help understand people first and second would be security”.

Another common purpose is sociology by 2 interviewees too, as I7 suggested: “The

idea that we have an idea of sociology from through digital methods it`s quite good”.

Analysis of trends was also considered positive by I6: “if you look at it more broadly

maybe if you analyse data you can find trends in data”.

4.4.4 PERCEPTIONS OF A NEGATIVE COLLECTION PURPOSE

For this subtheme were collected the negative opinions in regards to a particular

purpose of the collected data. Most of the interviewees expressed dissatisfaction

when their data was used for a marketing purpose but targeted wrongly. Three

interviewees (I2, I5, I8) mentioned situations related to that, with two of them

mentioning particular examples that happened to them, such as receiving

advertisements for products that they were not interested in purchasing, or that were

not gender appropriated, like female products being advertised to a male user,

which was the case mentioned by I5:

“I mean I am not a cross dresser, so if someone else looked at my system at that

moment and saw brassieres and panties I mean it see that I am trying to shop for

my girlfriend or something for me”.

Interviewee I2 had a similar situation, “if the coding makes an error, and because

they give information that is not pertain to you, and they keep bothering you on

something”. The interviewee also mentioned that excessive advertising can

negatively affect the efficiency of everyday routine, “anything that is fusing a lot of

information on me, that affects my own productivity and creativity”, turning the

marketing service into something irritating, “I cannot use the internet for free without

getting adverts, it´s annoying, that´s the only thing, but apart from that, I am alright”.

28

4.5 CONCERNS OF DATA COLLECTION

4.5.1 GENERAL CONCERNS

General Concerns subtheme gathers all the thoughts that worry the participants in

regards to their data being collected. The three repeated subjects with two votes

each from the 9 interviewees, were:

Fear of being hacked, as explained by I2: “I wonder if someone hacks on the

information, oh my God, like my bank accounts, or stuff like that”.

The uncertainty of how and by whom the information will be handled, as

mentioned by I9: “There is always that ‘What if your information ends up in

the wrong hands?’ Sort of question”. It was also mentioned by I1 that it can

turn into a scary situation: “It´s a bit scary, because we don´t know what they

are going to do with the information”, expressed also as a fear of the

unknown by I2: “I have a feeling like behind the scenes they are doing other

things that you are not supposed to be doing, like making clones of

everybody, like they can do stuffs that we don´t know about”.

There is also the fact that there is no way to escape from Internet history, as

I5 said: “I mean the internet always has the way of never forgetting because

information that is collected is already stored somewhere”, I2, who shared

the same thought, suggested to be careful of posting data, because “the

internet never forgets, if you don´t want it out there, don´t even share it, don´t

put it on social media”.

4.5.2 RISKS

Risks subtheme describes the issues the interviewees considered as a possible

exposure to risks due to data collection. Two of the interviewees considered

“infringement of privacy” a possible risk when asked about that, as for example I5

mentioned: “your privacy has been infringed, it´s inoperable now, I mean if you try

yourself to be a very private person, I think the moment you go in the internet, your

privacy is broken”. I3 considered that a major hazard for young people could be

sharing their location; “there are these applications where you can track geo-

surveillance on specific regions, children should be careful posting too much

personal information, location information, should be really careful, that´s the big

risk.”. Another possible risk would be the hackers trying to seek for information, as

mentioned by I5: “password is really the way forward right now, when people can

actually hack a secure server and download millions of passwords just to get

29

information”. Also it was taken into consideration that personal information, such as

the date of birth could be used to get bank account information, as pinpointed by I6:

“The risk is that if I´ve got personal information out there that lead to identify me,

people can use it fraudulently against me”. The interviewee also mentioned the

common risk that there is no control over the data: “I don´t know who has that data, I

don´t know where it goes, that´s the risk to me.”

4.5.3 PRECAUTIONS

For Precautions subtheme, were taken into consideration the main actions taken by

the interviewees in order to protect their data. Three participants mentioned trying to

be private as a measure to protect their information. For example, I2 stated: “I just

try to be private, apart, whatever I want to be out there I put it out there”. Two of

three (I3, I6) were also specific about mentioning that they are careful of what they

post on social media channels, as stated by I6: “I am quite careful what actually I put

on those platforms, like Facebook I barely post”. As well, it was mentioned by the

same participant that deleting information may help: “just removing data, and I don´t

post any much information”. Another common preventive action mentioned by two

interviewees was to change the “account privacy settings”, as described by I3: “if the

user doesn’t´ want their personal or private information being posted on social

media, they should really make their account private”. I3 also highlighted the

importance of setting the right privacy settings: “you really need to be careful with

privacy settings, so as long as you´ve got quite high privacy settings that means that

no one can actually go through your page”. Configuring a Virtual Private Network

was also a voted precaution by 2 interviewees, by for example I5: “I´ll probably get a

VPN to hide my IP address from being public”.

4.6 DISCUSSION

In regards to their awareness, most of the interviewees were aware of their data

being collected even when they are not sure what for or how. The majority of the

interviewees were aware of the purposes for that collected data, such as targeted

marketing, surveillance, behaviour tracking and healthcare. From the interviewees,

mixed reflections were identified. The different perceptions confirmed Malhotra, Kim

& Agarwal’s (2004) claim about subjectivity of the company’s intentions to the user’s

beliefs. Some users acknowledged that third party companies need to make money,

showing a sense of understanding of the business side, but at the same time some

30

sense of worry that the information is no longer under the owners’ control.

Marketing, for example, was the dominant topic among interviewees, with mixed to

positive results. Interviewees identified targeted advertising, as well as security and

sociology, as positive purposes for their personal information use. However,

interviewees expressed frustration and worry with their activities being tracked. They

also expressed dissatisfaction when their data was used for a marketing purpose but

targeted wrongly or used excessively. The end balance was positive, with

interviewees identifying more positives than negatives.

Interviewees identified many concerns of data collection, a lot of them fears and

risks. Amongst risks, they mentioned hacking, identifying user from personal data

and privacy infringement. Users feared not knowing where the data will end up and

who might end up tracking them. However, overall, the interviewees were willing to

make concessions to their data use as long as it benefitted them, just like Sayre and

Horne claimed (cited in Norberg, Horne & Horne, 2007). People said they were

scared of the surveillance factor, however they felt safe with it, so they would rather

have that, even when they did not completely like it. It meant that they agreed to

having their privacy invaded to protect their personal safety. None of the

interviewees mentioned concerns over lack of legal regulations. This confirms

Obar’s (2015) opinion that regular people do not bother with the legal considerations

of data collection, even when it is them that tick away their consent.

It was also confirmed that young people are thinking about preventative measures,

like Pybus, Cote & Blanke (2015) suggested. Users listed trying to be private as a

measure of data protection, deleting information, changing account privacy settings

and configuring a virtual private network. However, they also realised there is no

way to escape from Internet history and the easiest way is to keep more personal

data to themselves.

31

5 CONCLUSIONS

The overall aim of the dissertation was to analyse and investigate the students’

perceptions of their social media data being used for marketing, surveillance and

other purposes with or without their awareness. Numerous articles were read in

order to build the literature review to examine the general trends of data collection in

regards to big social media data, finding facts that were then contrasted with the

research findings. During the literature review, it was highlighted the how big data

from social networks became a source of business for marketing and also used for

surveillance purposes. In the legal frame it was noticed that there is a lack of strong

laws that actually protect the user that produces this data, meaning they are no

longer in control of their information. Also, it was mentioned that the users are aware

of the privacy policies but they do not bother on understand them and prefer to sing

in and trust that the company will not misuse their data. The literature review helped

build the questions for a semi-structured interview to investigate students’

awareness and perceptions. The responses were then analysed to answer the main

research questions.

RQ1: Are the students aware of their personal data being collected?

Most of the interviewees were aware that their data is being collected through social

media channels. They express their knowledge that the data that is being shared in

the internet will no longer be in control of the owner. The world wide web is hard to

manage, and they are aware that keeping as much information to themselves is the

most practical way.

RQ2: What feelings, emotions, reactions, does this situation produce in them?

Data collection provokes in the participants different types of reactions, for instance

they do understand that social media is a business and needs to make profit

somehow. But at the same time they are worried that their data may fall in the wrong

hands, that may end up being used for undesired purposes or being used to identify

private individuals.

RQ3: What do they think about that situation, if it is positive or negative?

They do consider some purposes positive, as targeted advertising, that kind of

provides them with first-hand information that can be considered useful when they

32

are actually looking for something in particular, but at the same time they expressed

that mistaken targeted advertising can turn into something annoying. Security and

surveillance were another topic that had mixed reactions. Participants considered

that somehow it makes them feel safe, knowing that location is being used for

surveillance, because it gave that sense of personal security, but at the same time

they do not want to be watched. Also, it was mentioned that young people are not

completely aware of the consequences of sharing location and that may turn into a

problem, because it is easy to track down population using geolocations apps.

RQ4: What measures have they taken or plan to take in regards to that?

Most of the interviewees agree that they do want to keep their privacy, in order to do

so, some measurements were taken into consideration. Some of them changed their

account settings in order to make their profiles private, others decided to delete

personal information from their social media profiles that may lead to identifying

them and use that data against them. Others tried to configure virtual private

networks so their information is transferred in a secure way.

Overall, the participants were aware of the situation, they knew their data is being

collected and they were just living with it. For better or for worst, they did not actually

feel affected, they did not like their private information being used, but they just

accept it, they think that is how the social media world works, and as long as it does

not put them in a dangerous situation, that will not stop them from their normal

activities in the online media.

5.1 LIMITATIONS AND RECOMMENDATIONS

This study had a number of limitations. First of all, the study did not take into

consideration age, gender, nationality, cultural background or any demographic

data. Secondly, another one of the limitations of this research was using a single

qualitative research approach. A subsequent research would benefit from using

triangulation by combining research data with quantitative data from an online

questionnaire, or using a different qualitative approach, such as grounded theory, to

assess interview data without reviewing the literature first. Also, the results were

rather ambiguous, as expected within the literature. Due to these limitations of the

research, it would be recommended that further studies will be carried out. The

following study could gather a number of responses from a wider geographical area

or a wider organisational background. A more comprehensive analysis, using a

33

combined triangulation method, that could search for quantitative results in a wider

sample. After a questionnaire, interviews could be done with a wider sample in order

to get clearer overall insights of the general trends among the participants.

Because it was established that the reception to privacy concerns is mixed, future

research could focus primarily on what concerns users have over their data. With

the lack of regulations and users being indifferent, potential research should try to

collect users’ opinions on if and how data collection should be regulated. Another

specific study could focus on the ways to combat privacy issues, in addition to the

options suggested by the current interviewees. With the current events in social

media like Facebook and WhatsApp scandal, it would be extremely valuable to

investigate more about how to efficiently protect their social media data.

34

6 REFERENCES

Aronson, J. (1995). A Pragmatic View of Thematic Analysis. The Qualitative Report,

2(1), 1-3. Retrieved from http://nsuworks.nova.edu/tqr/vol2/iss1/3

Boyd, D., & Crawford, K. (2012). CRITICAL QUESTIONS FOR BIG DATA.

Information, Communication & Society, 15(5), 662–679. doi:

10.1080/1369118X.2012.678878

Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative

Research in Psychology, 3(2), 77–101. doi: 10.1191/1478088706qp063oa

Cukier, K., & Mayer-Schoenberger, V. (2013). The Rise of big data. Foreign Affairs,

92(3), 27–40. Retrieved from

http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=87000329

&site=ehost-live

Dunn, W. N. (1983). Qualitative Methodology. Science Communication, 4(4), 590–

597. doi: 10.1177/0164025983004004007

Eynon, R. (2013). The rise of Big Data: what does it mean for education, technology,

and media research? Learning, Media and Technology, 38(3), 237–240. doi:

10.1080/17439884.2013.771783.

Facebook. (2016). Statement of Rights and Responsibilities. Retrieved from

https://www.facebook.com/terms

Graeff, T. R., & Harmon, S. (2002). Collecting and using personal data: consumers’

awareness and concerns. Journal of Consumer Marketing, 19(4), 302–318.

doi: 10.1108/07363760210433627

Gov.uk (2015). Data protection. Retrieved from https://www.gov.uk/data-

protection/the-data-protection-act

Kennedy, H., & Moss, G. (2015). Known or knowing publics? Social media data

mining and the question of public agency. Big Data & Society, 2(2),

2053951715611145. doi: 10.1177/2053951715611145

Kitchin, R., & McArdle, G. (2016). What makes Big Data, Big Data? Exploring the

ontological characteristics of 26 datasets. Big Data & Society, 3(1), 1-10. doi:

10.1177/2053951716631130

http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=87000329&site=ehost-live

http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=87000329&site=ehost-live

35

Lyon, D. (2014). Surveillance, Snowden, and Big Data: Capacities, consequences,

critique. Big Data & Society, 1(2), 1-13. doi: 10.1177/2053951714541861

Lythcott, J., & Duschl, R. (1990). Qualitative research: From methods to

conclusions. Science Education, 74(4), 445–460. doi:

10.1002/sce.3730740405

Malhotra, N. K., Kim, S. S., & Agarwal, J. (2004). Internet Users’ Information Privacy

Concerns (IUIPC): The Construct, the Scale, and a Causal Model.

Information Systems Research, 15(4), 336–355. doi: 10.1287/isre.1040.0032

Marr, B. (2016, March 15). 17 Predictions About The Future Of Big Data Everyone

Should Read [Blog post]. Forbes. Retrieved from

http://www.forbes.com/sites/bernardmarr/2016/03/15/17-predictions-about-

the-future-of-big-data-everyone-should-read/#4d361826157c

Norberg, P. A., Horne, D. R., & Horne, D. A. (2007). The Privacy Paradox: Personal

Information Disclosure Intentions versus Behaviors. Journal of Consumer

Affairs, 41(1), 100–126. doi: 10.1111/j.1745-6606.2006.00070.x

Obar, J. A. (2015). Big Data and The Phantom Public: Walter Lippmann and the

fallacy of data privacy self-management. Big Data & Society, 2(2), 1-16. doi:

10.1177/2053951715608876

Peacock, S. E. (2014). How web tracking changes user agency in the age of Big

Data: The used user. Big Data & Society, 1(2), 1-11.

http://doi.org/10.1177/2053951714564228

Phelps, J., Nowak, G., & Ferrell, E. (2000). Privacy Concerns and Consumer

Willingness to Provide Personal Information. Journal of Public Policy &

Marketing, 19(1), 27–41. doi: 10.1509/jppm.19.1.27.16941

Pybus, J., Cote, M., & Blanke, T. (2015). Hacking the social life of Big Data. Big

Data & Society, 2(2), 1-10. doi: 10.1177/2053951715616649

Qu, S. Q., & Dumay, J. (2011). The qualitative research interview. Qualitative

Research in Accounting & Management, 8(3), 238–264.

doi:10.1108/11766091111162070

Scarfi, M. (2012, June 28). Social media and the big data explosion. Forbes.

Retrieved from http://www.forbes.com/sites/onmarketing/2012/06/28/social-

media-and-the-big-data-explosion/#7b6d9a4f6aa7

http://doi.org/10.1287/isre.1040.0032

http://www.forbes.com/sites/bernardmarr/2016/03/15/17-predictions-about-the-future-of-big-data-everyone-should-read/#4d361826157c

http://www.forbes.com/sites/bernardmarr/2016/03/15/17-predictions-about-the-future-of-big-data-everyone-should-read/#4d361826157c

http://doi.org/10.1177/2053951714564228

http://doi.org/10.1509/jppm.19.1.27.16941

http://www.forbes.com/sites/onmarketing/2012/06/28/social-media-and-the-big-data-explosion/#7b6d9a4f6aa7

http://www.forbes.com/sites/onmarketing/2012/06/28/social-media-and-the-big-data-explosion/#7b6d9a4f6aa7

36

Schechner, S., & Koh, Y. (2016, August 29). European Regulators Scrutinize

WhatsApp Data-Sharing Plan With Facebook. The Wall Street Journal.

Retrieved from http://www.wsj.com/articles/european-regulators-scrutinize-

whatsapp-data-sharing-plan-with-facebook-1472506175

Skeggs, B., & Yuill, S. (2015). Capital experimentation with person/a formation: how

Facebook’s monetization refigures the relationship between property,

personhood and protest. Information, Communication & Society, 19(3), 380-

396. Retrieved from

http://www.tandfonline.com/doi/full/10.1080/1369118X.2015.1111403

Soares, L. (2012). The Rise of Big Data. EDUCAUSE Review, 47(3), 60-61.

Retrieved from http://er.educause.edu/~/media/files/article-

downloads/erm1237.pdf

Statista. (2016A). Number of social network users worldwide from 2010 to 2020 (in

billions). Retrieved from http://www.statista.com/statistics/278414/number-of-

worldwide-social-network-users/

Statista. (2016B). Age distribution of active social media users worldwide as of 3rd

quarter 2014, by platform. Retrieved from

http://www.statista.com/statistics/274829/age-distribution-of-active-social-

media-users-worldwide-by-platform/

Tynan, D. (2016, August 25). WhatsApp privacy backlash: Facebook angers users

by harvesting their data. The Guardian. Retrieved from

https://www.theguardian.com/technology/2016/aug/25/whatsapp-backlash-

facebook-data-privacy-users

Vaismoradi, M., Turunen, H., & Bondas, T. (2013). Content analysis and thematic

analysis: Implications for conducting a qualitative descriptive study. Nursing

& Health Sciences, 15(3), 398–405. doi: 10.1111/nhs.12048

Zwitter, A. (2014). Big Data ethics. Big Data & Society, 1(2), 1-6. doi:

10.1177/2053951714559253

http://www.wsj.com/articles/european-regulators-scrutinize-whatsapp-data-sharing-plan-with-facebook-1472506175

http://www.wsj.com/articles/european-regulators-scrutinize-whatsapp-data-sharing-plan-with-facebook-1472506175

http://er.educause.edu/~/media/files/article-downloads/erm1237.pdf

http://er.educause.edu/~/media/files/article-downloads/erm1237.pdf

http://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/

http://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/

http://www.statista.com/statistics/274829/age-distribution-of-active-social-media-users-worldwide-by-platform/

http://www.statista.com/statistics/274829/age-distribution-of-active-social-media-users-worldwide-by-platform/

37

7 APPENDICES

7.1 RESEARCH ETHICS APPLICATION

42

7.2 INVITATION EMAIL

43

7.3 CONSENT FORM

The University of Sheffield Information School

Perceptions and awareness of data collection in social media.

Researchers Sara Michelle Urrea Aguilera ([email protected])

Purpose of the research To analyse and investigate student´s perceptions of their social data being used for marketing, surveillance and other purposes with or without their awareness.

Who will be participating? We are inviting all higher education/university students over the age of 18.

What will you be asked to do? We will ask you a series of questions about your understanding of and views about personal data collection in social media. We would like you to provide as in-depth an answer as possible.

What are the potential risks of participating? The risks of participating are the same as those experienced in everyday life.

What data will we collect? We will collect the responses from a number of interviewees. All participants will be audio recorded during the interviews, and some anonymised notes will be taken by the interviewer.

What will we do with the data? A transcript will be created of each audio recording. The collected responses will be analysed and discussed in my dissertation for the master’s degree.

Will my participation be confidential? We will anonymize the responses of all interviewees. No identifying personal information will be used for the project research after the interviews. All data will be stored in a secure location on the Information School´s research data drive, which can be accessed only by me, my supervisor, and the School´s Examinations Officer and ICT staff operating the facility. I will also back up the data and store a password protected version on my laptop. All data will be deleted once the dissertation is accepted.

What will happen to the results of the research project? The results of this study will be included in my master’s dissertation which will be publicly available. The results may also be published e.g. as a scholarly journal article. Please contact the School in six months.

44

I confirm that I have read and understand the description of the research project, andthat I have had an opportunity to ask questions about the project.

I understand that my participation is voluntary and that I am free to withdraw at anytime without any negative consequences.

I understand that if I withdraw I can request for the data I have already provided to bedeleted, however this might not be possible if the data has already been anonymisedor findings published.

I understand that I may decline to answer any particular question or questions, or to doany of the activities.

I understand that my responses will be kept strictly confidential, that my name oridentity will not be linked to any research materials, and that I will not be identified oridentifiable in any report or reports that result from the research, unless I have agreedotherwise.

I give permission for all the research team members to have access to my responses.

I agree to take part in the research project as described above.

Participant Name (Please print) Participant Signature

Researcher Name (Please print) Researcher Signature

Date

Note: If you have any difficulties with, or wish to voice concern about, any aspect of your participation in this study, please contact Dr Jo Bates, Research Ethics Coordinator, Information School, The University of Sheffield ([email protected]), or the University Registrar and Secretary.

mailto:[email protected]

45

7.4 RESEARCH ETHICS APPROVAL LETTER

46

7.5 INTERVIEW QUESTIONS

QUESTIONS:

In this interview I will be asking you about your awareness and perceptions of

personal data collection in social media. As you may have noticed, we have reached

an era where we are voluntarily and involuntarily giving away our private personal

information to third party companies that profit of our data and allowing them to use

it for their own means.

1. Which social media platforms do you use on a regular basis?

2. How aware do you feel you are about the ways in which your personal data

is being collected in your social media channels, for example, Facebook,

Twitter, Weibo?

3. In which ways do you think your personal details are being collected by

social media platforms?

In case they say they don’t know or fail to elaborate:a. By using personal data during registration

b. By uploading pictures

c. By conversing through messaging services

d. By allowing third party applications access your data, E.g. your GPS

location.

e. By tracking your activity on other websites

4. How aware do you feel you are about the ways in which your personal social

media data is being used by third parties other than the social media

companies that collect it?

5. Can you give some examples of how you think your social media data is

being used by third parties?

6. How does it make you feel that your social media data is being collected and

used for such purposes? / What emotions you have about sharing sensitive

personal data in this way? Why do you think you feel this way?

7. What specific risks and concerns do you think sharing personal data on

social media poses?

8. Can you think any positive reasons for personal data being collected? What

would they be?

9. Are you currently doing anything in regards to your personal data being

collected? Is there anything that you would like to do or plan to do in the

future but haven’t got around to yet?

47

7.6 ACCESS TO DISSERTATION

Access to Dissertation

A Dissertation submitted to the University may be held by the Department (or

School) within which the Dissertation was undertaken and made available for

borrowing or consultation in accordance with University Regulations.

Requests for the loan of dissertations may be received from libraries in the UK and

overseas. The Department may also receive requests from other organisations, as

well as individuals. The conservation of the original dissertation is better assured if

the Department and/or Library can fulfill such requests by sending a copy. The

Department may also make your dissertation available via its web pages.

In certain cases, where confidentiality of information is concerned, if either the

author or the supervisor so requests, the Department will withhold the dissertation

from loan or consultation for the period specified below. Where no such restriction is

in force, the Department may also deposit the Dissertation in the University of

Sheffield Library.

To be completed by the Author – Select (a) or (b) by placing a tick in the

appropriate box

If you are willing to give permission for the Information School to make your

dissertation available in these ways, please complete the following:

X (a) Subject to the General Regulation on Intellectual Property, I, the author,

agree to this dissertation being made immediately available through the

Department and/or University Library for consultation, and for the

Department and/or Library to reproduce this dissertation in whole or part in

order to supply single copies for the purpose of research or private study

48

(b) Subject to the General Regulation on Intellectual Property, I, the author,

request that this dissertation be withheld from loan, consultation or

reproduction for a period of [ ] years from the date of its submission.

Subsequent to this period, I agree to this dissertation being made available

through the Department and/or University Library for consultation, and for

the Department and/or Library to reproduce this dissertation in whole or

part in order to supply single copies for the purpose of research or private

study

Name Sara Michelle Urrea Aguilera

Department Information School

Signed Sara Urrea Date 01/09/2016

To be completed by the Supervisor – Select (a) or (b) by placing a tick in the

appropriate box

(a) I, the supervisor, agree to this dissertation being made immediately

available through the Department and/or University Library for loan or

consultation, subject to any special restrictions (*) agreed with external

organisations as part of a collaborative project.

*Special

restrictions

(b) I, the supervisor, request that this dissertation be withheld from loan,

consultation or reproduction for a period of [ ] years from the date of its

submission. Subsequent to this period, I, agree to this dissertation being

made available through the Department and/or University Library for loan

or consultation, subject to any special restrictions (*) agreed with external

organisations as part of a collaborative project

Name

Department

49

Signed Date

THIS SHEET MUST BE SUBMITTED WITH DISSERTATIONS BY

DEPARTMENTAL REQUIREMENTS.

Download - PERCEPTIONS AND AWARENESS OF DATA COLLECTION IN …dagda.shef.ac.uk/dispub/dissertations/2015-16/... · I would like to thank my mother, Carmita, Huguito, my father and the rest of

Top Related