social network analysis

8
© 2013, IJOURNALS All Rights Reserved Page 116 Social Network Analysis A Survey on Privacy of personal Data Author: Dushyant Tanna (Department of Mathematics, Marwadi Engineering College, India) Abstract In this paper we will discuss about the usage of social sites such as facebook, twitter and many more. As we know that social networking sites has just modified the way of usage of public sharing with the help of digital technology. This means that people were using SN even before the invention of digital technology but extensive use of social networking sites have come up with recent wide spread usage. People create social networking profile on website to share their private information, updates in social life and personal emotions to a limited or wide number of users. This enables creation of interconnected network and groups. Some of the main resource for the usage of social networking sites like chatting, messaging, emails, file sharing, video calling, voice chatting, blogging and discussion groups. Some of the Social networking sites from their origin in the order of their release are as sixdegrees.com, Live Journal, Blank Planet, Cyworld, Friendster, LinkedIn, MySpace, Hi5, Orkut, Flicker, Facebook (Harvard), Yahoo360 0 , Youtube, Facebook(Corporate), Windows Live Space, Twitter and Facebook (everyone). There are many more other than these also. With some of them were just designed to frame a marketing strategy for re-launch of certain brands. Since most of the software that deals with digital communities are free for end user so every user of site can modify its own content Keywords: Social Network analysis, Node, Ego 1. INTRODUCTION Social Network analysis is a tool developed during 1950s and 1970s by Researchers and Sociologist in Social Psychology. It is based on the basic acceptance that there is importance of relationships among nodes. Borgatti and Foster [1](2003) have proved that there is exponential growth of literature in Social Network research. Which has led us to the point that one should be fully aware with the proper way of using of SN sites. Since the number of user of SN sites, is increasing of which the major group is of teenagers who are least aware about the security and privacy of SN sites so there is a great need of making them aware about the threats of SN sites and how one can misuse the information posted in SN sites. There is really noticeable number of user from 1997. Thanks to the efforts taken by so many great researchers to increase the pace of our life and make it possible for everyone to use SN sites to post our feeling publically and that too with so many people at just one click in fraction of seconds. Sometimes one feels so lonely that just by sharing some of the personal thought will make them relaxed. It is even a pleasure to chat with someone unknown person. Making friends has become so easy for all the current generation. Methodology: Questionnaire, Interviews, Chi-Square Test 2. DEFINITIONS 2.1. Node It represents the individual actor in the network 2.2. Ego It is an individual focal node. In a network the number of ego is the same as the number of nodes. It can be groups, persons or entire society. 2.3. Alters In general for ego-centric networks, alters which are identified as connected to ego are a set that is unconnected with those for each other ego. Alters are really useful in a way like for example, if one could identify each of the alters connected to an ego by some relation, we can form a visual picture of the networks of some social associations. 3. SOME OF THE METHODS OF STUDYING SOCIAL NETWORK ANALYSIS 3.1 Full network method: In this method, one is required to collect each actor’s tie with other all other actors. . In essence, this

Upload: dushyant-tanna

Post on 25-Mar-2016

220 views

Category:

Documents


2 download

DESCRIPTION

A Survey on Privacy of Personal Data

TRANSCRIPT

Page 1: Social Network Analysis

© 2013, IJOURNALS All Rights Reserved

Page 116

Social Network Analysis – A Survey on

Privacy of personal Data

Author: Dushyant Tanna

(Department of Mathematics, Marwadi Engineering College, India)

Abstract

In this paper we will discuss about the usage of social

sites such as facebook, twitter and many more. As we

know that social networking sites has just modified the

way of usage of public sharing with the help of digital

technology. This means that people were using SN even

before the invention of digital technology but extensive

use of social networking sites have come up with recent

wide spread usage. People create social networking

profile on website to share their private information,

updates in social life and personal emotions to a limited

or wide number of users. This enables creation of

interconnected network and groups. Some of the main

resource for the usage of social networking sites like

chatting, messaging, emails, file sharing, video calling,

voice chatting, blogging and discussion groups. Some of

the Social networking sites from their origin in the

order of their release are as sixdegrees.com, Live

Journal, Blank Planet, Cyworld, Friendster, LinkedIn,

MySpace, Hi5, Orkut, Flicker, Facebook (Harvard),

Yahoo3600, Youtube, Facebook(Corporate), Windows

Live Space, Twitter and Facebook (everyone). There are

many more other than these also. With some of them

were just designed to frame a marketing strategy for

re-launch of certain brands. Since most of the software

that deals with digital communities are free for end

user so every user of site can modify its own content

Keywords: Social Network analysis, Node, Ego

1. INTRODUCTION Social Network analysis is a tool developed during

1950s and 1970s by Researchers and Sociologist in

Social Psychology. It is based on the basic acceptance

that there is importance of relationships among

nodes. Borgatti and Foster [1](2003) have proved

that there is exponential growth of literature in Social

Network research. Which has led us to the point that

one should be fully aware with the proper way of

using of SN sites. Since the number of user of SN sites,

is increasing of which the major group is of teenagers

who are least aware about the security and privacy of

SN sites so there is a great need of making them

aware about the threats of SN sites and how one can

misuse the information posted in SN sites. There is

really noticeable number of user from 1997. Thanks

to the efforts taken by so many great researchers to

increase the pace of our life and make it possible for

everyone to use SN sites to post our feeling publically

and that too with so many people at just one click in

fraction of seconds. Sometimes one feels so lonely

that just by sharing some of the personal thought –

will make them relaxed. It is even a pleasure to chat

with someone unknown person. Making friends has

become so easy for all the current generation.

Methodology:

Questionnaire, Interviews, Chi-Square Test

2. DEFINITIONS

2.1. Node It represents the individual actor in the network

2.2. Ego It is an individual focal node. In a network the number of ego is the same as the number of nodes. It can be groups, persons or entire society.

2.3. Alters In general for ego-centric networks, alters which are identified as connected to ego are a set that is unconnected with those for each other ego. Alters are really useful in a way like for example, if one could identify each of the alters connected to an ego by some relation, we can form a visual picture of the networks of some social associations.

3. SOME OF THE METHODS OF STUDYING SOCIAL NETWORK ANALYSIS

3.1 Full network method: In this method, one is required to collect each actor’s tie with other all other actors. . In essence, this

Page 2: Social Network Analysis

© 2013, IJOURNALS All Rights Reserved

Page 117

approach is taking a census of ties in a population of actors -- rather than a sample. For example we could collect data on shipments of steel between all pairs of nation states in the world system from International Monetary Fund records; we could examine the boards of directors of all public corporations for overlapping directors; we could count the number of vehicles moving between all pairs of cities; we could look at the flows of e-mail between all pairs of employees in a company; we could ask each child in a play group to identify their friends. Full network data is necessary to properly define and measure many of the structural concepts of network analysis.

3.2 Snowball method This method begins with a focal actor or set of actors. Each of these actors is asked to name some or all of their ties to other actors. Then, all the actors named (who were not part of the original list) are tracked down and asked for some or all of their ties. The process continues until no new actors are identified, or until we decide to stop (usually for reasons of time and resources, or because the new actors being named are very marginal to the group we are trying to study). The snowball method can be particularly helpful for tracking down "special" populations (often numerically small sub-sets of people mixed in with large numbers of others). Business contact networks, community elites, deviant sub-cultures, avid stamp collectors, kinship networks, and many other structures can be pretty effectively located and described by snowball methods. It is sometimes not as difficult to achieve closure in snowball "samples" as one might think. The limitations on the numbers of strong ties that most actors have, and the tendency for ties to be reciprocated often make it fairly easy to find the boundaries.

3.3 Ego – Centric Network (with alter

connection) In many cases it will not be possible (or necessary) to track down the full networks beginning with focal nodes (as in the snowball method). An alternative approach is to begin with a selection of focal nodes (egos), and identify the nodes to which they are connected. Then, we determine which of the nodes identified in the first stage are connected to one another. This can be done by contacting each of the nodes; sometimes we can ask ego to report which of the nodes that it is tied to are tied to one another.

This kind of approach can be quite effective for collecting a form of relational data from very large populations, and can be combined with attribute-based approaches. For example, we might take a simple random sample of male college students and ask them to report who are their close friends, and which of these friends know one another. This kind of approach can give us a good and reliable picture of the kinds of networks (or at least the local neighborhoods) in which individuals are embedded. We can find out such things as how many connections nodes have, and the extent to which these nodes are

close-knit groups. Such data can be very useful in helping to understand the opportunities and constraints that ego has as a result of the way they are embedded in their networks.

3.4 Ego – Centric Network (Ego only) Ego-centric methods really focus on the individual, rather than on the network as a whole. By collecting information on the connections among the actors connected to each focal ego, we can still get a pretty good picture of the "local" networks or "neighborhoods" of individuals. Such information is useful for understanding how networks affect individuals, and they also give a (incomplete) picture of the general texture of the network as a whole.

Suppose, however, that we only obtained information on ego's connections to alters -- but not information on the connections among those alters. Data like these are not really "network" data at all. That is, they cannot be represented as a square actor-by-actor array of ties. But doesn't mean that ego-centric data without connections among the alters are of no value for analysts seeking to take a structural or network approach to understanding actors. We can know, for example, that some actors have many close friends and kin, and others have few. Knowing this, we are able to understand something about the differences in the actors places in social structure, and make some predictions about how these locations constrain their behavior. What we cannot know from ego-centric data with any certainty is the nature of the macro-structure or the whole network.

In ego-centric networks, the alters identified as connected to each ego are probably a set that is unconnected with those for each other ego. While we cannot assess the overall density or connectedness of the population, we can sometimes be a bit more general. If we have some good theoretical reason to think about alters in terms of their social roles, rather than as individual occupants of social roles, ego-centered networks can tell us a good bit about local social structures. For example, if we identify each of the alters connected to an ego by a friendship relation as "kin," "co-worker," "member of the same church," etc., we can build up a picture of the networks of social positions (rather than the networks of individuals) in which egos are embedded. Such an approach, of course, assumes that such categories as "kin" are real and meaningful determinants of patterns of interaction.

4. DATA ON THE POPULARITY OF SN SITE AND ITS NUMBER OF USER

The below gives the list of the popularity of the some of the famous social networking sites and number of users. Based on the number of users we can definitely say that social network sites are gaining a lot of popularities these days.

Page 3: Social Network Analysis

© 2013, IJOURNALS All Rights Reserved

Page 118

1 Facebook: 800,000,000 - Estimated Unique Monthly Visitors

2 Twitter: 250,000,000 - Estimated Unique Monthly Visitors

3 LinkedIn: 200,000,000 - Estimated Unique Monthly Visitors

4 Pinterest: 120,000,000 - Estimated Unique Monthly Visitors

5 MySpace: 70,500,000 - Estimated Unique Monthly Visitors

6 Google Plus+: 65,000,000 - Estimated Unique Monthly Visitors

7 Instagram: 50,000,000 - Estimated Unique Monthly Visitors

8 DeviantArt: 25,500,000 - Estimated Unique Monthly Visitors

9 LiveJournal: 20,500,000 - Estimated Unique Monthly Visitors

10 Tagged: 19,500,000 - Estimated Unique Monthly Visitors

11 Orkut: 17,500,000 - Estimated Unique Monthly Visitors

12 CafeMom: 12,500,000 - Estimated Unique Monthly Visitors

Page 4: Social Network Analysis

© 2013, IJOURNALS All Rights Reserved

Page 119

13 Ning: 12,000,000 - Estimated Unique Monthly Visitors

14 Meetup: 7,500,000 - Estimated Unique Monthly Visitors

15 myLife: 5,400,000 - Estimated Unique Monthly Visitors

5. SOME OF THE FACTS PRESENTED WITH SIMPLE PERCENTAGE TABLE

5.1 Time Spent during leisure time The following table gives us the evidence about the leisure time. People were asked to categorise the time spent in most five preferred activity during a day.

Table 1: Time Spent during leisure time

Time Spent during leisure time Frequency Percentage

FB, Twiter / SN site 20 40

Reading a book 17 34

Refering to blogs 6 12

Hang out 5 10

To worship place 2 4

Total 50 100

Figure 1: Leisure time spent

So it is evident that 52% of the people prefer to spare

time on Facebook or Twitter of Blogs.

5.2 Website mostly visited

Table 2: Website Visited

Website Mostly Visited Frequency Percentage

Facebook 24 48 Gmail, yahoo mail, etc email sites 9 18

You Tube 2 4

Educational Sites 8 16

Others 7 14

50 100

05

10152025303540

40

34

12 10

4

Time Spent during leisure time

Page 5: Social Network Analysis

© 2013, IJOURNALS All Rights Reserved

Page 120

Figure 2: Website Visited

This is obvious that 48% visit facebook.

5.3 Frequency of Sharing Status or thoughts on SNS

Table 3: Frequency of Sharing

Frequency of Sharing Status or thoughts on SNS Frequency Percentage

Daily 4 times or more 8 16 daily 2 times - 4 times 20 40

Daily once or twice 10 20

Weekly 7 14 Fortnightly or monthly 5 10

Total 50 100

Figure 3: Frequency of Sharing

This is obvious that 56% people share thoughts

atleast 2 or more times on SNS

5.4 Preference of Activity

Table 4: Preference of Activity on SNS

Preference of Activity in SN sites Frequency Percentage

Download/Upload photos 11 22

Chatting 14 28 Browsing for unknown person's data 2 4 Important Work / Reading 15 30

Others 8 16

Total 50 100

05

101520253035404550

48

18

4

16 14

Website Mostly Visited

0

5

10

15

20

25

30

35

40

16

40

20

1410

Frequency of Sharing Status or thoughts on

SNS

Page 6: Social Network Analysis

© 2013, IJOURNALS All Rights Reserved

Page 121

Figure 4: Preference of Activity on SNS

54% people prefer to pass time in non useful activity

5.5 Membership in Academic Groups

Table 5: Membership in groups

Groups Related to academic Frequency Percentage

Less than 10 40 80

Between 10 to 30 7 14

Between 30 to 40 2 4

Between 40 to 50 0 0

More than 50 1 2

Total 50 100

Figure 5: Membership in groups related to

academic

It is evident that 80% of the people hold membership

in less than 10 educational related group.

6. RESEARCH PROBLEM To find out the usage of the Social networking sites by teenagers and to analyze whether they balance the

time of study. 6.1 RESEARCH METHODOLOGY 6.1.1 Population Size(N) The total element of the universe from which sample is selected for the purpose of study is known as the population. The population type includes teenagers of all age and people till age 40.

The population here is 200.

6.1.2 Sample Size (n) All the items considered in any field of enquiry constitutes of a universe for a population.

In this research only a few items can be selected from the population for our study purpose. The items selected constitute what is technically called as sample.

The population type includes teenagers of all age and people till age 40.

Here our sample size is 50 from the total population

to conduct the study.

6.2 DATA COLLECTION

The data source: Primary and secondary

6.2.1 The research approach: Survey method

6.2.2 The research instrument: Questionnaire Method (Primary source)

6.2.3 Secondary sources: journals, magazines, articles

6.2.4 The respondents: teenagers of all age and people till age 40

6.3 TOOLS OF ANALYSIS

6.3.1 Simple Percentage Analysis

Here the simple percentage analysis is used for calculating the percentage of usage of SN sites in the

total respondents.

6.3.2 CHI-Square Test

Chi-Square test is applied to test the goodness of fit, to verify the distribution of observed data with

05

1015202530

2228

4

30

16

Preference of Activity in SN sites

020406080

80

144 0 2

Groups Related to academic

Page 7: Social Network Analysis

© 2013, IJOURNALS All Rights Reserved

Page 122

assumed theoretical distribution. Therefore it is a measure to study the divergence of actual and expected frequencies. Karl Pearson[] has developed a method to test the difference between the theoretical value and the observed value.

Chi-square test (X2) = (O-E)2/Eij

Degree Of Freedom (df) =(R-1)(C-1) Where,

Oij = Observed Frequency

Eij = Expected Frequency

R = Number of Rows ;

C = Number of Columns

For the entire Chi–Square test the table value has been taken @ 5% significance level.

6.3.3 DATA ANALYSIS WITH CHI-

SQUARE TEST

TEST: CHI – Square Test is conducted to find out the relationship between the age group and the usage of Social Networking Sites.

HYPOTHESIS

Ho (Null): - There is no significant relationship between the Age Group and the usage of Social

Networking Sites.

Ha (Alternate): - There is significant relationship between the Age Group and the usage of Social Networking Sites.

Calculation for Observed Values

Calculation of Observed Value

Time Spent SN sites Academic Act Total

> 5 6 2 8

4 to 5 4 2 6

3 to 4 8 3 11

2 to 3 8 13 21

< 2 24 30 54

50 50 100

Calculation for Expected Values

Formula for Expected Value = (Row x Column) / (Total)

Calculation of Expected value

Time Spent SN sites Academic Act

> 5 4 4

4 to 5 3 3

3 to 4 5.5 5.5

2 to 3 10.5 10.5

< 2 27 27

Chi - Square Test

Time Spen

t Type Oij Eij

(Oij - Eij)^2

(Oij - Eij)^2/

Eij

> 5 SN 6 4 4 1

Acad 2 4 4 1

4 to 5

SN 4 3 1 0.33

Acad 2 3 1 0.33

3 to 4

SN 8 5.5 6.25 1.14

Acad 3 5.5 6.25 1.14

2 to 3

SN 8 10.5 6.25 0.60

Acad 13 10.5 6.25 0.60

< 2 SN 24 27 9 0.33

Acad 30 27 9 0.33

6.80

Degree Of Freedom

(r-1)*(c-

1) 4

at @ 5%

Significant

Table Value

9.488

Calcula

ted Value

6.797

Page 8: Social Network Analysis

© 2013, IJOURNALS All Rights Reserved

Page 123

CALCULATIONS

Degree of freedom (df) = (R-1) (C-1) = (3-1) (3-1) = 4 @ 5% significance level

Table value (X2tab) = 9.488

Calculated value (X2cal) = 3.17024

Since the calculated value is less than the table value. So the Null hypothesis is accepted.

6.3.4 CONCLUSION There is no significant relationship between the age group and the usage of Social Networking Sites

REFERENCES [1] Borgatti, S.P. & Foster, P.C. 2003. The network

paradigm in organizational research: A review and typology. Journal of Management, 29: 991-1013

[2] Belson, William (1981). The design and understanding of research questions. Hants, England: Garner Publishing.

[3] Hanneman, Robert A. and Mark Riddle. 2005. Introduction to social network methods. Riverside, CA: University of California, Riverside

[4] Bradburn, N., Sudman, S., & Wansink, B. (2004). Asking questions: the definitive guide to questionnaire design. San Francisco: Jossey-Bass.

[5] Waller J.L., Johnson M. H., (2013), Chi-Square and T-Tests Using SAS®: Performance and Interpretation, Georgia Regents University, Augusta, Georgia, SAS Global Forum 2013, Paper 430-2013

[6] Moore, D. S., 2010. The Basic Practice of Statistics. Fifth edition. W. H. Freeman and Company, New York, NY, USA

[7] RemenyiD., Onofrei G., English J., (2009), An Introduction to Statistics using Microsoft Excel, Academic Publishing Limited, UK

[8] R.A. Fisher and F. Yates, Statistical Tables for Biological Agricultural and Medical Research, 6th ed., Table IV, Oliver & Boyd, Ltd., Edinburgh