towards designing a light weight social network analysis tool

Click here to load reader

Upload: visithan

Post on 09-Feb-2017

34 views

Category:

Education


1 download

TRANSCRIPT

Developing a Tools for Social Network Analysis

Towards Designing a Light Weight Social Network Analysis Tool

1

ObjectivesDevelop a tool to analyze Social Network by implementing optimized network algorithm.We achievedVisualization of the network data.Minimize the time spent on searching.Size of the network.Identify the most central node.Identify the communities.Discover of shortest path between two nodes.

2

NetworksA network is composed by set of vertices and a set of lines.Nodes represent the selected units and edges represent the ties between unitsNodes and links form a graphAdditional data about nodes or lines are usually known as their properties or attributes. For example name or label of a node, weight of edges, position, etc.Simply saying a network consist of graph and dataNetwork = Graph + Data.

Social NetworksOnline Social networks are web based online communication that allow people to socially interact with others. A social network consists of a number of actors connected by some kind of relationship. Actors can be individuals, groups, organizations etc. Relationships can be of any kind, financial, friendship, professional, etc

4

Social Network AnalysisThe Online Social Network Analysis includes the study of the online social structure, detecting and interpreting patterns of social relations among actors or individuals.The analysis allows us to visualize the social network and to find the following network Metrics:Network Density Degree of a nodeNetwork CentralityConnected ComponentsShortest path and Shortest path lengthCommunity

VisualizationSocial network can be visualized as a graph.A graph is a set of nodes and a set of edges between pairs of verticesA vertex is the smallest unit in a network. In social network analysis, it represents an actor (Individual or an organization, or a country).A social relation that is undirected (e.g., is friend of, is family of) is represented by an edge because both individuals are equally involved in the relation. Visualization is used to support the discovery of people and connections among each person and communities.

Visualization

7

Dataset

An example Visualization using our tools

This is a visualization of a big data set. Data set contains 3 column. Each column separated by space. The first and second columns indicate the nodes which are connected by a relation. The third column indicate the weight of the relation.8

Metrics in Social Network AnalysisNetwork Density Degree of a nodeNetwork Centrality Connected ComponentsShortest path and shortest path lengthCommunity or clusters

9

DensityDensity is the number of lines in a simple network, expressed as a proportion of the maximum possible number of lines.Density = no of edges/no of possible edges No of possible edges = (no of nodes-1)* no of nodes/2A complete network has maximum density.Density useful for determine the size of a Social network.12534Density = 6/10 = 0.612534Number of Edges = 6Number of Possible edges = 10

Degree The Degree of a node is the number of links incident with it.In social network degree indicate the popularity of a vertex.

11

Network CentralityIdentify the central/popular nodes in a networkThere are several approaches exist for measure centrality such asDegree CentralityBetweenness CentralityCloseness Centrality and etcDegree centrality measures the number of direct connections that an individual node has to other nodes within a networkDegree centralization = degree of a node/ possible maximum degreeBetweenness Centrality is measures how many times a node occurs in a shortest path; measure of social brokerage power.In Social network central node have better access to information and better opportunities to spread information

Degree centralization

Visualization of a networkDegree centrality of each node

Betweenness centralization

Visualization of a networkBetweenness centrality of each node

Connected Components Identifies the distinct components of the social network. Each node belongs to exactly one components. Two nodes are in the same component if there is a path from one node to the other.Number of Connected Components = 13

Shortest Path and Path lengthA shortest path between two nodes is a minimal length path between them.Path length - The distances between pairs of nodes in the network.

Shortest Path

An example of shortest path from node 9 to node 5.Our tool display the shortest path form source node to all other nodes which are in one components.

Community or ClusterA community within a network is a densely connected group of nodes.We have implemented two method for detecting community.Weak link removal method.Broker nodes removal method.Weak link removal method is remove a node that has weight below the threshold value.The threshold value should be within the minimum and maximum weight of the data set.Broker nodes removal method is remove a node that has maximum betweenness centrality value.

Weak link removal method

Figure (a) Initial NetworkFigure (b) After removing weak link. Three communities appear. The threshold value is 74

Broker nodes removal method

Initial NetworkAfter removing broker nodes. Five communities appear.

ImplementationWe implements algorithms for To find the connected components in a networkTo find the degree centrality of a nodeTo find the betweenness centrality of a nodeTo find the communities Weak link removal algorithmBroker node removal algorithmAlgorithm for find shortest path between two nodes.

ImplementationWe use java collection framework for store the data at run time. Such asMapListSetQueueCollection frameworks are used to store, retrieve, and manipulate the aggregated data.

Performance ComparisonWe compare the performance of our tool with NetworkX. NetworkX (NX) is a rich integrated tool set for graph creation, manipulation, analysis, and visualization.We compare the following properties between our tool and NetworkX.Visualization speedDegree CentralityBetweenness Centrality

Comparison of Visualization Speed

Comparison of Degree Centrality

Comparison of Betweenness Centrality

ConclusionIn our tool, many algorithms are developed measure the Metrics (Measures) in social network analysis such as finding the degree of a node, neighbors of a node, degree centrality, weak link removal based community detection and etc.We have carried out a comparative study by comparing the performance of our tool with other well known tool NetworkX.Our result show that our tool displays a good performance lead when we use large data sets and our degree centrality algorithm works better than the NetworkX's algorithm. Visualization speed of this tools is very much higher than NetworkX.

Thank you