an efficient privacy management system in online social
TRANSCRIPT
An Efficient Privacy Management System in
Online Social Networks
T. Sureshkumar
(ASP/IT) Department of Information Technology
Nandha College of Technology
Erode, India.
K. Ganesh, K. Manikandan,
S. Suryapprakash, V. Renugapriya
(UG Scholar) Department of Information Technology
Nandha College of Technology
Erode, India
Abstract: Online social networks (OSNs) have experienced
tremendous growth in recent years and become a defect portal for
hundreds of millions of Internet users. These OSNs offer attractive
means for digital social interactions and information sharing, but
also raise a number of security and privacy issues. While OSNs
allow users to restrict access to shared data, they currently do not
provide any mechanism to enforce privacy concerns over data
associated with multiple users on.
Keywords: Individual security, Osn (Online Social Network),
UCB (Upper Confidence Bound)
I. INTRODUCTION
Novel Topic-Sensitive Influencer Mining (TSIM)
framework in interest-based social media networks. TSIM
aims to find topical influential users and images. The
influence estimation is determined with a hyper graph
learning approach. In the hyper graph, the vertices represent
users and images, and the hyper edges are utilized to capture
multitier relations including visual-textual content relations
among images, and social links between users and images.
Algorithm wise, TSIM first learns the topic distribution by
leveraging user-contributed images, and then infers the
influence strength under different topics for each node in the
hyper graph. We pursue a systematic solution to facilitate
collaborative management of shared data in OSNs. We
begin by examining how the lack of Multi Party Access
Control (MPAC) for data sharing in OSNs can undermine
typical data sharing the protection of user data. Some
patterns with respect to multiparty authorization in OSNs
identified. Based on these sharing patterns, an the core
features of are also MPAC model is formulated to capture
multiparty authorization requirements that have not been
accommodated so far by existing access control systems Our
control and models for OSNs. Model also contains a
multiparty policy specification scheme. Meanwhile, since
conflicts are inevitable in multi-party authorization
enforcement, a voting mechanism is further provided to deal
with authorization and privacy conflicts in our model.
II. LITERATURE SURVEY
(i) Hypergraph Learning With Hyper edge Expansion
Many tasks require clustering in a graph where each edge
represents a similarity relation. Often, it is a co-occurrence
relation that involves more than two items, such as the co-
citation and co-purchase relations. The co-occurrence
relation can be represented by a hyperedge that connects
two or more vertices in a hyper graph. Therefore, hyperedge
relations are often transformed into another graph that is
easier to handle. For classification and clustering tasks, the
hyperedge are usually transformed into cliques of edges.
This category of techniques includes clique expansion, star
expansion.With a vertex expansion, evaluating the goodness
of clustering is done on the induced graph. For example, in a
hyperedge of k vertices, a cut that separates the hyperedge
into 1 and k − 1 vertices would cut k − 1 pairwise edges,
while a cut that splits the vertices in two equal halves would
have k 2/4 cut edges. Thus the vertex expansion would
prefer an unbalanced clustering. To mitigate the problem of
unbalanced clustering, it is proposed in star expansion and
NHC to use the cluster volume as a normalizer for balancing
the cluster sizes. But such normalization cannot completely
eliminate the problem. We present the following example of
vertex embedding to explain why the problem still exists.
By computing the eigenvectors of the normalized Laplacian
LN HC of the induced graph, it is possible to project the
vertices into a Euclidian space, which is called embedding
in spectral graph learning. On the left side of Figure 1, we
show the 1-dimensional vertex embedding of NHC by the
eigenvector corresponding to the second smallest eigenvalue
of LN HC. It is worth to focus on the vertices that belong to
both hyperedges (the overlapping part). Although the
hyperedges have the same weight and the cluster volume
normalizer is applied, the overlapping part is still biased to
the side with less vertices (in this case e2 side). This means
that the optimal clustering of two clusters should assign the
overlapping part and other vertices in e2 to one cluster. Such
bias might be a problem when the hyperedge sizes are
unbalanced, e.g. co-citation relations with a lot or a few
citations. Moreover, the behavior of the artificial
normalization (or “correction”) could be undesirable when
many hyperedges intersect with each other, because the cost
of the clustering would depend on how a hyperedge is split
into the clusters. An even split would introduce a different
cost compared to an uneven split. As any hyperedge that is
not entirely within the same cluster represents a relation that
is violated by the clustering, it would be natural to have the
learning result independent of the hyperedge sizes and only
depend on the hyperedge connectivity and hyperedge
weights. A presented a new transformation called hyperedge
expansion (HE) based on a network flow technique so that
the learning result is invariant to the distribution of vertices
among hyperedges. HE expansion is first carried out on the
hyperedge level. Then the learning results on hyperedges are
projected back to the vertices through the adjacency
information between hyperedges and vertices.
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Published by, www.ijert.org
RTICCT - 2019 Conference Proceedings
Volume 7, Issue 01
Special Issue - 2019
1
(ii) Learning a Hidden Hypergraph
Is introduced an interesting combinatorial object, which we
call an independent covering family. Basically, an
independent covering family of a hypergraph is a collection
of independent sets that cover all non-edges. An interesting
observation is that the set of negative queries of any
algorithm that learns a hypergraph drawn from a class of
hypergraphs that is closed under the operation of adding an
edge is an independent covering family of that hypergraph.
Note both the class of r-uniform hypergraphs and the class
of (r, ∆)-uniform hypergraphs are closed under the operation
of adding an edge. This implies that the query complexity of
learning such a hypergraph is bounded below by the
minimum size of its independent covering families. In the
opposite direction, subroutines are given one arbitrary edge
from a hypergraph. With the help of the subroutines, we
show that if are constructed small-sized independent
covering families for some class of hypergraphs, It is able to
obtain an efficient learning algorithm for it. In this paper, we
give a randomized construction of an independent covering
family of size O(r2 2r m log n) is given for r-uniform
hypergraphs with m edges. This yields a learning algorithm
using a number of queries that is quadratic in m, which is
further improved to give an algorithm using a number of
queries that is linear in m. As mentioned in Anglin and Chen
(2004) and some other papers, the hypergraph learning
problem may also be viewed as the problem of learning a
monotone Disjunctive Normal Form (DNF) Boolean
formula using membership queries only. Each vertex of H is
represented by a variable and each edge by a term
containing all variables associated with the vertices of the
edge. A membership query assigns 1 or 0 to each variable,
and is answered 1 if the assignment satisfies at least one
term, and 0 otherwise, that is, the set of vertices
corresponding to the variables are assigned 1 of contains all
vertices of at least one edge of H. An r-uniform hypergraph
corresponds to a monotone r-DNF. An (r, ∆)-uniform
hypergraph corresponds to a monotone DNF whose terms
are of sizes in the range of [r − ∆, r]. Thus, our results apply
also to learning the corresponding classes of monotone
DNF.
Formulas Using Membership Queries.
In this section, algorithm is given that finds an arbitrary
edge in a hypergraph of dimension r using only r log n edge-
detecting queries. The algorithm is adaptive and takes r log
n rounds. The success probability in the construction of
independent covering families in the previous section can be
easily improved by drawing more samples. Using the high-
probability version of the construction, algorithm is obtained
using a number of queries that is quadratic in m that learns
an r-uniform hypergraph with m edges with high
probability. Although the first algorithm for finding one
edge is deterministic and simple, the round complexity r log
n might be too high when n is much larger than m. The
round complexity to O(log m + r) is improved using only
O(log m log n) more queries.
(iii) Image Retrieval Via Probabilistic Hypergraph
Ranking
Hypergraph based transductive algorithm is proposed to the field of image retrieval. Based on the similarity matrix computed from various feature descriptors, image is takenas a ‘centroid’ vertex and form a hyperedge by a centroid and its k-nearest neighbors. To further exploit the correlation information among images, Is proposed a novel hypergraph model called the probabilistic hypergraph, which presents not only whether a vertex vi belongs to a hyperedge ej, but also the probability that vi ∈ ej . In this way, both the higher order grouping information and the local relationship between vertices within each hyperedge are described in this model. To improve the performance of content-based image retrieval, the hypergraph-based transductive learning algorithm is proposed in to learn beneficial information from both labeled and unlabeled data for image ranking. After feedback images are provided by users or active learning techniques, the hypergraph ranking approach tends to assign the same label to vertices that share many incidental hyperedges, with the constraints that predicted labels of feedback images should be similar to their initial labels.
The contribution of this paper is threefold:
• A proposed a new image retrieval framework based
on transductive learning with hypergraph structure,
which considerably improves image search
performance;
• A probabilistic hypergraph model to exploit the
structure of the data manifold by considering not
only the local grouping information, but also the
similarities between vertices in hyperedges;• An in depth comparison between simple graph and
hypergraph based transductive learning algorithms
is conducted in the application domain of image
retrieval, which is also beneficial to other computer
vision and machine learning applications.
It presents an active learning framework, in which a fusion
of semi-supervised techniques (based on Gaussian fields and
harmonic functions) and SVM are comprised. And pairwise
graph based man if old ranking algorithm is adopted to build
an image retrieval system. Cain et al. put forward semi-
supervised discriminant analysis and active subspace
learning to relevance feedback based image retrieval. In a
simple graph both labeled and unlabeled images are taken as
vertices; two similar images are connected by an edge and
the edge weight is computed as image-to-image affinities.
Depending on the affinity relationship of a simple graph,
semi-supervised learning techniques could be utilized to
boost the image retrieval performance.
(iv) User Interest And Social Influence Based Emotion
Prediction For Individuals
Emotions are playing significant roles in daily life, making
emotion prediction important. To date, most of state-of-the
art methods make emotion prediction for the masses which
are invalid for individuals. Is proposed novel emotion
prediction method for individuals based on user interest and
social influence. To balance user interest and social
influence, Is proposed a simple yet efficient weight learning
method in which the weights are obtained from users’
behaviors.
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Published by, www.ijert.org
RTICCT - 2019 Conference Proceedings
Volume 7, Issue 01
Special Issue - 2019
2
The problem of emotion prediction for individuals is not
trivial. So far, there are fewer works on emotion prediction
for individuals. Emotions have long been viewed as
passions produced on their own interest. However, from
social as- pact, has shown that how happy users is
influenced are users social links to people in social
networks. More recently, Tang’s work quantitatively studies
how an individual’s emotion is influenced by his friends in
social network .It can be seen from the above the existing
emotion prediction methods for individuals either focus on
user interest or social influence. However, neither user
interest nor social influence alone can predict individual’s
emotion accurately. Is proposed a novel method jointly
considering user interest and social influence in social
network platform to predict user’s emotion. Is proposed a
simple yet efficient weight learning method to balance the
weights of user interest and social influence to figure out
exactly what kind of roles they are playing in the final
emotion prediction. The conceptual framework of our
proposed emotion prediction for individuals. With the
popularity of social network, i.e. Facebook, Twitter and
Flicker, more and more people are willing to share their own
feelings towards hot events or their experiences in daily life,
whether delivering positive or negative emotions, which
makes it easier for people to know others’ minds. More or
less, users are getting increasingly easier to be influenced by
others in social network. However, different friends may
have different extents of influences on the user based on
how close they are or how much similarity they have in
common. The use the emotion similarity when treated the
same microblog to measure the social influence.
III. EXISTING SYSTEM
Cryptographic mechanism-based security social media
technology mainly uses cryptographic security techniques
for groups with dynamic memberships
The group is any community or any cluster which shows
same properties. The social media problems of security,
privacy and anti-piracy can be overcome through
cryptographic techniques like authentication, encryption etc.
• The user needs to store many community keys if
he/she belongs to several communities
• There does not exist an efficient way to revoke the
member permanently or temporarily
• There does not exist an efficient way for
anonymous authentication with the view of tracing
the behaviour of users and computer forensics.
Each tag is an explicit reference that links to a user’s space.
For the user data, current OSNs indirectly require for
regulating protection of users to be system and policy
administrators their data, where users can restrict data
sharing to a specific trusted users. OSNs often use user
relationship and between trusted and set of group
membership to distinguish untrusted users. For example, in
Facebook, users can allow friends, Friends of Friends
(FOF), groups, or public to access their personal
authorization and data, depending on privacy requirements.
Although OSNs currently provide simple access control
mechanisms allowing users to govern access to information
have no contained in their own spaces, users, unfortunately,
Control over data residing outside their spaces. For instance,
if a user posts a comment in a friend’s space, she/he cannot
specify which users can view the comment.
In another case, when a user uploads a photo and tags
friends who appear in the photo, the tagged friends cannot
restrict who can see may have this photo, even though the
tagged friends different privacy concerns about the photo.
To address such a critical issue, preliminary protection
mechanisms have been offered by existing OSNs.
IV. PROBLEM DEFINITION:
• Multi-armed bandit problem
• Detecting the conflicts among different users’
privacy policies, and then generating an aggregated
policy that can resolve the conflicts to the largest
extent collaborative privacy management in OSNs
• Aggregated policy may cause a privacy loss to
some of the users
Detection Strategy
Classification: Classification means researchers first to
build a model for a group of classes or concepts, then use
the model to predict class labels for test data. For example,
to classify whether an email is email spam, web page is web
spam.
Prediction: Prediction focuses on the continuous-valued
functions of models researchers created. For example,
scientists use the prepared models to forecast the economic
growth in the next year.
Classification and prediction are a two-step process, which
means, model construction and model applications. Model
construction means scientists first need to introduce a set of
predefined classes, which called training dataset. Training
dataset consists of tuples for building a model, and each
tuple or sample belongs to a predefined class. At the same
time, researchers need to make the classification rules,
classification models, decision trees, decision rules, or math
formulae, etc. Model application means to classify those
unseen objects: researchers need to use an independent test
data set to estimate the accuracy of the model, then use the
model to classify unknown class labels. The training dataset
needs to use some features to make a further application. As
former researchers’ experience, most of the features are web
page top domains, languages, some words (body and title),
average word length, anchor words, visibility of content,
repeating keywords, the most common keywords, n-gram
likelihood and so on. Suppose to explore the influence of
spam in one OSN to another; we do not aim to show how
great performance of detection only around one dataset. So
we chose 10% original data to do the training work so that it
can maintain the maximum independence and testability of
posts in one social network, at the same time, it is more
intuitional and beneficial to show the influence of spam
related with same topics in other social network to the spam
detection in that social network.
The strategy of process are as follows:
A. We first split TSD and FSD separately into
training and test datasets, use training datasets
to train the various classifiers, and then use
classifiers to check the test datasets. We then
get the original classified results of Twitter
and Facebook Spam Dataset.
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Published by, www.ijert.org
RTICCT - 2019 Conference Proceedings
Volume 7, Issue 01
Special Issue - 2019
3
B. To show the influence of Facebook spam
posts in Twitter spam tweets classification,
this research combines spam of Facebook into
a Twitter training set; we then use the newly
trained classifiers to test the remaining
dataset.
C. After step 2, we then do the same procedure in
the Facebook training dataset, and then apply
the new training process to verify the original
test dataset.
D. Finally, we combine the results of
classifications on the above two social
networks.
E. And finally got the exact result and determine
given the statement about the process.
Combined filtered classifier is used to train and test
with various classify algorithms and String to
Word Vector to process natural language. We also
use precision, F1-Measure as criteria to evaluate
the classification performance. The relations of the
true positive (TP), true negative (TN), false
positive (FP) and false negative (FN) are shown.
True positive: Facebook users correctly identified
as Facebook users; False positive: Twitter users
incorrectly identified as Facebook users; True
negative: Twitter users correctly identified as
Twitter users; False negative: Facebook users
incorrectly identified as Twitter users. In general,
Positive means identified, and negative means
rejected. Therefore: True positive means correctly
identified; false positive means incorrectly
identified; True negative means correctly rejected;
false negative means incorrectly rejected.
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Published by, www.ijert.org
RTICCT - 2019 Conference Proceedings
Volume 7, Issue 01
Special Issue - 2019
4
V. PROPOSED SYSTEM:
A high threshold indicates that the user has a relatively low
tendency to share the data with others, and only when the
majority of the involved users or users that are highly trusted
agree to post the data, the data can finally be posted. By
tuning the threshold, the user can make a trade-off between
data sharing and privacy preserving. A trust-based
mechanism is proposed for collaborative privacy
management in OSNs. The trust values between users are
associated with users’ privacy loss, and the proposed
mechanism can encourage users to be more considerate of
other users’ privacy. Trust-based privacy management
mechanism based on threshold which the user makes the
final decision on data posting. A high threshold indicates
that the user has a relatively low tendency to share the data
with others, and only when the majority of the involved
users or users that are highly trusted agree to post the data,
the data can finally be posted. By tuning the threshold, the
user can make a trade-off between data sharing and privacy
preserving. A bandit approach is proposed to adjust the
parameter of the trust-based mechanism. By applying the
UCB policy, the user can make a rational trade-off between
data sharing and privacy preserving.
Algorithm:
1: for t = 1 to K do
2: Choose arm It = t
3: Observe and record the reward rIt ,t
4: r ← rIt ,t
5: nIt ← 1
6: end for
7: for t = K + 1 to T do
8: for i = 1 to K do
9.←1/ni ∑ r 1(I=i)
10: end for
12: Observe and record the reward rIt ,t
13: rt ← rIt ,t
14: nIt ← nIt+ 1
15: end forVI. MODULES
(i) User Profile Creation
A user profile (user profile, or simply profile when used in-
context) is a collection of personal data associated to a
specific user. A profile refers therefore to the explicit digital
representation of a person's identity. A user profile can also
be considered as the computer representation of a user
model. A user profile is a visual display of personal data
associated with a specific user, or a customized desktop
environment. A profile refers therefore to the explicit digital
representation of a person's identity. A user profile can also
be considered as the computer representation of a user
model. A profile can be used to store the description of the
characteristics of person. This information can be exploited
by systems taking into account the persons' characteristics
and preferences. The user personal data store in Online
Social Networks (OSNs) database that details contain
informs like first name, last name, username, password,
email Id, gender etc.
(ii) Post Wall Creation
The Website wall post is the most social network is enabling
with photo sharing activities. Protected albums allow users
to set their albums with access protection. This is one of the
beneficial features from wallpost that who fear with photo
scams on photo sharing websites. Photo tagging the option
makes the photo search easier after a long period of time.
Here ruse can give the names or keywords for photos that
related to the photo in better to recognize easily. Although
OSNs currently provide simple access control mechanisms
allowing users to govern access to information contained in
their own spaces, users, unfortunately, have no control over
data residing outside their spaces. In this module user can
add their or interested photos in their wall. This wall posting
contains the photo, photo description, tag information are
given by the user that details are stored in the OSNs
database.
(iii) Multiparty Policy Access Control (MPAC)
Two steps are performed to evaluate an access request over
MPAC policies. The first step checks the access request
against the policy specified by each controller and yields a
decision for the controller. The accessor element in a policy
decides whether the policy is applicable to a request. If the
user who sends the request belongs to the user set derived
from the accessor of a policy, the policy is applicable and
the evaluation process returns a response with the decision
(either permit or deny) indicated by the effect element in the
policy. Otherwise, the response yields deny decision if the
policy is not applicable to the request. In the second step,
decisions from all controllers responding to the access
request are aggregated to make a final decision for the
access request. Since data controllers may generate different
decisions (permit and deny) for an access request, conflicts
may occur. To make an unambiguous decision for each
access request, it is essential to adopt a systematic conflict
resolution mechanism to resolve those conflicts during
multiparty policy evaluation.
(iv) Topic Distribution Learning
We utilize the image vertices and homogeneous hyper edges
in the hyper graph to learn the topic distribution. We
propose to develop a hyper graph regularized topic model to
fully leverage both content and context information of
images to help learn the potential topics of interest.
However, in real-world scenarios, user-contributed social
media data is inevitably noisy and the textual information
associated with images is usually sparse, which makes it
difficult to use the hyper graph regularized topic model to
learn topics of interest accurately. Therefore, we first select
the informative images with rich tags to identify the latent
topics. Then we obtain the topic distribution for all images
via collaborative representation based similarity
propagation.(v) INFLUENCE RANKING
Topic Sensitive Influence Ranking via Affinity Propagation
Based on the learned topic distribution and the constructed
hypergraph, we perform a topical affinity propagation on the
hypergraph with the heterogeneous hyperedges for
measuring influence regarding topics for each user and
image.
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Published by, www.ijert.org
RTICCT - 2019 Conference Proceedings
Volume 7, Issue 01
Special Issue - 2019
5
The affinity propagation algorithm is originally employed
for clustering data to identify a subset of exemplars, which
are used to best account for all other data points by passing
similarity messages between data points. Affinity
propagation can be applied whenever there is a way to
measure or pre-compute a numerical value for each pair of
data points, which indicates how similar they are. In our
scenario, users exert influence on each other through
images, which is reflected in the indirect user behaviors of
favorite or comment links. Users share topical similarity
which can be computed through the connected images and
social links. We can use affinity propagation to exchange
influence messages between users and images along
heterogeneous hyperedges in the hypergraph. The final
derived influence messages between users can be viewed as
mutual social influence between users. The algorithm for
topic-sensitive influence ranking is summarized in
Algorithm 3. In the algorithm, the influence of users and
images is recursively updated until it achieves the optimal
condition.
VII. CONCLUSION
This system can make the privacy control for the individual
user during the sharing process. and efficient user can be
defined which user will be view and will not be view about
their own data. Secure transaction and affinity propagation
is possible for by this set of algorithm. Data privacy can give
businesses a competitive advantage. The general data
production regulation is a challenge but it open up huge
business benefits that boost return on investment, such as
improved customer loyalty and more efficient operation.
Product your content from potential employers and
cybercrime. Allow you to come across as a professional
Individual.
REFERENCES
1. C. Zhang, J. Sun, X. Zhu, and Y. Fang, “Privacy
and security for online social networks: Challenges
and opportunities,” IEEE Netw., vol. 24, no. 4, pp.
13–18, Jul/Aug. 2010.
2. L. Xu, C. Jiang, J. Wang, J. Yuan, and Y. Ren,
“Information security in big data: Privacy and data
mining,” IEEE Access, vol. 2, pp. 1149–1176,
2014.
3. L. Xu, C. Jiang, Y. Chen, J. Wang, and Y. Ren, “A
framework for categorizing and applying privacy-
preservation techniques in big data mining,”
Computer, vol. 49, no. 2, pp. 54–62, Feb. 2016.
4. M. Qiu, K.Gai, and Z. Xiong, “Privacy-preserving
wireless communications using bipartite matching
in social big data,” Future Gener. Comput. Syst., to
be published. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S0
167739X17301449.
5. C. Fiesler et al., “What (or who) is public?: Privacy
settings and social media content sharing,” in Proc.
ACM Conf. Comput. Supported Cooper. Work Soc.
Comput., Mar. 2017, pp. 567–580.
6. H. Hu, G.-J. Ahn, and J. Jorgensen, “Detecting and
resolving privacy conflicts for collaborative data
sharing in online social networks,” in Proc. 27th
ACM Annu. Comput. Secure. Appl. Conf., Dec.
2011, pp. 103–112.
7. M. Such and N. Criado, “Resolving multi-party
privacy conflicts in social media,” IEEE Trans.
Know. Data Eng., vol. 28, no. 7, pp. 1851–1863,
Jul. 2016.
8. P. Auer, N. Cesa-Bianchi, and P. Fischer, “Finite-
time analysis of the multigame bandit problem,”
Mach. Learn., vol. 47, no. 2, pp. 235–256, 2002.
9. H. Hu, G.-J. Ahn, Z. Zhao, and D. Yang, “Game
theoretic analysis of multiparty access control in
online social networks,” in Proc. 19th ACM Symp.
Access Control Models Technol., New York, NY,
USA, Jun. 2014, pp. 93–102.
10. H. Hu, G. J. Ahn, and J. Jorgensen, “Multiparty
access control for online social networks: Model
and mechanisms,” IEEE Trans. Knowl. Data Eng.,
vol. 25, no. 7, pp. 1614–1627, Jul. 2013.
.
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Published by, www.ijert.org
RTICCT - 2019 Conference Proceedings
Volume 7, Issue 01
Special Issue - 2019
6