ensemble learning for online social networkslinc.ucy.ac.cy/isocial/images/stories/events... · big...

Post on 15-Aug-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Amira Soliman

PhD student, KTH/EES/LCN

Supervisor: Ass. Prof. Sarunas Girdzijauskas

Ensemble Learning for Online Social Networks

1

Outline

1. Ensemble Learning

2. Ensemble Learning for OSN

3. Peer Sampling Service

4. Merging Classifiers

5. Applications

2

Big Data Mining

Turning this big data into “actionable insights” in a timely fashion

Challenges:

– Big Data 3V

– Information sharing and data privacy

– Mining complex and dynamic data

– Mining from sparse, uncertain, and

incomplete data

3

4

Ensemble Learning

Original

Training data

....D

1D

2 Dt-1

Dt

D

Step 1:

Create Multiple

Data Sets

C1

C2

Ct -1

Ct

Step 2:

Build Multiple

Classifiers

C*

Step 3:

Combine

Classifiers

Motivations of Ensemble Learning

• Ensemble model improves accuracy and robustness over

single model methods

• Applications:

– distributed computing

– privacy-preserving applications

– large-scale data with reusable models

– multiple sources of data

5

Simple Distributed Training

6

Server 1 Server

2

Server

3

Server

4

Server

5

Training

Data

w(1) w(2) w(3) w(4) w(5)

i

i

i

)(ww

μi is the fraction of data for piece i

Ensemble Learning for OSN

7

Research Questions:

• Diversity – random overlay

• Navigation – constructing relay paths through social ties

• Merge classifiers – reflect Ego/Socio – centric properties in OSN

Methodologies:

• Gossip-based Peer Sampling Service

• On-the-fly Path Reconstruction

• Merging classifiers based on content-richness.

Our approach

1. Construct a random overlay on top of the network.

2. Train classifier with local training dataset.

3. Receive classifiers from neighbors in overlay.

4. Assign the weights of different classifiers.

5. Update local classifier with the using the weighted average:

6. Propagate updated classifier to neighbors in overlay.

8

Overlay Construction

Cyclon: Inexpensive membership management for

unstructured P2P overlays

– that gives access to random peers

– has low diameter

– has low clustering coefficient

– Resilient to massive node failures

Basic idea: Shuffle operation, that’s performed

periodically using gossip

9

How Cyclon works

10

A

B

C

D

E

F

G

H

C : ipAddrC

A: ipAddrA E : ipAddrE

H: ipAddrH

D: ipAddrD

A: ipAddrA

C: ipAddrC

F: ipAddrF

D: ipAddrD

A: ipAddrA

B: ipAddrB

E: ipAddrE

G: ipAddrG

H: ipAddrH

Overlay Construction on top of OSN

11

A

B

C

D

E

F

G

H

D: A, D

A: A

C: C

F: A, F

D: D

A: A

B: A, B

E: E

G: G

H: H

C : B, C

A: B, A E : F, E

H: F, H

On-the-fly Path Reconstruction

12

A

B

C

D

E

F

G

H

D: A, D

A: A

C: C

F: A, F

D: D

A: A

B: A, B

E: E

G: G

H: H

H: F, H

H: A, F, H H : A, F, H

Our approach

1. Construct a random overlay on top of the network.

2. Train classifier with local training dataset.

3. Receive classifiers from neighbors in overlay.

4. Assign the weights of different classifiers.

5. Update local classifier with the using the weighted average:

6. Propagate updated classifier to neighbors in overlay.

13

Merging Models

14

L(n)

M1 M2 Mn

LCV

M

Evaluation

1. We have generated synthetic graphs and data.

2. Data from OSN (Facebook, and Twitter)

Applications:

For our iSocial project:

1. Identity Management with INSUBRIA

2. Spam Filtering with FORTH

15

Thanks

Questions and Comments are All welcome

16

top related