thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... ·...

33
The structure of online social networks mirrors those in the offline world Paper by R.I.M. Dunbar, Valerio Arnaboldi, Marco Con9, Andrea Passarella Presenta9on: Rohith Raj Chenna

Upload: others

Post on 14-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

The  structure  of  online  social  networks  mirrors  those  in  the  

offline  world

                 Paper  by  R.I.M.  Dunbar,  Valerio  Arnaboldi,  Marco  Con9,  Andrea  Passarella    

Presenta9on:  Rohith  Raj  Chenna  

Page 2: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Egocentric  Social  Network  Analysis

•  Studies  an  individual’s  personal  network  and  its  affects  on  that  individual  •  The  Ego:  Describes  the  network  around  a  single  node.  • Alter:  The  nodes  that  are  connected  to  the  ego.  

Page 3: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Social  Networks  and  it’s  constraints

•  There  is  huge  growth  in  the  usage  of  social  networking  sites  over  the  past  decade.    •  This  has  raised  fundamental  ques9ons  about  the  constraints  that  exist  over  both  the  size  and  the  paMern  of  social  rela9onships  and  whether  they  mirror  the  offline  social  networks.  •  This  becomes  of  par9cular  interest  in  the  light  of  the  finding  that  there  appears  to  be  a  cogni9ve  limit  on  the  size  of  natural  face-­‐to-­‐face  social  networks.  •  This  limit  is  thought  to  arise  out  of  a  combina9on  of  a  cogni9ve  constraint  and  a  9me  constraint.  

Page 4: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Cogni>ve  Constraint

•  The  central  cogni9ve  constraint  is  based  on  the  observa9on  that,  in  primates,  the  typical  size  of  social  groups  correlates  closely  with  the  size  of  the  neocortex.  •  If  the  limit  is  exceeded,  then  the  social  network  becomes  unstable  and  is  prone  to  fission.  •  This  proposal  is  supported  by  evidence  from  neuroimaging  performed  on  humans  and  monkeys.  

Page 5: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Time  Constraint

•  There  is  also  evidence  to  suggest  that  9me  imposes  a  constrain  •  Time  becomes  important  because  it  seems  that  the  strength  of  a  rela9onship  is  determined  by  how  much  9me  two  individuals  spend  together.  •  Self-­‐rated  es9mates  of  the  emo9onal  closeness  for  dyadic  rela9onships  correlate  closely  with  the  frequency  of  contact.    • And  these  in  turn  correlate  with  willingness  to  behave  altruis9cally  towards  the  alter  in  ques9on  

Page 6: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Determina>on  of  the  size  of  offline  Social  Networks • Offline  social  network  of  heavy  and  casual  users  of  internet  social  networking  sites,  and  found  that  they  did  not  differ.  •  Study  on  downloaded  traffic  among  the  followers  of  individual  TwiMer  accounts  and,  using  a  criterion  of  reciprocated  exchanges  to  iden9fy  meaningful  rela9onships,  concluded  that  TwiMer  communi9es  typically  averaged  between  100  and  200  individuals.  •  In  a  physicists  community,  it  was  found  that  that  there  was  a  marked  downturn  in  the  rate  at  which  addi9onal  members  were  acquired  once  communi9es  exceeded  200  individuals  

Page 7: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

•  Individuals  do  not,  however,  distribute  their  social  effort  evenly  among  the  alters  in  their  networks.    •  Indeed,  there  is  considerable  evidence  to  show  that,  within  natural  social  networks,  individual  alters  can  be  ranked  in  order  of  declining  investment  by  ego.  •  These  rankings  fall  into  a  natural  series  of  layers  with  a  scaling  ra9o  of  ∼3  that  yields  breakpoints  at  around  5,  15,  50  and  150  alters.  

Page 8: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Facebook  Dataset  #1

•  Facebook  dataset  #1  was  obtained  before  2009  when  the  default  privacy  se[ngs  allowed  users  inside  the  same  regional  network  to  have  full  access  to  each  others’  personal  data.  •  The  dataset  covers  the  9me  span  from  the  start  of  Facebook  in  September  2004  un9l  April  2008.  •  The  dataset  represents  only  a  subsample  of  the  original  Facebook  regional  network,  in  terms  of  downloaded  Facebook  profiles  (∼56%)  and  their  Facebook  friendships  (∼37%).  

Page 9: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

• Despite  the  high  number  of  missing  profiles,  some  of  their  data  is  s9ll  present  in  the  dataset.  • We  only  miss  the  data  that  is  sent  from  a  public  profile  to  a  private  profile  and  between  two  private  profiles.  • We  do  not  know  which  profiles  are  public  and  which  are  not  because  we  only  have  number  of  (undirected)  interac9ons  (posts  or  photo  comments)  that  occurred.  •  The  only  informa9on  we  have  is  the  percentage  of  non-­‐public  profiles  

Page 10: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

•  44%  of  the  nodes  are  selected  randomly  and  are  assumed  to  be  private.  •  The  number  of  interac9ons  on  all  these  nodes  are  doubled.  •  The  resul9ng  interac9ons  are  more  accurate  in  inside  layers  and  less  accurate  in  outside  layers.  •  Strongly  asymmetrical  rela9onships  are  typically  known  to  belong  to  the  most  external  layers  of  the  ego  networks.  

Page 11: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

•  Facebook  dataset  #1  consists  of  3+  million  nodes  and  23+  million  edges.  •  The  data  is  classified  into  4  9me  frames.    •  Last  month,  Last  6  months,  Last  year,  En9re  dura9on.  • Contact  frequency  is  calculated.  

Page 12: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

CCDF  of  contact  frequency  for  rela>onships  in  Facebook  dataset  #1

Page 13: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

• Graph  shows  that  the  contact  frequency  is  low  for  most  of  the  rela9onships,  but  there  are  a  few  rela9onships  with  very  high  levels  of  interac9ons.  •  This  type  of  distribu9on  is  typical  in  social  networks.  

Page 14: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

•  For  the  analysis  we  consider  only  egos  with  an  average  of  more  than  10  interac9ons  per  month,  thus  selec9ng  “socially  ac9ve  people”  since  they  are  par9cularly  relevanior  our  analysis,  and  discard  inac9ve  profiles.  •  The  final  dataset  contain  130,338  egos  with  5,289,910  ac9ve  edges.  •  To  extract  ego  networks  from  the  datasets,  we  first  create  a  series  of  sets  each  of  which  contains  all  the  social  rela9onships  of  a  user.  

Page 15: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

CCDF  of  the  size  of  ego  networks  for  rela>onships  in  Facebook  Dataset  #1

Page 16: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

• We  note  that,  for  the  majority  of  ego  networks,  the  size  is  lower  than  100.    •  This  means  that  even  though  people  can  poten9ally  add  up  to  5000  friends  in  Facebook,  they  communicate  only  with  a  small  subset  of  them.  

Page 17: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

• We  further  refine  the  dataset  by  selec9ng,  for  each  ego  network,  only  the  set  of  rela9onships  with  contact  frequency  higher  than  one  message  per  year.  •  This  is  to  avoid  considering  people  in  whom  the  ego  does  not  invest  some  minimum  amount  of  9me  and  cogni9ve  resources.  •  The  selec9on  of  social  rela9onships  with  more  than  one  message  per  year  does  not  represent  a  substan9al  change  in  the  size  of  the  ego  networks.  

Page 18: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!
Page 19: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Facebook  Dataset  #2

•  It  represents  the  Facebook  regional  network  of  New  Orleans  and  it  has  been  obtained  through  a  crawling  agent  similar  to  the  one  created  for  downloading  Facebook  dataset  #1.  •  It  represents  a  much  smaller  network((90,269  nodes  and  3,646,662  social  links)  but  the  data  is  much  more  specific.  •  It  reports  the  list  of  its  Facebook  friends  and  the  list  of  wall  posts  received  by  the  user  from  their  friends,  with  the  9mestamp  indica9ng  the  9me  at  which  the  interac9on  occurred.  

Page 20: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

TwiIer  Dataset

• A  sample  of  303,902  TwiMer  user  profiles  are  collected  by  crawling  TwiMer  in  November  2012.  •  To  avoid  including  TwiMer  users  that  are  not  human,  the  data  is  filtered  by  iden9fying  user  profiles  that  have  recognizable  human  characteris9cs.  

Page 21: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

CCDF  of  the  contact  frequency  for  rela>onship  in  TwiIer

Page 22: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

CCDF  of  ther  size  of  ego  networks  in  TwiIer

Page 23: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!
Page 24: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

• As  with  the  Facebook  datasets,  we  selected  only  accounts  that  had  an  average  of  more  than  10  interac9ons  per  month.  •  The  final  dataset  contains  60,790  egos  and  5,323,195  social  rela9onships.  •  The  CCDFs  show  longer  tails  than  in  Facebook.  This  indicates  tha9n  TwiMer  there  are  users  with  larger  ego  networks  than  in  Facebook.  • Nevertheless,  similarly  to  Facebook,  more  than  90%  of  the  TwiMer  users  have  less  than  100  rela9onships  

Page 25: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Analysis

• We  use  both  K-­‐Means(par99on  clustering  technique)  and  DBSCAN(Density  based  clustering  technique)  on  the  frequency  of  contact  of  each  ego  network  to  search  for  a  layered  structure.  • We  apply  k-­‐means  in  two  different  ways.  • On  the  one  hand,  we  want  to  find  the  typical  number  of  clusters  in  the  ego  networks,  as  we  want  to  verify  if  Facebook  and  TwiMer  ego  networks  show  a  layered  structure  with  a  number  of  layers  similar  to  the  one  found  in  offline  social  networks.  

Page 26: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

•  To  measure  how  well  the  data  are  clustered,  we  calculate  the  silhoueMe  sta9s9cs  for  each  op9mal  configura9on.  •  The  results  obtained  with  k-­‐means  may  be  affected  by  the  presence  of  noisy  data.  • Noise  canaffect  a  k-­‐means  analysis  in  two  different  ways:  •  The  presence  of  noisy  points  between  two  adjacent  clusters  might  cause  the  algorithm  to  treat  them  all  as  a  single  cluster  instead  of  two.  •  The  presence  of  a  large  number  of  noisy  points  in  the  data  set  could  lead  to  the  detec9on  of  more  clusters  than  really  exist.  

Page 27: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

• DBSCAN  defines  two  parameters,  ε  and  MinPts.  • Any  object  with  more  than  MinPts  neighbours  within  a  distance  ε  is  defined  as  a  “Core  object”.  •  “Border  objects”  of  the  cluster  is  linked  to  a  core  object  at  a  distance  less  than  ε.  

Page 28: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Op>mal  number  of  clusters

Page 29: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Results

•  The  distribu9ons  show  a  marked  peak  around  k*  =  4  for  all  the  datasets.  •  For  Facebook  dataset  #1,  the  ego  networks  have  an  average  op9mal  number  of  clusters  equal  to  4.35  (with  median  4),  and  Facebook  dataset  #2  has  an  average  op9mal  number  of  clusters  of  4.10  (with  median  4).  •  .  Despite  a  clear  mode  at  4,  the  ego  networks  in  the  TwiMer  dataset  have  an  average  op9mal  number  of  clusters  equal  to  6.60  (with  median  5)  due  to  the  long  tail  to  the  right.  

Page 30: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

•  The  average  silhoueMe  value  for  the  best  configura9ons  associated  with  the  op9mal  number  of  clusters  for  each  ego  network  is  0.670  for  Facebook  dataset  #1,  0.678  for  Facebook  dataset  #2,  and  0.674  for  TwiMer  

Page 31: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!
Page 32: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Layer  0

• We  only  knew  about  the  occurrence  from  Layer  1  of  approximately  5  friends  but  we  see  a  new  layer  before  layer  1  called  layer  0.  •  This  means  most  people  have  1  or  2  really  good  friends  with  whom  they  communicate  way  more  than  that  they  do  with  people  who  are  in  layer  1.  

Page 33: Thestructureofonlinesocial$ networks$mirrors$those$in$the$ …ravi/pdfs/talk_slides.d/chenna... · 2015. 12. 1. · SocialNetworksandit’sconstraints • There!is!huge!growth!in!the!usage!of!social!networking!sites!over!the!

Conclusion

•  The  analyses  of  three  different  online  datasets  confirm  the  layered  structure  found  in  offline  face-­‐to-­‐face  social  networks.  •  These  layers  have  previously  been  iden9fied  only  from  samples  of  quite  modest  size.  •  The  sizes  of  the  en9re  ego  networks  for  the  three  datasets  are  smaller  than  the  total  size  of  conven9onal  offline  egocentric  networks.  •  The  mean  rates  of  contact  in  each  layer  are  extremely  close,  especially  for  the  Facebook  datasets,  to  those  found  in  offline  egocentric  networks.  •  This  suggests  that  the  online  environments  may  be  mapping  quite  closely  onto  everyday  offline  networks