scalable methods for the analysis of network‐based data · p. smyth: networks muri meeng, june 3,...

35
P. Smyth: Networks MURI Mee5ng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Mee=ng, June 3 rd 2011, UC Irvine Principal Inves=gator: Professor Padhraic Smyth Department of Computer Science University of California, Irvine Addi5onal project informa5on online at www.datalab.uci.edu/muri

Upload: buicong

Post on 22-Aug-2019

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

1

Scalable Methods for the Analysis  of Network‐Based Data 

MURI Mee=ng, June 3rd 2011, UC Irvine 

Principal Inves=gator:  Professor Padhraic Smyth Department of Computer Science University of California, Irvine 

Addi5onalprojectinforma5ononlineatwww.datalab.uci.edu/muri 

Page 2: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

2

Today’sMee5ng

•  Goals–  Reviewourresearchprogress–  Discussion,ques5ons,interac5on–  Feedbackfromvisitors

•  Format–  Introduc5on–  Researchtalks

•  Regular:20minutes+5minsatendforques5ons/discussion•  Short:10minutes(sessionaVerlunch)

–  Twoopendiscussionsessions,ledbyfaculty–  Ques5on/discussionencouragedduringtalks–  Severalbreaksfordiscussion

Butts

Page 3: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

3

MURIProjectTimeline

•  Ini5al3‐yearperiod–  May12008toApril30th2011

–  Fundingactuallyarrivedtouniversi5esinOct2008

•  2‐yearextension:–  May12011toApril30th2013

•  Mee5ngs(allatUCIrvine)–  KickoffMee5ng,November2008–  WorkingMee5ngs,April2009,August2009–  AnnualReview,December2009–  WorkingMee5ng,May2010

–  AnnualReview,November2010–  Today,June2011

Page 4: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

4

Mo5va5on

2007:interdisciplinaryinterestinanalysisoflargenetworkdatasets

Manyoftheavailabletechniquesweredescrip5ve,couldnothandle

‐  Predic5on

‐  Missingdata

‐  Covariates,etc

Page 5: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

5

Mo5va5on

2007:interdisciplinaryinterestinanalysisoflargenetworkdatasets

Manyoftheavailabletechniquesweredescrip5ve,couldnothandle

‐  Predic5on

‐  Missingdata

‐  Covariates,etc

2007:significantsta5s5calbodyoftheoryavailableonnetworkmodeling

Manyoftheavailabletechniquesdidnotscaleuptolargedatasets,notwidelyknown/understood/used,etc

Page 6: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

6

Mo5va5on

2007:interdisciplinaryinterestinanalysisoflargenetworkdatasets

Manyoftheavailabletechniquesweredescrip5ve,couldnothandle

‐  Predic5on

‐  Missingdata

‐  Covariates,etc

2007:significantsta5s5calbodyoftheoryavailableonnetworkmodeling

Manyoftheavailabletechniquesdidnotscaleuptolargedatasets,notwidelyknown/understood/used,etc

Goal of this MURI project 

Develop new sta=s=cal network models  and algorithms to broaden their scope of applica=on to large, complex, dynamic  

real‐world network data sets

Page 7: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

7

MURITeam

Inves=gator  University  Department  Exper=se  Number Of PhD Students 

Number of Postdocs 

PadhraicSmyth(PI) UCIrvine ComputerScience Machinelearning 4

CarterBugs UCIrvine Sociology Sta5s5calsocialnetworkanalysis

6

MarkHandcock UCLA Sta5s5cs Sta5s5calsocialnetworkanalysis

1 1

DaveHunter PennState Sta5s5cs Computa5onalsta5s5cs

2 1

DavidEppstein UCIrvine ComputerScience Graphalgorithms 2

MichaelGoodrich UCIrvine ComputerScience Algorithmsanddatastructures

1 1

DaveMount UMaryland ComputerScience Algorithmsanddatastructures

2

TOTALS  18  3 

Page 8: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

8Collabora5onNetwork

Padhraic Smyth Dave

Hunter

Mark Handcock

Dave Mount

Mike Goodrich

David Eppstein Carter

Butts

Page 9: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

9

Emma Spiro

Lorien Jasny

Zack Almquist

Chris Marcum

Sean Fitzhugh

Nicole Pierski

Collabora5onNetwork

Padhraic Smyth Dave

Hunter

Mark Handcock

Dave Mount

Mike Goodrich

David Eppstein Carter

Butts

Chris DuBois

Minkyoung Cho

Eunhui Park

Miruna Petrescu-Prahova

Arthur Asuncion Jimmy

Foulds

Duy Vu Ruth Hummel

Michael Schweinberger

Ranran Wang

Nick Navaroli

Darren Strash

Lowell Trott

Maarten Loffler

Joe Simon

Page 10: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

10

Emma Spiro

Lorien Jasny

Zack Almquist

Chris Marcum

Sean Fitzhugh

Nicole Pierski

Collabora5onNetwork

Padhraic Smyth Dave

Hunter

Mark Handcock

Dave Mount

Mike Goodrich

David Eppstein Carter

Butts

Chris DuBois

Minkyoung Cho

Eunhui Park

Miruna Petrescu-Prahova

Arthur Asuncion Jimmy

Foulds

Duy Vu Ruth Hummel

Michael Schweinberger

Ranran Wang

Nick Navaroli

Darren Strash

Lowell Trott

Maarten Loffler

Joe Simon

Page 11: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

11

Emma Spiro

Lorien Jasny

Zack Almquist

Chris Marcum

Sean Fitzhugh

Nicole Pierski

Collabora5onNetwork

Padhraic Smyth Dave

Hunter

Mark Handcock

Dave Mount

Mike Goodrich

David Eppstein Carter

Butts

Chris DuBois

FACEBOOK

Romain Thibaux

Minkyoung Cho

Eunhui Park

Miruna Petrescu-Prahova

Arthur Asuncion Jimmy

Foulds

Duy Vu Ruth Hummel

Michael Schweinberger

Ranran Wang

Nick Navaroli

Ryan Acton

Krista Gile

Computational Social Science Initiative U Mass Amherst

GOOGLE

INTEL

Darren Strash

Lowell Trott

Maarten Loffler

Joe Simon

RAND

Page 12: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

12

Example:NetworkDynamicsinClassroomsNicole Pierski, Chris DuBois, Carter Butts

Page 13: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

13

Data: Count matrix of 200,000 email messages among 3000 individuals over 3 months

Problem: Understand communication patterns and predict future communication activity

Challenges: sparse data, missing data, non-stationarity, unseen covariates

Page 14: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

14

KeyScien5fic/TechnicalChallenges

•  Parametrizemodelsinasensibleandcomputableway–  Respecttheoriesofsocialbehavioraswellasexplainobserveddata

•  Accountforrealdata–  E.g.,understandsamplingmethods:accountformissing,error‐pronedata

•  Makeinferencebothprincipledandprac5cal–  computa5onally‐scalability:wantaccurateconclusions,butcan’twaitforeverfor

results

•  Dealwithrichanddynamicdata–  Real‐worldproblemsinvolvesystemswithcomplexcovariates(text,geography,

etc)thatchangeover5me

Insum:sta5s5callyprincipledmethodsthatrespectthereali5esofdataandcomputa5onalconstraints

Page 15: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

15

DomainTheory DataCollec5on

Sta5s5calModels

MappingtheProjectTerrain

Page 16: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

16

DomainTheory DataCollec5on

Sta5s5calModels Sta5s5calTheory

MappingtheProjectTerrain

Page 17: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

17

DataStructuresandAlgorithms

DomainTheory DataCollec5on

Sta5s5calModels Sta5s5calTheory

Es5ma5onAlgorithms

MappingtheProjectTerrain

Page 18: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

18

DataStructuresandAlgorithms

DomainTheory DataCollec5on

Sta5s5calModels Sta5s5calTheory

Es5ma5onAlgorithms

Inference

HypothesisTes5ng

Predic5on/Forecas5ng

DecisionSupport

Simula5on

MappingtheProjectTerrain

Page 19: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

19

Topic State of the Art Prior to Project 

State of the Art  Now 

Poten=al Applica=ons And Impact 

Generaltheoryforhandling

missingdatainsocialnetworks

Problemonlypar5allyunderstood.

NosoVwareavailableforsta5s5calmodeling

Generalsta5s5caltheoryfortrea5ngmissingdatainasocialnetworkcontext.

Publicly‐availablecodeinR.(GileandHandcock,2010)

Allowsapplica5onofsocialnetworkmodelingtodatasetswithsignificantmissing

data

Rela5onaleventmodels

Basicdyadiceventmodels.

Noexogenousevents.NopublicsoVware.

Muchrichermodelwithexogenousevents,egocentricsupport,mul5pleobserver

accounts,hierarchiesSoVwarepubliclyavailable

(Bugsetal,2010)

Providesageneralframeworkfordynamic

networkmodelingtolargerealis5capplica5ons

Cliquefindingalgorithms

Tooslowforuseinsta5s5calnetwork

modeling

Newlinear‐5mealgorithmforlis5ngallmaximalcliquesin

sparsegraphs(Eppstein,Loffler,Strash,

2010)

Extendsapplicabilityofsta5s5calnetworkmodelingtolargernetworksandmore

complexmodels

Accomplishments

Page 20: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

20

Topic State of the Art Prior to Project 

State of the Art  Now 

Poten=al Applica=ons And Impact 

Generaltheoryforhandling

missingdatainsocialnetworks

Problemonlypar5allyunderstood.

NosoVwareavailableforsta5s5calmodeling

Generalsta5s5caltheoryfortrea5ngmissingdatainasocialnetworkcontext.

Publicly‐availablecodeinR.(GileandHandcock,2010)

Allowsapplica5onofsocialnetworkmodelingtodatasetswithsignificantmissing

data

Rela5onaleventmodels

Basicdyadiceventmodels.

Noexogenousevents.NopublicsoVware.

Muchrichermodelwithexogenousevents,egocentricsupport,mul5pleobserver

accounts,hierarchiesSoVwarepubliclyavailable

(Bugsetal,2010)

Providesageneralframeworkfordynamic

networkmodelingtolargerealis5capplica5ons

Cliquefindingalgorithms

Tooslowforuseinsta5s5calnetwork

modeling

Newlinear‐5mealgorithmforlis5ngallmaximalcliquesin

sparsegraphs(Eppstein,Loffler,Strash,

2010)

Extendsapplicabilityofsta5s5calnetworkmodelingtolargernetworksandmore

complexmodels

Accomplishments

Page 21: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

21

Topic State of the Art Prior to Project 

State of the Art  Now 

Poten=al Applica=ons And Impact 

Generaltheoryforhandling

missingdatainsocialnetworks

Problemonlypar5allyunderstood.

NosoVwareavailableforsta5s5calmodeling

Generalsta5s5caltheoryfortrea5ngmissingdatainasocialnetworkcontext.

Publicly‐availablecodeinR.(GileandHandcock,2010)

Allowsapplica5onofsocialnetworkmodelingtodatasetswithsignificantmissing

data

Rela5onaleventmodels

Basicdyadiceventmodels.

Noexogenousevents.NopublicsoVware.

Muchrichermodelwithexogenousevents,egocentricsupport,mul5pleobserver

accounts,hierarchiesSoVwarepubliclyavailable

(Bugsetal,2010)

Providesageneralframeworkfordynamic

networkmodelingtolargerealis5capplica5ons

Cliquefindingalgorithms

Tooslowforuseinsta5s5calnetwork

modeling

Newlinear‐5mealgorithmforlis5ngallmaximalcliquesin

sparsegraphs(Eppstein,Loffler,Strash,

2010)

Extendsapplicabilityofsta5s5calnetworkmodelingtolargernetworksandmore

complexmodels

Accomplishments

Page 22: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

22

Topic State of the Art Prior to Project 

State of the Art  Now 

Poten=al Applica=ons And Impact 

Generaltheoryforhandling

missingdatainsocialnetworks

Problemonlypar5allyunderstood.

NosoVwareavailableforsta5s5calmodeling

Generalsta5s5caltheoryfortrea5ngmissingdatainasocialnetworkcontext.

Publicly‐availablecodeinR.(GileandHandcock,2010)

Allowsapplica5onofsocialnetworkmodelingtodatasetswithsignificantmissing

data

Rela5onaleventmodels

Basicdyadiceventmodels.

Noexogenousevents.NopublicsoVware.

Muchrichermodelwithexogenousevents,egocentricsupport,mul5pleobserver

accounts,hierarchiesSoVwarepubliclyavailable

(Bugsetal,2010)

Providesageneralframeworkfordynamic

networkmodelingtolargerealis5capplica5ons

Cliquefindingalgorithms

Tooslowforuseinsta5s5calnetwork

modeling

Newlinear‐5mealgorithmforlis5ngallmaximalcliquesin

sparsegraphs(Eppstein,Loffler,Strash,

2010)

Extendsapplicabilityofsta5s5calnetworkmodelingtolargernetworksandmore

complexmodels

Accomplishments

Page 23: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

23Application to Email Data: 200,000 email messages among 3000 individuals over 3 months

(DuBois and Smyth, ACM SIGKDD 2010)

Page 24: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

24

Impact:SoVware

•  RLanguageandEnvironment–  Open‐source,high‐levelenvironmentforsta5s5calcompu5ng

–  Defaultstandardamongresearchsta5s5cians‐increasinglybeingadoptedbyothers

–  Es5mated250kto1millionusers

•  Statnet–  Rlibrariesforanalysisofnetworkdata–  Newcontribu5onsfromthisMURIproject:

•  Missingdata(GileandHandcock,2010)

•  Rela5onaleventmodels(Bugs,2010)

•  Latent‐classmodels(DuBois,2010)

•  Fastclique‐finding(Strash,2010)•  +more……

Page 25: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

25

Impact:Publica5ons

•  Over40peer‐reviewedpublica5ons–  acrosscomputerscience,sta5s5cs,andsocialscience

–  Highvisibility•  Science,Bugs,2009•  Journal of the American Sta3s3cal Associa3on,Schweinberger,inpress•  Annals of Applied Sta3s3cs,GileandHandcock,2010•  Journal of the ACM,daFonsecaandMount,2010•  Journal of Machine Learning Research,Asuncion,Smyth,etc,2009

–  Highlyselec5veconferences•  ACMSIGKDD2010(16%acceptrate)•  NeuralInforma5onProcessing(NIPS)Conference2009(25%accepts)•  IEEEInfocom2010(17.5%accepts)

•  Cross‐pollina5on–  Exposingcomputerscien5ststosta5s5calandsocialnetworkingideas

–  Exposingsocialscien5stsandsta5s5cianstocomputa5onalmodelingideas

Page 26: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

26

Impact:WorkshopsandInvitedTalks

•  2010Poli5calNetworksConference–  WorkshoponNetworkAnalysis

–  PresentedandrunbyBugsandstudentsSpiro,Fitzhugh,Almquist

•  InvitedTalks:ConferencesandWorkshops–  R!2010ConferenceatNIST(Handcock,2010)–  2010SummerSchoolonSocialNetworks(Bugs)–  MiningandLearningwithGraphsWorkshop(Smyth,2010)–  NSF/SFIWorkshoponSta5s5calMethodsfortheAnalysisofNetworkData(Handcock,2009)–  Interna5onalWorkshoponGraph‐Theore5cMethodsinComputerScience(Eppstein,2009)

–  Quan5ta5veMethodsinSocialScience(QMSS)Seminar,Dublin(Almquist.2010)–  +manymore…..

•  InvitedTalks:Universi5es–  Stanford,UCLA,GeorgiaTech,UMass,Brown,etc

Page 27: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

27

Impact:theNextGenera5on

•  Facultyposi5onsatUMass–  RyanActon,KristaGile‐>AsstProfs,partofnewini5a5veinComputa5onalSocial

Science

•  Studentsspeakingatmajorsummerconferences–  SunbeltInterna5onalSocialNetworks(Jasny,Spiro,Fitzhugh,Almquist,DuBois

–  ACMSIGKDDConference(DuBois)

–  Interna5onalConferenceonMachineLearning(Vu)

–  AmericanSociologicalAssocia5onMee5ng(Marcum,Jasny,Spiro,Fitzhugh,Almquist)

•  Bestpaperawardsornomina5ons(Spiro,Hummel)

•  Na5onalfellowships:DuBois(NDSEG),Asuncion(NSF),Navaroli(NDSEG)

Page 28: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

28

…..andtheOldGenera5on

•  CarterBugs–  AmericanSociologicalAssocia5on,LeoA.Goodmanaward,2010

–  highestawardtoyoungmethodologicalresearchersinsocialscience

•  MichaelGoodrich–  ACMFellow,IEEEFellow,2009

•  PadhraicSmyth–  ACMSIGKDDInnova5onAward2009

–  AAAIFellow2010

•  MarkHandcock–  FellowoftheAmericanSta5s5calAssocia5on,2009

Page 29: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

29

WhatNext?

•  “Push”algorithmicadvancesintosta5s5calmodeling–  Willallowustoscaleexis5ngalgorithmstomuchlargerdatasets

•  Developnetworkmodelswithricherrepresenta5onalpower–  Geographicdata,temporalevents,textdata,actorcovariates,heterogeneity,etc

•  Systema5callyevaluateandtestdifferentapproaches–  evaluateabilityofmodelstopredictover5me,toimputemissingvalues,etc

•  Applytheseapproachestohighvisibilityproblemsanddatasets–  E.g.,onlinesocialinterac5onsuchasemail,Facebook,Twiger,blogs

•  MakesoVwarepubliclyavailable

Page 30: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

30

Organiza5onalCollabora5onduringtheKatrinaDisaster

Combined Network plotted by HQ Location

Almquist and Butts

Page 31: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

31SESSION 1:

9:20 Dynamic Egocentric Models for Citation Networks Dave Hunter, Professor, Statistics, Penn State University 9:45 Membership Dimension Maarten Loffler, Postdoctoral Fellow, Computer Science, UC Irvine 10:05 Multilevel Network Models for Classroom Dynamics Chris DuBois, PhD student, Statistics, UC Irvine 10:30 COFFEE BREAK

SESSION 2: 10:45 DISCUSSION: FAST CHANGE-SCORE COMPUTATION IN DYNAMIC GRAPHS Led by David Eppstein and Michael Goodrich, Computer Science, UC Irvine 11:35 Bayesian Meta-Analysis of Network Data via Reference Quantiles Carter Butts, Professor, Social Sciences, UC Irvine

12:00 Break for lunch (lunch for PIs + visitors at the University Club)

Page 32: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

32

1:30 to 2:40 Short Highlight Talks

Computational Issues with Exponential Random Graph Models Mark Handcock, Professor, Statistics, UCLA

Experimental Results on Fast Clique Finding David Eppstein, Professor, Computer Science, UC Irvine

Modeling Rates between Affiliates on Facebook from Sampled Data Emma Spiro, PhD student, Social Sciences, UC Irvine

Modeling Degree Sequences of Undirected Networks with Application to 9/11 Disaster Networks Miruna Petrescu-Prahova, Postdoctoral Fellow, Statistics, University of Washington

Statistical Models for Text and Networks Jimmy Foulds, PhD student, Computer Science, UC Irvine

Analysis of Life History Data Sean Fitzhugh, PhD student, Social Sciences, UC Irvine

Approximate Sampling for Binary Discrete Exponential Families with Fixed Execution Time and Quality Guarantees Carter Butts, Professor, Social Sciences, UC Irvine

Page 33: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

33

2:40 – 3:30: SESSION 3

2:40 Instability, Sensitivity, and Degeneracy of Discrete Exponential Families Michael Schweinberger, Postdoctoral Fellow, Penn State University

3:05 Empirical Analysis of Latent Space Embedding David Mount, Professor, University of Maryland

3:30 COFFEE BREAK

4:00 DISCUSSION: LATENT VARIABLE MODELING OF NETWORK DATA Led by Carter Butts and Padhraic Smyth

4:45 WRAP-UP, CLOSING COMMENTS

5:00 ADJOURN 5:00 ADJOURN

Page 34: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

34

Addi5onalResourcesProjectWebsite:hgp://www.datalab.uci.edu/muri/

SlidesandPostersfromAHM:hgp://www.datalab.uci.edu/muri/june2011/

Publica5ons:hgp://www.datalab.uci.edu/muri/publica5ons.php

SoVware:hgp://csde.washington.edu/statnet/

DataSets:hgp://networkdata.ics.uci.edu/resources.php

Page 35: Scalable Methods for the Analysis of Network‐Based Data · P. Smyth: Networks MURI Meeng, June 3, 2011 1 Scalable Methods for the Analysis of Network‐Based Data MURI Meeng, June

P.Smyth:NetworksMURIMee5ng,June3,2011

35

QUESTIONS?