limits of learning in incomplete networks · regression weights feature vector of node i. nol...

51
Limits of Learning in Incomplete Networks Timothy LaRock [email protected] In collaboration with 1 Timothy Sakharov Sahely Bhadra Tina Eliassi-Rad Supported by NSF CNS-1314603 & NSF IIS-1741197

Upload: others

Post on 16-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Limits of Learning in Incomplete Networks

Timothy [email protected]

In collaboration with

1

Timothy Sakharov Sahely Bhadra Tina Eliassi-Rad

Supported by NSF CNS-1314603 & NSF IIS-1741197

Page 2: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Background: Incomplete Networks

- Networkdataisoftenincomplete

- Acquiringmoredataisoftenexpensiveand/orhard

- Researchquestion:Givenanetworkeddatasetandlimitedresourcestocollectmoredata,howcanyougetthemostbangforyourbuck?

2

Page 3: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Two general approaches to network completion

3

CollectmoredataDon’tcollectmoredata

Page 4: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Two general approaches to network completion

4

Collectmoredata

Assumeanetworkmodel

Combinenetworkmodelwithincompletedatatogetamodelofthenetworkstructure

Infermissingdatafromthismodel

- [Kimetal.2011]- [Chenetal.2018]

Don’tcollectmoredata

Page 5: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Collectmoredata

Two general approaches to network completion

5

EstimateStatisticsfrompartiallyobservednetwork- [Soundarajanetal.2015]- [Soundarajanetal.2016]Utilizeanexplore-exploitapproach- [PfeifferIIIetal.2014]- [Soundarajanetal.2017]- [Muraietal.2018]- [Madhawaetal.2018]- Thiswork!

Assumeanetworkmodel

Combinenetworkmodelwithincompletedatatogetamodelofthenetworkstructure

Infermissingdatafromthismodel

- [Kimetal.2011]- [Chenetal.2018]

Don’tcollectmoredata

Page 6: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Our solution: Network Online Learning (NOL)

6

- Learntogrowanincompletenetworkthroughsequential,optimalqueries(tosomeAPI)

- Agnostictobothdatadistributions andsamplingmethod

- Interpretablefeaturesthatarecomputableonline

- ResearchQuestion:Givenanetworkeddatasetandlimitedresourcestocollectmoredata,howcanyougetthemostbangforyourbuck?

Page 7: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Assumptions

7

Amodeloftheunderlyinggraph.

Howtheinitialsamplewascollected.

WeknowtheAPIaccessmodel(completevsincompletequeries).

Theunderlyingnetworkisstatic(probingthesamenodetwicegivesnonewinformation).

Weassume... Wedonotassume...

Page 8: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Inputs:

- : Incomplete network

- b : probing budget

- r : reward function

8

Probe b times to maximize cumulative reward

Output:

Network after b probes

Network Online Learning (NOL)

Example reward function: - number of new nodes observed

Page 9: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

NOL algorithm

9

Page 10: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

NOL algorithm

10

Page 11: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

NOL algorithm

11

Observe the current state

Page 12: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

NOL algorithm

12

Observe the current state

Feature vector of node iRegression weights

Page 13: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

NOL algorithm

13

Observe the current state

Page 14: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

14

Observe the current state

Choose the next action

NOL algorithm

Page 15: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

15

Observe the current state

Choose the next action

NOL algorithm

Take action, update network, collect reward

Page 16: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

16

Observe the current state

Choose the next action

Take action, update network, collect reward

Update parameters

NOL algorithm

Page 17: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

17

Observe the current state

Choose the next action

Take action, update network, collect reward

Update parameters

NOL algorithm

Online linear regression following Strehl et al., NIPS 2008.

Page 18: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

18

Observe the current state

Choose the next action

Take action, update network, collect reward

Repeat until budget depleted

Update parameters

NOL algorithm

Page 19: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

NOL algorithm

19

1. Observe the current state

2. Choose the next action

3. Take action, update network, collect reward

4. Update parameters

5. Repeat until budget depleted

Page 20: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Features

20

Page 21: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Features

21

i

Page 22: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Feature: In-sample degree

22

i

Page 23: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Feature: In-sample clustering coefficient

23

i

Page 24: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Feature: Normalized size of connected component

24

i

Page 25: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Feature: Fraction of probed neighbors

25

i

Page 26: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Feature: Lost Reward

26

u v

wKey idea: The order in which we probe nodes can impact the reward they yield.

Page 27: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Feature: Lost Reward

27

Unobserved node (potential reward for u or v)

Partially observed nodes

u v

wKey idea: The order in which we probe nodes can impact the reward they yield.

Page 28: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

ResearchQuestion:Givenanetworkeddataset,andlimitedresourcestocollectmoredata,howcanyougetthemostbangforyourbuck?

- Potentialbangforyourbuckdependsonnetworkstructure!

28

Page 29: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

29

Page 30: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

30

Learning not useful

Heuristics optimal

Page 31: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

31

Potential for learning

Learning not useful

Heuristics optimal

Page 32: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

32

Homogeneous degree dist.

Potential for learning

Learning not useful

Heuristics optimal

Page 33: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

33

Learning not useful

Homogeneous degree dist.

Heterogenous degree dist.

1Avrachenkovetal.INFOCOMWKSHPS (2014)

Heuristics optimal1

Potential for learning

Page 34: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

34

1Avrachenkovetal.INFOCOMWKSHPS (2014)

Homogeneous degree dist.

Heterogenous degree dist. with rich structure

Heterogenous degree dist.

Potential for learning

Learning not useful

Heuristics optimal1

Page 35: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Experiments

35

Page 36: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heuristic Baselines

36

- Highdegree- Probetheunprobednodewithmaximumdegree

- Highdegreew/jump- Probetheunprobednodewithmaximumdegree,randomlyjump

withprobabilityp- Lowdegree

- Probetheunprobednodewithminimumdegree- Random

- Probeanodechosenuniformlyatrandomfromtheunprobednodes

Page 37: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

iKNN-UCB [Madhawa+ArXivpreprint,2018]

37

- K-NearestNeighborsUpperConfidenceBound- Nonparametricmulti-armedbanditapproach

- Choosenodetoprobebycombiningnearestneighborrewardinformation(basedonEuclideandistancebetweenfeaturevectors)+extentofpreviousexplorationofsimilaractions

Page 38: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

38

Homogeneous degree dist.

Heterogenous degree dist. with rich structure

Heterogenous degree dist.

Potential for learning

Learning not useful

Heuristics optimal

Page 39: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

39

Learning not useful

Heuristics optimal

Potential for learning

Homogenous degree dist.

Heterogenous degree dist.

Heterogenous degree dist. with rich structure

Page 40: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

40

Learning not useful

Heuristics optimal

Potential for learning

Erdos-Renyi Model

Heterogenous degree dist. with rich structure

Heterogenous degree dist.

Page 41: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

41

Learning not useful

Heuristics Optimal

Potential for learning

Erdos-Renyi Model

BTER Model1

1C.Seshadhrietal.PhysicalReviewE (2012)

Heterogenous degree dist.

Page 42: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

42

Learning not useful

Heuristics Optimal

Potential for learning

Erdos-Renyi Model

Barabasi-Albert Model

BTER Model1

1C.Seshadhrietal.PhysicalReviewE (2012)

Page 43: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Results - Learning Spectrum

43

Learning not useful

Heuristics Optimal

Potential for learning

Erdos-Renyi Model

Barabasi-Albert Model

BTER Model

Page 44: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Results - Learning Spectrum

44

Learning not useful

Heuristics Optimal

Potential for learning

Erdos-Renyi Model

Barabasi-Albert Model

Allmethodsareindistinguishable,learningdoesn’thelporhurt!

BTER Model

Page 45: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Results - Learning Spectrum

45

Heuristics Optimal

Potential for learning

Erdos-Renyi Model

Barabasi-Albert Model

Allmethodsareindistinguishable,learningdoesn’thelporhurt!

Learning not useful

BTER Model

Page 46: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Results - Learning Spectrum

46

Heuristics Optimal

Potential for learning

Erdos-Renyi Model

Barabasi-Albert Model

HighdegreeheuristicisoptimalinBAmodel,butNOLlearnstoprobeliketheheuristic!

Allmethodsareindistinguishable,learningdoesn’thelporhurt!

Learning not useful

BTER Model

Page 47: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Results - Learning Spectrum

47

Heuristics Optimal

Potential for learning

Erdos-Renyi Model

Barabasi-Albert Model

Allmethodsareindistinguishable,learningdoesn’thelporhurt!

HighdegreeheuristicisoptimalinBAmodel,butNOLlearnstoprobeliketheheuristic!

Learning not useful

BTER Model

Page 48: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

48

Learning Not Useful

Heuristics Optimal

Potential for learning

Coauthorshipnetwork

Citationnetwork

EmailCommunicationNetwork

N=23kE=89k△s=78.7k

N=36.7kE=184k△s=727k

N=6.7kE=17k△s=21.6k

DBLP Cora Enron

Page 49: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Heterogeneity in network properties creates a learning “spectrum”

49

Learning Not Useful

Heuristics Optimal

Potential for learning

✓✓✓DBLP Cora Enron

Page 50: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

Summary

50

- NetworkOnlineLearning canlearntoprobeonlinewithminimalassumptions

- Successistiedtopropertiesoftheunderlyingnetwork:

- Spectrumbasedonobjectivefunctionbeingmaximized(degreedistributionintheseexperiments)

- NOLcanlearntobehaveliketheoptimalheuristic

- Preliminaryexperimentssuggestsomerealworldcomplexnetworksfallinthe“learnable”category

Page 51: Limits of Learning in Incomplete Networks · Regression weights Feature vector of node i. NOL algorithm 13 Observe the current state. 14 ... doesn’t help or hurt! BTER Model. Results

References

51

Kimetal.(2011).TheNetworkCompletionProblem:InferringMissingNodesandEdgesinNetworks.SIAMInternationalConferenceonDataMining.

Seshadhrietal.(2012).Communitystructureandscale-freecollectionsofErdos-Renyigraphs,PhysicalReviewE.

Avrachenkovetal.(2014).Payfew,influencemost:Onlinemyopicnetworkcovering.Proceedings- IEEEINFOCOM.

PfeifferIIIetal.(2014).ActiveExplorationinNetworks:UsingProbabilisticRelationshipsforLearningandInference.InProceedingsofthe23rdACMInternationalConferenceonConferenceonInformationandKnowledgeManagement.

Soundarajanetal.(2015).MaxOutProbe:AnAlgorithmforIncreasingtheSizeofPartiallyObservedNetworks.The2015NIPSWorkshoponNetworksintheSocialandInformationSciences.

Soundarajanetal.(2016).MaxReach:Reducingnetworkincompletenessthroughnodeprobes.2016IEEE/ACMInternationalConferenceonAdvancesinSocialNetworksAnalysisandMining(ASONAM).

Soundarajanetal.(2017).ε-WGX:AdaptiveEdgeProbingforEnhancingIncompleteNetworks.InProceedingsofthe2017ACMonWebScienceConference.

Muraietal.(2018).SelectiveHarvestingOverNetworks.DataMiningandKnowledgeDiscovery.

Madhawaetal.(2018).ExploringPartiallyObservedNetworkswithNonparametricBandits.arXiv:1804.07059.

Chenetal.(2018).FlexibleModelSelectionforMechanisticNetworkModelsviaSuperLearner.arXiv:1804.00237.

Tim [email protected]

Thanks!

Slides at http://eliassi.org/larock_netsci18.pdf