the statistical analysis of personal network data i. cross-sectional analysis ii. dynamic analysis...
TRANSCRIPT
The statistical analysis of personal network dataI. Cross-sectional analysisII. Dynamic analysis
Miranda Lubbers, Autonomous University of Barcelona
Sociocentric networks
Sociocentric or complete networks consist of the set of relations among the actors of a defined group (e.g., a school class, a firm)
Personal networks
A personal network consists of the set of relations a focal person (ego) has with an unconstrained set of others (alters) and the relations among them.
Egonet, software to aid the collection of personal network data Information about the respondent (ego; e.g.,
age, sex, nationality) Information about the associates (alters) to
whom ego is connected (e.g., alter’s age, sex, nationality)
Information about the ego-alter pairs (e.g., closeness, frequency and / or means of contact, time of knowing, geographic distance, whether they discuss a certain topic, type of relation – e.g., family, friend, neighbour, workmate – )
Information about the relations among alters as perceived by ego (simply whether they are related or not, or strong/weak/no relation)
The statistical analysis of personal versus sociocentric networks: what are the differences?
Whereas sociocentric network researchers often (yet not always) concentrate on a single network, personal network researchers typically investigate a sample of networks (ideally a random, representative sample).
The dependency structure of sociocentric networks is complex, therefore leading to the need of specialized social network software, but personal network researchers, as they have up till now hardly used the data on alter-alter relations*, have a simpler dependency structure...
Personal network data have a “multilevel structure”
E.g.: sample of 100 respondents; for each respondent, data of 45 alters were collected, so in total a collection of 4500 alters
For cross-sectional analysis, three types of analysis have been used in past researchType I: Aggregated analysisType II: Disaggregated analysis
(not okay, forget about it quickly!)Type III: Multilevel analysis
Type 1: Aggregated analysis First, aggregate all information to the ego-level
(this can be exported directly from Egonet):Compositional variables (aggregated
characteristics of alters or ego-alter relations): e.g., percentage of women, average closeness, average distance between ego and his nominees,...)
Then use standard statistical procedures to e.g.:
Describe the network size and / or composition or compare it across populations
Explain the size and / or composition of the networks (network as a dependent variable) with for example regression analysis (e.g., in SPSS, R)
Regression analysis In simple linear regression, the model
that describes the relation between a single dependent variable y and a single explanatory variable x is
yi = β0 + β1xi + εi
β0 and β1 are referred to as the model parameters, and ε is a probabilistic error term that accounts for the variability in y that cannot be explained by the linear relationship with x.
Regression analysis
Simple linear regression:
yi = β0 + β1xi + εi
More explanatory variables can be added:
yi = β0 + ∑βpxip + εi
Illustration aggregate analysis S. G. B. Roberts, R. I. M. Dunbar, T.
V. Pollet, T. Kuppens (2009). Exploring variation in active network size: Constraints and ego characteristics. Social Networks, 31, 138-146.
Illustration: explaining personal network size
1. Explaining unrelated network size
Illustration: explaining personal network size
2. Explaining related network size
Regression analysis at the aggregate level…
Is statistically correct provided that you do not make any cross-level inferences ( ecological fallacy)
Hypothetical illustration of the statement to not make cross-level inferences on the basis of aggregate results…
I ask three persons to name ten friends each
I further ask what the sex of each friend is and how close they feel with each friend on a scale from 0 (not close at all) to 4 (very close).
My question is “Do persons who have many women in their networks feel closer with their network members?”
Network A Network B Network C
F 1.0 M 0.5 F 0.5 M 0.5 F 0.5 M 3.0
F 2.0 M 0.5 F 1.0 M 1.0 F 0.5 M 4.0
M 1.0 F 1.5 M 1.5 F 1.0
M 1.0 F 2.0 M 2.0 F 1.0
M 1.0 F 2.0 M 2.0 F 1.0
M 1.5 F 1.5
M 1.5 F 1.5
M 2.0 F 2.0
Example: Statistical relation at aggregate level cannot be interpreted at tie level
Network A Network B Network C
F 1.0 M 0.5 F 0.5 M 0.5 F 0.5 M 3.0
F 2.0 M 0.5 F 1.0 M 1.0 F 0.5 M 4.0
M 1.0 F 1.5 M 1.5 F 1.0
M 1.0 F 2.0 M 2.0 F 1.0
M 1.0 F 2.0 M 2.0 F 1.0
M 1.5 F 1.5
M 1.5 F 1.5
M 2.0 F 2.0
20 % female 50 % female 80% female
Av. tie strength 1.2
Av. tie strength 1.4
Av. tie strength 1.6
Example: Statistical relation at aggregate level cannot be interpreted at tie level
Network A Network B Network C
F 1.0 M 0.5 F 0.5 M 0.5 F 0.5 M 3.0
F 2.0 M 0.5 F 1.0 M 1.0 F 0.5 M 4.0
M 1.0 F 1.5 M 1.5 F 1.0
M 1.0 F 2.0 M 2.0 F 1.0
M 1.0 F 2.0 M 2.0 F 1.0
M 1.5 F 1.5
M 1.5 F 1.5
M 2.0 F 2.0
20 % female 50 % female 80% female
Av. tie strength 1.2
Av. tie strength 1.4
Av. tie strength 1.6
At tie level: 50% female, 50% male, av. tie strength women 1.3, av. tie strength men 1.5Example: Statistical relation at aggregate level cannot be
interpreted at tie level
Type 2: Disaggregate analysis Disaggregated analysis of dyadic
relations (e.g., a linear regression analysis on the 4500 alters) is statistically not correct even though it has been done (e.g. Wellman et al., 1997, Suitor et al., 1997) Observations of alters are not
statistically independent as is assumed by standard statistical procedures
If observations of one respondent are correlated, standard errors will be underestimated, and consequently significance will be overestimated
Type 3: Multilevel analysis Multilevel analysis is a generalization of
linear regression, where the variance in outcome variables can be analyzed at multiple hierarchical levels. In our case, alters (level 1) are nested within ego’s / networks (level 2), hence the variance is decomposed in variance between and within networks.
The regression equation yi = β0 + β1xi
+ Ri is now extended to yij = β0j + β1jxij + Rij,
where β0j = γ00 + U0j
Dependent variable: Some characteristic of the dyadic relationships (e.g., strength of tie).
Note: Special multilevel models have been developed for discrete dependent variables.
Explanatory variables can be (among others):
characteristics of ego’s (level 2), characteristics of alters (level 1), characteristics of the ego-alter pairs (level
1). Software: e.g., R, MLwiN, HLM, VarCL
Type 3: Multilevel analysis
Illustrations of multilevel analysis for personal networks G. Mollenhorst, B. Völker, H. Flap
(2008). Social contexts and personal relationships: The effect of meeting opportunities on similarity for relationships of different strength. Social Networks, 30, 60-68.
Mok, D., Carrasco, J.-A., & Wellman, B. (2009). Does Distance Still Matter in the Age of the Internet? Urban Studies, forthcoming.
The effect of the context where people meet on the amount of similarity between them (Mollenhorst, Völker, Flap)
Illustration: Analysis of the importance of distance for overall contact frequency (Mok, Carrasco & Wellman)
LnDist is the natural logarithm of residential distance between ego and alter, RIMM is a dummy variable indicating whether ego is an immigrant. Bold figures are significant at p < .05,
bold and italic at p < .10.
See for a good article about the possibilities of multilevel analysis of personal networks:
Van Duijn, M. A. J., Van Busschbach, J. T., & Snijders, T. A. B. (1999). Multilevel analysis of personal networks as dependent variables. Social Networks, 21, 187-209.
In summary, cross-sectional analysis of personal networks...
Unit of analysis
Focus of analysis
Existence of ties Content of ties
Ties - What predicts the contents of ties?
Multilevel analysis
Personal networks
What predicts the size of the network?Regression analysis at aggregate level
What predicts the composition of
networks? Regression analysis at aggregate level
... but what about the relationships among alters? So far, we have only looked at the
relationships a person (ego) has with his or her network members (alters)…
e.g., we ask people to nominate 45 others and to report about their relationships with them…
But data can also be collected on the relationships among network members…
... but what about the relationships among alters? Most researchers are only
interested in alter-alter relations to say something about the structure of personal networks at the network level only
... but what about the relations among alters? Most researchers are only interested in alter-alter
relations to say something about the structure of personal networks at the network level only: Compute structural measures at the aggregate
level (e.g., density, betweenness centralization, number of cliques)
Predict the structure of the networks in an aggregated analysis using for example regression analysis
... but what about the relations among alters? It may however be interesting to
analyze which alters are related (at the tie level) What predicts transitivity in
personal relations? Or, as Louch expressed it, what predicts network integration?
Exponential Random Graph Models (ERGMs) The class of ERGMs is a class of
statistical models for the state of a social network at one time point.
The presence or absence of a tie between any pair of actors in the network is modeled as a function of structural tendencies (e.g., transitivity, popularity), individual and dyadic covariates (e.g., similarity).
Exponential Random Graph Models (ERGMs) ERGMs can be estimated in, among others, the
software SIENA (up to version 3), statnet, pnet (e.g., in R)
Dependent variable: whether pairs of alters are related or not
Explanatory variables: characteristics of alters, characteristics of the relation alters have with ego, characteristics of the alter-alter pair, endogenous network characteristics such as transitivity (in a meta-analysis, characteristics of ego can be
added as well) Type of analysis: Apply a common ERGM to each
network, then run a meta-analysis (cf. Lubbers, 2003; Snijders & Baerveldt, 2003; Lubbers & Snijders, 2007).
Ego influences parameter estimates strongly…
… so we tend to leave ego out
Parameter s.e. Q
Alternating 2-stars (degree) -0.17 0.20 0.81 181.74**
Alternating 2-triangles (transitivity) 2.36**
0.34 1.36 233.03**
Alter is Spanish (vs. from country of origin)Alter is a fellow migrant (vs. ,,)Two alters have same country of residence and origin
-0.01 0.07 0.51**
0.040.070.10
0.100.360.34
53.43** 86.40** 40.56**
Two alters have shared group membership
0.54**
0.11 0.44 95.44**
Ego´s feelings of closeness with alter
0.05* 0.02 0.06 50.78**
* p < .05, ** p < .01. Conditioned on degree.
Example ERGM: Predicting relations among alters in the personal networks of immigrants
In summary, cross-sectional analysis of personal networks...
Unit of analysis
Focus of analysis
Existence of ties Contents of ties
Ego-alter ties - What predicts the contents of ties?
Multilevel analysis
Alter-alter ties What predicts whether there are ties among alters?
ERGM
What predicts the contents of ties among alters?Social Relation
Model
Personal networks
What predicts the size of the network?Regression analysis at aggregate level
What predicts the composition / structure of
networks? Regression analysis at aggregate level
Part II. Dynamic analysis
How do personal networks change over time?
Studies that collect data on personal networks in two or more waves in a panel study
Interest in dynamic analysis
“Networks at one point in time are snapshots, the results of an untraceable history” (Snijders)
E.g., personal communities in Toronto (Wellman et al.) Changes following a focal life event (individual level)
E.g., transition from high school to university (Degenne & Lebeaux, 2005); childbearing, moving, return to school in midlife (Suitor & Keeton, 1997); retirement (Van Tilburg, 1992); marriage (Kalmijn et al., 2003); divorce (Terhell, Broese Van Groenou, & Van Tilburg, 2007); widowhood (Morgan, Neal, & Carder, 2000); migration (Lubbers, Molina, Lerner, Ávila, Brandes & McCarty, 2009)
Broader studies of social change: Social and cultural changes in countries with dramatic institutional changes
E.g., post-communism in Finland, Russia (Lonkila, 1998), Eastern Germany (Völker & Flap, 1995), Hungary (Angelusz & Tardos, 2001), China (Ruan, Freeman, Dai, Pan, & Zhang, 1997),
Sources of change in (personal) networks
Unreliability due to measurement error
Inherent instability Systemic change External change
Leik & Chalkley (1997), Social Networks 19, 63-74
Sources of change in (personal) networks
Unreliability due to measurement error
Inherent instability Systemic change External change
Leik & Chalkley (1997), Social Networks 19, 63-74
Personal networks are layered
Personal network (± 150)
Close / active network (± 50)
Sympathy group (± 15)
Support clique (± 5)
Dependent variables in dynamic personal network studies
FocusLevel
Existence of ties (dichotomous)
Contents of ties (valued)
Ego-alter ties
Persistence of ties with alters
Changing contents of ties with alters
Networks Expansion / contraction of networks
Changing composition of networks
Typology: Feld, Suitor, & Gartner Hoegh, 2007, Field Methods, 19, 218-236.
Type 1: Persistence of ties with alters across time
Dependent variable: whether a tie persists or not to a subsequent time (dichotomous)
Explanatory variables: characteristics of ego at t1 (gender, job situation) change characteristics of ego t1-t2 (e.g., change in
marital status) characteristics of alter at t1 (e.g., educational level) characteristics of the ego-alter pair at t1 (e.g., tie
strength) cross-level interactions (e.g., ego’s marital status × kin)
Type of analysis: Logistic multilevel analysis (e.g., MLwin, Mixno)
Type 1: Persistence of ties with alters across time Logistic regression is used to predict the
log odds that a tie persists over time (log odds = log (p / q)).
Logistic regression is in reality ordinary regression using the log odds as the response variable.
The coefficients B in a logistic regression model are in terms of the log odds: A unit increase in the explanatory variable x1
will multiply the log odds for having a tie with eβ1
Illustration type 1: Explaining persistence of ties for immigrantsFixed effects B SE (B)
Constant -3.256**
0.520
Ego’s length of residence in Spain 0.192 0.109
Personal network density -3.251* 1.380
Ego’s frequency of contact with alter
0.323**
0.048
Ego’s emotional closeness with alter
0.508**
0.073
Alter is Spanish 0.915 0.513
Alter is a fellow migrant -0.626**
0.227
Alter is a transnational -0.498* 0.235
Alter’s degree centrality 0.073**
0.014
Ego’s length of residence × alter is Spanish
-0.365**
0.122
* p < .05, ** p < .01. Excluded: Sex, employment status, marital status, recent visits to country of origin, changes in employment & marital status, tie duration, kin
Type 2: Changes in characteristics of persistent ties across time
Dependent variable: change in some characteristic of the relationship (e.g., change in strength of tie); or characteristic at t2, and use same characteristic at t1 as covariate (auto-correlation approach)
Explanatory variables: characteristics of ego at t1 (gender, job situation) change characteristics of ego t1-t2 (e.g., change in
marital status) characteristics of alter at t1 (e.g., educational level) characteristics of the ego-alter pair at t1 (e.g., tie
strength) cross-level interactions (e.g., ego’s marital status ×
kin) Type of analysis: Multilevel analysis
Example
Change in contact frequency (visits and telephone calls) after an important life event
Two time points: shortly after the life event took place and four years later
Van Duijn, M. A. J., Van Busschbach, J. T., & Snijders, T. A. B. (1999).
Type 3: Changes in the size of the network across time
Dependent variable: change in number of ties in the personal network
Explanatory variables: characteristics of ego at t1 (gender, job
situation) change characteristics of ego t1-t2 (e.g.,
change in marital status) characteristics of the set of alters at t1
Type of analysis: Regression analysis at the aggregate level
Illustration of the analysis of the stability of personal networks over time (East York studies, Wellman et al.)
Multiple regression predicting network turnover (n = 33)
Type 4: Changes in overall network characteristics across time Dependent variable: change in
compositional or structural variable (e.g., percentage of alters with higher education, density of the network)
Explanatory variables, e.g.: Characteristics of ego at t1 Characteristics of the network at t1
Type of analysis: Regression analysis at the aggregate level
Dynamic personal network analysis: More than two observations Add an extra level to the analysis
representing the observation: One-level models become two-level
models Two-level models become three-level
Dynamic personal network analysis: More than two observations Example of type 2 analysis with multiple
observations: Changes in contact after widowhoodGuiaux, M., van Tilburg, T.; Broese van Groenou, M. (2007). Changes in contact and support exchange in personal networks after widowhood. Personal Relationships, 14, 457-473
More than two observations: example of alternative way (type 3 analysis)
E. L. Terhell, M. I. Broese van Groenou & T. van Tilburg (2004). Network dynamics in the long-term period after divorce. Journal of Social and Personal Relationships, 21, 719-738
More than two observations: example of alternative way (type 3 analysis) – cont´d
See for example the chapter on longitudinal data in this book: T. A. B. Snijders & R. J. Bosker
(1999). Multilevel analysis. An introduction to basic and advanced multilevel modeling. London: Sage Publications.
In summary, dynamic analysis of personal networks…
FocusLevel
Existence of ties (dichotomous)
Contents of ties (valued)
Ego-alter ties
Persistence of ties with altersLogistic multilevel analysis
Changing contents of ties with altersMultilevel analysis
Networks Expansion / contraction of networksRegression analysis at the aggregate level
Changing composition of networksRegression analysis at the aggregate level
... but what about the dynamics of alter-alter relations? … ??
Time 1An example of a changing personal network
Node color: Stable alters are dark blue; temporal alters light blue
Edge color: Relations among stable alters are dark blue; among / with temporal alters light blue
Node size: Ego’s closeness with alter
Labels: Spanish, Fellow Migrants, Originals, TransNationals
An example of a changing personal network
Node color: Stable alters are dark blue; temporal alters light blue
Edge color: Relations among stable alters are dark blue; among / with temporal alters light blue
Node size: Ego’s closeness with alter
Labels: Spanish, Fellow Migrants, Originals, TransNationals
FocusUnits
Existence of ties (dichotomous)
Contents of ties (valued)
Ego-alter ties
Persistence of ties with alters
Changing contents of ties with alters
Alter-alter ties
Formation / decay of ties among alters
Changing contents of ties among alters
Networks Expansion / contraction of networks + changing structure
Changing composition of networks
Dependent variables in dynamic personal network studies: Composition and structure
Type 5: Changes in ties among alters across time
Dependent variable: whether alters make new ties or break existing ties with other alters across time
Independent variables: characteristics of alters, characteristics of the relation alters have with ego, characteristics of the alter-alter pair, endogenous network characteristics such as
transitivity (in a meta-analysis, characteristics of ego can be
added as well) Type of analysis: Apply a common SIENA model to
each network (leaving ego out), then run a meta-analysis (cf. Lubbers, 2003; Snijders & Baerveldt, 2003; Lubbers & Snijders, 2007). A multilevel version of SIENA is on the agenda.
Just a few thoughts about the use of SIENA for personal networks Ego influences parameter estimates considerably,
therefore, ego should be left out or alternatively, his or her relations can be given structural ones (to model that ego is by definition related to everyone else)
As ego reports about the relationships between his or her alters, relations tend to be symmetric, so non-directed model type for SIENA
Smaller networks or networks that have only a few changes per network (less than 40) can be combined into one or multiple multigroup project(s)
Parameter ^μ s.e. Q
Rate 6.83* 0.74 2.48 86.51*
Degree -0.94* 0.30 1.49 320.08*
Degree-related popularity (sqrt)
-0.20* 0.02 0.0 259.01*
Transitivity 0.48* 0.12 0.75 1371.63*
Alter is SpanishAlter is a fellow migrantSame country residence / origin
0.29 0.57* 0.69*
0.160.130.05
0.590.500.0
66.81* 155.07* 126.39*
Shared group membership
0.73* 0.05 0.0 79.74*
Closeness alter 0.18* 0.03 0.0 139.23*
Closeness alter 1 × alter 2
0.01 0.02 0.0 56.45
* p < .01. N = 44 respondents
Example: Predicting the changes in ties among alters in immigrant networks
In summary, dynamic analysis of personal networks…
FocusLevel
Existence of ties (dichotomous)
Contents of ties (valued)
Ego-alter ties
Persistence of ties with altersLogistic multilevel analysis
Changing contents of ties with altersMultilevel analysis
Alter-alter ties
Formation / decay of ties among altersSIENA
Changing contents of ties among altersSIENA valued data
Networks Expansion / contraction of networksRegression analysis at the aggregate level
Changing composition of networksRegression analysis at the aggregate level
Conclusion
Multiple statistical methods for personal network research, depending on your research interest
Combining several methods probably gives the greatest insight into your data
Thanks!My e-mail address: [email protected]