some things to talk about social and political polarization a cool dynamic network simulation (which...
TRANSCRIPT
Some things to talk about• Social and political polarization• A cool dynamic network simulation (which
we haven’t done yet)• Statistical cutoffs and p-values (work of
Wald, Berger, …)• Survey weighting and poststratification
Andrew GelmanDepartments of Statistics and Political Science,
Columbia University7 Feb 2009
Also: Tian Zheng, Thomas DiPrete, Julien Teitler, Jiehua Chen,Tyler McCormick, Rozlyn Redd, Juli Simon Thomas, Delia Baldassarri, David Park, Yu-Sung Su, Matt Salganik, Duncan Watts, Sharad Goel
Studying social and political polarization
Studying social and political polarization• Questions from sociology• Questions from political science• Sources of data• Statistical challenges
Questions from sociology• The “degree distribution”• Characteristics of “the social network”• Homophily• Quantifying segregation• Knowing and trusting
Questions from political science• Polarization of Democrats and Republicans• Polarization of political discourse• How are people swayed by news media,
talk radio, each other, …• Geographic polarization• Polarization and the perception of
polarization
Sources of data• Complete data on small social networks
(schools, monks, …)• Very sparse data on large social networks
(Framingham, …)• Complete data on other networks (scientific
coauthors, …)• Other network datasets (email, Facebook, …)• From random sample surveys• Questions about close contacts (GSS 1985/2004,
NES 2000)• Questions about acquaintances (“How many X’s
do you know?”)
Statistical challenges: Misconceptions of others• Examples• Name• Disease status• Sexual preference• Political leanings• Challenge/opportunity: attributed and perceived attributes
• Appearance vs. reality• How large is the “footprint” of a group?
Statistical challenges: Learning about small and large groups• 1500 respondents x 750 acquaintances = 1
million• Potential to learn about small groups• Potential to learn about people you can’t interview
• Difficulty with large groups• For example, “How many Democrats do you know”• #known is too high to quickly estimate• Potential solution: look at subnetworks
• “Cube model” (individuals x groups x subnetworks)• Need main effects and two-way interactions
Statistical challenges: Network structure• Social network is patterned• Sex, age, ethnicity, SES, location• Names, occupations, attitudes
• Correct for non-uniform patterns by using a mix of names
• Estimate non-uniform patterns using a conditional probability matrix for ages
• Overdispersion to model unexplained variation
• Can’t do much with triangles, 4-cycles, etc.
Statistical challenges: Recall bias• Some people are easier to recall than
others• David, Olga, Sharad• For some sets of names, can be quantified:
Nicole/Christine/Michael• Sliding definitions
• Who are your friends?• Estimates of average #known range from
300 to 750 to …• Estimates of average #trusted range from
1.5 to 15 to 150
Statistical challenges: Returning to the social science questions• Polarization as political segregation in the
social network• Comparing polarization to perceived
polarization• Answering conjectures such as: People in
big cities know more people but trust fewer people
• Getting geography back in the picture
Andrew GelmanDepartments of Statistics and Political Science,
Columbia University7 Feb 2009
Forming Voting Blocs and Coalitions as a
Prisoner's Dilemma: A Possible Theoretical
Explanation for Political Instability
Dynamic network model for political coalitionsMathematics of coalitions
Forming a coalition helps the subgroup (or they wouldn’t do it)
But it hurts the general population (negative externality)Coalitions are inherently unstable
Coalitions of coalitionsOpportunistic acts of secession, poaching, and
dissolutionThe simulation I want to do:
Set up a political settings: “agents” with attributes and locations
Payoff function for agentsLocally optimal movesSchedulingImplementation
Andrew GelmanDepartments of Statistics and Political Science,
Columbia University7 Feb 2009
Statistical cutoffs and p-values
Setting a cutoff for selecting patterns for further studyOld problem in statistics: Neyman, Wald, Berger, …Also of interest to biologists!Some different goals:
Finding patterns that are “statistically significant”Classifying into those to study further, and those to set
asideMathematical framework: distribution of a “score”Solution depends upon:
Distribution of the score among “uninteresting” casesDistribution of the score among “interesting” casesNumber of uninteresting and interesting casesCost of follow-up of uninteresting casesCost of follow-up of interesting cases
Andrew GelmanDepartments of Statistics and Political Science,
Columbia University7 Feb 2009
Survey weighting and poststratification
Survey weighting and poststrafication
General framework for adjusting for differences between sample and population
Population estimate = avg over poststratification cells
You might have to model:The survey responseSize of poststratification cellsProbabilities of selection
Respondent-driven sampling example:Cells determined by “gregariousness” and
“distance”Could approx correlations using clustering