searching for “phases” in complex simulation output using evolutionary knowledge discovery...

Searching for “Phases” in Complex Simulation

Output using Evolutionary Knowledge Discovery TechniquesBruce Edmonds and other

SCID Project Members

Evidence-Led Specification

Common-SenseEntities

Secondary Analysis of Simulation Output

Initial party preference inheritedParty preference can be linked to learning from parents (e.g. Verba, Scholzman et al. 2005) .

People vote out of habitGoing to the polls in one election will lead to a greater likelihood of returning to the polls in a subsequent election (e.g. Gerber, Green et al. 2003) .

Voting is a social normCivic duty is an important rationale for individual-level turnout (e.g. Riker and Ordeshook, 1968).

People share the political views of their greater networksProbability of agreement within a network depends on the distribution of political opinion within one’s network (autoregressive networks) (e.g. Huckfeldt, Johnson, and Sprague, 2004).

Electors can be mobilised to vote by family, friends and political partiesHousehold members, friends and political parties will ask people to vote on election day (e.g. Cutts and Fieldhouse, 2009).

There are high amounts of homophily in social networksIndividuals have more contact with similar people (e.g. McPherson, Smith-Lovin et al. 2001).

Education increases the level of political interestThe level of exposure to (political) information one is exposed to increases when pursuing higher education (e.g. Lewis-Beck, 2008).

Political experts are more influential within political discussion networksPeople will tend to listen to people they believe are political experts (those who have higher levels of political interest/involvement) (e.g. Huckfeldt, 2001).

Satisfaction with the outcome of an election increases future turnoutPositive reinforcement from voting will lead to further voting (e.g. Bendor, Diermeier and Ting, 2003) .

Voting can be hindered by personal shocksThe birth of a child disturbs habit (Plutzer, 2002).

Voting varies with ageDeclining health, mobility, and energy levels impede voting (e.g. Strate et al. 1989)

Discuss-politics-with person-23 blue expert=false neighbour-network year=10 month=3

Lots-family-discussions year=10 month=2

Etc.

Memory

Level-of-Political-Interest

Age

Ethnicity

ClassActivities

A Small District

A H

ou

seho

ld

An Agent’s Memory of Events

Etc.

Given that models that are adequate to much social phenomena will necessarily be highly complex, we are left with the necessity of understanding them. The approach here is to construct relevant but complex simulation models to start with (Data Integration Models) and then try and model this with simpler models.

Here, to aid in this search, we use evolutionary techniques to look for hypotheses about the model behaviour, but not over the whole parameter “space” but rather to identify clusters where local patterns hold – maybe akin to “phases” found in some physical systems. These might suggest context-dependent rules for a simpler model or summaries of the complex model behaviour to use in the (relative) validation of simpler models. secondary is achieved using a locally evaluated Genetic Programming algorithm which simultaneously develops arithmetic predictors of a target output (voter turnout) and their scope.

Comparison with evidence is facilitated if the entities in the simulation correspond to entities that are observed in a naturalistic manner.

Specification simulation rules and agent behaviour is informed by available evidence (as much as possible). Some examples are listed below.

What Next? The output (clusters and expressions) suggests hypotheses that can then: (a) be checked using specific simulation experiments and using standard statistical tests (b) be explored in simpler and more abstract models (in particular to capture any significant “phase” changes that these indicate.

The Model is run many times (in this case 7000 times) each time using randomly selected parameter values (p1, p2,…). Many different measures are recorded concerning the outcomes from different layers of the model (m1, m2, … including the predicted variable, t). This provides a rich data set for the secondary analysis.

This data is then distributed over a space according to the values of a couple of the parameters (the grey background patchwork indicates the density of this data in the space)

Em

mig

rati

on

Rat

e

Propensity for Moving Nearby