synthesizing agents and relationships for land use / transportation modelling
DESCRIPTION
Synthesizing Agents and Relationships for Land Use / Transportation Modelling. Lecture Outline. Introduction Previous Work Data New Methods Results. Introduction. How would land use, transportation patterns and emissions react to... High congestion charge? Greenbelt policy? - PowerPoint PPT PresentationTRANSCRIPT
David PritchardCivil Engineering, University of Toronto
September 12, 2008
Synthesizing Agents and Relationships for Land Use / Transportation
Modelling
2
Lecture Outline
● Introduction● Previous Work● Data● New Methods● Results
3
Introduction
● How would land use, transportation patterns and emissions react to...
● High congestion charge?● Greenbelt policy?● “Do nothing” while population grows ● Major transportation projects
● Major extrapolations from current behaviour● Too hard to predict conventionally
4
Introduction
Traditional 4-stage
5
Introduction
Integrated Land Use/Transportation Environment (ILUTE) model
6
Introduction
● We can’t build such a complicated model using conventional methods
● Instead, preferred approach is microsimulation model
● What is microsimulation?
7
Introduction
Conventional Model
Simulation Model
8
Introduction
● Microsimulation = Simulation + Agents● Models the state of agents● Combined behaviour of agents yields
system state● 1. Begin with initial population in start year● 2. Update population, year by year
● age persons, change family structures● change jobs, move homes● use this to predict annual travel patterns
● 3. Obtain travel patterns in forecast year
9
Introduction
● Need an initial population in the start year● List of agents and their attributes - e.g.,
● Number of persons, and their ages● Number of vehicles● Type of dwelling● etc.
● But - complete list is unknown● “Population Synthesis” used instead
● Use known data to create initial agents● Result has known statistical properties● Best estimate from limited data
10
Introduction
My results:Improved method for population synthesis
Allows more attributes for each agentNew method for relationship synthesis
Allows correct set of agents and correct set of relationships
Created a synthetic population for ILUTE Persons, families, households and dwellings Complete 1986 population for GTHA
11
Previous Work
Two representations of set of agents List of agents and their attributes (as categories) Contingency table
One cell for each combination of attributes Cell contains count of number of agents
12
Previous Work
Data Limitations Patchwork of partial data Mostly, we have one-way margins Break down of a single attribute into a few
categoriesExample: look at how we can use one-way margins
13
Previous Work
14
Previous Work
Iterative Proportional Fitting
15
Previous Work
Iterative Proportional Fitting
16
Previous Work
Iterative Proportional Fitting e.g., “Biproportional Updating” of O/D tables
Exactly satisfies target marginsAlso minimizes discrimination information relative to source population
Information theory: maximum entropyResulting PDF satisfies the constraints without assuming any information we do not possess
17
Previous Work
Many options for margins in 3D
18
Previous Work
Beckman, Baggerley & McKay (1996)State-of-the-art application of IPF for census Geography attribute gets special treatment
Due to nature of data in PUMS and census tablesTwo approaches: zone-by-zone, or all zones at once
Treats final table as a PMF Monte Carlo draws used to integerize Hurts fit to target margins
Limited number of attributes
19
Previous Work
Williamson, Birkin and Rees (1998)Not IPF: “Combinatorial Optimisation”List-based, instead of tablesPros:
good fit to target margins may handle more attributes
Cons: no guarantees about relationship with source
sample not entropy maximizing slow
20
Data
Summary Tables Usually one attribute, by zone (2D margin) Contingency table Large sample: 20% or 100% Sometimes 2-3 attributes by zone Used as Target Margins
Public Use Microdata Sample (PUMS) List; almost all attributes, except zones Small sample (1-2%) Canada: defined for each large Census
Metropolitan Area (CMA) Used as Source Sample
21
Data
22
Data
23
Data
24
Data
Canadian Census includes three PUMS Persons Census families Households & Dwellings
Also summary tables related to each
25
New Methods: Sparsity
Beckman et al.’s approach doesn’t work well with many attributes
Computation becomes hard Huge memory requirement Slow
Thirteen attributes on family agent: Beckman Zone-by-Zone needs 1.4 GB memory Beckman Multizone needs 1,036 GB memory
26
New Methods: Sparsity
Number of cells in multiway table grows exponentially with number of attributes (dimensions)
27
New Methods: Sparsity
28
New Methods: Sparsity
Large number of binsMost bins are zeroNumber of bins is larger than sample!
29
New Methods: Sparsity
Is it meaningful to use many attributes? Tentatively, yes Not a meaningful 13-way distribution But, a link between many statistically valid low-
order distributions (e.g., 3-way) If acceptable, can we do better than standard IPF?
Yes - use a sparse data structure instead of a complete array to represent table
Store only non-zero cells in table
30
New Methods: Sparsity
Same representation as Williamson’s “Combinatorial Optimisation”
But, uses IPF algorithmMaximum entropy guarantee; fastCan implement either zone-by-zone or multizone IPF using sparse data structure
31
New Methods: Relationships
Land use/transportation models have more types of agents Agents: Persons, families,
households, business establishments
Objects: Vehicles, dwellings
32
New Methods: Relationships
Need to synthesize correct relationshipsExamples:
Which persons are married? Opposite sex, similar ages - usually
Which household owns/rents a given dwelling? Number of rooms and number of persons should
be correlatedEarlier methods could guarantee correct PDF for one agent type, but not all simultaneously
33
New Methods: Relationships
Family PUMS contains information about persons in family husband/wife ages; child ages
Can synthesize “family” agent Include some “person” attributes in family
34
New Methods: Relationships
Then, conditionally synthesize persons on family attributes IPF result is a joint probability mass function
P(AGE, EDU, INCOME, OCCUP, SEX, ...)
Can convert to a conditional PMF
P(EDU, INCOME, OCCUP, ... | AGE, SEX)
Synthesize, repeating for husband, wife, children
35
New Methods: Relationships
Guarantees good fit for both agent types Correct Family PDF Correct Person PDF
Simple, data-driven No rules No special data sources, models
Provided that attributes can be aligned between agents
36
Results
37
Results
38
Results
Programmed in R A statistical programming platform Dynamic language, fast prototyping Good support for categorical data, contingency
tablesToronto CMA: 1.1 million households, 1.0 million families, 3.3 million persons
Run time: 2 hours, 7 minutes on older 1.5GHz computer
Repeated for Hamilton and Oshawa CMAs
39
Results
40
Results
Experiment Is there value in using really rich input data? Or does PUMS + 1D tables give enough?
Calculated fit against all available dataSRMSE and G2 information theoretic statistics
41
Results
42
Results
Improvement of result with additional data evident However, no statistical tests possible
Monte Carlo stage causes some errorMy conditional synthesis introduces small amount of additional error
Little difference between zone-by-zone and multizone methods
43
Questions?