1 enhancing the quality of transferred household travel survey data: a bayesian updating approach...
TRANSCRIPT
11
Enhancing the Quality of Transferred Enhancing the Quality of Transferred Household Travel Survey Data: Household Travel Survey Data:
A Bayesian Updating Approach Using A Bayesian Updating Approach Using
MCMC with Gibbs SamplingMCMC with Gibbs Sampling Yongping Zhang Yongping Zhang
Kouros Mohammadian, PhDKouros Mohammadian, PhDDepartment of Civil and Materials EngineeringDepartment of Civil and Materials Engineering
University of Illinois at ChicagoUniversity of Illinois at Chicago
The 11th TRB National Transportation Planning Applications Conference
May 7, 2007
22
Data TransferabilityData Transferability
The idea is to use data collected in one context in a new The idea is to use data collected in one context in a new context. This can reduce or eliminate the need for a context. This can reduce or eliminate the need for a large data collection in the application context. large data collection in the application context.
Previous StudiesPrevious Studies ITE trip generation tablesITE trip generation tables NCHRP 365 NCHRP 365 (Nancy McGuckin, et al)(Nancy McGuckin, et al)
Highly aggregateHighly aggregate ORNL’s NPTS/NHTS transferability study ORNL’s NPTS/NHTS transferability study (Pat Hu, et al)(Pat Hu, et al)
Aggregate (CT level)Aggregate (CT level) Data simulation Data simulation (Stopher and Greaves)(Stopher and Greaves)
Disaggregate (HH level), C&RT classification method, limited Disaggregate (HH level), C&RT classification method, limited number of independent variables number of independent variables
33
Project ApproachProject Approach
Consider larger set of variables Consider larger set of variables NHTS and CTPP datasetsNHTS and CTPP datasetsUse quantifiable variables that can be easily Use quantifiable variables that can be easily predicted or are available from other sources predicted or are available from other sources (e.g., PUMS)(e.g., PUMS) Consider variables representing Land-use, Consider variables representing Land-use, Urban form, and transportation system Urban form, and transportation system characteristicscharacteristicsAdvanced clustering, updating, and simulation Advanced clustering, updating, and simulation approachesapproaches
44
DataData
Data SourcesData Sources 2001 NHTS, 2000 CTPP, 2001 NHTS, 2000 CTPP, PUMS, PUMS, 2003 TTI, Tiger/Line 2003 TTI, Tiger/Line
GIS data filesGIS data files
Data CleaningData Cleaning 33 variables of demographics, socio-economics 33 variables of demographics, socio-economics and land useand land use Individual level: Individual level: Age group, Race/Ethnicity, Age group, Race/Ethnicity,
Education, OccupationEducation, Occupation Household level: Household level: HH size, Income, Adults, Vehicles, HH size, Income, Adults, Vehicles,
Drivers, WorkersDrivers, Workers Census tract level: Census tract level: Housing, Employment, and Housing, Employment, and
Population densitiesPopulation densities New VariablesNew Variables
55
Intersection density (Tiger/Line)Intersection density (Tiger/Line)No. of intersections / Area No. of intersections / Area
Road density (Tiger/Line)Road density (Tiger/Line)Road length / AreaRoad length / Area
Pedestrian environment (Tiger/Line)Pedestrian environment (Tiger/Line)Block size: Road length / No. of intersections Block size: Road length / No. of intersections
Transit friendly environment (CTPP)Transit friendly environment (CTPP)Transit users / Total no. of workersTransit users / Total no. of workersTransit trips / Total no. of tripsTransit trips / Total no. of trips
Congestion factorCongestion factorTravel time index (TTI report for 85 MSAs)Travel time index (TTI report for 85 MSAs)
Avg. travel time / Free flow TT in that regionAvg. travel time / Free flow TT in that region
New VariablesNew Variables
66
Dependent VariablesDependent VariablesTravel Characteristics Travel Characteristics (from NHTS trip file aggregated to HH level)(from NHTS trip file aggregated to HH level)
VMT for each householdVMT for each household No. of tripsNo. of trips No. of mandatory tripsNo. of mandatory trips No. of maintenance tripsNo. of maintenance trips No. of discretionary tripsNo. of discretionary trips No. of transit trips in the HHNo. of transit trips in the HH No. of private vehicle tripsNo. of private vehicle trips No. of non-motorized (bicycles and walk) tripsNo. of non-motorized (bicycles and walk) trips No. of toursNo. of tours Average trips per tourAverage trips per tour Average trip distance in miles for all HH membersAverage trip distance in miles for all HH members No. of transit users in the HHNo. of transit users in the HH No. of carpool users in the HHNo. of carpool users in the HH Percentage of public transit usage in the HHPercentage of public transit usage in the HH Percentage of carpool usage among workers in the HHPercentage of carpool usage among workers in the HH Total commute distance in the HHTotal commute distance in the HH Average commute distance in the HHAverage commute distance in the HH
77
SyntheticHouseholds
with 33Variables
Look-up Tables
BayesianUpdating
Monte CarloSimulation
PopulationSynthesizer
FormattedPUMS
SyntheticHHs withCluster
Membership
TransferredTravel
Attributes
UpdatedTransferred
TravelAttributes
SimulatedTravel
Attributes
Add-on Datain
NHTS2001
ValidationData with
ClusterMembership
Validation withSimulation
ANN Module TransferabilityModel
NHTS 2001
88
ClusteringClustering
Classification schema is a critical issue Classification schema is a critical issue
Clustering methods tested include: K-MeaClustering methods tested include: K-Means, hierarchical, C&RT, TwoStep, ANN ns, hierarchical, C&RT, TwoStep, ANN
11 clusters were generated using TwoStep 11 clusters were generated using TwoStep clustering method clustering method
ONLY national data is usedONLY national data is used
99
ClustersClusters1.1. Rich and Smart Rich and Smart : :
middle age families middle age families professional or managerial white collar jobs professional or managerial white collar jobs graduate degrees graduate degrees high incomes high incomes majority live in suburbs. majority live in suburbs. greater part are White but also some Asiangreater part are White but also some Asian
2.2. Young Achievers: Young Achievers: Young couples without children or mainly with pre-school children Young couples without children or mainly with pre-school children college degrees college degrees white collar jobs in sales, service, technical, and professional white collar jobs in sales, service, technical, and professional mid-range income. mid-range income. higher percentages live in suburb or rural areas.higher percentages live in suburb or rural areas.
3.3. Kids-centered FamiliesKids-centered Families : : middle aged and working class families middle aged and working class families pre-school and school age childrenpre-school and school age children usually have college education usually have college education mid-rage to high level incomemid-rage to high level income primarily White and live in suburb or townprimarily White and live in suburb or town
1010
Clusters, cont.Clusters, cont.4.4. Rural Blues Rural Blues : :
working class, middle aged families working class, middle aged families pre-school and school age children pre-school and school age children mainly high school graduates mainly high school graduates blue collar jobs (farming, manufacturing, etc) blue collar jobs (farming, manufacturing, etc) low to mid-range incomelow to mid-range income greater part are White and mainly live in rural area or small towns.greater part are White and mainly live in rural area or small towns.
5.5. Working Mixing Pot Working Mixing Pot :: working class White, Black, Asian, or Hispanic working class White, Black, Asian, or Hispanic single adults or couples single adults or couples college or high school education college or high school education low to mid-range income low to mid-range income
6.6. Mainstream Families: Mainstream Families: mid-scale, upper mid age, Whitemid-scale, upper mid age, White large working class couples or families with older childrenlarge working class couples or families with older children college or high school educationcollege or high school education mid-range to high level income mid-range to high level income suburb or rural areas suburb or rural areas
1111
7.7. Senior Couples Senior Couples : : senior couples,senior couples, majority working and some are retired majority working and some are retired greater part is White but include some Black, Asian, or American-greater part is White but include some Black, Asian, or American-
IndiansIndians suburb or rural areas.suburb or rural areas.
8.8. Sustaining Minority Families: Sustaining Minority Families: low income, low income, middle aged, working class familiesmiddle aged, working class families mainly Hispanic or Black but also some Asian and Whitemainly Hispanic or Black but also some Asian and White majority have not finished high schoolmajority have not finished high school service, sales, manufacturing, farming, or construction jobsservice, sales, manufacturing, farming, or construction jobs
9.9. Forever Youngs Forever Youngs : : White senior couplesWhite senior couples, empty nesters, empty nesters mostly retired but some have sales, service, or managerial jobs mostly retired but some have sales, service, or managerial jobs low to mid-range income low to mid-range income
Clusters, cont.Clusters, cont.
1212
10.10. Traditional Seniors: Traditional Seniors: mainly retired single individuals and some retired couples mainly retired single individuals and some retired couples low income. low income. majority are White but some Black, Asian, or American-Indiansmajority are White but some Black, Asian, or American-Indians
11.11. Neo UrbansNeo Urbans : : Small families/couples or single individuals Small families/couples or single individuals dense urban areas dense urban areas college education college education low to mid-range income low to mid-range income sales, service, or professional jobssales, service, or professional jobs dominant race is White but a significant number are Black, Asian, and dominant race is White but a significant number are Black, Asian, and
HispanicHispanic
Clusters, cont.Clusters, cont.
1313
Cluster-Based Travel CharacteristicsCluster-Based Travel Characteristics
Cluster No.
of H
H T
rips
No.
of A
uto
Tri
ps
No.
of N
on-M
otor
ized
T
rips
% o
f Tra
nsit
% o
f Car
pool
Avg
Com
mut
e D
ista
nce
No.
of T
ours
Avg
Tri
p Pe
r Tou
r
Avg
Ann
ual V
MT
Man
dato
ry T
rips
per
Pe
rson
Mai
nten
ance
Tri
ps p
er
Pers
on
Dis
cret
iona
ry T
rips
per
Pe
rson
Aut
o T
rips
per
Per
son
Tra
nsit
Tri
ps p
er P
erso
n
Non
-Mot
oriz
ed T
rips
per
Pe
rson
Tou
rs p
er P
erso
n
Mean 14.84 12.90 1.47 0.03 0.05 19.43 10.77 1.41 13,008 0.68 1.27 1.10 3.96 0.05 0.44 3.28S.D. 8.47 11.74 3.18 0.11 0.19 132.95 9.43 0.40 11,984 0.70 1.10 0.95 2.06 0.21 0.74 1.45Mean 12.95 11.50 1.10 0.02 0.07 15.90 9.47 1.39 13,450 0.63 1.09 0.91 3.47 0.04 0.32 2.87S.D. 8.17 11.08 2.95 0.11 0.23 36.80 8.81 0.44 14,713 0.63 1.02 0.91 1.92 0.22 0.68 1.39Mean 17.02 14.88 1.43 0.02 0.04 13.53 12.26 1.41 13,501 0.62 1.20 1.04 3.79 0.03 0.36 3.11S.D. 9.69 13.67 3.13 0.09 0.16 40.93 10.30 0.37 12,627 0.64 1.05 0.92 2.07 0.14 0.63 1.44Mean 15.68 13.69 1.24 0.02 0.07 12.54 11.41 1.40 12,705 0.60 1.08 0.93 3.47 0.04 0.31 2.89S.D. 8.86 12.78 2.72 0.10 0.23 22.21 9.73 0.44 14,721 0.57 0.96 0.90 1.94 0.24 0.57 1.34Mean 5.62 4.97 0.51 0.04 0.04 15.26 4.01 1.45 11,309 0.77 1.21 1.05 3.87 0.08 0.40 3.12S.D. 3.49 4.67 1.46 0.18 0.18 59.75 3.39 0.60 13,337 1.02 1.32 1.24 2.65 0.43 0.99 1.72Mean 13.06 11.94 0.83 0.02 0.07 14.30 9.68 1.36 12,655 0.79 1.09 0.95 3.91 0.03 0.26 3.14S.D. 7.68 11.01 2.31 0.09 0.21 47.65 8.67 0.39 12,233 0.74 1.07 0.92 2.09 0.18 0.56 1.41Mean 9.16 8.42 0.63 0.02 0.04 15.77 6.56 1.44 10,995 0.59 1.19 1.01 3.80 0.03 0.29 2.96S.D. 5.31 7.65 1.85 0.10 0.18 47.41 5.75 0.53 12,107 0.79 1.16 1.02 2.21 0.19 0.70 1.56Mean 12.38 10.40 1.29 0.05 0.10 12.09 9.14 1.39 10,998 0.49 0.97 0.77 2.92 0.09 0.34 2.51S.D. 8.24 11.76 3.43 0.15 0.25 21.55 9.74 0.45 13,200 0.54 0.97 0.83 2.01 0.34 0.68 1.37Mean 8.91 8.11 0.70 0.01 0.04 10.49 6.06 1.57 8,606 0.14 1.41 1.21 3.84 0.01 0.32 2.86S.D. 4.95 6.86 1.79 0.09 0.18 20.18 4.95 0.68 8,484 0.40 1.25 1.07 2.27 0.14 0.69 1.57Mean 4.61 3.95 0.51 0.04 0.02 8.60 3.21 1.54 5,972 0.12 1.45 1.25 3.69 0.07 0.47 2.99S.D. 2.72 3.53 1.35 0.19 0.15 14.03 2.64 0.77 8,278 0.46 1.50 1.25 2.58 0.42 1.00 1.64Mean 6.58 3.39 2.03 0.27 0.05 21.71 4.90 1.41 7,504 0.56 1.05 0.90 1.88 0.64 1.19 2.72S.D. 4.27 4.80 3.35 0.39 0.17 174.94 4.69 0.51 12,759 0.73 1.19 1.03 2.18 1.07 1.63 1.60
8
9
10
11
1
2
3
4
5
6
7
1414
SyntheticHouseholds
with 33Variables
Look-up Tables
BayesianUpdating
Monte CarloSimulation
PopulationSynthesizer
FormattedPUMS
SyntheticHHs withCluster
Membership
TransferredTravel
Attributes
UpdatedTransferred
TravelAttributes
SimulatedTravel
Attributes
Add-on Datain
NHTS2001
ValidationData with
ClusterMembership
Validation withSimulation
ANN Module TransferabilityModel
NHTS 2001
1515
TransferabilityTransferabilityAn ANN model (with genetic algorithm) is An ANN model (with genetic algorithm) is used to simulate cluster membership as used to simulate cluster membership as a function of 11 factors for each HH in a function of 11 factors for each HH in add-on datasetsadd-on datasetsThe model has 92.4% prediction The model has 92.4% prediction potentialpotentialTravel characteristics are transferred Travel characteristics are transferred from national clusters to add-on data from national clusters to add-on data according to their cluster membershipaccording to their cluster membershipWeighted observed and Predicted travel Weighted observed and Predicted travel characteristics are comparedcharacteristics are compared
1616
Comparison of Weighted Trip Comparison of Weighted Trip Count per PersonCount per Person
Trip Count per Person
0
1
2
3
4
5
1 2 3 4 5 6 7 8 9 10 11
National
New York
Wisconsin
Baltimore
1717
Comparison of Weighted Comparison of Weighted Mandatory Trips per PersonMandatory Trips per Person
Mandatory Trips per Person
0.0
0.2
0.4
0.6
0.8
1.0
1 2 3 4 5 6 7 8 9 10 11
National
New York
Wisconsin
Baltimore
1818
Original Comparison of Transit Usage Original Comparison of Transit Usage Not so good! some clusters need improvementNot so good! some clusters need improvement
•Compared to No. of Trips, the prediction of transit usage is not so good.
•Cluster 5,8,10,11 show significant difference and need improvement.
Percentage of Transit Usage in the Household (No. of Transit Users/Household Size)
0.00
0.10
0.20
0.30
0.40
0.50
1 2 3 4 5 6 7 8 9 10 11
National
Baltimore
New York
Wisconsin
1919
Improvement to Clusters Using Improvement to Clusters Using C&RTC&RT1. The first level of tree is
grown upon the difference of the No. of vehicles in the household (own vehicle or not).
2. Improvement of the model due to this level is defined by improvement/(Variance of Node 0).
3. For example, here 0.0017 equals to 13.3%, and 0.009 equals to 7.05% and 0.0002 equals to 1.57%.
4. Total model improvement is about 22%.
2020
Contrast of Distributions of Trip Rate per Person
in Cluster 5
0.00
0.05
0.10
0.15
0.20
0 5 10 15 20 25
Trips per Person per Day
Pro
babi
lity
National
NY
Nice match shown!however, not always the case. How to improve the transferability?
Considering Distributions:Considering Distributions:Trip RateTrip Rate
2121
Considering Distributions:Considering Distributions:Trip DistanceTrip Distance
Not So Good!Needs to be improved
Comparison of National and NY Add-on Distributions for Cluster 2
0
0.005
0.01
0.015
0.02
0.025
0.03
0 20 40 60 80 100 120 140
Trip Distance per Person(Miles)
Pro
babi
lity
National
NY
2222
Considering Distributions:Considering Distributions:Various distributions were fitted to the dataset inclVarious distributions were fitted to the dataset including: uding: Normal, Gamma, Weibull, Exponential, Max Extreme, Normal, Gamma, Weibull, Exponential, Max Extreme,
Lognormal, Logistic, Student’s t, Min Extreme, TriangulLognormal, Logistic, Student’s t, Min Extreme, Triangular, General Beta, Pareto, Uniform, Binomial, Geometric,ar, General Beta, Pareto, Uniform, Binomial, Geometric, Hyper Geometric, and Poisson. Hyper Geometric, and Poisson.
The fitting results are interpreted by The fitting results are interpreted by examining the rankings of the three fit statistics examining the rankings of the three fit statistics
A-D, K-S, and Chi-squared statistics A-D, K-S, and Chi-squared statistics visually judging of plots, density and cumulative curves visually judging of plots, density and cumulative curves p-value and critical values at different sig. levels.p-value and critical values at different sig. levels.
Non-normal distributions are dominant (e.g., GamNon-normal distributions are dominant (e.g., Gamma)ma)
2323
Gamma DistributionGamma Distribution
PDF CDF
Gamma function:
k > 0 is the shape parameterθ > 0 is the scale parameterthe location parameter determines where the origin is located
2424
Fitted Distribution with Parameters for each Fitted Distribution with Parameters for each Variable by ClusterVariable by Cluster
2525
SyntheticHouseholds
with 33Variables
Look-up Tables
BayesianUpdating
Monte CarloSimulation
PopulationSynthesizer
FormattedPUMS
SyntheticHHs withCluster
Membership
TransferredTravel
Attributes
UpdatedTransferred
TravelAttributes
SimulatedTravel
Attributes
Add-on Datain
NHTS2001
ValidationData with
ClusterMembership
Validation withSimulation
ANN Module TransferabilityModel
NHTS 2001
2626
Bayesian UpdatingBayesian Updating
Local updating can significantly improve Local updating can significantly improve the quality of the transferred datathe quality of the transferred data
Used Bayesian updatingUsed Bayesian updating Traditionally in transferability literature only Traditionally in transferability literature only
variables with normal distributions have been variables with normal distributions have been studied due to the simplicity in calculation of studied due to the simplicity in calculation of posterior from normal prior and likelihood.posterior from normal prior and likelihood.
In practice, the variables of interest (i.e., the In practice, the variables of interest (i.e., the likelihood) can take various distributional likelihood) can take various distributional forms. forms.
2727
f(x|θ) is the probability function for the observed data x f(x|θ) is the probability function for the observed data x (i.e., local sample), given the unknown parameter θ,(i.e., local sample), given the unknown parameter θ,
g(θ) is the prior distribution for θ, g(θ) is the prior distribution for θ, k(θ|x) is the posterior distribution for θ given observed k(θ|x) is the posterior distribution for θ given observed
data xdata x
The technique can be expanded to situations when The technique can be expanded to situations when no prior data is available.no prior data is available.
The analyst can do successive updating,The analyst can do successive updating, using the new information without losing the gains fusing the new information without losing the gains f
rom the old one.rom the old one.
dgxfgxfxk
0
)()(/)()()(
Bayesian UpdatingBayesian Updating
2828
Bayesian Updating (2)Bayesian Updating (2)
The National sample of NHTS 2001 is used as The National sample of NHTS 2001 is used as the source for the prior informationthe source for the prior information
A small local sample is randomly selected from A small local sample is randomly selected from the NY add-on, leaving the rest for validationthe NY add-on, leaving the rest for validation
Bootstrap method is used to resample the data Bootstrap method is used to resample the data and justify the prior distribution assumptions of and justify the prior distribution assumptions of parameters of interest (i.e., scale and shape for parameters of interest (i.e., scale and shape for Normal distribution), Normal distribution),
Normal distribution is fitted to each of the Normal distribution is fitted to each of the resample datasets. resample datasets.
2929
Bayesian Updating (3)Bayesian Updating (3)
Then, Markov Chain Monte Carlo (MCMC) simulation Then, Markov Chain Monte Carlo (MCMC) simulation with Gibbs Sampling is utilized to update the prior with with Gibbs Sampling is utilized to update the prior with the small local sample.the small local sample.Assuming the updated variables of interest are still Assuming the updated variables of interest are still Gamma distributed, the posterior of parameters are used Gamma distributed, the posterior of parameters are used to derive the updated means and SD of the variables. to derive the updated means and SD of the variables. Updated parameters are then compared with the Updated parameters are then compared with the validation data and national data to test the effectiveness validation data and national data to test the effectiveness of the updating procedure. of the updating procedure. The comparisons prove that significant improvement is The comparisons prove that significant improvement is achieved. achieved. The improvement increases with the local sample size The improvement increases with the local sample size
a relatively cost-effective sample size is suggesteda relatively cost-effective sample size is suggested
3030
RMSE Change wi th Sampl e Si ze
0
0. 2
0. 4
0. 6
0. 8
1
30 50 70 90 110 130 150 170 190Sampl e Si ze
RMSE
Root Mean Square Error (RMSE) decreases Root Mean Square Error (RMSE) decreases with the increase of sample size. with the increase of sample size.
There is instability when the sample size within There is instability when the sample size within each cluster is smaller than 45 observations. each cluster is smaller than 45 observations.
A sample size of 75 per cluster seems to be the A sample size of 75 per cluster seems to be the most cost-effective plan.most cost-effective plan.
3131
Comparisons of Updated,National and NY Add-on Distributionsfor Trip Distance per Preson in Cluster 2
0
0.005
0.01
0.015
0.02
0.025
0.03
0 20 40 60 80 100 120Trip Distance per Person per Day (miles)
Pro
babib
ility National
NY
Updated
Updating ResultsUpdating Results
Updated mean values are significantly Updated mean values are significantly improved towards validation data.improved towards validation data.
3232
Trip Rates per Person
ClusterNational National-updated State of New York
Location Shape Scale Mean Location Shape Scale Mean Location Shape Scale Mean
2 -0.83 5.42 0.88 3.94 -0.83 5.15 0.92 3.91 -0.30 3.47 1.14 3.66
3 -3.13 12.31 0.61 4.38 -3.13 12.05 0.61 4.22 -1.66 8.44 0.67 3.99
4 -0.99 6.42 0.77 3.95 -0.99 6.05 0.80 3.85 -0.42 4.43 0.89 3.53
8 -0.13 3.14 1.15 3.48 -0.13 2.90 1.12 3.13 0.18 2.40 1.24 3.16
11 0.04 2.52 1.47 3.75 0.04 2.44 1.45 3.58 0.32 2.20 1.40 3.39
Trip Distance per Person
ClusterNational National-updated State of New York
Location Shape Scale Mean Location Shape Scale Mean Location Shape Scale Mean
2 -0.09 1.45 21.28 30.67 -0.09 1.34 21.04 28.10 -0.07 1.32 20.84 27.33
3 -0.49 1.68 18.91 31.18 -0.49 1.62 18.93 30.18 0.11 1.53 19.31 29.62
4 -0.22 1.61 18.55 29.59 -0.22 1.45 19.98 28.75 -0.02 1.30 20.59 26.67
5 -0.09 1.20 24.93 29.93 -0.09 1.20 24.03 28.84 -0.09 1.19 23.97 28.36
6 -0.43 1.91 18.12 34.18 -0.43 1.89 18.22 34.01 -0.08 1.58 21.40 33.69
7 0.11 1.48 22.69 33.58 0.11 1.54 21.69 33.51 -0.08 1.52 20.75 31.55
8 -0.12 1.06 24.08 25.38 -0.12 1.03 24.03 24.63 -0.09 0.90 22.91 20.53
9 -0.09 1.16 21.43 24.72 -0.09 1.16 22.23 25.65 -0.03 1.17 22.17 25.91
Summary of Updating Results
3333
SyntheticHouseholds
with 33Variables
Look-up Tables
BayesianUpdating
Monte CarloSimulation
PopulationSynthesizer
FormattedPUMS
SyntheticHHs withCluster
Membership
TransferredTravel
Attributes
UpdatedTransferred
TravelAttributes
SimulatedTravel
Attributes
Add-on Datain
NHTS2001
ValidationData with
ClusterMembership
Validation withSimulation
ANN Module TransferabilityModel
NHTS 2001
3434
Population Synthesizing and Travel Population Synthesizing and Travel Data SimulationData Simulation
Using PUMS Data, NYC population is Using PUMS Data, NYC population is synthesized. synthesized. All of the contextual factors were calculated for All of the contextual factors were calculated for each HH. each HH. Synthetic population with all required 33 variables Synthetic population with all required 33 variables was generated.was generated.Using the ANN model, cluster memberships are Using the ANN model, cluster memberships are obtained. obtained.
Travel data are simulated for each HH using Travel data are simulated for each HH using Monte Carlo simulation of each travel attribute Monte Carlo simulation of each travel attribute with updated parameters of the fitted with updated parameters of the fitted distributionsdistributions..
3535
Comparison of Simulated and Add-on NYC SamplComparison of Simulated and Add-on NYC Samples (Trips per Person)es (Trips per Person)
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0 2 4 6 8 10 12
Simulated
NYC-Addon