a spatio-temporal query interface for analysing individual biographies : report on a practical...
TRANSCRIPT
A Spatio-temporal Query Interface for Analysing Individual
Biographies : Report on a Practical Experience
Marius Thériault (CRAD, Laval University), Christophe Claramunt (French Naval Academy)
&Anne-Marie Séguin (INRS-UCS)
ISPRS WorkshopSpatial, Temporal and Multi-Dimensional Data Modelling and
Analysis,Québec, October 2-3, 2003
Research funded by SSHRC, GEOIDE and NSERC
ISPRS Workshop, October 2003
Introduction
Urban modelling must consider decision-making behaviour of urban actors using disaggregate data in order to relate Activity location, home choice, commuting and travel decision Household, individual and professional profiles of persons
Needing temporal GIS for analysing urban systems
because Uncertainties exist in the system (aggregation is not straightforward)
• Emergent behaviour is occurring Decision rules for individuals and households are intricate System processes are time-path and location dependent
• Future system state depends partly on past and current states
ISPRS Workshop, October 2003
Issues of Modelling Evolution Paths Within GIS
However, current GIS database concepts are mostly static
Time is supported using Date formats and low-level operators
<, =, > and, eventually, Allen’s primitives
Enhancing ST operators to improve their semantic expressiveness Extending Allen’s primitives: Before, After, During, Precede, etc. Providing Rank operators : First, Second, Third, …, Last Introducing Duration operators : Shortest, …, Longest Set operators : All Before, All After, All During, All Shorter, etc. Database modelling approaches for analysing evolution paths (combine specific facts to define application dependent trajectories) Query interface for searching ordered patterns of facts
Select First Two Children Born Before their Parents Buy their Second Home
Integrated spatial, temporal and thematic query mechanisms within a unified language and/or interface
ISPRS Workshop, October 2003
Context and Objectives of this Research
Context Develop GIS tools for analysing the unintentional consequences, at the
macro scale (E.g. urban spread), of intentional actions and strategies
occurring at the micro-scale (aggregation of individual decisions)
Provide GIS resources for studying influence of the neighbourhood on
individual decisions and to summarise their combined effect on the evolution
of the urban system
Objectives Develop a generic logical database model to handle evolution paths
(E.g. personal biographies) and a query interface combining
temporal, spatial and thematic criteria
Reshuffle ST data in order to describe specific evolution providing flat
files (one for each question at hand) suitable for statistical analysis
using statistical package like SPSS and SAS
ISPRS Workshop, October 2003
Studying Individual Biographies
Focus of this application Household, residential and professional history of citizens
Life course of most individuals Is built around interlocking series of events
During the last decades, these trajectories generated patterns of events of
increasing complexity:
- more divorces,
- extension of contractual short-term employment
- increasing geographical mobility, telecommuting, etc.
Within cities, these individual trajectories intersect and combine, yielding
demographic and residential patterns – driving city evolution and
transportation demand
Understanding evolution processes within personal biographies cannot be
derived from censuses as they give only snapshot reports on complex
situations (aggregated data) and they do not relate successive facts
ISPRS Workshop, October 2003
The 1996 Retrospective Survey for the Quebec Metro Area
Survey collecting, in one interview, information about all changes occurred over
a long period of time, since the departure of the respondent’s parental home
A spatially stratified sample of four cohorts of professional workers
Sample of 418 respondents stratified by municipality, gender and age (36-40 and 46-50)
Interviews realized at the respondent’s home, mean duration 1.5 hour (27,167 facts)
Three trajectories
Residential trajectory : every home occupied (three months or more) since the departure
of parent’s home, with their location (civic address) and other characteristics (tenure,
price, choice criteria, reasons to leave, etc.)
Household trajectory : each change in the composition of the respondent’s household
(arrival or departure of a spouse, birth, death, arrival of a child from an other household,
relatives, roommates, cotenants, etc.)
Professional trajectory : each change in employer, each work place, with their
characteristics (including secondary jobs, education and unemployment episodes)
Collecting dates and location of every change (starting- and ending-time of episode)
ISPRS Workshop, October 2003
Complex Evolution Processes
Leaving Parent's HomeTime Line
RESIDENTIALTRAJECTORY
CAREER TRAJECTORY
CO NSULTANTUNEM PLO YED
STUDENT
CO NSULTANT
PRO FESSIO NAL
TECHNICIAN
UNEM PLO YED
Survey date
CONSULTANT MOTHER Episode Lifeline
HOUSEHOLD TRAJECTORY
Fam ilySPO USE
SO N M O THERUNCLE
SPO USE
IN CO UPLESING LE
M ARRIEDDIVO RCEDSING LE
M aritalstatus
Otherspersons
3
Occupation
CHALET
Secondaryhouse
TO W N HO USE
M ainhom e
RO O M STUDIO FLATRO O MAPARTM ENTSTUDIO FLAT APARTM ENT
Event
TECHNICIAN
Roomm ate
Roomm ate
3
97 242321171210
1 2664
2 221614
19
151311853 252018
Location
Personal Biography
ISPRS Workshop, October 2003
Changes in Personal Life
An individual’s history is altered When an event occurs modifying at least one important aspect of his personal status
(marital, family, job, home, education, income, etc.)
Such an event may alter simultaneously statuses on more than one trajectory - or
may have effect on several individuals in the family
Some events (E.g. new born baby) can be anticipated and may potentially lead to
prior adjustment (actions linked to expectation)
Effects can also be delayed (after the enabling event occurs)
Life trajectories show interlocked evolution Behaviour based on personal values, beliefs and strategy
Facts report events and episodes (time periods with stable attributes) which intersect
to depict global life status of the person along lifelines
Hypothesis: facts ordering builds logical sequences (evolution patterns)
related to life cycles (E.g. young couples, retired persons, etc.)
Studying these patterns is more relevant to urban studies than knowing the exact
timing of events for each individual
ISPRS Workshop, October 2003
Issues in Modelling Life Trajectories
How can we express the temporal structure of biography as an ordered sequence of intertwined statuses (episodes) and events, using database modelling concepts, while retaining its behavioural meaning?
Personal biographies are a complex mix of real world phenomena (E.g. persons, dwellings, etc.) described using facts (E.g. episodes, events) Facts are ordered along lifelines to form sequences of independent or
joint evolution (linked trajectories or related individuals) processes
Processes use aggregation (household made of persons), combination (mix of jobs held simultaneously), and collaboration (renting or buying a dwelling is using another type of entity and starts a new residential episode)
ISPRS Workshop, October 2003
Tentative Ontology of Lifelines and Trajectories
+build trajectory()
-name
Trajectory
+get facts()
-contributing fact classes
Lifestate
+beginning() : Date+ending() : Date
Lifeline
+MBR()
The Geographical Space
datumprojectionunits
Spatial Reference System
* 1
origingranularityunits
Temporal Reference System
-details
Fact
+get distance()+get orientation()
-name-longitude : float-latitude : float
Place
* *-geometry
Geographical Feature
location relationship()distance relationship()orientation relationship()topology relationship()
operation
Spatial Relationships
* *
Event Episode
+get duration()+get age()
-begin : Date-end : Date
Time Period
* *+get chronological order()+get duration order()+get historical order()
-ordering type
Temporal Ordering
+get facts()
-pattern type
Pattern of Facts
Change Stable State
+select trajectories()
-owner name : wchar_t
Individual Biography
The History
-anterior facts-posterior facts
Historical OrderingChronological Ordering
*
*
historical relationship()chronological relationship()duration relationship()
operation
Temporal Relationships
* *
+get facts()
-birthdate : Date-survey time : Date
Individual Lifeline
1 1
get age() : float
chronological ordering setduration ordering sethistorical ordering set
Pattern Relationships
*
*get age() : float
chronological ordering setduration ordering sethistorical ordering set
Sequence Relationships
H
B
C
DE
A
F
G
7
3
4
8
5
2
1
4
a
c
g
d
f
b
Time-varying Attributes
ISPRS Workshop, October 2003
Database Modelling Concepts for Trajectories
A lifelines is combining facts (events and episodes) describing a
specific aspect of personal life (E.g. employment) A trajectory (E.g. household) combines a set of related lifelines
(E.g. marital status, family composition) using application-specific
semantic relationships Each lifeline is ordering facts (periods of time) during which a given
status was stable (E.g. single or married). When an event occurs, there is some change in status, leading to at
least one new episode (E.g. birth of a child in an household changes
its composition); this defines evolution patterns Lifelines define multi-dimensional networks of evolution paths
(directional from past to future) Finally, each fact could be located in space (using a list of locations)
ISPRS Workshop, October 2003
Database Modelling of Evolution in Trajectories
Generic part of the ST data model
Respondents
PK,FK1,FK2,U1 RespondId
I1 GenderI1 Cohort
Facts
PK FactId
FK2 OwnerIdFK3,I2 SpatialIdFK1,I1 LifeStateIdI3 PeriodBegI4 PeriodEndI5 ObsTime
belongs to
HistoricalOrdering
PK HistoryId
FK1,I1 FBeforeIdFK2,I2 FAfterId
is before
is after
ActingIndividuals
PK ActingId
FK2,I2 PersonIdFK1,I1 FactId
is involved in
Individuals
PK PersonId
I1 NameI1 SurNameI1 Gender
isis
LifeStates
PK LifeStateId
U2,U1 LifeStateNameU2,U1 Episode
belongs to
TrajectoryStates
PK LifeDimId
FK1,I1 LifeStateIdFK2,I2 TrajectId
uses
Trajectories
PK TrajectId
U2,U1 TrajectName
defines
Spatial
PK SpatialId
I1 LongitudeI2 Latitude
is located at
MapInfo_MapCatalog
I1 SpatialTypeU1 TableNameI1 CoordinateSystemI1 SymbolI1 XColumnNameI1 YColumnName
Link to Spatialw
are
FactOwners
PK OwnerId
I1 BirthDateI1 Survey Time
is
Developing a generic (application-independent) spatio-temporal data model to handle historical orderings and querying patterns of facts in order to produce flat files needed for event-history analysis
Application semantics
Facts : events and episodes
Historical ordering of facts
Location of facts
Modelling the probability of a status change considering the context : Cox regression combines survival tables and logistic regression
A target changed status is modelled using a set of change enabling facts, some change motivating facts and a target changed status
For example the propensity for couple of tenants (enabling facts) to buy their first house (target status : home owner) after the birth of their second child if they hold a stable job (motivating facts)
Time elapsed after enabling facts and/or motivating facts and local context are relevant
Time line (elapsed time)
Change motivating facts
Enabling facts Target status
ISPRS Workshop, October 2003
Enhancing Expression of ST Relationships
Time ordering should use time stamps (chronological), historical (topological – first…last) and/or duration (shortest…longest) criteria
Semantics of trajectories are application dependent and should be modelled accordingly, as well as explicitly handled during the query
Query mechanisms should be provided to search patterns of facts (E.g. second child birth after longest unemployment episode) eventually using time buffers (delayed and anticipated actions)
Operation of the interface should be close to natural language and should maximize semantic expressiveness
Spatial and temporal operators should be integrated and handled together within a query interface/language combining filters (selecting facts used to build ad hoc lifelines) and criteria (selecting specific facts)
ISPRS Workshop, October 2003
Temporal Operators on Two Time IntervalsCommutative Allen’s operators are identified with grey tones Operational definition
a) Comparison between the time limits of two time intervals (periods or instants) – Extended from Allenyes T Equal U (T’ = U’) (T” = U”)no T MeetBeg U T” = U’no T MeetEnd U U MeetBeg Tyes T Touch U (T MeetBeg U) (T MeetEnd U)no T During U (T’ > U’) (T” < U”)no T Start U (T’ = U’) (T” < U”)no T Finish U (T’ > U’) (T” = U”)no T Inside U (T During U) (T Start U) (T Finish U)no T Contain U U Inside Tno T CoverBeg U T’ < U’ < T” < U”no T CoverEnd U U CoverBeg Tyes T Overlap U ((T CoverBeg U) (T CoverEnd U)) ~(T Contain U)no T Before U T” < U’no T After U U Before Tyes T Disjoint U (T Before U) (T After U)yes T Outside U (T Disjoint U) (T Touch U)yes T Intersect U ~(T Disjoint U)no T Anterior U (T Before U) (T TMeetBeg U)no T Posterior U (T After U) (T MeetEnd U)no T Precede U (T Before U) (T MeetBeg U) (T CoverBeg U)no T Succeed U (T After U) (T MeetEnd U) (T CoverEnd U)no T Bound U ((T Start U) (T Finish U)) (T Inside U)no T Initiate U (T Start U) T CoverEnd Uno T Terminate U (T Finish U) T CoverBeg Uno T Begin U (T Initiate U) (T Equal U)no T End U (T Terminate U) (T Equal U)
b) Comparison between the durations of two time intervals (periods or instants)yes T Equivalent U (T”-T’) = (U”-U’)no T Shorter U (T”-T’) < (U”-U’)no T Longer U (T”-T’) > (U”-U’)no T ShorterEquiv U (T Shorter U) (T Equivalent U)no T LongerEquiv U (T Longer U) (T Equivalent U)yes T Different U ~(T Equivalent U)
Temporal operands (T and U) are delimited by their beginning (T’ and U’) and ending (T” and U”) time stamps
ISPRS Workshop, October 2003
Spatial Operators on Two Spatial Objects
Commutative
Clementini’s primitive operators are identified with grey tones
Operational definition
yes E Equal F (E° F° = E° F°) (E F = E F)
yes E Touch F (E° F° = ) (E F )
no E Inside F (E F = E) E° F )
no E Contain F F Inside E
yes E Overlap F (E F E) (E F F) (E° F° )
yes E Disjoint F E F =
yes E Outside F (E Disjoint F) (E Touch F)
yes E Intersect F ~(E Disjoint F)
Spatial operands (E and F) are formed by their interiors (E° and F°) and boundaries (E and F)
ISPRS Workshop, October 2003
Duration Operators Between Two Time Periods
Commutative Duration operators
Operational definition Exceptions
yes T DSpan U Maximum (T”,U”) – Minimum (T’,U’)
yes T DMerge U Maximum (T”,U”) – Minimum (T’,U’) If (T Disjoint U) then 0
yes T DCommon U If (T Inside U) then T” – T’If (T Contain U) then U” – U’
If (T CoverBeg U) then T” – U’If (T CoverEnd U) then U” – T’,
If (T Equal U) then T” – T’
If (T Outside U) then 0
yes T Distance U If (T Before U) then U’ – T” else T’ – U” If ~(T Disjoint U) then 0
no T DBefore U U’ – T” If ~(T Before U) then 0
no T DAfter U T’ – U” If ~(T After U) then 0
no T DAnterior U U’ – T’ If ~(T Anterior U) then 0
no T DPosterior U T” – U” If ~(T Posterior U) then 0
Temporal operands (T and U) are delimited by their beginning (T’ and U’) and ending (T” and U”) time stamps
ISPRS Workshop, October 2003
Distance Operators Between Two Spatial Objects
Commutative Distance operators
Operational definitionEuclidean distances
Exceptions
yes E DisCtrs F Length (Line (Eclong: E
clat; F
clong: F
clat))
yes E Distance F Length (Shortest Line (Elong:Elat; Flong:Flat)) If ~(E Outside F) then null
no E DistInside F Length (Shortest Line (Elong:Elat; Flong:Flat)) If ~(E Inside F) then null
no E DistContain F Length (Shortest Line (Elong:Elat; Flong:Flat)) If ~(E Contain F) then null
Spatial operands (E and F) are defined by their respective boundaries (E and F) and centre points (Ec and Fc)
ISPRS Workshop, October 2003
Spatio-temporal Query of Patterns of Facts within Trajectories We developed a
query interface combining georelational GIS capabilities and temporal/historical ordering of facts (including search of patterns) using ODBC links
Specifying target trajectory/factSpecifying target trajectory/factSpecifying time orderingSpecifying time orderingSpecifying patterns of factsSpecifying patterns of factsSpecifying temporal conditionsSpecifying temporal conditionsSpecifying duration conditionSpecifying duration condition
Specifying spatial location conditionSpecifying spatial location conditionSpecifying spatial distance conditionSpecifying spatial distance condition
Specifying other status conditionSpecifying other status condition
ISPRS Workshop, October 2003
Linking to Event History Regression Analysis
Evolution phenomena are related to facts giving evidence of change These facts and their possible relationships are recorded using relational databases
We want to submit to statistical analysis these data and expressions based on them in
order to build event history models
Ordinary multiple regression is ill-suited to the analysis of biographies, because of
two peculiarities: censoring and time-varying explanatory variables
Censoring refers to the fact that the value of a variable may be unknown at the time
of survey, generally because the event did not occur (E.g. duration of marriage for a person
who never divorce) – computation of divorce rate should consider censoring
Considering time varying explanatory factors To study the effect of the family composition on residential location choice, one needs to
consider time-varying information
A bio-statistical method called event history regression analysis can handle such a
problem (it combines survival tables and logistic regression)
The query interface enhance data restructuring needed for this kind of statistical
analysis
ISPRS Workshop, October 2003
Example of ST Query on Personal Trajectories
Within Quebec Metro Area, considering only facts at a distance >= 500 metres from respondent’s first owned home (filtering), retain all first three children (before any fourth – censoring) arrival or birth events provided their ending time was not during (Disjoint) the first tenant episode and they where separated by more than 2 months from at least one (Any) job episode (criteria). Selected facts’ periods are extended by 60 days before and 30 days after the actual time stamps (time buffering).
ISPRS Workshop, October 2003
Event History Analysis
Survival tables are using conditional probabilities to estimate the mean proportion
of people experiencing some change in their life after a significant event occurs (E.g.
proportion of tenants buying a home after the arrival of the second child), computing
the time delay after a specified enabling event (E.g. time to divorce after marriage)
However, these probabilities are not exactly the same for everyone because specific
conditions may influence propensity to change
Finding those specific factors that condition individual propensity to do something
requires a combination of survival tables and logistic regression to estimate the
marginal effect of other personal attributes on the probability that an event occurs
The purpose of Event History Analysis (also called Cox Regression) is to model
specific variations of the probability of state transition through time for individuals
considering independent (even time-varying) variables describing their personal
situation on other lifelines (E.g. What is the marginal effect of a 6-month
unemployment period occurred less than five years ago, on the propensity to buy a
home after the second child is born? Is their a significant effect? Is this effect stable
over time and space?)
ISPRS Workshop, October 2003
Probability for tenants to buy a house after their first child is born
)(*)(1)( BuypropensityTenantsurvivalBuyHouseyprobabilit
eightiesseventiessixtiesdistmoveduresepis eeeeeBuyodds 267.0574.1115.2007.0137.0)(
Modelling propensity of tenants for buying a home after the first child is born
units quantitiescoefficient effectDuration of residential episode in the new house(proxy for employment status and stability) years 5 0,137 1,98377
Distance between the place where the child was born and the new home location km 5 0,007 1,03562(proxy of willingness to move far away to enhance market opportunities)
Decade during with the child was born 1960-69 0 -2,115 1(retain only one choice) 1970-79 0 -1,574 1
1980-89 1 -0,267 0,765671990-96 0 0 1
Time elapsed after the first child is born years 2 Odds ratio 1,57302
Survival table cumulative proportion 30,0%
Marginal probability 61,1%
Probability to buy a house 18,3%
Survival Functions of Tenants After the First Child is Born (Cumulative proportions)
00.10.20.30.40.5
0.60.70.80.9
1
0 5 10 15 20
Years after the birth of the first child
Cu
mu
lati
ve
pro
po
rtio
nTenant
Home owner
duresepis : duration of residential episode (years)distmove : distance between the tenant and the new home (km)sixties : first child birth was during the sixtiesseventies : first child birth was during the seventieseighties : first child birth was during the eighties
)(1)()( Buyodds
BuyoddsBuypropensity
ISPRS Workshop, October 2003
Example of Event-History Analysis Results
Propensity of Tenants to Buy a Home Conditional to Decade and Elapsed Time after Birth of the First Child
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20
Years after birth of first child
Prob
abili
ty o
f bei
ng h
ome
owne
r
1960-69
1970-79
1980-89
1990-95
Propensity of Tenants to Buy a Home Conditional to Employment Status and Elapsed Time after
Birth of the First Child
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20
Years after birth of first child
Pro
babi
lity
of b
eing
ho
me
owne
r Unstable
Stable
Very stableRate of access to property ownership significantly increases
through time - from the sixties to the eighties
How much stability in employment increases
propensity to buy a home
ISPRS Workshop, October 2003
Discussion and Conclusion
The modelling approach and the query interface
Use standard entity-relationship principles, combined with geo-relational
technology
Encapsulate application-semantics within the database structure allowing for
the development of a generic query interface
Provide means for combining facts (events and episodes), locations, timings,
lifelines and trajectories within a unified framework allowing for exploration of
patterns of facts and evolution networks
Integrates spatial, temporal and thematic operators within a unified dialog
Provide original temporal rank and set operators + Allen’s and Clementini’s
Conclusion
To the best of our knowledge, this type of application for the spatial monitoring
of changes in population behaviour is original
Keeping track of dynamics using GIS has a strong potential to enhance urban
and transportation planning