dracones: web-based mapping and spatial analysis for public health surveillance christian jauvin...

46
Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Upload: janice-bruce

Post on 26-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Dracones: Web-Based Mapping and Spatial Analysis for Public Health

Surveillance

Christian Jauvin

David Buckeridge

McGill University

Page 2: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Summary

Dracones:Built with MapServer/PostGIS

We'll be covering:Public Health context Software architectureSome specific problems

Page 3: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Public Health - Two Perspectives

Case management Individual cases of notifiable diseasesRelationship networks

Population surveillanceLarger risk patterns

Page 4: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Case Management

Questions/problems: Is a case due to recent transmission? If so, does the case share any feature with

other, recent cases?

Ways it's being done: Investigations/interviewsMeeting with other investigators

Page 5: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Population Surveillance

Questions/problems:Are more cases happening than expected?Does an excess suggest ongoing

transmission in a specific region?

Way it's being done:Semi-automated routine temporal and space-

time statistical analysis (SaTScan)

Page 6: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Montreal DSP

Département de santé publique de Montréal (Public Health Agency)

Need: incorporate spatial data + analysis capabilities within workflow

One reason: research shows that spatial information helps

Answer: Dracones project Funded in part by GeoConnections Led by David Buckeridge, MD, PhD 15 month contract

Page 7: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Case Management at the DSP

Current Situation Information on paper

entered into system (Oracle DB + Forms)

System contains sensitive data (names, addresses)

Limited tools for analyzing case data

Project Goal Capture spatial data Visualize and analyze

spatial distribution of cases

Page 8: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Population Surveillance at the DSP

Current Situation Routine temporal and

space-time statistical analysis

Capacity to visualize time-series but not maps

Project Goal Add mapping capacity Extend range of

analytic methods

Page 9: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Why Location Matters - Case Management

If you are studying a case of a certain disease that was just declared

It is harder to picture the situation by looking at something as this..

Page 10: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Why Location Matters - Case Management

Page 11: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Why Location Matters - Case Management

Than by looking at this..

Page 12: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Why Location Matters - Case Management

Page 13: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Why Location Matters - Population Surveillance

If you are studying the spatial distribution of a set of disease clusters

This would seem more difficult..

Page 14: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Why Location Matters - Population Surveillance

Page 15: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Why Location Matters - Population Surveillance

Than this..

Page 16: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Why Location Matters - Population Surveillance

Page 17: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Development Process

Management TeamLed by public health MD with informatics

trainingMembers from each area of DSP involved

User InvolvementUsers on management team Input throughout requirements, design,

development

Page 18: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Software Required and Our Choices

Software Type Required Our Choice

~GIS MapServer

General + Spatial DB PostgreSQL + PostGIS

Cartography-enabled client HTML/Javascript

Analytical / statistical tools SaTScan, R, Python

Page 19: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Web Architecture Benefits

Usually lighter/simpler technologiesCross-platformEase of deployment and integrationBuilds on existing set of conventions and

behaviours

Page 20: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

System Architecture

Oracle DB

Oracle Forms

Current Case Management System

Web client

Bridge

{Python

R

SaTScan

{ Apache + PHP

MapServer + MapScript

PostgreSQL/PostGIS DB

Dracones

Page 21: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Client Side - UI

UI is 100% Javascript (ExtJS library)Future project: extract the map-

manipulation parts:Tile-based panningZoomingLayer activation

And releasing them under an OS license

Page 22: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Client Side - Functions

From the results of a query performed in the Oracle client, launch the application to visualize the results

Inspect those results by varying certain parameters

Launch external analysis tools

Page 23: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Server Side - MapServer

MapServer: OS tool that add geospatial content to web applications

Can be used as a CGIInterface with many programming

languagesWorks very closely with PostGIS

Page 24: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Server Side - MapServer

MapServer with Apache 2.2, using PHP5Linux and WindowsSince it's stateless, each interaction:

Build a map object from a base mapfileModify the map object (according to client

parameters)Return rendered map as a file to the client

(that will display it)

Page 25: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

MapServer - Layers

A map object is made of layersA layer can be loaded from a shapefile

(ESRI open format), that specifies its geometry

Or it can be loaded directly from a PostGIS table

Page 26: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

PostGIS

PostGIS: spatial extension for PostgreSQL

Adds geometry types (points, lines, polygons, etc)

Spatial functions and operators (distance, convex hull, intersection, etc)

Spatial indexes

Page 27: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

PostGIS

Queries that mix spatial and non-spatial aspects of the data

If you have a case table:

case_id condition region_id

1 TB 10

2 Gastro 20

Page 28: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

PostGIS

And a region table:

region_id name geom

10 Centre-Sud POLYGON(…)

20 Hochelaga POLYGON(…)

Page 29: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

PostGIS

You can then build a query like this:

SELECT * FROM case, region

WHERE case.condition = 'TB'

AND case.region_id = region.id

AND within(region.geom,

GeomFromText('POLYGON(…)')

Page 30: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

PostGIS

A MS layer can be built simply by adding a connection attribute, pointing to the PG table (two lines really!)

Shapefile and table sources can be mixed

Page 31: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Analysis Tools - SaTScan

Requirement: interfacing with analysis tools

SaTScan: detection of space-time clusters Scan for areas where the probability of

being a case is significantly higher than being a non-case

Page 32: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Analysis Tools

Since it's a command-line tool without an open API, we use Python to run it, parse the results and plot them using MapServer

We do the same for some external R routines

Page 33: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

System Data Sources

Health dataReportable disease databaseAncillary data on contacts

Geographical dataStreet networks and postal code fileHealth regions, census, postal boundaries

Page 34: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Using Address Data from a Public Health Database

Problem: addresses are stored as character fields:

No validation at the entry pointData quality is compromised

Address:

1500-a Sherbroooke St. Ouest

Page 35: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Two Problems with Address Processing

The addresses need to be parsed, and possible (and numerous) transcript errors and ambiguities must be solved

The ones which refer to a same place must be identified and treated as a unique object

Page 36: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Possible Solutions

These could be solved in a more SQL-integrated manner: edit distance module for PG (?)

We decided however to go the procedural way (using Python)

Page 37: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Address Validation Algorithm - Requirements

A database with (1) the street network geometry

(2) the street segment address rangesAnd (3) the postal code geometry and

street range association

Page 38: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Address Validation Algorithm

So you will know for instance that:

Sherbrooke

StreetSherbrooke Street

1001

2001

3001998

1998

2998

H2X2T1H2X2T2

Page 39: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Address Validation Algorithm - Steps

Parse the text addresses in 3 tokens: {S#, SN, PC}

For each triplet:Try to find an exact match, by being tolerant

on SN (maximum coverage, edit distance..)By being tolerant on SN, try to vary PC Idem with SN, fix PC and vary S#

Page 40: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Address Validation Algorithm - Batch Results

By doing a batch analysis of the DSP data (105K records), we found that:84% of the address records were "exact"14.5% were recoverable errors1.5% were non-recoverable errors

Page 41: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Last Address Processing Step: Geocoding

Geocoding by interpolation:

Sherbrooke

StreetSherbrooke Street

1001

2001

3001998

1998

2998

H2X2T1H2X2T2

1500 Sherbrooke

Page 42: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

A Last Problem

DSP management system is read-only (for us)

Not spatially enabledMust not affect performance

Page 43: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

And its Solution

Create a mirror of the DSP data model, using PG

Augmented with spatial aspects (and more adapted address handling)

Refreshed periodicallyReprocessing of the content that has

changedExtraction of the new one

Page 44: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

A Challenge

Interface and extend existing:SystemEnvironment (including an important

community of users and developers)

Page 45: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Lessons Learned

Very strong interest in using spatial information at the DSP but infrastructure, skills and data quality are limiting Large effort to validate and correct all addresses

The science of spatial analysis in public health often lags the technology How to analyze multiple locations for each individual? How important is spatial location in an urban area?

Open-source, web-based mapping software and spatial databases (MapServer, PostGIS) are robust and easy to work with for skilled developers

Page 46: Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

Acknowledgements

GeoConnections, CIHRMcGill University

Aman Verma, Sherry Olsen, Andrew CarterMontreal DSP

Louise MarcotteRobert Allard, Lucie Bedard, André Bilodeau

Montreal Chest InstituteKevin Schwartzman, Jonathan RichardAlice Zwerling, Marie-Josee Dion