data.gov wiki: a semantic web approach to government data · esp. visualization – to support web...

21
Data.gov Wiki: A Semantic Web Approach to Government Data Li Ding, Dominic DiFranzo, Sarah Magidson, Alvaro Graves, James R. Michaelis, Xian Li, Deborah L. McGuinness, Jim Hendler Tetherless World Constellation Nov 2, 2009

Upload: others

Post on 15-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Data.gov Wiki: A Semantic Web Approach to

Government Data

Li Ding, Dominic DiFranzo, Sarah Magidson, Alvaro Graves, James R. Michaelis, Xian Li,

Deborah L. McGuinness, Jim Hendler

Tetherless World ConstellationNov 2, 2009

Page 2: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Synergy

• Government: data is out there “as is”• Loop: gov data and linked data• Loop: gov data and web developers• Loop: gov data and end users

Page 3: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Government Data on the Web

Page 4: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Objectives

• Investigate the role of semantic web in producing, processing and utilizing government datasets– To enrich the value of data via normalizing,

linking and information-extraction– To realize the value of data via applications,

esp. visualization– To support web developers via machine

friendly data access and web services

Page 5: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Data Processors(Web Services & Analyzers)Data Processors(Web Services & Analyzers)

SPARQL Web Service

XSLT Service Diff Service

RDF/XML

RSS Generator

SPARQL End Point

Linked Data

Linked DataGOV data

(RDF)

Google Viz MIT Exhibit RSS 1.0 tagCloud

CSVXSL…

Tabulator

Convert D

ataLink &

Enrich D

ataV

iew &

Use D

ata

Link Annotator

RDF/XML

Li Ding, Dominic DiFranzo, Sarah Magidson, and Jim Hendler · Tetherless World Constellation · Rensselaer Polytechnic Institute · Aug 7 2009 · http://data-gov.tw.rpi.edu/

Sem Wiki

Semantic Web Architecture for Government Data

Page 6: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

The Landscape

Page 7: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

The catalog data

Page 8: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

(#10)Residential Energy

Consumption Survey

(#401) Budget Authority and

offsetting receipts1976-2014

(#403) Governmental

Receipts1962-2014

(#402) Outlays and

offsetting receipts1962-2014

(#249) 2006 Toxics Release

Inventory

(#90) 2005-2007 ACS PUMS

Housing (#191) 2005 Toxics Release

Inventory

(#91) 2005-2007 ACS PUMS Population

(#34)Worldwide M1+ Earthquakes past 7 days

(#9) CASTNET Visibility

(#397) 2007 Toxics Release

Inventory

(#8) CASTNET Ozone

Budget

Population

Energy and Utilities

Geography and Environment

(@10001)CASTNET sites

Li Ding, Dominic DiFranzo, Sarah Magidson, and Jim Hendler · Tetherless World Constellation · Rensselaer Polytechnic Institute · Aug 7 2009 · http://data-gov.tw.rpi.edu/

Data-gov Cloud (Aug 2009)

Page 9: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Data-gov Cloud (Oct 2009)

Li Ding and Jim Hendler · Tetherless World Constellation · Rensselaer Polytechnic Institute · Oct 2009 · http://data-gov.tw.rpi.edu/

US-COMMUNITY(2005-2007)

CASTNET(1990 – Present)

RECS(2005)

GOV-BUDGET(1962-2014)

TOXIC-RELEASE(2005-2008)

EARTHQUAKE(Present)

STATE-LIB(2006-2007)

PUBLIC-LIB(1992-2006)

MED-COST(1994-2009)

LABOR-STAT(19xx-Present)

DATA-GOV-CATALOG(present)

Government

Community

Services

Environment

CASTNET sites

RECS code

US agency US location

Linked Data

USAspending(2008-2010)

GeoNamesGeoNames

Page 10: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

More statistics

Page 11: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Demos

Page 12: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Data.gov + epa.gov

Page 13: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Gov Data + Corporate Data + User Data

Page 14: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Computing Difference of Revisions

Page 15: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

More demos?

• http://data-gov.tw.rpi.edu/wiki/demos

Page 16: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Technical Issues

Page 17: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Issues in Data.gov

• Duplicated Datasets- Some datasets are part of another dataset

– Dataset 140 (2005 Toxics Release Inventory data for the state of California (EPA)) is a subset of Dataset 191.

• Formatting Issues - The format of some datasets is not friendly to machine processing.

– Dataset 37 (Lower Colorado River Daily Average Water Elevations and Releases (US Bureau of Reclamation)).

– Dataset 335 (National Longitudinal Surveys (US Bureau of Labor Statistics)) tells you how to order data from the government.

• Access Point Issues - The access points are interactive webpage which is not friendly for machine access.

– Dataset 330 (Local Area Unemployment Statistics (US Bureau of Labor Statistics)

Sarah

Page 18: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Linking Data

1. link similar datasets by reusing property namespace

2. link to rdfs:label (via rdfs:subPropertyOf) using semantic wiki

3. link to DBpedia (via owl:sameAs) using wikipedia widget

4. link instances (via common <property, literal-value> pair)

5. link government data with web data (via time and location)

6. link revisions of government data (via knowledge provenance)

Page 19: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Semantic mapping: AI + CI

need manual disambiguation!

Map to Wikipedia/DBpedia Name

Page 20: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

RDF => SPARQL => Web

• We use SPARQL to bridge Web devlopers and Semantic Web data.

• A triple store is used to support handling multi-million triple RDF datasets

Page 21: Data.gov Wiki: A Semantic Web Approach to Government Data · esp. visualization – To support web developers via machine friendly data access and web services. ... value> pair) 5

Conclusion

semantic web enabled portal for linked government data 5 billion triples from data.gov hosts apps, demos & services provide education services integrates web users’ contributions