toward next generation of gazetteer: utilizing geosparql for developing linked geoname data
DESCRIPTION
This is the presentation for the session of spatiotemporal linked data in PNC 2014TRANSCRIPT
Toward Next Generation of Gazetteer: Utilizing GeoSPARQL For Developing
Linked Geoname Data
Dongpo Deng !
Geospatial Information Specialist Institute of Information Science
Academia Sinica [email protected]
2014 PaciKic Neighborhoods Consortium Annual Conference and Joint Meeting
Place and Name• A place is a meaningful location for people.
• To identify places, people give name for separating from undifferentiated space.
• A place is a concept of geography.
• A name of a place is about toponymy.
Gazetteer• A gazetteer is deKined by (Hill, 2000) as geospatial dictionaries of geographic names with the core components of
• A name (could have variant names also) ;
• A location (coordinates representing a point, line, or areal location) ;
• A type (selected from a type scheme of categories for places/features).
ADL Gazetteer
Geonames.org
Ordnance Survey 50k Gazetteer
Getty TGN
Research data management (CKAN) (taijiang.tw)
Metadata management
� &-
'.&-
�/"&-
� �0�2,�,�,��,!� � 2���#
��2��������+1�*�&)�$�
���/"2 (,%����
Place name as Controlled Vocabulary
Place name as Controlled Vocabulary
Ambiguity of Place names
• Many places have same name
• A place can have many name
• Long place names are often shortened
• Spatial footprints of place names are often difKicult to deKine
Linked Data• Tim Berners-‐Lee (2006) proposed 4 principles:
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those names.
3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
4. Include links to other URIs. so that they can discover more things.
Why Linked Data?
• To create web of data
• To semantically integrate data
• To facilitate data reuse
• The more Link data, the more knowledge can be discovered
Linked Open Data Cloud (9/2008)
Linked Open Data Cloud (7/2009)
Linked Open Data Cloud (9/2010)
Linked Open Data Cloud (9/2011)
Linked Open Data Cloud (2014.4)
OGC GeoSPARQL• The GeoSPARQL is a new OGC standard, which provides three main components for encoding geographic information:
• (1) The deKinitions of vocabularies for representing features, geometries, and their relationships;
• (2) A set of domain-‐speciKic, spatial functions for use in SPARQL queries;
• (3) A set of query transformation rules
GeoSPARQL Vocabulary: Basic Classes and Relations
Topological Relations between geo:SpatialObject
• OGC simple feature relation family
• Also support RCC8 and Egenhofer
A BA BA B
A A BA BA
B
A/B
geo:sfEquals geo:sfTouches geo:sfOverlaps geo:sfContains
geo:sfWithin geo:sfDosjoint geo:sfIntersects geo:sfCrosses
B
Components of GeoSPARQL• Vocabulary for Query Patterns
• Classes • Spatial Object, Feature, Geometry • Properties
• Topological relations • Links between features and geometries
• Datatypes for geometry literals • geo:wktLiteral, geo:gmlLiteral
• Query Functions • Topological relations, distance, buffer, intersection, …
• Entailment Components • RDFS entailment • RIF rules to compute topological relations
Some GeoSPARQL examples:City rdfs:subClassOf geo:Feature
:School rdfs:subClassOf geo:Feature
:Taipei rdf:Type :City
:NTU rdf:type :School
:NTU :isDeveloped “1928-‐3-‐16”^^xsd:date
:Taipei geo:hasGeometry :geo_001
:geo_001 geo:asWKT “Polygon((…))”^^geo:wktLiteral
:NTU geo:hasGeometry :geo_002
:geo_002 geo:asWKT “Polygon((…))”^^geo:wktLiteral
:NTU geo:sfWithin :Taipei
beta information
non-geospatial information
geospatial information
BBN Parliamenthttp://parliament.semwebcentral.org/
A procedure for making data interlink
Specification
Modeling
Publish
Utility
Transform
• Distinguish concepts of place names • URI design
• Develop ontologies
• Transform data to RDF
• Publish the RDF/OWL
• Utilize the RDF/OWL data for services
Taiwan Geographic Name Information System
(http://placesearch.moi.gov.tw/)
A place name ontology
tpn:Place
geo:Feature
tpn:FeatureTypetpn:featureClass
owl:subClassOf
skos:Concept
owl:subClassOf
time:Interval
geo:Point
geo:Geometry
owl:subClassOf
geo:hasGeometry
tpn:Footprint
geo:inside
geo:wktLiteral
geo:asWKT
tpn:is_in
owl:subClassOf
event:Event
event:place
geo="http://www.opengis.net/ont/geosparql#"
time="http://www.w3.org/2006/time#"
xsd="http://www.w3.org/2001/XMLSchema#"
tpn="http://lod.tw/ontologies/geoname.owl#"
owl="http://www.w3.org/2002/07/owl#"
event="http://purl.org/NET/c4dm/event.owl#"
event:time
tpn:Name (NameCollection)
tpn:PlaceName
tpn:memberOf
time:Instant
time:hasBeginningtpn:endToUse
tpn:startToUse
time:hasEnd
tpn:altNametpn:name
http://geo.lod.tw
D2R server
http://geo.lod.tw
D2R server
Link to Geonames
The place names used in dutch
colonial rule period
The place names used in dutch
colonial rule period
Disambiguate place names • ‘⼤大山腳’ is a name of places and a URI
• There are three places named ‘⼤大山腳’
GeoSPARQL Endpoint
GeoSPARQL query (1)
SELECT DISTINCT ?Place ?Place_wkt!WHERE { ! ?Place a tpn:Place;! geo:hasGeometry ?Place_geo.! ?Place_geo geo:asWKT ?Place_wkt.!!FILTER (geof:sfWithin(?Place_wkt, "POLYGON((119.99912 23.24348,120.25398 23.24482,120.25398 23.24482,120.25130 23.24348,120.25130 23.24348,120.25666 23.00203,120.00449 23.01276,119.99912 23.24348))”^^sf:wktLiteral)) .!!} !
To Kind place names within a spatial extent
Query result (1)
GeoSPARQL query (2)
SELECT DISTINCT ?p_wkt ?Place_wkt ?distance!WHERE { !! ?Place a tpn:Place;! geo:hasGeometry ?Place_geo.! ?Place_geo geo:asWKT ?Place_wkt.! <http://geo.lod.tw/resource/Point/01c2db36d23bdadda4beca046ce85e47> geo:asWKT ?p_wkt;!! LET (?buff := geof:buffer(?p_wkt, 3000, units:metre)) .! FILTER (geof:sfWithin(?Place_wkt, ?buff)) .! LET (?distance := geof:distance(?Place_wkt, ?p_wkt, units:metre)) .!}
To Kind place names within a 3 km buffer and obtain their coordinates and distances
GeoSPARQL query (3)
SELECT DISTINCT ?pName ?tName ?Time_xsd!WHERE { ! ?PN a tpn:PlaceName;! rdfs:label ?pName;! tpn:startToUse ?Sart_Interval.! ! ?Start_Interval a time:Interval;! rdfs:label ?tName;! time:hasBeginning ?begin.! ! ?begin time:xsdDateTime ?Time_xsd .!!Filter (?Time_xsd < "1900-12-19T16:00:00Z"^^xsd:dateTime ) .!!}
To Kind names of places and their time started to use before Dec. 19, 1900
Query result (3)
Concluding remarks • By using the ontology of place names, Taiwanese place name dataset is transferred from spreadsheet to triples (RDF).
• The uniKied place names can be served as controlled vocabularies.
• The Taiwanese place name dataset is not only linked forward to Geonames.org, but also linked backward to historical place names.
• A front-‐end linked data server (D2R) is established to demonstrate the linked place names.
• A GeoSPARQL endpoint (BBN Parliament) is developed for serving spatiotemporal SPARQL queries.
[email protected]!twitter: @dongpo!
facebook: dongpo.deng
Slides are available on http://tinyurl.com/pnc2014ddp