(almost) everything you ever wanted to know about geo (with woeids)

Post on 11-Sep-2014

71 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

"(Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs)"; presented on March 10th. 2010 at the London Twitter DevNest 7, at the Sun Customer Briefing Centre in London.

TRANSCRIPT

London Twitter #devnest 7, March 2010

(Almost) Everything You Ever WantedTo Know About Geo (with WOEIDs)…

Gary Gale, Yahoo! Geo Technologies

the agenda

louisvolant on Flickr : http://www.flickr.com/photos/27048731@N03/4003756731/

3

the agenda• the hello

• the WOEIDs

• the WTF?

• the background

• the geocoding and the geoparsing

• the frustration

• the WOEIDs redux

• the APIs

• the demo

• the goodbye

4KELLYLEEBARRETT on Flickr : http://www.flickr.com/photos/kellylee/4177529745/

5Gary Gale on Flickr : http://www.flickr.com/photos/vicchi/4414198544/

WOEIDs

stevefaeembra on Flickr : http://www.flickr.com/photos/stevefaeembra/3567750853/

1258934244418

8David Armano on Flickr : http://www.flickr.com/photos/7855449@N02/3158864420/

some background

blakophoto on Flickr : http://www.flickr.com/photos/cleveralias/3158810304/

let’s talk about geocoding

inF! on Flickr : http://www.flickr.com/photos/nathanbarrow/3339245753/

geocoding is the process of finding associated geographic coordinates (often expressed as latitude and longitude) from other geographic data, such as street addresses, or zip codes (postal codes).

reverse geocoding is the process of back (reverse) coding of a point location (latitude, longitude) to a readable address or place name.

noway on Flickr : http://www.flickr.com/photos/noway/78606643/

what? where?

what? (maybe) where? (maybe)

this is not geocoding, this is geoparsing

szim90 on Flickr : http://www.flickr.com/photos/szim90/272670479/

geoparsing is the process of assigning geographic identifiers (e.g., codes or geographic coordinates expressed as latitude-longitude) to textual words and phrases that occur in unstructured content.

cheap flights from london to paris in october

20

“I’m sorry dave; I can’t find that place”

21Jamison Judd on Flickr : http://www.flickr.com/photos/jamisonjudd/2433102356/

web servers

22

51° 30' 50.0868", 0° 7' 42.8514"

163.1.117.210

20442/6015

#C5243B212

(125 Shaftesbury Avenue, London, UK)

(Oxford, UK)

(Brest, France)

(Wilmington, Delaware, USA)

23National Library NZ on The Commons on Flickr : http://www.flickr.com/photos/nationallibrarynz_commons/3326203787/

web surfers

24

The West End

Downtown

The Shops

The High Street

25

The Online WorldFormal, normalised, structured, regular

The Offline World

Informal, eccentric, bizarre, irregular

The Real World“We Are Here”

cheap flights from london to paris in october

London

Paris

1) Tokenize

2) Remove common words

3) Remove words not in gazetteer

“in”… India?

bodhitjal on Flickr : http://www.flickr.com/photos/bodhithaj/361857780/

“in”… Indiana?

OZinOH on Flickr : http://www.flickr.com/photos/75905404@N00/505688957/

“to”… Tonga?

j_buswell on Flickr : http://www.flickr.com/photos/j_buswell/3683814556/

language

Jovike on Flickr : http://www.flickr.com/photos/jvk/19894053/

Thé?a town in Burgundy, France

To?a town in Ibarakiprefecture, Japan

AND?ISO 31660-1 Alpha-3for Andorra

Å?a town in Norland Fylke,Norway

IN?ISO 3166-1 Alpha-2for India

Is?another town in Burgundy, France

IT?ISO 3166-1 Alpha-2 for Italy

You?a town in Yatenga, Burkina Faso

That?a town in Rajasthan, India

may cause frustration

paloaltosoftware on Flickr : http://www.flickr.com/photos/paloalto/3038701605/

disambiguation

Koen Vereeken on Flickr : http://www.flickr.com/photos/koenvereeken/2088902012/

this is peru …

and so is this (in argentina)

and so is this (in bolivia)

semantics required

dullhunk on Flickr : http://www.flickr.com/photos/dullhunk/3525013547/

Hilton, Paris Paris Hilton

London Jack London

Panama Panama Hats

who uses official names anyway?

takomabibelot on Flickr : http://www.flickr.com/photos/takomabibelot/234301712/

MOMA NYC

paula moya on Flickr : http://www.flickr.com/photos/40351463@N00/745012335/

Museum of Modern Art, New York

Millennium Wheel

hismith83 on Flickr : http://www.flickr.com/photos/hismith83/200701961/

London Eye

San Francisco

SF Brit on Flickr : http://www.flickr.com/photos/cnbattson/192162591/

City and County of San Francisco

WOEIDs (redux)

stevefaeembra on Flickr : http://www.flickr.com/photos/stevefaeembra/3567750853/

1258934244418

51° 30' 50.0868", 0° 7' 42.8514"

Unique

Permanent

Global

Language Neutral•London = Londra = Londres = ロンドン•United States = États-Unis = Stati Uniti = 미국

Ensures that geography can be employed consistently and globally

straup on Flickr : http://www.flickr.com/photos/straup/3504862388/

GeoPlanetA Global Location Repository

Names + Geometry +TopologyWOEIDs for

• cities and towns• postal codes, airports

• admin regions, time zones• telephone code areas

• marketing areas• points of interest• colloquial areas• neighbourhoods

woodleywonderworks on Flickr : http://www.flickr.com/photos/wwworks/2222523978/

Continents

Countries

Counties

Regions

Colloquials

Targeting Zones

Postal Codes

Area Codes

Boroughs

Neighbourhoods

POIs

Stratford-upon-Avon

36424

CV3726787646

Stratford-on-Avon12696101

Warwickshire12602190

England24554868

United Kingdom23424975

Earth1

Vereinigtes Königreich

Royaume Uni

イギリス

Europe24865675

Great Britain28298150

Worcestershire12602192

Warwick39228

Supername

Country

Country

County

District

Town

ZIP

Continent

Colloquial

County

Town

http://engineering.twitter.com/2010/02/woeids-in-twitters-trends.html

http://isithackday.com/hacks/placemaker/tweet-locations.php

http://wherein.yahooapis.com/v1/document

unlock your api

sam.d on Flickr : http://www.flickr.com/photos/samd/65693717/

https://developer.apps.yahoo.com/wsregapp/

Placemaker Parameters

appid• 100% mandatory

inputLanguage• en-US, fr-CA, …

outputType• XML or RSS

documentContent• text to geoparse

documentTitle• optional title

documentURL• URL to geoparse

documentType• MIME type of doc

autoDisambiguate• remove duplicates

focusWoeid• filter around a WOEID

// POST to Placemaker

$ch = curl_init();

define('POSTURL', 'http://wherein.yahooapis.com/v1/document');define('POSTVARS', 'appid='.$key.'&documentContent='.urlencode($content).

'&documentType=text/plain&outputType=xml'.$lang); $ch = curl_init(POSTURL);curl_setopt($ch, CURLOPT_POST, 1);curl_setopt($ch, CURLOPT_POSTFIELDS, POSTVARS);curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $placemaker = curl_exec($ch);curl_close($ch);

places

that_james on Flickr : http://www.flickr.com/photos/that_james/496797309/

<placeDetails><place><woeId>44418</woeId><type>Town</type><name><![CDATA[London, England, GB]]></name><centroid><latitude>51.5063</latitude><longitude>-0.12714</longitude></centroid></place><matchType>0</matchType><weight>1</weight><confidence>10</confidence></placeDetails>

One place for WOEID 44418

references

misterbisson on Flickr : http://www.flickr.com/photos/maisonbisson/117720946/

<reference><woeIds>44418</woeIds><start>1079</start><end>1089</end><isPlaintextMarker>1</isPlaintextMarker><text><![CDATA[London, UK]]></text><type>plaintext</type><xpath><![CDATA[]]></xpath></reference><reference><woeIds>44418</woeIds><start>1116</start><end>1126</end><isPlaintextMarker>1</isPlaintextMarker><text><![CDATA[London, UK]]></text><type>plaintext</type><xpath><![CDATA[]]></xpath></reference>

Two references for WOEID 44418

Two references for WOEID 44418

// turn into an PHP object and loop over the results

$places = simplexml_load_string($placemaker, 'SimpleXMLElement',

LIBXML_NOCDATA); if($places->document->placeDetails){

$foundplaces = array();

// create a hashmap of the places found to mix with// the references found

foreach($places->document->placeDetails as $p){$wkey = 'woeid'.$p->place->woeId;$foundplaces[$wkey]=array(

'name'=>str_replace(', ZZ','',$p->place->name).'', 'type'=>$p->place->type.'', 'woeId'=>$p->place->woeId.'', 'lat'=>$p->place->centroid->latitude.'', 'lon'=>$p->place->centroid->longitude.'’

);}

}

// loop over references and filter out duplicates

$refs = $places->document->referenceList->reference;$usedwoeids = array();foreach($refs as $r){

foreach($r->woeIds as $wi){if(in_array($wi,$usedwoeids)){

continue;} else {

$usedwoeids[] = $wi.'';}$currentloc = $foundplaces["woeid".$wi];if($r->text!='' && $currentloc['name']!='' &&

$currentloc['lat']!='' && $currentloc['lon']!=''){

$text = preg_replace('/\s+/',' ',$r->text);$name = addslashes(str_replace(', ZZ’,

$currentloc['name']));$desc = addslashes($text);$lat = $currentloc['lat'];$lon = $currentloc['lon'];$class = stripslashes($desc)."|$name|$lat|$lon";$placelist.= "<li>".

}}

http://www.vicchi.org/speaking

the internet is broken

Nesster on Flickr : http://www.flickr.com/photos/nesster/3168425434/

// load the URL, using YQL to filter the HTML// and fix UTF-8 nasties

$url = 'http://www.vicchi.org/speaking';

$realurl = 'http://query.yahooapis.com/v1/public/yql’.'?q=select%20*%20'.'from%20html%20where%20url%20%3D

%20%22'.urlencode($url).'%22&format=xml';

$ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $realurl); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $c = curl_exec($ch); curl_close($ch);if(strstr($c,'<')){

$c = preg_replace("/.*<results>|<\/results>.*/",'',$c);$c = preg_replace("/<\?xml version=\"1\.0\"".

" encoding=\"UTF-8\"\?>/",'',$c);$c = strip_tags($c);$c = preg_replace("/[\r?\n]+/"," ",$c);

}

minor annoyances

swooshthesnail on Flickr : http://www.flickr.com/photos/swooshthesnail/3281681399/

50,000 bytes

ASurroca on Flickr : http://www.flickr.com/photos/asurroca/147049402/

no json

X

post not get

sludgegulper on Flickr : http://www.flickr.com/photos/sludgeulper/2645478209/

http://where.yahooapis.com/v1/

collections

bradman334 on Flickr : http://www.flickr.com/photos/bradman334/3402569690/

74

collections

• lists of related resources, such as places

• e.g. find all places called “london”

http://where.yahooapis.com/v1/places.q('london');count=0?appid=[your id]

• e.g. find the most likely place called “london”

http://where.yahooapis.com/v1/places.q('london’)?appid=[your id]

<places xmlns="http://where.yahooapis.com/v1/schema.rng" xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:start="0" yahoo:count="1" yahoo:total="22"><place yahoo:uri="http://where.yahooapis.com/v1/place/44418" xml:lang="en-us"><woeid>44418</woeid><placeTypeName code="7">Town</placeTypeName><name>London</name><country type="Country" code="GB">United Kingdom</country><admin1 type="Country" code="GB-ENG">England</admin1><admin2 type="County" code="">Greater London</admin2><admin3></admin3><locality1 type="Town">London</locality1><locality2></locality2><postal></postal><centroid><latitude>51.506321</latitude><longitude>-0.127140</longitude></centroid><boundingBox><southWest><latitude>51.261318</latitude><longitude>-0.563000</longitude></southWest><northEast><latitude>51.686031</latitude><longitude>0.280360</longitude></northEast></boundingBox></place></places>

resources

joshuarichards on Flickr : http://www.flickr.com/photos/joshywoshywoo/124671979/

77

resources

• unique objects that contain multiple attributes, such as a place

• e.g. get attributes for WOEID 44418

http://where.yahooapis.com/v1/place/44418?appid=[your id]

• e.g. find the most likely place called “london”

http://where.yahooapis.com/v1/places.q('london’)?appid=[your id]

78

resources

• unique objects that contain multiple attributes, such as a place

• e.g. get places related to WOEID 44418

http://where.yahooapis.com/v1/place/44418/relation?appid=[your id]

• parent, ancestors, belongsto, neighbours, siblings, children

<?xml version="1.0" encoding="UTF-8"?><places xmlns="http://where.yahooapis.com/v1/schema.rng" xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:start="0" yahoo:count="10" yahoo:total="34"><place yahoo:uri="http://where.yahooapis.com/v1/place/12695806" xml:lang="en-us"><woeid>12695806</woeid><placeTypeName code="10">Local Administrative Area</placeTypeName><name>City of London</name></place><place yahoo:uri="http://where.yahooapis.com/v1/place/12695807" xml:lang="en-us"><woeid>12695807</woeid><placeTypeName code="10">Local Administrative Area</placeTypeName><name>London Borough of Camden</name></place><place yahoo:uri="http://where.yahooapis.com/v1/place/12695808" xml:lang="en-us"><woeid>12695808</woeid><placeTypeName code="10">Local Administrative Area</placeTypeName><name>London Borough of Hackney</name></place>…</places>

Far more than you could ever wanthttp://delicious.com/codepo8/geotoys

never work with children, animals or live demos

elephipelephi on Flickr : http://www.flickr.com/photos/elephipelephi/1493013250/

not taking notes?

selva on Flickr : http://www.flickr.com/photos/selva/24604141/

London Twitter #devnest 7, March 2010

(Almost) Everything You Ever WantedTo Know About Geo (with WOEIDs)…

Gary Gale, Yahoo! Geo Technologies

http://slideshare.net/vicchi

thanks for listening

Paul Keleher on Flickr : http://www.flickr.com/photos/pkeleher/1658311814/

www.ygeoblog.com

twitter.com/vicchi

twitter.com/yahoogeo

top related