the oecd delta project – providing easier access to data through api's
DESCRIPTION
Presented at the IMAODBC - Neuchatel, 16-20 September 2013TRANSCRIPT
The OECD Delta Project
IMAODBC - Neuchatel, 16-20 September 2013
Jonathan Challener, OECD
{“Providing easier access to data”:“Through APIs”}
Accessible
Open
Free
FindUnderstandUse
Machine-readableIndexableRe-Useable
Available without charge
Making OECD data Open, Accessible, Free
Accessible
Open
Free
FindUnderstandUse
Machine-readableIndexableRe-Useable
Available without charge
Making OECD data Open, Accessible, Free
{“the oecd experience”:“2007-2012”}
Machine-readable
SDMX is the global standard for statistical data and metadata exchange
SDMX-ML 2.0 Web Service
<?xml version="1.0"?><message:MessageGroup xmlns:message="http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.SDMX.org/resources/SDMXML/schemas/v2_0/generic http://www.sdmx.org/docs/2_0/SDMXGenericData.xsd http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message http://www.sdmx.org/docs/2_0/SDMXMessage.xsd" xmlns:common="http://www.SDMX.org/resources/SDMXML/schemas/v2_0/common" xmlns="http://www.SDMX.org/resources/SDMXML/schemas/v2_0/generic"><Header xmlns="http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message"><ID>none</ID><Test>false</Test><Truncated>false</Truncated><Prepared>2013-04-15T11:46:19</Prepared><Sender id="OECD"><Name xml:lang="en">Organisation for Economic Co-operation and Development</Name><Name xml:lang="fr">Organisation de coopération et de développement économiques</Name></Sender></Header><DataSet keyFamilyURI="http://stats.oecd.org/RestSDMX/sdmx.ashx/GetKeyFamily/REFSERIES/OECD/?resolveRef=true"> <KeyFamilyRef>REFSERIES</KeyFamilyRef> <Series> <SeriesKey><Value value="FRA" concept="LOCATION"/><Value value="YPTTTTL1_ST" concept="SUBJECT"/><Value value="A" concept="FREQUENCY"/></SeriesKey> <Attributes><Value value="P1Y" concept="TIME_FORMAT"/></Attributes> <Obs><Time>2010</Time><ObsValue value="62927.11"/></Obs> <Obs><Time>2011</Time><ObsValue value="63249.09"/></Obs> </Series></DataSet></message:MessageGroup>
OECD.Stat - the OECD Data Warehouse Solution - was the first dissemination system to provide a means for disseminating data in the internationally agreed standard, SDMX. Having implemented a fully conforming 2.0 SDMX web service in 2007
Automated data exchange
Then began to introduce a number of automated data exchanges through SDMX 2.0 interface with several organisations and data-resellers / aggregators
Flexible
However although SDMX is highly flexible
but complex
It is also highly complex. This slowed its uptake and also reuse
{“the oecd experience”:“2012-beyond”}
So entered a period when users needs changed and with an ever increasing demand for reuse.
KISS
Then started the thinking about KISS…
KISS Principle
No not the band but the KISS Principle: Keep it simple, stupid (and short)!
Date: Mon, 12 Mar 2012 13:30:01 CET Location: OECD, ParisMeeting: OECD’s Statistical Information SystemCollaboration Community (SIS-CC) workshop
Presenter: Xavier Badosa Title: Standards for statistical data dissemination Content-Type: Simplify+JSONify/SDMX
In March 2012 the 2nd OECD’s Statistical Information System Collaboration Community (SIS-CC) workshop took place with a presentation by Xavier Badosa to push for a simplified SDMX and create a common JSON based format for statistical dissemination.
XMLJSONSDMX-JS?
SDMX-MLXavier proposed to take SDMX-ML (XML) and together with JSON, a JavaScript Object Notation (JSON) text-based open standard designed for human-readable data interchange.
"generic:Attributes" : {"generic:Value" : {"@concept" : "OBS_STATUS", "@value" : "P"}}}, {"generic:Time" : "2011-04", "generic:ObsValue" : {"@value" : "106.56246"}, "generic:Attributes" : {"generic:Value" : {"@concept" : "OBS_STATUS", "@value" : "P"}}}, {"generic:Time" : "2011-05", "generic:ObsValue" : {"@value" : "113.26596"}, "generic:Attributes" : {"generic:Value" : {"@concept" : "OBS_STATUS", "@value" : "P"}}}, {"generic:Time" : "2011-06", "generic:ObsValue" : {"@value" : "114.22037"}, "generic:Attributes" : {"generic:Value" : {"@concept" : "OBS_STATUS", "@value" : "P"}}}, {"generic:Time" : "2011-07", "generic:ObsValue" : {"@value" : "108.77534"}, "generic:Attributes" : {"generic:Value" : {"@concept" : "OBS_STATUS", "@value" : "P"}}}, {"generic:Time" : "2011-08", "generic:ObsValue" : {"@value" : "116.37424"}, "generic:Attributes" : {"generic:Value" : {"@concept" : "OBS_STATUS", "@value" : "P"}}}, {"generic:Time" : "2011-09", "generic:ObsValue" : {"@value" : "116.89853"}, "generic:Attributes" : {"generic:Value" : {"@concept" : "OBS_STATUS", "@value" : "P"}}}, {"generic:Time" : "2011-10", "generic:ObsValue" : {"@value" : "118.59833"}, "generic:Attributes" : {"generic:Value" : {"@concept" : "OBS_STATUS", "@value" : "P"}}}, {"generic:Time" :
“XMLish”JSON
To create an XMLish JSON!
Standard definition led by SDMX TWG sub group
This was the start of an important parallel evolution in SDMX led by a sub group of the SDMXTWG.
{"sdmx-proto-json": "2013-03-26", "structure": { "dimensions": [ { "id": "LOCATION", "name": "Country", "role": null, "codes": [{"selected": true, "id": "FRA", "name": "France"}]}, { "id": "SUBJECT", "name": "Subject", "role": null, "codes": [{"selected": true, "id": "YPTTTTL1_ST", "name": "Population mid-year estimates Total, Annual, ('000)"}]}, { "id": "FREQUENCY", "name": "Frequency", "role": null, "codes": [{"selected": true, "id": "A", "name": "Annual"}]}, { "id": "TIME_PERIOD", "name": "Time", "role": null, "codes": [{"selected": false, "id": "2010", "name": "2010"}, {"selected": false, "id": "2011","name": "2011"}]} ]}, "header": {"name": "Reference Series", "id": "none", "test": "false", "truncated": "false", "prepared": "2013-04-15T12:03:41", "sender": {"id": "OECD", "name": "Organisation for Economic Co-operation and Development"}}, "dataSets": []}
{"sdmx-proto-json": "2013-02-05", "structure": {"structure": "BISWEB_EERDATAFLOW", "href": "todo", "components": { "LOCATION": {"id": "LOCATION", "name": "Country", "values": [{"id": "FRA", "name": "France"}]}, "SUBJECT": {"id": "SUBJECT", "name": "Subject", "values": [{"id": "YPTTTTL1_ST", "name": "Population mid-year estimates Total, Annual, ('000)"}]}, "FREQUENCY": {"id": "FREQUENCY", "name": "Frequency", "values": [{"id": "A", "name": "Annual"}]}, "TIME_PERIOD": {"id": "TIME_PERIOD", "name": "Time", "values": [{"id": "2010", "name": "2010"}, {"id": "2011", "name": "2011"}]}, "packaging": { "dataSetDimensions": [], "seriesDimensions": ["LOCATION", "SUBJECT", "FREQUENCY"], "observationDimensions": ["TIME_PERIOD"], }}}, "header": {"name": "Reference Series", "id": "none", "test": "false", "truncated": "false", "prepared": "2013-04-15T13:48:40", "sender": {"id": "OECD", "name": "Organisation for Economic Co-operation and Development"}}, "dataSets": []}
Two formats proposed
CodeIndex SliceIn late 2012 with 2 formats being proposed, JSON-Codeindex and JSON-Slice, the OECD set about creating a API to service users with the 2 proposed JSON based formats. The OECD SDMX-PROTO-JSON API was launched in early 2013 ahead of the 2013 SIS-CC workshop.
{"sdmx-proto-json": "2013-03-26", "structure": { "dimensions": [ { "id": "LOCATION", "name": "Country", "role": null, "codes": [{"selected": true, "id": "FRA", "name": "France"}]}, { "id": "SUBJECT", "name": "Subject", "role": null, "codes": [{"selected": true, "id": "YPTTTTL1_ST", "name": "Population mid-year estimates Total, Annual, ('000)"}]}, { "id": "FREQUENCY", "name": "Frequency", "role": null, "codes": [{"selected": true, "id": "A", "name": "Annual"}]}, { "id": "TIME_PERIOD", "name": "Time", "role": null, "codes": [{"selected": false, "id": "2010", "name": "2010"}, {"selected": false, "id": "2011","name": "2011"}]} ]}, "header": {"name": "Reference Series", "id": "none", "test": "false", "truncated": "false", "prepared": "2013-04-15T12:03:41", "sender": {"id": "OECD", "name": "Organisation for Economic Co-operation and Development"}}, "dataSets": []}
{"sdmx-proto-json": "2013-02-05", "structure": {"structure": "BISWEB_EERDATAFLOW", "href": "todo", "components": { "LOCATION": {"id": "LOCATION", "name": "Country", "values": [{"id": "FRA", "name": "France"}]}, "SUBJECT": {"id": "SUBJECT", "name": "Subject", "values": [{"id": "YPTTTTL1_ST", "name": "Population mid-year estimates Total, Annual, ('000)"}]}, "FREQUENCY": {"id": "FREQUENCY", "name": "Frequency", "values": [{"id": "A", "name": "Annual"}]}, "TIME_PERIOD": {"id": "TIME_PERIOD", "name": "Time", "values": [{"id": "2010", "name": "2010"}, {"id": "2011", "name": "2011"}]}, "packaging": { "dataSetDimensions": [], "seriesDimensions": ["LOCATION", "SUBJECT", "FREQUENCY"], "observationDimensions": ["TIME_PERIOD"], }}}, "header": {"name": "Reference Series", "id": "none", "test": "false", "truncated": "false", "prepared": "2013-04-15T13:48:40", "sender": {"id": "OECD", "name": "Organisation for Economic Co-operation and Development"}}, "dataSets": []}
Converged into…
The 2 formats, after much discussion and a real world developer community feedback, converged.
Simplified SDMX
{“SDMX”:“JSON”}
To create a single format…
{“SDMX”:“JSON”}
{"header":{"id":"ed9dd4bb-3725-4064-b173-842887bae108","test":false,"prepared":"2013-08-23T23:08:34Z","sender":{"id":"OECD","name":"Organisation for Economic Co-operation and Development"},"request":{"uri":"http://stats.oecd.org/SDMX-JSON/data/MEI_FIN/IRLT.AUS+AUT+BEL+CAN+CHL+CZE+DNK+EST+FIN+FRA+DEU+GRC+HUN+ISL+IRL+ISR+ITA+JPN+KOR+LUX+MEX+NLD+NZL+NOR+POL+PRT+SVK+SVN+ESP+SWE+CHE+TUR+GBR+USA+EA17+SDR+NMEC+BRA+CHN+IND+IDN+RUS+ZAF.M/OECD?startPeriod=2012&endPeriod=2013"}},"dataSets":[{"action":"Informational","annotations":[],"attributes":[],"series":{"0:0:0":{"attributes":[0],"observations":{"0":[3.795],"1":[3.97],"2":[4.15],"3":[3.8575],"4":[3.2775],"5":[2.995],"6":[2.8875],"7":[3.1875],"8":[3.0925],"9":[3.0225],"10":[3.0875],"11":[3.2275]}},"0:1:0":{"attributes":[0],"observations":{"0":[3.27],"1":[3],"2":[2.87],"3":[2.83],"4":[2.49],"5":[2.29],"6":[2.07],"7":[1.97],"8":[2.04],"9":[2.02],"10":[1.85],"11":[1.77]}},"0:2:0":{"attributes":[0],"observations":{"0":[4.01],"1":[3.48],"2":[3.33],"3":[3.42],"4":[3.22],"5":[3.09],"6":[2.74],"7":[2.55],"8":[2.59],"9":[2.53],"10":[2.4],"11":[2.19]}},"0:3:0":{"attributes":[0],"observations":{"0":[3.9171],"1":[3.7524],"2":[3.2886],"3":[3.3889],"4":[3.5291],"5":[3.4122],"6":[3.2504],"7":[3.0141],"8":[2.4256],"9":[2.3061],"10":[2.2492],"11":[2.1039]}},"0:33:0":{"attributes":[0],"observations":{"0":[8.35],"1":[8.06],"2":[8.09],"3":[8.18],"4":[8.13],"5":[8.26],"6":[8.09],"7":[8.17],"8":[8.25],"9":[8.21],"10":[7.96],"11":[8.1]}},"0:34:0":{"attributes":[0],"observations":{"0":[8.4],"1":[8.23],"2":[8.37],"3":[8.28],"4":[8.28],"5":[8.16],"6":[7.52],"7":[7.48],"8":[7.4],"9":[7.67],"10":[7.64],"11":[7.37]}}}}],"structure":{"uri":"http://stats.oecd.org/SDMX-JSON/dataflow/MEI_FIN","name":"MEI_FIN","description":"Monthly Monetary and Financial Statistics (MEI)","dimensions":{"dataSet":[],"series":[{"keyPosition":0,"id":"SUBJECT","name":"Subject","values":[{"id":"IRLT","name":"Long-term interest rates, Per cent per annum"}],"role":null},{"keyPosition":1,"id":"LOCATION","name":"Country","values":[{"id":"AUS","name":"Australia"},{"id":"AUT","name":"Austria"},{"id":"BEL","name":"Belgium"},{"id":"CAN","name":"Canada"},{"id":"CHL","name":"Chile"},{"id":"CZE","name":"Czech Republic"},{"id":"DNK","name":"Denmark"},{"id":"FIN","name":"Finland"},{"id":"FRA","name":"France"},{"id":"DEU","name":"Germany"},{"id":"GRC","name":"Greece"},{"id":"HUN","name":"Hungary"},{"id":"ISL","name":"Iceland"},{"id":"IRL","name":"Ireland"},{"id":"ISR","name":"Israel"},{"id":"ITA","name":"Italy"},{"id":"JPN","name":"Japan"},{"id":"KOR","name":"Korea"},{"id":"LUX","name":"Luxembourg"},{"id":"MEX","name":"Mexico"},{"id":"NLD","name":"Netherlands"},{"id":"NZL","name":"New Zealand"},{"id":"NOR","name":"Norway"},{"id":"POL","name":"Poland"},{"id":"PRT","name":"Portugal"},{"id":"SVK","name":"Slovak Republic"},{"id":"SVN","name":"Slovenia"},{"id":"ESP","name":"Spain"},{"id":"SWE","name":"Sweden"},{"id":"CHE","name":"Switzerland"},{"id":"GBR","name":"United Kingdom"},{"id":"USA","name":"United States"},{"id":"EA17","name":"Euro area (17 countries)"},{"id":"RUS","name":"Russian Federation"},{"id":"ZAF","name":"South Africa"}],"role":null},{"keyPosition":2,"id":"FREQUENCY","name":"Frequency","values":[{"id":"M","name":"Monthly"}],"role":null}],"observation":[{"keyPosition":0,"id":"TIME_PERIOD","name":"Time","values":[{"id":"2012-01","name":"Jan-2012"},{"id":"2012-02","name":"Feb-2012"},{"id":"2012-03","name":"Mar-2012"},{"id":"2012-04","name":"Apr-2012"},{"id":"2012-05","name":"May-2012"},{"id":"2012-06","name":"Jun-2012"},{"id":"2012-07","name":"Jul-2012"},{"id":"2012-08","name":"Aug-2012"},{"id":"2012-09","name":"Sep-2012"},{"id":"2012-10","name":"Oct-2012"},{"id":"2012-11","name":"Nov-2012"},{"id":"2012-12","name":"Dec-2012"}],"role":"time"}]},"attributes":{"dataSet":[],"series":[{"id":"TIME_FORMAT","name":"Time format","values":[{"id":"P1M","name":"Monthly"}],"role":null}],"observation":[]},"annotations":[]},"errors":[]}
BETA: stats.oecd.org/opendataapi/Json.htm
{“SDMX”:“JSON”}
<request> ::= "http://stats.oecd.org/SDMX-JSON/data/" <dataset name> "/" <filter expression> "/" <agency name> [ "?" <additional parameters> ] <additional parameters> ::= "startPeriod=" <start time> [ "&" <additional parameters> ] | "endPeriod=" <end time> [ "&" <additional parameters> ] | "json-lang=" <language> [ "&" <additional parameters> ] | "json-format=" <json format> [ "&" <additional parameters> ] | "dimensionAtObservation=" <code of dimension at observation level> [ "&" <additional parameters> ] | "AllDimensions" [ "&" <additional parameters> ] <filter expression> ::= <dimensions> <dimensions> ::= <dimension> [ "." <dimensions> ] <dimension> ::= [ <dimension values> ] <dimension values> ::= <dimension value> [ "+" <dimension values> ] <language> ::= "en" | "fr" The OECD has since updated the API to
reflect this single format that can be accessed through the link shown.
Supporting the wider developer community
However, the OECD recognised the need to reach much wider and thus.
Standard protocol for retrieving any kind of data over HTTP
Introduced an OData API, an implementation of the Microsoft OData Protocol connected to OECD.Stat.
Simple lightweight data format
{"odata.metadata": "http://stats.oecd.org/OECDStatWCF_OData/OData.svc/$metadata#DotStat.OData.DataSource.DatawarehouseEntities/REFSERIES", "value": [ { "FREQUENCY": "A", "LOCATION": "FRA", "SUBJECT": "YPTTTTL1_ST", "TIME": "2010", "Flags": "", "Metadata": null, "Value": 62927.11}, { "FREQUENCY": "A", "LOCATION": "FRA", "SUBJECT": "YPTTTTL1_ST", "TIME": "2011", "Flags": "", "Metadata": null, "Value": 63249.09} ]}
to retrieve data in a simple and lightweight data format, both JSON and XML.
Reaching out to the wider developer community
Having built it, we needed them to come.
OpenDataAPIThe OECD’s Statistical Information System
Collaboration Community Open Data Initiative for Developers
So in early 2013 ahead of the OECD’s Statistical Information System Collaboration Community workshop in April 2013, the Open Data Initiative for Developers was launched.
OpenDataAPI
Objectives: benefits and ease of use
The Open Data Initiative for Developers objective was to demonstrate the benefits and the ease of use of the OECD.Stat Open Data web services by exposing the OECD data through two new beta web services to interested web developers to develop innovative applications based on OECD public data, and to gather feedback from developers to improve the web services in future versions.
The Open Data Initiative for Developers was a great success concluding at the SIS-CC 2013 Workshop held in Paris, when a group of developers reviewed each submission, shortlisted and presented the final submissions and overall feedback to the wider group.
http://www.drasticdata.nl/ProjectOECD
http://www3.inegi.org.mx/rnm/sdmxhttp://www.uis.unesco.org/das/Country/OdataTest
Shortlisted submissions
SDMX-JSON: very simple to use
OData: simple model and is popular with developers
Overall the feedback was positive and directly contributed to the further development of the APIs
{“SDMX”:“JSON”}
…and in particular the final single JSON format
Potential future formats
As already mentioned the OECD may introduce APIs serving more formats that could include:- Google Data (a REST-inspired technology), - Google Dataset Publishing Language (DPSL
) - Google KML, a Geospatial file format
Wider benefits
The APIs are also helping to contribute to the OECD data portal project and supported the development of a beta mobile version of OECD.Stat.
References
1. sdmx.org2. xavierbadosa.com/oecd3. github.com/sdmx-twg4. odata.org5. stats.oecd.org/opendataapi/Json.htm (BETA)6. stats.oecd.org/opendataapi/OData.html
(BETA)
Jonathan Challener, OECD [email protected]@Challener
Thank you
{“Providing easier access to data”:“Through APIs”}The OECD Delta Project
IMAODBC - Neuchatel, 16-20 September 2013