ckan 2.0 introduction (20140618 updated)
TRANSCRIPT
CKAN 2 Introduction
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Taiwan License.
Presenter: 李承錱 Cheng-Jen Lee (Sol)
Email: cjlee AT iis.sinica.edu.tw
2014/6/18 2
Agenda● About CKAN● Feature Tour
– Publish & Find Datasets– Store & Manage Data– Engage with Users & Others– Customise & Extend
● CKAN and 5 Open Data★● Showcase● Issues
2014/6/18 3
Agenda● About CKAN● Feature Tour
– Publish & Find Datasets– Store & Manage Data– Engage with Users & Others– Customise & Extend
● CKAN and 5 Open Data★● Showcase● Issues
2014/6/18 4
About CKAN● The Comprehensive Knowledge Archive
Network● A powerful data management system
● Publishing● Sharing● Finding● Using Data
2014/6/18 5
About CKAN
83 instancesaround the worldin May 2014
2014/6/18 6
CKAN 2
2014/6/18 7
Feature Tour
2014/6/18 9
Feature Tour (1) Publish & Find Datasets
Add Dataset Basic Information
2014/6/18 10
Feature Tour (1) Publish & Find Datasets
Add DataUnder the Dataset
2014/6/18 11
Feature Tour (1) Publish & Find Datasets
Add Metadata About the Dataset
2014/6/18 12
Feature Tour (1) Publish & Find Datasets
Filter ByKeywords
2014/6/18 13
Feature Tour (1) Publish & Find Datasets
Filter ByGeographical Features
2014/6/18 14
Feature Tour (2)Store & Manage Data
Data Explorer:
recline_preview (csv, xls)json_previewpdf_previewckanext-spatial
2014/6/18 15
Feature Tour (2)Store & Manage Data
Graphing data
2014/6/18 16
Feature Tour (3)Engage with Users & Others
Share
2014/6/18 17
Feature Tour (3)Engage with Users & Others
Organization
2014/6/18 18
Feature Tour (3)Engage with Users & Others
Manage Users of an Organization
2014/6/18 19
Feature Tour (3)Engage with Users & Others
Manage Role of Members
Admin (管理者 ): edit datasets & membersEditor (編輯 ): edit datasetsViewer (成員 ): view (private) datasets
Note: Public datasets are visible to everyone
2014/6/18 20
Feature Tour (3)Engage with Users & Others
Harvest and Federation
2014/6/18 21
Feature Tour (3)Engage with Users & Others
History
2014/6/18 22
Feature Tour (4)Customise & Extend
● RESTful JSON APIs– The Action API– The DataStore API– The FileStore API...
● Extensions (over 60)– ckanext-harvest– ckanext-spatial
● Themable● Integrates with other CMS (ex. Drupal)
Open source is good!
2014/6/18 23
CKAN and 5 Open Data★
2014/6/18 24
CKAN and 5 Open Data★● ★ Make your stuff available on the Web (whatever
format) under an open license
2014/6/18 25
CKAN and 5 Open Data★● ★★ Make it available as structured data (e.g.,
Excel instead of image scan of a table)★★★ Use non-proprietary formats (e.g., CSV instead of Excel)
– Accept any data format– Beautiful data demonstration– Datastore : Indexing for structured data
2014/6/18 26
CKAN and 5 Open Data★● ★★★★ Use URIs to denote things, so that people
can point at your stuff
– Permanent link for each dataset– Get Dataset URI through API
● ★★★★★ Link your data to other data to provide context
– Linked data and RDF for metadata
2014/6/18 27
CKAN and 5 Open Data★● RDF for metadata
– DCAT and Dublin Core– curl -L -H "Accept:application/rdf+xml"
http://thedatahub.org/dataset/gold-prices
2014/6/18 28
Agenda● About CKAN● Feature Tour
– Publish & Find Datasets– Store & Manage Data– Engage with Users & Others– Customise & Extend
● CKAN and 5 Open Data★● Showcase● Issues
2014/6/18 29
United KingdomDATA.GOV.UK
2014/6/18 30
United StatesDATA.GOV
2014/6/18 31
BrazilDADOS.GOV.BR
2014/6/18 32
European UnionPUBLICDATA.EU
2014/6/18 33
JapanDATA.GO.JP
2014/6/18 34
TainanDATA.TAINAN.GOV.TW (NEW!)
2014/6/18 35
Geospatial Data Explorer:Lat/Long field
2014/6/18 36
Geospatial Data Explorer:GeoJSON
2014/6/18 37
Geospatial Data Explorer:WMS
2014/6/18 38
Agenda● About CKAN● Feature Tour
– Publish & Find Datasets– Store & Manage Data– Engage with Users & Others– Customise & Extend
● CKAN and 5 Open Data★● Showcase● Issues
2014/6/18 39
Issues● CJK Support
– CJK Search– Some broken translations– File name
● Extensions compatibilities● Tons of tweaks needed● Performance Issue● Complicated Architecture
2014/6/18 40
System Architecture
2014/6/18 41
Issues (Cont'd)● What You Should Know
– Python & Pylons– ckan plugins toolkit– SQLAlchemy & SQL– HTML, JavaScript– Babel (Translation)– Web Server (UNIX, Apache, Nginx ...)
2014/6/18 42
● Official Documents:– http://docs.ckan.org/en/latest/
● Installation Notes (in Chinese):– https://ckan-docs-tw.readthedocs.org/
● CKAN Development Discussions:– http://lists.okfn.org/mailman/listinfo/ckan-dev
● CKAN Taiwan Interest Group:– https://groups.google.com/forum/#!forum/ckan-taiwan-interest-group
Resources
2014/6/18 43
Thanks for your attention!Any Q?
Email: u10313335 AT citi.sinica.edu.tw
http://about.me/sollee
CKAN 2: Additional Topics
Presenter: 李承錱 Cheng-Jen Lee (Sol)
Email: u10313335 AT citi.sinica.edu.tw
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Taiwan License.
2014/6/18 45
Agenda● Extended Topic 1: Installation● Extended Topic 2: Harvesters
2014/6/18 46
Install from Source● Virtual environment● Checkout the source (via GIT)
– https://github.com/okfn/ckan● Create a CKAN config file● Setup Jetty & Solr● Initialize Database (user, db)● Link to who.ini● Create a Sysadmin User● Deployment (Apache + Nginx)● Install other extensions...
2014/6/18 47
Installation Notes● https://ckan-docs-tw.readthedocs.org/
2014/6/18 48
Agenda● Extended Topic 1: Installation● Extended Topic 2: Harvesters
2014/6/18 49
Harvesters● ckanext-harvest
– Remote harvesting extension– https://github.com/okfn/ckanext-harvest
● Source Type– CKAN (built-in)– CSW– WAF– Custom (csv/xls/website… etc)
2014/6/18 50
Harvested from TGOSCSW service
2014/6/18 51
Harvestershttp://Mydomain.com/harvest
2014/6/18 52
HarvestersAdd a new harvest source
2014/6/18 53
HarvestersCreate a harvest job
2014/6/18 54
HarvestersOverview of harvested datasets
2014/6/18 55
HarvestersBackground Process
● Manually– (pyenv) $ paster --plugin=ckanext-harvest
harvester gather_consumer -c /etc/ckan/default/production.ini
– (pyenv) $ paster --plugin=ckanext-harvest harvester fetch_consumer -c /etc/ckan/default/production.ini
– (pyenv) $ paster --plugin=ckanext-harvest harvester run -c /etc/ckan/default/production.ini
2014/6/18 56
HarvestersBackground Process
● Automatically– Supervisor (for gather & fetch consumer)– Cron (for run)
2014/6/18 57
HarvestersCustom harvester
● Implement the harvester interface to perform harvesting operations
● Three stages● gather: get the identification● fetch: fetch the contents● import: create ckan package (dataset)
2014/6/18 58
HarvestersThe harvesting interface
from base import HarvesterBase
class SRDAHarvester(HarvesterBase):
def _set_config(self,config_str):
def info(self):
...
def gather_stage(self, harvest_job):
def fetch_stage(self, harvest_object):
def import_stage(self, harvest_object):
See the extension site for detailsAn example (SRDA): http://goo.gl/ZMnND7