ckan 2.0 introduction (20140618 updated)

58
CKAN 2 Introduction This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Taiwan License. Presenter: 李承錱 Cheng-Jen Lee (Sol) Email: cjlee AT iis.sinica.edu.tw

Upload: chengjen-lee

Post on 06-Jul-2015

340 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: ckan 2.0 Introduction (20140618 updated)

CKAN 2 Introduction

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Taiwan License.

Presenter: 李承錱 Cheng-Jen Lee (Sol)

Email: cjlee AT iis.sinica.edu.tw

Page 2: ckan 2.0 Introduction (20140618 updated)

2014/6/18 2

Agenda● About CKAN● Feature Tour

– Publish & Find Datasets– Store & Manage Data– Engage with Users & Others– Customise & Extend

● CKAN and 5 Open Data★● Showcase● Issues

Page 3: ckan 2.0 Introduction (20140618 updated)

2014/6/18 3

Agenda● About CKAN● Feature Tour

– Publish & Find Datasets– Store & Manage Data– Engage with Users & Others– Customise & Extend

● CKAN and 5 Open Data★● Showcase● Issues

Page 4: ckan 2.0 Introduction (20140618 updated)

2014/6/18 4

About CKAN● The Comprehensive Knowledge Archive

Network● A powerful data management system

● Publishing● Sharing● Finding● Using Data

Page 5: ckan 2.0 Introduction (20140618 updated)

2014/6/18 5

About CKAN

83 instancesaround the worldin May 2014

Page 6: ckan 2.0 Introduction (20140618 updated)

2014/6/18 6

CKAN 2

Page 7: ckan 2.0 Introduction (20140618 updated)

2014/6/18 7

Feature Tour

Page 8: ckan 2.0 Introduction (20140618 updated)

2014/6/18 8

Demo Sitedemo.ckan.org

Page 9: ckan 2.0 Introduction (20140618 updated)

2014/6/18 9

Feature Tour (1) Publish & Find Datasets

Add Dataset Basic Information

Page 10: ckan 2.0 Introduction (20140618 updated)

2014/6/18 10

Feature Tour (1) Publish & Find Datasets

Add DataUnder the Dataset

Page 11: ckan 2.0 Introduction (20140618 updated)

2014/6/18 11

Feature Tour (1) Publish & Find Datasets

Add Metadata About the Dataset

Page 12: ckan 2.0 Introduction (20140618 updated)

2014/6/18 12

Feature Tour (1) Publish & Find Datasets

Filter ByKeywords

Page 13: ckan 2.0 Introduction (20140618 updated)

2014/6/18 13

Feature Tour (1) Publish & Find Datasets

Filter ByGeographical Features

Page 14: ckan 2.0 Introduction (20140618 updated)

2014/6/18 14

Feature Tour (2)Store & Manage Data

Data Explorer:

recline_preview (csv, xls)json_previewpdf_previewckanext-spatial

Page 15: ckan 2.0 Introduction (20140618 updated)

2014/6/18 15

Feature Tour (2)Store & Manage Data

Graphing data

Page 16: ckan 2.0 Introduction (20140618 updated)

2014/6/18 16

Feature Tour (3)Engage with Users & Others

Share

Page 17: ckan 2.0 Introduction (20140618 updated)

2014/6/18 17

Feature Tour (3)Engage with Users & Others

Organization

Page 18: ckan 2.0 Introduction (20140618 updated)

2014/6/18 18

Feature Tour (3)Engage with Users & Others

Manage Users of an Organization

Page 19: ckan 2.0 Introduction (20140618 updated)

2014/6/18 19

Feature Tour (3)Engage with Users & Others

Manage Role of Members

Admin (管理者 ): edit datasets & membersEditor (編輯 ): edit datasetsViewer (成員 ): view (private) datasets

Note: Public datasets are visible to everyone

Page 20: ckan 2.0 Introduction (20140618 updated)

2014/6/18 20

Feature Tour (3)Engage with Users & Others

Harvest and Federation

Page 21: ckan 2.0 Introduction (20140618 updated)

2014/6/18 21

Feature Tour (3)Engage with Users & Others

History

Page 22: ckan 2.0 Introduction (20140618 updated)

2014/6/18 22

Feature Tour (4)Customise & Extend

● RESTful JSON APIs– The Action API– The DataStore API– The FileStore API...

● Extensions (over 60)– ckanext-harvest– ckanext-spatial

● Themable● Integrates with other CMS (ex. Drupal)

Open source is good!

Page 23: ckan 2.0 Introduction (20140618 updated)

2014/6/18 23

CKAN and 5 Open Data★

Page 24: ckan 2.0 Introduction (20140618 updated)

2014/6/18 24

CKAN and 5 Open Data★● ★ Make your stuff available on the Web (whatever

format) under an open license

Page 25: ckan 2.0 Introduction (20140618 updated)

2014/6/18 25

CKAN and 5 Open Data★● ★★ Make it available as structured data (e.g.,

Excel instead of image scan of a table)★★★ Use non-proprietary formats (e.g., CSV instead of Excel)

– Accept any data format– Beautiful data demonstration– Datastore : Indexing for structured data

Page 26: ckan 2.0 Introduction (20140618 updated)

2014/6/18 26

CKAN and 5 Open Data★● ★★★★ Use URIs to denote things, so that people

can point at your stuff

– Permanent link for each dataset– Get Dataset URI through API

● ★★★★★ Link your data to other data to provide context

– Linked data and RDF for metadata

Page 27: ckan 2.0 Introduction (20140618 updated)

2014/6/18 27

CKAN and 5 Open Data★● RDF for metadata

– DCAT and Dublin Core– curl -L -H "Accept:application/rdf+xml"

http://thedatahub.org/dataset/gold-prices

Page 28: ckan 2.0 Introduction (20140618 updated)

2014/6/18 28

Agenda● About CKAN● Feature Tour

– Publish & Find Datasets– Store & Manage Data– Engage with Users & Others– Customise & Extend

● CKAN and 5 Open Data★● Showcase● Issues

Page 29: ckan 2.0 Introduction (20140618 updated)

2014/6/18 29

United KingdomDATA.GOV.UK

Page 30: ckan 2.0 Introduction (20140618 updated)

2014/6/18 30

United StatesDATA.GOV

Page 31: ckan 2.0 Introduction (20140618 updated)

2014/6/18 31

BrazilDADOS.GOV.BR

Page 32: ckan 2.0 Introduction (20140618 updated)

2014/6/18 32

European UnionPUBLICDATA.EU

Page 33: ckan 2.0 Introduction (20140618 updated)

2014/6/18 33

JapanDATA.GO.JP

Page 34: ckan 2.0 Introduction (20140618 updated)

2014/6/18 34

TainanDATA.TAINAN.GOV.TW (NEW!)

Page 35: ckan 2.0 Introduction (20140618 updated)

2014/6/18 35

Geospatial Data Explorer:Lat/Long field

Page 36: ckan 2.0 Introduction (20140618 updated)

2014/6/18 36

Geospatial Data Explorer:GeoJSON

Page 37: ckan 2.0 Introduction (20140618 updated)

2014/6/18 37

Geospatial Data Explorer:WMS

Page 38: ckan 2.0 Introduction (20140618 updated)

2014/6/18 38

Agenda● About CKAN● Feature Tour

– Publish & Find Datasets– Store & Manage Data– Engage with Users & Others– Customise & Extend

● CKAN and 5 Open Data★● Showcase● Issues

Page 39: ckan 2.0 Introduction (20140618 updated)

2014/6/18 39

Issues● CJK Support

– CJK Search– Some broken translations– File name

● Extensions compatibilities● Tons of tweaks needed● Performance Issue● Complicated Architecture

Page 40: ckan 2.0 Introduction (20140618 updated)

2014/6/18 40

System Architecture

Page 41: ckan 2.0 Introduction (20140618 updated)

2014/6/18 41

Issues (Cont'd)● What You Should Know

– Python & Pylons– ckan plugins toolkit– SQLAlchemy & SQL– HTML, JavaScript– Babel (Translation)– Web Server (UNIX, Apache, Nginx ...)

Page 42: ckan 2.0 Introduction (20140618 updated)

2014/6/18 42

● Official Documents:– http://docs.ckan.org/en/latest/

● Installation Notes (in Chinese):– https://ckan-docs-tw.readthedocs.org/

● CKAN Development Discussions:– http://lists.okfn.org/mailman/listinfo/ckan-dev

● CKAN Taiwan Interest Group:– https://groups.google.com/forum/#!forum/ckan-taiwan-interest-group

Resources

Page 43: ckan 2.0 Introduction (20140618 updated)

2014/6/18 43

Thanks for your attention!Any Q?

Email: u10313335 AT citi.sinica.edu.tw

http://about.me/sollee

Page 44: ckan 2.0 Introduction (20140618 updated)

CKAN 2: Additional Topics

Presenter: 李承錱 Cheng-Jen Lee (Sol)

Email: u10313335 AT citi.sinica.edu.tw

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Taiwan License.

Page 45: ckan 2.0 Introduction (20140618 updated)

2014/6/18 45

Agenda● Extended Topic 1: Installation● Extended Topic 2: Harvesters

Page 46: ckan 2.0 Introduction (20140618 updated)

2014/6/18 46

Install from Source● Virtual environment● Checkout the source (via GIT)

– https://github.com/okfn/ckan● Create a CKAN config file● Setup Jetty & Solr● Initialize Database (user, db)● Link to who.ini● Create a Sysadmin User● Deployment (Apache + Nginx)● Install other extensions...

Page 47: ckan 2.0 Introduction (20140618 updated)

2014/6/18 47

Installation Notes● https://ckan-docs-tw.readthedocs.org/

Page 48: ckan 2.0 Introduction (20140618 updated)

2014/6/18 48

Agenda● Extended Topic 1: Installation● Extended Topic 2: Harvesters

Page 49: ckan 2.0 Introduction (20140618 updated)

2014/6/18 49

Harvesters● ckanext-harvest

– Remote harvesting extension– https://github.com/okfn/ckanext-harvest

● Source Type– CKAN (built-in)– CSW– WAF– Custom (csv/xls/website… etc)

Page 50: ckan 2.0 Introduction (20140618 updated)

2014/6/18 50

Harvested from TGOSCSW service

Page 51: ckan 2.0 Introduction (20140618 updated)

2014/6/18 51

Harvestershttp://Mydomain.com/harvest

Page 52: ckan 2.0 Introduction (20140618 updated)

2014/6/18 52

HarvestersAdd a new harvest source

Page 53: ckan 2.0 Introduction (20140618 updated)

2014/6/18 53

HarvestersCreate a harvest job

Page 54: ckan 2.0 Introduction (20140618 updated)

2014/6/18 54

HarvestersOverview of harvested datasets

Page 55: ckan 2.0 Introduction (20140618 updated)

2014/6/18 55

HarvestersBackground Process

● Manually– (pyenv) $ paster --plugin=ckanext-harvest

harvester gather_consumer -c /etc/ckan/default/production.ini

– (pyenv) $ paster --plugin=ckanext-harvest harvester fetch_consumer -c /etc/ckan/default/production.ini

– (pyenv) $ paster --plugin=ckanext-harvest harvester run -c /etc/ckan/default/production.ini

Page 56: ckan 2.0 Introduction (20140618 updated)

2014/6/18 56

HarvestersBackground Process

● Automatically– Supervisor (for gather & fetch consumer)– Cron (for run)

Page 57: ckan 2.0 Introduction (20140618 updated)

2014/6/18 57

HarvestersCustom harvester

● Implement the harvester interface to perform harvesting operations

● Three stages● gather: get the identification● fetch: fetch the contents● import: create ckan package (dataset)

Page 58: ckan 2.0 Introduction (20140618 updated)

2014/6/18 58

HarvestersThe harvesting interface

from base import HarvesterBase

class SRDAHarvester(HarvesterBase):

def _set_config(self,config_str):

def info(self):

...

def gather_stage(self, harvest_job):

def fetch_stage(self, harvest_object):

def import_stage(self, harvest_object):

See the extension site for detailsAn example (SRDA): http://goo.gl/ZMnND7