plone integration with exist-db - structured content rocks

47
Structured Content Rocks! Integration of eXist-db with Plone Andreas Jung/@MacYET ZOPYX • www.zopyx.com Plone Conference 2014 • Bristol, UK

Upload: andreas-jung

Post on 24-Jun-2015

496 views

Category:

Internet


3 download

DESCRIPTION

Integration of Plone with eXist-db (XML database). Give at Plone Conference 2014 in Bristol

TRANSCRIPT

Page 1: Plone Integration with eXist-db - Structured Content rocks

Structured Content Rocks!Integration of eXist-db with Plone

Andreas Jung/@MacYET ZOPYX • www.zopyx.com

Plone Conference 2014 • Bristol, UK

Page 2: Plone Integration with eXist-db - Structured Content rocks
Page 3: Plone Integration with eXist-db - Structured Content rocks

Python, Plone, Zope nerdPublishing wizardDinosaur of Zope (Paul Everitt)

Page 4: Plone Integration with eXist-db - Structured Content rocks
Page 5: Plone Integration with eXist-db - Structured Content rocks
Page 6: Plone Integration with eXist-db - Structured Content rocks
Page 7: Plone Integration with eXist-db - Structured Content rocks
Page 8: Plone Integration with eXist-db - Structured Content rocks

Agenda

Page 9: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

Agenda

‣ XML-based publication workflows ‣ context: ‣ DOCX ➝ XML conversion ‣ XML➝ PDF/EPub conversion

‣ Integration of Plone with XML database eXist-db

Page 10: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

What is Structured Content?

‣ XML of course ‣ HTML is not suitable for publishing purposes in general ‣ XML Schemas or Document Type Definition for ‣ defining the exact structure of a document ‣ syntactical and semantical validation ‣ industry standard in the publishing world ‣ defacto exchange format with third-party applications

Page 11: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

What is

‣ A NoSQL Document Database and Application platform

‣ Open-source XML database written in Java

‣ stores documents: XML/HTML

‣ stores arbitrary (binary) data (DOCX, PDF, images, …)

‣ XML technology: XPath 3, XForms, XSLT 2, XQuery 3, XUpdate

‣ comes with Lucence for fulltext indexing

‣ open for all related Java XML technology

?

Page 12: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

Why

‣ Hierarchical storage model (collections -> folders)

‣ Content and scripts accessible through WebDAV

‣ Scripting using XQuery

‣ XQuery scripts callable through REST API

‣ Scripts results serializable to JSON, HTML, XML

‣ Very good experience during evaluation period

?

Page 13: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

How do we use

‣ storing XML documents

‣ indexing XML documents

‣ searching XML documents

‣ aggregation of XML documents

‣ manipulation of XML documents

?

Page 14: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

Onkopedia project?‣ www.dgho-onkopedia.de

www.onkopedia-guidelines.info

‣ Plone project since 2010

‣ Portal for medical guidelines for diagnosis and treatment of hematology and oncology diseases

‣ DOCX ➝ HTML ➝ PDF (Produce & Publish)

‣ Owned by Deutsche Gesellschaft für Hämatologie und Medizinische Onkologie in cooperation with further medical societies (AT, CH)

Page 15: Plone Integration with eXist-db - Structured Content rocks
Page 16: Plone Integration with eXist-db - Structured Content rocks
Page 17: Plone Integration with eXist-db - Structured Content rocks
Page 18: Plone Integration with eXist-db - Structured Content rocks
Page 19: Plone Integration with eXist-db - Structured Content rocks
Page 20: Plone Integration with eXist-db - Structured Content rocks
Page 21: Plone Integration with eXist-db - Structured Content rocks
Page 22: Plone Integration with eXist-db - Structured Content rocks
Page 23: Plone Integration with eXist-db - Structured Content rocks
Page 24: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

Current editorial workflow

Word -> XHTML (OpenOffice, webservice)

Editorial fine-tuning for images, imagemaps, linking

Conversion to EPUB and PDF

Publishing

Page 25: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

‣ HTML not suitable for further requirements

‣ implementation too tight coupled to Plone

‣ a lot of fragile and workaround code for Plone

‣ need for better production-safety

‣ need for better automated production

‣ interfaces and APIs for external systems requested by other vendors

Reasons for switching to XML

Page 26: Plone Integration with eXist-db - Structured Content rocks

Content structure inside eXist-db

Page 27: Plone Integration with eXist-db - Structured Content rocks

root

de

en

onkopedia

my-onkopedia

onkopedia-p

knowledge-database

mammakarzinom-der-frau

mammakarzinom-des-mannes

mammakarzinom-der-frau

onkopedia

current

archive

draft

Version 01.04.2013

Version 07.08.2014

Version 25.03.2012

pdf

xml

html

media

source

1.jpg

2.jpg

incoming.docx

index.html

index.xml

index.pdf

my-onkopedia

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

Page 28: Plone Integration with eXist-db - Structured Content rocks

root

de

en

onkopedia

my-onkopedia

onkopedia-p

knowledge-database

mammakarzinom-der-frau

mammakarzinom-des-mannes

mammakarzinom-der-frau

onkopedia

current

archive

draft

Version 01.04.2013

Version 07.08.2014

Version 25.03.2012

pdf

xml

html

media

source

1.jpg

2.jpg

incoming.docx

index.html

index.xml

index.pdf

my-onkopedia

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

Page 29: Plone Integration with eXist-db - Structured Content rocks

root

de

en

onkopedia

my-onkopedia

onkopedia-p

knowledge-database

mammakarzinom-der-frau

mammakarzinom-des-mannes

mammakarzinom-der-frau

onkopedia

current

archive

draft

Version 01.04.2013

Version 07.08.2014

Version 25.03.2012

pdf

xml

html

media

source

1.jpg

2.jpg

incoming.docx

index.html

index.xml

index.pdf

my-onkopedia

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

Publish

Page 30: Plone Integration with eXist-db - Structured Content rocks

root

de

en

onkopedia

my-onkopedia

onkopedia-p

knowledge-database

mammakarzinom-der-frau

mammakarzinom-des-mannes

mammakarzinom-der-frau

onkopedia

current

archive

draft

Version 01.04.2013

Version 07.08.2014

Version 25.03.2012

pdf

xml

html

media

source

1.jpg

2.jpg

incoming.docx

index.html

index.xml

index.pdf

my-onkopedia

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

Archive

Page 31: Plone Integration with eXist-db - Structured Content rocks

root

de

en

onkopedia

my-onkopedia

onkopedia-p

knowledge-database

mammakarzinom-der-frau

mammakarzinom-des-mannes

mammakarzinom-der-frau

onkopedia

current

archive

draft

Version 01.04.2013

Version 07.08.2014

Version 25.03.2012

pdf

xml

html

media

source

1.jpg

2.jpg

incoming.docx

index.html

index.xml

index.pdf

my-onkopedia

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

Page 32: Plone Integration with eXist-db - Structured Content rocks

How to map this into Plone?

Page 33: Plone Integration with eXist-db - Structured Content rocks

root

de

en

onkopedia

my-onkopedia

onkopedia-p

knowledge-database

mammakarzinom-der-frau

mammakarzinom-des-mannes

mammakarzinom-der-frau

onkopedia

current

archive

draft

Version 01.04.2013

Version 07.08.2014

Version 25.03.2012

pdf

xml

html

media

source

1.jpg

2.jpg

incoming.docx

index.html

index.xml

index.pdf

my-onkopedia

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

Page 34: Plone Integration with eXist-db - Structured Content rocks

root

de

en

onkopedia

my-onkopedia

onkopedia-p

knowledge-database

mammakarzinom-der-frau

mammakarzinom-des-mannes

mammakarzinom-der-frau

onkopedia

current

archive

draft

Version 01.04.2013

Version 07.08.2014

Version 25.03.2012

pdf

xml

html

media

source

1.jpg

2.jpg

incoming.docx

index.html

index.xml

index.pdf

my-onkopedia

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

Connector

http://host/de/my-onkopedia/mammakarzinom-der-frau/archive/version-25.03.2014/@@view/xml/index.xml

Connector

Connector

Page 35: Plone Integration with eXist-db - Structured Content rocks

de

en

my-onkopedia

onkopedia-p

knowledge-database

mammakarzinom-des-mannes

mammakarzinom-der-frau

onkopedia

current

archive

draft

Version 01.04.2013

Version 07.08.2014

Version 25.03.2012

pdf

xml

html

media

source

1.jpg

2.jpg

incoming.docx

index.html

index.xml

index.pdf

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

source incoming.docx

xml index.xml

html index.html

media

1.jpg

2.jpg

pdf index.pdf

Connectorhttp://host/de/my-onkopedia/mammakarzinom-der-frau/archive/version-25.03.2014/@@view/xml/index.xml

Page 36: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

‣ Plone content-type (Dexterity) ‣ maps a subtree from eXist-db into Plone (similar to Reflecto) ‣ traversal support ‣ UI for managing collections (add, remove, rename) ‣ ACE editor integration ‣ pluggable view registry for eXist-db content (by-suffix) ‣ ZIP import/export ‣ support for XQuery scripts called through the RESTXQ layer of eXist-db

‣ persistent per-connector logging ‣ small and extensible ‣ Plone security & rights management apply on the connector level

zopyx.existdb

Page 37: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

‣ Use cases:

‣ Mapping existing collections of XML documents and associated resources into Plone

‣ Building supplementary (web) applications and functionality on top of XML collections

‣ Anti patterns:

‣ not a general storage replacement for content-types

‣ not a transparent storage like AttributeStorage, SQLStorage (AT) etc.

Use cases and anti patterns

Page 38: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

Produce & PublishXML to PDF

Query Server

Word2XMLPlone CMS

DGHOMember Database

Authenticatio

n

DOCX

XML, Assets

Authorizatio

n

PDF, EPUB

HTML, XML + CSS

XQuery

XML, HTML, JSON

Mac

XML Editing, A

ssets

Editing

XML Editing, A

ssets

Editing

WebDAV

WebDAV

Windows

JSONHTMLXML XQuery

WebDAV

Onkopedia Onkopedia Editor (Intern)

Onkopedia Editor (I

ntern)Onkopedia Site Visitor

Onkopedia Site VisitorOnkopedia Edito

r (Intern)

External Systems Clinical systems Medical applications Medical databases

HTTPREST APIGuidelines (XML)

Addendums (XML)Assets (Images, Styles)

PDFDOCX

eXist-dbXML database

Architecture

Page 39: Plone Integration with eXist-db - Structured Content rocks

Produce & PublishXML to PDF

Query Server

Word2XMLPlone CMS

DGHOMember Database

Authenticatio

n

DOCX

XML, Assets

Authorizatio

n

PDF, EPUB

HTML, XML + CSS

XQuery

XML, HTML, JSON

Mac

XML Editing, A

ssets

Editing

XML Editing, A

ssets

Editing

WebDAV

WebDAV

Windows

JSONHTMLXML XQuery

WebDAV

Onkopedia Onkopedia Editor (Intern)

Onkopedia Editor (I

ntern)Onkopedia Site Visitor

Onkopedia Site VisitorOnkopedia Edito

r (Intern)

External Systems Clinical systems Medical applications Medical databases

HTTPREST APIGuidelines (XML)

Addendums (XML)Assets (Images, Styles)

PDFDOCX

eXist-dbXML database

Page 40: Plone Integration with eXist-db - Structured Content rocks

Produce & PublishXML to PDF

Query Server

Word2XMLPlone CMS

DGHOMember Database

Authenticatio

n

DOCX

XML, Assets

Authorizatio

n

PDF, EPUB

HTML, XML + CSS

XQuery

XML, HTML, JSON

Mac

XML Editing, A

ssets

Editing

XML Editing, A

ssets

Editing

WebDAV

WebDAV

Windows

JSONHTMLXML XQuery

WebDAV

Onkopedia Onkopedia Editor (Intern)

Onkopedia Editor (I

ntern)Onkopedia Site Visitor

Onkopedia Site VisitorOnkopedia Edito

r (Intern)

External Systems Clinical systems Medical applications Medical databases

HTTPREST APIGuidelines (XML)

Addendums (XML)Assets (Images, Styles)

PDFDOCX

eXist-dbXML database

Page 41: Plone Integration with eXist-db - Structured Content rocks

Produce & PublishXML to PDF

Query Server

Word2XMLPlone CMS

DGHOMember Database

Authenticatio

n

DOCX

XML, Assets

Authorizatio

n

PDF, EPUB

HTML, XML + CSS

XQuery

XML, HTML, JSON

Mac

XML Editing, A

ssets

Editing

XML Editing, A

ssets

Editing

WebDAV

WebDAV

Windows

JSONHTMLXML XQuery

WebDAV

Onkopedia Onkopedia Editor (Intern)

Onkopedia Editor (I

ntern)Onkopedia Site Visitor

Onkopedia Site VisitorOnkopedia Edito

r (Intern)

External Systems Clinical systems Medical applications Medical databases

HTTPREST APIGuidelines (XML)

Addendums (XML)Assets (Images, Styles)

PDFDOCX

eXist-dbXML database

Page 42: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

Produce & PublishXML to PDF

Query Server

Word2XMLPlone CMS

DGHOMember Database

Authenticatio

n

DOCX

XML, Assets

Authorizatio

n

PDF, EPUB

HTML, XML + CSS

XQuery

XML, HTML, JSON

Mac

XML Editing, A

ssets

Editing

XML Editing, A

ssets

Editing

WebDAV

WebDAV

Windows

JSONHTMLXML XQuery

WebDAV

Onkopedia Onkopedia Editor (Intern)

Onkopedia Editor (I

ntern)Onkopedia Site Visitor

Onkopedia Site VisitorOnkopedia Edito

r (Intern)

External Systems Clinical systems Medical applications Medical databases

HTTPREST APIGuidelines (XML)

Addendums (XML)Assets (Images, Styles)

PDFDOCX

eXist-dbXML database

Architecture

Page 43: Plone Integration with eXist-db - Structured Content rocks

Hidden gem: pyfilesystem

Page 44: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

‣ unified Python API for accessing different filesystems

‣ local ‣ WebDAV ‣ Dropbox ‣ SFTP/SSH ‣ S3 ‣ (Plone)

‣ Write portable code independent of the underlaying FS

‣ the filesystem is just a configuration option

pyfilesystem

Page 45: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

pyfilesystem

from fs.contrib.davfs import davfs

handle = DAVFS(„http://host/existdb/webdavdb“)

files = handle.listdir()

with handle.open(„foo.txt“, „w“) as fp:

fp.write(„hello world“)

Page 46: Plone Integration with eXist-db - Structured Content rocks

www.produce-and-publish.com Professional XML Publishing (C) 2014 ZOPYX

‣ much better production-safety through XML by applying validations, schema/DTD checks etc.

‣ replaced tons of Plone-specific and fragile Plone code

‣ well-defined DOCX ➝ XML conversion workflow

‣ much smaller code base

‣ easy to build Plone-XML apps on top of zopyx.existdb

Conclusion

Page 47: Plone Integration with eXist-db - Structured Content rocks

Questions?