funnelweb ploneconf2010

19
[email protected] Plone Conf 2010 Dylan Jay FunnelWeb Easy Content Conversions Dylan Jay PretaWeb

Upload: dylan-jay

Post on 27-Jun-2015

575 views

Category:

Documents


0 download

DESCRIPTION

PloneConf2010 talk about easy content conversion framework called funnelweb. Makes importing any site easy.

TRANSCRIPT

Page 1: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

FunnelWeb

Easy Content Conversions

Dylan JayPretaWeb

Page 2: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

Content Conversions suck

Large existing sites Static html or old CMS Hard to quote on Content audit Use plone to fix content Convert Docs to Pages (coming...)

Page 3: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

History

2008 - Obrien Intranet 2009 – pretaweb.funnelweb (deprecated)

Plone UI > Actions > Import 2010 – transmogrify.* release on pypi 2010 – collective.developermanual

sphinx to plone 2010 – funnelweb Recipe + Script Thanks – Dylan Jay, Vitaliy Podoba, Rok Garbas, Mikko Ohtamaa, Tim

Knap

Page 4: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

Demo

Page 5: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

funnelweb.recipe

Add to buildout

[funnelweb]

recipe = funnelweb

crawler-url=http://www.whitehouse.gov

Page 6: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

bin/funnelweb

Crawls Caches locally Filters Removes template Restructures Determines title,hidden etc Uploads to plone

Page 7: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

Common Options

crawler:site_url crawler:ignore ploneupload:target template1:description template1:text *-disable

Page 8: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

Command Line

bin/funnelweb --crawler:max=50 --localupload:output=var/funnelwebdebug

Page 9: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

Viewing the Pipeline

bin/funnelweb --pipeline

Page 10: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

Custom pipeline

bin/funnelweb –pipeline > pipeline.cfg {edit} pipeline.cfg bin/funnelweb --pipeline=pipeline.cfg

Page 11: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

Making your own blueprint

class MyBlueprint(object):

classProvides(ISectionBlueprint)

implements(ISection)

def __init__(self, transmogrifier, name, options, previous):

self.previous = previous

def __iter__(self):

for item in self.previous:

dosomethingto(item)

yield item

<utility component=".myblueprint.MyBluePrintr"

name="transmogrify.myblueprint" />

Page 12: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

transmogrify.webcrawler

transmogrify.webcrawler Crawls site or cache for content

transmogrify.webcrawler.typerecognitor Sets Plone content type based on mime-type

transmogrify.webcrawler.cache Saves content to disk

Page 13: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

transmogrify.htmlcontentextractor

transmogrify.htmlcontentextractor Provide XPath for title, description, text etc.

transmogrify.htmlcontentextractor.auto Guesses XPaths from content

Page 14: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

transmogrify.siteanalyser

transmogrify.siteanalyser.relinker Moves, renames, url tidying

transmogrify.siteanalyser.title Guess page titles

transmogrify.siteanalyser.defaultpage Move index pages into folders

transmogrify.siteanalyser.attach Move attachments closer to pages

Page 15: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

transmogrify.ploneremote

Remoteconstructor Adds content to plone via xmlrpc

Remoteschemaupdater Updates content of existing object

Remotenavigationexcluder Hides content not in orginal sites navigation

Remoteworkflowupdater Publish content

Remoteredirector Creates aliases for items that have moved

Page 16: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

Other blueprints

transmogrify.pathsorter Puts folders before content and content in

right order collective.transmogrifier.sections.condition

Useful to drop certain content

Page 17: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

Where to get it

http://github.com:djay/funnelweb.git http://github.com:djay/transmogrify.* Pypi release TBA

Page 18: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

#TODO

• Extract content styles into visual editor

Page 19: Funnelweb ploneconf2010

[email protected] Conf 2010 Dylan Jay

Thanks

[email protected]

• IRC: djjay

• Twitter: djay75