Funnelweb ploneconf2010

Post on 27-Jun-2015

575 views 0 download

Tags:

description

PloneConf2010 talk about easy content conversion framework called funnelweb. Makes importing any site easy.

transcript

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

FunnelWeb

Easy Content Conversions

Dylan JayPretaWeb

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

Content Conversions suck

Large existing sites Static html or old CMS Hard to quote on Content audit Use plone to fix content Convert Docs to Pages (coming...)

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

History

2008 - Obrien Intranet 2009 – pretaweb.funnelweb (deprecated)

Plone UI > Actions > Import 2010 – transmogrify.* release on pypi 2010 – collective.developermanual

sphinx to plone 2010 – funnelweb Recipe + Script Thanks – Dylan Jay, Vitaliy Podoba, Rok Garbas, Mikko Ohtamaa, Tim

Knap

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

Demo

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

funnelweb.recipe

Add to buildout

[funnelweb]

recipe = funnelweb

crawler-url=http://www.whitehouse.gov

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

bin/funnelweb

Crawls Caches locally Filters Removes template Restructures Determines title,hidden etc Uploads to plone

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

Common Options

crawler:site_url crawler:ignore ploneupload:target template1:description template1:text *-disable

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

Command Line

bin/funnelweb --crawler:max=50 --localupload:output=var/funnelwebdebug

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

Viewing the Pipeline

bin/funnelweb --pipeline

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

Custom pipeline

bin/funnelweb –pipeline > pipeline.cfg {edit} pipeline.cfg bin/funnelweb --pipeline=pipeline.cfg

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

Making your own blueprint

class MyBlueprint(object):

classProvides(ISectionBlueprint)

implements(ISection)

def __init__(self, transmogrifier, name, options, previous):

self.previous = previous

def __iter__(self):

for item in self.previous:

dosomethingto(item)

yield item

<utility component=".myblueprint.MyBluePrintr"

name="transmogrify.myblueprint" />

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

transmogrify.webcrawler

transmogrify.webcrawler Crawls site or cache for content

transmogrify.webcrawler.typerecognitor Sets Plone content type based on mime-type

transmogrify.webcrawler.cache Saves content to disk

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

transmogrify.htmlcontentextractor

transmogrify.htmlcontentextractor Provide XPath for title, description, text etc.

transmogrify.htmlcontentextractor.auto Guesses XPaths from content

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

transmogrify.siteanalyser

transmogrify.siteanalyser.relinker Moves, renames, url tidying

transmogrify.siteanalyser.title Guess page titles

transmogrify.siteanalyser.defaultpage Move index pages into folders

transmogrify.siteanalyser.attach Move attachments closer to pages

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

transmogrify.ploneremote

Remoteconstructor Adds content to plone via xmlrpc

Remoteschemaupdater Updates content of existing object

Remotenavigationexcluder Hides content not in orginal sites navigation

Remoteworkflowupdater Publish content

Remoteredirector Creates aliases for items that have moved

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

Other blueprints

transmogrify.pathsorter Puts folders before content and content in

right order collective.transmogrifier.sections.condition

Useful to drop certain content

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

Where to get it

http://github.com:djay/funnelweb.git http://github.com:djay/transmogrify.* Pypi release TBA

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

#TODO

• Extract content styles into visual editor

dylan@pretaweb.comPlone Conf 2010 Dylan Jay

Thanks

• djay@pretaweb.com

• IRC: djjay

• Twitter: djay75