Post on 27-Jun-2015
description
transcript
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
FunnelWeb
Easy Content Conversions
Dylan JayPretaWeb
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Content Conversions suck
Large existing sites Static html or old CMS Hard to quote on Content audit Use plone to fix content Convert Docs to Pages (coming...)
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
History
2008 - Obrien Intranet 2009 – pretaweb.funnelweb (deprecated)
Plone UI > Actions > Import 2010 – transmogrify.* release on pypi 2010 – collective.developermanual
sphinx to plone 2010 – funnelweb Recipe + Script Thanks – Dylan Jay, Vitaliy Podoba, Rok Garbas, Mikko Ohtamaa, Tim
Knap
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Demo
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
funnelweb.recipe
Add to buildout
[funnelweb]
recipe = funnelweb
crawler-url=http://www.whitehouse.gov
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
bin/funnelweb
Crawls Caches locally Filters Removes template Restructures Determines title,hidden etc Uploads to plone
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Common Options
crawler:site_url crawler:ignore ploneupload:target template1:description template1:text *-disable
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Command Line
bin/funnelweb --crawler:max=50 --localupload:output=var/funnelwebdebug
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Viewing the Pipeline
bin/funnelweb --pipeline
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Custom pipeline
bin/funnelweb –pipeline > pipeline.cfg {edit} pipeline.cfg bin/funnelweb --pipeline=pipeline.cfg
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Making your own blueprint
class MyBlueprint(object):
classProvides(ISectionBlueprint)
implements(ISection)
def __init__(self, transmogrifier, name, options, previous):
self.previous = previous
def __iter__(self):
for item in self.previous:
dosomethingto(item)
yield item
<utility component=".myblueprint.MyBluePrintr"
name="transmogrify.myblueprint" />
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.webcrawler
transmogrify.webcrawler Crawls site or cache for content
transmogrify.webcrawler.typerecognitor Sets Plone content type based on mime-type
transmogrify.webcrawler.cache Saves content to disk
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.htmlcontentextractor
transmogrify.htmlcontentextractor Provide XPath for title, description, text etc.
transmogrify.htmlcontentextractor.auto Guesses XPaths from content
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.siteanalyser
transmogrify.siteanalyser.relinker Moves, renames, url tidying
transmogrify.siteanalyser.title Guess page titles
transmogrify.siteanalyser.defaultpage Move index pages into folders
transmogrify.siteanalyser.attach Move attachments closer to pages
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.ploneremote
Remoteconstructor Adds content to plone via xmlrpc
Remoteschemaupdater Updates content of existing object
Remotenavigationexcluder Hides content not in orginal sites navigation
Remoteworkflowupdater Publish content
Remoteredirector Creates aliases for items that have moved
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Other blueprints
transmogrify.pathsorter Puts folders before content and content in
right order collective.transmogrifier.sections.condition
Useful to drop certain content
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Where to get it
http://github.com:djay/funnelweb.git http://github.com:djay/transmogrify.* Pypi release TBA
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
#TODO
• Extract content styles into visual editor
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Thanks
• djay@pretaweb.com
• IRC: djjay
• Twitter: djay75