[email protected] Conf 2010 Dylan Jay
Content Conversions suck
Large existing sites Static html or old CMS Hard to quote on Content audit Use plone to fix content Convert Docs to Pages (coming...)
[email protected] Conf 2010 Dylan Jay
History
2008 - Obrien Intranet 2009 – pretaweb.funnelweb (deprecated)
Plone UI > Actions > Import 2010 – transmogrify.* release on pypi 2010 – collective.developermanual
sphinx to plone 2010 – funnelweb Recipe + Script Thanks – Dylan Jay, Vitaliy Podoba, Rok Garbas, Mikko Ohtamaa, Tim
Knap
[email protected] Conf 2010 Dylan Jay
funnelweb.recipe
Add to buildout
[funnelweb]
recipe = funnelweb
crawler-url=http://www.whitehouse.gov
[email protected] Conf 2010 Dylan Jay
bin/funnelweb
Crawls Caches locally Filters Removes template Restructures Determines title,hidden etc Uploads to plone
[email protected] Conf 2010 Dylan Jay
Common Options
crawler:site_url crawler:ignore ploneupload:target template1:description template1:text *-disable
[email protected] Conf 2010 Dylan Jay
Command Line
bin/funnelweb --crawler:max=50 --localupload:output=var/funnelwebdebug
[email protected] Conf 2010 Dylan Jay
Custom pipeline
bin/funnelweb –pipeline > pipeline.cfg {edit} pipeline.cfg bin/funnelweb --pipeline=pipeline.cfg
[email protected] Conf 2010 Dylan Jay
Making your own blueprint
class MyBlueprint(object):
classProvides(ISectionBlueprint)
implements(ISection)
def __init__(self, transmogrifier, name, options, previous):
self.previous = previous
def __iter__(self):
for item in self.previous:
dosomethingto(item)
yield item
<utility component=".myblueprint.MyBluePrintr"
name="transmogrify.myblueprint" />
[email protected] Conf 2010 Dylan Jay
transmogrify.webcrawler
transmogrify.webcrawler Crawls site or cache for content
transmogrify.webcrawler.typerecognitor Sets Plone content type based on mime-type
transmogrify.webcrawler.cache Saves content to disk
[email protected] Conf 2010 Dylan Jay
transmogrify.htmlcontentextractor
transmogrify.htmlcontentextractor Provide XPath for title, description, text etc.
transmogrify.htmlcontentextractor.auto Guesses XPaths from content
[email protected] Conf 2010 Dylan Jay
transmogrify.siteanalyser
transmogrify.siteanalyser.relinker Moves, renames, url tidying
transmogrify.siteanalyser.title Guess page titles
transmogrify.siteanalyser.defaultpage Move index pages into folders
transmogrify.siteanalyser.attach Move attachments closer to pages
[email protected] Conf 2010 Dylan Jay
transmogrify.ploneremote
Remoteconstructor Adds content to plone via xmlrpc
Remoteschemaupdater Updates content of existing object
Remotenavigationexcluder Hides content not in orginal sites navigation
Remoteworkflowupdater Publish content
Remoteredirector Creates aliases for items that have moved
[email protected] Conf 2010 Dylan Jay
Other blueprints
transmogrify.pathsorter Puts folders before content and content in
right order collective.transmogrifier.sections.condition
Useful to drop certain content
[email protected] Conf 2010 Dylan Jay
Where to get it
http://github.com:djay/funnelweb.git http://github.com:djay/transmogrify.* Pypi release TBA