Date post: | 12-May-2015 |
Category: |
Technology |
Upload: | wooda |
View: | 5,190 times |
Download: | 1 times |
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
How to minimize CPU and memory usage ofZope and Plone applications
Gaël LE MIGNOT – Pilot Systems
October 2007
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Plan
1 The golden rulesUse efficient algorithmUse C codeDo things onceThe 90-10 or 80-20 law
2 Locating CPU usageUsing decoratorsIssues and tips aboutbenchmarksUsingDeadlockDebugger
3 CPU optimization
Use catalog and productcodeUsing cachesThe hidden devil:conflict errors
4 Memory optimizationLocating memory usageOptimizing memory
5 Massive sites deploymentZEO pro and consSquid caching overviewSite exporting strategies
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Why this talk ?
The current situationPlone is very powerfulBut Plone is slow !High-level applications developers tend to forget aboutoptimization
The goals of this talkRemind you the golden rulesProvide tips, tricks and way to write faster Plone codeConsider both CPU and memory usageA quick coverage of deployment solutions, the work-around
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use efficient algorithmUse C codeDo things onceThe 90-10 or 80-20 law
Oh, no, not maths !
Algorithm is the key
Efficient algorithm is the key to fast programsThe gains for efficient algorithms are in powers of tenIt is not obsoleted by high-level languages !
Program complexity
A general approximation of speed for large datasetsCounts the number of times you execute the core loopLoop twice: O(n2), loop thrice: O(n3)
Remember: sorting is O(n ln(n)), list lookup O(n), btreelookup O(ln(n)), dict between O(1) and O(n), usuallyconsidered as O(ln(n))
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use efficient algorithmUse C codeDo things onceThe 90-10 or 80-20 law
The test case
Presenting the test case
We have two lists of wordsWe want words in both listsI used Debian’s french and english wordlists
Comparable uses in PloneFinding users in two groupsHandling tags (for tag clouds, for example)Automatically finding related documents
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use efficient algorithmUse C codeDo things onceThe 90-10 or 80-20 law
The test program
The naive versionfrench = open("french")english = open("english")
french = list(french)english = list(english)
words = [ word for word in frenchif word in english ]
print len(french), len(english), len(words)
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use efficient algorithmUse C codeDo things onceThe 90-10 or 80-20 law
The first results
Big datasetFrench words: 139704, English words: 98326Program took 7m52s; replace list with set: 0.17sSpeed-up factor: 2700x
More reasonable datasetAt 1000 words: 140ms and 4.1ms; speed-up: 34xAt 100 words: 1.7ms and 0.32ms; speed-up: 5x
MoralityUse an efficient algorithm; correct datatype is often enoughDon’t forget Python’s powerful set module !
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use efficient algorithmUse C codeDo things onceThe 90-10 or 80-20 law
Use C code: trust Python
The theoryOn the other side, some people like to hand-optimize codeIn addition to being less readable, it’s often slowerUse C code when you can; trust Python’s standard library
Applying to the previous program
Our lists are already sortedWe can save time, and have a O(n) algorithmBut the sets version is still faster: 0.17s, compared to 0.21sfor my manually optimized one (that’s 23% faster)
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use efficient algorithmUse C codeDo things onceThe 90-10 or 80-20 law
My beautiful hand-crafted codefrench = list(open("french"))english = list(open("english"))
pos = 0maxpos = len(english) - 1words = []for word in french:
while (word > english[pos]) and (pos < maxpos):pos += 1
if word == english[pos]:words.append(word)
print len(french), len(english), len(words)
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use efficient algorithmUse C codeDo things onceThe 90-10 or 80-20 law
Do things once
The theoryAvoid doing the same thing many timesUse local variables to store resultsIn a Zope context, it’s even more true with acquisitionAnd in restricted code, with security checks, even more !
Applying to the previous program
Move the maxpos from before to inside the loopWe go from 0.21s to 0.23s, that’s almost 10% slower.
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use efficient algorithmUse C codeDo things onceThe 90-10 or 80-20 law
The last golden rule
The last golden rule
It states that 10% of the code takes 90% of the ressourcesThe lighter version is 20% takes 80% of the ressourcesIt’s already used in Zope: some parts, like ZODB orsecurity, are implemented in CBut it applies to your code too !
This rule is also a nice transition. . .How to detect those ?This will be our next part !
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Using decoratorsIssues and tips about benchmarksUsing DeadlockDebugger
The easy way: using decorators
The ideaWrite a simple decorator printing the time taken by itsfunction executionUse it around functions you want to test
How to get more accurate values
On Unix systems, you can use resource.getrusage
It’ll give you user-space and kernel-space CPU usage
The limitsYou need to decorate all your functionsIt’ll consider all threads, not just your own
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Using decoratorsIssues and tips about benchmarksUsing DeadlockDebugger
System, user or real ? ... I’ll say blue !
System and user timeOnly account for CPU time used by your Zope processSystem is the time spent in kernel-mode, like IO, threadhandling, . . .System doesn’t account for IO waitingUser is the time spent in your code, Python, Zope, Plone
Real-life clockIs disturbed by other processesBut accounts for time spent waiting for IO (disk spinning,ZEO server answer, swaping pages in and out, . . . )
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Using decoratorsIssues and tips about benchmarksUsing DeadlockDebugger
Schroedinger cat will bite us !
Benchmarking affects your application
You can’t observe a process (be it a cat or a Zope) withoutdisturbing itgetrlimit, time and logging uses CPU timeBut if you don’t abuse, the cost stays very low
Benchmarking moves you away from real-lifeUnix doesn’t provide per-thread accountingBenchmarking a full multi-threaded system is unreliableRunning a single thread is going away from real-lifeIt can change swapping or ZEO access
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Using decoratorsIssues and tips about benchmarksUsing DeadlockDebugger
Beware the power of chaos
Zope is a complex systemIt contains cache and lot of other stateful devicesRunning a bench twice can lead to different results
The fresh load strategy
Do all your bench on a freshly started ZopeThat’ll be the worse case, when all caches are empty
The second test strategy1 Start a fresh Zope2 Run the code once, discard the result3 Run it again, once caches are full
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Using decoratorsIssues and tips about benchmarksUsing DeadlockDebugger
The quick and dirty profiler
What is DeadlockDebugger ?
A useful package listing the current state of all threadsIts goal was to locate deadlocksBut you can use it to find where your code spent its timeIt is Free Software, while the pofiler module is not
The magical formula~ $ while true; do
time >> bench ;lynx -dump $DEADLOCK_DEBUG_URL >> bench ;done
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Using decoratorsIssues and tips about benchmarksUsing DeadlockDebugger
Final remarks
Don’t disturb the beastWhen you do benchmarks, try to keep your computerotherwise idleThat includes between two bench !Running Firefox or OpenOffice will screw up your memorycache
Don’t mix chemicalsThe DeadlockDebugger trick is highly intrusiveIt’ll give you a good overall view, but is not very accurateDon’t mix it with serious benchmarking
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use catalog and product codeUsing cachesThe hidden devil: conflict errors
The ABC of efficient Zope code
Use the catalog
It prevents loading objects from the ZODBUnpickling is slow, even more when using ZEOBut don’t abuse from it, either
Use product codeZope security checks have a costPage Templates and Python Scripts are slowContent types, tools, Zope3 code and external methodsare fastBut be careful not to create security holes !
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use catalog and product codeUsing cachesThe hidden devil: conflict errors
A good cache will save your life
How to do it ?Cache the results of (moderatly) slow methodsYou can write a cache decorator
Cache tricks and tipsDon’t store Zope objects, but simple Python typesYou can use the parameters as the keyDon’t forget the contextConsider wisely what to use as the keyFor example, roles or userid ?
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use catalog and product codeUsing cachesThe hidden devil: conflict errors
The evil twin of caches: expiry
Automatic expiryReal-life time basedMaximal number of objectsLRU or scoring ?
Manual expiryEasy and ugly: restarting the ZopeIncluding a flush button somewhere
Semi-automatic expiryQuite complex to handleBut the most efficient
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use catalog and product codeUsing cachesThe hidden devil: conflict errors
The advertising page
GenericCacheA GPLed pure Python moduleIt provides a decorator and a classSupports custom key marshallersSupports several expiry schemes (time, LRU, manual)Can easily support semi-automatic expiry
Using it in ZopeYou need to write your own key marshallersYou can integrate the flushing in your UI
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use catalog and product codeUsing cachesThe hidden devil: conflict errors
Why and how do they happen ?
Threads and transactionsNothing is written to the ZODB until the endBut two threads can change the same object at the sametimeThat is a conflict: Zope will raise an error
Zope default behaviourZope will then restart the transactionThe user will not see anythingBut the processing will be done twice...Zope will give up after three tries
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use catalog and product codeUsing cachesThe hidden devil: conflict errors
How to limit them
Make changes localTry to not change too many objects in the same transactionThis is a design decision, you must think about it beforeIt is not always possible
Commit your transactionYou can manually commit your transactionThis will lower the risk of collisionBut it may also break the consistencyUse it on heavy processing, on safe pointsExample: reindexing a catalog, processing a daily batch
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use catalog and product codeUsing cachesThe hidden devil: conflict errors
How to resolve them
Some conflicts may be resolvedMost recent access date is the max of bothCounters are easily resolvedContainer conflicts can sometimes be resolved too
Zope allows you to do soWith the _p_resolveConflict methodIt takes three objects: old state, save state, current stateIt’s up to you to do a three-way merge
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Use catalog and product codeUsing cachesThe hidden devil: conflict errors
Please draw me a sheep
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Locating memory usageOptimizing memory
Peeling onion’s layers
Base ideaMany layers can delay memory releaseIt’s hard to know how much memory is really used
Layers1 Python’s garbage collector may delay release of objects2 Libraries may recycle objects (like linked-lists items)3 Your libc’s free may delay release to the operating system4 The operating system may delay unmap or sbrk actual
operation
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Locating memory usageOptimizing memory
A bit of theory about swap
Swap and memoryIn modern operating systems, the memory is just a cachefor diskSwapping, unmapping binaries or releasing cache is thesameYou can have swap used without it being a problem
Working set and trashing
The problem is in frequency, not amountDuring an operation, a set of pages is used, called the“working set”When it doesn’t fit in memory, your system is “trashing”
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Locating memory usageOptimizing memory
So, do I have a memory problem ?
The vmstat commandGives regular information about virtual memory systemExists on most Unixes, including GNU/Linux and FreeBSDTypical invocation is vmstat 2
Most interesting columns are si and so on Linux, pi andpo on FreeBSD
More informationsIt is a global command, accounting for the whole systemIt also contains “normal” I/O to discs and CPU state
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Locating memory usageOptimizing memory
A few tips
Using the gc module
You can check for uncollectable cyclesYou can ask for manual collectionYou can get a list of all objectsWarning, basic types like string are not returned, you needto walk in containers
Monkey-patching malloc
You can track malloc/free calls with ltrace
On the GNU C library, you can write hooks to malloc in CTools like dmalloc can provide debug logs
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Locating memory usageOptimizing memory
Sometimes, you get both at once
Use the catalog
Using the catalog instead of awakening objects will reducememory usageIt’ll also reduce disk usage, and therefore the cost ofswapping
Store big files in the filesystem
It’ll avoid polluting ZODB cacheEven better: use Apache to serve them directlyYou can use products like tramline
You can just serve with Apache the BlobFile directory
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Locating memory usageOptimizing memory
Reference counting and big memory chunks
The simple theoryObjects are only destroyed when there is no referenceLocal variables are destroyed at the end of the functionYou can use the del operator to force itUse it on big objects before any slow operation
But sometimes it’s not that simpleA reference can held in a calling functionYou cannot destroy it thenYou need to pass by reference
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Locating memory usageOptimizing memory
Let’s find a perfect culprit
The codedef process(target):
what = generate_data()send(target, what)
def send(target, what):encoded = encode(what)del whattarget.write(encoded)
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
Locating memory usageOptimizing memory
A solution
The codedef process(target):
what = generate_data()params = { "data": what }del whatsend(target, params)
def send(target, params):encoded = encode(params["data"])del params["data"]target.write(encoded)
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
ZEO pro and consSquid caching overviewSite exporting strategies
The two faces of ZEO
The good face of ZEOIt is very easy to set upIt makes Zope parellizable on many CPUsIt supports networking, and can allow clusteringIt can do both failover and load-balancing
The bad face of ZEOIt’ll slow any ZODB accessThe database itself is centralised and can be a bottleneckSession objects will need to go through ZODB marshallingtoo
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
ZEO pro and consSquid caching overviewSite exporting strategies
Using a proxy-cache
Do things only onceThe rendering a Plone page is complexBut many visitors will have exactly the same resultsUsing a proxy-cache like Squid (or Apache) can avoiddoing it many timesBut that only covers anonymous users
Squid can do varying
You can use multilingual site behind a SquidIt’ll detect the headers are the serve the correct versionfrom the cacheExample: EADS Astrium
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
ZEO pro and consSquid caching overviewSite exporting strategies
If one ZODB is not enough...
Static HTML exportNot too hard to implement, but still trickyRemoves all dynamic features like search
Exporting a site
You can copy a ZODB from a running ZopeOr you can use zexp and XML-RPC or Post
Resyncing backIt requires some hacksBut we did it on forum (PloneGossip) and eventsubscription (SignableEvent)
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications
The golden rulesLocating CPU usage
CPU optimizationMemory optimization
Massive sites deployment
ZEO pro and consSquid caching overviewSite exporting strategies
Conclusion
Plone is not doomed to be slow !Optimization has to be part of all steps, from design tocodeEven with fast CPUs, it’s still an important issue
ThanksTo Pilot Systems team for feedbackTo the Free Software community for all the tools available
The usual...Any question ?
Gaël LE MIGNOT – Pilot Systems How to minimize CPU and memory usage of Zope and Plone applications