Date post: | 13-Aug-2015 |
Category: |
Software |
Upload: | jguerrero999 |
View: | 164 times |
Download: | 2 times |
Shubhra Kar | Products & Educationtwitter:@shubhrakar
Profiling & Memory Leak Diagnosis
nodejs @ hyper-scale
About me
J2EE and SOA architect
Performance architect
Node, mBaaS & APIs
These guys sent me !
Bert Belder
Ben Noordhuis
Node/io Core
RaymondFeng
Ritchie Martori
LoopBack & Express Core
SamRoberts
Miroslav Bajtos
Ryan Graham
Latency demands are uncompromising
Web SaaS Mobile IoT
10
25
50
100100
50
1015
40
100
25
The new curve
Concurrent Users Latency Adoption
That’s not your API, is it ?
So how do we tune performance ?
How does GC work in V8 – Almost the same !
Concept of reachability.Roots: Reachable or in live scope objectsInclude objects referenced from anywhere in the call stack (all local variables and parameters in the functions currently being invoked), and any global variables.Objects are kept in memory while accessible from roots through a reference or a chain of references.Root objects are pointed directly from V8 or the Web browser like DOM elementsGarbage collector identifies dead memory regions/unreachable objects through a chain of pointers from a live object; reallocates or releases them to the OS
Easy right ? Hell No !!!
Pause and then Stop the World
V8 essentially:stops program execution when performing a garbage collection cycle.processes only part of the object heap in most garbage collection cycles. This minimizes the impact of stopping the application.Accurately keeps all objects and pointers in memory. This avoids falsely identifying objects as pointers which can result in memory leaks.In V8, the object heap is segmented into many parts; hence If an object is moved in a garbage collection cycle, V8 updates all pointers to the object.
Short, Full GC and some algorithms to know
V8 divides the heap into two generations:
Short GC/scavengingObjects are allocated in “new-space” (between 1 and 8 MB). Allocation in new space is very cheap; increment an allocation pointer when we want to reserve space for a new object. When the pointer reaches the end of new space, a scavenge (minor garbage collection cycle) is triggered, quickly removing dead objects from new space.large space overhead, since we need physical memory backing both to-space and from-space. Fast by design, hence using for short GC cycles. Acceptable if new space is small – a few Mbs
Full GC/mark-sweep & mark-compactObjects which have survived two minor garbage collections are promoted to “old-space.” Old-space is garbage collected in full GC (major garbage collection cycle), which is much less frequent. A full GC cycle is triggered based on a memory thresholdTo collect old space, which may contain several hundred megabytes of data, we use two closely related algorithms, Mark-sweep and Mark-compact.
New Algorithm implementation
Incremental marking & lazy sweeping
In mid-2012, Google introduced two improvements that reduced garbage collection pauses significantly: incremental marking and lazy sweeping.Incremental marking means being able to do a bit of marking work, then let the mutator (JavaScript program) run a bit, then do another bit of marking work. Short pauses in the order of 5-10 ms each as an example for marking. Threshold based. At every alloc, execution is paused to perform an incremental marking step.
Lazy sweep cleans up set of objects at time eventually cleaning all pages.
OK…so how can I do heap analysis
StrongLoop CLI
$ slc start
$ slc ctl
Service ID: 1Service Name: express-example-appEnvironment variables: No environment variables definedInstances: Version Agent version Cluster size 4.1.0 1.5.1 4Processes: ID PID WID Listening Ports Tracking objects? CPU profiling? 1.1.50320 50320 0 1.1.50321 50321 1 0.0.0.0:3001 1.1.50322 50322 2 0.0.0.0:3001 1.1.50323 50323 3 0.0.0.0:3001 1.1.50324 50324 4 0.0.0.0:3001
$ slc ctl heap-snapshot 1.1.1
StrongLoop API
Programmatic heap snapshots (timer based)
Programmatic heap snapshots (threshold based)
var heapdump = require('heapdump') ... setInterval(function () { heapdump.writeSnapshot() }, 6000 * 30) <strong>(1)</strong>
var heapdump = require('heapdump')var nextMBThreshold = 0 <strong>(1)</strong>
setInterval(function () { var memMB = process.memoryUsage().rss / 1048576 <strong>(2)</strong> if (memMB > nextMBThreshold) { <strong>(3)</strong> heapdump.writeSnapshot() nextMBThreshold += 100 }}, 6000 * 2) <strong>(4)</strong>
StrongLoop Arc
StrongLoop Arc – Memory Analysis
CPU hotspots & event loop blocks ?
Don’t Block the EventLoop
Node.js Server
StrongLoop CLI
$ slc start
$ slc ctl
Service ID: 1Service Name: express-example-appEnvironment variables: No environment variables definedInstances: Version Agent version Cluster size 4.1.0 1.5.1 4Processes: ID PID WID Listening Ports Tracking objects? CPU profiling? 1.1.50320 50320 0 1.1.50321 50321 1 0.0.0.0:3001 1.1.50322 50322 2 0.0.0.0:3001 1.1.50323 50323 3 0.0.0.0:3001 1.1.50324 50324 4 0.0.0.0:3001
$ slc ctl cpu-start 50320 $ slc ctl cpu-stop 50320
CPU profiling in StrongLoop Arc
The Upside down wedding cake
Blocked event loop in Meteor atomosphere
node-fibers implements co-routines. Meteor uses this to hack local thread storage allowing V8 to run multiple execution contexts each mapped to a co-routine.
FindOrAllocatePerThreadDataForThisThread() used in switching context between co-routines
Co-routines are cooperative; the current coroutine has to yield control before another one can run and that is what Meteor does in its process.nextTick() callback; it essentially builds concurrent (but not parallel) green threads on a round-robin scheduler
Too many tiny tasks and not one long running one was blocking the event loop
process.nextTick() has a failsafe mechanism where it will process “x” tick callbacks before deferring the remaining ones to the next event loop tick.
Native MongoDB driver disabled the failsafe to silence a warning message in node v0.10 about maxtickDepth being reached
ticks parent name2274 7.3% v8::internal::Isolate::FindOrAllocatePerThreadDataForThisThread()1325 58.3% LazyCompile: ~<anonymous> packages/meteor.js:6831325 100.0% LazyCompile: _tickCallback node.js:399
The fix
The workaround: switch fromprocess.nextTick() to setImmediate()
StrongLoop Smart Profiling….Arc support coming soon
$ slc ctl cpu-start 1.1.76901 12
$ slc ctl -C http://my.remote.host cpu-start 1.1.76901 12
Local Linux host
Remote Linux host
• Sniff for event loop block• Trigger deep profile when blocking
encountered
And finally the winner is…
Deep Transaction Tracing
StrongLoop – node.js Development to Production
Build and Deploy
Automate Lifecycle
Performance MetricsReal-time production monitoring
ProfilerRoot cause
CPU & Memory
API ComposerVisual modeling
StrongLoop Arc
Process Manager
Scale applications
Q2201
5Mesh
Deploy containerize
d
ORM, mBaaS, Realtime