Performance, profiling, and production troubleshooting tools
on the Java platform
About CONTEXTWEB Exchange
• Core platform written in Lightweight Java• 10,000+ Requests per second• <50ms Average Response Time• May query 40+ parties per auction• Multiple daily code pushes
Company Confidential 2
About this talk
• Internal Training• Developers – don’t always know
troubleshooting• Ops – don’t always know Java tooling• Introduction to some basics and tooling
distributed with the JDK
Company Confidential 3
Company Confidential 4
Problems We’ve Seen
• Stability• CPU under load• System Stability
• Scalability• Maximum throughput per application• Average Response time under load• Hardware utilization
Company Confidential 5
Things we look at
1.Garbage Collection
2.Thread Contention
3.Resource Usage
GC
Company Confidential 7
JVM Garbage Collection
Company Confidential 8
JVM Garbage Collection
Company Confidential 9
JVM Garbage Collection-Xloggc:/path/to/logfile (log GC activity)9.302: [GC 1146880K->75819K(3002368K), 0.0723170 secs]9.819: [Full GC 152814K->75275K(3002368K), 0.2690010 secs]13.829: [Full GC 368689K->79768K(3002368K), 0.3586360 secs]22.635: [GC 1226648K->108287K(3002368K), 0.0190360 secs]26.485: [GC 1255167K->94485K(3002368K), 0.0176960 secs]27.039: [Full GC 174388K->99964K(3002368K), 0.4527120 secs]34.997: [GC 1246844K->170532K(3002368K), 0.1357100 secs]193.227: [GC 1317366K->244168K(3002368K), 0.1290920 secs]197.768: [GC 1391023K->259567K(3002368K), 0.2124620 secs]341.069: [GC 1406447K->332544K(3002368K), 0.2839650 secs]346.165: [GC 1479424K->315776K(3002368K), 0.0939770 secs]
Company Confidential 10
JVM Garbage Collection
Company Confidential 11
JVM Garbage Collection
Company Confidential 12
JVM Garbage Collection
Company Confidential 13
JVM Garbage CollectionJmap
Lets you take memory snapshots and dumps against a running, or even hung JVM
jmap -F -J-d64 -dump:live,format=b,file=dump.hprof JAVA_PID
/usr/java/default/bin/jmap
Company Confidential 14
JVM (not) Garbage Collection
• java.lang.OutOfMemoryError: Java heap space
• java.lang.OutOfMemoryError: PermGen space
• java.lang.OutOfMemoryError: request <size> bytes for <reason>.
Company Confidential 15
Garbage Collection: Tools we use
• Verbose GC Logs – every JVM• Jstat – every 5 seconds• Jconsole - monitoring• JvisualVM – visualization• Jmap – memory dumps• Yourkit – memory profiling
Thread Contention
Company Confidential 16
• Full Java Thread Dump Java HotSpot(TM) 64-Bit Server VM 19.1-b02 Sun Microsystems Inc.Number of threads: 545Number of daemon threads: 485Peak live thread count since the Java virtual machine started or peak was reset: 972Is support for thread contention monitoring available on this JVM? [true]Is thread contention monitoring enabled? [false]. If false, some thread synchronization statistics are not be available.Is support for CPU time measurement for any thread available on this JVM? [true]Is thread CPU time measurement enabled? [true]. If false, thread execution times are not available for any thread.--------------------------------------------------------------------------------Thread Execution Information:-----------------------Thread "http-thread-pool-8080-(862241)" thread-id: 879,948 thread-state: TIMED_WAITING Waiting on lock: java.util.concurrent.CountDownLatch$Sync@e3963a8 at: sun.misc.Unsafe.park(Native Method) at: java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198) at: java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1011) at: java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1303) at: java.util.concurrent.CountDownLatch.await(CountDownLatch.java:253) at: com.contextweb.commons.client.http.batch.httpcorenio2.BatchClient$BatchClientJob.execute(BatchClient.java:115) at: com.contextweb.commons.client.http.batch.util.AbstractBatchExecutionJob.execute(AbstractBatchExecutionJob.java:18) at: com.contextweb.adserving.rt.buyer.RTBuyerBatchExecutorImpl.executeJob(RTBuyerBatchExecutorImpl.java:48) at: com.contextweb.adserving.rt.buyer.RTBuyerBidRequestProcessor.executeJob(RTBuyerBidRequestProcessor.java:135)
Thread Contention
• Thread Dumps are your best friend• kill -3• Jstack (-J-d64 –l –m)
• Look for:• Many objects in blocked wait• Threadpool usage• Thread/Resource pools missing items• Synchronization/Concurrency Issues
Company Confidential 17
Thread Contention
• Thread-dump analysis requires many different dumps
• lsof/netstat/other system tools often provide hints on contention points
• (historically) AppServers/libs shipped with ridiculously low defaults• Worker threads• DB connections• File descriptors (linux)
Company Confidential 18
Company Confidential 19
Thread Contention: Tools we use
• Thread dumps – every 15 seconds• Shutdown scripts
• auto thread dump 5x• Lsof output• Netstat output• (optional) heap dump
Resource Usage
• Memory• Don’t forget PermGen• Don’t forget Stack (per thread)
• File Descriptors• DB/Network Connections• Timeouts/Expirations
• Connection/Read default to forever• LDAP/DNS default to forever• URL – blocking DNS call
Company Confidential 20
Resource Usage: Tools we use
• ab (Apache Benchmark)• Dirt simple• ab -n 10000000 -k -c 200 http://localhost:8080/bls/check?
urlreferrercheck=geI0qJVZa/Uu7;5DtxzgRilmw==
• Siege• Mud simple• siege -c 20 http
://localhost:8080/bls/check?useragenthash=adfasasdasd
• Instrumentation• Home-grown – RollingCounter• Timers
Company Confidential 21
Conclusions
•Many free tools give insight•You can never have enough historical data
•Many default settings will get you in trouble
Company Confidential 22
References
• http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html
• http://prefetch.net/blog/index.php/2008/01/16/monitoring-garbage-collection-with-jstat/
• http://download.oracle.com/javase/6/docs/technotes/tools/share/jstack.html
• http://java.sun.com/developer/technicalArticles/J2SE/monitoring/
• http://www.yourkit.com