NYC Java Meetup - Profiling and Performance

Post on 26-Jan-2015

114 views 1 download

description

A brief overview of some of the tools that ship with the Java platform that can be used to troubleshoot performance issues, and common production/performance problems

transcript

Performance, profiling, and production troubleshooting tools

on the Java platform 

About CONTEXTWEB Exchange

• Core platform written in Lightweight Java• 10,000+ Requests per second• <50ms Average Response Time• May query 40+ parties per auction• Multiple daily code pushes

Company Confidential 2

About this talk

• Internal Training• Developers – don’t always know

troubleshooting• Ops – don’t always know Java tooling• Introduction to some basics and tooling

distributed with the JDK

Company Confidential 3

Company Confidential 4

Problems We’ve Seen

• Stability• CPU under load• System Stability

• Scalability• Maximum throughput per application• Average Response time under load• Hardware utilization

Company Confidential 5

Things we look at

1.Garbage Collection

2.Thread Contention

3.Resource Usage

GC

Company Confidential 7

JVM Garbage Collection

Company Confidential 8

JVM Garbage Collection

Company Confidential 9

JVM Garbage Collection-Xloggc:/path/to/logfile (log GC activity)9.302: [GC 1146880K->75819K(3002368K), 0.0723170 secs]9.819: [Full GC 152814K->75275K(3002368K), 0.2690010 secs]13.829: [Full GC 368689K->79768K(3002368K), 0.3586360 secs]22.635: [GC 1226648K->108287K(3002368K), 0.0190360 secs]26.485: [GC 1255167K->94485K(3002368K), 0.0176960 secs]27.039: [Full GC 174388K->99964K(3002368K), 0.4527120 secs]34.997: [GC 1246844K->170532K(3002368K), 0.1357100 secs]193.227: [GC 1317366K->244168K(3002368K), 0.1290920 secs]197.768: [GC 1391023K->259567K(3002368K), 0.2124620 secs]341.069: [GC 1406447K->332544K(3002368K), 0.2839650 secs]346.165: [GC 1479424K->315776K(3002368K), 0.0939770 secs]

Company Confidential 10

JVM Garbage Collection

Company Confidential 11

JVM Garbage Collection

Company Confidential 12

JVM Garbage Collection

Company Confidential 13

JVM Garbage CollectionJmap

Lets you take memory snapshots and dumps against a running, or even hung JVM

jmap -F -J-d64 -dump:live,format=b,file=dump.hprof JAVA_PID

/usr/java/default/bin/jmap

Company Confidential 14

JVM (not) Garbage Collection

• java.lang.OutOfMemoryError: Java heap space

• java.lang.OutOfMemoryError: PermGen space

• java.lang.OutOfMemoryError: request <size> bytes for <reason>.

Company Confidential 15

Garbage Collection: Tools we use

• Verbose GC Logs – every JVM• Jstat – every 5 seconds• Jconsole - monitoring• JvisualVM – visualization• Jmap – memory dumps• Yourkit – memory profiling

Thread Contention

Company Confidential 16

• Full Java Thread Dump Java HotSpot(TM) 64-Bit Server VM 19.1-b02 Sun Microsystems Inc.Number of threads: 545Number of daemon threads: 485Peak live thread count since the Java virtual machine started or peak was reset: 972Is support for thread contention monitoring available on this JVM? [true]Is thread contention monitoring enabled? [false]. If false, some thread synchronization statistics are not be available.Is support for CPU time measurement for any thread available on this JVM? [true]Is thread CPU time measurement enabled? [true]. If false, thread execution times are not available for any thread.--------------------------------------------------------------------------------Thread Execution Information:-----------------------Thread "http-thread-pool-8080-(862241)" thread-id: 879,948 thread-state: TIMED_WAITING Waiting on lock: java.util.concurrent.CountDownLatch$Sync@e3963a8 at: sun.misc.Unsafe.park(Native Method) at: java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198) at: java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1011) at: java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1303) at: java.util.concurrent.CountDownLatch.await(CountDownLatch.java:253) at: com.contextweb.commons.client.http.batch.httpcorenio2.BatchClient$BatchClientJob.execute(BatchClient.java:115) at: com.contextweb.commons.client.http.batch.util.AbstractBatchExecutionJob.execute(AbstractBatchExecutionJob.java:18) at: com.contextweb.adserving.rt.buyer.RTBuyerBatchExecutorImpl.executeJob(RTBuyerBatchExecutorImpl.java:48) at: com.contextweb.adserving.rt.buyer.RTBuyerBidRequestProcessor.executeJob(RTBuyerBidRequestProcessor.java:135)

Thread Contention

• Thread Dumps are your best friend• kill -3• Jstack (-J-d64 –l –m)

• Look for:• Many objects in blocked wait• Threadpool usage• Thread/Resource pools missing items• Synchronization/Concurrency Issues

Company Confidential 17

Thread Contention

• Thread-dump analysis requires many different dumps

• lsof/netstat/other system tools often provide hints on contention points

• (historically) AppServers/libs shipped with ridiculously low defaults• Worker threads• DB connections• File descriptors (linux)

Company Confidential 18

Company Confidential 19

Thread Contention: Tools we use

• Thread dumps – every 15 seconds• Shutdown scripts

• auto thread dump 5x• Lsof output• Netstat output• (optional) heap dump

Resource Usage

• Memory• Don’t forget PermGen• Don’t forget Stack (per thread)

• File Descriptors• DB/Network Connections• Timeouts/Expirations

• Connection/Read default to forever• LDAP/DNS default to forever• URL – blocking DNS call

Company Confidential 20

Resource Usage: Tools we use

• ab (Apache Benchmark)• Dirt simple• ab -n 10000000 -k -c 200 http://localhost:8080/bls/check?

urlreferrercheck=geI0qJVZa/Uu7;5DtxzgRilmw==

• Siege• Mud simple• siege -c 20 http

://localhost:8080/bls/check?useragenthash=adfasasdasd

• Instrumentation• Home-grown – RollingCounter• Timers

Company Confidential 21

Conclusions

•Many free tools give insight•You can never have enough historical data

•Many default settings will get you in trouble

Company Confidential 22