20140228 fp and_performance

transcript

FP and PerformanceBeyond Big

“O” NotationJamie Allen

Director of Consulting

Who Am I?

• Director of Consulting for

• Author of Effective Akka

@jamie_allen

Big “O” Notation

Many Developers Don’t Look Further

Reactive Applications

What is Performance?

Throughput

Latency

Footprint

Power Consumption!

There are no standards

The only rules you must follow is your non-functional requirements!

We Love Functional Programming!• But what is it?

•Just first class functions?•Referential transparency?•Immutability?•Category theory?

Abstractions!• They help us reason about our logic• Decoupling• Simplicity• Correctness• Reuse

Double Edged Sword• We live in a world of abstractions already!•Languages on the JVM are DSL for bytecode•Bytecode is an abstraction over macro instructions

•Macro instructions are an abstraction over micro instructions

JVM == Imperative• The JVM is built to execute imperative logic very fast• The more we stray from imperative logic constructs, the more we pay in terms of performance

Penalties• Increased allocations• More bytecode executed to perform tasks• Less control over runtime performance

There Is A Fine Line Here• We love to write “elegant” code, but this is defined by your personal aesthetic•Some people love prefix notation and s-expressions

•Some people see beauty in c-style constructs

Rap Genius

What About The Environment?

We Can’t Ignore the Cost!

Languages Impose Constraints• “Choose-a-phone” languages versus those

with strictly defined language constructs

Languages “Pick” Abstractions For You• Example: CSP versus Actor models

•Why must we choose?•Why can’t we have both in libraries?•Why can’t both be relevant to solving problems in an application?

We Need to be Treated Like Adults!• Every program will behave differently• Choosing a language that imposes strict rules forces us to make ugly choices

How Many Libs Have Performance Tests?

• Very few• You give up control of your ability to optimize when you use libraries

Asynchrony• A wonderful tool for leveraging cores on a machine• Not faster than a single thread per se• We must pay attention to Amdahl’s Law

HardwareFrom the bottom up, computers

understand queues and message passing

Things to Avoid

• Parallel Collections• STM• Returns out of closures• Boxing• Lambdas! :)

Digression: Don’t Use OSX• It’s lousy for doing performance analysis• The JVM has issues, particularly with visibility

Tools Also Have Issues• Many tools only give you information after the JVM is at a safepoint• What does that tell us about our memory consumption or thread stacks?

Tools That I Trust• jHiccup (pauses and stalls)• Java Microbenchmarking Harness• PrintGCStats (post-mortem tool on GC output in file)• GC:

• jClarity Censum• GarbageCat• VisualVM with Visual GC plugin

• htop (visualize user vs kernel usage)• dstat (replaces vmstat, iostat and ifstat)

• --cpu, --mem, --aio, --top-io, --top-bio• OProfile (system wide Linux profiler)

Pinning to Cores• Use numactl for pinning to cores and sockets• taskset pins to cores only, no control over cross-socket communication, which has latency penalties, but useful for laptop tests

JVM Flags to Use• Pay close attention to the output from:

•+PrintGCDetails•+PrintGCDateStamp•+PrintGCCause•+PrintGCApplicationStoppedTime•+PrintTenuringDistribution (age of objects in survivor

spaces)•+PrintSafepointStatistics (reason and timings)•+PrintFlagsFinal (show entire JVM config)•+PrintAssembly (must use

+UnlockDiagnosticVMOptions)

Don’t Overuse Memory Barriers• Volatile variables aren’t required for every field• Think about how to organize updates to fields and then update ONE volatile var last to publish all

Thank You!

20140228 fp and_performance

Documents