+ All Categories
Home > Documents > Scalable Apache for Beginners Aaron Bannert [email protected]@apache.org /...

Scalable Apache for Beginners Aaron Bannert [email protected]@apache.org /...

Date post: 16-Jan-2016
Category:
Upload: denis-nichols
View: 236 times
Download: 0 times
Share this document with a friend
Popular Tags:
81
Scalable Apache for Beginners Aaron Bannert [email protected] / aaron@codemass .com QuickTime™ TIFF (Uncompr are needed t
Transcript
Page 1: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Scalable Apache for Beginners

Aaron [email protected] / [email protected]

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 2: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Measuring Performance

What is Performance?

Page 3: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

How do we measure performance?

Benchmarks Requests per Second Bandwidth Latency Concurrency (Scalability)

Page 4: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Real-world Scenarios

Can benchmarks tell us how it will perform in the real world?

Page 5: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

What makes a good Web Server?

CorrectnessReliabilityScalabilityStabilitySpeed

Page 6: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Correctness

Does it conform to the HTTP specification?

Does it work with every browser?Does it handle erroneous input gracefully?

Page 7: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Reliability

Can you sleep at night?Are you being paged during dinner?It is an appliance?

Page 8: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Scalability

Does it handle nominal load?Have you been Slashdotted?

And did you survive?

What is your peak load?

Page 9: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Speed (Latency)

Does it feel fast?Do pages snap in quickly?Do users often reload pages?

Page 10: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Apache the General Purpose Webserver

Apache developers strive for

correctness first, and

speed second.

Page 11: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Apache 1.3

Fast enough for most sitesParticularly on 1 and 2 CPU systems.

Page 12: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Apache 2.0

Adds more features filters threads portability

(has excellent Windows support)

Scales to much higher loads.

Page 13: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.
Page 14: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Apache HTTP Server

Architecture Overview

Page 15: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Classic “Prefork” Model

Apache 1.3, and Apache 2.0 Prefork

Many Children Each child handles one

connection at a time.Child

Parent

ChildChild… (100s)

Page 16: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Multithreaded “Worker” Model

Apache 2.0 Worker

Few Children Each child handles many

concurrent connections.

Child

Parent

ChildChild… (10s)

10s of threads

Page 17: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Dynamic Content: Modules

Extensive APIPluggable InterfaceDynamic or Static Linkage

Page 18: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

In-process Modules

Run from inside the httpd process CGI (mod_cgi) mod_perl mod_php mod_python mod_tcl

Page 19: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Out-of-process Modules

Processing happens outside of httpd (eg. Application Server)

Tomcat mod_jk/jk2, mod_jserv

mod_proxy mod_jrun

Parent

TomcatChild

ChildChild

Page 20: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Architecture: The Big Picture

Child

Parent

ChildChild… (10s)

10s of threads Tomcat

DB

100s of threads

mod_jkmod_rewritemod_phpmod_perl

Page 21: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.
Page 22: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Terms and Definitions

Terms from the Documentation

and the Configuration

Page 23: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

“HTTP”

HyperText Transfer Protocol

A network protocol used to communicate

between web servers and web clients (eg. a

Web Browser).

Page 24: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

“Request” and “Response”

Web browsers request pages and web servers respond with the result.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Web Browser(Mosaic)

Web Server(Apache)

Request

Response

Page 25: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

“MPM”

Multi-Processing ModuleAn MPM defines how the server will

receive and manage incoming requests.Allows OS-specific optimizations.Allows vastly different server models

(eg. threaded vs. multiprocess).

Page 26: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

“Child Process” aka “Server”

Called a “Server” in

httpd.conf

A single httpd process.

May handle one or more

concurrent requests

(depending on the MPM).Child

Parent

ChildChild… (100s)

Servers

Page 27: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

“Parent Process”

The main httpd process.

Does not handle connections itself.

Only creates and destroys children.

Child

Parent

Child

Child

… (100s)

Only one P

arent

Page 28: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

“Client”

Single HTTP connection (eg. web browser). Note that many web browsers open up multiple

connections. Apache considers each connection uniquely.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Web Browser(Mosaic)

Web Server(Apache)

Page 29: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

“Thread”

In multi-threaded MPMs (eg. Worker).

Each thread handles a single connection.

Allows Children to handle many

connections at once.

Page 30: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.
Page 31: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Apache Configuration

httpd.conf walkthrough

Page 32: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Prefork MPM

Apache 1.3 and Apache 2.0 PreforkEach child handles one connection at a

timeMany childrenHigh memory requirements

“You’ll run out of memory before CPU”

Page 33: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Prefork Directives (Apache 2.0)

StartServersMinSpareServersMaxSpareServersMaxClientsMaxRequestsPerChild

Page 34: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Worker MPM

Apache 2.0 and laterMultithreaded within each childDramatically reduced memory footprintOnly a few children (fewer than prefork)

Page 35: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Worker Directives

MinSpareThreadsMaxSpareThreadsThreadsPerChildMaxClientsMaxRequestsPerChild

Page 36: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

KeepAlive Requests

Persistent connectionsMultiple requests over one TCP socket

Directives: KeepAlive MaxKeepAliveRequests KeepAliveTimeout

Page 37: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.
Page 38: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Apache 1.3 and 2.0Performance Characteristics

Multi-process,

Multi-threaded,

or Both?

Page 39: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Prefork

High memory usageHighly tolerant of faulty modulesHighly tolerant of crashing childrenFastWell-suited for 1 and 2-CPU systemsTried-and-tested model from Apache 1.3“You’ll run out of memory before CPU.”

Page 40: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Worker

Low to moderate memory usage Moderately tolerant to faulty modules Faulty threads can affect all threads in child Highly-scalable Well-suited for multiple processors Requires a mature threading library

(Solaris, AIX, Linux 2.6 and others work well)

Memory is no longer the bottleneck.

Page 41: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Important Performance Considerationssendfile() supportDNS considerationsstat() callsUnnecessary modules

Page 42: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

sendfile() Support

No more double-copy Zero-copy* Dramatic improvement for static files Available on

Linux 2.4.x Solaris 8+ FreeBSD/NetBSD/OpenBSD ...

* Zero-copy requires both OS support and NIC driver support.

Page 43: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

DNS Considerations

HostNameLookups DNS query for each incoming request Use logresolve instead.

Name-based Allow/Deny clauses Two DNS queries per request for each

allow/deny clause.

Page 44: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

stat() for Symlinks

Options FollowSymLinks

Symlinks are trusted. SymLinksIfOwnersMatch

Must stat() and lstat() each symlink, yuck!

Page 45: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

stat() for .htaccess files

AllowOverride stat() for .htaccess in each path component of a

request Happens for any AllowOverride Try to disable or limit to specific sub-dirs Avoid use at the DocumentRoot

Page 46: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

stat() for Content Negotiation

DirectoryIndex Don’t use wildcards like “index” Use something like this instead

DirectoryIndex index.html index.php index.shtml

mod_negotiation Use a type-map instead of MultiViews if

possible

Page 47: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Remove Unused Modules

Saves Memory Reduces code and data footprint

Reduces some processing (eg. filters)Makes calls to fork() faster

Static modules are faster than dynamic

Page 48: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.
Page 49: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Testing Performance

Benchmarking Tools

Page 50: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Some Popular (Free) Tools

abfloodhttperfJMeter ...and many others

Page 51: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

ab

Simple Load on a Single URLComes with ApacheGood for sanity checkScales poorly

Page 52: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

flood

Profile-driven load testerUseful for generating real-world scenariosI co-authored itPart of the httpd-test project at the ASFBuilt to be highly-scalableDesigned to be extremely flexible

Page 53: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

JMeter

Has a graphical interfaceBuilt on JavaPart of Apache Jakarta projectDepends heavily on JVM performance

Page 54: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Benchmarking Metrics

What are we interested in testing? Recall that we want our web server to be

Correct Reliable Scalable Stable Fast

Page 55: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Benchmarking Metrics: Correctness

No errors No data corruption Protocol compliant

Should not be an everyday concern for admins

Page 56: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Benchmarking Metrics: Reliability

MTBF - Mean Time Between Failures

Difficult to measure programmaticallyEasy to judge subjectively

Page 57: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Benchmarking Metrics: Scalability

Predicted concurrencyMaximum concurrent connectionsRequests per Second (rps)Concurrent Users

Page 58: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Benchmarking Metrics:StabilityConsistency, PredictabilityErrors per ThousandCorrectness under StressNever returns invalid information

Common problem with custom web-apps Works well with 10 users, but chokes on 1000.

Page 59: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Benchmarking Metrics:SpeedRequests per Second (rps)Latency

time until connected time to first byte time to last byte time to close

Easy to test with current toolsHighly related to Scalability/Concurrency

Page 60: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Method

1. Define the problemeg. Test Max Concurrency, Correctness, etc...

2. Narrow the scope of the problemSimplify the problem

3. Use tools to collect data

4. Come up with a hypothesis

5. Make minimal changes, retest

Page 61: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.
Page 62: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Troubleshooting

Common pitfalls

and their solutions

Page 63: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Check your error_log

The first place to lookIncrease the LogLevel if needed

Make sure to turn it back down (but not off) in production

Page 64: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Check System Health

vmstat, systat, iostat, mpstat, lockstat, etc...

Check interrupt load NIC might be overloaded

Are you swapping memory? A web server should never swap

Check system logs /var/log/message, /var/log/syslog, etc...

Page 65: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Check Apache Health

server-status ExtendedStatus (see next slide)

Verify “httpd -V”ps -elf | grep httpd | wc -l

How many httpd processes are running?

Page 66: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

server-status Example

Page 67: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Other Possibilities

Set up a staging environmentSet up duplicate hardware

Check for known bugs http://nagoya.apache.org/bugzilla/

Page 68: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Common Bottlenecks

No more File DescriptorsSockets stuck in TIME_WAITHigh Memory Use (swapping)CPU OverloadInterrupt (IRQ) Overload

Page 69: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

File Descriptors

Symptoms entry in error_log new httpd children fail to start fork() failing across the system

Solutions Increase system-wide limits Increase ulimit settings in apachectl

Page 70: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

TIME_WAIT

Symptoms Unable to accept new connections CPU under-utilized, httpd processes sit idle Not Swapping netstat shows huge numbers of sockets in TIME_WAIT

Many TIME_WAIT are to be expected Only when new connections are failing is it a problem

Decrease system-wide TCP/IP FIN timeout

Page 71: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Memory Overload, Swapping

Symptoms Ignore system free memory, it is misleading! Lots of Disk Activity top/free show high swap usage Load gradually increasing ps shows processes blocking on Disk I/O

Solutions Add more memory Use less dynamic content, cache as much as possible Try the Worker MPM

Page 72: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

How much free memorydo I really have?Output from top/free is misleading.Kernels use buffersFile I/O uses cachePrograms share memory

Explicit shared memory Copy-On-Write after fork()

The only time you can be sure is when it starts swapping.

Page 73: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

CPU Overload

Symptoms top shows little or no idle CPU time System is not Swapping High system load System feels sluggish Much of the CPU time is spent in userspace

Solutions Add another CPU, get a faster machine Use less dynamic content, cache as much as possible

Page 74: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Interrupt (IRQ) Overload

Symptoms Frequent on big machines (8-CPUs and above) Not Swapping One or two CPUs are busy, the rest are idle Low overall system load

Solutions Add another NIC

bind it to the first or use two IP addresses in Apache put NICs on different PCI busses if possible

Page 75: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.
Page 76: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Next Generation Improvements

Page 77: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Linux 2.6

NPTL and NGPT Next-Gen Thread Libraries for Linux Available in RedHat 9 already

O(1) scheduling patch Preemptive Kernel patch

All improvements affect Apache, but the Worker MPM will likely be the most affected.

Page 78: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

Solaris 9

1:1 threads Decreases thread library overhead Improves CPU load sharing

sendfile()-like support (since late Solaris 7) Zero-copy

Page 79: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

64-bit Native Support

Sparc had it for a long time G5s now have it (sort-of) AMD64 (Opteron and Athlon64) have it

Noticeable improvement in Apache 2.0 Increased Requests-per-second Faster 64-bit time calculations

Huge Virtual Memory Address-space mmap/sendfile

Page 80: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.
Page 81: Scalable Apache for Beginners Aaron Bannert aaron@apache.orgaaron@apache.org / aaron@codemass.comaaron@codemass.com.

The End

Thank You!


Recommended