Post on 26-Mar-2015
transcript
CS193H:High Performance Web Sites
Lecture 8: Rule 4 – Gzip Components
Steve SoudersGoogle
souders@cs.stanford.edu
AnnouncementsWeb 100 Performance Profile (round 1) class
project has been graded – contact Aravind if you want to know your grade
Compression (encoding)
typically reduces size by 70%(6230-2066)/6230 = 67%
GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1
HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 6230
function d(s) {...
GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1Accept-Encoding: gzip,deflate
HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 2066Content-Encoding: gzip
XmoÛHþ\ÿFÖvã*wØoq...
Gzip vs. Deflate
gzip (default settings) compresses more
Gzip Deflate
Size SizeSavin
gs SizeSavin
gs
Script 3.3K 1.1K 67% 1.1K 66%
Script 39.7K 14.5K 64% 16.6K 58%
Stylesheet 1.0K 0.4K 56% 0.5K 52%
Stylesheet 14.1K 3.7K 73% 4.7K 67%
Pros and ConsPro:
smaller transfer sizeCon:
CPU cycles – on client and server
Don't compress resources < 1K
Gzip configurationApache 1.3: mod_gzip
mod_gzip_item_include file \.html$mod_gzip_item_include mime ^text/html$mod_gzip_item_include file \.js$mod_gzip_item_include mime ^application/x-javascript$
mod_gzip_item_include file \.css$mod_gzip_item_include mime ^text/css$
Apache 2.x: mod_deflateAddOutputFilterByType DEFLATE text/html text/css application/x-javascript
control compression level: DeflateCompressionLevelhttp://httpd.apache.org/docs/2.0/mod/mod_deflate.html
HTML Scripts Stylesheets
amazon.com x
aol.com x some some
cnn.com
ebay.com x
froogle.google.com x x x
msn.com x deflate deflate
myspace.com x x x
wikipedia.org x x x
yahoo.com x x x
youtube.com x some some
Gzip: not just for HTML
HTML Scripts Stylesheets
aol.com x x x
ebay.com x some
facebook.com x x x
google.com/search x x na
search.live.com/results
x x x
msn.com x x x
myspace.com x x x
en.wikipedia.org/wiki x some some
yahoo.com x x x
youtube.com x x x
gzip scripts, stylesheets, XML, JSON (not images, Flash, PDF) March 2007October 2008
Edge Case: ProxiesProxy Origin Server
6 GET main.js (no Accept-Encoding)
2 GET main.js Accept-Encoding: gzip
3 main.js Content-Encoding: gzip
4 main.js Content-Encoding: gzip
5 main.js Content-Encoding: gzip
1 GET main.js Accept-Encoding: gzip
7 main.js Content-Encoding: gzip
proxies may serve gzipped content to browsers that don't support it, and vice versa
Edge Case: Proxies w/ VaryProxy Origin Server
6 GET main.js (no Accept-Encoding)
2 GET main.js Accept-Encoding: gzip
3 main.js Content-Encoding: gzip Vary: Accept-Encoding
4 main.js Content-Encoding: gzip [Accept-Encoding: gzip]
5 main.js Content-Encoding: gzip
1 GET main.js Accept-Encoding: gzip
10 main.js (no gzip)
7 GET main.js (no Accept-Encoding)
9 main.js [Accept-Encoding: ]
8 main.js Vary: Accept-Encoding
11 GET main.js Accept-Encoding: gzip
12 main.js Content-Encoding: gzip
13 GET main.js (no Accept-Encoding)
14 main.js (no gzip)
add Vary: Accept-Encoding
Edge Case: Bad Browsers< 1% of browsers have problems with gzip
IE 5.5: http://support.microsoft.com/default.aspx?scid=kb;en-us;Q313712
IE 6.0: http://support.microsoft.com/default.aspx?scid=kb;en-us;Q31249
Netscape 3.x, 4.x http://www.schroepl.net/projekte/mod_gzip/browser.htm
User-Agent white list for gzipApache 1.3: mod_gzip_item_include reqheader "User-Agent: MSIE [6-9]" mod_gzip_item_include reqheader "User-Agent: Mozilla/[5-9]"
Apache 2.0: BrowserMatch ^MSIE [6-9] gzip BrowserMatch ^Mozilla/[5-9] gzip
Edge Case: Bad Browsers(cont'd)proxies could mix-up responses
give cached response from useragent1 to useragent2
could add Vary: User-Agentso many possibilities, defeats proxy caching
better to add Cache-Control: Private downside: disables all proxy caches
is it a serious problem?hard to diagnose; problem getting smaller
Edge Case: ETagswhat happens when proxy makes Conditional
GET requests?Last-Modified date for gzipped vs. ungzipped is
different => If-Modified-Since works fineETag is the same in Apache for gzipped &
ungzipped => If-None-Match succeeds, proxy could give browser mismatched content
remove Etags! (Rule 13)
http://issues.apache.org/bugzilla/show_bug.cgi?id=39727
Edge Case: ETags presentProxy Origin Server
6 GET main.js (no Accept-Encoding)
2 GET main.js Accept-Encoding: gzip
3 main.js Content-Encoding: gzip Cache-Control: max-age=0 ETag: "de158-e58-c7ee4140"
4 main.js Content-Encoding: gzip Cache-Control: max-age=0 ETag: "de158-e58-c7ee4140"
5 main.js Content-Encoding: gzip
1 GET main.js Accept-Encoding: gzip
7 GET main.js If-None-Match: "de158-e58-c7ee4140"
8 304 Not Modified9 main.js Content-Encoding: gzip
proxy gives browser mismatched content
Edge Case: ETags removedProxy Origin Server
6 GET main.js (no Accept-Encoding)
2 GET main.js Accept-Encoding: gzip
3 main.js Content-Encoding: gzip Cache-Control: max-age=0 Last-Modified: Thu, 21 Aug
2008 23:53:57 GMT
4 main.js Content-Encoding: gzip Cache-Control: max-age=0 Last-Modified: Thu, 21 Aug 2008 23:53:57 GMT
5 main.js Content-Encoding: gzip
1 GET main.js Accept-Encoding: gzip
7 GET main.js If-Modified-Since: Thu, 21 Aug 2008 23:53:57
GMT
8 main.js Cache-Control: max-age=0 Last-Modified: Fri, 22 Aug
2008 09:43:15 GMT
removing ETags avoids the problem
10 main.js (no gzip)
9 main.js Cache-Control: max-age=0 Last-Modified: Fri, 22 Aug 2008 09:43:15 GMT
Edge Case Fixes
Vary: Accept-Encoding
Cache-Control: private
ETag
aol.com x
ebay.com x x x (IIS)
facebook.com x
google.com/search x
search.live.com/results
x x (IIS)
msn.com x (IIS)
myspace.com x x (Apa)
en.wikipedia.org/wiki x (Apa)
yahoo.com x
youtube.com x someVary: User-Agent – not used
March 2007October 2008
Homework"Improving Top Site" class project:• add improvements for Rule 4• measure improvements using Hammerhead• record results in your personal Web 100 sheet
read Chapter 5 of HPWS for 10/17
QuestionsHow much are file sizes typically reduced by using
gzip compression?What types of resources (images, scripts, etc.)
should not be compressed?For the resource types that should be compressed,
should they always be compressed?How do you prevent proxies from serving gzipped
resources to browsers that don't support gzip?How can ETags cause proxies to serve mismatched
content to browsers?