CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google...

Post on 26-Mar-2015

225 views 1 download

Tags:

transcript

CS193H:High Performance Web Sites

Lecture 8: Rule 4 – Gzip Components

Steve SoudersGoogle

souders@cs.stanford.edu

AnnouncementsWeb 100 Performance Profile (round 1) class

project has been graded – contact Aravind if you want to know your grade

Compression (encoding)

typically reduces size by 70%(6230-2066)/6230 = 67%

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 6230

function d(s) {...

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1Accept-Encoding: gzip,deflate

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 2066Content-Encoding: gzip

XmoÛHþ\ÿFÖvã*wØoq...

Gzip vs. Deflate

gzip (default settings) compresses more

Gzip Deflate

Size SizeSavin

gs SizeSavin

gs

Script 3.3K 1.1K 67% 1.1K 66%

Script 39.7K 14.5K 64% 16.6K 58%

Stylesheet 1.0K 0.4K 56% 0.5K 52%

Stylesheet 14.1K 3.7K 73% 4.7K 67%

Pros and ConsPro:

smaller transfer sizeCon:

CPU cycles – on client and server

Don't compress resources < 1K

Gzip configurationApache 1.3: mod_gzip

mod_gzip_item_include file \.html$mod_gzip_item_include mime ^text/html$mod_gzip_item_include file \.js$mod_gzip_item_include mime ^application/x-javascript$

mod_gzip_item_include file \.css$mod_gzip_item_include mime ^text/css$

Apache 2.x: mod_deflateAddOutputFilterByType DEFLATE text/html text/css application/x-javascript

control compression level: DeflateCompressionLevelhttp://httpd.apache.org/docs/2.0/mod/mod_deflate.html

HTML Scripts Stylesheets

amazon.com x

aol.com x some some

cnn.com

ebay.com x

froogle.google.com x x x

msn.com x deflate deflate

myspace.com x x x

wikipedia.org x x x

yahoo.com x x x

youtube.com x some some

Gzip: not just for HTML

HTML Scripts Stylesheets

aol.com x x x

ebay.com x some

facebook.com x x x

google.com/search x x na

search.live.com/results

x x x

msn.com x x x

myspace.com x x x

en.wikipedia.org/wiki x some some

yahoo.com x x x

youtube.com x x x

gzip scripts, stylesheets, XML, JSON (not images, Flash, PDF) March 2007October 2008

Edge Case: ProxiesProxy Origin Server

6 GET main.js (no Accept-Encoding)

2 GET main.js Accept-Encoding: gzip

3 main.js Content-Encoding: gzip

4 main.js Content-Encoding: gzip

5 main.js Content-Encoding: gzip

1 GET main.js Accept-Encoding: gzip

7 main.js Content-Encoding: gzip

proxies may serve gzipped content to browsers that don't support it, and vice versa

Edge Case: Proxies w/ VaryProxy Origin Server

6 GET main.js (no Accept-Encoding)

2 GET main.js Accept-Encoding: gzip

3 main.js Content-Encoding: gzip Vary: Accept-Encoding

4 main.js Content-Encoding: gzip [Accept-Encoding: gzip]

5 main.js Content-Encoding: gzip

1 GET main.js Accept-Encoding: gzip

10 main.js (no gzip)

7 GET main.js (no Accept-Encoding)

9 main.js [Accept-Encoding: ]

8 main.js Vary: Accept-Encoding

11 GET main.js Accept-Encoding: gzip

12 main.js Content-Encoding: gzip

13 GET main.js (no Accept-Encoding)

14 main.js (no gzip)

add Vary: Accept-Encoding

Edge Case: Bad Browsers< 1% of browsers have problems with gzip

IE 5.5: http://support.microsoft.com/default.aspx?scid=kb;en-us;Q313712

IE 6.0: http://support.microsoft.com/default.aspx?scid=kb;en-us;Q31249

Netscape 3.x, 4.x http://www.schroepl.net/projekte/mod_gzip/browser.htm

User-Agent white list for gzipApache 1.3: mod_gzip_item_include reqheader "User-Agent: MSIE [6-9]" mod_gzip_item_include reqheader "User-Agent: Mozilla/[5-9]"

Apache 2.0: BrowserMatch ^MSIE [6-9] gzip BrowserMatch ^Mozilla/[5-9] gzip

Edge Case: Bad Browsers(cont'd)proxies could mix-up responses

give cached response from useragent1 to useragent2

could add Vary: User-Agentso many possibilities, defeats proxy caching

better to add Cache-Control: Private downside: disables all proxy caches

is it a serious problem?hard to diagnose; problem getting smaller

Edge Case: ETagswhat happens when proxy makes Conditional

GET requests?Last-Modified date for gzipped vs. ungzipped is

different => If-Modified-Since works fineETag is the same in Apache for gzipped &

ungzipped => If-None-Match succeeds, proxy could give browser mismatched content

remove Etags! (Rule 13)

http://issues.apache.org/bugzilla/show_bug.cgi?id=39727

Edge Case: ETags presentProxy Origin Server

6 GET main.js (no Accept-Encoding)

2 GET main.js Accept-Encoding: gzip

3 main.js Content-Encoding: gzip Cache-Control: max-age=0 ETag: "de158-e58-c7ee4140"

4 main.js Content-Encoding: gzip Cache-Control: max-age=0 ETag: "de158-e58-c7ee4140"

5 main.js Content-Encoding: gzip

1 GET main.js Accept-Encoding: gzip

7 GET main.js If-None-Match: "de158-e58-c7ee4140"

8 304 Not Modified9 main.js Content-Encoding: gzip

proxy gives browser mismatched content

Edge Case: ETags removedProxy Origin Server

6 GET main.js (no Accept-Encoding)

2 GET main.js Accept-Encoding: gzip

3 main.js Content-Encoding: gzip Cache-Control: max-age=0 Last-Modified: Thu, 21 Aug

2008 23:53:57 GMT

4 main.js Content-Encoding: gzip Cache-Control: max-age=0 Last-Modified: Thu, 21 Aug 2008 23:53:57 GMT

5 main.js Content-Encoding: gzip

1 GET main.js Accept-Encoding: gzip

7 GET main.js If-Modified-Since: Thu, 21 Aug 2008 23:53:57

GMT

8 main.js Cache-Control: max-age=0 Last-Modified: Fri, 22 Aug

2008 09:43:15 GMT

removing ETags avoids the problem

10 main.js (no gzip)

9 main.js Cache-Control: max-age=0 Last-Modified: Fri, 22 Aug 2008 09:43:15 GMT

Edge Case Fixes

Vary: Accept-Encoding

Cache-Control: private

ETag

aol.com x

ebay.com x x x (IIS)

facebook.com x

google.com/search x

search.live.com/results

x x (IIS)

msn.com x (IIS)

myspace.com x x (Apa)

en.wikipedia.org/wiki x (Apa)

yahoo.com x

youtube.com x someVary: User-Agent – not used

March 2007October 2008

Homework"Improving Top Site" class project:• add improvements for Rule 4• measure improvements using Hammerhead• record results in your personal Web 100 sheet

read Chapter 5 of HPWS for 10/17

QuestionsHow much are file sizes typically reduced by using

gzip compression?What types of resources (images, scripts, etc.)

should not be compressed?For the resource types that should be compressed,

should they always be compressed?How do you prevent proxies from serving gzipped

resources to browsers that don't support gzip?How can ETags cause proxies to serve mismatched

content to browsers?