+ All Categories
Home > Internet > Developing the fastest HTTP/2 server

Developing the fastest HTTP/2 server

Date post: 21-Apr-2017
Category:
Upload: kazuho-oku
View: 24,461 times
Download: 0 times
Share this document with a friend
56
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Developing the fastest HTTP/2 server DeNA Co., Ltd. Kazuho Oku 1
Transcript
Page 1: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Developing the fastest HTTP/2 server

DeNA Co., Ltd.Kazuho Oku

1

Page 2: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Who am I?

n  Kazuho Okun  Major works:

⁃  Palmscape / Xiino (web browser for Palm OS)•  awarded M.I.T. TR 100/2002

⁃  Mitoh project 2004 super creator⁃  Q4M (message queue plugin for MySQL)•  MySQL Conference Community Awards 2011

⁃  H2O (HTTP/2 server)•  Japan OSS Contribution Award 2015

2Developing the fastest HTTP/2 server

Page 3: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Background

3Developing the fastest HTTP/2 server

Page 4: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Responsiveness is important

4Developing the fastest HTTP/2 server

source:h@p://radar.oreilly.com/2009/06/bing-and-google-agree-slow-pag.html

n  500ms increase → -1.2% revenue

Page 5: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Increasing size and # of requests

5Developing the fastest HTTP/2 server

source:h@p://[email protected]/trends.php?s=All&minlabel=Aug+1+2011&maxlabel=Aug+1+2015#bytesTotal&reqTotal

Page 6: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Bandwidth is also increasing

n  end-usersʼ B/W increase 50% every year (Nielsenʼs Law)

6Developing the fastest HTTP/2 server

source:h@p://www.nngroup.com/arRcles/law-of-bandwidth/

Page 7: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

More bandwidth doesnʼt matter

7Developing the fastest HTTP/2 server

source:MoreBandwidthDoesn'tMa@er-2011MikeBelshe(Google)

* effective B/W reaches ceiling at around 1.6Mbps

Page 8: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Latency is the new bottleneck

8Developing the fastest HTTP/2 server

source:MoreBandwidthDoesn'tMa@er-2011MikeBelshe(Google)

Page 9: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Latency cannot be optimized

n  latency = speed of light⁃  round-trip bet. Japan and US: 80ms

n  mobile carriers have huge latency⁃  LTE ~ 50ms

n  the Web is becoming more and more complex

9Developing the fastest HTTP/2 server

Page 10: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Web is becoming slower ... unless we do something.

10Developing the fastest HTTP/2 server

Page 11: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Solution: new protocol

11Developing the fastest HTTP/2 server

Page 12: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

HTTP/2!

12Developing the fastest HTTP/2 server

Page 13: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

The reasons HTTP/1.1 is slow

n  concurrency is too small⁃  multiple round-trips required when issuing many

requestsn  no prioritization between. requests

⁃  can suspend HTML / image streams in favor of CSS / JS

n  big request / response headers⁃  typically hundreds of octets⁃  becomes an overhead when issuing many reqs.

13Developing the fastest HTTP/2 server

Page 14: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

HTTP/2

n  RFC 7540 (2015/5)⁃  based on SPDY by Google

n  key features:⁃  binary protocol⁃  header compression⁃  multiplexing⁃  prioritization

14Developing the fastest HTTP/2 server

Page 15: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Benchmark

n  red bar: time spent until first-paintn  big difference bet. server implementations

n  reason: quality of prioritization logicn  H2O shows the true potential of HTTP/2

15Developing the fastest HTTP/2 server

Page 16: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Have we reached the limit?

16Developing the fastest HTTP/2 server

Page 17: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Letʼs consider what would be the ideal HTTP flow.

17Developing the fastest HTTP/2 server

Page 18: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

TCP slow start

n  Initial Congestion Window (IW)=10⁃  only 10 packets can be sent in first RTT⁃  used to be IW=3

n  window increase: 1.5x/RTT

18Developing the fastest HTTP/2 server

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

1 2 3 4 5 6 7 8

bytestransmi,ed

RTT

TCPslowstart(IW10,MSS1460)

Page 19: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Flow of the ideal HTTP

n  fastest within the limits of TCP/IPn  receive a request 0-RTT, and:

⁃  first send CSS/JS*⁃  then send the HTML⁃  then send the images*

*: but only the ones not cached by the browser

19Developing the fastest HTTP/2 server

client server

1RT

T

request

response

Page 20: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

The reality in HTTP/2

n  TCP establishment: +1 RTTn  TLS handshake: +2 RTT*n  HTML fetch: +1 RTTn  JS,CSS fetch: +2 RTT**

n  Total: 6 RTT

*: 1 RTT on reconnection**: servers often cannot switch to sending JS,CSS instantly, due to the output buffered in TCP send buffer

20Developing the fastest HTTP/2 server

client server

1RT

T

TCPSYN

TCPSYNACK

TLSHandshake

TLSHandshake

TLSHandshake

TLSHandshake

GET/

HTML

GETcss,js

css,js〜〜

Page 21: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Ongoing optimizations

n  TCP Fast Open⁃  connection establishment in 0 RTT

n  TLS 1.3⁃  initial handshake complete in 1 RTT⁃  resumption in 0 RTT

n  what can be done in the HTTP/2 layer?

21Developing the fastest HTTP/2 server

Page 22: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Further optimizations in HTTP/2 layer

n  optimize TCP for responsivenessn  Cache-aware server push

22Developing the fastest HTTP/2 server

Page 23: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Optimizing TCP for responsiveness

23Developing the fastest HTTP/2 server

Page 24: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Typical sequence of HTTP/2

24Developing the fastest HTTP/2 server

HTTP/2 200 OK

<!DOCTYPE HTML>…<SCRIPT SRC=”jquery.js”>…

client server

GET /

GET /jquery.js

needtoswitchsendingfromHTMLtoJSatthisverymoment(meansthatamountofdatasentin*mustbesmallerthanIW)

1RTT

*

Page 25: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Buffering in TCP and TLS layer

25Developing the fastest HTTP/2 server

TCPsendbuffer

CWNDunacked pollthreshold

BIObuf.

// ordinary code (non-blocking)while (SSL_write(…) != SSL_ERR_WANT_WRITE) ;

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

Page 26: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Why do we have buffers?

26Developing the fastest HTTP/2 server

n  TCP send buffer:⁃  reduce ping-pong bet. kernel and application

n  BIO buffer:⁃  for data that couldnʼt be stored in TCP send buffer

TCPsendbuffer

CWNDunacked pollthreshold

BIObuf.

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

Page 27: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Improvement: poll-then-write

27Developing the fastest HTTP/2 server

TCPsendbuffer

CWNDunacked pollthreshold

// only call SSL_write when polls notifies the app.while (poll_for_write(fd) == SOCKET_IS_READY) SSL_write(…);

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

Page 28: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Adjust poll threshold

28Developing the fastest HTTP/2 server

TCPsendbuffer

CWNDunacked pollthreshold

n  set poll threshold to the end of CWND?⁃  setsockopt(TCP_NOTSENT_LOWAT)⁃  in linux, the minimum is CWND + 1 octet•  becomes unstable when set to CWND + 0

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

Page 29: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Adjust poll threshold

29Developing the fastest HTTP/2 server

CWNDunacked pollthreshold

// only call SSL_write when polls notifies the app.while (poll_for_write(fd) == SOCKET_IS_READY) SSL_write(…);

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

TCPsendbuffer

Page 30: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Further improvement: read TCP states

30Developing the fastest HTTP/2 server

CWNDunacked pollthreshold

// calc size of data to send by calling getsockopt(TCP_INFO)if (poll_for_write(fd) == SOCKET_IS_READY) { capacity = CWND + unacked + ONE_MSS - TLS_overhead; SSL_write(prepare_http2_frames(capacity));}

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

TCPsendbuffer

Page 31: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Issues in the proposed approach

n  increased delay bet. ACK recv. → data send⁃  leads to slower peak speed⁃  reason:•  traditional approach: completes within kernel•  this approach: application needs to be notified to

generate new datan  solution:

⁃  use the approach only when necessary•  i.e. when RTT is big and CWND is small•  increased delay can be ignored if: delay << RTT

31Developing the fastest HTTP/2 server

Page 32: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Code for calculating size of data to sendsize_t get_suggested_write_size() { getsockopt(fd, IPPROTO_TCP, TCP_INFO, &tcp_info, sizeof(tcp_info)); if (tcp_info.tcpi_rtt < min_rtt || tcp_info.tcpi_snd_cwnd > max_cwnd) return UNKNOWN;

switch (SSL_get_current_cipher(ssl)->id) { case TLS1_CK_RSA_WITH_AES_128_GCM_SHA256: case …: tls_overhead = 5 + 8 + 16; break; default: return UNKNOWN; }

packets_sendable = tcp_info.tcpi_snd_cwnd > tcp_info.tcpi_unacked ? tcp_info.tcpi_snd_cwnd - tcp_info.tcpi_unacked : 0; return (packets_sendable + 1) * (tcp_info.tcpi_snd_mss - tls_overhead);}

32Developing the fastest HTTP/2 server

Page 33: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Benchmark

33Developing the fastest HTTP/2 server

n  conditions:⁃  server in Ireland, client in Japan (RTT 250ms)⁃  load tiny js at the top of a large HTML

n  result: delay decreased from 511ms to 250ms⁃  i.e. JS fetch latency was 2RTT, became 1 RTT•  similar results in other environments

Page 34: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Conclusion

n  near-optimal result can be achieved⁃  by adjusting poll threshold and reading TCP

states⁃  1-packet overhead due to restriction in Linux

kerneln  1-RTT improvement in H2O

⁃  estimated 1-RTT improvement per the depth of the load graph

34Developing the fastest HTTP/2 server

Page 35: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Same problem exists with load balancers

n  L4 L/B or TLS terminator also act as buffers⁃  impact bigger than that of TCP send buffer of

httpdn  solution:

⁃  best: donʼt use L/B⁃  next to best: implement mitigations in L/B⁃  long-term: TCP migration + L3 NAT or DSR•  i.e. accept in L/B, then transfer the connection to

HTTP/2 server

35Developing the fastest HTTP/2 server

Page 36: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Cache-aware Server Push

36Developing the fastest HTTP/2 server

Page 37: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

What is server-push?

n  start the delivery of CSS / JS when receiving a request for HTML

n  effect:⁃  1 RTT reduction, or more

37Developing the fastest HTTP/2 server

Page 38: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Use-case: conceal request process time

n  ex. RTT=50ms, process time=200ms

38Developing the fastest HTTP/2 server

req.�

processrequest�push-asset�

HTML�

push-asset�

push-asset�

push-asset�

req.�

processrequest�

asset�

HTML�

asset�

asset�

asset�

req.�

450ms(5RTT+processing=m

e)�

250ms(1RTT+processing=m

e)�

withoutpush� withpush�

Page 39: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Use-case: conceal network distance

n  CDNsʼ use-case⁃  utilize the conn. while waiting for app. response⁃  side-effect: reduce the number of app DCs

39Developing the fastest HTTP/2 server

req.�

push-asset�

HTML�

push-asset�

push-asset�

push-asset�

client� edgeserver(CDN)� app.server�

req.�

HTML�

Page 40: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Issues of server-push

n  how to determine if a resource is already cached⁃  shouldnʼt push a resource already in cache•  waste of bandwidth (and time)

⁃  canʼt issue a request to identify the cache state•  since it would waste 1 RTT we are trying to reduce!

40Developing the fastest HTTP/2 server

Page 41: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Cache-aware server push

n  experimental feature since H2O 1.5n  create a digest of URLs found in browser cache

⁃  uses Golomb coded sets•  space-efficient variant of bloom filter

n  server uses the digest to determine whether or not to push

41Developing the fastest HTTP/2 server

Page 42: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Memo: fresh vs. stale

n  two states of a cached resourcen  fresh:

⁃  resource that can be used⁃  example: Expires: Jan 1 2030

n  stale:⁃  needs revalidation before use•  i.e. issue GET with if-modified-since

42Developing the fastest HTTP/2 server

Page 43: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Generating a digest

1.  calc hashcode of URLs of every fresh cache⁃  range: 0 .. #-of-URL / false-positive-rate

2.  sort the hashcodes, remove duplicates3.  emit the first element using the following encoding:

1.  “value * FPR” using unary coding2.  “value mod (1/false-positive-rate)” using binary

coding4.  for every other element, emit the delta from

preceding element subtracted by one using the encoding

5.  pad 1 up to the byte boundary43Developing the fastest HTTP/2 server

Page 44: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Generating a digest

n  scenario:⁃  FPR: 1/256⁃  URLs of fresh resources in cache:•  https://example.com/ecma.js•  https://example.com/style.css

n  calc hash modulo 512: 0x3d, 0x16bn  sort, remove dupes, and emit the delta:

⁃  0x3d → 0 00111101⁃  0x16b - 0x3d - 1 → 0x12d → 10 00101101⁃  padding → 111111

44Developing the fastest HTTP/2 server

Page 45: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Overhead of sending the digest

n  size: #-of-URLs * (1/log2(FPR) + 1.x) bitsn  1,400 URLs can be stored in 1 packet

⁃  when false-positive-rate set to 1/128n  can raise FPR to cram more URLs

⁃  false-positive means the resource is not pushed, browser can just pull it

⁃  pushing some of the required resources is better than none

45Developing the fastest HTTP/2 server

Page 46: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Where to store the digest?

n  cookie⁃  pros: runs on any browser, anytime⁃  cons: digest becomes inaccurate•  only the browser knows whatʼs in the browser cache

n  ServiceWorker (+ServiceWorker Cache)⁃  pros: runs on Chrome, Firefox⁃  cons: doesnʼt start until leaving the landing page

n  HTTP/2 frame⁃  pros: minimal octets transferred•  thanks to the knowledge of HTTP/2 connection

⁃  cons: needs to be implemented by browser developer

46Developing the fastest HTTP/2 server

Page 47: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Discussion at IETF

n  IETF 95 (April)⁃  initial submission of the internet draft•  co-author: Mark Nottingham (HTTP WG Chair)

⁃  defines the HTTP/2 frame•  since itʼs the best way in the long-term•  store the frame in headers / cookies for the short-

termn  IETF 96, HTTP Workshop (July)

⁃  to define digest calculation of stale resources

47Developing the fastest HTTP/2 server

Page 48: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Handling stale resources

n  hash key changed to URL + Etag⁃  anyone needs support for last-modified?

n  server uses URL + Etag of the resource to check the digest⁃  push the resource in case a match is not found⁃  push 304 Not Modified in case a match is found

48Developing the fastest HTTP/2 server

Page 49: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Difficulties in pushing 304

n  Etag cannot always be obtained immediately⁃  cannot build If-Match request header without

etag⁃  the “request*” of a pushed resource SHOULD be

sent before the main responsen  proposed solution:

⁃  allow 304 against a non-conditional GET

*: in case of server-push, the server generates both request and response, sends them to the client.

49Developing the fastest HTTP/2 server

Page 50: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Using server-push from Ruby

n  Link: rel=preload header⁃  web server pushes the specified URL

HTTP/1.1 200 OK

Content-Type: text/html

Link: </style.css>; rel=preload # this header!!!

⁃  supported by:•  H2O, nghttpx (nghttp2), mod_h2 (Apache)

⁃  patch for nginx exists

50Developing the fastest HTTP/2 server

Page 51: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

The issue with Link: rel=preload

n  cannot initiate push while processing the request

51Developing the fastest HTTP/2 server

client HTTP/2server Webapp.

GET/

can’tpushatthismoment

GET/

200OKLink:…200OK

processrequest

Page 52: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

1xx Early Metadata

52Developing the fastest HTTP/2 server

n  send Link: rel=preload as interim response⁃  application sends 1xx then processes the request

n  supported in H2O 2.1n  might propose for standardization in IETF

GET / HTTP/1.1Host: example.com

HTTP/1.1 1xx Early MetadataLink: </style.css>; rel=preload

HTTP/1.1 200 OKContent-Type: text/html; charset=utf-8

<!DOCTYPE HTML>...

Page 53: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Sending 1xx from Rack

n  in case of Unicorn:Proc.new do |env| env[”unicorn.socket”].write( ”HTTP/1.1 1xx Early Metadata\r\n” + ”Link: </style.js>; rel=preload\r\n” + ”\r\n”); # time-consuming operation ... [ 200, [ ... ], [ ... ] ]end

...we need to define the formal API

53Developing the fastest HTTP/2 server

Page 54: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Conclusion

54Developing the fastest HTTP/2 server

Page 55: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Conclusion

n  the Web has become faster with HTTP/2n  HTTP/2 becomes fast as to the limit of TCP/IP with:

⁃  optimizing TCP for responsiveness⁃  Cache Digest⁃  1xx Early Metadata

55Developing the fastest HTTP/2 server

Page 56: Developing the fastest HTTP/2 server

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Q&A

n  Q. Can it be made faster than the limits o TCP/IP?n  A. Yes!

⁃  shorten the RTT!•  CDNsʼ approach

⁃  make DNS query part of TLS handshake•  was part of TLS 1.3 draft (removed as too

premature)⁃  fairness isnʼt a issue for a private network!•  TCP optimizer for mobile carriers

56Developing the fastest HTTP/2 server


Recommended