+ All Categories
Home > Technology > How GZIP compression works - JS Conf EU 2014

How GZIP compression works - JS Conf EU 2014

Date post: 14-Dec-2014
Category:
Upload: raul-fraile
View: 1,366 times
Download: 12 times
Share this document with a friend
Description:
Data compression is an amazing topic. Even in today’s world, with fast networks and almost unlimited storage, data compression is still relevant, especially for mobile devices and countries with poor Internet connections. For better or worse, GZIP compression is the de-facto lossless compression method for compressing text data in websites. It is not the fastest nor the better, but provides an excellent tradeoff between speed and compression ratio. The way Internet works makes it also difficult to use newer compression methods. This talk examines how GZIP works internally, explaining the internals of the DEFLATE algorithm, which is a combination of LZ77 and Huffman coding. Different implementations will be compared, such as GNU GZIP, 7-ZIP and zopfli, focusing on why and how some of these implementations perform better than others. Finally, we will try to go beyond GZIP, preprocessing our data to achieve better results. For example, transposing JSON.
Popular Tags:
46
HOW GZIP COMPRESSION WORKS RAUL FRAILE JSCONF EU BERLIN
Transcript
Page 1: How GZIP compression works - JS Conf EU 2014

H O W G Z I P C O M P R E S S I O N W O R K SR A U L F R A I L E

J S C O N F E U B E R L I N

Page 2: How GZIP compression works - JS Conf EU 2014

• P H P / J S S O F T W A R E D E V E L O P E R

!

• M S ( R E S ) S T U D E N T I N

C O M P U T I N G T E C H N O L O G I E S .

!

• M A D E I N S PA I N .

A B O U T M E

Page 3: How GZIP compression works - JS Conf EU 2014

D ATA C O M P R E S S I O N

Page 4: How GZIP compression works - JS Conf EU 2014

N O T A N E X P E R T *

Page 5: How GZIP compression works - JS Conf EU 2014

D ATA C O M P R E S S I O N I S A N AMAZ ING T O P I C

Page 6: How GZIP compression works - JS Conf EU 2014

REALLY !

Page 7: How GZIP compression works - JS Conf EU 2014

M A G I CI T C A N B E S E E N L I K E …

flickr.com/photos/jeffkrause/6799254170

Page 8: How GZIP compression works - JS Conf EU 2014

flickr.com/photos/t_e_brown/8677750589

… I T ’ S N O T

Page 9: How GZIP compression works - JS Conf EU 2014

I N F O R M AT I O N T H E O R YC L A U D E S H A N N O N

Page 10: How GZIP compression works - JS Conf EU 2014

E N T R O P Yflickr.com/photos/95303997@N07/10074330416

Page 11: How GZIP compression works - JS Conf EU 2014

H = - p ( x ) l o g 2 p ( x )⎲⎳

AV E R A G E A M O U N T O F I N F O R M AT I O N C O N TA I N E D I N E A C H M E S S A G E

≈N U M B E R O F B I T S T O R E P R E S E N T T H E M E S S A G E

Page 12: How GZIP compression works - JS Conf EU 2014

225 days/year 62 %

17 days/year 6 %

flickr.com/photos/aigle_dore/5952296478flickr.com/photos/mariano-mantel/13955110319

Page 13: How GZIP compression works - JS Conf EU 2014

H U M A N B R A I NI S D E S I G N E D T O C O M P R E S S D A TA

flickr.com/photos/birthintobeing/11841180046

Page 14: How GZIP compression works - JS Conf EU 2014

flickr.com/photos/neolao/3105372669flickr.com/photos/tommiephotography/6840025942

flickr.com/photos/earlysound/2186172726

Page 15: How GZIP compression works - JS Conf EU 2014

M O R S E C O D E S H O R T E R S E Q U E N C E S F O R C O M M O N C H A R A C T E R S

flickr.com/photos/amboo213/9044879245

Page 16: How GZIP compression works - JS Conf EU 2014

D ATA C O M P R E S S I O N I N H T T P

Page 17: How GZIP compression works - JS Conf EU 2014

GET index.html Accept-Encoding: gzip, deflate

G Z I P + H T T P

Page 18: How GZIP compression works - JS Conf EU 2014

G Z I P C O M P R E S S I O N

Page 19: How GZIP compression works - JS Conf EU 2014

• D E F L A T E A L G O R I T H M

!

• D E S I G N E D B Y P H I L K A T Z

!

• U S E D I N H T T P, P N G A N D P D F

G Z I P

Page 20: How GZIP compression works - JS Conf EU 2014

D E F L AT E

L Z 7 7

H U F F M A N C O D I N G+

Page 21: How GZIP compression works - JS Conf EU 2014

L Z 7 7 ( VA R I AT I O N )

T H I S F I L E I S H U G E ! T H AT ' S B E C A U S E T H E F I L E I S N O T C O M P R E S S E D

< 3 3 , 9 >

S E A R C H B U F F E R ( U P T O 3 2 K B ) L O O K - A H E A D

Page 22: How GZIP compression works - JS Conf EU 2014

T H I S F I L E I S H U G E ! T H AT ' S B E C A U S E T H E F I L E I S N O T C O M P R E S S E D

L Z 7 7 ( VA R I AT I O N )

< 3 3 , 9 >

L I T E R A L S · L E N G T H S · D I S TA N C E S

Page 23: How GZIP compression works - JS Conf EU 2014

H U F F M A N C O D I N G

0 1 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 1 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 0 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 1 0 1 0 0 1 0 0 1 0 0 1 1 0 0 0 1 0 0 0 1 0 0

H 0 0 0E 0 0 1L 0 1 0O 0 1 1W 1 0 0R 1 0 1D 1 1 0_ 1 1 1

H E L L O W O R L D

8 8 B I T S

F I X E D - L E N G T H C O D E S

0 0 0 0 0 1 0 1 0 0 1 0 0 1 1 1 1 1 1 0 0 0 1 1 1 0 1 0 1 0 1 1 0

3 3 B I T S

Page 24: How GZIP compression works - JS Conf EU 2014

H U F F M A N C O D I N G

C H A R A C T E R F R E Q U E N C Y:

0 0 0 1 0 0 1 0 0 1 1 0 1 1 1 0 0 0 0

L 3 0O 2 1H 1 0 0E 1 0 1W 1 1 0R 1 1 1D 1 0 0 0_ 1 0 0 1

H E L L O W O R L D

1 9 B I T S

I T ’ S A M B I G U O U S

H EL H OD O…

VA R I A B L E - L E N G T H C O D E S

Page 25: How GZIP compression works - JS Conf EU 2014

H U F F M A N C O D I N G

L 3 1 0O 2 1 1 1H 1 0 0 1E 1 1 1 0 0W 1 0 0 1R 1 0 0 0D 1 1 1 0 1_ 1 0 1 0

Page 26: How GZIP compression works - JS Conf EU 2014

H U F F M A N C O D I N G

L 3 1 0O 2 1 1 1H 1 0 0 1E 1 1 1 0 0W 1 0 0 1R 1 0 0 0D 1 1 1 0 1_ 1 0 1 0

0 0 1 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 0 1 1 1 1 0 0 0 1 0 1 1 0 1

H E L L O W O R L D

3 2 B I T S

Page 27: How GZIP compression works - JS Conf EU 2014

H U F F M A N C O D I N G

TA B L E 1 : L I T E R A L S + L E N G T H S

TA B L E 2 : D I S TA N C E S

Page 28: How GZIP compression works - JS Conf EU 2014

B L O C K S

B L O C K 1 B L O C K 2 … B L O C K NM M M M

M O D E 1 : N O C O M P R E S S I O N

M O D E 2 : F I X E D C O D E TA B L E S

M O D E 3 : G E N E R AT E D C O D E TA B L E S

Page 29: How GZIP compression works - JS Conf EU 2014

flickr.com/photos/functoruser/2436979033

Page 30: How GZIP compression works - JS Conf EU 2014

G Z I P C O M P R E S S I O NI M P L E M E N TAT I O N S

Page 31: How GZIP compression works - JS Conf EU 2014

G N U G Z I P Z O P F L I7 - Z I P

M O D E FA S T

M O D E H I G H

C O M P R E S S I O N

M O D E N O R M A L

G E N E R A L R U L E : M O R E T I M E , B E T T E R C O M P R E S S I O N R AT I O

I M P L E M E N TAT I O N S

Page 32: How GZIP compression works - JS Conf EU 2014

G Z I P C O M P R E S S I O NW H Y G Z I P ?

Page 33: How GZIP compression works - JS Conf EU 2014

• G O O D C O M P R E S S I O N R A T I O .

• FA S T T O ( U N ) C O M P R E S S .

• I N T H E W O R S T C A S E , E X PA N D S

T H E D A TA S L I G H T LY.

• M E M O R Y I N D E P E N D E N T.

• F R E E I M P L E M E N TA T I O N S T H A T

A V O I D PA T E N T S .

T R A D E O F F

Page 34: How GZIP compression works - JS Conf EU 2014

N E W E R A L G O R I T H M SI S S U E S T R Y I N G T O A D D B Z I P 2 S U P P O R T T O C H R O M E

Page 35: How GZIP compression works - JS Conf EU 2014

G Z I P C O M P R E S S I O NB E Y O N D G Z I P

Page 36: How GZIP compression works - JS Conf EU 2014

P R E P R O C E S S D ATA T O O P T I M I Z E MATCHES

Page 37: How GZIP compression works - JS Conf EU 2014
Page 38: How GZIP compression works - JS Conf EU 2014

G Z I P ( T ( D ATA ) ) < G Z I P ( D ATA )

Page 39: How GZIP compression works - JS Conf EU 2014

T R A N S P O S I N G J S O N

{ "name": "John", "country": "USA" }, { "name": "Stephan", "country": "Germany" }, { "name": "Rob", "country": "USA" }

{ "name": [ "John", "Stephan", "Rob" ], "country": [ "USA", "Germany", "USA" ] }

Page 40: How GZIP compression works - JS Conf EU 2014

X M L / H T M L AT T R I B U T E S O R D E R

<input id='f1' class='field' name="f1" type="text" /> <input class="field" id="f2" type="text" name="f2" />

<input id="f1" class="field" name="f1" type="text" /> <input class="field" id="f2" type="text" name="f2" />

<input id="f1" class="field" name="f1" type="text" /> <input id="f2" class="field" name="f2" type="text" />

<input type="text" class="field" id="f1" name="f1" /> <input type="text" class="field" id="f2" name="f2" />

1 7 , 7 6 %

2 7 , 1 0 %

3 8 , 3 2 %

3 8 , 3 2 %

h t t p : / / g o o . g l / G g M w 2 6

Page 41: How GZIP compression works - JS Conf EU 2014

R E F E R E N C E S

Page 42: How GZIP compression works - JS Conf EU 2014

“ C o m p r e s s o r H e a d ” C o l t M c A n l i s

Page 43: How GZIP compression works - JS Conf EU 2014

“ D a t a C o m p r e s s i o n : T h e C o m p l e t e R e f e r e n c e ” D a v i d S a l o m o n

Page 44: How GZIP compression works - JS Conf EU 2014

“ A U n i v e r s a l A l g o r i t h m f o r S e q u e n t i a l D a t a C o m p r e s s i o n ” J a c o b Z i v & A b r a h a m L e m p e l

Page 45: How GZIP compression works - JS Conf EU 2014

“ A m e t h o d f o r t h e c o n s t r u c t i o n o f m i n i m u m r e d u n d a n c y c o d e s ” D a v i d A . H u f f m a n

Page 46: How GZIP compression works - JS Conf EU 2014

T H A N K Y O U

R a ú l F r a i l e @ r a u l f r a i l e


Recommended