+ All Categories
Home > Technology > How GZIP works... in 10 minutes

How GZIP works... in 10 minutes

Date post: 18-Dec-2014
Category:
Upload: raul-fraile
View: 891 times
Download: 1 times
Share this document with a friend
Description:
Slides of the talk at the deSymfonyDay unconference
15
How GZIP Compression Works Raul Fraile …in 10 minutes
Transcript
Page 1: How GZIP works... in 10 minutes

How GZIP Compression Works Raul Fraile …in 10 minutes

Page 2: How GZIP works... in 10 minutes

About me

• PHP/Symfony2 developer at

• PHP 5.3 Zend Certified Engineer

• Symfony Certified Developer

• BS in Computer Science. Ms(Res) student in Computing Technologies.

• Open source: LadybugPHP

Page 3: How GZIP works... in 10 minutes

What is GZIP?

• GZIP is a lossless compression method, we can recover the original data once decompressed.

• It has become the de-facto lossless compression method for compressing textual data in websites.

Page 4: How GZIP works... in 10 minutes

What is GZIP?

Web server

GET index.html Accept-Encoding: gzip

Page 5: How GZIP works... in 10 minutes

How it works?

• It is based on the DEFLATE algorithm, which is a combination of LZ77 and Huffman coding.

• First, the LZ77 algorithm replaces repeated occurrences of data with references.

• Second, Huffman coding assigns shorter codes to more frequent “characters”.

Page 6: How GZIP works... in 10 minutes

How it works?

This file is huge! That's because the file is not compressed

<33, 9>

LZ77

Page 7: How GZIP works... in 10 minutes

How it works?

“compressed”

Huffman coding

c: 1 o: 1 m: 1 p: 1

r: 1 e: 2 s: 2 d: 1

01100011 01101111 01101101 01110000 01110010 01100101 01110011 01110011 01100101 01100100

1100 011 010 000 001 111 10 10 111 1101

Page 8: How GZIP works... in 10 minutes

Why GZIP?

• GZIP is not the best compression method, but there are a few good reasons to use it.

• Provides a good tradeoff between speed and ratio.

• Difficulty to add newer compression methods.

Page 9: How GZIP works... in 10 minutes

Implementations

GNU GZIP

7-zip Zopfli

Different implementations, different results

Page 10: How GZIP works... in 10 minutes

GZIP + PHP

$originalFile = __DIR__ . '/jquery-1.11.0.min.js'; $gzipFile = __DIR__ . '/jquery-1.11.0.min.js.gz'; $originalData = file_get_contents($originalFile); $gzipData = gzencode($originalData, 9); file_put_contents($gzipFile, $gzipData); var_dump(filesize($originalFile)); // int(96380) var_dump(filesize($gzipFile)); // int(33305)

Page 11: How GZIP works... in 10 minutes

Beyond GZIP

• Preprocessing the text can have an impact on the compression ratio.

• How? Optimizing matches.

Page 12: How GZIP works... in 10 minutes

Beyond GZIP

Page 13: How GZIP works... in 10 minutes

Beyond GZIP

{ "name": "Raul", "country": "Spain" }, { "name": "Pablo", "country": "USA" }, { "name": "Pedro", "country": "Spain" }

Transposing JSON

{ "name": [ "Raul", "Pablo", "Pedro" ], "country": [ "Spain", "USA", "Spain" ] }

Page 14: How GZIP works... in 10 minutes

Beyond GZIPOrdering XML/HTML attributes

<input id='f1' class='field' name="f1" type="text" /> <input class="field" id="f2" type="text" name="f2" />

<input id="f1" class="field" name="f1" type="text" /> <input class="field" id="f2" type="text" name="f2" />

<input id="f1" class="field" name="f1" type="text" /> <input id="f2" class="field" name="f2" type="text" />

17,76 %

27,10 %

38,32 %

<input type="text" class="field" id="f1" name="f1" /> <input type="text" class="field" id="f2" name="f2" /> 38,32 %

Page 15: How GZIP works... in 10 minutes

Thank you!


Recommended