Date post: | 20-Dec-2015 |
Category: |
Documents |
View: | 222 times |
Download: | 5 times |
Assembly 2005, Helsinki, July 2005 1
Crinkler
- compressing Windows 4k intros to EXE files
Aske Simon Christensen
Rune L. H. Stubbe
Assembly 2005, Helsinki, July 2005 2
Overview
• Background
• Compression method
• Function import
• Header layout
• Demo
• Future plans
Assembly 2005, Helsinki, July 2005 3
Why another one?
EXE optimizer CAB compressor BAT inserter
EXE file
BAT file
• Most common method: CAB dropping
• Dropping is a mess
• We want EXE files!
Assembly 2005, Helsinki, July 2005 4
How is Crinkler different?
• The normal build process:
Compiler Assember Linker Cruncher
C/C++ files ASM files object / library files
EXE file
Assembly 2005, Helsinki, July 2005 5
How is Crinkler different?
• The Crinkler way:
Compiler Assember Crinkler
C/C++ files ASM files object / library files
EXE file
Assembly 2005, Helsinki, July 2005 6
Why another one?
• Control over code and data placement– Choose base address– Optimize order for best compression– Separate code and data– Put in extra code
• Import code• Code transformations
Assembly 2005, Helsinki, July 2005 7
Compression method
• Context modelling+ Much better compression ratio than LZX
+ Well suited for small amounts of data
+ Small decompression code (< 250 bytes)
+ Pays off even with the extra header
- Extremely slow
- Very memory-hungry
Assembly 2005, Helsinki, July 2005 8
Data compression basics
• Take advantage of self-similarity
• Find patterns and eliminate them
• Dictionary compression
• Statistical compression
Assembly 2005, Helsinki, July 2005 9
Dictionary compression
• LZ77: Refer repetitions back to original
• Reasonable compression ratio
• Fast compression
• Very fast decompression
M I S S I S S I P P IM I S S ISSI P P I
Assembly 2005, Helsinki, July 2005 10
• Estimate probability distribution of each symbol based on earlier data
• PPM:
• Problem: local
M I S S I S S I P P I
Statistical compression
Assembly 2005, Helsinki, July 2005 11
M I S S I S S I P P I
Context modelling
• Generalization of PPM
• Look at combinations of recent symbols
• A bit mask describes a model
• Problem: Many masks to choose from
0 0 0 0 0 1 0 0
Assembly 2005, Helsinki, July 2005 12
Implementation
• Estimation for each single bit
• Context is current byte + selection of last 8
• Estimate the best collection of masks
• Estimate the best weights of the masks
• Keep track of contexts in a hash table
• Ignore hash collisions
• Find hash table size with few collisions
Assembly 2005, Helsinki, July 2005 13
Function import
• Import by name: Name of each function– The import table is a big part of an EXE file
• Import by ordinal: Number instead of name– Much smaller but quite incompatible
• Import by hash: Hash code of each function– Small and compatible– Not supported directly
• Import by hashed ordinal range
Assembly 2005, Helsinki, July 2005 14
Header optimization
DOS header
Section header
PE offset
DOS stub
PE header
Data directories
544 bytes!
Assembly 2005, Helsinki, July 2005 15
Header optimization
DOS header
Section header
PE offset
DOS stub
PE header
Data directories
Assembly 2005, Helsinki, July 2005 16
Header optimization
DOS header
Section header
PE offset
DOS stub
PE header
Data directories
Assembly 2005, Helsinki, July 2005 17
Header optimization
DOS header
Section header
PE offset
DOS stub
PE header
Data directories
Ignored
Assembly 2005, Helsinki, July 2005 18
Header optimization
DOS header
Section header
PE offset
DOS stub
PE header
Data directories
Ignored196 bytes!
Assembly 2005, Helsinki, July 2005 19
Header optimization
DOS header
Section header
PE offset
DOS stub
PE header
Data directories
Hash code124 bytes + 18 hash codes!
Assembly 2005, Helsinki, July 2005 20
Demo
Assembly 2005, Helsinki, July 2005 21
Future plans
• Windows 2000 compatibility
• Even better compression
• Section reordering
• Transformations
• More feedback
• 64k specialized version
Assembly 2005, Helsinki, July 2005 22
Thank you
Questions?
Comments?
Suggestions?