Post on 06-Jun-2015
transcript
Windows 8 Disk Deduplication Deep Dive
Ronald Beekelaar Virsoft Solutions
ronald@beekelaar.com
Schiphol, 19 jan 2012
Introductions • Presenter
– MVP Security – MVP Virtual Machine Technology – E-mail: ronald@beekelaar.com
• Work
– Security consultancy – Virtualization consultancy – Create many VM-based labs and demos – Software to optimize, manage and run VM – Maintain four datacenters world-wide
• Running Hyper-V labs for customers (MOC, training and demo purposes)
Objectives
• Discuss one interesting new aspect of Windows 8: Disk Deduplication
What is Disk Deduplication ?
• Goal:
– Use less storage space
• Method:
– Ensure that identical content in multiple (large) files is only stored once
• Is block-based, post-process, transparant solution
Standard deduplication modes
• "Source" – Prevent transferring data, if duplicate
• Used by Remote Differential Compression
• "Inline" – Perform deduplication when data is written
• Used by NTFS file compression • Write process is slowed down
• "Post-Process" (or "Background")
– Perform deduplication later, in background, when idle • Used by Windows 8 Data Deduplication
Other methods to save disk space
• SIS (single-instance-store) in Win2000 – Is file-based, not block-based
• NTFS file compression – Is inline, not post-process – Much more CPU intensive
• NTFS hard links – Is not transparent – Is file-based, not block-based
NTFS Hard Links
• Multiple file entries pointing to same data
• Manage – Create: mklink /h link.ext target.ext
– List: fsutil hardlink list file.ext
• Is not transparent – Edit one hardlink file, also changes other files
• Windows uses thousands of hard links (!) – Good reason not to touch C:\Windows\winsxs
Windows 8 dedup architecture
• Is file-system filter driver
– Coordinates between file entry, regular storage and 'chunk' storage
• Dedup service (ddpsvc) runs jobs to deduplicate files
How does Windows 8 dedup work?
• Dedup service recognizes common 'chunks' in files, and places those in Chunk Store – In System Volume Information folder
• Dedup filter driver ensures that applications read correct file content
• File "size" (= content length) does not change in Explorer – Explorer reports "size-on-disk" as 4 KB
How does Windows 8 dedup work?
Windows 8 dedup details
• Dedup works per volume – Also works on portable disks – Dedup does NOT work on C: (Windows) volume
• Chunk size is 32-128 KB (average 80 KB) • By default
– Chunks are compressed in chunk store • Avoids re-compressing compressed files (zip, etc)
– Dedup service ignores files < 64 KB – Dedup service ignores files changed in last 30 days – Dedup service ignores NTFS encrypted files
Savings?
• Depends on file content of course
• Microsoft reported averages:
– General: 50-60% savings
• Documents: 30-50% saving
• Application library: 70-80% savings
• VHD library: 80-95% savings
Performance?
• Write has no direct performance hit – Dedup operations are done post-process
• Read has a ~3% performance hit (if not in cache) – Due to more disk head operations
– Compare with disk fragmentation
• Windows caching is dedup-aware (!) – Dedup improves caching efficience
Reliable? • My opinion: Yes - 100%
• Data is check-summed
– Means: invalid data is detected
• Operations are crash consistent – Means: can interrupt/crash operation at any time without losing
data
• Data is self-describing – Means: it can be read without external data
• Popular 'chunks' (>100x) are stored multiple times – Means: avoids creating IO hotspots on disk
January 20, 2012 NIC 2012
How to enable Windows 8 dedup?
• Install Data Deduplication role service • Start Data Duplication Service (ddpsvc) • Powershell
– import-module Deduplication – help dedup
– enable-dedupvolume D: – set-dedupvolume D: -minimumfileagedays 0
• Default is 30 days
– start-dedupjob D: -type Optimization • Use Unoptimization to undo
– get-dedupjob – get-dedupstatus – get-dedupmetadata
Questions ?
• Thanks for your attention