Energy-E�ciency of Long-term Storage
Irina Tolokonnikova
Seminar "Energy-E�cient Programming"
Arbeitsbereich Wissenschaftliches Rechnen
Fachbereich Informatik
Fakultät für Mathematik, Informatik und Naturwissenschaften
Universität Hamburg
2015-01-14
Archive Data Storage Devices State of Research Conclusion References
Agenda
1 Archive
2 Data Storage Devicesdata storage methodstapeHDDMAIDs
3 State of Research
4 Conclusion
5 References
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 1 / 28
Archive Data Storage Devices State of Research Conclusion References
Archive
storage of digital data for many years
requirements:
preservationretrievalauditing
archival data 6= backup data
needs to be cheap to obtain, cheap to operate, easy to expand
high costs for energy consumption
→ room for improvement
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 2 / 28
Archive Data Storage Devices State of Research Conclusion References
How much data are we talking about?
DKRZ: > 100 PetaBytes total capacity [1]
Google: ∼ 15 ExaBytes (in 2013) = 15000 Petabytes (onlyestimation)
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 3 / 28
Archive Data Storage Devices State of Research Conclusion References
How much data are we talking about?
DKRZ: > 100 PetaBytes total capacity [1]
Google: ∼ 15 ExaBytes (in 2013) = 15000 Petabytes (onlyestimation)
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 3 / 28
Archive Data Storage Devices State of Research Conclusion References
How much data are we talking about?
DKRZ: > 100 PetaBytes total capacity [1]
Google: ∼ 15 ExaBytes (in 2013) = 15000 Petabytes (onlyestimation)
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 3 / 28
Archive Data Storage Devices State of Research Conclusion References
Figure: 15 ExaBytes of punch cards would be enough to cover NewEngland, to a depth of about 4.5 kilometers
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 4 / 28
Archive Data Storage Devices State of Research Conclusion References
not this
Figure: LP [wikipedia.org]
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 5 / 28
Archive Data Storage Devices State of Research Conclusion References
not this
Figure: punch card [wikipedia.org]Irina Tolokonnikova Energy-E�ciency of Long-term Storage 6 / 28
Archive Data Storage Devices State of Research Conclusion References
not this
Figure: a United States National Archives Records Service facility in1959. Each carton could hold 2000 cards [wikipedia.org]
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 7 / 28
Archive Data Storage Devices State of Research Conclusion References
not this
Figure: 3,5-inch �oppy disk
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 8 / 28
Archive Data Storage Devices State of Research Conclusion References
not this?
Figure: compact cassette [wikipedia.org]
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 9 / 28
Archive Data Storage Devices State of Research Conclusion References
Tape
used as a cartridge with a single reelholds several tens to thousands of GB (state wikipedia.org13.01.15)
Oracle StorageTek T10000 T2 hold 8,5 TB
Figure: components of a tape drive [3]
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 10 / 28
Archive Data Storage Devices State of Research Conclusion References
Tape
used as a cartridge with a single reelholds several tens to thousands of GB (state wikipedia.org13.01.15)Oracle StorageTek T10000 T2 hold 8,5 TB
Figure: components of a tape drive [3]
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 10 / 28
Archive Data Storage Devices State of Research Conclusion References
DKRZ
7 automated Oracle/StorageTek SL8500 tape libraries
8 robots per library
over 67000 slots for magnetic tape cassettes
Figure: Inside the Tape library of DKRZ [1]
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 11 / 28
Archive Data Storage Devices State of Research Conclusion References
lifetime and costs
lifetime: 30 yearscosts: less than 1 cent per GB238X less energy over 12 years than HDD
Figure: upgrades save money and space [3]Irina Tolokonnikova Energy-E�ciency of Long-term Storage 12 / 28
Archive Data Storage Devices State of Research Conclusion References
pros and cons
Pros Cons
cheap needs specialexpensive equipment
long lifetime sequential access pattern
no power needed when not accessed
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 13 / 28
Archive Data Storage Devices State of Research Conclusion References
Hard Drives
easy and fast to access data storagesearching, consistency checking and inter-media reliabilityoperationscosts: 0.07 $per GB and fallinglifetime: 10 years, but easy to break mechanics
Figure: Laptop Hard disk drive, 500 GB Western Digital Scorpio[wikipedia.org]
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 14 / 28
Archive Data Storage Devices State of Research Conclusion References
pros and cons
Pros Cons
easy access, simply system needs much power,even when turned o�
matches requirements of big data easy to break
higher bandwidth (200X) needs extra spacefor redundancy
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 15 / 28
Archive Data Storage Devices State of Research Conclusion References
Colarelli, Grunwald et al.(2002)
massive array of idle disks = MAIDs
aim: storage densities matching those of tape, with reducedenergy consumption
but operating same data volume in disks costs 10X more thanin tape
idea: use a cache manager to keep only part of disks in anarray powered up
varying spin-down delays
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 16 / 28
Archive Data Storage Devices State of Research Conclusion References
Results
good trade o� in performance and energy e�ciency
read performance still e�ected by the spin-down delay
but 82% of read requests were satis�ed by the cache
least energy consumed with 4 sec spin-down delay
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 17 / 28
Archive Data Storage Devices State of Research Conclusion References
SSD
costs: 0.66 $per GB , yet too expensive
lifetime depends on usage, ∼10 years
yet unclear, how unused data behaves on SSD
coming soon?
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 18 / 28
Archive Data Storage Devices State of Research Conclusion References
Pergamum tomes by Storer et al. (2008)
interfaces and protocols change slowlyusing inter- and intra-device redundancywork energy e�cient, by not spinning up idle disks
→ intelligent, self managing storage device
Figure: Pergamum tome, redrawn
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 19 / 28
Archive Data Storage Devices State of Research Conclusion References
Results of Pergamum
size of the hard drive
nonvolatile RAM handles many types of requests(e.g. hashes)without spinning up the disk
using signatures for redundancy checking in entire inter-diskgroup
using trees of hash values to reduce signature data
once added to the network, the tome automatically joins aredunancy group or builds new one
→ makes storage management easier
using intra-device redundancy, recovering from small errorswithout other devices
aim to be price-competitive with tape
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 20 / 28
Archive Data Storage Devices State of Research Conclusion References
Problems and improvments
still not included in data archives(?)
redundancy overhead, but much energy saved
"disposable" tomes
encoding time 10X longer than on laptop processor BUT 10Xless power consumed
future work:
better algorithmsparallel processes (distributed searching)
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 21 / 28
Archive Data Storage Devices State of Research Conclusion References
A Spin-Up Saved is Energy Earned, Greenan et al.(2008)
idea: use redundancies on active devices instead of waking upinactive ones
→ Power aware codingthree conditions needed:
Figure: Three conditions for a power-aware system
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 22 / 28
Archive Data Storage Devices State of Research Conclusion References
Power Aware Techniques
rules known from Pergamum tome
Power Schedule
each code instance should have own write policywrite parallel across disk groups
Power-Aware Read Algorithm
minimize the number of disk activations�rst �nd out, if lost data is recoverablelike solving a matrix where inactive devices are treated aserased
Disk Activation Algorithm
perform search to �nd best activationhow and when is a spin-down performed?
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 23 / 28
Archive Data Storage Devices State of Research Conclusion References
observation while testing
mind the trade-o� trilemma!
Figure: the trade-o� when trying power aware coding
open questions:which enviroments will bene�t from power aware coding?how to �nd optimal policies?robust metrics have to be developed for evaluation thepower-reliability-performance trade-o�
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 24 / 28
Archive Data Storage Devices State of Research Conclusion References
Conclusion
Figure: Disk compared to Tape [3]
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 25 / 28
Archive Data Storage Devices State of Research Conclusion References
Conclusion
Pergamum tomes by Storer et al.
Pergamum tomes added to networksredundancy overhead used to recover errorsenergy saved by not spinning up other disksself managing system with "disposable" nodes
Power Aware Programming
try to use less disks as e�cient as you canmind the trade-o� trilemma between fault tolerance, spacee�ciency and power e�ciency"Initial results show that power-aware coding may be well
suited for the write-once, read-maybe workload of long-term
archival storage systems."
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 26 / 28
Archive Data Storage Devices State of Research Conclusion References
How would you store...
...(your own) private medical data?
...research data of a medical study?
...data of all patients of a hospital?
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 27 / 28
Archive Data Storage Devices State of Research Conclusion References
References
[1] https://www.dkrz.de/Klimarechner-en/datenarchiv (13.01.2015)[2] https://what-if.xkcd.com/63/ (13.01.2015)[3] Dr. Mark L Watson: Advanced Tape Technologies for Future
Archive Storage Systems. MSST - Media II (Tape Media andLibraries), 2013[4] Colarelli, Dennis, Dirk Grunwald, and Michael Neufeld. Thecase for massive arrays of idle disks (maid). The 2002 Conferenceon File and Storage Technologies. 2002.[5] Storer, Mark W., et al. Pergamum: Replacing tape with energy
e�cient, reliable, disk-based archival storage. Proceedings of the6th USENIX Conference on File and Storage Technologies.USENIX Association, 2008.[6] Greenan, Kevin M., et al. A Spin-Up Saved Is Energy Earned:
Achieving Power-E�cient, Erasure-Coded Storage. HotDep. 2008.
Irina Tolokonnikova Energy-E�ciency of Long-term Storage 28 / 28