Object-based Storage Devices and Intelligent Storage“Yes – it’s a crazy idea but is it crazy enough?”
HP LabsThomas M. Ruwart
University of MinnesotaDigital Technology Center
Intelligent Storage [email protected]
January 27, 2003
2
Overview
• Introduction – The University of Minnesota Digital Technology Center and DISC
• The Big Picture• What is OSD?• Why OSD? What general problem(s) need to be
addressed?• Extensible Storage Architectures• Intelligent Storage• Summary
3
Digital Technology Center:
Proposed in 1998 by President Mark Yudof to be the hub of innovation and excellence at the University of Minnesota in the digital technologies serving the industrial, educational, and public needs of the state and the nation.
DTC Initiative
4
• Promote communication and collaboration between theuniversity, government, industry, and to serve as a point ofentry into research and development partnerships.
• Create, promote, and coordinate cooperative interdisciplinary advanced technology initiativesbetween various university colleges and programs.
• Generate new ideas/learning and educate outstandinggraduates who are prepared for the high technologyindustries of the 21st century.
DTC Mission
5
Resources, Programs and Research Areas
Graphics &Visualization
SignalProcessing &
Wireless
System Recognition &
VerificationStorage Supercomputing Networks
IndustrialPartners/
Consortiums
Databases &Data mining
LCSEDigital Technology CenterU of M focal point
Research Education AffiliatePrograms
NewInitiatives
OutreachPartnership
Outcomes and Results
DTC Role
6
Anderson (Arch)Chen (CSE)Interrante (CSE)Meyer (CSE)Piotrowski (Arch)
Chen (CSE)Interrante(CSE)Meyer (CSE)
Gao (Chemistry)Grosberg (Physics)Kaznessis (CEMS)Othmer (Mathematics)
Hsu (CSE)Lilja (ECE)Nadathur (CSE)Roychowdhury (ECE)
Carlis (CSE)Karypis (CSE)
Du (CSE)Weissman(CSE)Kim (CSE)
Du (CSE)Tewfik (ECE)Zhang (CSE)
Giannakis (ECE)Ottesen (ECE/Rochester)Sidiropoulos (ECE)
Digital DesignConsortium
Graphics & Visualization
Computational Biology Systems Recognition& Verification
Databases, Datamining
Storage
Networks
Signal Processing, Wireless
Gini (CSE)Papanikolopoulos(CSE)Voyles (CSE)
AI, Robotics, andVision
Woodward (LCSE)
Sapiro (ECE)
Tewfik (ECE)
Lilja (ECE)
Zhang (CSE)
Research Areas and Interaction
7
DTC Intelligent Storage Consortium (DISC)
• Emphasize the application of Advanced Storage Technologies• A Balanced approach to research that includes:
– Applications that need/use storage– Advanced and Emerging Storage Architectures– Advanced and Emerging Storage Technologies both
software and hardware– Business Cases and aspects of the Storage industry
• Market Trends• Product Directions• Effects of these disruptive technologies• Adoption rates
• Provide consortium members with not just technology research but a more complete and significant outcome
8
• Explore possibilities of storage evolution over the next 3, 5, 10, …years– Object-based Storage Devices– Intelligent Storage Devices and Systems
• Explain the benefits of OSD and Intelligent from the technical and business perspectives
• Generate awareness of some current efforts in these two areas
Goal of this presentation
9
What is OSD?
• Object-based Storage Devices – An Enabling Technology• Grew out of the Network Attached Secure Disks (NASD) project at CMU• A flexible and powerful protocol used to communicate with storage
devices• Proposed as a protocol extension to the SCSI command set• Actively being pursued by the OSD Technical Working Group in the
Storage Networking Industry Association (SNIA) and by the ASCI Trilabs• It is a natural step in the evolution of storage interface protocols
• For some however, it is very new and very different
ST506 SMD SCSI FC SCSI SCSI OSD OSD
1902 1985 1990 1998 2002? 200X
10
What OSD is NOT
• It is not intended or expected that the object abstraction be a complete file system
• There is NO notion of – Naming– Hierarchical relationships– Streams– file system style ownership access control
• The omitted features are assumed still to be the responsibility of the OS file system
11
The General Application:Storage Architectures Today
Storage Device
I/O Application
Interconnect
File System
Network Attached StorageNAS(files)
Interconnect
Storage Device
File System
I/O Application
Storage Area NetworkSAN
(blocks)
Storage Device
File System
I/O Application
Direct Attached StorageDAS
(blocks)
Architecture defined by location of file system & storage devices
12
Block Storage Device
OSD System Architecture
File System Storage Component
File SystemUser Component
I/O Application
Block Storage Device
File System Storage Component
File SystemUser Component
I/O Application
SAN Architecture OSD Architecture
Interconnect
Interconnect
13
Typical File System Components
• User File System Component– Hierarchy Management– Naming – User Access Control– Data Properties (Attributes)
• File System Storage Component– Free space management– Storage allocation for data entities– Attribute Interpretation Block Storage
Device
File System Storage Component
File SystemUser Component
I/O Application
Interconnect
14
Example: How an OSD-based file system would work
Block Storage Device
File System Storage Component
File SystemUser Component
I/O Application Metadata Manager
File SystemUser Component
I/O Application
Data Transfer
Object
Location
Security
Security
15
Why OSD? What problems need to be addressed?
• Depends on the APPLICATION• Different people are trying to solve different problems for different reasons• Management
– Device Management– Storage Management– Data Management
• Cost• Performance• Security• The “ilities”
– Reliability– Availability– Serviceability– Maintainability– Scalability or Extensibility
16
Extensible Storage Architectures
• Density – the number of bytes/IOPS/bandwidth per unit volume• Scalability – what does that word really mean?
• Capacity: number of bytes, number of objects, number of files, number of actuators …etc.
• Performance: Bandwidth, IOPs, Latency, …etc.• Connectivity: number of disks, hosts, arrays, …etc.• Geographic: LAN, SAN, WAN, …etc.• Processing Power
• Adaptability – to changing applications• Capability – can add functionality for different applications• Manageability – Can be managed as a system rather than just a box of storage devices• Reliability – Connection integrity capabilities• Availability – Fail-over capabilities • Serviceability – Hot-plug capability • Interoperability – Supported by many vendors – Heterogeneous by nature• Cost – address issues such as $/MB, $/sqft, $/IOP, $/MB/sec, TCO, …etc.
17
Intelligent Storage Devices
• An Intelligent Storage Device is: – Aware of the data objects it stores– Can be aware of the contents of data objects– Can be aware of the relationships between data objects– Can act on data objects and manipulate the objects
and/or their contents• Questions:
– At what level do you communicate with an intelligent storage device?
– If you are going to teach a storage device to do something, what do you teach it to do?
18
Management
• Self-managed devices– More autonomous because the devices are aware of the data objects
they store• Internal device failure management is simpler and more robust• Space Management is simpler for the same reason
• Data management is easier because these tasks could be offloaded onto the storage devices themselves
• Data Sharing– Heterogeneous OS support is implicit– Physical device sharing is implicit
• Policy-Driven backup, recovery, hierarchical storage management is simpler to implement
• Managing objects is easier than managing blocks• Managing Object-based Storage Devices is easier than managing block-
based storage devices
19
Performance Virtualization
• Three I/O Performance Metrics– Bandwidth – number of sustained bytes per second – Latency – time to first byte of data– IOPS – number of sustained transactions per second
• Applications need only specify values for these three metrics as“attributes” of the object being created or accessed
• The Storage Device can then decide where/how best to store the object in order to meet the performance requirements (see HybridStorage Devices)
• Abstracts the physical storage device performance characteristics• Attributes can also be used to make more informed decisions
about cache usage• Performance attributes can also be used to manage performance
as a resource (i.e. bandwidth reservation)
20
Security
• OSD has a Security Model built into it from the beginning rather than as an after thought
• The OSD Security Model enables a secure exchange and storage/execution of objects
• Using this security model Active Object Storage Devices can effectively implement encryption
• The inclusion of a Security Model gives OSDs more autonomy than plain disk drives
• Intelligent Storage Devices can implement other security mechanisms such as those specified in MPEG21, on-disk encryption, secure-erase, …etc.
21
Technology Shifts
• What happens when…– NEC Announces a 10 Terabit Memory Chip– MEMS devices bridge the gap between RAM and Disk– DVDR Replaces Tape– Disk densities hit 1 terabit/in2 , 10Tb/in
• Must Decouple the physical storage technology from the application(s) and the file systems
– OSD is the ultimate virtualization technology but it is a standard– Intelligence in the storage device allows for exploitation of the
Technology Shifts rather than reaction to them• Underlying storage technologies can evolve independently of the data
that they store and the protocols that access them
22
Intelligent Storage and Active Objects
• Normal disks or storage devices only Read and Write data • An Intelligent Disk is actually a storage device that understands
the content, structure, and context of the data it manages• An object can be:
– A simple block of data – A meta-object that is a dynamic collection of other objects– A method or executable procedure– Any or all of the above
• Intelligent storage devices can be Hybrid devices made up of disks, tapes, DVDR, RAMDISK, Flash memory, …etc.
• Hybrid Active Storage devices can store data based on performance, security, or other attributes
23
Business Perspective
• Why would a storage vendor want OSD/Intelligent Storage– Differentiation– Higher margins because of increased “value”
• Why Would an End-User want to purchase OSD/Intelligent Storage– Simpler Management on all levels– More functionality/capability– LOWER TCO
• Issues– Market adoption– Technology adoption– How to deal with these?
• Need a technology/business focused R&D consortium
24
Summary
• Answer the original question: Is it crazy enough?– Yes it is – at least for the next 10 years.
• OSD is the next step in the evolution of storage devices• OSD-like products are beginning to emerge• Standards are beginning to take shape through SNIA OSD TWG• OSD is a foundation on which to build more Intelligent, more
capable storage devices• Need to work more closely with “application” vendors to refine and
deploy the OSD and Intelligent storage protocols• Need to develop an understanding of OSD and Intelligent Storage as
viable solutions that solve significant problems that translate into $$$ for both the technology provider and the customer
25
Click to add subliminal message
• Click to add subliminal message
26
Information, R&D, Standards
• Standards Work– SNIA OSD Technical Working Group www.snia.org/osd– ANSI SCSI www.t10.org/scsi-3.htm– National Storage Industry Consortium www.nsic.org/nasd
• Research– University of Minnesota, Digital Technology Center, Intelligent Storage Consortium
(DISC) www.dtc.umn.edu– Carnegie Mellon University (CMU) Parallel Data Lab (PDL) www.pdl.cmu.edu– University of California Santa Cruz, Storage Systems Research Center
www.ucsc.edu/ssrc– University of California San Diego, Center for Magnetic Recording Research (CMRR)
cmrr.ucsd.ede• Research & Development
– Intel Labs www.intel.com/labs/storage/osd– IBM Research www.haifa.il.ibm.com/storage.html– Others….
27
Backup Slides
28
Common OSD-like Examples
• Digital “Appliances”– Digital Cameras– MP3 Players– CD/DVD Players
• Systems– Napster, Morpheus, …etc. – Protocols and standards: Corba, UML, XML, …etc.
29
Issues
• Where is OSD implemented?– OSD on disk drives? – Disk arrays? – Removable media devices?
• Market Acceptance?• How does OSD compete with current technologies (i.e. ATA disks)
that are “good enough”?• Support for legacy applications.• Where does Microsoft fit into this picture?• Where do the Software Application vendors fit into the picture?• Where does Linux fit into this picture?• Where do all the other OS vendors fit in?