+ All Categories
Home > Documents > Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

Date post: 12-Jan-2016
Category:
Upload: charla-hubbard
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
22
Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT 1
Transcript
Page 1: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT 1

Content Addressed Storage

Chapter 9

Page 2: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 2

Chapter Objective

Upon completion of this chapter, you will be able to:

• Describe CAS, fixed content and archives, traditional storage solutions for archive

• Describe the features and benefits of a CAS based storage strategy

• List the physical and logical elements of CAS• Describe the storage and retrieval process for CAS data

objects• Describe the best suited operational environments for CAS

solutions

Page 3: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 3

Lesson: CAS Overview

Upon completion of this lesson, you be able to:• Define fixed content• Describe traditional archival solutions and its

shortcoming • Define Content Addressed Storage (CAS)• List benefits of CAS

Page 4: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 4

What are Fixed Content and Archives

Electronic Documents• Contracts, claims, etc.• E-mail and attachments• Financial spread sheets• CAD/CAM designs• Presentations

Digital Records• Documents– Checks, securities trades– Historical preservation

• Photographs– Personal / professional

• Surveys – Seismic, astronomic,

geographic

Digital Assets Retained For Active Reference And ValueDigital Assets Retained For Active Reference And Value

Leverage Historical Value

Improve Service Levels

Generate New Revenues

Rich Media• Medical– X-rays, MRIs, CTI

• Video– News / media, movies– Security surveillance

• Audio– Voicemail– Radio

Page 5: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 5

Challenges of Storing Fixed Content

• Fixed content is growing at more than 90% annually– Significant amount of newly created information falls into this

category – New regulations require retention and data protection

• Often, long-term preservation is required (years-decades)• Simultaneous multi-user online access is preferable to

offline storage• Need faster access to fixed content• Need for location independent data, enabling technology

refresh and migration• Traditional storage methods are inadequate

Page 6: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 6

Traditional storage solutions for Archive

• Three categories of archival solution are:– Online, nearline, and offline based on the means

of access• Traditional archival solution were offline– Traditional archival process used optical disks and

tapes as media for archival– An archive is often stored on a Write Once Read

Many (WORM) device, such as a CD-ROM

Page 7: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 7

Shortcomings of Traditional Archiving Solutions

• Tape is slow, and standards are always changing • Optical is expensive, and requires vast amounts

of media• Recovering files from tape and optical is often

time consuming• Data on tape and optical is subject to media

degradation• Both solution require sophisticated media

managementCAS has emerged as an alternative to traditional

archiving solutions

Page 8: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 8

What is Content Addressed Storage (CAS)

• Object-oriented, location-independent approach to data storage

• Repository for the “Objects”• Access mechanism to interface with repository• Globally unique identifiers provide access to

objects

Page 9: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 9

Benefits of CAS

• Content authenticity• Content integrity• Location independence• Single-instance storage (SiS)• Retention enforcement• Record-level protection and disposition• Technology independence• Fast record retrieval

Page 10: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 11

Lesson Summary

Key points covered in this lesson:• CAS Definition• Challenges of Storing Fixed Content• Shortcomings of Traditional Archiving

Solutions• Benefits of CAS

Page 11: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 12

Lesson: CAS Architecture

Upon completion of this lesson, you will be able to:

• Describe CAS architecture• Describe Physical and logical elements of CAS• Describe data storage and retrieval process in

CAS environment• CAS examples

Page 12: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 13

Physical Elements of CAS• Storage devices (CAS Based)– Storage node– Access node

• Servers (to which storage devices get connected)

• Client

Server

Private LAN

Storage Nodes

Access Nodes

CAS System

IP

API

Page 13: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 14

CAS Terminology

• Application Programming Interface (API)– A set of function calls that enables

communication between applications or between an application and an operating system

• Binary Large Object (BLOB)– The Distinct Bit Sequence (DBS) of user data

represents the actual content of a file and is independent of the filename and physical location

API

Page 14: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 15

CAS Terminology (Cont.)

• C-Clip– A package containing the user's data and associated

metadata– C-Clip ID (C-Clip handle or C-Clip reference) is the CA

that the system returns to the client application• Content Address (CA)

– An identifier that uniquely addresses the content of a file and not its location. Unlike location-based addresses, content addresses are inherently stable and, once calculated, they never change and always refer to the same content

• C-Clip Descriptor File (CDF)– The additional XML file that the system creates when

making a C-Clip. This file includes the content addresses for all referenced BLOBs and associated metadata

Page 15: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 16

API

How CAS Stores a Data Object

Unique Content Address is calculated

Client presents data to API to be archived CAS System

Client

Application Server

CDF

C-Clip(Object)

Object is sent to Centera via Centera API over IP

Page 16: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 17

API

How CAS Stores a Data Object

Unique Content Address is calculated

Client presents data to API to be archived CAS System

Client

Application ServerObject is sent to Centera via Centera API over IP Object

Centera validates the Content Address and stores the object

Acknowledgement returned to application

Clip ID is retained and stored for future use

Page 17: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 18

Application Server

Client

CAS System

Object is needed byan application

1 CAS authenticatesthe request and

delivers the object

4

Application findsContent Address of

object to be retrieved

2 Retrieval request issent to the CAS via

CAS API over IP

3

How CAS Retrieves a Data Object

API

C-Clip ID

Page 18: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 19

CAS Features

• Features available with most CAS systems are:– Integrity checking – Data protection

• Local replication • Remote replication

– Load balancing – Scalability – Self-diagnosis and repair – Report generation and event notification – Fault tolerance – Audit trails

Page 19: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 20

Example 1: CAS Healthcare Solution

• Each X-ray image ranges from about 15MB to over 1GB• Patient record is stored online for a period of 60-90

days• Beyond 90 days patient records are archived

Data Stored on CAS

Patient Studies

Stored locally for Short-Term Use

(60 Days)

Hospital

CAS SystemApplication Server

API

Page 20: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 21

Example 2: CAS Financial Solution

• Check image size is about 25KB• Check imaging service provider may process 50–

90 million check images per month• Checks are stored online for a period of 60 days• Beyond 60 days data is archived

Bank

CAS SystemApplication Server

API

Page 21: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 22

Lesson Summary

Key points covered in this lesson:• CAS architecture• Physical and logical elements of CAS• CAS storage and retrieval process• CAS solution examples

Page 22: Content Addressed Storage Chapter 9 ISMDR:BEIT:VIII:chap 6:Madhu N PIIT1.

ISMDR:BEIT:VIII:chap 6:Madhu N PIIT - 23

Concept in Practice – EMC Centera• Centera Architecture– Based on RAIN (Redundant Array of Independent

Node)• Access Node• Storage Node

Access/Storage Nodes

1 2 3 4 5 6 4

3

6

1

5

2

Private LAN

Storage Nodes

Content Mirrored Content

Power Rails

EthernetSwitch

EthernetSwitch

LAN

To Server


Recommended