+ All Categories
Home > Documents > Big Data, Big Records NOVA ARMA NCC-AIIM US. Department of the Interior Office of the Chief...

Big Data, Big Records NOVA ARMA NCC-AIIM US. Department of the Interior Office of the Chief...

Date post: 15-Dec-2015
Category:
Upload: rahul-auker
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
30
Big Data, Big Records NOVA ARMA NCC-AIIM US. Department of the Interior Office of the Chief Information Officer John Montel Office of the Chief Information Officer Policy Planning and Management February 27, 2013 Carrie Mallen IQ Business Group eDiscovery Practice
Transcript

Big Data, Big RecordsNOVA ARMA NCC-AIIM

US. Department of the InteriorOffice of the Chief Information Officer

John MontelOffice of the Chief Information OfficerPolicy Planning and ManagementFebruary 27, 2013

Carrie MallenIQ Business GroupeDiscovery Practice

U.S. Department of the Interior 2

Department of the Interior

• Cabinet level agency• 14 Bureau Offices• Employ’s ~70,000 / 280,000 volunteers• Manages $16.8B operating budget• Manages 500 million acres of surface land• Manages 479 dams and 348 reservoirs• Supplies 30% of the nation's energy production• Produce 55,000 different maps each year• Protects ~500 million recreational and cultural visitors

http://www.doi.gov/facts.html

U.S. Department of the Interior 3

IT Transformation

• Unified Messaging (BisonConnect)– Google apps for Government

• Enterprise Information (eERDMS)– Enterprise eArchive System– Enterprise Content System– Enterprise Forms System– Enterprise Dashboard System

eEDRMS Program Vision

Provide the Department of the Interior with a single cohesive integrated information management program designed to support and manage departmental records related to email, documents, and content in the Cloud

eERDMS Program Objectives

• Capture all unified messaging journaled email records• Capture all mobile content records• Capture all lines of business records• Capture all business system records• Develop a super bucket records schedule• Develop an online automated litigation hold process• Support Freedom of Information Act requests • Support litigation early case assessment needs• Support Congressional and Department inquiries

Program Capabilities

• Records Management DoD 5015 v3• Records, Document and Email Archiving/Journaling• Records and Document Auto Classification• Records and Document Content Management • Records and Document Imaging• Records and Document Management• Records and Document Scanning• Records and Document Workflow• Records and Document Collaborating Workspaces• Records and Document Auditing• Records and Document Advanced Early Case Assessment & Review• Records and Document Mobility Content Management• Section 508 Compliance out of the box• Optional: Advanced Legal Review, Social Media Capture, Email Management,

National Shredding Program & National Digitization Program, Migration Services and Support Staff Services.

6

U.S. Department of the Interior 7

OMB Directive M-12-18

Requires to the fullest extent possible - eliminate paper and use electronic recordkeeping.

Expected benefits: improved performance and promotion of openness

and accountability further identification and transfer to the National

Archives and Records Administration (NARA) of the permanently valuable historical records

minimizing costs and operating more efficiently

U.S. Department of the Interior 8

eERDMS Environment

EnterpriseRecordsSystem

EnterpriseContentSystem

EnterpriseFormsSystem

EnterpriseDashboard

System

NIEMXML

EnterpriseFax

System

EnterpriseSocial

System

FinancePersonnel

Operations

Security

AdministrationLogistics

Programs

Human Resources

Contracts

ERA

U.S. Department of the Interior 9

Big Data, Big Business• 600+ million emails a year

– 70 Million in Jan 2013– 100 Million Estimated for February 2013– 1.2B emails received– 15.5M records produced a day

• 22 Billion data points generated• 5,500+ FOIA cases a year• 200+ ongoing litigation cases• 100+ million printed pages a year• 4,100+ mobile devices• 15,000 Fax devices• Exabyte / Zettabyte of electric content

10

Records Management ObjectivesProvide the Department with: • a single, simplified, integrated Records Retention Schedule for

managing Bureau/Office records• a Retention Schedule based on Lines of Business shared across

Bureaus/Offices• a Retention Schedule which reduces the complexity of the existing

Schedules to allow for the use of auto-classification tools for assigning retention periods to Department records

We are, integrating knowledge for tomorrows workforce

11

Starting Point

• 14 Bureaus/Offices in DOI• 200 existing Retention

Schedules• 2,330 retention instructions • Some Big Bucket Schedules• Some Traditional Schedules• Some schedules in draft• Some schedules at NARA

awaiting approval

12

Department Records Schedule (DRS) Strategy

• Started with the Existing DOI Retention Schedules• Identified the Department’s Lines of Business• Created Crosswalks • Created Summary Worksheets• Drafted Super Bucket Retention Schedules, Ver 1• Entered Super Bucket Retention Schedules, Ver 1 in

eERDMS• and then……..Auto-Classification

13

Policy Bucket

• Controls and Oversight• Planning and Budgeting• Litigation and Judicial Activities• Regulatory Development

14

Mission Bucket• Biological Resources• Culture & Heritage• Disaster Management• Education• Energy• Environmental Management• Financial Management• Geospatial Services• Grants & Cooperative Agreements• Intelligence Operations• Land & Marine Conservation• Land Management Planning

• Land Use• Minerals• Public Health & Safety• Water• Water Quality• Wildland Fire

15

Administrative Bucket

• Accounting• Administration/Housekeeping

– Ultra Transitory?– Transitory; out of office, Amazon, eBay, twitter,

early dismissal, marketplace, Credit Union, Advisory notices, holiday notices, Dept. wide notices

• Human Resources• Information and Technology

16

Crosswalks

• Mapped each schedule item in every schedule to the Department’s Lines of Business

• Developed crosswalks • Vetted crosswalks with

Bureaus/Offices Records Officers

• Some Bureaus/Offices were very involved with the process

17

Results

18

Auto-Classification

• Definition/How it Works• Exemplars/Why• Testing and Refinement• Training• Implementation• Legal Defensibility

19

Auto-Classification

Definition of auto-classification: • Tool that provides automatic identification,

classification, retrieval, and archival and disposal capabilities for electronic records

• Tool that uses a hybrid approach that combines machine learning, rules, and content analytics

• Tool that uses a rules engine and scans content for words, phrases, tone, etc. to identify semantic relationships to assign records classification and retention periods to content (Open Text)

Auto-Classification

21

Auto-Classification Process

• System uses exemplars of each file node to train system to recognize patterns, tone, etc.

• Find “like” (similar) feature used to gather additional exemplars

• Use exemplars to create a model• Precision and recall numbers need to be 75% or better• Refine model with additional exemplars over time• Auto-classification run on incoming email content to assign

retention periods.

Hold Options

• Search-Based Holds• User-Based Holds• Location-Based

Holds• Classification-Based

Holds

Other Considerations

– Journaling– “Live” Content– Content at Risk

Copyright © 2010 Open Text Corporation. All rights reserved.

Select Users to be on Hold - per Matter

Slide 23

Option for selecting entire results set

.

User Based Holds

Slide 24

User Based Holds

Date ranges can be applied

Slide 25

Applies a hold to all items Created By, Owned By or have a version added by the users in the specified date range.

Users Can be Removed

Slide 26

More Advanced Search

Culling &Deduplication

Processing

• Live exploration– Search and explore data

before collection and preservation

• Reduce involvement of IT in collection

• Only relevant ESI required for hold is automatically collected to central hold repository

• Further cull and deduplicate prior to export of fully processed ESI

• Remote collection from disparate enterprise data sources - including ECM Suite

eDiscovery Early Case Assessment

Copyright © Open Text Corporation. All rights reserved. Slide 28

SharePoint

Desktops

EESSuite

EES Suite

File Servers

EmailECA

Any Review Platform

29

Communication and Outreach

• Shared vision and goals up, down, and across the organization

• Bureau/Office Records Officers Work Group• Records Officer Task Force with leadership role• Staff dedicated to supporting the effort with

the client

30

Thank you

John MonteleRecords Service ManagerService Planning and ManagementDepartment of the InteriorOffice of the Chief Information Officer1849 C. Street, N.W.Room 7444Washington, DC. 20040T. (202) 208-3939C. (202) 604-1149F. (202) 501-2360E. [email protected]

Carrie MalleneDiscovery SME IQ Business GroupPrime for eEDRMSDepartment of InteriorRoom 2012Washington, DC 20040C. 415 577-3982E. [email protected]


Recommended