+ All Categories
Home > Documents > By Annie Stoehr, MSI MWR Information Management Consultant.

By Annie Stoehr, MSI MWR Information Management Consultant.

Date post: 11-Jan-2016
Category:
Upload: ruby-norton
View: 215 times
Download: 2 times
Share this document with a friend
21
Avoiding Datageddon: Science Information Management in the Midwest Region By Annie Stoehr, MSI MWR Information Management Consultant
Transcript
Page 1: By Annie Stoehr, MSI MWR Information Management Consultant.

Avoiding Datageddon: Science Information Management in the Midwest Region

By Annie Stoehr, MSIMWR Information Management Consultant

Page 2: By Annie Stoehr, MSI MWR Information Management Consultant.

About me...

MSI for the School of Information, University of Michigan Specialization in Human-Computer

Interaction and Archives & Records Mgmt

May 2011 – began internship at the Great Lakes Science Center in Ann Arbor, MI

Activities: Inventory and assess science information mgmt best practices across the MWR, design and administer usability tests for data mgmt tools, archive, training and education, and provide guidance on the use of SharePoint

Interests: providing accessibility and discoverability to records that were previously unknown, leveraging existing assets to make the USGS more competitive despite funding cuts, and building a science information mgmt network that supports consistent data mgmt practices across USGS

Page 3: By Annie Stoehr, MSI MWR Information Management Consultant.

The Plan: Science Information Management

2012 – Midwest Region commissioned the Science Information Management Team to inventory and assess the science data management practices

How well do practices fit the CDI’s Science Data Lifecycle Model?

Are current practices meeting Centers’ needs?

What tools are needed to help leverage USGS data?

Page 4: By Annie Stoehr, MSI MWR Information Management Consultant.

The Model: Science Data Lifecycle

What does this look like in practice?

Page 5: By Annie Stoehr, MSI MWR Information Management Consultant.

The SIM Business Plan

The goal of this assessment is to create best management practices (BMP) for science data from which MWR centers can develop or enhance data management plans.

These data management plans will ultimately lead to the exposure and sharing of MWR data to increase scientific integration and partnerships.

Currently, the USGS has two teams working on data management and its BMPs. The Community for Data Integration (CDI) Data Management Working Group and the Science Data Coordinator Network (SDCN) are compiling information about science data storage, retrieval, and sharing. They have developed a framework that breaks down the data life cycle.

The Planning, Acquisition, Processing and Analysis section of the Data Life Cycle pertain to scientists as they collect and interpret the datum. In order to promote the continued use and availability of science datum, the last two steps, Preservation and Publication, are essential. Currently in many centers, scientists are also responsible for these steps; however available data stewards may assist when needed.

Page 6: By Annie Stoehr, MSI MWR Information Management Consultant.

SIM Business Plan Cont.

The MWR currently includes 19 Science Centers, each creating a multitude of datasets. The datum they invest in and create only has enduring value if it is maintained. For the most part, the Planning, Acquisition, Processing, and Analysis section of the Data Life Cycle are handled by the scientists and their project teams. However, outside of a few CDI sponsored projects there is little effort into sustainable use and accessibility of science datum – an important piece of the life cycle of datum that makes data-sharing possible in the years to come. This is a prime opportunity for the MWR to set the standard and lead the way toward science information management and data preservation. In order for the MWR to be well-positioned for the increasing need for integrated science, it is essential to leverage existing data management standards and practices to provide optimal effectiveness in generating science for the needs of society.

Protecting research datum, increasing exposure of science to audiences, and cost savings, will add value to the scientists, science centers, the Region and ultimately the Bureau.

Our research will identify their current needs. Through this “inventory and assessment”, we will uncover what activities are working well and what activities are not working. As the Centers work through or develop their individualized plan, science datum will be managed and stored for easier sharing across the Bureau as well as sustained for future use as indicated by their plan.

Continuing to maintain science datum demonstrates accountability, transparency, and reliability to each other, our partners, and the public.

Not only does the Bureau reap benefits from data management activities, but so do our stakeholders and scientists.

With an organized approach, we can ensure that science datum will be available for any Bureau-level activities that promote data integration and discoverability.

Opportunity Benefits

Page 7: By Annie Stoehr, MSI MWR Information Management Consultant.

A Tale of 3 Centers…

UMESC

NWHC

GLSC

Page 8: By Annie Stoehr, MSI MWR Information Management Consultant.

Great Lakes Science Center

The GLSC has a number of long-running data sets that are well known to professionals in the field of fishery biology. The Center placed a priority on the older, inactive records to reformat and preserve for continued use because they form the basis for some of the long-running datasets. In order to maintain the credibility, authenticity, and reliability of these data sets, original data sheets were scanned to PDF format and entered into Oracle databases. Now that most of the older records have been processed and reformatted, the Center is now trying to link all parts of a study together in an Archives Catalogue to facilitate discovery. While the GLSC has an internal network that facilitates sharing between staff, there was a need for additional ways to manage work flows such as study planning. The Center established a SharePoint Intranet site and designated a group to oversee the site’s development. Since the fall of 2010, graduate students from the University of Michigan School of Information Records Management concentration have made much of the recent work a reality.

The framework at GLSC: Plan – Scientists create project plans and then enter relevant parts into Basis +, Acquire –

Collect: from the field/lab (Collection practices are outlined in the study plan on Basis+) Convert: Legacy collection is being indexed and reformatted Share: MOUs and MOAs in place; also share through FOIA requests Purchase: Unknown

Preserve – Reformatting paper records; developing Center SharePoint site for teams’ DM; building an Archive Catalog; archiving scientists’ work; robust backup system; enter data into Oracle

Publish – Make findings available through ScienceBase, Center website, IDPS, literature/journals, presentations, guest speakers, public outreach, answering data requests

Page 9: By Annie Stoehr, MSI MWR Information Management Consultant.

National Wildlife Health Center

The NWHC is more concerned with active records, as this Center handles diagnostics and investigations of current events. Entities (public, private, other government agencies, and individuals) make a request for the NWHC to investigate something. These requests are sporadic, thus timely execution of work is critical. In order to better serve partners, interested parties, and our own scientists; the Center needed to address their active records and data. Most of the Center’s data is held in various databases, each serving its own function. To expedite investigations and research, the Center made working with these databases a priority. The NWHC also wanted to ensure that the data they will be collecting/future studies will be properly documented and captured. Once a team was identified, they began working on the Center’s immediate needs through an assessment and the development of a metadata template. While the Center does have some older, inactive records; it is unclear if they have retained value given the nature of the studies at the Center.

  The framework at NWHC: Plan – Scientists enter plans into Basis+ Acquire –

Collect: from the field/lab Convert: older records are being reformatted with plans to send them to NARA Share: MOUs, MOAs, FOIA; sent by entities Purchase: Unknown

Preserve – Metadata template to ensure context with data; data entry into multiple databases Publish/Share – Fill data requests; share photos; send results back to the inquiring entity

Page 10: By Annie Stoehr, MSI MWR Information Management Consultant.

Upper Mississippi Environmental Science Center

At UMESC, a majority of scientists have complete control over their data, which means that their individual practices are largely unknown. There is no unified vision for a data management plan. The exceptions are the Long Term Resources Monitoring Program, an element of the Upper Mississippi River Restoration Environmental Management Program and the regulated studies within the Aquatic Ecosystems Health Branch. The regulated studies must be managed as they are regularly audited. The Librarian handles the paper records from these studies and scans each study’s index file. These studies will be sent to NARA, however there are 120 cubic feet of backlogged archives that have yet to be processed.

The framework at UMESC: Plan – Scientists enter plans into Basis after completing Administrative Policy and

Procedure From 045.2. Acquire –

Collect: from the field/lab Convert: Librarian is working through backlog of archives  

Share: MOUs/MOAs in place, up to scientists’ discretion Purchase: Aerial Photography is purchased for some geospatial projects Create: Some projects require the creation of geospatial data

Preserve – Librarian scans index sheet of archived regulated studies, preparing to send them to NARA. There is a tape library (on-site and of-site) of data from 1989 – 2006. The catalog of data is searchable on the UMESC Intranet site. 

Publish/Share – Make findings available through ScienceBase, Center website, literature/journals, presentations, guest speakers, public outreach, answering data requests

Page 11: By Annie Stoehr, MSI MWR Information Management Consultant.

The Model Applied

Page 12: By Annie Stoehr, MSI MWR Information Management Consultant.

GLSC Model: Archives Focus

Page 13: By Annie Stoehr, MSI MWR Information Management Consultant.

NWHC Model: Future Projects Focus

Page 14: By Annie Stoehr, MSI MWR Information Management Consultant.

UMESC Model: No Unified Focus

Page 15: By Annie Stoehr, MSI MWR Information Management Consultant.

Lessons Learned The three centers had varying levels of

application and interest in data management. Reinforcing our idea that a one-size-fits-all model will not work in the Ecosystems Mission Area within the Midwest Region.

The type of science dictates how a center will tackle data management. For instance, centers with long-term data sets tend to focus on their archives to ensure continued credibility of those data sets. By contrast, centers that collect science data through opportunistic events or research tend to focus on securing the data and making it immediately searchable and retrievable.

Although it is thought of as a rare occurrence, there have been instances of technology failure at all three Centers. Data and files are more likely to be recovered

when the Center has a robust backup system coupled with staff whom are knowledgeable about how to use it.

After a Center visit, it became apparent that individual practices vis-a-vis backing up data actually put the Center at risk for data loss. Center management have since designated a network drive for backups to protect against data loss

Leveraging local resources (universities, other agencies, etc.) can be vital to establishing and maintaining a center’s data management program.

Despite having different approaches to data management, all three centers need more staff, data management tools that can be adapted to specific Center needs, and a directory of “who is doing what”.

It is more cost efficient to plan for data management activities at the onset of a project. When data are managed upon creation, the process of archiving is much faster. This reduces not only the amount of effort, but also the amount of time it takes to make the data shareable to a wider audience. The USGS recognizes data management as a legitimate cost of science and is touts as a best management practice (BMP).

Page 16: By Annie Stoehr, MSI MWR Information Management Consultant.

Value By assisting Biology Centers in developing

data management plans utilizing BMPs, the MWR is helping to make possible data-sharing on a larger scale. This will afford Centers the opportunity to enrich their science, products, and contribution to the scientific community.

Not only will DMPs facilitate better science, they are also tools for cost savings. For example, a Center with 100 staff can save roughly $500,000 a year just by having a more efficient search and retrieval process.

Staff will also experience an increase in productivity as they can more easily find what they need and can thus “get back to the science”. As the staff is able to work more efficiently, it takes less turn-around time to report findings. This adds to USGS’s reputation – not only known for authoritative science, but also expedient results.

Page 17: By Annie Stoehr, MSI MWR Information Management Consultant.

Recommendations Continue the SIM Project

Each Biology Center has a unique focus/approach to science. No one tool will work for them all. In order to get all the biology centers to a place where they can share data, the remaining biology centers will need to be surveyed.

Continue to work with the previously visited Centers The CDI Management Group released a DM

website that is packed with tools to help with DM. The SIM team can assist these Centers with applying these tools.

These Centers need a conduit that can tailor these tools to 1) organize their data, 2) share their data, and 3) continue to use their data for future projects.

Examples:▪ NWHC - needs help

developing/testing/implementing the metadata template (Plan); determining which datasets are “shareable” (Publish/Share); and which only need to be archived (Preserve)

▪ UMESC - developing a unified DM Plan for the Center (Plan, Acquire, Preserve, Publish/Share) and implementing it

Develop more DM tools based on previously visited Centers These tools will be more tailored to specific Center

needs▪ NWHC: series of questions to assist in the metadata

template development; UX design and testing of database improvements/acquisition

▪ UMESC: series of adapted procedures for developing a tailored DM plan based on the CDI DM website

Develop a directory of expertise to further assist with DM and other study-specific issues This has been repeatedly requested by all Centers Includes name, number, email address, location,

abstract of current work, list of past projects/products, list of associated teams/groups, web links

Location/platform for this directory also need to be investigated

Extend this project to include Water Centers CDs at the Water Centers have had to face DM

issues for a while, assessing their practices would be beneficial to the rest of the Centers

Although majority of data at the Water SCs is structured, there is also an amount that is unstructured that could be valuable to the Center, MWR, USGS, and the larger scientific community.

Page 18: By Annie Stoehr, MSI MWR Information Management Consultant.

What’s going on now?

IN THE MIDWEST REGION… In light of the sequester,

the Center visits for Phase II of the project are on hold.

The focus is developing templates for the application of science data and records management.

How can Center activities be leveraged at the Regional level?

AT THE GLSC…

Began working with Bruce Manny, a Fishery Biologist.

40 years worth of data from the Great Lakes’ Connecting Channels.

As we work through the process, we are documenting what we do to serve as a template.

Page 19: By Annie Stoehr, MSI MWR Information Management Consultant.

Archiving the Connecting Channels

Created a SharePoint site Center staff can see how we are building the

site They can track the progress of the project

Attempting to demonstrate the benefits of data management and preservation Submitted a Data Rescue Project Application Demonstrating through accessibility and

discoverability, data are not just for one-time use

Demonstrating cost savings by archiving while Bruce is still working with the USGS, rather than waiting until after he retires

Page 20: By Annie Stoehr, MSI MWR Information Management Consultant.

The Vision

A more integrated network of science information management support across the Midwest Region…

Page 21: By Annie Stoehr, MSI MWR Information Management Consultant.

~Fin~

Thank You!

Any Questions?


Recommended