Date post: | 11-Aug-2014 |
Category: |
Data & Analytics |
Upload: | anton-van-de-putte |
View: | 270 times |
Download: | 0 times |
CCAMBIO and the mARS project
Anton Van de PutteCCAMBIO Annual Meeting
12 may 2014
Microbial Antarctic Resources System
An information system dedicated to facilitate the discovery, access and analysis of geo-
referenced,molecular microbial diversity (meta)data generated by Antarctic researchers, in an
Open fashion.
What’s happened so far
• mARS Workshop hosted at the Belgian Science Policy Office (BELSPO, Brussels) in May 2012
• mARS Workshop held during the SCAR Open Science Conference (Portland, OR) in July 2012
• Technical mARS Workshop hosted at the Université Libre de Bruxelles in December 2013
• Initiate the development of the database and webplatform
Near future planning
• mARS Workshop held during the SCAR Open Science Conference (Auckland, NZ) on 27 august 2014
• Present a proof of concept of the dataindrastructure to be used for mARS
The Vision4 incremental steps
Step 1: Data description
and discoveryIntegrated Publishing Toolkit
(IPT)
Step 2: Habitat and Microbial Sequence
Metadata Entry MiMARKS and mARS Sequence Set
Te m p l a t e
Step 3: Georeferenced-molecular sequence
database integration
Step 4: Processing batch sequence data
Circum-Antarctic microbial diversity
Standard Operating Procedure
How to get started
Getting Data into mARS
• Requires that
• Data is accessible in a public a public repository (Genbank, IMG-M or other web accessible)
• 2 additional metadata files
• MiMARKS
• Microbial Sequence spreadsheet
0. Before you start• 1. Clearly Identify your needs
• You have a project that you would like to register with mARS
• no sequence data or environmental data at this point: skip Steps 1, 2, 4 and 7
• environmental data, but no publicly available sequences yet, follow all Steps below, but do not enter Sequence IDs in the forms.
• environmental data, and publicly available sequences. Follow all Steps below.
0. Before you start
• Send an email to request a username and password from the IPT administrator
0. Before you start
• Send an email to request a username and password from the IPT administrator
• Make a copy of the MiMarks Googlesheet from the RDP MiMarks Googlesheet (click on “Make copy” from the “File” menu).
0. Before you start
• Send an email to request a username and password from the IPT administrator
• Make a copy of the MiMarks Googlesheet from the RDP MiMarks Googlesheet (click on “Make copy” from the “File” menu).
• Make a copy of the Microbial Sequence Set from the mARS Googlesheet (click on “Make copy” from the “File” menu).
1. Prepare your MiMarks spreadsheet• In the MiMarks Googlesheet you’ve created in
step 0, fill in your environmental metadata details using the “Google Documents” interface, following the instructions available from the MiMarks Googlesheet documentation at RDP. Example files are available from the mARS website.
• In the header for each column that will hold your sequence set data, list the unique identifier of your sequence set.
• Once you are finished, download your spreadsheet as a CSV (Comma-separated Values) file on your computer.
2. Prepare your Microbial Sequence Set spreadsheet
• In the Microbial Sequence Set Googlesheet you’ve created in step 0, fill all the fields (replace the examples available from the Googlesheet)
• Once you are finished, download your spreadsheet as a CSV (Comma-separated Values) file on your computer.
3. Describe your data in the IPT• Login the IPT using your credentials:
• Use the form at the bottom of the “Manage Resource” page to create a new resource. Provide a unique "shortname" for your dataset.
• Click the “Create” button. You will arrive on the Resource Management page.
• Click on the “Edit” button in the Metadata section on the left and fill in the details for the different metadata sections. A detailed instructions are available from IPT quick reference guide. Hint: mention your grant number in the “Project Data” section, to allow us to link your resource to relevant projects in the GCMD/AMD.
4. Upload your MiMarks and Microbial Sequence
Set• 1. In your IPT session, from your Resource
Management page, click on the “Choose file” button in the “Source data” section on the left of the page.
• 2. Point to your completed MiMarks CSV, and click on “Choose”
• 3. Click on the “add” button in the “Source data” section on the left of the page then click on the “Save” button on the bottom. Your MiMarks CSV file is now uploaded on the IPT.
4. Upload your MiMarks and Microbial Sequence
Set• 5. From your Resource Management page, click
on the “Choose file” button in the “Source data” section on the left of the page.
• 6. Point to your completed Microbial Sequence Set CSV, and click on “Choose”
• 7. Click on the “add” button in the Source data section on the left of the page, then click on the “Save” button. Your Microbial Sequence Set CSV file is now uploaded on the IPT.
5. Publish and register your data• From your Resource Management page, click on
the “Publish” button in the “Published release” section on the left of the page. Do not worry when you see a warning message “Source data or Darwin Core mappings missing. No data archive generated
• By default, your resource’s visibility is set to “Private”. To allow your resource to become visible on the IPT for all users, click on the “Public” button in the “Visibility” section.
• Request one of the administrators to “Register” your dataset.