Date post: | 13-Apr-2017 |
Category: |
Science |
Upload: | gigascience-bgi-hong-kong |
View: | 801 times |
Download: | 0 times |
1
Where are we now? Views of the genome data wars from the field.
0000-0001-6444-1436
@SCEdmunds
2
Circa 2002: Genome Wars pt. II
Rice was a key battle between the Bermuda & Fort Lauderdale meetings.
Commercial (syngenta) v academic research community.
Like Celera paper, Science again willing to publish genome without data in public domain.
https://www.newscientist.com/article/dn2061-fears-over-rice-genome-access/
Genome Wars: the Empire Strikes Back"A maximum of 15 Kb of DNA or 15 K amino acids can be submitted in a FASTA format, and appropriate BLAST searches will be performed by SBI. Alignment results of the search will be sent via e-mail to the requestor. Rice contigs identified by these alignments can be requested for further analysis using the sequence submission/contig request form. Up to 100 Kb of sequence information may be downloaded per week under your account.”
”TMRI will make its sequence assembly of the whole rice genome available on a CD-ROM under the terms of the Free Public Access Agreement for TMRI Whole Genome Sequence.”
https://web.archive.org/web/20021009130336/http://portal.tmri.org/rice/RiceAccess.html
4Meanwhile in China…
“SciencecongratulatesChinesescientists”
Back to back publication, April 2002
Yu et al., (BGI) & Goff et al. (Syngenta/Myriad), Science 296, 79
BGI data public [AAAA00000000]
Circa 2002: Genome Wars
5 April, 2002 Beijing
http://www.agbioforum.org/v8n23/v8n23a07-pray.htm
Syngenta closed TMRI database, data became part of IRGSP consortium paper published in 2005.
Fort Lauderdale, January 2003.
NAS "UPSIDE: the Uniform Principle for Sharing Integral Data and materials Expeditiously”.
AAAS: “‘All data necessary to understand, assess, and extend the conclusions of the manuscript must be available to any reader of Science’ ”.
Circa 2003: The aftermath
7
19961997
19981999
20002001
20022003
20042005
20062007
20080
100
200
300
400
500
600
700rice wheat
Rice v Wheat: consequences of publically available genome data.
Papers
http://www.tandfonline.com/doi/abs/10.1080/08109028.2011.631275
Circa 2003-date: The Legacy
IRRI GALAXYRice 3K project: 3,000 rice genomes, 13.4TB public data
Circa 2014: Big Data
8http://www.gigasciencejournal.com/content/3/1/7
9
IRRI GALAXYRice 3K project: 3,000 rice genomes, 120 TB public data
Circa 2015: Bigger Data
https://aws.amazon.com/public-data-sets/3000-rice-genome/
10
http://www.gigasciencejournal.com/content/3/1/23http://www.gigasciencejournal.com/content/4/1/19
Compute publishing: Virtual Machines
• Downloadable as virtual harddisk/available as Amazon Machine
Image
11
http://www.gigasciencejournal.com/content/4/1/33http://www.gigasciencejournal.com/content/4/1/47
Compute publishing: Containers
• Archived docker images/available v dockerhub & bioboxes registry
12
Compute publishing: consequences?
• Cost us $1000 AWS credits to
review one paper. Scalable?
• Is the era of free open-data
over?
• Are we happy with AWSification
of research? Research-as-a-
Service?
• If not, who will pay?