O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname...

Post on 06-Jan-2018

218 views 1 download

description

How Perl saved the human genome project an_genome O DATE: Early February, 1996 O LOCATION: Cambridge, England, in the conference room of the largest DNA sequencing center in Europe.Cambridge, England, in the conference room of the largest DNA sequencing center in Europe. O OCCASION: A high level meeting between the computer scientists of this center and the largest DNA sequencing center in the United States. O THE PROBLEM: Although the two centers use almost identical laboratory techniques, almost identical databases, and almost identical data analysis tools, they still can't interchange data or meaningfully compare results. O THE SOLUTION: Perl.

transcript

O Log in to amazon biolinuxO For mac users

O ssh ubuntu@public_dns_address

O For Windows usersO use puttyO Hostname public_dns_addressO username ubuntu

mkdir bioperlcd bioperlwget http://biobase.ist.unomaha.edu/~ithapa/myfile.gbk

BioPerlIshwor Thapa (02/17/2012)

How Perl saved the human genome project

http://www.bioperl.org/wiki/How_Perl_saved_human_genome

O DATE: Early February, 1996O LOCATION:

Cambridge, England, in the conference room of the largest DNA sequencing center in Europe.

O OCCASION: A high level meeting between the computer scientists of this center and the largest DNA sequencing center in the United States.

O THE PROBLEM: Although the two centers use almost identical laboratory techniques, almost identical databases, and almost identical data analysis tools, they still can't interchange data or meaningfully compare results.

O THE SOLUTION: Perl.

Installing BioPerlO BioLinux comes with BioPerlO For other machines (linux, mac,

windows),O

http://www.bioperl.org/wiki/Main_Page

Programming in Perlprint “Hello World!\n”;

for (int $i = 0; $i < 10; $i++){

print “$i\n”;}

BioPerlO Two Main Classes in BioPerl

Bio::SeqIOBio::Seq

using Bio::SeqIOO 3 Main Methods

new next_seq write_seq

Genbank to Fasta converter

use Bio::SeqIO;$in = Bio::SeqIO->new(-file => ”myfile.gbk" , -format => ’Genbank'); $out = Bio::SeqIO->new(-file => ">myfile.fasta" ,

-format => ’Fasta');

while ( my $seq = $in->next_seq() ) {$out->write_seq($seq);

}

Bio::SeqO 3 Main Methods

new seq subseq display_id desc revcom

Using Bio::Sequse Bio::SeqIO;$in = Bio::SeqIO->new(-file => "myfile.gbk" , -format => 'Genbank');

while ( my $seq = $in->next_seq() ) { print $seq->display_id; print $seq->desc; #print $seq->seq; #print $seq->subseq(10,20); #print $seq->revcom->seq;}

SeqFeatures

while (my $seq = $seq_io->next_seq()){ my @features = $seq->get_SeqFeatures(); foreach my $feat(@features) {

if($feat->primary_tag eq "CDS") { my @pid = $feat->get_tag_values('protein_id'); my @translation = $feat->get_tag_values('translation'); for (my $index = 0; $index < scalar @pid; $index++) { print ">$pid[$index]"."\n"; print $translation[$index]."\n"; } } }

}