Date post: | 11-Jun-2015 |
Category: |
Education |
Upload: | aureliano-bombarely |
View: | 525 times |
Download: | 0 times |
by Aureliano Bombarely Gomez
Boyce Thompson Institute for Plant Research
Tower RoadIthaca, New York 14853-1801
U.S.A.
Writing Perl Scripts
Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
1. Four mandatory lines.
1.LINE: #!/usr/bin/perlWhere ? At the beginning of the script.Why ? It says to the operating system what
program needs to use to executate the script.
2.LINE: use warnings;Where ? Before declare the modules and variables.
(sooner is better).Why ? It will print any compilation warnings.
3.LINE: use strict;Where ? Before declare the modules and variables.
(sooner is better).Why ? It will check any gramatical error and it will not
Let run scripts with errors.
1. Four mandatory lines.
4.LINE: 1;Where ? At the end of the script.Why ? It says to the operating system that the script
It is done.
#!/usr/bin/perl
use strict;use warnings;
############# MY CODE###########
1;
Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
2. Useful modules I: Files
JUST A REMINDER: How open/Read/Write/Close files.
1. OPEN FUNCTION.
FILEHANDLES: undefined scalar variable autovivified.
MODE: read, input only: < write, output only: >append to a file: >>read/write update access: +<write/read update access +>read/append update access +>>
REFERENCE: Filename or reference to open
open (FILEHANDLE, MODE, REFERENCE);
2. Useful modules I: Files
JUST A REMINDER: How open/Read/Write/Close files.
1. OPEN FUNCTION.
SUGGESTIONS: “use autodie” instead “OR die(“my error”)”;
open (my $ifh, '<', $input_filename);open (my $ofh, '>', $output_filename);
open (my $ifh, '<', $input_filename)OR die(“ERROR OPENING FILE: $!”);
2. Useful modules I: Files
JUST A REMINDER: How open/Read/Write/Close files.
2. READING OPENED FILES.
SUGGESTIONS: “Know the status of the file”
while(<FILEHANDLE>) {## BLOCK USING $_ as LINE (don't forget chomp)
}
my @filelines = <FILEHANDLE>;my $L = scalar(@filelines);my $l = 0;
foreach my $line (@filelines) {$l++;print STDERR “Reading line $l of $L lines \r”;
}
2. Useful modules I: Files
JUST A REMINDER: How open/Read/Write/Close files.
3. WRITE OVER OPENED FILES.
4. CLOSE FILES.
print $ofh “This to print over the file”;
close($ofh);
1. Useful modules I: Files
a) File::Basename;
Parse file paths into directory, filename and suffix.
use File::Basename; my ($name, $path, $suffix) = fileparse($fullname,@suffixlist);
my $name = fileparse($fullname, @suffixlist);
my $basename = basename($fullname, @suffixlist);
my $dirname = dirname($fullname);
2. Useful modules I: Files
b) File::Spec;
Operations over filenames.
use File::Spec;
my $currdir = File::Spec->currdir();my $tempdir = File::Spec->tempdir();
my $path = File::Spec->catfile($currdir, $filename);
Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
3. Useful modules II: Options
Usual way to pass options: Using $ARGV
#!/usr/bin/perl
use strict;use warnings;use autodie; my ($arg1, $arg2) = @ARGV;
1;
user@comp$ myscript.pl argument1 argument2
3. Useful modules II: Options
Usual way to pass options: Using $ARGV
PROBLEM:
When there are multiple arguments can be confusing.
Mandatory arguments are difficult to check !!!
SOLUTION:
Use modules GetOpt::Std or GetOpt::Long
3. Useful modules II: Options
GetOpt::Std;
Process single-character arguments from the command line
use GetOpt::Std; our( $opt_i, $opt_o, $opt_V, $opt_H); getopts(i:o:VH);
## i: and o: expect something aftter the switch.my $input = $opt_i || die(“ERROR: -i <input> was not supplied.”);my $output = $opt_i || die(“ERROR: -o <output> was not supplied.”);
## V and H don't expect anything after the switch.if ($opt_H) {
print $help;}
user@comp$ myscript.pl -i argument1 -o argument2 -V -H
Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
4. Documentation and being verbose
Three types of documentation:
1) Document code with #.GOOD: Useful for developers.BAD: Inaccessible for users if they not open the script.
2) Document using perldoc.GOOD: Clear and formated information.BAD: perdoc is not always installed in the system.
3) Document using an inside print function.GOOD: Frecuently easy to access. Intuitive.BAD: ??? Well increase the size of your script.
4. Documentation and being verbose
Three types of documentation:
1) Document code with #.GOOD: Useful for developers.BAD: Inaccessible for users if they not open the script.
2) Document using perldoc.GOOD: Clear and formated information.BAD: perdoc is not always installed in the system.
3) Document using an inside print function.GOOD: Frecuently easy to access. Intuitive.BAD: ??? Well increase the size of your script.
Documenting through a function;
sub help {
print STDERR <<EOF;$0:
Description:My program description.
Synopsis:myscript.pl [-H] [-V] -i <input>
Arguments:-i <input> input file (mandatory)-H <help> print Help.-V <verbose> be verbose
EOF;Exit(1);
}
4. Documentation and being verbose
Calling help;
4. Documentation and being verbose
use GetOpt::Std; our( $opt_i, $opt_o, $opt_V, $opt_H); getopts(i:o:VH);
## i: and o: expect something aftter the switch.my $input = $opt_i || die(“ERROR: -i <input> was not supplied.”);my $output = $opt_i || die(“ERROR: -o <output> was not supplied.”);
## V and H don't expect anything after the switch.if ($opt_H) {
help();}
Being verbose;
4. Documentation and being verbose
use GetOpt::Std; our( $opt_i, $opt_o, $opt_V, $opt_H); getopts(i:o:VH);
## i: and o: expect something aftter the switch.my $input = $opt_i || die(“ERROR: -i <input> was not supplied.”);my $output = $opt_i || die(“ERROR: -o <output> was not supplied.”);
if ($opt_V) {my $date = `date`;chomp($date);print STDERR “Step 1 [$date]:\n\tParsing -i $input file.\n”;
}
Being verbose;
4. Documentation and being verbose
my @filelines = <FILEHANDLE>;my $L = scalar(@filelines);my $l = 0;
foreach my $line (@filelines) {$l++;if ($opt_V) {
print STDERR “Reading line $l of $L lines \r”;}
}
Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
GOAL: Create a script to calculate:
1) Number of sequence in a file.
2) Total BP of a file.
3) Longest sequence
4) Shortest sequence.
5) Average and SD.
6) N25, N50, N75, N90, N95 (length and indexes)
5. Exercise: Assembly Stats
6) N25, N50, N75, N90, N95 (length and indexes)
Just a reminder:
N50 Length is the minimun length contained by the 50% of the size of the file (in bp) when it is ordered by decreasing length.
N50 Index is the number os sequences contained by the 50% of the size of the file (in bp) when it is ordered by decreasing length.
5. Exercise: Assembly Stats