+ All Categories
Home > Education > PerlScripting

PerlScripting

Date post: 11-Jun-2015
Category:
Upload: aureliano-bombarely
View: 525 times
Download: 0 times
Share this document with a friend
Popular Tags:
26
by Aureliano Bombarely Gomez Boyce Thompson Institute for Plant Research Tower Road Ithaca, New York 14853-1801 U.S.A. Writing Perl Scripts
Transcript
Page 1: PerlScripting

by Aureliano Bombarely Gomez

Boyce Thompson Institute for Plant Research

Tower RoadIthaca, New York 14853-1801

U.S.A.

Writing Perl Scripts

Page 2: PerlScripting

Writing Perl Scripts:

1. Four mandatory lines.

2. Useful modules I: Files.

3. Useful modules II: Options.

4. Documentation and being verbose.

5. Exercise: Assembly stats.

Page 3: PerlScripting

Writing Perl Scripts:

1. Four mandatory lines.

2. Useful modules I: Files.

3. Useful modules II: Options.

4. Documentation and being verbose.

5. Exercise: Assembly stats.

Page 4: PerlScripting

1. Four mandatory lines.

1.LINE: #!/usr/bin/perlWhere ? At the beginning of the script.Why ? It says to the operating system what

program needs to use to executate the script.

2.LINE: use warnings;Where ? Before declare the modules and variables.

(sooner is better).Why ? It will print any compilation warnings.

3.LINE: use strict;Where ? Before declare the modules and variables.

(sooner is better).Why ? It will check any gramatical error and it will not

Let run scripts with errors.

Page 5: PerlScripting

1. Four mandatory lines.

4.LINE: 1;Where ? At the end of the script.Why ? It says to the operating system that the script

It is done.

#!/usr/bin/perl

use strict;use warnings;

############# MY CODE###########

1;

Page 6: PerlScripting

Writing Perl Scripts:

1. Four mandatory lines.

2. Useful modules I: Files.

3. Useful modules II: Options.

4. Documentation and being verbose.

5. Exercise: Assembly stats.

Page 7: PerlScripting

2. Useful modules I: Files

JUST A REMINDER: How open/Read/Write/Close files.

1. OPEN FUNCTION.

FILEHANDLES: undefined scalar variable autovivified.

MODE: read, input only: < write, output only: >append to a file: >>read/write update access: +<write/read update access +>read/append update access +>>

REFERENCE: Filename or reference to open

open (FILEHANDLE, MODE, REFERENCE);

Page 8: PerlScripting

2. Useful modules I: Files

JUST A REMINDER: How open/Read/Write/Close files.

1. OPEN FUNCTION.

SUGGESTIONS: “use autodie” instead “OR die(“my error”)”;

open (my $ifh, '<', $input_filename);open (my $ofh, '>', $output_filename);

open (my $ifh, '<', $input_filename)OR die(“ERROR OPENING FILE: $!”);

Page 9: PerlScripting

2. Useful modules I: Files

JUST A REMINDER: How open/Read/Write/Close files.

2. READING OPENED FILES.

SUGGESTIONS: “Know the status of the file”

while(<FILEHANDLE>) {## BLOCK USING $_ as LINE (don't forget chomp)

}

my @filelines = <FILEHANDLE>;my $L = scalar(@filelines);my $l = 0;

foreach my $line (@filelines) {$l++;print STDERR “Reading line $l of $L lines \r”;

}

Page 10: PerlScripting

2. Useful modules I: Files

JUST A REMINDER: How open/Read/Write/Close files.

3. WRITE OVER OPENED FILES.

4. CLOSE FILES.

print $ofh “This to print over the file”;

close($ofh);

Page 11: PerlScripting

1. Useful modules I: Files

a) File::Basename;

Parse file paths into directory, filename and suffix.

use File::Basename; my ($name, $path, $suffix) = fileparse($fullname,@suffixlist);

my $name = fileparse($fullname, @suffixlist);

my $basename = basename($fullname, @suffixlist);

my $dirname = dirname($fullname);

Page 12: PerlScripting

2. Useful modules I: Files

b) File::Spec;

Operations over filenames.

use File::Spec;

my $currdir = File::Spec->currdir();my $tempdir = File::Spec->tempdir();

my $path = File::Spec->catfile($currdir, $filename);

Page 13: PerlScripting

Writing Perl Scripts:

1. Four mandatory lines.

2. Useful modules I: Files.

3. Useful modules II: Options.

4. Documentation and being verbose.

5. Exercise: Assembly stats.

Page 14: PerlScripting

3. Useful modules II: Options

Usual way to pass options: Using $ARGV

#!/usr/bin/perl

use strict;use warnings;use autodie; my ($arg1, $arg2) = @ARGV;

1;

user@comp$ myscript.pl argument1 argument2

Page 15: PerlScripting

3. Useful modules II: Options

Usual way to pass options: Using $ARGV

PROBLEM:

When there are multiple arguments can be confusing.

Mandatory arguments are difficult to check !!!

SOLUTION:

Use modules GetOpt::Std or GetOpt::Long

Page 16: PerlScripting

3. Useful modules II: Options

GetOpt::Std;

Process single-character arguments from the command line

use GetOpt::Std; our( $opt_i, $opt_o, $opt_V, $opt_H); getopts(i:o:VH);

## i: and o: expect something aftter the switch.my $input = $opt_i || die(“ERROR: -i <input> was not supplied.”);my $output = $opt_i || die(“ERROR: -o <output> was not supplied.”);

## V and H don't expect anything after the switch.if ($opt_H) {

print $help;}

user@comp$ myscript.pl -i argument1 -o argument2 -V -H

Page 17: PerlScripting

Writing Perl Scripts:

1. Four mandatory lines.

2. Useful modules I: Files.

3. Useful modules II: Options.

4. Documentation and being verbose.

5. Exercise: Assembly stats.

Page 18: PerlScripting

4. Documentation and being verbose

Three types of documentation:

1) Document code with #.GOOD: Useful for developers.BAD: Inaccessible for users if they not open the script.

2) Document using perldoc.GOOD: Clear and formated information.BAD: perdoc is not always installed in the system.

3) Document using an inside print function.GOOD: Frecuently easy to access. Intuitive.BAD: ??? Well increase the size of your script.

Page 19: PerlScripting

4. Documentation and being verbose

Three types of documentation:

1) Document code with #.GOOD: Useful for developers.BAD: Inaccessible for users if they not open the script.

2) Document using perldoc.GOOD: Clear and formated information.BAD: perdoc is not always installed in the system.

3) Document using an inside print function.GOOD: Frecuently easy to access. Intuitive.BAD: ??? Well increase the size of your script.

Page 20: PerlScripting

Documenting through a function;

sub help {

print STDERR <<EOF;$0:

Description:My program description.

Synopsis:myscript.pl [-H] [-V] -i <input>

Arguments:-i <input> input file (mandatory)-H <help> print Help.-V <verbose> be verbose

EOF;Exit(1);

}

4. Documentation and being verbose

Page 21: PerlScripting

Calling help;

4. Documentation and being verbose

use GetOpt::Std; our( $opt_i, $opt_o, $opt_V, $opt_H); getopts(i:o:VH);

## i: and o: expect something aftter the switch.my $input = $opt_i || die(“ERROR: -i <input> was not supplied.”);my $output = $opt_i || die(“ERROR: -o <output> was not supplied.”);

## V and H don't expect anything after the switch.if ($opt_H) {

help();}

Page 22: PerlScripting

Being verbose;

4. Documentation and being verbose

use GetOpt::Std; our( $opt_i, $opt_o, $opt_V, $opt_H); getopts(i:o:VH);

## i: and o: expect something aftter the switch.my $input = $opt_i || die(“ERROR: -i <input> was not supplied.”);my $output = $opt_i || die(“ERROR: -o <output> was not supplied.”);

if ($opt_V) {my $date = `date`;chomp($date);print STDERR “Step 1 [$date]:\n\tParsing -i $input file.\n”;

}

Page 23: PerlScripting

Being verbose;

4. Documentation and being verbose

my @filelines = <FILEHANDLE>;my $L = scalar(@filelines);my $l = 0;

foreach my $line (@filelines) {$l++;if ($opt_V) {

print STDERR “Reading line $l of $L lines \r”;}

}

Page 24: PerlScripting

Writing Perl Scripts:

1. Four mandatory lines.

2. Useful modules I: Files.

3. Useful modules II: Options.

4. Documentation and being verbose.

5. Exercise: Assembly stats.

Page 25: PerlScripting

GOAL: Create a script to calculate:

1) Number of sequence in a file.

2) Total BP of a file.

3) Longest sequence

4) Shortest sequence.

5) Average and SD.

6) N25, N50, N75, N90, N95 (length and indexes)

5. Exercise: Assembly Stats

Page 26: PerlScripting

6) N25, N50, N75, N90, N95 (length and indexes)

Just a reminder:

N50 Length is the minimun length contained by the 50% of the size of the file (in bp) when it is ordered by decreasing length.

N50 Index is the number os sequences contained by the 50% of the size of the file (in bp) when it is ordered by decreasing length.

5. Exercise: Assembly Stats