+ All Categories
Home > Documents > Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A...

Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A...

Date post: 19-Dec-2015
Category:
View: 239 times
Download: 5 times
Share this document with a friend
25
Introduction to Perl Bioinformatics
Transcript
Page 1: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Introduction to Perl

Bioinformatics

Page 2: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

What is Perl? Practical Extraction and Report

Language A scripting language Components

an interpreter scripts: text files created by user

describing a sequence of steps to be performed by the interpreter

Page 3: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Installation Create a Perl directory under C:\ Either

Download AP.msi from the course website (http://curry.ateneo.net/~jpv/BioInf07/) and execute (installs into C:\Perl directory)

Or download and unzip AP.zip into C:\Perl Reset path variable first (or edit C:\

autoexec.bat) so that you can execute scripts from MSDOS C> path=%path%;c:\Perl\bin

Page 4: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Writing and RunningPerl Scripts Create/edit script (extension: .pl)

C> edit first.pl

Execute script C> perl first.pl

* Tip: place your scripts in a separate work directory

# my first scriptprint “Hello World”;print “this is my first script”;

Page 5: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Perl Features Statements Strings Numbers and Computation Variables and Interpolation Input and Output Files Conditions and Loops Pattern Matching Arrays and Lists

Page 6: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Statements A Perl script is a sequence of

statements Examples of statements

print “Type in a value”;$value = <>;$square = $value * $value;print “The square is ”, $square, “\n”;

Page 7: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Comments Lines that start with # are ignored

by the Perl interpreter# this is a comment line

In a line, characters that follow # are also ignored$count = $count + 1; # increment

$count

Page 8: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Strings String

Sequence of characters Text

In Perl, characters should be surrounded by quotes ‘I am a string’ “I am a string”

Special characters specified through escape sequences (preceded by a \ ) “a newline\n and a tab\t”

Page 9: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Numbers Integers specified as a sequence of

digits 6 453

Decimal numbers: 33.2 6.04E24 (scientific notation)

Page 10: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Variables Variable: named storage for values

(such as strings and numbers) Names preceded by a $ Sample use:

$count = 5; # assignment statement$message = “Hello”; # another assignmentprint $count; # print the value of a variable

Page 11: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Computation Fundamental arithmetic operations:

+ - * / Others

** exponentiation () grouping

Example (try this out as a Perl script)$x = 4;$y = 2;$z = (3 + $x) ** $y;print $z, “\n”;

Page 12: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Interpolation Given the following script:

$x = “Smith”;print “Good morning, Mr. $x”;print ‘Good morning, Mr. $x’;

Strings quoted with “” perform expansions on variables escape characters like \n are also

interpreted when strings are quoted with “” but not when they are quoted with ‘’

Page 13: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Input and Output Output

print function Escape characters Interpolation

Input Bracket operator (e.g., $line = <>; ) Not typed (takes in strings or

numbers)

Page 14: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Input Files Opening a file

open INFILE, ’data.txt’; Input

$line = <INFILE>; Closing a file

close INFILE;

Page 15: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Output Files Opening

open OUTFILE, ’>result.txt’; Or, open OUTFILE, ’>>result.txt’;

#append Input

print OUTFILE “Hello”; Closing files

close OUTFILE;

Page 16: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Conditions Can execute statements

conditionally Syntax: Example:

if ( condition ) if ( $num > 1000 ){ { statement print “Large”; statement } …}

Page 17: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

If - Else$num = <>;if ( $num > 1000 ){ print “Large number\n”;}else{ print “Small number\n”;}print “Thanks\n”;

Page 18: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Loops Repetitive execution Syntax: Example:

while ( condition )$count = 0;{ while ( $count < 10 ) statement { statement print

“counting-”, $count; … $count = $count +

1; } }

Page 19: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Conditions ( expr symbol expr ) Numbers

== equal <= less than or equal

!= not equal >= greater than or equal< less than> greater than

Stringseq ne lt gt le ge=~ pattern match

Page 20: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Functions length $str returns number of characters

in $str defined $str tests if $str is a valid string

(useful for testing if $line=<>;suceeded)

chomp $str removes last character from $str(useful because $line=<>;

includesthe newline character)

print $var displays $var on output device

Page 21: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Pattern Matching <string> =~ <pattern>

is a condition that that checks if a string matches a pattern

Simplest case: <pattern> specifies a search substringExample: if (s =~ /bio/) …

holds TRUE if s is “molecular biology”, “bioinformatics”, “the bionic man”;FALSE if s is “chemistry”, “bicycle”, “a BiOpsy”

Page 22: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Special pattern matching characters \w letters (word character) \d digit \s space character (space, tab

\n)

if ( s =~ /\w\w\s\d\d\d/ ) …holds TRUE for “CS 123 course”,“Take Ma 101 today”FALSE for “Only 1 number here”

Page 23: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Special pattern matching characters

. any character ^ beginning of string/line $ end of string or line

if ( s =~ /^\d\d\d\ss..r/ ) …holds TRUE for “300 spartans”FALSE for “all 100 stars”

Page 24: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Groups and Quantifiers [xyz] character set | alternatives * zero or more + 1 or more ? 0 or 1 {M} exactly M {M,N} between M and N characters

Page 25: Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

NCBI file Example

/VERSION\s+(\S+)\s+GI:(\S+)/

Matches a version line Parenthesis groups characters for

future retrieval $1 stands for the first version

number,$2 gets the number after “GI:”


Recommended