+ All Categories
Home > Documents > Gen712/812: Module 1 – Session 1 Introduction to Perl programming for Bioinformatics

Gen712/812: Module 1 – Session 1 Introduction to Perl programming for Bioinformatics

Date post: 05-Jan-2016
Category:
Upload: dory
View: 38 times
Download: 0 times
Share this document with a friend
Description:
Gen712/812: Module 1 – Session 1 Introduction to Perl programming for Bioinformatics. Outline 1.0 What to expect 1.1Introduction to programming: Hardware VS Software 1.2Types of programming Languages 1.3Types of Operating systems 1.4Introduction to Linux/Unix 1.5SSH. - PowerPoint PPT Presentation
Popular Tags:
29
Outline 1.0 What to expect 1.1 Introduction to programming: Hardware VS Software 1.2 Types of programming Languages 1.3 Types of Operating systems 1.4 Introduction to Linux/Unix 1.5 SSH
Transcript
Page 1: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Outline1.0 What to expect1.1 Introduction to programming: Hardware VS Software1.2 Types of programming Languages1.3 Types of Operating systems1.4 Introduction to Linux/Unix1.5 SSH

Page 2: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

• Introduction to Linux/Unix• working on Linux/Unix system

• logging in• managing passwords

• navigating the Linux/Unix file system

•Core PERL• Basic language syntax

•Applied PERL• Using perl to do bioinformatics

Page 3: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Computer Device capable of performing computations and

making logical decisions at speeds millions of times faster than human beings

Computers process data under the control of sets of instructions called computer programs, or scripts.

Has two inseparable components• Hardware

– The various physical components comprising a computer• Keyboard, screen, mouse, disks, memory, • CD-ROM, central processing units

• Software – Tested and working step-by-step instructions that are

bundled as a unit and can run on a computer

Page 4: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Six logical units in every computer:

1. Input unit Obtains information from input devices (keyboard,

mouse)2. Output unit

Outputs information (to screen, to printer, to control other devices)

3. Memory unit Rapid access, low capacity, stores input information

4. Arithmetic and logic unit (ALU) Performs arithmetic calculations and logical decisions

5. Central processing unit (CPU) Supervises, coordinates and prioritizes computing jobs

6. Secondary storage unit Cheap, long-term, high-capacity storage Stores programs and data for loading and processing by

ALU/CPU

Page 5: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Three types of programming languages1. Machine languages

Strings of numbers giving machine specific instructions

Example:+1300042774 (100111101011001001010)+1400593419 (101010101111100010001) +1200274027 (100100101000010010010)

2. Assembly languages English-like abbreviations representing

elementary computer operations (translated via assemblers)

Example:LOAD BASEPAYADD OVERPAYSTORE GROSSPAY

Page 6: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

3. High-level languagesCode that is:Written in non redundant, unambiguous subset of everyday English and Mathematical notations Human readableIncomprehensible to computers unless translated by compiler or interpreter programsExample:

grossPay = basePay + overTimePay

Page 7: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Examples of high-level languagesC, C++, C#, Java

Used for major applications in various fieldsFORTRAN

Used for scientific and engineering applicationsCOBOL

Used to manipulate large amounts of dataPascal

Intended for academic /instructional usePerl, Python, Ruby …

Interpreted , good for the web applications, Fast text and data manipulation …

Page 8: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics
Page 9: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

What is Linux/Unix An architecture independent Operating system:

- Multiuser - Multitasking - Secure mode

*Each user restricted to his/her home directory: *Can’t access other peoples home directory

Secure login shell Case sensitive Uses forward slash as directory separator “/” Prime development environment for

bioinformatics software, webserver applications

Page 10: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Logging in/out Type username at prompt Supply password when prompted Issue exit/logout command to exit

Password resetting: Type passwd at the command prompt Issue current password when prompted Issue new password when prompted Re-issue new password to confirm

Choose a good password Longer than six letters & digits Mix upper case, lowercase, digit and

punctuation marks

Page 11: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

http://www.linux-tutorial.info/modules.php?name=MContent&pageid=5

Page 12: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

1. Overview of the Linux System2. Basic Commands3. Relative & Absolute Path4. Redirection and Pipe5. File/Directory Permissions6. Process Management7. The nano Text Editor

Page 13: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Kernel is a main program of Unix system. it controls hard wares, CPU, memory, hard disk, network card etc.

Shell is an interface between user and kernel. Shell interprets your input as commands and pass them to kernel.

Kernel

Shell

User

input

Page 14: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Directory Structure Files are placed in directories/folders. All directories are in a tree like hierarchical structure. Users can add and remove files and directories on the

tree If they have proper level of authority.

Top Most directory is “/”, which is called or root. Users have their own directory called home directory

Users can create and delete files and folders they own Users can give permission to others on their files and

folders

Page 15: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics
Page 16: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

When you log on too Linux machines, you will see a prompt On cisunix/wildcats it is a % with a blinking cursor On other machines it looks like: [user@host ~]$

[feseha@perl ~]$

The prompt is called a shell prompt and waits for user commands

User commands consists of three basic parts (in order): command name eg. ls options (modifiers) eg. -la (l= long form, a = all file types) arguments the entity to be acted upon

eg: /usr/local/bin

NB: options and arguments are optional

[user@host ~]$ ls –la /usr/local/bin

Page 17: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

ls show files in current position cd change directory cp copy file or directory mv move file or directory rm remove file or directory pwd print Working directory mkdir create directory rmdir remove directory less display file contents one screen full at a time more display file contents one screen full at a time

(with less and more: press space bar to see next screen full

or press the letter ‘q’ to quit and get the prompt back) cat concatenates and display contents of one or more

files

(pipe it to more or less to prevent continuous flow and read contents)

man display online manual

Page 18: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

passwd change password (following its instruction)

head show the top 10, or more with options, lines of a file

tail show the bottom 10, or more with options, lines of afile

sort Sort file contents by the first field in ascending alphabetical order

grep search file and retrieve lines containing pattern

wc word or line count

Page 19: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Path refers to the position of a file / folder in the directory tree.

Paths can be expressed as relative path or absolute path. Relative path expression:

the path is not defined uniquely expressed relative to current position.

../../../fileName FileName is two directories up! ../foldername/fileName File is in folderName one directory up!

Absolute path expression: the path is defined uniquely does not depend on your current path It goes all the way from root position to the target

file/folder position Eg. /home/feseha/public_html/cgi-bin/w777/junk.txt

Page 20: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

By default, output of commands, if any, is displayed on screen. Output can be directed to:

File Other commands to process it further

Use: >fileName to redirect command output from screen to a file. NB: “>” also means overwrite the file if it exists so pay attention to what

you write to!

Using “>>” instead of “>” appends the output to the end of the file if the file already exists or creates a new file if it doesn’t exist.

Output can be redirected to other commands via a pipe “|” Eg. 1. If the output of a command is longer than the screen size, you can pipe it to less or more so that one screen full can be viewed at a time

ls –l | more

Eg. 2. If you want to peruse a many concatenated documents spanning more than one screen, pipe it to “more” or “less” to view one screen full at a time

head *.pl | more

(* = anything so *.pl means all files ending in “.pl” i.e. perl scripts)

Page 21: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Eg. 3. You just want to know how many lines are in the output of ls. You can pipe its output to wc –l as follows

ls | wc –l

Eg. 4. You want to see the lines 250 to 265 in a file that has 1000 lines in it.You can use head to get the first 265 line and pipe it to tail to get the last 15 lines

head -265 targetFile.txt | tail -15

If you want the lines saved to result.txt instead of the screen, redirect it with >

head -265 targetFile.txt | tail -15 > result.txt

Page 22: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

sort Sorts a file /list with the first field of each line.Options:

-n numerical sorting on first fiels -k sort by field number following k -r sort in reverse descending order ( z – a or say 100 to 1)

-n Sorts considering the numeric value of the strings-k3 Sorts using the third field of each line-rnk3 Sorts in reverse order, using the numeric value of field # 3

Page 23: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

What does the following command achieve?

ls –la /bin/ | sort –nk5 | tail -1

First figure out what each command is supposed to do …

ls – la /bin/

sort –nk5

tail -1

Page 24: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

All of files and directories have specific owner and permission.

There are three types of permission: Readable, r Writeable, w and eXecutable, x

Permissions are set at three user levels: owner, u (u from user) group member, g and World, o (All Others outside owner and owner’s group). ALL, a i.e u+g+o

Example:

ls -l .bash_profile

-rw-r--r-- 1 cnotred cnotred 191 Jan 4 13:11 .bash_profile

r:readable, w:writable, x: executable

Page 25: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Command Outcome_____________________ chmod change file mode, add or remove permission chown change owner of the file

Examples: chmod a+w filename

add writable permission to all users

chmod o-x filenameremove executable permission from

others

chmod a+xGives permission to the user to execute a

file

u: user (owner), g: group, o: othersa: all

Page 26: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

r = 4, w = 2, x = 1

rwx rwx rwx 421 421 421 7 7 7

a+rwx 421 001 001 rwx --x –x

Command line usage: chmod 711 fileName

Set the permission at three levels chown userName myFile.txt

(I passed ownership to another user!)

Page 27: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Check permissionls –l .bash_profilecp .bash_profile sample.txtls –l sample.txt

Remove readable permission from all. chmod a-r sample.txtls –l sample.txtless sample.txt

Add readable & writable premissions to file owner.chmod u+rw sample.txtls –l sample.txtless sample.txtrm sample.txt

Page 28: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Nano is a user friendly text editor!

Arrow-keys Move cursor____________________________ CTRL+a Move to the beginning of the current line. CTRL+e Move to the end of the current line. CTRL+v Move forward one page. CTRL+y Move backward one page. CTRL+w Search for text. CTRL+d Delete the current character. CTRL+k Remove (cut) current line or selected text. CTRL+u Paste (uncut) last cut text at the cursor position. CTRL+o Save (output) the file. CTRL+x Exit nano, saving the file.

Page 29: Gen712/812: Module 1 – Session 1 Introduction  to  Perl programming for Bioinformatics

Create the file Hello.plnano hello.pl

Write hello.pl as follows.

#!/usr/bin/perlprint “Hello World\n”;

Make it executablechmod u+x hello.pl

Run it!./hello.pl


Recommended