Dedan Githae, d.githae@cgiar.org BecA-ILRI Hub Introduction to Linux / UNIX OS MARI eBioKit...

Post on 29-Dec-2015

218 views 1 download

Tags:

transcript

Dedan Githae, d.githae@cgiar.orgBecA-ILRI Hub

Introduction to Linux / UNIX OS

MARI eBioKit Workshop; Nov 24- 28, 2014

Importance of informatics to biology

Functional aspect: Involves representation, organisation, manipulation, distribution, maintainance and using information

– Design, data formats,, databases, simple scripts to assist handle data

Developmental aspect: Develop analytical tools to get information from data

– Comparing sequences to infer function of newly discovered gene

– Compare 3D protein structure to know how protein folding takes place

– How protein – metabolites; protein- protein interactions for cell function

Why unix

Optimal for developing, compiling and running programs and scientific research tools- used in hi-performance computer systems.

Has been in use in industry and academia (i.e. the software developers).

Most programs are command- line (i.e., launched by entering a command in a terminal window rather than through GUI)

Versatile scripting and system tools readily available on Linux allow

customization of any analysis

Free softwares!Did you know ? The world-wide web (www) runs on unix?

Sotware that powers internet was invented on unix; thus most servers are unix-based.

Moving about

Do not use your mouse

The file systemThe terminal

Basic commands

ls – list stuff in the present directory (folder)

pwd – present working directory

cd [folder_name] – change directory- navigate to the [folder_name]

cd .. change directory to the parent cd ~ change directory to default 'home'

head [filename] – list first few lines. tail listst last few lines

less [filename] – read contents of a file in bits

cat [filename] – display contents of the file

cp [source] [destination] - copy file / directory from source to directory

mv [source] [destination] – Move... i.e. Cut-&-Paste

rm : remove

Question : Why do you avoid spaces in filenames and folder_names

Command line - options

Command line options – extended functionality to the command

ls – l : list the files in the directory in long format

ls -t : list sorted output by modification

ls -S : sort output by size

ls -r : reverse sort the output

ls -R : List recursively in directory below current level

ls -1 : list output one per line

ls -p : helps identify directories by adding a /

mkdir [dir_name]: Make a directory called dir_name

rmdir -???

More helpful commands

Question :

Why do you need to avoid spaces in file names and folder names?

Split/ csplit – splits a (big) file into smaller files

cut – seperates parts of the file using preset delimiters eg tabs

paste – combine portions from several files into one file

sort – sorts a single file, group of files and simultaneously merge into one file

cmp / diff [file1] [file2] – list first few lines. tail listst last few lines

wc – word count

Grep – search for pattern in file(s)

Kill – stop a process

Tar – to compress / extract archives

Special characters

Question :

Why do you need to avoid spaces in file names and folder names?

/ denotes directories

. present directory

.. Parent directory

> redirect output to a file

>> Append (add to the end) the output to existing file

# Comment (ignored by computer)

| redirect output of a program into another program

< read as input

[] defines a set of characters

Regular expression- regex

\ is called ‘escape’: the next char has special meaning

? match a single character• matches zero or more

Thanks