+ All Categories
Home > Documents > A Career in Bioinformatics Sarah Butcher Oxford University Bioinformatics Centre .

A Career in Bioinformatics Sarah Butcher Oxford University Bioinformatics Centre .

Date post: 28-Dec-2015
Category:
Upload: erik-ferguson
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
18
A Career in Bioinformatics Sarah Butcher Oxford University Bioinformatics Centre http://www.molbiol.ox.ac.uk
Transcript

A Career in Bioinformatics

Sarah ButcherOxford University

Bioinformatics Centrehttp://www.molbiol.ox.ac.uk

What is Bioinformatics Anyway?

In the broadest sense - the use of computers for the storage, manipulation and analysis of biological data

IT for biology

In silico biological research

computational biology

can be seen as a tool or as a scientific discipline

different groups tend to set the boundaries according to their own particular interests!

Why Use Bioinformatics?

The most reliable way to discover the biological function of a molecule is by experimentation - BUT this can be complex, time-consuming and expensive!

It is relatively simple (i.e. fast and cheap) to sequence the corresponding DNA

So - many bioinformatics tools are designed to allow us to infer structural and functional information from sequence alone

But inference is rarely 100% accurate:Need some statistical measure of accuracy and (ideally) final proof by experimentation

The integrated approach - use in silico results to refine and direct laboratory experimentation

SQ Sequence 898 BP; 289 A; 155 C; 186 G; 268 T; 0 other; ATCCCCGCAT GGATGTTTTA TAAAAAACAT GATTGACATC ATGTTGCATA TAGGTTAGAT 60 AAAACAAGTG GCGTTATCTT TTTCCGGATT GTCTTCTTGT ATGATATATA AGTTTTCCTC 120 GATGAAAAAT ATAACTTTCA TTTTTTTTAT TTTATTAGCA TCGCCATTAT ATGCAAATGG 180 CGACAGATTA TACCGTGCTG ACTCTAGACC CCCAGATGAA ATAAAACGTT CCGGAGGTCT 240 TATGCCCAGA GGGCATAATG AGTACTTCGA TAGAGGAACT CAAATGAATA TTAATCTTTA 300 TGATCACGCG AGAGGAACAC AAACCGGCTT TGTCAGATAT GATGACGGAT ATGTTTCCAC 360 TTCTCTTAGT TTGAGAAGTG CTCACTTAGC AGGACAGTCT ATATTATCAG GATATTCCAC 420 TTACTATATA TATGTTATAG CGACAGCACC AAATATGTTT AATGTTAATG ATGTATTAGG 480 CGTATACAGC CCTCACCCAT ATGAACAGGA GGTTTCTGCG TTAGGTGGAA TACCATATTC 540 TCAGATATAT GGATGGTATC GTGTTAATTT TGGTGTGATT GATGAACGAT TACATCGTAA 600 CAGGGAATAT AGAGACCGGT ATTACAGAAA TCTGAATATA GCTCCGGCAG AGGATGGTTA 660 CAGATTAGCA GGTTTCCCAC CGGATCACCA AGCTTGGAGA GAAGAACCCT GGATTCATCA 720 TGCACCACAA GGTTGTGGAA ATTCATCAAG AACAATCACA GGTGATACTT GTAATGAGGA 780 GACCCAGAAT CTGAGCACAA TATATCTCAG GGAATATCAA TCAAAAGTTA AGAGGCAGAT 840 ATTTTCAGAC TATCAGTCAG AGGTTGACAT ATATAACAGA ATTCGGGATG AATTATGA 898

SQ SEQUENCE 258 AA; 29902 MW; FA4CAD00 CRC32; MKNITFIFFI LLASPLYANG DRLYRADSRP PDEIKRSGGL MPRGHNEYFD RGTQMNINLY DHARGTQTGF VRYDDGYVST SLSLRSAHLA GQSILSGYST YYIYVIATAP NMFNVNDVLG VYSPHPYEQE VSALGGIPYS QIYGWYRVNF GVIDERLHRN REYRDRYYRN LNIAPAEDGY RLAGFPPDHQ AWREEPWIHH APQGCGNSSR TITGDTCNEE TQNLSTIYLR EYQSKVKRQI FSDYQSEVDI YNRIRDEL

Why are we here?

Prediction of FunctionPrediction of Function

ProteinIdentify domainsPredict foldsFind active sites Predict interactions

RegulationIdentify regulatory elementsPromotersTranscriptional control

Cell growth and death

Tissue specificity

Developmental Control

Evolution

Alternative transcripts

The hardest challenge

Common goals - a small selection

Why Did I move careers into Bioinformatics?

Frustrated with pace of bench scienceFew career prospects in short-term postdoc positions

Found in silico analyses far more interesting than wet-lab-hypotheses can be examined in hours rather than months-failures can be investigated

Wanted a job that still used biological knowledge, involvedscience on everyday basis, meeting people, stimulated learning

Posts seemed to give more self esteem – skills much in demand!

Fast expanding field with few formally trained people so chances to learn on the job

My Career Path

Degree 1: Imperial College of Science, Technology & MedicineBSc in Applied Biology (4 yr sandwich with 2 x 6 month research placements)Degree 2: PhD National Institute of Medical Research (MRC) in T cell receptor gene use (4 years in total)Postdoctoral research: NERC Institute of Virology and Environmental Microbiology Oxford. (5 years) Projects in following areas:• Design of diagnostic PCR primers for monitoring human enteropathogenic virus contamination of commercial shellfish and novel positive control reagents• Phylogenetic studies of Caliciviridae (+SS RNA viruses)• Recombinant protein expression in insect cell expression systems• Pilot studies of transport and persistence of human enteropathogenic viruses in natural river watersBioinformatics Computing Officer: Oxford University Bioinformatics Centre(RS1A) 1 yearManager Bioinformatics Centre: Oxford University Bioinformatics Centre3 years

My Career Path – how

Degree1: 2nd year course on Pascal programming (put me off computers for a while!)Degree 2: learnt to work in research atmosphere, learnt basics of molecular biology,Also how to present and communicate ideas, to learn independently

Postdoc1: had to learn to use bioinformatics resources early on in order to interrogate public databases and retrieve datasets to work on. Attended in-house practical-based courses on using bioinformatics tools. Explored and learnt to use a wide range of available sequence analysis programs… asked questions…. Read papersStarted to help other people to use the same programs to analyse their dataActed as demonstrator for courses run by bioinformatics centreattended bioinformatics conference, workshop at Sanger Centre (Protein domains)Applied for job in bioinformatics centre….Bioinformatics post: continual learning on-job

So What Does A Bioinformatician Do?

Reformat dataCurate biological data (may involve designing, building and maintaining databases)Act as ‘interpreter’ between pure biologist and computer Mine large complex datasets for added-value informationTrain, teach (formally or informally)ProgramDevelop softwareDevelop algorithmsNovel research

Some, any or all of the following

Pros and Cons of Bioinformatics

• Fast expanding field where demand outstrips supply in both industry and academia!‘trendy’ so research councils are funding special initiatives• Good career prospects at present (see above)• Flexible working environment as all you need is access to computer(so working from home or flexible hours may be possible)• Many different niches as many different types of data to analyse• Can be combined with benchwork in some instances if you so wish

• You may be analysing someone elses datasets• Many roles are supporting not initiating research• Not everyone loves computers• Need broad knowledge base to do well

BUT…

Most Useful General Qualities

Ability to learn! (field continually changing)

Problem solving

Accuracy and attention to detail

Good communication and inter-personal skills

Writing and presentation skills

Biologist/ biochemist Computing/Programming

Computer scientist Molecular biology

Software designer/programmer Molecular biology

Statistician/mathematicianMolecular biology/ Genetics/ Population genetics

Enough biology to formulate / understand the questionEnough computing to manipulate the data/ write the tools/actually do the analyses Enough statistics (and biology) to sanity-check the results and understand their significance

Jack of all trades but master of one…

Computing Experience

Programming?

Platform?

Programs?

Other Skills?

sooner or later you will have to! Shell scripting good start, Perl VERY usefulformal language good - C, C++, Java (Fortran less)HTML, XML also good but almost assumed

Although many bioinformatics tools run on PCs/MACs, most Servers are still UNIX so UNIX/Linux very useful

Advanced UNIX (any flavour), system administration, database management (e.g ORACLE, MYSQL),understanding of networks, security,interest in computers in general

Thorough understanding of core bioinformatics toolsincluding runtime options (commandline) and results interpretation. Which program to use, limitations of methods

Perl! - pipeline design tool of choicePerl has been described as the glue that stuck the human genome project together see http://www.ddj.com/articles/1997/9718/9718e/9718e.htm.

Perl is a programming language which is particularly suited for processing text (database files, sequence data, output from analyses software).

http://www.perl.org (the spiritual home of perl) http://www.bioperl.org/ (a project to develop modules for biology)http://perl.oreilly.com/ (a perl site by Oreilly)

Beginning Perl for Bioinformatics – James Tisdall O’Reilly (2001)

Self-learningBooks - wide range of bioinformatics books – no ONE book

will help –some give good overviews but outdated quickly Many good programming source books available – find one that works for youJournals - e.g. Bioinformatics (Oxford University Press), Briefings in Bioinformatics (Henry Stewart) – background and some reviews but many papers are scattered over wide range of journals Web-sites – large numbers of websites carry very good teaching resources– eg NCBI tutorials http://www.ncbi.nlm.nih.govMany good public database resources and specialist sites

Newsgroups – bionet.software bioinformatics.org

http://www.expasy.org/Good WWW services are crosslinked

Formal Training

Some universities offer one year MSc or MRes in bioinformaticsover 11 currently in UK, several in preparation(http://www.molbiol.ox.ac.uk/Links/bioinf_links.htm) for list of some

Distance learning MSc (e.g. Exeter, Manchester)

Short specialist courses run by some Universities on some topics in bioinformatics (e.g. http://www.conted.ox.ac.uk/courses/biosciences.html)

Research council – funded summer schools (e.g. previous BBSRC course at Leeds)

HGMP short courses and workshops http://www.hgmp.mrc.ac.uk/About/Courses/

Open University - more general computing courses/degrees/diploma

Doing

Investigate your local resources – make the most of them!

Try web-based programs if you have none locallye.g. http://bioweb.pasteur.fr/intro-uk.html

Get Linux, install it and then try installing and running bioinformatics programs yourself

Instead of performing repetitive complex tasks manually, write scripts to do it for you, even if they are only very simple


Recommended