Post on 10-May-2015
transcript
Java voor Bio-Informatica I
Introduction
Course intro Java as a language for Bioinformatics Current Java API’s, Applications and
Tools for Bioinformatics Some Statistics
Books Book: Java voor studenten Syllabus: Java voor BioInformatica 1
On Scholar
USE THE BOOK
Like this
OnLine How to Think Like a Computer Scientist
– Java Version http://www.greenteapress.com/thinkapjava/thinkapjava.pdf
Thinking in Java –edition 3, Bruce Eckel http://www.planetpdf.com/codecuts/pdfs/eckel/TIJ3.zip
BlueJ Tutorial http://www.bluej.org/tutorial/tutorial.pdf
SUN Java http://java.sun.com/
Software BlueJ (www.bluej.org)
http://www.youtube.com/watch?v=SRLU1bJSLVg
OO!
Similar to C(++)
Isn’t JavaScript
http://java.sun.com/
From Python to Java
Why Python? Less effort Regular Expressions for manipulating text Lot of pre-existing bioinformatics Python
modules (BioPython) Python is widely used in the biology
community, and it is a language well suited for writing Web applications
Good for quick prototypes
Why not Python (Ensembl)? “Some aspects of Python are not well suited for a
software project of Ensembl's size”. “absence of compile time checking of function
prototypes and variable types is a steady source of runtime errors.”
Why Java?
Java is platform independent and hence easily distributable.
Ideal for Visualization
Ensj, BioJava APIs
Since lot of standards and software engineering tools (IDEs,Testing, etc) are built for Java we can build applications that are robust and heavy duty
Why Java (Ensembl) “compile time type checking, enforced interfaces, multi threading, better support for graphical user interfaces, and correct garbage collection of circularly
referenced objects”.
Java Genomics Tools Argo Genome Browser is a tool for visualizing and
annotating whole genomes. Broad Institute
Apollo genome editor is a tool for annotating genomic sequences
FruitFly.org
Artemis is a genome viewer and annotation tool. The Sanger Institute
PatternHunter is a homology search tool. Bioinformatics Solutions Inc.
Java Genomics Tools The GBuilder is a tool for analysis and visualization of
collections and assemblies of sequences. EMBL, EBI
Sockeye is a 3D environment for comparative genomics.
Canada genome sciences centre
VCMap MapView uses interactive Java interface Rat Genome Database
Jalview is a multiple alignment editor. BBSRC UK
Java Data Management Tools OmniGene helps to exchange biological data through the web
services model and J2EE technology Panther Informatics
Genome Directory System (GDS) is a distributed search and retrieval system for genome databases
Indiana University
Citrina is a database management tool that automates the mirroring and processing of databases that are distributed via ftp servers
Indiana University
Haystack is designed to let individuals manage all information in ways that make the most sense to them.
IBM
LuceGene is a document/object search and retrieval system for Genome and Bioinformatics Databases
Indiana University
Java Gene Expression Tools
GeneX is a gene expression database and integrated tool set NCGR
TIGR TM4 Microarray Software Suite Microarray Explorer (MAExplorer) - data-mining program
National Cancer Institute
Caryoscope is for viewing gene expression data in a whole-genome context
Stanford University
Java TreeView renders gene expression data into several interactive views.
Stanford University
J-Express Pro - analysis and visualization of microarray data. molmine
Java Bioinformatics APIs/Libraries BioJava project APIs for processing biological data.
Jemboss is a java based interface to EMBOSS (The European Molecular Biology Open Software Suite)
Ensj – API to access ensembl databases
MartJ – API to access EnsEMBL's Mart database
Phylogenetic Analysis Library (PAL)-bioinformatics analysis of evolutionary development of genomes.
CEBL New Zealand
Knowledge Discovery Object Model (KDOM) is an API to represent and manage biological knowledge during application development
Genome Sciences Centre Canada
Other Java Tools JaMBW is Java based Molecular Biologist's Workbench
EMBL
DAG-Edit is an application to browse, query and edit GO GO
ImageJ can display, edit, analyze, process, save and print images. Research Service Branch NIH
Cytoscape is a tool for analyzing and visualizing biological network data
The Institute for Systems Biology (ISB)
PubSearch is a web based literature curation tool The Arabidopsis Information Resource (TAIR)
PubFetch is a literature retrieval tool BRC
Statistics November 12 2006
Query Java AND Bioinformatics
Python ANDBioinformatics
C++ ANDBioinformatics
Perl AND Bioinformatics
Google 1.660.000 2.040.000 1.420.000 1.510.000
Google Scholar 10,500 1,220 15 5,350
PubMed 146 14 0 82
Statistics November 13 2007
Query Java AND Bioinformatics
Python ANDBioinformatics
C++ ANDBioinformatics
Perl AND Bioinformatics
Google 1.920.000 1.640.000 1.560.000 1.710.000
Google Scholar 13,400 1.860 23 9.930
PubMed 180 25 20 102
Hello World!// Hello.java
public class Hello {
public static void main(String[] args) { System.out.println("Hello, world!");
}
}
Hello World!import java.awt.*;import java.awt.event.*;import javax.swing.*;
public class HelloSwing extends JFrame{ public static void main (String[] args){ JOptionPane.showMessageDialog(null,
"Hello World"); }
}
Syntax isn’t important for me But the JVM cares! Casesensitivity E.g System.out.println != system.out.println
Summary Java is increasingly being adopted by Bioinformatics
community Many former Perl based applications and APIs are
currently being rewritten in Java (e.g. BioMOBY, Ensembl etc.)
With release of advanced Java APIs and improved Java Virtual Machine some of the drawbacks were eliminated (e.g. Regular Expression, Casting)
References Java for Bioinformatics and Java APIs
for Bioinformatics – by Stephen Montgomery at http://www.oreillynet.com
The Ensembl core software libraries.– Stabenau A, McVicker G, Melsopp C, Proctor G, Clamp M, and
Birney E in Genome Res. 2004 May;14(5):929-33