By
Dr.Dr. Azman Firdaus ShafiiAzman Firdaus ShafiiFounder Chairman & CEOFounder Chairman & CEO
dafsdafs@@aldrixaldrix.net.net
Bioinformatics Symposium 2005Bioinformatics Symposium 2005
IT Comes AliveIT Comes AliveTheaterette, HELP University College
Tuesday, 26th July 2005
OPEN SOURCE SYSTEMS SDN BHDOPEN SOURCE SYSTEMS SDN BHDInnovation through ICT and Life SciencesInnovation through ICT and Life Sciences
www.www.aldrixaldrix.net .net
Biotechnology is 10,000 years old nowBiotechnology is 10,000 years old now
Leavening bread and beer fermentation (4,000
BC, Egypt, Mesopotamia)
Production of cheese and wine (4,000 – 2,000 BC, Sumeria, Egypt, China)
First antibiotic (mouldy soybean curds to treat boils) (500BC, China)
and so on…..
Open Source Systems Sdn. Bhd. Page 2
The Central Dogma in Biology is stillThe Central Dogma in Biology is still
DNADNA RNARNA PROTEINPROTEIN
Reverse Transcription
Transcription Translation
FoldingPost-Translational Modifications?
STRUCTURESTRUCTURE
FUNCTIONFUNCTION
Hey Folks, looks like
DNADNA = cellular “lordslords”
ProteinsProteins = cellular “workerworker--slavesslaves” !!
Page 3Open Source Systems Sdn. Bhd.
Pharmaceutical Industry is 160 years old nowPharmaceutical Industry is 160 years old now
� USD550B market now, USD700B in 2008, 40% still in USA.
� Thousands of firms (i.e. fragmented) including generic drug makers, CRO’s, wholesalers, retailers.
� On top, sits the “Big Pharmas”, the Big 12, HQ’s in USA/EU with half of total retail sales, Pfizer the leader at 10% market share
� Big Pharmas very profitable (25% profit margins) but now under cost and new products pressures. Can’t rely solely on “blockbuster drugs” pipeline anymore.
� Drug discovery and development to market cost (including discounted financial opportunity costs)? Average USD800 Million.
� USA Healthcare Spending = USD1.8 T, ca. 15% of GDP
Page 4Open Source Systems Sdn. Bhd.
The Price of Fighting CancerThe Price of Fighting Cancer
The estimated costs of some of a new wave of cancer drugs (fastest growing sector), which aim at the disease without the side effects of traditional chemotherapy.
CANCER CANCER
DRUGDRUG MANUFACTURERMANUFACTURERAPPROVED APPROVED
FOR USEFOR USETYPE OF TYPE OF
CANCER CANCER TREATEDTREATED
EST. ANNUAL EST. ANNUAL
COST PER COST PER PATIENTPATIENT
ErbituxImClone/Bristol-Myers 2004 Colorectal $ 111,000
Avastin Genentech 2004 Colorectal 54,000
Herceptin Genentech 1998 Breast 38,000
Tarceva Genentech 2004 Lung 35,000
Note: Shares of Genentech has quadrupled in the last 2 years!
Drug price = ƒ {R&D costs, manufacturing, value to patients, etc.}
Source: Sanford C. Bernstein & Co. (NYT July 12, 2005)
Page 5Open Source Systems Sdn. Bhd.
Technology
ResearchArea
Bioinformatics
Positional Cloning
Parallel Sequencing
2-D Gel,Mass Spec.
Cellular AssaysModel Organism,Gene Knock-outs
SmallMolecules
Animal Models
FunctionalGenomics
Proteomics
CellAssays
DrugTargets
GenesDrugTests
GenomicsCombinatorial Combinatorial
ChemistryChemistryScreening
DrugLeads
Pharmacogenomics
HumanTrials
Genotyping,Phenotyping,SNPs Markers
Drug
High-Throughput and Ultra-High Throughput Screening
StructuralDrug Design
Molecular Informatics
Differential Display, Expression Patterns, Reporter Gene Technologies
Chip Technologies, DNA Chips, Protein Chips, Microarrays
DevelopmentStage
ApprovalDrug Discovery Preclinical
I II III IV
Clinical
Bioinformatics in Drug Discovery/Development ProcessBioinformatics in Drug Discovery/Development Process
Source: IBM Life Science, 2002
Page 6Open Source Systems Sdn. Bhd.
Pharmaceutical R&D = USD55BUSD55B in 2005
Traditionally, 10 10 –– 1212 years to develop 11 drug
Drugs to the public : MaximiseMaximise efficacyefficacy
MinimiseMinimise toxicitytoxicity
10,00010,000Molecules Screened
250250Selected for Pre-clinical Testing
10Ready for Clinical Testing
1FDA/Regulator
ApprovalFDA/Regulator FDA/Regulator
ApprovalApproval
Page 7Open Source Systems Sdn. Bhd.
TrendsTrends
� Fierce competition from generics (I,C)
� Reduce D3 pipeline time frame to 7 years
� Collaborate/acquire smaller biotech companies (1/3 of new molecules originate here now, 1/3 more from university labs)
� Maximise use of bioinformatics (and other “matics”) including computational biology across the 60 departments involved
� R&D outsourcing (some parts, 70:30 say) to CRO’s in lower cost BRIC(S) countries
� Early days in pharmacogenomics or “personalised medicine”
Page 8Open Source Systems Sdn. Bhd.
Modern Biotechnology Industry is 30 years oldModern Biotechnology Industry is 30 years old(with the founding of Genentech, USA, on April 7, 1976)
Years to IPO
3 – 5 years
Years to First Product
6 - 12 years
Years to Profit
9 - 15 years
Page 9
Number of biotech companies? (Ernst & Young, 2005)
USA 1430Canada 470Germany 360United Kingdom 320Australia 220China 150India 100Malaysia 20-50?
Worldwide: 4,416, 641 PLC’s, 78% USA.
Today, it is a USD55B industry by revenues
Market capitalisation (USD330B, 2004 in USA alone) but be careful about hypes and bubbles.
Use your common sense, do not be gullible to Wall Street analysts. Understand what are pendulum swings, market structural shifts, fundamentally disruptive
Open Source Systems Sdn. Bhd.
ConclusionConclusion::
Biotech is more risky than ICT, has a long gestation period. That’s why in many countries where there
is no venture capital industry, the government must
take the initial lead.
Malaysia’s National Biotech Policy was unveiled on April 28, 2005. What is the order of priority?
Page 10
Green Biotech? Agri / Food
Red Biotech? Healthcare / Pharma
White Biotech? Industrial (including biofuels, vitamins, environmental remediation, etc.)
Will impact sources and sinks of fund flows, deal flows.
Open Source Systems Sdn. Bhd.
� Creation of DATABASES (eg. GenBank, EMBL, DDBJ for genomes ~ updated daily)
allowing storage and management of large biological data sets (eg. Sanger
Centre’s 70 ABI sequencing units generates 60M bases raw data daily to become 600M annotated
finished sequences per year). Biologists want manually curated, biological validated annotation. (SWISS-PROT has 90 annotators, mostly females!).
� Development of ALGORITHMS and STATISTICS to determine relationships among members of large biological data sets
� Use of the TOOLS for the ANALYSIS and INTERPRETATION of various types of biological data, including DNA, RNA and protein sequences, protein structures, gene expression profiles and biochemical pathways.
ESSENTIALLY, BIOINFORMATICS HAS 3 COMPONENTSESSENTIALLY, BIOINFORMATICS HAS 3 COMPONENTS
Page 11
EMBOSSEMBOSS
BLASTBLAST
STADENSTADEN
CLUSTALWCLUSTALW
Open Source Systems Sdn. Bhd.
Finding the Right DataFinding the Right Data
Name Address Description
Ensembl www.ensembl.org The Human GenomeGenBank/DDBJ/EMBL www.ncbi.nlm.nih.gov Nucleotide sequencePubMed www.ncbi.nlm.nih.gov Literature referencesNR www.ncbi.nlm.nih.gov Protein sequencesSWISS-PROT www.expasy.ch Annotated protein sequencesInterProScan www.ebi.ac.uk Protein domainsOMIM www.ncbi.nlm.nih.gov Genetic diseasesEnzymes www.chem.qmul.ac.uk EnzymesPDB www.rscb.gov/pdb Protein structuresKEGG www.genome.ad.jp Metabolic pathways
Source : Bioinformatics For Dummies(2003)
Analyzing your DNA/RNA SequenceAnalyzing your DNA/RNA Sequence
Name Address Description
Webcutter www.firstmarket.com/cutter Restriction mapPCR biotools.umassmed.edu/bioapps PCR primers design
Assembly bio.ifom-firc.it/ASSEMBLY/ Simple DNA assembling
assemble.html for small sequencesGenomeScan genes.mit.edu/genomescan/ Gene discovery
blastn, tblastn, blastx www.ncbi.nlm.nih.gov Database searchThe Genome Browser genome.cse.ucsc.edu Browse the ultimate data!
Mfold www.bioinfo.rpi.edu RNA structure prediction
Analyzing Your Protein SequenceAnalyzing Your Protein Sequence
Name Address Description
BLAST www.ncbi.nlm.nih.gov Database homology search
SRS srs.ebi.ac.uk Database search
Entrez www.ncbi.nlm.nih.gov Database searchInterProScan www.ebi.ac.uk Find protein domains
ExPASy www.expasy.ch Analyze a proteinDotlet www.ch.embnet.org Make a dot plot
ClustalW www.ebi.ac.uk Multiple sequence alignmentT-Coffee igs-server.cnrs-mrs.fr/Tcoffee Evaluate multiple alignment
Jalview www.es.embnet.org Multiple alignment editor
PSIPRED bioinf.cs.ucl.ac.uk/psipred Secondary structure predictionCn3D www.ncbi.nlm.nih.gov/Structure Display and spin 3-D structures
Phylip bioweb.pasteur.fr/into-uk.html Tree reconstruction
Open Source Systems Sdn. Bhd. Page 12
DATA
� Stored fact� Inactive (they exist)� Technology-based� Gathered from various sources
INFORMATION
� Presented fact� Active (enabler)� Business-based� Transformed from data
Role of Bioinformatics(This role already quite developed in Engineering and Finance/Banking)
Where component parts, or building blocks, are already known and their respective functions well understood)
The Typical Bioinformatics Equation: The Typical Bioinformatics Equation:
Know Sequence, WhatKnow Sequence, What’’s the Consequences the Consequence??
Adapted from Dr Hwa A Lim’s “Genetically Yours”
World Scientific Publisher, 2002.
Page 13
Data Information Knowledge
Analyses
ModellingExperiments
in-vivo, in-vitro, population, etc.
Open Source Systems Sdn. Bhd.
LIMS
Laptop
Global WAN+
Great Global GridGrid Computing
Framework
PCMultimedia H/P
PDA
Visualization & Hi-performance Workstation
Storage Management
Database Management
Parallel Linux Clusters*Server A
Presentation Layer
Computational & Data Management Computing Layer
Information Management Layer
DataBiological FinancialChemical AdminProteomic DiagnosticGenetic TreatmentClinical Pathways
Adapted from EMC
THE THE ““NEW BIOLOGYNEW BIOLOGY”” INFORMATIONINFORMATION--DELIVERY MODELDELIVERY MODEL
Note: Use of scripting languages (Perl, Python, R, JAVA, etc.), XML, Linux and other Open Source Bioinformatics Software is widespread
Page 14
SMP* clusters/dedicated irons (eg. IBM BlueGene/L)
65,000 cpu’s, 132TF on LINPACK
Open Source Systems Sdn. Bhd.* An OSS White Paper on High Performance Computing is downloadable from http://www/aldrix.net
Grand Challenge I : Genomics to BiologyGrand Challenge I : Genomics to BiologyElucidating the structure and function to genomes
I-1 Comprehensively identify the structural and functional components encoded in the human genome.
I-2 Elucidate the organisation of genetic networks and protein pathways and establish how they contribute to cellular and oganismal phenotypes.
I-3 Develop a detailed understanding of the heritable variation in the human genome.
I-4 Understand evolutionary variation across species and the mechanisms underlying it.
I-5 Develop policy options that facilitate the widespread use of genome information in both research and clinical setting
NHGRI, 2003
Page 15Open Source Systems Sdn. Bhd.
Bioinformatics in Drug Discovery / Development ProcessBioinformatics in Drug Discovery / Development Process(Example(Example))
DNA sequences
& maps
Gene Expression
Analysis
Protein structure
Prediction / Analysis
Protein structure
Prediction / Analysis
Disease
model
Disease
selection
Empirical
Medicine
LTS
HTSLead Identification
& optimization
Rational
drug design
Pre-clinicaltrials
Clinical
Trials
P-I / II / III / IV
Genomics
databaseTarget gene
Protein
data
Cell biology
database
In vivo
Diseasebiology
Preclinical andexperimental Data
Clinical Data
Computational physical chemistry
Medicinal chemistry
Molecular diversity
chemistry structure
Target receptor
Clinical Trials
Clinical Biology
Animal genetic diseasesPhysiology database
Medical research database
Computational Chemistry
Protein structure prediction
Protein-protein interaction
Protein-molecule interaction
Genomics / Proteomics
Structural Genomics Functional Genomics
Proteomics
Source : IBM Life Science, 2002
DNA informationNucleotide sequence
of genes
Page 16Open Source Systems Sdn. Bhd.
Grand Challenge II : Genomics to HealthGrand Challenge II : Genomics to HealthTranslating genome-based knowledge into health benefits
II-1 Develop robust strategies for identifying the generic contributions to disease and drug response.
II-2 Develop strategies to identify gene variants that contribute to good health and resistance to disease.
II-3 Develop genome-based approaches to prediction of disease susceptibility and drug response, early detection of illness, and molecular taxonomy of disease states,
II-4 Use new understanding of genes and pathways to develop powerful new therapeutic approaches to disease.
II-5 Investigate how genetic risk information is conveyed in clinical settings, how that information influences health strategies and behaviours, and how these affect health outcomes and costs.
II-6 Develop genome-based tools that improve the health of all.
NHGRI, 2003
Page 17Open Source Systems Sdn. Bhd.
Grand Challenge III : Genomics to SocietyGrand Challenge III : Genomics to SocietyPromoting the use of genomics to maximise benefits and minimise harms
III-1 Develop policy options for the use of genomics in medical and non-medical settings.
III-2 Understand the relationship between genomics, race, and ethnicity and the consequences of uncovering these relationships.
III-3 Understand the consequences of uncovering the genomic contributions to human traits and behaviours.
III-4 Assess how to define the ethnical boundaries for uses of genomics. NHGRI, 2003
Page 18Open Source Systems Sdn. Bhd.
Biodiversity: Hype or Real?Biodiversity: Hype or Real?
Astra Zeneca (7th ranked Big Pharma @ USD22B revenues, USD70B market capitalisation) thinks it’s for real. Has invested A$100M looking for drug compounds among Australia’s flora and fauna in partnership with locals (e.g. Griffith University, Queensland). Evaluated 41,000 samples of plants and micro-organisms.
Malaysia? Early stage only. Our RM2B natural products/ herbal*industries will probably focus on nutraceuticals and cosmeceuticals first. We will probably enter the global pharmaceuticals sector via bio-generics, with or without international partners.
Page 19
* Consider Pegaga (Cantella asiatica) @ Pasar malam 1 kg = RM4.00
Dried, extracts 1 kg = RM100.00
Upon standardisation 1 kg = RM380.00
but upon standardised to pharmaceutical/cosmetics industry, 1 kg = RM3,800.00!
(Source: Rajen, [email protected])
Open Source Systems Sdn. Bhd.
� Earth - universities, RI’s
� Fire - entrepreneurship
� Water - finance esp. venture capital
� Wood - networking (government, industry, academia)
�� GoldGold - stock market (value realisation, IPs are protected)
The Last Page?The Last Page?
Page 20
Biotech/bioinformatics are K-intensive. Basic sciences must be solid. But
IMAGINATIONIMAGINATION will take you to breakthroughs. Need specialised skills to innovate and create value. Worthwhile doing your post-graduate degrees (MSc, PhD, post-doctoral, overseas stints, etc.). Computer Science people must cultivate a love/interest in molecular stuff (& vice-versa). OSS and others are recruiting!
Worldwide, nations are building clusters (Michael Porter’s diamond).
For (bio) clusters to be impactful (takes at least 10 years), need all 5 synergistic elements to be in place.
Open Source Systems Sdn. Bhd.
Career development? Look for scenarios that are likely to have an impact on our lives.History is littered with corpses that only had a single view of the future.
Malaysia – just starting, have reasonable agro-industrial base
�� HR capital constraints (USA produces 10,000 PhDs in Science HR capital constraints (USA produces 10,000 PhDs in Science
and Engineering per year!)and Engineering per year!)
�� Bioinformatics skills take time to hone. Best to interact closeBioinformatics skills take time to hone. Best to interact closely ly
with wetwith wet--lab people or bench scientists. Grads must be flexible lab people or bench scientists. Grads must be flexible
and willing to adapt/evolve. Is it our brain thatand willing to adapt/evolve. Is it our brain that’’s lacking s lacking
bandwidth?bandwidth?
�� Regional competitors abound, whatRegional competitors abound, what’’s our competitive s our competitive
advantage?advantage?
�� But national commitment is there (will it be But national commitment is there (will it be impactfulimpactful as our as our
plantations and ICT industries?)plantations and ICT industries?)
�� Have right people at right places, with right tools and adequateHave right people at right places, with right tools and adequate
resources, and plenty of imagination.resources, and plenty of imagination.
Definitely the Last PageDefinitely the Last Page
Page 21
Remember: whatever we do, it’s all relative to the competition!
Open Source Systems Sdn. Bhd.
Thank You