+ All Categories
Home > Documents > Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery...

Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery...

Date post: 22-May-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
17
© 2015 - Diversity Arrays Technology Pty. Ltd KDDart - Knowledge Discovery System Andrzej Kilian Diversity Arrays Technology Pty Ltd CBA workshop Canberra December 4 2018
Transcript
Page 1: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

Andrzej KilianDiversity Arrays Technology Pty Ltd

CBA workshop CanberraDecember 4 2018

Page 2: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ Introductions¨ What is important in good genotyping

¡ People ¡ QC/QA systems¡ Informatics¡ Equipment

¨ DArT PL’s genome profiling methods¡ Whole genome “random distribution” markers (DArTseq, DAtRseqLD)¡ Targetted genotyping (DArTag, DArTmp, DArTcap, rhAMPseq?)

¨ Examples of applications¨ Comprehensive support in data analytics and management¨ Concluding remarks

2

Page 3: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ Mission: Actively support more effective and sustainable use of natural resources¨ Main business: genome profiling services and IT support for agriculture and ecology¨ Operating since 2001 – first affordable whole genome profiling service ¨ Global service provider: >1.5 million whole genome profiles, >1000 organisms¨ Supporting international agriculture through tech development, training & service¨ Collaboration in technology package transfer (Mexico, Africa) ¨ Team of 40 (lab, data analysis and IT) + international visitors/trainees ¨ High diversity of clients: From farmers to multinational corporations, majority from OS,

SMEs and public (CGIAR) most involved in breeding

3

OUR PEOPLE

OUR CLIENTS

SAGA

IGSS

Page 4: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ 37 years ago: PhD in pop genetics and microevolution with 8(!) isozymes (Arabidopsis thaliana)

¨ 33 years ago: RFLP in Cambridge UK

¨ 25 years ago: plant gene cloning in USA with RDA and RFLP subtraction –foray into genomic representations

¨ 21 years ago: invention of DArTmethod – combining RFLP concepts with genomic representations

¨ 10 years ago: DArTseq – initial transition from array platform to NGS platforms

4

Page 5: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ Always start with learning about user needs!

¨ Price of service is only at the end of the “meeting of the minds” process

¨ Technology applied is often not what was initially requested ¡ Finding a better way to deliver required outcome

¡ Dependent on application, volume of service etc

¨ Complementing the “missing bits” in clients’ technological capacity¡ Many molecular marker options (from dozens of

markers/sample up to whole genome analysis)¡ Dedicated analytics/data mining integrated with marker data

production

¡ KDDart platform for data integration

5

Page 6: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ Not only our own staff!

¨ Over 100 trainees since inception

¨ Predominantly short term visitors

¡ 2 weeks average stay

¡ Usually with materials for processing

¡ Good understanding of all processes ¡ Joint downstream data analysis

¨ Over 20 longer term several months +

¡ Mostly in the context of technology transfer

¡ Competence in performing the relevant tasks and troubleshooting

¡ Established technology in Mexico (partnership with CIMMYT) and now in Kenya (BMGF grant, partnership with ILRI) - IGSS

¡ IGSS operating using DArTseq platform, preparing to roll out other DArT methods and some of the IT tools

6

Page 7: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ 18 years of continued development –nearly 100 releases

¨ Online Ordering and LIMS/database integration: critical for sample tracking:

¨ Integrated with: ¡ Raw (sequence) data production¡ Sequence curation, barcode splitting¡ Imaging systems¡ Pipetting robots, etc...¡ Secondary pipeline (DArTsoft14)

¨ Raw data stored from start of DArT operations

¨ Data complexity/volume reduction¨ Coping well with 2-4 fold yearly increase in

sample/data volumes

9

Page 8: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ 20 years of technology development (complexity reduced representations)

¨ Complexity reduction optimised for each organism¡ Pair of Restriction Enzymes, at least one methylation

sensitive

¨ Targeting c 200,000 mostly low copy sequences (methyl filtration)¡ In most cases >90% of markers single copy

¨ Most assays read at >10X

¡ High call rate

¡ High data quality

¨ Extensive quality control including QC individual libraries

¨ Technical replication -> selection of best markers (SNPs & SilicoDArTs)

¨ Sequencing platform independent

11

Page 9: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ SNPs and SilicoDArTs provide complementary picture of diversity

¨ SilicoDArT analysis provides additional QC insights for SNPs

¨ SilicoDArTs slightly better in technical reproducibility (higher read depth)

¨ Insight into epigenetic variation via SilicoDArTs

¨ Better performance in ”deep” phylogenetics compared to SNPs

Typical results:(average of 100 services using 2.5 million reads option)

12

SNPs SilicoDArTs

Number 53,860 44,570

Read depth 25 29

Reproducibility 99.5 99.8

Call rate 89.7 96.2

Page 10: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ DArTseqLDworks well when medium density of markers in random positions across the genome is required/sufficient. ¡ Representations around 10x smaller than in DArTseq. ¡ Method optimisation (rare cutting RE combinations) but no

specific oligo synthesis required for each organism. ¡ Offers a cheaper setup and no ascertainment bias when

applied to new material¡ Very simple analytics¡ Applied by many commercial and public GS programs and in

lower resolution diversity analysis (genebanks, ecology)

13

Page 11: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

Targeted genotyping complements DArTseq

q Four complementary platforms developed for GS and other lower density applications

q Platform choice depends on required marker density and the volume of demand

q Sequencing platform for assay readout

q Adopted several methods for sequence capture

q Performance, cost and FTO considerations

q Average read depth/locus >100 – very high data quality

14

DArTmpdeveloped recently for applications in which low density of markers is sufficient (<500). The method involves two step PCR process: 1. Selecting short stretch of sequence around the SNP 2. Barcoding and NGS platform elements addition

Page 12: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ DArTcap: regular DArTseqassay followed by selected markers capture with “baits”

¨ Enriched library sequenced. ¨ Outperforms other low(er)plex

assays in polyploids thanks to complexity reduction step.

¨ Probe synthesis investment required and some ascertainment bias possible if probe set not optimised for material/application

¨ Enabling “exom/gene space capture” in the absence of reference sequence and/or transcriptome data (more effective in plants!)

Standard DArTseq library construction

All fragments have a complete assembly of elements required for Illumina sequencing

Hybridisation with biotin-RNA probes complementary to markers of interest

Fragments with markers of interest pulled down with magnetic beads

Captured fragments eluted and Illumina sequenced – DArTcap markers extracted

Page 13: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ Comprehensive characterisation of germplasm/diversity studies: ¡ Parents of breeding programs¡ Germplasm collections (e.g.SeeD project)

¨ Genetic ID/PBR¨ Seed purity/product quality testing

¡ Partnership with GRDC to deliver testing for grains in Australia, also internationally (mainly Africa)

¨ Genetic and QTL mapping¡ > 1000 mapping populations

¨ Association mapping/GWAS ¡ “Cloning as you go” in breeding programs

¨ Gene tagging /cloning (under $10,000) - quantitative BSA

¨ Genomic Selection (plants and animals)

¨ Population genetics¨ Eco-evo, applied ecology ¨ SpIDer: sort of “barcode of life” (species identification)

¡ long term project with Chevron in Australia

17

Page 14: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ Cost of assay and platform choice driven by size of demand and application type – consultation critical

¨ Limiting the number of platforms => lower investment (sequencing “the best bet”)

¨ Assay densities

¡ Hundreds of thousands for parents

¡ Thousands for selections/biparental crosses

¡ Dozens for “targeted selection”

¨ High density~$35-40/sample depending on volume

¨ Cheaper versions of assay ($10-25)

¡ With lower depth with imputation (“on the fly”)

¨ Markers from the same representation -> efficient imputation

¨ Low density (100- 1000 markers) – $7-10

¨ Lowest densities ~ 30 markers ($2-4)

18

Breeding parents(>100,000 markers)

Routine(GS) selection(>5,000)

Targeted selection

(>30)

Page 15: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

q Initial method described by Bedo et al, BMC Genetics (2008) – partnership with NICTA

q Applicable to any population type including breeding programs

q Used either for QTL/GWAS or for genomic selection

q GEMir and GEMisq New technology for finding interacting

pairs of genes (or higher level interactions)

q Adding interactions of markers with environment

q Testing capacity to improve prediction models of GS with “epistasis”

19

GWAS analysis – linear models

GWIS analysis – interaction search

Page 16: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System 20

Data production Data integration

Page 17: Andrzej Kilian Diversity Arrays Technology Pty Ltd · 2019-05-15 · KDDart -Knowledge Discovery System © 2015 -Diversity Arrays Technology Pty. Ltd Andrzej Kilian Diversity Arrays

© 2015 -Diversity Arrays Technology Pty. LtdKDDart -Knowledge Discovery System

¨ Integration with DNA profiling and IT services provided by DArT PL and IGSS ¡ Consultancy with users to find the best solution to their problem

¨ DArTseq a generic, scalable and robust platform for any organism and application (over 1,000 species in routine analysis)

¨ Support in lower density assay development and delivery¡ DArTmp, DArTag, DArTcap and DArTseqLD

¨ DArTdb and DArTsoft14 in support of marker data production

¨ KDDart platform for data storage and analysis

¨ Continued development of genotyping methods (DArTreseq, whole genome assembly/structural variants) and free tools for data collection and analysis

¨ Open access and partnership model for technology and service delivery

21


Recommended