+ All Categories
Home > Documents > 9/30/[email protected] Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson...

9/30/[email protected] Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson...

Date post: 17-Jan-2016
Category:
Upload: thomasina-sims
View: 221 times
Download: 0 times
Share this document with a friend
Popular Tags:
32
9/30/09 [email protected] Watson School of Biological Sciences Cold Spring Harbor Laboratory Yaniv Erlich Compressed Sensing Approaches for High Throughput Carrier Screen Joint work with Noam Shental, Amnon Amir and Or Zuk
Transcript
Page 1: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Watson School of Biological Sciences

Cold Spring Harbor Laboratory

Yaniv Erlich

Compressed Sensing Approaches for High Throughput Carrier Screen

Joint work with Noam Shental, Amnon Amir and Or Zuk

Page 2: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Outline

• What is a carrier screen?

• Our vision - compressed sensing carrier screen

• Unique features of our setting

• Bayesian reconstruction algorithm

• Simulations

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 3: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Rare recessive genetic diseases

Normal

Carrier

Affected

Healthy

Healthy!

Disease

Name Genotype Phenotype

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

~29/30

~1/30

0.003%

Cystic Fibrosis

Page 4: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Carrier breading may lead to devastating results

AffectedCarrier1:2 1:4

No Carrier1:4

Intro - carrier screens

CS vision Unique features BP solver Simulations

Carrier couple

Compressed sensing carrier screen

Page 5: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

What can we do?

• Several countries employ nationwide programs

- screen the bulk population

- very limited set of genes

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 6: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Carrier screen - the current mechanism

Input: Thousands of specimens.

Output: Finding carriers for rare genetic diseases

A needle in a haystack problem

Intro - carrier screens

CS vision Unique features BP solver Simulations

Serial processing:

- sequence: 1 region of 1 person per reaction

- expensive and does not scale

Compressed sensing carrier screen

Page 7: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Carrier screens - our vision

Ultra-high throughput carrier screen

Many specimens + many regions

• Adding more genes to the test panel while keeping the task in a tractable scale

• Increase the participation by reducing the cost

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 8: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

BUT

• On pooled samples - only histogram of the DNA sequence type.

How to multiplex many specimens with next generation sequencers?

Next generation sequencers – parallel processing

Sequence 100 million DNA molecules in a single batch (~1 week)

Fra

ctio

n o

f re

ads

Example:

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1

When pooling 4 normal specimens and 1 carrier

WT allele

Mutant

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 9: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Multiplexing - the compressed sensing approach

y = Φx

CS principle: when x is sparse, very few measurements are sufficient for faithful reconstruction.

X

N

carrier

=

Φ

T pools

y

Pooling design

0-1 matrix

The ratio of carrier reads

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 10: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Distinctions from traditional CS• ‘On a budget’ compressed sensing

• Not all pools were born equal

• Signal domain

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 11: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Distinctions from traditional CS• ‘On a budget’ compressed sensing

• Not all pools were born equal

• Signal domain

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 12: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

On a budget compressed sensing

• Heavy weight design requires long pooling steps and higher material consumption

• Higher compression level is more prone to technical difficulties

• We want a very sparse sensing matrix

Specimens (N)

Pools (t)Φ=

Weight (w)

Compression level

Random matrix with p=0.5

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 13: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Inputs: N (number of specimens in the experiment)

Weight (pooling efforts)

Algorithm:1. Find W numbers {x1,x2,…,xw} such that:

• Bigger than

• Pairwise coprime

2. Generate W modular equations:

3. Construct the pooling design upon the modular equations

Output: Sparse pooling design with

Light Chinese Design

N>

)(mod

)(mod

)(mod

2

1

WxPoolSpecimen

xPoolSpecimen

xPoolSpecimen

≡≡

M

Advantages:

• (w-1)-disjunct matrix

•The weight does not explicitly depend on the number of specimens

• The compression level is

• Easy to debug

N<

mod 6

mod 7

Intro - carrier screens

CS vision Unique features BP solver Simulations

N

Compressed sensing carrier screen

Page 14: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Distinctions from traditional CS• ‘On a budget’ compressed sensing

• Not all pools were born equal

• Signal domain

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 15: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Not all pools were born equal•The sequencer does not report the absolute number of carriers in the pool

•Instead:

),( prbinomial# carrier reads ~

# total sequence reads

Fraction of carriers in the pool / 2

• Pools with ↑sequence reads and ↓carriers provide more reliable information.

• The noise is not additive but with correlation to the content of the pool.

• We need a reconstruction algorithm that takes into account the reliability of the data from each pool.

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 16: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Distinctions from traditional CS• ‘On a budget’ compressed sensing

• Not all pools were born equal

• Signal domain

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 17: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Signal Domain

rx ∈ RN

Nx }1,0{∈r

In traditional CS:

In compressed carrier screen:

Traditional CS decoder solves:

εφ ≤−=∈

21.minargˆ yxtsxx

NRx

• What are the implications of using traditional decoder and employing rounding procedure?

• Can we find reconstruction procedure that directly finds Nx }1,0{∈r

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 18: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Bayesian reconstruction algorithm

Biological expectations Pooling model and sequencing

Biologically, the genotype of one specimen is not dependent on the genotype of other one (unless relatives)

Only the specimens in the pool are affecting the pool results

⎭⎬⎫

⎩⎨⎧

∈= ∏∏∈∈ Tt

ti

ix

txDPBxPxN

}){|()|(maxarg}1,0{

*

r

r

{ })|()|(maxarg}1,0{

* xDPBxPxNx

rrrr∈

= Biological data Pooling data

Approximation by loopy Belief Propagation…

Φ

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 19: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Advantages of Belief Propagation

• Bottom up approach – weighs the reliability of each individual pool

• Bayesian – everything speaks the same language. Can incorporate a-priori medical information and familial connections.

• Encoding advantage – Chinese pooling ensures that there are no short cycles

• Binary results directly – no rounding procedure at the end

Biological data Pooling data

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 20: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Simulations of compressed carrier screen in Ashkenazi Jews

Genetic Disorder Carrier rate

Tay-Sachs 1:25

Cystic Fibrosis 1:30

Familial Dysautonomia 1:30

Usher Syndrome 1:40

Canavan 1:40

Glycogen Storage 1:71

Fanconi Anemia C 1:80

Niemann-Pick 1:80

Mucolipidosis type 4 1:100

Bloom 1:102

Nemaline Myopathay 1:108

• Finding carriers for two Ashkenazi Jews diseases: Tay-Sachs and Bloom syndrome.

• Chinese pooling design

• Comparing GPSR (traditional solver) and BP

• Evaluating Nmax – the largest number of specimens for which at least 48 out of 50 runs give 100% accuracy.

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 21: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Results

Bloom Tay-Sachs

BP GPSR Pools/Specimen =

6.5%Pools/Specimens=

13%

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 22: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Conclusions• CS framework can be utilized for ultra-high throughput carrier screens.

• Our setting shows several unique features not in traditional framework

- We suggest tailored encoding (light Chinese) and decoding (BP) procedures

• At least in our settings: a tailor decoder, BP, has an advantage over reconstructing with off-the shelf CS solver

• CS carrier screen has the potential to reduce dramatically the cost of sequencing.

Intro - carrier screens

CS vision Unique features BP solver Simulations

Compressed sensing carrier screen

Page 23: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

An ongoing study…

Introduction

Naïve Solution

s

Chinese Pooling

Analysis Results

Intro - carrier screens

CS vision Unique features BP solver SimulationsThe real thing

Compressed sensing carrier screen

Page 24: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Greg Hannon

Acknowledgements

For more information: hannonlab.cshl.edu/labmembers/erlich

Noam Shental

Or Zuk& Amnon Amir

Igor Carron (Nuit

Blanche)

Funding:

Lindsay Goldberg PhD Fellowship

ACM/IEEE-CS HPC PhD Fellowship

Compressed sensing carrier screen

Page 25: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Loopy belief propagation is tricky

Damping is the key

DNA Sudoku

Page 26: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Page 27: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Pooling imperfections•Background contamination

•Pooling failures (erasures)

mod 377mod 377

Data from a real experiment

Pools not in use

Pools

# R

ead

sIntro - carrier screens

CS vision Unique features BP solver Simulations

Page 28: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Distinctions from traditional CS• ‘On a budget’ compressed sensing

• Not all pools were born equal

• Pooling imperfections

• Signal domain

Intro - carrier screens

CS vision Unique features BP solver Simulations

Page 29: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Distinctions from traditional CS• ‘On a budget’ compressed sensing

• Not all pools were born equal

• Pooling imperfections

• Signal domain

Intro - carrier screens

CS vision Unique features BP solver Simulations

Page 30: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Distinctions from traditional CS• ‘On a budget’ compressed sensing

• Not all pools were born equal

• Pooling imperfections

• Signal domain

Intro - carrier screens

CS vision Unique features BP solver Simulations

Page 31: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Distinctions from traditional CS• ‘On a budget’ compressed sensing

• Not all pools were born equal

• Pooling imperfections

• Signal domain

Intro - carrier screens

CS vision Unique features BP solver Simulations

Page 32: 9/30/09erlich@cshl.edu Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

9/30/09 [email protected]

Distinctions from traditional CS• ‘On a budget’ compressed sensing

• Not all pools were born equal

• Pooling imperfections

• Signal domain

Intro - carrier screens

CS vision Unique features BP solver Simulations


Recommended