+ All Categories
Home > Documents > Detecting Inversions in Human Genome

Detecting Inversions in Human Genome

Date post: 03-Jan-2016
Category:
Upload: patrick-leach
View: 16 times
Download: 2 times
Share this document with a friend
Description:
Detecting Inversions in Human Genome. Phillip Tao Advisor: Eleazar Eskin. Polymorphism. Structural abnormality in chromosome Deletion Duplication Translocation Inversion. Inversion. Portion of chromosome is flipped Usually no major adverse effects - PowerPoint PPT Presentation
21
Detecting Inversions in Human Genome Phillip Tao Advisor: Eleazar Eskin
Transcript
Page 1: Detecting Inversions in Human Genome

Detecting Inversions in Human Genome

Phillip Tao

Advisor: Eleazar Eskin

Page 2: Detecting Inversions in Human Genome

Polymorphism

Structural abnormality in chromosomeDeletionDuplicationTranslocationInversion

Page 3: Detecting Inversions in Human Genome

Inversion

Portion of chromosome is flipped Usually no major adverse effects Inverted section tends to have strong LD Small inversions are very hard to detect

Page 4: Detecting Inversions in Human Genome

Bafna’s Method

Define inversion as two breakpoints Find two SNPs on each side of each breakpoint SNP on outside of one breakpoint should

correlate higher with SNP on inside of other

breakpoint if there’s an inversion

Page 5: Detecting Inversions in Human Genome

... A ... ... T ... ... C ... ... C ...

... A ... ... G ... ... C ... ... G ...

... C ... ... T ... ... G ... ... C ...

... C ... ... G ... ... G ... ... G ...

... A ... ... G ... ... C ... ... G ...

Page 6: Detecting Inversions in Human Genome

My Goal

Simplify Bafna’s method Use r-correlation Use single SNPs instead of finding multi-SNP

markers

Page 7: Detecting Inversions in Human Genome

My Method

Calculate correlation between all SNPs For each SNP, calculate difference in

correlation between all other SNPs to it Find sets of four SNPs which fit pattern

described earlier Organize sets into groups based on position

Page 8: Detecting Inversions in Human Genome

Example

1 2 3 4 5 6 7A T C A G C GA G A A G T CT G C G G C CA T C A G C GT T C G A C G

Page 9: Detecting Inversions in Human Genome

Example r table

1 2 3 4 5 6 71 1.02 0.2 1.03 0.4 0.6 1.04 1.0 0.2 0.4 1.05 0.6 0.4 0.3 0.4 1.06 0.4 0.6 1.0 0.4 0.3 1.07 0.2 1.0 0.6 0.2 0.4 0.6 1.0

Page 10: Detecting Inversions in Human Genome

Example diff table (SNP 1)

1 2 3 4 5 6 71 2 0.03 0.2 0.04 0.8 0.6 0.05 0.4 0.2 -0.4 0.06 0.2 0.0 -0.6 -0.2 0.07 0.0 -0.2 -0.8 -0.4 -0.2 0.0

1 2 4 1 2 5 1 3 4 1 3 51 2 3 1 2 6

Page 11: Detecting Inversions in Human Genome

Example diff table (SNP 6)

1 2 3 4 5 6 71 0.02 -0.2 0.03 -0.6 -0.4 0.04 0.0 0.2 0.6 0.05 0.1 0.3 0.7 0.1 0.06 7 0.0

2 4 6 2 5 6 3 4 6 3 5 6

Page 12: Detecting Inversions in Human Genome

Example cont.

1 2 4 1 2 5 1 3 4 1 3 5 2 4 6 2 5 6 3 4 6 3 5 6 2 4 7 2 5 7 3 4 7 3 5 71 2 4 6 1 2 5 6 1 3 4 6 1 3 5 61 2 4 7 1 2 5 7 1 3 4 7 1 3 5 7

1 2 3 1 2 6

[1 – 1] [2 – 3] [4 – 5] [6 – 7]

Page 13: Detecting Inversions in Human Genome

Results

Results for 8 ENCODE regions Each encode region has about one “big”

inversion, and 3 or 4 smaller possible inversions Inversion candidates range from about 20kb to

250kb

Page 14: Detecting Inversions in Human Genome

Encode 1 CEU

length 138206:

26933775 26961947 27061501 27080620 (x1152)

[26933311 - 26935400] [26935778 - 27001979]

[27061501 - 27073984] [27074652 - 27115799]

length 24723:

27229393 27243243 27265414 27269500 (x549)

[27222615 - 27242896] [27243243 - 27247682]

[27264662 - 27267966] [27269500 - 27290893]

Page 15: Detecting Inversions in Human Genome

Encode 1 JPTCHB

length 112765:

26925087 26961569 27038413 27095921 (x696)

[26925087 - 26936161] [26936185 - 26984395]

[27018432 - 27048950] [27053451 - 27098098]

length 16797:

27286339 27297153 27308501 27317801 (x430)

[27282442 - 27291838] [27292455 - 27297184]

[27308501 - 27309252] [27309746 - 27318505]

Page 16: Detecting Inversions in Human Genome

Encode 2 CEU

length 146580:

89679961 89740881 89846316 89856918 (x10169)

[89629528 - 89702509] [89703442 - 89751478]

[89842982 - 89850022] [89851175 - 89971133]

length 103202:

89984366 90038027 90141147 90162545 (x4464)

[89960639 - 90037168] [90037945 - 90074697]

[90125136 - 90141147] [90143267 - 90244055]

Page 17: Detecting Inversions in Human Genome

Encode 2 JPTCHB

length 61931:

89740469 89777036 89815696 89844587 (x7363)

[89740469 - 89753274] [89754595 - 89783950]

[89807767 - 89816526] [89817163 - 89869295]

length 241177:

90147369 90237945 90461335 90485128 (x5137)

[90071367 - 90186818] [90223524 - 90325391]

[90457540 - 90464701] [90468056 - 90493804]

Page 18: Detecting Inversions in Human Genome

Encode 3 CEU

length 53311:

126434362 126444935 126484991 126520444 (x6392)

[126430928 - 126434467] [126435292 - 126461428]

[126483937 - 126488603] [126489707 - 126537051]

length 79164:

126717787 126750681 126810226 126838912 (x4294)

[126653273 - 126730160] [126731062 - 126753794]

[126810226 - 126810226] [126811293 - 126868969]

Page 19: Detecting Inversions in Human Genome

Encode 3 JPTCHB

length 53311:

126434155 126435292 126484017 126489707 (x8664)

[126434155 - 126434467] [126435292 - 126461428]

[126483937 - 126488603] [126489707 - 126534298]

length 56719:

126499913 126517706 126563455 126598442 (x2480)

[126461428 - 126509693] [126510624 - 126536076]

[126558033 - 126567343] [126567738 - 126622425]

Page 20: Detecting Inversions in Human Genome

Problems

Grouping algorithm not very good Many redundant groups Not weighting sets

Some candidate inversions overlap others Seems to be detecting too many Very slow and inefficient

Page 21: Detecting Inversions in Human Genome

Extensions

Improve grouping algorithm Add weighting of sets Combine similar groups Filter out sets which are likely outliers

Use other inversion detection techniques Use length constraints to filter out sets and

groups


Recommended