Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/.../02/Chris... ·...

Post on 05-Aug-2020

1 views 0 download

transcript

Genomic studies of speciation and gene flow

Why study speciation genomics?

Why study speciation genomics?

Long-standing questions (role of geography/gene flow)

Why study speciation genomics?

How do genomes diverge?

Long-standing questions (role of geography/gene flow)

Why study speciation genomics?

Find speciation genes

How do genomes diverge?

Long-standing questions (role of geography/gene flow)

Genomic divergence during speciation

evolution.berkeley.edu

1. Speciation as a bi-product of physical isolation

2. Speciation due to selection – without isolation

Genomic divergence during speciation

evolution.berkeley.edu

1. Speciation as a bi-product of physical isolation

2. Speciation due to selection – without isolation

Genomic divergence during speciation

evolution.berkeley.edu

1. Speciation as a bi-product of physical isolation

2. Speciation due to selection – without isolation

80 100 120 140 160 180

0.0

0.4

0.8

Transect position (km)

Sd

frequ

ency

Cline theory - e.g. Barton and Gale 1993

Genomic divergence during speciation

evolution.berkeley.edu Wu 2001, JEB

1. Speciation as a bi-product of physical isolation

2. Speciation due to selection – without isolation

?

Time

Stage 1 - one or few loci under disruptive selection

Genome

FST

Gene under

selection

Feder, Egan and Nosil TiG

Stage 2 - Divergence hitchhiking

Genome

FST

Feder, Egan and Nosil TiG

Stage 2b - Inversion

Genome

FST

Feder, Egan and Nosil TiG

Inversion links co-adapted alleles

Stage 3 - Genome hitchhiking

Feder, Egan and Nosil TiG

Genome

FST

Stage 4 - Genome wide isolation

Feder, Egan and Nosil TiG

Genome

FST

Some sub-species clearly in stage 1lWing pattern “races” of Heliconius melpomene

80 100 120 140 160 180

0.0

0.4

0.8

b fre

quen

cy

19862011

n11050

80 100 120 140 160 180

0.0

0.4

0.8

D fr

eque

ncy

80 100 120 140 160 180

0.0

0.4

0.8

N fr

eque

ncy

80 100 120 140 160 180

0.0

0.4

0.8

Transect position (km)

Yb

frequ

ency

80 100 120 140 160 180

0.0

0.4

0.8

D fr

eque

ncy 1986

2011n11050

80 100 120 140 160 180

0.0

0.4

0.8

Cr

frequ

ency

80 100 120 140 160 180

0.0

0.4

0.8

Transect position (km)

Sd

frequ

ency

Heliconius erato Heliconius melpomene

Some sub-species clearly in stage 1

S. H. Martin et al. Genome Res. 23, 1817–1828 (2013). O. Seehausen et al. Nat. Rev. Genet. 15, 176–92 (2014).

lWing pattern “races” of Heliconius melpomene

B (red/orange patterns) Yb (yellow/white patterns)

Some sub-species clearly in stage 1lCarrion and hooded Crows

Poelstra, J. W. et al. Science 344, 1410–4 (2014).

FST

And here is a recent example with multiple islands

Malinsky et al., Science 350, 1493 (2015).

And here is a recent example with multiple islands

Malinsky et al., Science 350, 1493 (2015).

And here is a recent example with multiple islands

Malinsky et al., Science 350, 1493 (2015).

Other species have islands…but are they real?

Ellegren, et al. Nature 491, 756- (2012).

Anopheles gambiae and A. coluzzi Formerly M and S forms of A. gambiae

Clarkson et al. 2014 Nature Communications

Other species have islands…but are they real?

Other species have islands…but are they real?

Seehausen et al., Nature Reviews Genetics, 2014

• Fst measures relative divergence

• Peaks indicate regions of higher than expected between population divergence, given the within population divergence

• Peaks can therefore result from reduced diversity within species

• This could be due to lower Ne within species (selective sweeps, background selection)

• So peaks NOT NECESSARILY due to reduced gene flow

What do patterns of Fst really mean?

Note that sometimes sweeps within species = speciation genes

Sweeps across the species barrier can also lead to Fst peaks

Double peaks??

Nicolas Bierne, Daniel Berner and others

Anopheles M-S divergence

Relative divergence higher in low recombination regions - not significant for absolute divergence

see also: Charlesworth 1998 MBE Measures of divergence…

No evidence for higher Dxy in wing pattern loci

S. H. Martin et al. Genome Res. 23, 1817–1828 (2013). O. Seehausen et al. Nat. Rev. Genet. 15, 176–92 (2014).

lWing pattern “races” of Heliconius melpomene

B (red/orange patterns) Yb (yellow/white patterns)

No evidence for higher Dxy in wing pattern loci

S. H. Martin et al. Genome Res. 23, 1817–1828 (2013). O. Seehausen et al. Nat. Rev. Genet. 15, 176–92 (2014).

lWing pattern “races” of Heliconius melpomene

B (red/orange patterns) Yb (yellow/white patterns)

Suggestion that we use absolute measures of divergence?

Understanding genomic divergence

No single statistic will capture the complex history of mutation, migration and selection

Patterns need to be interpreted in the specific context of the study species

Much better to use explicit tests for gene flow

Need to design sampling so the expectations in the absence of gene flow are clear and testable

Much better to use explicit tests for gene flow

Need to design sampling so the expectations in the absence of gene flow are clear and testable

The key is to identify ‘control’ populations that are not influenced by admixture

•Isolated DNA from bones 38,000 yrs old in Croatia •We diverged from Neanderthals around 270-440,000yrs ago

•Evidence for gene exchange with humans (1-4% of genome?)

Green et al., 328:710 Science 2010

Explicit tests for gene flow: Neanderthal genome

Explicit tests for gene flow: ABBA-BABA test

Explicit tests for gene flow: ABBA-BABA test

Green et al. 2010 Science 328:710-722

OBSERVE: 103612 ABBA 94029 BABA

Gene flow

Explicit tests for gene flow: ABBA-BABA test

EXPECT: 50% ABBA 50% BABA

OBSERVE: 103612 ABBA 94029 BABA

Gene flow

Green et al. 2010 Science 328:710-722

Explicit tests for gene flow: ABBA-BABA test

EXPECT: 50% ABBA 50% BABA

OBSERVE: 103612 ABBA 94029 BABA

Gene flow

Green et al. 2010 Science 328:710-722

Explicit tests for gene flow: ABBA-BABA test

Explicit tests for gene flow: ABBA-BABA test

Explicit tests for gene flow: Combining multiple signals

The genomic landscape of Neanderthal ancestry in present-day humans - Sankararaman et al. Nature 2014

1) Derived alleles at high frequency shared with Neanderthal 2) High divergence to Africa but low to Neanderthal 3) Long haplotype blocks

Explicit tests for gene flow: Combining multiple signals

Explicit tests for gene flow: Heliconius butterflies

!

Martin et al., Genome Research 2013

Explicit tests for gene flow: Heliconius butterflies

Many sources of reproductive isolation:

Female hybrids are sterile Different host plant use Different habitat preference Strong assortative mating

Explicit tests for gene flow: Heliconius butterflies

Explicit tests for gene flow: Heliconius butterflies

• Much larger proportion of genome is flowing as compared to Neanderthals

• Similarly strong effect on sex chromosome

F

Burri et al., Genome Research 2015

Sequenced 20 individuals per population at 20x coverage

An alternative is to take an explicit modelling approach

IM and IMa Jody Hey

Martin et al., Biorxiv 2015

So far models have mostly just estimated genome-wide parameters…assuming the genome is homogenous

Where we need to go next is to incorporate genome heterogeneity in selection and recombination

So far models have mostly just estimated genome-wide parameters…assuming the genome is homogenous

Where we need to go next is to incorporate genome heterogeneity in selection and recombination

High density linkage maps to map the recombination landscape

Chromosome length (Megabases)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

1

00.

788

2.37

53.

163

3.95

4.73

8

6.32

5

7.91

38.

706

9.5

10.6

8611

.872

12.6

5913

.446

14.2

3415

.021

15.8

0916

.596

17.3

8418

.171

19.7

58

21.3

4622

.133

22.9

3323

.734

29.5

2530

.319

31.1

1331

.932

.688

34.2

9935

.616

46.7

73

48.3

2249

.109

50.6

96

52.2

8453

.071

54.6

59

56.2

4657

.034

57.8

21

59.4

0860

.196

60.9

83

63.3

8464

.171

64.9

5965

.746

66.5

33

71.4

5472

.248

73.0

42

74.6

29

76.2

1777

.004

77.7

92

79.3

7980

.166

80.9

6781

.767

82.5

55

85.7

8286

.569

87.3

5788

.144

88.9

3189

.719

91.3

0692

.094

92.8

81

99.6

21

102.

09

2

00.

787

1.57

5

4.80

25.

589

6.37

7

7.96

4

9.55

2

11.1

39

12.7

2613

.514

14.3

01

20.9

78

23.3

7824

.166

27.3

93

29.7

9330

.581

31.3

6832

.156

32.9

43

34.5

335

.318

36.9

0537

.693

38.4

8639

.28

40.8

6841

.655

42.4

4243

.23

44.8

1745

.617

46.4

1647

.203

47.9

9148

.778

51.1

7951

.967

52.7

5553

.542

54.3

29

55.9

2456

.705

57.4

92

60.7

19

62.3

1363

.094

63.8

88

65.5

0266

.282

67.0

767

.857

69.4

45

71.0

32

72.6

273

.407

74.9

94

3

00.

787

1.57

5

3.97

54.

763

5.55

6.33

7

10.4

0511

.19

13.5

8914

.376

15.9

6416

.751

19.2

1220

.02

20.8

07

23.2

08

24.7

9525

.583

27.9

8328

.774

29.5

6430

.352

31.1

3931

.927

32.7

1433

.502

35.0

89

36.6

77

38.2

6639

.056

40.6

4341

.426

43.0

09

44.6

0645

.384

46.9

7147

.759

49.3

46

50.9

3451

.721

54.1

2154

.909

55.7

0956

.509

58.0

97

62.3

863

.019

63.8

19

65.4

2

68.6

46

71.0

4771

.834

74.2

35

75.8

2676

.61

78.1

98

80.5

98

82.2

1282

.999

84.1

8585

.371

86.9

5887

.746

90.9

7391

.773

92.5

73

94.1

6194

.948

4

00.

787

1.57

52.

362

3.15

5.55

6.33

87.

125

7.91

28.

79.

494

10.2

8711

.075

11.8

6212

.65

13.4

37

15.8

37

18.2

38

19.8

2520

.619

21.4

13

22.9

42

25.4

7226

.265

27.0

6627

.86

28.6

4729

.435

30.2

22

31.8

4332

.61

35.8

536

.637

43.3

14

44.9

01

46.4

8947

.276

48.0

64

50.4

6451

.248

52.8

3953

.626

55.2

1456

.001

56.7

89

58.3

6

62.5

3763

.337

64.1

38

65.7

2566

.519

67.3

1368

.168

.887

69.6

7570

.462

72.0

5372

.843

75.2

77

76.8

64

78.4

5279

.239

80.8

27

82.4

14

84.0

02

85.5

8986

.376

87.1

49

88.7

7789

.564

5

0

1.58

8

3.98

84.

775

7.17

6

11.2

4312

.031

12.8

1813

.605

15.1

9315

.98

19.2

0719

.995

21.5

82

24.8

0925

.626

.387

27.9

7728

.785

30.3

7231

.159

31.9

4732

.747

33.5

4834

.335

35.1

22

38.3

49

48.7

01

51.1

0251

.889

52.6

7653

.464

54.2

5155

.039

55.8

26

59.8

9360

.681

61.4

6862

.256

63.8

43

65.4

3

67.8

7868

.657

70.2

5571

.033

71.8

2472

.616

73.4

03

75.8

0476

.591

77.3

7978

.166

78.9

5479

.741

82.1

4282

.93

83.7

1784

.505

85.2

9286

.08

86.8

67

6

0

1.58

7

3.17

5

4.76

25.

556.

337

7.92

5

9.51

2

11.1

1911

.912

13.5

15.0

87

16.6

75

18.2

6219

.05

19.8

3720

.624

21.4

12

24.6

3925

.419

27.0

39

29.4

4

31.0

27

32.6

15

34.2

0234

.989

35.7

7736

.564

38.1

5538

.933

40.5

2741

.315

42.1

02

44.5

0345

.29

47.4

7

51.9

67

54.3

6755

.154

57.5

958

.381

65.9

5866

.746

67.5

3368

.321

69.1

0869

.895

70.6

83

72.2

773

.058

73.8

4574

.632

77.0

33

78.6

2

86.1

9886

.972

88.5

46

90.6

8191

.727

92.5

14

94.9

1295

.705

96.4

9697

.285

101.

354

106.

276

108.

677

111.

078

111.

866

112.

653

113.

44

115.

028

115.

815

116.

603

117.

403

118.

203

118.

991

7

0

2.4

3.18

8

5.58

86.

376

7.16

3

8.75

1

10.3

38

13.2

1313

.984

15.1

8815

.976

16.7

63

18.3

519

.147

19.9

44

21.5

3222

.319

23.1

0723

.894

25.4

8126

.269

28.6

69

30.2

5731

.044

35.9

6636

.754

38.3

41

41.5

6842

.342

44.7

5745

.544

47.1

32

48.7

2649

.507

52.7

3353

.52

54.3

08

56.7

08

59.1

0959

.896

60.6

8461

.471

62.2

5863

.046

63.8

464

.633

65.4

2166

.208

68.6

0969

.396

73.4

6374

.251

75.0

3875

.826

77.4

13

79.0

0779

.788

80.5

76

84.6

4385

.43

86.2

18

87.8

05

91.8

7292

.66

93.4

4794

.235

97.4

62

99.0

49

100.

636

103.

037

103.

824

104.

612

105.

399

106.

248

106.

313

8

0

1.6

2.38

83.

175

3.96

2

6.36

37.

157.

938

9.54

510

.282

17.9

2518

.713

19.5

21.0

88

22.6

75

24.2

62

25.8

5

28.2

5

30.6

5131

.445

32.2

3833

.026

33.8

13

35.4

0136

.188

36.9

7537

.763

38.5

539

.338

40.1

2540

.912

44.2

0744

.98

46.5

6847

.355

49.0

24

51.5

1252

.313

53.1

13

54.7

55.4

8856

.275

57.0

6957

.863

61.0

9

63.4

9

65.0

78

66.6

65

68.2

49

70.6

5371

.441

72.2

2873

.016

73.8

0374

.59

75.3

7876

.165

77.7

53

80.1

53

81.7

41

9

0

3.39

1

4.97

85.

766

6.55

3

8.14

1

12.2

0812

.995

15.3

9616

.183

17.7

7118

.558

19.3

45

20.9

3321

.619

25.8

95

27.4

8228

.27

29.0

7229

.865

33.9

4134

.73

36.3

1737

.105

37.8

56

40.3

3141

.119

41.9

0642

.694

44.3

1145

.122

47.5

2248

.309

50.7

151

.497

53.8

9854

.685

56.2

7357

.06

57.8

48

62.7

763

.558

64.3

46

65.9

33

67.5

21

69.1

08

71.5

0972

.302

73.0

9673

.883

80.5

6

82.1

4782

.935

83.7

22

85.3

1

86.8

9787

.685

88.4

72

10

0

1.6

2.38

83.

175

3.96

24.

75

6.33

7

7.92

58.

712

9.5

10.2

8711

.074

11.8

6212

.649

13.4

3714

.224

15.0

1215

.799

16.5

86

18.9

87

20.5

7821

.362

22.9

4923

.737

25.3

2426

.112

27.6

9928

.486

29.2

74

30.8

6131

.649

33.2

36

34.8

2435

.611

37.2

24

39.3

9640

.461

42.0

7842

.836

43.6

3

45.2

3646

.024

46.8

1147

.599

48.3

8649

.174

50.7

6151

.548

52.3

3653

.123

54.7

1155

.498

56.2

9657

.094

58.6

82

60.2

6961

.056

65.1

2465

.911

66.7

0867

.499

68.2

8669

.074

69.8

6170

.648

71.4

3672

.223

73.8

1174

.753

78.6

5579

.443

81.0

381

.818

84.2

1885

.006

88.2

32

89.8

290

.607

91.3

9592

.182

The effect of background selection on introgression in humans

Harris and Nielson Biorxiv 2015

The effect of background selection on introgression in humans

Harris and Nielson Biorxiv 2015

Admixture is less in gene rich regions supporting this model…..

Population and speciation genomics: Conclusions

• Great power to detect subtle signals of selection and gene flow

• Can make more general observations about genes and regions involved in adaptation

• BUT genomic processes complicate the picture

• Best approaches combine multiple signals to infer process

• Eventually we need to combine background selection, recombination, positive selection

And finally a shameless plug….

And finally a shameless plug….

Adaptive introgression

Photo credit Andrei Sourakov

Photo credit Andrei Sourakov

Photo credit Andrei Sourakov

Photo credit Andrei Sourakov

43 s

peci

es 77 s

peci

es

Fritz Müller

–G-test:G=7.25,d.f.=1,p=0.007

Expected number

0

15

30

No.

Mod

els

Atta

cked

H. melpomene H. cydno F1 hybrid

Merrill et al., Proc. Roy. Soc 2012

102 204 76

Several major loci control Heliconius patterns

Several major loci control Heliconius patterns

Several major loci control Heliconius patterns

Several major loci control Heliconius patterns

Several major loci control Heliconius patterns

Several major loci control Heliconius patterns

Reed et al., 2011 Science

E

Richard Wallbank

Cross-species sharing

Heliconius Genome Consortium Nature 2012

Cross-species sharing

Heliconius Genome Consortium Nature 2012

Cross-species sharing

Heliconius Genome Consortium Nature 2012

Phylogenies across B/D

ML tree based on

50,000 bp

Heliconius Genome Consortium Nature 2012

Phylogenies across B/D

ML tree based on

50,000 bp

Heliconius Genome Consortium Nature 2012

Phylogenies across B/D

ML tree based on

50,000 bp

Phylogenies across B/D

ML tree based on

50,000 bp

Phylogenies across B/D

ML tree based on

50,000 bp

Phylogenies across B/D

ML tree based on

50,000 bp

Phylogenies across B/D

ML tree based on

50,000 bp

Phylogenies across B/D

ML tree based on

50,000 bp

Phylogenies across B/D

ML tree based on

50,000 bp

Phylogenies across B/D

ML tree based on

50,000 bp

Okay, so introgression causes mimicry

Okay, so introgression causes mimicry

But mimicry is weird, right?

Novelty can arise through introgression and recombination

Heliconius heurippa

NNBB

Mavarez et al., Nature 2006

Camilo Salazar

Novelty can arise through introgression and recombination

Heliconius cydno cordula

Heliconius melpomene melpomene

NNbb

nnBB

Heliconius heurippa

NNBB

Mavarez et al., Nature 2006

Camilo Salazar

optix100kb

Wallbank et al., PLoS Biology 2016

Red

optix100kb

Wallbank et al., PLoS Biology 2016

RedBlue

optix100kb

Wallbank et al., PLoS Biology 2016

RedBlue

optix100kb

Wallbank et al., PLoS Biology 2016

Wallbank et al., PLoS Biology 2016

Wallbank et al., PLoS Biology 2016

Generate dated trees using this node as a reference point

Wallbank et al., PLoS Biology 2016

H. cydno

H. pachinus

H. heurippa

H. tristeroH. timareta

H. melpomene

H. elevatus

H. pardalinus

H. luciana

H. athis

H. hecale

H. ethilla

H. nattereri

H. besckei

H. ismenius

H. numata

Million Years Ago4 3 2 1 0

AB C D E

Wallbank et al., PLoS Biology 2016

What about behaviour?

What about behaviour?

H. melpomene X H. cydno

bb Bb

-10

010

2030

Genotype

Num

ber o

f app

roac

hes

Diff

eren

ce b

etw

een

appr

oach

es to

cyd

no a

nd

mel

ponm

ene

Richard Merrill

Mate preference segregates withforewing colour in backcross hybrids

Rel

ativ

e pr

obab

ility

of c

ourti

ngH

. mel

pom

ene

fem

ales

Merrill et al. Proc Roy Soc B, 2011

Mate preference segregates withforewing colour in backcross hybrids

Rel

ativ

e pr

obab

ility

of c

ourti

ngH

. mel

pom

ene

fem

ales

Merrill et al. Proc Roy Soc B, 2011

H. melpomene measured preference

H. cydnomeasured preference

Large effect: 35% of thedifference betweenparental species

G = 58.64, P << 0.001

Heliconius cydno cordula

Heliconius melpomene melpomene

NNbb

nnBBHeliconius heurippa

NNBB

Mavarez et al., Nature 2006

cordula heurippa melpomene

010

2030

40

Num

ber o

f app

roac

hes

Melo et al., Evolution 2009

Now working on QTL maps of species differences in behaviour:

01234

LOD

sco

re

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 X

01234

LOD

sco

re

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 X

01234

Chromosome

LOD

sco

re

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 X

Lamichhaney et al., Nature 2015

Lamichhaney et al., Nature 2015

Pointy

Blunt

ALX1 associated with beak shape

Most of these studies use phenotype associations to identify introgressed loci

But can we identify them a priori using the ABBA-BABA method?

Explicit tests for gene flow: ABBA-BABA test

Explicit tests for gene flow: ABBA-BABA test

Green et al. 2010 Science 328:710-722

D is quite dependent on the number of informative sites (the denominator)

Martin et al., MBE 2014

D is quite dependent on the number of informative sites (the denominator)

D is not at all good at detecting outlier windows

Martin et al., MBE 2014

D is quite dependent on the number of informative sites (the denominator)

D is not at all good at detecting outlier windows

Martin et al., MBE 2014

D is quite dependent on the number of informative sites (the denominator)

D is not at all good at detecting outlier windows

Martin et al., MBE 2014

Where s is numerator from the D equation f is the fraction of introgression compared to maximum possible

Martin’s F

Martin et al., MBE 2014

Martin’s F

But Martin’s F is quite good at finding the introgression outliers

Martin et al., MBE 2014

• Smith and Kronforst argued that introgression could be inferred where ABBA-BABA outliers showed lower Dxy compared to genome-wide average

• Be wary of window based D statistics

• F is better than D…

• Sampling design is very important!

Implications for tree-thinking

Archaea

Eukarya

Bacteria

The tree of life is reticulatedThe tree of life is reticulated

Implications for tree-thinking

Mallet, Hahn and Besansky BioEssays 2015 Hahn and Nakhleh Evolution 2015

Okay, so what have we learnt and where do we go from here?