+ All Categories
Home > Documents > Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been...

Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been...

Date post: 16-Jan-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
12
Copyright Ó 2008 by the Genetics Society of America DOI: 10.1534/genetics.107.079046 Rapid Evolution of Two Odorant-Binding Protein Genes, Obp57d and Obp57e, in the Drosophila melanogaster Species Group Takashi Matsuo 1 Department of Biological Sciences, Tokyo Metropolitan University, Tokyo 192-0397, Japan Manuscript received July 18, 2007 Accepted for publication December 2, 2007 ABSTRACT Genes encoding odorant-binding protein (OBP) form a large family in an insect genome. Two OBP genes, Obp57d and Obp57e, were previously identified to be involved in host-plant recognition in Drosophila sechellia. Here, by comparing the genomic sequences at the Obp57d/e locus from 27 Drosophila species, we found large differences in gene number between species. Phylogenetic analysis revealed that Obp57d and Obp57e in the D. melanogaster species group arose by gene duplication of an ancestral OBP gene that remains single in the obscura species group. Further gain and loss of OBP genes were observed in several lineages in the melanogaster group. Site-specific analysis of evolutionary rate suggests that Obp57d and Obp57e have functionally diverged from each other. Thus, there are two classes of gene number differences in the Obp57d/e region: the difference of the genes that have functionally diverged from each other and the difference of the genes that appear to be functionally identical. Our analyses demonstrate that these two classes of differences can be distinguished by comparisons of many genomic sequences from closely related species. G ENES involved in the animal chemosensory system, such as the olfactory and gustatory receptor- encoding genes, tend to form large families in a genome. Size differences in these multigene families among animal species were explained by differences in selec- tion pressure maintaining functional genes. For example, the higher proportion of olfactory receptor pseudo- genes in monkeys was explained by the acquisition of full trichromatic color vision that reduced the depen- dence of these species on olfactory cues (Gilad et al. 2004). Also, the loss of gustatory receptor functions in primates was suggested to be the result of changes in the environ- ment and species-specific food preference (Go et al. 2005). With the completion of many genome sequences, however, comparisons of genomic data have raised a question of whether all the differences in multigene- family size are consequences of selection. Alternatively, they might be caused by merely a stochastic gain and loss of genes. Indeed, results from genome analyses revealed that, at least in part, the size difference in multigene families between species can be explained by neutral evolution (Karev et al. 2003, 2004; Reed and Hughes 2004; Hahn et al. 2005; De Bie et al. 2006; Rudnicki et al. 2006). On the basis of these observations, it was pro- posed that the size difference in multigene families is not a consequence, but a cause of evolutionary changes in phenotypes (Nei 2005). These two theories, however, may not look at the same phenomenon. It is known that genes generated by a duplication undergo two successive but distinct stages of evolution (Lynch and Conery 2000). At the earlier stage, the two genes are functionally identical and tend to be reduced to a single gene by degeneration of either gene. Once they have functionally diverged from each other, however, both genes independently contribute to fitness, and selection pressure maintains the two genes stably for a long time. The selection model of gene- family evolution may explain differences at the later stage, while the stochastic gain-and-loss model may fit events occurring at the earlier stage. Thus, it is impor- tant to know which stage contributes more to the size difference between the gene families of interest, be- cause it influences the conclusion of analyses. Genes encoding odorant-binding proteins (OBP), secreted molecules that function in insects’ chemo- sensilla, form a large family in an insect genome. In the Drosophila melanogaster genome, there are 50 OBP genes; this number is comparable to that of odorant re- ceptors and gustatory receptors (Galindo and Smith 2001; Graham and Davies 2002; Hekmat-Scafe et al. 2002). Similarly to those of the chemoreceptor gene families, the size of the OBP gene family also varies between species, suggesting that OBP genes are under the control of the same kind of evolutionary mecha- nisms that determine the sizes of chemoreceptor gene families (Xu et al. 2003; Fore ˆt and Maleszka 2006). Sequence data from this article have been deposited with the DNA Data Bank of Japan under accession nos. AB370270–AB370291. 1 Address for correspondence: Department of Biological Sciences, Tokyo Metropolitan University, 1-1 Minami Osawa, Hachioji, Tokyo 192-0397, Japan. E-mail: [email protected] Genetics 178: 1061–1072 ( February 2008)
Transcript
Page 1: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

Copyright � 2008 by the Genetics Society of AmericaDOI: 10.1534/genetics.107.079046

Rapid Evolution of Two Odorant-Binding Protein Genes, Obp57d andObp57e, in the Drosophila melanogaster Species Group

Takashi Matsuo1

Department of Biological Sciences, Tokyo Metropolitan University, Tokyo 192-0397, Japan

Manuscript received July 18, 2007Accepted for publication December 2, 2007

ABSTRACT

Genes encoding odorant-binding protein (OBP) form a large family in an insect genome. Two OBPgenes, Obp57d and Obp57e, were previously identified to be involved in host-plant recognition in Drosophilasechellia. Here, by comparing the genomic sequences at the Obp57d/e locus from 27 Drosophila species,we found large differences in gene number between species. Phylogenetic analysis revealed that Obp57dand Obp57e in the D. melanogaster species group arose by gene duplication of an ancestral OBP gene thatremains single in the obscura species group. Further gain and loss of OBP genes were observed in severallineages in the melanogaster group. Site-specific analysis of evolutionary rate suggests that Obp57d andObp57e have functionally diverged from each other. Thus, there are two classes of gene numberdifferences in the Obp57d/e region: the difference of the genes that have functionally diverged from eachother and the difference of the genes that appear to be functionally identical. Our analyses demonstratethat these two classes of differences can be distinguished by comparisons of many genomic sequencesfrom closely related species.

GENES involved in the animal chemosensory system,such as the olfactory and gustatory receptor-

encoding genes, tend to form large families in a genome.Size differences in these multigene families amonganimal species were explained by differences in selec-tion pressure maintaining functional genes. For example,the higher proportion of olfactory receptor pseudo-genes in monkeys was explained by the acquisition offull trichromatic color vision that reduced the depen-denceofthesespeciesonolfactorycues(Gilad etal.2004).Also, the loss of gustatory receptor functions in primateswas suggested to be the result of changes in the environ-ment and species-specific food preference (Go et al. 2005).

With the completion of many genome sequences,however, comparisons of genomic data have raised aquestion of whether all the differences in multigene-family size are consequences of selection. Alternatively,they might be caused by merely a stochastic gain and lossof genes. Indeed, results from genome analyses revealedthat, at least in part, the size difference in multigenefamilies between species can be explained by neutralevolution (Karev et al. 2003, 2004; Reed and Hughes

2004; Hahn et al. 2005; De Bie et al. 2006; Rudnicki et al.2006). On the basis of these observations, it was pro-posed that the size difference in multigene families is

not a consequence, but a cause of evolutionary changesin phenotypes (Nei 2005).

These two theories, however, may not look at the samephenomenon. It is known that genes generated by aduplication undergo two successive but distinct stagesof evolution (Lynch and Conery 2000). At the earlierstage, the two genes are functionally identical and tendto be reduced to a single gene by degeneration of eithergene. Once they have functionally diverged from eachother, however, both genes independently contribute tofitness, and selection pressure maintains the two genesstably for a long time. The selection model of gene-family evolution may explain differences at the laterstage, while the stochastic gain-and-loss model may fitevents occurring at the earlier stage. Thus, it is impor-tant to know which stage contributes more to the sizedifference between the gene families of interest, be-cause it influences the conclusion of analyses.

Genes encoding odorant-binding proteins (OBP),secreted molecules that function in insects’ chemo-sensilla, form a large family in an insect genome. In theDrosophila melanogaster genome, there are �50 OBPgenes; this number is comparable to that of odorant re-ceptors and gustatory receptors (Galindo and Smith

2001; Graham and Davies 2002; Hekmat-Scafe et al.2002). Similarly to those of the chemoreceptor genefamilies, the size of the OBP gene family also variesbetween species, suggesting that OBP genes are underthe control of the same kind of evolutionary mecha-nisms that determine the sizes of chemoreceptor genefamilies (Xu et al. 2003; Foret and Maleszka 2006).

Sequence data from this article have been deposited with the DNAData Bank of Japan under accession nos. AB370270–AB370291.

1Address for correspondence: Department of Biological Sciences, TokyoMetropolitan University, 1-1 Minami Osawa, Hachioji, Tokyo 192-0397,Japan. E-mail: [email protected]

Genetics 178: 1061–1072 ( February 2008)

Page 2: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

Two OBP genes, Obp57d and Obp57e, have beenidentified to be responsible for species-specific host-plant preference in D. sechellia (Matsuo et al. 2007).These genes are expected to be under selective pres-sure by ecological conditions, and their evolution mighthave affected behavior also in other species. Here, bycomparing the genomic sequences at the Obp57d/e locusfrom 27 Drosophila species, we revealed the rapidevolution of the two OBP genes, which resulted ingene-number differences between species. The differ-ences were divided into two classes: the difference of thegenes that have functionally diverged from each otherand the difference of the genes that appear to be at theearly stage of evolution after gene duplication and thushave not yet functionally diverged. Our findings dem-onstrate that the comparative analysis of many genomicsequences from closely related species is useful fordiscrimination between these two classes of differences.

MATERIALS AND METHODS

Dot-plot analysis: Dot-plot analysis between D. melanogasterand D. pseudoobscura genomic sequences was carried out usingDotlet 1.5 with the following settings: window size, 59;threshold, 30 (Junier and Pagni 2000).

Fly stocks: The fly stocks used in this study are listed in Table1. All the stocks are maintained in our laboratory, except forD. melanogaster and D. pseudoobscura, whose genomic sequenceswere obtained from FlyBase (release 5.1 and 2.0, respectively).

Sequencing of the Obp57d/e region: The Obp57d/e genomicregion was amplified by PCR with KOD plus enzyme (Toyobo,Tokyo), using the primers P1 59-AGCCACAAACTGGAGGACAG-39 and P2 59-GCCTCCAGGCCGTCGAACTC-39 thatrecognize highly conserved regions between GA14778 andCG18066 and between GA15677 and CG30148, respectively.For all the fly species, a single band was obtained. The PCR-amplified fragment was purified using a QIAquick spincolumn (QIAGEN, Valencia, CA) and directly sequenced withthe primers listed in supplemental Table S1 at http://www.genetics.org/supplemental/. PCR amplification was indepen-dently carried out at least three times for each species and themost frequent observation was adopted when there wasinconsistency among replicates. D. simulans, D. sechellia, andD. mauritiana sequences were previously deposited in the DNAData Bank of Japan (Matsuo et al. 2007).

ORF identification: The genomic sequences at the Obp57d/eregion were searched for second exons (ORF) using an OBPsignature cystein motif (C-X10-C-X8-C). First exons (ORF)were determined among possible ORFs using the followingcriteria: starts with ATG, length is sufficient (.50 bp), andexon–intron boundary can be assigned to form in-frameconnection with the corresponding second exon. By usingthese criteria, the first exon and the exon–intron boundarycould be uniquely determined for every second exon found byusing the C-X10-C-X8-C motif.

Amino acid sequence analyses: Alignment and phylogeneticanalyses of the deduced amino acid sequences were carried outusing the MEGA3.1 sequence analysis package (Kumar et al.2004). The signal peptide sequence was predicted usingSignalP 3.0 (Bendtsen et al. 2004). The ancestral amino acidsequence at each internal node was inferred by the maximum-parsimony (MP) method (Fitch 1971). An original scriptrunning on the R statistical package was used for ancestral state

inference and for counting the number of amino acid sub-stitution events for each site (see supplemental materials athttp://www.genetics.org/supplemental/ for the script and adetailed description). Types I and II functional divergencesbetween Obp57d and Obp57e were examined using DIVERGE 2.0(Gu and Velden 2002; Gu 2006).

RESULTS

Gene number difference between species at theObp57d/e locus: In the D. melanogaster genome, genesencoding OBP form clusters on each chromosome(Graham and Davies 2002; Hekmat-Scafe et al.2002). One of the two OBP gene clusters at cytologicalposition 57 of the second chromosome consists ofObp57d and Obp57e (Figure 1A). These two genes aretightly surrounded by CG18066 and CG30148, forminga gene-dense region. The position and the order ofthese surrounding genes are conserved in the D. pseudo-obscura genome, but there is only one OBP gene(GA15675) between GA14778 and GA15677, the ortho-logs of D. melanogaster CG18066 and CG30148, respec-tively (Figure 1B). A homology search using BLAST forObp57d, Obp57e, and D. pseudoobscura GA15675 againstthe D. pseudoobscura genome failed to hit any other OBPgenes, indicating that D. pseudoobscura has only one genethat corresponds to D. melanogaster Obp57d or Obp57e.

To determine whether the observed difference ingene number was caused by gene duplication or geneloss, we sequenced the Obp57d/e region from 25 ad-ditional species phylogenetically located between D.pseudoobscura and D. melanogaster (Table 1). Two OBPgenes were found in most species, but some species haveonly one OBP gene. Furthermore, several species (i.e.,D. takahashii, D. biarmipes, D. ficusphila, and D. elegans)have more than two OBP genes in this region (Figure 2).

Figure 3 shows the minimum evolution (ME) tree ofthe deduced amino acid sequences of the OBP genes.Obp57a and Obp57b in D. melanogaster were used asthe outgroup. The OBP genes in the obscura groupbranched first, and then the other OBP genes formedtwo major clades, which correspond to D. melanogasterObp57d and Obp57e. The two clades appear to be equallydiverged from the OBP genes in the obscura group,indicating that Obp57d and Obp57e are generated by agene duplication event at the very early stage ofevolution of the melanogaster group.

Figure 4 shows the genomic structure of the Obp57d/eregion of all the species along with the phylogenetic treeof the species. The difference in OBP gene number be-tween the species revealed that there have been multipleduplication and deletion events during the evolution ofthe melanogaster group. The first duplication occurred atthe very early stage of the melanogaster group evolution,generating Obp57d and Obp57e from an ancestralOBP gene, which remains single in the obscura group(Figure 4, arrow 1). Immediately after the duplication,

1062 T. Matsuo

Page 3: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

Obp57e was lost in the ananassae subgroup (arrow 2). TheBLAST search for Obp57e against the genomic sequenceof D. ananassae failed to hit any other OBP genes,proving that Obp57e was not translocated to the otherposition but was completely lost from the D. ananassaegenome. In contrast to the ananassae subgroup, Obp57dwas lost in the auraria–rufa lineage (arrow 3).

Further duplication of the remaining OBP gene wasobserved in D. varians and D. constricta. Extra copies of

Obp57d genes are also observed in D. takahasii, D.biarmipes, D. ficusphila, and D. elegans. The exact numberand timing of each duplication event cannot be de-termined because the phylogenetic relationship betweenthese species is not clearly resolved. Nevertheless, struc-tural similarity between these multiple genes (Figure 3)considered with the phylogenetic relationship betweenspecies (Figure 4) indicates that most of the multiplegenes are supposed to be duplicated independently in

Figure 1.—Genomicstructure around theObp57d/e region. (A) Geno-mic structure around twoOBP gene clusters at cyto-logical position 57 on thesecond chromosome ofD. melanogaster. (B) Dot-plotanalysis of the Obp57d/egenomic region in D. mela-nogaster and D. pseudoobs-cura.

Evolution of Obp57d and Obp57e 1063

Page 4: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

each lineage. Indeed, Obp57d gene number differsbetween the two closely related species, D. takahasiiand D. biarmipes, suggesting that this difference evolvedrapidly. We also found that in D. takahasii and D. elegans,one of these duplicated Obp57d genes has degeneratedinto a pseudogene by frameshift mutations (Figure 2).Because such tandem repeats of similar genes are knownto fluctuate easily in their number, there might be anintraspecies variation of gene number in these species.

Site-specific analysis of functional constraint againstamino acid substitution: Obp57d or Obp57e knock-outflies showed similar changes in behavioral response tooctanoic acid, indicating that these two OBP genes, atleast in part, share the same function in perception ofoctanoic acid (Matsuo et al. 2007). We searched foramino acid sites that are conserved between Obp57d andObp57e. For the analysis, we selected species with singlecopies of Obp57d and Obp57e genes, to ensure betterconservation of functions in each gene (Figure 5A). Abifurcating tree is not adequate to describe the phylo-genetic relationship between the selected species. Thus,we employed the multibranched tree for ancestral stateinference by the MP method, and the substitutionevents at all branches were counted. The total numbersof substitutions were almost the same between Obp57dand Obp57e, indicating that the strengths of overallconstraints on these two genes are equivalent to each

other (Table 2). When the distribution of the number ofamino acid substitutions at each site was analyzed, 16sites were conserved, being beyond the expectation bythe negative binomial distribution (see supplementalFigure S1 at http://www.genetics.org/supplemental/).Amino acids at these sites are shown with those inD. pseudoobscura and D. obscura (Table 3). In addition tothe six OBP-signature cysteines, three sites at positions59, 97, and 124 are conserved between the obscura andmelanogaster groups. They are the candidates for theamino acids that determine the common functionbetween Obp57d and Obp57e.

Functional divergence between Obp57d and Obp57e:The ME tree of the amino acid sequences supported thetwo clades representing Obp57d and Obp57e (Figure 3).To examine whether this pattern reflects the functionaldivergence between the two genes, we examined thesite-specific difference in evolutionary rate between thetwo clades using DIVERGE (Gu and Velden 2002; Gu

2006). The software examines two types of functionaldivergence: type I functional divergence in which thesite-specific evolutionary rate is different between twoclades and type II functional divergence in which func-tionally different amino acids are conserved at the cor-responding sites between two clades. In general, thetype I functional divergence is observed when thefunctional constraint on particular sites was lost in

TABLE 1

Species stock used in this research

Species Stock ID Origin Accession no.a

D. obscura — — AB370270D. ananassae 14024-0371.00 San Luis Potosi, Mexico AB370271D. merina — Madagascar AB370272D. varians k-aar001 Luzon, Philippines AB370273D. simulans S357 1979, Oiso, Japan AB232138D. mauritiana G71 1979.8, Mauritius AB232140D. sechellia SS86 1986, Plaslin island, Seychelles AB232139D. yakuba L42 1979.9, Kenya AB370274D. erecta — — AB370275D. takahashii RGN182 1982.1, Rangoon, Myanmar AB370276D. biarmipes CJB214 1981.12, Coimbatore, India AB370277D. eugracilis WAU155 1981.7, Penang, Malaysia AB370278D. ficusphila Yakushima 1982.10, Yakushima, Japan AB370279D. fuyamai CM771 Chiengmai, Thailand AB370280D. elegans IRM1980 1980, Iriomote, Japan AB370281D. auraria Miyazaki 2 1980.10, Miyazaki, Japan AB370282D. subauraria — Sapporo, Japan AB370283D. constricta — Taiwan AB370284D. rufa TMU Tokyo AB370285D. tani Hangzhou 91i 1991, Zhejiang, China AB370286D. jambulina NHO-115 1981.12, Nagarahole, India AB370287D. watanabei SWB185 1982.1, Shwebo, Myanmar AB370288D. barbarae MMY317 1982, Maymyo, Myanmar AB370289D. kikkawai HNL122 1981, Honolulu AB370290D. lini RGN206 Rangoon, Myanmar AB370291

a Accession numbers for the Obp57d/e genomic sequences.

1064 T. Matsuo

Page 5: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

either clade and maintained in the other clade, whilethe type II divergence is observed when a novelfunctional role was acquired at particular sites in eitherclade (Gu and Velden 2002; Roth et al. 2007). Becausethe software accepts only bifurcating trees, we testedtwo tree shapes that approximate the actual topologybetween the species (Figure 5, B and C). By combin-ing maximum-likelihood estimation with ancestral se-quence inference to avoid underestimation caused bymultiple substitutions at a single branch, DIVERGE gavea larger estimation of the substitution number per sitethan that shown in Table 2 (Table 4). The coefficients

for type I functional divergence (uI) between Obp57dand Obp57e are significantly different from zero for bothtrees 2 and 3, showing that the site-specific evolutionaryrate differs between the two clades. On the other hand,the coefficients for type II functional divergence (uII)are not significantly different from zero, showing thatthere is no site-specific shift of amino acid propertybetween the two clades.

We further examined which site is responsible for thetype I functional divergence. Figure 6 shows plots of thenumber of substitutions and the posterior probability oftype I divergence at each site. The highest probability

Figure 2.—Multiple align-ments of Obp57d and Obp57eamino acid sequences. De-duced amino acid sequenceswere aligned using CLUS-TAL W. ‘‘^’’ and ‘‘*’’ indicatea frameshift mutation and anonsense mutation in puta-tive pseudogenes, respec-tively. Red bars indicate thepredicted signal peptidecleavage sites. Drosophila ele-gans Obp57d does not havea potential signal peptidesequence. Names of the se-quences used in the site-specific analysis of functionalconstraint are boxed (see Fig-ure 5). Orange boxes indi-cate the OBP signaturecysteines. Green boxes indi-cate the amino acids con-served among Obp57d andObp57e in all the species usedin the site-specific analysis(see Table 3). Blue boxes in-dicate the clade-specificallyconserved amino acids (seeTable 5).

Evolution of Obp57d and Obp57e 1065

Page 6: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

was observed at the sites where the number of sub-stitutions highly differed between Obp57d and Obp57e.Among those sites with a probability of type I divergence.0.8, six were completely conserved in either clade andsubstituted four times or more in the other clade (Table5, site positions indicated by footnote a). These arecandidates to be the responsible sites for functionaldivergence between Obp57d and Obp57e.

In the subfunctionalization of duplicated genes,ancestral functions are divided into duplicated genes(Roth et al. 2007). Concerning Obp57d and Obp57e, theobscura group preserves a single OBP gene that issupposed to retain ancestral functions. We examinedwhether the cluster-specifically conserved amino acids inthe melanogaster group are also conserved in the obscuragroup. Among the 19 sites that are cluster-specifically

Figure 2.—Continued.

1066 T. Matsuo

Page 7: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

conserved in the melanogaster group, only 7 are con-served in the obscura group (Table 5). This denotes thatat the other 12 sites, the evolutionary rate decreasedafter gene duplication. In other words, these 12 sitesmay reflect the newly acquired functional constraints inObp57d and Obp57e, which did not exist in the ancestralOBP gene. However, it is possible that the OBP gene inthe obscura group did not preserve the ancestral func-

tions completely, and some of the ancestral functionshave been lost.

DISCUSSION

Specifically conserved amino acids in Obp57d andObp57e: In general, OBP genes evolve rapidly. Com-parisons of OBP genes in the D. melanogaster genome

Figure 3.—ME tree of deduced amino acidsequences. Obp57d and Obp57e genes form twoclades, both of which have equally diverged fromthe OBP genes in the obscura group. The ME treewas constructed with the Poisson correction us-ing the complete deletion option of MEGA 3for the alignment shown in Figure 2. D. mela-nogaster Obp57a and Obp57b were used as an out-group. Bootstrap values .80% are shown.

Evolution of Obp57d and Obp57e 1067

Page 8: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

revealed that the difference between amino acidsequences of the OBP genes is so large and saturatedthat the phylogenetic relationship between the OBPgenes is not resolved (Graham and Davies 2002;Hekmat-Scafe et al. 2002). There are only threeconditions that define OBP: (1) OBP has a signalsequence to be secreted, (2) OBP has six a-helicaldomains, and (3) OBP has six cysteines at particularintervals that are necessary for appropriate conforma-

tion. Most of the other sites in the OBP genes are notconserved at the amino acid level (Galindo and Smith

2001; Graham and Davies 2002; Hekmat-Scafe et al.2002).

Because most OBP genes in a genome are supposedto have diverged from others in function, the compar-ison of the amino acid sequences of OBP genes within agenome is not effective for elucidating the relationshipbetween the structure and the specific function of each

Figure 5.—Phylogeneticrelationships between spe-cies selected for site-specificanalysis of functional con-straint. (A)Phylogenetic treeused for analysis of aminoacid substitution. (B and C)Phylogenetic trees used foranalysis of functional diver-gence using DIVERGE. Be-cause the software does notaccept multibranched trees,the realistic tree (shown inA) was approximated by bi-furcating trees.

Figure 4.—Genomic structure of the Obp57d/e region in analyzed species. The phylogenetic relationship between the species isbased on Da Lage et al. (2007). Arrows indicate the position of three major gene duplication/loss events (see text).

1068 T. Matsuo

Page 9: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

OBP. Thus, it is more preferable to compare orthologsfrom closely related but different species, which areexpected to retain the same function, e.g., ligandrepertoire. By comparisons of the orthologous genesfrom many species, we found the conserved amino acidsin both Obp57d and Obp57e. They might be the key sitesfor the specific functions shared by these two genes. Wealso found the type I functionally diverged sites betweenthe two OBP genes. They are possibly the key aminoacids responsible for the specific functions of each OBP.

Functional divergence between Obp57d and Obp57e:There are two theories for the functional divergence ofduplicated genes: subfunctionalization and neofunc-tionalization (Roth et al. 2007). The subfunctionaliza-tion of duplicated genes is a key process in the DDCmodel, in which the functions of an ancestral geneare divided to the duplicated genes that functionallycomplement each other. On the other hand, the neo-functionalization of duplicated genes results in the acqui-sition of a novel function by one gene, while preservingthe ancestral function by another gene. The ME treeof the amino acid sequences showed that Obp57d and

Obp57e have equally diverged from the OBP gene in theobscura group, suggesting that the neofunctionalizationof either gene is not likely. Also, type II functionaldivergence was not supported, which means that therewas no radical substitution of amino acids leading to theacquisition of a novel function. However, not all of thetype I diverged sites between Obp57d and Obp57e appearto be caused by the loss of functional constraints; amongthe 19 conserved sites that are clade specific, 12 sites arenot conserved in the obscura group, in which theancestral functions should be conserved (Table 3).The specific condition for OBP genes needs to beconsidered to understand these observations. Becausethe most sites in OBP genes are evolutionarily free,acquisition of a novel function after gene duplicationmight be observed as an increase of functional con-straints at the sites that had been free before du-plication. Such site-specific differences of evolutionaryrate will be detected as type I divergence, but in this case,it should be related to neofunctionalization rather thansubfunctionalization. Positional shift of functionallyimportant sites, for example, may cause such changes.Our analysis did not include insertion/deletion varia-tions, which clearly affect positional relationships be-tween functional amino acids. Thus, it remains possiblethat each of Obp57d and Obp57e inherited a subdivisionof ancestral functions (subfunctionalization), and at thesame time they gained a novel function that is specific toeach of the two OBP genes (neofunctionalization). Ithas been proposed that subfunctionalization has a roleas a transition state to neofunctionalization (He andZhang 2005; Rastogi and Liberles 2005). This possi-bility should be examined experimentally by in vitroassay, as well as by behavioral assay of genetically manip-ulated flies.

TABLE 2

Summary of site-specific analysis of amino acid substitution

No. ofsubstitutions

Averageno. of

substitutionsper site

No. ofconserved

sites

Obp57d 184 1.80 33Obp57e 192 1.88 31Total 376 3.68 16a

a Number of sites conserved between Obp57d and Obp57e.

TABLE 4

Summary of analysis for functional divergence betweenObp57d and Obp57e

Tree 2 Tree 3

a 1.06 1.09Da (Obp57d) 2.19 2.19Db (Obp57e) 2.26 2.26uI 1 SE 0.582 6 0.106 0.606 6 0.111N 43 43C 25 26R 34 33uII 1 SE �0.057 6 0.243 �0.078 6 0.247

Trees 2 and 3 are shown in Figure 5, B and C, respectively. ais the gamma shape parameter. Da and Db are the averagenumbers of substitutions per site. uI and uII are the coeffi-cients of type I and type II functional divergences, respec-tively. SE, standard error. N, C, and R are the numbers ofsites that display no difference, conserved difference, and rad-ical differences in amino acid property between the twoclades, respectively.

TABLE 3

Summary of conserved amino acids between Obp57d andObp57e in the melanogaster species group

Position pseudoobscura/obscura Obp57d and Obp57e

37a C/C C59 W/W W60 L/P P74a C/C C78a C/C C97 Y/Y Y101 E/E G104 E/Q D109 K/D A115a C/C C122 E/S E124 D/D D126a C/C C133 A/A F135a C/C C141 A/A L

a Positions of the six OBP-signature cysteines.

Evolution of Obp57d and Obp57e 1069

Page 10: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

Birth-and-death process and selection: The ananassaesubgroup and the auraria–rufa lineage provide interest-ing examples in which either Obp57d or Obp57e has beenlost. Even in the subfunctionalization process, dupli-cated genes require selective pressure for the preserva-tion of ancestral functions. This selective pressure is alsonecessary for the maintenance of both subfunctional-ized genes. In other words, the species lacking eithersubfunctionalized gene may exhibit defective pheno-types that had been deleterious during the subfunction-alization process. Thus, the loss of either gene in theananassae subgroup and the auraria–rufa lineage mayindicate a shift in selective pressure, such as a reductionin population size leading to genetic drift, or an envi-ronmental change leading to a shift in food availability.Indeed, some of the amino acids conserved in the otherspecies that have both Obp57d and Obp57e (Tables 3 and5) are not conserved in the ananassae subgroup and theauraria–rufa lineage (Figure 2), indicating that the func-tional constraint on the remaining gene has changed. Itshould be also noted that the OBP gene number at theObp57d/e locus is under a particular selection mecha-nism. In natural populations of D. melanogaster, there ispolymorphism at the Obp57e locus (Takahashi andTakano-Shimizu 2005). The Obp57e null allele wasfound worldwide, indicating the existence of balancing

selection. This denotes that there is a selection-basedmechanism that affects the gene number at the Obp57d/e locus. The mechanism of this selection may have greatimportance as a determinant of the OBP gene familysize.

Contrary to the examples discussed above, the mul-tiple Obp57d genes in D. takahasii, D. biarmipes, D.ficusphila, D. elegans, and D. vaians and the two Obp57egenes in D. constricta appear to be at the early stage ofevolution after gene duplications in each lineage.Although the amino acids conserved in Obp57d genesin the other species (Tables 3 and 5) are not conservedin some of these multiple Obp57d genes (Figure 2), itmay indicate, in this case, a gene degeneration pro-cess rather than functional divergence. Indeed, Obp57dpseudogenes are found in D. takahashii and D. elegans.Analyses of intraspecies variations in each species mightreveal selection pressure on these extra Obp57d genes, ifany. Although the observed number difference of genesat the earlier stage of evolution (e.g., three Obp57d genesin D. biarmipes) is larger than that at the later stage (e.g.,loss of Obp57d in D. auraria) and it may contribute moreto the size difference of the gene family between species,its contribution to phenotypic differences would be lessthan that of the number difference of the functionallydiverged genes.

Figure 6.—Site-specific analysis of the difference in functional constraint. (A) The number of amino acid substitutions at eachsite. Arrows indicate the conserved sites among all Obp57d and Obp57e sequences in the selected species (see Table 3). (B) Pos-terior probability of type I functional divergence (being different in evolutionary rate) at each site calculated by using DIVERGE.Sites with probabilities .0.8 have a large difference in the number of amino acid substitutions between Obp57d and Obp57e (shownin A, connected by dashed lines).

1070 T. Matsuo

Page 11: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

Functional relationship with receptors: In the evolu-tion of OBP genes, not only the populational and envi-ronmental factors, but also the local factors at themolecular level are important determinants of selectionpressure. For its proper function, OBP must be coex-pressed with functionally corresponding chemorecep-tors in the same sensilla (Xu et al. 2005). Changes inthe structure (function) and expression pattern of thecorresponding receptor are expected to change theselection pressure on OBP genes. More importantly,changes in the expression pattern of OBP itself also alterits local environment. The evolution in the expressionpattern of each OBP may affect the selection pressureon itself, possibly resulting in the gene number differ-ence between species. It is necessary to examinewhether these genes are expressed in the same patternas seen in D. melanogaster or in different patterns in thespecies in which the number of Obp57d/e genes isaltered.

Conclusion: There are two classes of gene numberdifferences in the Obp57d/e region: the difference of thegenes that have functionally diverged from each otherand the difference of the genes that appear to befunctionally identical. Although both of these twoclasses contribute to the size difference of the genefamily between species, their contributions to thephenotypic differences are not equal, and the evolu-tionary mechanisms underlying them are different.

Thus, it is important to distinguish between these twoclasses in the analysis of the size difference of multigenefamilies among species. Comparisons of many genomicsequences from closely related species are effective forthis purpose.

This work was supported by grant-in-aid for Young Scientists (B)19770210 from the Ministry of Education, Culture, Sports, Science andTechnology, Japan.

LITERATURE CITED

Bendtsen, J. D., H. Nielsen, G. Von Heijne and S. Brunak,2004 Improved prediction of signal peptides: SignalP 3.0.J. Mol. Biol. 340: 783–795.

Da Lage, J.-L., G. J. Kergoat, F. Maczkowiak, J.-F. Silvain, M.-L.Cariou et al., 2007 A phylogeny of Drosophilidae using theAmyrel gene: questioning the Drosophila melanogaster species groupboundaries. J. Zool. Syst. Evol. Res. 45: 47–63.

De Bie, T., N. Cristianini, J. P. Demuth and M. W. Hahn,2006 CAFE: a computational tool for the study of gene familyevolution. Bioinformatics 22: 1269–1271.

Fitch, W. M., 1971 Toward defining the course of evolution: mini-mum change for a specific tree topology. Syst. Zool. 20: 406–416.

Foret, S., and R. Maleszka, 2006 Function and evolution of a genefamily encoding odorant binding-like proteins in a social insect,the honey bee (Apis mellifera). Genome Res. 16: 1385–1394.

Galindo, K., and D. P. Smith, 2001 A large family of divergent Dro-sophila odorant-binding proteins expressed in gustatory and ol-factory sensilla. Genetics 159: 1059–1072.

Gilad, Y., V. Wiebe, M. Przeworski, D. Lancet and S. Paabo,2004 Loss of olfactory receptor genes coincides with the acqui-sition of full trichromatic vision in primates. PLoS Biol. 2: E5.

TABLE 5

Summary of clade-specifically conserved amino acids

pseudoobscura/Ancestral stateb No. of substitutions within a clade

Posterior probabilityPosition obscura Obp57d Obp57e Obp57d Obp57e of type I divergencec

40 Q/Q AHNRTY Q 7 2 0.8550 E/E E ADNTY 2 8 0.8756 L/L F L 5 1 0.9257 D/D K E 4 1 0.8468 L/L V I 0 3 0.8172 F/V H Y 0 3 0.8075a Y/Y Y F 0 5 0.9477 T/T T T 3 0 0.8284 N/Q GNS G 3 0 0.8293a H/H RST Q 6 0 0.9899a E/E D K 0 4 0.90100a T/A LNST S 4 0 0.92110a S/S P M 0 6 0.96114 N/N R STVW 1 6 0.94117 Y/T Y AGIS 0 3 0.82120a A/A R E 0 4 0.89121 D/D K D 5 1 0.83140 R/R K K 3 0 0.83142 A/A A A 5 1 0.82

a Highly diverged sites where the amino acid is conserved in either clade and substituted more than threetimes in the other clade.

b Inferred by the MP method (see materials and methods).c Calculated using DIVERGE.

Evolution of Obp57d and Obp57e 1071

Page 12: Rapid Evolution of Two Odorant-Binding Protein …...Two OBP genes, Obp57d and Obp57e, have been identified to be responsible for species-specific host-plant preference in D. sechellia

Go, Y., Y. Satta, O. Takenaka and N. Takahata, 2005 Lineage-spe-cific loss of function of bitter taste receptor genes in humans andnonhuman primates. Genetics 170: 313–326.

Graham, L. A., and P. L. Davies, 2002 The odorant-binding pro-teins of Drosophila melanogaster: annotation and characterizationof a divergent gene family. Gene 292: 43–55.

Gu, X., 2006 A simple statistical method for estimating type-II (clus-ter-specific) functional divergence of protein sequences. Mol.Biol. Evol. 23: 1937–1945.

Gu, X., and K. V. Velden, 2002 DIVERGE: phylogeny-based analysisfor functional-structural divergence of a protein family. Bioinfor-matics 18: 500–501.

Hahn, M. W., T. De Bie, J. E. Stajich, C. Nguyen and N. Cristianini,2005 Estimating the tempo and mode of gene family evolutionfrom comparative genomic data. Genome Res. 15: 1153–1160.

He, X., and J. Zhang, 2005 Rapid subfunctionalization accompa-nied by prolonged and substantial neofunctionalization in dupli-cate gene evolution. Genetics 169: 1157–1164.

Hekmat-Scafe, D. S., C. R. Scafe, A. J. McKinney and M. A.Tanouye, 2002 Genome-wide analysis of the odorant-bindingprotein gene family in Drosophila melanogaster. Genome Res. 12:1357–1369.

Junier, T., and M. Pagni, 2000 Dotlet: diagonal plots in a webbrowser. Bioinformatics 16: 178–179.

Karev, G. P., Y. I. Wolf and E. V. Koonin, 2003 Simple stochasticbirth and death models of genome evolution: Was there enoughtime for us to evolve? Bioinformatics 19: 1889–1900.

Karev, G. P., Y. I. Wolf, F. S. Berezovskaya and E. V. Koonin,2004 Gene family evolution: an in-depth theoretical and simu-lation analysis of non-linear birth-death-innovation models. BMCEvol. Biol. 4: 32.

Kumar, S., K. Tamura and M. Nei, 2004 MEGA3: integrated soft-ware for molecular evolutionary genetics analysis and sequencealignment. Brief. Bioinform. 5: 150–163.

Lynch, M., and J. S. Conery, 2000 The evolutionary fate and con-sequences of duplicate genes. Science 290: 1151–1155.

Matsuo, T., S. Sugaya, J. Yasukawa, T. Aigaki and Y. Fuyama,2007 Odorant-binding proteins Obp57d and Obp57e affect tasteperception and host-plant preference in Drosophila sechellia. PLoSBiol. 5: e118.

Nei, M., 2005 Selectionism and neutralism in molecular evolution.Mol. Biol. Evol. 22: 2318–2342.

Rastogi, S., and D. A. Liberles, 2005 Subfunctionalization of du-plicated genes as a transition state to neofunctionalization. BMCEvol. Biol. 5: 28.

Reed, W. J., and B. D. Hughes, 2004 A model explaining the sizedistribution of gene and protein families. Math. Biosci. 189:97–102.

Roth, C., S. Rastogi, L. Arvestad, K. Dittmar, S. Light et al.,2007 Evolution after gene duplication: models, mechanisms,sequences, systems, and organisms. J. Exp. Zool. B Mol. Dev. Evol.308: 58–73.

Rudnicki, R., J. Tiuryn and D. Wojtowicz, 2006 A model for theevolution of paralog families in genomes. J. Math. Biol. 53: 759–770.

Takahashi, A., and T. Takano-Shimizu, 2005 A high-frequency nullmutant of an odorant-binding protein gene, Obp57e, in Drosophilamelanogaster. Genetics 170: 709–718.

Xu, P. X., L. J. Zwiebel and D. P. Smith, 2003 Identification of adistinct family of genes encoding atypical odorant-binding pro-teins in the malaria vector mosquito, Anopheles gambiae. InsectMol. Biol. 12: 549–560.

Xu, P., R. Atkinson, D. N. M. Jones and D. P. Smith, 2005 DrosophilaOBP LUSH is required for activity of pheromone-sensitive neu-rons. Neuron 45: 193–200.

Communicating editor: N. Takahata

1072 T. Matsuo


Recommended