+ All Categories
Home > Documents > Sequence Evolution of the Drosophila Heat Shock Locus hsro ... · tein-encoding heat shock loci...

Sequence Evolution of the Drosophila Heat Shock Locus hsro ... · tein-encoding heat shock loci...

Date post: 16-Feb-2021
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
13
Copyright 0 1989 by the Genetics Society of America Sequence Evolution of the Drosophila Heat Shock Locus hsro. I. The Nonrepeated Portion of the Gene James C. Garbe,' William G. Bendena2and Mary Lou Pardue Department of Biology, Massachusetts Znstitute of Technology, Cambridge, Massachusetts 02139 Manuscript received November 28, 1988 Accepted for publication February 28, 1989 ABSTRACT The locus which we now call hsrw was originally identified as a large heat shock puff in polytene region 93D of Drosophila melanogaster. This puff was subsequently found to have several phenotypic characteristics that distinguished it from other heat shock puffs. These characteristics include induction by a number of agents that do not induceotherpuffsandthepresence of large ribonucleotide particles that are not found elsewhere. Each Drosophila species has one heat shock puff with these phenotypes. In contrast to thestrongsequenceconservation seen in puffs coding for heatshock proteins, very little cross-hybridization is detected between hsrw loci in different species, suggesting that the hsrw loci are diverging rapidly. Comparative analyses of the hsrw locus from D. melanogaster, D. pseudoobscura, and D. hydei show that, despite the sequence change, the structure of the locus and its transcripts has been conserved, along with a number of short regions of the sequence. The short regions of conservation offer some clues to the function of this unusual locus. In addition, these comparisons offer a view of the evolution 'of a gene whose primary function does not appear to be protkin coding. W HEN subjected to environmental stress, orga- nisms respond by shifting the pattern of mac- romolecular synthesis to the production of a limited number of RNAs and proteins. This response, named the heatshock response, has been found to be a highly conserved phenomenon in all types of organisms that have been examined, including animals, plants and bacteria. The ubiquity of the response suggests that it is an ancient cellular mechanism. Molecular charac- terization of the genes that areactive during the heat shock response has provided additional information on the extent to which the response has been con- served. There are at least four families of proteins whose synthesis is increased by heat shock and other stresses. The most abundantand highly conserved heat shock proteins (hsp's) belong to thehsp70 family (CRAIG1985; LINDQUIST 1986). At theamino acid level Drosophila hsp70shares 73% overall identity with human hsp7O and 50% identity with Escherichia coli hsp70 (dnaK). In some regions of these proteins, the level of identity exceeds 90%. Another family, the hsp 90 family, includes hsp90 from humans, hsp82of Drosophila and the htpG protein of E. coli (BARDWELL andCRAIG 1987). There is some 41 % identity of amino acids between the Drosophila hsp82 protein and htpG of E. coli. The group of small hsps, the hsp20 family, is somewhat more variable in size and sequence Berkeley, California 94720. Ontario, Canada K7L 3N6. Genetics 124: 403-415 (June, 1989) ' Present address: Department of Genetics, University of California, ' Present address: Department of Biology, Queen's University, Kingston, (CRAIG1985). In Drosophila there is a quartet of proteins between 22 kD and 28 kD, in yeast a single protein of 26 kD, and in plants there is a large cluster of proteins ranging from 15 to 27 kD. The nucleotide sequences of the plant hsp 20 family are not very similar to thesequences of the proteins in animals, yet both the plant and animal proteins share some struc- tural features (NAGANO et al. 1985). Another con- served feature of the hsp 20 family is its presence in non-stressed gametogenic cells of both animals and yeast (KURTZ et al. 1986). No bacterial members of the 20-kD family have been reported. Recently, a fourth conserved family has been identified contain- ing the bacterial GroEL protein, a eukaryotic mito- chondrial protein (MCMULLIN and HALLBERG 1986), and the ribulose bisphosphate carboxylase (Rubisco) subunit binding protein from plants (HEMMINGSEN et al. 1988). In contrast to the general conservation of the pro- tein-encoding heat shock loci across species and king- doms, one heatshock locus has been found that shows evidence of much more rapid sequence evolution. This locus is hsrw of Drosophila, a locus with products that appear to act primarily as RNAs (BENDENA et al. 1989a). Like some other heat shock loci, hsrw is im- portant in non-stressed as well as stressed cells; the name hsrw reflects the original identification as the locus encoding the w set of heat shock RNAs. We now know that hsrw is active in almost all cellsand that the activity is greatly increased by heat shock. The lack of sequence cross-homology for hsrw loci
Transcript
  • Copyright 0 1989 by the Genetics Society of America

    Sequence Evolution of the Drosophila Heat Shock Locus hsro. I. The Nonrepeated Portion of the Gene

    James C. Garbe,' William G. Bendena2 and Mary Lou Pardue Department of Biology, Massachusetts Znstitute of Technology, Cambridge, Massachusetts 02139

    Manuscript received November 28, 1988 Accepted for publication February 28, 1989

    ABSTRACT The locus which we now call hsrw was originally identified as a large heat shock puff in polytene

    region 93D of Drosophila melanogaster. This puff was subsequently found to have several phenotypic characteristics that distinguished it from other heat shock puffs. These characteristics include induction by a number of agents that do not induce other puffs and the presence of large ribonucleotide particles that are not found elsewhere. Each Drosophila species has one heat shock puff with these phenotypes. In contrast to the strong sequence conservation seen in puffs coding for heat shock proteins, very little cross-hybridization is detected between hsrw loci in different species, suggesting that the hsrw loci are diverging rapidly. Comparative analyses of the hsrw locus from D. melanogaster, D. pseudoobscura, and D. hydei show that, despite the sequence change, the structure of the locus and its transcripts has been conserved, along with a number of short regions of the sequence. The short regions of conservation offer some clues to the function of this unusual locus. In addition, these comparisons offer a view of the evolution 'of a gene whose primary function does not appear to be protkin coding.

    W HEN subjected to environmental stress, orga- nisms respond by shifting the pattern of mac- romolecular synthesis to the production of a limited number of RNAs and proteins. This response, named the heat shock response, has been found to be a highly conserved phenomenon in all types of organisms that have been examined, including animals, plants and bacteria. The ubiquity of the response suggests that it is an ancient cellular mechanism. Molecular charac- terization of the genes that are active during the heat shock response has provided additional information on the extent to which the response has been con- served. There are at least four families of proteins whose synthesis is increased by heat shock and other stresses. The most abundant and highly conserved heat shock proteins (hsp's) belong to the hsp70 family (CRAIG 1985; LINDQUIST 1986). At the amino acid level Drosophila hsp70 shares 73% overall identity with human hsp7O and 50% identity with Escherichia coli hsp70 (dnaK). In some regions of these proteins, the level of identity exceeds 90%. Another family, the hsp 90 family, includes hsp90 from humans, hsp82 of Drosophila and the htpG protein of E. coli (BARDWELL and CRAIG 1987). There is some 41 % identity of amino acids between the Drosophila hsp82 protein and htpG of E. coli. The group of small hsps, the hsp20 family, is somewhat more variable in size and sequence

    Berkeley, California 94720.

    Ontario, Canada K7L 3N6.

    Genetics 124: 403-415 (June, 1989)

    ' Present address: Department of Genetics, University of California, ' Present address: Department of Biology, Queen's University, Kingston,

    (CRAIG 1985). In Drosophila there is a quartet of proteins between 22 kD and 28 kD, in yeast a single protein of 26 kD, and in plants there is a large cluster of proteins ranging from 15 to 27 kD. The nucleotide sequences of the plant hsp 20 family are not very similar to the sequences of the proteins in animals, yet both the plant and animal proteins share some struc- tural features (NAGANO et al. 1985). Another con- served feature of the hsp 20 family is its presence in non-stressed gametogenic cells of both animals and yeast (KURTZ et al. 1986). No bacterial members of the 20-kD family have been reported. Recently, a fourth conserved family has been identified contain- ing the bacterial GroEL protein, a eukaryotic mito- chondrial protein (MCMULLIN and HALLBERG 1986), and the ribulose bisphosphate carboxylase (Rubisco) subunit binding protein from plants (HEMMINGSEN et al. 1988).

    In contrast to the general conservation of the pro- tein-encoding heat shock loci across species and king- doms, one heat shock locus has been found that shows evidence of much more rapid sequence evolution. This locus is hsrw of Drosophila, a locus with products that appear to act primarily as RNAs (BENDENA et al. 1989a). Like some other heat shock loci, hsrw is im- portant in non-stressed as well as stressed cells; the name hsrw reflects the original identification as the locus encoding the w set of heat shock RNAs. We now know that hsrw is active in almost all cells and that the activity is greatly increased by heat shock.

    The lack of sequence cross-homology for hsrw loci

  • 404 J. C. Garbe, W. G. Bendena and M. L. Pardue

    was first detected in experiments that used in vivo labeled heat shock RNAs as probes for in situ hybrid- izations to polytene chromosomes from closely and distantly related Drosophila species (PETERS, LUBSEN AND SONDERMEIJER 1980). In all instances, the pro- tein-encoding heat shock loci showed strong cross- hybridization among the species tested. One of the major heat shock puffs, however, showed no detecta- ble cross-hybridization between distantly related Dro- sophila species. Only when the source of the labeled RNA was a sibling Drosophila species was cross-hy- bridization seen at this locus. This apparent lack of sequence conservation was puzzling because there was a body of cytological data showing that this particular heat shock puff had some distinguishing characteris- tics in all species studied (LAKHOTIA and SINCH 1982; LAKHOTIA 1987). The phenotypic evidence that these puffs are homologous was supported by their chro- mosomal locations. In each species the puff is on the chromosome arm derived from ancestral element E (PATTERSON and STONE 1952).

    The isolation of recombinant DNA clones for the hsrw loci of D. hydei (hsr2-48B: PETERS et al. 1984) and D. melanogaster (hsr93D: WALLDORF et al. 1984; GARBE and PARDUE 1986) provided the opportunity to determine the molecular structure of these loci and their transcripts. As reported previously (GARBE et al. 1986; RYSECK et al. 1987) the molecular analysis con- firms the proposed homology of these two loci, which we now call hsrw. Both loci have a characteristic ge- nomic organization in which the transcription unit consists of approximately 2.5 kb of unique sequence followed by an 8-1 0-kb region of short direct tandem repeats. The pattern of three transcripts produced from the loci is also similar (Figure l), consisting of a large transcript about 10 kb long (hsrwl) and two smaller transcripts of about 2 and 1.2 kb in length (hsrw2 and hsrw3, respectively). All transcripts have the same start site (RYSECK et al. 1987; J. C. GARBE, unpublished data). The metabolism of these three transcripts has been studied most thoroughly in D. melanogaster but the pattern of puff induction in other species and the structure of their transcripts suggest that the D. melanogaster studies are applicable to the other hsrw loci. The w l transcript is found only in the nucleus and is relatively stable. It does not act as a precursor to the other transcripts. The w2 transcript appears to be produced by an alternative termination near a polyadenylation signal some 2 kb from the initiation point. w2 is the nuclear precursor for w3, a cytoplasmic RNA whose turnover is closely associated with the rate of protein synthesis in the cell (BENDENA et al. 1989a, b).

    Surprisingly, direct sequence comparisons show that the entire transcribed region of the hsrw loci in D. melanogaster and D. hydei is highly divergent. Only

    poly-A slgnal poly-A slgnal Repeets @&~&-" GENOMIC

    a de OMEGA 1 OMEGA 2

    OMEGA 3

    FIGURE 1 ."Schematic diagram showing the genomic organiza- tion of the hsrw locus (GENOMIC) and the structure of its tran- scripts (OMEGA 1 , 2 and 3) in the Drosophila species studied. HSE/ CTS indicates a region containing heat shock elements and at least some of the constitutive transcription signals. The bent arrow indicates the start of transcription for all three transcripts. The black box indicates the sequences that are spliced out to make w 3 . The first poly-A signal is the one used for termination of w2. The second poly-A signal is the one used for termination of w l . Repeats indicate the 8-15 kb segment of tandem direct repeats; this region has been greatly truncated in the diagram. The unique region between the transcription start and the repeats is 2-3 kb, depending on the Drosophila species. The wl transcript is colinear with the entire transcription unit and is limited to the nucleus. The w2 transcript is also nuclear and is processed to yield the cytoplasmic transcript, w3, by removal of a 700-nucleotide intron (sequences indicated by black box). AAAAAAA indicates polyadenylation of transcripts. The w 3 transcript is 1.2 kb in D. melanogaster and D. pseudoobscura and 1.35 kb in D. hydei, w2 is 1.9 kb in D. melanogaster and D. pseudoobscura and 2.2 kb in D. hydei. The size of w l varies somewhat more between species and also varies between alleles within a species.

    short regions of sequence identity remain evident in the unique portion of the loci. Some of these con- served regions can be identified as being involved in RNA transcription and processing. The largest of the conserved regions flanks the 3' splice site of the in- tron. In the repeat region the degree of sequence divergence is even more striking. The length of the repeat units differs in the two species, 115 bp for D. hydei and about 280 bp for D. melanogaster. Within the repeats the only conserved sequence is the nona- mer ATAGGTAGG which is repeated once in the 115-bp repeat and twice in the 280 bp repeat.

    Because of the marked sequence divergence be- tween the hsrw loci of D. melanogaster and D. hydei, which are separated by some 60 million years (BEV- ERLEY and WILSON 1984), we have characterized the hsrw locus of a third species. The species chosen was D. pseudoobscura, which is considered to have diverged from the line leading to D. melanogaster some 46 million years ago and is thus much closer to D. mela- nogaster than to D. hydei (BEVERLEY and WILSON 1984). In addition we have used hybridization analysis to compare parts of the hsrw sequence in other Dro- sophila species. Our results, again, show rapid evolu- tion of the sequence but conservation of the pheno- type of this locus. The basic genomic organization is conserved with both unique and repeated sequences and the transcripts produced are structurally equiva- lent. It appears that the unique 5' region is evolving differently than is the segment of tandem repeats in

  • Sequence Evolution of hsrw 405

    the 3’ region. This report will center on the sequences of the 5’ region. Studies of the repeated segments will be presented elsewhere. The sequence analysis of the unique region has provided some clues as to the function of this unusual gene (BENDENA et al. 1989a). In addition, this analysis permits us to examine the evolution of a sequence that is not constrained signif- icantly by the demands of protein coding.

    MATERIALS AND METHODS

    Drosophila stocks: The Drosophila stocks used in this study were as follows: D. melanogaster, rySo6 (Canton-S) ob- tained from A. SPRADLING, gt-1 and gt-X11 (PARDUE and DAWID 1981); D. simulans, D. hydei and D. virilis obtained from W. GELBART; D. mauritiana, obtained from M. YOUNG; D. miranda, obtained from R. NORMAN; and D. pseudoob- scura, standard third chromosomal arrangement, obtained from R. LEWONTIN.

    Nucleic acid analysis: Genomic DNA was isolated from Drosophila adults as described by AYME-SOUTHGATE et al. (1 988). RNA was isolated from larvae using guanidine-HCl as described by CHIRCWIN et al. (1979). Electrophoresis and blot hybridization conditions were as described (GARBE and PARDUE 1986). DNA for probes was labeled according to FEINBERG and VOGELSTEIN (1983, 1984). The donor and acceptor splice sites and the polyadenylation sites for D. melanogaster were determined from the sequence of cloned cDNA (GARBE et al. 1986). The transcription start sites and acceptor splice sites for all three species were determined by the method of Hu and DAVIDSON (1986). The donor splice sites and polyadenylation sites for D. pseudoobscura and D. hydei have been determined by homology with the D. melanogaster sites.

    Recombinant DNA: Construction of genomic libraries for D. pseudoobscura was done by standard methods (MAN- IATIS, FRITSCH and SAMBROOK, 1982). Genomic DNA was partially digested with Sau3AI and size fractionated on a 5- 20% linear sucrose gradient. Appropriately sized fragments were ligated to BamHI-digested X EMBL 3B (KARN, BREN- NER and BARNETT 1983) arms and packaged (HOHN and MURRAY 1977). The library was screened as described (Ben- ton and DAVIS 1977). Restriction fragments were cloned into M 13 vectors mpl8 and mp19 (NORRANDER, KEMPE and MESSING 1983) and sequenced by the dideoxy chain termi- nation method (SANGER, NICKLEN and COULSON, 1977) using [55S]dATP (BICGEN, GIBSON and HONC 1983).

    Computer analysis: Nucleic acid homologies were ana- lyzed by the methods of MAIZEL and LENK (1 98 l ) using the programs COMPARE and DOTPLOT of the University of Wisconsin Genetics Computer Group (UWGCG; DEVER- EUX, HAEKERLI and SMITHIES 1984). Nucleic acid sequence alignments were done using the IALIGN program of the National Biomedical Research Foundation (NBRF). This program uses the algorithm of NEEDLEMAN and WUNSCH (1 970). Scoring values for the alignments were: identities, 3; transitions, 1; transversions, 0; and gaps, -6 (LAKE 1988).

    RESULTS

    Hybridization detects a small segment of the hsrw sequence in each Drosophila species studied: As pre- dicted by the early studies (PETERS, LUBSEN and SON- DERMEIJER 1980), cross-hybridization between cloned hsrw sequences from D. melanogaster and D. hydei

    detected very little homology between the species. Using six small DNA fragments spanning the 1.9-kb encoding the small nuclear transcript of D. melano- gaster to probe cloned hsrw DNA from D. hydei, we found only one fragment showing significant hybrid- ization and two showing very weak hybridization (GARBE and PARDUE 1986). The pattern of hybridi- zation did not change when hybridization stringency was reduced. Surprisingly, direct sequence analysis showed the major cross-hybridization was due to a 60 nucleotide region, including 40 nucleotides within the intron and spanning the 3’ splice site. This segment had 100% identity in the two species. T o determine whether this region was conserved in other Drosophila species, we isolated genomic DNA from representa- tive Drosophila species and analyzed it by Southern hybridization, using as probe a 750-bp piece of D. melanogaster DNA that spanned the intron and in- cluded the 3‘ splice site region.

    The analysis indicates that the region flanking the 3’ splice site is conserved in all the tested Drosophila species (Figure 2a). The levels of hybridization de- tected also provide an indication of the level of con- servation in the region. D. melanogaster and its sibling species, D. simulans and D. mauritiana, show equally strong hybridization to the probe. In contrast, the other species tested show a sharply reduced hybridi- zation signal. Surprisingly, the signal detected for species outside the melanogaster sibling species group is approximately equal among the four species tested, although these species differ significantly in their re- lation to D. melanogaster. This data does not determine whether the reduced signal is due to a short region of high sequence identity as is seen for D. hydei or a longer sequence of lower identity.

    The Southern hybridization experiments do not identify the cross-hybridizing sequences as members of the hsrw loci but they do show a single band of hybridization for each species, as expected if there is one locus per genome. In the two additional cases where we have determined the chromosomal localiza- tion of the cross-hybridizing region, in D. pseudoob- scura and D. virilis, we have found that the sequence is located in the heat shock puff that has the pheno- typic characteristics of hsrw loci, as determined by LAKHOTIA and SINCH (1982). Localization of the se- quence in D. pseudoobscura was done using a cloned piece of D. pseudoobscura DNA that had been isolated by hybridization with the D. melanogaster intron re- gion. In situ hybridizations with this clone on polytene chromosomes showed a single site of hybridization at locus 58C (Figure 2c). Although the 58C puff contains sequences encoding hsp70, this puff also has charac- teristics of hsrw puffs (BURMA and LAKHOTIA 1984). Our characterization of this locus (see below) confirms the suggestion that the 58C puff contains the hsrw

  • 406 J. C . Garbe, W. G. Bendena and M. L. Pardue

    1 2 3 4 5 6 7 0 9

    a

    b 0.

    FIGURE 2.-(a) Blot hvbridization with the D. melanogasfer hsro intron probe detects a single band in DNA from other Drosophila species. A strong hybridization signal is seen for D. rnelanogasfer (lanes 1-3, stocks gt-1, gtll, and Canton S, respectively), D. simulans (lane 4). and D. maurifiana (lane 5) . Weaker hybridbation signals are seen for D. miranda (lane 6), D. pseudoobscura (lane 7), D. hydei (lane 8). and D. virilis (lane 9). To allow direct comparison between the species, the lanes of DNA from D. melanogasfer and its sibling species have been overexposed, obscuring the single band of hy- bridimtion that can be seen on shorter exposures. DNA was isolated from adults, cut with Psfl, fractionated on an agarose gel, trans- ferred to a nitrocellulose filter and hybridized with a ScallEcoRV fragment from the D. melanogasfer intron. Hybridbation was at 60" in 4X SET ( 1 X SET: 0. I5 M NaC1/0.03 M Tris. HC1/2 mM EDTA, pH 7.0). 0.2% polyvinylpyrollidone, 0.2% Ficoll 400, 0.2% bovine serum albumin, and 0.5% SDS. Filters were washed at 60" in 0.15 M NaC1/0.015 M sodium citrate, pH 7.0. and 0.5% SDS. (b) The D. pseudoobsrura clone that was selected by hybridization with the D. melanogasfer hsro intron clone maps to the 58C puff (arrow). This autoradiograph shows a chromosome that has been hybridized with 'H-labeled DNA from the D. pseudoobscura clone. (c) The D. mela- nogasfer hsro intron probe cross-hybridizes with sequences in the D. virilis hsro locus. The D. melanogaster intron fragment used as probe for the Southern hybridization in (a) was 'H-labeled and used for in situ hybridization to polytene chromosomes of D. virilis. This autoradiograph shows the single site of hybridilation that is detected in the 20CD puff, a puff with other characteristics of a hsrw heat shock Duff (arrow).

    gene as well as hsp70 sequences. D. virilis is one of the most distantly related of the other species used for Southern hybridization since it, like D. hydei, be- longs to the subgenus Drosophila while the other spe- cies belong to the subgenus Sophophora. Although we have not cloned hsrw sequences from D. virilis, we have localized the cross-hybridizing region on D. virilis chromosomes by in situ hybridization with the D. melanogaster intron probe and find the sequences in the expected puff at 20CD (Fig. 2c) (LAKHOTIA and SINGH 1982).

    Transcripts from the hsrw locus of D. pseudoob- scura resemble those of D. melanogaster and D. hydei: Figure 1 schematically illustrates the general structure of hsrw loci in the three species of Drosophila studied. The diagram is based on results from RNA blot hybridizations that relate specific regions of the ge- nomic DNA to particular transcripts (GARBE 1988; GARBE and PARDUE 1986; GARBE et al. 1986). In all three species, we detect three transcripts of a charac- teristic pattern. There are small species-specific differ- ences in the sizes of the w 2 and w 3 transcripts. These correspond to small insertions and deletions in the sequence of the unique portion of the locus. In addi- tion, we have noted that the size of the D. pseudoob- scura wl transcript is significantly larger than the equivalent transcript of D. melanogaster or D. hydei. This larger difference in wl appears to occur in the region containing the tandem repeats which varies between 8 and 15 kb. The region of tandem repeats can also vary between alleles within a sequence (our unpublished data).

    Sequence data indicates similar structural orga- nization in the unique region: Analysis of the cloned sequences from D. pseudoobscura showed that, in this species, as in D. melanogaster and D. hydei, the hsrw locus contains a 5' unique region, containing two exons and one intron, followed by a region of repeated sequence. A comparison of the fine structure of the hsrw loci in the three Drosophila species was per- formed after sequence determination of the D. pseu- doobscura clone. Sequence analysis of the unique re- gion showed the pattern of regional conservation that had been seen in the D. melanogasterlD. hydei compar- isons (Figure 3). This will be discussed below in con- junction with the other two species. Our analysis of the repeated regions will be presented elsewhere (our manuscript in preparation).

    As an initial indicator of the levels of sequence identity in the hsrw loci of the three species, we have used the programs COMPARE and DOTPLOT (UWCCC, see MATERIALS AND METHODS) to provide a graphic representation of homologous regions. The program COMPARE was used to identify regions of at least 14 identical nucleotides in a search window of 21 (67% identity). The identified regions were then plotted on a dotmatrix format using the program DOTPLOT. No distinctions were made for levels of identity above 67%. The results of two-way compari- sons for the three loci are shown in Figure 4. The highest level of sequence homology is seen for D. pseudoobscura and D. melanogaster (Figure 4a). This homology reflects the closer evolutionary relationship between these two species as compared to D. hydei. As previously shown (GARBE et al. 1986), very little ho- mology is detected between D. melanogaster and D. hydei (Figure 4c). Although D. pseudoobscura appears

  • DRP

    DRM

    DRP

    DRM

    DKP

    DRM

    DRP

    DRM

    DRP

    DRM

    DRP

    DRM

    DRP

    DRM

    DRP

    Dm1

    DRP

    DRM

    DRP

    DRM

    DRP

    DRM

    DRP

    DRM

    DRP

    DIL51

    DRP

    DRM

    DRP

    DRM

    DRP

    DRM

    DRP

    DRM

    DRP

    DRM

    DRP

    DRM

    DRP

    DRM

    DRP

    DRM

    1

    10 1

    201

    301

    401

    501

    601

    701

    801

    901

    1001

    1101

    1201

    1301

    1401

    1501

    1601

    1701

    1801

    1901

    zoo1

    Sequence Evolution of hsrw

    T A T - T A G A A C T G G T C G A ; ~ C C A G A T A ~ T G C ~ A T T C G C A G T T A T . . . AAAATTGTA I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 1 1 TATAAATAGAGCCGCCTCAGTCCGGTCACGTCACTCTCAAATG~GTGTTCAAGTGCATTCAAAGTGAAGCTG~TAACCAGTT~TAGTA

    ' ___).

    . . T a m & . TGT. ATGAGTCAT~GATTTCACCG. TCAGCTTCGGGCATTTCACAATGCGGAAGCAACTGGAAAGGC~CAGTTTCTCGTTTTT=TA I I I I I I I I I l l I I IIIIIIIIIIII I I IIIII IIIIII IIIII I I I I I I I II I I IIIIII IIII1111

    CCT~~GATGTGATTAGTCATCGATTTTGCT~TCAGCGTCGGGGAATTTCGCAAT~A~A~~~CAGT .......... ..TTTCTCAAATTTGGCTA G ~ G T A A C T C C C T A G G C A ; ; G A T G A ~ . . TATCCAGGA~TTCAGGATCTCG~CAGACTCAGTCGGGTGAATGCGGCAGCCAGACCTGCAG I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I l l1 I I I I1 I I I I I I I l l I l l I l l l I I I I GAAAGTGACCCACTAGGCAGTCTGAGGCAGTTATCCA~~~ATGTA.AGGATGTGTGAGTACTCTCTGTGTCGAATG .... GCGATGGCCT.GACCTGCGG TGTACGT~~~TTGATAATTTGGGATAGG~GCCA~TATTAA. . . .CAGCAGCAAATCAAGAGTGTCATCTTTCAATG~ACCTTGTAATT~GTAAG~

    I I I I I I I l l I I I I I I I I I I I I I I I I I I I I I I I I l l I I I I I I I I I I I I I I I I I I I I I I ATT. . . . . . . . T T G A A A G A T T G A A A T A G G A A G C C A G G T A T A C A T A C A C A G C G T G T T C T A T A T ......... TCTTAATTCGTAAGCCA AAACGTAGTCCTAAGAACTAG. . . . . ATA~TTGTCGACAGACGATATGCAGAAAM. ACCTTTTCTAACGTTG. GCCGATGCTCCATTTCGTCATCACTC I I I I1 I I I I I I l l I I I I I1 I l l I I I I I I I I I l l I I I I I I I I I I I I I I l l I 1 I I I I I I I I I I I

    GTT . . . . T T T T G T G G C C T T T T C C A A T ~ ~ T C T A T A A G T C G I I I l l I I I I I l l I l l 1 I I l l I I I I1 I I I I I I I I I I I I I

    ATATCGAAGTTCTCGACTGTTCCCTTTCAGC . . CCCCC~AACCGT. .&CACCACCA&XCTGCTCT~AAAATACCCCTCTTTTCACTTACATTTTAC I I I I I I I I I I I I I I I I I l l I I I I I I II I I I I I I I I I I l l I I l l I I I I I I I I l l I I I I

    ATGTATGTATATTCATCTCGTCCCACTTTT~. ATACGCCTATCTATAC . T G C A T T C G G T T G T A C T C A C A C T G A C G T C A A C A G C A G A G C A I I I I I I I I I l l I I I I l l I I I I I I I I l l 1 I I I I I I I I l l I I I I I I I I I I I

    AAGTACTTGCGTGTACACACTCTACCCATGTGTTTACTCTCTCTTTATATATTATG. . . TACGCGTACATTGTATGTATGTATGTATTTTCCCACATGC: I II I I I i I ! I I 1 I I I I I I I I l l I I I I I l l II I l l I I I I I

    . . . . . . . . T ~ T A T A G G . GMTTTAACTGCTTTATGTTTCTTCATCCC. . . , CTGTTTCTGTCGTTCTATTGGCATCTTACCTACGTATAAGTTTC

    AACC.TATGTTTAAGACGTAGTTTATATAACCGT..ACATt\TG.TATGTAGCGAAACACGTTTTCmCGTCGCATCGAAGC..CATTTCTTCTGCAGCC

    ATTGCAATTTCATCGTGTTGGTAATCGCAAAT.TGTAAATGTGTCCCT ....................... ACTACTTCACATTTGTGTGTATCCAA..

    . . G T C G A A A G T T C T C G A A C T A T T C T C G T G A G C A A C C C C C G G A T T C G T T C T .......... CTCTCTCCAGAAACAGCTTAC

    TCTTACGTTTACAATTGACTCTCTACCTATCGGGTATATACACATTTTTATATATGTATACACTCAGAGACACCCCAATCCCC.CACCATACATATGGCGAG

    A T G T G T G A A T G T A T . C G C A G T T G G A C G A A A A C G G C A G C T G C G C A G A A A A C G C T G C T T T G

    I I I I I I I I I I I I I I I I I I I I I I I I l l I I I I I I I I I I I I I I I I l l I I I I1 I I CTTCCCCATACGTATAACTGTGTTTATCAGC~TA.ATCTCGCCAGCCTGACTTTGTCCCTGTCGCTTCTATTCTATTGACATTCT AA... . . . ACTAGATT.

    GTAGTCATTTTAACATTTACATAGCTGATAATCTTATTTGTAATAATG. CTTTATGCAACTCTGACACAC CTTAACCLGCAATATGTATTTCTTTCT I l l 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I1 I1 I I I I I I I I I l l I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

    I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I l l I I I I I I I I I l l I I I I I l l

    GA. AGAAGTGCTG~TGCACTTCGGCCCATGTACGCGTTTGGGCTAC~G~GTTGT~T~GACCCTGACTCA~CTTAGTGT . . . . . . . '

    GTAGACGTTTTAACATTTACCTAGCTGTTAATCTTAATCTTATTTGTAAACATCTCTATATGCAAC.CTGCCATAC CTTAACCAAGCAATATGTATTTCTTTCT

    CT-CTTTATA~T-GTTGAAAGTTGAT GCAAACGCGTAGGGCATCGATCCAAGT~CTCTGAACTCTGTACCCTTCGCGA~AATA~CA CTAAACTTTATAGTTGGGCGTTGAAAGTTGAT C...........GATATCGATCCGTGAAAAG........TCGATACCCTGCGCAAGCATGGGGCGG

    I I I I I I I I I I I I I I I I I I I I I1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

    GATTGTCGGGGTCGCGCAC~GACCAGGGTGCGTCGTTTCTATCAGATTGATTGTGC~~~ATTGTGTTATAGGAACTCC~TGTATCGACTTCTCTGCTCTA I I II111111 I I 1 I I I I I I I I I i l l I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 l 1 1 1 1 1 1 I

    CTAAGGGTGAA. . . . . CCCTAC . . AATGGGCCTTCTGT~;;CTTACCTACCATCGAACGAATACCAATACATACCGGGCAGATATACGTATAACTCTGTTC I l l I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I1 I I I I I I I I I I I I I1 I I I I I I I I I I I I I I I I I

    T G A C A T C G A ~ C G A T T G A A A C T G T T C A C A C G T C G T C T T T G . . : ACAATGGCTGTTGCTTGCAAATCTTTT I l l I I I I I I I I I I I I I I I I I I I I l l I I I I I I I I I I I I I I I I I I I I I I I I I l l I l l I I I l l

    CTGTGGCTA~ACATTTTGT~GTGAACTCT~GCGGAA. ACAGTTCCACACCAACAATGTGT . . . . TAAGTTCTTAAATATTTGAAATTGTTTAAACTGTT I I I I I l l I I I I I I I I I I I I I I I l l I l l I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

    T T T G T T T T T G T A C A C T C T A C G A C T T T A T A A T T A C T T G T A C C G A T G T ~ T T A C T T G T C G ~ C ~ C ~ T A A G ~ T ~ ~ T G C T T A A C I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I l l I l l I l l I I I I O I I I I I I O I I I I I I I I I I T T T G T T C T T A T A C A C T T C A C C A A T T A A G ~ T T A C C T G T A C C T G T T G A A T C A C ~ C A . . . . . . . . . TAGAAAAAATGAATAAC MGAGCTCATCTATAGAGAAGCAACA , AAAAAGTGAATAAAA . . T T C M T ~ T T ~ ~ I I I I I I I I I I l l 1 I I I l l 1 I l l I11111111IIlIII I I I I I I I I I I

    CATATGTGCTGAAAACGCACTCGGCCCGATCCCGATTC€AGCGTTATTCGAAAGCTGTGTCTGCGACCGTGACTGAGATCATATGCGTACATATATCT

    AATGTCCGGGGTCGTGGGCCAGCCAGTGC.TCGATTCTGTCAGATTGATTGTGC~ATTGTGTTATAGGAACACTffiTGTATCGACTTCTCT~TCCA

    CTATGGGTGAAGGATACCCTACCGtVVYiGGCCTTCTGTCGCTTAC.TATCATCGAACAAGTTCCG ..... TAAAGGGCAGACATACGTACA . . . . CGTGG

    CAGCATATTACGTTC AA. . . . . . T G A C A C A T C G T C T C T G G A T T A G T A G T T G A A C C A A C G A G C T T T

    . . . TGTTCTTATATTCAGTTGTAGTCAATACGGaGACATTTCCACACCAAC~~,TGTGTCACTTATGTTCTTAAATACCAGAAACTGTTTAAA...TT

    2084

    W G A G C T C ... TATAACAGAAAAGCCACAGAAAAAGTGAATWTTCUTTC

    407

    100

    200

    300

    400

    500

    600

    700

    800

    900

    1000

    1100

    1200

    1300

    1400

    1500

    1600

    1700

    1800

    1900

    2000

    FIGURE 3.-Aligned unique region sequences for D. pseudoobscura and D. melanogaster hsrw loci. DRP and DRM denote D. pseudoobscura and D. melanogaster, respectively. The sequences were aligned for maximum homology using the program IALIGN. The alignment begins at the T A T A box and ends 2 nucleotides beyond the polyadenylation site as determined for D. melanogaster. Homologous regions of significance in RNA metabolism are enclosed in boxes. The 5' splice junction (GT at position 537) and the 3' splice junction (AG at position 13 12) are marked with black arrowheads. The polyadenylation site (after the T at position 2082) is marked with an open arrowhead. The transcription start site (A at position 33) is indicated with the arrow at the top of the sequence. A small open reading frame conserved in D. pseudoobscura, D. hydei and D. melanogaster (see text and Figure 9) is enclosed in brackets, beginning at position 152.

  • 408 J. C. Garbe, W. G. Bendena and M. L. Pardue

    D. pseudoobscura

    , . .. . . I

    . . . . ., . .

    FIGURE 4."Dot matrix plots showing homologous regions of hsrw loci of D. pseudoobscura, D. melanogaster and D. hydei. The region of unique sequence from the TATAA box to the polyade- nylation site was compared for homology by the program COM- PARE. The results of the comparison were plotted with the pro- gram DOTPLOT. Each dot on the matrix represents the center of a 2 1-nucleotide window in which at least 14 nucleotides are identical in the two species. Regions that are homologous for all three species include those that are involved in RNA metabolism. The highly conserved 3' splice site is indicated by the arrow.

    to show slightly higher homology with D. hydei (Figure 4b) than does D. melanogaster, such homology would not be expected since both the melanogaster and obscura subgroups are believed to be equally diverged from the hydei subgroup (BEVERLEY and WILSON 1984) and the comparative alignments of the se- quences (see below) do not confirm this apparent closer relationship of D. hydei and D. pseudoobscura.

    Sequence alignment reveals specific regions of conservation: The dotmatrix results reveal several highly conserved regions in the hsrw loci of all three species. T o more precisely determine the regions of identity in the hsrw loci of D. pseudoobscura and D. melanogaster, the sequences were aligned using the program IALIGN. T o align the hsrw sequences we used the conserved 60 nucleotides surrounding the 3' splice site as a starting point. The introns were aligned in a 3' to 5' direction starting at the acceptor splice junction and proceeding toward the transcription start site for 800 nucleotides to go well past the 5' splice site. (The introns range from 714 to 739 nucleotides, depending on the species.) In each species comparison, the 14-1 6 nucleotides (depending on the species com- pared) around the 5' splice site of the two sequences coincide when aligned by this program. We then aligned sequences 5' to the donor splice site by start- ing with the 14 nucleotides of identical sequence at that site and continuing through the transcription start. (The portion 5' to the alignment start at the 3' splice site had to be broken into two pieces because the program could not handle the total number of nucleotides.) Sequences in the second exon were aligned by beginning again with the conserved 60 nucleotides at the acceptor splice site and proceeding in a 5' to 3' direction.

    Each of the three segments was aligned for all three Drosophila species, using a scoring matrix of: identi- ties, 3; transitions, 1; transversions, 0; gaps, -6. We have compared this matrix with several others. Results obtained with three of the matrices are shown in Table 1. For most of the nine alignments, the matrices with partial scoring for transitions give significantly better (in some cases 2-3-fold) alignment scores (alignment score: number of standard deviations from a popula- tion of randomly scrambled sequences) than a matrix which scores only identities (Table 1). Our results show consistent coincidence of the regions that we have biological reasons to think are homologous (see below). In contrast, alignments made by a scoring matrix that was not sensitive to transitions left these proposed regions of homology offset in several cases.

    Examination of the aligned sequences shows that the longest runs of unbroken homology that exist in all three species occur at sites that appear to be in- volved in RNA processing. These and other short regions of sequence conservation that appear to be

  • Sequence Evolution of hsrw

    TABLE 1

    409

    Sequence alignment scores

    Comparison Matrix" 5' Upstream' 5' Exon Intron Intron*' 3' Exon

    hsr-w D. melanogasterlD. hydei A 4.48 10.39 19.98 9.83 20.26

    B 5.15 8.66 13.00 21.70 C 2.14 7.33 8.54 7.33 8.34

    D. melanogasterlD. pseudoobscura A 3.70 28.97 18.30 14.18 40.92 B 2.75 23.06 20.00 37.34 C 3.88 20.93 12.01 20.93 25.99

    D. PseudoobscuralD. hydei A 5.01 14.14 13.56 9.82 18.39 B 6.00 10.47 16.32 17.58 C 2.35 12.53 9.38 12.53 10.18

    hsp82 intrond D. melanagasterlD. virilis A 1.57

    B 0.57 C 0.62

    D. melanogasterlD. pseudoobscura A 3.79 B 5.00 C 2.28

    D. virilislD. pseudoobscura A 2.82 B 3.50 C 1.86

    The sequences were aligned by the IALIGN program of NBRF. For each comparison, the maximum similarity score was compared with the scores of 25 alignments with randomly permuted versions of the sequences being aligned. The alignment score given in this table is the number of standard deviations separating the similarity score of this alignment from the scores of the group of 25 permuted versions.

    Matrix A: identities, 3; transitions, 1; transversions, 0; gaps, -6; bias, 0; matrix B: identities, 3; transitions, 1; transversions, 0; gaps, -2; bias, 0; and matrix C: identities, 1; transitions, 0; transversions, 0; gaps, -2; bias, 2.

    5' Upstream: the 360 nucleotides 5' to the transcription start site.

    ~~~

    ' Intron*: intron minus the 14-16 nucleotides of homology at the 5' splice site and minus the 40 nucleotides of homology at the 3' splice site.

    The hsp82 intron sequences are taken from the work of BLACKMAN and MFSELSON (1 986).

    hsrw

    FIGURE 5.-The sequence surrounding the hsrw 3' splice site is highly conserved. Eighty nucleotides in the region of the hsrw 3' splice

    DRP TGACACACAACTTAACCAAGCAATATGTATTTCTTTCTCT~CTTTATAG/TTGGGCGTTG~GTTGATACGC~C site are shown panel). Vertical bars in-

    DRM TGCCATACCAC~CAAGCAATATGTATGTATTTCTTTCTCT~CTTTATAG/TTGGGCGTTG~GTTGATATCGATAT dicate nucleotides that are identical in D' pseu- ** ** * I * * I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I * * I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I doobscura, D. melanogaster and D. hydei. Aster-

    DRH ATTGTTACACTTTAACCAAGCAATATGTATTTCTTTCTCT~CTTTATAG/TTGGGCGTTG~GTTGATACGCGATC isks indicate nucleotides that are the Same for

    TTAAC - - - - - - . - - - - - - - - - - - - - - - - - - - - - - - . . . A G D. pseudoobscura and D. melanagaster. The ar- branch acceptor row indicates the splice junction and the se-

    quence TTAAC corresponds to a potential

    hsp82 branch acceptor. The 3' splice junction region

    DRP TATTTAATGAGTTGTGA~TCTCGTGTGGTATTTTCTTGCTGTTCCAG/ATGCCCGAAGAAGCTGAGACTTTCGCA pseudoobscura and D. vir& (BLACKMAN and L E T - , for hsp82 (lower panel) of D. melanagaster, D.

    I I * I* * l l * * I I I l l I I I I I I I I I I I I I I I l l 1 l l I l l MFSELWN 1986)showsverylittleconservation

    DRV ~TGTGTGTGCAATGTTTACTTAATTTGATTTGGTTTGG~TATTGTTGCAG/ATGCCCGAAGAAGCTGAGACTTTCGCA quence or position of the potential branch ac-

    DRM TTGAACCCACAGACTATAACTAATCCTAATGATTTTGT~TCCATTGCAG/ATGCCAGAAGAAGCAGAGACCTTTGCA within the intron and no conservation in se- I I I I I I I l l I I I I I I I I I I I I I I I I I I I I I l l

    biologically important are discussed below. Acceptor splice site: The most striking region of

    homology among the species revealed by the dotma- trix analysis surrounds the acceptor splice site (Figures 4 and 5) . This region of 60 nucleotides is 100% identical in the three species examined. It is remark- able because much of it is within the intron. The region of identity begins with 5 nucleotides that match the consensus for the branch point acceptor in both sequence and position. The sequence identity contin- ues 20 nucleotides into the second exon. A compari-

    ceptor.

    son of the same region of the hsp82 3' splice region shows only 9 nucleotides in the 3' 40 nucleotides of the intron to be identical.

    Donor splice site: Computer alignments of the intron sequences, beginning at the 3' splice junction and continuing toward the 5' end, also result in the align- ment of a 14- 16 nucleotide region of identity around the donor splice site (Figure 6). The level of identity for the hsrw loci is similar to that seen for the donor splice site of hsp82 (9 nucleotides identical). For both hsp82 and hsrw, the identical nucleotides are mainly

  • 410 J. C. Garbe, W. G. Bendena and M. L. Pardue

    consensus CAGGTRAGT

    DRP . . . . . . . . . . . .

    GAGTTAGGAAGCCAGGTATAAAC I I I I I I I I I I I I I I I

    I I I I I I I I I I I I I I I D M GAAATAGGAAGCCAGGTATACAT

    DRH GGGATAGGAAGCCAGGTATTAAC

    U1 RNA 3‘ GAGAUGCGGUCCAUUCAUApppG 5‘

    FIGURE 6.-The sequences around the 5’ splice junction of the D. melanogaster, D. pseudoobscura and D. hydei hsrw loci are aligned for homology. Also indicated are the consensus 5’ splice sequence (MOUNT 1982) and the 5’ nucleotides of U1 RNA. The conserved hsrw 5’ splice sequence shows only six matches to the consensus although these are adjacent nucleotides. The conserved sequences also reveal eight adjacent nucleotides that are perfectly complemen- tary to the U 1 RNA region believed to be involved in splicing.

    ++x++++

    in the exon sequences. In addition, the hsp82 splice site shows more similarity to the splice site consensus sequence (MOUNT 1982) than does hsrw. However, more sequence complementarity to U1 RNA is seen in the exon sequences of the hsrw loci than in the same region of hsp82.

    Polyadenylation site: A further region of homology detected by both the dotmatrix analysis and the se- quence alignments is near the polyadenylation site in the unique portion of hsrw (Figure 7). In all three species, this region is very A-rich (up to 65%). The polyadenylation signal falls within a 15-2 1 nucleotide (depending on species compared) segment with only one mismatch. In addition, the polyadenylation signal in each species is complementary to a sequence further downstream from the polyadenylation site. This com- plementarity would allow formation of a stem loop structure with the polyadenylation signal in the stem and the cleavage site near the center of the loop. Two cDNA clones from the D. melanogaster hsrw locus have been sequenced (GARBE and PARDUE 1986; RYSECK et al. 1987) and the existence of two nearby polyadenyl- ation sites noted. The proposed stem-loop structure can be formed with either of the polyadenylation signals and the downstream sequence. Either of these structures would have cleavage of the RNA occurring in the loop.

    Transcription start site: Another region of homology that we can identify as having biological significance is around the transcription start site. This region is conserved among the loci (1 6 nucleotides identical out of 23; Figure 8) and for all three loci shows homology to sequences at the transcription start site of other Drosophila heat shock genes. The sequence is rich in A residues in the +5 to + 15 region, as are the other heat shock loci. We note that five of the six specific nucleotides which are conserved in all Drosophila hsp mRNAs, except that for hsp82 (HULTMARK, KLEMENZ and GEHRINC 1986), are also conserved in the hsrw loci; the sixth of these nucleotides, an A in position 20 is replaced by T in all three hsrw transcripts. Although the position of the TATAA box is con- served with respect to the transcription start site, there

    is little sequence conservation in the intervening nu- cleotides.

    The conserved open reading frame, ORF-w: The cyto- plasmic transcripts of hsrw are as large as the mRNAs for many moderately sized polypeptides. The hsrw transcript is also spliced and polyadenylated, features considered typical of mRNAs. Nevertheless the cyto- plasmic transcripts in each species are distinguished by the lack of a large open reading frame (ORF). There are larger potential ORFs on the opposite, nontranscribed, strand, suggesting that the small ORFs in the transcript may also be random occur- rences (GARBE et al. 1986). Sequence alignment shows only one ORF that is conserved in position and ap- proximate size in all three species (Figure 9). This ORF, called ORF-w, is found near position + 120 from the transcription start site and is the first ORF in each species that is in a sequence context thought to be favorable for translation (KOZAK 1984, 1986; CAVE- NER 1987). Both D. melanogaster and D. pseudoobscura have an additional ORF closer to the transcription start site but neither of these ORFs is in a favorable context. The polypeptide that would be derived from ORF-w is not strongly conserved. It would be small, 23 amino acids in D. hydei, 24 in D. pseudoobscura, and 27 in D. melanogaster. The first four amino acids (Met, Glu, Lys, Cys) are conserved, although in D. hydei Glu is encoded by its alternate codon. Only two other amino acids in the predicted polypeptide are identical in all three species but several amino acids are con- served in two of the species. We can see no evidence that the amino acid substitutions are conservative changes that would still allow maintenance of a partic- ular polypeptide structure. In spite of the apparent sequence divergence of ORF-w, we have evidence that this ORF is translated (BENDENA et al. 1989a; FINI, BENDENA and PARDUE 1989).

    5’ Flanking sequences: In contrast to the larger re- gions of homology seen in the transcribed regions of the hsrw locus, analysis of the upstream sequences (Figure 10) shows only one significant conserved block; this is around the TATA box. The sequence alignments reveal about 66-70% overall homology between species in the upstream regions, when the introduction of gaps is allowed. This homology level is similar to that seen for the transcribed regions; however, the alignment scores are generally lower for the upstream regions (Table I ) , suggesting that the matrix chosen does not fit this region as well as the others.

    The sequences seen in the promoter region (Figure 10) contain heat shock elements (HSEs), as expected, since the transcripts show the pattern of induction seen for other heat shock RNAs. If HSEs are defined by the classic consensus sequence CnnGAAnn- TTCnnG (PELHAM 1982), the nearest HSE (near -60

  • Sequence Evolution of hsrw 41 I

    * ___). . + . w . 1 . DRP GAGCTCATCTATA ... GAAAAGAAGCAAC AAAAAA.... GTGAAT AAAA... TTCAATAATTCCAAAAATTCTCGAAACA.TTGCTCGTCATATTTT DRM GAGCTC ... TATAACAGAlVULGCCACAG .AAAAA..... GTGAATAAAAAA-TTCAAAAATTCAAAAAATATTTCGAAACATTTGCTCGTGA ~ ~ . . . T DRH GAGCTCACC AAA.. C A G A A A A G A A G C T A C A A T C A T ~ G A T C T G .. AGACC ... AATATTTCGAAACAATTGTTC~AATA ... T

    c ” . DRP TTTTATTTCAAGCTT~TTCAA~TT~G.~~AGATCATCCAACCG~TTTAC DRM TTTTATTTCAAGAAAAGATCCGATTTCAG.TTAT .... ACCATCTAGCTTAA DIU TTTTATTTTAAGTTAAAATTCTA.CAAATTTCA..GTCCAATCACCATAC

    FIGURE 7,”Sequence homology is detected in the region of the hsro polyadenylation site. DRM, DRP and DRH refer to D. melunoguster, D. pseudoobscura and D. hydei, respectively. The last 60-1 00 nucleotides before the polyadenylation site for the cytoplasmic transcript is very A-rich. The aligned sequence shows most homology beginning with the polyadenylation signal and continuing through the polyadenylation site. For this figure sequences were aligned by eye. The vertical arrows indicate the two polyadenylation sites that have been determined for D. melanogaster. The rightward arrows indicate the polyadenylation signals corresponding to these two sites. The leftward arrow indicates a sequence that forms an inverted repeat with either of the polyadenylation signals.

    from the transcription start) varies in its match to the consensus. For D. hydei the match is 9 of 10, for D. pseudoobscura it is 8 of 10, and for D. melanogaster it is 6 of 10. If the HSE is based on the sequence TTCnnGAA as proposed by XIAO and LIS ( 1 988) this region fits with the same relative levels of match. All three species have other fits to both consensus se- quences further upstream. Positions of the upstream HSEs are not conserved with respect to the other species.

    In the three Drosophila species studied here the only difference that we detect in heat shock induction is seen in D. pseudoobscura, where maximal production of all heat shock products is seen below the tempera- ture of maximal induction in the other species. This lower temperature of heat shock induction may be correlated with a lower optimal growth temperature since we find that our D. pseudoobscura stock will not grow at temperatures above 2 5 ” , although D. mela- nogaster and D. hydei stocks survive well at these tem- peratures.

    DISCUSSION

    Although the hsro locus was originally identified because of its participation in the heat shock response, our studies of the transcripts have shown that this locus, like the hsp82 gene, is constitutively expressed in most cells but has increased activity in stressed cells (GARBE and PARDUE 1986). Previous evidence regard- ing this locus had indicated several unusual features that distinguished it from the other heat shock loci and, in fact, from any known gene (GARBE and PAR- DUE 1986; GARBE et al. 1986; HOVEMANN, WALDORF and RYSECK 1986; RYSECK et al. 1987). We suggest that the apparent rapid divergence of this locus stems from the fact that only a minor portion of the tran- script is involved in protein-coding and thus most of the sequence is not constrained by the need to main- tain a consecutive reading frame or by the structure of an encoded protein. Freed of this constraint, the sequence has acquired many small deletions and in-

    sertions and is evolving in much the way that has been noted for some introns and other noncoding regions (EFSTRATIADIS et al. 1980; MARTINEZ-CRUZADO et al. 1988). Such insertions and deletions rapidly eliminate the ability to detect homology by cross-hybridization, yet they have apparently not destroyed whatever func- tion the transcripts have since each species has only one hsro puff and that puff must therefore be func- tional. The deletions and insertions tend to balance out, leaving the sizes of the transcripts and the spacing of the conserved regions relatively constant in the different species.

    If an RNA molecule is to perform some conserved function, one would expect some sort of sequence or structural conservation. In spite of the marked evo- lutionary changes, the hsrw sequences do show some regions of strong conservation. The sequences seen in the promoter region contain typical heat shock ele- ments, as expected, since the transcripts show the pattern of induction seen for other heat shock RNAs.

    For the Drosophila hsps 22-28, hsp68 and hsp70 mRNAs, the region at the transcription start that includes the 6 conserved nucleotides (-1, + I , +7, + I 2, +15 and +20) has been implicated in both heat shock translation and efficient transcription (HULT- MARK, KLEMENZ and GEHRING 1986; MCGARRY and LINDQUIST 1985), although the conserved nucleotides have not been specifically tested. The conservation of 5 of these nucleotides in hsrw is consistent with both of the postulated functions since the hsrw cytoplasmic transcript is associated with polysomes under heat shock conditions (FINI, BENDENA and PARDUE 1989).

    Data on the sequences required for splicing indicate that the only absolute requirements are the consensus GT at the donor site, the AG at the acceptor site and an appropriately placed branch point (PADGETT et al. 1986). Yet the hsrw transcripts have larger regions of conserved sequence around both the donor and the acceptor splice sites. The acceptor splice site is espe- cially remarkable in showing 100% homology over 60 nucleotides. Although it is possible that this large

  • 412 J. C. Garbe, W. G. Bendena and M. L. Pardue

    DRM A C G T C A C T C T C A A A T G A A A A G T G T T

    DRH C A G T C A T T C G A A A A A A A T A A G T G T T F-7 ' l l DRP C T G C C A T T C G C A A A A A A A A A G T G T T

    C A A A A - HSP22 C T C T C A G T T C A A A A A A A C C A A A C C A HSP23 G C G T C A G T T G A A T T C A A A A A G C C A A HSP26 A G C A C A G A T C G A A T T C A A A A A T C G A HSP27 A G C A C A G T C T A A A C T G A A A A A T T G A HSP68 G C T A C A T T T G A A A T C A A A C A G T C A A HSP70 G C G T C A A T T C A A T T C A A A C A A G C A A

    A C A A A : A

    FIGURE 8.-Sequences near the hsrw transcription start site are highly conserved (upper panel). The sequences at the transcription start site for the hsrw loci of D. melanogaster, D. hydei and D. pseudoobscura (DRM, DRH and DRP, respectively) are aligned. Nucleotides that are the same in at least two of the species are boxed. The transcription start site is the A at position 6 (arrow). Sequences near the transcription start site for other D. melanogaster heat shock mRNAs are shown in the lower panel. The sequences are aligned according to HULTMARK, KLEMENZ and GEHRING ( 1 986). The A at position 6 represents the transcription start used for the alignment (arrow). Nucleotides at six positions are conserved for the hsp mRNAs; -1, + 1 , +7, +12, +15 and +20. The first 5 of these nucleotides are also conserved in the hsrw loci. In position six, all the hsrw loci have a T rather than an A. The two As in the consensus that are indicated by arrowheads are the only members of the consensus that are also conserved in the hsp82 sequence.

    region might be required for splicing under heat shock conditions, the lack of conservation of this region in the hsp82 mRNA argues against this. Like hsrw, the hsp82 transcript is produced and spliced during heat shock. The hsp82 genes from D. melano- gaster, D. pseudoobscura and D. virilis have been se- quenced by BLACKMAN and MESELSON (1986). For hsp82, the sequence around the acceptor splice site and the sequence and position of the branch point acceptor show variation between these species. Of the last 40 nucleotides in the hsp82 intron, only 9 are identical and they are scattered. The first 20 nucleo- tides of the hsp82 second exon do include 18 identical nucleotides but, since the hsp82 exon encodes the protein, the sequence conservation here may be re- quired for protein structure, an explanation that does not hold for the sequences of the hsrw second exon.

    A more attractive explanation for the conservation of the region around the 3' splice site of hsrw is that it is involved in the regulation of splicing. In some cases changes in the pattern of splicing and polyade- nylation are used to modulate gene expression (LEFF, ROSENFELD and EVANS 1986). The expression of the secreted form of IgMp heavy chain is controlled by alterations in polyadenylation site selection that sub- sequently determine the pattern of splicing that oc- curs. -For the calcitonin/CGRP gene, alterations in splicing affect the choice of polyadenylation sites. In these examples, the changes in RNA processing result

    in clear changes in the protein products synthesized. Changes in the processing of the hsrw transcripts appear to result only in changes in the relative amounts of the three hsrw transcripts, 01, w2 or 03. If, as we propose (BENDENA et al. 1989a), the tran- scripts have different roles, changes in their relative levels could be significant.

    We can suggest two points at which conserved se- quences around the 3' splice site might affect the processing of hsrw transcripts. First, the unspliced precursor, w2, of the cytoplasmic transcript, w 3 , is much more abundant than the unspliced precursors of other mRNAs. The level of the spliced w 3 may be modulated by controlling the rate of splicing of 02 and the conserved sequences may play a role in this control. A second point of control is the alternative polyadenylation which appears to determine whether a transcript will be w2 or wl (BENDENA et al. 1989b). This alternative polyadenylation occurs at a site in the unique portion of the gene and results in the w 2 transcript which has none of the repeats found in w 1. The large amount of w2 RNA that we detect in our RNA samples indicates that polyadenylation site selec- tion is probably the primary factor in determining whether or not the hsrol transcript will be made. However, we can not rule out the possibility that the decision to make the hsrw2 RNA results from the binding of spliceosomes to the acceptor splice site but that the splicing reaction itself proceeds much slower than might be expected.

    Although the mechanisms for utilization of alter- native termination sites are unknown, it is intriguing that the region of alternative termination in hsrw has the potential to form a stem-loop. The ability to form a stem-loop structure has been seen at other termi- nation sites such as histones (BIRCHMEIER et al. 1983), adenovirus E2A (MCDEVITT et al. 1984) and myosin heavy chain (J. MCCARTHY, personal communication). The only other conserved features of the region of alternative termination are the blocks of A-rich se- quences in the last 100 nucleotides of the second exon. Conservation of a sequence composed almost entirely of A and T has been noted in the 3' region of many lymphokines and proto-oncogenes (SHAW and KAMEN 1986) and these sequences have been implicated in the rapid turn-over of the mRNAs. It may be that the very A-rich sequence in the hsrw transcript is respon- sible for the very rapid turnover which we see in the w 3 RNA (BENDENA et al. 1989b).

    We have focused on sequence conservations in re- gions of known function, transcript start sites, splice sites, etc. Interestingly, in the sequence alignment these were also the longest stretches of 100% identity. However, alignments based entirely on sequence iden- tity showed that, though most runs of 100% identity were short, there was significant homology for both

  • Sequence Evolution of hsro

    107 . * * * . * # * . # # .# * . 167 D R P CAACGAAAAAATGGAAAAGTGTGGAAAACGTGTCGTAAATGCGAAACAAAGGCCATATTC

    108 MetGluLysCysGlyLysArgValValAsnAlaLysGlnArgProTyrSer

    . 168 DRM ACAACCAAAAATGGAAAAGTGTAAAAATCGTGTCCCAGCAGACGAGCAGCAGCAGTACGA

    MetGluLysCysLysAsnArgValProAlaAspGluGlnGlnGlnTyrGlu 115 . 175

    DRH TACAACTGCAATGGAGAAGTGTACAGTATTTGTTGCGAAGGCGAGATTGCGGAGCTATAA MetGluLysCysThrValPheValAlaLysAlaArgLeuArgSerTyrAsn

    168 228 D R P CAGCATGGCAACAACGCCAACGTAGTCTTTCTCAATGTCGGGCAATATTTGCTCATAA

    169 SerMetAlaThrThrProThr???

    229 DRM GTATTGCAAAATGCAGGGGCAAGGGCCCACGTAGTATTTTTCCACGTCGGGCATTTAA

    1 7 6 DRH

    -360 DRP DRM

    DRH -260

    DRP DRM

    DRH

    -160 DRP DRM DRH

    -60

    DRP DRM DRH

    FIGURE

    TyrCysLysMetGlnGlyGlnGlyProThr??? 236

    rATGTCACGTCGACTGAAATAATTGCTAGCGCACCGTTCAAGCTGTTTTGCTGTGAAC MetSerArgArgLeuLys???

    413

    FIGURE 9.-A small open reading frame is conserved in both position and approxi- mate size in D. pseudoobscura, D. melanogaster and D. hydei (DRP, DRM and DRH, respec- tively). The open reading frame begins at about +I20 from the transcription start site and would encode a 23 to 27 amino acid peptide, depending on the species. The first four amino acids are the same in the three ORFs. Other data indicates that this ORF is translated in D. melanogaster and D. hydei (see text). Asterisks indicate amino acids that are conserved in all three species while the # sign indicates amino acids conserved in two spe- cies. The numbers next to the sequence in- dicate the nucleotide position from the tran- scription start site. The sequences shown in this figure include a correction for the D. hydei sequence previously reported by us (GARBE et al. 1986). This correction alters the size of the previously identified open reading frame and the C-terminal sequence of the encoded polypeptide. The corrected sequence has been submitted to the Gen- Bank database.

    -261

    -161

    -61

    : 10.-The distribution of heat shock elements in the 5’ region of the hsrw loci shows very little conservation. Heat shock elements identified by homology to the consensus sequences of PELHAM (1982) are underlined; overlapping sequences are overlined. Only the consensus sequence located near position -60 is conserved in position in all three species; however the three species show different degrees of match to the consensus at this site. Sequences were aligned only by position with respect to the splice site.

    of the exons and the intron. When we added partial scoring for transitions the alignment scores rose mark- edly in the intron and second exon (and less so in the first exon), suggesting that transitions are more fre- quent here, as they have been found to be in protein- coding regions (BLACKMAN and MESELSON 1986; BOD- MER and ASHBURNER 1984; SCHAEFFER and AQUADRO 1987). This result is surprising in view of data that indicates that only a small region of the w3 transcript (the ORF-w) is involved in protein coding (FINI, BEN- DENA and PARDUE 1989). The apparently similar changes occurring in the nonprotein coding regions indicate that other evolutionary constraints may be operating on the hsrw sequence.

    Although we can identify two very strong regions

    of homology at either end of the intron, these are not solely responsible for the high alignment score of the intron. Alignment of the intron without the regions around the splice site still gives significant alignment scores. Using the same scoring to align the hsp82 intron sequences published by BLACKMAN and MESEL- SON (1986) we do not obtain significant alignment scores for any of the comparisons (Table 1). Our comparison includes D. hydei, while BLACKMAN and MESELSON used D. virilis, but these two species are equally distant from the others compared.

    Taken together, the conserved features of the hsrw sequence suggest strongly that it is important to tran- scribe and process three transcripts of the size of w l , w 2 and 03. Further, the conserved sequences indicate

  • 414 J. C. Garbe, W. G. Bendena and M. L. Pardue

    several regions of the gene that may explain some of its unusual features. We are presently analyzing the various regions of the transcripts to better understand the role that they play in the metabolism of hsrw RNAs. This information, coupled with additional knowledge of the biology of the hsro locus, should enable us to interpret more of this sequence.

    We are grateful to WILL GILBERT, ERIC LANDER and PAUL SCHIMMEL for enlightening discussions. Sequence analyses were done through the Whittaker College Computing Facility. J. C. G. was a predoctoral trainee of the National Institutes of Health. This work was supported by a grant from the National Institutes of Health to M. L. P.

    LITERATURE CITED

    AYME-SOUTHGATE, A., P. F. LASKO, C. K. FRENCH and M. L. PARDUE, 1989 Characterization of the gene for mp20: a Drosophila muscle protein that is not found in asynchronous oscillatory flight muscle. J. Cell Biol. 108: 521-531.

    BARDWELL, J. C., and E. A. CRAIG, 1987 Eukaryotic M , -83,000 heat shock protein has a homologue in Escherichia coli. Proc. Natl. Acad. Sci. USA 8 4 5177-5181.

    BENDENA, W. G., M. E. FINI, J. C. GARBE, G . M. KIDDER, S. C. LAKHOTIA and M. L. PARDUE, 1989a hsrw: a different sort of heat shock locus. ICN-UCLA Symp. Mol. Cell. Biol. 6 9 4-13.

    BENDENA, W. G., J. C. GARBE, K. L. TRAVERSE, S. C. LAKHOTIA and M. L. PARDUE, 1989b Multiple inducers of the Drosoph- ila heat shock locus 93D (hsrw): inducer-specific patterns of the three transcripts. J. Cell Biol. (in press).

    BENTON, W. D., and R. W. DAVIS, 1977 Screening Lambda-gt recombinant clones by hybridization to single plaques in situ. Science 196: 180-182.

    BEVERLEY, S. M., and A. C. WILSON, 1984 Molecular evolution in Drosophila and the higher diptera. 11. A time scale for fly evolution. J. Mol. Evol. 21: 1-13.

    BIGGEN, M. D., T . J. GIBSON and G. F. HONG, 1983 Buffer gradient gels and ''S-label as an aid to rapid DNA sequence determination. Proc. Natl. Acad. Sci. USA 80: 3963-3965.

    BIRCHMEIER, C., W. FOLK and M. L. BIRNSTIEL, 1983 The ter- minal RNA stem-loop structure and 80 bp of spacer DNA are required for the formation of 3' termini of sea urchin H2A mRNA. Cell 35: 433-440.

    BLACKMAN, R. K., and M. MESELSON, 1986 Interspecific nucleo- tide sequence comparisons used to identify regulatory and structural features of the Drosophila hsp82 gene. J. Mol. Biol.

    BODMER, M., AND M. ASHBURNER, 1984 Conservation and change in the DNA sequences coding for alcohol dehydrogenase in sibling species of Drosophila. Nature 3 0 9 425-430.

    BURMA, P. K., AND S. C. LAKHOTIA, 1984 Cytological identity of 93D-like and 87C-like heat shock loci in Drosophila pseudoob- scura. Indian J. Exp. Biol. 22: 577-580.

    CAVENER, D. R., 1987 Comparison of the consensus sequence flanking translational start sites in Drosophila and vertebrates. Nucleic Acids Res. 15: 1353-1361.

    CHIRGWIN, J. M., A. E. PRYZBYLA, R. J. MACDONALD and W. J. RUTTER, 1979 Isolation of biologically active ribonucleic acid from sources enriched in ribonuclease. Biochemistry 18: 5294- 5298.

    CRAIG, E. A , , 1985 The heat shock response. CRC Crit. Rev. Biochem. 18: 239-280.

    DEVEREUX, J., P. HAEKERLI and 0. SMITHIES, 1984 A compre- hensive set of sequence programs for the VAX. Nucleic Acids Res. 12: 387-395.

    EFSTRATIADIS, A,, J. W. POSAKONY, T. MANIATIS, R. M. LAWN, C.

    188: 499-515.

    O'CONNELL, R. A. SPRITZ, J. K. DERIEL, B. G . FORGET, S. M. WEISSMAN, J. L. SLIGHTOM, A. E. BLECHL, 0. E. SMITHIES, F. E. BARALLE, C. C. SHOULDERS and N. J. PROUDFOOT, 1980 The structure and evolution of the human &globin gene family. Cell 21: 653-688.

    FEINBERG, A. P., and B. VOGELSTEIN, 1983 A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132: 6-13.

    FEINBERG, A. P., and B. VOGELSTEIN, 1984 A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity (Addendum). Anal. Biochem. 137: 266-267.

    FINI, M. E., W. G. BENDENA and M. L. PARDUE, 1989 Unusual behavior of the cytoplasmic transcript of hsr omega: an abun- dant, stress inducible RNA that is translated, but that yields no detectable protein product. J. Cell Biol. (in press).

    GARBE, J. C., 1988 Characterization of heat shock locus 93D of Drosophila melanogaster. Ph.D. thesis, Massachusetts Institute of Technology.

    GARBE, J. C., and M. L. PARDUE, 1986 Heat shock locus 93D of Drosophila melanogaster: a spliced RNA most strongly con- served in the intron. Proc. Natl. Acad. Sci. USA 83: 1812- 1816.

    GARBE, J. C., W. G. BENDENA, M. ALFANO and M. L. PARDUE, 1986 A Drosophila heat shock locus with a rapidly diverging sequence but a conserved structure. J. Biol. Chem. 261: 16889- 16894.

    HEMMINGSEN, S. M., c . WOOLFORD, s. M. VAN DER VIES, K. TILLY, D. T . DENNIS, C. P. GEORGOPOULOS, R. W. HEMDRIX and R. J. ELLIS, 1988 Homologous plant and bacterial proteins chap- erone oligomeric protein assembly. Nature 333: 330-334.

    HOHN, B., and K. MURRAY, 1977 Packaging recombinant DNA molecules into bacteriophage particles in vitro. Proc. Natl. Acad. Sci. USA 74: 3259-3263.

    HOVEMANN, B., U. WALLDoRFand R. P. RYSECK, 1986 Heat shock locus 93D of Drosophila melanogaster: an RNA with limited coding capacity accumulates precursor transcripts after heat shock. Mol. Gen. Genet. 204: 334-340.

    Hu, M. C.-T., and N. DAVIDSON, 1986 Mapping transcription start points on cloned genomic DNA with T 4 DNA polymerase: a precise and convenient technique. Gene 42: 21-29.

    HULTMARK, D., R. KLEMENZ and W. GEHRING, 1986 Translational and transcriptional control elements in the un- translated leader of the heat shock gene hsp22. Cell 44: 429- 438.

    KARN, J., S. BRENNER and L. BARNETT, 1983 New bacteriophage vectors with positive selection for cloned inserts. Methods Enzymol. 101: 3-19.

    KOZAK, M., 1984 Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs. Nucleic Acids Res. 12: 857-872.

    KOZAK, M., 1986 Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell 44: 283-292.

    KURTZ, S., J. Ross], L. PETKO and S. LINDQUIST, 1986 An ancient developmental induction; heat shock proteins induced in spor- ulation and oogenesis. Science 231: 1 154-1 157.

    LAKE, J. A , , 1988 Origin of the eukaryotic nucleus determined by rate-invariant analysis of rRNA sequences. Nature 331: 184-186.

    LAKHOTIA, S. C., 1987 The 93D heat shock locus in Drosophila: a review. J. Genet. 66: 139-157.

    LAKHOTIA, S. C., and A. K. SINGH, 1982 Conservation of the 93D puff of Drosophila melanogaster in different species of Drosoph- ila. Chromosoma 86: 265-278.

    LEFF, S. E., M. G. ROSENFELD and R. M. EVANS, 1986 Complex transcriptional units: diversity in gene expression by alternative RNA processing. Annu. Rev. Biochem. 5 5 1091-1 117.

  • Sequence Evolution of hsro 415

    LINDQUIST, S., 1986 The heat shock response. Annu. Rev. Biochem. 55: 1151-1 192.

    MAIZEL, J. V., and R. P. LENK, 1981 Enhanced graphic matrix analysis of nucleic acid and protein sequences. Proc. Natl. Acad. Sci. USA 78: 7665-7669.

    MANIATIS, T., E. F. FRITSCH and J. SAMBROOK, 1982 Molecular Cloning. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.

    MARTINEZ-CRUZADO, J. C., C. SWIMMER, M. G. FENERJIAN and F. KAFATOS, 1988 Evolution of the autosomal chorion locus in Drosophila. I . General organization of the locus and sequence comparisons of genes s15 and s19 in evolutionarily distant species. Genetics 119: 663-677.

    MCDEVITT, M. A,, M. J. IMPERIALE, H. ALI and J. R. NEVINS, 1984 Requirement of a downstream sequence for generation of a poly(A) addition site. Cell 37: 993-999.

    MCGARRY, T . J., and S. LINDQUIST, 1985 The preferential trans- lation of Drosophila mRNA requires sequences in the untrans- lated leader. Cell 42: 903-9 1 1 .

    MCMULLIN, T. W., and R. L. HALLBERG, 1986 An effect of heat shock on ribosome structure: the appearance of a new ribo- some-associated protein. Mol. Cell. Biol. 6 2527-2535.

    MOUNT, S. M., 1982 A catalogue of splice junctions. Nucleic Acids Res. 1 0 459-472.

    NAGANO, R. T., E. CZARNECKA, W. B. GURLEY, F. SCHOFFL and J. L. KEY, 1985 Genes for low molecular weight heat shock proteins of soybeans: sequence analysis of a multigene family. Mol. Cell. Biol. 5: 3417-3428.

    NORRANDER, J., T. KEMPE and J. MESSING, 1983 Construction of improved M 13 vectors using deoxyribonucleotide-directed mu- tagenesis. Gene 26: 10 1 - 106.

    NEEDLEMAN, S. B., and C. D. WUNSCH, 1970 A general method applicable to the search for similarities in amino acid sequence of two proteins. J. Mol. Biol. 48: 443-453.

    PADGETT, R. A,, P. J. GRABOWSKI, M. M. KONARSKA, S. SEILER and P. A. SHARP, 1986 Splicing of messenger RNA precursors. Annu. Rev. Biochem. 55: 1 1 19-1 150.

    PARDUE, M. L., and I . B. DAWID, 1981 Chromosomal locations of two DNA segments that flank ribosomal insertion-like se- quences in Drosophila: flanking sequences are mobile elements. Chromosoma 83: 29-43.

    PATTERSON, J. T., and W. S. STONE, 1952 Evolution in the Species Drosophila. Macmillan, New York, p. 161.

    PELHAM, H. R. B., 1982 A regulatory upstream promoter element in the Drosophila hsp7O heat shock gene. Cell 30: 517-528.

    PETERS, F. P. A. M. N., N. H. LUFSEN and P. J. A. SONDERMEIJER, 1980 Rapid sequence divergence in a heat shock locus of Drosophila. Chromosoma 81: 271-280.

    PETERS, F. P. A. M. N., N. H. LUBSEN, U. WALLDORF, R. J. M. MOORMANN and B. HOVEMANN, 1984 The unusual structure of heat shock locus 2-48B in Drosophila hydei. Mol. Gen. Genet. 197: 392-398.

    RYSECK, R.-P., U. WALLDORF, T . HOFFMAN and B. HOVEMANN, 1987 Heat shock loci 93D ofDrosophila melanogasterand 48B of Drosophila hydei exhibit a common structural and transcrip- tional pattern. Nucleic Acids Res. 15: 3317-3333.

    SANGER, F., S. NICKLEN and A. R. COULSON, 1977 DNA sequenc- ing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74: 5463-5467.

    SCHAEFFER, S. W., and C. F. AQUADRO, 1987 Nucleotide sequence of the Adh gene region of Drosophila pseudoobscura, evolution- ary change and evidence for an ancient gene duplication. Genetics 117: 61-73.

    SHAW, G., and R. KAMEN, 1986 A conserved AU sequence from the 3’ untranslated region of GM-CSF mRNA mediates selec- tive mRNA degradation. Cell 4 6 659-667.

    WALLDORF, U., S. RICHTER, R.-P. RYSECK, H. STELLER, J. E. EDSTROM, E. K. F. BAUTZ and B. HOVEMANN, 1984 Cloning of heat-shock locus 93D from Drosophila melanogaster. EMBO J. 3: 2499-2504.

    XIAO, H., and J. T. LIS, 1988 Germline transformation used to define key features of heat-shock response elements. Science 2 3 9 1139-1 142.

    Communicating editor: C. C. LAURIE


Recommended