+ All Categories
Home > Documents > of human oy-, b-, fl-globin genes · 2005. 4. 22. · Proc. Natl.Acad.Sci. USA77(1980) TEG51 TyGI...

of human oy-, b-, fl-globin genes · 2005. 4. 22. · Proc. Natl.Acad.Sci. USA77(1980) TEG51 TyGI...

Date post: 16-Feb-2021
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
5
Proc. Nat!. Acad. Sci. USA Vol. 77, No. 7, pp. 4229-423, July 1980 Genetics Cloning and characterization of DNA sequences surrounding the human oy-, b-, and fl-globin genes (repetitive DNA sequences/thalassemia/electron microscopy/Southern blot analysis) R. E. KAUFMAN*, P. J. KRETSCHMERt, J. W. ADAMS*, H. C. COONt, W. F. ANDERSONt, AND A. W. NIENHUIS* Clinical Hematology Branch" and Laboratory of Molecular Hematologyt, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20205 Communicated by Donald S. Fredrickson, April 24,1980 ABSTRACT The human -, 6-, and globin genes are lo- cated within a 30-kilobase (kb) region of DNA, of which only 20% represents the globin genes. We have attempted to define the nature of flanking and intergenic sequences by isolating recombinants containing the human e-, both y-, or the 3' end of the "globin gene from a bacteriophage library of cloned human DNA. Comparison of these recombinants and a recombinant containing the 6- and h-globin genes (HfG1) has provided the following results. The -lobin gene is located 14 kb 5' to the G7 gene. DNA sequence homology between the region containing the two G7 genes and the a an ene region is limited to only a few hundred nucleotides which include the globin coding sequences. Repetitive DNA sequences have been found in the region 3' to the globin gene. Sequences located adjacent to the -globin gene are repeated in the globin gene region. A repeti- tive DNA sequence more than 3.2 kb long is repeated frequently in the human genome but is not repeated in the globin gene region in the clones examined. Human hemoglobins are selectively produced in a characteristic and orderly fashion during different stages of development (1-4). During early embryogenesis, expression of the rand E genes leads to production of Hb Gower I (ref2) (5). As the pri- mary site of erythropoiesis switches from the yolk sac to the fetal liver, production of A- and e-globins dramatically decreases. These are replaced by a- and -y-globin chain production (6). Hb Gower II (a2e2) is temporarily produced while fetal he- moglobin, Hb F (a2y2), becomes the predominant hemoglobin during the remainder of fetal life. Two linked genes encode for y-globin; these are designated Gy and Ay on the basis of a single difference at codon 136 where glycine or alanine is encoded, respectively. f3-Globin production begins by the 12th week of gestation so that a small amount of adult hemoglobin, Hb A (a212), is present in fetal blood but at birth a switch from pre- dominantly y-globin to predominantly f3-globin synthesis re- sults in the adult pattern of hemoglobin production. 6-Globin, which differs from fl-globin in only 10 of its 146 amino acids (7) is found in trace amounts in adult erythrocytes. By both restriction endonuclease mapping of genomic DNA and molecular cloning of the human globin genes, the two y-, the 6-, and the fl-globin genes have been shown to be closely linked on a 30-kilobase (kb) segment of DNA (8-14) which is known to be on chromosome 11 (15). Each globin gene contains introns (9-11), of approximately 100 base pairs (bp), at a posi- tion corresponding to amino acids 30 and 31 and approximately 800-900 bp at a position corresponding to amino acids 104 and 105 (10). Including the exons which have a total length of ap- proximately 540-570 bp, each structural gene occupies 1600 bp of DNA. Thus, the globin genes comprise only a small per- centage of DNA in this region. The role of the remnng DNA The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "ad- verttsement" in accordance with 18 U. S. C. §1734 solely to indicate this fact. 4229 remains unexplained although the segments between the structural genes have been postulated to contain sequences that may be important to gene regulation. Using molecular cloning methods, we have isolated and characterized three recombinants containing the e-globin gene (TeG51), both of the y-globin genes (TyGl), and the 3' end of the fl-globin gene (TflG41) (Fig. 1). By utilizing these and the recombinants H,8G1 and HP3G2 (10), both of which contain the human 6- and #-globin genes, we have examined the intergenic sequences for regions of homology by the Southern blot tech- nique and by heteroduplex analysis by electron microscopy. We have found the e gene to be linked to Gy gene and dem- onstrated DNA sequences that lie 3' to ft gene and are repeated many times throughout the human genome. MATERIALS AND METHODS Materials. Restriction endonucleases were obtained from Bethesda Research Laboratories (Rockville, MD) and Boehringer Mannheim; DNA digestions were performed in buffers recommended by the supplier. DNA polymerase I was obtained from Boehringer Mannheim. [32P]dCTP and [32P]dGTP were obtained from Amersham/Searle. DNA ligase was purchased from New England Biolabs (lot 9). Alkaline phosphatase was obtained from Worthington. Eschertchfa coll DP50 supF, bacteriophage Charon 4A, and associated material necessary for verification of EK-2 status were obtained as a generous gift from F. Blattner (Madison, WI) and maintained as described (14). E. colf X1776 was provided by R. Curtis III (16). Plasmid pBR322 (17) was provided by H. Boyer. Plasmid JW151 was obtained from B. Forget (18). Bacteriophage recombinants H46G1 and H(3G2 (10) were kindly provided by T. Maniatis Methods for maintenance and growth of bacterial strains, for amplification and extraction of plasmid DNA from E. coil strains X1776 and C-600, and extraction of bacteriophage DNA from high-titer CsCl stocks were as de- scribed (19). All cloning experiments were conducted in com- pliance with the National Institutes of Health Guidelines for Recombinant DNA Research as approved by the National In- stitutes of Health Biohazard Comnittee under a Memorandum of Understanding and Agreement. Isolation and Characterization of Clones Containing Globin Genes. The human Charon 4A genomic library was prepared by the procedure described in detail by Maniatis et al. (20) with the in vitro packaging system described by Blattner et al. (14). High molecular weight DNA, isolated from spleen tissue of an individual homozygous for fl-thalassemia, was di- gested incompletely with EcoRI and size fractionated on su- crose gradients from which DNA fragments of 15-20 kb were isolated and ligated to the purified arms of Charon 4A (19). A Abbreviatis: bp, base pair(s); Hb, hemoglobin; kb, kilobase(s); HPFHK hereditary persistence of fetal hemoglobin. Downloaded by guest on July 7, 2021
Transcript
  • Proc. Nat!. Acad. Sci. USAVol. 77, No. 7, pp. 4229-423, July 1980Genetics

    Cloning and characterization ofDNA sequences surrounding thehuman oy-, b-, and fl-globin genes

    (repetitive DNA sequences/thalassemia/electron microscopy/Southern blot analysis)

    R. E. KAUFMAN*, P. J. KRETSCHMERt, J. W. ADAMS*, H. C. COONt, W. F. ANDERSONt, ANDA. W. NIENHUIS*Clinical Hematology Branch" and Laboratory of Molecular Hematologyt, National Heart, Lung, and Blood Institute, National Institutes of Health,Bethesda, Maryland 20205

    Communicated by Donald S. Fredrickson, April 24,1980

    ABSTRACT The human -, 6-, and globin genes are lo-cated within a 30-kilobase (kb) region of DNA, of which only20% represents the globin genes. We have attempted to definethe nature of flanking and intergenic sequences by isolatingrecombinants containing the human e-, both y-, or the 3' end ofthe "globin gene from a bacteriophage library of cloned humanDNA. Comparison of these recombinants and a recombinantcontaining the 6- and h-globin genes (HfG1) has provided thefollowing results. The -lobin gene is located 14 kb 5' to the G7gene. DNA sequence homology between the region containingthe two G7 genes and the a an ene region is limited to onlya few hundred nucleotides which include the globin codingsequences. Repetitive DNA sequences have been found in theregion 3' to the globin gene. Sequences located adjacent to the-globin gene are repeated in the globin gene region. A repeti-

    tive DNA sequence more than 3.2 kb long is repeated frequentlyin the human genome but is not repeated in the globin generegion in the clones examined.

    Human hemoglobins are selectively produced in a characteristicand orderly fashion during different stages of development(1-4). During early embryogenesis, expression of the rand Egenes leads to production of Hb Gower I (ref2) (5). As the pri-mary site of erythropoiesis switches from the yolk sac to the fetalliver, production of A- and e-globins dramatically decreases.These are replaced by a- and -y-globin chain production (6).Hb Gower II (a2e2) is temporarily produced while fetal he-moglobin, Hb F (a2y2), becomes the predominant hemoglobinduring the remainder of fetal life. Two linked genes encode fory-globin; these are designated Gy and Ay on the basis of a single

    difference at codon 136 where glycine or alanine is encoded,respectively. f3-Globin production begins by the 12th week ofgestation so that a small amount of adult hemoglobin, Hb A(a212), is present in fetal blood but at birth a switch from pre-dominantly y-globin to predominantly f3-globin synthesis re-sults in the adult pattern of hemoglobin production. 6-Globin,which differs from fl-globin in only 10 of its 146 amino acids(7) is found in trace amounts in adult erythrocytes.By both restriction endonuclease mapping of genomic DNA

    and molecular cloning of the human globin genes, the two y-,the 6-, and the fl-globin genes have been shown to be closelylinked on a 30-kilobase (kb) segment of DNA (8-14) which isknown to be on chromosome 11 (15). Each globin gene containsintrons (9-11), of approximately 100 base pairs (bp), at a posi-tion corresponding to amino acids 30 and 31 and approximately800-900 bp at a position corresponding to amino acids 104 and105 (10). Including the exons which have a total length of ap-proximately 540-570 bp, each structural gene occupies 1600bp of DNA. Thus, the globin genes comprise only a small per-centage of DNA in this region. The role of the remnng DNA

    The publication costs of this article were defrayed in part by pagecharge payment. This article must therefore be hereby marked "ad-verttsement" in accordance with 18 U. S. C. §1734 solely to indicatethis fact.

    4229

    remains unexplained although the segments between thestructural genes have been postulated to contain sequences thatmay be important to gene regulation.

    Using molecular cloning methods, we have isolated andcharacterized three recombinants containing the e-globin gene(TeG51), both of the y-globin genes (TyGl), and the 3' end ofthe fl-globin gene (TflG41) (Fig. 1). By utilizing these and therecombinants H,8G1 and HP3G2 (10), both of which contain thehuman 6- and #-globin genes, we have examined the intergenicsequences for regions of homology by the Southern blot tech-nique and by heteroduplex analysis by electron microscopy.We have found the e gene to be linked to Gy gene and dem-onstrated DNA sequences that lie 3' to ft gene and are repeatedmany times throughout the human genome.

    MATERIALS AND METHODSMaterials. Restriction endonucleases were obtained from

    Bethesda Research Laboratories (Rockville, MD) andBoehringer Mannheim; DNA digestions were performed inbuffers recommended by the supplier. DNA polymerase I wasobtained from Boehringer Mannheim. [32P]dCTP and[32P]dGTP were obtained from Amersham/Searle. DNA ligasewas purchased from New England Biolabs (lot 9). Alkalinephosphatase was obtained from Worthington.

    Eschertchfa coll DP50 supF, bacteriophage Charon 4A, andassociated material necessary for verification of EK-2 statuswere obtained as a generous gift from F. Blattner (Madison, WI)and maintained as described (14). E. colf X1776 was providedby R. Curtis III (16). Plasmid pBR322 (17) was provided by H.Boyer. Plasmid JW151 was obtained from B. Forget (18).Bacteriophage recombinants H46G1 and H(3G2 (10) were kindlyprovided by T. Maniatis Methods for maintenance and growthof bacterial strains, for amplification and extraction of plasmidDNA from E. coil strains X1776 and C-600, and extraction ofbacteriophage DNA from high-titer CsCl stocks were as de-scribed (19). All cloning experiments were conducted in com-pliance with the National Institutes of Health Guidelines forRecombinant DNA Research as approved by the National In-stitutes of Health Biohazard Comnittee under a Memorandumof Understanding and Agreement.

    Isolation and Characterization of Clones ContainingGlobin Genes. The human Charon 4A genomic library wasprepared by the procedure described in detail by Maniatis etal. (20) with the in vitro packaging system described by Blattneret al. (14). High molecular weight DNA, isolated from spleentissue of an individual homozygous for fl-thalassemia, was di-gested incompletely with EcoRI and size fractionated on su-crose gradients from which DNA fragments of 15-20 kb wereisolated and ligated to the purified arms of Charon 4A (19). A

    Abbreviatis: bp, base pair(s); Hb, hemoglobin; kb, kilobase(s); HPFHKhereditary persistence of fetal hemoglobin.

    Dow

    nloa

    ded

    by g

    uest

    on

    July

    7, 2

    021

  • Proc. Natl. Acad. Sci. USA 77 (1980)

    TEG51 TyGIA .

    .04 % --o %/1 % %,4 \O' Gy Ay

    , _ _ -'\_ _-. "3.7 1 4.0 D.81*1i

    t 1.5\1

    It

    7.2 I l12nl1 I

    1 .7 13.3 1,, I .-i , i

    7.01 1I ~ 7, ."3.8 1.2 5 1

    ' I I I'I 2.2 10.5, I1r

    TfG 41

    -1 %.,' %6 p 1o-

    1

    I I I IC5Z/1j3.513.21le

    10.8 I

    9.3

    11.5.1.811 3.3 1l.0 9.0 1.0I l A a I1 1.3 2.4 1.4 4.3 4.5

    I,1 ;I3.8 2.8l l ll

    I

    Bam H I

    Hind III

    Pst I

    BgI

    Hpa

    FIG. 1. Restriction endonu-clease map of the e-, y-, 6-, andf3-globin regions. The relative po-sitions of the globin genes and theEcoRI sites are represented in theupper part ofthe figure. The size ofcertain fragments is indicated inkb. EcoRI fragments contained inthe recombinants TeG51 andT3G41 are defined by the verticaldashed lines on an expanded re-striction endonuclease map for tworecombinants presented in thelower half of the figure. The 1.8-and 1.5-kb EcoRI fragments arecommon to the recombinantsTEG51 and TyG1.

    library of 106 recombinant bacteriophage was amplified andscreened by the method of Benton and Davis (21). A 3.0-kb HpaI/Pst I fragment which contains the entire (l-globin gene withits introns and 1400-bp flanking sequences was used as a mo-lecular hybridization probe for bacteriophage screening. It wasisolated from a recombinant plasmid (pRKl) made by insertingthe 4.4-kb Pst I fragment from H/3G1 (10) into pBR322. Thisfragment and all other hybridization probes were made ra-dioactive by nick-translation as described (22, 23). Analysis ofrecombinants was by restriction endonuclease digestion andthe Southern blot technique (24) using hybridization conditionsdescribed by Jeffreys and Flavell (25). To facilitate mapping,in some cases, specific restriction endonuclease fragments wereisolated from agarose gels and subcloned into pBR322 by usingstandard techniques (19, 26).

    Heteroduplex Analysis. Equal amounts of bacteriophageDNA (total final concentration, 5 ttg/ml) were denatured andrehybridized for 45-60 min in 47.6% formamide exactly asdescribed (27). Spreading conditions, electron microscopy,photography, and measurement of heteroduplex molecules willbe described in detail elsewhere.

    RESULTSLinkage of the v and y-Globin Genes. Screening of the

    bacteriophage library produced a number of recombinants thatcontained globin genes; three have been analyzed in detail. Tworecombinants have been designated TeG51 and TyGl (see Fig.1). The insert from TeG51 contained four EcoRI fragments (3.7,4.0, 1.8, and 1.5 kb long) which were ordered as shown by usingdigestions with BamHI, Pst, HindIII, and EcoRI (detailsavailable on request). In certain cases, specific fragments wereextracted from agarose gels and digested with a second re-striction endonuclease- The globin gene in this recombinant wasidentified as the E gene based on a restriction endonuclease mapidentical to the human e gene that had previously been isolatedand characterized by DNA sequence determination (28).The second recombinant analyzed, TyG1, contained an

    18.5-kb insert of human DNA which includes both the Gy andATy genes. These were tentatively identified by demonstratingthat the Hha I fragment from JW151, which contains y-glob-in-gene-coding sequences (18), hybridized strongly to the globingenes in TyG1 compared to hybridizing of the Hpa I/Pst Ifragment which contains the fl-globin gene. In addition, therestriction endonuclease map derived for this recombinant isidentical to that previously reported by others for the 'y-globingene region as deduced by restriction endonuclease mappingof genomic DNA or characterization of recombinant clones(11-14, 29).

    Both recombinants contained EcoRI fragments 1.5 and 1.8kb long which were positioned at the ends of the inserted DNA

    as shown in Fig. 1. The two fragments from each of the tworecombinants comigrated precisely on agarose gel electro-phoresis (Fig. 2). When the 1.8-kb fragment was isolated fromTeG51, nick-translated, and used as a probe in a Southern blotanalysis, the 1.8-kb fragment in TyGI gave a band of intensityequal to that in TeG51. The 1.5-kb EcoRI fragment from eachcontained an identically positioned Pst I site (data not shown).Furthermore, both the 1.5- and 1.8-kb EcoRI fragments fromthe two clones lacked sites for BamHI, HindIII, Xba I, and Bgl

    ThG51 TyG1~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~..-.~~~ ~ ~ ~ ~ ~ .. A

    FIG 2 Demonstration of linkage of .and My genes. DNA frag-ments from EcoRI digestions of the recombinants TeG51 and T-yG1are revealed by ethidium bromide staining in the left lanes. Fragments

    representing inserted human DNA from TeG51 are 4.0,3.7,1.8, and

    1.5kb long; insert fragments from T-yG1 are 7.2,2.7,2.3, 1.8,1.6,1.5,0.8, and 0.6kb long; 1.8- and 1.5-kb fragments areprent in both. The

    radioautograph in the right lanes demonstrates hybridization of the

    1.8-kb EcoRI fragment from TEG51 to the 1.8-kb fragment in TeG51

    and in TyG1.

    Bam HI

    Hind III

    Xbo I

    BgUII

    4230 Genetics: Kaufman et al.

    Dow

    nloa

    ded

    by g

    uest

    on

    July

    7, 2

    021

  • Proc. Natl. Acad. Scd. USA 77 (1980) 4231

    P3.6 1,L

    a------------Probe2~~~~~~~~~~~~~~~~~~~8~~

    I~~~~

    FIG 3 Heteroduplex formed between T'yGl and HfiG1*Heavylines on the drawing (Inset) represent regions of hybridization. Largeloops on either side of short hybridized structural gene regions dem-onstrate lack of homology in these flanking sequences between theG7y and 6 genes and Ay and ft genes. In no cases were heteroduplexesformed between repeated sequences 5' to the 6 andG, genes. Severalheteroduplexes demonstrated a stem-and-loop structure at the 5' endof the insert of the reoombinant oontaining the 6 and 8 genes (Hj#Gl).Al measurements were performed on a Numonics digitizer. Dou-ble-stranded OX174 circular phage DNA was used as a length stan-dard. (X26,400.)

    II. Thus, the overlap of the DNA inserted into these recombi-nants and therefore the linkage of the e and Gy genes was es-tablished.

    Restriction Endonuclease Map 3' to the -GxobinGene.The third recombinant, T3G41, contained 17.6 kb of DNAextending 3' from the ft-globin gene. To derive the restrictionendonuclease map of this region, we subcloned the entire 17.6kb of DNA by inserting several fragments into pBR322 to givethe following recombinant pamd: 3.6-kb EcoRI fragment(pRK12), the 3.2-kb EcoRI fragment (pRK11), 1.5-kbEcoRIl/BamHIl fragment from the 5' end of the 10.8-kb EcoRIfragment (pRK19), and 9.3-kb BamHIl/EcoRIl fragment(pRK20) which represents the remainder of the insert in therecombinant bacteriophage. By maling use of the known re-striction endonuclease sites in pBR322, double and single di-gestions with several restriction endonucleases yielded data thatallowed us to generate the map shown in Fig. 1 (details availableon request).Nonhomology of the DNA Sequences Flanking the'y-, 5-,

    and &-Globin Genes. We attempted to determine the extentof sequence homology in DNA flanking the 'y-globin genescompared to DNA flaning the 6- and /3-globin genes by per-forming heteroduplex analysis using the electron microscope.The orientation of the inserted DNA in both TyG1 and Hj#G1is such that the right arm of Charon 4A is 5' with respect to thetwo globin genes, and the relative positions of the genes in eachrecombinant are approximately the same, making them idealfor heteroduplex analysis. A typical DNA heteroduplex isrepresented in Fig. 3. Little DNA sequence homology wasfound between the inserts in the two recombinants. Seven ofeight heteroduplexes analyzed in detail exhibited hybridization

    Genomic T£G51

    kdAb-A

    .i

    w

    it

    TyGi HOG2 TfG41

    _ _

    HFheO.Oi,

    FIG. 4. Repetitive DNA sequences 3' to the human (l-globin gene.A 2.9-kb EcoRI/Pst I fragment that contained no globin gene struc-tural sequences was used as a hybridization probe in Southern blotexperiments against EcoRI-digested genomic DNA and EcoRI di-gested DNA from several recombinants. For the recombinants, eachleft-hand lane shows ethidium bromide-stained DNA fragments andeach right-hand lane shows the Southern blot results. In addition tothe Charon 4A arms, TeG51 contained EcoRI fragments 4.0, 3.7, 1.8,and 1.5 kb long; T'yG1 contained EcoRI fragments 7.2, 2.7, 2.3, 1.8,1.6, 1.5, 0.8, and 0.6 kb long (small fragments not visible); HflG2contained EcoRI fragments 5.2, 3.2, 2.2, 2.05, and 1.75 kb long; andTIBG41 contained EcoRI fragments 10.8, 3.6, and 3.2 kb long. Frag-ments to which the probe hybridized are as follows: TEG41, 3.7 kb;TyG1, 7.2 kb; H#G2, 3.1 and 2.05 kb; T#G41, 3.6 kb. The large arrowsindicate hybridization of the probe to a segment of DNA from whichthe probe was derived; small arrows indicate fragments hybridizingto the repeated sequence in the probe. Hybridization was performedat 68°C in 0.45M NaCl/0.045M sodium citrate, pH 7.0, for 36 hr. Thefilter was washed with successively less concentrated saline such thatthe final two washes were in 15mM NaCl/1.5mM sodium citrate for30 min each.

    at a position that corresponded to the globin coding sequencesof the Ay and # genes as determined by measurements with anelectronic graphics calculator. Nonhybridizing regions 800-1000 bp long, representing the large introns, were surroundedby two short hybridizing regions 30-600 bpS' to the intron and150-280 bp 3' to the intron. A region that corresponded to the5' coding sequences in the Gy and 6 genes hybridized in fourof the eight heteroduplexes and ranged in length from 200 to450 bp. No hybridization occurred in the region of the largeintron or in the coding sequences 3' to the intron in the case ofthe G8y6 gene heteroduplex. In addition, no hybridization oc-curred in the regions 5' to the Gy and 6 genes between repeatedsequences demonstrated by filter hybridization studies.

    Detection of Repetitive DNA Sequences Located 3' to theHuman P-Globin Gene. Repetitive DNA sequences repre-sented in two restriction nuclease fragments were found in thisregion. The 2.9-kb Pst I/EcoRI fragment 3' to the f3-globingene hybridized diffusely to all size classes of DNA fragmentsin an EcoRI digest of human genomic DNA and also hybridizedto several EcoRI fragments in the globin gene region (Fig. 4).These data do not prove, however, that the sequence hybri-dizing to genomic DNA is the same sequence repeated in theglobin gene region because more than one repeat may bepresent in the 2.9-kb probe. The 3.2-kb EcoRI fragmentdownstream from the #-globin gene yielded a diffuse patternof hybridization with EcoRI-digested human genomic DNAbut specific bands were visible including an intense one at 3.2kb (Fig. 5). These data suggest that this 3.2-kb EcoRI fragmentis part of a long repetitive sequence of DNA represented in part

    Genetics: Kaufman et al.

    Dow

    nloa

    ded

    by g

    uest

    on

    July

    7, 2

    021

  • Proc. Natl. Acad. Sci. USA 77 (1980)

    [3 L i 3.2owe

    p ,r

    FIG. 5. Repetitive DNA sequences 3' to the human IB-globin gene.The 3.2 EcoRI fragment (horizontal marker) shown in the illustrationwas isolated and used as a probe in Southern blotting experimentsagainst EcoRI-digested genomic DNA (Left) and against EcoRI-digested DNA derived from subpools of a recombinant DNA library(Right). For details, see text.

    by a segment of DNA 3.2 kb long defined by two EcoRI sites.To substantiate this interpretation, we analyzed DNA fromsubpools of the human genomic library. During constructionof the recombinant bacteriophage library, the 106 unique re-combinant bacteriophage were divided into five fractions afterpackaging and before amplification so that each fraction con-tained approximately 2 X 105 recombinants representingroughly 20% of the human library. These subpools were am-plified and small amounts of DNA were prepared. When hy-bridized to the 3.2-kb EcoRI probe on a Southern blot analysis,EcoRI-digested DNA from each disclosed multiple discretebands, with a somewhat different pattern for each subpool (Fig.5). Specifically, an intense band at 3.2 kb was found in everysubpool, suggesting that this EcoRI fragment is part of a largerepetitive DNA sequence. When the 3.2-kb EcoRI probe wasannealed to the several globin gene-containing recombinants,no areas of homology throughout the E-y-b-fl gene complexwere found although a 7.2-kb EcoRI fragment between the Ayand genes is not represented in the recombinants analyzed.

    DISCUSSIONWe have isolated recombinant bacteriophage containing DNAfragments from the e-'y-b-( gene region and used these tocharacterize further this portion of the human genome. Linkageof the E and G y genes was established, the restriction endonu-clease map of this region was extended, the flanking regions ofthe several genes were shown to have very limited homologyto one another, and, finally, two fragments containing differentrepetitive DNA sequences were found 3' to the (-globingene.

    The restriction endonuclease sites downstream from thehuman (3-globin gene are of particular interest because ofpolymorphisms and deletions that have been defined in thisregion of DNA. Kan and coworkers have found polymorphismsin the first Hpa I (30) and BamHI (31) sites on the 3' side of thisgene. In virtually all individuals studied, the f(-globin gene was

    found in a 7.6-kb Hpa I fragment; however, a polymorphismthat yields a 13.0-kb Hpa I fragment has been found linked to60-87% of all P3s (sickle cell anemia) globin genes (30,32). Ourmap indicates that the second Hpa I site 3' to the f3 gene is 3.8kb from the first, so that a single point mutation causing loss ofthe first Hpa I site should yield a fragment of 11.4 kb ratherthan the 13.0-kb fragment found. However, the differencebetween 11.4 and 13.0 kb is probably not significant becauseof the difficulty in accurately sizing large fragments onSouthern blot analysis of genomic DNA, so the 13.0-kb frag-ment containing the f3S gene may in fact be 11.4 kb. If this istrue, the polymorphism is the result of a point mutation ratherthan a general rearrangement of DNA in this region. A secondpolymorphism in this region involves the BamHI site located9.3 kb 3' to the intragenic BamHI site in the j3-globin gene.Thirty-three percent of nonthalassemic Sardinian individualshave a 22.0-kb BamHI fragment rather than the usual 9.3-BamHI fragment which contains the 3' end of the fl-globingene (31). We did not find a second BamHI site downstreamfrom the BamHI site involved in the polymorphism, a resultcompatible with the polymorphism being a point mutation.Two deletion mutations that have eliminated all (or part) of

    the 6- and (3-globin genes extend to the region of DNA we havemapped. In one form of 65(-thalassemia, the deletion has beenmapped to extend from the middle of the B-globin gene to ap-proximately 1.0 kb 3' to the f3-globin gene (33). Several re-striction endonuclease sites that we have mapped match thosefound in the region 3' to the 5bf-thalassemia deletion. The sec-ond deletion is that found in individuals with hereditary per-sistence of fetal hemoglobin (HPFH). In two homozygous in-dividuals studied (12, 13), the deletion began 4.0 kb 5' to theS-globin gene and extended at least 2-3 kb beyond the ft-globingene. A BamHI site has been identified 3' to the deleted seg-ment in HPFH DNA (13). In the 17 kb of DNA 3' to the,B-globin gene, we found only a single BamHI site. If the twoBamHI sites were identical, other restriction endonuclease sites3' to the HPFH deletion should match restriction endonucleasesites in T3G41. Xba I (not shown) and Pst I sites are in similarpositions but HindIll and Bgl II sites found in our recombinant(Fig. 1) do not match sites mapped 3' to the deletion in HPFHDNA (13). Unless there were polymorphisms affecting both theHindIII and Bgl II sites, these data suggest that the segment ofDNA missing in HPFH extends beyond the 17 kb of DNA wehave mapped 3' to the ,B-globin gene.Our failure to find significant homology in the sequences

    flanking the 5- and ,3-globin genes or in the sequences sur-rounding the 5- or (3- compared to the 'y- or e-globin genessuggest that the sequences that encode for globins are morestrongly conserved than is the intergenic or intervening DNA.Lack of homology in the flanking and intervening sequenceDNAs of the mouse (majr and (3minor genes was first reportedby Leder and coworkers (34). Their extensive analysis has ledto the conclusion that, although coding sequences may drift bypoint mutation, flanking and intervening sequence DNAsevolve by rearrangement or deletion of large blocks of nucle-otide sequence (35). Lack of homology in the DNA sequencessurrounding or interrupting closely linked globin genes isthought to protect such genes from crossover and deletion. In-deed, in humans, two rare mutations do involve deletion andproduction of fusion genes, 5-(3 in Hb Lepore and Ay.-B in HbKenya (1-5) and therefore the crossover events occur withinthe conserved globin gene sequences. However, the 6-(3-tha-lassemic and HPFH mutations are deletions possibly due tocrossover events outside the conserved globin genes. Divergenceof flanking and intron sequences of globin genes is not a uni-versal finding. We have characterized, by restriction endonu-clease analysis and DNA sequencing, cloned DNA segments

    4232 Genetics: Kaufman et al.

    A.

    .'f "

    40 ."

    41

    Dow

    nloa

    ded

    by g

    uest

    on

    July

    7, 2

    021

  • Proc. Natl. Acad. Sci. USA 77 (1980) 4233

    containing the sheep My and AA genes (unpublished data). Thesegenes are expressed during the fetal and adult developmentalperiods, respectively. On heteroduplex analysis, homology inthe flanldng and intron sequences of these two genes was foundto extend over a 8-kb region. Thus, the significance of sequencedivergence around globin gene coding sequences or the lackthereof remains to be defined.

    Although homology in the globin gene flanking sequencesas determined by heteroduplex analysis is limited, we havefound sequences 3' to the f3-globin gene that are repeated in theglobin gene region and other parts of the human genome. De-tailed analysis of repetitive DNA sequences around the rabbitf3-globin genes (36) indicates multiple DNA sequences that areeither direct or inverted repeats. Duncan et al. (37) reportedthat homologous sequences 5' to the human Gy and a genesserve as templates that yield discrete RNA products of 515-575nucleotides when transcribed with RNA polymerasem in vitroand furthermore they have found another inverted copy of thisrepeat 3' to the fl-globin gene. We suspect but have not proventhat the hybridizing sequences in our Southern blots (Fig. 4) ofthe globin gene clones involve the same sequences that serveas polymerase III transcription units.The repetitive DNA sequence that includes the 3.2-kb EcoRI

    fragment appears to be of considerable length as reflected bythe inclusion of two EcoRI sites 3.2kb apart in many copies ofthe repeat (Fig. 5). Further analysis of five recombinants se-lected from the human library because they contain this repeatunit have demonstrated that it is a continuous DNA sequenceof at least 4.6 kb and also includes the BamHI site 3' to the3.2-kb EcoRI fragment in Tj3G41 (unpublished data). Becauserepetitive DNA may be involved in recombination, it is ger-mane to consider whether this region of DNA is constant withinthe human genome. The first sites for EcoRI, HindIII, Hpa I,and BamHI 3' to the /3-globin gene are all within the repeatedDNA sequence. Although polymorphisms in the BamHI andHpa I sites have been found, examination of several hundredDNA samples from normal individuals in the course ofsearching for polymorphisms that may be important for pre-natal diagnosis suggest that these sites are quite constant in thehuman genome (30-32, 38).

    Although 15-20% of the human genome is comprised ofmoderately repetitive DNA (39), most of the sequences arethought to occur in short repeats 300-500 bp long. We estimatethat there are approximately 5000 copies of the long repeat inthe human genome and that it comprises as much as 1% of totalhuman DNA (unpublished data).We thank Dr. Tom Maniatis for the recombinants H#G1 and HSIG2,

    Amanda Cline and Eric Schmader for excellent technical assistance,Maria Harrison for preparing the grids for electron microscopy, andExa Murray for assistance in preparation of the manuscript.1. Nienhuis, A. W. & Benz, E. J., Jr. (1977) N. Engl. J. Med. 297,

    1318-1328; 1371-1381; 1430-1436.2. Weatherall, D. J. & Clegg, J. B. (1979) Cell 16,467-479.3. Forget, B. G. (1979) Ann. Intern. Med. 91, 605-616.4. Bank, A., Mears, J. G. & Ramirez, F. (1980) Science 207,486-

    493.5. Gale, R. E., Clegg, J. B. & Huehns, E. R. (1979) Nature (London)

    280, 162-164.6. Bunn, H. F., Forget, B. G. & Ranney, H. M. (1977) Human He-

    moglobins (Saunders, Philadelphia), pp. 1-432.

    7. Dayhoff, M. D., ed. (1972) Atlas of Protein Sequence andStructure (Natl. Biomed. Res. Found., Silver Spring, MD), Vol. 4.

    8. Flavell, R. A., Kooter, J. M., DeBoer, E., Little, P. F. R. & Wil-liamson, R. (1978) Cell 15,25-41.

    9. Mears, J. G., Ramirez, F., Leibowitz, D. & Bank, A. (1978) Cell15, 15-23.

    10. Lawn, R. M., Fritsch, E. F., Parker, R. C., Blake, G. & Maniatis,T. (1978) Cell 15, 1157-1174.

    11. Little, P. F. R., Flavell, R. A., Kooter, J. M., Annison, G. & Wil-liamson, R. (1979) Nature (London) 278,227-231.

    12. Fritsch, E. F., Lawn, R. M. & Maniatis, T. (1979) Nature (Lon-don) 279, 598-603.

    13. Tuan, D., Biro, P. A., DeRiel, J. K., Lazarus, H. & Forget, B. G.(1979) Nucleic Acids Res. 6, 2519-2544.

    14. Blattner, F. R., Blechl, A. E., Denniston-Thompson, K., Faber,H. E., Richards, J. E., Slightom, J. L., Tucker, P. W. & Smithies,0. (1978) ScIence 202, 1279-1284.

    15. Deisseroth, A., Nienhuis, A., Lawrence, J., Giles, R., Turner, P.& Ruddle, F. H. (1978) Proc. Natl. Acad. Scd. USA 75, 1456-1460.

    16. Curtis, R., III, Pereira, D. A., Hsu, J. C., Hull, S. C., Clark, J. E.,Maturin, L. J., Sr., Goldschmidt, R., Moody, R., Inoue, M. &Alexander, L. (1977) in Recombinant Molecules: Impact onScience and Society, eds. Beers, R. F., Jr. & Bassett, E. G. (Raven,New York), pp. 45-56.

    17. Bolivar, F., Rodriquez, R. L., Greene, P. J., Betlach, M. C.,Heyneker, H. L. & Boyer, H. W. (1977) Gene 2,95-113.

    18. Wilson, J. T., Wilson, L. B., de Riel, J. K., Villa-Komoroff, L.,Efstratiadis, A., Forget, B. G. & Weissman, S. M. (1978) NucleicAcids Res. 5,563-581.

    19. Kretschmer, P. J., Kaufman, R. E., Coon, H. C., Chen, M., Geist,C. C. & Nienhuis, A. W. (1980) J. Biol. Chem., 255, 3204-3211.

    20. Maniatis, T., Hardison, R. C., Lacey, E., Lauer, J., O'Connell,C., Quon, D., Sim, G. K. & Efstratiadis, A. (1978) Cell 15,687-701.

    21. Benton, W. D. & Davis, R. W. (1977) Science 196, 180-182.22. Maniatis, T., Jeffrey, A. & Kleid, D. G. (1975) Proc. Natl. Acad.

    Sd. USA 72,1184-1188.23. Benz, E. J., Jr., Kretschmer, P. J., Geist, C. E., Kantor, J. A.,

    Turner, P. A. & Nienhuis, A. W. (1979) J. Biol. Chem. 254,6880-688&

    24. Southern, E. M. (1975) J. Mol. Biol. 98,503-517.25. Jeffreys, A. J. & Flavell, R. A. (1977) Cell 12, 1097-1108.26. Ullrich, A., Shine, J., Chirgwin, J., Pictet, R. L., Tischer, E., Rutter,

    W. J. & Goodman, H. A. (1977) Science 196, 1313-1319.27. Davis, R. W., Simon, M. & Davidson, N. (1971) Methods Enzy-

    mol. 21, 413-428.28. Proudfoot, N. J. & Baralle, F. E. (1979) Proc. Natl. Acad. Scd. USA

    76,5435-5439.29. Ramirez, F., Burns, A. L., Mears, J. G., Spence, S., Starkman, D.

    & Bank, A. (1979) Nucleic Acids Res. 7, 1147-1162.30. Kan, Y. W. & Dozy, A. M. (1978) Proc. Natl. Acad. Sci. USA 75,

    5631-5635.31. Kan, Y. W., Lee, K. Y., Furbetta, M., Angius, A. & Cao, A. (1980)

    N. Engl. J. Med. 302,185-188.32. Feldenzer, J., Mears, J. G., Burns, A. L., Natta, C. & Bank, A.

    (1979) J. Clin. Invest. 64, 751-755.33. Bernards, R., Kooter, J. M. & Flavell, R. A. (1979) Gene 6,

    265-280.34. Tiemeier, D. C., Tilghman, S. M., Polsky, F. I., Seidman, J. G.,

    Leder, A., Edgell, M. H. & Leder, P. (1978) Cell 14,237-245.35. Konkel, D. A., Maizel, J. V. & Leder, P. (1979) Cell 18, 865-873.36. Che-Kun, J. S. & Maniatis, T. (1980) Cell 19,379-391.37. Duncan, C., Biro, P. A., Choudary, P. V., Elder, J. T., Wang, R.

    R. C., Forget, B. G., De Riel, J. K. & Weissman, S. M. (1979) Proc.Natl. Acad. Sci. USA 76,5095-5099.

    38. Jeffreys, A. J. (1979) Cell 18, 1-10.39. Schmid, C. W. & Prescott, L. D. (1975) Cell 6,345-358.

    Genetics: Kaufman et al.

    Dow

    nloa

    ded

    by g

    uest

    on

    July

    7, 2

    021


Recommended