+ All Categories
Home > Documents > Nucleotide Sequence of Glucosyltransferase Genefrom...

Nucleotide Sequence of Glucosyltransferase Genefrom...

Date post: 22-May-2018
Category:
Upload: lamtuyen
View: 214 times
Download: 1 times
Share this document with a friend
8
JOURNAL OF BACTERIOLOGY, Sept. 1987, p. 4271-4278 Vol. 169, No. 9 0021-9193/87/094271-08$02.00/0 Copyright © 1987, American Society for Microbiology Nucleotide Sequence of a Glucosyltransferase Gene from Streptococcus sobrinus MFe28 JOSEPH J. FERRETTI,'* MARTYN L. GILPIN,2 AND ROY R. B. RUSSELL2 Department of Microbiology and Immunology, University of Oklahoma Health Sciences Center, Oklahoma City, Oklahoma 73190,1 and Dental Research Unit, Royal College of Surgeons of England, Downe, Kent BR6 7JJ, United Kingdom2 Received 3 April 1987/Accepted 2 June 1987 The complete nucleotide sequence was determined for the Streptococcus sobrinus MFe28 g#f gene, which encodes a glucosyltransferase that produces an insoluble glucan product. A single open reading frame encodes a mature glucosyltransferase protein of 1,559 amino acids (Mr, 172,983) and a signal peptide of 38 amino acids. In the C-terminal one-third of the protein there are six repeating units containing 35 amino acids of partial homology and two repeating units containing 48 amino acids of complete homology. The functional role of these repeating units remains to be determined, although truncated forms of glucosyltransferase containing only the first two repeating units of partial homology maintained glucosyltransferase activity and the ability to bind glucan. Regions of homology with alpha-amylase and glycogen phosphorylase were identified in the gluco- syltransferase protein and may represent regions involved in functionally similar domains. The glucosyltransferases (EC 2.4.1.5) produced by vari- ous species of oral streptococci are of considerable interest because of their production of extracellular glucans from sucrose. These glucans are thought to play a key role in the development of dental plaque because of their ability to adhere to smooth surfaces and mediate the aggregation of bacterial cells and food debris (12). It is known that a single strain can produce several distinct glucosyltransferases dif- fering in electrophoretic, antigenic, or enzymatic properties, although some of this apparent variety may be due to the use of different oral streptococcal strains and different purifica- tion procedures and activity assays by different laboratories. The properties and characteristics of the glucosyltrans- ferases of the mutans group streptococci have been reviewed by Ciardi (3) and Mukasa (18). Recently, several glucosyltransferase genes from various strains of streptococci have been cloned by recombinant DNA techniques and have been shown to be expressed in Escherichia coli. Robeson et al. (24) have cloned a glucosyl- transferase gene (gtfA) from Streptococcus mutans UAB90 (serotype c) and shown that it produces a protein with a molecular weight of 55,000. A similar gtfA gene has also been cloned by Pucci and Macrina (23) from S. mutans LM7 (serotype e) and by Burne et al. (2) from S. mutans GS5 (serotype c). Aoki et al. (1) reported the cloning of a gluco- syltransferase gene (gtfB) from S. mutans GS-5 that pro- duces a protein with a molecular weight of about 150,000. Another glucosyltransferase gene, gtfC, which specifies a 150,000-molecular-weight polypeptide has been obtained from S. mutans LM7 by Pucci et al. (22). Finally, Gilpin et al. (9) have cloned two glucosyltransferase genes from Streptococcus sobrinus MFe28 (serotype h): gtfS, which encodes a glucosyltransferase that synthesizes a water- soluble glucan, and gtfl, which encodes a glucosyltrans- ferase that synthesizes a water-insoluble glucan. The availability of these cloned genes allows further characterization of both the genes and gene products, and in this communication, we report the complete nucleotide sequence of the gtfl gene from S. sobrinus MFe28. * Corresponding author. MATERIALS AND METHODS Bacteria and media. E. coli MAF1 (containing plasmid pMLG1) was the initial source of the S. sobrinus MFe28 gtfp gene (27a). E. coli MAF5 contains plasmid pMLG5, which has the same 5.0-kilobase (kb) fragment as pMLG1 and an additional 0.5-kb fragment from the bacteriophage lambda recombinant in which the gtfp insert was first cloned (9). E. coli JM109 was used as the recipient for transfection exper- iments with M13 bacteriophage vectors (35) and was rou- tinely grown in 2x YT broth (19). Soft agar overlays consisted of 2x YT broth supplemented with final concen- trations of 0.75% agar, 0.33 mM isopropyl-3-D-thiogalacto- pyranoside, and 0.02% 5-bromo-4-chloro-3-indolyl-3-galac- toside for differentiating recombinant and nonrecombinant phages. For the titration of M13 recombinants carrying all or part of gtJl, phages were plated on E. coli JM109 on B-broth agar (26) to which 1% sucrose was added for detection of enzyme activity. Enzymes and chemicals. Restriction enzymes were pur- chased from Bethesda Research Laboratories, Inc., Gaith- ersburg, Md., and were used in accordance with the speci- fications of the manufacturer. T4 DNA ligase was purchased from Amersham Corp., Arlington Heights, Ill., or Bethesda Research Laboratories. The Klenow fragment of DNA poly- merase and the M13 15-base primer were purchased from Bethesda Research Laboratories. The deoxy- and dideoxy- nucleotide triphosphates were purchased from P-L Bio- chemicals, Inc., and [a-32P]dATP was purchased from New England Nuclear Corp., Boston, Mass.). Isopropyl-3-D-thio- galactopyranoside and 5-bromo-4-chloro-3-indolyl-3-galac- toside) were purchased from Sigma Chemical Co., St. Louis, Mo. Subcloning of the gtfl gene and nucleotide sequencing. The gtfl gene was obtained for subcloning experiments by diges- tion of pMLG1 with HindIII followed by electrophoresis and isolation of the 5.0-kb fragment from 0.8% type VII agarose gels as previously described (16). The fragment was unidirec- tionally degraded with Bal 31 by a modification of the procedure of Gilmore et al. (8), and all subcloning into M13 phages mpl8 and mpl9 was done as described by Ferretti et al. (7). A 0.5-kb HindIIl fragment was subsequently isolated 4271 on June 27, 2018 by guest http://jb.asm.org/ Downloaded from
Transcript

JOURNAL OF BACTERIOLOGY, Sept. 1987, p. 4271-4278 Vol. 169, No. 90021-9193/87/094271-08$02.00/0Copyright © 1987, American Society for Microbiology

Nucleotide Sequence of a Glucosyltransferase Gene fromStreptococcus sobrinus MFe28

JOSEPH J. FERRETTI,'* MARTYN L. GILPIN,2 AND ROY R. B. RUSSELL2Department of Microbiology and Immunology, University of Oklahoma Health Sciences Center, Oklahoma City,

Oklahoma 73190,1 and Dental Research Unit, Royal College of Surgeons of England,Downe, Kent BR6 7JJ, United Kingdom2

Received 3 April 1987/Accepted 2 June 1987

The complete nucleotide sequence was determined for the Streptococcus sobrinus MFe28 g#f gene, whichencodes a glucosyltransferase that produces an insoluble glucan product. A single open reading frame encodesa mature glucosyltransferase protein of 1,559 amino acids (Mr, 172,983) and a signal peptide of 38 amino acids.In the C-terminal one-third of the protein there are six repeating units containing 35 amino acids of partialhomology and two repeating units containing 48 amino acids of complete homology. The functional role of theserepeating units remains to be determined, although truncated forms of glucosyltransferase containing only thefirst two repeating units of partial homology maintained glucosyltransferase activity and the ability to bindglucan. Regions of homology with alpha-amylase and glycogen phosphorylase were identified in the gluco-syltransferase protein and may represent regions involved in functionally similar domains.

The glucosyltransferases (EC 2.4.1.5) produced by vari-ous species of oral streptococci are of considerable interestbecause of their production of extracellular glucans fromsucrose. These glucans are thought to play a key role in thedevelopment of dental plaque because of their ability toadhere to smooth surfaces and mediate the aggregation ofbacterial cells and food debris (12). It is known that a singlestrain can produce several distinct glucosyltransferases dif-fering in electrophoretic, antigenic, or enzymatic properties,although some of this apparent variety may be due to the useof different oral streptococcal strains and different purifica-tion procedures and activity assays by different laboratories.The properties and characteristics of the glucosyltrans-ferases of the mutans group streptococci have been reviewedby Ciardi (3) and Mukasa (18).

Recently, several glucosyltransferase genes from variousstrains of streptococci have been cloned by recombinantDNA techniques and have been shown to be expressed inEscherichia coli. Robeson et al. (24) have cloned a glucosyl-transferase gene (gtfA) from Streptococcus mutans UAB90(serotype c) and shown that it produces a protein with amolecular weight of 55,000. A similar gtfA gene has alsobeen cloned by Pucci and Macrina (23) from S. mutans LM7(serotype e) and by Burne et al. (2) from S. mutans GS5(serotype c). Aoki et al. (1) reported the cloning of a gluco-syltransferase gene (gtfB) from S. mutans GS-5 that pro-duces a protein with a molecular weight of about 150,000.Another glucosyltransferase gene, gtfC, which specifies a150,000-molecular-weight polypeptide has been obtainedfrom S. mutans LM7 by Pucci et al. (22). Finally, Gilpin etal. (9) have cloned two glucosyltransferase genes fromStreptococcus sobrinus MFe28 (serotype h): gtfS, whichencodes a glucosyltransferase that synthesizes a water-soluble glucan, and gtfl, which encodes a glucosyltrans-ferase that synthesizes a water-insoluble glucan.The availability of these cloned genes allows further

characterization of both the genes and gene products, and inthis communication, we report the complete nucleotidesequence of the gtfl gene from S. sobrinus MFe28.

* Corresponding author.

MATERIALS AND METHODSBacteria and media. E. coli MAF1 (containing plasmid

pMLG1) was the initial source of the S. sobrinus MFe28 gtfpgene (27a). E. coli MAF5 contains plasmid pMLG5, whichhas the same 5.0-kilobase (kb) fragment as pMLG1 and anadditional 0.5-kb fragment from the bacteriophage lambdarecombinant in which the gtfp insert was first cloned (9). E.coli JM109 was used as the recipient for transfection exper-iments with M13 bacteriophage vectors (35) and was rou-tinely grown in 2x YT broth (19). Soft agar overlaysconsisted of 2x YT broth supplemented with final concen-trations of 0.75% agar, 0.33 mM isopropyl-3-D-thiogalacto-pyranoside, and 0.02% 5-bromo-4-chloro-3-indolyl-3-galac-toside for differentiating recombinant and nonrecombinantphages. For the titration of M13 recombinants carrying all orpart of gtJl, phages were plated on E. coli JM109 on B-brothagar (26) to which 1% sucrose was added for detection ofenzyme activity.Enzymes and chemicals. Restriction enzymes were pur-

chased from Bethesda Research Laboratories, Inc., Gaith-ersburg, Md., and were used in accordance with the speci-fications of the manufacturer. T4 DNA ligase was purchasedfrom Amersham Corp., Arlington Heights, Ill., or BethesdaResearch Laboratories. The Klenow fragment ofDNA poly-merase and the M13 15-base primer were purchased fromBethesda Research Laboratories. The deoxy- and dideoxy-nucleotide triphosphates were purchased from P-L Bio-chemicals, Inc., and [a-32P]dATP was purchased from NewEngland Nuclear Corp., Boston, Mass.). Isopropyl-3-D-thio-galactopyranoside and 5-bromo-4-chloro-3-indolyl-3-galac-toside) were purchased from Sigma Chemical Co., St. Louis,Mo.

Subcloning of the gtfl gene and nucleotide sequencing. Thegtfl gene was obtained for subcloning experiments by diges-tion ofpMLG1 with HindIII followed by electrophoresis andisolation of the 5.0-kb fragment from 0.8% type VII agarosegels as previously described (16). The fragment was unidirec-tionally degraded with Bal 31 by a modification of theprocedure of Gilmore et al. (8), and all subcloning into M13phages mpl8 and mpl9 was done as described by Ferretti etal. (7). A 0.5-kb HindIIl fragment was subsequently isolated

4271

on June 27, 2018 by guesthttp://jb.asm

.org/D

ownloaded from

4272 FERRETTI ET AL.

sis of purified glucosyltransferase was done with an AppliedBiosystems 470A protein sequencer with an on-line 120APTH analyzer in accordance with the instructions of themanufacturer.

IIh-.a ........ ......1!X.......

FIG. 1. SDS-PAGE of glucosyltransferase activity in S. sobrinusMFe28, E. coli MAF4 (carrying pMLG5), and E. coli MAF1(carrying pMLG1). The right lane contains the following proteinstandards: RNA polymerase 1' subunit (Mr, 165,000), RNA poly-merase P subunit (155,000), 3-galactosidase (116,000),phosphorylase b (97,400), albumin (66,000), and ovalbumin (45,000).

from pMLG5, cloned into M13 phages, and sequenced. Thisfragment contained the remainder of the glucosyltransferasesequence not present in the pMLG1 HindIII fragment.Sequencing reactions were performed by the Sanger dide-

oxy chain termination method (28) by using the proceduresdescribed by Amersham. All sequences were confirmedfrom at least two overlapping clones, and the entire genesequence was determined on both strands. The sequenceinformation was analyzed by the James M. Pustell DNA/protein sequencing program obtained from InternationalBiotechnologies, Inc., New Haven, Conn.

Gel electrophoresis. Sodium dodecyl sulfate-polyacryl-amide gel electrophoresis (SDS-PAGE) and detection ofglucosyltransferase activity by incubation of gels with su-crose in the presence of Triton X-100 was done as describedpreviously (9), but the sensitivity of the method was en-hanced by the use of a periodic acid-Schiffreagent proceduremodified from published methods (15, 29). After incubationin sucrose (generally for 40 h at 37°C), the gels were fixed for30 min in 75% ethanol and treated on a shaker for 30 minwith 0.7% periodic acid in 5% acetic acid. They were thenshaken for 60 min in several changes of 0.2% sodiummetabisulfite in 5% acetic acid and placed in Schiff reagent(Sigma) for several hours. Finally, the gels were washedextensively in 45% methanol-45% acetic acid-10% water.All procedures were carried out at room temperature.

Purification of glucosyltransferase and glucan-binding pep-tides. Glucosyltransferase was purified from cells of E. coliMAF5 (which carries plasmid pMLG5) by subjecting thebacteria to ultrasonic disruption and using a single-stepaffinity chromatography procedure: the bacterial extract waspassed through a column containing Sepharose 1000 (Phar-macia) and mutan; after the column was washed with buffer,bound glucosyltransferase was eluted with 5 M guanidinehydrochloride (26). The same procedure was used for detec-tion of glucan-binding peptides; i.e., disrupted cell extracts,plus for phage M13-infected cultures, the culture superna-tant, were passed through the affinity column. Eluted pep-tides were analyzed by SDS-PAGE, and derivation fromglucosyltransferase was confirmed by Western blotting(immunoblotting) with antiserum against purified glucosyl-transferase (27a).

N-terminal sequence analysis. N-terminal sequence analy-

RESULTS

Cloning of complete gtl gene. The gtpl gene from S.sobrinus MFe28 was originally cloned into the BamHI sitesof bacteriophage lambda L47.1 and was located in a 7.6-kbDNA insert. Subsequently, mapping experiments showedthat HindIII cleaved the insert at three points and the lambdaDNA at two points outside the insert to give four fragmentsof 5.0, 2.6, 2.5, and 0.5 kb which carried S. sobrinus DNA.Plasmid pMLG1, from which a functional glucosyltrans-ferase is expressed in E. coli, carries the 5.0-kb fragment(27a). At an early stage of the nucleotide-sequencing pro-gram, it became clear that pMLG1 contained a long openreading frame with no termination codon. These resultsindicated that the entire gtp gene was not present inpMLG1, and so a further collection of derivatives ofpBR322carrying HindIII fragments from the bacteriophage lambdarecombinants was examined. One of these, pMLG5, en-coded a glucosyltransferase of 173 ki}odaltons (Fig. 1) andwas found to have both the 5.0- and 0.5-kb fragments.Restriction mapping of pMLG5 and subsequent nucleotidesequencing confirmed that the 0.5-kb fragment carried theinformation for the C-terminal region of glucosyltransferase.A partial restriction site map of the 5.5-kb insert containingthe gft gene is shown in Fig. 2.

Nucleotide sequence. The complete nucleotide sequence ofthe 4,995-base-pair (bp) fragment carrying the gtp gene wasdetermined in both orientations and is shown in Fig. 3.Previous evidence had indicated that 0.5 kb of the insert inpMLG1 and pMLG5 was derived from the bacteriophagelambda vector (27a), and this was confirmed by the findingthat the first 494 bp of the 5.5-kb fragment showed 100%

homology with the corresponding part of the lambda se-quence (lambda sequence not shown). An open readingframe containing 4,791 bp codes for the glucosyltransferaseprotein. The deduced amino acid sequence, which is codedstarting at the ATG codon beginning at position 160 andextending to the termination codon TAA at position 4951,contains 1,597 amino acids and has a molecular weight of177,100. A putative ribosome-binding site sequence (AG-GAGGA) is located nine nucleotides upstream from thetranslation initiation codon. Further upstream is a probable-35 region (TTGACG) separated by 18 bp from a proposed-10 region (TTAAAA).Amino acid composition. The deduced amino acid compo-

sition indicated a highly hydrophilic protein, with a hydro-phobic N-terminal region. This region displayed the charac-teristics expected of a signal peptide, i.e., a basic N-terminalregion followed by a central hydrophobic region and a morepolar C-terminal region. The residues surrounding aminoacid 38 conform with the "-3, -1" rule proposed by von

Heijne (31) for amino acids found at a cleavage site. To

Haei KWnIEcoRV Clal Clal

Tbal K,pnl NciI Aval Hidul PatIHindlI

8acl- 10. 200 3.b

800 1600 2400 3200 4000 4800 bp

FIG. 2. Partial restriction map of the 4,995-bp fragment contain-ing the gtp gene and the 494-bp fragment of bacteriophage lambda(hatched) carried by plasmid pMLG5. The 5' end of the gene is at theleft, and the 3' end is at the right.

co('4 in P

so. I( 4(

J. BACTERIOL.

on June 27, 2018 by guesthttp://jb.asm

.org/D

ownloaded from

VOL. 169, 1987

10 20 30 40 50 60

GAT OG1 CIA TOG TM MC AGA CM CAA ACA TIC ATG CCT1ITC TC TOT MT MC TAG AMA

70 s0 90 100 110 120

TM ATT GTA MT TOT GGT AAA ATT ACT TGA OCA TIA CM TIA TOT MC TCT TAO AM AGT

130 140 150 160 170 10

TM AGl TCO TTA TOT TO CM ATT AG0 AG ACT cOCA T20 A02 G AAMG AAT "A OT TOT

Sot Glu Lys AMn Glu Arg Phe

190 200 210 220 230 240

AG AT2 CAT AM OTC AM AMA AGA TOG0T ACT ATC TCA COTT CA TCT GCC ACT A TTALys Sot Hnis Lys Val Lys Lys Arg Trp V01 Thr Ile SIr Vol A01 Ser Alo ThrNot Leu

250 260 270 200 290 300

GCT TCA GCT CTC GOT lCO TCA GOT GOT ATC (TICAGrACT CMACT GTT AOC GM MC AGCAl4 SMr Ala Ltu Gly A1 SMr Vol A41 Sir Ala Asp Thr Olu Thr Vol Ser Glu Asp Ser

310 320 330 340 350 360

AAC CAA GCA GTC TTG ACG GCT GAC CAA ASG ACT ACC MC CM GAT ACT GAG CM ACT TCTAsn Gln A10 Vol Ltu Thr Alo Asp Gln Tht Thr Thr Asn Gln Asp Thr Glu Gin Thr SIr

370 380 390 400 410 420

GT GCA GoCG AcA GoCT ACA TCA GM CAG TCT GCT TCA ACT 0A 0A ACA GAT CAA GCAVal Alo A41 Thr A010 hr r Gin 1in SMr Ala SIr Thr Asp A1 A01 Thr Asp Gln Ala

430 440 450 460 470 480*

TCA 4AACA GAT CM 0CA TCA GCAG0 A GAG CAA AOT CM 0A ACA ACA GOCT AGC ACA GACSer A01 Thr Asp Gln Ala Ser A01 A01 Glu Gln Thr Gln Gly TOn lhr A01 SIr Thr Asp

490 500 510 520 530 540

AOG G&A ocr CAA ACA ACC ACA AAT oCT AAT GAA GCT AAG TGG GT CSG ACT GMAA T GAGThr Ala Alo Gln Thr Thr Thr AMn Alo Ann Glu Alo Lys Trp Vol Pro ohr Glu AMn Glu

550 560 570 500 590 600

MC CM GOT mTT AC0 GAT GAG ATG TTA &A GM GOC 00G MT GTG GCT ACT CT GMA TCTAsn Gln Val Phe TOr Asp Glu Set Lou Ala Glu Ala Lys Asn Vol A01 Thr Alo Glu Ser

610 620 630 640 650 660

MT TCA ATT CCA TMAMC TTG0CC AM ATG TCA MT GTT AMG CAG GTT GAC GUT MA TATAsn SIr Ile Pro Sir Asp Lnu A41 Lys lot SMr Asn Vol Lys Gln Vol Aso Gly Lys Tyr

670 680 690 700 710 720

TAT TAc TAC GAT CM GAC GC AAC GTT AMG AG MC TT Gct GTT AOC GTT GT GAG AMGTyr Tyr Tyr Asp Gin Asp Gly Asn Vol Lys Lys Ann Phe A01 Vol SMr Vol Gly Glu Lys

730 740 750 760 770 780

ATC TAT TT GCM ACT G0C GCT TAC MA GAC ACT AGC 4G GTA GCM GOTATMAMIle Tyr Tyr P60 Asp Glu TOn Gly A10 Tyr Lys Asp Thr SIr Lys Vol Glu Ala Aop Lys

790 800 810 820 830 840

TCA GOT TCC GAT ATO AG AM GA ACA0CC TT OCT GT MC MC MC1AMSer Gly SMr Asp Ioe Ser Lyo Glu Glu Thr Thr Phe Alo Alo Asn Ann Arg A1 Tyr SIr

850 860 870 000 090 900

ACC TCA GCT GM MC TT GM GC ATT GATAAMC TTG ACA OCT MAC TCA TOG TAO OGTThr Ser Ala Glu Ann Phe Glu Ala Ile Asp Asn Tyr Lou Thr A1a Asp Ser Trp Tyr Arg

910 920 930 940 950 960

CCA AMG TCC ATC CTC MAG GAT AAG C TG ACA GM TCA AGC AMS GAT GAC TTC CdTPro Lys SMt Ile Lnu Lys Asp Gly Lys Thr Trp Thr Glu Ser SMr Lys Asp Asp Ph6 Arg

970 980 990 1000 1010 1020

CCG CTA TT2 ATG GOT T1G TG1 CCA GAT ACC G00 ACC AM COC MC TAT GOT MC TAC A02Pro L tuL SuNot A1 Trp Trp Pro Asp Thr Glu Thr Lys Arg Asn Tyr Vol AMn Tyr oet

1030 1040 1050 1060 1070 1080

MC MG GOT GTT GOT ATT GAT AM ACC TAT Ace0 GOT G00A 40ACM OCT CGACCG ACAAMn Lys Val Val Gly Ile Asp Lys lir Tyr Tn Ala0 Gli 04ln A41 Asp Lou Thr

1090 1100 1110 1120 1130 1140

GCA O&A G&TGAC0C GOT CM 0cC OGT ATC M CAA AG A0T ACT CM AAT ACCAla Ala Ala Glu Ltu Vol Gln Ala Arg I1e Glu 0ln Lys I11 Thr Thr Glu Gln AMn TOr

1150 1160 1170 1100 1190 1200

AAA TOG TTG GT GM G ATC TCA GCT OT GTT AAA A0111AACA CM TG AAT ST GCMLyo Trp Ltu Arg Glu A01 Ile Ser A01 Phe Vol Lys Thr Gln Pro Gln Trp Asn Gly Glu

1210 1220 1230 1240 1250 1260

AGC GAA AAG 2CATAC GAT GAC MAC TTGCMMA 0T G0CC CIT TICAT AmC CM TOT

Ser Glu Lys Pro Tyr Asp Asp HisLo Gln AMn lly Ala Lou Lys Ph Asp Ann 0ln SMr

1270 1280 1290 1300 1310 1320

GAT Tno AcA CCA GAT AoG CM TSG MC TAOGT TTG CTC AMT CoC AA COCA ACT mC CAAAsp Lou Thr Pro Asp Thr SMr Asn Tyr Arg Lto Ltu AMn Arg TO Pro Tn AMn Gln

DNA SEQUENCE OF S. SOBRINUS gtfl 4273

1330 1340 1350 1360 1370 1380* * * * * *

ACT GOT TCC CTG GAC TCr CGT TTC ACC TAC AT OCT MC GAC COG TTA GS100G TAT GAG

Thr Gly Ser Leu Asp Sr Arg Ph. Thr Tyr Asn Ala Ann Asp Pro Lu Gly Gly Tyr Glu

1390 1400 1410 1420 1430 1440* * * * * *

TIC CIT CTG OCT MC GAC GTG GAT MC TCT MAT CCC ATC GTT CM GCA GAG CM CTI MCPhe Lou Lou Ala Asn Asp Val Asp Asn Ser Aon Pro Ile Val Gln Ala Glu Gln Lou Asn

1450 1460 1470 1480 1490 1500

TOG CTG CAT TAC CTG CCC MC TIC GOT ACT ATC TA GCT AM0AGA GOTU0AT OCT MC mTrp Leu His Tyr Lou Ltu Asn Phe Gly Thr I1e Tyr Ala Lys Asp Ala Asp Ala Ann Phe

1510 1520 1530 1540 1550 1560* * 0 * * *

GAc TCT ATC CGT GTT 0U GOG GTAG0 T ATM C GT T GCT GAc cTT CTG CM ATc TeT AGTAsp Ser Ile Arg Val Asp Ala Val Asp Asn Val Asp Ala Asp Leu Ltu Gln Ioe Ser Ser

1570 1580 1590 1600 1610 1620* * * * * *

GAT TAC COT AMG GCA GCT TAC GOT II0 AMA MACAM CAAAM AA GCT MT MC CAC GTTAsp Tyr Lou Lys Ala Ala Tyr Gly Ile Asp Lys Ash Asn Lys Asn Ala Asn Asn His Val

1630 1640 1650 1660 1670 1680* * * * 0 *

TOT UI GTA GAM CA TGG AOC GAC MC G4 ACC CCT TAT CCC CAT GAT 04T GOC GAC MCnr Ile Val Glu Ala Trp Ser Asp Ann Asp Thr Pro Tyr LAu His Asp Asp Gly Asp Asn

1690 1700 1710 1720 1730 1740* * * 0 * *

CTC ATG MC ATG GAC MC AMG TC CGT TTS TCT ATG CTT TCG TOT TTG GCT AM CCA T21Ltu lNot Asn Hot Asp Ann Lys Ph6 Arg Lou Ser Met Leu Trp Ser Lou Ala Lys Pro Leu

1750 1760 1770 1780 1790 1600

GAC AMA CT TCT GOC TTG MTCCC CTC ATCCAT MCAG0 CTG OTT GOCGTOT CGATG 04TAsp Lys Arg Ser Gly Ltu Ain Pro Ltu Ile His Ann SIr Lou Val Asp Arg Glu Val Asp

1010 1820 1830 1840 1850 1860

GAC CGT CAA GTT GM ACC GII CCA A0T TAC AGC TO CC OTTGCT CAC G0 AGC GM G0AAsp Arg Gli Val Glu Thr Val Pro Ser Tyr Ser Phs Ala Arg 410 Ris Asp SIr Glu Val

1870 1800 1890 1900 1910 1920

CM GAC CTG ATT CGT GAC OTT ATT MG GCT GM AT MU CCA AAT GCA TOT 0G0 TAT TOGln Anp Lou ITe Arg Asp I10 I1e Lys Ala GluIn eA1 Pro AMn Ala Ph6 Gly Tyr SMr

1930 1940 1950 1960 1970 1980

TOT ACT CM OAC GM ATc GAC CM GCC TTC MAG AT TAC MC GAA GAC CTC GAMG ACePhe Thr Gln Asp GluIn e Asp Gln A10 Phe Lys Ile Tyr Ann Glu Asp Ltu Lys Lys Thr

1990 2000 2010 2020 2030 2040

GAT MG AM TAC ACT CAC TAC 4T GTG COG CTT TCT TAT ACC TTG CTT CTG ACT MC AMGAsp Lys Lys Tyr Thr His Tyr Ann Val Pro Lu SMr Tyr Thr Leu Lou Ltu Thnr An Lys

2050 2060 2070 2080 2090 2100

G00 ToT A Cr OTT GTT TAOT 41 A GAT ATGT0 CACC GAT GATM 0T CM TAC ATG Gcc

Gly Ser Il. Pro Arg Val Tyr Tyr Gly Asp Sot Phe Thr Asp Asp Gly Gln Tyr Sot Ala

2110 2120 2130 2140 2150 2160

MC MG ACT G01 MC TAC GAT GCT ATC GM TCT CTG CTG AM GIC CGT AT AMG TAC OTTAsn Lys Thr Val Asn Tyr Asp A10 Ile Glu SMr Lou Lon Lys Al Arg Sot Lys Tyr Vol

2170 2180 2190 2200 2210 2220

GCT G00 G0T CAA GCT ATG CM AAT TAC CM ATC GT MTCGOTCGM ATC TTG ACT TCT GTCAla Gly Cly Gln Ala Hot Gln Ann Tyr Gln Ile Gly Ann Gly Glu Ile Ltu Thr Ser Val

2230 2240 2250 2260 2270 2280

CGT TAT 00 AMS G10 GCC COT AM CM AGC GAT AMG GT GAT GOG ACA OCT C ACG TCAArg Tyr Gly Lys Gly Ala Lou Lys Gln SMr Asp Lys Gly Asp Ala Thr Thr Arg Thr Ser

2290 2300 2310 2320 2330 2340

GOT OTC GOC GTT GTT ATG G0A MC CM CCC MTC TT AGC T20 MGA0G4M C0TA GCCGly Val Gly Val Val Sot Gly Asn Gln Pro Ann Phe SrLo u Asp Gly Lys Val Val Ala

2350 2360 2370 2380 2390 2400

CTC MC A0GGT GCT GCC CAC OCT MC CM GM TAC OTT GOT CTT ATG GTAT0 AACC AAMLto Asn Met Gly Ala A41 His Ala Asn Gln Glu Tyr Arg A41 Lou Set Val SMr Thn Lys

2410 2420 2430 2440 2450 2460

GAC G5T GOT oCA ACC TAT lCT404 GAT OCTMCOTocAGCM GG T0 cTG GT AMG CGCAsp Gly Val Ala Thr Tyr Ala Thn Asp Ala Asp Ala Ser Lys Ala Gly Ltu Val Lys Arg

2470 2400 2490 2500 2510 2520

0A CAT G0A MC0T0 TAC CTC TAC TTC 200 MAC GAC &COTT AM 0 GTT GCT MT CCThr Asp Glu Ann Gly Tyr Lou Tyr Phe Leu AMn Asp Asp Lnu Lys Gly Val Ala Asn Pro

2530 2540 2550 2560 2570 2580

CAG GIT TCT G0T TIC COT CM OTC TG20A CCA 0 CCA GCA CATM AC CM CAT A0TGln Val Ser Gly Phb Lto Gln Val Trp Val Pro Val Gly Ala Ala Asp Asp Gln Asp I1e

2590 i600 2610 2620 2630 2640

CTT CIA GCA GCT ACC GAT ACA GCAA40 ACC CAT 00 AM 24 CTC CAT CM GAM CT GCCArg Val Ala Ala SMr Asp Thr A1 Ser Thn Asp Cly Lys Ser Lu His Gln Asp A10 Ala

2650 2660 2670 2680 2690 2700

AT0 GAC ICT COC GTC ATG00 GTM G0T TIC TCT MC TrC CM TOT OGC ACA404M CMSot Asp Sr Arg Val Sot Phe Glu Gly Phe Ser Asn Phe Gln Ser Phe A4a Thn Lys Glu

2710 2720 2730 2740 2750 2760

GM GAG TAT A MT OT G ATT OCT MC MATCT MAM m TT TOA TOG G0A ATCt.lu Glu Tyr Thr Ann Val Val 210 Ala Ann AMn Val Asp Lys Phs Val Ser Trp Gly I1-

Continued on following page

on June 27, 2018 by guesthttp://jb.asm

.org/D

ownloaded from

4274 FERRETTI ET AL.

2770 2780 2790 2800 2810 2820

* * *

ACT GAC TOT G0A ATG G0 CCr CAG TAT GTC 1CA TC AC0 GAC GOT CG TIC CITT GAT ICTThr Asp Phe Glu Mst Ala Pro Gln Tyr Val Ser Ser Thr Asp Gly Gln Phb LAu Asp SOr

2830 2840 2850 286 2870 2880

* * * *

GTC ASSCMAA AAT GT TSAT GCC TOT All GAC CGT TAT GAC OTG G0 A1G C1 MA GCA AAC

Val Ile Gln A n Gly Tyr Ala Phb Thr Asp Arg Tyr Asp Lou 0ly Nost SOr Lys Ala AMn

2890 2900 2910 2920 2930 2940

* * * *0

AAG TAT G0S ACA GCC GAC CAA OTG GTT AAG GCT AC CTC CM AAA G00 CTLyt Tyr Gly Thr Ala Asp Gln Lau Val Lys Ala Ile Lys Ala Lou Ris Ala Lys Gly Lou

2950 2960 2970 2980 2990 3000

* * * * *

AAG GOT ATG GCA GAC TOG GTT CCA 4AC CAA AT0 TAC ACC TIC CT AAA CMA "A GIG G01

Lys Val Hst Ala Asp Trp Val Pro Asp G0n lit Tyr Thr Phb Pro Lys Gln Glu Val Vol

3010 3020 3030 3040 3050 3060

AC4 GTTScoO OAT AAM TOT GSC AAA CCA ATC GSAG0 AA4 CAAATC AAT CAC AGT

Thr Vol TAr Arg Thr Asp Lys Phb Gly Lys Pro I1e Ala Gly SOr Gln Ile Asn His SOr

3070 3080 3090 3100 3110 3120

* * * * *

CTC TAC G0A ACA GAT ACA AAG AGC TCS S GOAT G0c TAT CAA GoC AAA TAc G0C o01 GCCLau Tyr Val TAr Asp TAr Lys Ser Ser Gly Asp Asp Tyr Gln Ala Lys Tyr Gly Gly Ala

3130 3140 3150 3160 3170 3180

TIC COT GAC GAA TIA AAM GAA AAA TAT CCA GAA CTC TT ACC MO AA CAA A4C TC1 ACC

Pha Leu Asp Glu Lou Lys Glu Lys Tye Pro Glu Lau Phe Thr Lys Lys Gln I1e Ser Thr

3190 3200 3210 3220 3230 3240

G0C CM 1cc A1A0ATCCM TCT GTT MO ATS MA CM TOG TOT GCT MO TIC MC GOT

Gly Gln Ala Ile Asp Pro Ser Val Lys Ile Lys Gln Trp SOr Ala Lys Tyr Phe Asn Gly

3250 3260 3270 3280 3290 3300

AGC MT A0C CTT G0C COGGOTr01 GAT TAT GTC C0C AMC GAC CM G0A AMC MC MO 4AC

Ser Asn Ile Lau Gly Arg Gly Ala Asp Tyr Vol LAu SOr Asp Gln Ala SOr AMn Lys Tyr

3310 3320 3330 3340 3350 3360

CTC MST GTSTCA GAT GAT MA CTC TIC STO CCA AAA ACT CSC CTA GG CAA GTC GTA CAA

Leu Asn Val SOr Asp Asp Lys Lau Ph Leou Pro Lys Thr Leu Lau Gly Gln Val Val Glu

3370 3380 3390 3400 3410 3420

1CA GSS AOC COC TOT GAT G0A ACT 0G0 TAT G0C TAC MC TCA AMC ACA ACA GGT GAA AAG

Ser Gly Ile Arg Phs Asp Gly Thr Gly Tyr Val Tyr A n Ser SOr Thr Thr Gly Glu Lys

3430 3440 3450 3460 3470 3480* * * Al1 * *

GUSA C1AT MA ATn AM GM G01 GG MT COT SA SC TOT G CAA GA GG0 ACVal Thr Asp Ser Phb Ile Thr Glu Ala Gly Asn Leu Tyr Tyr Phb Gly Gln Asp Gly Tyr

3490 3500 3510 3520 3530 3540

A40 GTG Ac GGOT C MC AT1 MOG 00A TOC 4C0 TAT 1AC TC CSO GST M

Hst Val Thr Gly Ala Gln AMn Ile Lys Gly SOr AMn Tyr Tyr Phe Lou Ala A n Gly Ala

3550 3560 3570 3580 3590 3600

GCC TT CSC MT ACA TOT 14S01SG 1CC CAA GOT CM MC CA 11AC TAT 01C 0C GACAla Lou Arg Asn Thr Val Tyr Thr Asp Ala Gln Gly Gln AMn His Tyr Tyr Gly Ain Asp

3610 3620 3630 3640 3650 3660** * * ~~~~~A2* *

GGr AAA CGr TAc GAA MST 0G TAC CAA CM TT GG0 MT GAC AGC TOG TAC TIC AAGGly Lys Arg Tyr Glu Asn Gly Tyr Gln Gln Phe Gly AMn Asp SOr Trp Arg Tyr Phs Lys

3670 3680 3690 3700 3710 3720

MAT GOT GTC ATG GCA COT 00C CTC ACA ACC GTS GAT G00 CAC GTr CAA TA4 IT GAT MA

Asn Gly Val eIt Ala Lau Gly LAo Thr Thr Val Asp Gly His Val Gln Tyr Phb Asp Lys

3730 3740 3750 3760 3770 3780

GAT G0T GOT CAAMC AAGGAM MO AT ATI GTC ACC CGT GAT MO GTT COT TAC STCAsp Gly Val Gln Ala Lys Asp Lys Ile Ile Vol Thr Arg Asp Gly Lys Val Arg Tyr Phb

3790 3800 3810 3820 3830 3840

GAC CAA CAT MT G0A MAAT GT TA ACCAAM ACC TIC GTC GOT 0A8 MO ACT GGT CAC TOGAsp Gln His Asn Gly Aga Ala Val Thr Asn Thr Phe Val Ala Asp Lys Thr Gly His Trp

3850 3840 3870 3080 3890 3900

A3 * * * *T1C TAT CTA GSSMAA 0GAG 0TGTC OM GTT AC1 GOT CM MT GOT OCT MA CM CACTyr Tyr Lou Gly Lys Asp Gly Val Ala Val Thr Gly Ala Gln Thr Val Gly Lys Gln His

3910 3920 3930 3940 3950 3960

CTT TAC TTC GM 0CC MT GOT CM CM GTr AMG GOT G04C TOT C1 0A GCC MAM GAT GGLeu Tyr Phe Glu Ala Asn Gly Gln Gln Val Lys Gly Asp Phe Val Thr Ala Lys Asp Gly

3970 3980 3990 4000 4010 4020

AM CTT SAC TTC T0C GAT GTT TOT 00C 01G TOO 0C MT A1 TIc AT1 GAA G08Lys La Tyr Phe Tyr Asp Val Asp S r Gly Asp lit Trp Thr AMn Thr PFs Ile Glu Asp

4030 4040 4050 4060 4070 4080* A4 * * *

MOG 0A G0C MC TGG TIC TAT COT GOT MMA AT G0 A Ge G0CACA GGT GC CAA ACCLys Ala Gly Asn Trp Phe Tyr Lau Gly Lys Ap Gly Ala Ala Val Ahr Gly Ala Gln Thr

4090 4100 4110 4120 4130 4140

Am1MG G0C CM MA COT 1AC TIC AAG 0T MC 0C0CM CMGT AM GO A ATCG1CI1e Lys Gly Gln Lys Lou Tyr Phe Lys Ala A n Gly Gln Gln Val Lys Gly Asp Ile Val

4150 4160 4170 4180 4190 4200

MG GAT OSAT 0SMG AZT OG0 TAC TACGA1 OMCA0CM ACT G GAA CM 0T1 AATLys Asp Ala Asp Gly Lys I1 Arg Tyr Tyr Asp Ala Gln Thr Gly Glu Gln Val Ph Asn

4210 4220 4230 4240 4250

* BI * *

AG TCT GTA AGT GOT MC GOT MG ACT TAC TAC TIC GOT MT GU 00A ACT GCT CM ACT

Lys Ser Val Oer Val Asn Gly Lys Thr Tyr Tyr Phs Gly Ser Asp Gly Thr Ala Gln Thr

4270 4280 4290 4300 4310 4320

* * * * * *

CAG OCT AT CCA MG GGT CM ACC TT MOG GAT GOT TCT 00A GOT COT COT TTC TAC ATGln Ala Asn Pro Lys Gly Gln Thr Phe Lys Asp Gly Ser Gly Val Lsu Arg Phe Tyr A.n

4330 4340 4350 4360 4370 4380* * * * AS *

COT GM GOT CAG TAT GTA TCA GOT AGT G0A TGG TAT G0M ACA GCA GAG CAC GM TOO GOT

Leo Glu Gly Gln Tyr Val Ser Gly Ser Gly Trp Tyr Glu Thr Ala Glu His Glu Trp Val

4390 4400 4410 4420 4430 4440

* * *

TAT GTT MO TC0 GOT MG GTA TTG ACT GOT GCT CM ACC ATT G0A MT CM COGA GTT TACTyr Val Lys Ser Gly Lys Val Lsu Thr Gly Ala Gln Thr I1e Gly AMn Gln Arg Val Tyr

4450 4460 4470 4480 4490 4500

* * Hindlul*TIC MO GAT MT GOC CAT CM GTC MM GOT CM TOG 01A ACT 0T1 MC GAT GOT MO CcT

Phs Lys Asp AMn Gly His Gln Val Lys Gly Gln Lou Val Thr Gly AMn Asp G1y Lys Lou

4510 4520 4530 4540 4550 4560* * 62

COC TAC TAT GAT GCC MC TCT GGC GAC CAG OCT TIC MC MG TCT GTM ACT GTT MT G0C

Arg Tyr Tyr Asp Ala Asn Ser Gly Asp Gln Ala Phe AMn Lys Ser Val Thr Val Asn Gly

4570 4580 4590 4600 4610 4620* * * *

MG ACT TAC TAC TIC GOT OAT ACTGT CMAA T GCT MT CCA MO GOT CM

Lys TAr Tyr Tyr Phs Gly Ser Asp Gly Thr Ala Gln Thr Gin Ala Asn Pio Lys Gly Gln

4630 4640 4650 4660 4670 4680* * *

ACC TTTAMOG AT TCT G0A GOT CTT CGT TIc TAC MT CTT CM CAG TAT 11ATAr Pbh Lys Asp Gly Ser Gly Val Lou Arg Phe Tyr AMn Lou Glu Gly 01n Tyr Val Ser

4690 4700 4710 4720 4730 4740

* * * AB *

GOT AGT G0A TOG TAT MMAAT GCC CM GOT CM TGl CTT T1C OTT MO GAC MAMA GTA

Gly Ser Gly Trp Tyr Lyt AMn Ala Gln Gly Gln Trp Lou Tyr Val Lys Asp Gly Lys Val

4750 4760 4770 4780 4790 4800* * * *

TTG ACT GGC CT G ACA GTA GOT MC CMA MOG GT TAC TOT GU AMA MT GOT ATc CM

Lau Thr Gly Leu Gln Thr Val Gly AMn Gln Lys Val Tyr Pbs Asp Lys Asn Gly Ile Gln

4810 4820 4830 4840 4850 4860

GCC MO G00 MOG GCGOTAA0 A ACT TCT GAT GOT MO GOTC 1GC TAC TTT M MU TOT

Ala Lys Gly Lys Ala Val Arg Thr Ser Asp Gly Lys Val Arg Tyr Phe Asp Glu Asa SOr

4870 4880 4890 4900 4910 4920* * *

GOr AGC AG 0T1 ACC MC CM ToG AMA TT GTT TAC G0A CM AT TAC TAT TIC G0T AGTGly Ser HIt Ile TAr Asn Gln Trp Lys Phb Val Tyr Gly Gln Tyr Tyr Tyr Phe Gly Ser

4950 4940 4950 4960 4970 4980

GAT GOT GOGT GOGTC TAC CST G0C TOO Ac MAA GAT cTG AG0 TM Am TAT crc GAC

Asp Gly Ala Ala Val Tyr Arg Gly Trp Asn ---

49iO

CM G0C AMA MO cO

FIG. 3. Nucleotide sequence of the gtjp gene and flanking re-

gions. Numbering begins at the 5' end of the sequence. The deducedamino acid sequence of glucosyltransferase is given below thenucleotide sequence; an arrow designates the cleavage site for theremoval of the signal peptide. Putative promoter and ribosome-binding site sequences are underlined. The starts of repeat regionsAl to A6, Bi, and B2 are marked.

investigate whether the gtJp gene product was indeedcleaved at this site, the enzyme expressed in E. coli MAF5was purified by affinity chromatography and subjected toN-terminal amino acid analysis. The first 16 amino acidswere identified as Asp-Thr-Glu-Thr-Val-Ser-Glu-Asp-Ser-Asn-Gln-Ala-Val-Leu-Thr-Ala; this sequence is identical tothat of the deduced amino acid sequence directly followingthe postulated cleavage site. Thus, the signal peptide is 38amino acids long and contains regions similar to those foundin most secretory signal sequences (20). The mature gluco-syltransferase contains 1,559 amino acids and has a molec-ular weight of 172,983. The deduced amino acid compositionof glucosyltransferase with and without the signal peptide ispresented in Table 1.Amino acid sequence homology. A series of repeating units

is located in the C-terminal one-third of the glucosyltrans-ferase molecule (Fig. 4). One of the repeating units, desig-nated A, is 35 amino acids long and is present six times.Although these repeats are hot completely identical, repeat-ing unit A4 was found to have the greatest homology with

J. BACTERIOL.

on June 27, 2018 by guesthttp://jb.asm

.org/D

ownloaded from

DNA SEQUENCE OF S. SOBRINUS gtfl 4275

TABLE 1. Amino acid composition of glucosyltransferasededuced from the nucleotide sequence of gtfl

No. of residuesaAmino acid

With signal peptide Without signal peptide

Alanine 143 136Arginine 43 41Asparagine 109 108Aspartic acid 137 137Glutamic acid 65 63Glutamine 93 93Glycine 135 134Histidine 19 18Isoleucine 50 49Leucine 92 90Lysine 122 117Methionine 27 24Phenylalanine 64 63Proline 30 30Serine 106 101Threonine 124 122Tryptophan 25 24Tyrosine 98 98Valine 115 111Total 1,597 1,559

a Mr with signal peptide, 177,100; Mr without signal peptide, 172,983.

each of the other five repeating units. Based on a comparisonwith A4, the homologies were as follows: Al, 65%; A2, 38%;A3, 72%; A5, 65%; and A6, 50%. The gene segmentscorresponding to these regions are also highly conserved andexcept for repeat A2, which contained only 38% identicalbases, the repeats contained 65 to 72% identical bases.Repeating unit B is present twice and contains 48 aminoacids, all of which are identical. The corresponding generegions contain a stretch of 132 identical bases.

Functional regions of glucosyltransferase. Since functionalglucosyltransferase was expressed from both pMLG1 andpMLG5, the terminal 0.5 kb of the gtpl gene is clearly notessential for activity. However, we have previously reportedthat a deletion extending from the end of the gene to the Saclsite at position 3085 resulted in expression of a truncated andenzymatically inactive peptide (27, 27a). To further definethe length of the gene sequence required for expression of afunctional glucosyltransferase, a series of derivatives ofphage M13 containing various lengths of gtf7 were examinedfor expression of peptides which had enzyme activity andthe ability to bind to glucan (Fig. 5). The M13 derivativesR7-20, R5-2, and R7-3 all formed polymer when plated withE. coli JM109 on sucrose-containing medium (Fig. 6). Thetwo shortest M13 derivatives tested, R7-34 and R3-6, did notform any polymer either on plates or in a tube assay forglucosyltransferase using 14C-labeled sucrose. Nor did theyrelease reducing sugar from sucrose, whereas the longerderivatives did encode enzyme activity for the release ofreducing sugars, as indicated by the fact that when they wereplated on E. coli JM109 on sucrose indicator plates (Russellet al., in press), acid was produced by E. coli. The samepattern of function was found when glucan-binding abilitywas examined; derivatives R7-20, R5-2, and R7-3 all ex-pressed peptides which were retained by a mutan-Sepharosecolumn, whereas R7-34 and R3-6 did not.The results presented above indicated that the genetic

information essential for enzyme activity and glucan-bindingfunction was located in the C-terminal one-third of the gene.An in-frame gene fusion was therefore made between pUC8and the ScaI site of gtpl located at position 3290. The

A

1100 YYFGQDGYMVTGAQNIKGSNYYFLANGAALRNTVY

1163 WRYFKNGVMALGLTTVDGHVQYFDKDGVQAKDKII

1228 YYLGKDGVAVTGAQTVGKQHLYFEANGQQVKGDFV

1293 FYLGKDGAAVTGAQTIKGQKLYFKANGQQVKGDIV

1406 WVYVKSGKVLTGAQTIGNQRVYFKDNGHQVKGQLV

1519 WLYVKDGKVLTGLQTVGNQKVYFDKNGIQAKGKAV

B

1352 VNGKTYYFGSDGTAQTQANPKGQTFKDGSGVLRFYNLEGQYVSGSGWY

1464 VNGKTYYFGSDGTAQTQANPKGQTFKDGSGVLRFYNLEGQWYVSGSGWY

FIG. 4. Amino acid sequences of A and B repeating units foundin the C-terminal region of the glucosyltransferase protein. Thenumbers at the left indicate the position number of the first aminoacid of each repeating unit.

resultant recombinant plasmid pSF86 expressed a 65-kilodalton peptide which reacted with antiserum toglucosyltransferase, indicating that the entire C-terminalone-third of the enzyme was being made. This peptide hadno detectable glucosyltransferase activity but did bind to theaffinity column.

Protein homology. Comparison of the deduced amino acidsequence of glucosyltransferase with other sequenced pro-teins revealed partial homologies with three proteins: alpha-amylase from barley, alpha-amylase from Bacillus amylo-liquefaciens, and glycogen phosphorylase from rabbits (Fig.7). The homologies of the glucosyltransferase with the twoalpha-amylases overlap in the same general region, suggest-ing a region of functional homology for the three proteins.

DISCUSSIONNucleotide sequence analysis of the 5-kb fragment of S.

sobrinus MFe28 showed the presence of a single openreading frame which coded for the glucosyltransferase pro-tein. The deduced amino acid sequence of the mature proteinhad a molecular weight of 172,983, which agreed closely withthe value derived by SDS-PAGE. This value varies from

Base 3457 3646 3841

Amino acid 1100 1163 1228

Al A2 A3

4036 4213 4375 4552 4713

1293 1352 1406 1464 1519

A4 B1 A5 82 AS_ - -6_ _w%A

GTF GBPpMLG5

pMLG1

R7-20

R5-2

R7-3

R7-34

R3-6

pSF86

+ +

+ +

FIG. 5. The 3' end of the gtp gene showing positions of repeatregions and termination points of phage M13 derivatives and pSF86,a pUC8 vector carrying the terminal ScaI-HindIII fragment of gtfl.Adjacent to each fragment is shown the ability of its product toexhibit glucosyltransferase (GTF) activity or to function as a glucan-binding protein (GBP).

VOL. 169, 1987

_ +

on June 27, 2018 by guesthttp://jb.asm

.org/D

ownloaded from

4276 FERRETTI ET AL.

FIG. 6. Accumulation of glucan above points where E. coliJM109 was infected with phage M13 derivative R7-20 on sucrose-containing medium.

previous estimates for glucosyltransferase that produce in-soluble glucans (3, 18), although proteolytic degradation andproblems associated with molecular weight determinationsby gel analysis could easily account for the differences. Thededuced amino acid composition of the glucosyltransferaseindicates that it is a highly hydrophilic protein, containing11.5% basic amino acids, 12.6% acidic amino acids, and41.6% polar amino acids.The restriction map generated from nucleotide sequence

analysis is in agreement with previous maps established forthis fragment and also supports previous speculations con-cerning the location of probable transcription and translationinitiation sites (27, 27a). These sites are similar to transcrip-tion and translation sites reported for other streptococcalgenes (6, 7, 14, 17). Downstream of the coding region, asingle termination codon is present, but insufficient sequence

is available to comment further about sequences involved intranscription termination.The presence of a 38-amino-acid signal peptide was con-

firmed by N-terminal amino acid analysis of the purifiedglucosyltransferase protein, in which the sequence of thefirst 16 amino acids was identical to the deduced sequence.This signal peptide has properties similar to those of othersignal peptides (20), i.e., a positively charged N-terminalregion followed by a string of 23 hydrophobic amino acidsand a more polar C-terminal region. The cleavage sitebetween Ala and Asp and the surrounding residues are inaccordance with the -3, -1 rule proposed by von Heijne(31). The 38-amino-acid signal peptide of glucosyltransferaseis in the general size range of other streptococcal signalpeptides (6, 14, 17, 32) and that reported for other gram-positive organisms (20).

It is apparent that E. coli is capable of recognizing the gtflgene product and cleaving it at the site expected for removalof a secretion signal peptide. Other evidence suggests thatthe enzyme passes through the cytoplasmic membrane. Forexample, E. coli strains expressing glucosyltransferase canmetabolize sucrose (27a), although sucrose can pass throughonly the outer membrane and not the cytoplasmic membrane(5). Glucosyltransferase would thus be expected to accumu-late in the periplasmic space, but we have been unable, usingconventional osmotic shock methods for release of periplas-mic proteins (13, 34), to obtain release of the protein. In viewof the observation that much of the C-terminal region ofglucosyltransferase is not essential for function, it is tempt-ing to speculate that the commonly observed heterogeneityof molecular sizes in enzyme preparations (18, 25) is due tosequential degradation by proteolytic action on this end ofthe molecule. As yet, however, there is insufficient evidenceto confirm this idea.The C-terminal region of the glucosyltransferase protein

contains two sets of repeated sequences, the A repeatingunit present six times and the B repeating unit present twice.The A repeating units exhibit some variability, whereas theB repeating units are completely identical. The manner orsequence in which these duplications and changes occurredis not obvious. However, duplications in other streptococcal

Alpha-amylase (EC 3.2.1.1)

69a HSVIQNGYAFTDRYDID---ASKYGNAAELKSL*:: *-::, * ::-- :-:-:*

840 FQSFATKEEEYTNVVIANNVDKFVSWGITDFEMAPQYVSSTDGQFLDSVIQNGYAFTDRYDLGMSKANKYGTADQLVKA

40 FEWYTPNDGQHWK-RLQNDAEHLSDIGITAVWIPPAYKGLSQSD-NGYGPYDLYDLGE-FQQKGYVRTKYGTKSELQDA

looa

919

116b

IGALHGKGVQAIADIVINHRCA

IKALHAKGLKVMADWVPDQMYT

IGSLHRRNVQVYGD

Glycogen phosphorylase (EC 2.4.1.1)

1107 YMVYGAQNIKGSNYYFLANGAALRNTVYTD-AQGQNHYYGNDGKRYENGYQQFGNDSWRYFKNGVMALGLTTVDGHVQY*--:- --- :--:-- --:-. . . *0::::--: :: : .. ..*---::----:

37c FTLVKNRNVATPRDYYFAHALTVRDHLVGRWIRTQQHYYEKDPKRI--YYLSLQFYMGRTLQNTMVNLALENACDEADYFIG. 7. Alignment of predicted amino acid sequences of glucosyltransferase with regions of alpha-amylase from barley (a), alpha-amylase

from Bacillus amyloliquefaciens (b), and glycogen phosphorylase from rabbit (c). The sequences were aligned by the PRTALN program ofWilbur and Lipmann (33). Identical (:) and conserved ( ) amino acids are indicated. The numbers at the left indicate the position number ofthe first amino acid of each protein shown.

J. BACTEPIOL.

on June 27, 2018 by guesthttp://jb.asm

.org/D

ownloaded from

DNA SEQUENCE OF S. SOBRINUS gtfl 4277

genes and proteins have been recently reported, e.g., thegroup A type 6 M protein (14) and the group B immunoglob-ulin G-binding protein (6, 11).The functional role of the repeating units is not clear,

although the region containing them is essential both forglucosyltransferase activity and for binding of glucans. It isof interest that the part of the glucosyltransferase proteinshowing homology with glycogen phosphorylase spans theregion containing the first two A repeat units. This region ofglycogen phosphorylase is thought to be involved in sub-strate binding or catalysis (21) but is distinct from the region(amino acids 401 to 443) thought to be involved in the storagesite which binds heptamylose (10). The regions of homologywith the alpha-amylase proteins are found further upstream,not too distant from the repeating unit. The common featureof all three enzymes is the ability to bind to glucans, and itseems likely that the identified regions of glucosyltransferasehomology are also involved in or essential for glucan bind-ing.The relationship of the gtJf gene to other known gluco-

syltransferase genes is of considerable interest, especially inview of the different restriction maps reported (1, 22, 24, 27a)and the evolutionary distance between S. mutans, (G+Ccontent, 36 to 38%) and S. sobrinus (G + C content, 44 to46% (4). In the accompanying paper, Shiroza et al. report asequence analysis of the S. mutans gtJB gene, which speci-fies a glucosyltransferase that also produces insolubleglucans (30).

ACKNOWLEDGMENT

We thank David R. Lorenz for his many contributions to thiswork and Tricia Stalker for excellent technical assistance.

This research was supported by Public Health Service grant DE08191 from the National Institutes of Health and by the MedicalResearch Council.

LITERATURE CITED

1. Aoki, H., T. Shiroza, M. Hayakawa, S. Sato, and H. K.Kuramitsu. 1986. Cloning of a Streptococcus mutans glucosyl-transferase gene coding for insoluble glucan synthesis. Infect.Immun. 53:587-594.

2. Burne, R. A., B. Rubinfeld, W. H. Bowen, and R. E. Yasbin.1986. Cloning and expression of a Streptococcus mutans gluco-syltransferase gene in Bacillus subtilis. Gene 40:201-209.

3. Ciardi, J. 1983. Purification and properties of glucosyltrans-ferases of Streptococcus mutans: a review, p. 51-64. In R. J.Doyle and J. E. Ciardi (ed.), Glucosyltransferases, glucans,sucrose and dental caries (a special supplement to ChemicalSenses). Information Retrieval Limited, Washington, D.C.

4. Coykendall, A. L., and K. B. Gustafson. 1986. Taxonomy ofStreptococcus mutans, p. 21-28. In S. Hamada, S. M. Mich-alek, H. Kiyono, L. Menaker, and J. R. McGhee (ed.), Molec-ular microbiology and immunobiology of Streptococcusmutans. Elsevier Science Publishing, Inc., New York.

5. Decad, G. M., and H. Nikaido. 1976. Outer membrane ofgram-negative bacteria. XII. Molecular-seiving function of cellwall. J. Bacteriol. 128:325-336.

6. Fahnestock, S. R., P. Alexander, J. Nagle, and D. Filpula. 1986.Gene for an immunoglobulin-binding protein from a group Gstreptococcus. J. Bacteriol. 167:870-880.

7. Ferretti, J. J., K. S. Gilmore, and P. Courvalin. 1986. Nucleo-tide sequence analysis of the gene specifying the bifunctional6'-aminoglycoside acetyltransferase 2"-aminoglycoside phos-photransferase enzyme in Streptococcus faecalis and identifica-tion and cloning of gene regions specifying the two activities. J.Bacteriol. 167:631-638.

8. Gilmore, M. S., K. S. Gilmore, and W. Goebel. 1985. A newstrategy for "ordered" DNA sequencing based on a novel

method for the rapid purification of near milligram quantities ofa cloned restriction fragment. Gene Anal. Tech. 2:108-114.

9. Gilpin, M. L., R. R. B. Russell, and P. Morrissey. 1985. Cloningand expression of two Streptococcus mutans glucosyltrans-ferases in Escherichia coli K-12. Infect. Immun. 49:414-416.

10. Goldsmith, E., and R. J. Fletterick. 1983. Oligosaccharideconformation and protein saccharide interactions in solution.Pure Appl. Chem. 55:577-588.

11. Guss, B., M. Eliasson, A. Olsson, M. Uhlen, A.-K. Frej, H.Jornvall, J.-I. Flock, and M. Lindberg. 1986. Structure of theIgG-binding regions of streptococcal protein G. EMBO J. 5:1567-1575.

12. Hamada, S., and H. D. Slade. 1980. Biology, immunology, andcariogenicity of Streptococcus mutans. Microbiol. Rev. 44:331-384.

13. Hazelbauer, G. L., and S. Harayama. 1979. Mutants in trans-mission of chemotactic signals from two independent receptorsof E. coli. Cell 16:617-625.

14. Hollingshead, S. K., V. F. Fischetti, and J. R. Scott. 1986.Complete nucleotide sequence of type 6M protein of the groupA streptococcus. J. Biol. Chem. 261:1677-1686.

15. Konat, G., H. Offner, and J. Mellah. 1984. Improved sensitivityfor detection and quantitation of glycoproteins on polyacryl-amide gels. Experientia 40:303-304.

16. Kuhn, S., H.-J. Fritz, and P. Starlinger. 1979. Close vicinity ofISI integration sites in the leader sequence of the gal operon ofE. coli. Mol. Gen. Genet. 167:235-241.

17. Malke, H., B. Roe, and J. J. Ferretti. 1984. Nucleotide sequenceof the streptokinase gene from Streptococcus equisimilis H46A.Gene 34:337-362.

18. Mukasa, H. 1986. Properties of Streptococcus mutans glucosyl-transferases, p. 121-132. In S. Hamada, S. Michalek, H. Ki-yono, L. Menaker, and J. R. McGhee (ed.), Molecular micro-biology and immunobiology of Streptococcus mutans. ElsevierScience Publishing, Inc., New York.

19. Muller-Hill, B., L. Crapo, and W. Gilbert. 1968. Mutants thatmake more lac repressor. Proc. Natl. Acad. Sci. USA 59:1259-1264.

20. Oliver, D. 1985. Protein secretion in Escherichia coli. Annu.Rev. Microbiol. 39:615-648.

21. Palm, D., R. Goerl, and K. J. Burger. 1985. Evolution ofcatalytic and regulatory sites in phosphorylases. Nature (Lon-don) 313:500-503.

22. Pucci, M. J., K. R. Jones, and F. L. Macrina. 1987. Evidence fora duplicated DNA sequence associated with a glucosyltrans-ferase gene in Streptococcus mutans, p. 205-208. In J. J.Ferretti and R. Curtiss III (ed.), Streptococcal genetics. Amer-ican Society for Microbiology, Washington, D.C.

23. Pucci, M. J., and F. L. Macrina. 1986. Molecular organizationand expression of the gtfA gene of Streptococcus mutans LM7.Infect. Immun. 54:77-84.

24. Robeson, J. P., R. G. Barletta, and R. Curtiss III. 1983.Expression of a Streptococcus mutans glucosyltransferase genein Escherichia coli. J. Bacteriol. 155:211-221.

25. Russell, R. R. B., E. Abdulla, M. L. Gilpin, and K. Smith. 1986.Characterization of Streptococcus mutans surface antigens, p.61-70. In S. Hamada, S. Michalek, H. Kiyono, L. Menaker, andJ. R. McGhee (ed.), Molecular microbiology and immunobiol-ogy of Streptococcus mutans. Elsevier Science Publishing, Inc.,New York.

26. Russell, R. R. B., D. Coleman, and G. Dougan. 1985. Expressionof a gene for glucan-binding protein from Streptococcus mutansin Escherichia coli. J. Gen. Microbiol. 131:295-299.

27. Russell, R. R. B., and M. L. Gilpin. 1987. Identification ofvirulence components of mutans streptococci, p. 201-204. In J.J. Ferretti and R. Curtiss III (ed.), Streptococcal genetics.American Society for Microbiology, Washington, D.C.

27a.Russell, R. R. B., M. L. Gilpin, H. Mukasa, and G. Dougan.1987. Characterization of glucosyltransferase expressed from aStreptococcus sobrinus gene cloned in Escherichia coli. J. Gen.Microbiol. 133:935-944.

28. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA sequenc-ing with chain terminating inhibitors. Proc. Natl. Acad. Sci.

VOL. 169, 1987

on June 27, 2018 by guesthttp://jb.asm

.org/D

ownloaded from

4278 FERRETTI ET AL.

USA 74:5463-5467.29. Segrest, J. P., and R. L. Jackson. 1972. Molecular weight

determinations of glycoproteins by polyacrylamide gel electro-phoresis in sodium dodecyl sulphate. Methods Enzymol. 28:54-63.

30. Shiroza, T., S. Ueda, and H. K. Kuramitsu. 1987. Sequenceanalysis of the gtjB gene from Streptococcus mutans. J. Bacte-riol. 169:4263-4270.

31. von Heijne, G. 1983. Patterns of amino acids near signal se-quence cleavage sites. Eur. J. Biochem. 133:17-21.

32. Weeks, C. R., and J. J. Ferretti. 1986. Nucleotide sequence ofthe type A streptococcal exotoxin (erythrogenic toxin) gene

from Streptococcus pyogenes bacteriophage T12. Infect. Im-mun. 52:144-150.

33. Wilbur, W. J., and D. J. Lipman. 1983. Rapid similaritysearches of nucleic acid and protein data banks. Proc. Natl.Acad. Sci. USA 80:726-730.

34. Witholt, B., M. Boekhout, M. Brock, J. Kingma, H. van Heerik-huizen, and L. de Lelj. 1976. An efficient and reproducibleprocedure for the formation of spheroplasts from variouslygrown Escherichia coli. Anal. Biochem. 74:160-170.

35. Yanisch-Perron, C., J. Vieira, and J. Messing. 1985. ImprovedM13 phage cloning vectors and host strains: nucleotide se-quences of M13 mpl8 and pUC19 vectors. Gene 33:103-119.

J. BACTERIOL.

on June 27, 2018 by guesthttp://jb.asm

.org/D

ownloaded from


Recommended