Supporting InformationGloux et al. 10.1073/pnas.1000066107
setoeahcranerCairetcaboretnEairetcaboetorP
-setucimriF
puorgiborolhC/setedioretcaBaiborcimocurreV
airetcabosuFsamsalpocyM
GB-11G11H
i l o c a i h c i r e h c s E ) y t i v i t c a G B ( 2 1 K 2 8 7 9 3 4 4 0 _ P Z
A m u l i h p o m r e h t m u l l e c o r e a n 0 2 3 1 - Z / 5 2 7 6 M S D ) n o i t a t o n n a G B ( 6 3 9 3 7 5 2 0 0 _ P Y
s u v a n g s u c c o c o n i m u R ) y t i v i t c a G B ( 1 E 8 8 7 1 8 5 4 3 : I G
s u s o n m a h r s u l l i c a b o t c a L 1 - 2 S M L ) n o i t a t o n n a G B ( 2 8 7 9 3 4 4 0 _ P Z
s n e g n i r f r e p . m u i d i r t solC ) y t i v i t c a G B ( 9 3 2 8 C T C N 2 7 8 2 4 6 2 0 _ P Z
setucimriFpuorgGB-11G11H
Fig. S1. Distance tree view of H11G11-BG Blastp results and localization of the Firmicutes H11G11-BG group and of several BGs of the UidA group.
H11G11
C7D2
C. BartlettiZP_02210507
S. variabileZP_03776158
B. formatexigensZP_03687412
R. gnavusZP_02040206
R. gnavusZP_02041835
Paenibacillus sp. ZP_02847334
R. gnavusZP_02042987
B.
S. ZP_03776204
B. formatexigensZP_03686296
R.ZP_03753494
B. formatexigensZP_03685645
S. satellesZP_04455048
S. variabile_ZP_03774451
F. prausnitziiZP_02090684
E. coli P05804
L. gasseri gi12802352
R. gnavus gi34581788
C. perfringens B1RQV9
H11
G11
-like
BG
(Firm
icut
es)
Kno
wn
BG
s
variabile
inulinivorans
capillosusZP_02038110
PD81
9476
PD00
2797
PD33
9503
PD00
2797
PDA
2N0N
2
PD00
2163
PDA
253F
2
PD88
4831
PDA
1N8L
2
Fig. S2. Comparative domain organization of unique Firmicutes BGs and known BGs. Domain arrangements of proteins in families were performed using theProDom database on September 2009. Conserved characterized domains are represented by colored boxes and documented in ProDom database; un-characterized conserved domains are numbered.
Gloux et al. www.pnas.org/cgi/content/short/1000066107 1 of 9
www.pnas.org/cgi/content/short/1000066107
-New PS00719 pattern: [NT]-x-[LIVMFYWD]-R-[STACNL](2)-H-Y-[PQ]-x(4)-[LIVMFYWS](2)-x(3)-[DN]-x(2) -G- [LIVMFYWA](4)
>H11G11 CAT TAT CAG>ZP_02210507 CAC TAT CAA C. bartlettii>ZP_03776158 CAC TAC CAG S. variabile>ZP_036887412 CAC TAT CAG B. formatexigens >C7D2 CAT TAT CAG >ZP_02040206 CAT TAT CAG R. gnavus >ZP_02041835 CAT TAT CAG R. gnavus >ZP_02847334 CAT TAC CAG Paenibacillus sp.>ZP_02042987 CAT TAT CAG R. gnavus>ZP_02038110 CAC TAT CAG B. capillosus>ZP_03776204 CAC TAC CAG S. variabile >ZP_03686296 CAC TAC CAG B. formatexigens>ZP_03753494 CAC TAT CAG R. inulinivorans>ZP_03685645 CAT TAT CAG B. formatexigens>ZP_04455048 CAC TAT CAG S. satelles>ZP_3774451 CAC TAC CAG S. variabile>EDP22313 CAC TAC CAG F. prausnitzii>ZP_02065512 CAT TAT CCG B. ovatus>ZP_03478350 CAT TAC CCG P. johnsonii>ZP_02031489 CAT TAT CCG P. merdae>P05804 CAT TAC CCT E. coli >C2JTS9 CAT TAT CCA L. rhamnosus>C1CAWO CAT TAT CCA S. pneumoniae >Q3DSU4 CAT TAT CCT S. agalactiae>C2CG53 CAC TAC CCT A. tetradius>B1RQV9 CAT TAT CCA C. perfringens>B9MLH3 CAT TAT CCT A. thermophilum>Q6W7J7 CAT TAT CCT R. gnavus
Firmicutes H11G11-BG group
Bacteroidetes H11G11-BG group
known BGs (uidA homologs)
Fig. S3. A specific “HYQ” motif in the H11G11-BG group from Firmicutes. Codon usage within the HYP conserved domain of pattern 1 (Prosite PS00719) wascompared between H11G11-like BGs and known BGs (UidA homologs).
IepyT
IIepyT
Ty IIIep
VIepyT
,1122GOC BleM /+aN, esoibilem retropmys dna detalersretropsnart [ etardyhobraC dnatropsnart ]msilobatem
,29700RGIT hpg , ragus ( edisocylG - edisotneP -edinoruxeH retropsnart)
,1122GOC BleM /+aN, esoibilem retropmys dna detalersretropsnart [ etardyhobraC dnatropsnart msilobatem ]
,1122GOC BleM /+aN, esoibilem retropmys dna detalersretropsnart [ etardyhobraC dnatropsnart msilobatem ]
,29700RGIT hpg , ragus ( edisocylG - edisotneP -edinoruxeH retropsnart)
,29700RGIT hpg , ragus ( edisocylG - edisotneP -edinoruxeH retropsnart)
,1122GOC BleM /+aN, esoibilem retropmys dna detalersretropsnart [ etardyhobraC dnatropsnart msilobatem ]
,1122GOC BleM /+aN, esoibilem retropmys dna detalersretropsnart [ etardyhobraC dnatropsnart msilobatem ]
,1122GOC BleM /+aN, esoibilem retropmys dna detalersretropsnart [ etardyhobraC dnatropsnart msilobatem ]
,52920KRP,52920KRP etanoruculg esaremosi
,1122GOC BleM /+aN, esoibilem retropmys dna detalersretropsnart [ etardyhobraC dnatropsnart msilobatem ]
,1122GOC BleM /+aN, esoibilem retropmys dnasretropsnart [ etardyhobraC dnatropsnart msilobatem ]
,1122GOC BleM /+aN, esoibilem retropmys dna detalersretropsnart [ etardyhobraC dnatropsnart msilobatem ]
,1122GOC BleM /+aN, esoibilem retropmys dna detalersretropsnart [ etardyhobraC dnatropsnart msilobatem ]
,4350GOC MroN +aN, - nevird gurditlum xulffe pmup]smsinahcemesnefeD[ e9 13
e2 62
e3 62
e4 52
e6 33
e1 251
e3 23
e3 04
e7 34
e4 86
e8 15
e3 37
e2 76
e1 96
e5 76
e9 - 13
e2 - 62
e3 - 62
e4 - 52
e6 - 33
e1 - 251
e3 - 23
e3 - 04
e7 - 34
e4 - 86
e8 - 15
e3 - 37
e2 - 76
e1 - 96
e5 - 76
GB niamoddevresnoctsebretropmySretropmyS puorgretropmyStih)seigolomohpuorglanretni()eulaV-E(
mulunargilodbuS elibairav 67151MSD 40267730PZ dyhylg 70267730PZ
alletnayrB snegixetamrof 96441MSD
succoconimuR suvang 4192CCTA
airubesoR snaroviniluni 14861MSD
sedioretcaB susollipac 99792CCTA
54658630PZ 64658630PZalletnayrB snegixetamrof 96441MSD
69268630PZ 79268630PZ
78924020PZ 88924020PZ
49435730PZ 39435730PZ
01183020PZ 21183020PZ
15447730PZ 25447730PZmulunargilodbuS elibairav 67151MSD
iiztinsuarp 31322PDE 41322PDE
aihtrowelttuhS selletas 00641MSD 84055440PZ 94055440PZ
53374820PZsullicabineaP RDJ.ps -2
11G11H 02eneg 12eneg
85167730PZ 75167730PZmulunargilodbuS elibairav 67151MSD
succoconimuR suvang 4192CCTA
43374820PZ 43374820PZ-
succoconimuR suvang 4192CCTA
2D7C 6eneg 7eneg
53814020PZ 73814020PZ
60204020PZ 70204020PZ
muiretcabilaceaF 2/12M
Fig. S4. Genetic environments of the unique Firmicutes BGs. The conserved genetic environments close to the homolog Firmicutes BGs were restricted tosymporters whose homologies were grouped into four types determined after ClustalW multiple alignment and homology search. Red arrows represent thedifferent BG genes and green arrows the associated symporters. The best conserved domains and E values were those found after Blastp against nonredundantprotein sequences (NCBI).
Gloux et al. www.pnas.org/cgi/content/short/1000066107 2 of 9
www.pnas.org/cgi/content/short/1000066107
Fig. S5. Amino acid motifs conserved in symporters associated with the Firmicutes BGs. The different types of putative symporters and their identities/sim-ilarities within types were determined after homology search and ClustalW Multiple alignment. The patterns proposed were obtained by using the discoverpatterns tool PRATT (http://www.expasy.ch/tools/pratt/).
Gloux et al. www.pnas.org/cgi/content/short/1000066107 3 of 9
http://www.expasy.ch/tools/pratt/www.pnas.org/cgi/content/short/1000066107
00+E00,0
80-E00,5
70-E00,1
70-E05,1
70-E00,2
70-E05,2
BnI EnI U1FMnI
d.n d.n d.n
stnafnI
Freq
uenc
y in
met
agen
omes
(h
its/b
p)
00+E00,0
80-E00,5
70-E00,1
70-E05,1
70-E00,2
70-E05,2
8bus7busY2FX2FW2FV2FT1FS1FRnIDnIAnI
11G11H)%14(muidirtsolC
70501220_PZiitteltrab.C
85167730_PZelibairav.S
31322PDEiiztinsuarp.F
4943573_PZsnaroviniluni.R
43374820_PZ.pssullicabineaP
43374820_PZ.pssullicabineaP
21556020_PZsutavo.B
05387430_PZiinosnhoj.P
98413020_PZeadrem.P
98413020_PZeadrem.P
5E1Q1CiitoverpsuccocoreanA
07TF8BesneinfahmuiretcabotifluseD
7J7W6Qsuvang.R
3HLM9BmulihpomrehtmullecoreanA
9VQR1BsnegnirfrepmuidirtsolC
35GC2CsuidartetsuccocoreanA
35GC2CsuidartetsuccocoreanA
4USD3QeaitcalagasuccocotpertS
0WAC1CeainomuenpsuccocotpertS
9STJ2CsusonmahrsullicabotcaL
9STJ2CsusonmahrsullicabotcaL
40850P21Kiloc.E
nerdlihcdnastludA
Freq
uenc
y in
met
agen
omes
(h
its/b
p)
sGBnwonKsgolomohAdiU
sGB-11G11HsgolomohsetucimriF
sGB-11G11HsgolomohsetedioretcaB
Fig. S6. Information details about frequencies and distribution of the unique and known BGs among individuals. This figure presents the detailed results ofFig. 7 with the frequency per each BG gene tested.
Gloux et al. www.pnas.org/cgi/content/short/1000066107 4 of 9
www.pnas.org/cgi/content/short/1000066107
Table
S1.
Gen
esiden
tified
aspotential
BGs
Clone
No.of
aminoacids
BestPS
I-BLA
STresult
(iden
tity
%,E-va
lue)
Simila
rity
with
glyco
sylhyd
rolase
family
2Prosite
patterns
Activity,
arch
itecture,an
dgen
etic
context
Putative
function
COG
functional
category
H3D
458
5gi29
3462
82-aspartyl-tRNA
synthetase
(Bacteroides
thetaiotaomicronVPI-548
2)(91%
,0.0).
1:<50
%,2:<50
%*
PNP-G
deg
lucu
ronidation
Aminoacid
activa
tion
COG01
73Aspartyl-
tRNA
synthetase
C11
H2
165
gi118
7487
50-2C-m
ethyl- D-erythritol2,4-
cyclodiphosphatesynthase
(Marinomonas
sp.MW
YL1
)(59%
,1e
-51
).Pa
ttern2C
-methyl- D-erythritol2,4-
cyclodiphosphatesynthasesignature
(100
%).
1:<50
%,2:<50
%PN
P-G
deg
lucu
ronidation
Enzy
mes
ofthemethyl
erythritolphosphate
pathway
(terpen
oid
biosynthesis).Interaction
withthehost.
COG02
45Lipid
tran
sport
andmetab
olism
H11
G11
755
gi293
4616
7β-galactosidase(Bacteroides
thetaiotaomicronVPI-548
2)(40%
,3e
-118
)
1:82
%,2:93
%PN
P-G
deg
lucu
ronidation.
Glucu
ronidepermea
seupstream
.Noβ-galactosidaseactivity.
β-Glucu
ronidase,
glucu
ronide
metab
olism.
COG32
50Carbohyd
rate
tran
sport
andmetab
olism
233
and
≥60
5
ABC-typ
ean
timicrobialpep
tidetran
sport
system
:gi158
9681
9-ABCtran
sporter,
ATP
-bindingprotein
(Carnobacterium
sp.
AT7
)(70%
,5e
-93).ABCtran
sporter,
permea
seprotein
(Carnobacterium
sp.
AT7
)(28%
,1e
-54).
1:<50
%,2:<50
%(forboth
proteins)
PNP-G
deg
lucu
ronidation
permea
se.Pa
tternglyco
syl
hyd
rolase
family
16active
site
(65%
).Xyloglucanen
do-
tran
sglyco
sylase
Cterm
inal
domain.
SalX-A
BC-typ
ean
timicrobial
pep
tidetran
sport
system
.COG11
36.2
defen
semechan
isms
331
gi122
5826
56sporulationprotein
and
relatedproteins(Teth39
DRAFT
_034
0)(Thermoan
aerobacterethan
olicusATC
C33
223)
(35%
,1e
-44).
1:<50
%,2:55
%PN
P-G
deg
lucu
ronidation,cell
wallhyd
rolase,MraY
family
signature
1(53%
)
Sporulationstag
eII,
protein
DFirm
icutes(IPR
0142
25).
COG23
85Sp
orulation
protein
andrelated
proteins
H11
B1
916
gi149
8117
21serine/threonineprotein
kinase(Roseobactersp.Azw
K-3b)
(39%
,3e
-44).Le
ngth:44
4am
inoacids.
1:<50
%,2:51
%PN
P-G
deg
lucu
ronidation.
UDP-glyco
syltransferase/glyco
gen
phosphorylase
superfamily
.Domainglyco
syltran
sferase
group1.
TPRrepea
tSL
1domain.
Patternhex
apep
tide-repea
t–co
ntainingtran
sferasesignature
(68%
)
Bifunctional
protein,signal
tran
sductionmechan
isms,
polysaccharideproduction/
tran
sport
interactionwith
thehost.
COG
gen
eral
function
prediction.
Gen
esiden
tified
onfosm
idinsertsco
nferringdeg
lucu
ronidationactivity
werean
alyz
edusingPS
I-BLA
STsearch
onNCBIa
gainst
nonredundan
tprotein
sequen
cesofGen
Ban
k(relea
se16
3.0,
http://www.ncb
i.nlm
.nih.gov/BLA
ST/)
and
EXPA
SY(http://www.exp
asy.ch
/tools/blast/)
datab
ases.Domain
arch
itectures,
potential
functions,
and
associated
patternswere
analyz
edusing
InterProScan
(http://www.ebi.a
c.uk/
InterProScan
/),MyH
itsfrom
theSw
issInstitute
ofBioinform
atics(http://www.isb-sib.ch/),PR
OSITE
(http://www.exp
asy.org/prosite/),SM
ART(http://sm
art.em
bl-heidelberg.de/),PF
AM
(http://www.san
ger.ac.uk/
Software/Pfam
/),MotifScan
(http://myh
its.isb-sib.ch/cgi-bin/m
otif_scan
),an
dSU
PERFA
MILY
(http://supfam.org/SUPE
RFA
MILY/).Proteinsim
plicated
intran
sport
system
swerean
alyzed
withtheTran
sport
Clas-
sificationDatab
ase(http://www.tcd
b.org/).
*1an
d2forPS
0071
9an
dPS
0060
8Prosite
patterns,respective
ly.Conserved
patternswerefoundonPR
OSITE
(http://www.exp
asy.org/prosite/),a
ndsimila
rities
weredetermined
onPB
IL(http://npsa-pbil.ibcp
.fr/)
usingPA
TTINPR
OT.
Gloux et al. www.pnas.org/cgi/content/short/1000066107 5 of 9
http://www.ncbi.nlm.nih.gov/BLAST/http://www.ncbi.nlm.nih.gov/BLAST/http://www.expasy.ch/tools/blast/http://www.ebi.ac.uk/InterProScan/http://www.ebi.ac.uk/InterProScan/http://www.isb-sib.ch/http://www.expasy.org/prosite/http://smart.embl-heidelberg.de/http://www.sanger.ac.uk/Software/Pfam/http://www.sanger.ac.uk/Software/Pfam/http://myhits.isb-sib.ch/cgi-bin/motif_scanhttp://supfam.org/SUPERFAMILY/http://www.tcdb.org/http://www.expasy.org/prosite/http://npsa-pbil.ibcp.fr/www.pnas.org/cgi/content/short/1000066107
Table S2. Glycosyl hydrolase family 2 signatures conservation in the unique BG and homologs
Metagenomic clone or genomeNCBI reference andno. of amino acids
Prosite pattern PS00719(position and similarity %)
Prosite pattern PS00608(position and similarity %)
Metagenomic clone H11G11 309–334, 82% 362–376, 93%Metagenomic clone C7D2 314–339, 86% 369–383, 90%R. gnavus ATCC 29149 ZP_02041835 (747 aa) 304–329, 76% 359–373, 90%
ZP_02040205 (757 aa) 316–341, 86% 371–385, 90%ZP_02042987 (640 aa) 298–323, 86% 351–365, 89%
S. variabile DSM 15176 ZP_03776158 (735 aa) 312–337, 86% 365–379, 81%ZP_03774451 (727 aa) 319–344, 86% 372–386, 93%ZP_03776204 (639 aa) 329–354, 86% 382–396, 89%
B.a formatexigens DSM 14469 ZP_03687412 (756 aa) 322–347, 86% 377–391, 90%ZP_03685645 (756 aa) 331–356, 86% 384–398, 71%ZP_03686296 (642 aa) 305–330, 76% 358–372, 77%
F. prausnitzii M21/2 EDP22313 (735 aa) 302–327, 86% 355–369, 93%R. inulinivorans DSM 16841 ZP_03753494 (640 aa) 298–323, 86% 351–365, 93%C. bartlettii DSM 16795 ZP_02210507 (746 aa) 311–336, 76% 364–378, 93%B. capillosus ATCC 29799 ZP_02038110 (641 aa) 299–324, 86% 352–366, 89%S. satelles DSM 14600 ZP_04455048 (766 aa) 321–346, 86% 374–388, 93%Paenibacillus sp. JDR-2 ZP_02847334 (767 aa) 326–351, 86% 381–395, 90%
Homologies of signatures (Prosite patterns) of glycosyl hydrolase family 2 (galactosidases/glucuronidases) and unique BG homologs were studied using thePATTINPROT tool (http://npsa-pbil.ibcp.fr/).
Gloux et al. www.pnas.org/cgi/content/short/1000066107 6 of 9
http://npsa-pbil.ibcp.fr/www.pnas.org/cgi/content/short/1000066107
Table
S3.
UidA
homologsresearch
ingen
omes
possessingtheuniqueBG
sequen
ce
Knownan
dco
nfirm
edBG-positive
strains
Protein
sources
for
homologyresearch
BesttBlastn,localiz
ation,
andhomology(%
iden
tity,
%simila
rity)
Aminoacid
sequen
cehomologywithH11
G11
(localiz
ation,%
iden
tity,%
simila
rity)
Assigned
asH11
G11
BG-like
group(positiononco
ntig,
%H11
G11
iden
tity,%
H11
G11
simila
rity)
S.va
riab
ileDSM
1517
6P0
5804
E.co
liS_va
riab
ile-1.0.1_C
ont0.25
(198
49–21
483)
26%
,42
%(198
73–22
053)
48%
,61
%ZP
0377
4451
(198
73–22
056)
45%
,57
%gi128
0235
2L.
gasseri
S_va
riab
ile-1.0.1_C
ont0.19
(592
10–60
709)
25%
,44
%(590
90–61
003)
55%
,69
%ZP
0377
6204
(590
90–61
009)
43%
,54
%gi345
8178
8R.gnav
us
S_va
riab
ile-1.0.1_C
ont0.19
(591
92–60
709)
25%,41
%B.ova
tusATC
C84
83P0
5804
E.co
liB_o
vatus-MSIQ_C
ont517
(133
275–
1316
08)
26%
,44%
(133
134–
1316
08)25
%,44%
<50
simila
rity
gi128
0235
2L.
gasseri
B_o
vatus-MSIQ_C
ont460
(264
053–
2656
39)27
%,
42%
(263
942–
2657
77)40
%,54
%ZP
0206
5512
(263
912–
2659
09)34
%,46
%
gi345
8178
8R.gnav
us
B_o
vatus-MSIQ_C
ont517
(133
293–
1315
90)26
%,
41%
(133
134–
1316
08)25
%–44
%<50
simila
rity
P.johnsoniiDSM
1831
5P0
5804
E.co
liP_
johnsonii-1.0_
Cont16.36
(888
373
12)28
%,42
%(901
5–70
90)40
%,55
%ZP
0347
8350
(703
9 –90
75)
34%
,46
%gi128
0235
2L.
gasseri
P_johnsonii-1.0_
Cont16.36
(894
6–73
12)27
%,43
%(901
570
90)40
%,55
%
gi345
8178
8R.gnav
us
P_johnsonii-1.0_
Cont16.36
(894
0–73
18)25
%,41
%(901
5–70
90)40
%,55
%
F.prausnitziiM21
/2P0
5804
E.co
liF_prausnitzii_M21
2-2.0.1_
Cont750
(314
70–
3326
0)35
%,53
%
(314
97–33
233)
25%
,41
%<50
simila
rity
gi128
0235
2L.
gasseri
F_prausnitzii_M21
2-2.0.1_
Cont750
(314
73–
3324
2)42
%,57
%gi345
8178
8R.gnav
us
F_prausnitzii_M21
2-2.0.1_
Cont750
(314
73–
3323
3)38
%,53
%R.gnav
usATC
C29
149
P058
04E.
coli
R_g
nav
us-1.0.1_
Cont39.1
(267
4–41
13)27
%,41
%(249
1–44
07)53
%,68
%ZP
0204
2987
(249
1–44
10)
45%
,57
%gi128
0235
2L.
gasseri
R_g
nav
us-1.0.1_C
ont39.1
(261
1–41
13)26
%,45
%gi345
8178
8R.gnav
us
R_g
nav
us-1.0.1_
Cont244
(468
95–48
577)
25%
,41
%(468
92–49
153)
46%
,60
%ZP
0204
0206
(468
86–49
156)
44%
,57
%B.capillosusATC
C29
799
P058
04E.
coli
B_cap
illosus-2.0.1_
Cont317
(459
01–47
340)
27%,39
%(457
21–47
634)
52%
,67
%ZP
0203
8110
(457
15–47
637)
44%
,56
%gi128
0235
2L.
gasseri
B_cap
illosus -2.0.1_
Cont317
(457
30–47
340)
26%,44
%gi345
8178
8R.gnav
us
B_cap
illosus-2.0.1_
Cont317
(457
24–47
352)
24%,41
%
Gloux et al. www.pnas.org/cgi/content/short/1000066107 7 of 9
www.pnas.org/cgi/content/short/1000066107
Table
S3.Cont.
Knownan
dco
nfirm
edBG-positive
strains
Protein
sources
for
homologyresearch
BesttBlastn,localiz
ation,
andhomology(%
iden
tity,
%simila
rity)
Aminoacid
sequen
cehomologywithH11
G11
(localiz
ation,%
iden
tity,%
simila
rity)
Assigned
asH11
G11
BG-like
group(positiononco
ntig,
%H11
G11
iden
tity,%
H11
G11
simila
rity)
B.form
atex
igen
sDSM
1446
9P0
5804
E.co
liB_form
atex
igen
s-1.0.1_
Cont23.1(74
444–
7625
2)43
%,57
%
(746
96–76
225)
26%
,41
%<50
simila
rity
gi128
0235
2L.
gasseri
B_form
atex
igen
s-1.0.1_
Cont2.1
(232
634–
2344
24)37
%,55
%
(232
901–
2344
15)28
%,44
%
gi345
8178
8R.gnav
us
B_form
atex
igen
s-1.0.1_
Cont2.1(232
643–
2344
21)40
%,54
%P.
merdae
ATC
C43
184
P058
04E.
coli
P_merdae
-MSIQ_C
ont63
(214
079–
2125
08)28
%,
42%
(214
211–
2122
86)41
%,55
%ZP
0203
1489
(214
271–
2122
38)34
%,46%
gi128
0235
2L.
gasseri
P_merdae
-MSIQ_C
ont63
(214
211–
2125
08)26
%,
43%
gi345
8178
8R.gnav
us
P_merdae
-MSIQ_C
ont63
(214
091–
2125
14)26
%,
41%
C.bartlettiiDSM
1679
5P0
5804
E.co
liC_b
artlettii-2.0.1_
Cont15.1
(406
77–42
332)
24%
,39
%(406
74–42
899)
58%
,70
%ZP
0221
0507
(406
74–42
911)
54%
,68
%gi128
0235
2L.
gasseri
C_b
artlettii-2.0.1_
Cont15.1
(406
56–42
332)
22%
,39
%gi345
8178
8R.gnav
us
C_b
artlettii-2.0.1_
Cont15.1
(407
22–42
33223
%,39
%)
R.inulin
ivoransDSM
1684
1P0
5804
E.co
liR_inulin
ivorans-
1.0.1_
Cont419
.1(752
33–
7373
4)26
%,42
%
(753
56–73
458)
53%
,67
%ZP
0375
3494
(753
56–73
437)
44%
,57
%
gi128
0235
2L.
gasseri
R_inulin
ivorans-
1.0.1_
Cont419
.1(752
36–
7373
4)27
%,45
%gi345
8178
8R.gnav
us
R_inulin
ivorans-
1.0.1_
Cont419
.1(752
33–
7373
1)25
%,40
%
Todeterminetherelationship
betwee
nthesestrain
activities
andtheirsequen
cepotentialitiesforboth
UidA(E.coliK12
P058
04)an
dH11
G11
-BGhomologs,UidAhomologsweresearch
edusingtblastnin
the
gen
omes
concerned
andfurther
aligned
toH11
G11
-BGhomologs.Comparativean
alysisofiden
tity
percentages
betwee
nthetw
oBGclassesallowed
iden
tificationofstrainsforwhichaH11
G11
-BGhomologwas
theonly
knownBG
gen
eab
leto
gen
eratean
activity.Th
esameap
proachwas
perform
edwiththethreeBacteroidetes
strainsstudied(uniqueBG
group)ex
ceptthat
theinco
mplete
16SrRNA
sequen
ceof
Parabacteroides
merdae
ATC
C43
184T
was
replacedbytheclosest
sequen
cefrom
uncu
lturedbacterium
cloneTS
4_a0
2f06
.
Gloux et al. www.pnas.org/cgi/content/short/1000066107 8 of 9
www.pnas.org/cgi/content/short/1000066107
Table S4. Oligonucleotides and PCR conditions used in this study
Gene (metagenomic clone) Primer Sequences
Putative aspartyl-tRNA synthetase H3D4 F-H3D4-Synth-Gt 5′-GGA TCC AAT GTA TAG ATC ACA CAC CTG CGG AGA ATT G-3′R-H3D4-Synth-Gt 5′-CTG CAG TCA CTT AAA AGT GAC CTT TTT ATT CAT CAG GAA-3′
Putative 2C-methyl-D-erythritol 2,4-cyclodiphosphatesynthase C11H2
F-C11H2-Ery 5′-GGA TCC AAT GAG TTC GGG AAT GAT GAA ATT CAG A-3′
R-C11H2-Ery 5′-CTG CAG CTA TTC CCT GTA AAT CAG GGC GAC CGC-3′Putative serine/threonine protein kinase H11B1 F-H11B1 5′-GAA TTC GAT GCC AGC AGA AGG CCA GCC AGA C-3′
R-H11B1 5′-CTG CAG TTA ATT GAT TTC AAT TCC GGA GTT TTT-3′Putative sporulation protein H11G11 F-H11G11-Spo 5′-GGA TCC AAT GAA AAC ATA CGG TAT TTT ATG CTT -3′
R-H11G11-Spo 5′-CTG CAG TCA AAT AAT TTC GGT GTT CGG ATA GTA-3′Putative ABC transporter, permease protein H11G11 F-H11G11-Perm 5′-GGA TCC AAT GAG AAA AGA TTT TCA GCG GGA AAT A-3′
R-H11G11-Perm 5′-CTG CAG TTA TAC CAA AGC GGC AAC AAG GAA GAA TAT-3′Putative β-galactosidase H11G11 F-H11G11-Bgal 5′-GGA TCC AAT GCG AGA AGT AAT AAA TAT AAA CAA-3′
R-H11G11-Bgal 5′-CTG CAG TTA TTT CTG CTT TTT CTT AAT CTT GTT-3′
Underlining indicates restriction enzymes sites (BamHI and PstI except for putative serine/threonine protein kinase H11B1: EcoRI and PstI). PCR was conductedunder the following conditions: putative aspartyl-tRNA synthetase H3D4—(F0.3 μM, R0.3 μM), PCR (15′ 95 °C, 5 cycles: 94 °C 1′, 62 °C 1′, 72 °C 2′, 5 cycles: 94 °C 1′,57 °C 1′, 72 °C 2′, 10 cycles: 94 °C 1′, 52 °C 1′, 72 °C 2′10 and10-min elongation step at 72 °C); putative serine/threonineprotein kinaseH11B1—(F0.2 μM,R0.2 μM), PCR(15′ 95 °C, 5 cycles: 94 °C 1′, 68 °C 1′, 72 °C 2′, 5 cycles: 94 °C 1′, 63 °C 1′ and 72 °C 2′, 20 cycles: 94 °C 1′, 58 °C 1′, 72 °C 3′ and 10-min elongation step at 72 °C); putativeABC transporter, permease proteinH11G11—(F0.2 μM,R0.2 μM), PCR (15′ 95 °C, 5 cycles: 94 °C 1′, 66 °C 1′, 72 °C 1′ 30 20 cycles: 94 °C 1′, 61 °C 1′, 72 °C 1′ 30 and10-minelongation step at 72 °C); putative 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase C11H2—(F0.3 μM, R0.3 μM, 3% DMSO); PCR (15 ′95 °C, 5 cycles: 94 °C 1′,68 °C 1′, 72 °C 30′′, 20 cycles: 94 °C 1′, 62 °C 1′, 72 °C 45′′ and 10-min elongation step at 72 °C); putative sporulation protein H11G11—(F0.3 μM, R0.3 μM); PCR (15′95 °C, 5 cycles: 94 °C 1′, 63 °C 1′, 72 °C 1′, 20 cycles: 94 °C 1′, 58 °C 1′, 72 °C 1′ and 10-min elongation step at 72 °C); putative β-galactosidase H11G11—(F0.6 μM,R0.6 μM,3% DMSO), (15′ 95 °C, 5 cycles: 94°C 1′, 52 °C 1′, 72 °C 2′, 20 cycles: 94 °C 1′, 47 °C 1′, 72 °C 2′ and 10-min elongation step at 72 °C).
Gloux et al. www.pnas.org/cgi/content/short/1000066107 9 of 9
www.pnas.org/cgi/content/short/1000066107