+ All Categories
Home > Documents > Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and...

Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and...

Date post: 17-Oct-2016
Category:
Upload: mohammed-aslam
View: 213 times
Download: 0 times
Share this document with a friend
22
Folded-back Solution Structure of Monomeric Factor H of Human Complement by Synchrotron X-ray and Neutron Scattering, Analytical Ultracentrifugation and Constrained Molecular Modelling Mohammed Aslam and Stephen J. Perkins* Department of Biochemistry and Molecular Biology, Royal Free Campus, Royal Free and University College Medical School, University College London, Rowland Hill Street London NW3 2PF, UK Factor H (FH) is a regulatory cofactor for the protease factor I in the breakdown of C3b in the complement system of immune defence, and binds to heparin and other polyanionic substrates. FH is composed of 20 short consensus/complement repeat (SCR) domains, for which the over- all arrangement in solution is unknown. As previous studies had shown that FH can form monomeric or dimeric structures, X-ray and neutron scattering was accordingly performed with FH in the concentration range between 0.7 and 14 mg ml 1 . The radius of gyration of FH was deter- mined to be 11.1-11.3 nm by both methods, and the radii of gyration of the cross-section were 4.4 nm and 1.7 nm. The distance distribution func- tion P(r) showed that the overall length of FH was 38 nm. The neutron data showed that FH was monomeric with a molecular mass of 165,000(17,000) Da. Analytical ultracentrifugation data confirmed this, where sedimentation equilibrium curve fits gave a mean molecular mass of 155,000(3,000) Da. Sedimentation velocity experiments using the g*(s) derivative method showed that FH was monodisperse and had a sedi- mentation coefficient of 5.3(0.1) S. In order to construct a full model of FH for scattering curve and sedimentation coefficient fits, homology models were constructed for 17 of the 20 SCR domains using knowledge of the NMR structures for FH SCR-5, SCR-15 and SCR-16, and vaccinia coat protein SCR-3 and SCR-4. Molecular dynamics simulations were used to generate a large conformational library for each of the 19 SCR- SCR linker peptides. Peptides from these libraries were combined with the 20 SCR structures in order to generate stereochemically complete models for the FH structure. Using an automated constrained fit pro- cedure, the analysis of 16,752 possible FH models showed that only those models in which the 20 SCR domains were bent back upon themselves were able to account for the scattering and sedimentation data. The best- fit models showed that FH had an overall length of 38 nm and is flexible. This length is significantly less than a predicted length of 73 nm if the 20 SCR structures had been arranged in an extended arrangement. This out- come is attributed to several long linker sequences. These bent-back domain structures may correspond to conformational flexibility in FH and enable the multiple FH binding sites for C3 and heparin to come into close proximity. # 2001 Academic Press Keywords: complement factor H; X-ray scattering; neutron scattering; short consensus/complement repeat; analytical ultracentrifugation *Corresponding author Introduction In the complement system, factor H (FH) is pre- sent in plasma at about 0.5 mg ml 1 , in which its principal function is to regulate the alternative E-mail address of the corresponding author: [email protected] Abbreviations used: b2GPI, b2 glycoprotein I; CD46, membrane cofactor protein; FH, factor H; SCR, short consensus/complement repeat; VCP, vaccinia virus complement control protein. doi:10.1006/jmbi.2001.4720 available online at http://www.idealibrary.com on J. Mol. Biol. (2001) 309, 1117–1138 0022-2836/01/051117–22 $35.00/0 # 2001 Academic Press
Transcript
Page 1: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

doi:10.1006/jmbi.2001.4720 available online at http://www.idealibrary.com on J. Mol. Biol. (2001) 309, 1117±1138

Folded-back Solution Structure of Monomeric FactorH of Human Complement by Synchrotron X-ray andNeutron Scattering, Analytical Ultracentrifugation andConstrained Molecular Modelling

Mohammed Aslam and Stephen J. Perkins*

Department of Biochemistryand Molecular Biology, RoyalFree Campus, Royal Free andUniversity College MedicalSchool, University CollegeLondon, Rowland Hill StreetLondon NW3 2PF, UK

E-mail address of the [email protected]

Abbreviations used: b2GPI, b2 glmembrane cofactor protein; FH, facconsensus/complement repeat; VCPcomplement control protein.

0022-2836/01/051117±22 $35.00/0

Factor H (FH) is a regulatory cofactor for the protease factor I in thebreakdown of C3b in the complement system of immune defence, andbinds to heparin and other polyanionic substrates. FH is composed of 20short consensus/complement repeat (SCR) domains, for which the over-all arrangement in solution is unknown. As previous studies had shownthat FH can form monomeric or dimeric structures, X-ray and neutronscattering was accordingly performed with FH in the concentration rangebetween 0.7 and 14 mg mlÿ1. The radius of gyration of FH was deter-mined to be 11.1-11.3 nm by both methods, and the radii of gyration ofthe cross-section were 4.4 nm and 1.7 nm. The distance distribution func-tion P(r) showed that the overall length of FH was 38 nm. The neutrondata showed that FH was monomeric with a molecular mass of165,000(�17,000) Da. Analytical ultracentrifugation data con®rmed this,where sedimentation equilibrium curve ®ts gave a mean molecular massof 155,000(�3,000) Da. Sedimentation velocity experiments using the g*(s)derivative method showed that FH was monodisperse and had a sedi-mentation coef®cient of 5.3(�0.1) S. In order to construct a full model ofFH for scattering curve and sedimentation coef®cient ®ts, homologymodels were constructed for 17 of the 20 SCR domains using knowledgeof the NMR structures for FH SCR-5, SCR-15 and SCR-16, and vacciniacoat protein SCR-3 and SCR-4. Molecular dynamics simulations wereused to generate a large conformational library for each of the 19 SCR-SCR linker peptides. Peptides from these libraries were combined withthe 20 SCR structures in order to generate stereochemically completemodels for the FH structure. Using an automated constrained ®t pro-cedure, the analysis of 16,752 possible FH models showed that only thosemodels in which the 20 SCR domains were bent back upon themselveswere able to account for the scattering and sedimentation data. The best-®t models showed that FH had an overall length of 38 nm and is ¯exible.This length is signi®cantly less than a predicted length of 73 nm if the 20SCR structures had been arranged in an extended arrangement. This out-come is attributed to several long linker sequences. These bent-backdomain structures may correspond to conformational ¯exibility in FHand enable the multiple FH binding sites for C3 and heparin to comeinto close proximity.

# 2001 Academic Press

Keywords: complement factor H; X-ray scattering; neutron scattering;short consensus/complement repeat; analytical ultracentrifugation*Corresponding author

ing author:

ycoprotein I; CD46,tor H; SCR, short, vaccinia virus

Introduction

In the complement system, factor H (FH) is pre-sent in plasma at about 0.5 mg mlÿ1, in which itsprincipal function is to regulate the alternative

# 2001 Academic Press

Page 2: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

1118 Solution Structure of Factor H

pathway of complement activation by acting as acofactor for factor I in the breakdown of C3b toform iC3b.1 It also accelerates the decay of the C3convertase C3bBb, and competes with factor B forbinding to C3b. FH consists of 20 short consensus/complement repeat (SCR) domains, also known ascomplement control protein repeats, each of lengthabout 61 residues. This is the most abundantdomain type in the complement proteins.1 Asshown in Figure 1, which is adapted from recentreviews,2,3 the cofactor and decay acceleratingactivity is located within the four N-terminaldomains, SCR-1 to SCR-4, which bind to intactC3b, a second C3 site is located within SCR-6 toSCR-10 that binds to the C3c region of C3b, and athird site is located within SCR-16 and SCR-20 thatbinds to the C3d region of C3b.4,5 Heparin modu-lates the complement regulatory functions of FH.Two heparin-binding sites have been located inSCR-7 and SCR-20 in recombinant FH,6,7 and athird heparin-binding site is suggested to belocated at or near SCR-13.8 The synergistic actionof these domains is thought to enable FH toachieve differential control of complement acti-vation on activators and non-activators of thealternative pathway of the complement. FH hasbeen identi®ed as an adhesion ligand for humanneutrophils and binds to the integrin MAC-1(CD11b/CD18), which is also a receptor for iC3b(CR3) and ICAM-1. This interaction enhances theactivation response of neutrophils.9 FH is a ligandfor L-selectin.10 FH also binds to surface-expressedsialylated lipooligosaccharides of pathogens suchas Neisseria gonorrhoeae and Streptococcus pyogenesto restrict the activation of the complement alterna-tive pathway and enhance the pathogenicity ofthese microrganisms.11 ± 13

Structural studies of FH are hindered by its size,and its crystallisation is thought to be precluded bythe 19 potential ¯exible peptide links between the20 SCR domains and by the existence of up to nineputative glycosylation sites in its sequence.14-16

Alternative structural methods have had to be used.X-ray and neutron scattering showed that FH wasdimeric in a range of solution conditions, with aradius of gyration RG of 12.5 nm or more.17 A struc-ture based on small spheres arranged as two rods oflength 77 nm and joined to form a V-shape with an

angle of 5 � between them was successfully used tomodel the scattering curve. Studies of FH in therelatively harsh unphysiological conditions of elec-tron microscopy showed that FH is extended andformed a population of folded-back structures witha typical overall length of 49.5 nm; however, thisstudy showed that FH was monomeric.18 Accord-ingly, the monomeric or dimeric state of FH isunclear. Atomic structures have been determinedby 2D-NMR for the single domains SCR-5, SCR-15and SCR-16 of human FH, and for a domain pairSCR-15/16.19 ± 22 This has now been augmented bythe SCR-3/4 NMR structure of the vaccinia viruscomplement control protein (VCP),23 and morerecently by two crystal structures for the twoN-terminal SCR domains of membrane cofactorprotein (CD46) and all ®ve SCR domains of b2-gly-coprotein I (b2GPI).24,25 The combined results ofthese studies showed that there is no preferredstructural orientation between two adjacent SCRdomains, and that it is not possible to extrapolatefrom the knowledge of these linker structures topredict the overall 20-domain structure of FH.

X-ray and neutron scattering are powerful tech-niques that enable the oligomeric state of proteinsand their structures to be determined in solution.26

In comparison to the previous scattering study ofFH,17 the availability of improved instrumentationmakes it possible to obtain X-ray and neutron dataover a wider range of scattering angle. In addition,the scattering data are now supplemented by ana-lytical ultracentrifugation data. Advances in scat-tering and sedimentation modelling have shownthat these data are fully calculable from atomicstructures.27 Here, a recently developed method ofconstrained automated curve ®ts based on testinga wide range of stereochemically allowed struc-tures leaves only the best-®t structures that areconsistent with the observed curves, and this hasmuch improved the utility of the method.28 Usingthese methods, we reinvestigated the solutionproperties of FH and demonstrated that this ismonomeric under the conditions used here. Amedium-resolution solution structure determi-nation for FH was initiated starting from theknown NMR SCR structures, which were used toconstruct homology models for the 17 remainingSCR domains in FH. Molecular dynamics simu-

Figure 1. Schematic view of the20 SCR domains of FH. The pos-itions of three C3b binding sites(the ®rst four having decay acceler-ating and factor I activity), threeheparin-binding sites on SCR-7,SCR-13 and SCR-20 (black), andnine putative N-linked glycosyla-tion sites are shown. The Figure isadapted from Zipfel et al. (1999)and Pangburn (2000).

Page 3: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Figure 2. Guinier analyses of X-ray data on FH. Thecurves are arbitrarily displaced on the intensity axis forclarity. Filled circles correspond to the I(Q) data used toobtain RG and RXS values. The straight lines correspondto the best ®t through these points. The same three scat-tering curves were used in (a)-(c) and corresponded tosample concentrations of 3.6 mg mlÿ1, 1.8 mg mlÿ1 and1.8 mg mlÿ1 from top to bottom. (a) RG plots using a Qrange of 0.08-0.13 nmÿ1 for ®ts. The QRG value of 1.4 isbased on a RG value of 11 nm. (b) RXS-1 plots using a Qrange of 0.16-0.26 nmÿ1 for ®ts. The QRXS-1 values arebased on an RXS-1 value of 4.1 nm. (c) RXS-2 plots using aQ range of 0.4-0.8 nmÿ1 for ®ts. The QRXS-2 values arebased on an RXS-2 value of 1.7 nm.

Solution Structure of Factor H 1119

lations were used to construct peptide confor-mations that enabled the 20 domains to be linkedtogether to produce structurally complete confor-mations of FH. By the use of automated curve ®tsin a trial-and-error approach, a total of 16,752 FHmodels was evaluated to show that FH has a con-formationally ¯exible folded-back structure inphysiological buffers. The immunological impli-cations of this FH structure are discussed.

Results and Discussion

X-ray scattering data for FH

The puri®cation of FH for scattering and ultra-centrifugation studies resulted in single bandswhen analysed by SDS-PAGE and homogeneouspeaks by gel-®ltration (see Materials and Meth-ods).29 X-ray solution scattering experiments wereperformed on native FH in the concentration rangebetween 0.7 and 14 mg mlÿ1 in Tris buffer. GuinierRG ®ts were linear in an appropriate QRG range upto 1.4 (Figure 2(a)), where Q � 4p sin y/l(2y � scattering angle; l �wavelength). Inspectionof the ten time frames obtained during data acqui-sition showed no radiation damage effects. Theconcentration dependence of 26 RG measurementsshowed that this was not observed (Figure 3(a)),and likewise no dependence was detected for theI(0)/c values (data not shown). This showed thatthe oligomeric state of FH was unchanged between0.7 and 14 mg mlÿ1. The mean radius of gyrationRG was determined to be 11.1(�0.4) nm, which isless than an apparent RG value of at least 12.5 nmreported for dimeric FH.17 In cross-sectional Gui-nier ®ts, two linear regions of ®ts were identi®edin separate Q ranges as previously reported fordimeric FH (Figure 2(b) and (c)). These Q rangesresulted in mean values of the cross-sectional radiiof gyration RXS-1 and RXS-2 of 4.4 nm and 1.7 nm,respectively (Figure 3(c) and (d)). The present RXS-1

value of 4.4(�0.2) nm is slightly greater than thatof 3.6(�0.4) nm for dimeric FH, while the presentRXS-2 value of 1.7(�0.1) nm is identical with that of1.8(�0.3) nm for dimeric FH. If the solution struc-ture of FH is approximated as an elongated ellipti-cal cylinder, L � [12(RG

2 ÿ RXS2 )]1/2, where L is its

length,30 the values of RG and RXS-1 or RXS-2

resulted in a length estimate of 35-38 nm. Thelength L is also given by L � p I(0)/[I(Q).Q]Q ! 0

from the RXS-1 cross-sectional analysis,31 and thiswas determined to be similar at 39(�3) nm.

The indirect transformation of the scattering dataI(Q) in reciprocal space into real space gives thedistance distribution function P(r), which rep-resents all the distance vectors between pairs ofatoms within FH. This gives a different calculationof the RG and I(0) values that is based on the fullscattering curve in the Q range between 0.08 and2 nmÿ1, and gives another determination of L.From the concentration-dependence of 22 RG

values calculated from GNOM, an RG value of11.3(�0.5) nm was determined at zero concen-

tration (Figure 3(a)). This is in good agreementwith the Guinier value of 11.1 nm, despite theexperimental dif®culties of measuring large RG

values at the lowest Q range near the beamstop.

Page 4: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Figure 3. Concentration dependence of the X-ray and neutron RG, RXS-1 and RXS-2 values for FH. The standard devi-ations of individual data points are shown when large enough to be visible. (a) The mean of the 26 X-ray RG valuesderived from Guinier ®ts (*) resulted in an RG value of 11.1(�0.4) nm, while the mean of the 22 RG values derivedfrom GNOM P(r) analyses (*) likewise gave an RG value of 11.3(�0.5) nm. (b) The mean of the 13 neutron RG valuesderived from Guinier ®ts (*) resulted in an RG value of 11.3(�0.4) nm at zero concentration, while the mean of the14 RG values derived from GNOM P(r) analyses (*) likewise gave an RG value of 11.7(�0.2) nm. (c) The mean of 18X-ray RXS-1 values (*) gave 4.4(�0.2) nm, while the mean of seven neutron RXS-1 values (*) likewise gave3.9(�0.2) nm. (d) The mean of 20 X-ray RXS-2 values (*) gave 1.7(�0.03) nm, while the mean of seven neutron RXS-2

values (*) likewise gave 1.5(�0.1) nm.

1120 Solution Structure of Factor H

The X-ray P(r) curve for FH (Figure 4(a)) showsthat the maximum M, the most frequently occur-ring distance within FH, occurs at 11 nm, and thatthe length L is 40 nm. As the NMR structures of asingle SCR domain shows that its length isapproximately 4 nm, the length of FH would beabout 80 nm if the 20 SCR domains were in a fullyextended arrangement. As the value of L is abouthalf the expected length, this clearly showed thatthe SCR domains in FH are bent back upon them-selves in solution.

Neutron scattering data for FH

Neutron scattering experiments were performedon native FH at concentrations between 0.4 and9.6 mg mlÿ1 in PBS in 99.9 % 2H2O (Materials andMethods). Neutrons were used to verify theabsence of radiation damage effects on the scatter-ing curve that can be encountered with X-rays, aspreviously observed with dimeric FH.17 Because ofthe large RG value for FH, only Instruments D11and D22 at the ILL were capable of measuring suf-®ciently accurate neutron data at low Q values inthe Guinier analyses (data not shown). The use ofthe same X-ray Q range to obtain the neutronGuinier RG values resulted in a mean RG value of11.3(�0.4) nm (Figure 3(b)). The RXS-1 and RXS-2

values were determined using the same Q rangesused in the X-ray analyses to be 3.9(�0.2) nm and1.51(�0.06) nm, respectively (Figure 3(c) and (d)).

From the RG and the RXS-1 or RXS-2 values, L wasdetermined to be 37-39 nm. From the ratio ofintensities using the ®rst cross-sectional analysis, Lwas estimated to be 46(�6) nm.

Neutron data were also obtained for FH usingInstrument LOQ at the ISIS facility. Since the mini-mum Q value of these measurements was effec-tively 0.1 nmÿ1, this precluded the determinationof RG values. Nonetheless, it was possible to esti-mate the molecular mass of FH from the meanI(0)/c value of 0.188 � 0.019 for data obtained at3.7 mg mlÿ1 and 6.1 mg mlÿ1. From a conversiongraph,28 a value of 165,000(�17,000) Da was deter-mined, in good agreement with a monomeric FHmolecular mass of 150,000 Da calculated from itssequence.

The neutron distance distribution function P(r)indicated a maximum M at 12 nm and a lengthL of 40 nm, both of which agreed well with theX-ray data. The mean RG value determined fromthe P(r) curves was 11.7(�0.5) nm. In addition tothose noted above, the good agreements of thesevalues with the corresponding X-ray valuesindicates that high-quality scattering data had beenobtained.

Sedimentation equilibrium and velocity datafor FH

To determine whether FH was monomeric ordimeric, sedimentation equilibrium experiments

Page 5: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Figure 4. X-ray and neutron distance distribution functions P(r) for FH. The maximum of the P(r) curve is denotedby M, the most frequently occurring distance within FH, which occurs at 11 nm (X-rays) and 12 nm (neutrons), andthe maximum dimension is denoted by L, which is 40 nm for both the X-ray and the neutron data.

Solution Structure of Factor H 1121

were performed for FH at nine concentrationsbetween 0.2 and 6.2 mg mlÿ1 in PBS (see Materialsand Methods). Using both absorbance data moni-tored at 280 nm and interferometry data, the mol-ecular mass of FH was determined to be 153,000Da in Figure 5(a) and to be 159,000 Da inFigure 5(b) on the assumption that a single specieswas present. No concentration-dependence of themolecular mass for FH was observed at 11,000r.p.m. (Figure 5(c)). At higher speeds of 14,000r.p.m. and 17,000 r.p.m., apparent concentrationdependences were observed, and these gave mol-ecular masses of 158,000 Da and 148,000 Da onextrapolation to zero concentration. The existenceof these concentration-dependences was attributedto the sensitivity of the molecular mass determi-

Figure 5. Sedimentation equilibrium data for FH by analytat 0.4 mg mlÿ1 and 1.2 mg mlÿ1 based on the assumption ofr.p.m. using absorbance and interference optics, respectivelydistributed randomly. (c) The observed molecular mass valuon the interference data. Three rotor speeds were used (Regressions to zero concentrations gave molecular mass148,000(�2000) Da, respectively. The error bars represent ththree scans at equilibrium.

nation to the high intensity of the equilibriumcurve at the largest radius in the cell, at whichmeasurements were less accurate. These exper-iments con®rmed that FH is a monomer.

The sedimentation coef®cient s�20,w monitors themacromolecular elongation. As this is independentof the RG value, it provides a control of the scatter-ing data. Velocity experiments were performed at20,000 r.p.m., 25,000 r.p.m., 30,000 r.p.m. and40,000 r.p.m. using FH samples at concentrationsbetween 0.2 and 1.8 mg mlÿ1 in PBS. Analysisusing the g(s*) method to analyse pairs of scansshowed that the s�20,w value was 5.34 S from thepeak position in Figure 6. From a total of 14 deter-minations at these rotor speeds, the mean s�20,w

value was determined to be 5.3(�0.1) S. This is in

ical ultracentrifugation. (a) and (b) Best-®t curves for FHa single species, both recorded at a rotor speed of 11,000. In the upper panels, the residuals of the curve ®ts arees of FH are shown as a function of concentration based*, 11,000 r.p.m.; *, 14,000 r.p.m.; ~, 17,000 r.p.m.).values of 155,000(�3000) Da, 158,000(�2000) Da, ande standard deviation of the mean molecular mass from

Page 6: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Figure 6. Fit of the g(s*) distribution to a Gaussianfunction based on sedimentation velocity data for FH byanalytical ultracentrifugation. For this, ten scans weretaken that were recorded at ten minute intervals usingabsorbance optics at 280 nm during a run at 25,000r.p.m. for an FH sample at 0.7 mg mlÿ1 in PBS. Fromthis, s�20,w was determined to be 5.34 S, and the molecu-lar mass was 147,000 Da. For this ®t, the maximum per-missible measurable molecular mass was over 16 timesthe expected value,62 meaning that time-broadeningeffects were negligible.

1122 Solution Structure of Factor H

good agreement with previous determinations ofthe s�20,w value of 5.5-5.6 S.32,33 From the Gaussianpeak width in Figure 6, which gives the diffusioncoef®cient, the molecular mass was determinedfrom this and the s�20,w value by the Svedbergequation to be 147,000 Da under measurementconditions in which time-broadening effects werenegligible.

Homology modelling of 17 SCR domains of FH

In order to construct a full model of FH that willaccount for the scattering and sedimentation data,homology models for the 17 FH SCR domains ofunknown structure were constructed using theexperimentally determined NMR structures ofSCR-5, SCR-15 and SCR-16 in FH, and two moreSCR domains in VCP. This modelling employedthe alignment of the FH SCR domains in Figure 7,which was constructed using residue conservationin the sequence alignment of 101 SCR domains,34

the secondary structure assignments from DSSP,35

and the side-chain solvent-accessibilities fromCOMPARER.36,37 In this alignment, Cys4, Cys33,Cys47, Trp52, and Cys58 were highly conserved.This sequence alignment resulted in the alignmentof the six antiparallel b-strands that make up thestructural core of the SCR domain. The seven refer-ence SCR structures were assessed using PRO-

CHECK to show from Ramachandran plots of themainchain torsion angles that 42-70 % of the resi-dues occurred in the most-favoured stereochemicalregions.38,39 In comparison, two recent crystalstructures of b2GPI with ®ve SCR domains and theN-terminal pair of CD46 showed that 85-87 % ofresidues occurred in the most favoured regions.24,25

Given the benchmark that 90 % of the residuesshould be in the most favoured region, it wasinferred that the NMR structures are less accuratethan those derived from crystallography, but aresuf®cient for the modelling of FH by scatteringand the assessment of the biological function ofFH.

The 17 homology models were optimally con-structed from four of the seven NMR structureswith only short insertions and deletions using stan-dard homology modelling methods (Table 1; seeMaterials and Methods). The models were assessedusing Ramachandran plots to show that 40-71 % ofresidues occurred in the most favoured regions(this percentage generally re¯ected that reportedfor the reference structure, being the same orslightly reduced).

A randomised search for modelling the X-raysolution structure of FH: method 1

The lengths of the 19 linker sequences in humanFH varied between three and eight residues,between Cys58 in the preceding SCR domain andCys4 in the following SCR domain. Five of the 19linkers contained a single Pro residue immediatelypreceding Cys4 at the start of the SCR domain.This will restrict the allowed linker main-chainconformation and cause this to be kinked (Figure 7and Table 1); none of the six known SCR linkerstructures possesses a Pro residue in this position.A high proportion of 36 residues (40 %) of the 89present in the 19 linker sequences are charged,where 21 basic residues occur between one andthree times in 15 linkers, and 15 acidic residuesoccur one or two times in 12 linkers. Glu (12 %)and Lys (17 %) residues occur the most frequently.This is in contrast to the 20 SCR domains of FH,within which a lower proportion of 24 % of resi-dues are charged.

Eight starting models for FH were created basedon known or predicted linker structures. Seveninter-SCR orientations are known from two NMRand two crystal structures (Table 2). Six of thesewere four residues in size, and the distancesbetween the Cys58 and Cys4 Ca atoms rangesbetween 1.29 and 1.63 nm. That between SCR-4and SCR-5 in b2GPI contained three residues andis of length 1.30 nm. Only 12 of the 19 linkersequences in FH possess three or four residues.The remainder contained between ®ve and eightresidues, meaning that the modelling of thesebased on known linker structures will be arbitrary(Table 1). The use of the ®rst linker structure inb2GPI to de®ne all 19 interdomain orientations inFH resulted in the most extended FH model of tip-

Page 7: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Table 1. Summary of the homology modelling of the 17 SCR domains of FH

Domain PDB codeaInsertions(residues)

Deletions(residues)

Linker sizeb

(residues)Linker length

(nm)

SCR-1 1hfh-1 1,3 1 4 1.59 (�)SCR-2 1vvc-4 1,1 1 4 1.66SCR-3 1vvc-4 4 None 4 1.66SCR-4 1hcc None 1 4 1.66SCR-5 SCR-5 None None 4 1.59 (�)SCR-6 1hfh-1 2,3 1 3 1.32SCR-7 SCR-5 None None 5 1.98SCR-8 1hfh-1 1,3 3 3 1.32SCR-9 SCR-5 2 None 4 1.66SCR-10 1vvc-4 None 1 6 2.32SCR-11 1hcc 1 None 6 2.32SCR-12 1hcc None None 8 2.98SCR-13 SCR-5 2 1,4 7 2.64SCR-14 1hcc None None 5 1.91 (�)SCR-15 1hfh-1 None None 4 1.59 (�)SCR-16 1hcc None None 4 1.66SCR-17 1hcc 1 None 4 1.66SCR-18 1hcc 1 None 6 2.32SCR-19 1hcc 1 None 3 1.26 (�)SCR-20 1hfh-1 1,2,3 1

a The PDB code denotes the reference structure used for homology modelling. The coordinates for SCR-5 are not in the PDB andwere kindly provided by Dr P. N. Barlow (Edinburgh, UK).

b A linker sequence is de®ned by the number of residues between Cys58 and Cys4 in two adjacent SCR domains. The 19 inter-domain linkers in FH possess a mean size of 4.6(�1.4) residues (their sequences are reported in Figure 7). The length of the linker isde®ned by the distance between the Ca atoms of Cys58 and Cys4, and the values reported here are when the linker is extended as ab-strand, noting that its value will depend on whether a Pro residue is present in the linker (�).

Solution Structure of Factor H 1123

to-tip length 73 nm (Figure 8; Table 2). This mostextended FH model resulted in a predicted X-rayRG value of 21.0 nm, and both the RXS values weresimilar at 0.88-0.90 nm, all of which deviate signi®-cantly from the observed RG values of 11.1-11.3 nmand the two observed RXS values of 4.4 nm and1.7 nm (Figure 2). The visual comparison of thepredicted and calculated curves showed wide devi-ations (Figure 9(a)). The goodness-of-®t R-factorde®ned by analogy with protein crystallographyand based on the X-ray experimental curve in theQ range extending to 1.4 nmÿ1 (denoted asR1.4)

40,41 was 37.1 %, which is high compared tovalues of 1.2-8.7 % obtained in typical good curve®ts.27 In contrast, the use of the VCP linker resultedin the most compact FH model of tip-to-tip length6 nm. This model also resulted in large differencesbetween the experimental and predicted curves(not shown), where the predicted RG value was3.4 nm, the R1.4 value was 41.9 %, and the pre-dicted curve visibly deviated from experiment.Helical FH models such as those based on the FHSCR 15/16 and the b2GPI SCR 3/4 linkers couldlikewise be ruled out from these trial simulations(Figure 9(b) and (c)).

Information to guide the modelling of FH wasobtained from Table 2. Whilst the agreementbetween experiment and prediction was generallypoor in the FH models of intermediate lengths, itwas noticeable that the models based on the FHSCR 15/16 and the b2GPI SCR 3/4 linkers were oflength 35-39 nm, which is comparable with thelength of 40 nm for FH determined from the P(r)curve. The RG values of 10.2-10.8 nm for these twomodels were comparable with experiment. The

RXS-2 values of 1.50 nm and 1.43 nm for the modelsbased on the FH SCR 15/16 and the b2GPI SCR 3/4 linkers are close to the observed RXS-2 value of1.7 nm. This indicated that the RXS-2 values moni-tored the averaged orientation of adjacent pairs ofSCR domains. The greater the kink between twoadjacent SCR domains, the larger the RXS-2 valuesbecome (Table 2; Figure 8). In contrast, no corre-lation could be observed between the RXS-1 valuesof these models and the observed RXS-1 value inFigure 2(b), from which it was inferred that theRXS-1 value monitored the higher-order SCRdomain structure in FH. This is consistent with theexistence of ®ve longer linkers between the SCRdomains of FH (Table 1), which would permit thefolding back of these domains. If the 19 linkers inFH were modelled in b-strand conformations(Table 1), the resulting FH structure exhibited anextended but bent arrangement of SCR domains.These bends resulted from positional variations inthe linker coordinates used to superimpose eachlinker with its neighbouring SCR domains and thepresence of Pro residues. However, this model wastoo extended to account for the solution structuredata (Table 2; Figure 9(d)).

Based on these guidelines, automated modellingstrategies based on selections of random linkerstructures were developed for FH in order to pro-vide a good representation of its structure in sol-ution. Preliminary searches had shown that, iflength constraints were not used with the linkerstructures, the resulting FH models frequently pre-sented intertwined and overlapping convolutedstructures. Consequently, automated scatteringcurve ®t searches were initiated with the extended

Page 8: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Figure 7. Sequence and structure alignment of the FH SCR sequences used for homology modelling with NMRstructures for three SCR domains from FH and two SCR domains from VCP (Table 1). Nine putative N-glycosylationsites are in bold and underlined, of which the three sites within the SCR-4, SCR-12 and SCR-18 domains wereassumed to be non-glycosylated in the FH models. Potential heparin-binding residues are denoted by plus symbols.The residue conservation in the FH SCR domains is compared with those observed in 101 SCR domains.34 Theobserved secondary structures in the NMR structures were identi®ed using DSSP,35 where b-strand residues aredenoted by E. The consensus six b-strands (B1-B6) consistently identi®ed in these structures are shown above thealignment. The observed side-chain accessibilities in these structures (e, exposed; b, buried) were identi®ed usingCOMPARER,37 where 0 denotes 0-9 % accessibility, 1 denotes 10-19 % accessibility, and so on up to 9. Residues withaccessibilities up to 19 % are denoted as buried.

1124 Solution Structure of Factor H

Page 9: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Table 2. Summary of modelling searches to determine compatible solution structures for FH

Orientation of linkerMinimum length of each linker

(nm)Number of

modelsGood-fitmodelsa

Tip-to-tipdistanceb (nm)

X-ray RG

(nm)X-ray RXS-1

(nm)X-ray RXS-2

(nm)R-factor

(%)s�20,w

(S)

Experimental values 11.1-11.3 4.4 1.7 n.a. 5.3

b2GPI SCR 1/2 (four residues) 1 0 73 21.0 0.88 0.90 37 4.1b2GPI SCR 2/3 (four residues) 1 0 70 20.3 1.10 0.91 35 4.3CD46 SCR 1/2(four residues) 1 0 69 19.5 0.70 0.84 40 4.2b2GPI SCR 4/5 (three residues) 1 0 51 14.5 0.68 1.13 28 5.1FH SCR 15/16 (four residues) 1 0 39 10.8 0.23 1.50 32 5.6b2GPI SCR 3/4 (four residues) 1 0 35 10.2 2.52 1.43 19 5.6VCP SCR 3/4 (four residues) 1 0 6 3.4 3.85 2.71 42 9.4Extended Defined in Table 1 1 0 49 16.4 2.66 1.00 31 4.5Randomised-1 (Peptides�0.35)-0.25 2011 13 21(�6) 9.8(�0.3) 4.27(�0.41) 1.52(�0.08) 22(�2) 5.2(�0.1)Randomised-2 (Peptides�0.35)-0.15 2005 28 24(�5) 9.9(�0.4) 4.23(�0.40) 1.55(�0.13) 21(�3) 5.1(�0.1)Randomised-3 (Peptides�0.35)-0.05 2010 31 23(�7) 9.9(�0.2) 4.25(�0.40) 1.52(�0.10) 19(�3) 5.0(�0.1)Randomised-4 95% of Table 1 lengths 2009 19 22(�6) 9.8(�0.3) 4.34(�0.48) 1.49(�0.07) 20(�2) 5.1(�0.1)Randomised-5 98% of Table 1 lengths 2010 16 24(�4) 9.9(�0.3) 4.30(�0.36) 1.56(�0.11) 20(�2) 5.1(�0.1)Randomised-6 105% of Table 1 lengths 1997 34 21(�7) 10.1(�0.4) 4.39(�0.40) 1.56(�0.11) 21(�4) 5.0(�0.1)Rotational FH SCR 15/16 2700 9 32(�3) 7.7(�0.1) 2.12(�0.20) 1.52(�0.02) 28(�1) 5.6(�0.1)Randomised-7 Hybrid (see the text) 2010 11 23(�4) 9.7(�0.1) 4.16(�0.19) 1.58(�0.05) 25(�1) 5.3(�0.1)

a The good ®t models in the last eight rows were de®ned by ®lters that were set to be 15 % of the RG, RXS-1 and RXS-2 values. For the rotational search, the ®lters that yielded nine best-®tmodels were set to 30-60 % (see the text).

b The tip-to-tip distance is measured between the Ca atoms of Glu1 in the SCR-1 domain and Arg1213 in the SCR-20 domain.

Page 10: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Figure 8. Seven models for FH based on 19 identical interdomain linkers determined for seven NMR and crystalstructures and an extended linker. That shown for CD46 was derived from molecule B in the PDB ®le. The extendedmodels is derived from Table 1. The scattering and sedimentation parameters for these eight models are reported inTable 2. The SCR domains are numbered 1 to 20 on the left to follow Figures 1 and 7. The six carbohydrate chainsassumed to be on SCR-9, SCR-13, SCR-14, SCR-15 (two) and SCR-17 are shown in black.

1126 Solution Structure of Factor H

FH structure of Figure 9(d) in order to evaluateSCR arrangements that would account for the FHscattering curve. These were developed using mol-ecular dynamics simulations to generate stereoche-mically allowed linker conformations, in whichseveral types of constraints were applied to retainthe lengths of the 19 linkers, while leaving the startand end of the 19 linkers to vary randomly (seeMaterials and Methods). A library of 10,000 pep-tide conformations was generated for each of the19 linkers. In each automated search, a linker pep-tide was taken randomly from each library inorder to construct each full FH model from super-impositions between the 20 SCR models and 19 lin-kers. Even though the NMR structure for the linkerbetween SCR-15 and SCR-16 is known, random-ised structures for this were included in the model-ling as well for consistency. As the modellingprocedure caused the 20 SCR domains to bend intoall orientations relative to each other, this resultedin the generation of FH structures that rangedfrom compact to extended. Six variations of this

search resulted in between 1997 and 2011 modelsin each one (the Randomised-1 to Randomised-6searches in Table 2), giving a total of 12,042models.

Best-®t ®lters based on deviations of 15 % fromthe experimental X-ray data were applied to the12,042 models in turn to reject the unsatisfactoryones. The ®rst ®lter corresponded to the absence ofsteric overlap between SCR domains (de®ned as arequirement for >95 % of the FH volume to be pre-sent in each sphere model). Subsequent ®lters pro-gressed to the RG value (12.9 nm > RG > 9.5 nm),the RXS-1 value (5.1 nm > RXS-1 > 3.7 nm), and theRXS-2 value (1.96 nm > RXS-2 > 1.45 nm). The RG,RXS-1 and RXS-2 ®lters proved to be discriminatorybetween good-®t and poor-®t models. Between 13and 34 good-®t models were identi®ed from eachsearch. By this approach, a broad population of FHstructures was generated. For example, in the 2010models of the Randomised-3 search, the tip-to-tipdistance between SCR-1 and SCR-20 varied from1-58 nm (Figure 10(a)). The 31 best-®t RG values

Page 11: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Figure 9. Four trial X-ray curve®ts for the solution structure of FH.The modelled curves (continuouslines) are compared with X-raydata for FH (®lled circles). The Qranges corresponding to those usedfor the RG, RXS-1 and RXS-2 determi-nations in Figure 2 are indicated.(a) This FH model was based onthe interdomain SCR orientationfound between the b2GPI SCR-1and SCR-2 domains. This resultedin the most extended FH model(Figure 8), for which the scatteringcurve resembled that of a rod-likecylinder. (b) This FH model wasbased on the interdomain SCRorientation found in the NMRstructure of SCR-15 and SCR-16 inFH. This resulted in a helical struc-ture. (c) This FH model was basedon the interdomain SCR orientationfound between the b2GPI SCR-3and SCR-4 domains. This resulted

in a helical structure and a pronounced dip in the curve. (d) This extended FH model was created by setting the 19linker peptide conformations to be extended b-strands (Table 1; Figure 8). The overall arrangement of the SCRdomains depended on the orientations of the Cys4 and Cys58 residues in each of the 20 SCR structures.

Solution Structure of Factor H 1127

from this search gave good RXS-1 and RXS-2 valuesthat were located in the centre of this population,showing that a suf®ciently wide population hadbeen sampled in the search. The comparison of theR1.4 and RG values in Figure 10(b) showed aV-shaped distribution with a shallow minimum atan R1.4 value of 12 % that extended between 8 and10.5 nm. The 31 best-®t RG values occurred at theupper limit of this minimum, and this showed thatthe RG, RXS-1 and RXS-2 ®lters had selected modelledcurves that gave good ®ts to the experimentaldata. Only the Randomised-3 search showed thisgood agreement. The other ®ve searches resultedin V-shaped distributions that were narrower thanthose in Figure 10(b) and placed the 13-34 good-®tstructures in regions that were not close to theminimum in R1.4 values. In the Randomised-3search, four of these 31 models gave low R1.4

values of 14-15 % and exhibited higher RG valuesin better agreement with experiment. These modelswere accordingly selected as the ®nal representa-tive best-®t structures for FH (Figure 11). All fourgood-®t FH stuctures showed folded-back SCRarrangements with tip-to-tip distances in a range of19-33 nm. No structural families could be ident-i®ed from the modelling, where Figure 12(a)showed that the orientation of the SCR domains israndom in all cases. However, since the longestmodel in Table 2 had a maximum tip-to-tip dis-tance in FH of 73 nm, this showed the extent towhich a given FH domain structure has to befolded back in order to provide a good ®t to theexperimental X-ray scattering curve.

A rotational search for modelling the X-raysolution structure of FH: method 2

A control modelling search based on rotations offour-domain segments was carried out in order tore®ne the conclusions of the randomised modelsearches. As the orientation of the SCR 15/16 pairin FH gave an RXS-2 value close to the experimentalvalue, this domain pair constituted the basis of asystematic rotational conformational search for theFH structure. The 20 SCR domains were modelledas ®ve segments of four SCR domains each. Withineach segment, the three interdomain orientationsfollowed that of SCR 15/16 in FH. By rotating byequal increments each of the four SCR segments, inwhich the preceding SCR segment is held ®xed ineach case, rotations could be applied systematicallyto each of the four segments in order to encompassthe full range of possible orientations in the FHmodels (see Materials and Methods). This gave2700 models. The resulting RG values ranged from11.3 nm for the most extended model to 3.9 nm forthe most compact structure. The R1.4 values rangedfrom 29 % for the best curve ®t to 43 % for theworst ®t. Signi®cantly larger ®lters of 30 % in theRG values and 60 % in the RXS-1 values wereneeded to identify nine best-®t models. Modelswith RG values close to the experimental value of11.1-11.3 nm resulted in high R1.4 values, whilemodels with low R1.4 values were incompatiblewith the RG data. It was concluded that this searchhad failed to yield any good ®ts, and this showedthat the SCR domains are not orientated in a regu-lar fashion within FH.

Page 12: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Figure 10. Structural analysis of the 2010 FH models obtained from the Randomised-3 search. The RG values of the31 best curve-®t FH models are indicated by the green and red circles, of which the four best-®t FH models are indi-cated by red circles. (a) Comparison of the RG values for the FH models (yellow circles) with the distance betweenthe Ca atoms of Glu1 in the SCR-1 domain and Arg1213 in the SCR-20 domain. (b) Comparison of the R-factor (R1.4)values for the FH models with their RG values. Note that the four best-®t FH models are within the set of lowest R-factor values characterised by values between 12-15 %.

1128 Solution Structure of Factor H

Another control modelling search was based ona hybrid strategy that investigated whether theconformation of FH was determined primarily bythe conformations of the ®ve longest linkers withsix to eight residues. Four of these occur as a clus-ter between the SCR-10 and SCR-15 domains, andone is between SCR-18 and SCR-19 (Table 1).These longer linkers may permit a greater degreeof folding back between neighbouring SCR

domains than would be the case with the linkerscontaining three to ®ve residues. For this search,the 14 linkers with three to ®ve residues weremodelled on the conformation found in the SCR15/16 pair. The ®ve longest linkers were modelledin random orientations using the peptide linkers inthe Randomised-3 library (Table 2). A total of 2010models was generated. Their RG values rangedfrom 5.6 nm for the most compact model to

Page 13: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Figure 11. X-ray and neutron curve ®ts for the four best-®t models for FH from the Randomised-3 search. In eachpanel (a)-(d), a Ca trace of each best ®t model is shown with carbohydrate attached. The positions of SCR-1 and SCR-20 are indicated as numbered. In each panel, the upper curve ®t corresponds to X-ray data, while the lower curve ®tcorresponds to D11 neutron data. The continuous lines correspond to the calculated curve from each model. The Qranges used in the RG, RXS-1 and RXS-2 analyses of Figures 2 and 3 are indicated by the arrowed ranges in (a).

Solution Structure of Factor H 1129

11.5 nm for the most extended model, the RXS-1

values ranged from 0.07-4.94 nm, the RXS-2 valuesranged from 0.09-2.04 nm, and the R1.4 valuesranged from 13 to 34 % (data not shown). Theseranges showed that a suitably wide range of con-formational models had been created. The use ofthe best-®t ®lters of 15 % identi®ed 11 models thatsatis®ed the RG, RXS-1 and RXS-2 ®lters. These 11models showed ®ts to the X-ray scattering curvethat were comparable to those derived from the sixrandomised modelling searches, although themean R-factor was slightly higher at 25(�1) % inplace of the range of 19-22 % seen with the othersix searches (Table 2). It was concluded thatalternative modelling assumptions based on ®ve¯exible linker conformations that would yieldfolded-back SCR domain arrangements alsoresulted in good scattering curve ®ts for FH.

Neutron scattering curve modelling

As X-ray scattering can lead to radiation damageeffects,17 and the data are recorded in a high posi-tive solute-solvent contrast, the use of neutron scat-tering data with FH in heavy water provides a

control of the X-ray data, in that radiation damageeffects are avoided and the data are recorded in ahigh negative contrast. The use of two oppositecontrasts ensures that possible internal inhomo-geneities within FH that are caused by differentscattering densities for protein and carbohydratehave no effect on the curve ®ts. The neutron mod-elling views FH as an unhydrated molecule to agood ®rst approximation, which differs from thehydrated molecule seen by X-ray scattering. The2010 models generated by the Randomised-3search were accordingly compared with the D11scattering curve for FH as an independent monitorof the scattering ®t procedure. The modelled neu-tron curves gave RG values that ranged from4.8 nm for the most compact model to 11.1 nm forthe most extended model. Likewise, the RXS-1

values ranged from 0.22-5.08 nm, the RXS-2 valuesranged from 0.08-2.85 nm, and the R1.0 values ran-ged from 5-25 %. By analogy with Figure 10, thisshowed that a wide range of neutron models hadbeen created as desired. The application of 15 % ®l-ters with the neutron RG, RXS-1 and RXS-2 values(see the Neutron scattering data for FH section)resulted in the identi®cation of 19 best models.

Page 14: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Figure 12. Summary of the fourbest-®t models for FH. The SCR-1domain of the four best-®t FHmodels (a)-(d) from Figure 11 aresuperimposed upon each other inorder to indicate the random orien-tations of SCR-2 to SCR-20 in thesemodels. The views (e)-(h) are frontand back electrostatic views of themodel shown in Figure 11(d). Thisis depicted in four overlapping seg-ments where, for clarity, the SCRdomains are identi®ed by numbers.

1130 Solution Structure of Factor H

The mean of the 19 neutron RG, RXS-1 and RXS-2

values were 10.0(�0.3) nm, 3.68(�0.26) nm and1.37(�0.09) nm, respectively, and the mean R1.0

value was 6(�1) %. While these 19 models weredifferent from the 31 best X-ray models (Table 2),they were comparable with them. Despite this

difference, the four best-®t X-ray models showedexcellent ®ts with the neutron data, in which theR1.0 values ranged between 6.2-8.0 %. It was con-cluded that the neutron ®ts substantiated the mod-elling of FH as a folded-back structure, inagreement with the X-ray modelling.

Page 15: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Solution Structure of Factor H 1131

Sedimentation coefficient modelling of FH

The experimental sedimentation coef®cient s�20,w

of 5.3 S for FH provided an independent monitorof the RG value obtained by X-ray scattering. Bothparameters correspond to the degree of elongationof a hydrated macromolecular structure. In thetrial comparisons based on known orientationsbetween adjacent SCR domains (Figure 8), the bestagreement of the predicted s�20,w values with theexperimental value of 5.3 S occurred with theb2GPI linker 4/5, from which the resulting FHmodel has an s�20,w value of 5.1 S. The FH SCR 15/16 and the b2GPI 3/4 linker models, however,gave the best agreements with the RG value of FH.This discrepancy showed that the RG and s�20,w

values provide independent monitors of the FHstructure.

The s�20,w values for the FH models were calcu-lated for all the hydrated 14,752 FH models in theeight searches of Table 2. The mean s�20,w valuescalculated for the nine to 34 best-®t models fromeach of these eight searches gave values of 5.0-5.2S for the six randomised searches 1 to 6. These arewithin 0.1-0.3 S of the experimental values, whichis within the precision expected for this type ofmodelling.42 This supports the good agreementsbetween the experimental and predicted RG values.The four models in Figure 11 likewise showed con-sistent s�20,w values between 4.9-5.1 S. This agree-ment indicated that the supra domain arrangementof FH in¯uenced the s�20,w values, and that the scat-tering and sedimentation data both supported themodels in Figure 11.

Electrostatic surfaces of the FH models

Heparin-binding sites had been identi®ed for FHin the SCR-7, SCR-13 and SCR-20 domains(Figure 1).6-8,43,44 To see whether these sites werevisible in the four X-ray best-®t FH models, thesurfaces of the 20 SCR domains in FH were exam-ined for large basic patches by the use of electro-static maps. The close-up views of Figure 12(f)-(h)showed that the SCR-7, SCR-13 and SCR-20domains displayed these large basic surfaces(Figure 7), while the remaining SCR domains ofFH do not. These domain assignments are in agree-ment with studies of heparin binding to FH.6 ± 8 InSCR-7, residues involved in this basic surface areprimarily Arg369, Lys370, Arg386, Lys387 andLys392, with possible contributions from other resi-dues. In SCR-13, His746, Lys748, Lys750 andLys751 on one b-strand and Arg760, Arg762 andArg764 on an adjacent b-strand form an appropri-ate patch, although the evidence for heparin bind-ing to this domain is less clear than that for SCR-7and SCR-20. In SCR-20, these residues involveLys1184, Arg1185, Arg1188, Arg1192, Lys1212 andArg1213, all of which form a cluster around the Cterminus of FH. The comparisons of these basicresidues with those in mouse and bovine FH showweak conservation, which suggests that heparin

binding to FH may be species-dependent. Otherpotential heparin sites could be identi®ed through-out the FH sequence. For example, there are 13occurrences of a R/K-X-R/K tripeptide motif ineight other FH SCR domains; this motif has beenimplicated in heparin binding to SCR domains inviruses.45

These electrostatic views also showed that largeacidic patches were visible on SCR-9, SCR-10 andSCR-11 that may form a continuous surfacedepending on the interdomain orientation betweenthese three domains, and another one is visible onSCR-18. In the light of the high proportion ofcharged residues that are present in the linkersequences, the electrostatic views were not able toshow whether these contributed to any structuralfeatures of FH.

Conclusions

This study has presented the ®rst conformationalanalysis of the full molecular structure of intact FHin terms of the individual structures of its 20 SCRdomains and the 19 linker structures. The solutionscattering and sedimentation experiments showthat the solution structure of FH is monomeric. Ithas been commonly assumed that repeated SCRdomains within a single protein would bearranged in extended conformations. This is clearlynot the case for FH. The experimental data andmodelling analyses show that the 20 SCR domainsof FH are arranged as a folded-back conformation,and this will affect its interactions with C3b andfactor I and other molecules such as heparin.

The earliest work with FH showed that this wasmonomeric in solution.33 Under certain preparationconditions, FH was reproducibly found to bedimeric and rapidly formed higher oligomers inthe presence of zinc ions.17 Gel-®ltration and elec-tron microscopy then showed that FH was a ¯ex-ible extended monomeric molecule with a lengthof 49.5 nm and a cross-sectional diameter of3.4 nm.18 The discrepancies between these studieshave now been resolved by the present investi-gation. Both neutron scattering and sedimentationequilibrium experiments showed that FH wasmonomeric with a molecular mass of 147,000-165,000 Da in the physiological concentrationrange between 0.2-0.6 mg mlÿ1. The scattering datagave an RG value of 11.1-11.3 nm for monomericFH, which together with the sedimentation coef®-cient of 5.3 S indicated that its structure waselongated. The distance distribution functions P(r)agreed with the RG determinations and showedthat the maximum dimension of FH in solution is40 nm. This dimension is comparable with that of49.5 nm obtained by electron microscopy in thesense that the harsh conditions used formicroscopy can perturb the length determination ifthe FH structure is long and ¯exible.

It is unlikely that a molecule as large and ¯exibleas FH can be crystallised intact. As a substitute at

Page 16: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

1132 Solution Structure of Factor H

the present time for a crystal structure for FH, thescattering and ultracentrifugation data weredirectly modelled using molecular structures bythe use of an automated procedure.27 The presentstudy represents a novel extension of this auto-mated procedure to the analysis of a large numberof domains within a single structure. The strongmodelling constraints inherent in the use of knownSCR domain structures means that the low resol-ution of scattering and ultracentrifugation isimproved to result in a medium-resolution molecu-lar model for FH. However the modelling yieldsFH structures that are compatible only with scat-tering and ultracentrifugation data, and do notdetermine a unique structure. The constraints usedhere included the known FH sequence, the sixcarbohydrate chains, and the known NMR struc-tures of SCR domains from FH and VCP. Themodelling was implemented on the basis of testedprocedures in two stages. In the ®rst stage, the useof trial FH models showed that none of the exper-imentally determined interdomain linkers in NMRor crystal structures would give FH models thatwould ®t the solution data. This was con®rmed bythe control calculations based on systematicrotational search of segments of FH domains,which also failed. Nonetheless, these trials wereuseful in showing that the mean orientationbetween two adjacent SCR domains was similar tothat between SCR-15 and SCR-16 of FH or betweenSCR-3 and SCR-4 in b2GPI. This orientation couldbe monitored experimentally by the RXS-2 value. Inthe second stage, the use of randomised extendedlinker conformations in a trial-and-error procedureshowed from 14,052 models that only folded-backFH structures gave the best account of the scatter-ing curves. These models were experimentallymonitored by the use of both the RG and s�20,w

values, which are sensitive to the overall macro-molecular dimension, and by the RXS-1 value,which is sensitive to the proximity relationship ofSCR domains that are not immediately adjacent toeach other. The resulting best-®t FH models areconsistent with the appearance of the looped andtwisted FH structures visualised by electronmicroscopy that accounted for more than 75 % ofthe population of FH molecules.18

The simplest explanation for the existence offolded-back FH structures is provided by the exist-ence of varied conformations for the 19 linkersequences in FH, in particular the longer ones.Variable interdomain orientations between adja-cent SCR domains have been analysed in terms ofseven known linker structures, containing three orfour residues, that have been determined by NMRand crystallography.25 The different orientationsfrom these studies result in a wide range of poss-ible structures for FH. Even the six independentmolecules seen in the crystal structure of CD46SCR-1/2 show some ¯exibility at their interdomaininterface.24 Further conformational variabilitybetween the SCR domains in FH is expected toarise from the different linker lengths in FH. Whilst

14 of these are between three and ®ve residues inlength, ®ve consist of six, seven or eight residues,mostly between SCR-10 to SCR-14 (Table 1). Thereis uncertainty about the conformation of the long-est linkers, as it is not known whether these willhave an extended, looped or helical form, or willpermit two adjacent SCR domains to bend backupon themselves. Finally, more variation in the lin-ker conformations will result from the location ofPro3 residues adjacent to the Cys4 residue in ®veFH linkers (Figure 7). These are unique to FH andmay directly perturb the orientation of the linkers.

The signi®cance of the range of linker lengths inFH, in particular the longer linkers between SCR-10 and SCR-14, is shown by their high conserva-tion in the sequences of mouse FH14 (SWISSPROTaccession number P06909) and a partial sequencefor bovine FH.16 There are only three minor excep-tions, in that single-residue additions occur in thelinkers between SCR-10 and SCR-11 and betweenSCR-12 and SCR-13 in bovine FH, and there is atwo-residue shortening of the linker between SCR-13 and SCR-14 in mouse FH. A high proportion ofthe charged residues in the linker sequences areconserved in mouse and bovine FH. In addition,four of the ®ve Pro3 residues in human FH areconserved in mouse FH. These observationssuggest that the folded-back SCR domain structureof FH may be a common property for FH fromother mammalian species. The analogue of FH insand bass has only 17 SCR domains, each of whichcan be matched against equivalent SCR domains inhuman FH (GenBank accession number L21703).46

Of the 16 linkers in sand bass, ®ve differ in lengthfrom those in human FH by only one residue,while two linkers (between SCR-12 and SCR-13,and between SCR-13 and SCR-14) are shorter bytwo and four residues, respectively. Again, theleast conservation in linker length occurs in thelongest ones between SCR-10 and SCR-14, andSCR-10 is deleted in sand bass FH. These obser-vations suggest that there is an inherent degree ofconformational ¯exibility at the centre of FH.

This FH study has shown that multiple SCRdomain structures cannot be modelled using pre-sently known SCR linker structures. The questionof whether other SCR-containing proteins in thecomplement system may possess folded-backstructures was examined using the linker lengthsbetween SCR domains. Human C4 binding protein(the analogue of FH in the classical pathway ofactivation, with eight SCR domains) and thehuman factor H-like protein (which is comprisedof the ®rst seven SCR domains of FH) both possesslinkers of length three or four residues. Humandecay accelerating factor and membrane cofactorprotein CD46 both contain four SCR domains, andtheir linkers also possess three or four residues.These proteins are less likely to show folded-backSCR orientations in solution. This is in accordancewith a solution scattering analysis of C4 bindingprotein,31 and has formed the basis for modellingstudies of decay accelerating factor.47 The linkers

Page 17: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Solution Structure of Factor H 1133

in rat and mouse complement receptor-related pro-tein y (an analogue of decay accelerating factorand membrane cofactor protein) contain four or®ve residues between the ®ve SCR domains inthese proteins. The two largest proteins are comp-lement receptor types 1 and 2. These possess 30and 15 or 16 SCR domains, respectively, withmany linkers consisting of three, four or ®ve resi-dues, and include one and four eight-residue lin-kers, respectively. It is possible that the latter twoproteins may show folded-back structures in sol-ution, but this requires experimental investigation.Interestingly, even for SCR domains that are joinedby short four-residue linkers, there is evidence ofstructural cooperativity between the SCR-15, SCR-16 and SCR-17 domains in complement receptortype 1, while this is not observed between SCR-2and SCR-3 in VCP.48,49

In conclusion, the modelling shows that theseFH structures are folded back, and are not staticbut ¯exible, since the average of the four curves inFigure 11 is in better agreement with the exper-imental curve than any of the four curves on theirown. It is likely that FH in solution will exhibit amixture of different conformations, rather than asingle well-de®ned one, and four possibilities aredepicted in the summary of Figure 12. Equivalentcurve ®ts were obtained when only the ®ve longestlinkers between SCR-10 and SCR-14, and betweenSCR-18 and SCR-19 were modelled to be confor-mationally free. It is intriguing that the C3b andheparin-binding sites (Figure 1) are located mostlyoutside these longer linker regions. This mayimply that the shorter linkers contribute to the de®-nition of the binding sites, while the longer linkersprovide for conformational ¯exibility outside theseregions in order to facilitate multivalent ligandbinding. In fact, the immunological signi®cance of¯exible SCR structures in FH is that these correlatewith the occurrence of multiple sites in FH forinteractions with its ligands. C3b has three distinctbinding sites on FH. The folded-back structure forFH implies that the third C3b site at the C termi-nus of FH is able to move into proximity with the®rst two sites for C3b (Figure 1), and that it is poss-ible that the three C3b sites may not be indepen-dent of each other. This would facilitate thecooperative binding of the SCR domains of FH toC3b, and the biological control of this interactionbefore and after cleavage by factor I. FH also hasmultiple binding sites for heparin. It is possiblethat the interaction of heparin with FH may mod-ify the interdomain arrangement in FH and alterthe orientation of the three C3b binding sites. Ifthis is the case, this would provide a mechanismfor biological control of FH. The present structuralanalysis of FH has indicated new directions forfuture studies of the functional properties of FH.

Materials and Methods

Purification of FH for scattering andultracentrifugation experiments

FH was puri®ed as described.33 Outdated human plas-ma (400 ml) was dialysed against 25 mM Tris-HCl(pH 7.4), 140 mM NaCl, 0.5 mM EDTA then applied to alysine-Sepharose column (Amersham Pharmacia Biotech)equilibrated with 100 mM sodium phosphate (pH 7.4),150 mM NaCl, 15 mM EDTA to remove plasmin andplasminogen, then to a guard column of non-immuneIgG immobilised on Sepharose to remove proteins thatbind to immunoglobulins or modi®ed agarose. The elu-ate was applied to an anti-FH monoclonal antibodyMRC-OX24 immobilised on a Sepharose column (kindlyprovided by Dr R. B. Sim) and eluted using 3 M MgCl2.Previously, elution was achieved using alkaline pH con-ditions (such as the use of 50 mM diethylamine ordiethanolamine, 150 mM NaCl, pH 11.0-12.0), and theseconditions are attributed to be the likely cause of FHdimer formation in our previous work.17 Recycling theplasma through the MRC-OX24 Sepharose column maxi-mised the yield. The eluate was dialysed into Tris bufferand concentrated under N2 (g) pressure using a YM10membrane in a pressure cell (Amicon). Non-speci®caggregates of FH were removed by gel-®ltration on aSuperdex 200 HiLoad 16/60 column (Amersham Phar-macia Biotech). The concentration of FH was determinedfrom an absorption coef®cient of 16.7 (1 %, 280 nm, 1 cmpath-length) calculated from its composition from itssequence (SWISSPROT accession number P08603) andassuming six glycosylation sites.15,17,50 FH was assayedfor cofactor activity to show activity.51 For X-ray scatter-ing and ultracentrifugation, FH was dialysed at 4 �C intoTris buffer (25 mM Tris-HCl (pH 7.4), 140 mM NaCl,0.5 mM EDTA to which 0.1 mM Pefabloc SC was addedfor one X-ray session). For neutron scattering, FH wasdialysed into PBS in 99.9 % (v/v) 2H2O at 6 �C for 48hours with three changes of dialysate (PBS is 137 mMNaCl, 2.7 mM KCl, 8.1 mM Na2HPO4, 1.5 mM KH2PO4,0.5 mM EDTA, 0.02 % (w/v) NaN3, pH 7.5; to which1 mM Pefabloc SC was added for the session on Instru-ment D11). FH samples were routinely analysed by SDS-PAGE before and after scattering and ultracentrifugationexperiments to con®rm their integrity and homogeneity.

X-ray data from Station 2.1 at the SynchrotronRadiation Source

X-ray scattering data acquisition was performed in®ve independent sessions at the Synchrotron RadiationSource at Daresbury, Warrington, UK at the solutionscattering camera at Station 2.1 equipped with a500-channel quadrant detector.52,53 Experiments wereperformed with beam currents in a range of 111-230 mAand a ring energy of 2.0 GeV. Sample-to-detectordistances of 3.33-5.64 m were used, which yielded amaximum Q range of 0.05-2.2 nmÿ1. FH concentrationsbetween 0.1 and 16.0 mg mlÿ1 were measured at 15 �C inPerspex cells of sample volume 20 ml, contained withinmica windows of thickness between 10 and 15 mm.Other details are reported elsewhere.28

Neutron data from Instrument LOQ at ISIS andInstruments D22 and D11 at the ILL

Neutron scattering data were obtained in two differ-ent beam sessions on the LOQ instrument at the pulsed

Page 18: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

1134 Solution Structure of Factor H

neutron source ISIS at the Rutherford Appleton Labora-tory, Didcot, UK.54 The pulsed neutron beam wasderived from proton beam currents of 135-180 mA. Dataacquisitions were for 80 � 106-320 � 106 monitor countsin runs lasting two to eight hours for FH concentrationsbetween 2.1 and 6.1 mg mlÿ1. Other details are reportedelsewhere.28

Neutron scattering experiments were also performedin three independent sessions on Instruments D22 andD11 using the high-¯ux reactor at the Institut Laue-Lan-gevin (ILL) in Grenoble, France.55 FH in 2H2O bufferswas measured on Instrument D22 between 0.3 and9.6 mg mlÿ1 and on Instrument D11 between 0.4 and3.2 mg mlÿ1 in 2 mm thick rectangular Hellma cells posi-tioned in a thermostatted rack at 15 �C, typically withacquisition times of ®ve to 20 minutes. On InstrumentD22, sample-detector/collimation distances of 5.6 m/5.6 m and 1.4 m/8.0 m, a wavelength l of 1.00 nm(�l/l of 10 %) and a rectangular beam aperture of7 mm � 10 mm were used, as these con®gurationsavoided any need for detector deadtime corrections. Thecombined Q range was 0.07-2.5 nmÿ1. On InstrumentD11, sample-detector distances of 2.0 and 10.0 m, awavelength l of 1.00 nm (�l/l of 10 %) and a rectangu-lar beam aperture of 7 mm � 10 mm were used. Thecombined Q range was 0.05-1.1 nmÿ1. Data reductionwas based on standard ILL software using the DETEC,RNILS, SPOLLY, RGUIM and RPLOT routines.56

Analysis of reduced X-ray and neutron data

In a given solute-solvent contrast, the radius of gyra-tion RG is a measure of structural elongation if theinternal inhomogeneity of scattering densities has noeffect. Guinier analyses at low Q give the RG and the for-ward scattering at zero angle I(0):30

ln I�Q� � ln I�0� ÿ R2GQ2=3

This expression is valid in a QRG range up to 0.7 forextended rod-like particles, and is approximate in a QRG

up to 1.5 in which it slightly underestimates the true RG

value. The relative I(0)/c values (c � sample concen-tration) for samples measured in the same buffer duringa data session gives the relative molecular masses Mr

of the proteins when referenced against a suitablestandard.57,58 If the structure is elongated, the meanradius of gyration of the cross-sectional structure RXS

and the mean cross-sectional intensity at zero angle[I(Q)Q]Q ! 0

59 is obtained from:

ln�I�Q�Q� � ln�I�Q�Q�Q!0 ÿ R2XSQ2=2

The RG and RXS analyses lead to the triaxial dimensionsof the macromolecule. If the structure can be representedby an elongated elliptical cylinder, L � [12(RG

2 ÿ RXS2 )]1/2,

where L is its length.30 Alternatively, L is given by pI(0)/[I(Q)Q]Q ! 0.

31 The two semi-axes, A and B, of theelliptical cylinder are calculated by combining the dry orhydrated volume V (V � p ABL) with the RXS value(RXS

2 � (A2 � B2)/4). The hydrated volume is obtained onthe basis of a hydration of 0.3 g of water/g of glyco-protein and 0.0245 nm.3 per water molecule.50 Dataanalyses employed an interactive graphics programSCTPL5 (A.S. Nealis & S.J.P., unpublished software) on aSilicon Graphics 4D35S Workstation.

Indirect transformation of the scattering data I(Q) inreciprocal space into real space to give the distance dis-tribution function P(r) was performed using GNOM:60,61

P�r� � 1

2p2

Z 10

I�Q�Qr sin�Qr�dQ

P(r) corresponds to the distribution of distances rbetween any two volume elements within one particleweighted by the product of their respective electron ornuclear densities relative to the solvent density. Thisoffers an alternative calculation of the RG and I(0) that isbased on the full scattering curve, and gives the maxi-mum dimension of the macromolecule L. The X-ray I(Q)curve contained 462 data points in the Q range of 0.06-1.4 nmÿ1 and was ®tted with Dmax set as 40 nm. Theneutron I(Q) curve contained 105 data points in the Qrange of 0.1-2.5 nmÿ1 and was ®tted with Dmax again setas 40 nm. Other details of the P(r) analyses are reportedelsewhere.28

Sedimentation equilibrium and sedimentationvelocity data for FH

Analytical ultracentrifugation was performed on aBeckman XLi instrument in which the FH concentrationwas monitored using its absorbance at 280 nm and itsrefractive index measured by interferometry. Sedimen-tation equilibrium was measured from samples of FHbetween 0.17 and 6.22 mg mlÿ1. Sedimentation equili-brium data were acquired over 45 hours using six-sectorcells with solution column heights of 2 mm at rotorspeeds of 11,000 r.p.m., 14,000 r.p.m. and 17,000 r.p.m.in an AnTi50 rotor until equilibrium had been reached ateach speed as shown by the perfect overlay of runsmeasured at ®ve hour intervals. The molecular mass, M,was analysed on the basis of a single species using Beck-man software provided as an add-on to Origin Version4.1 (Microcal Inc.), where for FH was calculated to be0.717 ml gÿ1 from its sequence:50

cr � cro exp��o2=2RT�M�1ÿ r��r2 ÿ r2o��

where cr is the concentration at radius r, cro is the con-centration of the monomer at the reference radius ro, o isthe angular velocity, R is the gas constant, T is the absol-ute temperature, and r is the solvent density. Sedimen-tation velocity data were acquired over 16 hours at rotorspeeds of 20,000 r.p.m., 25,000 r.p.m., 30,000 r.p.m. and40,000 r.p.m. in two-sector cells with column heights of12 mm. For the g(s*) analyses, successive absorbance andinterference scans were recorded at ten minute intervals,the shortest interval possible under standard measure-ment conditions, for which the rotor speed of 20,000r.p.m. and 25,000 r.p.m. were optimal for FH. In time-derivative analyses, the subtraction of pairs of concen-tration scans versus radius in the cell eliminates systema-tic errors from baseline distortions in the cell windowsand permits the averaging of many pairs of subtractions.The extrapolation of individual subtractions to the starttime gives the g(s*) function, which was computed usingthe DCDT� program,62 from which the sedimentationand diffusion coef®cients were determined from thepeak position and width, respectively (Figure 6).

Homology modelling of 17 SCR domains in FH

Five NMR structures with seven SCR domains wereused for the homology modelling of 17 FH SCRdomains, namely the SCR-5 domain of FH (unpublishedcoordinates21), the SCR-15 domain of FH (PDB code1h®),22 the SCR-16 domain of FH (PDB code 1hcc),19,20

the SCR-15/SCR-16 domain pair of FH (PDB code

Page 19: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Solution Structure of Factor H 1135

1hfh),22 and the SCR-3/SCR-4 domain pair of VCP (PDBcode 1vcc).23 In order to align their sequences, secondarystructures were identi®ed using DSSP35 and the side-chain solvent-accessibilities were calculated using aprobe of radius 0.14 nm to represent a water moleculeusing COMPARER.36,37 To verify the alignment, the ®veNMR structures were superimposed on the basis of theconserved b-strand residues by the use of the molecularmodelling software INSIGHT II (MSI, San Diego) in con-junction with the use of Crystal Eyes stereo glasses. Thisfacilitated the assignment of structurally conserved andloop residues.

Even though FH SCR-5 possessed two cis-peptides atSer15 and Ile52, this structure was used to model threeother SCR domains (Table 1), where the cis-peptideswere removed by searching for new loops and the dis-torted geometries were corrected using DISCOVERre®nement. The SCR-15 and SCR-16 structures of FHand the SCR-4 structure of VCP were used to modelfour, seven and three SCR domains, respectively. The 17homology models were constructed using INSIGHT II,HOMOLOGY, DISCOVERY and BIOPOLYMER soft-ware (MSI, San Diego). For each SCR to be modelled, thereference SCR structure was selected on the criteria thatit possessed all six b-strands, had the least residues indisallowed regions in the Ramachandran plot calculatedusing PROCHECK,39 and required the fewest insertionsor deletions in the sequence alignment. Using HOM-OLOGY, the rigid body fragment assembly method wasused, in which the conserved Cys and b-strand residueswere assigned as structurally conserved regions, and theloop regions of length identical with those in the refer-ence structure were assigned as designated loops. Whereinsertions and deletions occurred, searched loops(de®ned in Figure 7 and Table 1) were modelled using aprecalculated Ca distance matrix to identify the best-matching loop in a database derived from 349 crystalstructures.63,64 After optimisation of the side-chain rota-mer positions, and veri®cation of the absence of stericoverlaps through performing bump checks, the two dis-ulphide bridges were created using BIOPOLYMER.Energy minimisations were performed using DISCOVERusing 500 steps of steepest descent minimisation toimprove the connectivity of the model and to minimisebad contacts or poor stereochemistry. During minimis-ation, the secondary structure was retained by immobi-lising the main-chain atoms in the conserved regions,and tethering these in the loop regions.

Triantennary complex-type carbohydrate structures(Man3GlcNAc6Gal3Fuc3NeuNAc1; Mr 2918) were addedto each of the N-linked glycosylation sites in an extendedconformation from the protein surface (Figure 1). Peptidesequencing showed that the single sites in SCR-9, SCR-13, SCR-14 and the two in SCR-15 are occupied, whilethe carbohydrate staining of proteolysed fragments ofFH showed that the site in SCR-4 is not glycosylated andthat only one site in SCR-12, SCR-17 and SCR-18 isglycosylated.15,16,33 Accordingly, for modelling purposes,it was arbitrarily assumed that SCR-4, SCR-12 and SCR-18 were not glycosylated.

The electrostatic surface of each SCR or FH model wascalculated using a full Coulombic boundary condition inDELPHI (MSI, San Diego). The internal and externaldielectric constants were set as 2 and 80, respectively.The solvent-accessible surface of each model was dis-played using the Connolly algorithm with a solventprobe of diameter 0.14 nm. The surface was coloured redfor potentials less than ÿ5 kT (acidic), blue for potentialsgreater than �5 kT (basic) and white for neutral poten-

tials of 0 kT. Colours between these values were pro-duced using a linear interpolation.

Randomised domain modelling of FH byconstrained scattering fits (method 1)

Method 1 was based on the generation of 20 domainFH models in which the domains were connected byconformationally randomised linker peptides.28 Each lin-ker peptide contained the Cys58 and Cys4 residues ofthe preceding and following SCR domains respectively(numbering as in Figure 7), and was generated initiallyon the basis of an extended b-strand structure using BIO-POLYMER. A molecular dynamics simulation using DIS-COVER3 (MSI, San Diego) was used to generate alibrary of 10,000 random structures for each of the 19 lin-kers at a temperature of 773 K by energy minimisationfor 300 iterations, followed by a temperature equili-bration step of 5000 fs, then running the simulation for1 � 106 fs during which the linker structure was savedevery 100 fs. To assemble a full model of FH, a linkerstructure was selected randomly from each of the 19libraries in turn, each of which was superimposed ontothe preceding and following SCR domains by use of theCys58 and Cys4 coordinates. This procedure varied boththe length and the orientation of the linkers. In a total ofsix searches, further FH models were generated byadjustments of distance restraints when constructing the19 linker peptide conformational libraries. In each one,random conformations were generated on the conditionthat the Cys58 and Cys4 Ca atoms were separated by aminimum distance that was determined by the numberof peptide bonds in the linker and a distance of 0.35 nmper peptide. The evaluation of 2000 full FH models ineach search required six days of central processor unittime on a Silicon Graphics INDY Workstation runningon IRIX 6.5 with an R5000 processor and 160-192 mega-bytes of memory.

Rotational search modelling of FH by constrainedscattering fits (method 2)

Method 2 was based on FH models arranged in ®vesegments of four SCR domains. The interdomain orien-tation of the three linkers in each segment was set to bethe same as that in the two-domain fragment containingSCR-15 and SCR-16. The effect of systematic rotationsbetween these ®ve segments of the SCR domains wasexplored, in which the X-axis corresponded to the long-est axis of the FH model. If these were set as 30 � stepsfrom 30 � to 360 � about the X, Y and Z-axes for the fourinter-segment links in the FH assembly, a prohibitivetotal of 8.9 � 1012 models would require evaluation.65

Accordingly, the search was simpli®ed by applying thesame X, Y and Z-axes rotations to all four inter-segmentinterfaces, and this proved to be adequate in a systematicsurvey of FH models. While the sequences of the linkerregions were not included in these models, and thisomission resulted in 88 fewer amino acid residues in fac-tor H than would otherwise be the case, the coordinateconversion to small spheres was set to the volume of thetotal of 1213 amino acid residues and 96 carbohydrateresidues in order to correct for the fewer residues pre-sent. A preliminary search utilised Y-axis and Z-axisrotations between ÿ90 � and 90 � in 15 � increments whenthe X-axis rotation was ®xed at 0 �. The ®nal searchentailed rotations of the segments between 0 � and 330 �in 30 � steps about the X-axis, and between ÿ105 � and

Page 20: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

1136 Solution Structure of Factor H

105 � in 15 � steps about the Y and Z-axes to generate12 � 15 � 15 � 2700 models.

Automated Debye scattering curve modelling of FH

INSIGHT II molecular graphics software was used togenerate models for the modelling searches. In the mod-elling method 1, a Biosym Command Language macrowas written to call up sequentially the 20 individual SCRhomology models and the 19 linkers selected randomlyfrom the libraries of linkers. The fragments were joinedby superimposition of the terminal Cys58 and Cys4 coor-dinates in each linker, after which duplicate atoms wereremoved. A FH sphere model was created by placing thefull FH model coordinate set within a three-dimensionalarray of cubes, each of side length 0.820 nm. Providedthat the number of atoms within a given cube exceededa user-de®ned cutoff, a sphere of the same volume as thecube (sphere diameter 1.017 nm) was placed at thecentre of the cube. This cutoff was determined by therequirement that the total volume of spheres was within1 % of the dry volume of 195.0 nm3 for FH calculatedfrom its amino acid and carbohydrate composition.50

This resulted in a total of 350 spheres for the dry volumeof FH that was used for the modelling of neutron scatter-ing curves. X-ray scattering modelling requires accountof the hydration shell surrounding FH. A hydration of0.3 g of water/g of glycoprotein and an electrostrictedvolume 0.0245 nm.3 per bound water molecule50 gave ahydrated FH volume of 259.2 nm.3 This shell was mod-elled using the HYPRO procedure,63 in which an excessof extra spheres was added to the surface of the drysphere model, and these were stripped off until thedesired hydration volume of 458 spheres in total hadbeen reached.

In modelling method 2, a Biosym CommandLanguage macro was used to perform the rotations ofeach segment with four SCR domains relative to oneanother to generate full FH models. For each rotation ofthe four inter-segmental linkers, the rotational centre andorigin was de®ned as the Ca atom of the C-terminal resi-due, the X-axis was de®ned by the line joining theN-terminal and C-terminal Ca atoms of the segment, andthe plane of its Y-axis was de®ned by an arbitrarily-selected Ca atom. Starting from a cube side length of0.880 nm, a total of 284 protein spheres of diameter1.092 nm were used to represent the dry volume of FH,and a total of 375 protein spheres represented thehydrated volume of FH.

The X-ray and neutron scattering curve I(Q) was cal-culated assuming a uniform scattering density for thespheres using the Debye equation as adapted tospheres.28,67 X-ray curves were calculated from thehydrated sphere models without corrections for wave-length spread or beam divergence, while these correc-tions were applied for the neutron curves.66 The numberof spheres, N, in the dry and hydrated models after gridtransformation was used to assess steric overlap betweenthe SCRs, where models showing less than 95 % of theoptimal total were discarded. The modelled scatteringcurves were assessed by calculation of the RG, RXS-1 andRXS-2 values in the same Q ranges used in the experimen-tal Guinier ®ts, where models giving values thatexceeded speci®ed ranges were discarded. The cut-offvalues were generously set to be �15 % of these exper-imental parameters in order to identify all models withinrange. Models that passed these ®lters were then rankedusing a goodness-of-®t R-factor de®ned by analogy with

protein crystallography and based on the experimentalcurves in the Q range extending to 1.4 nmÿ1 (denoted asR1.4).

40,41

Sedimentation coefficient modelling of FH

Procedures for hydrodynamic simulations based onsphere models have been tested.40,68 The GENDIA pro-gram was used to calculate the s�20,w values starting fromthe hydrated FH models created by methods 1 and 2.

Protein Data Bank accession number

The four best-®t Ca coordinate models for FH havebeen deposited in the Protein Data Bank with the acces-sion code 1haq.

Acknowledgements

We thank the Biotechnology and Biological SciencesResearch Council for a Special Studentship Award, andDr R. B. Sim (MRC Immunochemistry Unit, Oxford) forthe MRC-OX24 Sepharose matrix and useful discussions.We thank Mrs S. Slawson, Mr A. Gleeson, and Dr J. G.Grossmann (SRS, Daresbury), Dr P. A. Timmins (ILL,Grenoble) and Dr R. K. Heenan and Dr S. M. King (ISIS,Rutherford-Appleton Laboratory) for excellent instru-mental support, and Dr J. T. Eaton for generous assist-ance with the PDB ®le submission.

References

1. Law, S. K. A. & Reid, K. B. M. (1995). Complement,2nd edit., IRL Press, Oxford, UK.

2. Zipfel, P. F., Jokiranta, T. S., Hellwage, J., Koistinen,V. & Meri, S. (1999). The factor H protein family.Immunopharmacology, 42, 53-60.

3. Pangburn, M. K. (2000). Host recognition and targetdifferentiation by factor H, a regulator of thealternative pathway of complement. Immunopharma-cology, 49, 149-157.

4. Sharma, A. K. & Pangburn, M. K. (1996). Identi®-cation of three physically and functionally distinctbinding sites for C3b in human complement factorH by deletion mutagenesis. Proc. Natl Acad. Sci.USA, 93, 10996-11001.

5. Jokiranta, T. S., Hellwage, J., Koistinen, V., Zipfel,P. F. & Meri, S. (1998). Each of the three bindingsites of factor H interacts with a distinct site on C3b.Mol. Immunol. 35, 360.

6. Blackmore, T. K., Sadlon, T. A., Ward, H. M.,Lublin, D. M. & Gordon, D. L. (1996). Identi®cationof a heparin binding domain in the seventh shortconsensus repeat of complement factor H. J. Immunol.157, 5422-5427.

7. Blackmore, T. K., Hellwage, J., Sadlon, T. A., Higgs,N., Zipfel, P. F., Ward, H. M. & Gordon, D. L.(1998). Identi®cation of the second heparin-bindingdomain in human complement factor H. J. Immunol.160, 3342-3348.

8. Pangburn, M. K., Atkinson, M. A. L. & Meri, S.(1991). Localization of the heparin-binding site oncomplement factor H. J. Biol. Chem. 266, 16847-16853.

9. DiScipio, R., Daffern, P. J., SchraufstaÈ tter, I. U. &Sriramarao, P. (1998). Human polymorphonuclear

Page 21: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

Solution Structure of Factor H 1137

leukocytes adhere to complement factor H throughan interaction that involves aMb2 (CD11b/CD18.J. Immunol. 160, 4057-4066.

10. Malhotra, R., Ward, M., Sim, R. B. & Bird, M. I.(1999). Identi®cation of factor H as a ligand forL-selectin. Biochem. J. 341, 61-69.

11. Horstmann, R. D., Sievertsen, H. J., Knobloch, J. &Fischetti, V. A. (1988). Antiphagocytic activity ofstreptococcal M protein: selective binding of comp-lement control protein factor H. Proc. Natl Acad. Sci.USA, 85, 1657-1661.

12. Ram, S., Sharma, A. K., Simpson, S. D., Gulati, S.,McQuillen, D. P., Pangburn, M. K. & Rice, P. A.(1998). A novel sialic acid binding site on factor Hmediates serum resistance of sialylated Neisseriagonorrhoeae. J. Exp. Med. 187, 743-752.

13. Ram, S., McQuillen, D. P., Gulati, S., Elkins, C.,Pangburn, M. K. & Rice, P. A. (1998). Binding ofcomplement factor H to loop 5 of porin protein 1A:a molecular mechanism of serum resistance ofnonsialylated Neisseria gonorrhoeae. J. Exp. Med. 188,671-680.

14. Kristensen, T. & Tack, B. F. (1986). Murine proteinH is comprised of 20 repeating units, 61 aminoacids in length. Proc. Natl Acad. Sci. USA, 83, 3963-3967.

15. Ripoche, J., Day, A. J., Harris, T. J. R. & Sim, R. B.(1988). The complete amino acid sequence of humancomplement factor H. Biochem. J. 249, 593-602.

16. Soames, C. J., Day, A. J. & Sim, R. B. (1996). Predic-tion from sequence comparisons of residues of factorH involved in the interaction with complement com-ponent C3b. Biochem. J. 315, 523-531.

17. Perkins, S. J., Nealis, A. S. & Sim, R. B. (1991).Oligomeric domain structure of human complementfactor H by X-ray and neutron solution scattering.Biochemistry, 30, 2847-2857.

18. DiScipio, R. G. (1992). Ultrastructures and inter-actions of complement factors H and I. J. Immunol.149, 2592-2599.

19. Norman, D. G., Barlow, P. N., Baron, M., Day, A. J.,Sim, R. B. & Campbell, I. D. (1991). Three dimen-sional structure of a complement control proteinmodule in solution. J. Mol. Biol. 219, 717-725.

20. Barlow, P. N., Baron, M., Norman, D. G., Day, A. J.,Willis, A. C., Sim, R. B. & Campbell, I. D. (1991).Secondary structure of a complement control proteinmodule by two-dimensional H1 NMR. Biochemistry,30, 997-1004.

21. Barlow, P. N., Norman, D. G., Steinkasserer, A.,Horne, T. J., Pearce, J., Driscoll, P. C., Sim, R. B. &Campbell, I. D. (1992). Solution structure of the ®fthrepeat of factor H: a second example of the comp-lement control protein module. Biochemistry, 31,3626-3634.

22. Barlow, P. N., Steinkasserer, A., Norman, D. G.,Kieffer, B., Wiles, A. P., Sim, R. B. & Campbell, I. D.(1993). Solution structure of a pair of complementmodules by nuclear magnetic resonance. J. Mol. Biol.232, 268-284.

23. Wiles, A. P., Shaw, G., Bright, J., Perczel, A.,Campbell, I. D. & Barlow, P. N. (1997). NMR studiesof a viral protein that mimics the regulators ofcomplement activation. J. Mol. Biol. 272, 253-265.

24. Casasnovas, J. M., Larvie, M. & Stehle, T. (1999).Crystal structure of two CD46 domains reveals anextended measles virus-binding surface. EMBO J. 18,2911-2922.

25. Bouma, B., de Groot, P. G., van den Elsen, J. M. H.,Ravelli, R. B. G., Schouten, A. & Simmelink, M. J. A.et al. (1999). Adhesion mechanism of human b2-gly-coprotein I to phospholipids based on its crystalstructure. EMBO J. 18, 5166-5174.

26. Perkins, S. J. (2000). High-¯ux X-ray and neutronscattering studies. In Protein-Ligand Interactions: APractical Approach (Chowdhry, B. & Harding, S. E.,eds), vol. 1, pp. 223-262, Oxford University Press,Oxford, UK.

27. Perkins, S. J., Ashton, A. W., Boehm, M. K. &Chamberlain, D. C. (1998). Molecular structuresfrom low angle X-ray and neutron scatteringstudies. Int. J. Biol. Macromol. 22, 1-16.

28. Boehm, M. K., Woof, J. M., Kerr, M. A. & Perkins,S. J. (1999). The Fab and Fc fragments of IgA1 exhi-bit a different arrangement from that in IgG: astudy by X-ray and neutron solution scattering andhomology modelling. J. Mol. Biol. 286, 1421-1447.

29. Sim, R. B., Day, A. J., Moffatt, B. E. & Fontaine, M.(1993). Complement factor I and cofactors in controlof complement system convertase enzymes. MethodsEnzymol. 223, 13-35.

30. Glatter, O. & Kratky, O. (1982). Editors of Small-angle X-ray Scattering, Academic Press, New York.

31. Perkins, S. J., Chung, L. P. & Reid, K. B. M. (1986).Unusual ultrastructure of complement componentC4b-binding protein of human complement bysynchrotron X-ray scattering and hydrodynamicanalysis. Biochem. J. 223, 779-807.

32. Whaley, K. & Ruddy, S. (1976). Modulation of thealternative complement pathways by b1 H globulin.J. Exp. Med. 144, 1147.

33. Sim, R. B. & DiScipio, R. G. (1982). Puri®cation andstructural studies on the complement system controlprotein b1H (factor H). Biochem. J. 205, 285-293.

34. Perkins, S. J., Haris, P. I., Sim, R. B. & Chapman, D.(1988). A study of the structure of human comp-lement factor H by Fourier transformed infraredspectroscopy and secondary structure averagingmethods. Biochemistry, 27, 4004-4012.

35. Kabsch, W. & Sander, C. (1983). Dictionary ofprotein secondary structure: pattern recognition ofhydrogen-bonded and geometrical features. Biopoly-mers, 22, 2577-2637.

36. Lee, B. & Richards, F. M. (1971). An interpretationof protein structures: estimation of static accessibil-ity. J. Mol. Biol. 55, 379-400.

37. SÏali, A. & Blundell, T. L. (1990). The de®nition ofgeneral topological equivalence in protein structures:a procedure involving comparison of properties andrelationships through simulated annealing anddynamic programming. J. Mol. Biol. 212, 403-428.

38. Ramachandran, G. N. & Sassiekharan, V. (1968).Conformation of polypeptides and proteins. Advan.Protein Chem. 23, 283-437.

39. Laskowski, R. A., MacArthur, M. W., Moss, D. S. &Thornton, J. M. (1993). PROCHECK: a program tocheck the stereochemical quality of protein struc-tures. J. Appl. Crystallog. 26, 283-291.

40. Smith, K. F., Harrison, R. A. & Perkins, S. J. (1990).Structural comparisons of the native and reactioncentre cleaved forms of a1-antitrypsin by neutronand X-ray solution scattering. Biochem. J. 267, 203-212.

41. Beavil, A. J., Young, R. J., Sutton, B. J. & Perkins, S. J.(1995). Bent domain structure of recombinanthuman IgE-Fc in solution by X-ray and neutron

Page 22: Folded-back solution structure of monomeric factor H of human complement by synchrotron X-ray and neutron scattering, analytical ultracentrifugation and constrained molecular modelling

1138 Solution Structure of Factor H

scattering in conjunction with an automated curve®tting procedure. Biochemistry, 34, 14449-14461.

42. Perkins, S. J. (1989). Hydrodynamic modelling ofcomplement. In Dynamic Properties of BiomolecularAssemblies (Harding, S. E. & Rowe, A. J., eds), pp.226-245, Royal Society of Chemistry, London.

43. Jokiranta, T. S., Hellwage, J., Male, D. A., Giannakis,E., Zipfel, P. F., Meri, S. & Gordon, D. L. (2000).Comparison of the factor H heparin binding sites onSCR domains 7 and 20. Immunopharmacology, 49, 54.

44. Giannakis, E., Male, D. A., Ormsby, R. J., Mold, C.,Ranganathan, S. & Gordon, D. L. (2000). A commonsite within factor H SCR 7 responsible for bindingheparin, C-reactive protein and streptococcal Mprotein. Immunopharmacology, 49, 57.

45. Smith, S. A., Mullin, N. P., Parkinson, J.,Shchelkunov, S. N., Totmenin, A. V. & Loparev,V. N. et al. (2000). Conserved surface-exposed K/R-X-K/R motifs and net positive charge on poxviruscomplement control proteins serve as putativeheparin binding sites and contribute to inhibition ofmolecular interactions with human endothelial cells:a novel mechanism for evasion of host defence.J. Virol. 74, 5659-5666.

46. Dahmen, A., Kaidoh, T., Zipfel, P. F. & Gigli, I.(1994). Cloning and characterization of a cDNArepresenting a putative complement-regulatoryplasma protein from barred sand bass (Parablaxneblifer). Biochem. J. 301, 391-397.

47. Kuttner-Kondo, L., Medof, M. E., Brodbeck, W. &Shoham, M. (1996). Molecular modelling and mech-anism of action of human decay-accelerating factor.Protein Eng. 9, 1143-1149.

48. Kirkitadze, M. D., Dryden, D. T. F., Kelly, S. M.,Price, N. C., Wang, X. & Krych, M. et al. (1999).Co-operativity between modules within a C3b-bind-ing site of complement receptor type 1. FEBS Letters,459, 133-138.

49. Kirkitadze, M. D., Henderson, C., Price, N. C., Kelly,S. M., Mullin, N. P. & Parkinson, J. et al. (1999).Central modules of the vaccinia virus complementcontrol protein are not in extensive contact. Biochem.J. 344, 167-175.

50. Perkins, S. J. (1986). Protein volumes and hydrationeffects: the calculation of partial speci®c volumes,neutron scattering matchpoints and 280 nm absorp-tion coef®cients for proteins and glycoproteins fromamino acid sequences. Eur. J. Biochem. 157, 169-180.

51. Sim, E. & Sim, R. B. (1983). Enzymic assay of C3breceptor on intact cells and solubilized cells. Biochem.J. 210, 567-576.

52. Towns-Andrews, E., Berry, A., Bordas, J., Mant,G. R., Murray, P. K. & Roberts, K. et al. (1989).Time-resolved X-ray diffraction station: X-ray optics,detectors and data acquisition. Rev. Sci. Instrum. 60,2346-2349.

53. Worgan, J. S., Lewis, R., Fore, N. S., Sumner, I. L.,Berry, A. & Parker, B. et al. (1990). The applicationof multiwire X-ray detectors to experiments usingsynchrotron radiation. Nucl. Instrum. Methods Phys.Res. ser. A, 291, 447-454.

54. Heenan, R. K. & King, S. M. (1993). Development ofthe small-angle diffractometer LOQ at the ISISpulsed neutron source. In Proceedings of an Inter-national Seminar on Structural Investigations at PulsedNeutron Sources, Dubna, 1st-4th September 1992.Report E3-93-65, Joint Institute for Nuclear Research-Dubna.

55. Lindner, P., May, R. P. & Timmins, P. A. (1992).Upgrading of the SANS instrument D11 at the ILL.Physica B, 180, 967-972.

56. Ghosh, R. E. (1989). A computing guide for smallangle scattering experiments, ILL Internal publi-cation 89GH02T.

57. Kratky, O. (1963). X-ray small angle scattering withsubstances of biological interest in diluted solutions.Prog. Biophys. Chem. 13, 105-173.

58. Wignall, G. D. & Bates, F. S. (1987). Absolute cali-bration of small angle neutron scattering data.J. Appl. Crystallog. 20, 28-40.

59. Hjelm, R. P. (1985). The small-angle approximationof X-ray and neutron scatter from rigid rods ofnon-uniform cross section and ®nite length. J. Appl.Crystallog. 18, 452-460.

60. Semenyuk, A. V. & Svergun, D. I. (1991). GNOM - aprogram package for small-angle scattering data-processing. J. Appl. Crystallog. 24, 537-540.

61. Svergun, D. I. (1992). Determination of the regulariz-ation parameter in indirect transform methods usingperceptual criteria. J. Appl. Crystallog. 25, 495-503.

62. Philo, J. (2000). A method for directly ®tting thetime derivative of sedimentation velocity data andan alternative algorithm for calculating sedimenta-tion coef®cient distribution functions. Anal. Biochem.279, 151-163.

63. Hobohm, U. & Sander, C. (1994). Enlarged represen-tative set of protein structures. Protein Sci. 3, 522-524.

64. Hobohm, U., Scharf, M., Schneider, R. & Sander, C.(1992). Selection of representative protein data sets.Protein Sci. 1, 409-417.

65. Boehm, M. K., Mayans, M. O., Thornton, J. D.,Begent, R. H. J., Keep, P. A. & Perkins, S. J. (1996).Extended glycoprotein structure of the sevendomains in human carcinoembryonic antigen byX-ray and neutron solution scattering and anautomated curve ®tting procedure: implications forcellular adhesion. J. Mol. Biol. 259, 718-736.

66. Ashton, A. W., Boehm, M. K., Gallimore, J. R.,Pepys, M. B. & Perkins, S. J. (1997). Pentameric anddecameric structures in solution of the serum amy-loid P component by X-ray and neutron scatteringand molecular modelling analyses. J. Mol. Biol. 272,408-422.

67. Perkins, S. J. & Weiss, H. (1983). Low resolutionstructural studies of mitochondrial ubiquinol-cytochrome c reductase in detergent solutions byneutron scattering. J. Mol. Biol. 168, 847-866.

68. Perkins, S. J., Smith, K. F., Kilpatrick, J. M.,Volanakis, J. E. & Sim, R. B. (1993). Modelling of theserine protease fold by X-ray and neutron scatteringand sedimentation analyses: its occurrence in factorD of the complement system. Biochem. J. 295, 87-99.

Edited by R. Huber

(Received 18 January 2001; received in revised form 20 April 2001; accepted 20 April 2001)


Recommended