+ All Categories
Home > Documents > DIVINING PROTEIN ARCHITECTURE

DIVINING PROTEIN ARCHITECTURE

Date post: 31-Jan-2017
Category:
Upload: stu
View: 212 times
Download: 0 times
Share this document with a friend
5
SCIENCE & TECHNOLOGY DIVINING PROTEIN ARCHITECTURE Predicting structure from sequence has advanced impressively in the past few years STU BORMAN, C&EN WASHINGTON E VER SINCE THE 1960S, WHEN experiments on the refolding of ribonuclease by the late bio- chemist Christian B. Anfinsen demonstrated that a protein's amino acid sequence determines its three- dimensional structure, researchers have been working toward using sequence in- formation as a basis for predicting protein structure. Applications of protein structure pre- diction include learning more about protein folding, designing functional proteins from scratch, and divining the structure and func- tion of proteins from living organisms. Five or six years ago, the protein struc- ture prediction problem was considered to be pretty much unsolved. But since then considerable progress has been made: Scientists can now construct fairly accurate models that show where a protein's chain is supposed to go, and many of the topological features of these models closely re- semble those in actual structures. Not that all the problems have now been solved. "In terms of actu- ally explaining protein folding from first principles, I think we are very far away" says Krzysztof Fidelis, sen- ior scientist at Lawrence Livermore National Laboratory's Protein Structure Prediction Center. "I would like to see that bridge built, but it's not happening very easily" Attaining more accurate predic- tions is one of the key challenges. "It's going to take not just two years' time, but maybe 20 years' time to solve that problem," says Michael Levitt, professor and chairman of computational structural biology at Stanford University School of Med- icine. "By 'solved,' I mean correct- ly predicting to 1-A resolution the structure of a large protein. We're very far from that." Nevertheless, continual progress in the field is being made. "If you look back over multiple years, you do see an improvement over the whole range of modeling," says John Moult, pro- fessor and fellow of the Center for Ad- vanced Research in Biotechnology at the University of Maryland Biotechnology In- stitute, Rockville, Md. "The field is moving forward—not always rapidly but steadily" Protein structure prediction techniques are of three main types. An amino acid se- quence that turns up in a genome-sequenc- ing project or becomes known some other way may adopt a type of structure that's al- ready been seen in nature. Ifyou find a sim- ilar sequence in a protein database, you can then build a model based on the structure of the known protein. That's comparative modeling or homology modeling. HI0817 SUPERMODELS Baker and coworkers used the program Rosetta to create ab initio models (right sides of each row) that turned out to closely resemble crystal structures (left sides) of the DNA repair protein MutS and the bacterial protein HI0817. Even if a sequence does not match any in protein databases, it may still be possi- ble to find a suitable structure using a sec- ond technique, fold recognition. For per- haps half or three-quarters of unknown (structurally uncharacterized) proteins, there will be a suitable structure in the database one can use as a basis or template for extrapolating a 3-D model, even in the absence of a sequence match. "The idea of fold recognition is to turn the folding problem on its head and say 'Rather than finding a structure that's suit- able for the sequence, let's see if we can find whether our sequence fits on any ex- isting structures,' " explains Rob B. Rus- sell, group leader of structural bioinfor- matics at the European Molecular Biology Laboratory Heidelberg, Germany "Ifou thread the sequence onto existing structures and see if those structures bury the sequence's hydrophobic residues well, among a number of other structural considerations." If there's no known sequence or known structure to which you can match your se- quence, a third option is to build a 3-D structure of a protein from scratch, called de novo or ab initio modeling. This is the only viable option for proteins with new m folds —structures that have not t= been observed before in nature. SOMETHING THAT has had ahuge impact on the development of these techniques in the past few years is a program called CASP, Critical Assessment of Structure Prediction Methods, founded by Moult. Held every two years, it's essentially a competition to see which structure prediction tech- niques are most accurate at mod- eling unknown proteins. The predictions are evaluated and compared after the unknown structures have been determined crystallographically Results are discussed in an online CASP fo- rum called FORCASP, and papers arising from each CASP are pub- lished in a special issue of Proteins: Structure, Function & Genetics. Such an issue will be published later this year for the current program, CASP 5. Comparative modeling results at CASP 5 were summarized re- cently by biochemistry professor Anna Tramontano of the Univer- sity of Rome "La Sapienza" [Nat. Struct. Biol., 10,87 (2003)}. "Over- all, the average quality of compar- 26 C&EN / A U G U S T k, 2003 HTTP://WWW.CEN-ONLINE.ORG MutS
Transcript
Page 1: DIVINING PROTEIN ARCHITECTURE

SCIENCE & TECHNOLOGY

DIVINING PROTEIN ARCHITECTURE Predicting structure from sequence has advanced impressively in the past few years STU BORMAN, C&EN WASHINGTON

EVER SINCE THE 1960S, WHEN experiments on the refolding of ribonuclease by the late bio­chemist Christian B. Anfinsen demonstrated that a protein's

amino acid sequence determines its three-dimensional structure, researchers have been working toward using sequence in­formation as a basis for predicting protein structure.

Applications of protein structure pre­diction include learning more about protein folding, designing functional proteins from scratch, and divining the structure and func­tion of proteins from living organisms.

Five or six years ago, the protein struc­ture prediction problem was considered to be pretty much unsolved. But since then considerable progress has been made: Scientists can now construct fairly accurate models that show where a protein's chain is supposed to go, and many of the topological features of these models closely re­semble those in actual structures.

Not that all the problems have now been solved. "In terms of actu­ally explaining protein folding from first principles, I think we are very far away" says Krzysztof Fidelis, sen­ior scientist at Lawrence Livermore National Laboratory's Protein Structure Prediction Center. "I would like to see that bridge built, but it's not happening very easily"

Attaining more accurate predic­tions is one of the key challenges. "It's going to take not just two years' time, but maybe 20 years' time to solve that problem," says Michael Levitt, professor and chairman of computational structural biology at Stanford University School of Med­icine. "By 'solved,' I mean correct­ly predicting to 1-A resolution the structure of a large protein. We're very far from that."

Nevertheless, continual progress in the field is being made. "If you look back over multiple years, you

do see an improvement over the whole range of modeling," says John Moult, pro­fessor and fellow of the Center for Ad­vanced Research in Biotechnology at the University of Maryland Biotechnology In­stitute, Rockville, Md. "The field is moving forward—not always rapidly but steadily"

Protein structure prediction techniques are of three main types. An amino acid se­quence that turns up in a genome-sequenc­ing project or becomes known some other way may adopt a type of structure that's al­ready been seen in nature. If you find a sim­ilar sequence in a protein database, you can then build a model based on the structure of the known protein. That's comparative modeling or homology modeling.

HI0817

SUPERMODELS Baker and coworkers used the program Rosetta to create ab initio models (right sides of each row) that turned out to closely resemble crystal structures (left sides) of the DNA repair protein MutS and the bacterial protein HI0817.

Even if a sequence does not match any in protein databases, it may still be possi­ble to find a suitable structure using a sec­ond technique, fold recognition. For per­haps half or three-quarters of unknown (structurally uncharacterized) proteins, there will be a suitable structure in the database one can use as a basis or template for extrapolating a 3-D model, even in the absence of a sequence match.

"The idea of fold recognition is to turn the folding problem on its head and say 'Rather than finding a structure that's suit­able for the sequence, let's see if we can find whether our sequence fits on any ex­isting structures,' " explains Rob B. Rus­sell, group leader of structural bioinfor-matics at the European Molecular Biology Laboratory Heidelberg, Germany "Ifou thread the sequence onto existing structures and see if those structures bury the sequence's hydrophobic residues well, among a number of other structural considerations."

If there's no known sequence or known structure to which you can match your se­quence, a third option is to build a 3-D structure of a protein from scratch, called de novo or ab initio modeling. This is the only viable option for proteins with new

m folds —structures that have not t= been observed before in nature.

SOMETHING THAT has had ahuge impact on the development of these techniques in the past few years is a program called CASP, Critical Assessment of Structure Prediction Methods, founded by Moult. Held every two years, it's essentially a competition to see which structure prediction tech­niques are most accurate at mod­eling unknown proteins.

The predictions are evaluated and compared after the unknown structures have been determined crystallographically Results are discussed in an online CASP fo­rum called FORCASP, and papers arising from each CASP are pub­lished in a special issue of Proteins: Structure, Function & Genetics. Such an issue will be published later this year for the current program, CASP 5.

Comparative modeling results at CASP 5 were summarized re­cently by biochemistry professor Anna Tramontano of the Univer­sity of Rome "La Sapienza" [Nat. Struct. Biol., 10,87 (2003)}. "Over­all, the average quality of compar-

2 6 C & E N / A U G U S T k, 2 0 0 3 HTTP:/ /WWW.CEN-ONLINE.ORG

MutS

Page 2: DIVINING PROTEIN ARCHITECTURE

CAS PFOUNDERMoult observes that the big news of the CASP 5 competition was in the area of fold recognition, where metaservers have greatly improved accuracy.

ative modeling predictions ... improved, with the vast majority of methods pro­ducing good models... for targets sharing greater than 25% sequence identity with known structures," she wrote. "It is un­doubtedly true that biologists can now confidently use comparative mod­eling for structure prediction, {al­though} it is still difficult to pre­dict the structure of regions of the target that are substantially differ­ent (farther than 2.5 A) from the template." Predictions by two groups in Poland and one in the U.S. were most accurate at CASP 5, Tra­montane noted.

However, others in the field are more critical. "In comparative mod­eling, where your protein of inter­est has a sequence that is related to that of a structure that's already known, we've really been stuck for some time, in my judgment, and we still seem to be pretty much stuck in that area," Moult says.

Fidelis agrees that "the lack of progress in the area of comparative modeling is quite disappointing. We haven't seen any significant progress since CASP 2," in 1996.

"It's always been clear that ho­mology modeling is a random dis­placement away from the starting model built on a homolog, and no­body can consistently move toward the right answer," adds professor of biological chemistry David Shortle of Johns Hopkins University School of Medicine. "Ifou've demonstrated that a tar­get protein is evolutionarily related to a pro-tein of known structure, so you know roughly how it's folded. Can you then move that structure away from the known homolog toward the right answer in some sort of consistent way? And people can­not. Sometimes they get closer, some­

times they get further away" However, computational

structural biologist Roland L. Dunbrack Jr. of Fox Chase Cancer Center, Philadelphia, who special­izes in comparative model­ing, views the technique's prospects more hopefully: "Shortle is right that we don't move backbones clos­er to the target structure from the template structure

even most of the time. However, sequence alignments have improved tremendously in the last five years, and comparative mod­els have improved accordingly Also, while backbones have not improved, side-chain modeling for the target sequence onto the

Protein sequence

Search databases of

known structures

Homologous sequence of knov structure found?

Three-dimensional protein structure

GO WITH THE FLOW Protein structure prediction techniques are of three main types: comparative modeling, fold recognition, and de novo prediction.

template backbone is reasonably accurate at higher sequence identities."

Comparative models at lower sequence identities have also been improving, he says. "We now frequently make models in the 10 to 30% sequence identity range. This is necessary, since most proteins of unknown structure are only very distant­ly related to proteins of known structure."

But the task isn't easy or fast, he says: 'At low identity—10 to 20%—it's alot ofwork to get the alignment right."

Dunbrack and coworkers developed SCWRL, a side-chain conformation pre­diction program used for comparative modeling. But he notes that Modeller, cre­ated by professor of computational biolo­gy Andrej Sali and coworkers at the Uni­versity of California, San Francisco, "is the most commonly used comparative mod­eling software and has had a large impact on the field."

Sali agrees with Dunbrack that com­parative modeling has improved consid­erably in recent years. "I am not saying that the problems are solved—they remain and require additional work—but I simply do not agree with the suggestion that the field

is stuck," he says. CO

3 "THE BIG NEWS at CASP 5,"Moult « says, "was in the fold-recognition a category, where you're trying to " predict the structure of a protein > that's not obviously related at a se-£ quence level to known structures S but does have a fold that's been

seen before. Wha t we saw there was a quite large increase in the quality of the models in terms of accuracy And the reason for that seems to have to do with the in­troduction of metaservers."

Various research groups have de­veloped computer servers that ac­cept incoming sequence data and generate models in an automated manner. "One can also set up a metaserver," Moult explains—"a server that sends out sequences to multiple other servers, gets a num­ber of models from them, and then uses them to make consensus mod­els. The result is a significant im­provement of results in the fold-recognition category"

Automated servers and meta­servers can currently be used for both comparative modeling and

fold recognition, and some of the better comparative modeling at CASP 5 was car­ried out by metaservers. But metaserver performance was more impressive in fold recognition. In fact, metaserver fold-recog­nition scores at CASP 5 were better than those of almost all human participants.

Computer science senior lecturer Daniel Fischer of Ben-Gurion University of the

HTTP://WWW.CEN-ONLINE.ORG C & E N / A U G U S T 4 , 2 0 0 3 2 7

Page 3: DIVINING PROTEIN ARCHITECTURE

SCIENCE & TECHNOLOGY

DE NOVO ASSESSOR Russell was impressed with this year's CASP results.

Negev, Israel, pioneered the metaserver con­cept and also runs CAFASP (Critical As­sessment ofFullyAutomated Structure Pre­diction), a parallel program to CASP solely for automated servers and metaservers. CAFASP uses the same targets as those in CASP, and all CAFASP predictions become part of the CASP evaluation.

Other important metaserver groups in­clude those of Leszek Rychlewski, head of the Bioinformatics Laboratory at Bio-InfoBank Institute, Poznan, Poland, and biochemistry and molecular biophysics professor Burkhard Rost at Columbia Uni­versity Rychlewski runs the Live Bench Project, and Rost runs EVA—services that help researchers evaluate server and metaserver performance on structure pre­diction problems.

Although automated metaservers did surprisingly well in fold recognition in CASP 5, some scientists are troubled by the metaserver concept. "You send se­quences to servers, and they make predic­tions," Levitt says. "Then metaservers col­lect results from other servers and make consensus predictions. Then you have metaservers that go to the consensus metaservers and collect new consensus re­sults. And whoever gets the last result seems to do better. The last person into the game wins."

The trouble is that "you can't win if the others don't do their part," Levitt says. "So it's a strange business. This is kind of a meaningless technique. It works fine for

CASP because all these dif­ferent machines run the CASP sequences. But if you came along with a whole genome of 50 ,000 se­quences that you would like to predict this way, you couldn't. Because to be the top guy who's getting the best predictions, you have to rely on everyone else to do the work for you."

However, Fischer notes that there are two types of metaservers: "selectors," which gather information from other servers and just select an answer, and "added-value metaservers,"

which not only make selections but also enhance the input to generate better pre­dictions. 'A number of groups, including ours, are now working on developing fast, independent 'metapredictors' that run all components internally and do not depend on others," he tells C&EN. "Thus, some of the criticism attributed to the first gener­ation of metaservers may not be justified and will certainly fade away in the future, when fast, pow­erful, independent metapredic­tors will challenge the best human predictors."

Ab initio or de novo techniques are used for proteins that don't share either sequence or structural similarity with known proteins and thus have new folds. "To pre­dict the structure of a protein with a new fold, you might imagine in the limit solving Schrodinger's equation—getting a purely quan­tum mechanical solution of the problem," explains associate pro­fessor of biochemistry and Howard Hughes Medical Insti­tute assistant investigator David A. Baker of the University of Washington, Seattle. "But you can't do that because you can't get exact solutions for molecules with tens of atoms, let alone thousands of atoms as in proteins. So you have to make approximations."

To approximate the structure of new folds, ab initio programs assign energies

to different polypeptide chain confor­mations and then use optimization rou­tines to find the lowest energy (most sta­ble) conformations.

In new-fold prediction, Moult says, "we started with very poor results in CASP 1 but saw steady improvement through CASPs 2,3, and 4. "We actually hiccuped a bit be­tween CASP 4 and CASP 5—you can't see much progress there. But I'm not very put off by that. There's a lot of good stuff going on in the new-fold area, and I think we'll see things pick up again next time."

Russell, who helped assess CASP 5 de novo predictions, says: "I was very im­pressed with the results. It's clear that a number of groups are able to do things that I never would have dreamt possible 10 years ago."

TRADITIONALLY, "having a lot of related sequences is the sort of thing that helps you a lot if you're trying to do a de novo pre­diction," Russell says. "It gives you some in­formation about where the location of he­lices and strands might be, and so on. For some proteins in CASP 5, there was very lit-

BUILDING BLOCKS Baker uses a computer program called Rosetta that combines little bits and pieces from known proteins.

tie help of this sort. And there was at least one case where the protein didn't actually have any sequence homologs at all. It was a complete orphan, all by itself in the whole world. Nevertheless, a few groups—and Baker's certainly stands out among them— were able to get quite accurate structures.

"If you look back over multiple years, you do see an improvement over the whole range of modeling. The field is moving forward—not always rapidly, but steadily/'

28 C&EN / AUGUST k, 2003 H T T P : / / W W W . C E N - O N L I N E . O R G

Page 4: DIVINING PROTEIN ARCHITECTURE

I thought this was really phenomenal." Shortle says that in new-fold techniques,

"there's been significant progress, and 80% of it derives from the results of the Baker lab. When they get it right, they get it dra­matically right. I think the secret of their success has several components, but the major one is their selection of fragments from the Protein Data Bank to assemble their models with. We were told we came in second in CASP 5 in new-fold predic­tion, but it was a distant second. So there isn't a really dramatic story out­side of the Baker lab's success. I think that will change in the fu­ture. There's a lot on the hori­zon, but the Baker lab has led the way"

To perform their predictions, Baker and coworkers use a pro­gram called Rosetta. Essentially it makes new proteins by as­sembling little bits from known proteins.

"If you look at any nine-amino-acid chunk of a protein, during the folding process it doesn't im­mediately go to one conforma­tion," Baker says. "It flickers be­tween a number of different possible local conformations. Folding occurs when everything happens to be in the right place at the right time—when the dif­ferent pieces are oriented so they make low-energy interactions throughout the chain."

To model this flickering be­tween local conformations, "you have to know what distribution of conformations any given por­tion of this chain is going to adopt," Baker says. Rosetta gets that information from protein databases. The program then searches for combinations of lo­cal conformations that, when spliced together, produce very low energy protein tertiary structures.

The method is thus partly empirical, in that it makes use of databases of known structures. Other ab initio prediction pro­grams are more strictly theoretical, using molecular mechanics or molecular dy­namics and making little to no use of data from known structures.

Some contend that programs like Roset­ta are less pure, in a sense, than the more exclusively theoretical ones. "This has been an area of discussion, to put it politely," Moult says.

"Ifears ago we didn't have a lot of other structures to use as abasis for building mod­

els," Moult says, "so early ab initio meth­ods were more or less based on physics. However, we saw early in the CASP pro­gram that those methods didn't work very well. Meanwhile, people like Baker became very clever at using the information from known structures in various ways. This has been more successful. But it has upset some people with the older methods, who feel that in some sense this is not really science, but rather information science that's not based on physics."

R296

H-bond

CLOSE MATCH Dunbrack and coworkers used their SCWRL program to generate a comparative model (yellow and orange) of key residues from a complex of BACE protease with its substrate, amyloid precursor protein (APP, red). The model closely matches a crystal structure (dark and light blue) of a complex of BACE with an APP-like inhibitor (green). The comparative model shows a likely physiologically important salt bridge between a BACE residue and one residue on APP—an interaction not present in the crystal structure, which instead has a single hydrogen bond in a different location. Note: D = aspartic acid, R= arginine, Y = tyrosine, I = isoleucine, L = leucine.

These researchers "certainly have a point," he says. "The problem is that the traditional physics-based methods are still not delivering. I think it's important that we still give them space and that they still get funded, because if not, we're never go­ing to move forward in that area. But CASP puts an awful lot of emphasis on results, and results right now are better from meth­ods that somehow use a knowledge base."

Baker concedes that using information from known protein structures in con­structing sets of possible local conforma­tions is not the same as starting from scratch. However, "there is no method that starts from scratch," he says. "Ifou can't do

a truly first-principles calculation, so you have to get parameters from somewhere. It's a bit of semantics."

Because Schrodinger's equation can't be solved exactly for proteins, Baker says, "you have to be able to combine information from quite different areas. I think that's really the way of the future."

Levitt and coworkers, on the other hand, are among a number of groups that contin­ue to develop a more purely theoretical ap­proach. "It should be possible to predict pro-

£ tein structure from quantum | mechanics or molecular me-* chanics force fields —that is, | from the basic physics and chem-m istry of a situation," Levitt says. | An ab initio approach that o he and his postdoc Chen Keasar o recently tried was to write an uj expression for the free energy | of a protein and then minimize

that directly [J. Mol Biol, 329, 159(2003)}. "Wegotsome very interesting results," Levitt says, but the program didn't do very well in CASP 5.

Nevertheless, "ab initio ap­proaches need to be empha­sized further," Levitt says. "%u could argue that just for the pu­rity of chemistry and physics we need to be able to predict pro­tein structures" that way

One promising ab initio ef­fort is the Folding@Home proj­ect run by assistant professor of chemistry and of structural bi­ology VijayS. Pande of Stanford University Folding@Home us­es free time on the computers of thousands of volunteers to car­ry out computationally inten­sive protein calculations.

Pande and coworkers prima­rily study the mechanism of pro­tein folding, but in a study last

year they applied the computer power of the Folding@Home system to protein structure prediction. They discovered that the average unfolded structure of a protein mirrors the structure of its folded state [J. Mol. Biol, 323,153 (2002)}. By using a mo­lecular dynamics routine to calculate aver­age distances between pairs of residues in an unfolded protein, they were able to de­rive its folded structure.

"The trouble is that Vijay, with all the computers in the world and three months of computer time, can fold a protein with 36 amino acids sometimes," Levitt says, "and Baker can get much better results in two minutes."

HTTP:/ /WWW.CEN-ONLINE.ORG C & E N / A U G U S T A, 2 0 0 3 2 9

Page 5: DIVINING PROTEIN ARCHITECTURE

SCIENCE & TECHNOLOGY

The question is whether empirical methods like Baker's will ultimately be able to fold most biological proteins, or whether more purely ab initio methods like Levitt's will in the end be needed as well. "My own preferences are for the kind of methods where there's deep under­standing," says Eaton E. Lattman, profes­sor and chairman of the department of biophysics at Johns Hopkins University and editor-in-chief of Proteins: Structure, Function & Genetics. "So I think people have to respect Baker's achievements enor­mously, but whether the empirical meth­ods are going to run out of gas and only get you so far isn't clear."

The CASP program is generally believed to have helped the field. But some say CASP cycles are too fast, that the number of targets is too small for results to be valid statistically, and that head-to-head com­petition isn't a good way to do science.

"One of the problems maybe that we're reaching the limits of what we can do," Levitt says. "Normal science is not done at all like CASP It's done by people having ideas, thinking about them, writing careful papers, and so on." From the end of one CASP to the start of another, "you basical­ly have ayear to develop a new method, and that isn't enough time. At CASP, all that mat­ters is how you do, and that may be a long-term negative thing."

"Because the number of targets dealt with is small and the number of groups that are uniformly successful is even smaller, it's unclear whether CASP is really assessing progress and whether we're really learning anything about protein structure predic­tion there," adds molecular biology pro­fessor Charles L. Brooks III of Scripps Re­search Institute.

However, "I've attended all five CASPs, and I don't perceive any serious problems," Shortle says. "I think the people in charge do a very reasonable job in the assessments. Every year they've gotten better—more

rigorous and well defined. But every year there's a certain amount of grumbling. The rules of the game are pretty clear to anyone who has been around awhile, and I think the whining is inappropriate."

AN IMPORTANT POTENTIAL application area for protein structure prediction is structural ge­nomics, a large-scale effort to determine the structures of proteins across entire genomes. One might think that by solving all protein structures experimentally, structural genomics will eclipse structure prediction and eventually even make it obsolete. But researchers say that's not the plan and that structural genomics is indeed depending on pro- Levitt tein structure prediction to help achieve its aims.

An example is the University at Buffa­lo Center of Excellence in Bioinformatics, where director Jeffrey Skolnick and coworkers are developing comparative modeling and ab initio prediction tech­niques to advance the center's structural genomics goals.

"The structural genomics field has be­come a very strong raison d'etre for com­putational structure prediction," Levitt says. "It's impossible to make crystal struc­tures of every single gene on Earth, al­though getting the sequence of these is very likely to happen. In some sense, the premise of structural genomics is that we need to do enough structures so that mod­eling can then do the rest for us."

"The current plan for structural ge­nomics," Moult says, "is to try to sample structure space experimentally in such a way that you can build useful models of all of the other proteins. Right now, we have experimental structures for maybe 1% of

the proteins for which we have a sequence. So you'd like to build models for the oth­er 99%, and that's never going to go down to less than 90%."

Because the number of proteins in dif­ferent genomes is enormous, metaservers could play an important role in determin-

0 ing their structures. "As | servers continue to im-z prove, they will become in-| creasinglyimportantinany J prediction process, espe-^ cially when dealing with | genome-scale prediction £ tasks," Fischer and cowork­er ers wrote in a CAEASP pa-i per [Proteins: Struct., Fund., 1 G^tf.,45,171(2001)}."We z expect that in the near fu-« ture, the performance dif­

ference between humans and machines will continue to narrow and that fully au­

tomated structure prediction will become an effective companion and complement to experimental structural genomics."

For applications related to protein func­tion, efforts will continue to be made to refine modeling accuracy In most current models, "if you look in detail at where the atoms are, they're not in exactly the right places," Baker says. "That means that they're not good for applications where you want to understand catalysis or you want to do drug design. So the problem re­ally then becomes how to take a rough model and make it more accurate."

As solutions to that problem develop, new-fold modeling could prove increas­ingly useful for designing proteins as well as divining structures. "Many of the prin­ciples used in ab initio folding are obvi­ously applicable to problems where you're trying to change an enzyme specificity, modify the structure of an existing pro­tein, or design proteins from scratch," Rus­sell says. "That seems to be where the field is moving."

"The long-term dream," Levitt says, "is being able to treat proteins as a materials science and to model in proteins the way we can model in silicon and steel and poly­mers. It's clear that with the right sequence you can develop proteins that do anything you want. But to do that you're going to need to solve ab initio protein folding. It's a problem that's very much at the bound­ary of physics, chemistry, and life. Because on the one hand we think it's a purely com­putationally solvable problem, and on the other hand it's what makes living things possible. So there are many reasons why this probably will not go away" •

LAB FOR SALE ^ Austin, TX

ball: (512) 346-5180 ,[ Jerry Heare, SIOR

n ~—:'i NAI Commercial Industrial Properties Co.

3 0 C & E N / A U G U S T 4 , 2 0 0 3 H T T P : / / W W W . C E N - O N L I N E . O R G


Recommended