+ All Categories
Home > Documents > Neutral and Nonneutral Mutations: The creative mix—evolution of complexity in gene interaction...

Neutral and Nonneutral Mutations: The creative mix—evolution of complexity in gene interaction...

Date post: 26-Aug-2016
Category:
Upload: emile
View: 213 times
Download: 0 times
Share this document with a friend
7
J Mol Evol (1997) 44(Suppl 1):$2-$8 jou.. o MOLECULAR IEVOLUTION 9 Springer-Verlag New York Inc. 1997 Neutral and Nonneutral Mutations: The Creative Mix Evolution of Complexity in Gene Interaction Systems Emile Zuckerkandl Institute of Molecular Medical Sciences, 460 Page Mill Road, Palo Alto, CA 94306, USA Received: 3 June 1996 / Accepted: 20 August 1996 Abstract. Random drift, while indifferent to the func- tionality of the molecular features on which it acts, may nevertheless affect evolving molecular mechanisms. It can lead to functional novelty in either gene structure or regulation. In particular, a nearly neutral (in the sense of Ohta), somewhat deleterious mutation can result in a loss of efficiency in gene regulation, and this loss is expected at times to be compensated by a selected event of a particular type: the use of an additional regulatory factor. An accumulation of additional regulatory factors, imply- ing a combination of events of drift and selection, can permit regulatory systems to achieve an increase in both specificity and complexity as mere byproducts of a par- ticular repair process. Nearly neutral mutations thus may, at times, constitute a required pathway for increases in gene interaction complexity. The process seems to point to an inbuilt drive--built into the gene interaction system itself--toward the evolution of higher organisms. This is a matter worthy of experimental exploration, since the general foundations for the evolution of "higher" from "lower" organisms seems so far to have largely eluded analysis. Key words: Nearly neutral mutations -- Gene regu- latory -- Regulatory networks -- Gene network com- plexity -- Controller nodes -- Gene duplication -- Con- troller gene diseases -- Progressive evolution -- Anagenesis Introduction To be allowed to participate in honoring Motoo Kimura is indeed a signal honor. He was and remains one of the major figures of contemporary evolutionary science. His uniquely sharp mind, lost to the world through a cruel fate, lives on in his work and in our memory. Kimura felt that his neutral theory was important enough if it was simply true, regardless of the role that neutral mutations may play in the evolution of biological mechanisms. He believed that knowing what is, what- ever that may be, fulfills the objective of science. In fact, the fixation of mutations by random drift does bear pow- erfully on the evolution of biological mechanisms, al- though only in conjunction with other processes. Neutrality and Function One needs to define the relation between functional neu- trality and neutrality in Kimura's sense of neutral genetic drift. Functional neutrality may be considered character- istic of a sequence change that will not be selected for or against, however large the size of a population may be, up to reasonable maximal effective population sizes to be considered. Neutral drift is not sensitive to the presence or absence of function in the drifting nucleotide or amino acid--or sequence. By their immediate effect, neutral mutations neither generate nor diminish functionality, and nearly neutral mutations do not do so to any major extent. An important instance of the relation between Kimu- ra's neutrality and functionality is equifunctionality, or near-equifunctionality. It occurs when the substituted amino acid or base is functional, but functionally not significantly better or worse than the amino acid or base for which it substitutes. In proteins, nonfunctionality of
Transcript
Page 1: Neutral and Nonneutral Mutations: The creative mix—evolution of complexity in gene interaction systems

J Mol Evol (1997) 44(Suppl 1):$2-$8

jou.. o MOLECULAR IEVOLUTION

�9 Springer-Verlag New York Inc. 1997

Neutral and Nonneutral Mutations: The Creative Mix Evolution of Complexity in Gene Interaction Systems

Emile Zuckerkandl

Institute of Molecular Medical Sciences, 460 Page Mill Road, Palo Alto, CA 94306, USA

Received: 3 June 1996 / Accepted: 20 August 1996

Abstract. Random drift, while indifferent to the func- tionality of the molecular features on which it acts, may nevertheless affect evolving molecular mechanisms. It can lead to functional novelty in either gene structure or regulation. In particular, a nearly neutral (in the sense of Ohta), somewhat deleterious mutation can result in a loss of efficiency in gene regulation, and this loss is expected at times to be compensated by a selected event of a particular type: the use of an additional regulatory factor. An accumulation of additional regulatory factors, imply- ing a combination of events of drift and selection, can permit regulatory systems to achieve an increase in both specificity and complexity as mere byproducts of a par- ticular repair process. Nearly neutral mutations thus may, at times, constitute a required pathway for increases in gene interaction complexity. The process seems to point to an inbuilt drive--built into the gene interaction system itself--toward the evolution of higher organisms. This is a matter worthy of experimental exploration, since the general foundations for the evolution of "h igher" from " lower " organisms seems so far to have largely eluded analysis.

Key words: Nearly neutral mutations - - Gene regu- latory - - Regulatory networks - - Gene network com- plexity - - Controller nodes - - Gene duplication - - Con- troller gene diseases - - Progress ive evolution - - Anagenesis

Introduction

To be allowed to participate in honoring Motoo Kimura is indeed a signal honor. He was and remains one of the

major figures of contemporary evolutionary science. His uniquely sharp mind, lost to the world through a cruel fate, lives on in his work and in our memory.

Kimura felt that his neutral theory was important enough if it was simply true, regardless of the role that neutral mutations may play in the evolution of biological mechanisms. He believed that knowing what is, what- ever that may be, fulfills the objective of science. In fact, the fixation of mutations by random drift does bear pow- erfully on the evolution of biological mechanisms, al- though only in conjunction with other processes.

Neutrality and Function

One needs to define the relation between functional neu- trality and neutrality in Kimura's sense of neutral genetic drift. Functional neutrality may be considered character- istic of a sequence change that will not be selected for or against, however large the size of a population may be, up to reasonable maximal effective population sizes to be considered. Neutral drift is not sensitive to the presence or absence of function in the drifting nucleotide or amino acid--or sequence. By their immediate effect, neutral mutations neither generate nor diminish functionality, and nearly neutral mutations do not do so to any major extent.

An important instance of the relation between Kimu- ra's neutrality and functionality is equifunctionality, or near-equifunctionality. It occurs when the substituted amino acid or base is functional, but functionally not significantly better or worse than the amino acid or base for which it substitutes. In proteins, nonfunctionality of

Page 2: Neutral and Nonneutral Mutations: The creative mix—evolution of complexity in gene interaction systems

amino acids, at the very least in the sense of amino acid sites, is p robab ly ex t remely reduced, and near- equifunctionality of amino acid residues comes into play principally at the most variable sites. Even there, judging from the wide range of replacement rates for different accepted substituents (Barnard et al. 1972; Zuckerkandl 1975), near-equifunctionality may be far from prevalent.

On the other hand, in many regions of DNA that do not code for proteins, substituent nucleotides may indeed often be near-equifunctional, and, in addition, many in- dividual nucleotides may be expected to be nonfunc- tional. To be sure, in any particular case, nonfunctional- ity, as an alternative to equifunctionality, may be extremely difficult to demonstrate. Nonfunctionality of a great many individual nucleotides can nevertheless be assumed, and certainly has been. Interestingly, however, if one adds together nucleotides that are individually nonfunctional, one may end up with a sum of nucleotides that are collectively functional. Nucleotides belonging to heterochromatin are an example. Despite all arguments made in the past in favor of considering heterochromatin as junk, many people active in the field no longer doubt that it plays functional roles (e.g., Zuckerkandl and Hen- nig 1995). Just as, quite some time ago, populational thinking became a necessity in genetics, we need now to get used to populational thinking in regard to the func- tion of nucleotides. They may individually be junk, and collectively, gold. The statement, "The majority of mu- tations in higher organisms spread by random drift" thus can in no way be taken as equivalent to the statement, "The major part of DNA in higher organisms is non-

functional." In summary, there are two reasons why, in genomes,

predominant nonfunctionality, a notion increasingly in need of convincing support, is not implied by widespread drift, a process that has become increasingly well estab- lished in the wake of Kimura's oeuvre. One reason is the presumably frequent near-equifunctionality of amino ac- ids and especially of nucleotides; the other is the pre- sumably frequent collective functionality of nucleotides that are individually nonfunctional.

Nearly Neutral Mutations as Pathways Toward New Functions of Gene Products

How can it be that, despite its indifference to function, neutral drift bears on the evolution of biological mecha- nisms? The answer, well known to Kimura, is: through the opening up of new pathways toward the functional that otherwise would not exist. The slightly deleterious, nearly neutral mutations, championed by Tomoko Ohta in remarkable ways (Ohta 1992), are decisively impor- tant here, as will be suggested below.

Twenty years ago I treated as evolutionary "noise" any sequence change that did not lead to more than

$3

minute functional variation, whether the sequence change was selected or not (Zuckerkandl 1976a). I did recognize that functionally insignificant changes may "occasionally offer raw materials for significant func- tional change." This recognition was insufficient. The impact on evolution of functionally insignificant changes can be so great that the smallness of the proportion of such cases is irrelevant. Changes with no functional im- pact may often be the condition for access to sequence changes of the most significant type, namely, those that represent functional innovation. Without neutral or nearly neutral pathways, evolution would have been much more constrained, in character and extent.

An illustration of this is furnished by Huynen (1996) in an article entitled, "Exploring Phenotype Space Through Neutral Evolution." He shows how neutral se- quence changes open many otherwise-forbidden doors. His approach depends on computer simulations of the evolution of populations of tRNA pne secondary-structure landscapes. Sequences that fold into the same secondary structure are defined as neutral with respect to secondary structure. A connected network of neutral sequences emerges in sequence space and provides a stage for neu- tral evolution in the Kimura sense in that many of these sequences may fill the conditions of fixation by random drift. As one proceeds along Huynen's neutral network and explores new structures accessible by single substi- tutions from each neutral position in the landscape, one observes that the degree of structural innovation along a random walk on a neutral network is constant. Thus, thanks to the existence of a neutral network, populations are exploring more and more structures with time, at a constant rate, and neutral evolution allows sequences to "search" much larger samples of structures than ordi- nary adaptive walks would permit. Neutrality, not only in Huynen's sense, but, indirectly, in Kimura's sense as well, thus is shown to be an important facilitator for the emergence of novelty. It is what Motto Kimura ex- pected, and it certainly is helpful for understanding how this role of facilitator can be filled.

Nearly Neutral Mutations and the Building of Regulatory Networks of Genes

Biological systems seen as networks of molecular infor- mation is a key concept underlying the present meeting. The contribution of neutral mutations to the emergence of regulatory networks of genes, therefore, is relevant here, and it may be permissible to propose one possible scenario for this contribution. On this occasion, the meaning of the notion of complexity of gene interaction networks needs to be analyzed to some extent.

It has been observed that the same protein factors frequently bind to promoters and enhancers of different genes and that genes generally are controlled by a num-

Page 3: Neutral and Nonneutral Mutations: The creative mix—evolution of complexity in gene interaction systems

$4

I II II1

Fig. 1. Transcription complexes: impairment and restoration. Reprinted

ber of different factors, with the specificity of gene regu- lation, therefore, dependent upon their collective effect (Ptashne 1992). For full transcriptional activation, the simultaneous presence of a full set of factors often seems to be required (e.g., Robertson et al. 1995).

The schematic Fig. 1 shows a coding sequence (the wiggly line) and its upstream promoter, as well as tran- scription factors (circles). Part I of the figure refers to a gene that, for simplicity's sake, is assumed, at one point in evolution, to have required only two protein factors for its full transcriptional activation. We can picture the fac- tors as interacting. A mutation then occurs that decreases the affinity of one of the factors, say, factor B, for its cognate DNA element. The mutation can be either in the DNA element or in the gene controlling the factor. As a consequence of the mutation, factor B is bound less tightly to the DNA element and to factor A. This is shown in part II of the figure, where B is represented in the unbound state. In this situation, transcriptional acti- vation is more or less impaired. The important point for us, here, is that this mutation can be thought of as having become fixed by random drift, as a slightly or not so slightly deleterious mutation (Ohta 1992).

Full transcriptional activability can be reestablished if a compensatory mutation occurs--for instance, as repre- sented in part III of the figure. Here, a factor C has undergone a sequence change that prompts it to bind to factors A and B in such a way as to restabilize the bind- ing of factor B to the DNA and to factor A. Thus, mu- tation in the gene for factor C would be fixed in the population by positive selection. We would have a suc- cession of a somewhat deleterious mutation spreading by drift and of a compensatory mutation spreading either by selection or, in the case of an only slightly adaptive com- pensatory mutation, also by random drift, under certain circumstances. One point that deserves to be noted is that, in this scenario, the deleterious mutation thought to have spread by drift was a necessary precondition for the evolution of a system of gene regulation that displays both higher specificity (at least potentially) and higher complexity. The increase in specificity and complexity turns out to be a byproduct of the compensation for the functional defect.

Why a possible increase in specificity? Because the additional factor C, through its own "lifestyle," namely, the time and place of its action, may define a more re- stricted activity pattern for the target gene. For example, if factor C occurs only in the embryo, the full activity of

from Zuckerkandl (1994). See text.

the duplicate gene would not be limited to the embryo. This could be acceptable to the organism if the gene is a duplicate of another gene that continues to function as before.

Why an increase in complexity? Because, in the pro- cess described in the example, the recruiting of an addi- tional factor, C, regulated by its own gene, results in an increased number of factors now required for full tran- scriptional activation, and the network of gene interac- tions around the structural gene targeted is thereby ren- dered more complex.

Many variants of the process described could occur, and it could be repeated, leading to an accumulation of factors attending the regulation of a particular gene. There may, of course, be other pathways as well for establishing multifactorial gene control. Significantly, slightly deleterious mutations that spread by random drift can be required to bring about an increase in the com- plexity of the gene-regulatory network. This gives nearly neutral mutations, tentatively, a very important role in macro-evolution. The process described, and variants thereof, may contain part of the answer to the old ques- tion: Why, during evolution, does the complexity of some organisms increase when simpler ones are equally successful--or, in a certain sense, even more successful? We do not know whether, in evolution, the process of increase in complexity of the interaction network around individual genes reached an early plateau or continued for a very long time. It may have continued for a longer time for genes playing specific roles in nervous and im- mune systems, for example, than it did in some other cellular systems, and the increase may have been small- est for "housekeeping genes." One can surmise, at any rate, that the increase in complexity of the interactions converging upon individual genes, and, therefore, in complexity of the interactions among genes, took place repeatedly, through a random loss, by mutation, in regu- latory efficiency, and a subsequent compensation for this loss. The compensation, it is supposed, can occur at times either through a mutation in the gene of an addi- tional regulatory factor or thanks to the involvement of an additional DNA sequence element whose affinity for a factor is increased by a mutation.

An example of a process of the type described has been given in yeast by Eric Jarvis, Karen Clark, and George Sprague in 1989. Yeast factor MCM1, which is homologous to the mammalian serum response factor SRF, binds to a DNA element dubbed the P-box, a sym-

Page 4: Neutral and Nonneutral Mutations: The creative mix—evolution of complexity in gene interaction systems

$5

metric dyad, located 5' of genes whose activity charac- terizes yeast cell type a. The P-box appears to be ho- mologous to the mammalian serum response element SRE. Genes specific for the yeast c~ cell type, on the other hand, have a degenerate P-box with impaired dyad symmetry. The MCM1 protein cannot bind unaided to this degenerate DNA element, but this protein can bind to the DNA element in a ternary complex when the regu- latory factor c~1 is present and available for interaction, thus compensating, as it were, for the defect in the P-box. The presence of the a 1 factor determines the cascade of regulatory interactions that leads to cell type a. In cell type a, from which a l is absent, MCM1 binds unaided to intact P-boxes--rather exactly the situation depicted in the figure. Ira Herskowitz, in 1989, already pointed out that these circumstances provide a way to think about how cell-type-specific regulation might evolve from a constitutively expressed gene.

Evolution of the Complexity of Gene Interaction Systems

Given the complexity of the gene regulatory network even in yeast, the most likely evolutionary scheme might be one in which the greatest generalized increase in the complexity of interaction networks targeting individual genes occurred in eukaryotes prior to the appearance of multicellular organisms. The evolution of multicellular organisms would, then, have been accompanied by more selective further increases in this feature of complexity, namely, increases limited to certain subsets of genes-- perhaps, for example, genes responsible for the intercel- lular signaling systems and for the versatility as well as specialization in the responses of cells to their environ- ments.

To give appropriate emphasis to the notion of number of factors and cofactors interacting per targeted gene, this notion may deserve a name, and for that reason, for better or worse, a number of years ago I used the term "controller node complexity." The complexity of any given controller node will have a certain numerical value (Fig. 2). Conventions for obtaining this value in ways usable for comparative studies have been tentatively pro- posed (Zuckerkandl 1979) and have to be revised.

When the purpose is to provide a measure of the complexity of the organism, rather than of particular cell types (characterized by cell-type-specific inventories of sets of factors), controller node complexity is recorded across organismal time and space in the form of a rep- resentation of cumulative factor interactions as they oc- cur at all stages and in all cell types. Are considered collectively to define the size and limits of an organismal controller node, all factors, cofactors, and cognate DNA sequence elements that directly intervene in the tran-

controller node oromoter region 1

. . . . . . . . . I 4 , coding sequence

; . . . . ~ / V V V v v v v ~ . . . . . . ~ 1 7 6

I

I

I

I

I

_ _ 6 . . . . . . . . I

Fig. 2. Example of a controller node, in this case of complexity 7. Controller nodes comprise the following components, each either ac- tivating or inhibiting (or both, according to circumstances), involved in regulating the rate, timing, and localization of transcription: (i) the polypeptide chains (1 and 2) that interact directly with DNA sequence elements, and perhaps with one another (RNA molecules are also to be considered); (ii) these DNA sequence elements (3 and 4), located in promoter or enhancer regions; the promoter region represented includes further factor-binding DNA elements that are not recorded here; (iii) polypeptide chains (5) that interact with components of type I, and play the role of macromolecular cofactors of these components; and (iv) other molecules, e.g., lower molecular weight cofactors (6 and 7), that specifically interact with polypeptide chains listed under (i) or (iii).

Polypeptide chains in protein factors composed of more than one type of chain (controlled by more than one gene) should each be counted separately. Homodimers or homooligomers of protein factors should be counted as single controller node components. A factor re- acting with several (identical or nonidentical) DNA elements is to be counted as a single component of a controller node, even though its action may be different at different DNA sites. This difference in action is taken into account by including each of the cognate repeated DNA elements in the controller node complexity count, unless it is shown that their absence has no notable effect on the expression of the gene. However, when, in the regulatory region of the gene, the same factor binds to DNA, along most of the sector, once or more than once per nucleosome length (about 200 bp), not only is the factor counted as 1, but also the binding sites may collectively be counted as 1, since in such a case factor and sites may, locally, fill a general function in regard to chromatin structure. Conventions in these and in other re- spects have to be adopted if the complexities of different controller nodes within and among organisms are to be compared (Zuckerkandl

1979). Dots in the figure indicate continuity of DNA along the chromo-

some.

scriptional control of a gene, at some developmental time and in some tissue. To date, all examples are incomplete. One is the controller node for the c-fos gene (Zucker- kandl 1994; Fig. 1). The most comprehensive and en- lightening studies so far of controller nodes in higher organisms are probably those by Kirchhamer and Dav- idson (1996), on the controller node (not so designated, of course) of the Cyllla cytoskeletal actin gene of Stron- gylocentrotus purpuratus, and by Yuh and Davidson (1996) on Endol6, the gene for a cell surface glycopro- tein from the same sea urchin. For the three examples, the minimum complexity values (which future work may show to be far too conservative) are 21 (16 factors, 5+ binding sites), 21 (nine factors, 12 binding sites), and 48 (14 factors, 34 binding sites), respectively. (SpGCF1 sites have been counted once per sea urchin controller

Page 5: Neutral and Nonneutral Mutations: The creative mix—evolution of complexity in gene interaction systems

$6

node, see caption of Fig. 2.) As a counterexample, in the archaeon Methanococcus jannaschii (Bult et al. 1996), there is not enough space between most coding se- quences to accommodate comparable numbers of regu- latory DNA elements. For this reason alone, controller node complexity has to be much lower. Increasing gene interaction complexity surely requires, among other con- ditions, increasing minimum amounts of noncoding se- quences for the interaction with regulatory factors.

In combining temporally and spatially discrete pro- cesses to generate a single blueprint of interactions in- volving a given gene in a given organism, future figures for the complexity of organismal controller nodes may correlate with the number of an organism's developmen- tal stages. Also, it seems reasonable to presume that the number of factors in organismal controller nodes and the number of cell types in the organism will not be unre- lated. The more cell types are present, the larger the number of discrete regulatory factors in the organism probably is. The increased number of factors should be reflected, in part, in the complexity of at least a subset of organismal controller nodes. An increase in complexity of even a small subset could have far-reaching evolu- tionary consequences.

Only the synthetic organismal controller node has a unique value per organism (although the value will have a margin of error because of some unavoidable ambigu- ities in the count). Cell-type-specific and cell-stage- specific controller nodes are expected to be less complex than organismal controller nodes, and to vary in space and time. A DNA element may be accessible, but the cognate factor may not be present; or the factor may be present and the cognate DNA element may not be ac- cessible. Different mixes of such regulatory states define different cell types and lead to different sizes of the same controller nodes. (If a gene's regulatory region is totally inaccessible to factors, the gene has a controlled node complexity of 0.)

As one goes from relatively simple to very complex gene networks and organisms, the increase in the com- plexity of organismal controller nodes is only one com- plexity feature to investigate. Another is the sharing of controller node components by different controller nodes; in other words, the degree of controller node in- teraction and interdependence, and the multidimensional space of controller node networks. A third, simpler fea- ture is the increase in the number, per organism or per cell type, of functional genes, a counterpart to the old complexity measure represented by the number of spe- cies of mRNA (Galau et al. 1974).

In the present context, it would, in fact, be more ac- curate to speak of the increase in the number of distinct controller nodes per organism, rather than in the number of genes. In principle, duplicate coding sequences can be identically controlled, at least insofar as the number and nature of factors involved are concemed--a probable

example is provided by groups of histone genes. Admit- tedly, when no pervasive recent gene conversion events occurred that equalized all duplicated promoter or en- hancer sequences, genes controlled by the same factors may nevertheless show variations in factor affinities and correlated variations in transcription rates. Yet, the ac- tion of such nearly identical controller nodes would be merely additive, as long as the coding sequences have not functionally diverged to a significant extent. As more of the same gene product is produced, the phenotype may thereby be modified, but the complexity of the gene in- teraction network as well as that of the organism could be considered as having remained unchanged. However, when the contingent of regulatory factors acting on a duplicate gene begins to diverge with respect to their number or nature, the number of distinct controller nodes increases by one unit, and there is a small change in the complexity of the gene interaction network.

We do not know at present how the three aspects--(1) the complexity of controller nodes (namely, the number of interacting factors, cofactors, and DNA elements per gene), (2) the extent and mode of the interweaving of controller nodes through shared components, and (3) the number of controller nodes per genome---evolved over time in the ancestry of higher organisms. We are aware of an evolutionary increase in the number of genes, though not informed about the somewhat different num- ber of distinct controller nodes; we know little as yet about the multidimensional fabric woven among control- ler nodes, which includes time as one of the dimensions (we know, for example, that the TATA-box factor or the CCAAT-box factor is shared among many controller nodes, cf. Boulikas 1994); nor, in regard to individual controller nodes, do we have a hint about the mean and variance of their complexity as a function of what can be referred to only somewhat vaguely as evolutionary grades.

Such questions need to be answered despite the intri- cacies of their pursuit. Whatever the answers may be, the process of adopting a preexisting, additional factor into a regulatory complex must have been a frequent occur- rence in evolution. Indeed, some mutation impairing gene expression, representing in the homozygote a mild to moderate controller gene disease (Zuckerkandl 1964), can always be expected; some such mutations will be fixed by random drift; and, among compensatory muta- tions that restore normal gene expression, there will, from time to time, be some that increase gene interaction complexity. A fraction of these mutations will be fixed in populations, so that the organisms may be expected, thereby, to be driven to higher complexity, as a byprod- uct, we may reemphasize, of a process whose immediate function is different. As a result of this type of process being repeated perhaps in many parts of the genome in very early evolution, and in a few parts in later evolution, so-called progressive evolution (anagenesis) (e.g.,

Page 6: Neutral and Nonneutral Mutations: The creative mix—evolution of complexity in gene interaction systems

$7

Zuckerkandl 1976b) occurs, provided that an anticipated

correlation between the status of reputed "h igher organ-

i sm" and higher controller node complexi ty can be sub-

stantiated. As suggested, in later evolution, only subgroups of

genes are l ikely to contribute to differences in controller

node complexity among organisms. This caveat notwith-

standing, there is a good change that, at least in some

respects, morphological ly simpler organisms are based

on simpler networks of interaction among informational

macromolecules. This should apply in various degrees to

both the number of network components per g e n o m e - -

namely, the number of genes (number of controller

nodes ) - - and to the number of interaction pathways tak-

ing aim at individual genes (complexity of individual

controller nodes). The second number could be referred

to informally as the size of gene-centered regulatory kits. The average size of the regulatory kits, and the range of

their sizes according to categories of genes, comparing, say, archaea, bacter ia , yeast , Caenorhabditis, Dro- sophila, Strongylocentrotus, zebrafish, and mammals , would be an informative measure at the molecular level of organismal complexity. (The parasitic crab Sacculina could be investigated for a possible evolutionary de-

crease in controller node complexi ty at the parasitic stage, a decrease perhaps characteristic of parasitism). 1 To provide a sufficient number of such measures is tech- nically feasible, though a monumental task.

Given the nature of the gene interaction system, pro- gressive evolution, in the guise of increased gene inter- action complexity, emerges as a consequence of the

structure of this system itself. "Progress ive , " here, as definitely should go without saying, implies no value judgment. It refers to the objective existence of a hier- archy of levels in the biological world, with the higher levels being players that appeared later in evolution. As has been implici t in the unbroken line of mechanistic explanations of natural phenomena, the higher biological levels, instead of requiring an "61an v i ta l" or "Ent - e lechie ," would be spontaneously built up from the

1The increasing complexity of "higher" organisms would also be measurable by the number of molecular messengers that circulate from cell to cell, the messages being sent from genes in one cell to genes in others; a number that depends both on the number of genes in a subset of genes, namely, the regulatory genes (genes coding for regulatory factors), and on the complexity of the germane controller nodes. An increase in controller node complexity again enters into the picture here, since, in a given cell of a "higher" organism, the same genes also found in lower organisms often may need to "understand" a greater variety of instructions that reach them from other cells. "Understand- ing," on the part of a gene, simply means permitting a factor to have an effect on its expression, and a controller node is an inventory of such factors. It would, however, be hard to provide an experimental measure of organismal complexity based on the total number of different mes- sages that are emitted by at least one type of cell and can be received and acted upon in at least one other.

lower levels, a view strongly held also by Dennett (1995) and exceptionally well documented by him. Part of a testable molecular mechanism for progressive evolution is now proposed (see also Zuckerkandl 1994). In this connection, we may be able to observe, through a par-

ticular sequence of events in mutational gene impairment and restoration, an intrinsic drive toward higher gene interaction complexities and, correlatively, toward the

evolution of higher organisms. For reasons uncertain, this intrinsic drive is effective along certain lines of de-

scent only, and then only over finite evolutionary peri- ods. For example, the trend has long since ceased to be mani fes t in the ances t ry o f con tempora ry Anne l id worms. Although rare, it is nevertheless ever-recurring. Discovering the incidence of complexity increases or de-

creases in the gene and factor interaction systems, the tempo of evolution of this complexity and the pattern of distribution of the increases in complexity over sub- groups of genes, notably, genes active at different stages of deve lopment - -a l l of this begs for an investment in a research effort of a magnitude similar to that of the ge-

nome sequencing projects. A certain deepening of our understanding of biology is available at that price. This field should receive much more attention than it has. Premature? Hardly. Attention itself would enhance time-

liness. It is remarkable that one can conceive of the fixation

by random drift of some slightly deleterious mutations as a precondition for a biological process that is involved in a drive toward higher gene network and organismal com- plexity---one of evolution's greatest accomplishments. It is remarkable also that an intrinsic drive toward in-

creased biological complexi ty could appear as a mere byproduct of a molecular process that is very simple in its mechanistic principle. Then again, much of what evo- lution accomplishes is " m e r e byproduct" of the business at hand. Byproducts of functional structures and pro- cesses are the motor of evolut ion 's " imagina t ion ." Fur- thermore, the genius of evolution is to be able to make a

mountain out of a molehill. This, I expect, is also the genius of some of the neutral and nearly neutral muta-

tions. In summary, reference has been made to the impor-

tance of neutral and nearly neutral mutations as pathways toward new functions of gene products, toward increas- ing specificity of gene regulation, and toward increasing the complexi ty of the gene interaction network and, therefore, of the organism as a whole. In the mix of basic biological mechanisms of evolution, random drift, con- trary to what I, for one, thought in earlier years, thus tends to be a true partner in creativity. The creative role of genetic drift in evolution is apparently much greater than could have been realized before Motoo Kimura came and left his indelible mark.

Acknowledgments. The author's sincere thanks to Henry J. Vogel for a critical reading of an earlier version of this paper.

Page 7: Neutral and Nonneutral Mutations: The creative mix—evolution of complexity in gene interaction systems

$8

References

Bamard EA, Cohen MS, Gold MH, Kim JK (1972) Evolution of ribo- nuclease in relation to polypeptide folding mechanisms. Nature 240:395-398

Boulikas T (1994) A compilation and classification of DNA binding sites for protein transcription factors from vertebrates. Crit Rev Eukaryot Gene Expr 4:117-321

Bult CJ et al. (1996) Complete genome sequence of the methanogenic archaeon, Methanococcus janaschii. Science 273:1058-1073

Dennett DC (1995) Darwin's dangerous idea. Simon & Schuster, New York, 586 pp

Galau GA, Britten RJ, Davidson EH (1974) A measurement of the sequence complexity of polysomal mRNA in sea urchin embryos. Cell 2:9-22

Herskowitz I (1989) A regulatory hierarchy for cell specialization in yeast. Nature 342:749-757

Huynen M (1996) Exploring phenotype space through neutral evolu- tion. J Mol Evol (in press)

Jarvis EE, Clark KL, Sprague GF (1989) The yeast transcription acti- vator PRTF, a homolog of the mammalian serum response factor, is encoded by the MCM1 gene. Genes Dev 3:936-945

Kirchhamer CV, Davidson EH (1996) Spatial and temporal information processing in the sea urchin embryo: molecular and intramodular organization of the Cyllla gene cis-regulatory system. Develop- ment 122:333-384

Ohta T (1992) The nearly neutral theory of molecular evolution. Annu Rev Ecol Syst 23:263-286

Ptashne M (1992) A genetic switch. 2nd ed. Cell Press & Blackwell Scientific, Cambridge, MA 192 pp

Robertson LM, Kerppola TK, Vendrell M, Luk D, Smeyne RJ, Boc- chiaro C, Morgan JI, Curran T (1995) Regulation of c-fos expres- sion in transgenic mice requires multiple interdependent transcrip- tion control elements. Neuron 14:241-252

Yuh C-H, Davidson EH (1996) Molecular cis-regulatory organization of Endo-16, a gut-specific gene of the sea urchin embryo. Devel- opment 122:1069-1082

Zuckerkandl E (1964) Controller gene diseases. J Mol Biol 8:128-147

Zuckerkandl E (1975) The appearance of new structures and functions in proteins during evolution. J Mol Evol 7:1-57

Zuckerkandl E (1976a) Evolutionary processes and evolutionary noise at the molecular level, II. J Mol Evol 7:269-311

Zuckerkandl E (1976b) Programs of gene action and progressive evo- lution. In: Goodman M, Tashian RE (eds) Molecular anthropology. Plenum Press, New York, pp 387-447

Zuckerkandl E (1979) Controller node complexity: a measure of the degree of gene coordination. J Mol Evol 14:311-321

Zuckerkandl E (1994) Molecular pathways to parallel evolution: I. Gene nexuses and their morphological correlates. J Mol Evol 39: 661-678

Zuckerkandl E, Hennig W (1995) Tracking heterochromatin. Chromo- soma 104:75-83


Recommended