Knight, Landweber and Yarus, p. 1
Tests of a stereochemical genetic code
Rob Knight, Laura Landweber† and Michael YarusDepartment of Molecular, Cellular and Developmental Biology
University of ColoradoBoulder, CO 80309-0347
† Dept. of Ecology & Evolutionary BiologyPrinceton University
Princeton, NJ 08544-1003
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 2
Abstract
Does the genetic code assign similar codons to similar amino acids because of chemical
interactions between them? Unlike adaptive explanations, which can only explain the
relative positions of amino acids in the code, stereochemical explanations could tie codon
assignments to absolute, verifiable rules. However, modern translation encodes amino
acid sequences without direct codon/amino acid interaction. If there is a relationship
between RNA sequences with intrinsic affinity for amino acids and the modern genetic
code, we must therefore explain a historical transition in which direct interactions were
abandoned.
We review the literature and find no evidence that interactions between short sequences
(mono-, di- or trinucleotides) and amino acids are strong or specific enough to originate
genetic coding. Instead, interactions between amino acids and longer nucleic acid
sequences appear to recapture some assignments of the modern code. For example, real
codons are concentrated in newly selected amino acid binding sites to a greater extent
than codons from similar, but randomized, codes. This implies that some initial coding
assignments were made by interaction with macromolecular RNA-like molecules, and
have survived. Thus subsequent selection, such as selection to minimize coding errors,
has not erased all primordial chemical relationships. Retention of initial stereochemical
codon assignments for three of six amino acids (arginine, isoleucine, and tyrosine, but not
glutamine, leucine or phenylalanine) is strongly supported.
Combining data for the six amino acids, significant stereochemical relationships are of
more than one type - codons and anticodons are each concentrated in some binding sites.
Further work will be required to catalog the relationships between amino acids and
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 3
binding site sequences, especially if, as now appears, more than one type of interaction
has been transmitted to the modern code.
1. The Codon Correspondence Hypothesis
The codon correspondence hypothesis, tested in any stereochemical theory of the origin
of the genetic code, may be stated:
For each amino acid there is a coding sequence for which it has the greatest association.
The association between these sequences and amino acids influenced the form and
content of the genetic code.
The codon-correspondence hypothesis is compatible with establishment of the genetic
code either before or during the RNA world. A direct association between mono-, di- or
trinucleotides and their cognate amino acids would suggest that the code arose before
complex RNA catalysts, since trinucleotides would likely occur before the reproducible
synthesis of longer oligonucleotides. Alternatively, an association between trinucleotides
and their cognate amino acids that requires RNA tertiary structure would suggest that the
genetic code arose in the RNA world (the earliest evolutionary time at which long RNA-
like molecules were available). Larger RNAs loosen the constraint on the role of the
coding sequences, which could then support the amino acid binding site but need not
comprise it entirely. Amino acid/RNA complexes might have functioned in translation
from the beginning, but alternatives abound. Their original functions may have been
varied: as coenzyme sites for ribozymes 1, to stabilize RNA double helices 2, or to label
tRNA-like genomic tags 3, 4.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 4
2. Chemical Associations: A Historical Perspective
The idea that the genetic code might be stereochemically determined predates the
elucidation of the code. Gamow’s ‘diamond code’, in which amino acids would fit
specific pockets bounded by four DNA bases, relied on direct interaction between amino
acids and nucleic acids 5. More abstruse possibilities exist: mathematical (and even
numerological) schemes for solving the coding problem abounded before the actual
codon assignments were fully uncovered (reviewed in refs 6, 7).
The structure of the code showed clear patterns. Chemical explanations for such order
were sought by two routes. Physicochemical theorists 8, 9 hoped to measure interaction
between bases and amino acids. This might have resulted in chromatographic co-
partitioning on the early earth, which would be reproducible today by chemical
techniques. In contrast, stereochemical theorists 10, 11 assumed that molecular modeling
could reveal molecular complementarities between amino acids and coding triplets.
Stereochemistry/Molecular Models: The first chemical investigations of codon
assignments were via molecular modeling. Molecular models have been said to prove
that the genetic code was established in quite varied ways. For example, amino acids
might pair with codons 12 or anticodons 10, 13 in the tRNA. Codonic mononucleotides and
-helical homopolymeric amino acids may bind each other specifically (this model
“correctly predicts the glycine codon GGG”, although it unfortunately fails to predict any
other)14. Free glycine and free nucleotides 15 may have affinity, or free amino acids may
intercalate into adjacent bases in the anticodon doublet through H-bonding between
methylene groups and the π-electrons of the bases 16. Specific 2’ aminoacylation of the
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 5
second position anticodon base may have been mediated by the first position anticodon
base 17. Amino acids may be able to intercalate between first and second position bases in
double-stranded RNA molecules 18. Cavities caused by removal of the second-position
codon bases in B-DNA may accept amino acids 19. Perhaps amino acids nestle into a
pentanucleotide cup with the anticodon in the center 20. Pairing between amino acid side-
chains and cavities in a complex of four nucleotides (C4N) on the acceptor stem of tRNA
21 might occur. Or perhaps amino acids can bind their codons transposed 3’ 5’ 22, 23. A
double-stranded complex of the codon and anticodon has also been suggested 18, 24.
The modeling approach was tarnished early on when a claimed association between
codons and amino acids 12 relied on models that had been built backwards, 3’ to 5’ 25.
Nevertheless, even the idea that there is a relationship between reversed codons and
amino acids has been defended 22, 23.
Clearly, modeling methods used thus far are not sufficiently constrained. As a result, they
allow too many solutions. Additionally, these approaches tend to assume that the entire
code was uniquely determined by stereochemical fit (and even that modern variant codes
reflect fits induced by different environments 26). If amino acids were added to the code
over time and for different reasons, as seems probable 27, 28, such explanations are
overstatements that may prevent confirmation even if the basic hypothesis is true.
Physicochemical Effects/Chromatography: A second line of evidence comes from
chromatography. Because chromatographic properties of amino acids show regular
variation in the genetic code, any mechanism for the code’s origin must account for this
organization. Various studies have shown that the code conserves certain properties, such
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 6
as polarity. The polar requirement of amino acids (the ratio of the log relative mobility to
the log mole fraction water in a water-pyridine mixture) orders coding assignments
impressively. Amino acids with U in the second position of their codon are hydrophobic
while those with A are hydrophilic; those with C are intermediate, and those with G are
mixed. Furthermore, codons that share a doublet have almost identical polar requirements
even if not otherwise related (e.g. His and Gln; possibly Cys and Trp) 6, 8, 9, 29. Thus the
code is ordered with respect to amino acid properties, but such evidence cannot tell us
whether the code was optimized to minimize errors due to mutation or established by
direct chemical interactions 28.
Nor does such chemical order suggest a mechanism for actual codon assignments.
Partitioning of amino acids and nucleotides between aqueous and organic phases, as in a
primordial oil slick, might have associated AAA codons with Lys and UUU codons with
Phe 30. However, none of these molecules are produced in prebiotic syntheses 31 and a
further hypothesis is required to bring chromatographic partitioning to bear on codon
assignment. Analysis with two further chromatographic systems, water/micellar sodium
dodecanoate and hexane/ dodecylammonium propionate-trapped water, confirmed the
previous hydrophobicity scales in a context closer to prebiotic conditions 32. The relative
hydrophobicity of the homocodonic amino acids (Phe UUU, Pro CCC, Lys AAA, Gly
GGG) and the four nucleotides in an ammonium acetate/ammonium sulfate system
showed an anticodonic association, and for dinucleoside monophosphates the association
was also with the anticodon, rather than the codon, doublets 33. Multivariate analysis of
the properties of dinucleoside monophosphates and amino acids, focusing on
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 7
hydrophobicity, revealed many strong (p < 0.001) correlations between anticodons and
amino acids, but not between codons and amino acids 34.
Thus, chromatographic data suggest anticodonic, rather than codonic interactions (note
the underlying assumption that molecules with similar properties interact). However,
although chemical partitioning on the early earth could conceivably have led to specific
cofractionation between particular nucleotides (or oligonucleotides) and prebiotic amino
acids, there do not seem to be consistent correlations. Chromatographic separation on
various plausibly prebiotic surfaces (silicates, clays, hydroxyapatite, calcium carbonate,
etc.) showed that, on a silica surface under an aqueous solution of MgCl2 and
(NH4)H2PO4, Ala comigrates with CMP and Gly comigrates with GMP 35. Ala is assigned
the GCN codon class, while Gly has the GGN codon class. However, there was no strong
separation between GMP and UMP or between AMP and CMP even on silica, and many
prebiotic amino acids (Pro, Ile, Leu, Val) fell well outside the range of the nucleotides.
The situation was even worse on other surfaces, which did not provide any amino acid-
nucleotide concordances. Thus, the data do not support the conclusion that copartitioning
of nucleotides and amino acids led to the genetic code 35, especially in the absence of a
plausible mechanism for transforming a copartition into modern codon assignments.
Physicochemical Effects/Direct Interactions: The third type of evidence comes from
tests for direct interaction between nucleotides and amino acids. Mononucleotides show
nonspecific but charge-dependent interactions with polyamino acid chains, as measured
by the change in turbidity of the cosolution 14. Affinity chromatography, which tested
retardation of the four nucleotide monophosphates by each of nine amino acids (Gly, Lys,
Pro, Met, Arg, His, Phe, Trp, Tyr) immobilized by their carboxyl groups, showed no For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 8
association between binding strength and codon or anticodon assignments 36. Interactions
between free amino acids and poly(A), as measured by the chemical shift of the C2 and C8
protons of A, are also “not easily reconcilable with the genetic code” 37. Further affinity
chromatography and NMR experiments on the interaction between amino acids and
mono-, di-, and trinucleotides showed that amino acids did selectively interact with
specific bases 38, although the interactions did not parallel the genetic code. Imidazole-
activated amino acids esterify the 2’-OH groups of RNA homopolymers with some
specificity 39. However, since the two amino acids tested, phenylalanine and glycine,
much preferred poly(U) over any other polynucleotide, the results do not support the
authors’ contention that this mechanism led to the present codon assignments.
The dissociation constants of AMP complexes with the methyl esters of amino acids also
show selectivity, ranging about seven-fold from Trp (120 mM) to Ser (850 mM) 40.
However, neither Trp (UGG) nor Ser (CUN, AGY) have particularly many or few A
residues in their codons or anticodons, while the amino acids that do (Lys AAR, Phe
UUY) have intermediate dissociation constants (320 and 196 mM respectively). These
data did show a strong negative correlation between the association constant (1/KD) and
amino acid hydrophobicity. There are positive correlations between the dissociation
constant and the number of codons assigned to the amino acid, and to frequency of the
amino acid in proteins 40. Condensation of dipeptides of the form Gly-X in the presence
of AMP, CMP, poly(A) and poly(U) was mainly enhanced by the anticodonic
nucleotides, where a pattern was apparent 41. Different amino acids differ in their ability
to stabilize poly(A)-poly(U) and poly(I)-poly(C) double helices2, although the order is
similar in each case and so cannot have contributed to the establishment of the genetic
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 9
code. Finally, D-ribose adenosine biases esters with L-Phe but not D-Phe towards the 3’-
OH (the pattern is reversed with L-ribose adenosine). Thus, single nucleotides
moderately regio- and stereo- selectively aminoacylate themselves 42.
Recent evidence also suggests that self-assembly of purine monolayers differentially
affects adsorption of amino acids. The spacing between residues is consistent with
peptide bond distances: such self-assembly might have formed a primordial code,
although apparently one very different from the modern genetic code 43-45.
Summary: Two comprehensive reviews of these and other data 46, 47 suggested that if the
genetic code were established by interactions between simple molecules (not more
complicated than dipeptides or trinucleotides) and amino acids, then the greatest specific
interaction was between amino acids and their anticodon nucleotides. However,
individual experiments were equivocal or correlated with both anticodons and
occasionally codons, so no strong direction is evident in the data.
The absence of obvious, strong or reproducible correlations from these highly varied
approaches, considered alone or especially in sum, weakens the hypothesis that the code
rests on the chemistry of trinucleotide-amino acid interactions. We suggest instead a later
origin for the code, involving larger RNAs.
3. Adaptors and Adaptation
Perhaps the simplest explanation for the observed order in the genetic code 11, 48-50 is that
codon assignments were determined by stereochemical association between
oligonucleotides and amino acids 8-10, 12. This mechanism would assign similar amino
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 10
acids to similar codons because of intrinsic affinity, rather than as a result of natural
selection among alternative codes. Although the resulting codon assignments might
appear adaptive, in that they reduce various errors relative to other possible codes, they
would not be an adaptation.
Stereochemical pairing: Several such stereochemical schemes are conceivable. Thus,
the primordial sequences with which pairing occurred can either be the actual codons, or
some simple transform thereof 9. As detailed in Stereochemistry/Molecular Models
above, interactions have been proposed between amino acids and codons 12, anticodons 10,
13, codons read 3’ 5’ instead of 5’ 3’ 22, 23, a complex of four nucleotides (C4N)
formed by the three 5’ nucleotides of tRNA with the fourth nucleotide from the 3’ end 21,
and a double-stranded complex of the codon and anticodon 18, 24.
A fundamental problem that all stereochemical models share is that codons and amino
acids are never stereochemically linked in modern translation. Thus an implied
evolutionary shift has occurred in which direct associations were lost, but their logic was
nevertheless transmitted to the present. Such a conservative transition, required to make a
stereochemical origin observable, is supported by a strong argument from continuity. The
shift to indirect associations must occur in a translation apparatus that is making useful
peptides (otherwise the translation apparatus itself could not have been selected). Thus
the logic of the older direct interactions must be preserved or the altered translation
apparatus will be of no use. After consideration of the evidence, we discuss this transition
to indirect coding again.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 11
The existence of adaptors, tRNAs and aminoacyl-tRNA synthetases, in the modern
system allows codon assignments to be readily shuffled among amino acids 51.
Accordingly, adaptive evolution can erase primordial codon assignments. Thus we would
only expect some amino acids to show codon/site associations, especially if others were
added to the code later. Consequently, it is remarkable that any associations persist to the
present 52.
Amino Acid-Binding RNA: Most attention to sequence/binding site associations
initially focused on arginine, since arginine binds specifically to two completely distinct
classes of natural RNA molecules. The first class is the guanosine-binding site of self-
splicing group I introns, which binds arginine as a competitive inhibitor. The
guanidinium side-chain of arginine is similar in structure to the Watson-Crick face of G
53. A conserved Arg codon confers this activity, and the binding site is almost invariably
composed of several Arg codons in close juxtaposition 54, 55. The second class has been
extensively studied because of potential medical importance: free arginine can mimic the
natural interaction of HIV Tat peptides with TAR RNA 56. In this case, however, no Arg
codons are conserved at the binding site 57.
Natural amino acid-binding RNAs are few; more significantly, they can provide only
anecdotal evidence for codon/binding site interactions because they are almost certainly
under strong selection for properties other than binding to the free amino acids. However,
SELEX or selection-amplification, a technique for directed molecular evolution 58-60,
makes it possible to select those RNA molecules that perform a desired catalytic or
binding function from large random pools (see ref. 61 for review). This technological
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 12
advance makes it possible to find out whether RNA molecules that bind to particular
amino acids share any characteristic motifs at their binding sites.
Aptamers have now been isolated from a variety of amino acids (Table 7.1), including
hydrophobic amino acids such as valine 62, phenylalanine/tyrosine 63, isoleucine 64,
tyrosine 65, leucine (I. Majerfeld and M. Yarus, unpublished data), and phenylalanine 65a,
and hydrophilic amino acids such as glutamine (G. Tocchini-Valentini, unpublished data)
and citrulline, which is not normally found in proteins 66. However, RNA aptamers for
arginine are most abundant in the literature, and have been independently isolated in
several different experiments 66-73. Since structural information is available for many of
these sequences, it becomes possible to ask whether particular sequences are
overrepresented at recently selected binding sites, and, if so, whether these sequences
have any relationship to the modern genetic code.
4. Statistical Evidence for Triplet/Binding Site Associations
The theory that the code arose by stereochemical means is both specific and unique; its
predictions are explicit and different from other prevalent theories. Coevolution theories
(that coding was extended along biosynthetic pathways 74) are typically agnostic about
which trinucleotide-amino acid pairing established the initial codon assignments, but
predict that such pairings, if they exist at all, can account for only a small part of the
codon catalog. Optimization theories (that coding minimizes errors in expression 75)
predict no correspondence at all between trinucleotides and amino acid binding sites.
Evolution of Binding Triplets: Assuming that original amino acid binding sites were
RNA-like, they could have evolved into any of the components of modern translation:
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 13
tRNA, rRNA, mRNA, or primitive aminoacyl-tRNA synthetases (subsequently replaced
by protein enzymes). Depending on which modern translation component descended
from ancient amino acid interactions, we predict different associations between coding
nucleotides and amino acids. If binding sites evolved into tRNAs, for instance, the
anticodons should be overrepresented in amino acid binding sites, whereas if they
evolved into mRNA the codons should be overrepresented 76.
The selection of RNA molecules (aptamers) that bind amino acid ligands has made such
conjectures testable (Table 7.1). Because in vitro selection searches a large space of
possible sequences for optimal or near-optimal “solutions” to particular binding
problems, such directed evolution might be able to recapitulate primordial interactions
between amino acids and short RNA sequences. If amino acids interact favorably with
coding RNA sequences, this relation might be observed, or even proven. Since aptamers
can be selected for each amino acid, and since the specific nucleotides important to
binding can be determined, standard statistical tests for association (such as the 2 or G
tests) will reveal any consistent relation between binding-site nucleotides and nucleotides
in coding sequences 77.
Such a search for motifs faces predictable difficulties. RNA is more versatile than might
have once been thought, and many oligomers often bind an amino acid. The diversity of
RNAs that bind arginine, for example, shows that efforts to emulate a unique primordial
RNA for each amino acid would be futile 57. Recurrence of specific sequence motifs in
amino acid aptamers, such as codons or anticodons, cannot prove that similar interactions
led to the establishment of present codon assignments. However, suppose that coding
sequences embody such general interactions that they will still be detectable in the most For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 14
probable modern binding sites. Proof of any specific pairings at all would show that the
specificity existed to originate a genetic code. If specific pairings detected with in vitro
selection actually match present codon assignments, then similar processes in ancient
translation are supported. If there are frequent, strong associations between present
codons or anticodons and amino acids, their involvement in the origin of the code is the
only plausible explanation.
Binding Site Preferences: That any codon/binding site associations could survive to the
present has been questioned 78. However, the association between arginine and its binding
sites is exceptionally strong, and has proven remarkably robust to statistical
methodology, choice of binding sites, and choice of sequences from selected pools 52, 76-78.
In particular, arginine binding sites show strong associations with arginine codons (Table
7.2), but not anticodons (Table 7.3), codon or anticodon sets for other amino acids, other
groups of 4+2 codons incorporating a family box plus a doublet, or other short motifs.
This relationship remains highly significant even with many plausible modifications.
Sequences where the selected binding site overlaps the constant regions can be excluded,
the data can be corrected for nucleotide bias at binding sites and alternative sequences
can be chosen from reported pools without altering the conclusion.
Arginine might have been unique: it acts as a nucleotide mimic 53, perhaps more so than
other amino acids. However, significant associations between Tyr aptamer binding sites
and codons have been reported 52, and Ile aptamers contain conserved Ile codons at their
active sites 64. Data from several other amino acids have become available, allowing a
more general test of generality for the association between binding sites and codons. We
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 15
now extend the analysis to all available amino acids (Table 7.1) and reassess hypotheses
about specific associations.
Testing Triplet/Site Associations: Codons occur more often in binding sites than
expected for each of the six amino acids for which data are available, an improbable
outcome itself (P = (0.5)6 = 0.016). Individually, the arginine aptamers showed a
significant codon/site association only. Tyrosine and isoleucine aptamers showed
significant associations between both codons and anticodons: except for the association
between tyrosine and its codons, these relationships persist even when corrected for six
multiple comparisons (P < 0.01). Glutamine, leucine and phenylalanine have no
significant tendency to locate codons or anticodons in their binding sites (when corrected
for multiple comparisons). The most sensitive tests combine all data; then we observe
highly significant associations overall with both codons and anticodons, even when the
single most influential amino acid is excluded from the analysis (P < 10-6 in all cases).
Thus there is reason to believe that codons and anticodons are associated with binding
sites, and this conclusion does not depend on any one selection or set of binding sites.
On the other hand, controls show that this method can rule out certain possibilities. There
was no significant association for any amino acid, or for the set as a whole, with the
codons reversed 3’ to 5’, indicating that this hypothesis can be clearly rejected.
It is possible that the 21 codon (or anticodon) sets are an unfair comparison class, since
they range in size from 1 to 6 codons. A less precise, but perhaps more robust, test is to
see whether there is a significant association between the amino acid binding sites and the
codon (or anticodon) that contains the cognate doublet: this reflects the intuitively
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 16
plausible idea that the primitive code may have assigned amino acids only to family
boxes. However, doublet analysis (Table 7.4) does not greatly change the outcome.
Significant associations are observed for both doublets and codons/anticodons. Thus,
again, the results to date suggest both associations between codons and anticodons.
We can carry these conclusions a step further by freeing them of the assumptions
required even for standard statistical tests. If there is an association between the triplets
found at amino acid binding sites and the modern genetic code, it should be found only
with the actual genetic code and not with randomized versions of it. Accordingly, we
generated many alternative codes, and tested for codon/binding site associations. This
preserves important aspects of the experimental results, such as the spatial correlations
within binding sites (they occur in specific sections of the molecule), and the influence of
the occurrence of each triplet on the probability of finding others. In order to eliminate
dependence on any particular method for generating variant codes, we used several quite
different permutation methods.
An ISO C program randomized the code according to the following schemes:
1. Codon permutation: a codon can randomly and independently take on any
identity (including its real one). This keeps the number of codons per amino
acid constant, but usually completely disrupts the fine structure of the code
(such as wobble relations). This potentially generates 64! = 1.2 x 1089 possible
codes.
2. Amino acid permutation: any amino acid can randomly and independently
take any existing coding block(s), including those of stop codons. This
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 17
preserves the structure of the code entirely (the number and size of blocks for
codons are preserved, and their relative positions are preserved within the
coding table), but amino acids can be given different numbers of codons. At
one extreme, Arg, which normally has 6 codons split into a 4-block and a 2-
block, might end up with Trp’s single codon. This potentially generates 21! =
5.1 x 1019 possible codes.
3. Codon block permutation. Keeping the structure of the code constant, we
randomly assorted amino acid identities among groups of codons of the same
size. For example, the CGN block assigned to Arg might be swapped with the
CCN block normally assigned to Pro, but could not swap with the single UGA
codon assigned to Trp. Treating the three Ile codons as a 2-block and a 1-
block, this leads to 8!x14!x4! = 8.4 x 1016 codes with 8 4-blocks, 14 2-blocks,
and 4 1-blocks. This “n-block” scheme completely preserves the degeneracy
of the code, and also conserves the number of codons assigned to each amino
acid. Compared to the other randomization schemes, amino acids are far more
likely to retain some of their actual codons.
4. Base identity permutation: in addition to the block permutation of method 3,
this method randomizes the meaning of the first and second position base .
This partially disrupts the code’s structure (so that, for example, the UGN
codon block need not be split into blocks of 2, 1, and 1), but preserves the
degeneracy across a row and down a column. This multiplies the number of
codes from method 3 by a factor of (4!x4!)/2 for a total of 2.4 x 1019 codes,
and dramatically reduces retention of fragments of the present code.For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 18
5. Codon doublet permutation: like method 4, except that any codon doublet
independently takes on the meaning of any other codon doublet. This leads to
16!/(8!x6!x2!) = 360360 times as many codes as method 3, for a total of 3.0 x
1022 possible codes. Both this and method 4 preserve the number of codons
assigned to each amino acid and their block structure (e.g. Arg will always
have a 4-block and a 2-block), but this method does not preserve the relation
between blocks of particular sizes as does method 4.
We generated 10 million randomized codes for each of the 5 schemes listed above, and
compared codon/site associations in observed amino acid binding sites with those found
in the actual code (Fig. 7.1). The “n-block” model (#3) is uniquely right-skewed, because
some of the codons can only swap with a few partners under this model (e.g. there are
only 4 blocks containing one codon) so that some of the present structure of the code will
often be preserved. Even under this highly constrained model, however, only 0.8% of
randomized codes give apparent associations between codons and binding sites better
than the actual code. For the other, more completely scrambled models, between 0.11%
(method 2) and 0.04% (methods 4 and 5) of all random codes do better than the actual
code. Said another way, real codons are more associated with real binding sites than in
99.2 to 99.96% of all randomized codes, even though randomized codes include
fragments of the actual code. Using Fisher’s method for independent probabilities rather
than performing a G test on the summed counts gave similar results (data not shown).
Thus, our result is general and not sensitive to choice of alternative codes or sensitive to
statistical methodology. It is highly unlikely that we would see as significant an
association between codons and binding sites for a genetic code picked at random as that
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 19
actually seen with the real code. Randomization of anticodon assignments gives similar
results, but slightly less significant than for codons. Randomized anticodons are less
associated with binding sites than real ones in 99.2 to 99.5% of all codes. This small
difference in significance appears also in the statistical tests (Table 7.3).
These controls argue strongly that the most probable modern RNA-amino acid binding
sites capture something of the essential nature of the code. In particular, a stereochemical
process involving macromolecular RNA-like binding sites containing codons, and
perhaps anticodons, gave rise to the present genetic code. Considering individual amino
acids, primordial RNA-like binding sites were probably relevant to the assignment of
codons for at least three of six amino acids for which we have data.
5. Concluding Remarks
We now return to the direct to indirect coding transition implied by every stereochemical
model. RNA amino acid binding sites contain sequences likely to be relevant to the
appearance of the code. Thus the logically predicted transition from direct to indirect
coding rests first on the ability of coding sequences to serve as structural elements in
amino acid binding sites, and then to subsequently serve in normal base pairing. Triplets
that became codons might begin as essential elements in binding sites (indirect coding),
and later pair with primordial tRNAs (direct coding). Triplets that became anticodons
might begin within binding sites (indirect), then employ their more well-known base-
pairing activity when they begin to act as anticodons (direct coding). The conservative
logic of the direct to indirect transition, required by argument from continuity, is implicit
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 20
as soon as it is known that nucleotide triplets can be essential elements of amino acid
binding sites (compare the DRT theory 57).
Descendants of the original amino acid-binding sites could play four possible roles: as
tRNAs, mRNAs, ribosomes, or aminoacyl-tRNA synthetases. All these activities are
known to be possible activities for RNA 79-85, because they exist in modern selected
parallels. With present data, it appears that arginine may have been bound in primordial
sites containing sequences that became codons in mRNA. We found no strong evidence
for association between glutamine, leucine and phenylalanine and their coding sequences.
These are negative results based on limited data; however, these codons may have been
assigned by other means during later code evolution. Tyrosine and isoleucine present a
case we had not anticipated, in which both codons and anticodons are overrepresented
(though not because they are paired in the molecules). We cannot confidently specify the
descent of the coding sequences for these amino acids. Their binding sequences could
have become both tRNA-like and mRNA-like molecules, or these data may be the first
indication of the need for a new, more comprehensive theory.
Ideally, with a large sample of independently derived families of aptamer that bind each
of the amino acids, it should be possible to test associations between binding sites and
individual trinucleotides. If there are, as now appears, to be several classes of amino acids
with different relations to coding sequences, such high resolution may be required. It is
possible that high-throughput techniques for aptamer isolation will achieve this in the
future, but, for the moment, isolating aptamers and determining binding sites is a time-
consuming process. Consequently, it may be several years before site/triplet associations
are maximally resolved.For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 21
However, it is clearly not true that each aptamer binds its target amino acid using only the
cognate codons. Amino acid binding sites always require other nucleotides for their
construction. Where structures are known, the coding sequences can be in contact with
the amino acid or providing less central support for the site - in some cases they are in
both places 52. The fact that binding with detectable affinities are far more complex than
single trinucleotides strongly suggests that the code probably began in an RNA world,
after complex RNA molecules were prevalent. Assuming that the RNA world biota were
our immediate antecedents, translation was also probably devised in the RNA world 89.
An economical interpretation is therefore that coding assignments arose predominantly
during initial selection for templated peptide synthesis, rather than via other activities.
These techniques have substantial potential for further analysis. It may be possible to
discover why some amino acids have the actual codon assignments they do, and perhaps
why some amino acids were incorporated into the code while others available on the
early earth or as metabolic intermediates were excluded. Furthermore, with complete data
in hand it may be possible to define a minimal, stereochemically determined code, and
therefore to estimate the relative roles of chemistry and selection in shaping modern
codon assignments.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 22
Amino Acid Kd Comments Reference
Arg 400µMGroup I intron: naturally binds G 86
Arg 4mMTAR: Naturally binds Tat peptide in HIV 56
Arg 1mM3 families selected; no structures available 67
Arg 4mMSelected against GMP binding 68
Arg 2-4mMSelected by salt elution to mimic TAR 70
Arg 60µM
Derived from citrulline binder by mutagenesis/reselection; NMR structure available 66
Arg 330nM
Intensive selection with heat-denaturation; only one sequence structurally characterized, though many selected 72
Val 12mM No structural data 62
Ile 200-500µMOnly one family survived selection 64
Phe/Tyr 2-25mM No structural data 63
Trp 18µMBinds D-Trp-agarose, not free L-Trp; no structural data 87
Tyr 35µMAlso binds Trp; evolved from L-DOPA binder 65
Phe <1mMSome clones bind only Phe-agarose 65a
Leu ~1mM Majerfeld & Yarus, unpublished data
Gln 18-20mM Mannironi et al., unpublished data
Table 7.1: Natural and Artificial Amino Acid-Binding RNA. Entries in bold are those
with sufficient structural information to define binding site nucleotides, used to test for
statistical association between binding sites and triplet motifs. Natural RNA sequences
that bind arginine were excluded from the analysis, because they are probably under
selection for other properties.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 23
Codons Arg Tyr Ile Gln Phe LeuTer 0.05 1.28 -5.02 -4.19 15.86 2.65Ala 0.09 -16.95 -11.97 -11.57 -18.51 -0.38Cys -16.97 -0.66 -8.42 -3.32 -4.79 0.04Asp 0.15 3.96 -3.45 2.89 -1.82 -1.08Glu 3.44 -3.17 -1.79 -0.01 6.81 1.47Phe -3.38 -2.38 -8.42 1.26 3.73 -2.00Gly 0.35 0.25 31.57 8.94 2.25 0.00His -1.04 -1.87 -6.14 -0.02 -3.69 -0.68Ile 2.86 9.18 10.35 3.43 0.01 -4.60Lys 1.34 -14.86 0.00 1.74 1.39 0.62Leu -19.92 -4.16 8.14 -10.60 -7.57 0.83Met -5.60 3.06 0.00 -0.02 -0.15 -1.35Asn 5.46 -0.04 -1.79 3.25 1.04 0.01Pro 0.00 -2.30 -11.17 -9.55 -8.26 -0.15Gln 0.27 2.30 -2.85 2.98 2.00 0.62Arg 29.11 0.24 -25.10 1.66 0.17 -0.78Ser -6.07 -4.95 -15.73 -7.54 -11.32 5.65Thr -0.10 0.57 -16.54 1.94 -7.32 2.61Val -0.13 4.45 -0.04 -0.38 5.53 2.82Trp -7.26 0.04 42.58 -1.14 -2.52 0.28Tyr -3.38 6.69 10.90 -0.33 0.03 -0.12
Rank 1 2 4 4 4 6
Table 7.2: Tests for association between amino acid binding sites and their cognate
codons. Rows: codon sets for each amino acid; columns: amino acids for which aptamers
with known structures have been reported. Bold values indicate the cognate codon sets
for each amino acid aptamer; values in italics indicate codon sets with at least as strong
an association as the actual codon set. Tabulated numbers are G values for association
between codons and binding sites, with the Williams correction 88; negative values
indicate codon sets that are found less frequently at binding sites than would be expected
by chance. ‘Rank’ indicates the rank order of the cognate amino acid’s codon set.
Binding sites for this table and all others are taken from ref. 52 where applicable (Arg, Ile,
Tyr), or otherwise from personal communications from the specific aptamer laboratories.
See ref. 76 for discussion of the effects of different choices of binding site.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 24
Codons n+b+
c +b-c -b+c -b-c G PArg 5 36 16 38 106 29.1 3.4E-08Tyr 3 12 71 9 179 6.7 4.8E-03Ile 5 15 25 30 181 10.4 6.5E-04Gln 3 6 36 6 108 3.0 4.2E-02Leu 2 16 46 19 78 0.8 1.8E-01Phe 8 11 74 35 504 3.7 2.7E-02Total 26 96 268 137 1156 51.6 3.5E-13
Total - Arg 21 60 252 99 1050 25.1 2.7E-07
Anticodons # seq+b+
c +b-c -b+c -b-c G PArg 5 20 32 37 107 2.9 4.5E-02Tyr 3 18 65 6 182 21.7 1.6E-06Ile 5 16 24 23 188 17.1 1.7E-05Gln 3 1 41 17 97 -5.9 9.9E-01Leu 2 27 35 23 74 6.7 4.7E-03Phe 8 12 73 40 499 3.7 2.8E-02Total 26 94 270 146 1147 43.1 2.6E-11
Total - Tyr 21 74 238 109 1040 39.6 1.6E-10
Rev. Codons # seq+b+
c +b-c -b+c -b-c G PArg 5 16 36 42 102 0.05 8.3E-01Tyr 3 3 80 6 182 0.03 8.6E-01Ile 1 10 30 25 186 4.10 4.3E-02Gln 3 7 35 11 103 1.34 2.5E-01Leu 2 12 50 29 68 -2.22 1.4E-01Phe 8 2 83 43 496 -4.33 3.7E-02Total 22 50 314 156 1137 0.71 4.0E-01
Table 7.3: Test for association between binding sites and the cognate codons, anticodons,
and codons reversed 3’ to 5’. Column headings: n, number of sequences; +b+c, number
of bases both in codons and in binding sites; +b-c, number of bases in binding sites but
not in codons; -b+c, number of bases in codons but not in binding sites; -b-c, number of
bases neither in codons nor in binding sites; G, the G test for association in a 2 x 2 table,
with the Williams correction; P, 1-tailed test for independence with 1 degree of freedom.
Values in italics are significant to P < 0.01 after correcting for 6 comparisons. There are
significant associations between some amino acid binding sites and both codons and
anticodons, even when the single most significant association is removed. However, there
is no association at all between amino acid binding sites and the reversed codons.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 25
Codon Doublets # seq +b+c +b-c -b+c -b-c G PArg 5 24 28 24 120 16.4 2.5E-05Tyr 3 22 61 20 168 10.2 7.1E-04Ile 5 15 25 30 181 10.4 6.5E-04Gln 3 9 33 15 99 1.5 1.1E-01Leu 2 7 55 21 76 -2.9 9.6E-01Phe 8 17 68 96 443 0.2 3.2E-01Total 26 94 270 206 1087 17.5 1.4E-05Total - Arg 21 70 242 182 967 7.1 3.9E-03Anticodon Doublets # seq +b+c +b-c -b+c -b-c G PArg 5 11 41 27 117 0.1 3.6E-01Tyr 3 23 60 19 169 12.5 2.1E-04Ile 5 8 32 45 166 0.0 5.7E-01Gln 3 5 37 46 68 -12.6 1.0E+00Leu 2 22 40 16 81 7.2 3.6E-03Phe 8 27 58 72 467 15.6 3.8E-05Total 26 96 268 225 1068 13.8 1.0E-04Total - Phe 18 85 227 198 951 14.8 6.1E-05
Table 7.4: Test for association between binding sites and codon doublets (XYN) or
anticodon doublets (NY’X’), where X and Y are specified and N is any base. For
example, the codon doublet for Phe is UUN within a binding site, and the anticodon
doublet is NAA within a site. Again, the specific associations hold for both codons and
anticodons overall, although few of the results are individually significant. Italics indicate
significant values after correction for 6 comparisons.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 26
Fig. 7.1: Distribution of likelihood for randomized genetic codes. The lines correspond to the different models for random codes described in Testing Triplet/Site Associations. The gray vertical line at the right (G = 51.5) gives the position of the actual code: very few randomized codes give a higher association between ‘codons’ and binding sites, making it highly unlikely that the observed association for the real code is due to chance. The “n-block” line (x) is skewed strongly to the right, because some codons can occupy relatively few blocks under this model. Thus n-block randomization preserves many similarities to the real code.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 27
References
1. Szathmáry E. Coding coenzyme handles: A hypothesis for the origin of the
genetic code. Proc Natl Acad Sci USA 1993;90: 9916-9920.
2. Porschke D. Differential effect of amino acid residues on the stability of double
helices formed from polyribonucleotides and its possible relation to the evolution of the
genetic code. J Mol Evol 1985;21: 192-198.
3. Maizels N, Weiner AM. Peptide-specific ribosomes, genomic tags, and the origin
of the genetic code. Cold Spring Harb Symp Quant Biol 1987;LII: 743-749.
4. Maizels N, Weiner AM. The genomic tag hypothesis: modern viruses as
molecular fossils of ancient strategies for genomic replication, in The RNA world,
Gesteland RF and Atkins JF, Eds. Cold Spring Harbor Laboratory Press: New York
1993;577-602.
5. Gamow G. Possible mathematical relation between deoxyribonucleic acid and
protein. Kgl Dansk Videnskab Selskab Biol Medd 1954;22: 1-13.
6. Woese CR. The genetic code: the molecular basis for genetic expression. New
York: Harper & Row 1967.
7. Ycas M. The biological code. North-Holland Research Monographs: Frontiers of
Biology, ed. Neuberger A and Tatum EL. Vol. 12. Amsterdam: North-Holland publishing
Company 1969.
8. Woese CR, Dugre DH, Dugre SA, et al. On the fundamental nature and evolution
of the genetic code. Cold Spring Harb Symp Quant Biol 1966;31: 723-736.
9. Woese CR, Dugre DH, Saxinger WC, et al. The molecular basis for the genetic
code. Proc Natl Acad Sci USA, 1966;55: 966-974.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 28
10. Dunnill P. Triplet nucleotide-amino acid pairing: a stereochemical basis for the
division between protein and nonprotein amino acids. Nature, 1966. 210: 1267-1268.
11. Pelc SR. Correlation between coding triplets and amino acids. Nature 1965;207:
597-599.
12. Pelc SR, Welton MGE. Stereochemical relationship between coding triplets and
amino-acids. Nature 1966;209: 868-872.
13. Ralph RK. A suggestion on the origin of the genetic code. Biochem Biophys Res
Comm 1968;33: 213-218.
14. Lacey Jr JC, Pruitt KM. Origin of the genetic code. Nature 1969;223: 799-804.
15. Rendell MS, Harlos JP, Rein R. Specificity in the genetic code: the role of
nucleotide base-amino acid interaction. Biopolymers 1971;10: 2083-2094.
16. Melcher G. Stereospecificity of the genetic code. J Mol Evol 1974;3: 121-140.
17. Nelsesteuen GL. Amino acid-directed nucleic acid synthesis. J Mol Evol 1978;11:
109-120.
18. Hendry LB, Whitham FH. Stereochemical recognition in nucleic acid-amino acid
interactions and its implications in biological coding: a model approach. Perspect Biol
Med 1979;22: 333-345.
19. Hendry LB, Bransome Jr ED, Hutson MS, et al. First approximation of a
stereochemical rationale for the genetic code based on the topography and physichemical
properties of "cavities" constructed from models of DNA. Proc Natl Acad Sci USA
1981;78: 7440-7444.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 29
20. Balasubramanian R. Origin of life: a hypothesis for the origin of adaptor-mediated
ordered synthesis of proteins and an explanation for the choice of terminating codons in
the genetic code. Bio Systems 1982;15: 99-104.
21. Shimizu M. Molecular basis for the genetic code. J Mol Evol 1982;18: 297-303.
22. Root-Bernstein RS. Amino acid pairing. J theor Biol 1982;94: 885-894.
23. Root-Bernstein RS. On the origin of the genetic code. J theor Biol 1982;94: 895-
904.
24. Alberti S. The origin of the genetic code and protein synthesis. J Mol Evol
1997;45: 352-358.
25. Crick FHC. An error in model building. Nature 1967;213: 798.
26. Mellersh A. A model for the prebiotic synthesis of peptides and the genetic code.
Orig Life Evol Biosph 1993;23: 261-274.
27. Crick FHC. The origin of the genetic code. J Mol Biol 1968;38: 367-379.
28. Knight RD, Freeland SJ, Landweber LF. Selection, history and chemistry: the
three faces of the genetic code. Trends Biochem Sci 1999;24: 241-7.
29. Woese CR. Evolution of the genetic code. Naturwissenschaften 1973;60: 447-59.
30. Nagyvary J, Fendler JH. Origin of the genetic code: a physical-chemical model of
primitive codon assignments. Orig Life 1974;5: 357-362.
31. Miller SL. Which organic compounds could have occurred on the prebiotic earth?
Cold Spring Harb Symp Quant Biol 1987;LII: 17-27.
32. Fendler JH, Nome F, Nagyvary J. compartmentalization of amino acids in
surfactant aggregates. J Mol Evol 1975;6: 215-232.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 30
33. Weber AL, Lacey Jr JC. Genetic code correlations: amino acids and their
anticodon nucleotides. J Mol Evol 1978;11: 199-210.
34. Jungck JR. The genetic code as a periodic table. J Mol Evol 1978;11: 211-224.
35. Lehmann U. Chromatographic separation as selection process for prebiotic
evolution and the origin of the genetic code. Bio Systems 1985;17: 193-208.
36. Saxinger C, Ponnamperuma C. Experimental investigation on the origin of the
genetic code. J Mol Evol 1971;1: 63-73.
37. Raszka M, Mandel M. Is there a physical chemical basis for the present genetic
code? J Mol Evol 1972;2: 38-43.
38. Saxinger C, Ponnamperuma C. Interactions between amino acids and nucleotides
in the prebiotic milieu. Orig Life 1974;5: 189-200.
39. Lacey Jr JC, Weber AL, White Jr WE. A model for the coevolution of the genetic
code and the process of protein synthesis: review and assessment. Orig Life 1975;6: 273-
283.
40. Reuben J, Polk FE. Nucleotide-amino acid interactions and their relation to the
genetic code. J Mol Evol 1980;15: 103-112.
41. Podder SK, Basu HS. Specificity of protein-nucleic acid interaction and the
biochemcial evolution. Orig Life 1984;14: 477-484.
42. Lacey Jr JC, Wickramasinghe NSMD, Cook GW, et al. Couplings of character
and of chirality in the origin of the genetic system. J Mol Evol 1993;37: 233-239.
43. Sowerby SJ, Cohn CA, Heckl WM, et al. Differential adsorption of nucleic acid
bases: relevance to the origin of life. Proc Natl Acad Sci USA 2001;98: 820-822.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 31
44. Sowerby SJ, Heckl WM. The role of self-assembled monolayers of the purine and
pyrimidine bases in the emergence of life. Orig Life Evol Biosph 1998;28: 283-310.
45. Sowerby SJ, Stockwell PA, Heckl WM, et al. Self-programmable, self-assembling
two-dimensional genetic matter. Orig Life Evol Biosph 2000;30: 81-99.
46. Lacey Jr JC, Mullins Jr DW. Experimental studies related to the origin of the
genetic code and the process of protein synthesis-a review. Orig Life 1983;13: 3-42.
47. Lacey Jr JC. Experimental studies on the origin of the genetic code and the
process of protein synthesis: a review update. Orig Life Evol Biosph 1992;22: 243-275.
48. Epstein CJ. Role of the amino-acid 'code' and of selection for conformation in the
evolution of proteins. Nature 1966;210: 25-28.
49. Volkenstein MV. Coding of polar and non-polar amino acids. Nature, 1965;207:
294-295.
50. Woese CR. Order in the genetic code. Proc Natl Acad Sci USA 1965;54: 71-75.
51. Saks ME, Sampson JR, Abelson J. Evolution of a transfer RNA gene through a
point mutation in the anticodon. Science 1998;279: 1665-1670.
52. Yarus M. RNA-ligand chemistry: a testable source for the genetic code. RNA
2000;6: 475-484.
53. Yarus M. A specific amino acid binding site composed of RNA. Science
1988;240: 1751-1758.
54. Yarus M. Specificity of arginine binding by the Tetrahymena intron.
Biochemistry 1989;28: 980-988.
55. Yarus M. An RNA-amino acid complex and the origin of the genetic code. New
Biologist 1991;3: 183-189.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 32
56. Tao J, Frankel AD. Specific binding of arginine to TAR RNA. Proc Natl Acad Sci
USA 1992;89: 2723-2726.
57. Yarus M. Amino Acids as RNA Ligands: a Direct-RNA-Template Theory for the
Code's Origin. J Mol Evol 1998;47: 109-117.
58. Ellington AD, Szostak JW. In vitro selection of RNA molecules that bind
specific ligands. Nature 1990;346: 818-822.
59. Tuerk C, Gold L. Systematic evolution of ligands by exponential enrichment:
RNA ligands to bacteriophage T4 DNA polymerase. Science 1990;249: 505-510.
60. Robertson DL, Joyce GF. Selection in vitro of an RNA enzyme that specifically
cleaves single- stranded DNA. Nature 1990;344: 467-468.
61. Ciesiolka J, Illangasekare M, Majerfeld I, et al. Affinity selection-amplification
from randomized ribooligonucleotide pools. Meth Enzymol 1996;267: 315-335.
62. Majerfeld I, Yarus M. An RNA pocket for an aliphatic hydrophobe. Nat Struct
Biol 1994;1: 287-292.
63. Zinnen S, Yarus M. An RNA pocket for the planar aromatic side chains of
phenylalanine and tryptophane. Nucl Acid Symp Ser 1995;33: 148-151.
64. Majerfeld I, Yarus M. Isoleucine:RNA sites with essential coding sequences.
RNA 1998;4: 471-478.
65. Mannironi C, Scerch C, Fruscoloni P, et al. Molecular recognition of amino acids
by RNA aptamers: the evolution into an L-tyrosine binder of a dopamine-binding RNA
motif. RNA 2000;6: 520-527.
65a. Illangasekare M, Yarus M. Phenylalanine-binding RNAs and genetic code
evolution. J Mol Evol 2002;54: 298-311.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 33
66. Famulok M. Molecular recognition of amino acids by RNA-aptamers: an L-
citrulline binding RNA motif and its evolution into an L-arginine binder. J Am Chem Soc
1994;116: 1698-1706.
67. Connell GJ, Illangsekare M, Yarus M. Three small ribooligonucleotides with
specific arginine sites. Biochemistry 1993;32: 5497-5502.
68. Connell GJ, Yarus M. RNAs with dual specificity and dual RNAs with similar
specificity. Science 1994;264: 1137-1141.
69. Yarus M. An RNA-amino acid affinity, in The RNA World, Gesteland RF, Atkins
JF, Editors. Cold Spring Harbor Laboratory Press: New York 1993;205-217.
70. Tao J, Frankel AD. Arginine-binding RNAs resembling TAR identified by in
vitro selection. Biochemistry 1996;35: 2229-2238.
71. Burgstaller P, Kochoyan M, Famulok M. Structural probing and damage selection
of citrulline- and arginine-specific RNA aptamers identify base positions required for
binding. Nucl Acid Res 1995;23: 4769-4776.
72. Geiger A, Burgstaller P, von der Eltz H, et al. RNA aptamers that bind L-arginine
with sub-micromolar dissociation constants and high enantioselectivity. Nucl Acid Res
1996;24: 1029-1036.
73. Yang Y, Kochoyan M, Burgstaller P, et al. Structural basis of ligand
discrimination by two related RNA aptamers resolved by NMR spectroscopy. Science
1996;272: 1343-1346.
74. Wong JT-F. A co-evolution theory of the genetic code. Proc Natl Acad Sci USA
1975;72: 1909-1912.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 34
75. Sonneborn TM. Degeneracy of the genetic code: extent, nature, and genetic
implications, in Evolving Genes and Proteins, Bryson V and Vogel HJ, Eds. Academic
Press: New York 1965;377-297.
76. Knight RD, Landweber LF. Guilt by association: the arginine case revisited.
RNA, 2000;6: 499-510.
77. Knight RD, Landweber LF. Rhyme or reason: RNA-arginine interactions and the
genetic code. Chem Biol 1998;5: R215-R220.
78. Ellington AD, Khrapov M, Shaw CA. The scene of a frozen accident. RNA
2000;6: 485-498.
79. Illangasekare M, Sanchez G, Nickles T, et al. Aminoacyl-RNA synthesis
catalyzed by an RNA. Science 1995;267: 643-647.
80. Illangasekare M, Yarus M. Specific, rapid synthesis of Phe-RNA by RNA. Proc
Natl Acad Sci U S A 1999;96: 5470-5475.
81. Illangasekare M, Yarus M. A tiny RNA that catalyzes both aminoacyl-RNA and
peptidyl-RNA synthesis. RNA 1999;5: 1482-1489.
82. Welch M, Majerfeld I, Yarus M. 23S rRNA similarity from selection for peptidyl
transferase mimicry. Biochemistry 1997;36: 6614-6623.
83. Nissen P, Hansen J, Ban N, et al. The structural basis of ribosome activity in
peptide bond synthesis. Science 2000;289: 920-930.
84. Yarus M, Welch M. Peptidyl transferase: ancient and exiguous. Chem Biol
2000;7: R187-R190.
85. Kumar RK, Yarus M. RNA-catalyzed amino acid activation. Biochemistry
2001;40: 6998-7004.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
Knight, Landweber and Yarus, p. 35
86. Yarus M, Majerfield I. Co-optimization of ribozyme substrate stacking and L-
arginine binding. J Mol Biol 1992;225: 945-949.
87. Famulok M, Szostak JW. Stereospecific recognition of tryptophan agarose by in
vitro selected RNA. J Am Chem Soc 1992;114: 3990-3991.
88. Sokal RR, Rohlf FJ, Biometry: The Principles and Practice of Statistics in
Biological Research. 3rd ed. New York: W. H. Freeman and Company 1995.
89. Yarus, M. On translation by RNAs alone. Cold Spring Harb Symp Quant Biol
2001;66: 207-215.
For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience