Hendry et al.
Lawrence B. Hendry,* Virendra B. Mahesh,* Edwin D. Bransome, Jr.#, Marion S. Hutson+ and Lillian K.
*Drug Design and Development Laboratory, Department of Physiology and Endocrinology, #Department
of Medicine, +Department of Pathology, Medical College of Georgia., Augusta, Georgia 30912 and
**Scottish Rite Childrens Hospital, 1641 S. Ponce de Leon, Atlanta, Georgia 30307 U.S.A.
Correspondence should be addressed to:
Submitted for publication: August 1995
Keywords: nucleic acids, genetic code, amino acids, DNA, RNA, complementarity
Computer modeling was employed to demonstrate that the amino acid L-isoleucine fits remarkably well into
the apyrimidinic site in DNA 5' A_C 3', 3' TAG 5' derived by removal of the second codon
nucleotide T from its double stranded codon/anticodon triplet. The complex 5' A-ILE-C 3', 3' TAG 5' exemplified remarkable complementarity of van der Waals surfaces as well as highly
favorable electrostatic interactions as measured by energy calculations. Alterations in the natural amino
acid including changes in the chirality of the isoleucine side chain to those not occurring in protein resulted
in poor fitting structures. These findings confirm earlier studies made with physical models and provide
strong evidence that the genetic code has a stereochemical basis.
Understanding the relationships between proteins and nucleic acids is of fundamental importance in
unraveling the function, regulation and evolution of genes. Historically, it was not long after Watson and
Crick's publication of the structure of the double helical B form of DNA (1) that Gamov (2) raised the
question of whether there were unique lock and key relationships between the structures of the twenty
amino acids occurring naturally in proteins and the four bases in DNA. Subsequently, the laboratories of
Nirenberg and Ochoa discovered that amino acids were coded by different triplet sequences of nucleic
acid bases which were given the name codons (3, 4). The position of amino acids in a given protein
sequence (amino to carboxyl) could be read from the sequence of codons (5' to 3'). The assignments of
amino acids to particular codons have since been firmly established in the genetic code table.
Gamov's belief that there must be structural relationships between nucleic acids, amino acids and proteins
was shared by a large number of early investigators most notably Woese (5). Various experimental and
theoretical approaches have been reported including suggestions that amino acids might interact with either
codons or anticodons (6, 7 and references therein). In a 1968 landmark paper on the origin of the genetic
code, Crick (8) stated that it was "...essential to pursue the stereochemical theory". At that time, because a
clear stereochemical model was not available, Crick hypothesized that the genetic code could however be
a frozen accident of evolution and thus uninterpretable today. He also suggested that any satisfactory
model should measure amino acid/nucleic acid interactions in terms of binding constants. Perhaps of
greatest insight was his suggestion that "it might be more useful to consider which amino acids are not used
in the code".
Our laboratory has employed various physical models to demonstrate relationships between amino acids
and codon/anticodon bases. Initially, amino acid side chains were observed to fit into cavities between base
pairs in partially unwound single stranded RNA as well partially unwound double stranded RNA (9). In
many cases, the amino acids fit particularly well into sites constructed from the first two bases of their codons
and anticodons. Structural analogies between amino acid side chains and nucleic acid bases were then
discovered leading to the idea of replacing a base in DNA and/or RNA with the amino acid (10). In
subsequent studies, the Watson and Crick B form of DNA was employed because the regularity of this
conformation of the double helix and in particular the symmetrical property manifest in the dyad axis could
be readily used to compare the amino acid/nucleic acid complexes with one another. Physical models of
cavities in double stranded DNA in which a base was removed were constructed leading to the discovery
that amino acids were excellent fits into certain cavities (7, 11). The amino acids fit best into the cavities
derived from their codon/anticodons. Poor fits were generally found when the amino acids were inserted
into cavities not derived from their codons. Structurally altered amino acids which were not capable of
being translated into protein were poor fits. In these early studies, it was not possible to quantitate degree
of fit of amino acids and until recently computational methods which could reliably and rigorously quantitate
interactions of molecular structures have been unavailable.
In this study, computer modeling was employed to reinvestigate our initial observations of the
stereochemical basis for the genetic code which were based upon relatively primitive physical models.
Our interest in the code problem was rekindled by two developments. First, new evidence has been
reported by Harris, Sullivan and Hickok (12) which supports a stereochemical theory for the origin of the
genetic code based upon a study of the interaction of the DNA binding domain of the glucocorticoid
receptor protein with the glucocorticoid response element sequence in DNA. These findings suggest that
precise stereochemical relationships are conserved between selected amino acid residues existing in
certain regulatory proteins and the nucleic acid sequences of genes that they regulate. The relationships
were found to be directly reflected the genetic code. Specifically, those amino acids of the protein which
recognize the gene appear to be oriented in a manner allowing for direct interaction with sequences which
contain their codons. Second, modern computational methods including standard force field energy
calculations have progressed to the point that accurate and reproducible, quantitative measurements of the
interactions of molecules can now be readily obtained. Such calculations are based upon the fundamental
physicochemical principle of complementarity originally defined by Pauling and Delbruck (13) i.e.
favorable steric interactions of van der Waals surfaces and electrostatic attraction of suitably oriented atoms
and functional groups.
At the outset, we wish to point out to the reader that any study which examines structural relationships
between amino acids and nucleic acids will be biased by the existence of the known genetic code table. To
avoid this problem, naturally occurring amino acid residues as well as those structures not existing in
protein were examined using L-isoleucine and various structural isomers as examples. The energy
resulting from the interactions of these amino acids with codon/anticodon sites was measured. Isoleucine
was chosen because in addition to its L- configuration, it has a chiral (asymmetric) side chain and exists as a
single enantiomer when translated into protein. It was thus possible to examine the fit of various
diastereoisomers. The rationale of varying the amino acid structure and quantitating interactions with a
given codon/anticodon site directly addresses the above stated concerns of Crick.
Here, we report further evidence that the genetic code has a stereochemical basis. This conclusion is
based upon the finding that L-isoleucine fits remarkably well into the apyrimidinic site derived from its
codon/anticodon and the fit is highly specific as evidenced by the relatively poor fit of other structural
isomers. The results of the study all of the existing amino acids and their fit into all 64 possible sites using
this approach will be reported in subsequent papers.
MATERIALS AND METHODS
Molecular modeling was conducted on a Indigo Extreme Silicon Graphics computer with Sybyl 6.04
software (Tripos Associates, St. Louis, MO) equipped with stereoviewing. All structures were constructed
with the Biopolymer module of the Sybyl program and assigned Kollman all atom charges. Double
stranded DNA triplets were constructed in the Watson and Crick canonical B form. Apurinic/apyrimidinic
sites were formed by removing a middle nucleotide (base with deoxyribose and 5' phosphate) from the
triplets. Energy calculations were performed with the Sybyl force field using a 1.2 parameter for the van
der Waals radius of hydrogen. While keeping the relative position of the bases intact, adjustments were
made to the torsional angles of the remaining backbone to permit maximal insertion of the size and shape of
a given amino acid side chain into the site. Concomitantly, the electrostatic interaction between the
negatively charged 5' -phosphate oxygen of the 3' base bordering the site and the positively charged alpha
amino group of the amino acid (salt bridge) were optimized. Alterations were made in the various amino
acid structures to create unnatural isomers which were then minimized using the force field.
Insertion of each candidate amino acid into DNA was accomplished using van der Waals surfaces including 1.0
solvent accessible Connolly surfaces of the apurinic/apyrimidinic cavities in stereo. Poor steric contacts
were minimized using autodocking. The docking procedure was repeated several times to optimize the
distances and directions of potential hydrogen bonds and salt bridges as well as to maximize van der Waals
interactions. The conformations of the amino acids were also adjusted to maximize steric contact. The
relative fit or complementarity of each amino acid was calculated by measuring the optimal favorable
energy change resulting from docking. A convenient method was to perform the docking procedure,
define the amino acid and DNA separately as aggregates and merge the molecules into a single complex.
The change in van der Waals energy was used as a measure of steric complementarity; the change in
electrostatic energy using donor hydrogens and acceptor heteroatoms was used to assess electrostatic
complementarity. Thus, the greater the magnitude of the negative energy change resulting from complex
formation, the more stable the complex and the better the fit. The total fit of each ligand was evaluated by
adding the change in kcal of the electrostatic and van der Waals energies.
Before proceeding to the results, it should be noted that there are certain inherent limitations of the
currently available computational methods. To date, it has not been possible to examine all other possible
sites or conformations of DNA and/or RNA. Water surfaces need to be considered and should also play
an important role in the specificity of fits of amino acids into the apurinic/apyrimidinic sites. The amino acid
codon/anticodon complexes including solvent shells should also be examined with molecular dynamics.
The creation of an apyrimidinic site using computer modeling from double stranded DNA is depicted with
both skeletal and space filling models in Figures 1 and 2, respectively. The site was created by removal of
a center nucleotide in a triplet sequence. In the example shown, the specific double stranded sequence
which was used to construct the cavity, i.e. 5' ATC 3', 3' TAG 5' , is a codon/anticodon triplet
for isoleucine. Note that the cavity formed by removal of the nucleotide T is bordered by: the first
codon base (A) along with the attached deoxyribose; the third codon base (C) with the attached
deoxyribose and 5 phosphate group; the middle base of the anticodon (A). The orientation, van der Waals
surface and electrostatic properties of these bases along with the sugar- phosphate backbone determine
the overall shape and physicochemical characteristics of the cavity. The predominant electrostatic feature
of any given cavity is the negatively charged phosphate oxygen of the third codon base. Thus, in order
for a candidate molecule to fit well within the site according to the well established principles of
complementarity (13), it should form a suitable charge interaction with the phosphate group as well as
conform to the surface characteristics of the cavity. For this reason, when docking amino acids into the site,
the positively charged alpha amino group was positioned in a manner which could form a salt bridge to the
phosphate oxygen concomitant with maximizing the van der Waals interaction of the side chain within the
When attempts were made to dock L-isoleucine into the site 5' A_C 3', 3' TAG 5' , the side
chain was found to be the same approximate size and shape as the cavity (Figures 1C, 2C). A complex 5' A-ILE-C 3', 3' TAG 5' could be constructed in which a salt bridge was formed with the isobutyl
side chain fitting completely within the cavity. Moreover, the van der Waals surfaces of the middle
anticodon base and the side chain possessed contacts which were strikingly complementary to one another.
To better assess the shape of the cavity, a Connolly or solvent accessible surface was created using those
atoms most closely bordering the site (Figure 3). L-isoleucine fits very well into the Connolly surface
(Figure 3C and Figure 4A). When measured with energy calculations, L- isoleucine formed an electrostatic
interaction (salt bridge) of -23.578 kcal and a van der Waals interaction of -14.233 kcal within the site. The
total interaction energy in the 5' A-ILE-C 3', 3' TAG 5' complex was -37.811 kcal.
Attempts to dock other isomers of L-isoleucine into DNA were not as successful
(Figure 4). In no case was
it possible to fully insert these isomers into the cavity analogous to L-isoleucine without very high van der
Waals repulsion of ca +550 kcal to > +8000 kcal. The best fits of these isomers in the site are as follows. L-
alloisoleucine had a -22.257 kcal electrostatic interaction with a -11.496 kcal van der Waals interaction for a
total of -33.753 kcal. As shown in Figure 4B, relatively little of the side chain was capable of contacting the
surface of the cavity bordered by the base A of the anticodon. Changing the configuration of the alpha
carbon to D-alloisoleucine also did not permit full insertion into the cavity. D-alloisoleucine had a -15.026
kcal electrostatic interaction and a - 8.368 kcal van der Waals interaction for a total of -23.394 kcal. Even less of
the side chain of this epimer could fit within the site (Figure 4C). The worst fitting diastereoisomer was D-isoleucine
which had a - 8.711 kcal electrostatic interaction and a -8.323 kcal van der Waals interaction for a total of -
17.034 kcal(Figure 4D). It was not possible to fit the L-tertiary butyl structural isomer of L-isoleucine into the site due to
the bulkiness of the side chain(Figure 4E). The L-t- butylisoleucine isomer had a -9.260 kcal electrostatic interaction
and a -7.436 kcal van der Waals interaction for a total of -16.696 kcal. The structural homolog of L-
isoleucine (L-homoisoleucine) with a carbon added between the alpha carbon and isobutyl side chain was
also a poor fit. It was not possible to form a reasonable salt bridge and the side chain could only be partially
inserted into the site(Figure 4F). L- homoisoleucine had a -3.968 kcal electrostatic interaction with a -10.018 kcal van
der Waals interaction for a total of -13.986 kcal.
Computer modeling including graphics and energy calculations have clearly shown that L-isoleucine is an
excellent fit into the apyrimidinic site created by removal of the middle nucleotide (T) from the double
stranded DNA sequence 5' ATC 3', 3' TAG 5'. L-isoleucine exhibits highly favorable
interactions within the apyrimidinic DNA cavity 5' A_C 3', 3' TAG 5' demonstrated by
simultaneous formation of: a strong electrostatic interaction manifest in a salt bridge between the alpha
amino group of the amino acid and a phosphate oxygen of the third nucleotide base (C); complementary
van der Waals surfaces between the side chain and the surrounding bases and in particular the unpaired
base (A). These results confirm earlier studies based upon physical models which demonstrated that
translated amino acids fit into apurinic/apyrimidinic sites (7, 10, 11). The apyrimidinic site in DNA chosen for
this study was derived from a codon-anticodon triplet sequence for L-isoleucine. The remarkable
complementarity manifest in the 5' A-ILE-C 3', 3' TAG 5' complex is consistent with our prior
observations and supports the notion that the genetic code has a stereochemical basis.
As stated in the introduction, prior knowledge of the existence of the genetic code table can create an
inherent bias in any study of the relationships between amino acids and their codons or anticodons. This
investigation was specifically designed to avoid such bias using the rationale of Crick (8), namely, to
examine amino acid structures which are not translated into protein. Using L-isoleucine as an example, all
possible diastereoisomers, a structural isomer and a homolog were examined for fit into the apyrimidinic site
derived from an isoleucine codon/anticodon. None of the structural variants fit into the site as well as L-
isoleucine as summarized in energy calculations plotted in Figure 5. These results provide unequivocal
and strong evidence that the fit of isoleucine into the cavity formed from its apyrimidinic codon/anticodons
cannot be fortuitous.
We have recently reported evidence using similar computer modeling methodology that the amino acid L-
tryptophan fits very well into apurinic sites derived from its codons/anticodons (14). The amino
acid/nucleic acid complexes formed, e.g. 5' T-TRP-G 3', 3' ACC 5' and 5' T-TRP-A 3', 3' ACT 5' , revealed a high degree of specificity. Namely, poor fits were generally found when
attempting to fit L-tryptophan into sites not derived from its codon/anticodons. For example, L-tryptophan
would not fit into the site which accommodates L-isoleucine. Although not shown, L-isoleucine does not fit
well into the L-tryptophan site. Computer modeling studies in progress demonstrate that in general amino
acids not only fit into apurinic/apyrimidinic sites derived from their codons/anticodons but that these fits are
highly specific when measured with energy calculations. It is also of interest that apurinic/apyrimidinic sites
in RNA appear to accommodate the amino acids at least as well if not better than the respective sites in
It will undoubtedly take a long time to rigorously examine the fits of all translated amino acids as well as
structurally altered isomers into all 64 possible apurinic/apyrimidinic sites. From the available data, it is
reasonable to conclude that the genetic code has an underlying stereochemical rationale based upon the
complementary stereochemical fits of amino acids into apurinic/apyrimidinic sites derived from their double
stranded codon/anticodons triplets. The observation that alterations in the amino acid structures including
changes in chirality from those translated into protein result in poor fits provides strong support for the
stereochemical theory and suggests that the genetic code table evolved from direct interactions of nucleic
acids with candidate amino acids. These findings may also have application in understanding the
established conservation of amino acid residues in proteins which regulate certain genes as described by
Harris et al.(12).
It should not be surprising that the original principles of the physicochemical complementarity of molecules
and their predicted importance in biological function originally described by Pauling and Delbruck would
be applicable to understanding biological coding. In fact, it would indeed be contradictory if Nature's
current systems of biological regulation and transmission of genetic information did not follow these
principles. We have proffered that such complementarity exists between nucleic acids and a wide variety
of biologically active naturally occurring small molecules with nucleic acids (15). We also maintain that
these complementary relationships represent a stereochemical logic inherent in gene structure which
dictates constraints on biological structure, function, activity and metabolism (11).
We wish to thank others who have been involved in the studies on the stereochemical basis for the genetic
code including Robert Ivarie, Douglas Ewing, John Henke, Francis Witham, Matt Petersheim, Kristin Douglas, Kerry Hendry, Bryan Hendry and Wendy Hendry.
We also thank the Georgia Research Alliance for partial funding of computer hardware and software
used in this study.
- Watson, J. and Crick, F. H. C. Molecular structure of nucleic acids. (1953) Nature 171, 737.
- Gamov, G. Possible relation between deoxyribonucleic acid and protein structures. (1954) Nature 173,
- Nirenberg, N. W. and Matthaei, J. H. The dependence of cell-free protein synthesis in E. coli upon
naturally occurring or synthetic polyribonucleotides. (1961) Proc. Natl. Acad. Sci. 47, 1588-1602.
- Lengyel, P. R., Speyer, J. R. and Ochoa, S. Synthetic polynucleotides and the amino acid code. (1961)
Proc. Natl. Acad. Sci. 47, 1936-1942.
- Woese, C. R. The Genetic Code: The Molecular Basis For Genetic Expression. New York: Harper &
- Lacey, J. C., Jr., Wickramasinghe, N. and Cook, G. W. Experimental studies on the origin of the genetic
code and the process of protein synthesis: a review update. (1992) Origins of Life and Evolution of the
Biosphere 22, 243-275.
- Hendry, L. B., Bransome, E. D., Jr., Hutson, M. S. and Campbell, L. K. A newly discovered
stereochemical logic in the structure of DNA suggests that the genetic code is inevitable. (1984) Perspect.
Biol. Med. 27, 623-651.
- Crick, F. H. C. The origin of the genetic code. (1968) J. Mol. Biol. 38, 367- 379.
- Hendry, L. B. and Witham, F. H. Stereochemical recognition in nucleic acid-amino acid interactions and
its implications in biological coding: a model approach. (1979) Perspect. Biol. Med. 22, 333-345.
- Hendry, L. B., Bransome, E. D. Jr. and Petersheim, M. Are there structural analogies between amino
acids and nucleic acids? (1981) Origins of Life 11, 203-221.
- Hendry, L. B., Bransome, E. D., Jr., Hutson, M. S. and Campbell, L. K. First approximation of a
stereochemical rationale for the genetic code based on the topography and physicochemical properties of
"cavities" constructed from models of DNA. (1981) Proc. Natl. Acad. Sci. 78, 7440-7444.
- Harris, L. F., Sullivan, M. R. and Hickok, D. F. Conservation of genetic information: a code for site-
specific DNA recognition. (1993) Proc. Natl. Acad. Sci. 90, 5534-5538.
- Pauling, L. and Delbruck, M. The nature of the intermolecular forces operative in biological processes.
(1940) Science 92, 77-79.
- Hendry, L. B., Chu, C. K., Mahesh, V. B., Bransome, E. D., Jr., Hutson, M. S. and Campbell, L. K. (1994)
Compumed 1994 Congress Proceedings, in press.
- Hendry, L. B. Drug design with a new type of molecular modelling based on stereochemical
complementarity to gene structure. (1993) J. Clin. Pharmacol. 33, 1173-1187.
© 1995 Epress Inc.