Harris et al.
We investigated protein/DNA interactions, using molecular dynamics simulations in solvent computed
for 600 picoseconds in a 10 Angstom water layer, between the glucocorticoid receptor (GR) DNA
binding domain (DBD) amino acids and DNA of a glucocorticoid receptor response element (GRE)
consisting of 29 nucleotide base pairs. Hydrogen bonding interactions were monitored. In addition, van
der Waals and electrostatic interaction energies were calculated. Amino acids of the GR DBD DNA
recognition helix formed both direct and water mediated hydrogen bonds at cognate codon/anticodon
nucleotide base and backbone sites within the GRE DNA right major groove halfsite. Likewise amino
acids in a beta strand structure adjacent to the DNA recognition helix formed both direct and water
mediated hydrogen bonds at cognate codon/anticodon nucleotide base and backbone sites within both the
GRE right and left major groove halfsites. In addition, amino acids within a predicted alpha helix located
on the carboxyl terminus of the GR DBD interacted at codon/anticodon nucleotide sites on the DNA
backbone of the GRE right major groove flanking nucleotides. These interactions together induced
breakage of Watson-Crick nucleotide base pairing hydrogen bonds, resulting in significant structural
changes and bending of the DNA into the protein.
Is there a code for recognition between DNA regulatory proteins and cognate DNA binding sites?
Biological experiments and molecular models of both prokaryotic and eukaryotic regulatory protein/DNA
interactions have described a very specific event (1-15). However a code for DNA site-specific
recognition was not detected. In fact, these findings have resulted in a debate regarding the existence of a
recognition code (16-19).
Our laboratory has long been interested in the origin of the genetic code and DNA site specific
recognition by DNA regulatory proteins and has made several key observations. We observed and
reported that genetic information is conserved between prokaryotic and eukaryotic DNA regulatory
proteins' DNA binding domains and their cognate sites on DNA to which they specifically bind, operators
or response elements (18-19). As an example, we reported that genetic information is conserved between
the DNA sequence of a well characterized glucocorticoid response element (GRE)
(Genebank locus MMTPRGR1) and its flanking
nucleotides and the c-DNA encoding the glucocorticoid receptor (GR) DNA binding domain (DBD)
(Genebank locus HUMGCRA) (19).
The GR DBD consists of 150 amino acids which fold into a structural motif of two "zinc finger" modules
(20). Using genetic sequence search techniques, we were the first to locate and describe the GR DNA
recognition alpha helix on the carboxyl terminus of the first zinc finger. We discovered the GR DNA
recognition helix by observing that its encoding c-DNA shares genetic information with a GRE (19). By
model building, we observed that amino acids of the DNA recognition helix were aligned with their
cognate codon-anticodon nucleotides within the GRE DNA right major groove halfsite. This conservation
of genetic information allowed us to hypothesize a code for DNA site specific recognition based on a
stereochemical relationship between functional sites on amino acids and their codon-anticodon
The genomic structure of the human GR gene has been determined to consist of ten exons (21). The two
zinc fingers of the DBD are separately encoded by two of the ten exons, 3 and 4. The DNA recognition
helix is encoded in exon 3 at the splice junction site of exons 3 and 4. Adjacent to the DNA recognition
helix is a beta strand structure encoded in exon 4. The recognition helix and beta strand structures are
spliced at a conserved Gly residue and serve as a bridge which joins the two zinc fingers. Earlier, we
observed that the carboxyl terminus of the GR DBD contains a predicted alpha helix structure encoded in
exon 5 at the exon 4 and 5 splice junction site. We compared separately the nucleotide sequences of GR
DBD exons 3, 4 and 5 with a nucleotide sequence
(Genebank locus MMTPRGR1) known to contain GRE
sites upstream of the mouse mammary tumor virus gene transcription start site (22). We observed
nucleotide subsequence similarity between a well characterized GRE and its flanks and nucleotide
sequences on the ends of exons 3, 4 and 5 at their splice junction sites. These sequences encode the DNA
recognition helix in exon 3, a beta strand in exon 4 and a structure predicted to be an alpha helix in exon 5.
By model building, we observed that amino acids located within the DNA recognition helix, the beta
strand , and the predicted alpha helix of the GR DBD as described above are spaced so that they align with
trinucleotides identical to cognate codon/anticodon nucleotides within the GRE major groove halfsites
and flanking regions (22). These findings suggested that these GR DBD amino acids may interact with
their codon-anticodon nucleotides within the GRE and its flanks.
Recently, using molecular dynamics simulations in solvent, we investigated protein/DNA interactions
between the GR DBD amino acids and the GRE and its flanking nucleotides. We compared findings from
a fully solvated 80 Angstrom water droplet GR DBD/GRE model with those from a 10 Angstrom water
layer GR DBD/GRE model. Our findings indicated that the interactions between the GR DBD amino acids
and the nucleotides of the GRE were independent of the hydration shell (23).
In the present study, we conducted 600 picoseconds of molecular dynamics simulations in a 10 Angstron
water layer model, investigating interactions between the GR DBD amino acids and the GRE and its
flanking nucleotides. Hydrogen bonding interactions were monitored. In addition, van der Waals and
electrostatic interaction energies were calculated. The findings indicate that GR DBD amino acids of the
DNA recognition helix have preferential electrostactic attraction toward their cognate codon-anticodon
nucleotides and form both direct and water mediated hydrogen bonds at these nucleotide base and
backbone sites within the GRE right major groove halfsite. Likewise, amino acids of the beta strand and
the predicted alpha helix, described above, form hydrogen bonds with nucleotide base and DNA backbone
sites at cognate codon/anticodon nucleotides within the GRE major groove halfsites and GRE flanking
regions, respectively. These interactions together induce breakage of Watson-Crick nucleotide base
pairing hydrogen bonds, resulting in significant structural changes and bending of the DNA toward the protein.
MATERIALS AND METHODS
The model of the GR DBD dimer used in this study (see figure 7)
was derived from NMR atomic coordinates of the GR
DBD (personal communication, Kaptein) (20). However, residues following Arg 510 in the NMR GR
DBD structural determination were disordered, and no coordinates were reported. The amino acid
sequence ranging from Arg 510 to Lys 517 contained a predicted alpha helix encoded by exon 5 which
we reported earlier to have genetic similartity to the GRE flanking nucleotide regions (22). Therefore, in
order to study potential interactions by amino acids of this predicted alpha helix and nucleotides flanking
the GRE, it was necessary using the QUANTA program (24) from Molecular Simulations Inc., to create an
alpha helix of the exon 5 encoded amino acids ranging from 511 to 517 and attach this structure to Arg
510; this modified GR DBD structure was used in a 10 Angstrom water layer model to study GR DBD
amino acid interactions with a GRE and its flanking nucleotides. A model of B-form DNA of a naturally
occurring MMTV GRE from GENBANK locus MMTPRGR1 in which we observed genetic similarity with
the c-DNA encoding the GR DBD (19, 22) was likewise created using the NUCLEIC ACID BUILDER
module from the QUANTA program (24). Solvated molecular dynamics simulation of the NMR GR
DBD/GRE model is described below.
The solvated molecular dynamics simulations were run on a CRAY YMP C-90 supercomputer using a
specially optimized version of CHARMm (release version 22.1) which has an atom limit of 15,000. The
10 Angstrom water layer model required 6717 water atoms, the GR/GRE protein/DNA complex
consisted of 2908 atoms resulting in a model of 9625 total atoms. The molecular dynamics simulation
required 0.6 CRAY C-90 CPU hours of computational resources per picosecond of simulation.
The solvated model was minimized for 200 cycles using the Steepest Descents method. Then the
structure was minimized for 100 cycles using the Adopted Basis Newton-Rapson method. Heating was
run for 600 cycles, at 0.001 ps per cycle for a total of 0.6 picoseconds, resulting in 0.5o K temperature
increase per cycle (from 0 to 300 degrees K). Equilibration was run for 1000 cycles (1 picosecond)
resulting in an overall temperature RMS deviation of approximately 3 degrees K. Finally molecular
dynamics were run with a step size of 0.001 picoseconds for an additional 600 picoseconds (600,000
cycles) using velocity scaling. A constant dielectric potential with an e value of 1.00 was used. A non-
bonded cutoff of 15.00 angstroms was used. Non-bonded parameters were updated every 20 cycles and
all energy terms were computed. For a detailed discussion of the CHARMm potential energy function see
reference (25) and for a review of molecular dynamics implementation in the biological sciences see
Explicit sodium counter-ions were used in the DNA model, based on geometry provided by Don Gregory
Ph.D. from Molecular Simulations Inc. Zinc atoms were placed in the GR structure and tetrahedrally
coordinated with the sulfur atoms from the "zinc-finger" cysteines. The residue topology file (RTF) for
the "zinc-finger" cysteines was altered and a new residue type was created 'ZCY' (for zinc binding
cysteine) in which the negative charges on the sulfur atoms were increased from -0.19 to -0.50 so that
the charges from the four tetrahedrally coordinated cysteine sulfur atoms would neutralize the +2.0
charge on the zinc atom. In addition, the charges on the zinc binding cysteine beta carbons were increased
from +0.19 to +0.40 and the charges on the alpha carbons were increased from +0.10 to +0.20 in order
to maintain the ZCY residue at a net 0.0 charge.
DNA Groove Geometry Calculations:
The conformational changes of the DNA during dynamics were evaluated using the CURVES 4.1
program provided by Richard Lavery of Laboratoire de Biochimie Theorique CNRS (personal
communication). The documentation provided describes CURVES as "an algorithm for calculating a
helical parameter description for any irregular nucleic acid segment with respect to an optimal, global
helical axis. The solution is obtained by minimizing a function which represents the variations in helical
parameters between successive nucleotides as well as quantifying the kinks and dislocations which exist
between successive helical axis segments". For more detailed information
regarding the CURVES 4.1 program see references (27-28).
Interaction Energy Calculations:
Graphs of initial interaction energy between GRE nucleotides and selected GR DBD amino acids were
calculated using CHARMm (25-26). In all graphs, interaction energy was calculated using a constant
dielectric potential with an e value of 1.00. "Total Energy" is the sum of electrostatic interaction energy
and Van der Waals interaction energy. The values given for the interaction of particular amino acid and
nucleotide residues are the sum of the interaction energies of all atoms in those residues.
Hydrogen Bond Calculations:
The hydrogen bond interactions for the 10 Angstrom water layer GR/GRE model were recorded at 1.0
picosecond (1000 cycle) intervals. Frequencies of H-bonding (see table 1) interactions greater than 600 reflect
multiple hydrogen bonds (i.e. when two or more of the grouped atoms from one residue interact at the
same atom from another residue) for a given amino acid/nucleotide interaction. The hydrogen bonding
interactions between amino acids encoded by exons 3, 4 and 5 of the GR DBD and nucleotides of the
GRE and flanking regions were monitored. We used a distance-angle algorithm to compute hydrogen
bonds which was based on the results of analysis of hydrogen bonding in proteins (29). The value used for
the maximum distance allowed between the hydrogen atom and the acceptor was 2.5 angstroms. The value
used for the maximum distance allowed between the atom bearing the hydrogen and the acceptor was 3.3
angstroms. The minimum angle at the acceptor was 90 degrees (limit = 0 to 180 degrees). The minimum
angle at the hydrogen was 90 degrees (limit = 0 to 180 degrees). The minimum angle at the atom bearing
the hydrogen was 90 degrees (limit = 0 to 180 degrees).
RESULTS AND DISCUSSION
Recently, we reported that nucleotide subsequence similarity exists between a well characterized GRE
and its flanking nucleotides and the c-DNA which encodes amino acids of the GR DBD (22). We also
observed by model building that amino acids encoded at the splice junctions of exons 3, 4 and 5 of the GR
DBD are aligned with their cognate codon/anticodon nucleotides within the GRE right and left major
groove halfsites and flanks. This includes amino acids of the GR DNA recognition helix encoded in exon
3, a beta strand encoded in exon 4, adjacent to the DNA recognition helix and amino acids of a predicted
alpha helix encoded in exon 5 at the exon 4 and 5 splice junction site (see figure 1 A-H). These findings
suggested that the amino acids within the above structures may interact with their cognate
codon/anticodon nucleotides within the GRE and its flanks. To investigate this possibility, we docked the
GR DBD dimer at H-bonding distance within the DNA major groove halfsites of the GRE. Using the
CHARMm program, we conducted 600 picoseconds of molecular dynamics. A GR DBD/ 29 bp GRE
model , without the water molecules, is shown in figure 2. In this model the GR DBD is docked at
approximately 10 Angstroms from the 29 bp GRE and flanking nucleotides for visual clarity. This model
is to be used as a key for locating interactions found between the GR DBD amino acids and nucleotides
of the GRE and its flanks during molecular dynamics, see table 1.
Amino Acid-Nucleotide Hydrogen Bonding Interactions:
Hydrogen bonding interactions between amino acids encoded by exons 3, 4 and 5 of the NMR GR DBD
and nucleotides of the GRE and flanking regions were monitored. A summary of H-bonding interactions
is shown in table 1. Equivalent functional sites on the amino acids are grouped: Lysine hydrogen bond
donor sites HZ1, HZ2 and HZ3 are combined as HZ. Arginine hydrogen bond donor sites HH11 and
HH12 are combined as HH1 and hydrogen bond donor sites HH21 and HH22 are combined as HH2.
Glutamine hydrogen bond donor sites HE21 and HE22 are combined as HE2. Asparagine hydrogen bond
donor sites HD21 and HD22 are combined as HD2. Glutamic acid hydrogen bond acceptor sites OE1 and
OE2 are combined as OE. Likewise, the DNA backbone phosphate group hydrogen bond acceptor sites
O1P and O2P are combined as OP. First and last occurrences of DNA/protein hydrogen bonds which
includes minimization, heating and equilibration steps followed by the 600
picosecond production dynamics simulation are given in picoseconds of dynamics with their frequency of
occurrence. Individual hydrogen bonds are labeled "C" for amino acid-nucleotide codon interactions,
"AC" for amino acid-nucleotide anticodon interactions, "C*" and "AC*" for amino acid-nucleotide codon
and anticodon interactions when the codon or anticodon sequence is present reading 3' to 5'.
It can be seen in table 1 that the majority of H-bonding interactions for exon 3 encoded amino acids of the
DNA recognition helix occur at codon/anticodon nucleotide sites within the GRE right major groove
halfsite. Likewise, H-bonding interactions between the GRE right major groove halfsite and its flanking
nucleotides and the exon 4 and exon 5 encoded amino acids of the beta strand and predicted alpha helix
respectively occur at codon/anticodon nucleotide sites. In contrast, in the left major groove halfsite, with
the exception of amino acid V 468 of the DNA recognition helix encoded in exon 3, just those amino
acids of the exon 4 encoded beta strand form H-bonds at codon/anticodon nucleotide sites see table 1.
Electrostatic and van der Waals interactions:
We calculated van der Waals and electrostatic interaction energies between amino acids of the GR DBD
and nucleotides on the sense and antisense strands of the GRE and its flanks. Calculations were
performed on the minimized, heated and equilibrated structures at the beginning of the dynamics
simulation and after 600 picoseconds in order to analyze the attractive forces between GR DBD DNA
recognition helix amino acids and GRE nucleotides. A total energy (Kcal/M) interaction consisting of
both van der Waals and electrostatic energy was determined. Total energy values were recorded for the
hydrophilic amino acids of the GR DNA recognition helix and nucleotide base pairs within the GRE DNA
right major groove halfsite. The maximal attractive energy potential for Lys 461, Lys 465, Arg 466 and
Glu 469 sidechains was with their cognate codon or anticodon nucleotide base pairs found within a
palindromic sequence, 5'-AAGAA-3'-5'-TTCTT-3', which has codons for Lys (AAG), Arg (AGA) and Glu
(GAA) in both directions, 5'-to-3' and 3'-to-5', in the GRE DNA right major groove halfsite (see figure
3A, C-E). In addition, Val 462 showed a strong van der Waals interaction at the middle nucleotide of its
codon GTT on the sense strand (see figure 3B). The van der Waals interaction of Val 462 at the middle
nucleotide of its codon site in the right major groove halfsite agrees with our original prediction for this
amino acid (19) which was recently confirmed by the findings of Luisi et al. (14). At the beginning of
dynamics (0 picoseconds), the maximal attractive energy potential for Arg 466 was not directed toward
its codon/anticodon nucleotide base pair (see figure 3D). However, during molecular dynamics, Arg 466
showed strong attractive energy potential for its codon nucleotide G38, AGA (see figure 3D ). Our
results show global electrostatic attraction for GR DNA recognition helix amino acids toward their
cognate codon/anticodon nucleotides within the GRE right major groove halfsite. In addition, Gln 471 of
the exon 4 encoded beta strand has maximal attractive energy potential for its codon nucleotide on the
sense strand, CAA, reading 3'-to- 5', see figure 3F.
Specific Amino Acid-Nucleotide Interactions:
The GR DBD is reported to preferentially and specifically bind to the GRE right major groove halfsite
containing the 5'-TGTTCT-3'-5'-AGAACA-3' recognition sequence as a monomer which in turn facilitates
cooperative dimerization and subsequent non specific interaction with nucleotides of the adjacent left
major groove halfsite (30). We reported earlier that genetic information is conserved within the GRE
right major groove halfsite for amino acids of the exon 3 encoded DNA recognition helix Amino acids
(22). In addition, we also reported that genetic information is conserved within both the GRE left and
right major groove halfsites and flanking regions for amino acids of the exon 4 encoded beta strand and
the exon 5 encoded amino acids of a putative DNA binding alpha, respectively (23). Amino acids Lys
461, Lys 465 and Arg 466 of the GR DNA recognition helix are conserved at similar positions within the
DNA recognition helices of the steroid receptor family; these amino acids of the GR DNA recognition
helix have been reported to specifically bind DNA at GRE sites and regulate gene transcription (31). We
observed that amino acids Lys 461, Lys 465 and Arg 466 form both direct and water mediated
multidentate H-bonds at cognate codon/anticodon nucleotide base sites within the GRE right major
groove halfsite, as shown in table 1 see figures 1A-H and 2 for reference. Close up views of specific
amino acid-nucleotide H-bonding interactions are shown in figure 4A-J. In figure 4A, water mediated H-
bonds between the sidechain of Lys 461 and its codon nucleotide A36 at the N7 base site is shown; water
mediated H-bonding between Lys 461 and anticodon nucleotide C21 at the H41 site is also shown. Our
molecular dynamic simulations also indicate that Val 462 of the DNA recognition helix has van der Waals
interaction with it's codon nucleotide T19 within the GRE right major groove halfsite, see figure 4B. H-
bonding interactions for amino acid Lys 465 are shown in figure 4C. Lysine 465 forms direct H-bonds
with its codon nucleotide G38 at the N7 and 06 base sites. Lysine 465 also forms a direct H-bond with
the N7 base site of its codon nucleotide A37, as well as, forming a direct H-bond with the O4 base site of
nucleotide T20. These interactions, in concert, disrupt the Watson-Crick (WC) H-bonds between C21-
G38 as can be seen in figure 4C. It is interesting to note that methylation of G38 has been reported to
inhibit site specific DNA binding by the GR protein (32). In figure 4D, H-bonding interactions are shown
for Arg 466, direct H-bonds are formed between its codon nucleotide A39 at the OP backbone site and at
the codon nucleotide G38 at O5' and OP backbone sites. Glutamic acid 469 forms a water mediated H-
bond with its codon nucleotide G38 at the phosphate backbone see figure 4E. Within the right major
groove halfsite, methylation of G18 has also been reported to inhibit site specific DNA binding by the GR
protein (32). Our results show that amino acids, Gln 471 and Asn 473, of the beta strand encoded in exon
4 at the splice junction site of exons 3 and 4 form both direct and water mediated H-bonds with
nucleotide base sites O6 and N7 respectively on their anticodon nucleotide G18, see table 1 and figure 4F
In addition, we recently observed that flanking the GRE major groove halfsites are sequences rich in
purines 5'-TAAAACGA- 3' on the right and 3'-TCAAAAAC- 5' on the left. These sequences contain
codons/anticodons for a cluster of hydrophilic amino acids (Arg 510, Lys 511, Thr 512, Lys 513, Lys
514, Lys 515, Ile 516 and lys 517) located within the predicted alpha helix on the carboxyl end of the GR
DBD (22) see figures 1 and 2. It is interesting to note that the GRE and flanking nucleotide sequence in
which we observed maximal nucleotide subsequence similarity to the GR DBD (22) is identical in
sequence and location within the MMTV5LTR to that described by Scheidereit et al. as a GR binding site
using nuclease footprinting (32-33) see figure 2. It is also interesting that the GR amino acids ranging
from 510-517 are related in sequence to the nuclear localization signal of the simian virus SV40 T-
antigen: Pro, Pro, Lys, Lys, Lys, Arg, Lys and Val (34). We report herein that amino acids Arg 510, Lys
513 and Lys 517 of the predicted alpha helix within the NMR GR DBD right monomer form both direct
and water mediated H-bonds on the DNA backbone at codon/anticodon sites of the GRE right major
groove halfsite and flanking nucleotide region, see table 1 and figure 4H-J. These interactions together
induce DNA bending into the protein.
Hydrogen bonding interactions between exon 3 encoded amino acids of the left GR DBD monomer of
the dimer and nucleotides of the GRE left major groove halfsite involve the same amino acids as seen in
the right GR DBD monomer and occur at equivalent dyad symmetrical nucleotide positions as in the GRE
right major groove halfsite. However, the wild type GRE major groove halfsites consist of an imperfect
palindrome of the 5' TGTTCT 3' recognition sequence which occurs in the right major groove halfsite;
the sequence 5'TGTAAC 3' occurs in the left major groove halfsite. Therefore codon/anticodon
nucleotide sites for Lys 461, Lys 465 Arg 466 and Glu 469 of the DNA recognition helix are not present
in the GRE left major groove halfsite (see figure 1A, C-D) and interactions occur at non-codon
nucleotide sites (see table 1 ). However, methylation of G47 in the left major groove halfsite is reported
to inhibit site specific DNA binding by the GR protein dimer (32). Our results show that Gln 471 and Tyr
474 form both direct and water mediated H-bonds with their codon/anticodon nucleotide base pair C12-
G47, see table 1. This observation along with the specific atomic interactions which take place between
the GR DNA recognition helix amino acids and their cognate codon/anticodon nucleotide bases within the
GRE right major groove halfsite as described above, see table 1, figure 1A-D, figures 3A-F and 4A-J,
supports our hypothesis that conservation of genetic information is a determinate of site specific DNA
recognition and binding. Furthermore, the overall richness in codon/anticodon nucleotides for amino
acids of the GR DBD DNA recognition helix encoded in exon 3, the beta strand in exon 4 and predicted
alpha helix in exon 5 coupled with the atomic interactions by these amino acids at their conserved codon
/anticodon sites in the GRE major groove halfsites and flanking regions
(see figure 1A-H, table 1, figure 3A-F and figure 4A-J ) offer an explanation for the DNA binding
preference reported for the GR at this particular GRE site (36) as opposed to the other GRE sites
available in the LTR upstream of the MMTV gene initiation site.
Nucleotide-Nucleotide Hydrogen Bonding Interactions for the 29 BP GRE DNA:
Hydrogen bonding interactions between sense and antisense strand GRE nucleotides during 600
picoseconds of molecular dynamics on the GR DBD/GRE 29 BP DNA model are shown in table 2. A loss
of one or more canonical Watson-Crick (WC) H-bonds can be seen occurring predominantly in the right
major groove halfsite at nucleotide base pairs: C16-G43, T17-A42, G18-C41, T20-A39, C21-G38, T22-
A37 and T23-A36. The majority of the loss in canonical H-bonds for these nucleotide base pairs can be
accounted for by amino acid-nucleotide interactions at WC sites as shown in table 1 and by the non-
canonical nucleotide-nucleotide H-bonding interactions shown in table 2. In the left major groove
halfsite a complete loss of canonical WC H-bonds, during the entire dynamics simulation, occurs between
nucleotide base pair A11-T48 due largely to the H-bonding interactions occurring between amino acid Arg
466 at WC sites on A11-T48 as shown in table 1; water mediated H-bonding interactions also occurred
between amino acid Gln 471 and nucleotide A11 at the H62 WC site, table 1. The WC H-bonding for the
other nucleotide base pairs of the GRE left major halfsite are predominantly canonical throughout the
Structural Changes in GR DBD Protein/GRE DNA Complex:
During molecular dynamics of the GR DBD/GRE complex, structural changes occur in both the DNA and
protein. The DNA appears to wrap around the GR DBD DNA recognition alpha helices. In addition,
nucleotides flanking the GRE major groove halfsites are drawn into amino acids of the exon 5 encoded
predicted alpha helix. The minor groove between the left and right GRE major groove halfsite is
compressed. The nucleotides in particular within the GRE right major groove halfsite show a loss of
canonical WC base pairing H-bonds (see table 2). In addition, nucleotides flanking the right major groove
halfsite show a decrease in minor groove width (see figure 5A-D). A closeup view of the GRE right
major groove nucleotide sequence is shown in figure 5D. A loss of WC canonical H-bonding is apparent
at nucleotide pairs G18-C41 and C21-G38. Methylation of guanine at these sites has been shown to
inhibit binding of the GR protein (32). To further illustrate the geometric changes in the GRE DNA, using
the CURVES program (27-28), GRE DNA major and minor groove width was analyzed after 600
picoseconds of molecular dynamics for the GR DBD/GRE model compared to GRE DNA at 0
picoseconds (see figure 6). The DNA major and minor groove widths determined at 0 picoseconds, 11.4
and 5.6 Angstroms, respectively are in close agreement with values reported for canonical B-DNA
duplexes (28). These values were used to monitor changes in DNA major and minor groove width during
molecular dynamics. An increase in width in the GRE right major groove halfsite can be seen. This
observation is in agreement with results from GR DBD/GRE co-crystal findings (14). A decrease in
minor groove width between the GRE DNA major groove halfsites was also observed. Similar findings
have also been reported for certain prokaryotic DNA regulatory protein/DNA complexes (28). In
addition, nucleotides of the minor groove flanking the GRE right major groove showed a marked
decrease in width within the poly A/T sequence reflecting bending into the GR DBD protein. Similar
findings of DNA bending have been reported for other DNA regulatory protein/DNA complexes
Interactive molecular models of the GR/GRE complex before and after 600 picoseconds of molecular dynamics
are shown in
In addition an MPEG movie showing the structural changes occurring in the GR DBD/GRE model
during 600 picoseconds of molecular dynamics is shown in
Our findings, reported herein, show that amino acids Lys 461, Lys 465, and Arg 466 of the GR DNA
recognition helix encoded in exon 3 and Gln 471 of the beta strand encoded at the splice junction site of
exons 3 and 4 adjacent to the GR DNA recognition helix specifically form both direct and water mediated
H-bonds at their cognate codon/anticodon nucleotide base sites within the 5'-CTGTTCTT-3' -5'-
AAGAACAG-3' recognition motif. In addition, Val 462 interacts by van der Waals with the middle
nucleotide of its codon, GTT, and Glu 469 has strong electrostatic attraction toward its codon nucleotide
A39, GAA. Therefore recognition of codon-anticodon nucleotides within the GRE DNA right major
groove halfsite by amino acids of the GR DNA recognition helix offers an explanation for the GR DNA
binding preference to the GRE major groove halfsite which contains the 5'-TGTTCT-3'- 5'-AGAACA-3'
Our findings indicate that GR site specific DNA recognition involves overlapping reading frames. In
addition, our findings also suggest that site specific DNA recognition may be bi-directional, that is, amino
acids may recognize their cognate codon/anticodon nucleotides reading 5'-to-3' or 3'-to-5'. This appears
to be the case in the naturally occurring GRE right major groove halfsite palindrome sequence 5'-AAGAA-
3' on the antisense strand which has codon nucleotides for hydrophilic amino acids Lys (AAG), Arg
(AGA), and Glu (GAA) of the GR DNA recognition helix in overlapping reading frames in both
directions. These observations offer an explanation as to why more than one amino acid can interact with
the same nucleotide and vice-versa (11) and still satisfy site specific DNA recognition according to our
hypothesis. Unlike the 5'-TGTTCT-3' 5'-AGAACA-3' recognition motif which is conserved within the
right major groove halfsite of GREs, the nucleotide sequences of the GRE flanking regions are not
conserved (38). However, we detected conservation of genetic information between both flanks of a
GRE and the GR DBD exon 5 encoded predicted alpha helix (22), see figure 1A, C. It is significant that
this same GRE site, among the several located within the LTR nucleotide sequence upstream of the
transcription start of the MMTV gene, has been reported to preferentially bind GR and have the highest
transcription enhancing activity (35). Therefore, our findings indicate that conservation of genetic
information (19,22) and the corresponding atomic interactions of amino acids of the GR DBD DNA
recognition helix, beta strand and predicted alpha helix with cognate codon/anticodon nucleotides within a
GRE and its flanking DNA sequence as reported herein are correlated with both DNA site specific
recognition and transcription enhancement.
Our findings described herein and elsewhere (18-19, 22-23) strongly support the idea of a
stereochemical basis for the origin of the genetic code (39-47) because amino acids within regulatory
proteins' DNA recognition helices are consistently being found lining up with cognate codon-anticodon
nucleotides within their specific DNA binding sites. These findings also suggest that these structures may
have been template dependent in their evolution (i.e. peptides acting as templates for nucleotide
polymerization or vice-versa (48-50). Our observations that genetic information is conserved between the
GRE and its flanking nucleotides and nucleotide sub-sequences at the splice junction sites of exons 3, 4
and 5 which encode the DNA recognition helix, beta strand and predicted alpha helix, respectively, of the
GR DBD implies that these structures are primordial molecular recognition modules which have been
conserved. Therefore, we propose that prebiotic, template directed autocatalytic synthesis of mutually
cognate peptides and polynucleotides resulted in their amplification and evolutionary conservation in a
contemporary eukaryotic organism as a modular genetic regulatory apparatus. Finally, the amino acid-
nucleotide atomic interactions described herein confirm our original prediction that conservation of
genetic information is a determinate of site specific DNA recognition for DNA regulatory proteins (18-19, 22-23).
We thank Don Gregory of Molecular Simulations Inc. for providing geometry for explicit sodium
counter-ions used in all simulations and for Zn atom placement and charge parameters for Zn binding
cysteines in the "zinc fingers" of the GR DBD structures. We also thank the Molecular Simulations Inc.
staff for software support with QUANTA, Michael Fenton of Fentonnet.com for data reduction programs,
Barry Bolding of Cray Research Inc. for CHARMm software optimization on the CRAY C-90, Minnesota
Supercomputer Institute Scientific Director, Don Truhlar for support and encouragement, the Minnesota
Supercomputer Center user services representatives for technical support on the CRAY-2 and C-90, R.
Kaptein for personal communication of GR NMR structural coordinates, R. Lavery for providing
CURVES 4.1 software and special thanks are due to Charlie Larson of Silicon Graphics Inc. for hardware
support with the IRIS 4D 320-GTX workstation. This work was supported in part by a research grant from
the Minnesota Supercomputer Institute, Minneapolis MN.
This work was also supported by a research fellowship in memory of William Lang Jr..
- Ptashne, M. Specific binding of Lambda phage repressor to Lambda DNA. Nature 214, 232-234
- McKay, D., Weber, I. and Steitz, T. Structure of catabolite gene activator at 2.9 angstroms
resolution. Incorporation of amino acid sequence and interactions with cyclic AMP. J. Biol. Chem.
257, 9518-9524 (1982).
- Takeda, Y., Ohlendorf, D., Anderson, W., Matthews, B. DNA-binding proteins. Science 221,
- Pabo, C., Sauer, R. Protein-DNA recognition. A. Rev. of Biochem. 53, 293-321 (1984).
- Marx, J. A crystalline view of protein-DNA binding. Science 229, 846-848 (1985).
- Schleif, R. DNA binding by proteins. Science 241:1182-1187 (1988).
- Otwinowski, Z., Schevitz, R., Zhang, R., Lawson, C., Joachimiak, A., Marmorstein, R., Luisi-B-F
and Sigler, P. Crystal structure of trp repressor/operator complex at atomic resolution. Nature 335,
- Aggarwal, A., Rodgers, D., Drottar, M., Ptashne, M. and Harrison, S. Recognition of a DNA
operator by the repressor of phage 434: A view at high resolution. Science 242, 899-907 (1988).
- Harrison, S., Anderson, J., Koudelka, G., Mondragon, A., Subbiah, S., Wharton, R., Wolberger, C.
and Ptashne, M. Recognition of DNA sequences by the repressor of bacteriophage 434. Biophys.
Chem. 29, 31-37 (1988)
- Brenowitz, M., Senear, D. and Ackers, G. Flanking DNA-sequences contribute to the specific
binding of cI-repressor and OR1. Nucleic Acids Res. 17, 3747-3755 (1989)
- Harrison, S. and Aggarwal, A. DNA recognition by proteins with the helix-turn-helix motif.
Annu. Rev. of Biochem. 59, 933-969 (1990).
- Schwabe, J., Neuhaus, D. and Rhodes, D. Solution structure of the DNA-binding domain of the
oestrogen receptor. Nature 348, 458-461, (1990).
- Baleja, J. and Sykes, B. Comparison of the structures of operator DNA free and in complex with
Lambda repressor. Biochemistry and Cell Biology 69, 202-205 (1991).
- Luisi, B., Xu, w., Otwinowski, Z., Freedman, L., Yamaoto, K. and Sigler, P. Crystallographic
analysis of the interaction of the glucocorticoid receptor with DNA. Nature 352, 497-505 (1991).
- Schultz, S., Shields, G. and Steitz, T. Crystal structure of a CAP-DNA complex: The DNA is
bent by 90 degrees. Science 253, 1001-1007 (1991)
- Beato, M. Modulation of gene expression through DNA binding proteins: Is there a regulatory
code? Haemat. Blood Transf. 29, 217-223 (1985).
- Matthews, B. No code for recognition. Nature 335, 294-295
- Harris, L., Sullivan, M. and Hickok, D. Conservation of Genetic information between regulatory
protein DNA binding alpha helices and their cognate operator sites: A simple code for site-specific
Comp. Math. with Appl. 20, 1-23 (1990)
- Harris, L., Sullivan, M. and Hickok, D. Genetic sequences of hormone response elements share
similarity with predicted alpha helices within DNA binding domains of steroid receptor proteins: A
basis for site-specific recognition. Comp. Math. with Appl. 20, 25-48 (1990).
- Hard, T., Kellenbach, E., Boelens, R., Maler, B., Dahlman, K., Freedman, L., Carlstedt-Duke, J.,
Yamamoto, K., Gustafsson, J. and Kaptein, R. Solution structure of the glucocorticoid receptor
DNA-binding domain. Science 249, 157-160 (1990).
- Encio, I., Detera-Wadleigh, S. The genomic structure of the human glucocorticoid receptor. J
Biol. Chem. 266, 7182-7188 (1990).
- Harris, L., Sullivan, M. and Hickok, D. Conservation of Genetic information: A code for site-
specific DNA recognition. Proc. Natl. Acad. Sci. USA 90, 5534-5538 (1993).
Harris, L., Sullivan, M. Popken-Harris, P. and Hickok, D. Molecular dynamics simulations in
solvent of the glucocorticoid receptor protein in complex with a glucocorticoid response element
DNA sequence. J. Biomol. Struct. Dyn. 12, 249-270 (1994)
- Quanta is a molecular modeling and display tool developed by Molecular Simulations Inc., (200
Fifth Avenue, Waltham, Massachusetts 02254) which allows the construction of molecular models of
DNA sequences, point
mutations of existing models and the modeling of small peptides with a selected secondary
- Brooks, B., Bruccoleri, R., Olafson, B., States, D., Swaminathan, S. and Karplus, M. CHARMm:
A program for macromolecular energy, minimization and dynamics calculations. J. Comput.
Chem. 4, 187-217 (1983)
- Karplus, M. and Petsko, G. Molecular dynamics simulations in biology (review). Nature 347,
- Lavery, R. and Sklenar, H. The definition of generalized helicoidal parameters and of axis
curvature for irregular nucleic acids. J. Biomol. Struct. Dyn. 6, 63-91 (1988).
- Stofer, E. and Lavery, R. Measuring the geometry of DNA grooves. Biopolymers 34, 337-346
- Baker, E. and Hubbard, R. Hydrogen bonding in globular proteins. Prog. Biophys. Mol. Biol. 44,
- Tsai, S., Carlstedt-Duke, J., Weigel, N., Dahlman, K., Gustafsson, J-A., Tsai, M-J. and O'Malley,
B. Molecular interactions of steroid hormone receptor with its enhancer element: Evidence for
formation. Cell 55, 361-369 (1988).
- Hollenberg, S. and Evans, R. Multiple and cooperative trans-activation domains of the human
glucocorticoid receptor. Cell 55, 899-906 (1988).
- Scheidereit, C., Beato, M. Contacts between hormone receptor and DNA double helix within a
glucocorticoid regulatory element of mouse mammary tumor virus. Proc. Natl. Acad. Sci. USA
81, 3029-3034 (1984).
- Scheidereit, C., Geisse, S., Westphal, H., Beato, M. The glucocorticoid receptor binds to
defined nucleotide sequences near the promoter of mouse mammary tumor virus. Nature 304,
- Picard, P and Yamamoto, K. Two signals mediate hormone-dependent nuclear localization of the
glucocorticoid receptor. EMBO J. 6, 3333-3340 (1987).
- Buetti, E., Kuhnel, B. Distinct sequence elements involved in the glucocorticoid regulation of
the mouse mammary tumor virus promoter identified by linker scanning mutagenesis. J. Mol. Biol.
190, 379-389 (1986).
- Leidig, F., Baxter, J. and Eberhardt, N. Thyroid hormone receptors induce DNA bending: potential
importance for receptor action. Transactions of the Association of American Physicians 103, 154-
- Nardulli, A. and Shapiro, D. Binding of the estrogen receptor DNA-binding domain to the
estrogen response element induces DNA bending. Mol. Cell. Biol. 12, 2037-2042 (1992).
- Chalepakis, G., Postma, J., Beato, M. A model for hormone receptor binding to the mouse
mammary tumour virus regulatory element based on hydroxyl radical footprinting. Nucleic Acids
Res. 16:10237-10247 (1988).
- Woese, C. Models for the evolution of codon assignments. J. Mol. Biol. 43, 235-240 (1969).
- Woese, C. The fundamental nature of the genetic code: Prebiotic interactions between
polynucleotides and polyamino acids or their derivatives. Proc. Natl. Acad. Sci. USA 59, 110-117
- Hendry, L., Bransome Jr., E., Hutson, M. and Campbell, L. A newly discovered stereochemical
logic in the structure of DNA suggests that the genetic code is inevitable. Perspect. Biol. Med. 27,
- Hendry, L, Mahesh, V., Bransome Jr., E., Hutson, M. and Campbell, L. A stereochemical
rationalle for the genetic code derived from complementary fit of amino acids into cavities
formed in codon/anticodon
sequences in double stranded DNA: Further evidence based upon noncomplementarity of
untranslated amino acids. The World Wide Web Journal of Biology 1, (1995).
- Lacey, J. and Mullins Jr. D. Experimental studies related to the origin of the genetic code and the
process of protein synthesis - a review. Origins of Life 13, 3-42 (1983).
- Lacey, J. and Mullins Jr. D. The case for the anticode. Origins of Life 14, 505-511 (1984).
- Lacey, J., Wickramasinghe, N. and Cook, G. Experimental studies related to the origin of the
genetic code and the process of protein synthesis - a review update. Origins of Life 22, 243-275
- Yarus, M. & Christian, E. Genetic code origins. Nature 342, 349-350 (1989).
- Yarus, M. An RNA-amino acid complex and the origin of the genetic code. New Biologist 3,
- Nelsestuen, G. Amino acid - directed nucleic acid synthesis. A possible mechanism in the origin
of life. J. Mol Evol. 11, 19-120 (1978).
- Nelsestuen, G. Amino acid catalyzed condensation of purines and pyrimidines with 2-
deoxyribose. Biochemistry 18, 2843-2846 (1979).
- Lacey, J., Staves, M. and Thomas, K. Ribonucleic acids may be catalysts for the preferential synthesis
of L-amino acid peptides: a minireview. J. Mol. Evol. 31(3) 244-248 (1990).
- Dayhoff, M. Atlas of protein sequence and structure. National Biomedical Research Foundation,
Silver Spring, MD. (1978).
- Miesfeld, R., Godowski, P., Maler, B. and Yamamoto, K. Glucocorticoid receptor mutants that
define a small region sufficient for enhancer activation. Science 236, 423-425 (1987).
- Payvar, F., DeFranco, D., Firestone, G., Edgar, B., Wrange, O., Okret, S., Gustafsson, J. and
Yamamoto, K. Sequence-specific binding of glucocorticoid receptor to MTV DNA at sites within and
upstream of the
transcribed region. Cell 35, 381-392 (1983).
- Carson, M. & Bugg, C. E. Algorithm for ribbon models of proteins. J. Mol. Graphics 4, 121-122
© 1995 Epress Inc.