Title: Molecular Basis of Genetic Instability of Triplet Repeats
Abstract: A veritable explosion is taking place in our understanding of the human genetics, biochemistry, and DNA structural issues related to human hereditary neuromuscular and neurodegenerative diseases. Also, the non-Mendelian expansion process that elicits these disease manifestations (anticipation) is under intense investigation. Within the last 3 years, the molecular basis of 10 human genetic disorders (including fragile X syndrome (FRAXA and FRAXE), myotonic dystrophy (DM), ( 1The abbreviations used are: DMmyotonic dystrophyHDHuntington's diseaseSCA1spinocerebellar ataxia type 1DRPLAdentatorubral-pallidoluysian atrophybpbase pair(s)SBMAspinobulbar muscular atrophy.) Kennedy's disease, Huntington's disease (HD), spinocerebellar ataxia type 1 (SCA1), and dentatorubral-pallidoluysian atrophy (DRPLA)) has been partially established (reviewed in (1.Davies K. Warren S. Genome Analysis. 7. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY1993Google Scholar, 2.Bates G. Lehrach H. BioEssays. 1994; 16: 277-284Crossref PubMed Scopus (112) Google Scholar, 3.Sutherland G.R. Richards R.I. Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 3636-3641Crossref PubMed Scopus (301) Google Scholar, 4.Panzer S. Kuhl D.P.A. Caskey C.T. Stem Cells. 1995; 13: 146-157Crossref PubMed Scopus (27) Google Scholar, 5.Krahe R. Ashizawa T. Wetherall J. Groth D. Hypervariable Genetic Markers. CRC Press, Inc., Boca Raton, FL1995: 29-60Google Scholar)). The diseases are characterized at the molecular level by the expansion of a simple triplet repeat (CTG and CGG) from less than 15 copies of the repeat in normal individuals to scores of copies in affected cases; thousands of copies are found in some cases of fragile X and myotonic dystrophy. These increases in size occur upon passage of an expanded repeat in the chromosome to offspring. Moreover, the symptoms of these diseases follow an unusual genetic pattern called anticipation, in which the disease becomes more severe and has an earlier age of onset with each successive generation (reviewed in (1.Davies K. Warren S. Genome Analysis. 7. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY1993Google Scholar, 2.Bates G. Lehrach H. BioEssays. 1994; 16: 277-284Crossref PubMed Scopus (112) Google Scholar, 3.Sutherland G.R. Richards R.I. Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 3636-3641Crossref PubMed Scopus (301) Google Scholar, 4.Panzer S. Kuhl D.P.A. Caskey C.T. Stem Cells. 1995; 13: 146-157Crossref PubMed Scopus (27) Google Scholar, 5.Krahe R. Ashizawa T. Wetherall J. Groth D. Hypervariable Genetic Markers. CRC Press, Inc., Boca Raton, FL1995: 29-60Google Scholar, 6.Wells R.D. Sinden R.R. Davies K. Warren S. Genome Analysis. 7. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY1993: 107-138Google Scholar)). The instability of repeats in the genome has also been linked to hereditary nonpolyposis colon cancer, which may involve mutations in mismatch repair functions(4.Panzer S. Kuhl D.P.A. Caskey C.T. Stem Cells. 1995; 13: 146-157Crossref PubMed Scopus (27) Google Scholar, 5.Krahe R. Ashizawa T. Wetherall J. Groth D. Hypervariable Genetic Markers. CRC Press, Inc., Boca Raton, FL1995: 29-60Google Scholar, 7.Thibodeau S.N. Bren G. Schaid D. Science. 1993; 260: 816-819Crossref PubMed Scopus (2812) Google Scholar, 8.Shibata D. Peinado M.A. Ionov Y. Malkhosyan S. Perucho M. Nat. Genet. 1994; 6: 273-281Crossref PubMed Scopus (455) Google Scholar, 9.Baker S.M. Bronner C.E. Zhang L. Plug A.W. Robatzek M. Warren G. Elliott E.A. Yu J. Ashley T. Arnheim N. Flavell R.A. Liskay R.M. Cell. 1995; 82: 309-319Abstract Full Text PDF PubMed Scopus (476) Google Scholar). myotonic dystrophy Huntington's disease spinocerebellar ataxia type 1 dentatorubral-pallidoluysian atrophy base pair(s) spinobulbar muscular atrophy. For example, Huntington's disease shows anticipation and has expanded CAG triplet repeats(1.Davies K. Warren S. Genome Analysis. 7. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY1993Google Scholar, 2.Bates G. Lehrach H. BioEssays. 1994; 16: 277-284Crossref PubMed Scopus (112) Google Scholar, 3.Sutherland G.R. Richards R.I. Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 3636-3641Crossref PubMed Scopus (301) Google Scholar, 4.Panzer S. Kuhl D.P.A. Caskey C.T. Stem Cells. 1995; 13: 146-157Crossref PubMed Scopus (27) Google Scholar, 5.Krahe R. Ashizawa T. Wetherall J. Groth D. Hypervariable Genetic Markers. CRC Press, Inc., Boca Raton, FL1995: 29-60Google Scholar, 10.Huntington's Disease Collaborative Research GroupCell. 1993; 72: 971-983Abstract Full Text PDF PubMed Scopus (7118) Google Scholar). A CAG repeat of between 11 and 34 bp in the normal population encodes a polyglutamine tract in the IT15 gene. Expansion to about 90 bp occurs in HD patients. The age of onset correlates with the length of the triplet repeat with the largest changes in repeat lengths seen upon paternal transmission(11.Duyao M. Ambrose C. Myers R. Novelletto A. Persichetti F. Frontali M. Folstein S. Ross C. Franz M. Abbott M. Gray J. Conneally P. Young A. Penney J. Hollingsworth Z. Shoulson I. Lazzarini A. Falek A. Koroshetz W. Sax D. Bird E. Vonsattel J. Bonilla E. Alvir J. Conde J.Bickham Cha J.-H. Dure L. Gomex F. Ramos M. Sanchez-Ramos J. Snodgrass S. de Young M. Wexler N. Moscowitz C. Penchaszadeh G. MacFarlane H. Anderson M. Jenkins B. Srinidhi J. Barnes G. Gusella J. MacDonald M. Nat. Genet. 1993; 4: 387-392Crossref PubMed Scopus (904) Google Scholar). Sperm display a heterogeneous expanded repeat length. An intermediate allele, IA, containing 30-38 (or 34-38) repeats (perhaps similar to a premutation in fragile X or DM) has been identified. Initial reports suggest that sporadic expansion of the IA allele occurs only through paternal transmission(12.Goldberg Y.P. Kremer B. Andrew S.E. Theilmann J. Graham R.K. Squitieri F. Telenius H. Adam S. Sajoo A. Starr E. Heiberg A. Wolff G. Hayden M.R. Nat. Genet. 1993; 5: 174-179Crossref PubMed Scopus (235) Google Scholar, 13.Myers R.H. MacDonald M.E. Koroshetz W.J. Duyao M.P. Ambrose C.M. Taylor S.A.M. Barnes G. Srinidhi J. Lin C.S. Whaley W.L. Lazzarini A.M. Schwarz M. Wolff G. Bird E.D. Vonsattel J.-P. G. Gusella J.F. Nat. Genet. 1993; 5: 168-173Crossref PubMed Scopus (227) Google Scholar). The function of the gene product is uncertain(14.Trottier Y. Devys D. Imbert G. Sandou F. An I. Lutz Y. Weber C. Agid Y. Hirsch E.C. Mandel J.-L. Nat. Genet. 1995; 10: 104-110Crossref PubMed Scopus (382) Google Scholar). Considering Mendelian genetic principles, anticipation was an enigma. The discovery of expanding triplet repeats (or “mutable mutations”) in diseases showing anticipation afforded a physical basis for this unusual genetic phenomenon. Expansion of the triplet repeat is responsible for the genetic defect, influencing the activity of a glutamine-containing protein (SBMA, HD, SCA1, and DRPLA) or influencing the level of expression of a gene with which the repeat is associated (fragile X and DM)(1.Davies K. Warren S. Genome Analysis. 7. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY1993Google Scholar, 2.Bates G. Lehrach H. BioEssays. 1994; 16: 277-284Crossref PubMed Scopus (112) Google Scholar, 3.Sutherland G.R. Richards R.I. Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 3636-3641Crossref PubMed Scopus (301) Google Scholar, 4.Panzer S. Kuhl D.P.A. Caskey C.T. Stem Cells. 1995; 13: 146-157Crossref PubMed Scopus (27) Google Scholar, 5.Krahe R. Ashizawa T. Wetherall J. Groth D. Hypervariable Genetic Markers. CRC Press, Inc., Boca Raton, FL1995: 29-60Google Scholar). All triplet repeat genetic diseases identified to date show anticipation. Several other diseases also show anticipation including spinocerebellar ataxia type 2(15.Pulst S.-M. Nechiporuk A. Starkman S. Nat. Genet. 1993; 5: 8-10Crossref PubMed Scopus (65) Google Scholar), bipolar affective disorder(16.McInnis M.G. McMahon F.J. Chase G.A. Simpson S.G. Ross C.A. DePaulo Jr., J.R. Am. J. Hum. Genet. 1993; 53: 385-390PubMed Google Scholar), and hereditary spastic paraparesis (Strumpell's disease)(17.Bruyn R.P.M. van Deutekom J. Frants R.R. Padberg G.W. Clin. Neurol. Neurosurg. 1993; 95: 125-129Crossref PubMed Scopus (22) Google Scholar). If a correlation exists between anticipation and triplet repeats, many more diseases showing anticipation may be identified since there are more than 40 genes containing associated triplet repeats. An understanding of the molecular mechanisms of triplet repeat instabilities (expansions and deletions) is important for the comprehension of anticipation. Kang et al.(18.Kang S. Jaworski A. Ohshima K. Wells R.D. Nat. Genet. 1995; 10: 213-218Crossref PubMed Scopus (317) Google Scholar) have established a defined genetic system that shows promise for the dissection of this process. The frequency of genetic expansions or deletions in Escherichia coli depends on the direction of replication(18.Kang S. Jaworski A. Ohshima K. Wells R.D. Nat. Genet. 1995; 10: 213-218Crossref PubMed Scopus (317) Google Scholar). Large expansions occur predominantly when the CTGs are in the leading template strand rather than the lagging strand. However, deletions are more prominent when the CTGs are in the opposite orientation (Fig. 1). Most deletions generate products of defined size classes. Strand slippage coupled with non-classical DNA structures (Fig. 2) probably accounts for these observations and relates to expansion-deletion mechanisms in eukaryotic chromosomes. To study expansions, these workers determined if a plasmid that contains (CTG)130 is completely homogenous as a cloned molecule or if deletions and expansions had occurred that gave rise to sequence heterogeneity, even in a tiny percent of the molecules. The insert containing the triplet repeat was excised from the vector and separated by gel electrophoresis. The regions of the gel either above or below the insert band were eluted and “recloned”; recombinant plasmids were obtained that contained successively larger or smaller inserts, respectively. The family of inserts characterized by these methods contained repeat units ranging from 17 to 300. Hence, expansion and deletion occur in E. coli. This discovery lays the foundation for evaluating host cell genetic factors (replication, recombination, mismatch repair, etc.) that may elicit genetic instabilities. DNA sequence analyses showed that expansion and contraction always occurred in multiple repeats of 3 bp. Prior investigations (19.Jaworski A. Higgins N.P. Wells R.D. Zacharias W. J. Biol. Chem. 1991; 266: 2576-2581Abstract Full Text PDF PubMed Google Scholar) showed that deletions in dinucleotide repeat sequences occurred in multiple units of 2 bp.Figure 2:DNA structures of triplet repeats. Slipped structures and toroids are not mutually excluded.View Large Image Figure ViewerDownload Hi-res image Download (PPT) Fig. 1 outlines a possible mechanism for the expansion and deletion behaviors. For expansion, a hairpin loop may form on the lagging strand nascent DNA (CTG strand). NMR investigations (20.Smith G.K. Jie J. Fox G.E. Gao X. Nucleic Acids Res. 1995; 23: 4303-4311Crossref PubMed Scopus (66) Google Scholar) revealed that CTG oligomers form a stable anti-parallel duplex with TT pairs, whereas the complementary CAG strand forms a metastable conformation. When the CTG is the lagging strand template (orientation II), a loop may form on the lagging strand that will be bypassed during DNA synthesis to generate deletions. Multiple slippages (6.Wells R.D. Sinden R.R. Davies K. Warren S. Genome Analysis. 7. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY1993: 107-138Google Scholar) may be promoted by an “idling polymerase” caused by a strong block such as a DNA structure or the presence of proteins(21.Yano-Yanagisawa H. Li Y. Wang H. Kohwi Y. Nucleic Acids Res. 1995; 23: 2654-2660Crossref PubMed Scopus (18) Google Scholar), which causes continuous slippage (primer realignment) resulting in the expansion of larger sequences. Other workers (22.Jeffreys A.J. Tamaki K. MacLeod A. Monckton D.G. Neil D.L. Armour J.A.L. Nat. Genet. 1994; 6: 136-145Crossref PubMed Scopus (458) Google Scholar) favor gene conversion events to explain germline mutations at human minisatellites. Evolutionary studies (23.Eichler E.E. Kunst C.B. Lungenbeel K.A. Ryder O.A. Davison D. Warren S.T. Nelson D.L. Nat. Genet. 1995; 11: 301-308Crossref PubMed Scopus (43) Google Scholar) on the cryptic FMR1 CGG repeat suggest that replication slippage and unequal crossing over have been operative for >150 million years. Ohshima et al.(24.Ohshima K. Kang S. Wells R.D. J. Biol. Chem. 1996; 271: 1853-1856Abstract Full Text Full Text PDF PubMed Scopus (47) Google Scholar) have recently discovered that the CTG triplet repeat is the dominant genetic expansion product in E. coli. This extraordinary discovery was made possible by the successful cloning and characterization of all 10 repeating triplet sequences. ( 2K. Ohshima and R. D. Wells, unpublished work.) The relative capacity of the 10 repeating triplet sequences to be expanded in E. coli(18.Kang S. Jaworski A. Ohshima K. Wells R.D. Nat. Genet. 1995; 10: 213-218Crossref PubMed Scopus (317) Google Scholar) was explored with a competition study. Surprisingly, the CTG triplet repeat was expanded at least nine times more frequently than any of the other nine triplets(24.Ohshima K. Kang S. Wells R.D. J. Biol. Chem. 1996; 271: 1853-1856Abstract Full Text Full Text PDF PubMed Scopus (47) Google Scholar). Low levels of expansion were found also for CGG, GTG, and GTC. Thus, the structure of the CTG repeat and/or its utilization by the DNA synthetic systems in vivo must be quite different from the other triplets. The surprising discovery that CTG triplet repeats are the dominant expansion products in E. coli, as found (1.Davies K. Warren S. Genome Analysis. 7. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY1993Google Scholar, 2.Bates G. Lehrach H. BioEssays. 1994; 16: 277-284Crossref PubMed Scopus (112) Google Scholar, 3.Sutherland G.R. Richards R.I. Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 3636-3641Crossref PubMed Scopus (301) Google Scholar, 4.Panzer S. Kuhl D.P.A. Caskey C.T. Stem Cells. 1995; 13: 146-157Crossref PubMed Scopus (27) Google Scholar, 5.Krahe R. Ashizawa T. Wetherall J. Groth D. Hypervariable Genetic Markers. CRC Press, Inc., Boca Raton, FL1995: 29-60Google Scholar) in clinical samples from human hereditary diseases, suggests the importance of DNA structural properties(25.Wells R.D. J. Biol. Chem. 1988; 263: 1095-1098Abstract Full Text PDF PubMed Google Scholar). Other investigations have revealed that duplex CTG and CGG repeats have unorthodox properties including nucleosome assembly(26.Wang Y.-H. Amirhaeri S. Kang S. Wells R.D. Griffith J. Science. 1994; 265: 669-671Crossref PubMed Scopus (211) Google Scholar), their capacity to cause DNA polymerases to pause within the repeat sequences(27.Kang S. Ohshima K. Shimizu M. Amirhaeri S. Wells R.D. J. Biol. Chem. 1995; 270: 27014-27021Abstract Full Text Full Text PDF PubMed Scopus (169) Google Scholar), as well as conformational features as revealed by helical repeat and polyacrylamide gel migrations ( 3R. Gellibolian, M. Shimizu, S. Amirhaeri, S. Kang, K. Ohshima, J. E. Larson, Y.-H. Fu, C. T. Caskey, B. A. Oostra, and R. D. Wells, unpublished observations.) (discussed below). Further elucidation of the CTG repeat structural features along with the genetic factors responsible for expansion may explain why most (8 out of 10) (1.Davies K. Warren S. Genome Analysis. 7. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY1993Google Scholar, 2.Bates G. Lehrach H. BioEssays. 1994; 16: 277-284Crossref PubMed Scopus (112) Google Scholar, 3.Sutherland G.R. Richards R.I. Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 3636-3641Crossref PubMed Scopus (301) Google Scholar, 4.Panzer S. Kuhl D.P.A. Caskey C.T. Stem Cells. 1995; 13: 146-157Crossref PubMed Scopus (27) Google Scholar, 5.Krahe R. Ashizawa T. Wetherall J. Groth D. Hypervariable Genetic Markers. CRC Press, Inc., Boca Raton, FL1995: 29-60Google Scholar) triplet repeat hereditary disease genes contain CTG repeats. Although other triplet repeats are found in the human genome (29.Gastier J.M. Pulido J.C. Sunden S. Brody T. Buetow K.H. Murray J.C. Weber J.L. Hudson T.J. Sheffield V.C. Duyk G.M. Hum. Mol. Genet. 1995; 4: 1829-1836Crossref PubMed Scopus (80) Google Scholar), the lengths are shorter (generally <15 repeats) than found for these disease genes. Other work (30.Kang, S., Ohshima, K., Jaworski, A., Wells, R. D. (1996) J. Mol. Biol., in pressGoogle Scholar) has shown that the CTG triplet repeat is expanded in E. coli distal to the replication origin as a single large event of ∼120 bp. In summary, these investigations (18.Kang S. Jaworski A. Ohshima K. Wells R.D. Nat. Genet. 1995; 10: 213-218Crossref PubMed Scopus (317) Google Scholar, 24.Ohshima K. Kang S. Wells R.D. J. Biol. Chem. 1996; 271: 1853-1856Abstract Full Text Full Text PDF PubMed Scopus (47) Google Scholar, 30.Kang, S., Ohshima, K., Jaworski, A., Wells, R. D. (1996) J. Mol. Biol., in pressGoogle Scholar) establish a genetically defined system for studying the molecular mechanisms of this non-Mendelian process. A recent report of a transgenic mouse model for SBMA (31.Bingham P.M. Scott M.O. Wang S. McPhaul M.J. Wilson E.M. Garbern J.Y. Merry D.E. Fischbeck K.H. Nat. Genet. 1995; 9: 191-196Crossref PubMed Scopus (126) Google Scholar) found no change in length with transmission. Bacterial systems may provide useful mechanistic information until a genetically defined eukaryotic system can be established. In fact, a number of similarities exist between the behaviors observed in humans and this E. coli system (reviewed in (24.Ohshima K. Kang S. Wells R.D. J. Biol. Chem. 1996; 271: 1853-1856Abstract Full Text Full Text PDF PubMed Scopus (47) Google Scholar)). As an accidental discovery as part of chemical probe analyses, the pausing of DNA synthesis in vitro at specific loci in double-stranded CTG and CGG triplet repeats was found(27.Kang S. Ohshima K. Shimizu M. Amirhaeri S. Wells R.D. J. Biol. Chem. 1995; 270: 27014-27021Abstract Full Text Full Text PDF PubMed Scopus (169) Google Scholar). The DNA syntheses of CTG triplets ranging from 17 to 180 and CGG repeats from 9 to 160 repeats in length were studied in vitro. Primer extensions using the Klenow fragment of DNA polymerase I, the modified T7 DNA polymerase (Sequenase), or the human DNA polymerase β paused strongly at specific loci in the CTG repeats. The pausings were abolished by heating at 70°C. As the length of the triplet repeats in duplex DNA, but not in single-stranded DNA, was increased, the magnitude of pausings increased. CGG triplet repeats also showed similar, but not identical, patterns of pausings. These results indicate that appropriate lengths of the triplets adopt a non-B conformation(s) that blocks DNA polymerase progression; the resultant idling polymerase may catalyze slippages (Fig. 1) to give expanded sequences and, hence, provide the molecular basis for this non-Mendelian genetic process. Also, recent in vivo replication studies in E. coli( 4S. M. Mirkin, personal communication.) with plasmids containing the CGG repeat revealed length-dependent pause sites. Other studies (32.Usdin K. Woodford K.J. Nucleic Acids Res. 1995; 23: 4202-4209Crossref PubMed Scopus (228) Google Scholar) with single-stranded (CGG)20 as a template suggest a K+-dependent structure (tetraplex) that serves as a barrier to DNA synthesis in vitro. Mismatch repair-deficient E. coli(33.Modrich P. Science. 1995; 266: 1959-1960Crossref Scopus (394) Google Scholar) were studied in order to further elucidate the factors involved in genetic instabilities as well as DNA structural issues in vivo. Long CTG repeats are stabilized in ColE1-derived plasmids in E. coli containing mutations in the methyl-directed mismatch repair genes (mutS, mutL, or mutH)(34.Jaworski A. Rosche W.A. Gellibolian R. Kang S. Shimizu M. Sinden R.R. Wells R.D. Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 11019-11023Crossref PubMed Scopus (148) Google Scholar). When plasmids containing (CTG)180 were grown for about 100 generations in mutS, mutL, or mutH strains, 60-85% of the plasmids contained a full-length repeat, whereas in the parent strain only about 20% of the plasmids contained the full-length repeat. The deletions occur only in the (CTG)180 insert, not in DNA flanking the repeat. While many products of the deletions are heterogeneous in length, preferential deletion products of about 140, 100, 60, and 20 repeats were observed. The E. coli mismatch repair proteins apparently recognize three-base loops formed during replication and then generate long single-stranded gaps where stable hairpin structures may form, which can be bypassed by DNA polymerase during the resynthesis of duplex DNA. Similar studies were conducted with plasmids containing CGG repeats; no stabilization of these triplets was found in the mismatch repair mutants. Since prokaryotic and human mismatch repair proteins are similar (33.Modrich P. Science. 1995; 266: 1959-1960Crossref Scopus (394) Google Scholar, 35.Kolodner R.D. Alani E. Curr. Opin. Biotechnol. 1994; 5: 585-589Crossref PubMed Scopus (56) Google Scholar) and since several carcinoma cell lines, which are defective in mismatch repair, show instability of simple DNA microsatellites(7.Thibodeau S.N. Bren G. Schaid D. Science. 1993; 260: 816-819Crossref PubMed Scopus (2812) Google Scholar, 8.Shibata D. Peinado M.A. Ionov Y. Malkhosyan S. Perucho M. Nat. Genet. 1994; 6: 273-281Crossref PubMed Scopus (455) Google Scholar, 9.Baker S.M. Bronner C.E. Zhang L. Plug A.W. Robatzek M. Warren G. Elliott E.A. Yu J. Ashley T. Arnheim N. Flavell R.A. Liskay R.M. Cell. 1995; 82: 309-319Abstract Full Text PDF PubMed Scopus (476) Google Scholar), these mechanistic investigations in a bacterial cell may provide insights into the molecular basis for some human genetic diseases. Simple repeat sequences in plasmids adopt non-B conformations under appropriate conditions (such as negative supercoil density, ionic strength, etc.) in vitro (reviewed in (6.Wells R.D. Sinden R.R. Davies K. Warren S. Genome Analysis. 7. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY1993: 107-138Google Scholar) and (36.Sinden R.R. DNA Structure and Function. Academic Press, San Diego, CA1995Google Scholar)). For example, mirror repeat purine•pyrimidine sequences form triplexes (H-DNA) and (in certain cases) nodule DNA, alternating purine-pyrimidine sequences adopt left-handed Z-DNA, inverted repeats form cruciforms, and repeating A tracts exist in bent (curved) conformations. Some unusual structures were proven to exist in vivo in plasmids (6.Wells R.D. Sinden R.R. Davies K. Warren S. Genome Analysis. 7. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY1993: 107-138Google Scholar, 36.Sinden R.R. DNA Structure and Function. Academic Press, San Diego, CA1995Google Scholar) and in chromosomes(37.Lukomski S. Wells R.D. Proc. Natl. Acad. Sci. U. S. A. 1994; 91: 9980-9984Crossref PubMed Scopus (20) Google Scholar). Several recent biophysical studies were reported(38.Fry M. Loeb L.A. Proc. Natl. Acad. Sci. U. S. A. 1994; 91: 4950-4954Crossref PubMed Scopus (316) Google Scholar, 39.Mitas M. Yu A. Dill J. Haworth I.S. Biochemistry. 1995; 34 (and references therein): 12803-12811Crossref PubMed Scopus (112) Google Scholar, 40.Chen X. Mariappan S.V.S. Catasti P. Ratliff R. Moyzis R.K. Laayoun A. Smith S.S. Bradbury E.M. Gupta G. Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 5199-5203Crossref PubMed Scopus (228) Google Scholar, 41.Gacy A.M. Goellner G. Juranic N. Macura S. McMurray C.T. Cell. 1995; 81: 533-540Abstract Full Text PDF PubMed Scopus (522) Google Scholar, 42.Mitchell J.E. Newbury S.F. McClellan J.A. Nucleic Acids Res. 1995; 23: 1876-1881Crossref PubMed Scopus (44) Google Scholar, 43.Gao X. Huang X. Smith G.K. Zheng M. Liu H. J. Am. Chem. Soc. 1995; 117: 8883-8884Crossref Scopus (47) Google Scholar) on short (generally <24 bp) synthetic oligonucleotides with CTG or CGG triplets, which, in general, support the concept of hairpin loops (Figure 1:, Figure 2:) and other ordered conformations. Long CGG and CTG triplet repeat duplex sequences adopt intrinsic structures best explained as toroids3 (Fig. 2) that are unlike other previously described non-B DNA conformations as concluded from apparent helical repeat studies(44.Wang J.C. Proc. Natl. Acad. Sci. U. S. A. 1979; 76: 200-203Crossref PubMed Scopus (377) Google Scholar). These toroids, intrinsically curved DNA, have a fully paired helical duplex structure with a periodic repeat of ∼81 bp (27 triplets). Furthermore, polyacrylamide gel electrophoresis studies on fragments containing these triplet repeats show that the fragments migrate up to 30% more rapidly than expected whereas they migrate at the expected rate on agarose gel electrophoresis(45.Chastain P. Eichler E. Kang S. Nelson D. Levene S.D. Sinden R.R. Biochemistry. 1995; 34: 16125-16131Crossref PubMed Scopus (74) Google Scholar).3 These analyses also confirm the unusual conformation of CTG and CGG triplet repeats. Similar polyacrylamide gel electrophoresis investigations were conducted with the other eight triplet repeat sequences2; all fragments showed normal gel mobilities except for the longest lengths of ACC and GTC, which showed some characteristics similar to CTG and CGG but to a smaller extent. Chemical and enzymatic probe analyses as well as two-dimensional agarose gel electrophoretic investigations showed that the triplet repeat structures are fully base paired and negative supercoiling does not generate a non-B DNA structure. Electron microscopic investigations were conducted to evaluate the nucleosome assembly properties at DNA triplet repeats (26.Wang Y.-H. Amirhaeri S. Kang S. Wells R.D. Griffith J. Science. 1994; 265: 669-671Crossref PubMed Scopus (211) Google Scholar) since the toroidal conformations (Fig. 2) might provide a suitable homing site. Nucleosomes are the basic structural elements of chromosomes and consist of 146 bp of DNA coiled about an octamer of histone proteins that mediate general transcriptional repression. Plasmids containing lengths of CTG from 0 to 250 repeats were investigated(26.Wang Y.-H. Amirhaeri S. Kang S. Wells R.D. Griffith J. Science. 1994; 265: 669-671Crossref PubMed Scopus (211) Google Scholar). The efficiency of nucleosome formation increased with expanded triplet blocks suggesting that such blocks may repress transcription through the creation of stable nucleosomes (Fig. 3). In fact, the expanded CTG triplet repeats are the strongest known nucleosome positioning element(46.Wang Y.-H. Griffith J. Genomics. 1994; 25: 570-573Crossref Scopus (121) Google Scholar), even compared to the Xenopus borealis somatic 5 S RNA gene, one of the strongest known natural nucleosome positioning sequences. In summary, we believe that three types of non-B DNA conformations are important for triplet repeats (Fig. 2). The toroid structure formed with duplex CCG and CTG sequences is dictated solely by these triplet repeat sequences. We presume that the toroid is a suitable homing site for histone octamer binding. Slipped structures are the only reasonable explanation for the observed mismatch repair results(34.Jaworski A. Rosche W.A. Gellibolian R. Kang S. Shimizu M. Sinden R.R. Wells R.D. Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 11019-11023Crossref PubMed Scopus (148) Google Scholar). Hence, this may be the first case where a non-B structure has been detected in vivo prior to its in vitro characterization. Also, hairpin loops may be formed by single-stranded regions during DNA replication (Fig. 1). Several factors influence the stability of the triplet repeat inserts. First, the type of sequence plays a major role with CGG being the most difficult to stably maintain in E. coli(49.Shimizu, M., Gellibolian, R., Oostra, B. A., Wells, R. D. (1996) J. Mol. Biol., in pressGoogle Scholar). Second, the length of the repeats is very important since longer tracts, especially for CGG, show a greater degree of instability compared with shorter inserts (30 or less). This behavior in E. coli is consistent with the mechanism of genetic anticipation for the fragile X syndrome(47.Fu Y.-H. Kuhl D.P.A. Pizzuti A. Pieretti M. Sutcliffe J.S. Richards S. Verkerk A.J.M.H. Holden J.J.A. Fenwick Jr., R.G. Warren S.T. Oostra B.A. Nelson D.L. Caskey C.T. Cell. 1991; 67: 1047-1058Abstract Full Text PDF PubMed Scopus (1774) Google Scholar). Third, the presence of interruptions greatly enhances the stability of triplet repeats especially for CGG. Alleles derived from human patients show the presence of stable and unstable CGG triplets of similar size, suggesting that a feature other than length, but intrinsic to the repeat, was responsible for stability. Eichler et al.(48.Eichler E.E. Holden J.J.A. Popovich B.W. Reiss A.L. Snow K. Thibodeau S.N. Richards C.S. Ward P.A. Nelson D.L. Nat. Genet. 1994; 8: 88-94Crossref PubMed Scopus (418) Google Scholar) found that lengths of >33 uninterrupted CGGs showed marked instability, regardless of total repeat length, suggesting that the loss of the AGG interruptions is an important mutational event in the generation of alleles predisposed to the fragile X syndrome. Fourth, the orientation of the insert relative to the unidirectional replication origin was discussed above (Fig. 1). Fifth, the strains of E. coli used as host cells are critical; E. coli SURE was the best choice for maintaining the CGG triplet repeats of up to 160 repeats in pUC-derived plasmids (compared with HB101, STBL2, and RS2). Inserts containing longer than 160 CGG repeats were extremely unstable in pUC19 and were prone to delete to smaller sized plasmids. Hence, the vector of choice is significant also. Sixth, the location of the insert in the vector is important and may relate to the pausing observed at the DNA polymerase I/III switch site(27.Kang S. Ohshima K. Shimizu M. Amirhaeri S. Wells R.D. J. Biol. Chem. 1995; 270: 27014-27021Abstract Full Text Full Text PDF PubMed Scopus (169) Google Scholar). Seventh, the copy number of the vector may be important. Substantial progress has been made in the past 4 years in understanding several hereditary diseases, but the molecular basis of genetic instabilities of long triplet repeats remains to be elucidated. The establishment of expansion systems provides hope for molecular and genetic insights. The concept of a “mutatable mutation” is novel (i.e. DNA itself or its structures may be mutagenic). Hence, it is not surprising that major challenges lie before this field. Since a number of other diseases also show anticipation, the field may be just in its infancy. In the future, these issues represent a fertile arena for a broad range of clinical, human genetic, transgenic animal model, prokaryotic genetic, biochemical, as well as physical determinations. The goal is to understand the molecular mechanisms responsible for genetic instabilities and to eventually eradicate these devastating human neuromuscular and neurodegenerative diseases. I thank my collaborators for diligent and talented efforts that made this review possible and R. Iyer for assistance in preparation of figures.