Friedreich ataxia (FRDA, OMIM #229300), the most common inherited ataxia, is an autosomal recessive neurodegenerative disorder. The first symptoms, which are limb and gait ataxia and loss of tendon reflexes in the lower limbs, appear before the age of 25, and then patients develop dysarthria, loss of ability to walk and loss of sensation. Scoliosis is also present in most case of FRDA and is clearly progressive. There is no treatment for FRDA and patients mostly die of progressive hypertrophic cardiomyopathy in their early adulthood. FRDA is a rare genetic disorder with an estimated prevalence of one case per 50,000 individuals. It is commonly caused by hyperexpansion of a GAA triplet repeat in the first intron of both copies of the FXN gene (Campuzano et al., 1996). Only 5% of patients are heterozygotes for a point mutation in one allele and a GAA expansion on the other allele (Pandolfo, 2002; Gellera et al., 2007). Normal individuals have up to ~40 triplets, whereas in FRDA patients the number of GAA repeat can be from 70 to 1700 (Dürr et al., 1996; Pandolfo, 2002). The GAA repeat size determines the level of residual frataxin and is directly correlated to an earlier age of onset and more rapid rate of disease progression. The GAA repeat expansion causes a transcriptional defect of the FXN gene, and reduced levels of frataxin, a 210 amino acid mitochondrial protein, results in mitochondrial dysfunction and oxidative damage (Campuzano et al., 1997; Pandolfo, 2008).
1.2 FXN gene structure and expression
In 1988 Chamberlain et al carried out genetic linkage analysis on 22 families with three or more affected siblings and detected thatthe mutation causing FRDA occurs in chromosome 9 (Chamberlain et al., 1988). Further linkage studies on about 120 affected families localised the disease locus to chromosome 9q13-9q21 (Hanauer et al., 1990). In 1996 the FRDA locus was narrowed down to 9q13 and a gene called X25 was identified in that region (Campuzano et al., 1996). The gene, initially called X25 and later changed to FRDA, is now known as FXN. Using direct complementary DNA (cDNA) selection and exon amplification, the coding sequences of FXN gene were identified. An inverse polymerase chain reaction (PCR) approach was used to identify the intronic sequence flanking the known coding regions and therefore, the FXN gene structure was determined (Campuzano et al., 1996). Screening the FXN gene for mutations in FRDA patients revealed that an expanded trinucleotide repeats (GAA) within the first intron of FXN gene accounted for about 98% of FRDA cases. The FXN gene is composed of seven exons spanning 95 kb of genomic DNA. Exons 1 to 5aproduce the major transcript, a 1.3kb mRNA that encodes a 210-amino acid protein, called frataxin. Since the exon 5bhas an in-frame stop codon, alternative splicing can give a rise to a 171-amino acid coding transcript (Campuzano et al., 1996). Exon 6 is entirely noncoding. Two major transcription start sites, TSS1 and TSS2, were identified at 221bp and 62bp upstream of the ORF, respectively. Greene et al. revealed that a region located between -221 and -121 bases upstream of the ATG translation start site plays a crucial role in regulation of the FXN promoter (Greene et al., 2005). No TATA box element was apparent in this region and the Initiator (Inr)-like elements associated with the TSS1 did not affect the FXN gene expression (Kumari et al., 2011). Two transcription factors, serum response factor (SRF) and transcription factor activator protein 2 (TFAP2), were found to bind directly to this region and mutagenesis of either SRF or TFAPA2 binding sites significantly decreased FXN promoter activity (Li et al., 2010). The FXN gene is expressed in all cells but it has variable tissue-specific levels of expression. Generally, in tissues with high metabolic rate, the frataxin expression is high. FXN is significantly expressed in heart and spinal cord, intermediately in liver, skeletal muscle, and pancreas and minimally in other tissues (Koutnikova et al., 1997). In the central nervous system (CNS), the highest frataxin expression has been detected in dorsal root ganglia (DRG), much lower in cerebellum and lowest in cerebellar cortex (Delatycki et al., 2000). The frataxin mRNA level in FRDA patients is varied in an inverse GAA length-dependent manner. Depending on the GAA repeat size, FRDA patients can have 13%-30% of the normal frataxin mRNA levels (Pianese et al., 2004; Bidichandani et al., 1998).
1.3 Frataxin protein
Frataxin (FXN) is a nuclear-encoded 210 amino acid protein which is produced in the cytosol as a precursor form. It contains N-terminus mitochondrial targeting peptides that direct the precursor to mitochondria matrix. Data obtained from transfection of mammalian cells with human FXN in GFP-expression localised frataxinto the mitochondria (Babcock et al., 1997). Following import of the precursor into the mitochondria, mitochondrial processing peptidase (MPP) proteolytically removes the N-terminal sequence in two sequential steps and generates the 14.5 kDa functional mature frataxin encoded by amino acids 81-210 (m81-FXN) (Campuzano et al., 1996; Gibson et al., 1996; Cavadini et al., 2000; Schmucker et al., 2008). Recently it has been shown that before the mitochondrial import, RNF125 as the E3 ligase enzyme ubiquitinates a significant amount of frataxin precursor and targets them to ubiquitin/proteasome degradation. Preventing frataxin degradation is considered as a potential therapeutic approach for FRDA (Rufini et al., 2011; Benini et al., 2017). Solution and crystal structure of mature frataxin revealed that frataxin is a compact, globular protein containing two α-helices, a middle β sheet region and a C terminal coil. Frataxin is characterized by a unique fold which allows two terminal α-helices to frame a platform composed of five antiparallel β strands. C terminus of β5 with β6 and β7 form a smaller β sheet which interacts with the planes and confers an overall compact α-β sandwich structure (Dhe-Paganon et al., 2000; Musco et al., 2000; Bencze et al., 2006). Alignment of human frataxin and CyaY sequences revealed that FXN C-terminal region is evolutionary highly conserved portion of the protein, which all missense mutation found so far in individuals with heterozygous FRDA affect this domain. This observation suggests that C-terminus is essential for folding and function of FXN. Mutation affecting N-terminal region of FXN causes an atypical and milder FRDA phenotype in which patients show slower disease progression (Figure 1.1) (Cossée et al., 1999; De Michele et al., 2000; Musco et al., 2000).
1.3.1 Frataxin and iron homeostasis
Presence of frataxin in a wide range of species from bacteria to humans demonstrates that it is an evolutionary conserved protein. Although its exact molecular function is unclear at the present, early embryonic death in nullizygous Fxn mice suggests that frataxin is essential for emberyonic development (Cossée et al., 2000). It is generally accepted that FXN is involved in iron homeostasis and assembly of iron-sulphur (Fe-S) clusters (Gibson et al., 1996; Adinolfi et al., 2009; Pastore and Puccio, 2013). Babcock et al. (1997) provided the first insight into the role of FXN. They reported that deletion of Yfh1(yeast FXN homologue gene) results in accumulation of mitochondrial iron and hypersensitivity to oxidative stress (Babcock et al., 1997). Further studies revealed that reintroduction of Yfh1 leads to rapid export of accumulated iron from the mitochondria into the cytosol. This observation indicates a regulatory role for frataxin homologue Yfh1p in maintaining mitochondrial iron homeostasis (Radisky et al., 1999). There are several lines of evidence indicating that FXN is involved in the assembly of iron-sulphur (Fe-S) clusters. It has been reported that FXN deficiency results in reduced activity of Fe-S proteins such as the Krebs cycle enzyme aconitase. Deficient activities of complexes I–III of the respiratory chain and of the aconitases (Puccio et al., 2001) in FRDA mouse models and Δyfh1 yeast and FRDA patients (Rotig et al., 1997) have been detected (Pandolfo and Hausmann, 2013). Puccio et al. ( 2001) generated two different tissue-specific knock-out mouse models and they reported a 77%-87% complex II deficiency and a 70-74% aconitase deficiency without noticeable mitochondrial iron accumulation in both models (Puccio et al., 2001). Interestingly, it has been reported that Fxn deficient mice show 50% of Fe-S enzyme activity at 4 weeks age (Seznec et al., 2004). These results propose an important but not essential role for FXN in Fe-S cluster biosynthesis.
1.3.2 Oxidative stress
Due to impaired mitochondrial iron efflux in FRDA, iron accumulates in mitochondria. This excess iron, which is primarily ferrous Fe [II], leads to increased Fenton-mediated toxic free radical (hydroxyl radicals OH●) production and consequently increased oxidative stress. OH● is a highly toxic and can damage any biological macromolecule such as mitochondrial DNA and consequently result in cell death. Several studies were consistent with the idea that iron accumulation leads to oxidative stress in FRDA. Measuring the concentration of 8 hydroxy 2’ deoxyguanosine (8OH2’dG; a marker of oxidative DNA damage) in FRDA patients (Schulz et al., 2000), serum malondialdehyde (MDA; a marker of lipid peroxidation) in FRDA mouse model (Al-Mahdawi et al., 2006) and FRDA patients (Bradley et al., 2004) showed an increase in the level of these oxidative parameters. However, results from other studies questioned the role of oxidative stress in FRDA (Armstrong et al., 2010). Data from conditional knock-out mouse models revealed that iron accumulates in mitochondria in a time-dependent manner and it occurs after inactivation of Fe-S dependent enzymes (Puccio et al., 2001). Moreover, it has been demonstrated that complete Fxn deficiency does not induce oxidative stress in neuronal tissues of an FRDA mouse model (Seznec et al., 2005). Conflicting results obtained from different studies point out the need of further research on contribution of iron accumulation and oxidative damage in FRDA.
1.4 GAA repeat expansion in FRDA
Campuzano et al. (1997) showed that compared to normal individuals, the intron 1 in FRDA patients is larger and it contains pure GAA repeat expansion. 96% of FRDA chromosome abnormality is due to the homozygous expansion of GAA repeat within the first intron of the FXN gene. The remaining cases are heterozygous for GAA repeat expansion on one allele and a point mutation on the other (Campuzano et al., 1996). A normal sequence carries ~6-36 triplet repeats, whereas affected individuals have approximately 70 up to 1700 triplets, most commonly 600–900 GAA triplets on both alleles of the FXN gene (Pandolfo, 2009; Cossee et al., 1997; Bidichandani et al., 1998). Investigating the GAA repeat size in normal chromosomes revealed length polymorphism in normal alleles. 83% of normal alleles carry 6-10 GAA units, which are named small normal (SN) and 17% of them contain 12-36 triplets and form large normal group (LN) (Montermini et al., 1997; Pandolfo, 1998; Monticelli et al., 2004). Like all simple-sequence repeats, the basic polymorphism-generating mechanism can be accounted for size heterogeneity in normal alleles. Based on this mechanism, during DNA replication, occasional ‘stuttering’ of DNA polymerase can result in size heterogeneity in normal alleles (Montermini et al., 1997). Due to the slippage during replication of repetitive sequences, the new strand mispairs with the template strand, creating an unpaired repeat loop. After a second round of replication this ‘slip out’ can give a rise to expanded repeat (Mirkin, 2006). Probably a rare or singular event caused the jump from a small normal to large normal group. Several studies have demonstrated that the long normal alleles are interrupted by hexa-nucleotide repeats (GAGGAA), which prevent these large normal alleles from further expanding and forming pathological expanded alleles. It has been directly observed that uninterrupted long normal alleles, named as permutations, can undergo large expansion into an allele containing >650 triplets (Montermini et al., 1997; Patel and Isaya, 2001; Holloway et al., 2011).
1.4.1 Instability of GAA expanded repeats
The pathological expanded allele size is highly variable and it shows both somatic (mitotic) and intergenerational instability. That means that the GAA repeat size continues to differ within the tissues and across the generations (Pianese et al., 1997). Several studies have demonstrated that parental gender plays an important role in GAA repeat size variation. It has been observed that in maternal transmission both contraction and expansion of the GAA allele can be detected, whereas following paternal transmission there is a tendency for the GAA repeat to contract (Pianese et al., 1997; De Michele et al., 1998). In addition, as the parents age, the contractions in the paternal transmission and the expansions in the maternal transmission are more evident (Kaytor et al., 1997; De Michele et al., 1998). It has been observed that sibs with same parental mutant chromosome show different repeat lengths. It has been documented that somatic mosaicism in FRDA patients is length-dependent, tissue-specific and age-dependent (Clark et al., 2007). Mitotic GAA instability shows a strong tendency to contract. The expanded GAA alleles in peripheral leukocytes revert to normal size, whereas in the DRG, a predilection for further expansion has been detected (Sharma et al., 2002). In addition, alleles containing less than 25 triplets are completely stable and when the number of triplets exceeds 44, somatic instability can be detected (Sharma et al., 2002; Clark et al., 2007). Comparing somatic instability in a FRDA 18-week foetus and a 24-year-old FRDA patient, both homozygous with the same E allele sizes, revealed that foetal tissue contains significantly lower levels of somatic instability. Furthermore, an experiment analysing data from 2 to 49-year-old FRDA patients and carriers has shown that mutation loads increase from 17% to 78.7%, respectively in an age dependent manner. Thus it can be concluded that somatic instability occurs mainly after early embryonic development and it develops progressively through the life (De Biase et al., 2007). Several models explain the mechanisms of repeat instability and all of them support a role for replication in repeat instability (Pearson et al., 2005). Forming secondary structure during replication can provide an explanation for repeat contraction and expansion. During replication of triplet repeat, the single-stranded part of lagging strand forms hairpin. Replication across hairpins results in contraction and slippage of nascent lagging strand leads to expansion (Figure 1.2) (Heidenfelder et al., 2003; Mirkin, 2006). Recently, it was demonstrated that expanded GAA repeats at the FXN locus in FRDA obstructed the progression of replication machinery and allowed the replication forks to proceed predominantly in a 3′-5′ direction, which could promoted repeat expansion (Gerhardt et al., 2016). In addition, recent studies have shown that changes in epigenetic status surrounding repeat expansions also contribute to repeat instability (Libby et al., 2008; Dion and Wilson, 2009). A comparative analysis of the epigenetic context of disease-associated trinucleotide repeats (TNRs) and random set of repeats revealed that the expansion-prone repeats differ from background TNRs in their epigenetic profiles. This data indicates that certain epigenetic status is associated with repeat expansion (Essebier et al., 2016).
Figure 1.2: Repeat instability during replication.
1.5 Abnormal structures
1.5.1 Abnormal triplex structure
The exact mechanism through which the GAA repeat expansion leads to FXN gene silencing is not fully understood yet. Several mechanisms have been proposed to explain how the triplet expansion inhibits frataxin expression. It has been suggested that GAA repeat expansion allows the formation of non-B-DNA structures, such as triplex-based sticky DNA, R-loops, and heterochromatin formation at pathogenic FXN alleles, which can lead to frataxin deficiency. Bidichandani et al. (1998)demonstrated that GAA expansion is associated with an unusual DNA structure and transcription deficiency in a length-dependent manner (Bidichandani et al., 1998; Rajeswari, 2012). The GAA.TTC repeat is a homopurine (R) – homopyrimidine (Y) mirror repeat, which can adapt non-B-DNA conformation, including triplexes (H-DNA), sticky DNA or hairpin structures. The GAA repeat can form either R·R·Y or Y·R·Y intramolecular triplexes by fold- back mechanism. In both motifs, the purine has the central position and forms normal Watson-Crick pairs with the complementary pyrimidine strand (Figure 1.3). Binding of the third strand, containing either purine (R·R·Y) or pyrimidine (Y·R·Y), is through ‘Hoogsteen’ or ‘reverse-Hoogsteen type’ of hydrogen bonding (Sakamoto et al., 1999; Grabczyk and Usdin, 2000c; Sakamoto et al., 2001c; Mirkin, 2006).
1.5.2 Sticky DNA
Sticky DNA is one of the suggested unusual DNA conformations that is proposed for the GAA.TTC repeats. Sakamoto et al. (1999) reported that in a negatively supercoiled plasmid at natural pH, association of two long GAA.TTC repeats in the R·R·Y configurations leads to the formation of sticky DNA. They proposed a strand exchange model in which two R·R·Y triplexes exchange their Y strands to form an extremely stable hybrid triplex (Figure 1.3) (Sakamoto et al., 1999). It has been reported that sticky DNA structure directly binds to RNA polymerase, sequesters it and thus, strongly impairs in vitro transcription in both a cis and trans manner (Sakamoto et al., 2001c). Sakamoto et al. (2001)showed that introducing interruptions into GAA repeats abolishes the formation of sticky DNA and therefore reduces its negative effect on GAA transcription (Sakamoto et al., 2001a).
Several studies have provided evidence that the unusual DNA structure within the GAA repeat impedes transcription of the FXN mRNA. It has been proposed that triplex formation may prevent binding of transacting factors (TF) to transcription-control sequences and abolish transcriptional enhancement (Kohwi and Kohwi-Shigematsu, 1991), or alternatively it may trap the RNA polymerase and cause a pause site (Sakamoto et al., 2001c). Grabczyk and Usdin proposed a model to explain the role of DNA triplex in impeding the GAA repeat transcription. Their model suggests that transient formation of DNA triplexes behind of an advancing RNA polymerase within a long GAA.TCC tract leads to R-loop (RNA.DNA hybrid) formation. R-loop formation negatively affects FXN transcription elongation and consequently results in reduced frataxin mRNA and protein levels (Figure 1.4) (Grabczyk and Usdin, 2000a; Grabczyk and Usdin, 2000c). It has been demonstrated that RNA.DNA hybrid formation is a fundamental feature of GAA repeat transcription (Grabczyk et al., 2007). Groh et al. (2014)observed R-loop formation over the expanded GAA region. Its level was in correlation with the GAA expansion length and it was co-localised with the H3K9me2 chromatin marker (Groh et al., 2014). Several studies have demonstrated that elimination of RNA.DNA hybrids by RNase-H, which specifically disrupts RNA.DNA hybrids, significantly improves mRNA elongation (Huertas and Aguilera, 2003; Groh et al., 2014). Moreover, observing this hybrid formation even in the ‘pre-mutation’ size alleles, suggests that RNA.DNA hybrid formation contributes to GAA repeat instability (Grabczyk et al., 2007).
Figure 1.4: An intramolecular triplex impedes transcription and gives rise to RNA.DNA hybrid.
1.6 Epigenetic changes in FRDA
The word “epigenetic” literally means “on top of or in addition to genetics”. Based on new scientific findings, epigenetics is defined as ‘‘the study of changes in gene function that are mitotically and/or meiotically heritable and that do not involve a change in the DNA sequence.’’ (Dupont et al., 2009; Kanherkar et al., 2014). Major epigenetic mechanisms, include DNA methylation, histone modification and ncRNA-mediated gene silencing. Disruption of these epigenetic mechanisms can alter gene expression by chromatin remodeling. Accumulative evidence suggests that mutation and non-B-DNA structure could account for epigenetic changes and heterochromatin formation, thereby impeding gene transcription (Chen et al., 1995; Al-Mahdawi et al., 2008). Several epigenetic changes and abnormal heterochromatinisation have been identified in the immediate vicinity of the expanded GAA repeats of the FXN gene (Sandi et al., 2014). The first evidence for epigenetic involvement in FRDA came from a study by Saveliev and colleagues. In eukaryotes, DNA wraps around the histone proteins to form a nucleoprotein complex called “chromatin“. Heterochromatin is a highly condensed and transcriptionally inaccessible type of chromatin, which contains large blocks of repetitive DNA sequences. Heterochromatin is invasive and has a tendency to affect the expression of nearby genes (Yandim et al., 2013). Saveliev et al. showed that in transgenic mice, juxtapositioning of GAA expanded repeats with heterochromatin-sensitive human CD2 (hCD2) transgene alters expression by position effect variegation (PEV) (Saveliev et al., 2003). They showed that irrespective of the chromosomal location of the hCD2 transgene, inclusion of the GAA repeats can supress hCD2 expression. In addition, they also reported that the overexpression of a powerful modifier of PEV, heterochromatin protein (HP1), in CD2 GAA transgenic mouse lines significantly decreases CD2 expression, whereas in the absence of GAA, HP1 overexpression did not affect hCD2 expression (Saveliev et al., 2003). PEV occurs when a gene that was previously located in a euchromatic region, is positioned proximal to heterochromatic region. The relocated gene exhibits varied expression because of the change in its position (Elgin and Reuter, 2013). PEV is a hallmark of heterochromatin mediated gene silencing. Recent studies strongly suggest that epigenetic changes including DNA methylation, histone modifications and repressive chromatin formation are involved in the FXN gene silencing in FRDA (Elgin and Grewal, 2003; Herman et al., 2006).
1.6.1 DNA methylation
DNA methylation is a silencing epigenetic mark that involves the addition of a methyl group to the C5 position of cytosine in the context of CpG dinucleotides. Gene silencing is the predominant consequence of DNA methylation. Therefore, aberrant DNA methylation is a feature of a number of important human diseases, such as cancer in which DNA methylation in the promoter region of key tumour suppressor genes promotes oncogenic progression (Pook, 2012). Recently, a large amount of evidence has emerged to show that DNA methylation has an important role in repeat expansion diseases. Disease-associated DNA methylation has also been identified in other trinucleotide repeat expansion diseases, such as myotonic dystrophy type 1 (DM1) (López Castel et al., 2010), fragile X syndrome (FRAXA) and FRDA. The expansion of CGG repeats to over 200 triplets within the 5′UTR of the FMR1 gene leads to hypermethylation of the CpG sites along the FMR1 promoter region, transcriptional silencing and development of fragile X syndrome (Chandler et al., 2003; Naumann et al., 2009). Pathological expansion of of a G4C2 repeat in C9orf72 is the most common genetic cause of frontotemporal lobar degeneration (FTLD) and amyotrophic lateral sclerosis (ALS). A recent study showed that the increase of C9orf72 repeat length correlates with earlier disease onset, increased methylation states of 5′ CpG islands and reduced promoter activity (Gijselinck et al., 2016). In FRDA patients, similar patterns have been reported. The FXN gene has three putative CpG islands (Figure 1.5) (http://cpgislands.usc.edu/) and several studies have investigated the role of DNA methylation in FRDA. Studying DNA methylation status at the FXN locus has revealed an overall increase in the methylation of CpG island upstream of GAA repeat in FRDA derived lymphoblastoid cells (Greene et al., 2007). Al-Mahdawi and colleagues also reported an increase in DNA methylation upstream of GAA repeats (UP) in two lines of FRDA YAC transgenic mice (YG8 and YG22) and clinically important FRDA tissue such as heart, brain and cerebellum (Al-Mahdawi et al., 2008). A similar pattern was demonstrated in two large scale studies confirming the previous observation about hypermethylation in the UP region of the FXN gene in FRDA individuals. These studies reported a positive correlation between triplet expansion size, disease severity and the level of upstream methylation (Evans‐Galea et al., 2012) and a strong negative correlation between methylation level and the age of disease onset (Castaldo et al., 2008; Sandi et al., 2013; Yandim et al., 2013). Lorcinz and colleagues demonstrated that intragenic DNA methylation may also reduce transcription elongation efficiency by inducing heterochromatin formation, suggesting that intronic DNA methylation at the upstream region of the GAA repeat is involved in FXN gene silencing (Lorincz et al., 2004).
1.6.2 Histone modifications
Eukaryotic DNA is tightly packed into nucleosomes and a nucleosome is composed of DNA coiled around an octamer, which is made up of two copies of each histone protein- H2A, H2B, H3, and H4. There are different types of histone modifications, which can affect gene expression by determining accessibility of the DNA for transcription. The simplest and most extensively studied classes of histone modifications are the acetylation and methylation of lysine (K) residues. Acetylation of lysine is a hallmark of transcriptionally active chromatin. This is probably due to the presence of acetyl groups (CH3CO-) which can weaken the binding between the histones and negatively-charged DNA backbone and allow the chromatin structure to open up (Allfrey et al., 1964; Halleck and Gurley, 1981; Yandim et al., 2013). Histone acetylation also results in the recruitment of chromatin remodeling factors that lead to more permissive and transcriptionally competent chromatin structure. Such factors are recruited through their acetyl-lysine recognition domains, known as bromodomains (BRD) (Swygert and Peterson, 2014; Ferri et al., 2016). Hyperacetylation of histone H3 and H4 are known as active marks, whereas methylation of H3K9 and H3K27 are known to be a hallmark of heterochromatin formation and subsequent gene silencing (Karmodiya et al., 2012; Wang et al., 2008). It is commonly recognized that H3K9 and H3K27 methylation are important in heterochromatin formation. It is well established that a conserved histone methyltransferase (HMTase), SUV39H1/2, selectively methylates lysine 9 of histone H3 (Rea et al., 2000) and subsequently generates a binding site for HP1 proteins, a family of heterochromatic adaptor molecules (Lachner et al., 2001). In turn, HP1 recruits more HMTases and creates a positive feedback loop, resulting in heterochromatin formation (Figure 1.6). On the other hand, H3K27 methylation is catalysed by polycomb repressive complex 2 (PRC2). Previous studies suggested that both markers recruit distinct protein machineries. However, recent studies showed that these marks frequently colocalise in the genome and they cooperate in gene silencing maintenance (Boros et al., 2014; Saksouk et al., 2015; van Kruijsbergen et al., 2015). Hypoacetylation of histone H3 and H4, with increased levels of H3K9 di- and trimethylation in the upstream of GAA repeat, were first reported histone modifications within the FXN locus (Herman et al., 2006). Investigating FRDA patient brain, cerebellum and heart tissues and YAC transgenic mice revealed significant levels of H3 and H4 histone deacetylation and increased H3K9 di- and tri-methylation at the promoter, downstream and upstream of the GAA repeats (Al-Mahdawi et al., 2008). Furthermore, repressive marks such as H3K9me3 and H3K27me3 and HP1, were highly enriched in 5′UTR of the FXN gene (De Biase et al., 2009). Kim et al. (2011)reported significantly decreased levels of H3K4me3 at a region immediately upstream of the GAA repeats. Numerous studies suggest that H3K4me3 modification is often positively correlated with transcription initiation and more accessible chromatin structure for transcription factors. In addition, it has been shown that H3K4me3 facilitates recruitment of transcription post-initiation factors and enhances the efficiency of transcript elongation. This finding indicates a defect in FXN transcription elongation rather than a defective transcriptional initiation in FRDA (Sims et al., 2007; Kim et al., 2011). H3K36me3 and H3K79me2 are known as a hallmark of transcription elongation. Decreased methylation of H3K79 and H3K36 in FRDA cell lines supports the idea that frataxin deficiency is a result of defective FXN transcription elongation (Kim et al., 2011; Sandi et al., 2013; Sandi et al., 2014).
Figure 1.6: Spreading of heterochromatin via HP1 and SUV39H.
CCCTC-binding factor (CTCF) is an 82-kDa evolutionarily conserved and ubiquitously expressed 11-zinc-finger DNA-binding protein. 100% evolutionary conservation of the entire CTCF 11-Zn-finger region indicates that these domains are involved in a highly conserved biological function (Filippova et al., 1996). CTCF is a multifunction protein, initially characterised as a transcriptional repressor of the chicken c-myc gene (Lobanenkov et al., 1990; Klenova et al., 1993; Kim et al., 2015). Later, CTCF was found to also function as a chromatin insulator. Indeed, it is the only insulator protein known in vertebrates. Insulators are DNA-protein complex that protect genes from inappropriate signals emanating from their surrounding environments (West et al., 2002). Insulators can function as enhancer-blockers and block enhancer-promoter interactions or act as barriers against spread of heterochromatin into a neighbouring domain (Yang and Corces, 2011). There are several recent studies proposing that insulators may mediate three-dimensional looping of genomic regions with the primary goal of organising the eukaryotic genome into epigenetically heritable states (Yang and Corces, 2011; Herold et al., 2012). The insulator activity of CTCF was first identified at the 5′ end of the chicken
ß-globin locus. Bell et al. (1999) showed that CTCF was able to interfere with enhancer-promoter communication in a directional manner and separate the locus from neighbouring heterochromatin (Bell et al., 1999). The insulator function of CTCF was also reported at the imprinted Igf2 (insulin-like growth factor 2)–H19 locus. It was shown that CTCF can bind to the unmethylated imprinting control region (ICR), and block the access of Igf2 to the distal enhancer (Figure 1.7) (Bartolomei et al., 1991; Bell and Felsenfeld, 2000).
Figure 1.7: The neighbouring Igf2 and H19 genes are reciprocally imprinted.
Some studies have suggested that CTCF may act as a barrier at heterochromatin boundaries and protect genes against PEV. Genome-wide studies of the localisation of CTCF in three cell types revealed that CTCF-binding sites are significantly enriched at the boundaries between the repressive H3K27me3 and active H2AK5ac domains, indicating that CTCF may be involved in the chromatin barrier function (Cuddapah et al., 2009). A recent study by Weth et al. (2014) has demonstrated that upon CTCF depletion, H3K27me3 spreads into the nearby active domain and thereby causes gene repression. They also showed that CTCF removes H3K27me3, a hallmark of repressed domain, and actively converts repressed chromatin into active chromatin (Weth et al., 2014).
CTCF plays a role in several TNR diseases. Results from genome-wide mapping studies and studying TNR diseases such as myotonic dystrophy (DM1), spinocerebellar ataxia type 2 (SCA2) and type 7 (SCA7), and Huntington’s disease (HD), show that CTCF binding sites are present at one or both sides of repeat expansion regions (Filippova et al., 2001; Libby et al., 2008). The role of CTCF in establishing the local chromatin structure at repeats was first shown at the DM1 locus, where two CTCF binding sites flank DM1 CTG repeats and restrict repressive chromatin structure to repeat region. Loss of CTCF binding in congenital DM1 was associated with the spread of heterochromatin into nearby regions (Cho et al., 2005). In FRDA, a single CTCF binding site has been identified in the 5′ untranslated region (5′UTR) region of the FXN gene. De Biase et al. (2009) reported that the GAA repeat expansion in FRDA is associated with severe depletion of CTCF in the 5′UTR of the FXN gene. However, since other genes were not found to be affected, it appears that CTCF depletion is not a generalised defect in FRDA (De Biase et al., 2009). Studying FRDA cerebellum also demonstrated that CTCF occupancy decreased to the value of only 20% compared to unaffected cerebellum controls (Al-Mahdawi et al., 2013). It has also been reported that CTCF depletion in FRDA is associated with higher levels of an antisense transcript and heterochromatin formation at the FXN locus (De Biase et al., 2009).
1.6.4 The role of antisense transcription in FRDA
Antisense transcription refers to the transcription from the opposite strand of a protein coding strand (sense strand) and the production of a single stranded RNA, known as an antisense transcript. This antisense transcript may have a partial or a complete overlap with the sense transcript. Antisense transcripts are widespread in eukaryotic genomes. It has been estimated that at least 20% of human genes have an antisense partner (Chen et al., 2004). Similar to sense transcription, antisense transcription is driven by a promoter that can be an independent unidirectional promoter or a divergent promoter (bidirectional promoter) (Pelechano and Steinmetz, 2013). Various models have been suggested to explain how antisense transcripts are involved in regulation of gene expression. In many organisms, antisense transcripts have emerged as key regulators of gene expression in an epigenetic manner. A classic example is mammalian X chromosome inactivation, in which XIST silences one of the two X chromosomes in females. The action of XIST is negatively regulated in cis by its antisense transcript, TSIX. TSIX transcription induces histone modifications and alters chromatin conformation in the XIST promoter region. It has been shown that loss of TSIX transcription abolishes heterochromatin formation at the XIST promoter and is associated with reduced CpG methylation and aberrant histone modification in the 5′ region of Xist gene (Ohhata et al., 2008).
Antisense RNA can also affect expression of its sense partner by DNA methylation and chromatin modifications at the promoter region. Yu et al. (2008) reported that in leukaemia, expression of p15, a well-known tumour suppressor gene (TSG), is negatively-regulated in cis and in trans by its antisense transcript, p15AS. p15AS transcription leads to dicer-independent heterochromatin formation through an increase in H3K9 dimethylation and a decrease in dimethylation of H3K4 in cells (Yu et al., 2008). Antisense RNA transcribed at the p21 locus, another TSG, represses p21 gene expression. The p21 antisense RNA acts as an effector molecule, which directs epigenetic regulatory complexes and H3K27me3 enrichment to the sense promoter region (Morris et al., 2008; Faghihi and Wahlestedt, 2009). Important evidence of the relationship between antisense RNA and DNA methylation was provided by Tufarelli et al. (2003). They showed that heavily methylated CpG islands and the silencing of the haemoglobin α1 gene (HBA2) in patients with a class of α-thalassemia are correlated with the transcription of an antisense transcript. An aberrant LUC7L transcriptwhichis transcribed from the opposite strand to HBA2 gene, extends into the CpG-island region of HBA2 gene and causes hypermethylation and silencing of the HBA2 gene (Tufarelli et al., 2003; Morris, 2012; Pelechano and Steinmetz, 2013). Antisense transcripts with a potential pathogenic impact have been detected in several repeat associated diseases. At the Huntington’s disease (HD) locus, a natural antisense transcript has been detected that regulates the level of HTT expression (Chung et al., 2011). It has been proposed that an antisense transcript spanning the CGG repeat region of FMR1, might contribute to the pathogenesis of fragile X syndrome (FXS) and fragile X-associated tremor and ataxia syndrome (FXTAS) (Ladd et al., 2007). Recent studies show that the FMRA1 antisensetranscript, ASFMR1, supports a non-canonical type of protein translation called repeat associated non-AUG (RAN) translation, which derives the production of toxic proteins from CGG repeats (Krans et al., 2016). In addition, studies suggest that antisense transcription in spinocerebellar ataxia type 8 (SCA8) (Moseley et al., 2006), amyotrophic lateral sclerosis/frontotemporal dementia (ALS/FTD) (Zu et al., 2013) and DM1 (Cho et al., 2005) is a fundamental pathologic feature of these repeat-associated diseases. The expanded allele in congenital DM1 is associated with an antisense transcript emanating from the adjacent SIX5 regulatory region, the loss of CTCF binding and the propagation of heterochromatin to the surrounding regions (Cho et al., 2005; Gudde et al., 2017).
Literature supporting the notion that antisense transcripts are involved in heterochromatin formation and the regulation of their partner mRNAs expression inspired De Biase et al. (2009) to investigate the presence of any antisense transcript in FRDA. They performed a strand-specific reverse transcription PCR with a primer located upstream of FXN transcription start site 3 (TSS3) and discovered an antisense transcript overlapping with the CTCF binding site (Figure 1.8). This novel antisense transcript named FAST1 (FXN Antisense Transcript – 1) was produced at significantly higher levels in FRDA fibroblasts. Sequencing and BLAST analysis of the FAST-1 transcript confirmed that it was originated from the FXN locus. Higher levels of FAST-1 were associated with the severe CTCF depletion and coincidentally heterochromatin formation in the 5′ untranslated sequence (5′UTR) of the FXN gene (De Biase et al., 2009). A comparison of CTCF occupancy at the 5′UTR of fibroblast cell lines was carried out between two normal individuals versus two FRDA patients, revealing a four-fold reduction in CTCF in the 5′UTR of the FRDA samples. In addition, they examined CTCF occupancy at 2 other loci on chromosome 9q in FRDA and normal cell lines and showed that FRDA cells do not have a generalized defect of CTCF binding. The +1 nucleosome of the FXN gene includes the CTCF binding site in the 5′UTR. It is followed by a long nucleosome free region, which is composed of the three reported TSS of the FXN gene (Figure 1.8) (De Biase et al., 2009). De Biase and colleagues observed that knocking down CTCF by siRNA in wild type fibroblasts resulted in high levels of FAST1 and reduced FXN transcription. Moreover, they detected significant H3K9me3, H3K27me3 and HP1 enrichment in FRDA fibroblasts, suggesting repressive heterochromatin formation involving the +1 nucleosome and ensuing transcriptional silencing of the FXN gene (De Biase et al., 2009).
In order to determine the exact size and location of FAST-1, characteristics of FAST-1 were investigated in our lab and a full length FAST-1 transcript with a total length of 523bp in size, was identified. Mapping the 3′ and 5′ ends of FAST-1 transcriptonto the genome precisely localised them to nucleotides -359 and 164 of the FXN gene, respectively. A poly (A) signal was also identified in the FAST-1 sequence at nucleotide positions of -341 to -346 (Figure 1.9) (Sandi and Pook, 2015). FAST-1 also comprises a 462 base pair open reading frame, which encodes a putative 154 amino acids. The open reaing frame is intact in humans and chimpanzees but it does not contain a Kozak consensus sequence (De Biase et al., 2009).
Figure 1.9: The 5′ end of FXN gene showing the region corresponding to the full length FAST-1 transcript.
1.7 Mouse models
1.7.1 Knock-out and knock in mouse models
Cossée and colleagues generated an Fxn knock-out mouse model by deleting the most conserved and functionally important sequence of the Fxn gene, exon 4. Homozygous frataxin deletion resulted in the complete loss of frataxin function and early embryonic lethality (Cossée et al., 2000). Early embryonic lethality indicates the need of residual frataxin levels for survival until after birth. Puccio and colleagues then developed two viable lines of conditional knock-out mouse models, in which Fxn was deleted in specific tissues. They generated a muscle frataxin-deficient line and a neuron/cardiac muscle frataxin-deficient line, which together could mimic several pathophysiological and biochemical characteristics of FRDA (Puccio et al., 2001). Since GAA repeat expansion is the main cause of FRDA and none of the knock-out mouse models had GAA repeat expansions, Miranda et al., inserted or ‘knocked in’ (GAA) 230 within the first intron of the Fxn gene. To achieve further reduction in frataxin levels, the knock-in (KI) mice were crossed with knock-out (KO) mice and KIKO mice Fxn-/230GAAwere generated. KIKO mice showed 25-30% frataxin expression, but these mice did not initially show defects in motor coordination, iron metabolism, response to iron binding or GAA repeat instability (Miranda et al., 2002). However, neurobehavioral deficits have recently been discovered in KIKO mice (McMackin et al., 2017).
1.7.2 FRDA YAC transgenic mouse models
To investigate whether human frataxin can stay functional in a mouse environment and rescue the embryonic lethal phenotype of homozygous Fxn knock-out mice, a human wild-type FRDA yeast artificial chromosome (YAC) transgenic mouse line was crossed twice with heterozygous Fxn exon 4 deletion knock-out mice. The result was a phenotypically normal mouse line with no endogenous mouse frataxin that only expressed YAC-derived human frataxin (Pook et al., 2001). To study the relation of GAA repeat expansion to FRDA, Al-Mahdawi and colleagues introduced GAA repeat expansions into human YACs and generated two lines of transgenic mice, YG8 and YG22. Both of these mouse lines, showed intergenerational and age-related somatic instability of the GAA expansion (Al-Mahdawi et al., 2004) reduced levels of frataxin mRNA and protein expression and decreased aconitase activity (Al-Mahdawi et al., 2006).
Currently, there is no effective cure for FRDA, however different therapeutic approaches have been investigated to either treat the symptoms or address the cause of the disease (Figure 1.10). Although the exact molecular mechanisms underlying FRDA are not fully understood, unusual DNA structures and repressive chromatin formation as a result of GAA repeat expansion were proposed to play a role. It has been reported that the presence of complementary oligodeoxyribonucleotides (ODN) during transcription inhibits particular types of triplex formation in vitro and consequently this increases the yield of the full-length RNA. However, due to the high concentration required and the potential degradation, using unmodified ODNs in vivo is not practical (Grabczyk and Usdin, 2000a). Other molecules such as polyamides can disrupt formation of FRDA triplexes and sticky DNA by locking GAA repeat into double-stranded B-DNA structure and subsequently increase frataxin mRNA and protein levels in FRDA lymphoblast cells. However, microarray analysis of polyamide effects on the global gene expression has revealed that the polyamides significantly changed the expression of a small number of genes (Burnett et al., 2006). In a recent study, Corey and colleagues developed two distinct synthetic agents, an anti-GAA duplex RNA and a single-stranded locked nucleic acid LAN. These two synthetic nucleic acids employ same strategy to restore curative FXN levels in FRDA cells. They target GAA repeats and thereby, prevent R-loops and resultant repressive chromatin formation in FRDA cells. This study demonstrated that a simple binding to the mutant transcript is sufficient to re-activate frataxin expression in FRDA cells. No changes in GAA repeat-containing genes were reported in this study (Li et al., 2016). Another novel therapeutic strategy is delivering a functional FXN gene to the main site of pathology. It has been reported that delivering functional FXN into heart cells of a mouse model with complete deletion of frataxin in cardiac and skeletal muscle, can completely prevent the development of cardiac disease and even reverse it when cardiomyopathy is already established (Perdomini et al., 2014).
FRDA is caused by a partial deficiency of FXN protein. Therefore, by delivering a functional protein into the main site of pathology exogenously, the FRDA phenotype can be rescued. Payne and colleagues used a cell-penetrant peptide, TAT, to deliver human FXN to mitochondria in FRDA fibroblast cells and conditional knock-out mice. They reported reduced sensitivity to oxidant stress in FRDA fibroblast cells and increased cardiac function, aconitase activity, growth velocity and lifespan in conditional Fxn knock-out mice after treatment with TAT-FXN fusion protein (Vyas et al., 2012). Since GAA repeat expansions induce epigenetic changes and heterochromatin formation, resulting in transcriptional silencing of the FXN gene, the potential reversibility of epigenetic changes has made them attractive targets for FRDA treatment. In particular, histone deacetylase inhibitors (HDACi) have been used to boost frataxin expression. It has been reported that treating KIKI mice with a pimelic o-aminobenzamide HDACi (compound 106) restores normal frataxin levels and increases acetylation at lysine residues in histones H3 and H4 near the GAA repeat (Rai et al., 2008). Also, long term application of HDACi compound 109 in YG8R FRDA mice produced increased frataxin protein in brain and improved motor coordination and locomotor activity (Chutake et al., 2016; Sandi et al., 2011). A phase Ib clinical trial with HDACi compound 109/RG2833 has been successfully completed (Soragni et al., 2014; Bürk, 2017). Nicotinamide (vitamin B3) is another HDACi that has been reported to upregulate FXN expression in different FRDA models (Chan et al., 2013). A clinical study revealed that nicotinamide can significantly upregulate frataxin protein and attenuate heterochromatin formation in FRDA patients, however, no significant clinical improvement was reported (Libri et al., 2014). Furthermore, compound 5 (C5) is a small molecule with HDACi activity, which has been shown to restore FRDA histone acetylation to wild type levels and increase the FXN expression in FRDA patient-derived primary lymphocytes. The C5 molecule was identified by using an FRDA genomic DNA reporter cell model to screen and pre-select potential therapeutic compounds. Using this type of cell model system can considerably increase the efficiency and speed of primary compound screening (Lufino et al., 2013). To justify further development of such candidate therapeutic compounds, FRDA mouse models can be used to validate the initial results obtained from in vitro studies (Al-Mahdawi et al., 2008).