Molecular
Vision 2015; 21:955-973
<http://www.molvis.org/molvis/v21/955>
Received 29 June 2015 | Accepted 26 August 2015 | Published 28
August 2015
Jian Sun,1,2 Shira Rockowitz,2 Daniel Chauss,3 Ping Wang,4 Marc Kantorow,3 Deyou Zheng,2,4,5 Ales Cvekl1,2
1Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY; 2Department of Genetics, Albert Einstein College of Medicine, Bronx, NY; 3Department of Biomedical Science, Florida Atlantic University, Boca Raton, FL; 4Department of Neurology, Albert Einstein College of Medicine, Bronx, NY; 5Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY
Correspondence to: Ales Cvekl, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461; Phone: (718) 430-3217, FAX: (718) 430-8778; email: ales.cvekl@einstein.yu.edu
Purpose: Gene expression correlates with local chromatin structure. Our studies have mapped histone post-translational modifications, RNA polymerase II (pol II), and transcription factor Pax6 in lens chromatin. These data represent the first genome-wide insights into the relationship between lens chromatin structure and lens transcriptomes and serve as an excellent source for additional data analysis and refinement. The principal lens proteins, the crystallins, are encoded by predominantly expressed mRNAs; however, the regulatory mechanisms underlying their high expression in the lens remain poorly understood.
Methods: The formaldehyde-assisted identification of regulatory regions (FAIRE-Seq) was employed to analyze newborn lens chromatin. ChIP-seq and RNA-seq data published earlier (GSE66961) have been used to assist in FAIRE-seq data interpretation. RNA transcriptomes from murine lens epithelium, lens fibers, erythrocytes, forebrain, liver, neurons, and pancreas were compared to establish the gene expression levels of the most abundant mRNAs versus median gene expression across other differentiated cells.
Results: Normalized RNA expression data from multiple tissues show that crystallins rank among the most highly expressed genes in mammalian cells. These findings correlate with the extremely high abundance of pol II all across the crystallin loci, including crystallin genes clustered on chromosomes 1 and 5, as well as within regions of “open” chromatin, as identified by FAIRE-seq. The expression levels of mRNAs encoding DNA-binding transcription factors (e.g., Foxe3, Hsf4, Maf, Pax6, Prox1, Sox1, and Tfap2a) revealed that their transcripts form “clusters” of abundant mRNAs in either lens fibers or lens epithelium. The expression of three autophagy regulatory mRNAs, encoding Tfeb, FoxO1, and Hif1α, was found within a group of lens preferentially expressed transcription factors compared to the E12.5 forebrain.
Conclusions: This study reveals novel features of lens chromatin, including the remarkably high abundance of pol II at the crystallin loci that exhibit features of “open” chromatin. Hsf4 ranks among the most abundant fiber cell-preferred DNA-binding transcription factors. Notable transcripts, including Atf4, Ctcf, E2F4, Hey1, Hmgb1, Mycn, RXRβ, Smad4, Sp1, and Taf1 (transcription factors) and Ctsd, Gabarapl1, and Park7 (autophagy regulators) have been identified with high levels of expression in lens fibers, which suggests specific roles in lens fiber cell terminal differentiation.
Genome-wide studies provide unbiased opportunities to better understand the molecular mechanisms of gene control. The source data include quantitative mapping of transcriptional units, including mRNAs and ncRNAs, by RNA-seq and the identification of DNA-binding transcription factors, their co-activators (e.g., chromatin remodeling complexes), RNA polymerase II (pol II), histone post-translational modifications (PTMs), and DNA methylation by ChIP-seq and similar methods. The integration of individual data and their analysis result in a comprehensive understanding of chromatin structure and RNA synthesis and processing [1-3]. To achieve this, we have recently mapped the key histone PTMs, including H4K4me1, H3K4me3, H3K27ac, and H3K27me3; pol II; and the DNA-binding transcription factor Pax6 in newborn lens chromatin [4]. Several RNA-seq studies have recently been conducted using newborn mouse lenses microdissected into the lens epithelium and lens fibers [4,5] and E13 chicken lenses microdissected into central anterior lens epithelium, equatorial epithelium, cortical fibers, and central fibers [6]. Genome-wide studies of mRNAs more abundant in embryonic lens compared to whole embryonic tissue yielded data that led to the establishment of a powerful iSyTE database enriched for lens disease-causing genes [7]. This database and other emerging data can be further mined either by mapping additional features of chromatin, additional computational re-analyses beyond the original studies, and/or a combination of both approaches.
Numerous studies have shown that promoters and enhancers are accessible to nuclease digestion in vivo. This accessibility is thought to reflect less compacted chromatin and the presence of nucleosome-free regions [3]. Genome-wide “open” chromatin structure can be mapped using multiple approaches [3], including FAIRE-seq [8-10], DNaseI-seq [11], MNase-seq [12], and ATAC-seq [13,14]). FAIRE-seq [10] is based on earlier findings of nucleosome-free regions that may encompass as much as 2% of the genome [15]. These nucleosome-free regions, with an average size of 149 bp in length, represent platforms where clusters of multiple DNA-binding transcription factors are frequently located. Thus, the identification of nucleosome-free regions aids in the identification of transcription factors that might occupy these regions.
Lens development is an excellent model system for studying gene regulation, chromatin structure, and the degradation of subcellular organelles, as both the nuclei and mitochondria need to be degraded in terminally differentiated primary lens fibers [16-18]. A hallmark of lens fiber cell differentiation is a high level of crystallin gene expression [19], but the expression levels of crystallins related to other highly transcribed genes in different tissues have not been compared. In addition, the chromatin structure of crystallin loci remains poorly understood. Although a basic set of DNA-binding transcription factors that regulate crystallin gene expression has been identified [20,21], further studies are required to evaluate nucleosome-free regions and the distribution of pol II at the genome-wide level. Similarly, an analysis of genome-wide data pertinent to the expression of DNA-binding transcription factors and regulators of fiber cell-specific processes, such as the degradation of the subcellular organelles, should reveal novel features of the transcriptional control of these important families of genes and provide critical information for selecting the best candidates for future genetic studies of lens differentiation.
In this study, we conducted FAIRE-seq analysis of lens chromatin and integrated these data with histone PTMs and pol II occupancy. Our data establish αA-crystallin (Cryaa) as the most expressed gene in lens epithelium and as one of the most highly expressed genes in the mammalian body. Our data demonstrate a remarkable pol II abundance across the transcribed and frequently duplicated crystallin genes. The expression of crystallins in lens fibers is quantitatively comparable to hemoglobin gene expression in red blood cells. Additionally, among the DNA-binding transcription factors, Hsf4 expression is most highly enriched in the lens fiber cells. Finally, we found that the most abundant transcription factors are those with well-established roles in lens development and differentiation.
The RNA-seq data were downloaded from the Gene Expression Omnibus (GEO) with the following accession numbers: erythrocytes (GSM1464982 and GSM1464983), liver (GSM1002564, GSM1182941, GSM1182942, and GSM1182943), neurons (GSM818951), and pancreas (GSM543645, GSM543646, GSM543646, GSM1002565, GSM1002566, GSM1611337, GSM1611338, GSM1611339, GSM1611340, and GSM1150322). Lens fiber, lens epithelium, and forebrain RNA-seq data were described in our previous study (GSE66961 [4]). All RNA-seq reads were processed as described in our previous publication [4], using the software Tophat/Cufflink [22-24] to determine gene expression level as reads per kilobase per million mapped reads (RPKM) [25]. We used the pre-computed RPKM values if they were available in the GEO. Chicken RNA-seq data are from GSE53976 data sets [6].
The FAIRE experiments were performed following a protocol published by Giresi et al. [26]. Briefly, 300 newborn mouse lenses (CD-1 strain) were fixed with 1% formaldehyde for 10 min at room temperature, and the crosslinking was stopped by adding 2.5 M glycine to reach a final concentration of 125 mM. The crosslinked tissues were homogenized in a Dounce glass homogenizer (pestle B) in 10 ml of ice-cold lysis buffer (10 mM Tris-HCl pH 7.5, 10 mM NaCl, 3 mM MgCl2, 0.5% NP-40) and incubated on ice for 10 min. Following centrifugation (4 min at 1,500 ×g), the pellet was re-suspended in 3 ml of sonication buffer (0.1% SDS, 10 mM EDTA, 50 mM Tris-HCl pH 8.0, proteinase inhibitor cocktail) and sheared by sonicator. Following centrifugation (1 min at 10,000 ×g), the supernatant was transferred to a new 1.5 ml test tube. Next, 300 μl of chromatin samples were treated by phenol/chloroform extraction to recover DNA not bound by nucleosomes found in the water phase. The recovered DNA were subsequently treated with RNase A (final concentration of 50 μg/ml), purified using a MinElute PCR purification kit (Qiagen, Valencia, CA), and used for subsequent Illumina sequencing library preparation. The FAIRE-seq reads were aligned to the mouse genome using Bowtie software [27], and MACS software (version 2 [28]) was used for calling peaks. Two biologic replicates of FAIRE-Seq were conducted. Analyzing the genome-wide read distribution (counting reads in 5-kb windows), we found the two replicates were significantly correlated (Pearson correlation coefficient = 0.89). For simplicity, we present data from one replicate (GEO accession number, GSM1841114) deposited within the GSE66961 data set.
The pol II peak identification and RNA-seq data have been described elsewhere [4]. Chicken RNA-seq data are from GSE53976 data sets [6].
The lists of DNA-binding transcription factors and autophagy genes were downloaded from AmiGO2.
Lens crystallins have been identified as the most abundant transcripts in the mouse lens by RNA-seq [5]; however, it is unknown how their expression ranks among genes highly expressed in terminally differentiated cells. To address this issue, we examined transcriptomes obtained from mouse lens epithelium, lens fibers, erythrocytes, forebrain, liver, neurons, and pancreas generated by RNA-seq [4,29-32]. Individual transcript abundance was determined from normalized reads per kilobase per million mapped reads (RPKM). Next, to compare the expression of top expressed genes across tissues, we selected the 10 most abundant transcripts in each tissue/organ and computed the ratios of their expression to the median expression level of all transcribed genes in the same tissue/organ, resulting in a relative “fold change” over the median (log2 scale). In Figure 1, we present the data for the top 10 lens fiber cell transcripts and the top five transcripts each from lens epithelium, erythrocytes, the forebrain, the liver, neurons, and the pancreas. This analysis revealed that five hemoglobin-encoding mRNAs (Hbb-bs, Hbb-b1, Hbb-b2, Hbb-bt, and Hba-a2) in red blood cells; calcium channel, voltage-dependent, α2/δ subunit 1 (Cacna2d1) in neurons; and 10 crystallins (Cryga, Cryge, Crygb, Cryba1, Crybb3, Cryaa, Crybb1, Crygd, Cryba4, and Cryba2) in lens fiber cells represent a group of the most abundant transcripts, followed by insulins (Ins2 and Ins1) and glucagon (Gcg) in the pancreas (Figure 1). In lens epithelium, the most abundant transcripts were in the order of Cryaa > Crybb3 > Rprl3 > Rn45s > Cryba1 (Figure 1). Taken together, these data quantitatively establish that in lens fiber cells, crystallins reach such prominent expression levels that they can only be compared with the adult hemoglobin genes in erythrocytes and calcium channel transcripts encoding Cacna2d1 in neurons.
A known common denominator between lens crystallins, insulin, and glucagon (Figure 1) is that these genes share common transcription regulators, including Pax6 and large Maf proteins. Interestingly, Prox1, another transcriptional regulator of crystallins [31,33,34], is also expressed in the pancreas, where it controls exocrine functions [35]. Specifically, genetic and molecular studies have established multiple roles of Pax6 and c-Maf in crystallin gene regulation [36-42]. Studies in cell culture systems added MafA and MafB [39,40], although MafA−/−; MafB−/− and MafA−/−; MafB−/− compound lenses appear normal [43]. In the pancreas, the loss of Pax6 results in the decreased expression of insulin, glucagon, and somatostatin [44], and numerous Pax6-binding sites have been identified in the regulatory regions of these genes [45-48]. MafA is critical for glucose-responsive insulin gene expression in β-cells [49] and is regulated by Pax6 [50]. MafB regulates glucagon gene expression [51] and is required for β-cell maturation [52]. Both glucagon and insulin promoters also bind c-Maf [53], although its global role in pancreas biology remains to be established. Finally, Pax6 has been shown to directly regulate the expression of insulin-like peptides in insulin-producing neurons in Drosophila [54].
We were thus intrigued to compare the expression levels of crystallins in the pancreas and of insulins and glucagon in the lens. The RNA expression analysis of lens fibers revealed increased levels of glucagon (Gcg) over the median; however, the expression of both insulin genes was significantly below the median level (Figure 2A). In pancreas cells, the expression of three crystallins, αB- (Cryab), βA2- (Cryba2), and βB3- (Crybb3), was above the median level (Figure 2B); however, these levels were significantly below the expression levels of the top 10 pancreas transcripts. From these data, we concluded that even a full complement of DNA-binding transcription factors that regulate crystallin gene expression is not sufficient to elicit their expression in pancreas cells, raising the possibility that the regulatory activities of these factors are modified by lens- and pancreas-specific signaling and/or posttranslational modifications. It is also possible that gene control of αB-, βA2-, and βB3-crystallins (Figure 2B) utilizes a different set of transcription factors compared to the lens, as shown for αB-crystallin in cardiac and muscle cells [55]. Interestingly, the upregulation of crystallins has been identified in various diabetic eye models [56,57], raising the possibility that specific pathological conditions overcome the tight control of crystallin gene expression. Alternatively, it remains possible that individual crystallin gene expression in pancreas requires additional DNA-binding transcription factors, such as Hsf4 (see below), weakly expressed in the pancreas [58].
It has previously been shown that there is a correlation between chromatin structure and transcription levels [59,60]. Because the expression of crystallins in lens fibers reaches an exceptional level, we wanted to map the nucleosome density using FAIRE-seq (see the Materials and Methods section) and correlate the data with pol II occupancy and transcription output measured by RNA-seq analysis. Examining the global distribution of the FAIRE-seq signal, we found extended open chromatin regions encompassing crystallin loci located on the mouse chromosomes 1, 5, 9, 11, 16, and 17, often as clusters of duplicated genes (Figure 3). Many of those loci also contained a high abundance of pol II, especially at the Cryaa locus. While the overwhelming majority of pol II occupancy was found at the crystallin loci, a large enrichment was also observed in a few additional loci on chromosomes 2 and 19 (Figure 3), including Lrrc4c and lncRNA Malat1 regions.
We identified 107,737 FAIRE-seq peaks using MACS2 software [28], ranging from 133 bp to 5,374 bp and totally covering ~37 Mb of the mouse genome (Table 1). Based on the refSeq annotation, 5.6% of the peaks were mapped to promoters (+/− 2 kb of the annotated transcription start sites) and 52.9% of the peaks were within 50 kb of the annotated genes. To better link open chromatin with gene regulation, we compared and integrated the FAIRE-seq data with other chromatin modification data. Previously, we identified a total of 301 regions with extended H3K27ac enrichment in lenses [4] using the recently described method (“ROSE”) for identifying “super enhancers” [61]. For comparison, 308 super enhancers were found in the embryonic forebrain chromatin (Table 1). We found 420 FAIRE-seq peaks within the 301 lens super enhancers; 190 genes were found to within 50 kb of these peaks using GREAT software [62], and the involved genes were significantly enriched for functions related to both eye development and eye diseases (Appendix 1). On the reverse side of the “super enhancer” analysis, we analyzed extended H3K27me3 blocks (>15 kb) and more specifically those extending beyond the start and end of the annotated genes (Table 1). A total of 164 such extended regions (“super repressors”) were found in the lenses, covering 237 genes, of which 99 encode well-studied transcription factors that control patterning and terminal differentiation, such as Myod1, Neurod1, Hand1, and 29 Hox genes. The repression of these genes is strongly supported by their low expression values in the global RNA-seq data (data not shown). Nevertheless, 113 FAIRE-seq peaks were located within these super repressors. As expected, they were significantly enriched at loci encoding transcription factors (e.g., Lhx3 and Otx2) involved in cell fate commitment and system development, including eye development.
Transcriptional regulation of the mouse αA-crystallin (Cryaa) locus has been deciphered using a combination of transgenic mouse and molecular biology studies, leading to the identification of two distal developmentally controlled enhancers within a 16 kb genomic region [63-66]. In the Cryaa locus (chromosome 17), prominent pol II enrichment is found in the coding gene sequences and the extended 3′-UTR (Figure 4A). In contrast, the Cryab gene is located in the head-to-head orientation with Hspb2 (chromosome 9, Figure 4B). The transcription start sites of Cryab (in the lens) and the sequence-related Hspb2 differ by only 866 bps; however, only the Cryab gene is expressed in the lens [67,68]. Interestingly, the Cryab regulatory regions are extended to a 4 kb 5′-flanking region encompassing the Hspb2 gene [69]. As there is pol-II detected in the 5'-region of Cryab, including the coding regions of Hspb2, it appears that there are multiple upstream start sites of Cryab gene transcription [67-69]. In the forebrain, we did not find any significant numbers of αA- and αB-crystallin transcripts; nor did we observe pol-II enrichment at these loci (Figure 4A,B).
The mouse chromosome 1 contains a cluster of five γ-crystallin genes (Cryga-Cryge) separated by ~0.8 Mbp genomic region, where the sixth member, Crygf, is located in the opposite orientation relative to the Cryga-Cryge cluster (Figure 4C). In the γ-crystallin cluster, pol II is localized in both the transcribed regions (as expected) and the intergenic regions linking these genes (Figure 4C). This suggests the possibility that the enzyme leaving the template can be “recycled” by the next transcriptional unit without “leaving” the active chromatin domain. Mouse chromosome 5 harbors five β-crystallin genes: Cryba4, Crybb1, Crybb2, Crybb3 (Figure 4D,E), and Crygn (data not shown). The Cryba4 and Crybb1 are assembled in a head-to-head orientation and are expressed both in lens fibers and lens epithelium (Figure 4D). There are six non-crystallin genes between the Cryba4 and Crybb1 pair and a pair of Crybb2 and Crybb3 genes. Both Cryba4 and Crybb1 are expressed in lens fibers and epithelium (Figure 4D). The spacer region of 3.3 kb between Cryba4 and Crybb1 does not harbor any appreciable amount of pol II (Figure 4D). In contrast, Crybb2 is not yet transcribed (Figure 4E) compared to Crybb3 in newborn mouse lenses (Figure 1 and Figure 4E), as Crybb2 expression is augmented after the birth [70]. Note that in E12.5 forebrain chromatin, the αA- and γF-crystallin encoding mRNAs are of low abundance [4]. From these data, we concluded that crystallin loci are marked as open chromatin. Very high levels of crystallin gene expression correlate with an extremely high abundance of pol II all across the individual transcriptional units. Ongoing experiments are focused on discovering if these crystallin transcriptional units form “transcriptional hub/factories” in the 3D nuclear space [71,72].
Although an appreciable number of DNA-binding transcription factors have been studied during lens differentiation, with notable differences in their expression domains in the lens [18,73], an unbiased analysis of transcripts encoding transcription factors enriched in the lens epithelium and lens fibers has not yet been conducted. Here, we analyzed lens transcriptomes using three pair-wise comparisons: lens epithelium versus embryonic forebrain, lens fibers versus embryonic forebrain, and lens fibers versus lens epithelium. The embryonic E12.5 forebrain was included for comparison as both the lens and the forebrain are of common ectodermal origin and both tissues require Pax6 for their formation [18,74]. Furthermore, the analysis of transcript abundance between tissues greatly aids in the identification of functional and disease-related genes [7]. The transcripts identified as being more abundant in lens epithelium than in the forebrain include Tfap2a (AP-2α), Foxe3, Pitx3, and Prox1 (Figure 5A), while Pitx3, Hsf4, Prox1, and Maf (c-Maf) are more abundant in lens fibers (Figure 5B). The most notable lens fiber-enriched transcripts include Hsf4, Maf, Sox1, and Prox1 (Figure 5C). In contrast, compared to lens fibers, lens epithelium is enriched for Pax6, Jun, Foxe3, and Hey2 transcripts (Figure 5C). In both lens compartments, Ctnnb1 (β-catenin) and Atf4 show very high levels of expression, consistent with their roles in lens differentiation [75-77]. It should be noted that only a fraction of cellular β-catenin is engaged in transcription through the canonical Wnt/β-catenin signaling pathway [78].
Our previous analysis of Pax6 and Prox1 identified novel distal enhancers of both genes [4]. Here, we show chromatin data regarding selected genes that represent the most abundantly expressed transcription factors in the lens (Tfap2a, Hsf4 and Maf), that are expressed highly in the lens and forebrain (Ctnnb1), that are expressed in lens epithelium and the forebrain (Sox2), and that are expressed highly in the forebrain (Sox11; Figure 6 and Figure 7). Both Tfap2a and Foxe3 mRNAs are abundant in the lens epithelium (Figure 6A), which is in agreement with earlier studies of these transcription factors in the lens [79,80]. Both loci are marked with a much lower level of H3K27me3 in the lens compared to forebrain chromatin. Hsf4 appears to be one of the most important DNA-binding transcription factors in lens fibers [81]; it controls the expression of all γ-crystallins [58] and αB-crystallin [82]. In lens chromatin, the Hsf4 locus is marked by low levels of H3K27me3 (Figure 6C). In contrast, the Hsf4 locus is marked by large H3K27me3 domains in the forebrain. Maf is more highly expressed in fibers than in lens epithelium [38] and is only weakly expressed in the forebrain, consistent with the differential enrichment of H3K27me3 in the Maf locus (Figure 6D). In lens chromatin, the Maf locus is marked by H3K27ac, and its promoter/gene body are also marked by H3K4me3 (Figure 6D). β-catenin (Ctnnb1) is a multifunctional protein with a specific role in transcription; Wnt signaling stimulates the translocation of β-catenin into the nucleus, where it forms a specific complex with HMG box DNA-binding factors Lef/Tcf [83]. Ctnnb1 is highly expressed in the lens and forebrain, consistent with the low abundance of H3K27me3 along this gene in both tissues (Figure 7A). In addition to its role in the ES cell core Oct4-Sox2-Nanog regulatory network, Sox2 is an important regulator of lens placode formation [84-86] and neural stem cells [63,87]. The present data suggest different regulatory mechanisms of the Sox2 gene in lens epithelium and the forebrain, as the H3K4me1, H3K27ac, H3K4me3, and H3K27me3 core histone PTMs show different patterns of intensities in lens and forebrain chromatins (Figure 7B), consistent with the different utilization of tissue-specific enhancers [88]. Finally, Sox11 is expressed in the invaginating lens placode and is required for the separation of lens vesicles from the surface ectoderm [89]. Its expression is reduced in the differentiating lens [90]. The expression of Sox11 is high in the neonatal cortex [91] and is essential for embryonic and adult neurogenesis [92]. The chromatin structure of Sox11 in the forebrain is marked by abundant H3K4me1, H3K27ac, and H3K4me3, while these histone PTMs are less abundant in lens chromatin (Figure 7C).
The current analysis of the most abundantly expressed DNA-binding transcription factors in the mouse lens supports the functional roles of these proteins in controlling lens development [21,73]. It also supports the rational design of iSyTE for identifying lens disease-causing genes based on their relative abundance in the lens compared to whole embryonic tissues [7]. Many of these DNA-binding transcription factors (e.g., Atf4, Maf, Pax6, Prox1, Sox2, Tfap2a, etc.) bind CREB-binding protein (CBP) and p300 and p300 histone acetyltranferases, which are essential for embryonic lens induction [93] and display different temporal and spatial expressions in the embryonic mouse lens [94]. The loss of three CBP/p300 alleles results in cataracts [92]. Some of these factors (e.g., Gata3, Hsf4, and Pax6) interact with SWI/SNF ATP-dependent chromatin remodeling complexes [65,95].
Lens fiber cell differentiation and homeostasis requires the degradation of nuclei and mitochondria. Lens differentiation includes autophagy [6,96], mitophagy [97], chromatin degradation by lens-preferred DNase IIβ endonuclease [98], and proteasome-mediated protein degradation [99,100]. Here, we analyzed the expression and chromatin features of autophagy genes (Figure 8, Figure 9). Eight transcripts encoding Ctsd, Gabarapl1, Park7, Fis1, Optn, Wipi1, Wipi2, and Mtor show pro-lens expression. Ctsd is the major proteolytic enzyme and marker of catabolic activity in various ocular tissues [101,102]. Its transcripts are more abundant in the lens compared to the embryonic forebrain (Figure 8A,B). γ-aminobutyric acid (GABA) A receptor-associated protein-like 1 (Gabarapl1, Atg8l) is the mammalian homolog of yeast Atg8, which regulates autophagosome formation [103]. It is highly expressed in the lens and is more abundant in lens fibers (Figure 8C). Among all the transcripts examined, Parkinson’s disease (autosomal recessive, early onset) 7 (Park7) is the most abundant transcript among the autophagy group examined here (Figure 8B,C). Park7 is required for mitochondrial homeostasis and turnover [104]. The pro-fission mitochondrial Fis1 (Fission 1) is highly expressed in the lens, with a higher abundance in lens fibers (Figure 8C); it induces mitochondrial fragmentation before mitophagy [105]. Interestingly, Park7 promotes the proteosomal degradation of Fis1 [106].
Together with sequestosome 1 (SQSTM1/p62) and ubiquilin-2 (Ubgln2), optineurin (Optn) functions as an autophagy receptor protein [107]. Optn is more abundant in mouse lens fibers compared to lens epithelium (Figure 8C). The WD repeat domain, phosphoinositide interacting 1 (Wipi1, ATG18) and 2 (Wipi2, ATG18B) proteins function as essential phosphatidylinositol 3-phosphate (PtdIns3P) effectors at the nascent autophagosome [108]. Both Wipi1 and Wipi2 are more abundant in lens fibers compared to the forebrain (Figure 8B). Finally, the mechanistic target of rapamycin (serine/threonine kinase; Mtor) is the central regulator of autophagy [109,110], with increased expression in lens fibers compared to the forebrain (Figure 8B), and has an established role in lens organelle degradation [96].
The transcription of autophagy genes is regulated by Tfeb [111], a subfamily of FoxO proteins, including FoxO1, FoxO3a, FoxO4 and FoxO6 [112], and Hif-1α (gene name: Hif1a) DNA-binding factors. Hif1α is regulated at multiple levels, as its degradation is controlled by chaperone-mediated autophagy [113] and transcriptional control of BNIP3 [114]. Here, we found increased expression of Tfeb in both lens epithelium and lens fibers relative to the forebrain (Figures 8A,B), increased expression of Hif1a in lens epithelium compared to the fibers (Figure 8C), and moderately increased expression of Foxo1 in lens epithelium compared to the E12.5 forebrain (Figure 8A). Tfeb (bHLHe35) is a helix–loop–helix DNA-binding factor and serves as a master regulator of the lysozomal gene network, lysosomal biogenesis, autophagy, and other related processes [111]. The chromatin structure of the Tfeb locus and the RNA-seq data show highly active transcription in both lens epithelium and fibers and reveal an intragenic enhancer, marked by H3K4me1 and H3K27ac (Figure 9A). RNA-seq data at Foxo1 show abundant pre-spliced RNAs in lens epithelium, raising the possibility that splicing may control the availability of the Foxo1 mRNAs (Figure 9B). Its lower level of expression in the forebrain correlates with increased H3K27me3 in the promoter region (Figure 9B). The chromatin structure of the Hif1a locus suggests that its expression is regulated by the promoter; in forebrain chromatin, pol II is abundant, although RNA-seq data show much lower levels of expression compared to the lens epithelium (Figure 9C). Taken together, the targeting of Tfeb, Foxo1, and Hif1a in the lens by MLR10-cre [115] represents excellent opportunities to determine the function of these genes in lens differentiation and subcellular organelle turnover.
Four sets of transcriptome data have been generated in E13 embryonic chicken lens compartments [6]. Herein, we analyzed spatial gene expression changes of those 11 mouse genes, grouped as autophagy regulators (Figure 10A) and transcription factors (Figure 10B). Mouse Fis1 does not have any known chicken homolog. In both mouse and chicken lenses, the abundance of autophagy transcription factors is in the order Foxo1 < Tfeb < Hif1a (Figure 8, Figure 10). The chicken data show major upregulation of Mtor and Wipi1 in lens fiber samples compared to lens epithelium at E13. These changes were not detected in the mouse data, most likely due to the differences between the stages of lens cell differentiation between the E13 chicken and newborn (P1) mouse lenses. The E13 chicken lens fiber portions establish an organelle-free zone, while in newborn mouse lenses, the organelle-free zone is already established. We conclude that both mouse and chicken lenses express cohorts of genes implicated in autophagy, and the high abundance of some of their transcripts points to specific functions in lens homeostasis and organelle degradation and identifies candidate genes for functional studies in both systems.
Taken together, the present study establishes that crystallins are among a sparse group of the most abundantly expressed genes in mammalian tissues and organs. Their expression levels are further increased by 2–3 orders of magnitude in lens fibers relative to lens epithelium, where, for example, αA-crystallin is already abundantly expressed (at least 1,000-fold relative to the median expression level). Based on these data, it appears that a significant portion of the transcriptional apparatus is committed to transcribe crystallin genes, as evidenced by the high density of pol-II across the coding exons, introns and extended 3'-UTRs. In addition, crystallin loci show increased “open” chromatin accessibility, as evaluated by FAIRE-seq. These findings suggest several possible models, for example, formation of 3D transcriptional factories in which multiple crystallin loci are physically tethered in the 3D-space of the nuclei [116]. Finally, transcriptional machinery can be visualized through high-resolution two-photon microscopy of fluorescently tagged proteins [117,118]. Thus, the crystallin loci can be used as excellent sites to study the pol II elongation rate, nucleosome dynamics, and transcriptional termination. All these models are testable, and ongoing experiments are focused on addressing these outstanding questions.
Although RNA studies can be conducted using 10,000 to 100,000 cells and using single cells following amplification [119], the main limitation of chromatin studies is that while for special applications, 10,000 cells are sufficient, the majority of experiments require large amounts of materials (e.g., 20–300 lenses) and high-quality antibodies. The present studies represent a mixture of primary lens cells, and additional studies will be required to assess chromatin interaction in lens epithelium and lens fibers. It is also noteworthy that the cross-linking stabilizes actual protein-DNA interactions but does not reveal dynamics of transcription factor–DNA interactions in the specific nucleosome or linker region. Nevertheless, the RNA-seq studies were conducted with isolated lens epithelium and lens fibers and aid in the interpretation of chromatin data obtained from the whole lens.
Appendix 1. Functional annotation of genes next to FAIRE-seq peaks in super-enhancers.
This work was supported by NIH grants R01 EY012200 (AC), EY014237 (AC), EY013022 (MK), and R21 MH099452 (DZ), and by unrestricted departmental grant from Research to Prevent Blindness, Inc. to the Department of Ophthalmology and Visual Sciences. We thank the Einstein Genomics and Epigenomics and Proteomics Shared Facilities for their services. Data in this paper are from a thesis submitted in partial fulfillment of the requirements for the Degree of Doctor of Philosophy in the Graduate Division of Medical Sciences, Albert Einstein College of Medicine, Yeshiva University (JS).